All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/19] KVM GICv3 emulation
@ 2014-10-31 17:26 Andre Przywara
  2014-10-31 17:26 ` [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors Andre Przywara
                   ` (20 more replies)
  0 siblings, 21 replies; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

This is an updated version of the GICv3 guest emulation series.

This one is now based on v3.18-rc2, which makes this patch series
independent now, as all formerly required patches are now upstream.

I addressed most of the comments from Christoffer's review (thanks
for that!), this includes a split-up of two patches, so the new
series now carries more patches to ease review.

There seem to be still endianess issues with this, so I don't claim
this version to be compatible with anything other than LE on LE.
I am about to debug this and will include fixes in the next version.

A git repo hosting all these patches lives in the kvm-gicv3/v3 branch
of: http://www.linux-arm.org/git?p=linux-ap.git
-----

GICv3 is the ARM generic interrupt controller designed to overcome
some limits of the prevalent GICv2. Most notably it lifts the 8-CPU
limit. Though with recent patches from Marc there is support for
hosts to use a GICv3, the CPU limitation still applies to KVM guests,
since the current code emulates a GICv2 only.
Also, GICv2 backward compatibility being optional in GICv3, a number
of systems won't be able to run GICv2 guests.

This patch series provides code to emulate a GICv3 distributor and
redistributor for any KVM guest. It requires a GICv3 in the host to
work. With those patches one can run guests efficiently on any GICv3
host. It has the following features:
- Affinity routing (support for up to 255 VCPUs, more possible)
- System registers (as opposed to MMIO access)
- No ITS
- No priority support (as the GICv2 emulation)
- No save / restore support so far (will be added soon)

The first patches actually refactor the current VGIC code to make
room for a different VGIC model to be dropped in with Patch 16.
The remaining patches connect the new model to the kernel backend and
the userland facing code.

The series goes on top of v3.18-rc2.
The necessary patches for kvmtool to enable the guest's GICv3 have
been posted here before [1], an updated version will follow soon.

There was some testing on the fast model with some I/O and interrupt
affinity shuffling in a Linux guest with a varying number of VCPUs as
well as some testing on a Juno board (GICv2 only, to spot
regressions).

Please review and test.
I would be grateful for people to test for GICv2 regressions also
(so on a GICv2 host with current kvmtool/qemu), as there is quite
some refactoring on that front.

Much of the code was inspired by MarcZ, also kudos to him for doing
the rather painful rebase on top of v3.17-rc1.

Cheers,
Andre.

[1] https://lists.cs.columbia.edu/pipermail/kvmarm/2014-June/010086.html

Changes v2 ... v3:
* rebase to v3.18-rc2
* adapt to new kvm_register_device() function
* split up vm_ops patch and the GICv2 split-off patch to ease review
* various smaller changes due to Christoffer's review
* fix compilation for arm
* remove support for trapping SGI sysreg accesses on arm hosts

Changes v1 ... v2:
* rebase to v3.17-rc1, caused quite some changes to the init code
* new 9/15 patch to make 10/15 smaller
* fix wrongly ordered cp15 register trap entry (MarcZ)
* fix SGI broadcast (thanks to wanghaibin for spotting)
* fix broken bailout path in kvm_vgic_create (wanghaibin)
* check return value of init_emulation_ops() (wanghaibin)
* fix return value check in vgic_[sg]et_attr()
* add header inclusion guards
* remove double definition of VCPU_NOT_ALLOCATED
* some code move-around
* whitespace fixes

Andre Przywara (19):
  arm/arm64: KVM: rework MPIDR assignment and add accessors
  arm/arm64: KVM: pass down user space provided GIC type into vGIC code
  arm/arm64: KVM: refactor vgic_handle_mmio() function
  arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
  arm/arm64: KVM: introduce per-VM ops
  arm/arm64: KVM: move [sg]et_lr into per-VM ops
  arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing
  arm/arm64: KVM: dont rely on a valid GICH base address
  arm/arm64: KVM: make the maximum number of vCPUs a per-VM value
  arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
  arm/arm64: KVM: refactor MMIO accessors
  arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
  arm/arm64: KVM: add vgic.h header file
  arm/arm64: KVM: split GICv2 specific emulation code from vgic.c
  arm/arm64: KVM: add opaque private pointer to MMIO accessors
  arm/arm64: KVM: add virtual GICv3 distributor emulation
  arm64: KVM: add SGI system register trapping
  arm/arm64: KVM: enable kernel side of GICv3 emulation
  arm/arm64: KVM: allow userland to request a virtual GICv3

 arch/arm/include/asm/kvm_emulate.h   |    3 +-
 arch/arm/include/asm/kvm_host.h      |    3 +
 arch/arm/kvm/Makefile                |    1 +
 arch/arm/kvm/arm.c                   |   23 +-
 arch/arm/kvm/psci.c                  |   15 +-
 arch/arm64/include/asm/kvm_emulate.h |    3 +-
 arch/arm64/include/asm/kvm_host.h    |    5 +
 arch/arm64/include/uapi/asm/kvm.h    |    7 +
 arch/arm64/kernel/asm-offsets.c      |    1 +
 arch/arm64/kvm/Makefile              |    2 +
 arch/arm64/kvm/sys_regs.c            |   37 +-
 arch/arm64/kvm/vgic-v3-switch.S      |   14 +-
 include/kvm/arm_vgic.h               |   37 +-
 include/linux/irqchip/arm-gic-v3.h   |   26 +
 include/linux/kvm_host.h             |    2 +
 include/uapi/linux/kvm.h             |    2 +
 virt/kvm/arm/vgic-v2-emul.c          |  802 ++++++++++++++++++++++++++
 virt/kvm/arm/vgic-v2.c               |   26 +-
 virt/kvm/arm/vgic-v3-emul.c          |  894 +++++++++++++++++++++++++++++
 virt/kvm/arm/vgic-v3.c               |  192 +++++--
 virt/kvm/arm/vgic.c                  | 1018 +++++++---------------------------
 virt/kvm/arm/vgic.h                  |  128 +++++
 22 files changed, 2366 insertions(+), 875 deletions(-)
 create mode 100644 virt/kvm/arm/vgic-v2-emul.c
 create mode 100644 virt/kvm/arm/vgic-v3-emul.c
 create mode 100644 virt/kvm/arm/vgic.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 13:13   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code Andre Przywara
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

The virtual MPIDR registers (containing topology information) for the
guest are currently mapped linearily to the vcpu_id. Improve this
mapping for arm64 by using three levels to not artificially limit the
number of vCPUs. Also add an accessor to later allow easier access to
a vCPU with a given MPIDR.
Use this new accessor in the PSCI emulation.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm/include/asm/kvm_emulate.h   |    3 ++-
 arch/arm/include/asm/kvm_host.h      |    2 ++
 arch/arm/kvm/arm.c                   |   15 +++++++++++++++
 arch/arm/kvm/psci.c                  |   15 ++++-----------
 arch/arm64/include/asm/kvm_emulate.h |    3 ++-
 arch/arm64/include/asm/kvm_host.h    |    2 ++
 arch/arm64/kvm/sys_regs.c            |   11 +++++++++--
 7 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
index b9db269..bd54383 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -23,6 +23,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmio.h>
 #include <asm/kvm_arm.h>
+#include <asm/cputype.h>
 
 unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
@@ -164,7 +165,7 @@ static inline u32 kvm_vcpu_hvc_get_imm(struct kvm_vcpu *vcpu)
 
 static inline unsigned long kvm_vcpu_get_mpidr(struct kvm_vcpu *vcpu)
 {
-	return vcpu->arch.cp15[c0_MPIDR];
+	return vcpu->arch.cp15[c0_MPIDR] & MPIDR_HWID_BITMASK;
 }
 
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 53036e2..b443dfe 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -236,6 +236,8 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
+
 static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9e193c8..61f13cc 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -977,6 +977,21 @@ static void check_kvm_target_cpu(void *ret)
 	*(int *)ret = kvm_target_cpu();
 }
 
+struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
+{
+	unsigned long c_mpidr;
+	struct kvm_vcpu *vcpu;
+	int i;
+
+	mpidr &= MPIDR_HWID_BITMASK;
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		c_mpidr = kvm_vcpu_get_mpidr(vcpu);
+		if (c_mpidr == mpidr)
+			return vcpu;
+	}
+	return NULL;
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 09cf377..49f0992 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -21,6 +21,7 @@
 #include <asm/cputype.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_psci.h>
+#include <asm/kvm_host.h>
 
 /*
  * This is an implementation of the Power State Coordination Interface
@@ -65,25 +66,17 @@ static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct kvm *kvm = source_vcpu->kvm;
-	struct kvm_vcpu *vcpu = NULL, *tmp;
+	struct kvm_vcpu *vcpu = NULL;
 	wait_queue_head_t *wq;
 	unsigned long cpu_id;
 	unsigned long context_id;
-	unsigned long mpidr;
 	phys_addr_t target_pc;
-	int i;
 
-	cpu_id = *vcpu_reg(source_vcpu, 1);
+	cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
 	if (vcpu_mode_is_32bit(source_vcpu))
 		cpu_id &= ~((u32) 0);
 
-	kvm_for_each_vcpu(i, tmp, kvm) {
-		mpidr = kvm_vcpu_get_mpidr(tmp);
-		if ((mpidr & MPIDR_HWID_BITMASK) == (cpu_id & MPIDR_HWID_BITMASK)) {
-			vcpu = tmp;
-			break;
-		}
-	}
+	vcpu = kvm_mpidr_to_vcpu(kvm, cpu_id);
 
 	/*
 	 * Make sure the caller requested a valid CPU and that the CPU is
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5674a55..37316dd 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -27,6 +27,7 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmio.h>
 #include <asm/ptrace.h>
+#include <asm/cputype.h>
 
 unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu);
@@ -184,7 +185,7 @@ static inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)
 
 static inline unsigned long kvm_vcpu_get_mpidr(struct kvm_vcpu *vcpu)
 {
-	return vcpu_sys_reg(vcpu, MPIDR_EL1);
+	return vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
 }
 
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2012c4b..286bb61 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -207,6 +207,8 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
+
 static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 				       phys_addr_t pgd_ptr,
 				       unsigned long hyp_stack_ptr,
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 4cc3b71..dcc5867 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -252,10 +252,17 @@ static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 
 static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
+	u64 mpidr;
+
 	/*
-	 * Simply map the vcpu_id into the Aff0 field of the MPIDR.
+	 * Map the vcpu_id into the first three Aff fields of the MPIDR.
+	 * Aff0 uses only 16 CPUs, since there is a SGI injection
+	 * limitation of GICv3.
 	 */
-	vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
+	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
+	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
+	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
+	vcpu_sys_reg(vcpu, MPIDR_EL1) = (1ULL << 31) | mpidr;
 }
 
 /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
  2014-10-31 17:26 ` [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 13:14   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 03/19] arm/arm64: KVM: refactor vgic_handle_mmio() function Andre Przywara
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

With the introduction of a second emulated GIC model we need to let
userspace specify the GIC model to use for each VM. Pass the
userspace provided value down into the vGIC code and store it there
to differentiate later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm/kvm/arm.c     |    2 +-
 include/kvm/arm_vgic.h |    7 +++++--
 virt/kvm/arm/vgic.c    |    5 +++--
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 61f13cc..60c7997 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -753,7 +753,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	switch (ioctl) {
 	case KVM_CREATE_IRQCHIP: {
 		if (vgic_present)
-			return kvm_vgic_create(kvm);
+			return kvm_vgic_create(kvm, KVM_DEV_TYPE_ARM_VGIC_V2);
 		else
 			return -ENXIO;
 	}
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 206dcc3..dde5a00 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -140,6 +140,9 @@ struct vgic_dist {
 	bool			in_kernel;
 	bool			ready;
 
+	/* vGIC model the kernel emulates for the guest (GICv2 or GICv3) */
+	u32			vgic_model;
+
 	int			nr_cpus;
 	int			nr_irqs;
 
@@ -275,7 +278,7 @@ struct kvm_exit_mmio;
 int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
 int kvm_vgic_hyp_init(void);
 int kvm_vgic_init(struct kvm *kvm);
-int kvm_vgic_create(struct kvm *kvm);
+int kvm_vgic_create(struct kvm *kvm, u32 type);
 void kvm_vgic_destroy(struct kvm *kvm);
 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu);
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
@@ -326,7 +329,7 @@ static inline int kvm_vgic_init(struct kvm *kvm)
 	return 0;
 }
 
-static inline int kvm_vgic_create(struct kvm *kvm)
+static inline int kvm_vgic_create(struct kvm *kvm, u32 type)
 {
 	return 0;
 }
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 3aaca49..2403d72 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1931,7 +1931,7 @@ out:
 	return ret;
 }
 
-int kvm_vgic_create(struct kvm *kvm)
+int kvm_vgic_create(struct kvm *kvm, u32 type)
 {
 	int i, vcpu_lock_idx = -1, ret = 0;
 	struct kvm_vcpu *vcpu;
@@ -1963,6 +1963,7 @@ int kvm_vgic_create(struct kvm *kvm)
 
 	spin_lock_init(&kvm->arch.vgic.lock);
 	kvm->arch.vgic.in_kernel = true;
+	kvm->arch.vgic.vgic_model = type;
 	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
 	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
@@ -2388,7 +2389,7 @@ static void vgic_destroy(struct kvm_device *dev)
 
 static int vgic_create(struct kvm_device *dev, u32 type)
 {
-	return kvm_vgic_create(dev->kvm);
+	return kvm_vgic_create(dev->kvm, type);
 }
 
 static struct kvm_device_ops kvm_arm_vgic_v2_ops = {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 03/19] arm/arm64: KVM: refactor vgic_handle_mmio() function
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
  2014-10-31 17:26 ` [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors Andre Przywara
  2014-10-31 17:26 ` [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 13:23   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones Andre Przywara
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we only need to deal with one MMIO region for the GIC
emulation, but we soon need to extend this. Refactor the existing
code to allow easier addition of different ranges without code
duplication.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic.c |   77 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 56 insertions(+), 21 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 2403d72..704be48 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1032,37 +1032,28 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
 	return true;
 }
 
-/**
- * vgic_handle_mmio - handle an in-kernel MMIO access
+/*
+ * vgic_handle_mmio_range - handle an in-kernel MMIO access
  * @vcpu:	pointer to the vcpu performing the access
  * @run:	pointer to the kvm_run structure
  * @mmio:	pointer to the data describing the access
+ * @ranges:	pointer to the register defining structure
+ * @mmio_base:	base address for this mapping
  *
- * returns true if the MMIO access has been performed in kernel space,
- * and false if it needs to be emulated in user space.
+ * returns true if the MMIO access could be performed
  */
-bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
-		      struct kvm_exit_mmio *mmio)
+static bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
+			    struct kvm_exit_mmio *mmio,
+			    const struct mmio_range *ranges,
+			    unsigned long mmio_base)
 {
 	const struct mmio_range *range;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-	unsigned long base = dist->vgic_dist_base;
 	bool updated_state;
 	unsigned long offset;
 
-	if (!irqchip_in_kernel(vcpu->kvm) ||
-	    mmio->phys_addr < base ||
-	    (mmio->phys_addr + mmio->len) > (base + KVM_VGIC_V2_DIST_SIZE))
-		return false;
-
-	/* We don't support ldrd / strd or ldm / stm to the emulated vgic */
-	if (mmio->len > 4) {
-		kvm_inject_dabt(vcpu, mmio->phys_addr);
-		return true;
-	}
-
-	offset = mmio->phys_addr - base;
-	range = find_matching_range(vgic_dist_ranges, mmio, offset);
+	offset = mmio->phys_addr - mmio_base;
+	range = find_matching_range(ranges, mmio, offset);
 	if (unlikely(!range || !range->handle_mmio)) {
 		pr_warn("Unhandled access %d %08llx %d\n",
 			mmio->is_write, mmio->phys_addr, mmio->len);
@@ -1070,7 +1061,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	}
 
 	spin_lock(&vcpu->kvm->arch.vgic.lock);
-	offset = mmio->phys_addr - range->base - base;
+	offset -= range->base;
 	if (vgic_validate_access(dist, range, offset)) {
 		updated_state = range->handle_mmio(vcpu, mmio, offset);
 	} else {
@@ -1088,6 +1079,50 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	return true;
 }
 
+static inline bool is_in_range(phys_addr_t addr, unsigned long len,
+			       phys_addr_t baseaddr, unsigned long size)
+{
+	if (addr < baseaddr)
+		return false;
+	return addr + len <= baseaddr + size;
+}
+
+static bool vgic_v2_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				struct kvm_exit_mmio *mmio)
+{
+	unsigned long base = vcpu->kvm->arch.vgic.vgic_dist_base;
+
+	if (!is_in_range(mmio->phys_addr, mmio->len, base,
+			 KVM_VGIC_V2_DIST_SIZE))
+		return false;
+
+	/* GICv2 does not support accesses wider than 32 bits */
+	if (mmio->len > 4) {
+		kvm_inject_dabt(vcpu, mmio->phys_addr);
+		return true;
+	}
+
+	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_dist_ranges, base);
+}
+
+/**
+ * vgic_handle_mmio - handle an in-kernel MMIO access for the GIC emulation
+ * @vcpu:      pointer to the vcpu performing the access
+ * @run:       pointer to the kvm_run structure
+ * @mmio:      pointer to the data describing the access
+ *
+ * returns true if the MMIO access has been performed in kernel space,
+ * and false if it needs to be emulated in user space.
+ */
+bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		      struct kvm_exit_mmio *mmio)
+{
+	if (!irqchip_in_kernel(vcpu->kvm))
+		return false;
+
+	return vgic_v2_handle_mmio(vcpu, run, mmio);
+}
+
 static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
 {
 	return dist->irq_sgi_sources + vcpu_id * VGIC_NR_SGIS + sgi;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (2 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 03/19] arm/arm64: KVM: refactor vgic_handle_mmio() function Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 13:25   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops Andre Przywara
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

Some GICv3 registers can and will be accessed as 64 bit registers.
Currently the register handling code can only deal with 32 bit
accesses, so we do two consecutive calls to cover this.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic.c |   48 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 704be48..0cbdde9 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1033,6 +1033,48 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
 }
 
 /*
+ * Call the respective handler function for the given range.
+ * We split up any 64 bit accesses into two consecutive 32 bit
+ * handler calls and merge the result afterwards.
+ */
+static bool call_range_handler(struct kvm_vcpu *vcpu,
+			       struct kvm_exit_mmio *mmio,
+			       unsigned long offset,
+			       const struct mmio_range *range)
+{
+	u32 *data32 = (void *)mmio->data;
+	struct kvm_exit_mmio mmio32;
+	bool ret;
+
+	if (likely(mmio->len <= 4))
+		return range->handle_mmio(vcpu, mmio, offset);
+
+	/*
+	 * Any access bigger than 4 bytes (that we currently handle in KVM)
+	 * is actually 8 bytes long, caused by a 64-bit access
+	 */
+
+	mmio32.len = 4;
+	mmio32.is_write = mmio->is_write;
+
+	mmio32.phys_addr = mmio->phys_addr + 4;
+	if (mmio->is_write)
+		*(u32 *)mmio32.data = data32[1];
+	ret = range->handle_mmio(vcpu, &mmio32, offset + 4);
+	if (!mmio->is_write)
+		data32[1] = *(u32 *)mmio32.data;
+
+	mmio32.phys_addr = mmio->phys_addr;
+	if (mmio->is_write)
+		*(u32 *)mmio32.data = data32[0];
+	ret |= range->handle_mmio(vcpu, &mmio32, offset);
+	if (!mmio->is_write)
+		data32[0] = *(u32 *)mmio32.data;
+
+	return ret;
+}
+
+/*
  * vgic_handle_mmio_range - handle an in-kernel MMIO access
  * @vcpu:	pointer to the vcpu performing the access
  * @run:	pointer to the kvm_run structure
@@ -1063,10 +1105,10 @@ static bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	spin_lock(&vcpu->kvm->arch.vgic.lock);
 	offset -= range->base;
 	if (vgic_validate_access(dist, range, offset)) {
-		updated_state = range->handle_mmio(vcpu, mmio, offset);
+		updated_state = call_range_handler(vcpu, mmio, offset, range);
 	} else {
-		vgic_reg_access(mmio, NULL, offset,
-				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		if (!mmio->is_write)
+			memset(mmio->data, 0, mmio->len);
 		updated_state = false;
 	}
 	spin_unlock(&vcpu->kvm->arch.vgic.lock);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (3 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 13:59   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into " Andre Przywara
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we only have one virtual GIC model supported, so all guests
use the same emulation code. With the addition of another model we
end up with different guests using potentially different vGIC models,
so we have to split up some functions to be per VM.
Introduce a vgic_vm_ops struct to hold function pointers for those
functions that are different and provide the necessary code to
initialize them.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 include/kvm/arm_vgic.h |   10 ++++++
 virt/kvm/arm/vgic.c    |   81 +++++++++++++++++++++++++++++++++++-------------
 2 files changed, 69 insertions(+), 22 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index dde5a00..bfb660a 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -134,6 +134,14 @@ struct vgic_params {
 	void __iomem	*vctrl_base;
 };
 
+struct vgic_vm_ops {
+	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
+			       struct kvm_exit_mmio *);
+	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
+	void	(*add_sgi_source)(struct kvm_vcpu *vcpu, int irq, int source);
+	int	(*vgic_init)(struct kvm *kvm, const struct vgic_params *params);
+};
+
 struct vgic_dist {
 #ifdef CONFIG_KVM_ARM_VGIC
 	spinlock_t		lock;
@@ -215,6 +223,8 @@ struct vgic_dist {
 
 	/* Bitmap indicating which CPU has something pending */
 	unsigned long		*irq_pending_on_cpu;
+
+	struct vgic_vm_ops	vm_ops;
 #endif
 };
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0cbdde9..2c16684 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -105,6 +105,8 @@ static void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
 static const struct vgic_ops *vgic_ops;
 static const struct vgic_params *vgic;
 
+#define vgic_vm_op(kvm, fn) ((kvm)->arch.vgic.vm_ops.fn)
+
 /*
  * struct vgic_bitmap contains a bitmap made of unsigned longs, but
  * extracts u32s out of them.
@@ -761,6 +763,13 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	*vgic_get_sgi_sources(dist, vcpu->vcpu_id, irq) |= 1 << source;
+}
+
 /**
  * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
@@ -775,9 +784,7 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
  */
 static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
-	int vcpu_id = vcpu->vcpu_id;
 	int i;
 
 	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
@@ -804,7 +811,8 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 */
 		vgic_dist_irq_set_pending(vcpu, lr.irq);
 		if (lr.irq < VGIC_NR_SGIS)
-			*vgic_get_sgi_sources(dist, vcpu_id, lr.irq) |= 1 << lr.source;
+			vgic_vm_op(vcpu->kvm, add_sgi_source)(vcpu, lr.irq,
+							      lr.source);
 		lr.state &= ~LR_STATE_PENDING;
 		vgic_set_lr(vcpu, i, lr);
 
@@ -1162,7 +1170,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	if (!irqchip_in_kernel(vcpu->kvm))
 		return false;
 
-	return vgic_v2_handle_mmio(vcpu, run, mmio);
+	return vgic_vm_op(vcpu->kvm, handle_mmio)(vcpu, run, mmio);
 }
 
 static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
@@ -1414,7 +1422,7 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 	return true;
 }
 
-static bool vgic_queue_sgi(struct kvm_vcpu *vcpu, int irq)
+static bool vgic_v2_queue_sgi(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	unsigned long sources;
@@ -1489,7 +1497,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 
 	/* SGIs */
 	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
-		if (!vgic_queue_sgi(vcpu, i))
+		if (!vgic_vm_op(vcpu->kvm, queue_sgi)(vcpu, i))
 			overflow = 1;
 	}
 
@@ -1944,9 +1952,6 @@ static int vgic_init_maps(struct kvm *kvm)
 		}
 	}
 
-	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
-		vgic_set_target_reg(kvm, 0, i);
-
 out:
 	if (ret)
 		kvm_vgic_destroy(kvm);
@@ -1954,6 +1959,31 @@ out:
 	return ret;
 }
 
+static int vgic_v2_init(struct kvm *kvm, const struct vgic_params *params)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int ret, i;
+
+	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
+	    IS_VGIC_ADDR_UNDEF(dist->vgic_cpu_base)) {
+		kvm_err("Need to set vgic distributor addresses first\n");
+		return -ENXIO;
+	}
+
+	ret = kvm_phys_addr_ioremap(kvm, dist->vgic_cpu_base,
+				    params->vcpu_base,
+				    KVM_VGIC_V2_CPU_SIZE, true);
+	if (ret) {
+		kvm_err("Unable to remap VGIC CPU to VCPU\n");
+		return ret;
+	}
+
+	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
+		vgic_set_target_reg(kvm, 0, i);
+
+	return 0;
+}
+
 /**
  * kvm_vgic_init - Initialize global VGIC state before running any VCPUs
  * @kvm: pointer to the kvm struct
@@ -1976,26 +2006,15 @@ int kvm_vgic_init(struct kvm *kvm)
 	if (vgic_initialized(kvm))
 		goto out;
 
-	if (IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_dist_base) ||
-	    IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_cpu_base)) {
-		kvm_err("Need to set vgic cpu and dist addresses first\n");
-		ret = -ENXIO;
-		goto out;
-	}
-
 	ret = vgic_init_maps(kvm);
 	if (ret) {
 		kvm_err("Unable to allocate maps\n");
 		goto out;
 	}
 
-	ret = kvm_phys_addr_ioremap(kvm, kvm->arch.vgic.vgic_cpu_base,
-				    vgic->vcpu_base, KVM_VGIC_V2_CPU_SIZE,
-				    true);
-	if (ret) {
-		kvm_err("Unable to remap VGIC CPU to VCPU\n");
+	ret = vgic_vm_op(kvm, vgic_init)(kvm, vgic);
+	if (ret)
 		goto out;
-	}
 
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		kvm_vgic_vcpu_init(vcpu);
@@ -2008,6 +2027,21 @@ out:
 	return ret;
 }
 
+static bool init_emulation_ops(struct kvm *kvm, int type)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	switch (type) {
+	case KVM_DEV_TYPE_ARM_VGIC_V2:
+		dist->vm_ops.handle_mmio = vgic_v2_handle_mmio;
+		dist->vm_ops.queue_sgi = vgic_v2_queue_sgi;
+		dist->vm_ops.add_sgi_source = vgic_v2_add_sgi_source;
+		dist->vm_ops.vgic_init = vgic_v2_init;
+		return true;
+	}
+	return false;
+}
+
 int kvm_vgic_create(struct kvm *kvm, u32 type)
 {
 	int i, vcpu_lock_idx = -1, ret = 0;
@@ -2045,6 +2079,9 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
 
+	if (!init_emulation_ops(kvm, type))
+		ret = -ENODEV;
+
 out_unlock:
 	for (; vcpu_lock_idx >= 0; vcpu_lock_idx--) {
 		vcpu = kvm_get_vcpu(kvm, vcpu_lock_idx);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into per-VM ops
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (4 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 14:15   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 07/19] arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing Andre Przywara
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

The function to set the VGIC's list registers are not only dependent
on the host GIC model, but need to behave slightly different for
the type of emulated guest GIC.
So move the functions into the new struct vgic_vm_ops and initialize
them properly to prepare for guest GICv3 support later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 include/kvm/arm_vgic.h |    5 +++--
 virt/kvm/arm/vgic-v2.c |   17 +++++++++++++++--
 virt/kvm/arm/vgic-v3.c |   16 ++++++++++++++--
 virt/kvm/arm/vgic.c    |    9 +++++++--
 4 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index bfb660a..a6d41f1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -108,8 +108,6 @@ struct vgic_vmcr {
 };
 
 struct vgic_ops {
-	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
-	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
 	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
 	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
 	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
@@ -132,9 +130,12 @@ struct vgic_params {
 	unsigned int	maint_irq;
 	/* Virtual control interface base address */
 	void __iomem	*vctrl_base;
+	bool (*init_emul)(struct kvm *kvm, int type);
 };
 
 struct vgic_vm_ops {
+	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
+	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
 	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
 			       struct kvm_exit_mmio *);
 	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index 2935405..bdc8d97 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -143,8 +143,6 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 }
 
 static const struct vgic_ops vgic_v2_ops = {
-	.get_lr			= vgic_v2_get_lr,
-	.set_lr			= vgic_v2_set_lr,
 	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
 	.get_elrsr		= vgic_v2_get_elrsr,
 	.get_eisr		= vgic_v2_get_eisr,
@@ -158,6 +156,20 @@ static const struct vgic_ops vgic_v2_ops = {
 
 static struct vgic_params vgic_v2_params;
 
+static bool vgic_v2_init_emul(struct kvm *kvm, int type)
+{
+	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+
+	switch (type) {
+	case KVM_DEV_TYPE_ARM_VGIC_V2:
+		vm_ops->get_lr = vgic_v2_get_lr;
+		vm_ops->set_lr = vgic_v2_set_lr;
+		return true;
+	}
+
+	return false;
+}
+
 /**
  * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
  * @node:	pointer to the DT node
@@ -196,6 +208,7 @@ int vgic_v2_probe(struct device_node *vgic_node,
 		ret = -ENOMEM;
 		goto out;
 	}
+	vgic->init_emul = vgic_v2_init_emul;
 
 	vgic->nr_lr = readl_relaxed(vgic->vctrl_base + GICH_VTR);
 	vgic->nr_lr = (vgic->nr_lr & 0x3f) + 1;
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 1c2c8ee..a38339e 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -157,8 +157,6 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 }
 
 static const struct vgic_ops vgic_v3_ops = {
-	.get_lr			= vgic_v3_get_lr,
-	.set_lr			= vgic_v3_set_lr,
 	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
 	.get_elrsr		= vgic_v3_get_elrsr,
 	.get_eisr		= vgic_v3_get_eisr,
@@ -170,6 +168,19 @@ static const struct vgic_ops vgic_v3_ops = {
 	.enable			= vgic_v3_enable,
 };
 
+static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
+{
+	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+
+	switch (type) {
+	case KVM_DEV_TYPE_ARM_VGIC_V2:
+		vm_ops->get_lr = vgic_v3_get_lr;
+		vm_ops->set_lr = vgic_v3_set_lr;
+		return true;
+	}
+	return false;
+}
+
 static struct vgic_params vgic_v3_params;
 
 /**
@@ -231,6 +242,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
 		goto out;
 	}
 
+	vgic->init_emul = vgic_v3_init_emul_compat;
 	vgic->vcpu_base = vcpu_res.start;
 	vgic->vctrl_base = NULL;
 	vgic->type = VGIC_V3;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 2c16684..8c2e707 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1278,13 +1278,13 @@ static void vgic_update_state(struct kvm *kvm)
 
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr)
 {
-	return vgic_ops->get_lr(vcpu, lr);
+	return vgic_vm_op(vcpu->kvm, get_lr)(vcpu, lr);
 }
 
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
 			       struct vgic_lr vlr)
 {
-	vgic_ops->set_lr(vcpu, lr, vlr);
+	return vgic_vm_op(vcpu->kvm, set_lr)(vcpu, lr, vlr);
 }
 
 static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
@@ -2072,6 +2072,11 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 		}
 	}
 
+	if (!vgic->init_emul(kvm, type)) {
+		ret = -ENODEV;
+		goto out_unlock;
+	}
+
 	spin_lock_init(&kvm->arch.vgic.lock);
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 07/19] arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (5 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into " Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 20:05   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 08/19] arm/arm64: KVM: dont rely on a valid GICH base address Andre Przywara
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we unconditionally register the GICv2 emulation device
during the host's KVM initialization. Since with GICv3 support we
may end up with only v2 or only v3 or both supported, we move the
registration into the GIC probing function, where we will later know
which combination is valid.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 include/linux/kvm_host.h |    1 +
 virt/kvm/arm/vgic-v2.c   |    2 ++
 virt/kvm/arm/vgic-v3.c   |    1 +
 virt/kvm/arm/vgic.c      |    5 ++---
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ea53b04..326ba7a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1084,6 +1084,7 @@ void kvm_unregister_device_ops(u32 type);
 
 extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
+extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
 
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index bdc8d97..417ecaa 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -242,6 +242,8 @@ int vgic_v2_probe(struct device_node *vgic_node,
 		goto out_unmap;
 	}
 
+	kvm_register_device_ops(&kvm_arm_vgic_v2_ops, KVM_DEV_TYPE_ARM_VGIC_V2);
+
 	vgic->vcpu_base = vcpu_res.start;
 
 	kvm_info("%s@%llx IRQ%d\n", vgic_node->name,
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index a38339e..6825c71 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -241,6 +241,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
 		ret = -ENXIO;
 		goto out;
 	}
+	kvm_register_device_ops(&kvm_arm_vgic_v2_ops, KVM_DEV_TYPE_ARM_VGIC_V2);
 
 	vgic->init_emul = vgic_v3_init_emul_compat;
 	vgic->vcpu_base = vcpu_res.start;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 8c2e707..98fffb4 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2511,7 +2511,7 @@ static int vgic_create(struct kvm_device *dev, u32 type)
 	return kvm_vgic_create(dev->kvm, type);
 }
 
-static struct kvm_device_ops kvm_arm_vgic_v2_ops = {
+struct kvm_device_ops kvm_arm_vgic_v2_ops = {
 	.name = "kvm-arm-vgic",
 	.create = vgic_create,
 	.destroy = vgic_destroy,
@@ -2590,8 +2590,7 @@ int kvm_vgic_hyp_init(void)
 
 	on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
 
-	return kvm_register_device_ops(&kvm_arm_vgic_v2_ops,
-				       KVM_DEV_TYPE_ARM_VGIC_V2);
+	return 0;
 
 out_free_irq:
 	free_percpu_irq(vgic->maint_irq, kvm_get_running_vcpus());
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 08/19] arm/arm64: KVM: dont rely on a valid GICH base address
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (6 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 07/19] arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 20:05   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 09/19] arm/arm64: KVM: make the maximum number of vCPUs a per-VM value Andre Przywara
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

To check whether the vGIC was already initialized, we currently check
the GICH base address for not being NULL. Since with GICv3 we may
get along without this address, lets use the irqchip_in_kernel()
function to detect an already initialized vGIC.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 98fffb4..0407c6c 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2049,7 +2049,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 
 	mutex_lock(&kvm->lock);
 
-	if (kvm->arch.vgic.vctrl_base) {
+	if (irqchip_in_kernel(kvm)) {
 		ret = -EEXIST;
 		goto out;
 	}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 09/19] arm/arm64: KVM: make the maximum number of vCPUs a per-VM value
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (7 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 08/19] arm/arm64: KVM: dont rely on a valid GICH base address Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 20:06   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable Andre Przywara
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

Currently the maximum number of vCPUs supported is a global value
limited by the used GIC model. GICv3 will lift this limit, but we
still need to observe it for guests using GICv2.
So the maximum number of vCPUs is per-VM value, depending on the
GIC model the guest uses.
Store and check the value in struct kvm_arch, but keep it down to
8 for now.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm/include/asm/kvm_host.h   |    1 +
 arch/arm/kvm/arm.c                |    6 ++++++
 arch/arm64/include/asm/kvm_host.h |    3 +++
 virt/kvm/arm/vgic-v2.c            |    7 +++++++
 virt/kvm/arm/vgic-v3.c            |    8 ++++++++
 5 files changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b443dfe..7969e6e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -68,6 +68,7 @@ struct kvm_arch {
 
 	/* Interrupt controller */
 	struct vgic_dist	vgic;
+	int max_vcpus;
 };
 
 #define KVM_NR_MEM_OBJS     40
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 60c7997..ac0aa7f 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -131,6 +131,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 	/* Mark the initial VMID generation invalid */
 	kvm->arch.vmid_gen = 0;
+	kvm->arch.max_vcpus = CONFIG_KVM_ARM_MAX_VCPUS;
 
 	return ret;
 out_free_stage2_pgd:
@@ -213,6 +214,11 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
 	int err;
 	struct kvm_vcpu *vcpu;
 
+	if (id >= kvm->arch.max_vcpus) {
+		err = -EINVAL;
+		goto out;
+	}
+
 	vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL);
 	if (!vcpu) {
 		err = -ENOMEM;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 286bb61..f9e130d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -59,6 +59,9 @@ struct kvm_arch {
 	/* VTTBR value associated with above pgd and vmid */
 	u64    vttbr;
 
+	/* The maximum number of vCPUs depends on the used GIC model */
+	int max_vcpus;
+
 	/* Interrupt controller */
 	struct vgic_dist	vgic;
 
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index 417ecaa..c92ac33 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -159,11 +159,18 @@ static struct vgic_params vgic_v2_params;
 static bool vgic_v2_init_emul(struct kvm *kvm, int type)
 {
 	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+	int nr_vcpus;
 
 	switch (type) {
 	case KVM_DEV_TYPE_ARM_VGIC_V2:
+		nr_vcpus = atomic_read(&kvm->online_vcpus);
+		if (nr_vcpus > 8) {
+			pr_warn_ratelimited("VGICv2 only supports up to 8 vCPUs\n");
+			return false;
+		}
 		vm_ops->get_lr = vgic_v2_get_lr;
 		vm_ops->set_lr = vgic_v2_set_lr;
+		kvm->arch.max_vcpus = 8;
 		return true;
 	}
 
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 6825c71..fc4d628 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -171,11 +171,19 @@ static const struct vgic_ops vgic_v3_ops = {
 static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
 {
 	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+	int nr_vcpus;
 
 	switch (type) {
 	case KVM_DEV_TYPE_ARM_VGIC_V2:
+		nr_vcpus = atomic_read(&kvm->online_vcpus);
+		if (nr_vcpus > 8) {
+			pr_warn_ratelimited("VGICv2 supports only up to 8 vCPUs\n");
+			return false;
+		}
+
 		vm_ops->get_lr = vgic_v3_get_lr;
 		vm_ops->set_lr = vgic_v3_set_lr;
+		kvm->arch.max_vcpus = 8;
 		return true;
 	}
 	return false;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (8 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 09/19] arm/arm64: KVM: make the maximum number of vCPUs a per-VM value Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-03 20:04   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors Andre Przywara
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

ICC_SRE_EL1 is a system register allowing msr/mrs accesses to the
GIC CPU interface for EL1 (guests). Currently we force it to 0, but
for proper GICv3 support we have to allow guests to use it (depending
on their selected virtual GIC model).
So add ICC_SRE_EL1 to the list of saved/restored registers on a
world switch, but actually disallow a guest to change it by only
restoring a fixed, once-initialized value.
This value depends on the GIC model userland has chosen for a guest.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
---
 arch/arm64/kernel/asm-offsets.c |    1 +
 arch/arm64/kvm/vgic-v3-switch.S |   14 +++++++++-----
 include/kvm/arm_vgic.h          |    1 +
 virt/kvm/arm/vgic-v3.c          |    9 +++++++--
 4 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 9a9fce0..9d34486 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -140,6 +140,7 @@ int main(void)
   DEFINE(VGIC_V2_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_v2.vgic_elrsr));
   DEFINE(VGIC_V2_CPU_APR,	offsetof(struct vgic_cpu, vgic_v2.vgic_apr));
   DEFINE(VGIC_V2_CPU_LR,	offsetof(struct vgic_cpu, vgic_v2.vgic_lr));
+  DEFINE(VGIC_V3_CPU_SRE,	offsetof(struct vgic_cpu, vgic_v3.vgic_sre));
   DEFINE(VGIC_V3_CPU_HCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_hcr));
   DEFINE(VGIC_V3_CPU_VMCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_vmcr));
   DEFINE(VGIC_V3_CPU_MISR,	offsetof(struct vgic_cpu, vgic_v3.vgic_misr));
diff --git a/arch/arm64/kvm/vgic-v3-switch.S b/arch/arm64/kvm/vgic-v3-switch.S
index d160469..617a012 100644
--- a/arch/arm64/kvm/vgic-v3-switch.S
+++ b/arch/arm64/kvm/vgic-v3-switch.S
@@ -148,17 +148,18 @@
  * x0: Register pointing to VCPU struct
  */
 .macro	restore_vgic_v3_state
-	// Disable SRE_EL1 access. Necessary, otherwise
-	// ICH_VMCR_EL2.VFIQEn becomes one, and FIQ happens...
-	msr_s	ICC_SRE_EL1, xzr
-	isb
-
 	// Compute the address of struct vgic_cpu
 	add	x3, x0, #VCPU_VGIC_CPU
 
 	// Restore all interesting registers
 	ldr	w4, [x3, #VGIC_V3_CPU_HCR]
 	ldr	w5, [x3, #VGIC_V3_CPU_VMCR]
+	ldr	w25, [x3, #VGIC_V3_CPU_SRE]
+
+	msr_s	ICC_SRE_EL1, x25
+
+	// make sure SRE is valid before writing the other registers
+	isb
 
 	msr_s	ICH_HCR_EL2, x4
 	msr_s	ICH_VMCR_EL2, x5
@@ -244,9 +245,12 @@
 	dsb	sy
 
 	// Prevent the guest from touching the GIC system registers
+	// if SRE isn't enabled for GICv3 emulation
+	cbnz	x25, 1f
 	mrs_s	x5, ICC_SRE_EL2
 	and	x5, x5, #~ICC_SRE_EL2_ENABLE
 	msr_s	ICC_SRE_EL2, x5
+1:
 .endm
 
 ENTRY(__save_vgic_v3_state)
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index a6d41f1..8827bc7 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -243,6 +243,7 @@ struct vgic_v3_cpu_if {
 #ifdef CONFIG_ARM_GIC_V3
 	u32		vgic_hcr;
 	u32		vgic_vmcr;
+	u32		vgic_sre;	/* Restored only, change ignored */
 	u32		vgic_misr;	/* Saved only */
 	u32		vgic_eisr;	/* Saved only */
 	u32		vgic_elrsr;	/* Saved only */
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index fc4d628..ce50918 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -145,15 +145,20 @@ static void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcrp)
 
 static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 {
+	struct vgic_v3_cpu_if *vgic_v3;
+
+	vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
 	/*
 	 * By forcing VMCR to zero, the GIC will restore the binary
 	 * points to their reset values. Anything else resets to zero
 	 * anyway.
 	 */
-	vcpu->arch.vgic_cpu.vgic_v3.vgic_vmcr = 0;
+	vgic_v3->vgic_vmcr = 0;
+
+	vgic_v3->vgic_sre = 0;
 
 	/* Get the show on the road... */
-	vcpu->arch.vgic_cpu.vgic_v3.vgic_hcr = ICH_HCR_EN;
+	vgic_v3->vgic_hcr = ICH_HCR_EN;
 }
 
 static const struct vgic_ops vgic_v3_ops = {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (9 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-04 11:55   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr() Andre Przywara
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

The MMIO accessors for GICD_I[CS]ENABLER, GICD_I[CS]PENDR and
GICD_ICFGR behave very similar in GICv3, although the way the
affected vCPU is determined differs.
Factor out a generic, backend-facing implementation and use small
wrappers in the current GICv2 emulation to ease code sharing later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic.c |  126 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 74 insertions(+), 52 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0407c6c..da501a2 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -478,64 +478,66 @@ static bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu,
 	return false;
 }
 
-static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
-				       struct kvm_exit_mmio *mmio,
-				       phys_addr_t offset)
+static bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
+				   phys_addr_t offset, int vcpu_id, int access)
 {
-	u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
-				       vcpu->vcpu_id, offset);
-	vgic_reg_access(mmio, reg, offset,
-			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+	u32 *reg;
+	int mode = ACCESS_READ_VALUE | access;
+	struct kvm_vcpu *target_vcpu = kvm_get_vcpu(kvm, vcpu_id);
+
+	reg = vgic_bitmap_get_reg(&kvm->arch.vgic.irq_enabled, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset, mode);
 	if (mmio->is_write) {
-		vgic_update_state(vcpu->kvm);
+		if (access & ACCESS_WRITE_CLEARBIT) {
+			if (offset < 4) /* Force SGI enabled */
+				*reg |= 0xffff;
+			vgic_retire_disabled_irqs(target_vcpu);
+		}
+		vgic_update_state(kvm);
 		return true;
 	}
 
 	return false;
 }
 
+static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
+				       struct kvm_exit_mmio *mmio,
+				       phys_addr_t offset)
+{
+	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+				      vcpu->vcpu_id, ACCESS_WRITE_SETBIT);
+}
+
 static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
 					 struct kvm_exit_mmio *mmio,
 					 phys_addr_t offset)
 {
-	u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
-				       vcpu->vcpu_id, offset);
-	vgic_reg_access(mmio, reg, offset,
-			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
-	if (mmio->is_write) {
-		if (offset < 4) /* Force SGI enabled */
-			*reg |= 0xffff;
-		vgic_retire_disabled_irqs(vcpu);
-		vgic_update_state(vcpu->kvm);
-		return true;
-	}
-
-	return false;
+	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+				      vcpu->vcpu_id, ACCESS_WRITE_CLEARBIT);
 }
 
-static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
+static bool vgic_handle_set_pending_reg(struct kvm *kvm,
 					struct kvm_exit_mmio *mmio,
-					phys_addr_t offset)
+					phys_addr_t offset, int vcpu_id)
 {
 	u32 *reg, orig;
 	u32 level_mask;
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int mode = ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT;
+	struct vgic_dist *dist = &kvm->arch.vgic;
 
-	reg = vgic_bitmap_get_reg(&dist->irq_cfg, vcpu->vcpu_id, offset);
+	reg = vgic_bitmap_get_reg(&dist->irq_cfg, vcpu_id, offset);
 	level_mask = (~(*reg));
 
 	/* Mark both level and edge triggered irqs as pending */
-	reg = vgic_bitmap_get_reg(&dist->irq_pending, vcpu->vcpu_id, offset);
+	reg = vgic_bitmap_get_reg(&dist->irq_pending, vcpu_id, offset);
 	orig = *reg;
-	vgic_reg_access(mmio, reg, offset,
-			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+	vgic_reg_access(mmio, reg, offset, mode);
 
 	if (mmio->is_write) {
 		/* Set the soft-pending flag only for level-triggered irqs */
 		reg = vgic_bitmap_get_reg(&dist->irq_soft_pend,
-					  vcpu->vcpu_id, offset);
-		vgic_reg_access(mmio, reg, offset,
-				ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+					  vcpu_id, offset);
+		vgic_reg_access(mmio, reg, offset, mode);
 		*reg &= level_mask;
 
 		/* Ignore writes to SGIs */
@@ -544,31 +546,30 @@ static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
 			*reg |= orig & 0xffff;
 		}
 
-		vgic_update_state(vcpu->kvm);
+		vgic_update_state(kvm);
 		return true;
 	}
 
 	return false;
 }
 
-static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
+static bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 					  struct kvm_exit_mmio *mmio,
-					  phys_addr_t offset)
+					  phys_addr_t offset, int vcpu_id)
 {
 	u32 *level_active;
 	u32 *reg, orig;
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int mode = ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT;
+	struct vgic_dist *dist = &kvm->arch.vgic;
 
-	reg = vgic_bitmap_get_reg(&dist->irq_pending, vcpu->vcpu_id, offset);
+	reg = vgic_bitmap_get_reg(&dist->irq_pending, vcpu_id, offset);
 	orig = *reg;
-	vgic_reg_access(mmio, reg, offset,
-			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+	vgic_reg_access(mmio, reg, offset, mode);
 	if (mmio->is_write) {
 		/* Re-set level triggered level-active interrupts */
 		level_active = vgic_bitmap_get_reg(&dist->irq_level,
-					  vcpu->vcpu_id, offset);
-		reg = vgic_bitmap_get_reg(&dist->irq_pending,
-					  vcpu->vcpu_id, offset);
+					  vcpu_id, offset);
+		reg = vgic_bitmap_get_reg(&dist->irq_pending, vcpu_id, offset);
 		*reg |= *level_active;
 
 		/* Ignore writes to SGIs */
@@ -579,17 +580,31 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
 
 		/* Clear soft-pending flags */
 		reg = vgic_bitmap_get_reg(&dist->irq_soft_pend,
-					  vcpu->vcpu_id, offset);
-		vgic_reg_access(mmio, reg, offset,
-				ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+					  vcpu_id, offset);
+		vgic_reg_access(mmio, reg, offset, mode);
 
-		vgic_update_state(vcpu->kvm);
+		vgic_update_state(kvm);
 		return true;
 	}
-
 	return false;
 }
 
+static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
+					struct kvm_exit_mmio *mmio,
+					phys_addr_t offset)
+{
+	return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
+					   vcpu->vcpu_id);
+}
+
+static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
+					  struct kvm_exit_mmio *mmio,
+					  phys_addr_t offset)
+{
+	return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
+					     vcpu->vcpu_id);
+}
+
 static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
 				     phys_addr_t offset)
@@ -712,14 +727,10 @@ static u16 vgic_cfg_compress(u32 val)
  * LSB is always 0. As such, we only keep the upper bit, and use the
  * two above functions to compress/expand the bits
  */
-static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
-				struct kvm_exit_mmio *mmio, phys_addr_t offset)
+static bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
+				phys_addr_t offset)
 {
 	u32 val;
-	u32 *reg;
-
-	reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
-				  vcpu->vcpu_id, offset >> 1);
 
 	if (offset & 4)
 		val = *reg >> 16;
@@ -748,6 +759,17 @@ static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
+				struct kvm_exit_mmio *mmio, phys_addr_t offset)
+{
+	u32 *reg;
+
+	reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
+				  vcpu->vcpu_id, offset >> 1);
+
+	return vgic_handle_cfg_reg(reg, mmio, offset);
+}
+
 static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
 				struct kvm_exit_mmio *mmio, phys_addr_t offset)
 {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (10 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-04 19:30   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 13/19] arm/arm64: KVM: add vgic.h header file Andre Przywara
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

vgic_set_attr() and vgic_get_attr() contain both code specific for
the emulated GIC as well as code for the userland facing, generic
part of the GIC.
Split the guest GIC facing code of from the generic part to allow
easier splitting later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic.c |   78 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 54 insertions(+), 24 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index da501a2..6ff5acd 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2383,7 +2383,8 @@ out:
 	return ret;
 }
 
-static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+static int vgic_set_common_attr(struct kvm_device *dev,
+				struct kvm_device_attr *attr)
 {
 	int r;
 
@@ -2399,17 +2400,6 @@ static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 		r = kvm_vgic_addr(dev->kvm, type, &addr, true);
 		return (r == -ENODEV) ? -ENXIO : r;
 	}
-
-	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
-		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-		u32 reg;
-
-		if (get_user(reg, uaddr))
-			return -EFAULT;
-
-		return vgic_attr_regs_access(dev, attr, &reg, true);
-	}
 	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS: {
 		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
 		u32 val;
@@ -2446,7 +2436,33 @@ static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 	return -ENXIO;
 }
 
-static int vgic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	int ret;
+
+	ret = vgic_set_common_attr(dev, attr);
+	if (ret != -ENXIO)
+		return ret;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
+		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+		u32 reg;
+
+		if (get_user(reg, uaddr))
+			return -EFAULT;
+
+		return vgic_attr_regs_access(dev, attr, &reg, true);
+	}
+
+	}
+
+	return -ENXIO;
+}
+
+static int vgic_get_common_attr(struct kvm_device *dev,
+				struct kvm_device_attr *attr)
 {
 	int r = -ENXIO;
 
@@ -2464,27 +2480,41 @@ static int vgic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 			return -EFAULT;
 		break;
 	}
+	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS: {
+		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+
+		r = put_user(dev->kvm->arch.vgic.nr_irqs, uaddr);
+		break;
+	}
+
+	}
+
+	return r;
+}
+
+static int vgic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	int ret;
+
+	ret = vgic_get_common_attr(dev, attr);
+	if (ret != -ENXIO)
+		return ret;
 
+	switch (attr->group) {
 	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
 	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
 		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
 		u32 reg = 0;
 
-		r = vgic_attr_regs_access(dev, attr, &reg, false);
-		if (r)
-			return r;
-		r = put_user(reg, uaddr);
-		break;
-	}
-	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS: {
-		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-		r = put_user(dev->kvm->arch.vgic.nr_irqs, uaddr);
-		break;
+		ret = vgic_attr_regs_access(dev, attr, &reg, false);
+		if (ret)
+			return ret;
+		return put_user(reg, uaddr);
 	}
 
 	}
 
-	return r;
+	return -ENXIO;
 }
 
 static int vgic_has_attr_regs(const struct mmio_range *ranges,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 13/19] arm/arm64: KVM: add vgic.h header file
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (11 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr() Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-04 19:30   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 14/19] arm/arm64: KVM: split GICv2 specific emulation code from vgic.c Andre Przywara
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

vgic.c is currently a mixture of generic vGIC emulation code and
functions specific to emulating a GICv2. To ease the addition of
GICv3 later, we create new header file vgic.h, which holds constants
and prototypes of commonly used functions.
I removed the long-standing comment about using the kvm_io_bus API
to tackle the GIC register ranges, as it wouldn't be a win for us
anymore.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>

-------
As the diff isn't always obvious here (and to aid eventual rebases),
here is a list of high-level changes done to the code:
* moved definitions and prototypes from vgic.c to vgic.h:
  - VGIC_ADDR_UNDEF
  - ACCESS_{READ,WRITE}_*
  - vgic_update_state()
  - vgic_kick_vcpus()
  - vgic_get_vmcr()
  - vgic_set_vmcr()
  - struct mmio_range {}
  - IS_IN_RANGE() macro
* removed static keyword and exported prototype in vgic.h:
  - vgic_bitmap_get_reg()
  - vgic_bitmap_set_irq_val()
  - vgic_bitmap_get_shared_map()
  - vgic_bytemap_get_reg()
  - vgic_dist_irq_set()
  - vgic_dist_irq_clear()
  - vgic_cpu_irq_clear()
  - vgic_reg_access()
  - handle_mmio_raz_wi()
  - vgic_handle_enable_reg()
  - vgic_handle_pending_reg()
  - vgic_handle_cfg_reg()
  - vgic_unqueue_irqs()
  - find_matching_range() (renamed to vgic_* to avoid namespace clutter)
  - vgic_handle_mmio_range()
  - vgic_update_state()
  - vgic_get_vmcr()
  - vgic_set_vmcr()
  - vgic_queue_irq()
  - vgic_kick_vcpus()
  - vgic_init_maps()
  - vgic_has_attr_regs()
  - vgic_set_common_attr()
  - vgic_get_common_attr()
  - vgic_destroy()
  - vgic_create()
* moved functions to vgic.h (static inline):
  - mmio_data_read()
  - mmio_data_write()
---
 virt/kvm/arm/vgic.c |  138 ++++++++++++++++-----------------------------------
 virt/kvm/arm/vgic.h |  124 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 167 insertions(+), 95 deletions(-)
 create mode 100644 virt/kvm/arm/vgic.h

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 6ff5acd..8f1e6ee 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -75,32 +75,16 @@
  *   inactive as long as the external input line is held high.
  */
 
-#define VGIC_ADDR_UNDEF		(-1)
-#define IS_VGIC_ADDR_UNDEF(_x)  ((_x) == VGIC_ADDR_UNDEF)
+#include "vgic.h"
 
-#define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
-#define IMPLEMENTER_ARM		0x43b
 #define GICC_ARCH_VERSION_V2	0x2
 
-#define ACCESS_READ_VALUE	(1 << 0)
-#define ACCESS_READ_RAZ		(0 << 0)
-#define ACCESS_READ_MASK(x)	((x) & (1 << 0))
-#define ACCESS_WRITE_IGNORED	(0 << 1)
-#define ACCESS_WRITE_SETBIT	(1 << 1)
-#define ACCESS_WRITE_CLEARBIT	(2 << 1)
-#define ACCESS_WRITE_VALUE	(3 << 1)
-#define ACCESS_WRITE_MASK(x)	((x) & (3 << 1))
-
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
-static void vgic_update_state(struct kvm *kvm);
-static void vgic_kick_vcpus(struct kvm *kvm);
 static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi);
 static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
-static void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
-static void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
 
 static const struct vgic_ops *vgic_ops;
 static const struct vgic_params *vgic;
@@ -161,8 +145,7 @@ static unsigned long *u64_to_bitmask(u64 *val)
 	return (unsigned long *)val;
 }
 
-static u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x,
-				int cpuid, u32 offset)
+u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x, int cpuid, u32 offset)
 {
 	offset >>= 2;
 	if (!offset)
@@ -180,8 +163,8 @@ static int vgic_bitmap_get_irq_val(struct vgic_bitmap *x,
 	return test_bit(irq - VGIC_NR_PRIVATE_IRQS, x->shared);
 }
 
-static void vgic_bitmap_set_irq_val(struct vgic_bitmap *x, int cpuid,
-				    int irq, int val)
+void vgic_bitmap_set_irq_val(struct vgic_bitmap *x, int cpuid,
+			     int irq, int val)
 {
 	unsigned long *reg;
 
@@ -203,7 +186,7 @@ static unsigned long *vgic_bitmap_get_cpu_map(struct vgic_bitmap *x, int cpuid)
 	return x->private + cpuid;
 }
 
-static unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x)
+unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x)
 {
 	return x->shared;
 }
@@ -230,7 +213,7 @@ static void vgic_free_bytemap(struct vgic_bytemap *b)
 	b->shared = NULL;
 }
 
-static u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x, int cpuid, u32 offset)
+u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x, int cpuid, u32 offset)
 {
 	u32 *reg;
 
@@ -327,14 +310,14 @@ static int vgic_dist_irq_is_pending(struct kvm_vcpu *vcpu, int irq)
 	return vgic_bitmap_get_irq_val(&dist->irq_pending, vcpu->vcpu_id, irq);
 }
 
-static void vgic_dist_irq_set_pending(struct kvm_vcpu *vcpu, int irq)
+void vgic_dist_irq_set_pending(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 
 	vgic_bitmap_set_irq_val(&dist->irq_pending, vcpu->vcpu_id, irq, 1);
 }
 
-static void vgic_dist_irq_clear_pending(struct kvm_vcpu *vcpu, int irq)
+void vgic_dist_irq_clear_pending(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 
@@ -350,7 +333,7 @@ static void vgic_cpu_irq_set(struct kvm_vcpu *vcpu, int irq)
 			vcpu->arch.vgic_cpu.pending_shared);
 }
 
-static void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
+void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
 {
 	if (irq < VGIC_NR_PRIVATE_IRQS)
 		clear_bit(irq, vcpu->arch.vgic_cpu.pending_percpu);
@@ -364,16 +347,6 @@ static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
 	return vgic_irq_is_edge(vcpu, irq) || !vgic_irq_is_queued(vcpu, irq);
 }
 
-static u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
-{
-	return le32_to_cpu(*((u32 *)mmio->data)) & mask;
-}
-
-static void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value)
-{
-	*((u32 *)mmio->data) = cpu_to_le32(value) & mask;
-}
-
 /**
  * vgic_reg_access - access vgic register
  * @mmio:   pointer to the data describing the mmio access
@@ -385,8 +358,8 @@ static void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value)
  * modes defined for vgic register access
  * (read,raz,write-ignored,setbit,clearbit,write)
  */
-static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
-			    phys_addr_t offset, int mode)
+void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
+		     phys_addr_t offset, int mode)
 {
 	int word_offset = (offset & 3) * 8;
 	u32 mask = (1UL << (mmio->len * 8)) - 1;
@@ -470,16 +443,16 @@ static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
 	return false;
 }
 
-static bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu,
-			       struct kvm_exit_mmio *mmio, phys_addr_t offset)
+bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
+			phys_addr_t offset)
 {
 	vgic_reg_access(mmio, NULL, offset,
 			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
 	return false;
 }
 
-static bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
-				   phys_addr_t offset, int vcpu_id, int access)
+bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
+			    phys_addr_t offset, int vcpu_id, int access)
 {
 	u32 *reg;
 	int mode = ACCESS_READ_VALUE | access;
@@ -516,9 +489,9 @@ static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
 				      vcpu->vcpu_id, ACCESS_WRITE_CLEARBIT);
 }
 
-static bool vgic_handle_set_pending_reg(struct kvm *kvm,
-					struct kvm_exit_mmio *mmio,
-					phys_addr_t offset, int vcpu_id)
+bool vgic_handle_set_pending_reg(struct kvm *kvm,
+				 struct kvm_exit_mmio *mmio,
+				 phys_addr_t offset, int vcpu_id)
 {
 	u32 *reg, orig;
 	u32 level_mask;
@@ -553,9 +526,9 @@ static bool vgic_handle_set_pending_reg(struct kvm *kvm,
 	return false;
 }
 
-static bool vgic_handle_clear_pending_reg(struct kvm *kvm,
-					  struct kvm_exit_mmio *mmio,
-					  phys_addr_t offset, int vcpu_id)
+bool vgic_handle_clear_pending_reg(struct kvm *kvm,
+				   struct kvm_exit_mmio *mmio,
+				   phys_addr_t offset, int vcpu_id)
 {
 	u32 *level_active;
 	u32 *reg, orig;
@@ -727,8 +700,8 @@ static u16 vgic_cfg_compress(u32 val)
  * LSB is always 0. As such, we only keep the upper bit, and use the
  * two above functions to compress/expand the bits
  */
-static bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
-				phys_addr_t offset)
+bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
+			 phys_addr_t offset)
 {
 	u32 val;
 
@@ -804,7 +777,7 @@ static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
  * to the distributor but the active state stays in the LRs, because we don't
  * track the active state on the distributor side.
  */
-static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
+void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	int i;
@@ -930,20 +903,6 @@ static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu,
 		return write_set_clear_sgi_pend_reg(vcpu, mmio, offset, false);
 }
 
-/*
- * I would have liked to use the kvm_bus_io_*() API instead, but it
- * cannot cope with banked registers (only the VM pointer is passed
- * around, and we need the vcpu). One of these days, someone please
- * fix it!
- */
-struct mmio_range {
-	phys_addr_t base;
-	unsigned long len;
-	int bits_per_irq;
-	bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
-			    phys_addr_t offset);
-};
-
 static const struct mmio_range vgic_dist_ranges[] = {
 	{
 		.base		= GIC_DIST_CTRL,
@@ -1029,10 +988,10 @@ static const struct mmio_range vgic_dist_ranges[] = {
 	{}
 };
 
-static const
-struct mmio_range *find_matching_range(const struct mmio_range *ranges,
-				       struct kvm_exit_mmio *mmio,
-				       phys_addr_t offset)
+const
+struct mmio_range *vgic_find_matching_range(const struct mmio_range *ranges,
+					    struct kvm_exit_mmio *mmio,
+					    phys_addr_t offset)
 {
 	const struct mmio_range *r = ranges;
 
@@ -1114,7 +1073,7 @@ static bool call_range_handler(struct kvm_vcpu *vcpu,
  *
  * returns true if the MMIO access could be performed
  */
-static bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
+bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			    struct kvm_exit_mmio *mmio,
 			    const struct mmio_range *ranges,
 			    unsigned long mmio_base)
@@ -1125,7 +1084,7 @@ static bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	unsigned long offset;
 
 	offset = mmio->phys_addr - mmio_base;
-	range = find_matching_range(ranges, mmio, offset);
+	range = vgic_find_matching_range(ranges, mmio, offset);
 	if (unlikely(!range || !range->handle_mmio)) {
 		pr_warn("Unhandled access %d %08llx %d\n",
 			mmio->is_write, mmio->phys_addr, mmio->len);
@@ -1151,14 +1110,6 @@ static bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	return true;
 }
 
-static inline bool is_in_range(phys_addr_t addr, unsigned long len,
-			       phys_addr_t baseaddr, unsigned long size)
-{
-	if (addr < baseaddr)
-		return false;
-	return addr + len <= baseaddr + size;
-}
-
 static bool vgic_v2_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 				struct kvm_exit_mmio *mmio)
 {
@@ -1279,7 +1230,7 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
  * Update the interrupt state and determine which CPUs have pending
  * interrupts. Must be called with distributor lock held.
  */
-static void vgic_update_state(struct kvm *kvm)
+void vgic_update_state(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct kvm_vcpu *vcpu;
@@ -1340,12 +1291,12 @@ static inline void vgic_disable_underflow(struct kvm_vcpu *vcpu)
 	vgic_ops->disable_underflow(vcpu);
 }
 
-static inline void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
+void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
 {
 	vgic_ops->get_vmcr(vcpu, vmcr);
 }
 
-static void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
+void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
 {
 	vgic_ops->set_vmcr(vcpu, vmcr);
 }
@@ -1395,7 +1346,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
  * Queue an interrupt to a CPU virtual interface. Return true on success,
  * or false if it wasn't possible to queue it.
  */
-static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
+bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -1681,7 +1632,7 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
 	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
 }
 
-static void vgic_kick_vcpus(struct kvm *kvm)
+void vgic_kick_vcpus(struct kvm *kvm)
 {
 	struct kvm_vcpu *vcpu;
 	int c;
@@ -1911,7 +1862,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
  * Allocate and initialize the various data structures. Must be called
  * with kvm->lock held!
  */
-static int vgic_init_maps(struct kvm *kvm)
+int vgic_init_maps(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct kvm_vcpu *vcpu;
@@ -2337,7 +2288,7 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
 	default:
 		BUG();
 	}
-	r = find_matching_range(ranges, &mmio, offset);
+	r = vgic_find_matching_range(ranges, &mmio, offset);
 
 	if (unlikely(!r || !r->handle_mmio)) {
 		ret = -ENXIO;
@@ -2383,8 +2334,7 @@ out:
 	return ret;
 }
 
-static int vgic_set_common_attr(struct kvm_device *dev,
-				struct kvm_device_attr *attr)
+int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 {
 	int r;
 
@@ -2461,8 +2411,7 @@ static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 	return -ENXIO;
 }
 
-static int vgic_get_common_attr(struct kvm_device *dev,
-				struct kvm_device_attr *attr)
+int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 {
 	int r = -ENXIO;
 
@@ -2517,13 +2466,12 @@ static int vgic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 	return -ENXIO;
 }
 
-static int vgic_has_attr_regs(const struct mmio_range *ranges,
-			      phys_addr_t offset)
+int vgic_has_attr_regs(const struct mmio_range *ranges, phys_addr_t offset)
 {
 	struct kvm_exit_mmio dev_attr_mmio;
 
 	dev_attr_mmio.len = 4;
-	if (find_matching_range(ranges, &dev_attr_mmio, offset))
+	if (vgic_find_matching_range(ranges, &dev_attr_mmio, offset))
 		return 0;
 	else
 		return -ENXIO;
@@ -2553,12 +2501,12 @@ static int vgic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 	return -ENXIO;
 }
 
-static void vgic_destroy(struct kvm_device *dev)
+void vgic_destroy(struct kvm_device *dev)
 {
 	kfree(dev);
 }
 
-static int vgic_create(struct kvm_device *dev, u32 type)
+int vgic_create(struct kvm_device *dev, u32 type)
 {
 	return kvm_vgic_create(dev->kvm, type);
 }
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
new file mode 100644
index 0000000..f320333
--- /dev/null
+++ b/virt/kvm/arm/vgic.h
@@ -0,0 +1,124 @@
+/*
+ * Copyright (C) 2012-2014 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * Derived from virt/kvm/arm/vgic.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __KVM_VGIC_H__
+#define __KVM_VGIC_H__
+
+#define VGIC_ADDR_UNDEF		(-1)
+#define IS_VGIC_ADDR_UNDEF(_x)  ((_x) == VGIC_ADDR_UNDEF)
+
+#define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
+#define IMPLEMENTER_ARM		0x43b
+
+#define ACCESS_READ_VALUE	(1 << 0)
+#define ACCESS_READ_RAZ		(0 << 0)
+#define ACCESS_READ_MASK(x)	((x) & (1 << 0))
+#define ACCESS_WRITE_IGNORED	(0 << 1)
+#define ACCESS_WRITE_SETBIT	(1 << 1)
+#define ACCESS_WRITE_CLEARBIT	(2 << 1)
+#define ACCESS_WRITE_VALUE	(3 << 1)
+#define ACCESS_WRITE_MASK(x)	((x) & (3 << 1))
+
+unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x);
+
+void vgic_update_state(struct kvm *kvm);
+int vgic_init_maps(struct kvm *kvm);
+
+u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x, int cpuid, u32 offset);
+u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x, int cpuid, u32 offset);
+
+void vgic_dist_irq_set_pending(struct kvm_vcpu *vcpu, int irq);
+void vgic_dist_irq_clear_pending(struct kvm_vcpu *vcpu, int irq);
+void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq);
+void vgic_bitmap_set_irq_val(struct vgic_bitmap *x, int cpuid,
+			     int irq, int val);
+
+void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
+void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
+
+bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq);
+void vgic_unqueue_irqs(struct kvm_vcpu *vcpu);
+
+void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
+		     phys_addr_t offset, int mode);
+bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
+			phys_addr_t offset);
+
+static inline
+u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
+{
+	return le32_to_cpu(*((u32 *)mmio->data)) & mask;
+}
+
+static inline
+void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value)
+{
+	*((u32 *)mmio->data) = cpu_to_le32(value) & mask;
+}
+
+struct mmio_range {
+	phys_addr_t base;
+	unsigned long len;
+	int bits_per_irq;
+	bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
+			    phys_addr_t offset);
+};
+
+static inline bool is_in_range(phys_addr_t addr, unsigned long len,
+			       phys_addr_t baseaddr, unsigned long size)
+{
+	if (addr < baseaddr)
+		return false;
+	return addr + len <= baseaddr + size;
+}
+
+const
+struct mmio_range *vgic_find_matching_range(const struct mmio_range *ranges,
+					    struct kvm_exit_mmio *mmio,
+					    phys_addr_t offset);
+
+bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
+			    struct kvm_exit_mmio *mmio,
+			    const struct mmio_range *ranges,
+			    unsigned long mmio_base);
+
+bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
+			    phys_addr_t offset, int vcpu_id, int access);
+
+bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
+				 phys_addr_t offset, int vcpu_id);
+
+bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
+				   phys_addr_t offset, int vcpu_id);
+
+bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
+			 phys_addr_t offset);
+
+void vgic_kick_vcpus(struct kvm *kvm);
+
+int vgic_create(struct kvm_device *dev, u32 type);
+void vgic_destroy(struct kvm_device *dev);
+
+int vgic_has_attr_regs(const struct mmio_range *ranges, phys_addr_t offset);
+int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
+int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
+
+bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type);
+
+#endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 14/19] arm/arm64: KVM: split GICv2 specific emulation code from vgic.c
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (12 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 13/19] arm/arm64: KVM: add vgic.h header file Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-04 19:30   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors Andre Przywara
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

vgic.c is currently a mixture of generic vGIC emulation code and
functions specific to emulating a GICv2. To ease the addition of
GICv3, split off strictly v2 specific parts into a new file
vgic-v2-emul.c.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>

-------
As the diff isn't always obvious here (and to aid eventual rebases),
here is a list of high-level changes done to the code:
* added new file to respective arm/arm64 Makefiles
* moved GICv2 specific functions to vgic-v2-emul.c:
  - handle_mmio_misc()
  - handle_mmio_set_enable_reg()
  - handle_mmio_clear_enable_reg()
  - handle_mmio_set_pending_reg()
  - handle_mmio_clear_pending_reg()
  - handle_mmio_priority_reg()
  - vgic_get_target_reg()
  - vgic_set_target_reg()
  - handle_mmio_target_reg()
  - handle_mmio_cfg_reg()
  - handle_mmio_sgi_reg()
  - vgic_v2_unqueue_sgi()
  - read_set_clear_sgi_pend_reg()
  - write_set_clear_sgi_pend_reg()
  - handle_mmio_sgi_set()
  - handle_mmio_sgi_clear()
  - vgic_v2_handle_mmio()
  - vgic_get_sgi_sources()
  - vgic_dispatch_sgi()
  - vgic_v2_queue_sgi()
  - vgic_v2_init()
  - handle_cpu_mmio_misc()
  - handle_mmio_abpr()
  - handle_cpu_mmio_ident()
  - vgic_attr_regs_access()
  - vgic_has_attr() (renamed to vgic_v2_has_attr())
  - vgic_set_attr() (renamed to vgic_v2_set_attr())
  - vgic_get_attr() (renamed to vgic_v2_get_attr())
  - struct mmio_range vgic_dist_ranges[]
  - struct mmio_range vgic_cpu_ranges[]
  - struct kvm_device_ops kvm_arm_vgic_v2_ops {}
* moved content of init_emulation_ops() into separate function in vgic-v2-emul.c
---
 arch/arm/kvm/Makefile       |    1 +
 arch/arm64/kvm/Makefile     |    1 +
 virt/kvm/arm/vgic-v2-emul.c |  795 +++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic.c         |  745 +---------------------------------------
 4 files changed, 799 insertions(+), 743 deletions(-)
 create mode 100644 virt/kvm/arm/vgic-v2-emul.c

diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index f7057ed..443b8be 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -22,4 +22,5 @@ obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
 obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o
 obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic.o
 obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
+obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
 obj-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 32a0961..d957353 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -21,6 +21,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += guest.o reset.o sys_regs.o sys_regs_generic_v8.o
 
 kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
+kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v2-switch.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v3-switch.o
diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
new file mode 100644
index 0000000..e64f215
--- /dev/null
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -0,0 +1,795 @@
+/*
+ * Contains GICv2 specific emulation code, was in vgic.c before.
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/cpu.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/uaccess.h>
+
+#include <linux/irqchip/arm-gic.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
+
+#include "vgic.h"
+
+#define GICC_ARCH_VERSION_V2		0x2
+
+static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
+static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
+{
+	return dist->irq_sgi_sources + vcpu_id * VGIC_NR_SGIS + sgi;
+}
+
+static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
+			     struct kvm_exit_mmio *mmio, phys_addr_t offset)
+{
+	u32 reg;
+	u32 word_offset = offset & 3;
+
+	switch (offset & ~3) {
+	case 0:			/* GICD_CTLR */
+		reg = vcpu->kvm->arch.vgic.enabled;
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+		if (mmio->is_write) {
+			vcpu->kvm->arch.vgic.enabled = reg & 1;
+			vgic_update_state(vcpu->kvm);
+			return true;
+		}
+		break;
+
+	case 4:			/* GICD_TYPER */
+		reg  = (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
+		reg |= (vcpu->kvm->arch.vgic.nr_irqs >> 5) - 1;
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+
+	case 8:			/* GICD_IIDR */
+		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	}
+
+	return false;
+}
+
+static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
+				       struct kvm_exit_mmio *mmio,
+				       phys_addr_t offset)
+{
+	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+				      vcpu->vcpu_id, ACCESS_WRITE_SETBIT);
+}
+
+static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+				      vcpu->vcpu_id, ACCESS_WRITE_CLEARBIT);
+}
+
+static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
+					struct kvm_exit_mmio *mmio,
+					phys_addr_t offset)
+{
+	return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
+					   vcpu->vcpu_id);
+}
+
+static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
+					  struct kvm_exit_mmio *mmio,
+					  phys_addr_t offset)
+{
+	return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
+					     vcpu->vcpu_id);
+}
+
+static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
+				     struct kvm_exit_mmio *mmio,
+				     phys_addr_t offset)
+{
+	u32 *reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
+					vcpu->vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	return false;
+}
+
+#define GICD_ITARGETSR_SIZE	32
+#define GICD_CPUTARGETS_BITS	8
+#define GICD_IRQS_PER_ITARGETSR	(GICD_ITARGETSR_SIZE / GICD_CPUTARGETS_BITS)
+static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int i;
+	u32 val = 0;
+
+	irq -= VGIC_NR_PRIVATE_IRQS;
+
+	for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++)
+		val |= 1 << (dist->irq_spi_cpu[irq + i] + i * 8);
+
+	return val;
+}
+
+static void vgic_set_target_reg(struct kvm *kvm, u32 val, int irq)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct kvm_vcpu *vcpu;
+	int i, c;
+	unsigned long *bmap;
+	u32 target;
+
+	irq -= VGIC_NR_PRIVATE_IRQS;
+
+	/*
+	 * Pick the LSB in each byte. This ensures we target exactly
+	 * one vcpu per IRQ. If the byte is null, assume we target
+	 * CPU0.
+	 */
+	for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) {
+		int shift = i * GICD_CPUTARGETS_BITS;
+
+		target = ffs((val >> shift) & 0xffU);
+		target = target ? (target - 1) : 0;
+		dist->irq_spi_cpu[irq + i] = target;
+		kvm_for_each_vcpu(c, vcpu, kvm) {
+			bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
+			if (c == target)
+				set_bit(irq + i, bmap);
+			else
+				clear_bit(irq + i, bmap);
+		}
+	}
+}
+
+static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
+				   struct kvm_exit_mmio *mmio,
+				   phys_addr_t offset)
+{
+	u32 reg;
+
+	/* We treat the banked interrupts targets as read-only */
+	if (offset < 32) {
+		u32 roreg;
+
+		roreg = 1 << vcpu->vcpu_id;
+		roreg |= roreg << 8;
+		roreg |= roreg << 16;
+
+		vgic_reg_access(mmio, &roreg, offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		return false;
+	}
+
+	reg = vgic_get_target_reg(vcpu->kvm, offset & ~3U);
+	vgic_reg_access(mmio, &reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	if (mmio->is_write) {
+		vgic_set_target_reg(vcpu->kvm, reg, offset & ~3U);
+		vgic_update_state(vcpu->kvm);
+		return true;
+	}
+
+	return false;
+}
+
+static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
+				struct kvm_exit_mmio *mmio, phys_addr_t offset)
+{
+	u32 *reg;
+
+	reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
+				  vcpu->vcpu_id, offset >> 1);
+
+	return vgic_handle_cfg_reg(reg, mmio, offset);
+}
+
+static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
+				struct kvm_exit_mmio *mmio, phys_addr_t offset)
+{
+	u32 reg;
+
+	vgic_reg_access(mmio, &reg, offset,
+			ACCESS_READ_RAZ | ACCESS_WRITE_VALUE);
+	if (mmio->is_write) {
+		vgic_dispatch_sgi(vcpu, reg);
+		vgic_update_state(vcpu->kvm);
+		return true;
+	}
+
+	return false;
+}
+
+/* Handle reads of GICD_CPENDSGIRn and GICD_SPENDSGIRn */
+static bool read_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
+					struct kvm_exit_mmio *mmio,
+					phys_addr_t offset)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int sgi;
+	int min_sgi = (offset & ~0x3);
+	int max_sgi = min_sgi + 3;
+	int vcpu_id = vcpu->vcpu_id;
+	u32 reg = 0;
+
+	/* Copy source SGIs from distributor side */
+	for (sgi = min_sgi; sgi <= max_sgi; sgi++) {
+		u8 sources = *vgic_get_sgi_sources(dist, vcpu_id, sgi);
+
+		reg |= ((u32)sources) << (8 * (sgi - min_sgi));
+	}
+
+	mmio_data_write(mmio, ~0, reg);
+	return false;
+}
+
+static bool write_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset, bool set)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int sgi;
+	int min_sgi = (offset & ~0x3);
+	int max_sgi = min_sgi + 3;
+	int vcpu_id = vcpu->vcpu_id;
+	u32 reg;
+	bool updated = false;
+
+	reg = mmio_data_read(mmio, ~0);
+
+	/* Clear pending SGIs on the distributor */
+	for (sgi = min_sgi; sgi <= max_sgi; sgi++) {
+		u8 mask = reg >> (8 * (sgi - min_sgi));
+		u8 *src = vgic_get_sgi_sources(dist, vcpu_id, sgi);
+
+		if (set) {
+			if ((*src & mask) != mask)
+				updated = true;
+			*src |= mask;
+		} else {
+			if (*src & mask)
+				updated = true;
+			*src &= ~mask;
+		}
+	}
+
+	if (updated)
+		vgic_update_state(vcpu->kvm);
+
+	return updated;
+}
+
+static bool handle_mmio_sgi_set(struct kvm_vcpu *vcpu,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset)
+{
+	if (!mmio->is_write)
+		return read_set_clear_sgi_pend_reg(vcpu, mmio, offset);
+	else
+		return write_set_clear_sgi_pend_reg(vcpu, mmio, offset, true);
+}
+
+static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset)
+{
+	if (!mmio->is_write)
+		return read_set_clear_sgi_pend_reg(vcpu, mmio, offset);
+	else
+		return write_set_clear_sgi_pend_reg(vcpu, mmio, offset, false);
+}
+
+static const struct mmio_range vgic_dist_ranges[] = {
+	{
+		.base		= GIC_DIST_CTRL,
+		.len		= 12,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_misc,
+	},
+	{
+		.base		= GIC_DIST_IGROUP,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GIC_DIST_ENABLE_SET,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_set_enable_reg,
+	},
+	{
+		.base		= GIC_DIST_ENABLE_CLEAR,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_clear_enable_reg,
+	},
+	{
+		.base		= GIC_DIST_PENDING_SET,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_set_pending_reg,
+	},
+	{
+		.base		= GIC_DIST_PENDING_CLEAR,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_clear_pending_reg,
+	},
+	{
+		.base		= GIC_DIST_ACTIVE_SET,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GIC_DIST_ACTIVE_CLEAR,
+		.len		= VGIC_MAX_IRQS / 8,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GIC_DIST_PRI,
+		.len		= VGIC_MAX_IRQS,
+		.bits_per_irq	= 8,
+		.handle_mmio	= handle_mmio_priority_reg,
+	},
+	{
+		.base		= GIC_DIST_TARGET,
+		.len		= VGIC_MAX_IRQS,
+		.bits_per_irq	= 8,
+		.handle_mmio	= handle_mmio_target_reg,
+	},
+	{
+		.base		= GIC_DIST_CONFIG,
+		.len		= VGIC_MAX_IRQS / 4,
+		.bits_per_irq	= 2,
+		.handle_mmio	= handle_mmio_cfg_reg,
+	},
+	{
+		.base		= GIC_DIST_SOFTINT,
+		.len		= 4,
+		.handle_mmio	= handle_mmio_sgi_reg,
+	},
+	{
+		.base		= GIC_DIST_SGI_PENDING_CLEAR,
+		.len		= VGIC_NR_SGIS,
+		.handle_mmio	= handle_mmio_sgi_clear,
+	},
+	{
+		.base		= GIC_DIST_SGI_PENDING_SET,
+		.len		= VGIC_NR_SGIS,
+		.handle_mmio	= handle_mmio_sgi_set,
+	},
+	{}
+};
+
+static bool vgic_v2_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				struct kvm_exit_mmio *mmio)
+{
+	unsigned long base = vcpu->kvm->arch.vgic.vgic_dist_base;
+
+	if (!is_in_range(mmio->phys_addr, mmio->len, base,
+			 KVM_VGIC_V2_DIST_SIZE))
+		return false;
+
+	/* GICv2 does not support accesses wider than 32 bits */
+	if (mmio->len > 4) {
+		kvm_inject_dabt(vcpu, mmio->phys_addr);
+		return true;
+	}
+
+	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_dist_ranges, base);
+}
+
+static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
+{
+	struct kvm *kvm = vcpu->kvm;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int nrcpus = atomic_read(&kvm->online_vcpus);
+	u8 target_cpus;
+	int sgi, mode, c, vcpu_id;
+
+	vcpu_id = vcpu->vcpu_id;
+
+	sgi = reg & 0xf;
+	target_cpus = (reg >> 16) & 0xff;
+	mode = (reg >> 24) & 3;
+
+	switch (mode) {
+	case 0:
+		if (!target_cpus)
+			return;
+		break;
+
+	case 1:
+		target_cpus = ((1 << nrcpus) - 1) & ~(1 << vcpu_id) & 0xff;
+		break;
+
+	case 2:
+		target_cpus = 1 << vcpu_id;
+		break;
+	}
+
+	kvm_for_each_vcpu(c, vcpu, kvm) {
+		if (target_cpus & 1) {
+			/* Flag the SGI as pending */
+			vgic_dist_irq_set_pending(vcpu, sgi);
+			*vgic_get_sgi_sources(dist, c, sgi) |= 1 << vcpu_id;
+			kvm_debug("SGI%d from CPU%d to CPU%d\n",
+				  sgi, vcpu_id, c);
+		}
+
+		target_cpus >>= 1;
+	}
+}
+
+static bool vgic_v2_queue_sgi(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long sources;
+	int vcpu_id = vcpu->vcpu_id;
+	int c;
+
+	sources = *vgic_get_sgi_sources(dist, vcpu_id, irq);
+
+	for_each_set_bit(c, &sources, dist->nr_cpus) {
+		if (vgic_queue_irq(vcpu, c, irq))
+			clear_bit(c, &sources);
+	}
+
+	*vgic_get_sgi_sources(dist, vcpu_id, irq) = sources;
+
+	/*
+	 * If the sources bitmap has been cleared it means that we
+	 * could queue all the SGIs onto link registers (see the
+	 * clear_bit above), and therefore we are done with them in
+	 * our emulated gic and can get rid of them.
+	 */
+	if (!sources) {
+		vgic_dist_irq_clear_pending(vcpu, irq);
+		vgic_cpu_irq_clear(vcpu, irq);
+		return true;
+	}
+
+	return false;
+}
+
+static int vgic_v2_init(struct kvm *kvm, const struct vgic_params *params)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int ret, i;
+
+	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
+	    IS_VGIC_ADDR_UNDEF(dist->vgic_cpu_base)) {
+		kvm_err("Need to set vgic distributor addresses first\n");
+		return -ENXIO;
+	}
+
+	ret = kvm_phys_addr_ioremap(kvm, dist->vgic_cpu_base,
+				    params->vcpu_base,
+				    KVM_VGIC_V2_CPU_SIZE, true);
+	if (ret) {
+		kvm_err("Unable to remap VGIC CPU to VCPU\n");
+		return ret;
+	}
+
+	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
+		vgic_set_target_reg(kvm, 0, i);
+
+	return 0;
+}
+
+static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	*vgic_get_sgi_sources(dist, vcpu->vcpu_id, irq) |= 1 << source;
+}
+
+bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	switch (type) {
+	case KVM_DEV_TYPE_ARM_VGIC_V2:
+		dist->vm_ops.handle_mmio = vgic_v2_handle_mmio;
+		dist->vm_ops.queue_sgi = vgic_v2_queue_sgi;
+		dist->vm_ops.add_sgi_source = vgic_v2_add_sgi_source;
+		dist->vm_ops.vgic_init = vgic_v2_init;
+		return true;
+	}
+	return false;
+}
+
+static bool handle_cpu_mmio_misc(struct kvm_vcpu *vcpu,
+				 struct kvm_exit_mmio *mmio, phys_addr_t offset)
+{
+	bool updated = false;
+	struct vgic_vmcr vmcr;
+	u32 *vmcr_field;
+	u32 reg;
+
+	vgic_get_vmcr(vcpu, &vmcr);
+
+	switch (offset & ~0x3) {
+	case GIC_CPU_CTRL:
+		vmcr_field = &vmcr.ctlr;
+		break;
+	case GIC_CPU_PRIMASK:
+		vmcr_field = &vmcr.pmr;
+		break;
+	case GIC_CPU_BINPOINT:
+		vmcr_field = &vmcr.bpr;
+		break;
+	case GIC_CPU_ALIAS_BINPOINT:
+		vmcr_field = &vmcr.abpr;
+		break;
+	default:
+		BUG();
+	}
+
+	if (!mmio->is_write) {
+		reg = *vmcr_field;
+		mmio_data_write(mmio, ~0, reg);
+	} else {
+		reg = mmio_data_read(mmio, ~0);
+		if (reg != *vmcr_field) {
+			*vmcr_field = reg;
+			vgic_set_vmcr(vcpu, &vmcr);
+			updated = true;
+		}
+	}
+	return updated;
+}
+
+static bool handle_mmio_abpr(struct kvm_vcpu *vcpu,
+			     struct kvm_exit_mmio *mmio, phys_addr_t offset)
+{
+	return handle_cpu_mmio_misc(vcpu, mmio, GIC_CPU_ALIAS_BINPOINT);
+}
+
+static bool handle_cpu_mmio_ident(struct kvm_vcpu *vcpu,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset)
+{
+	u32 reg;
+
+	if (mmio->is_write)
+		return false;
+
+	/* GICC_IIDR */
+	reg = (PRODUCT_ID_KVM << 20) |
+	      (GICC_ARCH_VERSION_V2 << 16) |
+	      (IMPLEMENTER_ARM << 0);
+	mmio_data_write(mmio, ~0, reg);
+	return false;
+}
+
+/*
+ * CPU Interface Register accesses - these are not accessed by the VM, but by
+ * user space for saving and restoring VGIC state.
+ */
+static const struct mmio_range vgic_cpu_ranges[] = {
+	{
+		.base		= GIC_CPU_CTRL,
+		.len		= 12,
+		.handle_mmio	= handle_cpu_mmio_misc,
+	},
+	{
+		.base		= GIC_CPU_ALIAS_BINPOINT,
+		.len		= 4,
+		.handle_mmio	= handle_mmio_abpr,
+	},
+	{
+		.base		= GIC_CPU_ACTIVEPRIO,
+		.len		= 16,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GIC_CPU_IDENT,
+		.len		= 4,
+		.handle_mmio	= handle_cpu_mmio_ident,
+	},
+};
+
+static int vgic_attr_regs_access(struct kvm_device *dev,
+				 struct kvm_device_attr *attr,
+				 u32 *reg, bool is_write)
+{
+	const struct mmio_range *r = NULL, *ranges;
+	phys_addr_t offset;
+	int ret, cpuid, c;
+	struct kvm_vcpu *vcpu, *tmp_vcpu;
+	struct vgic_dist *vgic;
+	struct kvm_exit_mmio mmio;
+
+	offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
+	cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
+		KVM_DEV_ARM_VGIC_CPUID_SHIFT;
+
+	mutex_lock(&dev->kvm->lock);
+
+	ret = vgic_init_maps(dev->kvm);
+	if (ret)
+		goto out;
+
+	if (cpuid >= atomic_read(&dev->kvm->online_vcpus)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	vcpu = kvm_get_vcpu(dev->kvm, cpuid);
+	vgic = &dev->kvm->arch.vgic;
+
+	mmio.len = 4;
+	mmio.is_write = is_write;
+	if (is_write)
+		mmio_data_write(&mmio, ~0, *reg);
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+		mmio.phys_addr = vgic->vgic_dist_base + offset;
+		ranges = vgic_dist_ranges;
+		break;
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
+		mmio.phys_addr = vgic->vgic_cpu_base + offset;
+		ranges = vgic_cpu_ranges;
+		break;
+	default:
+		BUG();
+	}
+	r = vgic_find_matching_range(ranges, &mmio, offset);
+
+	if (unlikely(!r || !r->handle_mmio)) {
+		ret = -ENXIO;
+		goto out;
+	}
+
+
+	spin_lock(&vgic->lock);
+
+	/*
+	 * Ensure that no other VCPU is running by checking the vcpu->cpu
+	 * field.  If no other VPCUs are running we can safely access the VGIC
+	 * state, because even if another VPU is run after this point, that
+	 * VCPU will not touch the vgic state, because it will block on
+	 * getting the vgic->lock in kvm_vgic_sync_hwstate().
+	 */
+	kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm) {
+		if (unlikely(tmp_vcpu->cpu != -1)) {
+			ret = -EBUSY;
+			goto out_vgic_unlock;
+		}
+	}
+
+	/*
+	 * Move all pending IRQs from the LRs on all VCPUs so the pending
+	 * state can be properly represented in the register state accessible
+	 * through this API.
+	 */
+	kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm)
+		vgic_unqueue_irqs(tmp_vcpu);
+
+	offset -= r->base;
+	r->handle_mmio(vcpu, &mmio, offset);
+
+	if (!is_write)
+		*reg = mmio_data_read(&mmio, ~0);
+
+	ret = 0;
+out_vgic_unlock:
+	spin_unlock(&vgic->lock);
+out:
+	mutex_unlock(&dev->kvm->lock);
+	return ret;
+}
+
+static int vgic_v2_set_attr(struct kvm_device *dev,
+			    struct kvm_device_attr *attr)
+{
+	int ret;
+
+	ret = vgic_set_common_attr(dev, attr);
+	if (ret != -ENXIO)
+		return ret;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
+		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+		u32 reg;
+
+		if (get_user(reg, uaddr))
+			return -EFAULT;
+
+		return vgic_attr_regs_access(dev, attr, &reg, true);
+	}
+
+	}
+
+	return -ENXIO;
+}
+
+static int vgic_v2_get_attr(struct kvm_device *dev,
+			    struct kvm_device_attr *attr)
+{
+	int ret;
+
+	ret = vgic_get_common_attr(dev, attr);
+	if (ret != -ENXIO)
+		return ret;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
+		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+		u32 reg = 0;
+
+		ret = vgic_attr_regs_access(dev, attr, &reg, false);
+		if (ret)
+			return ret;
+		return put_user(reg, uaddr);
+	}
+
+	}
+
+	return -ENXIO;
+}
+
+static int vgic_v2_has_attr(struct kvm_device *dev,
+			    struct kvm_device_attr *attr)
+{
+	phys_addr_t offset;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_ADDR:
+		switch (attr->attr) {
+		case KVM_VGIC_V2_ADDR_TYPE_DIST:
+		case KVM_VGIC_V2_ADDR_TYPE_CPU:
+			return 0;
+		}
+		break;
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+		offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
+		return vgic_has_attr_regs(vgic_dist_ranges, offset);
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
+		offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
+		return vgic_has_attr_regs(vgic_cpu_ranges, offset);
+	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
+		return 0;
+	}
+	return -ENXIO;
+}
+
+struct kvm_device_ops kvm_arm_vgic_v2_ops = {
+	.name = "kvm-arm-vgic-v2",
+	.create = vgic_create,
+	.destroy = vgic_destroy,
+	.set_attr = vgic_v2_set_attr,
+	.get_attr = vgic_v2_get_attr,
+	.has_attr = vgic_v2_has_attr,
+};
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 8f1e6ee..eda4cf5 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -81,8 +81,6 @@
 
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
-static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi);
-static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
 
@@ -408,41 +406,6 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 	}
 }
 
-static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
-			     struct kvm_exit_mmio *mmio, phys_addr_t offset)
-{
-	u32 reg;
-	u32 word_offset = offset & 3;
-
-	switch (offset & ~3) {
-	case 0:			/* GICD_CTLR */
-		reg = vcpu->kvm->arch.vgic.enabled;
-		vgic_reg_access(mmio, &reg, word_offset,
-				ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
-		if (mmio->is_write) {
-			vcpu->kvm->arch.vgic.enabled = reg & 1;
-			vgic_update_state(vcpu->kvm);
-			return true;
-		}
-		break;
-
-	case 4:			/* GICD_TYPER */
-		reg  = (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
-		reg |= (vcpu->kvm->arch.vgic.nr_irqs >> 5) - 1;
-		vgic_reg_access(mmio, &reg, word_offset,
-				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
-		break;
-
-	case 8:			/* GICD_IIDR */
-		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
-		vgic_reg_access(mmio, &reg, word_offset,
-				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
-		break;
-	}
-
-	return false;
-}
-
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
 			phys_addr_t offset)
 {
@@ -473,22 +436,6 @@ bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 	return false;
 }
 
-static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
-				       struct kvm_exit_mmio *mmio,
-				       phys_addr_t offset)
-{
-	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
-				      vcpu->vcpu_id, ACCESS_WRITE_SETBIT);
-}
-
-static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
-					 struct kvm_exit_mmio *mmio,
-					 phys_addr_t offset)
-{
-	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
-				      vcpu->vcpu_id, ACCESS_WRITE_CLEARBIT);
-}
-
 bool vgic_handle_set_pending_reg(struct kvm *kvm,
 				 struct kvm_exit_mmio *mmio,
 				 phys_addr_t offset, int vcpu_id)
@@ -562,109 +509,6 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 	return false;
 }
 
-static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
-					struct kvm_exit_mmio *mmio,
-					phys_addr_t offset)
-{
-	return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
-					   vcpu->vcpu_id);
-}
-
-static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
-					  struct kvm_exit_mmio *mmio,
-					  phys_addr_t offset)
-{
-	return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
-					     vcpu->vcpu_id);
-}
-
-static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
-				     struct kvm_exit_mmio *mmio,
-				     phys_addr_t offset)
-{
-	u32 *reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
-					vcpu->vcpu_id, offset);
-	vgic_reg_access(mmio, reg, offset,
-			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
-	return false;
-}
-
-#define GICD_ITARGETSR_SIZE	32
-#define GICD_CPUTARGETS_BITS	8
-#define GICD_IRQS_PER_ITARGETSR	(GICD_ITARGETSR_SIZE / GICD_CPUTARGETS_BITS)
-static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
-{
-	struct vgic_dist *dist = &kvm->arch.vgic;
-	int i;
-	u32 val = 0;
-
-	irq -= VGIC_NR_PRIVATE_IRQS;
-
-	for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++)
-		val |= 1 << (dist->irq_spi_cpu[irq + i] + i * 8);
-
-	return val;
-}
-
-static void vgic_set_target_reg(struct kvm *kvm, u32 val, int irq)
-{
-	struct vgic_dist *dist = &kvm->arch.vgic;
-	struct kvm_vcpu *vcpu;
-	int i, c;
-	unsigned long *bmap;
-	u32 target;
-
-	irq -= VGIC_NR_PRIVATE_IRQS;
-
-	/*
-	 * Pick the LSB in each byte. This ensures we target exactly
-	 * one vcpu per IRQ. If the byte is null, assume we target
-	 * CPU0.
-	 */
-	for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) {
-		int shift = i * GICD_CPUTARGETS_BITS;
-		target = ffs((val >> shift) & 0xffU);
-		target = target ? (target - 1) : 0;
-		dist->irq_spi_cpu[irq + i] = target;
-		kvm_for_each_vcpu(c, vcpu, kvm) {
-			bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
-			if (c == target)
-				set_bit(irq + i, bmap);
-			else
-				clear_bit(irq + i, bmap);
-		}
-	}
-}
-
-static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
-				   struct kvm_exit_mmio *mmio,
-				   phys_addr_t offset)
-{
-	u32 reg;
-
-	/* We treat the banked interrupts targets as read-only */
-	if (offset < 32) {
-		u32 roreg = 1 << vcpu->vcpu_id;
-		roreg |= roreg << 8;
-		roreg |= roreg << 16;
-
-		vgic_reg_access(mmio, &roreg, offset,
-				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
-		return false;
-	}
-
-	reg = vgic_get_target_reg(vcpu->kvm, offset & ~3U);
-	vgic_reg_access(mmio, &reg, offset,
-			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
-	if (mmio->is_write) {
-		vgic_set_target_reg(vcpu->kvm, reg, offset & ~3U);
-		vgic_update_state(vcpu->kvm);
-		return true;
-	}
-
-	return false;
-}
-
 static u32 vgic_cfg_expand(u16 val)
 {
 	u32 res = 0;
@@ -732,39 +576,6 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 	return false;
 }
 
-static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
-				struct kvm_exit_mmio *mmio, phys_addr_t offset)
-{
-	u32 *reg;
-
-	reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
-				  vcpu->vcpu_id, offset >> 1);
-
-	return vgic_handle_cfg_reg(reg, mmio, offset);
-}
-
-static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
-				struct kvm_exit_mmio *mmio, phys_addr_t offset)
-{
-	u32 reg;
-	vgic_reg_access(mmio, &reg, offset,
-			ACCESS_READ_RAZ | ACCESS_WRITE_VALUE);
-	if (mmio->is_write) {
-		vgic_dispatch_sgi(vcpu, reg);
-		vgic_update_state(vcpu->kvm);
-		return true;
-	}
-
-	return false;
-}
-
-static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
-{
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-
-	*vgic_get_sgi_sources(dist, vcpu->vcpu_id, irq) |= 1 << source;
-}
-
 /**
  * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
@@ -826,168 +637,6 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 	}
 }
 
-/* Handle reads of GICD_CPENDSGIRn and GICD_SPENDSGIRn */
-static bool read_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
-					struct kvm_exit_mmio *mmio,
-					phys_addr_t offset)
-{
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-	int sgi;
-	int min_sgi = (offset & ~0x3);
-	int max_sgi = min_sgi + 3;
-	int vcpu_id = vcpu->vcpu_id;
-	u32 reg = 0;
-
-	/* Copy source SGIs from distributor side */
-	for (sgi = min_sgi; sgi <= max_sgi; sgi++) {
-		int shift = 8 * (sgi - min_sgi);
-		reg |= ((u32)*vgic_get_sgi_sources(dist, vcpu_id, sgi)) << shift;
-	}
-
-	mmio_data_write(mmio, ~0, reg);
-	return false;
-}
-
-static bool write_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
-					 struct kvm_exit_mmio *mmio,
-					 phys_addr_t offset, bool set)
-{
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-	int sgi;
-	int min_sgi = (offset & ~0x3);
-	int max_sgi = min_sgi + 3;
-	int vcpu_id = vcpu->vcpu_id;
-	u32 reg;
-	bool updated = false;
-
-	reg = mmio_data_read(mmio, ~0);
-
-	/* Clear pending SGIs on the distributor */
-	for (sgi = min_sgi; sgi <= max_sgi; sgi++) {
-		u8 mask = reg >> (8 * (sgi - min_sgi));
-		u8 *src = vgic_get_sgi_sources(dist, vcpu_id, sgi);
-		if (set) {
-			if ((*src & mask) != mask)
-				updated = true;
-			*src |= mask;
-		} else {
-			if (*src & mask)
-				updated = true;
-			*src &= ~mask;
-		}
-	}
-
-	if (updated)
-		vgic_update_state(vcpu->kvm);
-
-	return updated;
-}
-
-static bool handle_mmio_sgi_set(struct kvm_vcpu *vcpu,
-				struct kvm_exit_mmio *mmio,
-				phys_addr_t offset)
-{
-	if (!mmio->is_write)
-		return read_set_clear_sgi_pend_reg(vcpu, mmio, offset);
-	else
-		return write_set_clear_sgi_pend_reg(vcpu, mmio, offset, true);
-}
-
-static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu,
-				  struct kvm_exit_mmio *mmio,
-				  phys_addr_t offset)
-{
-	if (!mmio->is_write)
-		return read_set_clear_sgi_pend_reg(vcpu, mmio, offset);
-	else
-		return write_set_clear_sgi_pend_reg(vcpu, mmio, offset, false);
-}
-
-static const struct mmio_range vgic_dist_ranges[] = {
-	{
-		.base		= GIC_DIST_CTRL,
-		.len		= 12,
-		.bits_per_irq	= 0,
-		.handle_mmio	= handle_mmio_misc,
-	},
-	{
-		.base		= GIC_DIST_IGROUP,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
-	},
-	{
-		.base		= GIC_DIST_ENABLE_SET,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_set_enable_reg,
-	},
-	{
-		.base		= GIC_DIST_ENABLE_CLEAR,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_clear_enable_reg,
-	},
-	{
-		.base		= GIC_DIST_PENDING_SET,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_set_pending_reg,
-	},
-	{
-		.base		= GIC_DIST_PENDING_CLEAR,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_clear_pending_reg,
-	},
-	{
-		.base		= GIC_DIST_ACTIVE_SET,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
-	},
-	{
-		.base		= GIC_DIST_ACTIVE_CLEAR,
-		.len		= VGIC_MAX_IRQS / 8,
-		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
-	},
-	{
-		.base		= GIC_DIST_PRI,
-		.len		= VGIC_MAX_IRQS,
-		.bits_per_irq	= 8,
-		.handle_mmio	= handle_mmio_priority_reg,
-	},
-	{
-		.base		= GIC_DIST_TARGET,
-		.len		= VGIC_MAX_IRQS,
-		.bits_per_irq	= 8,
-		.handle_mmio	= handle_mmio_target_reg,
-	},
-	{
-		.base		= GIC_DIST_CONFIG,
-		.len		= VGIC_MAX_IRQS / 4,
-		.bits_per_irq	= 2,
-		.handle_mmio	= handle_mmio_cfg_reg,
-	},
-	{
-		.base		= GIC_DIST_SOFTINT,
-		.len		= 4,
-		.handle_mmio	= handle_mmio_sgi_reg,
-	},
-	{
-		.base		= GIC_DIST_SGI_PENDING_CLEAR,
-		.len		= VGIC_NR_SGIS,
-		.handle_mmio	= handle_mmio_sgi_clear,
-	},
-	{
-		.base		= GIC_DIST_SGI_PENDING_SET,
-		.len		= VGIC_NR_SGIS,
-		.handle_mmio	= handle_mmio_sgi_set,
-	},
-	{}
-};
-
 const
 struct mmio_range *vgic_find_matching_range(const struct mmio_range *ranges,
 					    struct kvm_exit_mmio *mmio,
@@ -1110,24 +759,6 @@ bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	return true;
 }
 
-static bool vgic_v2_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
-				struct kvm_exit_mmio *mmio)
-{
-	unsigned long base = vcpu->kvm->arch.vgic.vgic_dist_base;
-
-	if (!is_in_range(mmio->phys_addr, mmio->len, base,
-			 KVM_VGIC_V2_DIST_SIZE))
-		return false;
-
-	/* GICv2 does not support accesses wider than 32 bits */
-	if (mmio->len > 4) {
-		kvm_inject_dabt(vcpu, mmio->phys_addr);
-		return true;
-	}
-
-	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_dist_ranges, base);
-}
-
 /**
  * vgic_handle_mmio - handle an in-kernel MMIO access for the GIC emulation
  * @vcpu:      pointer to the vcpu performing the access
@@ -1146,52 +777,6 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	return vgic_vm_op(vcpu->kvm, handle_mmio)(vcpu, run, mmio);
 }
 
-static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
-{
-	return dist->irq_sgi_sources + vcpu_id * VGIC_NR_SGIS + sgi;
-}
-
-static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
-{
-	struct kvm *kvm = vcpu->kvm;
-	struct vgic_dist *dist = &kvm->arch.vgic;
-	int nrcpus = atomic_read(&kvm->online_vcpus);
-	u8 target_cpus;
-	int sgi, mode, c, vcpu_id;
-
-	vcpu_id = vcpu->vcpu_id;
-
-	sgi = reg & 0xf;
-	target_cpus = (reg >> 16) & 0xff;
-	mode = (reg >> 24) & 3;
-
-	switch (mode) {
-	case 0:
-		if (!target_cpus)
-			return;
-		break;
-
-	case 1:
-		target_cpus = ((1 << nrcpus) - 1) & ~(1 << vcpu_id) & 0xff;
-		break;
-
-	case 2:
-		target_cpus = 1 << vcpu_id;
-		break;
-	}
-
-	kvm_for_each_vcpu(c, vcpu, kvm) {
-		if (target_cpus & 1) {
-			/* Flag the SGI as pending */
-			vgic_dist_irq_set_pending(vcpu, sgi);
-			*vgic_get_sgi_sources(dist, c, sgi) |= 1 << vcpu_id;
-			kvm_debug("SGI%d from CPU%d to CPU%d\n", sgi, vcpu_id, c);
-		}
-
-		target_cpus >>= 1;
-	}
-}
-
 static int vgic_nr_shared_irqs(struct vgic_dist *dist)
 {
 	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
@@ -1345,6 +930,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 /*
  * Queue an interrupt to a CPU virtual interface. Return true on success,
  * or false if it wasn't possible to queue it.
+ * sgi_source must be zero for any non-SGI interrupts.
  */
 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
@@ -1395,37 +981,6 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 	return true;
 }
 
-static bool vgic_v2_queue_sgi(struct kvm_vcpu *vcpu, int irq)
-{
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-	unsigned long sources;
-	int vcpu_id = vcpu->vcpu_id;
-	int c;
-
-	sources = *vgic_get_sgi_sources(dist, vcpu_id, irq);
-
-	for_each_set_bit(c, &sources, dist->nr_cpus) {
-		if (vgic_queue_irq(vcpu, c, irq))
-			clear_bit(c, &sources);
-	}
-
-	*vgic_get_sgi_sources(dist, vcpu_id, irq) = sources;
-
-	/*
-	 * If the sources bitmap has been cleared it means that we
-	 * could queue all the SGIs onto link registers (see the
-	 * clear_bit above), and therefore we are done with them in
-	 * our emulated gic and can get rid of them.
-	 */
-	if (!sources) {
-		vgic_dist_irq_clear_pending(vcpu, irq);
-		vgic_cpu_irq_clear(vcpu, irq);
-		return true;
-	}
-
-	return false;
-}
-
 static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
 {
 	if (!vgic_can_sample_irq(vcpu, irq))
@@ -1932,31 +1487,6 @@ out:
 	return ret;
 }
 
-static int vgic_v2_init(struct kvm *kvm, const struct vgic_params *params)
-{
-	struct vgic_dist *dist = &kvm->arch.vgic;
-	int ret, i;
-
-	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
-	    IS_VGIC_ADDR_UNDEF(dist->vgic_cpu_base)) {
-		kvm_err("Need to set vgic distributor addresses first\n");
-		return -ENXIO;
-	}
-
-	ret = kvm_phys_addr_ioremap(kvm, dist->vgic_cpu_base,
-				    params->vcpu_base,
-				    KVM_VGIC_V2_CPU_SIZE, true);
-	if (ret) {
-		kvm_err("Unable to remap VGIC CPU to VCPU\n");
-		return ret;
-	}
-
-	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
-		vgic_set_target_reg(kvm, 0, i);
-
-	return 0;
-}
-
 /**
  * kvm_vgic_init - Initialize global VGIC state before running any VCPUs
  * @kvm: pointer to the kvm struct
@@ -2002,15 +1532,9 @@ out:
 
 static bool init_emulation_ops(struct kvm *kvm, int type)
 {
-	struct vgic_dist *dist = &kvm->arch.vgic;
-
 	switch (type) {
 	case KVM_DEV_TYPE_ARM_VGIC_V2:
-		dist->vm_ops.handle_mmio = vgic_v2_handle_mmio;
-		dist->vm_ops.queue_sgi = vgic_v2_queue_sgi;
-		dist->vm_ops.add_sgi_source = vgic_v2_add_sgi_source;
-		dist->vm_ops.vgic_init = vgic_v2_init;
-		return true;
+		return vgic_v2_init_emulation_ops(kvm, type);
 	}
 	return false;
 }
@@ -2152,188 +1676,6 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write)
 	return r;
 }
 
-static bool handle_cpu_mmio_misc(struct kvm_vcpu *vcpu,
-				 struct kvm_exit_mmio *mmio, phys_addr_t offset)
-{
-	bool updated = false;
-	struct vgic_vmcr vmcr;
-	u32 *vmcr_field;
-	u32 reg;
-
-	vgic_get_vmcr(vcpu, &vmcr);
-
-	switch (offset & ~0x3) {
-	case GIC_CPU_CTRL:
-		vmcr_field = &vmcr.ctlr;
-		break;
-	case GIC_CPU_PRIMASK:
-		vmcr_field = &vmcr.pmr;
-		break;
-	case GIC_CPU_BINPOINT:
-		vmcr_field = &vmcr.bpr;
-		break;
-	case GIC_CPU_ALIAS_BINPOINT:
-		vmcr_field = &vmcr.abpr;
-		break;
-	default:
-		BUG();
-	}
-
-	if (!mmio->is_write) {
-		reg = *vmcr_field;
-		mmio_data_write(mmio, ~0, reg);
-	} else {
-		reg = mmio_data_read(mmio, ~0);
-		if (reg != *vmcr_field) {
-			*vmcr_field = reg;
-			vgic_set_vmcr(vcpu, &vmcr);
-			updated = true;
-		}
-	}
-	return updated;
-}
-
-static bool handle_mmio_abpr(struct kvm_vcpu *vcpu,
-			     struct kvm_exit_mmio *mmio, phys_addr_t offset)
-{
-	return handle_cpu_mmio_misc(vcpu, mmio, GIC_CPU_ALIAS_BINPOINT);
-}
-
-static bool handle_cpu_mmio_ident(struct kvm_vcpu *vcpu,
-				  struct kvm_exit_mmio *mmio,
-				  phys_addr_t offset)
-{
-	u32 reg;
-
-	if (mmio->is_write)
-		return false;
-
-	/* GICC_IIDR */
-	reg = (PRODUCT_ID_KVM << 20) |
-	      (GICC_ARCH_VERSION_V2 << 16) |
-	      (IMPLEMENTER_ARM << 0);
-	mmio_data_write(mmio, ~0, reg);
-	return false;
-}
-
-/*
- * CPU Interface Register accesses - these are not accessed by the VM, but by
- * user space for saving and restoring VGIC state.
- */
-static const struct mmio_range vgic_cpu_ranges[] = {
-	{
-		.base		= GIC_CPU_CTRL,
-		.len		= 12,
-		.handle_mmio	= handle_cpu_mmio_misc,
-	},
-	{
-		.base		= GIC_CPU_ALIAS_BINPOINT,
-		.len		= 4,
-		.handle_mmio	= handle_mmio_abpr,
-	},
-	{
-		.base		= GIC_CPU_ACTIVEPRIO,
-		.len		= 16,
-		.handle_mmio	= handle_mmio_raz_wi,
-	},
-	{
-		.base		= GIC_CPU_IDENT,
-		.len		= 4,
-		.handle_mmio	= handle_cpu_mmio_ident,
-	},
-};
-
-static int vgic_attr_regs_access(struct kvm_device *dev,
-				 struct kvm_device_attr *attr,
-				 u32 *reg, bool is_write)
-{
-	const struct mmio_range *r = NULL, *ranges;
-	phys_addr_t offset;
-	int ret, cpuid, c;
-	struct kvm_vcpu *vcpu, *tmp_vcpu;
-	struct vgic_dist *vgic;
-	struct kvm_exit_mmio mmio;
-
-	offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
-	cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
-		KVM_DEV_ARM_VGIC_CPUID_SHIFT;
-
-	mutex_lock(&dev->kvm->lock);
-
-	ret = vgic_init_maps(dev->kvm);
-	if (ret)
-		goto out;
-
-	if (cpuid >= atomic_read(&dev->kvm->online_vcpus)) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	vcpu = kvm_get_vcpu(dev->kvm, cpuid);
-	vgic = &dev->kvm->arch.vgic;
-
-	mmio.len = 4;
-	mmio.is_write = is_write;
-	if (is_write)
-		mmio_data_write(&mmio, ~0, *reg);
-	switch (attr->group) {
-	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-		mmio.phys_addr = vgic->vgic_dist_base + offset;
-		ranges = vgic_dist_ranges;
-		break;
-	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
-		mmio.phys_addr = vgic->vgic_cpu_base + offset;
-		ranges = vgic_cpu_ranges;
-		break;
-	default:
-		BUG();
-	}
-	r = vgic_find_matching_range(ranges, &mmio, offset);
-
-	if (unlikely(!r || !r->handle_mmio)) {
-		ret = -ENXIO;
-		goto out;
-	}
-
-
-	spin_lock(&vgic->lock);
-
-	/*
-	 * Ensure that no other VCPU is running by checking the vcpu->cpu
-	 * field.  If no other VPCUs are running we can safely access the VGIC
-	 * state, because even if another VPU is run after this point, that
-	 * VCPU will not touch the vgic state, because it will block on
-	 * getting the vgic->lock in kvm_vgic_sync_hwstate().
-	 */
-	kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm) {
-		if (unlikely(tmp_vcpu->cpu != -1)) {
-			ret = -EBUSY;
-			goto out_vgic_unlock;
-		}
-	}
-
-	/*
-	 * Move all pending IRQs from the LRs on all VCPUs so the pending
-	 * state can be properly represented in the register state accessible
-	 * through this API.
-	 */
-	kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm)
-		vgic_unqueue_irqs(tmp_vcpu);
-
-	offset -= r->base;
-	r->handle_mmio(vcpu, &mmio, offset);
-
-	if (!is_write)
-		*reg = mmio_data_read(&mmio, ~0);
-
-	ret = 0;
-out_vgic_unlock:
-	spin_unlock(&vgic->lock);
-out:
-	mutex_unlock(&dev->kvm->lock);
-	return ret;
-}
-
 int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 {
 	int r;
@@ -2386,31 +1728,6 @@ int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 	return -ENXIO;
 }
 
-static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
-{
-	int ret;
-
-	ret = vgic_set_common_attr(dev, attr);
-	if (ret != -ENXIO)
-		return ret;
-
-	switch (attr->group) {
-	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
-		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-		u32 reg;
-
-		if (get_user(reg, uaddr))
-			return -EFAULT;
-
-		return vgic_attr_regs_access(dev, attr, &reg, true);
-	}
-
-	}
-
-	return -ENXIO;
-}
-
 int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 {
 	int r = -ENXIO;
@@ -2441,31 +1758,6 @@ int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 	return r;
 }
 
-static int vgic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
-{
-	int ret;
-
-	ret = vgic_get_common_attr(dev, attr);
-	if (ret != -ENXIO)
-		return ret;
-
-	switch (attr->group) {
-	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
-		u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-		u32 reg = 0;
-
-		ret = vgic_attr_regs_access(dev, attr, &reg, false);
-		if (ret)
-			return ret;
-		return put_user(reg, uaddr);
-	}
-
-	}
-
-	return -ENXIO;
-}
-
 int vgic_has_attr_regs(const struct mmio_range *ranges, phys_addr_t offset)
 {
 	struct kvm_exit_mmio dev_attr_mmio;
@@ -2477,30 +1769,6 @@ int vgic_has_attr_regs(const struct mmio_range *ranges, phys_addr_t offset)
 		return -ENXIO;
 }
 
-static int vgic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
-{
-	phys_addr_t offset;
-
-	switch (attr->group) {
-	case KVM_DEV_ARM_VGIC_GRP_ADDR:
-		switch (attr->attr) {
-		case KVM_VGIC_V2_ADDR_TYPE_DIST:
-		case KVM_VGIC_V2_ADDR_TYPE_CPU:
-			return 0;
-		}
-		break;
-	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-		offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
-		return vgic_has_attr_regs(vgic_dist_ranges, offset);
-	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
-		offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
-		return vgic_has_attr_regs(vgic_cpu_ranges, offset);
-	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
-		return 0;
-	}
-	return -ENXIO;
-}
-
 void vgic_destroy(struct kvm_device *dev)
 {
 	kfree(dev);
@@ -2511,15 +1779,6 @@ int vgic_create(struct kvm_device *dev, u32 type)
 	return kvm_vgic_create(dev->kvm, type);
 }
 
-struct kvm_device_ops kvm_arm_vgic_v2_ops = {
-	.name = "kvm-arm-vgic",
-	.create = vgic_create,
-	.destroy = vgic_destroy,
-	.set_attr = vgic_set_attr,
-	.get_attr = vgic_get_attr,
-	.has_attr = vgic_has_attr,
-};
-
 static void vgic_init_maintenance_interrupt(void *info)
 {
 	enable_percpu_irq(vgic->maint_irq, 0);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (13 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 14/19] arm/arm64: KVM: split GICv2 specific emulation code from vgic.c Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-04 15:44   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation Andre Przywara
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

For a GICv2 there is always only one (v)CPU involved: the one that
does the access. On a GICv3 the access to a CPU redistributor is
memory-mapped, but not banked, so the (v)CPU affected is determined by
looking at the MMIO address region being accessed.
To allow passing the affected CPU into the accessors, extend them to
take an opaque private pointer parameter.
For the current GICv2 emulation we ignore it and simply pass NULL
on the call.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic-v2-emul.c |   41 ++++++++++++++++++++++++-----------------
 virt/kvm/arm/vgic.c         |   15 ++++++++-------
 virt/kvm/arm/vgic.h         |    7 ++++---
 3 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index e64f215..c2922e3 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -41,7 +41,8 @@ static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
 }
 
 static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
-			     struct kvm_exit_mmio *mmio, phys_addr_t offset)
+			     struct kvm_exit_mmio *mmio, phys_addr_t offset,
+			     void *private)
 {
 	u32 reg;
 	u32 word_offset = offset & 3;
@@ -77,7 +78,7 @@ static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
 				       struct kvm_exit_mmio *mmio,
-				       phys_addr_t offset)
+				       phys_addr_t offset, void *private)
 {
 	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
 				      vcpu->vcpu_id, ACCESS_WRITE_SETBIT);
@@ -85,7 +86,7 @@ static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
 					 struct kvm_exit_mmio *mmio,
-					 phys_addr_t offset)
+					 phys_addr_t offset, void *private)
 {
 	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
 				      vcpu->vcpu_id, ACCESS_WRITE_CLEARBIT);
@@ -93,7 +94,7 @@ static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
 					struct kvm_exit_mmio *mmio,
-					phys_addr_t offset)
+					phys_addr_t offset, void *private)
 {
 	return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
 					   vcpu->vcpu_id);
@@ -101,7 +102,7 @@ static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
 					  struct kvm_exit_mmio *mmio,
-					  phys_addr_t offset)
+					  phys_addr_t offset, void *private)
 {
 	return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
 					     vcpu->vcpu_id);
@@ -109,7 +110,7 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
-				     phys_addr_t offset)
+				     phys_addr_t offset, void *private)
 {
 	u32 *reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
 					vcpu->vcpu_id, offset);
@@ -168,7 +169,7 @@ static void vgic_set_target_reg(struct kvm *kvm, u32 val, int irq)
 
 static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
 				   struct kvm_exit_mmio *mmio,
-				   phys_addr_t offset)
+				   phys_addr_t offset, void *private)
 {
 	u32 reg;
 
@@ -198,7 +199,8 @@ static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
 }
 
 static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
-				struct kvm_exit_mmio *mmio, phys_addr_t offset)
+				struct kvm_exit_mmio *mmio, phys_addr_t offset,
+				void *private)
 {
 	u32 *reg;
 
@@ -209,7 +211,8 @@ static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
 }
 
 static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
-				struct kvm_exit_mmio *mmio, phys_addr_t offset)
+				struct kvm_exit_mmio *mmio, phys_addr_t offset,
+				void *private)
 {
 	u32 reg;
 
@@ -285,7 +288,7 @@ static bool write_set_clear_sgi_pend_reg(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_sgi_set(struct kvm_vcpu *vcpu,
 				struct kvm_exit_mmio *mmio,
-				phys_addr_t offset)
+				phys_addr_t offset, void *private)
 {
 	if (!mmio->is_write)
 		return read_set_clear_sgi_pend_reg(vcpu, mmio, offset);
@@ -295,7 +298,7 @@ static bool handle_mmio_sgi_set(struct kvm_vcpu *vcpu,
 
 static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu,
 				  struct kvm_exit_mmio *mmio,
-				  phys_addr_t offset)
+				  phys_addr_t offset, void *private)
 {
 	if (!mmio->is_write)
 		return read_set_clear_sgi_pend_reg(vcpu, mmio, offset);
@@ -403,7 +406,8 @@ static bool vgic_v2_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		return true;
 	}
 
-	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_dist_ranges, base);
+	return vgic_handle_mmio_range(vcpu, run, mmio,
+				      vgic_dist_ranges, base, NULL);
 }
 
 static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
@@ -527,7 +531,8 @@ bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type)
 }
 
 static bool handle_cpu_mmio_misc(struct kvm_vcpu *vcpu,
-				 struct kvm_exit_mmio *mmio, phys_addr_t offset)
+				 struct kvm_exit_mmio *mmio, phys_addr_t offset,
+				 void *private)
 {
 	bool updated = false;
 	struct vgic_vmcr vmcr;
@@ -568,14 +573,16 @@ static bool handle_cpu_mmio_misc(struct kvm_vcpu *vcpu,
 }
 
 static bool handle_mmio_abpr(struct kvm_vcpu *vcpu,
-			     struct kvm_exit_mmio *mmio, phys_addr_t offset)
+			     struct kvm_exit_mmio *mmio, phys_addr_t offset,
+			     void *private)
 {
-	return handle_cpu_mmio_misc(vcpu, mmio, GIC_CPU_ALIAS_BINPOINT);
+	return handle_cpu_mmio_misc(vcpu, mmio, GIC_CPU_ALIAS_BINPOINT,
+				    private);
 }
 
 static bool handle_cpu_mmio_ident(struct kvm_vcpu *vcpu,
 				  struct kvm_exit_mmio *mmio,
-				  phys_addr_t offset)
+				  phys_addr_t offset, void *private)
 {
 	u32 reg;
 
@@ -695,7 +702,7 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
 		vgic_unqueue_irqs(tmp_vcpu);
 
 	offset -= r->base;
-	r->handle_mmio(vcpu, &mmio, offset);
+	r->handle_mmio(vcpu, &mmio, offset, NULL);
 
 	if (!is_write)
 		*reg = mmio_data_read(&mmio, ~0);
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index eda4cf5..a54389b 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -407,7 +407,7 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 }
 
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
-			phys_addr_t offset)
+			phys_addr_t offset, void *private)
 {
 	vgic_reg_access(mmio, NULL, offset,
 			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
@@ -677,7 +677,7 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
  */
 static bool call_range_handler(struct kvm_vcpu *vcpu,
 			       struct kvm_exit_mmio *mmio,
-			       unsigned long offset,
+			       unsigned long offset, void *private,
 			       const struct mmio_range *range)
 {
 	u32 *data32 = (void *)mmio->data;
@@ -685,7 +685,7 @@ static bool call_range_handler(struct kvm_vcpu *vcpu,
 	bool ret;
 
 	if (likely(mmio->len <= 4))
-		return range->handle_mmio(vcpu, mmio, offset);
+		return range->handle_mmio(vcpu, mmio, offset, private);
 
 	/*
 	 * Any access bigger than 4 bytes (that we currently handle in KVM)
@@ -698,14 +698,14 @@ static bool call_range_handler(struct kvm_vcpu *vcpu,
 	mmio32.phys_addr = mmio->phys_addr + 4;
 	if (mmio->is_write)
 		*(u32 *)mmio32.data = data32[1];
-	ret = range->handle_mmio(vcpu, &mmio32, offset + 4);
+	ret = range->handle_mmio(vcpu, &mmio32, offset + 4, private);
 	if (!mmio->is_write)
 		data32[1] = *(u32 *)mmio32.data;
 
 	mmio32.phys_addr = mmio->phys_addr;
 	if (mmio->is_write)
 		*(u32 *)mmio32.data = data32[0];
-	ret |= range->handle_mmio(vcpu, &mmio32, offset);
+	ret |= range->handle_mmio(vcpu, &mmio32, offset, private);
 	if (!mmio->is_write)
 		data32[0] = *(u32 *)mmio32.data;
 
@@ -725,7 +725,7 @@ static bool call_range_handler(struct kvm_vcpu *vcpu,
 bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			    struct kvm_exit_mmio *mmio,
 			    const struct mmio_range *ranges,
-			    unsigned long mmio_base)
+			    unsigned long mmio_base, void *private)
 {
 	const struct mmio_range *range;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -743,7 +743,8 @@ bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	spin_lock(&vcpu->kvm->arch.vgic.lock);
 	offset -= range->base;
 	if (vgic_validate_access(dist, range, offset)) {
-		updated_state = call_range_handler(vcpu, mmio, offset, range);
+		updated_state = call_range_handler(vcpu, mmio, offset, private,
+						   range);
 	} else {
 		if (!mmio->is_write)
 			memset(mmio->data, 0, mmio->len);
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index f320333..f52db4e 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -58,7 +58,7 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu);
 void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 		     phys_addr_t offset, int mode);
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
-			phys_addr_t offset);
+			phys_addr_t offset, void *private);
 
 static inline
 u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
@@ -77,7 +77,7 @@ struct mmio_range {
 	unsigned long len;
 	int bits_per_irq;
 	bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
-			    phys_addr_t offset);
+			    phys_addr_t offset, void *private);
 };
 
 static inline bool is_in_range(phys_addr_t addr, unsigned long len,
@@ -96,7 +96,8 @@ struct mmio_range *vgic_find_matching_range(const struct mmio_range *ranges,
 bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			    struct kvm_exit_mmio *mmio,
 			    const struct mmio_range *ranges,
-			    unsigned long mmio_base);
+			    unsigned long mmio_base,
+			    void *private);
 
 bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 			    phys_addr_t offset, int vcpu_id, int access);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (14 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-07 14:30   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 17/19] arm64: KVM: add SGI system register trapping Andre Przywara
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

With everything separated and prepared, we implement a model of a
GICv3 distributor and redistributors by using the existing framework
to provide handler functions for each register group.
Currently we limit the emulation to a model enforcing a single
security state, with SRE==1 (forcing system register access) and
ARE==1 (allowing more than 8 VCPUs).
We share some of functions provided for GICv2 emulation, but take
the different ways of addressing (v)CPUs into account.
Save and restore is currently not implemented.

Similar to the split-off GICv2 specific code, the new emulation code
goes into a new file (vgic-v3-emul.c).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm64/kvm/Makefile            |    1 +
 include/kvm/arm_vgic.h             |   10 +-
 include/linux/irqchip/arm-gic-v3.h |   26 ++
 include/linux/kvm_host.h           |    1 +
 include/uapi/linux/kvm.h           |    2 +
 virt/kvm/arm/vgic-v3-emul.c        |  891 ++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic.c                |   11 +-
 virt/kvm/arm/vgic.h                |    3 +
 8 files changed, 942 insertions(+), 3 deletions(-)
 create mode 100644 virt/kvm/arm/vgic-v3-emul.c

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index d957353..4e6e09e 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -24,5 +24,6 @@ kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v2-switch.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3.o
+kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3-emul.o
 kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v3-switch.o
 kvm-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8827bc7..c303083 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -160,7 +160,11 @@ struct vgic_dist {
 
 	/* Distributor and vcpu interface mapping in the guest */
 	phys_addr_t		vgic_dist_base;
-	phys_addr_t		vgic_cpu_base;
+	/* GICv2 and GICv3 use different mapped register blocks */
+	union {
+		phys_addr_t		vgic_cpu_base;
+		phys_addr_t		vgic_redist_base;
+	};
 
 	/* Distributor enabled */
 	u32			enabled;
@@ -222,6 +226,9 @@ struct vgic_dist {
 	 */
 	struct vgic_bitmap	*irq_spi_target;
 
+	/* Target MPIDR for each IRQ (needed for GICv3 IROUTERn) only */
+	u32			*irq_spi_mpidr;
+
 	/* Bitmap indicating which CPU has something pending */
 	unsigned long		*irq_pending_on_cpu;
 
@@ -297,6 +304,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
 void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level);
+void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		      struct kvm_exit_mmio *mmio);
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 03a4ea3..6a649bc 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -33,6 +33,7 @@
 #define GICD_SETSPI_SR			0x0050
 #define GICD_CLRSPI_SR			0x0058
 #define GICD_SEIR			0x0068
+#define GICD_IGROUPR			0x0080
 #define GICD_ISENABLER			0x0100
 #define GICD_ICENABLER			0x0180
 #define GICD_ISPENDR			0x0200
@@ -41,14 +42,31 @@
 #define GICD_ICACTIVER			0x0380
 #define GICD_IPRIORITYR			0x0400
 #define GICD_ICFGR			0x0C00
+#define GICD_IGRPMODR			0x0D00
+#define GICD_NSACR			0x0E00
 #define GICD_IROUTER			0x6000
+#define GICD_IDREGS			0xFFD0
 #define GICD_PIDR2			0xFFE8
 
+/*
+ * Non-ARE distributor registers, needed to provide the RES0
+ * semantics for KVM's emulated GICv3
+ */
+#define GICD_ITARGETSR			0x0800
+#define GICD_SGIR			0x0F00
+#define GICD_CPENDSGIR			0x0F10
+#define GICD_SPENDSGIR			0x0F20
+
+
 #define GICD_CTLR_RWP			(1U << 31)
+#define GICD_CTLR_DS			(1U << 6)
 #define GICD_CTLR_ARE_NS		(1U << 4)
 #define GICD_CTLR_ENABLE_G1A		(1U << 1)
 #define GICD_CTLR_ENABLE_G1		(1U << 0)
 
+#define GICD_TYPER_LPIS			(1U << 17)
+#define GICD_TYPER_MBIS			(1U << 16)
+
 #define GICD_IROUTER_SPI_MODE_ONE	(0U << 31)
 #define GICD_IROUTER_SPI_MODE_ANY	(1U << 31)
 
@@ -56,6 +74,8 @@
 #define GIC_PIDR2_ARCH_GICv3		0x30
 #define GIC_PIDR2_ARCH_GICv4		0x40
 
+#define GIC_V3_DIST_SIZE		0x10000
+
 /*
  * Re-Distributor registers, offsets from RD_base
  */
@@ -74,6 +94,7 @@
 #define GICR_SYNCR			0x00C0
 #define GICR_MOVLPIR			0x0100
 #define GICR_MOVALLR			0x0110
+#define GICR_IDREGS			GICD_IDREGS
 #define GICR_PIDR2			GICD_PIDR2
 
 #define GICR_WAKER_ProcessorSleep	(1U << 1)
@@ -82,6 +103,7 @@
 /*
  * Re-Distributor registers, offsets from SGI_base
  */
+#define GICR_IGROUPR0			GICD_IGROUPR
 #define GICR_ISENABLER0			GICD_ISENABLER
 #define GICR_ICENABLER0			GICD_ICENABLER
 #define GICR_ISPENDR0			GICD_ISPENDR
@@ -90,10 +112,14 @@
 #define GICR_ICACTIVER0			GICD_ICACTIVER
 #define GICR_IPRIORITYR0		GICD_IPRIORITYR
 #define GICR_ICFGR0			GICD_ICFGR
+#define GICR_IGRPMODR0			GICD_IGRPMODR
+#define GICR_NSACR			GICD_NSACR
 
 #define GICR_TYPER_VLPIS		(1U << 1)
 #define GICR_TYPER_LAST			(1U << 4)
 
+#define GIC_V3_REDIST_SIZE		0x20000
+
 /*
  * CPU interface registers
  */
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 326ba7a..4a7798e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1085,6 +1085,7 @@ void kvm_unregister_device_ops(u32 type);
 extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
 extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
+extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
 
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6076882..24cb129 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -960,6 +960,8 @@ enum kvm_device_type {
 #define KVM_DEV_TYPE_ARM_VGIC_V2	KVM_DEV_TYPE_ARM_VGIC_V2
 	KVM_DEV_TYPE_FLIC,
 #define KVM_DEV_TYPE_FLIC		KVM_DEV_TYPE_FLIC
+	KVM_DEV_TYPE_ARM_VGIC_V3,
+#define KVM_DEV_TYPE_ARM_VGIC_V3	KVM_DEV_TYPE_ARM_VGIC_V3
 	KVM_DEV_TYPE_MAX,
 };
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
new file mode 100644
index 0000000..bcb5374
--- /dev/null
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -0,0 +1,891 @@
+/*
+ * GICv3 distributor and redistributor emulation on GICv3 hardware
+ *
+ * able to run on a pure native host GICv3 (which forces ARE=1)
+ *
+ * forcing ARE=1 and DS=1, not covering LPIs yet (TYPER.LPIS=0)
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ * Author: Andre Przywara <andre.przywara@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/cpu.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/interrupt.h>
+
+#include <linux/irqchip/arm-gic-v3.h>
+#include <kvm/arm_vgic.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
+
+#include "vgic.h"
+
+#define INTERRUPT_ID_BITS 10
+
+static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
+			     struct kvm_exit_mmio *mmio, phys_addr_t offset,
+			     void *private)
+{
+	u32 reg = 0, val;
+	u32 word_offset = offset & 3;
+
+	switch (offset & ~3) {
+	case GICD_CTLR:
+		/*
+		 * Force ARE and DS to 1, the guest cannot change this.
+		 * For the time being we only support Group1 interrupts.
+		 */
+		if (vcpu->kvm->arch.vgic.enabled)
+			reg = GICD_CTLR_ENABLE_G1A;
+		reg |= GICD_CTLR_ARE_NS | GICD_CTLR_DS;
+
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+		if (mmio->is_write) {
+			vcpu->kvm->arch.vgic.enabled = !!(reg & GICD_CTLR_ENABLE_G1A);
+			vgic_update_state(vcpu->kvm);
+			return true;
+		}
+		break;
+	case GICD_TYPER:
+		/*
+		 * as this implementation does not provide compatibility
+		 * with GICv2 (ARE==1), we report zero CPUs in the lower 5 bits.
+		 * Also TYPER.LPIS is 0 for now and TYPER.MBIS is not supported.
+		 */
+
+		/* claim we support at most 1024 (-4) SPIs via this interface */
+		val = min(vcpu->kvm->arch.vgic.nr_irqs, 1024);
+		reg |= (val >> 5) - 1;
+
+		reg |= (INTERRUPT_ID_BITS - 1) << 19;
+
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case GICD_IIDR:
+		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+		vgic_reg_access(mmio, &reg, word_offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	default:
+		vgic_reg_access(mmio, NULL, word_offset,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	}
+
+	return false;
+}
+
+static bool handle_mmio_set_enable_reg_dist(struct kvm_vcpu *vcpu,
+					    struct kvm_exit_mmio *mmio,
+					    phys_addr_t offset,
+					    void *private)
+{
+	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
+		return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+					      vcpu->vcpu_id,
+					      ACCESS_WRITE_SETBIT);
+
+	vgic_reg_access(mmio, NULL, offset & 3,
+			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+	return false;
+}
+
+static bool handle_mmio_clear_enable_reg_dist(struct kvm_vcpu *vcpu,
+					      struct kvm_exit_mmio *mmio,
+					      phys_addr_t offset,
+					      void *private)
+{
+	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
+		return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+					      vcpu->vcpu_id,
+					      ACCESS_WRITE_CLEARBIT);
+
+	vgic_reg_access(mmio, NULL, offset & 3,
+			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+	return false;
+}
+
+static bool handle_mmio_set_pending_reg_dist(struct kvm_vcpu *vcpu,
+					     struct kvm_exit_mmio *mmio,
+					     phys_addr_t offset,
+					     void *private)
+{
+	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
+		return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
+						   vcpu->vcpu_id);
+
+	vgic_reg_access(mmio, NULL, offset & 3,
+			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+	return false;
+}
+
+static bool handle_mmio_clear_pending_reg_dist(struct kvm_vcpu *vcpu,
+					       struct kvm_exit_mmio *mmio,
+					       phys_addr_t offset,
+					       void *private)
+{
+	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
+		return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
+						     vcpu->vcpu_id);
+
+	vgic_reg_access(mmio, NULL, offset & 3,
+			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+	return false;
+}
+
+static bool handle_mmio_priority_reg_dist(struct kvm_vcpu *vcpu,
+					  struct kvm_exit_mmio *mmio,
+					  phys_addr_t offset,
+					  void *private)
+{
+	u32 *reg;
+
+	if (unlikely(offset < VGIC_NR_PRIVATE_IRQS)) {
+		vgic_reg_access(mmio, NULL, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		return false;
+	}
+
+	reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
+				   vcpu->vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+		ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	return false;
+}
+
+static bool handle_mmio_cfg_reg_dist(struct kvm_vcpu *vcpu,
+				     struct kvm_exit_mmio *mmio,
+				     phys_addr_t offset,
+				     void *private)
+{
+	u32 *reg;
+
+	if (unlikely(offset < VGIC_NR_PRIVATE_IRQS / 4)) {
+		vgic_reg_access(mmio, NULL, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		return false;
+	}
+
+	reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
+				  vcpu->vcpu_id, offset >> 1);
+
+	return vgic_handle_cfg_reg(reg, mmio, offset);
+}
+
+static u32 compress_mpidr(unsigned long mpidr)
+{
+	u32 ret;
+
+	ret = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	ret |= MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8;
+	ret |= MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16;
+	ret |= MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24;
+
+	return ret;
+}
+
+static unsigned long uncompress_mpidr(u32 value)
+{
+	unsigned long mpidr;
+
+	mpidr = ((value >> 0) & 0xFF) << MPIDR_LEVEL_SHIFT(0);
+	mpidr |= ((value >> 8) & 0xFF) << MPIDR_LEVEL_SHIFT(1);
+	mpidr |= ((value >> 16) & 0xFF) << MPIDR_LEVEL_SHIFT(2);
+	mpidr |= (u64)((value >> 24) & 0xFF) << MPIDR_LEVEL_SHIFT(3);
+
+	return mpidr;
+}
+
+/*
+ * Lookup the given MPIDR value to get the vcpu_id (if there is one)
+ * and store that in the irq_spi_cpu[] array.
+ * This limits the number of VCPUs to 255 for now, extending the data
+ * type (or storing kvm_vcpu poiners) should lift the limit.
+ * Store the original MPIDR value in an extra array.
+ * Unallocated MPIDRs are translated to a special value and catched
+ * before any array accesses.
+ */
+static bool handle_mmio_route_reg(struct kvm_vcpu *vcpu,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, void *private)
+{
+	struct kvm *kvm = vcpu->kvm;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int irq;
+	u32 reg;
+	int vcpu_id;
+	unsigned long *bmap, mpidr;
+	u32 word_offset = offset & 3;
+
+	/*
+	 * Private interrupts cannot be re-routed, so this register
+	 * is RES0 for any IRQ < 32.
+	 * Also the upper 32 bits of each 64 bit register are zero,
+	 * as we don't support Aff3 and that's the only value up there.
+	 */
+	if (unlikely(offset < VGIC_NR_PRIVATE_IRQS * 8) || (offset & 4) == 4) {
+		vgic_reg_access(mmio, NULL, word_offset,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		return false;
+	}
+
+	irq = (offset / 8) - VGIC_NR_PRIVATE_IRQS;
+
+	/* get the stored MPIDR for this IRQ */
+	mpidr = uncompress_mpidr(dist->irq_spi_mpidr[irq]);
+	mpidr &= MPIDR_HWID_BITMASK;
+	reg = mpidr;
+
+	vgic_reg_access(mmio, &reg, word_offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+
+	if (!mmio->is_write)
+		return false;
+
+	/*
+	 * Now clear the currently assigned vCPU from the map, making room
+	 * for the new one to be written below
+	 */
+	vcpu = kvm_mpidr_to_vcpu(kvm, mpidr);
+	if (likely(vcpu)) {
+		vcpu_id = vcpu->vcpu_id;
+		bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
+		clear_bit(irq, bmap);
+	}
+
+	dist->irq_spi_mpidr[irq] = compress_mpidr(reg);
+	vcpu = kvm_mpidr_to_vcpu(kvm, reg & MPIDR_HWID_BITMASK);
+
+	/*
+	 * The spec says that non-existent MPIDR values should not be
+	 * forwarded to any existent (v)CPU, but should be able to become
+	 * pending anyway. We simply keep the irq_spi_target[] array empty, so
+	 * the interrupt will never be injected.
+	 * irq_spi_cpu[irq] gets a magic value in this case.
+	 */
+	if (likely(vcpu)) {
+		vcpu_id = vcpu->vcpu_id;
+		dist->irq_spi_cpu[irq] = vcpu_id;
+		bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
+		set_bit(irq, bmap);
+	} else
+		dist->irq_spi_cpu[irq] = VCPU_NOT_ALLOCATED;
+
+	vgic_update_state(kvm);
+
+	return true;
+}
+
+static bool handle_mmio_idregs(struct kvm_vcpu *vcpu,
+			       struct kvm_exit_mmio *mmio,
+			       phys_addr_t offset, void *private)
+{
+	u32 reg = 0;
+
+	switch (offset + GICD_IDREGS) {
+	case GICD_PIDR2:
+		reg = 0x3b;
+		break;
+	}
+
+	vgic_reg_access(mmio, &reg, offset & 3,
+			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+
+	return false;
+}
+
+static const struct mmio_range vgic_dist_ranges[] = {
+	{	/*
+		 * handling CTLR, TYPER, IIDR and STATUSR
+		 */
+		.base           = GICD_CTLR,
+		.len            = 20,
+		.bits_per_irq   = 0,
+		.handle_mmio    = handle_mmio_misc,
+	},
+	{
+		/* when DS=1, this is RAZ/WI */
+		.base		= GICD_SETSPI_SR,
+		.len		= 0x04,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		/* when DS=1, this is RAZ/WI */
+		.base		= GICD_CLRSPI_SR,
+		.len		= 0x04,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_IGROUPR,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_ISENABLER,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_set_enable_reg_dist,
+	},
+	{
+		.base		= GICD_ICENABLER,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_clear_enable_reg_dist,
+	},
+	{
+		.base		= GICD_ISPENDR,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_set_pending_reg_dist,
+	},
+	{
+		.base		= GICD_ICPENDR,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_clear_pending_reg_dist,
+	},
+	{
+		.base		= GICD_ISACTIVER,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_ICACTIVER,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_IPRIORITYR,
+		.len		= 0x400,
+		.bits_per_irq	= 8,
+		.handle_mmio	= handle_mmio_priority_reg_dist,
+	},
+	{
+		/* TARGETSRn is RES0 when ARE=1 */
+		.base		= GICD_ITARGETSR,
+		.len		= 0x400,
+		.bits_per_irq	= 8,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_ICFGR,
+		.len		= 0x100,
+		.bits_per_irq	= 2,
+		.handle_mmio	= handle_mmio_cfg_reg_dist,
+	},
+	{
+		/* this is RAZ/WI when DS=1 */
+		.base		= GICD_IGRPMODR,
+		.len		= 0x80,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		/* with DS==1 this is RAZ/WI */
+		.base		= GICD_NSACR,
+		.len		= 0x100,
+		.bits_per_irq	= 2,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	/* the next three blocks are RES0 if ARE=1 */
+	{
+		.base		= GICD_SGIR,
+		.len		= 4,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_CPENDSGIR,
+		.len		= 0x10,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base           = GICD_SPENDSGIR,
+		.len            = 0x10,
+		.handle_mmio    = handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICD_IROUTER,
+		.len		= 0x2000,
+		.bits_per_irq	= 64,
+		.handle_mmio	= handle_mmio_route_reg,
+	},
+	{
+		.base           = GICD_IDREGS,
+		.len            = 0x30,
+		.bits_per_irq   = 0,
+		.handle_mmio    = handle_mmio_idregs,
+	},
+	{},
+};
+
+static bool handle_mmio_set_enable_reg_redist(struct kvm_vcpu *vcpu,
+					      struct kvm_exit_mmio *mmio,
+					      phys_addr_t offset,
+					      void *private)
+{
+	struct kvm_vcpu *target_redist_vcpu = private;
+
+	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+				      target_redist_vcpu->vcpu_id,
+				      ACCESS_WRITE_SETBIT);
+}
+
+static bool handle_mmio_clear_enable_reg_redist(struct kvm_vcpu *vcpu,
+						struct kvm_exit_mmio *mmio,
+						phys_addr_t offset,
+						void *private)
+{
+	struct kvm_vcpu *target_redist_vcpu = private;
+
+	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
+				      target_redist_vcpu->vcpu_id,
+				      ACCESS_WRITE_CLEARBIT);
+}
+
+static bool handle_mmio_set_pending_reg_redist(struct kvm_vcpu *vcpu,
+					       struct kvm_exit_mmio *mmio,
+					       phys_addr_t offset,
+					       void *private)
+{
+	struct kvm_vcpu *target_redist_vcpu = private;
+
+	return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
+					   target_redist_vcpu->vcpu_id);
+}
+
+static bool handle_mmio_clear_pending_reg_redist(struct kvm_vcpu *vcpu,
+						 struct kvm_exit_mmio *mmio,
+						 phys_addr_t offset,
+						 void *private)
+{
+	struct kvm_vcpu *target_redist_vcpu = private;
+
+	return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
+					     target_redist_vcpu->vcpu_id);
+}
+
+static bool handle_mmio_priority_reg_redist(struct kvm_vcpu *vcpu,
+					    struct kvm_exit_mmio *mmio,
+					    phys_addr_t offset,
+					    void *private)
+{
+	struct kvm_vcpu *target_redist_vcpu = private;
+	u32 *reg;
+
+	reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
+				   target_redist_vcpu->vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	return false;
+}
+
+static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
+				       struct kvm_exit_mmio *mmio,
+				       phys_addr_t offset,
+				       void *private)
+{
+	u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
+				       *(int *)private, offset >> 1);
+
+	return vgic_handle_cfg_reg(reg, mmio, offset);
+}
+
+static const struct mmio_range vgic_redist_sgi_ranges[] = {
+	{
+		.base		= GICR_IGROUPR0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICR_ISENABLER0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_set_enable_reg_redist,
+	},
+	{
+		.base		= GICR_ICENABLER0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_clear_enable_reg_redist,
+	},
+	{
+		.base		= GICR_ISPENDR0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_set_pending_reg_redist,
+	},
+	{
+		.base		= GICR_ICPENDR0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_clear_pending_reg_redist,
+	},
+	{
+		.base		= GICR_ISACTIVER0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICR_ICACTIVER0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICR_IPRIORITYR0,
+		.len		= 32,
+		.bits_per_irq	= 8,
+		.handle_mmio	= handle_mmio_priority_reg_redist,
+	},
+	{
+		.base		= GICR_ICFGR0,
+		.len		= 8,
+		.bits_per_irq	= 2,
+		.handle_mmio	= handle_mmio_cfg_reg_redist,
+	},
+	{
+		.base		= GICR_IGRPMODR0,
+		.len		= 4,
+		.bits_per_irq	= 1,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GICR_NSACR,
+		.len		= 4,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{},
+};
+
+static bool handle_mmio_misc_redist(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset, void *private)
+{
+	u32 reg;
+	u32 word_offset = offset & 3;
+	u64 mpidr;
+	struct kvm_vcpu *target_redist_vcpu = private;
+	int target_vcpu_id = target_redist_vcpu->vcpu_id;
+
+	switch (offset & ~3) {
+	case GICR_CTLR:
+		/* since we don't support LPIs, this register is zero for now */
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	case GICR_TYPER + 4:
+		mpidr = kvm_vcpu_get_mpidr(target_redist_vcpu);
+		reg = compress_mpidr(mpidr);
+
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case GICR_TYPER:
+		reg = target_redist_vcpu->vcpu_id << 8;
+		if (target_vcpu_id == atomic_read(&vcpu->kvm->online_vcpus) - 1)
+			reg |= GICR_TYPER_LAST;
+		vgic_reg_access(mmio, &reg, word_offset,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case GICR_IIDR:
+		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+		vgic_reg_access(mmio, &reg, word_offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	default:
+		vgic_reg_access(mmio, NULL, word_offset,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	}
+
+	return false;
+}
+
+static const struct mmio_range vgic_redist_ranges[] = {
+	{	/*
+		 * handling CTLR, IIDR, TYPER and STATUSR
+		 */
+		.base           = GICR_CTLR,
+		.len            = 20,
+		.bits_per_irq   = 0,
+		.handle_mmio    = handle_mmio_misc_redist,
+	},
+	{
+		.base           = GICR_WAKER,
+		.len            = 4,
+		.bits_per_irq   = 0,
+		.handle_mmio    = handle_mmio_raz_wi,
+	},
+	{
+		.base           = GICR_IDREGS,
+		.len            = 0x30,
+		.bits_per_irq   = 0,
+		.handle_mmio    = handle_mmio_idregs,
+	},
+	{},
+};
+
+/*
+ * this is the stub handling both dist and redist MMIO exits for v3
+ * does some vcpu_id calculation on the redist MMIO to use a possibly
+ * different VCPU than the current one
+ */
+static bool vgic_v3_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				struct kvm_exit_mmio *mmio)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long dbase = dist->vgic_dist_base;
+	unsigned long rdbase = dist->vgic_redist_base;
+	int nrcpus = atomic_read(&vcpu->kvm->online_vcpus);
+	int vcpu_id;
+	struct kvm_vcpu *target_redist_vcpu;
+
+	if (is_in_range(mmio->phys_addr, mmio->len, dbase, GIC_V3_DIST_SIZE)) {
+		return vgic_handle_mmio_range(vcpu, run, mmio,
+					      vgic_dist_ranges, dbase, NULL);
+	}
+
+	if (!is_in_range(mmio->phys_addr, mmio->len, rdbase,
+	    GIC_V3_REDIST_SIZE * nrcpus))
+		return false;
+
+	vcpu_id = (mmio->phys_addr - rdbase) / GIC_V3_REDIST_SIZE;
+	rdbase += (vcpu_id * GIC_V3_REDIST_SIZE);
+	target_redist_vcpu = kvm_get_vcpu(vcpu->kvm, vcpu_id);
+
+	if (mmio->phys_addr >= rdbase + 0x10000)
+		return vgic_handle_mmio_range(vcpu, run, mmio,
+					      vgic_redist_sgi_ranges,
+					      rdbase + 0x10000,
+					      target_redist_vcpu);
+
+	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_redist_ranges,
+				      rdbase, target_redist_vcpu);
+}
+
+static bool vgic_v3_queue_sgi(struct kvm_vcpu *vcpu, int irq)
+{
+	if (vgic_queue_irq(vcpu, 0, irq)) {
+		vgic_dist_irq_clear_pending(vcpu, irq);
+		vgic_cpu_irq_clear(vcpu, irq);
+		return true;
+	}
+
+	return false;
+}
+
+static int vgic_v3_init_maps(struct vgic_dist *dist)
+{
+	int nr_spis = dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
+
+	dist->irq_spi_mpidr = kcalloc(nr_spis, sizeof(dist->irq_spi_mpidr[0]),
+				      GFP_KERNEL);
+
+	if (!dist->irq_spi_mpidr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int vgic_v3_init(struct kvm *kvm, const struct vgic_params *params)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int ret, i;
+	u32 mpidr;
+
+	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
+	    IS_VGIC_ADDR_UNDEF(dist->vgic_redist_base)) {
+		kvm_err("Need to set vgic distributor addresses first\n");
+		return -ENXIO;
+	}
+
+	/*
+	 * FIXME: this should be moved to init_maps time, and may bite
+	 * us when adding save/restore. Add a per-emulation hook?
+	 */
+	ret = vgic_v3_init_maps(dist);
+	if (ret) {
+		kvm_err("Unable to allocate maps\n");
+		return ret;
+	}
+
+	mpidr = compress_mpidr(kvm_vcpu_get_mpidr(kvm_get_vcpu(kvm, 0)));
+	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i++) {
+		dist->irq_spi_cpu[i - VGIC_NR_PRIVATE_IRQS] = 0;
+		dist->irq_spi_mpidr[i - VGIC_NR_PRIVATE_IRQS] = mpidr;
+		vgic_bitmap_set_irq_val(dist->irq_spi_target, 0, i, 1);
+	}
+
+	return 0;
+}
+
+static void vgic_v3_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
+{
+}
+
+bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	switch (type) {
+	case KVM_DEV_TYPE_ARM_VGIC_V3:
+		dist->vm_ops.handle_mmio = vgic_v3_handle_mmio;
+		dist->vm_ops.queue_sgi = vgic_v3_queue_sgi;
+		dist->vm_ops.add_sgi_source = vgic_v3_add_sgi_source;
+		dist->vm_ops.vgic_init = vgic_v3_init;
+		break;
+	default:
+		return false;
+	}
+	return true;
+}
+
+/*
+ * triggered by a system register access trap, called from the sysregs
+ * handling code there.
+ * The register contains the upper three affinity levels of the target
+ * processors as well as a bitmask of 16 Aff0 CPUs.
+ * Iterate over all VCPUs to check for matching ones or signal on
+ * all-but-self if the mode bit is set.
+ */
+void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
+{
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_vcpu *c_vcpu;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u16 target_cpus;
+	u64 mpidr, mpidr_h, mpidr_l;
+	int sgi, mode, c, vcpu_id;
+	int updated = 0;
+
+	vcpu_id = vcpu->vcpu_id;
+
+	sgi = (reg >> 24) & 0xf;
+	mode = (reg >> 40) & 0x1;
+	target_cpus = reg & 0xffff;
+	mpidr = ((reg >> 48) & 0xff) << MPIDR_LEVEL_SHIFT(3);
+	mpidr |= ((reg >> 32) & 0xff) << MPIDR_LEVEL_SHIFT(2);
+	mpidr |= ((reg >> 16) & 0xff) << MPIDR_LEVEL_SHIFT(1);
+	mpidr &= ~MPIDR_LEVEL_MASK;
+
+	/*
+	 * We take the dist lock here, because we come from the sysregs
+	 * code path and not from MMIO (where this is already done)
+	 */
+	spin_lock(&dist->lock);
+	kvm_for_each_vcpu(c, c_vcpu, kvm) {
+		if (!mode && target_cpus == 0)
+			break;
+		if (mode && c == vcpu_id)       /* not to myself */
+			continue;
+		if (!mode) {
+			mpidr_h = kvm_vcpu_get_mpidr(c_vcpu);
+			mpidr_l = MPIDR_AFFINITY_LEVEL(mpidr_h, 0);
+			mpidr_h &= ~MPIDR_LEVEL_MASK;
+			if (mpidr != mpidr_h)
+				continue;
+			if (!(target_cpus & BIT(mpidr_l)))
+				continue;
+			target_cpus &= ~BIT(mpidr_l);
+		}
+		/* Flag the SGI as pending */
+		vgic_dist_irq_set_pending(c_vcpu, sgi);
+		updated = 1;
+		kvm_debug("SGI%d from CPU%d to CPU%d\n", sgi, vcpu_id, c);
+	}
+	if (updated)
+		vgic_update_state(vcpu->kvm);
+	spin_unlock(&dist->lock);
+	if (updated)
+		vgic_kick_vcpus(vcpu->kvm);
+}
+
+
+static int vgic_v3_get_attr(struct kvm_device *dev,
+			    struct kvm_device_attr *attr)
+{
+	int ret;
+
+	ret = vgic_get_common_attr(dev, attr);
+	if (ret != -ENXIO)
+		return ret;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
+		return -ENXIO;
+	}
+
+	return -ENXIO;
+}
+
+static int vgic_v3_set_attr(struct kvm_device *dev,
+			    struct kvm_device_attr *attr)
+{
+	int ret;
+
+	ret = vgic_set_common_attr(dev, attr);
+	if (ret != -ENXIO)
+		return ret;
+
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+		return -ENXIO;
+	}
+
+	return -ENXIO;
+}
+
+static int vgic_v3_has_attr(struct kvm_device *dev,
+			    struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_ARM_VGIC_GRP_ADDR:
+		switch (attr->attr) {
+		case KVM_VGIC_V2_ADDR_TYPE_DIST:
+		case KVM_VGIC_V2_ADDR_TYPE_CPU:
+			return -ENXIO;
+		}
+		break;
+	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
+		return -ENXIO;
+	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
+		return 0;
+	}
+	return -ENXIO;
+}
+
+struct kvm_device_ops kvm_arm_vgic_v3_ops = {
+	.name = "kvm-arm-vgic-v3",
+	.create = vgic_create,
+	.destroy = vgic_destroy,
+	.set_attr = vgic_v3_set_attr,
+	.get_attr = vgic_v3_get_attr,
+	.has_attr = vgic_v3_has_attr,
+};
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index a54389b..2867269d 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1228,7 +1228,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	struct kvm_vcpu *vcpu;
 	int edge_triggered, level_triggered;
 	int enabled;
-	bool ret = true;
+	bool ret = true, can_inject = true;
 
 	spin_lock(&dist->lock);
 
@@ -1243,6 +1243,11 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 
 	if (irq_num >= VGIC_NR_PRIVATE_IRQS) {
 		cpuid = dist->irq_spi_cpu[irq_num - VGIC_NR_PRIVATE_IRQS];
+		if (cpuid == VCPU_NOT_ALLOCATED) {
+			/* Pretend we use CPU0, and prevent injection */
+			cpuid = 0;
+			can_inject = false;
+		}
 		vcpu = kvm_get_vcpu(kvm, cpuid);
 	}
 
@@ -1264,7 +1269,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 
 	enabled = vgic_irq_is_enabled(vcpu, irq_num);
 
-	if (!enabled) {
+	if (!enabled || !can_inject) {
 		ret = false;
 		goto out;
 	}
@@ -1406,6 +1411,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	}
 	kfree(dist->irq_sgi_sources);
 	kfree(dist->irq_spi_cpu);
+	kfree(dist->irq_spi_mpidr);
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
 	dist->irq_sgi_sources = NULL;
@@ -1581,6 +1587,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
 	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
+	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
 
 	if (!init_emulation_ops(kvm, type))
 		ret = -ENODEV;
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index f52db4e..42c20c1 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -35,6 +35,8 @@
 #define ACCESS_WRITE_VALUE	(3 << 1)
 #define ACCESS_WRITE_MASK(x)	((x) & (3 << 1))
 
+#define VCPU_NOT_ALLOCATED	((u8)-1)
+
 unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x);
 
 void vgic_update_state(struct kvm *kvm);
@@ -121,5 +123,6 @@ int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
 int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
 
 bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type);
+bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type);
 
 #endif
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 17/19] arm64: KVM: add SGI system register trapping
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (15 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-07 15:07   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation Andre Przywara
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

While the injection of a (virtual) inter-processor interrupt (SGI)
on a GICv2 works by writing to a MMIO register, GICv3 uses system
registers to trigger them.
Trap the appropriate registers on ARM64 hosts and call the SGI
handler function in the vGICv3 emulation code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm64/kvm/sys_regs.c |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index dcc5867..cf0452e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -165,6 +165,27 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+/*
+ * Trapping on the GICv3 SGI system register.
+ * Forward the request to the VGIC emulation.
+ * The cp15_64 code makes sure this automatically works
+ * for both AArch64 and AArch32 accesses.
+ */
+static bool access_gic_sgi(struct kvm_vcpu *vcpu,
+			   const struct sys_reg_params *p,
+			   const struct sys_reg_desc *r)
+{
+	u64 val;
+
+	if (!p->is_write)
+		return read_from_write_only(vcpu, p);
+
+	val = *vcpu_reg(vcpu, p->Rt);
+	vgic_v3_dispatch_sgi(vcpu, val);
+
+	return true;
+}
+
 static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 			const struct sys_reg_params *p,
 			const struct sys_reg_desc *r)
@@ -431,6 +452,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	/* VBAR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b0000), Op2(0b000),
 	  NULL, reset_val, VBAR_EL1, 0 },
+	/* ICC_SGI1R_EL1 */
+	{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1011), Op2(0b101),
+	  access_gic_sgi },
 	/* CONTEXTIDR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b0000), Op2(0b001),
 	  access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 },
@@ -659,6 +683,8 @@ static const struct sys_reg_desc cp14_64_regs[] = {
  * register).
  */
 static const struct sys_reg_desc cp15_regs[] = {
+	{ Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi },
+
 	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_sctlr, NULL, c1_SCTLR },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
 	{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (16 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 17/19] arm64: KVM: add SGI system register trapping Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-07 16:07   ` Christoffer Dall
  2014-10-31 17:26 ` [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3 Andre Przywara
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

With all the necessary GICv3 emulation code in place, we can now
connect the code to the GICv3 backend in the kernel.
The LR register handling is different depending on the emulated GIC
model, so provide different implementations for each.
Also allow non-v2-compatible GICv3 implementations (which don't
provide MMIO regions for the virtual CPU interface in the DT), but
restrict those hosts to use GICv3 guests only.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 virt/kvm/arm/vgic-v3.c |  168 ++++++++++++++++++++++++++++++++++++------------
 virt/kvm/arm/vgic.c    |    4 ++
 2 files changed, 130 insertions(+), 42 deletions(-)

diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index ce50918..c0e901c 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -34,6 +34,7 @@
 #define GICH_LR_VIRTUALID		(0x3ffUL << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
 #define GICH_LR_PHYSID_CPUID		(7UL << GICH_LR_PHYSID_CPUID_SHIFT)
+#define ICH_LR_VIRTUALID_MASK		(BIT_ULL(32) - 1)
 
 /*
  * LRs are stored in reverse order in memory. make sure we index them
@@ -43,7 +44,35 @@
 
 static u32 ich_vtr_el2;
 
-static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
+static u64 sync_lr_val(u8 state)
+{
+	u64 lr_val = 0;
+
+	if (state & LR_STATE_PENDING)
+		lr_val |= ICH_LR_PENDING_BIT;
+	if (state & LR_STATE_ACTIVE)
+		lr_val |= ICH_LR_ACTIVE_BIT;
+	if (state & LR_EOI_INT)
+		lr_val |= ICH_LR_EOI;
+
+	return lr_val;
+}
+
+static u8 sync_lr_state(u64 lr_val)
+{
+	u8 state = 0;
+
+	if (lr_val & ICH_LR_PENDING_BIT)
+		state |= LR_STATE_PENDING;
+	if (lr_val & ICH_LR_ACTIVE_BIT)
+		state |= LR_STATE_ACTIVE;
+	if (lr_val & ICH_LR_EOI)
+		state |= LR_EOI_INT;
+
+	return state;
+}
+
+static struct vgic_lr vgic_v2_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
 {
 	struct vgic_lr lr_desc;
 	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
@@ -53,30 +82,53 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
 		lr_desc.source	= (val >> GICH_LR_PHYSID_CPUID_SHIFT) & 0x7;
 	else
 		lr_desc.source = 0;
-	lr_desc.state	= 0;
+	lr_desc.state	= sync_lr_state(val);
 
-	if (val & ICH_LR_PENDING_BIT)
-		lr_desc.state |= LR_STATE_PENDING;
-	if (val & ICH_LR_ACTIVE_BIT)
-		lr_desc.state |= LR_STATE_ACTIVE;
-	if (val & ICH_LR_EOI)
-		lr_desc.state |= LR_EOI_INT;
+	return lr_desc;
+}
+
+static struct vgic_lr vgic_v3_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
+{
+	struct vgic_lr lr_desc;
+	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
+
+	lr_desc.irq	= val & ICH_LR_VIRTUALID_MASK;
+	lr_desc.source	= 0;
+	lr_desc.state	= sync_lr_state(val);
 
 	return lr_desc;
 }
 
-static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
-			   struct vgic_lr lr_desc)
+static void vgic_v3_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
+				 struct vgic_lr lr_desc)
 {
-	u64 lr_val = (((u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) |
-		      lr_desc.irq);
+	u64 lr_val;
 
-	if (lr_desc.state & LR_STATE_PENDING)
-		lr_val |= ICH_LR_PENDING_BIT;
-	if (lr_desc.state & LR_STATE_ACTIVE)
-		lr_val |= ICH_LR_ACTIVE_BIT;
-	if (lr_desc.state & LR_EOI_INT)
-		lr_val |= ICH_LR_EOI;
+	lr_val = lr_desc.irq;
+
+	/*
+	 * currently all guest IRQs are Group1, as Group0 would result
+	 * in a FIQ in the guest, which it wouldn't expect.
+	 * Eventually we want to make this configurable, so we may revisit
+	 * this in the future.
+	 */
+	lr_val |= ICH_LR_GROUP;
+
+	lr_val |= sync_lr_val(lr_desc.state);
+
+	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
+}
+
+static void vgic_v2_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
+				 struct vgic_lr lr_desc)
+{
+	u64 lr_val;
+
+	lr_val = lr_desc.irq;
+
+	lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
+
+	lr_val |= sync_lr_val(lr_desc.state);
 
 	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
 }
@@ -145,9 +197,8 @@ static void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcrp)
 
 static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 {
-	struct vgic_v3_cpu_if *vgic_v3;
+	struct vgic_v3_cpu_if *vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
 
-	vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
 	/*
 	 * By forcing VMCR to zero, the GIC will restore the binary
 	 * points to their reset values. Anything else resets to zero
@@ -155,7 +206,14 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 	 */
 	vgic_v3->vgic_vmcr = 0;
 
-	vgic_v3->vgic_sre = 0;
+	/*
+	 * Set the SRE_EL1 value depending on the configured
+	 * emulated vGIC model.
+	 */
+	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+		vgic_v3->vgic_sre = ICC_SRE_EL1_SRE;
+	else
+		vgic_v3->vgic_sre = 0;
 
 	/* Get the show on the road... */
 	vgic_v3->vgic_hcr = ICH_HCR_EN;
@@ -173,6 +231,15 @@ static const struct vgic_ops vgic_v3_ops = {
 	.enable			= vgic_v3_enable,
 };
 
+static void init_vgic_v3_emul(struct kvm *kvm)
+{
+	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+
+	vm_ops->get_lr = vgic_v3_on_v3_get_lr;
+	vm_ops->set_lr = vgic_v3_on_v3_set_lr;
+	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
+}
+
 static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
 {
 	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
@@ -186,14 +253,28 @@ static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
 			return false;
 		}
 
-		vm_ops->get_lr = vgic_v3_get_lr;
-		vm_ops->set_lr = vgic_v3_set_lr;
+		vm_ops->get_lr = vgic_v2_on_v3_get_lr;
+		vm_ops->set_lr = vgic_v2_on_v3_set_lr;
 		kvm->arch.max_vcpus = 8;
 		return true;
+	case KVM_DEV_TYPE_ARM_VGIC_V3:
+		init_vgic_v3_emul(kvm);
+		return true;
 	}
 	return false;
 }
 
+static bool vgic_v3_init_emul(struct kvm *kvm, int type)
+{
+	switch (type) {
+	case KVM_DEV_TYPE_ARM_VGIC_V3:
+		init_vgic_v3_emul(kvm);
+		return true;
+	}
+
+	return false;
+}
+
 static struct vgic_params vgic_v3_params;
 
 /**
@@ -235,29 +316,32 @@ int vgic_v3_probe(struct device_node *vgic_node,
 
 	gicv_idx += 3; /* Also skip GICD, GICC, GICH */
 	if (of_address_to_resource(vgic_node, gicv_idx, &vcpu_res)) {
-		kvm_err("Cannot obtain GICV region\n");
-		ret = -ENXIO;
-		goto out;
-	}
+		kvm_info("GICv3: GICv2 emulation not available\n");
+		vgic->vcpu_base = 0;
+		vgic->init_emul = vgic_v3_init_emul;
+	} else {
+		if (!PAGE_ALIGNED(vcpu_res.start)) {
+			kvm_err("GICV physical address 0x%llx not page aligned\n",
+				(unsigned long long)vcpu_res.start);
+			ret = -ENXIO;
+			goto out;
+		}
 
-	if (!PAGE_ALIGNED(vcpu_res.start)) {
-		kvm_err("GICV physical address 0x%llx not page aligned\n",
-			(unsigned long long)vcpu_res.start);
-		ret = -ENXIO;
-		goto out;
-	}
+		if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
+			kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
+				(unsigned long long)resource_size(&vcpu_res),
+				PAGE_SIZE);
+			ret = -ENXIO;
+			goto out;
+		}
 
-	if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
-		kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
-			(unsigned long long)resource_size(&vcpu_res),
-			PAGE_SIZE);
-		ret = -ENXIO;
-		goto out;
+		vgic->vcpu_base = vcpu_res.start;
+		vgic->init_emul = vgic_v3_init_emul_compat;
+		kvm_register_device_ops(&kvm_arm_vgic_v2_ops,
+					KVM_DEV_TYPE_ARM_VGIC_V2);
 	}
-	kvm_register_device_ops(&kvm_arm_vgic_v2_ops, KVM_DEV_TYPE_ARM_VGIC_V2);
+	kvm_register_device_ops(&kvm_arm_vgic_v3_ops, KVM_DEV_TYPE_ARM_VGIC_V3);
 
-	vgic->init_emul = vgic_v3_init_emul_compat;
-	vgic->vcpu_base = vcpu_res.start;
 	vgic->vctrl_base = NULL;
 	vgic->type = VGIC_V3;
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 2867269d..16d7c9d 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1542,6 +1542,10 @@ static bool init_emulation_ops(struct kvm *kvm, int type)
 	switch (type) {
 	case KVM_DEV_TYPE_ARM_VGIC_V2:
 		return vgic_v2_init_emulation_ops(kvm, type);
+#ifdef CONFIG_ARM_GIC_V3
+	case KVM_DEV_TYPE_ARM_VGIC_V3:
+		return vgic_v3_init_emulation_ops(kvm, type);
+#endif
 	}
 	return false;
 }
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (17 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation Andre Przywara
@ 2014-10-31 17:26 ` Andre Przywara
  2014-11-07 16:15   ` Christoffer Dall
  2014-11-03 12:59 ` [PATCH v3 00/19] KVM GICv3 emulation Christoffer Dall
  2014-11-06 10:57 ` Christoffer Dall
  20 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-10-31 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

With everything in place we allow userland to request the kernel
using a virtual GICv3 in the guest, which finally lifts the 8 vCPU
limit for a guest.
Also we provide the necessary support for guests setting the memory
addresses for the virtual distributor and redistributors.
This requires some userland code to make use of that feature and
explicitly ask for a virtual GICv3.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm64/include/uapi/asm/kvm.h |    7 ++++++
 include/kvm/arm_vgic.h            |    4 ++--
 virt/kvm/arm/vgic-v3-emul.c       |    3 +++
 virt/kvm/arm/vgic.c               |   46 ++++++++++++++++++++++++++-----------
 4 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 8e38878..2ed873a 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -78,6 +78,13 @@ struct kvm_regs {
 #define KVM_VGIC_V2_DIST_SIZE		0x1000
 #define KVM_VGIC_V2_CPU_SIZE		0x2000
 
+/* Supported VGICv3 address types  */
+#define KVM_VGIC_V3_ADDR_TYPE_DIST	2
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
+
+#define KVM_VGIC_V3_DIST_SIZE		SZ_64K
+#define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
+
 #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
 #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
 #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c303083..e2e432c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -35,8 +35,8 @@
 #define VGIC_MAX_IRQS		1024
 
 /* Sanity checks... */
-#if (KVM_MAX_VCPUS > 8)
-#error	Invalid number of CPU interfaces
+#if (KVM_MAX_VCPUS > 255)
+#error Too many KVM VCPUs, the VGIC only supports up to 255 VCPUs for now
 #endif
 
 #if (VGIC_NR_IRQS_LEGACY & 31)
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index bcb5374..ba6b0b5 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -870,6 +870,9 @@ static int vgic_v3_has_attr(struct kvm_device *dev,
 		case KVM_VGIC_V2_ADDR_TYPE_DIST:
 		case KVM_VGIC_V2_ADDR_TYPE_CPU:
 			return -ENXIO;
+		case KVM_VGIC_V3_ADDR_TYPE_DIST:
+		case KVM_VGIC_V3_ADDR_TYPE_REDIST:
+			return 0;
 		}
 		break;
 	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 16d7c9d..a5abef1 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1647,7 +1647,7 @@ static int vgic_ioaddr_assign(struct kvm *kvm, phys_addr_t *ioaddr,
 /**
  * kvm_vgic_addr - set or get vgic VM base addresses
  * @kvm:   pointer to the vm struct
- * @type:  the VGIC addr type, one of KVM_VGIC_V2_ADDR_TYPE_XXX
+ * @type:  the VGIC addr type, one of KVM_VGIC_V[23]_ADDR_TYPE_XXX
  * @addr:  pointer to address value
  * @write: if true set the address in the VM address space, if false read the
  *          address
@@ -1661,29 +1661,49 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write)
 {
 	int r = 0;
 	struct vgic_dist *vgic = &kvm->arch.vgic;
+	int type_needed;
+	phys_addr_t *addr_ptr, block_size;
 
 	mutex_lock(&kvm->lock);
 	switch (type) {
 	case KVM_VGIC_V2_ADDR_TYPE_DIST:
-		if (write) {
-			r = vgic_ioaddr_assign(kvm, &vgic->vgic_dist_base,
-					       *addr, KVM_VGIC_V2_DIST_SIZE);
-		} else {
-			*addr = vgic->vgic_dist_base;
-		}
+		type_needed = KVM_DEV_TYPE_ARM_VGIC_V2;
+		addr_ptr = &vgic->vgic_dist_base;
+		block_size = KVM_VGIC_V2_DIST_SIZE;
 		break;
 	case KVM_VGIC_V2_ADDR_TYPE_CPU:
-		if (write) {
-			r = vgic_ioaddr_assign(kvm, &vgic->vgic_cpu_base,
-					       *addr, KVM_VGIC_V2_CPU_SIZE);
-		} else {
-			*addr = vgic->vgic_cpu_base;
-		}
+		type_needed = KVM_DEV_TYPE_ARM_VGIC_V2;
+		addr_ptr = &vgic->vgic_cpu_base;
+		block_size = KVM_VGIC_V2_CPU_SIZE;
 		break;
+#ifdef CONFIG_ARM_GIC_V3
+	case KVM_VGIC_V3_ADDR_TYPE_DIST:
+		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
+		addr_ptr = &vgic->vgic_dist_base;
+		block_size = KVM_VGIC_V3_DIST_SIZE;
+		break;
+	case KVM_VGIC_V3_ADDR_TYPE_REDIST:
+		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
+		addr_ptr = &vgic->vgic_redist_base;
+		block_size = KVM_VGIC_V3_REDIST_SIZE;
+		break;
+#endif
 	default:
 		r = -ENODEV;
+		goto out;
+	}
+
+	if (vgic->vgic_model != type_needed) {
+		r = -ENODEV;
+		goto out;
 	}
 
+	if (write)
+		r = vgic_ioaddr_assign(kvm, addr_ptr, *addr, block_size);
+	else
+		*addr = *addr_ptr;
+
+out:
 	mutex_unlock(&kvm->lock);
 	return r;
 }
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v3 00/19] KVM GICv3 emulation
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (18 preceding siblings ...)
  2014-10-31 17:26 ` [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3 Andre Przywara
@ 2014-11-03 12:59 ` Christoffer Dall
  2014-11-06 10:57 ` Christoffer Dall
  20 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:35PM +0000, Andre Przywara wrote:
> This is an updated version of the GICv3 guest emulation series.
> 
> This one is now based on v3.18-rc2, which makes this patch series
> independent now, as all formerly required patches are now upstream.
> 
> I addressed most of the comments from Christoffer's review (thanks
> for that!), this includes a split-up of two patches, so the new
> series now carries more patches to ease review.
> 
> There seem to be still endianess issues with this, so I don't claim
> this version to be compatible with anything other than LE on LE.
> I am about to debug this and will include fixes in the next version.

As I stated on the patch in v2, can we at least make a best effort as to
not write code that breaks on a BE platform?  Thanks.

> 
> A git repo hosting all these patches lives in the kvm-gicv3/v3 branch
> of: http://www.linux-arm.org/git?p=linux-ap.git

Nit: In the future you may want to consider including actual git URLs in
your cover-letters, especially because the arm cgit thingy requires me
to click back, copy-paste twice etc., to construct the git url.

> -----
> 
> GICv3 is the ARM generic interrupt controller designed to overcome
> some limits of the prevalent GICv2. Most notably it lifts the 8-CPU
> limit. Though with recent patches from Marc there is support for
> hosts to use a GICv3, the CPU limitation still applies to KVM guests,
> since the current code emulates a GICv2 only.
> Also, GICv2 backward compatibility being optional in GICv3, a number
> of systems won't be able to run GICv2 guests.
> 
> This patch series provides code to emulate a GICv3 distributor and
> redistributor for any KVM guest. It requires a GICv3 in the host to
> work. With those patches one can run guests efficiently on any GICv3
> host. It has the following features:
> - Affinity routing (support for up to 255 VCPUs, more possible)
> - System registers (as opposed to MMIO access)
> - No ITS
> - No priority support (as the GICv2 emulation)
> - No save / restore support so far (will be added soon)
> 
> The first patches actually refactor the current VGIC code to make
> room for a different VGIC model to be dropped in with Patch 16.
> The remaining patches connect the new model to the kernel backend and
> the userland facing code.
> 
> The series goes on top of v3.18-rc2.
> The necessary patches for kvmtool to enable the guest's GICv3 have
> been posted here before [1], an updated version will follow soon.
> 
> There was some testing on the fast model with some I/O and interrupt
> affinity shuffling in a Linux guest with a varying number of VCPUs as
> well as some testing on a Juno board (GICv2 only, to spot
> regressions).
> 
> Please review and test.
> I would be grateful for people to test for GICv2 regressions also
> (so on a GICv2 host with current kvmtool/qemu), as there is quite
> some refactoring on that front.
> 
> Much of the code was inspired by MarcZ, also kudos to him for doing
> the rather painful rebase on top of v3.17-rc1.
> 
> Cheers,
> Andre.
> 
> [1] https://lists.cs.columbia.edu/pipermail/kvmarm/2014-June/010086.html
> 
> Changes v2 ... v3:
> * rebase to v3.18-rc2
> * adapt to new kvm_register_device() function
> * split up vm_ops patch and the GICv2 split-off patch to ease review
> * various smaller changes due to Christoffer's review
> * fix compilation for arm
> * remove support for trapping SGI sysreg accesses on arm hosts
> 
> Changes v1 ... v2:
> * rebase to v3.17-rc1, caused quite some changes to the init code
> * new 9/15 patch to make 10/15 smaller
> * fix wrongly ordered cp15 register trap entry (MarcZ)
> * fix SGI broadcast (thanks to wanghaibin for spotting)
> * fix broken bailout path in kvm_vgic_create (wanghaibin)
> * check return value of init_emulation_ops() (wanghaibin)
> * fix return value check in vgic_[sg]et_attr()
> * add header inclusion guards
> * remove double definition of VCPU_NOT_ALLOCATED
> * some code move-around
> * whitespace fixes
> 
> Andre Przywara (19):
>   arm/arm64: KVM: rework MPIDR assignment and add accessors
>   arm/arm64: KVM: pass down user space provided GIC type into vGIC code
>   arm/arm64: KVM: refactor vgic_handle_mmio() function
>   arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
>   arm/arm64: KVM: introduce per-VM ops
>   arm/arm64: KVM: move [sg]et_lr into per-VM ops
>   arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing
>   arm/arm64: KVM: dont rely on a valid GICH base address
>   arm/arm64: KVM: make the maximum number of vCPUs a per-VM value
>   arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
>   arm/arm64: KVM: refactor MMIO accessors
>   arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
>   arm/arm64: KVM: add vgic.h header file
>   arm/arm64: KVM: split GICv2 specific emulation code from vgic.c
>   arm/arm64: KVM: add opaque private pointer to MMIO accessors
>   arm/arm64: KVM: add virtual GICv3 distributor emulation
>   arm64: KVM: add SGI system register trapping
>   arm/arm64: KVM: enable kernel side of GICv3 emulation
>   arm/arm64: KVM: allow userland to request a virtual GICv3
> 
>  arch/arm/include/asm/kvm_emulate.h   |    3 +-
>  arch/arm/include/asm/kvm_host.h      |    3 +
>  arch/arm/kvm/Makefile                |    1 +
>  arch/arm/kvm/arm.c                   |   23 +-
>  arch/arm/kvm/psci.c                  |   15 +-
>  arch/arm64/include/asm/kvm_emulate.h |    3 +-
>  arch/arm64/include/asm/kvm_host.h    |    5 +
>  arch/arm64/include/uapi/asm/kvm.h    |    7 +
>  arch/arm64/kernel/asm-offsets.c      |    1 +
>  arch/arm64/kvm/Makefile              |    2 +
>  arch/arm64/kvm/sys_regs.c            |   37 +-
>  arch/arm64/kvm/vgic-v3-switch.S      |   14 +-
>  include/kvm/arm_vgic.h               |   37 +-
>  include/linux/irqchip/arm-gic-v3.h   |   26 +
>  include/linux/kvm_host.h             |    2 +
>  include/uapi/linux/kvm.h             |    2 +
>  virt/kvm/arm/vgic-v2-emul.c          |  802 ++++++++++++++++++++++++++
>  virt/kvm/arm/vgic-v2.c               |   26 +-
>  virt/kvm/arm/vgic-v3-emul.c          |  894 +++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic-v3.c               |  192 +++++--
>  virt/kvm/arm/vgic.c                  | 1018 +++++++---------------------------
>  virt/kvm/arm/vgic.h                  |  128 +++++
>  22 files changed, 2366 insertions(+), 875 deletions(-)
>  create mode 100644 virt/kvm/arm/vgic-v2-emul.c
>  create mode 100644 virt/kvm/arm/vgic-v3-emul.c
>  create mode 100644 virt/kvm/arm/vgic.h
> 
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors
  2014-10-31 17:26 ` [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors Andre Przywara
@ 2014-11-03 13:13   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:36PM +0000, Andre Przywara wrote:
> The virtual MPIDR registers (containing topology information) for the
> guest are currently mapped linearily to the vcpu_id. Improve this
> mapping for arm64 by using three levels to not artificially limit the
> number of vCPUs. Also add an accessor to later allow easier access to
> a vCPU with a given MPIDR.
> Use this new accessor in the PSCI emulation.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/include/asm/kvm_emulate.h   |    3 ++-
>  arch/arm/include/asm/kvm_host.h      |    2 ++
>  arch/arm/kvm/arm.c                   |   15 +++++++++++++++
>  arch/arm/kvm/psci.c                  |   15 ++++-----------
>  arch/arm64/include/asm/kvm_emulate.h |    3 ++-
>  arch/arm64/include/asm/kvm_host.h    |    2 ++
>  arch/arm64/kvm/sys_regs.c            |   11 +++++++++--
>  7 files changed, 36 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> index b9db269..bd54383 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -23,6 +23,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmio.h>
>  #include <asm/kvm_arm.h>
> +#include <asm/cputype.h>
>  
>  unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
>  unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
> @@ -164,7 +165,7 @@ static inline u32 kvm_vcpu_hvc_get_imm(struct kvm_vcpu *vcpu)
>  
>  static inline unsigned long kvm_vcpu_get_mpidr(struct kvm_vcpu *vcpu)
>  {
> -	return vcpu->arch.cp15[c0_MPIDR];
> +	return vcpu->arch.cp15[c0_MPIDR] & MPIDR_HWID_BITMASK;
>  }

continuing the discussion from the previous version, yes, please don't
call it get_mpidr() if it returns a masked off version of get_mpidr(),
then call it get_mpidr_hwid() or something.

>  
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 53036e2..b443dfe 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -236,6 +236,8 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> +struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
> +
>  static inline void kvm_arch_hardware_disable(void) {}
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 9e193c8..61f13cc 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -977,6 +977,21 @@ static void check_kvm_target_cpu(void *ret)
>  	*(int *)ret = kvm_target_cpu();
>  }
>  
> +struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
> +{
> +	unsigned long c_mpidr;
> +	struct kvm_vcpu *vcpu;
> +	int i;
> +
> +	mpidr &= MPIDR_HWID_BITMASK;
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		c_mpidr = kvm_vcpu_get_mpidr(vcpu);
> +		if (c_mpidr == mpidr)
> +			return vcpu;

why do you need the c_mpidr variable at all?

> +	}
> +	return NULL;
> +}
> +
>  /**
>   * Initialize Hyp-mode and memory mappings on all CPUs.
>   */
> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
> index 09cf377..49f0992 100644
> --- a/arch/arm/kvm/psci.c
> +++ b/arch/arm/kvm/psci.c
> @@ -21,6 +21,7 @@
>  #include <asm/cputype.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_psci.h>
> +#include <asm/kvm_host.h>
>  
>  /*
>   * This is an implementation of the Power State Coordination Interface
> @@ -65,25 +66,17 @@ static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
>  static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  {
>  	struct kvm *kvm = source_vcpu->kvm;
> -	struct kvm_vcpu *vcpu = NULL, *tmp;
> +	struct kvm_vcpu *vcpu = NULL;
>  	wait_queue_head_t *wq;
>  	unsigned long cpu_id;
>  	unsigned long context_id;
> -	unsigned long mpidr;
>  	phys_addr_t target_pc;
> -	int i;
>  
> -	cpu_id = *vcpu_reg(source_vcpu, 1);
> +	cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
>  	if (vcpu_mode_is_32bit(source_vcpu))
>  		cpu_id &= ~((u32) 0);
>  
> -	kvm_for_each_vcpu(i, tmp, kvm) {
> -		mpidr = kvm_vcpu_get_mpidr(tmp);
> -		if ((mpidr & MPIDR_HWID_BITMASK) == (cpu_id & MPIDR_HWID_BITMASK)) {
> -			vcpu = tmp;
> -			break;
> -		}
> -	}
> +	vcpu = kvm_mpidr_to_vcpu(kvm, cpu_id);
>  
>  	/*
>  	 * Make sure the caller requested a valid CPU and that the CPU is
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 5674a55..37316dd 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -27,6 +27,7 @@
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_mmio.h>
>  #include <asm/ptrace.h>
> +#include <asm/cputype.h>
>  
>  unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num);
>  unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu);
> @@ -184,7 +185,7 @@ static inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)
>  
>  static inline unsigned long kvm_vcpu_get_mpidr(struct kvm_vcpu *vcpu)
>  {
> -	return vcpu_sys_reg(vcpu, MPIDR_EL1);
> +	return vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
>  }
>  
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 2012c4b..286bb61 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -207,6 +207,8 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> +struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
> +
>  static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  				       phys_addr_t pgd_ptr,
>  				       unsigned long hyp_stack_ptr,
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 4cc3b71..dcc5867 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -252,10 +252,17 @@ static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>  
>  static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>  {
> +	u64 mpidr;
> +
>  	/*
> -	 * Simply map the vcpu_id into the Aff0 field of the MPIDR.
> +	 * Map the vcpu_id into the first three Aff fields of the MPIDR.
> +	 * Aff0 uses only 16 CPUs, since there is a SGI injection
> +	 * limitation of GICv3.

This last sentence is worded weirdly, so I suggested an alternative
version in my last review, which you missed/ignored.  Please address it.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code
  2014-10-31 17:26 ` [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code Andre Przywara
@ 2014-11-03 13:14   ` Christoffer Dall
  2014-11-03 13:25     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 13:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:37PM +0000, Andre Przywara wrote:
> With the introduction of a second emulated GIC model we need to let
> userspace specify the GIC model to use for each VM. Pass the
> userspace provided value down into the vGIC code and store it there
> to differentiate later.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Did you change anything since v2?

If not, care to apply my ack from last time?

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 03/19] arm/arm64: KVM: refactor vgic_handle_mmio() function
  2014-10-31 17:26 ` [PATCH v3 03/19] arm/arm64: KVM: refactor vgic_handle_mmio() function Andre Przywara
@ 2014-11-03 13:23   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 13:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:38PM +0000, Andre Przywara wrote:
> Currently we only need to deal with one MMIO region for the GIC
> emulation, but we soon need to extend this. Refactor the existing
> code to allow easier addition of different ranges without code
> duplication.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  virt/kvm/arm/vgic.c |   77 +++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 56 insertions(+), 21 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 2403d72..704be48 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1032,37 +1032,28 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
>  	return true;
>  }
>  
> -/**
> - * vgic_handle_mmio - handle an in-kernel MMIO access
> +/*
> + * vgic_handle_mmio_range - handle an in-kernel MMIO access
>   * @vcpu:	pointer to the vcpu performing the access
>   * @run:	pointer to the kvm_run structure
>   * @mmio:	pointer to the data describing the access
> + * @ranges:	pointer to the register defining structure
> + * @mmio_base:	base address for this mapping
>   *
> - * returns true if the MMIO access has been performed in kernel space,
> - * and false if it needs to be emulated in user space.
> + * returns true if the MMIO access could be performed
>   */
> -bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> -		      struct kvm_exit_mmio *mmio)
> +static bool vgic_handle_mmio_range(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +			    struct kvm_exit_mmio *mmio,
> +			    const struct mmio_range *ranges,
> +			    unsigned long mmio_base)
>  {
>  	const struct mmio_range *range;
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> -	unsigned long base = dist->vgic_dist_base;
>  	bool updated_state;
>  	unsigned long offset;
>  
> -	if (!irqchip_in_kernel(vcpu->kvm) ||
> -	    mmio->phys_addr < base ||
> -	    (mmio->phys_addr + mmio->len) > (base + KVM_VGIC_V2_DIST_SIZE))
> -		return false;
> -
> -	/* We don't support ldrd / strd or ldm / stm to the emulated vgic */
> -	if (mmio->len > 4) {
> -		kvm_inject_dabt(vcpu, mmio->phys_addr);
> -		return true;
> -	}
> -
> -	offset = mmio->phys_addr - base;
> -	range = find_matching_range(vgic_dist_ranges, mmio, offset);
> +	offset = mmio->phys_addr - mmio_base;
> +	range = find_matching_range(ranges, mmio, offset);
>  	if (unlikely(!range || !range->handle_mmio)) {
>  		pr_warn("Unhandled access %d %08llx %d\n",
>  			mmio->is_write, mmio->phys_addr, mmio->len);
> @@ -1070,7 +1061,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	}
>  
>  	spin_lock(&vcpu->kvm->arch.vgic.lock);
> -	offset = mmio->phys_addr - range->base - base;
> +	offset -= range->base;
>  	if (vgic_validate_access(dist, range, offset)) {
>  		updated_state = range->handle_mmio(vcpu, mmio, offset);
>  	} else {
> @@ -1088,6 +1079,50 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	return true;
>  }
>  
> +static inline bool is_in_range(phys_addr_t addr, unsigned long len,
> +			       phys_addr_t baseaddr, unsigned long size)
> +{
> +	if (addr < baseaddr)
> +		return false;
> +	return addr + len <= baseaddr + size;

not sure, but this may be simpler as you had it before:

return addr >= baseaddr &&
	addr + len <= baseaddr + size;

> +}
> +
> +static bool vgic_v2_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +				struct kvm_exit_mmio *mmio)
> +{
> +	unsigned long base = vcpu->kvm->arch.vgic.vgic_dist_base;
> +
> +	if (!is_in_range(mmio->phys_addr, mmio->len, base,
> +			 KVM_VGIC_V2_DIST_SIZE))
> +		return false;
> +
> +	/* GICv2 does not support accesses wider than 32 bits */
> +	if (mmio->len > 4) {
> +		kvm_inject_dabt(vcpu, mmio->phys_addr);
> +		return true;
> +	}
> +
> +	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_dist_ranges, base);
> +}
> +
> +/**
> + * vgic_handle_mmio - handle an in-kernel MMIO access for the GIC emulation
> + * @vcpu:      pointer to the vcpu performing the access
> + * @run:       pointer to the kvm_run structure
> + * @mmio:      pointer to the data describing the access
> + *
> + * returns true if the MMIO access has been performed in kernel space,
> + * and false if it needs to be emulated in user space.
> + */
> +bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +		      struct kvm_exit_mmio *mmio)
> +{
> +	if (!irqchip_in_kernel(vcpu->kvm))
> +		return false;
> +
> +	return vgic_v2_handle_mmio(vcpu, run, mmio);
> +}
> +
>  static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
>  {
>  	return dist->irq_sgi_sources + vcpu_id * VGIC_NR_SGIS + sgi;
> -- 
> 1.7.9.5
> 

otherwise:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code
  2014-11-03 13:14   ` Christoffer Dall
@ 2014-11-03 13:25     ` Andre Przywara
  2014-11-03 16:51       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 03/11/14 13:14, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:37PM +0000, Andre Przywara wrote:
>> With the introduction of a second emulated GIC model we need to let
>> userspace specify the GIC model to use for each VM. Pass the
>> userspace provided value down into the vGIC code and store it there
>> to differentiate later.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> Did you change anything since v2?

Yes (and that's why I dropped your ack):
I moved the line that stores the vgic_model from the "introduce
per-VM-ops" patch into here (plus the declaration of it):

	kvm->arch.vgic.vgic_model = type;

That was part of the split-up to make that bigger patch more readable.

Regards,
Andre.

> 
> If not, care to apply my ack from last time?
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
  2014-10-31 17:26 ` [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones Andre Przywara
@ 2014-11-03 13:25   ` Christoffer Dall
  2014-11-04 12:18     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:39PM +0000, Andre Przywara wrote:
> Some GICv3 registers can and will be accessed as 64 bit registers.
> Currently the register handling code can only deal with 32 bit
> accesses, so we do two consecutive calls to cover this.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  virt/kvm/arm/vgic.c |   48 +++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 704be48..0cbdde9 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1033,6 +1033,48 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
>  }
>  
>  /*
> + * Call the respective handler function for the given range.
> + * We split up any 64 bit accesses into two consecutive 32 bit
> + * handler calls and merge the result afterwards.
> + */
> +static bool call_range_handler(struct kvm_vcpu *vcpu,
> +			       struct kvm_exit_mmio *mmio,
> +			       unsigned long offset,
> +			       const struct mmio_range *range)
> +{
> +	u32 *data32 = (void *)mmio->data;
> +	struct kvm_exit_mmio mmio32;
> +	bool ret;
> +
> +	if (likely(mmio->len <= 4))
> +		return range->handle_mmio(vcpu, mmio, offset);
> +
> +	/*
> +	 * Any access bigger than 4 bytes (that we currently handle in KVM)
> +	 * is actually 8 bytes long, caused by a 64-bit access
> +	 */
> +
> +	mmio32.len = 4;
> +	mmio32.is_write = mmio->is_write;
> +
> +	mmio32.phys_addr = mmio->phys_addr + 4;
> +	if (mmio->is_write)
> +		*(u32 *)mmio32.data = data32[1];
> +	ret = range->handle_mmio(vcpu, &mmio32, offset + 4);
> +	if (!mmio->is_write)
> +		data32[1] = *(u32 *)mmio32.data;
> +
> +	mmio32.phys_addr = mmio->phys_addr;
> +	if (mmio->is_write)
> +		*(u32 *)mmio32.data = data32[0];
> +	ret |= range->handle_mmio(vcpu, &mmio32, offset);
> +	if (!mmio->is_write)
> +		data32[0] = *(u32 *)mmio32.data;
> +
> +	return ret;
> +}

Please think about the endianness issues here.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops
  2014-10-31 17:26 ` [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops Andre Przywara
@ 2014-11-03 13:59   ` Christoffer Dall
  2014-11-04 15:58     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 13:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:40PM +0000, Andre Przywara wrote:
> Currently we only have one virtual GIC model supported, so all guests
> use the same emulation code. With the addition of another model we
> end up with different guests using potentially different vGIC models,
> so we have to split up some functions to be per VM.
> Introduce a vgic_vm_ops struct to hold function pointers for those
> functions that are different and provide the necessary code to
> initialize them.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  include/kvm/arm_vgic.h |   10 ++++++
>  virt/kvm/arm/vgic.c    |   81 +++++++++++++++++++++++++++++++++++-------------
>  2 files changed, 69 insertions(+), 22 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index dde5a00..bfb660a 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -134,6 +134,14 @@ struct vgic_params {
>  	void __iomem	*vctrl_base;
>  };
>  
> +struct vgic_vm_ops {
> +	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
> +			       struct kvm_exit_mmio *);
> +	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
> +	void	(*add_sgi_source)(struct kvm_vcpu *vcpu, int irq, int source);
> +	int	(*vgic_init)(struct kvm *kvm, const struct vgic_params *params);
> +};
> +
>  struct vgic_dist {
>  #ifdef CONFIG_KVM_ARM_VGIC
>  	spinlock_t		lock;
> @@ -215,6 +223,8 @@ struct vgic_dist {
>  
>  	/* Bitmap indicating which CPU has something pending */
>  	unsigned long		*irq_pending_on_cpu;
> +
> +	struct vgic_vm_ops	vm_ops;
>  #endif
>  };
>  
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 0cbdde9..2c16684 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -105,6 +105,8 @@ static void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
>  static const struct vgic_ops *vgic_ops;
>  static const struct vgic_params *vgic;
>  
> +#define vgic_vm_op(kvm, fn) ((kvm)->arch.vgic.vm_ops.fn)
> +

another one?  why did you simply ignore my comment from the last review?

If it wasn't obvious last time around, YUCK, and no ;)

>  /*
>   * struct vgic_bitmap contains a bitmap made of unsigned longs, but
>   * extracts u32s out of them.
> @@ -761,6 +763,13 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
>  	return false;
>  }
>  
> +static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	*vgic_get_sgi_sources(dist, vcpu->vcpu_id, irq) |= 1 << source;
> +}
> +
>  /**
>   * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
>   * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
> @@ -775,9 +784,7 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
>   */
>  static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  {
> -	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> -	int vcpu_id = vcpu->vcpu_id;
>  	int i;
>  
>  	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
> @@ -804,7 +811,8 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  		 */
>  		vgic_dist_irq_set_pending(vcpu, lr.irq);
>  		if (lr.irq < VGIC_NR_SGIS)
> -			*vgic_get_sgi_sources(dist, vcpu_id, lr.irq) |= 1 << lr.source;
> +			vgic_vm_op(vcpu->kvm, add_sgi_source)(vcpu, lr.irq,
> +							      lr.source);
>  		lr.state &= ~LR_STATE_PENDING;
>  		vgic_set_lr(vcpu, i, lr);
>  
> @@ -1162,7 +1170,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	if (!irqchip_in_kernel(vcpu->kvm))
>  		return false;
>  
> -	return vgic_v2_handle_mmio(vcpu, run, mmio);
> +	return vgic_vm_op(vcpu->kvm, handle_mmio)(vcpu, run, mmio);
>  }
>  
>  static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
> @@ -1414,7 +1422,7 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>  	return true;
>  }
>  
> -static bool vgic_queue_sgi(struct kvm_vcpu *vcpu, int irq)
> +static bool vgic_v2_queue_sgi(struct kvm_vcpu *vcpu, int irq)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>  	unsigned long sources;
> @@ -1489,7 +1497,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>  
>  	/* SGIs */
>  	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
> -		if (!vgic_queue_sgi(vcpu, i))
> +		if (!vgic_vm_op(vcpu->kvm, queue_sgi)(vcpu, i))
>  			overflow = 1;
>  	}
>  
> @@ -1944,9 +1952,6 @@ static int vgic_init_maps(struct kvm *kvm)
>  		}
>  	}
>  
> -	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
> -		vgic_set_target_reg(kvm, 0, i);
> -

Remind me, why are we moving this chunk?

>  out:
>  	if (ret)
>  		kvm_vgic_destroy(kvm);
> @@ -1954,6 +1959,31 @@ out:
>  	return ret;
>  }
>  
> +static int vgic_v2_init(struct kvm *kvm, const struct vgic_params *params)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	int ret, i;
> +
> +	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
> +	    IS_VGIC_ADDR_UNDEF(dist->vgic_cpu_base)) {
> +		kvm_err("Need to set vgic distributor addresses first\n");
> +		return -ENXIO;
> +	}
> +
> +	ret = kvm_phys_addr_ioremap(kvm, dist->vgic_cpu_base,
> +				    params->vcpu_base,
> +				    KVM_VGIC_V2_CPU_SIZE, true);
> +	if (ret) {
> +		kvm_err("Unable to remap VGIC CPU to VCPU\n");
> +		return ret;
> +	}
> +
> +	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
> +		vgic_set_target_reg(kvm, 0, i);
> +
> +	return 0;
> +}
> +
>  /**
>   * kvm_vgic_init - Initialize global VGIC state before running any VCPUs
>   * @kvm: pointer to the kvm struct
> @@ -1976,26 +2006,15 @@ int kvm_vgic_init(struct kvm *kvm)
>  	if (vgic_initialized(kvm))
>  		goto out;
>  
> -	if (IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_dist_base) ||
> -	    IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_cpu_base)) {
> -		kvm_err("Need to set vgic cpu and dist addresses first\n");
> -		ret = -ENXIO;
> -		goto out;
> -	}
> -
>  	ret = vgic_init_maps(kvm);
>  	if (ret) {
>  		kvm_err("Unable to allocate maps\n");
>  		goto out;
>  	}
>  
> -	ret = kvm_phys_addr_ioremap(kvm, kvm->arch.vgic.vgic_cpu_base,
> -				    vgic->vcpu_base, KVM_VGIC_V2_CPU_SIZE,
> -				    true);
> -	if (ret) {
> -		kvm_err("Unable to remap VGIC CPU to VCPU\n");
> +	ret = vgic_vm_op(kvm, vgic_init)(kvm, vgic);
> +	if (ret)
>  		goto out;
> -	}
>  
>  	kvm_for_each_vcpu(i, vcpu, kvm)
>  		kvm_vgic_vcpu_init(vcpu);
> @@ -2008,6 +2027,21 @@ out:
>  	return ret;
>  }
>  
> +static bool init_emulation_ops(struct kvm *kvm, int type)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	switch (type) {
> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
> +		dist->vm_ops.handle_mmio = vgic_v2_handle_mmio;
> +		dist->vm_ops.queue_sgi = vgic_v2_queue_sgi;
> +		dist->vm_ops.add_sgi_source = vgic_v2_add_sgi_source;
> +		dist->vm_ops.vgic_init = vgic_v2_init;
> +		return true;
> +	}
> +	return false;
> +}
> +
>  int kvm_vgic_create(struct kvm *kvm, u32 type)
>  {
>  	int i, vcpu_lock_idx = -1, ret = 0;
> @@ -2045,6 +2079,9 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>  	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>  
> +	if (!init_emulation_ops(kvm, type))
> +		ret = -ENODEV;
> +
>  out_unlock:
>  	for (; vcpu_lock_idx >= 0; vcpu_lock_idx--) {
>  		vcpu = kvm_get_vcpu(kvm, vcpu_lock_idx);
> -- 
> 1.7.9.5
> 

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into per-VM ops
  2014-10-31 17:26 ` [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into " Andre Przywara
@ 2014-11-03 14:15   ` Christoffer Dall
  2014-11-04 16:30     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:41PM +0000, Andre Przywara wrote:
> The function to set the VGIC's list registers are not only dependent
> on the host GIC model, but need to behave slightly different for
> the type of emulated guest GIC.
> So move the functions into the new struct vgic_vm_ops and initialize
> them properly to prepare for guest GICv3 support later.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  include/kvm/arm_vgic.h |    5 +++--
>  virt/kvm/arm/vgic-v2.c |   17 +++++++++++++++--
>  virt/kvm/arm/vgic-v3.c |   16 ++++++++++++++--
>  virt/kvm/arm/vgic.c    |    9 +++++++--
>  4 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index bfb660a..a6d41f1 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -108,8 +108,6 @@ struct vgic_vmcr {
>  };
>  
>  struct vgic_ops {
> -	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
> -	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
>  	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
>  	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
>  	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
> @@ -132,9 +130,12 @@ struct vgic_params {
>  	unsigned int	maint_irq;
>  	/* Virtual control interface base address */
>  	void __iomem	*vctrl_base;
> +	bool (*init_emul)(struct kvm *kvm, int type);
>  };
>  
>  struct vgic_vm_ops {
> +	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
> +	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
>  	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
>  			       struct kvm_exit_mmio *);
>  	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);


this has now become incredibly confusing, what are your thoughts on
renaming vgic_ops to kvm_gic_ops to make it clear that this structure is
about hardware-managing ops and vgic_vm_ops is about the vgic, the
virtual instance?

> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index 2935405..bdc8d97 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -143,8 +143,6 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
>  }
>  
>  static const struct vgic_ops vgic_v2_ops = {
> -	.get_lr			= vgic_v2_get_lr,
> -	.set_lr			= vgic_v2_set_lr,
>  	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
>  	.get_elrsr		= vgic_v2_get_elrsr,
>  	.get_eisr		= vgic_v2_get_eisr,
> @@ -158,6 +156,20 @@ static const struct vgic_ops vgic_v2_ops = {
>  
>  static struct vgic_params vgic_v2_params;
>  
> +static bool vgic_v2_init_emul(struct kvm *kvm, int type)
> +{
> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> +
> +	switch (type) {
> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
> +		vm_ops->get_lr = vgic_v2_get_lr;
> +		vm_ops->set_lr = vgic_v2_set_lr;
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
>  /**
>   * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
>   * @node:	pointer to the DT node
> @@ -196,6 +208,7 @@ int vgic_v2_probe(struct device_node *vgic_node,
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> +	vgic->init_emul = vgic_v2_init_emul;
>  
>  	vgic->nr_lr = readl_relaxed(vgic->vctrl_base + GICH_VTR);
>  	vgic->nr_lr = (vgic->nr_lr & 0x3f) + 1;
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index 1c2c8ee..a38339e 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -157,8 +157,6 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>  }
>  
>  static const struct vgic_ops vgic_v3_ops = {
> -	.get_lr			= vgic_v3_get_lr,
> -	.set_lr			= vgic_v3_set_lr,
>  	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
>  	.get_elrsr		= vgic_v3_get_elrsr,
>  	.get_eisr		= vgic_v3_get_eisr,
> @@ -170,6 +168,19 @@ static const struct vgic_ops vgic_v3_ops = {
>  	.enable			= vgic_v3_enable,
>  };
>  
> +static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
> +{
> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> +
> +	switch (type) {
> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
> +		vm_ops->get_lr = vgic_v3_get_lr;
> +		vm_ops->set_lr = vgic_v3_set_lr;
> +		return true;
> +	}
> +	return false;
> +}
> +
>  static struct vgic_params vgic_v3_params;
>  
>  /**
> @@ -231,6 +242,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>  		goto out;
>  	}
>  
> +	vgic->init_emul = vgic_v3_init_emul_compat;
>  	vgic->vcpu_base = vcpu_res.start;
>  	vgic->vctrl_base = NULL;
>  	vgic->type = VGIC_V3;
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 2c16684..8c2e707 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1278,13 +1278,13 @@ static void vgic_update_state(struct kvm *kvm)
>  
>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr)
>  {
> -	return vgic_ops->get_lr(vcpu, lr);
> +	return vgic_vm_op(vcpu->kvm, get_lr)(vcpu, lr);
>  }
>  
>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
>  			       struct vgic_lr vlr)
>  {
> -	vgic_ops->set_lr(vcpu, lr, vlr);
> +	return vgic_vm_op(vcpu->kvm, set_lr)(vcpu, lr, vlr);
>  }
>  
>  static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
> @@ -2072,6 +2072,11 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  		}
>  	}
>  
> +	if (!vgic->init_emul(kvm, type)) {
> +		ret = -ENODEV;
> +		goto out_unlock;
> +	}
> +
>  	spin_lock_init(&kvm->arch.vgic.lock);
>  	kvm->arch.vgic.in_kernel = true;
>  	kvm->arch.vgic.vgic_model = type;
> -- 
> 1.7.9.5
> 

Thanks for splitting up the patches, it's certainly better to review.

However, my question from the last round still stands.  What you're
doing here is setting a sh*tload of function pointers through an amazing
amount of abstractions to avoid something like

void vgic_v2_set_lr(struct kvm_vgic *vgic)
{
	switch (vgic->type) {
	case KVM_DEV_TYPE_ARM_VGIC_V2:
		foo();
		break;
	case KVM_DEV_TYPE_ARM_VGIC_V3:
		bar();
		break;
	}
}

So I have to ask: What's the benefit? That you'll have fewer
conditionals?  But god have mercy on the poor people having to debug
some issue and figure out which function the code actually calls when it
(inside another complicated piece of logic) sets a LR.

This just feels like we're doing something incredibly wrong...

Thoughts?

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code
  2014-11-03 13:25     ` Andre Przywara
@ 2014-11-03 16:51       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 03, 2014 at 01:25:11PM +0000, Andre Przywara wrote:
> Hi Christoffer,
> 
> On 03/11/14 13:14, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:37PM +0000, Andre Przywara wrote:
> >> With the introduction of a second emulated GIC model we need to let
> >> userspace specify the GIC model to use for each VM. Pass the
> >> userspace provided value down into the vGIC code and store it there
> >> to differentiate later.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > 
> > Did you change anything since v2?
> 
> Yes (and that's why I dropped your ack):
> I moved the line that stores the vgic_model from the "introduce
> per-VM-ops" patch into here (plus the declaration of it):
> 
> 	kvm->arch.vgic.vgic_model = type;
> 
> That was part of the split-up to make that bigger patch more readable.
> 
ok, it fits perfectly her:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
  2014-10-31 17:26 ` [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable Andre Przywara
@ 2014-11-03 20:04   ` Christoffer Dall
  2014-11-03 20:17     ` Marc Zyngier
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 20:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:45PM +0000, Andre Przywara wrote:
> ICC_SRE_EL1 is a system register allowing msr/mrs accesses to the
> GIC CPU interface for EL1 (guests). Currently we force it to 0, but
> for proper GICv3 support we have to allow guests to use it (depending
> on their selected virtual GIC model).
> So add ICC_SRE_EL1 to the list of saved/restored registers on a
> world switch, but actually disallow a guest to change it by only
> restoring a fixed, once-initialized value.
> This value depends on the GIC model userland has chosen for a guest.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
>  arch/arm64/kernel/asm-offsets.c |    1 +
>  arch/arm64/kvm/vgic-v3-switch.S |   14 +++++++++-----
>  include/kvm/arm_vgic.h          |    1 +
>  virt/kvm/arm/vgic-v3.c          |    9 +++++++--
>  4 files changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 9a9fce0..9d34486 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -140,6 +140,7 @@ int main(void)
>    DEFINE(VGIC_V2_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_v2.vgic_elrsr));
>    DEFINE(VGIC_V2_CPU_APR,	offsetof(struct vgic_cpu, vgic_v2.vgic_apr));
>    DEFINE(VGIC_V2_CPU_LR,	offsetof(struct vgic_cpu, vgic_v2.vgic_lr));
> +  DEFINE(VGIC_V3_CPU_SRE,	offsetof(struct vgic_cpu, vgic_v3.vgic_sre));
>    DEFINE(VGIC_V3_CPU_HCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_hcr));
>    DEFINE(VGIC_V3_CPU_VMCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_vmcr));
>    DEFINE(VGIC_V3_CPU_MISR,	offsetof(struct vgic_cpu, vgic_v3.vgic_misr));
> diff --git a/arch/arm64/kvm/vgic-v3-switch.S b/arch/arm64/kvm/vgic-v3-switch.S
> index d160469..617a012 100644
> --- a/arch/arm64/kvm/vgic-v3-switch.S
> +++ b/arch/arm64/kvm/vgic-v3-switch.S
> @@ -148,17 +148,18 @@
>   * x0: Register pointing to VCPU struct
>   */
>  .macro	restore_vgic_v3_state
> -	// Disable SRE_EL1 access. Necessary, otherwise
> -	// ICH_VMCR_EL2.VFIQEn becomes one, and FIQ happens...
> -	msr_s	ICC_SRE_EL1, xzr
> -	isb
> -

I know I reviewed this once, but now I'm forgetting how it works with
this comment above.  First, I don't fully understand the comment.
Second, now we're restoring a value that may potentially have SRE_EL1
access enabled, but FIQ doesn't happen.  Can you clarify this for me?

Thanks,
-Christoffer

>  	// Compute the address of struct vgic_cpu
>  	add	x3, x0, #VCPU_VGIC_CPU
>  
>  	// Restore all interesting registers
>  	ldr	w4, [x3, #VGIC_V3_CPU_HCR]
>  	ldr	w5, [x3, #VGIC_V3_CPU_VMCR]
> +	ldr	w25, [x3, #VGIC_V3_CPU_SRE]
> +
> +	msr_s	ICC_SRE_EL1, x25
> +
> +	// make sure SRE is valid before writing the other registers
> +	isb
>  
>  	msr_s	ICH_HCR_EL2, x4
>  	msr_s	ICH_VMCR_EL2, x5
> @@ -244,9 +245,12 @@
>  	dsb	sy
>  
>  	// Prevent the guest from touching the GIC system registers
> +	// if SRE isn't enabled for GICv3 emulation
> +	cbnz	x25, 1f
>  	mrs_s	x5, ICC_SRE_EL2
>  	and	x5, x5, #~ICC_SRE_EL2_ENABLE
>  	msr_s	ICC_SRE_EL2, x5
> +1:
>  .endm
>  
>  ENTRY(__save_vgic_v3_state)
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index a6d41f1..8827bc7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -243,6 +243,7 @@ struct vgic_v3_cpu_if {
>  #ifdef CONFIG_ARM_GIC_V3
>  	u32		vgic_hcr;
>  	u32		vgic_vmcr;
> +	u32		vgic_sre;	/* Restored only, change ignored */
>  	u32		vgic_misr;	/* Saved only */
>  	u32		vgic_eisr;	/* Saved only */
>  	u32		vgic_elrsr;	/* Saved only */
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index fc4d628..ce50918 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -145,15 +145,20 @@ static void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcrp)
>  
>  static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>  {
> +	struct vgic_v3_cpu_if *vgic_v3;
> +
> +	vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
>  	/*
>  	 * By forcing VMCR to zero, the GIC will restore the binary
>  	 * points to their reset values. Anything else resets to zero
>  	 * anyway.
>  	 */
> -	vcpu->arch.vgic_cpu.vgic_v3.vgic_vmcr = 0;
> +	vgic_v3->vgic_vmcr = 0;
> +
> +	vgic_v3->vgic_sre = 0;
>  
>  	/* Get the show on the road... */
> -	vcpu->arch.vgic_cpu.vgic_v3.vgic_hcr = ICH_HCR_EN;
> +	vgic_v3->vgic_hcr = ICH_HCR_EN;
>  }
>  
>  static const struct vgic_ops vgic_v3_ops = {
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 07/19] arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing
  2014-10-31 17:26 ` [PATCH v3 07/19] arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing Andre Przywara
@ 2014-11-03 20:05   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 20:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:42PM +0000, Andre Przywara wrote:
> Currently we unconditionally register the GICv2 emulation device
> during the host's KVM initialization. Since with GICv3 support we
> may end up with only v2 or only v3 or both supported, we move the
> registration into the GIC probing function, where we will later know
> which combination is valid.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 08/19] arm/arm64: KVM: dont rely on a valid GICH base address
  2014-10-31 17:26 ` [PATCH v3 08/19] arm/arm64: KVM: dont rely on a valid GICH base address Andre Przywara
@ 2014-11-03 20:05   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 20:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:43PM +0000, Andre Przywara wrote:
> To check whether the vGIC was already initialized, we currently check
> the GICH base address for not being NULL. Since with GICv3 we may
> get along without this address, lets use the irqchip_in_kernel()
> function to detect an already initialized vGIC.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 09/19] arm/arm64: KVM: make the maximum number of vCPUs a per-VM value
  2014-10-31 17:26 ` [PATCH v3 09/19] arm/arm64: KVM: make the maximum number of vCPUs a per-VM value Andre Przywara
@ 2014-11-03 20:06   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-03 20:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:44PM +0000, Andre Przywara wrote:
> Currently the maximum number of vCPUs supported is a global value
> limited by the used GIC model. GICv3 will lift this limit, but we
> still need to observe it for guests using GICv2.
> So the maximum number of vCPUs is per-VM value, depending on the
> GIC model the guest uses.
> Store and check the value in struct kvm_arch, but keep it down to
> 8 for now.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/include/asm/kvm_host.h   |    1 +
>  arch/arm/kvm/arm.c                |    6 ++++++
>  arch/arm64/include/asm/kvm_host.h |    3 +++
>  virt/kvm/arm/vgic-v2.c            |    7 +++++++
>  virt/kvm/arm/vgic-v3.c            |    8 ++++++++
>  5 files changed, 25 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b443dfe..7969e6e 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -68,6 +68,7 @@ struct kvm_arch {
>  
>  	/* Interrupt controller */
>  	struct vgic_dist	vgic;
> +	int max_vcpus;
>  };
>  
>  #define KVM_NR_MEM_OBJS     40
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 60c7997..ac0aa7f 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -131,6 +131,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  	/* Mark the initial VMID generation invalid */
>  	kvm->arch.vmid_gen = 0;
> +	kvm->arch.max_vcpus = CONFIG_KVM_ARM_MAX_VCPUS;
>  
>  	return ret;
>  out_free_stage2_pgd:
> @@ -213,6 +214,11 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
>  	int err;
>  	struct kvm_vcpu *vcpu;
>  
> +	if (id >= kvm->arch.max_vcpus) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL);
>  	if (!vcpu) {
>  		err = -ENOMEM;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 286bb61..f9e130d 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -59,6 +59,9 @@ struct kvm_arch {
>  	/* VTTBR value associated with above pgd and vmid */
>  	u64    vttbr;
>  
> +	/* The maximum number of vCPUs depends on the used GIC model */
> +	int max_vcpus;
> +
>  	/* Interrupt controller */
>  	struct vgic_dist	vgic;
>  
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index 417ecaa..c92ac33 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -159,11 +159,18 @@ static struct vgic_params vgic_v2_params;
>  static bool vgic_v2_init_emul(struct kvm *kvm, int type)
>  {
>  	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> +	int nr_vcpus;
>  
>  	switch (type) {
>  	case KVM_DEV_TYPE_ARM_VGIC_V2:
> +		nr_vcpus = atomic_read(&kvm->online_vcpus);
> +		if (nr_vcpus > 8) {
> +			pr_warn_ratelimited("VGICv2 only supports up to 8 vCPUs\n");
> +			return false;
> +		}
>  		vm_ops->get_lr = vgic_v2_get_lr;
>  		vm_ops->set_lr = vgic_v2_set_lr;
> +		kvm->arch.max_vcpus = 8;
>  		return true;
>  	}
>  
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index 6825c71..fc4d628 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -171,11 +171,19 @@ static const struct vgic_ops vgic_v3_ops = {
>  static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
>  {
>  	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> +	int nr_vcpus;
>  
>  	switch (type) {
>  	case KVM_DEV_TYPE_ARM_VGIC_V2:
> +		nr_vcpus = atomic_read(&kvm->online_vcpus);
> +		if (nr_vcpus > 8) {
> +			pr_warn_ratelimited("VGICv2 supports only up to 8 vCPUs\n");
> +			return false;
> +		}
> +
>  		vm_ops->get_lr = vgic_v3_get_lr;
>  		vm_ops->set_lr = vgic_v3_set_lr;
> +		kvm->arch.max_vcpus = 8;

This is exactly the same code as above, and it feels a bit weird to have
this code be dependent on which host gic you have, unless you plan on
supporting a frankenvgicv2 in the guest with more than 8 vcpus?

Don't we have some common place for gicv2 emulation where we can stick
this, set the max_vcpus (perhaps using a #define) and then compare the
nr_vcpus to the max_vcpus?

>  		return true;
>  	}
>  	return false;
> -- 
> 1.7.9.5
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
  2014-11-03 20:04   ` Christoffer Dall
@ 2014-11-03 20:17     ` Marc Zyngier
  2014-11-07 19:18       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Marc Zyngier @ 2014-11-03 20:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 03/11/14 20:04, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:45PM +0000, Andre Przywara wrote:
>> ICC_SRE_EL1 is a system register allowing msr/mrs accesses to the
>> GIC CPU interface for EL1 (guests). Currently we force it to 0, but
>> for proper GICv3 support we have to allow guests to use it (depending
>> on their selected virtual GIC model).
>> So add ICC_SRE_EL1 to the list of saved/restored registers on a
>> world switch, but actually disallow a guest to change it by only
>> restoring a fixed, once-initialized value.
>> This value depends on the GIC model userland has chosen for a guest.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
>> ---
>>  arch/arm64/kernel/asm-offsets.c |    1 +
>>  arch/arm64/kvm/vgic-v3-switch.S |   14 +++++++++-----
>>  include/kvm/arm_vgic.h          |    1 +
>>  virt/kvm/arm/vgic-v3.c          |    9 +++++++--
>>  4 files changed, 18 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
>> index 9a9fce0..9d34486 100644
>> --- a/arch/arm64/kernel/asm-offsets.c
>> +++ b/arch/arm64/kernel/asm-offsets.c
>> @@ -140,6 +140,7 @@ int main(void)
>>    DEFINE(VGIC_V2_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_v2.vgic_elrsr));
>>    DEFINE(VGIC_V2_CPU_APR,	offsetof(struct vgic_cpu, vgic_v2.vgic_apr));
>>    DEFINE(VGIC_V2_CPU_LR,	offsetof(struct vgic_cpu, vgic_v2.vgic_lr));
>> +  DEFINE(VGIC_V3_CPU_SRE,	offsetof(struct vgic_cpu, vgic_v3.vgic_sre));
>>    DEFINE(VGIC_V3_CPU_HCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_hcr));
>>    DEFINE(VGIC_V3_CPU_VMCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_vmcr));
>>    DEFINE(VGIC_V3_CPU_MISR,	offsetof(struct vgic_cpu, vgic_v3.vgic_misr));
>> diff --git a/arch/arm64/kvm/vgic-v3-switch.S b/arch/arm64/kvm/vgic-v3-switch.S
>> index d160469..617a012 100644
>> --- a/arch/arm64/kvm/vgic-v3-switch.S
>> +++ b/arch/arm64/kvm/vgic-v3-switch.S
>> @@ -148,17 +148,18 @@
>>   * x0: Register pointing to VCPU struct
>>   */
>>  .macro	restore_vgic_v3_state
>> -	// Disable SRE_EL1 access. Necessary, otherwise
>> -	// ICH_VMCR_EL2.VFIQEn becomes one, and FIQ happens...
>> -	msr_s	ICC_SRE_EL1, xzr
>> -	isb
>> -
> 
> I know I reviewed this once, but now I'm forgetting how it works with
> this comment above.  First, I don't fully understand the comment.

If you write to ICH_VMCR_EL2 with SRE==1, the architecture forces VFIQEn
to 1, which causes interesting effects when you inject an Group0
interrupt (as we do for GICv2 emulation).

You end-up spending days debugging this, mostly blaming the model for
all these FIQs appearing in your guest, until you read that small gem
hidden in the architecture spec. Bad memories, let's not go there.

That's why we must make sure to set ICC_SRE_EL1 *before* writing to
ICH_VMCR_EL2.

> Second, now we're restoring a value that may potentially have SRE_EL1
> access enabled, but FIQ doesn't happen.  Can you clarify this for me?

That's a side effect of how we inject interrupts with GICv3. They are
Group1, always. A Group0 interrupt would definitely be delivered as a
FIQ, but we currently don't offer a way to support that.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors
  2014-10-31 17:26 ` [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors Andre Przywara
@ 2014-11-04 11:55   ` Christoffer Dall
  2014-11-04 12:25     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:46PM +0000, Andre Przywara wrote:
> The MMIO accessors for GICD_I[CS]ENABLER, GICD_I[CS]PENDR and
> GICD_ICFGR behave very similar in GICv3, although the way the
> affected vCPU is determined differs.

They behave similarly to each other (the registers) or similarly to how
they are implemented for GICv2?

> Factor out a generic, backend-facing implementation and use small
> wrappers in the current GICv2 emulation to ease code sharing later.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

I can't really understand the motivation from your commit message, but
the code looks fine and I suppose I'll realize the motivation later:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
  2014-11-03 13:25   ` Christoffer Dall
@ 2014-11-04 12:18     ` Andre Przywara
  2014-11-04 13:24       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-04 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 03/11/14 13:25, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:39PM +0000, Andre Przywara wrote:
>> Some GICv3 registers can and will be accessed as 64 bit registers.
>> Currently the register handling code can only deal with 32 bit
>> accesses, so we do two consecutive calls to cover this.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  virt/kvm/arm/vgic.c |   48 +++++++++++++++++++++++++++++++++++++++++++++---
>>  1 file changed, 45 insertions(+), 3 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 704be48..0cbdde9 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1033,6 +1033,48 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
>>  }
>>  
>>  /*
>> + * Call the respective handler function for the given range.
>> + * We split up any 64 bit accesses into two consecutive 32 bit
>> + * handler calls and merge the result afterwards.
>> + */
>> +static bool call_range_handler(struct kvm_vcpu *vcpu,
>> +			       struct kvm_exit_mmio *mmio,
>> +			       unsigned long offset,
>> +			       const struct mmio_range *range)
>> +{
>> +	u32 *data32 = (void *)mmio->data;
>> +	struct kvm_exit_mmio mmio32;
>> +	bool ret;
>> +
>> +	if (likely(mmio->len <= 4))
>> +		return range->handle_mmio(vcpu, mmio, offset);
>> +
>> +	/*
>> +	 * Any access bigger than 4 bytes (that we currently handle in KVM)
>> +	 * is actually 8 bytes long, caused by a 64-bit access
>> +	 */
>> +
>> +	mmio32.len = 4;
>> +	mmio32.is_write = mmio->is_write;
>> +
>> +	mmio32.phys_addr = mmio->phys_addr + 4;
>> +	if (mmio->is_write)
>> +		*(u32 *)mmio32.data = data32[1];
>> +	ret = range->handle_mmio(vcpu, &mmio32, offset + 4);
>> +	if (!mmio->is_write)
>> +		data32[1] = *(u32 *)mmio32.data;
>> +
>> +	mmio32.phys_addr = mmio->phys_addr;
>> +	if (mmio->is_write)
>> +		*(u32 *)mmio32.data = data32[0];
>> +	ret |= range->handle_mmio(vcpu, &mmio32, offset);
>> +	if (!mmio->is_write)
>> +		data32[0] = *(u32 *)mmio32.data;
>> +
>> +	return ret;
>> +}
> 
> Please think about the endianness issues here.

I didn't only think about it, I traced the code and tested it:
So it works like written above (I actually had a hickup in my kvmtool
setup that denied booting the bigendian initrds, so I thought that BE
was broken).

So the GIC is always LE, that's why we swap the bytes to LE in any
32-bit register in mmio_data_{write,read}, which gets called for each
vGIC register access via the vgic_reg_access() function.

So the memory order that the actual register handler functions
implicitly expect is always LE, regardless of the guest or host
endianness. vgic_reg_access() makes this transparent for the host code.

Now if we eventually assemble the 64-bit value from the two 32-bit
values, we also have to always do this in LE fashion. Hence the
hardcoded LE assignment here. Eventually this LE value will be copied
into the guest, which will access it through readq, which uses
le64_to_cpu() to convert it to the CPU native value.

So the branch as posted (or present in the repo) works fine (boot-tested
only so far) with all 8 combinations of (host endianness, guest
endianness, guest v2/v3 GIC).

I will add a comment to the function explaining this.

Regards,
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors
  2014-11-04 11:55   ` Christoffer Dall
@ 2014-11-04 12:25     ` Andre Przywara
  0 siblings, 0 replies; 76+ messages in thread
From: Andre Przywara @ 2014-11-04 12:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 04/11/14 11:55, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:46PM +0000, Andre Przywara wrote:
>> The MMIO accessors for GICD_I[CS]ENABLER, GICD_I[CS]PENDR and
>> GICD_ICFGR behave very similar in GICv3, although the way the
>> affected vCPU is determined differs.
> 
> They behave similarly to each other (the registers) or similarly to how
> they are implemented for GICv2?

Similarly to GICv2. Actually we have _three_ places where we need to
handle them: for the GICv2 distributor, for the GICv3 distributor
(handling only SPIs) and for the GICv3 redistributor (caring about PPIs
and SGIs). So I didn't want to have very similar code at three places,
thus the refactoring.

>> Factor out a generic, backend-facing implementation and use small
>> wrappers in the current GICv2 emulation to ease code sharing later.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> I can't really understand the motivation from your commit message, but
> the code looks fine and I suppose I'll realize the motivation later:

Hopefully the actual GICv3 emulation code will provide that insight. I
can add the above explanation to the commit message to make this more
obvious.

> 
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

Thanks!
Andre

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
  2014-11-04 12:18     ` Andre Przywara
@ 2014-11-04 13:24       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 04, 2014 at 12:18:16PM +0000, Andre Przywara wrote:
> Hi Christoffer,
> 
> On 03/11/14 13:25, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:39PM +0000, Andre Przywara wrote:
> >> Some GICv3 registers can and will be accessed as 64 bit registers.
> >> Currently the register handling code can only deal with 32 bit
> >> accesses, so we do two consecutive calls to cover this.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  virt/kvm/arm/vgic.c |   48 +++++++++++++++++++++++++++++++++++++++++++++---
> >>  1 file changed, 45 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 704be48..0cbdde9 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1033,6 +1033,48 @@ static bool vgic_validate_access(const struct vgic_dist *dist,
> >>  }
> >>  
> >>  /*
> >> + * Call the respective handler function for the given range.
> >> + * We split up any 64 bit accesses into two consecutive 32 bit
> >> + * handler calls and merge the result afterwards.
> >> + */
> >> +static bool call_range_handler(struct kvm_vcpu *vcpu,
> >> +			       struct kvm_exit_mmio *mmio,
> >> +			       unsigned long offset,
> >> +			       const struct mmio_range *range)
> >> +{
> >> +	u32 *data32 = (void *)mmio->data;
> >> +	struct kvm_exit_mmio mmio32;
> >> +	bool ret;
> >> +
> >> +	if (likely(mmio->len <= 4))
> >> +		return range->handle_mmio(vcpu, mmio, offset);
> >> +
> >> +	/*
> >> +	 * Any access bigger than 4 bytes (that we currently handle in KVM)
> >> +	 * is actually 8 bytes long, caused by a 64-bit access
> >> +	 */
> >> +
> >> +	mmio32.len = 4;
> >> +	mmio32.is_write = mmio->is_write;
> >> +
> >> +	mmio32.phys_addr = mmio->phys_addr + 4;
> >> +	if (mmio->is_write)
> >> +		*(u32 *)mmio32.data = data32[1];
> >> +	ret = range->handle_mmio(vcpu, &mmio32, offset + 4);
> >> +	if (!mmio->is_write)
> >> +		data32[1] = *(u32 *)mmio32.data;
> >> +
> >> +	mmio32.phys_addr = mmio->phys_addr;
> >> +	if (mmio->is_write)
> >> +		*(u32 *)mmio32.data = data32[0];
> >> +	ret |= range->handle_mmio(vcpu, &mmio32, offset);
> >> +	if (!mmio->is_write)
> >> +		data32[0] = *(u32 *)mmio32.data;
> >> +
> >> +	return ret;
> >> +}
> > 
> > Please think about the endianness issues here.
> 
> I didn't only think about it, I traced the code and tested it:
> So it works like written above (I actually had a hickup in my kvmtool
> setup that denied booting the bigendian initrds, so I thought that BE
> was broken).
> 
> So the GIC is always LE, that's why we swap the bytes to LE in any
> 32-bit register in mmio_data_{write,read}, which gets called for each
> vGIC register access via the vgic_reg_access() function.
> 
> So the memory order that the actual register handler functions
> implicitly expect is always LE, regardless of the guest or host
> endianness. vgic_reg_access() makes this transparent for the host code.
> 
> Now if we eventually assemble the 64-bit value from the two 32-bit
> values, we also have to always do this in LE fashion. Hence the
> hardcoded LE assignment here. Eventually this LE value will be copied
> into the guest, which will access it through readq, which uses
> le64_to_cpu() to convert it to the CPU native value.
> 
> So the branch as posted (or present in the repo) works fine (boot-tested
> only so far) with all 8 combinations of (host endianness, guest
> endianness, guest v2/v3 GIC).
> 
> I will add a comment to the function explaining this.
> 
Yes, you're right.  Thanks for the explanation.  I think the key to
understanding that this works is the fact that mmio_data is always
written in LE in memory.

I was thrown off by the conversion you were making to a u32*, which you
don't really use, except as index mamipulation and to copy the data, but
that's fine.

Thanks for explaining this.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-10-31 17:26 ` [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors Andre Przywara
@ 2014-11-04 15:44   ` Christoffer Dall
  2014-11-04 17:24     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 15:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:50PM +0000, Andre Przywara wrote:
> For a GICv2 there is always only one (v)CPU involved: the one that
> does the access. On a GICv3 the access to a CPU redistributor is
> memory-mapped, but not banked, so the (v)CPU affected is determined by
> looking at the MMIO address region being accessed.
> To allow passing the affected CPU into the accessors, extend them to
> take an opaque private pointer parameter.
> For the current GICv2 emulation we ignore it and simply pass NULL
> on the call.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Why does it have to be an opaque private pointer?  Would it not always
be a struct vcpu * or a vcpu_id then?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops
  2014-11-03 13:59   ` Christoffer Dall
@ 2014-11-04 15:58     ` Andre Przywara
  2014-11-04 19:03       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-04 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

as hinted on IRC, an earlier reply on this one got lost on my machine,
so please excuse my apparent ignorance on your previous comments.

On 03/11/14 13:59, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:40PM +0000, Andre Przywara wrote:
>> Currently we only have one virtual GIC model supported, so all guests
>> use the same emulation code. With the addition of another model we
>> end up with different guests using potentially different vGIC models,
>> so we have to split up some functions to be per VM.
>> Introduce a vgic_vm_ops struct to hold function pointers for those
>> functions that are different and provide the necessary code to
>> initialize them.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  include/kvm/arm_vgic.h |   10 ++++++
>>  virt/kvm/arm/vgic.c    |   81 +++++++++++++++++++++++++++++++++++-------------
>>  2 files changed, 69 insertions(+), 22 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index dde5a00..bfb660a 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -134,6 +134,14 @@ struct vgic_params {
>>  	void __iomem	*vctrl_base;
>>  };
>>  
>> +struct vgic_vm_ops {
>> +	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
>> +			       struct kvm_exit_mmio *);
>> +	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
>> +	void	(*add_sgi_source)(struct kvm_vcpu *vcpu, int irq, int source);
>> +	int	(*vgic_init)(struct kvm *kvm, const struct vgic_params *params);
>> +};
>> +
>>  struct vgic_dist {
>>  #ifdef CONFIG_KVM_ARM_VGIC
>>  	spinlock_t		lock;
>> @@ -215,6 +223,8 @@ struct vgic_dist {
>>  
>>  	/* Bitmap indicating which CPU has something pending */
>>  	unsigned long		*irq_pending_on_cpu;
>> +
>> +	struct vgic_vm_ops	vm_ops;
>>  #endif
>>  };
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 0cbdde9..2c16684 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -105,6 +105,8 @@ static void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
>>  static const struct vgic_ops *vgic_ops;
>>  static const struct vgic_params *vgic;
>>  
>> +#define vgic_vm_op(kvm, fn) ((kvm)->arch.vgic.vm_ops.fn)
>> +
> 
> another one?  why did you simply ignore my comment from the last review?
> 
> If it wasn't obvious last time around, YUCK, and no ;)

OK, which version would you like?

1) Actually my first solution was to decode it on each call-site directly:

vcpu->kvm->arch.vgic.vm_ops.add_sgi_source(vcpu, lr.irq, lr.source);

However one reviewer suggested to wrap it with the macro you see above.

2) Provide a static inline for each seems like overkill, since there is
only one caller for each of them, but it would look like this:

static void add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
{
        vcpu->kvm->arch.vgic.vm_ops.add_sgi_source(vcpu, irq, source);
}

Both don't look very convincing to me, so if you see other
colors^Wsolutions, please let me know ;-)

We have to choose between them at _runtime_, because there could be two
guests with different vGIC models running at the same time.

>>  /*
>>   * struct vgic_bitmap contains a bitmap made of unsigned longs, but
>>   * extracts u32s out of them.
>> @@ -761,6 +763,13 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
>>  	return false;
>>  }
>>  
>> +static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
>> +{
>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +
>> +	*vgic_get_sgi_sources(dist, vcpu->vcpu_id, irq) |= 1 << source;
>> +}
>> +
>>  /**
>>   * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
>>   * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
>> @@ -775,9 +784,7 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
>>   */
>>  static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>>  {
>> -	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> -	int vcpu_id = vcpu->vcpu_id;
>>  	int i;
>>  
>>  	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
>> @@ -804,7 +811,8 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>>  		 */
>>  		vgic_dist_irq_set_pending(vcpu, lr.irq);
>>  		if (lr.irq < VGIC_NR_SGIS)
>> -			*vgic_get_sgi_sources(dist, vcpu_id, lr.irq) |= 1 << lr.source;
>> +			vgic_vm_op(vcpu->kvm, add_sgi_source)(vcpu, lr.irq,
>> +							      lr.source);
>>  		lr.state &= ~LR_STATE_PENDING;
>>  		vgic_set_lr(vcpu, i, lr);
>>  
>> @@ -1162,7 +1170,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>  	if (!irqchip_in_kernel(vcpu->kvm))
>>  		return false;
>>  
>> -	return vgic_v2_handle_mmio(vcpu, run, mmio);
>> +	return vgic_vm_op(vcpu->kvm, handle_mmio)(vcpu, run, mmio);
>>  }
>>  
>>  static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
>> @@ -1414,7 +1422,7 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>>  	return true;
>>  }
>>  
>> -static bool vgic_queue_sgi(struct kvm_vcpu *vcpu, int irq)
>> +static bool vgic_v2_queue_sgi(struct kvm_vcpu *vcpu, int irq)
>>  {
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  	unsigned long sources;
>> @@ -1489,7 +1497,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>>  
>>  	/* SGIs */
>>  	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
>> -		if (!vgic_queue_sgi(vcpu, i))
>> +		if (!vgic_vm_op(vcpu->kvm, queue_sgi)(vcpu, i))
>>  			overflow = 1;
>>  	}
>>  
>> @@ -1944,9 +1952,6 @@ static int vgic_init_maps(struct kvm *kvm)
>>  		}
>>  	}
>>  
>> -	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
>> -		vgic_set_target_reg(kvm, 0, i);
>> -
> 
> Remind me, why are we moving this chunk?

The target registers are only valid for vGICv2 (we have other means for
GICv3), so this belongs now into the vGICv2 specific code.

Cheers,
Andre.

>>  out:
>>  	if (ret)
>>  		kvm_vgic_destroy(kvm);
>> @@ -1954,6 +1959,31 @@ out:
>>  	return ret;
>>  }
>>  
>> +static int vgic_v2_init(struct kvm *kvm, const struct vgic_params *params)
>> +{
>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>> +	int ret, i;
>> +
>> +	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
>> +	    IS_VGIC_ADDR_UNDEF(dist->vgic_cpu_base)) {
>> +		kvm_err("Need to set vgic distributor addresses first\n");
>> +		return -ENXIO;
>> +	}
>> +
>> +	ret = kvm_phys_addr_ioremap(kvm, dist->vgic_cpu_base,
>> +				    params->vcpu_base,
>> +				    KVM_VGIC_V2_CPU_SIZE, true);
>> +	if (ret) {
>> +		kvm_err("Unable to remap VGIC CPU to VCPU\n");
>> +		return ret;
>> +	}
>> +
>> +	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
>> +		vgic_set_target_reg(kvm, 0, i);
>> +
>> +	return 0;
>> +}
>> +
>>  /**
>>   * kvm_vgic_init - Initialize global VGIC state before running any VCPUs
>>   * @kvm: pointer to the kvm struct
>> @@ -1976,26 +2006,15 @@ int kvm_vgic_init(struct kvm *kvm)
>>  	if (vgic_initialized(kvm))
>>  		goto out;
>>  
>> -	if (IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_dist_base) ||
>> -	    IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_cpu_base)) {
>> -		kvm_err("Need to set vgic cpu and dist addresses first\n");
>> -		ret = -ENXIO;
>> -		goto out;
>> -	}
>> -
>>  	ret = vgic_init_maps(kvm);
>>  	if (ret) {
>>  		kvm_err("Unable to allocate maps\n");
>>  		goto out;
>>  	}
>>  
>> -	ret = kvm_phys_addr_ioremap(kvm, kvm->arch.vgic.vgic_cpu_base,
>> -				    vgic->vcpu_base, KVM_VGIC_V2_CPU_SIZE,
>> -				    true);
>> -	if (ret) {
>> -		kvm_err("Unable to remap VGIC CPU to VCPU\n");
>> +	ret = vgic_vm_op(kvm, vgic_init)(kvm, vgic);
>> +	if (ret)
>>  		goto out;
>> -	}
>>  
>>  	kvm_for_each_vcpu(i, vcpu, kvm)
>>  		kvm_vgic_vcpu_init(vcpu);
>> @@ -2008,6 +2027,21 @@ out:
>>  	return ret;
>>  }
>>  
>> +static bool init_emulation_ops(struct kvm *kvm, int type)
>> +{
>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>> +
>> +	switch (type) {
>> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
>> +		dist->vm_ops.handle_mmio = vgic_v2_handle_mmio;
>> +		dist->vm_ops.queue_sgi = vgic_v2_queue_sgi;
>> +		dist->vm_ops.add_sgi_source = vgic_v2_add_sgi_source;
>> +		dist->vm_ops.vgic_init = vgic_v2_init;
>> +		return true;
>> +	}
>> +	return false;
>> +}
>> +
>>  int kvm_vgic_create(struct kvm *kvm, u32 type)
>>  {
>>  	int i, vcpu_lock_idx = -1, ret = 0;
>> @@ -2045,6 +2079,9 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>>  	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>>  	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>>  
>> +	if (!init_emulation_ops(kvm, type))
>> +		ret = -ENODEV;
>> +
>>  out_unlock:
>>  	for (; vcpu_lock_idx >= 0; vcpu_lock_idx--) {
>>  		vcpu = kvm_get_vcpu(kvm, vcpu_lock_idx);
>> -- 
>> 1.7.9.5
>>
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into per-VM ops
  2014-11-03 14:15   ` Christoffer Dall
@ 2014-11-04 16:30     ` Andre Przywara
  2014-11-04 19:12       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-04 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 03/11/14 14:15, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:41PM +0000, Andre Przywara wrote:
>> The function to set the VGIC's list registers are not only dependent
>> on the host GIC model, but need to behave slightly different for
>> the type of emulated guest GIC.
>> So move the functions into the new struct vgic_vm_ops and initialize
>> them properly to prepare for guest GICv3 support later.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  include/kvm/arm_vgic.h |    5 +++--
>>  virt/kvm/arm/vgic-v2.c |   17 +++++++++++++++--
>>  virt/kvm/arm/vgic-v3.c |   16 ++++++++++++++--
>>  virt/kvm/arm/vgic.c    |    9 +++++++--
>>  4 files changed, 39 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index bfb660a..a6d41f1 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -108,8 +108,6 @@ struct vgic_vmcr {
>>  };
>>  
>>  struct vgic_ops {
>> -	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
>> -	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
>>  	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
>>  	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
>>  	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
>> @@ -132,9 +130,12 @@ struct vgic_params {
>>  	unsigned int	maint_irq;
>>  	/* Virtual control interface base address */
>>  	void __iomem	*vctrl_base;
>> +	bool (*init_emul)(struct kvm *kvm, int type);
>>  };
>>  
>>  struct vgic_vm_ops {
>> +	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
>> +	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
>>  	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
>>  			       struct kvm_exit_mmio *);
>>  	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
> 
> 
> this has now become incredibly confusing, what are your thoughts on
> renaming vgic_ops to kvm_gic_ops to make it clear that this structure is
> about hardware-managing ops and vgic_vm_ops is about the vgic, the
> virtual instance?

Mmh, makes some sense, but I am bit reluctant to rename existing
identifiers. I get about 20 hits for vgic_ops, so if you will ack this
rename, I can go ahead with it.

> 
>> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
>> index 2935405..bdc8d97 100644
>> --- a/virt/kvm/arm/vgic-v2.c
>> +++ b/virt/kvm/arm/vgic-v2.c
>> @@ -143,8 +143,6 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
>>  }
>>  
>>  static const struct vgic_ops vgic_v2_ops = {
>> -	.get_lr			= vgic_v2_get_lr,
>> -	.set_lr			= vgic_v2_set_lr,
>>  	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
>>  	.get_elrsr		= vgic_v2_get_elrsr,
>>  	.get_eisr		= vgic_v2_get_eisr,
>> @@ -158,6 +156,20 @@ static const struct vgic_ops vgic_v2_ops = {
>>  
>>  static struct vgic_params vgic_v2_params;
>>  
>> +static bool vgic_v2_init_emul(struct kvm *kvm, int type)
>> +{
>> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
>> +
>> +	switch (type) {
>> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
>> +		vm_ops->get_lr = vgic_v2_get_lr;
>> +		vm_ops->set_lr = vgic_v2_set_lr;
>> +		return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>>  /**
>>   * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
>>   * @node:	pointer to the DT node
>> @@ -196,6 +208,7 @@ int vgic_v2_probe(struct device_node *vgic_node,
>>  		ret = -ENOMEM;
>>  		goto out;
>>  	}
>> +	vgic->init_emul = vgic_v2_init_emul;
>>  
>>  	vgic->nr_lr = readl_relaxed(vgic->vctrl_base + GICH_VTR);
>>  	vgic->nr_lr = (vgic->nr_lr & 0x3f) + 1;
>> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
>> index 1c2c8ee..a38339e 100644
>> --- a/virt/kvm/arm/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic-v3.c
>> @@ -157,8 +157,6 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>>  }
>>  
>>  static const struct vgic_ops vgic_v3_ops = {
>> -	.get_lr			= vgic_v3_get_lr,
>> -	.set_lr			= vgic_v3_set_lr,
>>  	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
>>  	.get_elrsr		= vgic_v3_get_elrsr,
>>  	.get_eisr		= vgic_v3_get_eisr,
>> @@ -170,6 +168,19 @@ static const struct vgic_ops vgic_v3_ops = {
>>  	.enable			= vgic_v3_enable,
>>  };
>>  
>> +static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
>> +{
>> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
>> +
>> +	switch (type) {
>> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
>> +		vm_ops->get_lr = vgic_v3_get_lr;
>> +		vm_ops->set_lr = vgic_v3_set_lr;
>> +		return true;
>> +	}
>> +	return false;
>> +}
>> +
>>  static struct vgic_params vgic_v3_params;
>>  
>>  /**
>> @@ -231,6 +242,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
>>  		goto out;
>>  	}
>>  
>> +	vgic->init_emul = vgic_v3_init_emul_compat;
>>  	vgic->vcpu_base = vcpu_res.start;
>>  	vgic->vctrl_base = NULL;
>>  	vgic->type = VGIC_V3;
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 2c16684..8c2e707 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1278,13 +1278,13 @@ static void vgic_update_state(struct kvm *kvm)
>>  
>>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr)
>>  {
>> -	return vgic_ops->get_lr(vcpu, lr);
>> +	return vgic_vm_op(vcpu->kvm, get_lr)(vcpu, lr);
>>  }
>>  
>>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
>>  			       struct vgic_lr vlr)
>>  {
>> -	vgic_ops->set_lr(vcpu, lr, vlr);
>> +	return vgic_vm_op(vcpu->kvm, set_lr)(vcpu, lr, vlr);
>>  }
>>  
>>  static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
>> @@ -2072,6 +2072,11 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>>  		}
>>  	}
>>  
>> +	if (!vgic->init_emul(kvm, type)) {
>> +		ret = -ENODEV;
>> +		goto out_unlock;
>> +	}
>> +
>>  	spin_lock_init(&kvm->arch.vgic.lock);
>>  	kvm->arch.vgic.in_kernel = true;
>>  	kvm->arch.vgic.vgic_model = type;
>> -- 
>> 1.7.9.5
>>
> 
> Thanks for splitting up the patches, it's certainly better to review.
> 
> However, my question from the last round still stands.  What you're
> doing here is setting a sh*tload of function pointers through an amazing
> amount of abstractions to avoid something like
> 
> void vgic_v2_set_lr(struct kvm_vgic *vgic)
> {
> 	switch (vgic->type) {
> 	case KVM_DEV_TYPE_ARM_VGIC_V2:
> 		foo();
> 		break;
> 	case KVM_DEV_TYPE_ARM_VGIC_V3:
> 		bar();
> 		break;
> 	}
> }
> 
> So I have to ask: What's the benefit? That you'll have fewer
> conditionals?  But god have mercy on the poor people having to debug
> some issue and figure out which function the code actually calls when it
> (inside another complicated piece of logic) sets a LR.

So the big aim here is to separate the stages cleanly. We have the
backend (vgic-v2.c and vgic-v3.c), which cares about the host hardware
specific functions. Then we have the frontend (vgic-v2-emul.c and
vgic-v3-emul.c), which cares about the guest-facing emulation part.
vgic.c is now just the "middle end", connecting one of the front-ends
with one of the back-ends - depending on both the host's hardware
(back-end) and the user's choices at VM creation time (front-end).
Ideally any new hardware or emulation model would just require an extra
file with little or no changes to vgic.c.
Not sure whether we will see much more instances of either the front- or
back-end, but I like the possibility to add one later - this may be
useful already for the ITS emulation.

So my goal was to avoid any emulation or host specific calls in vgic.c,
just rely on proper initialization in an init() function.
Like your example above would require to export the respective functions
and put their prototypes in the header file, also to enumerate all
combinations in vgic.c, which I don't like very much.

I can give it a try anyway to check how it looks like and what it gives us.

> This just feels like we're doing something incredibly wrong...

TBH I have this feeling sometimes when reading the VGIC code (especially
the endianness code), which is probably just due to the nature of the
GIC being a beast on itself and emulating it is not a walk in the park.
That does even more apply to the GICv3 and emulating a GICv2 on top of a
hardware GICv3, for instance.

Regards,
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-11-04 15:44   ` Christoffer Dall
@ 2014-11-04 17:24     ` Andre Przywara
  2014-11-04 18:05       ` Marc Zyngier
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-04 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 04/11/14 15:44, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:50PM +0000, Andre Przywara wrote:
>> For a GICv2 there is always only one (v)CPU involved: the one that
>> does the access. On a GICv3 the access to a CPU redistributor is
>> memory-mapped, but not banked, so the (v)CPU affected is determined by
>> looking at the MMIO address region being accessed.
>> To allow passing the affected CPU into the accessors, extend them to
>> take an opaque private pointer parameter.
>> For the current GICv2 emulation we ignore it and simply pass NULL
>> on the call.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> Why does it have to be an opaque private pointer?  Would it not always
> be a struct vcpu * or a vcpu_id then?

IIRC Marc suggested this once be more future proof. Also a pointer makes
it easier to pass NULL in the GICv2 parts of the code, which makes it
more obvious that this value is not used in this case.

Marc, did I miss some more rationale?
Does that still hold?

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-11-04 17:24     ` Andre Przywara
@ 2014-11-04 18:05       ` Marc Zyngier
  2014-11-04 19:18         ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Marc Zyngier @ 2014-11-04 18:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/11/14 17:24, Andre Przywara wrote:
> Hi,
> 
> On 04/11/14 15:44, Christoffer Dall wrote:
>> On Fri, Oct 31, 2014 at 05:26:50PM +0000, Andre Przywara wrote:
>>> For a GICv2 there is always only one (v)CPU involved: the one that
>>> does the access. On a GICv3 the access to a CPU redistributor is
>>> memory-mapped, but not banked, so the (v)CPU affected is determined by
>>> looking at the MMIO address region being accessed.
>>> To allow passing the affected CPU into the accessors, extend them to
>>> take an opaque private pointer parameter.
>>> For the current GICv2 emulation we ignore it and simply pass NULL
>>> on the call.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>
>> Why does it have to be an opaque private pointer?  Would it not always
>> be a struct vcpu * or a vcpu_id then?
> 
> IIRC Marc suggested this once be more future proof. Also a pointer makes
> it easier to pass NULL in the GICv2 parts of the code, which makes it
> more obvious that this value is not used in this case.
> 
> Marc, did I miss some more rationale?
> Does that still hold?

The main idea was to have a general purpose pointer that you can
associate with the decoded region. Some form of private context, just
like we have for a lot of other kernel structures.

Now, I think having that as a explicit pointer looks truly awful. Can't
that be folded into struct kvm_exit_mmio that is already passed around?
It would make some sense that the private context is associated with the
actual access... I haven't seen how that interacts with the GICv3 code
though.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops
  2014-11-04 15:58     ` Andre Przywara
@ 2014-11-04 19:03       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 19:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 04, 2014 at 03:58:27PM +0000, Andre Przywara wrote:
> Hi Christoffer,
> 
> as hinted on IRC, an earlier reply on this one got lost on my machine,
> so please excuse my apparent ignorance on your previous comments.
> 
> On 03/11/14 13:59, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:40PM +0000, Andre Przywara wrote:
> >> Currently we only have one virtual GIC model supported, so all guests
> >> use the same emulation code. With the addition of another model we
> >> end up with different guests using potentially different vGIC models,
> >> so we have to split up some functions to be per VM.
> >> Introduce a vgic_vm_ops struct to hold function pointers for those
> >> functions that are different and provide the necessary code to
> >> initialize them.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  include/kvm/arm_vgic.h |   10 ++++++
> >>  virt/kvm/arm/vgic.c    |   81 +++++++++++++++++++++++++++++++++++-------------
> >>  2 files changed, 69 insertions(+), 22 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index dde5a00..bfb660a 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -134,6 +134,14 @@ struct vgic_params {
> >>  	void __iomem	*vctrl_base;
> >>  };
> >>  
> >> +struct vgic_vm_ops {
> >> +	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
> >> +			       struct kvm_exit_mmio *);
> >> +	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
> >> +	void	(*add_sgi_source)(struct kvm_vcpu *vcpu, int irq, int source);
> >> +	int	(*vgic_init)(struct kvm *kvm, const struct vgic_params *params);
> >> +};
> >> +
> >>  struct vgic_dist {
> >>  #ifdef CONFIG_KVM_ARM_VGIC
> >>  	spinlock_t		lock;
> >> @@ -215,6 +223,8 @@ struct vgic_dist {
> >>  
> >>  	/* Bitmap indicating which CPU has something pending */
> >>  	unsigned long		*irq_pending_on_cpu;
> >> +
> >> +	struct vgic_vm_ops	vm_ops;
> >>  #endif
> >>  };
> >>  
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 0cbdde9..2c16684 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -105,6 +105,8 @@ static void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
> >>  static const struct vgic_ops *vgic_ops;
> >>  static const struct vgic_params *vgic;
> >>  
> >> +#define vgic_vm_op(kvm, fn) ((kvm)->arch.vgic.vm_ops.fn)
> >> +
> > 
> > another one?  why did you simply ignore my comment from the last review?
> > 
> > If it wasn't obvious last time around, YUCK, and no ;)
> 
> OK, which version would you like?
> 
> 1) Actually my first solution was to decode it on each call-site directly:
> 
> vcpu->kvm->arch.vgic.vm_ops.add_sgi_source(vcpu, lr.irq, lr.source);
> 
> However one reviewer suggested to wrap it with the macro you see above.
> 

yeah, those lines did become very long.

> 2) Provide a static inline for each seems like overkill, since there is
> only one caller for each of them, but it would look like this:
> 
> static void add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
> {
>         vcpu->kvm->arch.vgic.vm_ops.add_sgi_source(vcpu, irq, source);
> }
> 
> Both don't look very convincing to me, so if you see other
> colors^Wsolutions, please let me know ;-)

I strongly prefer the static inline version, I don't think it's that
bad.  You could also stick with a macro solution, but then you should do
something like:

#define add_sgu_source(vcpu, irq, source) \
	vcpu->kvm->arch.vgic.vm_ops.add_sgi_source(vcpu, irq, source)

There's only a handful or so of these, right, so I really don't see a
big problem having a number of static inlines.

> 
> We have to choose between them at _runtime_, because there could be two
> guests with different vGIC models running at the same time.
> 
> >>  /*
> >>   * struct vgic_bitmap contains a bitmap made of unsigned longs, but
> >>   * extracts u32s out of them.
> >> @@ -761,6 +763,13 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
> >>  	return false;
> >>  }
> >>  
> >> +static void vgic_v2_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
> >> +{
> >> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> +
> >> +	*vgic_get_sgi_sources(dist, vcpu->vcpu_id, irq) |= 1 << source;
> >> +}
> >> +
> >>  /**
> >>   * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
> >>   * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
> >> @@ -775,9 +784,7 @@ static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
> >>   */
> >>  static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
> >>  {
> >> -	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> -	int vcpu_id = vcpu->vcpu_id;
> >>  	int i;
> >>  
> >>  	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
> >> @@ -804,7 +811,8 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
> >>  		 */
> >>  		vgic_dist_irq_set_pending(vcpu, lr.irq);
> >>  		if (lr.irq < VGIC_NR_SGIS)
> >> -			*vgic_get_sgi_sources(dist, vcpu_id, lr.irq) |= 1 << lr.source;
> >> +			vgic_vm_op(vcpu->kvm, add_sgi_source)(vcpu, lr.irq,
> >> +							      lr.source);
> >>  		lr.state &= ~LR_STATE_PENDING;
> >>  		vgic_set_lr(vcpu, i, lr);
> >>  
> >> @@ -1162,7 +1170,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >>  	if (!irqchip_in_kernel(vcpu->kvm))
> >>  		return false;
> >>  
> >> -	return vgic_v2_handle_mmio(vcpu, run, mmio);
> >> +	return vgic_vm_op(vcpu->kvm, handle_mmio)(vcpu, run, mmio);
> >>  }
> >>  
> >>  static u8 *vgic_get_sgi_sources(struct vgic_dist *dist, int vcpu_id, int sgi)
> >> @@ -1414,7 +1422,7 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
> >>  	return true;
> >>  }
> >>  
> >> -static bool vgic_queue_sgi(struct kvm_vcpu *vcpu, int irq)
> >> +static bool vgic_v2_queue_sgi(struct kvm_vcpu *vcpu, int irq)
> >>  {
> >>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >>  	unsigned long sources;
> >> @@ -1489,7 +1497,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
> >>  
> >>  	/* SGIs */
> >>  	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
> >> -		if (!vgic_queue_sgi(vcpu, i))
> >> +		if (!vgic_vm_op(vcpu->kvm, queue_sgi)(vcpu, i))
> >>  			overflow = 1;
> >>  	}
> >>  
> >> @@ -1944,9 +1952,6 @@ static int vgic_init_maps(struct kvm *kvm)
> >>  		}
> >>  	}
> >>  
> >> -	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i += 4)
> >> -		vgic_set_target_reg(kvm, 0, i);
> >> -
> > 
> > Remind me, why are we moving this chunk?
> 
> The target registers are only valid for vGICv2 (we have other means for
> GICv3), so this belongs now into the vGICv2 specific code.
> 

ah, right, obvious ;)

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into per-VM ops
  2014-11-04 16:30     ` Andre Przywara
@ 2014-11-04 19:12       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 19:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 04, 2014 at 04:30:42PM +0000, Andre Przywara wrote:
> Hi Christoffer,
> 
> On 03/11/14 14:15, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:41PM +0000, Andre Przywara wrote:
> >> The function to set the VGIC's list registers are not only dependent
> >> on the host GIC model, but need to behave slightly different for
> >> the type of emulated guest GIC.
> >> So move the functions into the new struct vgic_vm_ops and initialize
> >> them properly to prepare for guest GICv3 support later.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  include/kvm/arm_vgic.h |    5 +++--
> >>  virt/kvm/arm/vgic-v2.c |   17 +++++++++++++++--
> >>  virt/kvm/arm/vgic-v3.c |   16 ++++++++++++++--
> >>  virt/kvm/arm/vgic.c    |    9 +++++++--
> >>  4 files changed, 39 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index bfb660a..a6d41f1 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -108,8 +108,6 @@ struct vgic_vmcr {
> >>  };
> >>  
> >>  struct vgic_ops {
> >> -	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
> >> -	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
> >>  	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
> >>  	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
> >>  	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
> >> @@ -132,9 +130,12 @@ struct vgic_params {
> >>  	unsigned int	maint_irq;
> >>  	/* Virtual control interface base address */
> >>  	void __iomem	*vctrl_base;
> >> +	bool (*init_emul)(struct kvm *kvm, int type);
> >>  };
> >>  
> >>  struct vgic_vm_ops {
> >> +	struct vgic_lr	(*get_lr)(const struct kvm_vcpu *, int);
> >> +	void	(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
> >>  	bool	(*handle_mmio)(struct kvm_vcpu *, struct kvm_run *,
> >>  			       struct kvm_exit_mmio *);
> >>  	bool	(*queue_sgi)(struct kvm_vcpu *vcpu, int irq);
> > 
> > 
> > this has now become incredibly confusing, what are your thoughts on
> > renaming vgic_ops to kvm_gic_ops to make it clear that this structure is
> > about hardware-managing ops and vgic_vm_ops is about the vgic, the
> > virtual instance?
> 
> Mmh, makes some sense, but I am bit reluctant to rename existing
> identifiers. I get about 20 hits for vgic_ops, so if you will ack this
> rename, I can go ahead with it.
> 

Just something I wanted to throw out there for both you and Marc.  We
can change it later if needed, let's focus on getting these patches into
shape for now.

> > 
> >> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> >> index 2935405..bdc8d97 100644
> >> --- a/virt/kvm/arm/vgic-v2.c
> >> +++ b/virt/kvm/arm/vgic-v2.c
> >> @@ -143,8 +143,6 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
> >>  }
> >>  
> >>  static const struct vgic_ops vgic_v2_ops = {
> >> -	.get_lr			= vgic_v2_get_lr,
> >> -	.set_lr			= vgic_v2_set_lr,
> >>  	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
> >>  	.get_elrsr		= vgic_v2_get_elrsr,
> >>  	.get_eisr		= vgic_v2_get_eisr,
> >> @@ -158,6 +156,20 @@ static const struct vgic_ops vgic_v2_ops = {
> >>  
> >>  static struct vgic_params vgic_v2_params;
> >>  
> >> +static bool vgic_v2_init_emul(struct kvm *kvm, int type)
> >> +{
> >> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> >> +
> >> +	switch (type) {
> >> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
> >> +		vm_ops->get_lr = vgic_v2_get_lr;
> >> +		vm_ops->set_lr = vgic_v2_set_lr;
> >> +		return true;
> >> +	}
> >> +
> >> +	return false;
> >> +}
> >> +
> >>  /**
> >>   * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in DT
> >>   * @node:	pointer to the DT node
> >> @@ -196,6 +208,7 @@ int vgic_v2_probe(struct device_node *vgic_node,
> >>  		ret = -ENOMEM;
> >>  		goto out;
> >>  	}
> >> +	vgic->init_emul = vgic_v2_init_emul;
> >>  
> >>  	vgic->nr_lr = readl_relaxed(vgic->vctrl_base + GICH_VTR);
> >>  	vgic->nr_lr = (vgic->nr_lr & 0x3f) + 1;
> >> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> >> index 1c2c8ee..a38339e 100644
> >> --- a/virt/kvm/arm/vgic-v3.c
> >> +++ b/virt/kvm/arm/vgic-v3.c
> >> @@ -157,8 +157,6 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
> >>  }
> >>  
> >>  static const struct vgic_ops vgic_v3_ops = {
> >> -	.get_lr			= vgic_v3_get_lr,
> >> -	.set_lr			= vgic_v3_set_lr,
> >>  	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
> >>  	.get_elrsr		= vgic_v3_get_elrsr,
> >>  	.get_eisr		= vgic_v3_get_eisr,
> >> @@ -170,6 +168,19 @@ static const struct vgic_ops vgic_v3_ops = {
> >>  	.enable			= vgic_v3_enable,
> >>  };
> >>  
> >> +static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
> >> +{
> >> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> >> +
> >> +	switch (type) {
> >> +	case KVM_DEV_TYPE_ARM_VGIC_V2:
> >> +		vm_ops->get_lr = vgic_v3_get_lr;
> >> +		vm_ops->set_lr = vgic_v3_set_lr;
> >> +		return true;
> >> +	}
> >> +	return false;
> >> +}
> >> +
> >>  static struct vgic_params vgic_v3_params;
> >>  
> >>  /**
> >> @@ -231,6 +242,7 @@ int vgic_v3_probe(struct device_node *vgic_node,
> >>  		goto out;
> >>  	}
> >>  
> >> +	vgic->init_emul = vgic_v3_init_emul_compat;
> >>  	vgic->vcpu_base = vcpu_res.start;
> >>  	vgic->vctrl_base = NULL;
> >>  	vgic->type = VGIC_V3;
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 2c16684..8c2e707 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1278,13 +1278,13 @@ static void vgic_update_state(struct kvm *kvm)
> >>  
> >>  static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr)
> >>  {
> >> -	return vgic_ops->get_lr(vcpu, lr);
> >> +	return vgic_vm_op(vcpu->kvm, get_lr)(vcpu, lr);
> >>  }
> >>  
> >>  static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
> >>  			       struct vgic_lr vlr)
> >>  {
> >> -	vgic_ops->set_lr(vcpu, lr, vlr);
> >> +	return vgic_vm_op(vcpu->kvm, set_lr)(vcpu, lr, vlr);
> >>  }
> >>  
> >>  static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
> >> @@ -2072,6 +2072,11 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
> >>  		}
> >>  	}
> >>  
> >> +	if (!vgic->init_emul(kvm, type)) {
> >> +		ret = -ENODEV;
> >> +		goto out_unlock;
> >> +	}
> >> +
> >>  	spin_lock_init(&kvm->arch.vgic.lock);
> >>  	kvm->arch.vgic.in_kernel = true;
> >>  	kvm->arch.vgic.vgic_model = type;
> >> -- 
> >> 1.7.9.5
> >>
> > 
> > Thanks for splitting up the patches, it's certainly better to review.
> > 
> > However, my question from the last round still stands.  What you're
> > doing here is setting a sh*tload of function pointers through an amazing
> > amount of abstractions to avoid something like
> > 
> > void vgic_v2_set_lr(struct kvm_vgic *vgic)
> > {
> > 	switch (vgic->type) {
> > 	case KVM_DEV_TYPE_ARM_VGIC_V2:
> > 		foo();
> > 		break;
> > 	case KVM_DEV_TYPE_ARM_VGIC_V3:
> > 		bar();
> > 		break;
> > 	}
> > }
> > 
> > So I have to ask: What's the benefit? That you'll have fewer
> > conditionals?  But god have mercy on the poor people having to debug
> > some issue and figure out which function the code actually calls when it
> > (inside another complicated piece of logic) sets a LR.
> 
> So the big aim here is to separate the stages cleanly. We have the
> backend (vgic-v2.c and vgic-v3.c), which cares about the host hardware
> specific functions. Then we have the frontend (vgic-v2-emul.c and
> vgic-v3-emul.c), which cares about the guest-facing emulation part.
> vgic.c is now just the "middle end", connecting one of the front-ends
> with one of the back-ends - depending on both the host's hardware
> (back-end) and the user's choices at VM creation time (front-end).
> Ideally any new hardware or emulation model would just require an extra
> file with little or no changes to vgic.c.
> Not sure whether we will see much more instances of either the front- or
> back-end, but I like the possibility to add one later - this may be
> useful already for the ITS emulation.
> 
> So my goal was to avoid any emulation or host specific calls in vgic.c,
> just rely on proper initialization in an init() function.
> Like your example above would require to export the respective functions
> and put their prototypes in the header file, also to enumerate all
> combinations in vgic.c, which I don't like very much.

I accept the point that you set things up cleanly and thus you only have
to make a single call that abstracts something away underneeth to get a
cleaner middle-end implementation.

However, in my experience the things we're debugging with this type of
code requires you to go through the code and carefully consider how the
hardware registers and state ends up looking like, and the thing is, now
you see some function pointer being called, but there's really no easy
way to resolve that into an actual function implementation.

The way you've written the code requires you to go through 4 files
without gicv3 emulation in place (vgic.c, vgic-v2.c, vgic-v3.c, and
vgic-v2-emul.c) and find all init functions, and keep all settings in
your head, remember what function some pointer was configured to use, go
back to where you came from, see which arguments are passed to that
function, look up that function, and continue debugging.  I think that's
a high price to pay for some theoretically cleaner abstraction.

There's just something too complicated about the way it is now, so we
need to do something.

> 
> I can give it a try anyway to check how it looks like and what it gives us.
> 
> > This just feels like we're doing something incredibly wrong...
> 
> TBH I have this feeling sometimes when reading the VGIC code (especially
> the endianness code), which is probably just due to the nature of the
> GIC being a beast on itself and emulating it is not a walk in the park.
> That does even more apply to the GICv3 and emulating a GICv2 on top of a
> hardware GICv3, for instance.
> 

Marc and I talked about that very feeling during KVM Forum.  I think we
could address some of that, sacrificing a bit of performance for reduced
state and less code, but overall yes, the GIC is complicated and
incompatible with regular humans.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-11-04 18:05       ` Marc Zyngier
@ 2014-11-04 19:18         ` Christoffer Dall
  2014-11-04 20:17           ` Marc Zyngier
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 04, 2014 at 06:05:17PM +0000, Marc Zyngier wrote:
> On 04/11/14 17:24, Andre Przywara wrote:
> > Hi,
> > 
> > On 04/11/14 15:44, Christoffer Dall wrote:
> >> On Fri, Oct 31, 2014 at 05:26:50PM +0000, Andre Przywara wrote:
> >>> For a GICv2 there is always only one (v)CPU involved: the one that
> >>> does the access. On a GICv3 the access to a CPU redistributor is
> >>> memory-mapped, but not banked, so the (v)CPU affected is determined by
> >>> looking at the MMIO address region being accessed.
> >>> To allow passing the affected CPU into the accessors, extend them to
> >>> take an opaque private pointer parameter.
> >>> For the current GICv2 emulation we ignore it and simply pass NULL
> >>> on the call.
> >>>
> >>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >>
> >> Why does it have to be an opaque private pointer?  Would it not always
> >> be a struct vcpu * or a vcpu_id then?
> > 
> > IIRC Marc suggested this once be more future proof. Also a pointer makes
> > it easier to pass NULL in the GICv2 parts of the code, which makes it
> > more obvious that this value is not used in this case.
> > 
> > Marc, did I miss some more rationale?
> > Does that still hold?
> 
> The main idea was to have a general purpose pointer that you can
> associate with the decoded region. Some form of private context, just
> like we have for a lot of other kernel structures.
> 
> Now, I think having that as a explicit pointer looks truly awful. Can't
> that be folded into struct kvm_exit_mmio that is already passed around?
> It would make some sense that the private context is associated with the
> actual access... I haven't seen how that interacts with the GICv3 code
> though.
> 
Well, the idea with a (void *private) is to have something, which is
*generic* be reusable and extendable, no argument there.

So my question is, are we implementing some generic feature, where
having that extendability makes things better and clearer, or are we
just wrapping an int in a (void *) so we don't have to add another
parameter if sometime in the unknown future we need another additional
piece of information.

There are plenty of examples where you just pass NULL to a typed pointer
or 0 to an int parameter as well.

I'm not trying to fight the idea of a private pointer, I just want to
make sure we do what we can to keep this code somewhat sane, so if we
have a set of functions where we in 75% of the cases pass a vcpu * and
in the other cases don't, then I really think we want a vcpu *
parameter.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
  2014-10-31 17:26 ` [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr() Andre Przywara
@ 2014-11-04 19:30   ` Christoffer Dall
  2014-11-05 10:27     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 19:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:47PM +0000, Andre Przywara wrote:
> vgic_set_attr() and vgic_get_attr() contain both code specific for
> the emulated GIC as well as code for the userland facing, generic
> part of the GIC.
> Split the guest GIC facing code of from the generic part to allow
> easier splitting later.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

It's not really clear to me which data is specific to the emulated gic
and which is not or why you have to do this (yet), for example, the
_common function is now dealing with the GRP_ADDR case which is very
GICv2 specific (so far).  But I assume this will make sense as I
progress through the series.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 13/19] arm/arm64: KVM: add vgic.h header file
  2014-10-31 17:26 ` [PATCH v3 13/19] arm/arm64: KVM: add vgic.h header file Andre Przywara
@ 2014-11-04 19:30   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 19:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:48PM +0000, Andre Przywara wrote:
> vgic.c is currently a mixture of generic vGIC emulation code and
> functions specific to emulating a GICv2. To ease the addition of
> GICv3 later, we create new header file vgic.h, which holds constants
> and prototypes of commonly used functions.
> I removed the long-standing comment about using the kvm_io_bus API
> to tackle the GIC register ranges, as it wouldn't be a win for us
> anymore.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> -------
> As the diff isn't always obvious here (and to aid eventual rebases),
> here is a list of high-level changes done to the code:
> * moved definitions and prototypes from vgic.c to vgic.h:
>   - VGIC_ADDR_UNDEF
>   - ACCESS_{READ,WRITE}_*
>   - vgic_update_state()
>   - vgic_kick_vcpus()
>   - vgic_get_vmcr()
>   - vgic_set_vmcr()
>   - struct mmio_range {}
>   - IS_IN_RANGE() macro

should we worry about generic names now being exported and think about
renaming to things like kvm_mmio_range ?

(For the record, I'm not a strong proponent of this idea, just thought
it better to raise the issue now than later.)

Otherwise:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 14/19] arm/arm64: KVM: split GICv2 specific emulation code from vgic.c
  2014-10-31 17:26 ` [PATCH v3 14/19] arm/arm64: KVM: split GICv2 specific emulation code from vgic.c Andre Przywara
@ 2014-11-04 19:30   ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-04 19:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:49PM +0000, Andre Przywara wrote:
> vgic.c is currently a mixture of generic vGIC emulation code and
> functions specific to emulating a GICv2. To ease the addition of
> GICv3, split off strictly v2 specific parts into a new file
> vgic-v2-emul.c.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-11-04 19:18         ` Christoffer Dall
@ 2014-11-04 20:17           ` Marc Zyngier
  2014-11-05  9:49             ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Marc Zyngier @ 2014-11-04 20:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/11/14 19:18, Christoffer Dall wrote:
> On Tue, Nov 04, 2014 at 06:05:17PM +0000, Marc Zyngier wrote:
>> On 04/11/14 17:24, Andre Przywara wrote:
>>> Hi,
>>>
>>> On 04/11/14 15:44, Christoffer Dall wrote:
>>>> On Fri, Oct 31, 2014 at 05:26:50PM +0000, Andre Przywara wrote:
>>>>> For a GICv2 there is always only one (v)CPU involved: the one that
>>>>> does the access. On a GICv3 the access to a CPU redistributor is
>>>>> memory-mapped, but not banked, so the (v)CPU affected is determined by
>>>>> looking at the MMIO address region being accessed.
>>>>> To allow passing the affected CPU into the accessors, extend them to
>>>>> take an opaque private pointer parameter.
>>>>> For the current GICv2 emulation we ignore it and simply pass NULL
>>>>> on the call.
>>>>>
>>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>>
>>>> Why does it have to be an opaque private pointer?  Would it not always
>>>> be a struct vcpu * or a vcpu_id then?
>>>
>>> IIRC Marc suggested this once be more future proof. Also a pointer makes
>>> it easier to pass NULL in the GICv2 parts of the code, which makes it
>>> more obvious that this value is not used in this case.
>>>
>>> Marc, did I miss some more rationale?
>>> Does that still hold?
>>
>> The main idea was to have a general purpose pointer that you can
>> associate with the decoded region. Some form of private context, just
>> like we have for a lot of other kernel structures.
>>
>> Now, I think having that as a explicit pointer looks truly awful. Can't
>> that be folded into struct kvm_exit_mmio that is already passed around?
>> It would make some sense that the private context is associated with the
>> actual access... I haven't seen how that interacts with the GICv3 code
>> though.
>>
> Well, the idea with a (void *private) is to have something, which is
> *generic* be reusable and extendable, no argument there.
> 
> So my question is, are we implementing some generic feature, where
> having that extendability makes things better and clearer, or are we
> just wrapping an int in a (void *) so we don't have to add another
> parameter if sometime in the unknown future we need another additional
> piece of information.
> 
> There are plenty of examples where you just pass NULL to a typed pointer
> or 0 to an int parameter as well.
> 
> I'm not trying to fight the idea of a private pointer, I just want to
> make sure we do what we can to keep this code somewhat sane, so if we
> have a set of functions where we in 75% of the cases pass a vcpu * and
> in the other cases don't, then I really think we want a vcpu *
> parameter.

For the time being, I don't see any other use than a vcpu pointer for
the GICv3 case. Now, none of the MMIO decoding framework is GICv3
specific, and it feels a bit weird to hardcode the idea of a vcpu
pointer being passed around for code that doesn't really care about it
(GICv2).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors
  2014-11-04 20:17           ` Marc Zyngier
@ 2014-11-05  9:49             ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-05  9:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 04, 2014 at 08:17:07PM +0000, Marc Zyngier wrote:
> On 04/11/14 19:18, Christoffer Dall wrote:
> > On Tue, Nov 04, 2014 at 06:05:17PM +0000, Marc Zyngier wrote:
> >> On 04/11/14 17:24, Andre Przywara wrote:
> >>> Hi,
> >>>
> >>> On 04/11/14 15:44, Christoffer Dall wrote:
> >>>> On Fri, Oct 31, 2014 at 05:26:50PM +0000, Andre Przywara wrote:
> >>>>> For a GICv2 there is always only one (v)CPU involved: the one that
> >>>>> does the access. On a GICv3 the access to a CPU redistributor is
> >>>>> memory-mapped, but not banked, so the (v)CPU affected is determined by
> >>>>> looking at the MMIO address region being accessed.
> >>>>> To allow passing the affected CPU into the accessors, extend them to
> >>>>> take an opaque private pointer parameter.
> >>>>> For the current GICv2 emulation we ignore it and simply pass NULL
> >>>>> on the call.
> >>>>>
> >>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >>>>
> >>>> Why does it have to be an opaque private pointer?  Would it not always
> >>>> be a struct vcpu * or a vcpu_id then?
> >>>
> >>> IIRC Marc suggested this once be more future proof. Also a pointer makes
> >>> it easier to pass NULL in the GICv2 parts of the code, which makes it
> >>> more obvious that this value is not used in this case.
> >>>
> >>> Marc, did I miss some more rationale?
> >>> Does that still hold?
> >>
> >> The main idea was to have a general purpose pointer that you can
> >> associate with the decoded region. Some form of private context, just
> >> like we have for a lot of other kernel structures.
> >>
> >> Now, I think having that as a explicit pointer looks truly awful. Can't
> >> that be folded into struct kvm_exit_mmio that is already passed around?
> >> It would make some sense that the private context is associated with the
> >> actual access... I haven't seen how that interacts with the GICv3 code
> >> though.
> >>
> > Well, the idea with a (void *private) is to have something, which is
> > *generic* be reusable and extendable, no argument there.
> > 
> > So my question is, are we implementing some generic feature, where
> > having that extendability makes things better and clearer, or are we
> > just wrapping an int in a (void *) so we don't have to add another
> > parameter if sometime in the unknown future we need another additional
> > piece of information.
> > 
> > There are plenty of examples where you just pass NULL to a typed pointer
> > or 0 to an int parameter as well.
> > 
> > I'm not trying to fight the idea of a private pointer, I just want to
> > make sure we do what we can to keep this code somewhat sane, so if we
> > have a set of functions where we in 75% of the cases pass a vcpu * and
> > in the other cases don't, then I really think we want a vcpu *
> > parameter.
> 
> For the time being, I don't see any other use than a vcpu pointer for
> the GICv3 case. Now, none of the MMIO decoding framework is GICv3
> specific, and it feels a bit weird to hardcode the idea of a vcpu
> pointer being passed around for code that doesn't really care about it
> (GICv2).
> 
I don't think it's that bad.  It would be just like pud_free() and
friends which ignore the struct mm * parameter.  But anyhow, if you
feeel strongly about one way or the other, then go with it.  I've said
my piece.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
  2014-11-04 19:30   ` Christoffer Dall
@ 2014-11-05 10:27     ` Andre Przywara
  2014-11-05 10:37       ` Andre Przywara
  2014-11-05 12:57       ` Christoffer Dall
  0 siblings, 2 replies; 76+ messages in thread
From: Andre Przywara @ 2014-11-05 10:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 04/11/14 19:30, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:47PM +0000, Andre Przywara wrote:
>> vgic_set_attr() and vgic_get_attr() contain both code specific for
>> the emulated GIC as well as code for the userland facing, generic
>> part of the GIC.
>> Split the guest GIC facing code of from the generic part to allow
>> easier splitting later.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> It's not really clear to me which data is specific to the emulated gic
> and which is not or why you have to do this (yet), for example, the
> _common function is now dealing with the GRP_ADDR case which is very
> GICv2 specific (so far).  But I assume this will make sense as I
> progress through the series.

Admittedly this is somewhat of a corner case. Actually I tried to keep
as much code common (in vgic.c) as possible, and it was possible without
much pain for GRP_ADDR and kvm_vgic_addr.
Also I consider this call part of the switching and connecting
functionality of the VGIC.
Looking at the code again I think I had it in -emul.c before, but
decided to move it back for some reason (probably some other code
dependency which needed to be exposed). So unless I find some time ;-)
and a good reason to move it I tend to keep it here.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
  2014-11-05 10:27     ` Andre Przywara
@ 2014-11-05 10:37       ` Andre Przywara
  2014-11-05 12:57       ` Christoffer Dall
  1 sibling, 0 replies; 76+ messages in thread
From: Andre Przywara @ 2014-11-05 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/11/14 10:27, Andre Przywara wrote:
> Hi Christoffer,
>
> On 04/11/14 19:30, Christoffer Dall wrote:
>> On Fri, Oct 31, 2014 at 05:26:47PM +0000, Andre Przywara wrote:
>>> vgic_set_attr() and vgic_get_attr() contain both code specific for
>>> the emulated GIC as well as code for the userland facing, generic
>>> part of the GIC.
>>> Split the guest GIC facing code of from the generic part to allow
>>> easier splitting later.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>
>> It's not really clear to me which data is specific to the emulated gic
>> and which is not or why you have to do this (yet), for example, the
>> _common function is now dealing with the GRP_ADDR case which is very
>> GICv2 specific (so far).  But I assume this will make sense as I
>> progress through the series.
>
> Admittedly this is somewhat of a corner case. Actually I tried to keep
> as much code common (in vgic.c) as possible, and it was possible without
> much pain for GRP_ADDR and kvm_vgic_addr.
> Also I consider this call part of the switching and connecting
> functionality of the VGIC.
> Looking at the code again I think I had it in -emul.c before, but
> decided to move it back for some reason (probably some other code
> dependency which needed to be exposed). So unless I find some time ;-)
> and a good reason to move it I tend to keep it here.

... just found that kvm_vgic_addr() is not static, but also called from
the (now legacy) KVM_ARM_SET_DEVICE_ADDR ioctl. So it was causing more
churn to move it than it gave us clean separation.

Cheers,
Andre.

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
  2014-11-05 10:27     ` Andre Przywara
  2014-11-05 10:37       ` Andre Przywara
@ 2014-11-05 12:57       ` Christoffer Dall
  1 sibling, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-05 12:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 05, 2014 at 10:27:43AM +0000, Andre Przywara wrote:
> Hi Christoffer,
> 
> On 04/11/14 19:30, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:47PM +0000, Andre Przywara wrote:
> >> vgic_set_attr() and vgic_get_attr() contain both code specific for
> >> the emulated GIC as well as code for the userland facing, generic
> >> part of the GIC.
> >> Split the guest GIC facing code of from the generic part to allow
> >> easier splitting later.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > 
> > It's not really clear to me which data is specific to the emulated gic
> > and which is not or why you have to do this (yet), for example, the
> > _common function is now dealing with the GRP_ADDR case which is very
> > GICv2 specific (so far).  But I assume this will make sense as I
> > progress through the series.
> 
> Admittedly this is somewhat of a corner case. Actually I tried to keep
> as much code common (in vgic.c) as possible, and it was possible without
> much pain for GRP_ADDR and kvm_vgic_addr.
> Also I consider this call part of the switching and connecting
> functionality of the VGIC.
> Looking at the code again I think I had it in -emul.c before, but
> decided to move it back for some reason (probably some other code
> dependency which needed to be exposed). So unless I find some time ;-)
> and a good reason to move it I tend to keep it here.
> 
That's fine, my comment was more directed at the commit message than the
code itself; the patch itself looks ok.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 00/19] KVM GICv3 emulation
  2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
                   ` (19 preceding siblings ...)
  2014-11-03 12:59 ` [PATCH v3 00/19] KVM GICv3 emulation Christoffer Dall
@ 2014-11-06 10:57 ` Christoffer Dall
  2014-11-06 11:21   ` Christoffer Dall
  20 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-06 10:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:35PM +0000, Andre Przywara wrote:

[...]

> 
> Please review and test.
> I would be grateful for people to test for GICv2 regressions also
> (so on a GICv2 host with current kvmtool/qemu), as there is quite
> some refactoring on that front.
> 
So looking at the final result, we have a very strange flow with the
vgic_create() and kvm_vgic_create() functions.  I lost track in all the
rewrite patches how this happened exactly, but what I think you want to
end up with is:

one exported function:
int kvm_vgic_create(struct kvm_device *dev, u32 type);

which calls a static function:
static int vgic_create(struct kvm *kvm, u32 type);

or simply inline the static one in the exported one, I can't seem to
find other callers.

Can you take a look at this?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 00/19] KVM GICv3 emulation
  2014-11-06 10:57 ` Christoffer Dall
@ 2014-11-06 11:21   ` Christoffer Dall
  2014-11-06 15:13     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-06 11:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 06, 2014 at 11:57:51AM +0100, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:35PM +0000, Andre Przywara wrote:
> 
> [...]
> 
> > 
> > Please review and test.
> > I would be grateful for people to test for GICv2 regressions also
> > (so on a GICv2 host with current kvmtool/qemu), as there is quite
> > some refactoring on that front.
> > 
> So looking at the final result, we have a very strange flow with the
> vgic_create() and kvm_vgic_create() functions.  I lost track in all the
> rewrite patches how this happened exactly, but what I think you want to
> end up with is:
> 
> one exported function:
> int kvm_vgic_create(struct kvm_device *dev, u32 type);
> 
> which calls a static function:
> static int vgic_create(struct kvm *kvm, u32 type);
> 
> or simply inline the static one in the exported one, I can't seem to
> find other callers.
> 
Strike that, my cscope setup was messed up.

What I think you want is a static vgic_v3_create in vgic-v3-emul.c that
digs out the struct kvm pointer from the struct kvm_device and calls
kvm_vgic_create() and also just copy that single kfree(dev) line into
vgic_v3_destroy in vgic-v3-emul.c and the same for v2.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 00/19] KVM GICv3 emulation
  2014-11-06 11:21   ` Christoffer Dall
@ 2014-11-06 15:13     ` Andre Przywara
  2014-11-06 18:09       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-06 15:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 06/11/14 11:21, Christoffer Dall wrote:
> On Thu, Nov 06, 2014 at 11:57:51AM +0100, Christoffer Dall wrote:
>> On Fri, Oct 31, 2014 at 05:26:35PM +0000, Andre Przywara wrote:
>>
>> [...]
>>
>>>
>>> Please review and test.
>>> I would be grateful for people to test for GICv2 regressions also
>>> (so on a GICv2 host with current kvmtool/qemu), as there is quite
>>> some refactoring on that front.
>>>
>> So looking at the final result, we have a very strange flow with the
>> vgic_create() and kvm_vgic_create() functions.  I lost track in all the
>> rewrite patches how this happened exactly, but what I think you want to
>> end up with is:
>>
>> one exported function:
>> int kvm_vgic_create(struct kvm_device *dev, u32 type);
>>
>> which calls a static function:
>> static int vgic_create(struct kvm *kvm, u32 type);
>>
>> or simply inline the static one in the exported one, I can't seem to
>> find other callers.
>>
> Strike that, my cscope setup was messed up.
> 
> What I think you want is a static vgic_v3_create in vgic-v3-emul.c that
> digs out the struct kvm pointer from the struct kvm_device and calls
> kvm_vgic_create() and also just copy that single kfree(dev) line into
> vgic_v3_destroy in vgic-v3-emul.c and the same for v2.

So you want to remove vgic_destroy() and vgic_create() from vgic.c and
vgic.h and use a static version of it in vgic-v[23]-emul.c instead?
To avoid the two prototypes and make the declaration of the struct
kvm_device_ops look nicer? Did I got this right?

I did that now, it looks a bit neater, at the cost of small code
duplication of admittedly trivial code. So as long as there isn't
something added to those functions, that's probably fine.

So I will include it in the next version.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 00/19] KVM GICv3 emulation
  2014-11-06 15:13     ` Andre Przywara
@ 2014-11-06 18:09       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-06 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 6, 2014 at 4:13 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi Christoffer,
>
> On 06/11/14 11:21, Christoffer Dall wrote:
>> On Thu, Nov 06, 2014 at 11:57:51AM +0100, Christoffer Dall wrote:
>>> On Fri, Oct 31, 2014 at 05:26:35PM +0000, Andre Przywara wrote:
>>>
>>> [...]
>>>
>>>>
>>>> Please review and test.
>>>> I would be grateful for people to test for GICv2 regressions also
>>>> (so on a GICv2 host with current kvmtool/qemu), as there is quite
>>>> some refactoring on that front.
>>>>
>>> So looking at the final result, we have a very strange flow with the
>>> vgic_create() and kvm_vgic_create() functions.  I lost track in all the
>>> rewrite patches how this happened exactly, but what I think you want to
>>> end up with is:
>>>
>>> one exported function:
>>> int kvm_vgic_create(struct kvm_device *dev, u32 type);
>>>
>>> which calls a static function:
>>> static int vgic_create(struct kvm *kvm, u32 type);
>>>
>>> or simply inline the static one in the exported one, I can't seem to
>>> find other callers.
>>>
>> Strike that, my cscope setup was messed up.
>>
>> What I think you want is a static vgic_v3_create in vgic-v3-emul.c that
>> digs out the struct kvm pointer from the struct kvm_device and calls
>> kvm_vgic_create() and also just copy that single kfree(dev) line into
>> vgic_v3_destroy in vgic-v3-emul.c and the same for v2.
>
> So you want to remove vgic_destroy() and vgic_create() from vgic.c and
> vgic.h and use a static version of it in vgic-v[23]-emul.c instead?
> To avoid the two prototypes and make the declaration of the struct
> kvm_device_ops look nicer? Did I got this right?
>
> I did that now, it looks a bit neater, at the cost of small code
> duplication of admittedly trivial code. So as long as there isn't
> something added to those functions, that's probably fine.

Exactly!

>
> So I will include it in the next version.
>
Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation
  2014-10-31 17:26 ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation Andre Przywara
@ 2014-11-07 14:30   ` Christoffer Dall
  2014-11-10 17:30     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 1 Andre Przywara
  2014-11-12 12:39     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2 Andre Przywara
  0 siblings, 2 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-07 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
> With everything separated and prepared, we implement a model of a
> GICv3 distributor and redistributors by using the existing framework
> to provide handler functions for each register group.

new paragraph

> Currently we limit the emulation to a model enforcing a single
> security state, with SRE==1 (forcing system register access) and
> ARE==1 (allowing more than 8 VCPUs).

new paragraph

> We share some of functions provided for GICv2 emulation, but take
> the different ways of addressing (v)CPUs into account.
> Save and restore is currently not implemented.
> 
> Similar to the split-off GICv2 specific code, the new emulation code
> goes into a new file (vgic-v3-emul.c).
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm64/kvm/Makefile            |    1 +
>  include/kvm/arm_vgic.h             |   10 +-
>  include/linux/irqchip/arm-gic-v3.h |   26 ++
>  include/linux/kvm_host.h           |    1 +
>  include/uapi/linux/kvm.h           |    2 +
>  virt/kvm/arm/vgic-v3-emul.c        |  891 ++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic.c                |   11 +-
>  virt/kvm/arm/vgic.h                |    3 +
>  8 files changed, 942 insertions(+), 3 deletions(-)
>  create mode 100644 virt/kvm/arm/vgic-v3-emul.c
> 
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index d957353..4e6e09e 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -24,5 +24,6 @@ kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
>  kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
>  kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v2-switch.o
>  kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3.o
> +kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3-emul.o
>  kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v3-switch.o
>  kvm-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 8827bc7..c303083 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -160,7 +160,11 @@ struct vgic_dist {
>  
>  	/* Distributor and vcpu interface mapping in the guest */
>  	phys_addr_t		vgic_dist_base;
> -	phys_addr_t		vgic_cpu_base;
> +	/* GICv2 and GICv3 use different mapped register blocks */
> +	union {
> +		phys_addr_t		vgic_cpu_base;
> +		phys_addr_t		vgic_redist_base;
> +	};
>  
>  	/* Distributor enabled */
>  	u32			enabled;
> @@ -222,6 +226,9 @@ struct vgic_dist {
>  	 */
>  	struct vgic_bitmap	*irq_spi_target;
>  
> +	/* Target MPIDR for each IRQ (needed for GICv3 IROUTERn) only */
> +	u32			*irq_spi_mpidr;
> +
>  	/* Bitmap indicating which CPU has something pending */
>  	unsigned long		*irq_pending_on_cpu;
>  
> @@ -297,6 +304,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
> +void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		      struct kvm_exit_mmio *mmio);
> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
> index 03a4ea3..6a649bc 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -33,6 +33,7 @@
>  #define GICD_SETSPI_SR			0x0050
>  #define GICD_CLRSPI_SR			0x0058
>  #define GICD_SEIR			0x0068
> +#define GICD_IGROUPR			0x0080
>  #define GICD_ISENABLER			0x0100
>  #define GICD_ICENABLER			0x0180
>  #define GICD_ISPENDR			0x0200
> @@ -41,14 +42,31 @@
>  #define GICD_ICACTIVER			0x0380
>  #define GICD_IPRIORITYR			0x0400
>  #define GICD_ICFGR			0x0C00
> +#define GICD_IGRPMODR			0x0D00
> +#define GICD_NSACR			0x0E00
>  #define GICD_IROUTER			0x6000
> +#define GICD_IDREGS			0xFFD0
>  #define GICD_PIDR2			0xFFE8
>  
> +/*
> + * Non-ARE distributor registers, needed to provide the RES0
> + * semantics for KVM's emulated GICv3
> + */

huh?  I think this comment as to do a better job at explaining this, or,
just go away.

Why are we re-defining these registers?  Is it just a conincidence that
the offsets happen to be the same as for GICv2 so it would be
semantically incorrect to reuse the defines, or?

> +#define GICD_ITARGETSR			0x0800
> +#define GICD_SGIR			0x0F00
> +#define GICD_CPENDSGIR			0x0F10
> +#define GICD_SPENDSGIR			0x0F20
> +
> +
>  #define GICD_CTLR_RWP			(1U << 31)
> +#define GICD_CTLR_DS			(1U << 6)
>  #define GICD_CTLR_ARE_NS		(1U << 4)
>  #define GICD_CTLR_ENABLE_G1A		(1U << 1)
>  #define GICD_CTLR_ENABLE_G1		(1U << 0)
>  
> +#define GICD_TYPER_LPIS			(1U << 17)
> +#define GICD_TYPER_MBIS			(1U << 16)
> +
>  #define GICD_IROUTER_SPI_MODE_ONE	(0U << 31)
>  #define GICD_IROUTER_SPI_MODE_ANY	(1U << 31)
>  
> @@ -56,6 +74,8 @@
>  #define GIC_PIDR2_ARCH_GICv3		0x30
>  #define GIC_PIDR2_ARCH_GICv4		0x40
>  
> +#define GIC_V3_DIST_SIZE		0x10000
> +
>  /*
>   * Re-Distributor registers, offsets from RD_base
>   */
> @@ -74,6 +94,7 @@
>  #define GICR_SYNCR			0x00C0
>  #define GICR_MOVLPIR			0x0100
>  #define GICR_MOVALLR			0x0110
> +#define GICR_IDREGS			GICD_IDREGS
>  #define GICR_PIDR2			GICD_PIDR2
>  
>  #define GICR_WAKER_ProcessorSleep	(1U << 1)
> @@ -82,6 +103,7 @@
>  /*
>   * Re-Distributor registers, offsets from SGI_base
>   */
> +#define GICR_IGROUPR0			GICD_IGROUPR
>  #define GICR_ISENABLER0			GICD_ISENABLER
>  #define GICR_ICENABLER0			GICD_ICENABLER
>  #define GICR_ISPENDR0			GICD_ISPENDR
> @@ -90,10 +112,14 @@
>  #define GICR_ICACTIVER0			GICD_ICACTIVER
>  #define GICR_IPRIORITYR0		GICD_IPRIORITYR
>  #define GICR_ICFGR0			GICD_ICFGR
> +#define GICR_IGRPMODR0			GICD_IGRPMODR
> +#define GICR_NSACR			GICD_NSACR
>  
>  #define GICR_TYPER_VLPIS		(1U << 1)
>  #define GICR_TYPER_LAST			(1U << 4)
>  
> +#define GIC_V3_REDIST_SIZE		0x20000
> +
>  /*
>   * CPU interface registers
>   */
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 326ba7a..4a7798e 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1085,6 +1085,7 @@ void kvm_unregister_device_ops(u32 type);
>  extern struct kvm_device_ops kvm_mpic_ops;
>  extern struct kvm_device_ops kvm_xics_ops;
>  extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
> +extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
>  
>  #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
>  
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 6076882..24cb129 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -960,6 +960,8 @@ enum kvm_device_type {
>  #define KVM_DEV_TYPE_ARM_VGIC_V2	KVM_DEV_TYPE_ARM_VGIC_V2
>  	KVM_DEV_TYPE_FLIC,
>  #define KVM_DEV_TYPE_FLIC		KVM_DEV_TYPE_FLIC
> +	KVM_DEV_TYPE_ARM_VGIC_V3,
> +#define KVM_DEV_TYPE_ARM_VGIC_V3	KVM_DEV_TYPE_ARM_VGIC_V3

You need to document this device type in
Documentation/virtual/kvm/devices/ (probably in arm-vgic.txt).

That goes for patch 19 as well, but I'll remind you when I look at that
patch more closely.

>  	KVM_DEV_TYPE_MAX,
>  };
>  
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> new file mode 100644
> index 0000000..bcb5374
> --- /dev/null
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -0,0 +1,891 @@
> +/*
> + * GICv3 distributor and redistributor emulation on GICv3 hardware
> + *
> + * able to run on a pure native host GICv3 (which forces ARE=1)
> + *
> + * forcing ARE=1 and DS=1, not covering LPIs yet (TYPER.LPIS=0)

I think the above two lines require rewriting, may I suggest:

GICv3 emulation is currently only supported on a GICv3 host, but
supports both hardware with or without the optional GICv2 backwards
compatibility features.

We emulate a GICv3 without the backwards compatibility features (meaning
the emulated GICD_CTLR.ARE resets to 1 and is RAO/WI) and with only a
single security state (the emulated GICD_CTLR.DS=1, RAO/WI).  This
emulated GICv3 does not yet include support for LPIs (TYPER.LIPS=0,
RAZ/WI).

But pay particular attention to the bit about us emulating a GICv3 with
only a single security state, because you're implementing GICD_IGROUPR
and GICR_IGROUPR as RAZ/WI, which is then a limitation of the emulated
GIC (just like we don't emulate priorities), which is fine, but let's
then state that as such.

> + *
> + * Copyright (C) 2014 ARM Ltd.
> + * Author: Andre Przywara <andre.przywara@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/cpu.h>
> +#include <linux/kvm.h>
> +#include <linux/kvm_host.h>
> +#include <linux/interrupt.h>
> +
> +#include <linux/irqchip/arm-gic-v3.h>
> +#include <kvm/arm_vgic.h>
> +
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_arm.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include "vgic.h"
> +
> +#define INTERRUPT_ID_BITS 10
> +
> +static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
> +			     struct kvm_exit_mmio *mmio, phys_addr_t offset,
> +			     void *private)
> +{
> +	u32 reg = 0, val;
> +	u32 word_offset = offset & 3;
> +
> +	switch (offset & ~3) {
> +	case GICD_CTLR:
> +		/*
> +		 * Force ARE and DS to 1, the guest cannot change this.
> +		 * For the time being we only support Group1 interrupts.
> +		 */
> +		if (vcpu->kvm->arch.vgic.enabled)
> +			reg = GICD_CTLR_ENABLE_G1A;
> +		reg |= GICD_CTLR_ARE_NS | GICD_CTLR_DS;
> +
> +		vgic_reg_access(mmio, &reg, word_offset,
> +				ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> +		if (mmio->is_write) {
> +			vcpu->kvm->arch.vgic.enabled = !!(reg & GICD_CTLR_ENABLE_G1A);

> +			vgic_update_state(vcpu->kvm);
> +			return true;
> +		}
> +		break;

so we don't implement read-as-written for this register, should we at
least print a warning or something if the guest tries to enable group 0
interrupts?

> +	case GICD_TYPER:
> +		/*
> +		 * as this implementation does not provide compatibility

       Upper-case  ^

> +		 * with GICv2 (ARE==1), we report zero CPUs in the lower 5 bits.

lower 5 bits?  You mean we report bits [7:5] as 000 right?

> +		 * Also TYPER.LPIS is 0 for now and TYPER.MBIS is not supported.

drop the 'for now' just say we report TYPER.LPIS=0 and TYPER.MBIS=0;
because we don't support LBIs or MBIs.

> +		 */
> +
> +		/* claim we support at most 1024 (-4) SPIs via this interface */

claim?  Does this not hold in reality?  It doesn't seem to be what the
code does.  I'm doubting the usefulnes of this comment.

> +		val = min(vcpu->kvm->arch.vgic.nr_irqs, 1024);
> +		reg |= (val >> 5) - 1;
> +
> +		reg |= (INTERRUPT_ID_BITS - 1) << 19;

but it happens that we have no explanation about the arbitrarily chosen
10 bits?

> +
> +		vgic_reg_access(mmio, &reg, word_offset,
> +				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> +		break;
> +	case GICD_IIDR:
> +		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
> +		vgic_reg_access(mmio, &reg, word_offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> +		break;
> +	default:
> +		vgic_reg_access(mmio, NULL, word_offset,
> +				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +		break;
> +	}

I'm getting increasingly skeptic about the value of combining these
registers into a single misc function?

> +
> +	return false;
> +}
> +
> +static bool handle_mmio_set_enable_reg_dist(struct kvm_vcpu *vcpu,
> +					    struct kvm_exit_mmio *mmio,
> +					    phys_addr_t offset,
> +					    void *private)
> +{
> +	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> +		return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> +					      vcpu->vcpu_id,
> +					      ACCESS_WRITE_SETBIT);
> +
> +	vgic_reg_access(mmio, NULL, offset & 3,
> +			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);

Somewhat general question:

This made me wonder if we check for unaligned accesses anywhere or could
the guest get away with (offset & 3) = 2 and mmio->len = 4?  Then
semantics for this would start being weird...

> +	return false;
> +}
> +
> +static bool handle_mmio_clear_enable_reg_dist(struct kvm_vcpu *vcpu,
> +					      struct kvm_exit_mmio *mmio,
> +					      phys_addr_t offset,
> +					      void *private)
> +{
> +	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> +		return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> +					      vcpu->vcpu_id,
> +					      ACCESS_WRITE_CLEARBIT);
> +
> +	vgic_reg_access(mmio, NULL, offset & 3,
> +			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +	return false;
> +}
> +
> +static bool handle_mmio_set_pending_reg_dist(struct kvm_vcpu *vcpu,
> +					     struct kvm_exit_mmio *mmio,
> +					     phys_addr_t offset,
> +					     void *private)
> +{
> +	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> +		return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
> +						   vcpu->vcpu_id);
> +
> +	vgic_reg_access(mmio, NULL, offset & 3,
> +			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +	return false;
> +}
> +
> +static bool handle_mmio_clear_pending_reg_dist(struct kvm_vcpu *vcpu,
> +					       struct kvm_exit_mmio *mmio,
> +					       phys_addr_t offset,
> +					       void *private)
> +{
> +	if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> +		return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
> +						     vcpu->vcpu_id);
> +
> +	vgic_reg_access(mmio, NULL, offset & 3,
> +			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +	return false;
> +}
> +
> +static bool handle_mmio_priority_reg_dist(struct kvm_vcpu *vcpu,
> +					  struct kvm_exit_mmio *mmio,
> +					  phys_addr_t offset,
> +					  void *private)
> +{
> +	u32 *reg;
> +
> +	if (unlikely(offset < VGIC_NR_PRIVATE_IRQS)) {
> +		vgic_reg_access(mmio, NULL, offset & 3,

Just noticed, you don't need to mask off the upper bits all these places, do you?

I think it should be consistent with what we do in the v2 emulation.

The only place you may need to do that is in the handle_mmio_misc function.

> +				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +		return false;
> +	}
> +
> +	reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
> +				   vcpu->vcpu_id, offset);
> +	vgic_reg_access(mmio, reg, offset,
> +		ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> +	return false;
> +}
> +
> +static bool handle_mmio_cfg_reg_dist(struct kvm_vcpu *vcpu,
> +				     struct kvm_exit_mmio *mmio,
> +				     phys_addr_t offset,
> +				     void *private)
> +{
> +	u32 *reg;
> +
> +	if (unlikely(offset < VGIC_NR_PRIVATE_IRQS / 4)) {
> +		vgic_reg_access(mmio, NULL, offset & 3,
> +				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +		return false;
> +	}
> +
> +	reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
> +				  vcpu->vcpu_id, offset >> 1);
> +
> +	return vgic_handle_cfg_reg(reg, mmio, offset);
> +}
> +
> +static u32 compress_mpidr(unsigned long mpidr)

can you comment on this function which format it returns and which
context that's useful in?

> +{
> +	u32 ret;
> +
> +	ret = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	ret |= MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8;
> +	ret |= MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16;
> +	ret |= MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24;
> +
> +	return ret;
> +}
> +
> +static unsigned long uncompress_mpidr(u32 value)
> +{
> +	unsigned long mpidr;
> +
> +	mpidr = ((value >> 0) & 0xFF) << MPIDR_LEVEL_SHIFT(0);
> +	mpidr |= ((value >> 8) & 0xFF) << MPIDR_LEVEL_SHIFT(1);
> +	mpidr |= ((value >> 16) & 0xFF) << MPIDR_LEVEL_SHIFT(2);
> +	mpidr |= (u64)((value >> 24) & 0xFF) << MPIDR_LEVEL_SHIFT(3);
> +
> +	return mpidr;
> +}
> +
> +/*
> + * Lookup the given MPIDR value to get the vcpu_id (if there is one)
> + * and store that in the irq_spi_cpu[] array.
> + * This limits the number of VCPUs to 255 for now, extending the data
> + * type (or storing kvm_vcpu poiners) should lift the limit.
> + * Store the original MPIDR value in an extra array.

why?  To maintain read-as-written?

> + * Unallocated MPIDRs are translated to a special value and catched

s/catched/caught/

> + * before any array accesses.
> + */
> +static bool handle_mmio_route_reg(struct kvm_vcpu *vcpu,
> +				  struct kvm_exit_mmio *mmio,
> +				  phys_addr_t offset, void *private)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	int irq;
> +	u32 reg;
> +	int vcpu_id;
> +	unsigned long *bmap, mpidr;
> +	u32 word_offset = offset & 3;
> +
> +	/*
> +	 * Private interrupts cannot be re-routed, so this register
> +	 * is RES0 for any IRQ < 32.
> +	 * Also the upper 32 bits of each 64 bit register are zero,
> +	 * as we don't support Aff3 and that's the only value up there.

drop the rest of the sentence after Aff3.

> +	 */
> +	if (unlikely(offset < VGIC_NR_PRIVATE_IRQS * 8) || (offset & 4) == 4) {

you don't need the '== 4' part.

> +		vgic_reg_access(mmio, NULL, word_offset,
> +				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +		return false;
> +	}
> +
> +	irq = (offset / 8) - VGIC_NR_PRIVATE_IRQS;

can we not call this irq? spi instead maybe?

> +
> +	/* get the stored MPIDR for this IRQ */
> +	mpidr = uncompress_mpidr(dist->irq_spi_mpidr[irq]);
> +	mpidr &= MPIDR_HWID_BITMASK;
> +	reg = mpidr;
> +
> +	vgic_reg_access(mmio, &reg, word_offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> +
> +	if (!mmio->is_write)
> +		return false;
> +
> +	/*
> +	 * Now clear the currently assigned vCPU from the map, making room
> +	 * for the new one to be written below
> +	 */
> +	vcpu = kvm_mpidr_to_vcpu(kvm, mpidr);
> +	if (likely(vcpu)) {
> +		vcpu_id = vcpu->vcpu_id;
> +		bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
> +		clear_bit(irq, bmap);

this is the atomic version, right?  is it known to be faster on arm64
because it's written in assembly and that's why we're using it instead
of __clear_bit?

> +	}
> +
> +	dist->irq_spi_mpidr[irq] = compress_mpidr(reg);
> +	vcpu = kvm_mpidr_to_vcpu(kvm, reg & MPIDR_HWID_BITMASK);
> +
> +	/*
> +	 * The spec says that non-existent MPIDR values should not be
> +	 * forwarded to any existent (v)CPU, but should be able to become
> +	 * pending anyway. We simply keep the irq_spi_target[] array empty, so
> +	 * the interrupt will never be injected.
> +	 * irq_spi_cpu[irq] gets a magic value in this case.
> +	 */
> +	if (likely(vcpu)) {
> +		vcpu_id = vcpu->vcpu_id;
> +		dist->irq_spi_cpu[irq] = vcpu_id;
> +		bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
> +		set_bit(irq, bmap);

__set_bit ?

> +	} else
> +		dist->irq_spi_cpu[irq] = VCPU_NOT_ALLOCATED;

according to the CodingStyle (and me) this wants braces.

> +
> +	vgic_update_state(kvm);
> +
> +	return true;
> +}
> +
> +static bool handle_mmio_idregs(struct kvm_vcpu *vcpu,
> +			       struct kvm_exit_mmio *mmio,
> +			       phys_addr_t offset, void *private)
> +{
> +	u32 reg = 0;
> +
> +	switch (offset + GICD_IDREGS) {
> +	case GICD_PIDR2:
> +		reg = 0x3b;
> +		break;
> +	}
> +
> +	vgic_reg_access(mmio, &reg, offset & 3,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> +
> +	return false;
> +}
> +
> +static const struct mmio_range vgic_dist_ranges[] = {

can we call this vgic_v3_dist_ranges ?

> +	{	/*
> +		 * handling CTLR, TYPER, IIDR and STATUSR
> +		 */

this one doesn't need wings (and you're not doing that below)

> +		.base           = GICD_CTLR,
> +		.len            = 20,

nit: why do we specify this len as decimal and the others in hex?

> +		.bits_per_irq   = 0,
> +		.handle_mmio    = handle_mmio_misc,
> +	},

are we not mentioning the status register here because it's optional?

> +	{
> +		/* when DS=1, this is RAZ/WI */
> +		.base		= GICD_SETSPI_SR,
> +		.len		= 0x04,
> +		.bits_per_irq	= 0,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		/* when DS=1, this is RAZ/WI */
> +		.base		= GICD_CLRSPI_SR,
> +		.len		= 0x04,
> +		.bits_per_irq	= 0,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},

why are we only listing the _SR versions and not the _NSR versions?

> +	{
> +		.base		= GICD_IGROUPR,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},

this one may warrant a TODO: Group 0 interrupts not yet supported.

> +	{
> +		.base		= GICD_ISENABLER,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_set_enable_reg_dist,
> +	},
> +	{
> +		.base		= GICD_ICENABLER,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_clear_enable_reg_dist,
> +	},
> +	{
> +		.base		= GICD_ISPENDR,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_set_pending_reg_dist,
> +	},
> +	{
> +		.base		= GICD_ICPENDR,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_clear_pending_reg_dist,
> +	},
> +	{
> +		.base		= GICD_ISACTIVER,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICD_ICACTIVER,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICD_IPRIORITYR,
> +		.len		= 0x400,
> +		.bits_per_irq	= 8,
> +		.handle_mmio	= handle_mmio_priority_reg_dist,
> +	},
> +	{
> +		/* TARGETSRn is RES0 when ARE=1 */
> +		.base		= GICD_ITARGETSR,
> +		.len		= 0x400,
> +		.bits_per_irq	= 8,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICD_ICFGR,
> +		.len		= 0x100,
> +		.bits_per_irq	= 2,
> +		.handle_mmio	= handle_mmio_cfg_reg_dist,
> +	},
> +	{
> +		/* this is RAZ/WI when DS=1 */
> +		.base		= GICD_IGRPMODR,
> +		.len		= 0x80,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		/* with DS==1 this is RAZ/WI */

any reason why the two comments above are not identical?  (I know, I
have OCD).

> +		.base		= GICD_NSACR,
> +		.len		= 0x100,
> +		.bits_per_irq	= 2,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	/* the next three blocks are RES0 if ARE=1 */

probably nicer to just have a comment for each register where this
applies.

> +	{
> +		.base		= GICD_SGIR,
> +		.len		= 4,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICD_CPENDSGIR,
> +		.len		= 0x10,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base           = GICD_SPENDSGIR,
> +		.len            = 0x10,
> +		.handle_mmio    = handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICD_IROUTER,
> +		.len		= 0x2000,

shouldn't this be 0x1ee0?

> +		.bits_per_irq	= 64,
> +		.handle_mmio	= handle_mmio_route_reg,
> +	},
> +	{
> +		.base           = GICD_IDREGS,
> +		.len            = 0x30,
> +		.bits_per_irq   = 0,
> +		.handle_mmio    = handle_mmio_idregs,
> +	},
> +	{},
> +};
> +
> +static bool handle_mmio_set_enable_reg_redist(struct kvm_vcpu *vcpu,
> +					      struct kvm_exit_mmio *mmio,
> +					      phys_addr_t offset,
> +					      void *private)
> +{
> +	struct kvm_vcpu *target_redist_vcpu = private;
> +
> +	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> +				      target_redist_vcpu->vcpu_id,
> +				      ACCESS_WRITE_SETBIT);
> +}
> +
> +static bool handle_mmio_clear_enable_reg_redist(struct kvm_vcpu *vcpu,
> +						struct kvm_exit_mmio *mmio,
> +						phys_addr_t offset,
> +						void *private)
> +{
> +	struct kvm_vcpu *target_redist_vcpu = private;
> +
> +	return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> +				      target_redist_vcpu->vcpu_id,
> +				      ACCESS_WRITE_CLEARBIT);
> +}
> +
> +static bool handle_mmio_set_pending_reg_redist(struct kvm_vcpu *vcpu,
> +					       struct kvm_exit_mmio *mmio,
> +					       phys_addr_t offset,
> +					       void *private)
> +{
> +	struct kvm_vcpu *target_redist_vcpu = private;
> +
> +	return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
> +					   target_redist_vcpu->vcpu_id);
> +}
> +
> +static bool handle_mmio_clear_pending_reg_redist(struct kvm_vcpu *vcpu,
> +						 struct kvm_exit_mmio *mmio,
> +						 phys_addr_t offset,
> +						 void *private)
> +{
> +	struct kvm_vcpu *target_redist_vcpu = private;
> +
> +	return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
> +					     target_redist_vcpu->vcpu_id);
> +}
> +
> +static bool handle_mmio_priority_reg_redist(struct kvm_vcpu *vcpu,
> +					    struct kvm_exit_mmio *mmio,
> +					    phys_addr_t offset,
> +					    void *private)
> +{
> +	struct kvm_vcpu *target_redist_vcpu = private;
> +	u32 *reg;
> +
> +	reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
> +				   target_redist_vcpu->vcpu_id, offset);
> +	vgic_reg_access(mmio, reg, offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> +	return false;
> +}
> +
> +static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
> +				       struct kvm_exit_mmio *mmio,
> +				       phys_addr_t offset,
> +				       void *private)
> +{
> +	u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
> +				       *(int *)private, offset >> 1);
> +
> +	return vgic_handle_cfg_reg(reg, mmio, offset);
> +}
> +
> +static const struct mmio_range vgic_redist_sgi_ranges[] = {
> +	{
> +		.base		= GICR_IGROUPR0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,

shouldn't these RAO/WI instead?

> +	},
> +	{
> +		.base		= GICR_ISENABLER0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_set_enable_reg_redist,
> +	},
> +	{
> +		.base		= GICR_ICENABLER0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_clear_enable_reg_redist,
> +	},
> +	{
> +		.base		= GICR_ISPENDR0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_set_pending_reg_redist,
> +	},
> +	{
> +		.base		= GICR_ICPENDR0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_clear_pending_reg_redist,
> +	},
> +	{
> +		.base		= GICR_ISACTIVER0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICR_ICACTIVER0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICR_IPRIORITYR0,
> +		.len		= 32,
> +		.bits_per_irq	= 8,
> +		.handle_mmio	= handle_mmio_priority_reg_redist,
> +	},
> +	{
> +		.base		= GICR_ICFGR0,
> +		.len		= 8,
> +		.bits_per_irq	= 2,
> +		.handle_mmio	= handle_mmio_cfg_reg_redist,
> +	},
> +	{
> +		.base		= GICR_IGRPMODR0,
> +		.len		= 4,
> +		.bits_per_irq	= 1,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{
> +		.base		= GICR_NSACR,
> +		.len		= 4,
> +		.handle_mmio	= handle_mmio_raz_wi,
> +	},
> +	{},
> +};
> +
> +static bool handle_mmio_misc_redist(struct kvm_vcpu *vcpu,
> +				    struct kvm_exit_mmio *mmio,
> +				    phys_addr_t offset, void *private)
> +{
> +	u32 reg;
> +	u32 word_offset = offset & 3;
> +	u64 mpidr;
> +	struct kvm_vcpu *target_redist_vcpu = private;
> +	int target_vcpu_id = target_redist_vcpu->vcpu_id;
> +
> +	switch (offset & ~3) {
> +	case GICR_CTLR:
> +		/* since we don't support LPIs, this register is zero for now */
> +		vgic_reg_access(mmio, &reg, word_offset,
> +				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +		break;
> +	case GICR_TYPER + 4:
> +		mpidr = kvm_vcpu_get_mpidr(target_redist_vcpu);
> +		reg = compress_mpidr(mpidr);
> +
> +		vgic_reg_access(mmio, &reg, word_offset,
> +				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> +		break;
> +	case GICR_TYPER:
> +		reg = target_redist_vcpu->vcpu_id << 8;
> +		if (target_vcpu_id == atomic_read(&vcpu->kvm->online_vcpus) - 1)
> +			reg |= GICR_TYPER_LAST;
> +		vgic_reg_access(mmio, &reg, word_offset,
> +				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> +		break;
> +	case GICR_IIDR:
> +		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
> +		vgic_reg_access(mmio, &reg, word_offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> +		break;

the fact that you could reuse handle_mmio_iidr directly here and that
GICR_TYPER reads funny here, indicates to me that we should once again
split this up into smaller functions.

> +	default:
> +		vgic_reg_access(mmio, NULL, word_offset,
> +				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> +		break;
> +	}
> +
> +	return false;
> +}
> +
> +static const struct mmio_range vgic_redist_ranges[] = {
> +	{	/*
> +		 * handling CTLR, IIDR, TYPER and STATUSR
> +		 */
> +		.base           = GICR_CTLR,
> +		.len            = 20,
> +		.bits_per_irq   = 0,
> +		.handle_mmio    = handle_mmio_misc_redist,
> +	},
> +	{
> +		.base           = GICR_WAKER,
> +		.len            = 4,
> +		.bits_per_irq   = 0,
> +		.handle_mmio    = handle_mmio_raz_wi,
> +	},
> +	{
> +		.base           = GICR_IDREGS,
> +		.len            = 0x30,
> +		.bits_per_irq   = 0,
> +		.handle_mmio    = handle_mmio_idregs,
> +	},
> +	{},
> +};
> +
> +/*
> + * this is the stub handling both dist and redist MMIO exits for v3
      This 

Is this really a stub?

I would suggest spelling out distributor and re-distributor and GICv3.
Full stop after GICv3.

> + * does some vcpu_id calculation on the redist MMIO to use a possibly
> + * different VCPU than the current one

"some vcpu_id calculation" is not very helpful, either explain the magic
sauce, or just say in which way a "different" VCPU is something we need
to pay special attention to.

If I read the code correctly, the comment shoudl simply be:

The GICv3 spec allows any CPU to access any redistributor through the
memory-mapped redistributor registers.  We can therefore determine which
reditributor is being accesses by simply looking at the faulting IPA.

> + */
> +static bool vgic_v3_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> +				struct kvm_exit_mmio *mmio)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long dbase = dist->vgic_dist_base;
> +	unsigned long rdbase = dist->vgic_redist_base;

I'm not crazy about these 'shortcuts', especially given that RD_base is
the base of a specific redistributor, but ok.

> +	int nrcpus = atomic_read(&vcpu->kvm->online_vcpus);
> +	int vcpu_id;
> +	struct kvm_vcpu *target_redist_vcpu;
> +
> +	if (is_in_range(mmio->phys_addr, mmio->len, dbase, GIC_V3_DIST_SIZE)) {
> +		return vgic_handle_mmio_range(vcpu, run, mmio,
> +					      vgic_dist_ranges, dbase, NULL);
> +	}
> +
> +	if (!is_in_range(mmio->phys_addr, mmio->len, rdbase,
> +	    GIC_V3_REDIST_SIZE * nrcpus))
> +		return false;

so this implies that all redistributors will always be in contiguous IPA
space, is this reasonable?

> +
> +	vcpu_id = (mmio->phys_addr - rdbase) / GIC_V3_REDIST_SIZE;
> +	rdbase += (vcpu_id * GIC_V3_REDIST_SIZE);
> +	target_redist_vcpu = kvm_get_vcpu(vcpu->kvm, vcpu_id);

redist_vcpu should be enough

> +
> +	if (mmio->phys_addr >= rdbase + 0x10000)
> +		return vgic_handle_mmio_range(vcpu, run, mmio,
> +					      vgic_redist_sgi_ranges,
> +					      rdbase + 0x10000,
> +					      target_redist_vcpu);

0x10000 magic number used twice,  GICV3_REDIST_SGI_PAGE_OFFSET or
something shorter.

perhaps it is nicer to just adjust rdbase and set a range variable above
and only have a single call to vgic_handle_mmio_range().

> +
> +	return vgic_handle_mmio_range(vcpu, run, mmio, vgic_redist_ranges,
> +				      rdbase, target_redist_vcpu);
> +}
> +
> +static bool vgic_v3_queue_sgi(struct kvm_vcpu *vcpu, int irq)
> +{
> +	if (vgic_queue_irq(vcpu, 0, irq)) {
> +		vgic_dist_irq_clear_pending(vcpu, irq);
> +		vgic_cpu_irq_clear(vcpu, irq);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static int vgic_v3_init_maps(struct vgic_dist *dist)
> +{
> +	int nr_spis = dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
> +
> +	dist->irq_spi_mpidr = kcalloc(nr_spis, sizeof(dist->irq_spi_mpidr[0]),
> +				      GFP_KERNEL);
> +
> +	if (!dist->irq_spi_mpidr)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static int vgic_v3_init(struct kvm *kvm, const struct vgic_params *params)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	int ret, i;
> +	u32 mpidr;
> +
> +	if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
> +	    IS_VGIC_ADDR_UNDEF(dist->vgic_redist_base)) {
> +		kvm_err("Need to set vgic distributor addresses first\n");
> +		return -ENXIO;
> +	}
> +
> +	/*
> +	 * FIXME: this should be moved to init_maps time, and may bite
> +	 * us when adding save/restore. Add a per-emulation hook?
> +	 */

What is the plan for this?  Can we move it into init_maps or does that
require some more work?

Why can't we do what the gicv2 emulation does?

Not sure what the "Add a per-emulation hook?" question is asking...

> +	ret = vgic_v3_init_maps(dist);
> +	if (ret) {
> +		kvm_err("Unable to allocate maps\n");
> +		return ret;
> +	}
> +
> +	mpidr = compress_mpidr(kvm_vcpu_get_mpidr(kvm_get_vcpu(kvm, 0)));
> +	for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i++) {
> +		dist->irq_spi_cpu[i - VGIC_NR_PRIVATE_IRQS] = 0;
> +		dist->irq_spi_mpidr[i - VGIC_NR_PRIVATE_IRQS] = mpidr;
> +		vgic_bitmap_set_irq_val(dist->irq_spi_target, 0, i, 1);

why do we need 3 different copies of the same value now?  ok, we had two
before because of the bitmap "optimization" thingy, but now we have two
other sets of state for the same thing...

> +	}
> +
> +	return 0;
> +}
> +
> +static void vgic_v3_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
> +{

can you put a one line comment here:

/* The GICv3 spec does away with keeping track of SGI sources */

> +}
> +
> +bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	switch (type) {
> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
> +		dist->vm_ops.handle_mmio = vgic_v3_handle_mmio;
> +		dist->vm_ops.queue_sgi = vgic_v3_queue_sgi;
> +		dist->vm_ops.add_sgi_source = vgic_v3_add_sgi_source;
> +		dist->vm_ops.vgic_init = vgic_v3_init;
> +		break;
> +	default:
> +		return false;
> +	}
> +	return true;
> +}
> +
> +/*
> + * triggered by a system register access trap, called from the sysregs

      Triggered

> + * handling code there.

                    ^^^ there, where, here, and everywhere ?

> + * The register contains the upper three affinity levels of the target

          ^^^ which register?  @reg ?

> + * processors as well as a bitmask of 16 Aff0 CPUs.

Does @reg follow the format from something in the spec?  That would be
useful to know...

> + * Iterate over all VCPUs to check for matching ones or signal on
> + * all-but-self if the mode bit is set.

an all-but-self IPI?  Is that the architectural term?  Otherwise I would
suggest something like:  If not VCPUs are found which match reg (in some
way), then send the IPI to all VCPUs in the VM, except the one
performing the system register acces.

> + */

Also, please use the kdocs format here like the rest of the kvm/arm code.
Begin sentences with upper-case, etc.:

/**
* vgic_v3_dispatch_sgi - This function does something with SGIs
* @vcpu: The vcpu pointer
* @reg: Magic
*
* Some nicer version of what you have above.
*/

> +void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
> +{

It's a bit hard to review this when I cannot see how it is called, I'm
assuming that this is on writes to ICC_SGI1R_EL1 and reg is what the
guest tried to write to that register.

I have a feeling that you may want to add this function in a separate patch.

> +	struct kvm *kvm = vcpu->kvm;
> +	struct kvm_vcpu *c_vcpu;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u16 target_cpus;
> +	u64 mpidr, mpidr_h, mpidr_l;
> +	int sgi, mode, c, vcpu_id;
> +	int updated = 0;
> +
> +	vcpu_id = vcpu->vcpu_id;
> +
> +	sgi = (reg >> 24) & 0xf;
> +	mode = (reg >> 40) & 0x1;

perhaps we can call this 'targeted' or something to make it a bit more
clear.

> +	target_cpus = reg & 0xffff;

Can you add some defines for these magic shifts?  Are there not some
already for the GICv3 host driver we can reuse?

> +	mpidr = ((reg >> 48) & 0xff) << MPIDR_LEVEL_SHIFT(3);
> +	mpidr |= ((reg >> 32) & 0xff) << MPIDR_LEVEL_SHIFT(2);
> +	mpidr |= ((reg >> 16) & 0xff) << MPIDR_LEVEL_SHIFT(1);
> +	mpidr &= ~MPIDR_LEVEL_MASK;

(**) note the comment a few lines down.

> +
> +	/*
> +	 * We take the dist lock here, because we come from the sysregs
> +	 * code path and not from MMIO (where this is already done)

					which already takes the lock).

> +	 */
> +	spin_lock(&dist->lock);
> +	kvm_for_each_vcpu(c, c_vcpu, kvm) {

I think it would be helpful to document this loop, something like:

We loop through every possible vCPU and check if we need to send an SGI
to that vCPU.  If targeting specific vCPUS, we check if the candidate
vCPU is in the target list and if it is, we send an SGI and clear the
bit in the target list.  When the target list is empty and we are
targeting specific vCPUs, we are done.

Maybe too verbose, you can tweak it as you like.

> +		if (!mode && target_cpus == 0)
> +			break;
> +		if (mode && c == vcpu_id)       /* not to myself */
> +			continue;
> +		if (!mode) {
> +			mpidr_h = kvm_vcpu_get_mpidr(c_vcpu);
> +			mpidr_l = MPIDR_AFFINITY_LEVEL(mpidr_h, 0);
> +			mpidr_h &= ~MPIDR_LEVEL_MASK;

this is *really* confusing. _h and _l are high and low?

Can you factor this out into a static inline and get rid of that mpidr
mask above (**) ?

> +			if (mpidr != mpidr_h)
> +				continue;
> +			if (!(target_cpus & BIT(mpidr_l)))
> +				continue;
> +			target_cpus &= ~BIT(mpidr_l);
> +		}
> +		/* Flag the SGI as pending */
> +		vgic_dist_irq_set_pending(c_vcpu, sgi);
> +		updated = 1;
> +		kvm_debug("SGI%d from CPU%d to CPU%d\n", sgi, vcpu_id, c);
> +	}
> +	if (updated)
> +		vgic_update_state(vcpu->kvm);
> +	spin_unlock(&dist->lock);
> +	if (updated)
> +		vgic_kick_vcpus(vcpu->kvm);
> +}
> +
> +
> +static int vgic_v3_get_attr(struct kvm_device *dev,
> +			    struct kvm_device_attr *attr)
> +{
> +	int ret;
> +
> +	ret = vgic_get_common_attr(dev, attr);

So this means we can get the KVM_VGIC_V2_ADDR_TYPE_DIST and
KVM_VGIC_V2_ADDR_TYPE_CPU from an emualted gicv3 without the GICv2
backwards compatibility features?

> +	if (ret != -ENXIO)
> +		return ret;
> +
> +	switch (attr->group) {
> +	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> +	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> +		return -ENXIO;
> +	}
> +
> +	return -ENXIO;
> +}
> +
> +static int vgic_v3_set_attr(struct kvm_device *dev,
> +			    struct kvm_device_attr *attr)
> +{
> +	int ret;
> +
> +	ret = vgic_set_common_attr(dev, attr);

same as above?

> +	if (ret != -ENXIO)
> +		return ret;
> +
> +	switch (attr->group) {
> +	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> +	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> +		return -ENXIO;
> +	}
> +
> +	return -ENXIO;
> +}
> +
> +static int vgic_v3_has_attr(struct kvm_device *dev,
> +			    struct kvm_device_attr *attr)
> +{
> +	switch (attr->group) {
> +	case KVM_DEV_ARM_VGIC_GRP_ADDR:
> +		switch (attr->attr) {
> +		case KVM_VGIC_V2_ADDR_TYPE_DIST:
> +		case KVM_VGIC_V2_ADDR_TYPE_CPU:
> +			return -ENXIO;
> +		}
> +		break;
> +	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> +	case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> +		return -ENXIO;
> +	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
> +		return 0;
> +	}
> +	return -ENXIO;
> +}
> +
> +struct kvm_device_ops kvm_arm_vgic_v3_ops = {
> +	.name = "kvm-arm-vgic-v3",
> +	.create = vgic_create,
> +	.destroy = vgic_destroy,
> +	.set_attr = vgic_v3_set_attr,
> +	.get_attr = vgic_v3_get_attr,
> +	.has_attr = vgic_v3_has_attr,

nit: you could reorder set/get so they're set in the same order they
appear in the code.

> +};
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index a54389b..2867269d 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1228,7 +1228,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	struct kvm_vcpu *vcpu;
>  	int edge_triggered, level_triggered;
>  	int enabled;
> -	bool ret = true;
> +	bool ret = true, can_inject = true;
>  
>  	spin_lock(&dist->lock);
>  
> @@ -1243,6 +1243,11 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  
>  	if (irq_num >= VGIC_NR_PRIVATE_IRQS) {
>  		cpuid = dist->irq_spi_cpu[irq_num - VGIC_NR_PRIVATE_IRQS];
> +		if (cpuid == VCPU_NOT_ALLOCATED) {
> +			/* Pretend we use CPU0, and prevent injection */
> +			cpuid = 0;
> +			can_inject = false;
> +		}
>  		vcpu = kvm_get_vcpu(kvm, cpuid);
>  	}
>  
> @@ -1264,7 +1269,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  
>  	enabled = vgic_irq_is_enabled(vcpu, irq_num);
>  
> -	if (!enabled) {
> +	if (!enabled || !can_inject) {

don't you also need to handle the vgic_dist_irq_set_pending() call and
its friends above?

>  		ret = false;
>  		goto out;
>  	}
> @@ -1406,6 +1411,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
>  	}
>  	kfree(dist->irq_sgi_sources);
>  	kfree(dist->irq_spi_cpu);
> +	kfree(dist->irq_spi_mpidr);
>  	kfree(dist->irq_spi_target);
>  	kfree(dist->irq_pending_on_cpu);
>  	dist->irq_sgi_sources = NULL;
> @@ -1581,6 +1587,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
>  	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>  	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
> +	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;

sure, we can write to the same memory twice, why not, it's fun.

>  
>  	if (!init_emulation_ops(kvm, type))
>  		ret = -ENODEV;
> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
> index f52db4e..42c20c1 100644
> --- a/virt/kvm/arm/vgic.h
> +++ b/virt/kvm/arm/vgic.h
> @@ -35,6 +35,8 @@
>  #define ACCESS_WRITE_VALUE	(3 << 1)
>  #define ACCESS_WRITE_MASK(x)	((x) & (3 << 1))
>  
> +#define VCPU_NOT_ALLOCATED	((u8)-1)
> +
>  unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x);
>  
>  void vgic_update_state(struct kvm *kvm);
> @@ -121,5 +123,6 @@ int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
>  int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
>  
>  bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type);
> +bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type);
>  
>  #endif
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 17/19] arm64: KVM: add SGI system register trapping
  2014-10-31 17:26 ` [PATCH v3 17/19] arm64: KVM: add SGI system register trapping Andre Przywara
@ 2014-11-07 15:07   ` Christoffer Dall
  2014-11-10 11:31     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-07 15:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:52PM +0000, Andre Przywara wrote:
> While the injection of a (virtual) inter-processor interrupt (SGI)
> on a GICv2 works by writing to a MMIO register, GICv3 uses system
> registers to trigger them.
> Trap the appropriate registers on ARM64 hosts and call the SGI

Are you actually enabling the trapping here or just putting the trap
handler in place?  As I understood so far, we still configure the guest
at this point to raise an unexpected exception in the guest if it tries
to eaccess the system registers; did I get this wrong?

> handler function in the vGICv3 emulation code.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c |   26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index dcc5867..cf0452e 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -165,6 +165,27 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
>  	return true;
>  }
>  
> +/*
> + * Trapping on the GICv3 SGI system register.

Use the architecture name for the register here.

> + * Forward the request to the VGIC emulation.
> + * The cp15_64 code makes sure this automatically works
> + * for both AArch64 and AArch32 accesses.
> + */
> +static bool access_gic_sgi(struct kvm_vcpu *vcpu,
> +			   const struct sys_reg_params *p,
> +			   const struct sys_reg_desc *r)
> +{
> +	u64 val;
> +
> +	if (!p->is_write)
> +		return read_from_write_only(vcpu, p);
> +
> +	val = *vcpu_reg(vcpu, p->Rt);
> +	vgic_v3_dispatch_sgi(vcpu, val);

So do we guarantee somehow that we'll never get here if userspace didn't
successfully create a virtual GICv3?

> +
> +	return true;
> +}
> +
>  static bool trap_raz_wi(struct kvm_vcpu *vcpu,
>  			const struct sys_reg_params *p,
>  			const struct sys_reg_desc *r)
> @@ -431,6 +452,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	/* VBAR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b0000), Op2(0b000),
>  	  NULL, reset_val, VBAR_EL1, 0 },
> +	/* ICC_SGI1R_EL1 */
> +	{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1011), Op2(0b101),
> +	  access_gic_sgi },
>  	/* CONTEXTIDR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b0000), Op2(0b001),
>  	  access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 },
> @@ -659,6 +683,8 @@ static const struct sys_reg_desc cp14_64_regs[] = {
>   * register).
>   */
>  static const struct sys_reg_desc cp15_regs[] = {
> +	{ Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi },
> +
>  	{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_sctlr, NULL, c1_SCTLR },
>  	{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
>  	{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation
  2014-10-31 17:26 ` [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation Andre Przywara
@ 2014-11-07 16:07   ` Christoffer Dall
  2014-11-10 12:19     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-07 16:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:53PM +0000, Andre Przywara wrote:
> With all the necessary GICv3 emulation code in place, we can now
> connect the code to the GICv3 backend in the kernel.
> The LR register handling is different depending on the emulated GIC
> model, so provide different implementations for each.
> Also allow non-v2-compatible GICv3 implementations (which don't
> provide MMIO regions for the virtual CPU interface in the DT), but
> restrict those hosts to use GICv3 guests only.

s/use/support/

> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  virt/kvm/arm/vgic-v3.c |  168 ++++++++++++++++++++++++++++++++++++------------
>  virt/kvm/arm/vgic.c    |    4 ++
>  2 files changed, 130 insertions(+), 42 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> index ce50918..c0e901c 100644
> --- a/virt/kvm/arm/vgic-v3.c
> +++ b/virt/kvm/arm/vgic-v3.c
> @@ -34,6 +34,7 @@
>  #define GICH_LR_VIRTUALID		(0x3ffUL << 0)
>  #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
>  #define GICH_LR_PHYSID_CPUID		(7UL << GICH_LR_PHYSID_CPUID_SHIFT)
> +#define ICH_LR_VIRTUALID_MASK		(BIT_ULL(32) - 1)
>  
>  /*
>   * LRs are stored in reverse order in memory. make sure we index them
> @@ -43,7 +44,35 @@
>  
>  static u32 ich_vtr_el2;
>  
> -static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
> +static u64 sync_lr_val(u8 state)

is this lr_state_to_val ?

> +{
> +	u64 lr_val = 0;
> +
> +	if (state & LR_STATE_PENDING)
> +		lr_val |= ICH_LR_PENDING_BIT;
> +	if (state & LR_STATE_ACTIVE)
> +		lr_val |= ICH_LR_ACTIVE_BIT;
> +	if (state & LR_EOI_INT)
> +		lr_val |= ICH_LR_EOI;
> +
> +	return lr_val;
> +}
> +
> +static u8 sync_lr_state(u64 lr_val)

and lr_val_to_state ?

at least these sync names don't make much sense to me...

> +{
> +	u8 state = 0;
> +
> +	if (lr_val & ICH_LR_PENDING_BIT)
> +		state |= LR_STATE_PENDING;
> +	if (lr_val & ICH_LR_ACTIVE_BIT)
> +		state |= LR_STATE_ACTIVE;
> +	if (lr_val & ICH_LR_EOI)
> +		state |= LR_EOI_INT;
> +
> +	return state;
> +}
> +
> +static struct vgic_lr vgic_v2_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
>  {
>  	struct vgic_lr lr_desc;
>  	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
> @@ -53,30 +82,53 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
>  		lr_desc.source	= (val >> GICH_LR_PHYSID_CPUID_SHIFT) & 0x7;
>  	else
>  		lr_desc.source = 0;
> -	lr_desc.state	= 0;
> +	lr_desc.state	= sync_lr_state(val);
>  
> -	if (val & ICH_LR_PENDING_BIT)
> -		lr_desc.state |= LR_STATE_PENDING;
> -	if (val & ICH_LR_ACTIVE_BIT)
> -		lr_desc.state |= LR_STATE_ACTIVE;
> -	if (val & ICH_LR_EOI)
> -		lr_desc.state |= LR_EOI_INT;
> +	return lr_desc;
> +}
> +
> +static struct vgic_lr vgic_v3_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
> +{
> +	struct vgic_lr lr_desc;
> +	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
> +
> +	lr_desc.irq	= val & ICH_LR_VIRTUALID_MASK;
> +	lr_desc.source	= 0;
> +	lr_desc.state	= sync_lr_state(val);
>  
>  	return lr_desc;
>  }
>  
> -static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
> -			   struct vgic_lr lr_desc)
> +static void vgic_v3_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
> +				 struct vgic_lr lr_desc)
>  {
> -	u64 lr_val = (((u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) |
> -		      lr_desc.irq);
> +	u64 lr_val;
>  
> -	if (lr_desc.state & LR_STATE_PENDING)
> -		lr_val |= ICH_LR_PENDING_BIT;
> -	if (lr_desc.state & LR_STATE_ACTIVE)
> -		lr_val |= ICH_LR_ACTIVE_BIT;
> -	if (lr_desc.state & LR_EOI_INT)
> -		lr_val |= ICH_LR_EOI;
> +	lr_val = lr_desc.irq;
> +
> +	/*
> +	 * currently all guest IRQs are Group1, as Group0 would result

Can you guess my comment here?

> +	 * in a FIQ in the guest, which it wouldn't expect.
> +	 * Eventually we want to make this configurable, so we may revisit
> +	 * this in the future.
> +	 */
> +	lr_val |= ICH_LR_GROUP;
> +
> +	lr_val |= sync_lr_val(lr_desc.state);
> +
> +	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
> +}
> +
> +static void vgic_v2_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
> +				 struct vgic_lr lr_desc)
> +{
> +	u64 lr_val;
> +
> +	lr_val = lr_desc.irq;
> +
> +	lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
> +
> +	lr_val |= sync_lr_val(lr_desc.state);
>  
>  	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
>  }
> @@ -145,9 +197,8 @@ static void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcrp)
>  
>  static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>  {
> -	struct vgic_v3_cpu_if *vgic_v3;
> +	struct vgic_v3_cpu_if *vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
>  
> -	vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;

unnecessary change?

>  	/*
>  	 * By forcing VMCR to zero, the GIC will restore the binary
>  	 * points to their reset values. Anything else resets to zero
> @@ -155,7 +206,14 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>  	 */
>  	vgic_v3->vgic_vmcr = 0;
>  
> -	vgic_v3->vgic_sre = 0;
> +	/*
> +	 * Set the SRE_EL1 value depending on the configured
> +	 * emulated vGIC model.
> +	 */
> +	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
> +		vgic_v3->vgic_sre = ICC_SRE_EL1_SRE;

If we're on hardware with the GICv2 backwards compatibility can the
guest not actually set ICC_SRE_EL1_SRE to 0 but then we are not
preserving this because we wouldn't be trapping such accesses anymore?

Also, this is really about the reset value of that field, which the
comment above is not being specific about.

Further, that would mean that the field is NOT actually RAO/WI, and thus
the spec dictates that the field should reset to zero if EL1 is the
highest implemented exception level and the field is not RAO/WI, which
is what the guest expects, no?

> +	else
> +		vgic_v3->vgic_sre = 0;
>  
>  	/* Get the show on the road... */
>  	vgic_v3->vgic_hcr = ICH_HCR_EN;
> @@ -173,6 +231,15 @@ static const struct vgic_ops vgic_v3_ops = {
>  	.enable			= vgic_v3_enable,
>  };
>  
> +static void init_vgic_v3_emul(struct kvm *kvm)
> +{
> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> +
> +	vm_ops->get_lr = vgic_v3_on_v3_get_lr;
> +	vm_ops->set_lr = vgic_v3_on_v3_set_lr;
> +	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
> +}

why do you need this indirection?  Just move vgic_v3_init_emul() up here
and call that?

> +
>  static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
>  {
>  	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
> @@ -186,14 +253,28 @@ static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
>  			return false;
>  		}
>  
> -		vm_ops->get_lr = vgic_v3_get_lr;
> -		vm_ops->set_lr = vgic_v3_set_lr;
> +		vm_ops->get_lr = vgic_v2_on_v3_get_lr;
> +		vm_ops->set_lr = vgic_v2_on_v3_set_lr;
>  		kvm->arch.max_vcpus = 8;
>  		return true;
> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
> +		init_vgic_v3_emul(kvm);
> +		return true;
>  	}
>  	return false;
>  }
>  
> +static bool vgic_v3_init_emul(struct kvm *kvm, int type)
> +{
> +	switch (type) {
> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
> +		init_vgic_v3_emul(kvm);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
>  static struct vgic_params vgic_v3_params;
>  
>  /**
> @@ -235,29 +316,32 @@ int vgic_v3_probe(struct device_node *vgic_node,
>  
>  	gicv_idx += 3; /* Also skip GICD, GICC, GICH */
>  	if (of_address_to_resource(vgic_node, gicv_idx, &vcpu_res)) {
> -		kvm_err("Cannot obtain GICV region\n");
> -		ret = -ENXIO;
> -		goto out;
> -	}
> +		kvm_info("GICv3: GICv2 emulation not available\n");
> +		vgic->vcpu_base = 0;
> +		vgic->init_emul = vgic_v3_init_emul;
> +	} else {
> +		if (!PAGE_ALIGNED(vcpu_res.start)) {
> +			kvm_err("GICV physical address 0x%llx not page aligned\n",
> +				(unsigned long long)vcpu_res.start);
> +			ret = -ENXIO;
> +			goto out;
> +		}
>  
> -	if (!PAGE_ALIGNED(vcpu_res.start)) {
> -		kvm_err("GICV physical address 0x%llx not page aligned\n",
> -			(unsigned long long)vcpu_res.start);
> -		ret = -ENXIO;
> -		goto out;
> -	}
> +		if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
> +			kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
> +				(unsigned long long)resource_size(&vcpu_res),
> +				PAGE_SIZE);
> +			ret = -ENXIO;
> +			goto out;
> +		}
>  
> -	if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
> -		kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
> -			(unsigned long long)resource_size(&vcpu_res),
> -			PAGE_SIZE);
> -		ret = -ENXIO;
> -		goto out;
> +		vgic->vcpu_base = vcpu_res.start;
> +		vgic->init_emul = vgic_v3_init_emul_compat;
> +		kvm_register_device_ops(&kvm_arm_vgic_v2_ops,
> +					KVM_DEV_TYPE_ARM_VGIC_V2);
>  	}
> -	kvm_register_device_ops(&kvm_arm_vgic_v2_ops, KVM_DEV_TYPE_ARM_VGIC_V2);
> +	kvm_register_device_ops(&kvm_arm_vgic_v3_ops, KVM_DEV_TYPE_ARM_VGIC_V3);
>  
> -	vgic->init_emul = vgic_v3_init_emul_compat;
> -	vgic->vcpu_base = vcpu_res.start;
>  	vgic->vctrl_base = NULL;
>  	vgic->type = VGIC_V3;
>  
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 2867269d..16d7c9d 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1542,6 +1542,10 @@ static bool init_emulation_ops(struct kvm *kvm, int type)
>  	switch (type) {
>  	case KVM_DEV_TYPE_ARM_VGIC_V2:
>  		return vgic_v2_init_emulation_ops(kvm, type);
> +#ifdef CONFIG_ARM_GIC_V3
> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
> +		return vgic_v3_init_emulation_ops(kvm, type);

needs Documentation

> +#endif
>  	}
>  	return false;
>  }
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3
  2014-10-31 17:26 ` [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3 Andre Przywara
@ 2014-11-07 16:15   ` Christoffer Dall
  2014-11-10 12:26     ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-07 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 31, 2014 at 05:26:54PM +0000, Andre Przywara wrote:
> With everything in place we allow userland to request the kernel
> using a virtual GICv3 in the guest, which finally lifts the 8 vCPU
> limit for a guest.

You're actually not explicitly allowing this in this patch, you're
implicitly allowing it because init would fail without the vgic
distributor base address being set already.

Either re-arrange your patches or fix the commit message.

> Also we provide the necessary support for guests setting the memory
> addresses for the virtual distributor and redistributors.
> This requires some userland code to make use of that feature and
> explicitly ask for a virtual GICv3.

You need to add documentation for this new device type and the userspace
ABI.

> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm64/include/uapi/asm/kvm.h |    7 ++++++
>  include/kvm/arm_vgic.h            |    4 ++--
>  virt/kvm/arm/vgic-v3-emul.c       |    3 +++
>  virt/kvm/arm/vgic.c               |   46 ++++++++++++++++++++++++++-----------
>  4 files changed, 45 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 8e38878..2ed873a 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -78,6 +78,13 @@ struct kvm_regs {
>  #define KVM_VGIC_V2_DIST_SIZE		0x1000
>  #define KVM_VGIC_V2_CPU_SIZE		0x2000
>  
> +/* Supported VGICv3 address types  */
> +#define KVM_VGIC_V3_ADDR_TYPE_DIST	2
> +#define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
> +
> +#define KVM_VGIC_V3_DIST_SIZE		SZ_64K
> +#define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
> +
>  #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index c303083..e2e432c 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -35,8 +35,8 @@
>  #define VGIC_MAX_IRQS		1024
>  
>  /* Sanity checks... */
> -#if (KVM_MAX_VCPUS > 8)
> -#error	Invalid number of CPU interfaces
> +#if (KVM_MAX_VCPUS > 255)
> +#error Too many KVM VCPUs, the VGIC only supports up to 255 VCPUs for now

what happens now if you add more vcpus after having created a GICv2 with
8 vcpus?

>  #endif
>  
>  #if (VGIC_NR_IRQS_LEGACY & 31)
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index bcb5374..ba6b0b5 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -870,6 +870,9 @@ static int vgic_v3_has_attr(struct kvm_device *dev,
>  		case KVM_VGIC_V2_ADDR_TYPE_DIST:
>  		case KVM_VGIC_V2_ADDR_TYPE_CPU:
>  			return -ENXIO;
> +		case KVM_VGIC_V3_ADDR_TYPE_DIST:
> +		case KVM_VGIC_V3_ADDR_TYPE_REDIST:
> +			return 0;
>  		}
>  		break;
>  	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 16d7c9d..a5abef1 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1647,7 +1647,7 @@ static int vgic_ioaddr_assign(struct kvm *kvm, phys_addr_t *ioaddr,
>  /**
>   * kvm_vgic_addr - set or get vgic VM base addresses
>   * @kvm:   pointer to the vm struct
> - * @type:  the VGIC addr type, one of KVM_VGIC_V2_ADDR_TYPE_XXX
> + * @type:  the VGIC addr type, one of KVM_VGIC_V[23]_ADDR_TYPE_XXX
>   * @addr:  pointer to address value
>   * @write: if true set the address in the VM address space, if false read the
>   *          address
> @@ -1661,29 +1661,49 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write)
>  {
>  	int r = 0;
>  	struct vgic_dist *vgic = &kvm->arch.vgic;
> +	int type_needed;
> +	phys_addr_t *addr_ptr, block_size;
>  
>  	mutex_lock(&kvm->lock);
>  	switch (type) {
>  	case KVM_VGIC_V2_ADDR_TYPE_DIST:
> -		if (write) {
> -			r = vgic_ioaddr_assign(kvm, &vgic->vgic_dist_base,
> -					       *addr, KVM_VGIC_V2_DIST_SIZE);
> -		} else {
> -			*addr = vgic->vgic_dist_base;
> -		}
> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V2;
> +		addr_ptr = &vgic->vgic_dist_base;
> +		block_size = KVM_VGIC_V2_DIST_SIZE;
>  		break;
>  	case KVM_VGIC_V2_ADDR_TYPE_CPU:
> -		if (write) {
> -			r = vgic_ioaddr_assign(kvm, &vgic->vgic_cpu_base,
> -					       *addr, KVM_VGIC_V2_CPU_SIZE);
> -		} else {
> -			*addr = vgic->vgic_cpu_base;
> -		}
> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V2;
> +		addr_ptr = &vgic->vgic_cpu_base;
> +		block_size = KVM_VGIC_V2_CPU_SIZE;
>  		break;
> +#ifdef CONFIG_ARM_GIC_V3
> +	case KVM_VGIC_V3_ADDR_TYPE_DIST:
> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
> +		addr_ptr = &vgic->vgic_dist_base;
> +		block_size = KVM_VGIC_V3_DIST_SIZE;
> +		break;
> +	case KVM_VGIC_V3_ADDR_TYPE_REDIST:
> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
> +		addr_ptr = &vgic->vgic_redist_base;
> +		block_size = KVM_VGIC_V3_REDIST_SIZE;
> +		break;
> +#endif
>  	default:
>  		r = -ENODEV;
> +		goto out;
> +	}
> +
> +	if (vgic->vgic_model != type_needed) {
> +		r = -ENODEV;
> +		goto out;
>  	}
>  
> +	if (write)
> +		r = vgic_ioaddr_assign(kvm, addr_ptr, *addr, block_size);
> +	else
> +		*addr = *addr_ptr;
> +
> +out:
>  	mutex_unlock(&kvm->lock);
>  	return r;
>  }
> -- 
> 1.7.9.5
> 

Otherwise looks good to me.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
  2014-11-03 20:17     ` Marc Zyngier
@ 2014-11-07 19:18       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-07 19:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 03, 2014 at 08:17:36PM +0000, Marc Zyngier wrote:
> Hi Christoffer,
> 
> On 03/11/14 20:04, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:45PM +0000, Andre Przywara wrote:
> >> ICC_SRE_EL1 is a system register allowing msr/mrs accesses to the
> >> GIC CPU interface for EL1 (guests). Currently we force it to 0, but
> >> for proper GICv3 support we have to allow guests to use it (depending
> >> on their selected virtual GIC model).
> >> So add ICC_SRE_EL1 to the list of saved/restored registers on a
> >> world switch, but actually disallow a guest to change it by only
> >> restoring a fixed, once-initialized value.
> >> This value depends on the GIC model userland has chosen for a guest.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> >> ---
> >>  arch/arm64/kernel/asm-offsets.c |    1 +
> >>  arch/arm64/kvm/vgic-v3-switch.S |   14 +++++++++-----
> >>  include/kvm/arm_vgic.h          |    1 +
> >>  virt/kvm/arm/vgic-v3.c          |    9 +++++++--
> >>  4 files changed, 18 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> >> index 9a9fce0..9d34486 100644
> >> --- a/arch/arm64/kernel/asm-offsets.c
> >> +++ b/arch/arm64/kernel/asm-offsets.c
> >> @@ -140,6 +140,7 @@ int main(void)
> >>    DEFINE(VGIC_V2_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_v2.vgic_elrsr));
> >>    DEFINE(VGIC_V2_CPU_APR,	offsetof(struct vgic_cpu, vgic_v2.vgic_apr));
> >>    DEFINE(VGIC_V2_CPU_LR,	offsetof(struct vgic_cpu, vgic_v2.vgic_lr));
> >> +  DEFINE(VGIC_V3_CPU_SRE,	offsetof(struct vgic_cpu, vgic_v3.vgic_sre));
> >>    DEFINE(VGIC_V3_CPU_HCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_hcr));
> >>    DEFINE(VGIC_V3_CPU_VMCR,	offsetof(struct vgic_cpu, vgic_v3.vgic_vmcr));
> >>    DEFINE(VGIC_V3_CPU_MISR,	offsetof(struct vgic_cpu, vgic_v3.vgic_misr));
> >> diff --git a/arch/arm64/kvm/vgic-v3-switch.S b/arch/arm64/kvm/vgic-v3-switch.S
> >> index d160469..617a012 100644
> >> --- a/arch/arm64/kvm/vgic-v3-switch.S
> >> +++ b/arch/arm64/kvm/vgic-v3-switch.S
> >> @@ -148,17 +148,18 @@
> >>   * x0: Register pointing to VCPU struct
> >>   */
> >>  .macro	restore_vgic_v3_state
> >> -	// Disable SRE_EL1 access. Necessary, otherwise
> >> -	// ICH_VMCR_EL2.VFIQEn becomes one, and FIQ happens...
> >> -	msr_s	ICC_SRE_EL1, xzr
> >> -	isb
> >> -
> > 
> > I know I reviewed this once, but now I'm forgetting how it works with
> > this comment above.  First, I don't fully understand the comment.
> 
> If you write to ICH_VMCR_EL2 with SRE==1, the architecture forces VFIQEn
> to 1, which causes interesting effects when you inject an Group0
> interrupt (as we do for GICv2 emulation).
> 
> You end-up spending days debugging this, mostly blaming the model for
> all these FIQs appearing in your guest, until you read that small gem
> hidden in the architecture spec. Bad memories, let's not go there.
> 
> That's why we must make sure to set ICC_SRE_EL1 *before* writing to
> ICH_VMCR_EL2.
> 
> > Second, now we're restoring a value that may potentially have SRE_EL1
> > access enabled, but FIQ doesn't happen.  Can you clarify this for me?
> 
> That's a side effect of how we inject interrupts with GICv3. They are
> Group1, always. A Group0 interrupt would definitely be delivered as a
> FIQ, but we currently don't offer a way to support that.
> 
Realized I never responded to this.

Thanks for the clarification, this must have been dreadful to debug.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 17/19] arm64: KVM: add SGI system register trapping
  2014-11-07 15:07   ` Christoffer Dall
@ 2014-11-10 11:31     ` Andre Przywara
  2014-11-10 12:45       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-10 11:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 07/11/14 15:07, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:52PM +0000, Andre Przywara wrote:
>> While the injection of a (virtual) inter-processor interrupt (SGI)
>> on a GICv2 works by writing to a MMIO register, GICv3 uses system
>> registers to trigger them.
>> Trap the appropriate registers on ARM64 hosts and call the SGI
>
> Are you actually enabling the trapping here or just putting the trap
> handler in place?  As I understood so far, we still configure the guest
> at this point to raise an unexpected exception in the guest if it tries
> to eaccess the system registers; did I get this wrong?

You are right, the changes in the patch series at this point are not yet
visible to userland (and hence the guest), so any guest access to any
kind of GICv3 registers (MMIO or sysreg) should still fail at this point.
So a guest Linux GICv3 driver will never issue those MSRs if there is no
DT node present, but any attempt should still fail nevertheless, since
the GICv3 structures are not properly initialized.

>> handler function in the vGICv3 emulation code.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  arch/arm64/kvm/sys_regs.c |   26 ++++++++++++++++++++++++++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>> index dcc5867..cf0452e 100644
>> --- a/arch/arm64/kvm/sys_regs.c
>> +++ b/arch/arm64/kvm/sys_regs.c
>> @@ -165,6 +165,27 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
>>      return true;
>>  }
>>
>> +/*
>> + * Trapping on the GICv3 SGI system register.
>
> Use the architecture name for the register here.
>
>> + * Forward the request to the VGIC emulation.
>> + * The cp15_64 code makes sure this automatically works
>> + * for both AArch64 and AArch32 accesses.
>> + */
>> +static bool access_gic_sgi(struct kvm_vcpu *vcpu,
>> +                       const struct sys_reg_params *p,
>> +                       const struct sys_reg_desc *r)
>> +{
>> +    u64 val;
>> +
>> +    if (!p->is_write)
>> +            return read_from_write_only(vcpu, p);
>> +
>> +    val = *vcpu_reg(vcpu, p->Rt);
>> +    vgic_v3_dispatch_sgi(vcpu, val);
>
> So do we guarantee somehow that we'll never get here if userspace didn't
> successfully create a virtual GICv3?

No :-( Nothing prevents a guest from writing to this architectural
sysreg, but it shouldn't do since nothing tells it yet about a GICv3 yet.

What about just introducing the handler functions in this patch and
wiring them up in the sys_reg_descs struct later with the final
enablement patch?
This would provoke a compile warning though due to the unused static
functions. Is it worth to declare them as non-static until there are
referenced in the later patch?

Is there any other trick to avoid this warning or to work around this issue?

Cheers,
Andre.

>
>> +
>> +    return true;
>> +}
>> +
>>  static bool trap_raz_wi(struct kvm_vcpu *vcpu,
>>                      const struct sys_reg_params *p,
>>                      const struct sys_reg_desc *r)
>> @@ -431,6 +452,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>>      /* VBAR_EL1 */
>>      { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b0000), Op2(0b000),
>>        NULL, reset_val, VBAR_EL1, 0 },
>> +    /* ICC_SGI1R_EL1 */
>> +    { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1011), Op2(0b101),
>> +      access_gic_sgi },
>>      /* CONTEXTIDR_EL1 */
>>      { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b0000), Op2(0b001),
>>        access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 },
>> @@ -659,6 +683,8 @@ static const struct sys_reg_desc cp14_64_regs[] = {
>>   * register).
>>   */
>>  static const struct sys_reg_desc cp15_regs[] = {
>> +    { Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi },
>> +
>>      { Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_sctlr, NULL, c1_SCTLR },
>>      { Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
>>      { Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
>> --
>> 1.7.9.5
>>
>

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation
  2014-11-07 16:07   ` Christoffer Dall
@ 2014-11-10 12:19     ` Andre Przywara
  2014-11-10 13:24       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-10 12:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hej Christoffer,

On 07/11/14 16:07, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:53PM +0000, Andre Przywara wrote:
>> With all the necessary GICv3 emulation code in place, we can now
>> connect the code to the GICv3 backend in the kernel.
>> The LR register handling is different depending on the emulated GIC
>> model, so provide different implementations for each.
>> Also allow non-v2-compatible GICv3 implementations (which don't
>> provide MMIO regions for the virtual CPU interface in the DT), but
>> restrict those hosts to use GICv3 guests only.
> 
> s/use/support/
> 
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  virt/kvm/arm/vgic-v3.c |  168 ++++++++++++++++++++++++++++++++++++------------
>>  virt/kvm/arm/vgic.c    |    4 ++
>>  2 files changed, 130 insertions(+), 42 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
>> index ce50918..c0e901c 100644
>> --- a/virt/kvm/arm/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic-v3.c
>> @@ -34,6 +34,7 @@
>>  #define GICH_LR_VIRTUALID		(0x3ffUL << 0)
>>  #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
>>  #define GICH_LR_PHYSID_CPUID		(7UL << GICH_LR_PHYSID_CPUID_SHIFT)
>> +#define ICH_LR_VIRTUALID_MASK		(BIT_ULL(32) - 1)
>>  
>>  /*
>>   * LRs are stored in reverse order in memory. make sure we index them
>> @@ -43,7 +44,35 @@
>>  
>>  static u32 ich_vtr_el2;
>>  
>> -static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
>> +static u64 sync_lr_val(u8 state)
> 
> is this lr_state_to_val ?
> 
>> +{
>> +	u64 lr_val = 0;
>> +
>> +	if (state & LR_STATE_PENDING)
>> +		lr_val |= ICH_LR_PENDING_BIT;
>> +	if (state & LR_STATE_ACTIVE)
>> +		lr_val |= ICH_LR_ACTIVE_BIT;
>> +	if (state & LR_EOI_INT)
>> +		lr_val |= ICH_LR_EOI;
>> +
>> +	return lr_val;
>> +}
>> +
>> +static u8 sync_lr_state(u64 lr_val)
> 
> and lr_val_to_state ?
> 
> at least these sync names don't make much sense to me...
> 
>> +{
>> +	u8 state = 0;
>> +
>> +	if (lr_val & ICH_LR_PENDING_BIT)
>> +		state |= LR_STATE_PENDING;
>> +	if (lr_val & ICH_LR_ACTIVE_BIT)
>> +		state |= LR_STATE_ACTIVE;
>> +	if (lr_val & ICH_LR_EOI)
>> +		state |= LR_EOI_INT;
>> +
>> +	return state;
>> +}
>> +
>> +static struct vgic_lr vgic_v2_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
>>  {
>>  	struct vgic_lr lr_desc;
>>  	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
>> @@ -53,30 +82,53 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
>>  		lr_desc.source	= (val >> GICH_LR_PHYSID_CPUID_SHIFT) & 0x7;
>>  	else
>>  		lr_desc.source = 0;
>> -	lr_desc.state	= 0;
>> +	lr_desc.state	= sync_lr_state(val);
>>  
>> -	if (val & ICH_LR_PENDING_BIT)
>> -		lr_desc.state |= LR_STATE_PENDING;
>> -	if (val & ICH_LR_ACTIVE_BIT)
>> -		lr_desc.state |= LR_STATE_ACTIVE;
>> -	if (val & ICH_LR_EOI)
>> -		lr_desc.state |= LR_EOI_INT;
>> +	return lr_desc;
>> +}
>> +
>> +static struct vgic_lr vgic_v3_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
>> +{
>> +	struct vgic_lr lr_desc;
>> +	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
>> +
>> +	lr_desc.irq	= val & ICH_LR_VIRTUALID_MASK;
>> +	lr_desc.source	= 0;
>> +	lr_desc.state	= sync_lr_state(val);
>>  
>>  	return lr_desc;
>>  }
>>  
>> -static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
>> -			   struct vgic_lr lr_desc)
>> +static void vgic_v3_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
>> +				 struct vgic_lr lr_desc)
>>  {
>> -	u64 lr_val = (((u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) |
>> -		      lr_desc.irq);
>> +	u64 lr_val;
>>  
>> -	if (lr_desc.state & LR_STATE_PENDING)
>> -		lr_val |= ICH_LR_PENDING_BIT;
>> -	if (lr_desc.state & LR_STATE_ACTIVE)
>> -		lr_val |= ICH_LR_ACTIVE_BIT;
>> -	if (lr_desc.state & LR_EOI_INT)
>> -		lr_val |= ICH_LR_EOI;
>> +	lr_val = lr_desc.irq;
>> +
>> +	/*
>> +	 * currently all guest IRQs are Group1, as Group0 would result
> 
> Can you guess my comment here?
> 
>> +	 * in a FIQ in the guest, which it wouldn't expect.
>> +	 * Eventually we want to make this configurable, so we may revisit
>> +	 * this in the future.
>> +	 */
>> +	lr_val |= ICH_LR_GROUP;
>> +
>> +	lr_val |= sync_lr_val(lr_desc.state);
>> +
>> +	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
>> +}
>> +
>> +static void vgic_v2_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
>> +				 struct vgic_lr lr_desc)
>> +{
>> +	u64 lr_val;
>> +
>> +	lr_val = lr_desc.irq;
>> +
>> +	lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
>> +
>> +	lr_val |= sync_lr_val(lr_desc.state);
>>  
>>  	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
>>  }
>> @@ -145,9 +197,8 @@ static void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcrp)
>>  
>>  static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>>  {
>> -	struct vgic_v3_cpu_if *vgic_v3;
>> +	struct vgic_v3_cpu_if *vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
>>  
>> -	vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
> 
> unnecessary change?
> 
>>  	/*
>>  	 * By forcing VMCR to zero, the GIC will restore the binary
>>  	 * points to their reset values. Anything else resets to zero

So most of the code above is gone now in this form due to the drop of
init_emul and friends I did earlier last week due to your comments.
So I will skip those comments for now (or better: try to translate them
to the new code structure if possbile) and eagerly wait for them to
reappear in a different form in the v4 comments ;-)

>> @@ -155,7 +206,14 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
>>  	 */
>>  	vgic_v3->vgic_vmcr = 0;
>>  
>> -	vgic_v3->vgic_sre = 0;
>> +	/*
>> +	 * Set the SRE_EL1 value depending on the configured
>> +	 * emulated vGIC model.
>> +	 */
>> +	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
>> +		vgic_v3->vgic_sre = ICC_SRE_EL1_SRE;
> 
> If we're on hardware with the GICv2 backwards compatibility can the
> guest not actually set ICC_SRE_EL1_SRE to 0 but then we are not
> preserving this because we wouldn't be trapping such accesses anymore?
> 
> Also, this is really about the reset value of that field, which the
> comment above is not being specific about.
> 
> Further, that would mean that the field is NOT actually RAO/WI, and thus
> the spec dictates that the field should reset to zero if EL1 is the
> highest implemented exception level and the field is not RAO/WI, which
> is what the guest expects, no?

I am still thinking about this (because it is probably true). Need to
discuss with Marc what we can do about this.

Thanks,
Andre.

> 
>> +	else
>> +		vgic_v3->vgic_sre = 0;
>>  
>>  	/* Get the show on the road... */
>>  	vgic_v3->vgic_hcr = ICH_HCR_EN;
>> @@ -173,6 +231,15 @@ static const struct vgic_ops vgic_v3_ops = {
>>  	.enable			= vgic_v3_enable,
>>  };
>>  
>> +static void init_vgic_v3_emul(struct kvm *kvm)
>> +{
>> +	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
>> +
>> +	vm_ops->get_lr = vgic_v3_on_v3_get_lr;
>> +	vm_ops->set_lr = vgic_v3_on_v3_set_lr;
>> +	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
>> +}
> 
> why do you need this indirection?  Just move vgic_v3_init_emul() up here
> and call that?
> 
>> +
>>  static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
>>  {
>>  	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
>> @@ -186,14 +253,28 @@ static bool vgic_v3_init_emul_compat(struct kvm *kvm, int type)
>>  			return false;
>>  		}
>>  
>> -		vm_ops->get_lr = vgic_v3_get_lr;
>> -		vm_ops->set_lr = vgic_v3_set_lr;
>> +		vm_ops->get_lr = vgic_v2_on_v3_get_lr;
>> +		vm_ops->set_lr = vgic_v2_on_v3_set_lr;
>>  		kvm->arch.max_vcpus = 8;
>>  		return true;
>> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
>> +		init_vgic_v3_emul(kvm);
>> +		return true;
>>  	}
>>  	return false;
>>  }
>>  
>> +static bool vgic_v3_init_emul(struct kvm *kvm, int type)
>> +{
>> +	switch (type) {
>> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
>> +		init_vgic_v3_emul(kvm);
>> +		return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>>  static struct vgic_params vgic_v3_params;
>>  
>>  /**
>> @@ -235,29 +316,32 @@ int vgic_v3_probe(struct device_node *vgic_node,
>>  
>>  	gicv_idx += 3; /* Also skip GICD, GICC, GICH */
>>  	if (of_address_to_resource(vgic_node, gicv_idx, &vcpu_res)) {
>> -		kvm_err("Cannot obtain GICV region\n");
>> -		ret = -ENXIO;
>> -		goto out;
>> -	}
>> +		kvm_info("GICv3: GICv2 emulation not available\n");
>> +		vgic->vcpu_base = 0;
>> +		vgic->init_emul = vgic_v3_init_emul;
>> +	} else {
>> +		if (!PAGE_ALIGNED(vcpu_res.start)) {
>> +			kvm_err("GICV physical address 0x%llx not page aligned\n",
>> +				(unsigned long long)vcpu_res.start);
>> +			ret = -ENXIO;
>> +			goto out;
>> +		}
>>  
>> -	if (!PAGE_ALIGNED(vcpu_res.start)) {
>> -		kvm_err("GICV physical address 0x%llx not page aligned\n",
>> -			(unsigned long long)vcpu_res.start);
>> -		ret = -ENXIO;
>> -		goto out;
>> -	}
>> +		if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
>> +			kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
>> +				(unsigned long long)resource_size(&vcpu_res),
>> +				PAGE_SIZE);
>> +			ret = -ENXIO;
>> +			goto out;
>> +		}
>>  
>> -	if (!PAGE_ALIGNED(resource_size(&vcpu_res))) {
>> -		kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n",
>> -			(unsigned long long)resource_size(&vcpu_res),
>> -			PAGE_SIZE);
>> -		ret = -ENXIO;
>> -		goto out;
>> +		vgic->vcpu_base = vcpu_res.start;
>> +		vgic->init_emul = vgic_v3_init_emul_compat;
>> +		kvm_register_device_ops(&kvm_arm_vgic_v2_ops,
>> +					KVM_DEV_TYPE_ARM_VGIC_V2);
>>  	}
>> -	kvm_register_device_ops(&kvm_arm_vgic_v2_ops, KVM_DEV_TYPE_ARM_VGIC_V2);
>> +	kvm_register_device_ops(&kvm_arm_vgic_v3_ops, KVM_DEV_TYPE_ARM_VGIC_V3);
>>  
>> -	vgic->init_emul = vgic_v3_init_emul_compat;
>> -	vgic->vcpu_base = vcpu_res.start;
>>  	vgic->vctrl_base = NULL;
>>  	vgic->type = VGIC_V3;
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 2867269d..16d7c9d 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1542,6 +1542,10 @@ static bool init_emulation_ops(struct kvm *kvm, int type)
>>  	switch (type) {
>>  	case KVM_DEV_TYPE_ARM_VGIC_V2:
>>  		return vgic_v2_init_emulation_ops(kvm, type);
>> +#ifdef CONFIG_ARM_GIC_V3
>> +	case KVM_DEV_TYPE_ARM_VGIC_V3:
>> +		return vgic_v3_init_emulation_ops(kvm, type);
> 
> needs Documentation
> 
>> +#endif
>>  	}
>>  	return false;
>>  }
>> -- 
>> 1.7.9.5
>>
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3
  2014-11-07 16:15   ` Christoffer Dall
@ 2014-11-10 12:26     ` Andre Przywara
  2014-11-10 13:25       ` Christoffer Dall
  0 siblings, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-10 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hej Christoffer,

On 07/11/14 16:15, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:54PM +0000, Andre Przywara wrote:
>> With everything in place we allow userland to request the kernel
>> using a virtual GICv3 in the guest, which finally lifts the 8 vCPU
>> limit for a guest.
> 
> You're actually not explicitly allowing this in this patch, you're
> implicitly allowing it because init would fail without the vgic
> distributor base address being set already.
> 
> Either re-arrange your patches or fix the commit message.

The latter ;-)

>> Also we provide the necessary support for guests setting the memory
>> addresses for the virtual distributor and redistributors.
>> This requires some userland code to make use of that feature and
>> explicitly ask for a virtual GICv3.
> 
> You need to add documentation for this new device type and the userspace
> ABI.

Will do.

>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  arch/arm64/include/uapi/asm/kvm.h |    7 ++++++
>>  include/kvm/arm_vgic.h            |    4 ++--
>>  virt/kvm/arm/vgic-v3-emul.c       |    3 +++
>>  virt/kvm/arm/vgic.c               |   46 ++++++++++++++++++++++++++-----------
>>  4 files changed, 45 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 8e38878..2ed873a 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -78,6 +78,13 @@ struct kvm_regs {
>>  #define KVM_VGIC_V2_DIST_SIZE		0x1000
>>  #define KVM_VGIC_V2_CPU_SIZE		0x2000
>>  
>> +/* Supported VGICv3 address types  */
>> +#define KVM_VGIC_V3_ADDR_TYPE_DIST	2
>> +#define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
>> +
>> +#define KVM_VGIC_V3_DIST_SIZE		SZ_64K
>> +#define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
>> +
>>  #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
>>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
>>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index c303083..e2e432c 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -35,8 +35,8 @@
>>  #define VGIC_MAX_IRQS		1024
>>  
>>  /* Sanity checks... */
>> -#if (KVM_MAX_VCPUS > 8)
>> -#error	Invalid number of CPU interfaces
>> +#if (KVM_MAX_VCPUS > 255)
>> +#error Too many KVM VCPUs, the VGIC only supports up to 255 VCPUs for now
> 
> what happens now if you add more vcpus after having created a GICv2 with
> 8 vcpus?

On adding a VCPU we check the number of allowed VCPUs for this
particular guest (see arch/arm/kvm/arm.c:kvm_arch_vcpu_create() in patch
09/19). On creating a virtual GICv2 we set the limit to 8, so any
KVM_VCPU_CREATE afterwards will fail.

But indeed I found other issues in this sequence of VCPU/VGIC init,
which dissolved "magically" by the rework around (or actually the drop
of) init_emul() and friends.

Thanks,
Andre.

>>  #endif
>>  
>>  #if (VGIC_NR_IRQS_LEGACY & 31)
>> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
>> index bcb5374..ba6b0b5 100644
>> --- a/virt/kvm/arm/vgic-v3-emul.c
>> +++ b/virt/kvm/arm/vgic-v3-emul.c
>> @@ -870,6 +870,9 @@ static int vgic_v3_has_attr(struct kvm_device *dev,
>>  		case KVM_VGIC_V2_ADDR_TYPE_DIST:
>>  		case KVM_VGIC_V2_ADDR_TYPE_CPU:
>>  			return -ENXIO;
>> +		case KVM_VGIC_V3_ADDR_TYPE_DIST:
>> +		case KVM_VGIC_V3_ADDR_TYPE_REDIST:
>> +			return 0;
>>  		}
>>  		break;
>>  	case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 16d7c9d..a5abef1 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1647,7 +1647,7 @@ static int vgic_ioaddr_assign(struct kvm *kvm, phys_addr_t *ioaddr,
>>  /**
>>   * kvm_vgic_addr - set or get vgic VM base addresses
>>   * @kvm:   pointer to the vm struct
>> - * @type:  the VGIC addr type, one of KVM_VGIC_V2_ADDR_TYPE_XXX
>> + * @type:  the VGIC addr type, one of KVM_VGIC_V[23]_ADDR_TYPE_XXX
>>   * @addr:  pointer to address value
>>   * @write: if true set the address in the VM address space, if false read the
>>   *          address
>> @@ -1661,29 +1661,49 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write)
>>  {
>>  	int r = 0;
>>  	struct vgic_dist *vgic = &kvm->arch.vgic;
>> +	int type_needed;
>> +	phys_addr_t *addr_ptr, block_size;
>>  
>>  	mutex_lock(&kvm->lock);
>>  	switch (type) {
>>  	case KVM_VGIC_V2_ADDR_TYPE_DIST:
>> -		if (write) {
>> -			r = vgic_ioaddr_assign(kvm, &vgic->vgic_dist_base,
>> -					       *addr, KVM_VGIC_V2_DIST_SIZE);
>> -		} else {
>> -			*addr = vgic->vgic_dist_base;
>> -		}
>> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V2;
>> +		addr_ptr = &vgic->vgic_dist_base;
>> +		block_size = KVM_VGIC_V2_DIST_SIZE;
>>  		break;
>>  	case KVM_VGIC_V2_ADDR_TYPE_CPU:
>> -		if (write) {
>> -			r = vgic_ioaddr_assign(kvm, &vgic->vgic_cpu_base,
>> -					       *addr, KVM_VGIC_V2_CPU_SIZE);
>> -		} else {
>> -			*addr = vgic->vgic_cpu_base;
>> -		}
>> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V2;
>> +		addr_ptr = &vgic->vgic_cpu_base;
>> +		block_size = KVM_VGIC_V2_CPU_SIZE;
>>  		break;
>> +#ifdef CONFIG_ARM_GIC_V3
>> +	case KVM_VGIC_V3_ADDR_TYPE_DIST:
>> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
>> +		addr_ptr = &vgic->vgic_dist_base;
>> +		block_size = KVM_VGIC_V3_DIST_SIZE;
>> +		break;
>> +	case KVM_VGIC_V3_ADDR_TYPE_REDIST:
>> +		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
>> +		addr_ptr = &vgic->vgic_redist_base;
>> +		block_size = KVM_VGIC_V3_REDIST_SIZE;
>> +		break;
>> +#endif
>>  	default:
>>  		r = -ENODEV;
>> +		goto out;
>> +	}
>> +
>> +	if (vgic->vgic_model != type_needed) {
>> +		r = -ENODEV;
>> +		goto out;
>>  	}
>>  
>> +	if (write)
>> +		r = vgic_ioaddr_assign(kvm, addr_ptr, *addr, block_size);
>> +	else
>> +		*addr = *addr_ptr;
>> +
>> +out:
>>  	mutex_unlock(&kvm->lock);
>>  	return r;
>>  }
>> -- 
>> 1.7.9.5
>>
> 
> Otherwise looks good to me.
> 
> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 17/19] arm64: KVM: add SGI system register trapping
  2014-11-10 11:31     ` Andre Przywara
@ 2014-11-10 12:45       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-10 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 10, 2014 at 11:31:23AM +0000, Andre Przywara wrote:
> Hi Christoffer,
> 
> On 07/11/14 15:07, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:52PM +0000, Andre Przywara wrote:
> >> While the injection of a (virtual) inter-processor interrupt (SGI)
> >> on a GICv2 works by writing to a MMIO register, GICv3 uses system
> >> registers to trigger them.
> >> Trap the appropriate registers on ARM64 hosts and call the SGI
> >
> > Are you actually enabling the trapping here or just putting the trap
> > handler in place?  As I understood so far, we still configure the guest
> > at this point to raise an unexpected exception in the guest if it tries
> > to eaccess the system registers; did I get this wrong?
> 
> You are right, the changes in the patch series at this point are not yet
> visible to userland (and hence the guest), so any guest access to any
> kind of GICv3 registers (MMIO or sysreg) should still fail at this point.
> So a guest Linux GICv3 driver will never issue those MSRs if there is no
> DT node present, but any attempt should still fail nevertheless, since
> the GICv3 structures are not properly initialized.

Shouldn't any guest accesses to these registers just raise an undef
exception in the guest because we're not yet setting SRE?

In any case, it seems your commit message is misleading and should be
rewritten.


> 
> >> handler function in the vGICv3 emulation code.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  arch/arm64/kvm/sys_regs.c |   26 ++++++++++++++++++++++++++
> >>  1 file changed, 26 insertions(+)
> >>
> >> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> >> index dcc5867..cf0452e 100644
> >> --- a/arch/arm64/kvm/sys_regs.c
> >> +++ b/arch/arm64/kvm/sys_regs.c
> >> @@ -165,6 +165,27 @@ static bool access_sctlr(struct kvm_vcpu *vcpu,
> >>      return true;
> >>  }
> >>
> >> +/*
> >> + * Trapping on the GICv3 SGI system register.
> >
> > Use the architecture name for the register here.
> >
> >> + * Forward the request to the VGIC emulation.
> >> + * The cp15_64 code makes sure this automatically works
> >> + * for both AArch64 and AArch32 accesses.
> >> + */
> >> +static bool access_gic_sgi(struct kvm_vcpu *vcpu,
> >> +                       const struct sys_reg_params *p,
> >> +                       const struct sys_reg_desc *r)
> >> +{
> >> +    u64 val;
> >> +
> >> +    if (!p->is_write)
> >> +            return read_from_write_only(vcpu, p);
> >> +
> >> +    val = *vcpu_reg(vcpu, p->Rt);
> >> +    vgic_v3_dispatch_sgi(vcpu, val);
> >
> > So do we guarantee somehow that we'll never get here if userspace didn't
> > successfully create a virtual GICv3?
> 
> No :-( Nothing prevents a guest from writing to this architectural
> sysreg, but it shouldn't do since nothing tells it yet about a GICv3 yet.

I really don't care whether the guest should or should not do something,
if something is possible, we need to handle it.

> 
> What about just introducing the handler functions in this patch and
> wiring them up in the sys_reg_descs struct later with the final
> enablement patch?

yes, but that's not what this comment is about.

> This would provoke a compile warning though due to the unused static
> functions. Is it worth to declare them as non-static until there are
> referenced in the later patch?
> 
> Is there any other trick to avoid this warning or to work around this issue?
> 
Hmmm, my concern is that you're calling vgic_v3_dispatch_sgi(), but
you're not doing anything to check if irqchip_in_kernel(), so I just
didn't manage to think through the entire flow, in the sense of whether
we've excluded this function from ever being called if the gicv3 is not
created (becasue we never set SRE, for example).

I'd like to avoid a host NULL pointer dereference just because the guest
is being a little naughty.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation
  2014-11-10 12:19     ` Andre Przywara
@ 2014-11-10 13:24       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-10 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 10, 2014 at 12:19:09PM +0000, Andre Przywara wrote:
> Hej Christoffer,

  ^^^ Nice, Hej Andre,

> 
> On 07/11/14 16:07, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:53PM +0000, Andre Przywara wrote:
> >> With all the necessary GICv3 emulation code in place, we can now
> >> connect the code to the GICv3 backend in the kernel.
> >> The LR register handling is different depending on the emulated GIC
> >> model, so provide different implementations for each.
> >> Also allow non-v2-compatible GICv3 implementations (which don't
> >> provide MMIO regions for the virtual CPU interface in the DT), but
> >> restrict those hosts to use GICv3 guests only.
> > 
> > s/use/support/
> > 
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  virt/kvm/arm/vgic-v3.c |  168 ++++++++++++++++++++++++++++++++++++------------
> >>  virt/kvm/arm/vgic.c    |    4 ++
> >>  2 files changed, 130 insertions(+), 42 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
> >> index ce50918..c0e901c 100644
> >> --- a/virt/kvm/arm/vgic-v3.c
> >> +++ b/virt/kvm/arm/vgic-v3.c
> >> @@ -34,6 +34,7 @@
> >>  #define GICH_LR_VIRTUALID		(0x3ffUL << 0)
> >>  #define GICH_LR_PHYSID_CPUID_SHIFT	(10)
> >>  #define GICH_LR_PHYSID_CPUID		(7UL << GICH_LR_PHYSID_CPUID_SHIFT)
> >> +#define ICH_LR_VIRTUALID_MASK		(BIT_ULL(32) - 1)
> >>  
> >>  /*
> >>   * LRs are stored in reverse order in memory. make sure we index them
> >> @@ -43,7 +44,35 @@
> >>  
> >>  static u32 ich_vtr_el2;
> >>  
> >> -static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
> >> +static u64 sync_lr_val(u8 state)
> > 
> > is this lr_state_to_val ?
> > 
> >> +{
> >> +	u64 lr_val = 0;
> >> +
> >> +	if (state & LR_STATE_PENDING)
> >> +		lr_val |= ICH_LR_PENDING_BIT;
> >> +	if (state & LR_STATE_ACTIVE)
> >> +		lr_val |= ICH_LR_ACTIVE_BIT;
> >> +	if (state & LR_EOI_INT)
> >> +		lr_val |= ICH_LR_EOI;
> >> +
> >> +	return lr_val;
> >> +}
> >> +
> >> +static u8 sync_lr_state(u64 lr_val)
> > 
> > and lr_val_to_state ?
> > 
> > at least these sync names don't make much sense to me...
> > 
> >> +{
> >> +	u8 state = 0;
> >> +
> >> +	if (lr_val & ICH_LR_PENDING_BIT)
> >> +		state |= LR_STATE_PENDING;
> >> +	if (lr_val & ICH_LR_ACTIVE_BIT)
> >> +		state |= LR_STATE_ACTIVE;
> >> +	if (lr_val & ICH_LR_EOI)
> >> +		state |= LR_EOI_INT;
> >> +
> >> +	return state;
> >> +}
> >> +
> >> +static struct vgic_lr vgic_v2_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
> >>  {
> >>  	struct vgic_lr lr_desc;
> >>  	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
> >> @@ -53,30 +82,53 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
> >>  		lr_desc.source	= (val >> GICH_LR_PHYSID_CPUID_SHIFT) & 0x7;
> >>  	else
> >>  		lr_desc.source = 0;
> >> -	lr_desc.state	= 0;
> >> +	lr_desc.state	= sync_lr_state(val);
> >>  
> >> -	if (val & ICH_LR_PENDING_BIT)
> >> -		lr_desc.state |= LR_STATE_PENDING;
> >> -	if (val & ICH_LR_ACTIVE_BIT)
> >> -		lr_desc.state |= LR_STATE_ACTIVE;
> >> -	if (val & ICH_LR_EOI)
> >> -		lr_desc.state |= LR_EOI_INT;
> >> +	return lr_desc;
> >> +}
> >> +
> >> +static struct vgic_lr vgic_v3_on_v3_get_lr(const struct kvm_vcpu *vcpu, int lr)
> >> +{
> >> +	struct vgic_lr lr_desc;
> >> +	u64 val = vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)];
> >> +
> >> +	lr_desc.irq	= val & ICH_LR_VIRTUALID_MASK;
> >> +	lr_desc.source	= 0;
> >> +	lr_desc.state	= sync_lr_state(val);
> >>  
> >>  	return lr_desc;
> >>  }
> >>  
> >> -static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
> >> -			   struct vgic_lr lr_desc)
> >> +static void vgic_v3_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
> >> +				 struct vgic_lr lr_desc)
> >>  {
> >> -	u64 lr_val = (((u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) |
> >> -		      lr_desc.irq);
> >> +	u64 lr_val;
> >>  
> >> -	if (lr_desc.state & LR_STATE_PENDING)
> >> -		lr_val |= ICH_LR_PENDING_BIT;
> >> -	if (lr_desc.state & LR_STATE_ACTIVE)
> >> -		lr_val |= ICH_LR_ACTIVE_BIT;
> >> -	if (lr_desc.state & LR_EOI_INT)
> >> -		lr_val |= ICH_LR_EOI;
> >> +	lr_val = lr_desc.irq;
> >> +
> >> +	/*
> >> +	 * currently all guest IRQs are Group1, as Group0 would result
> > 
> > Can you guess my comment here?
> > 
> >> +	 * in a FIQ in the guest, which it wouldn't expect.
> >> +	 * Eventually we want to make this configurable, so we may revisit
> >> +	 * this in the future.
> >> +	 */
> >> +	lr_val |= ICH_LR_GROUP;
> >> +
> >> +	lr_val |= sync_lr_val(lr_desc.state);
> >> +
> >> +	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
> >> +}
> >> +
> >> +static void vgic_v2_on_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
> >> +				 struct vgic_lr lr_desc)
> >> +{
> >> +	u64 lr_val;
> >> +
> >> +	lr_val = lr_desc.irq;
> >> +
> >> +	lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
> >> +
> >> +	lr_val |= sync_lr_val(lr_desc.state);
> >>  
> >>  	vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
> >>  }
> >> @@ -145,9 +197,8 @@ static void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcrp)
> >>  
> >>  static void vgic_v3_enable(struct kvm_vcpu *vcpu)
> >>  {
> >> -	struct vgic_v3_cpu_if *vgic_v3;
> >> +	struct vgic_v3_cpu_if *vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
> >>  
> >> -	vgic_v3 = &vcpu->arch.vgic_cpu.vgic_v3;
> > 
> > unnecessary change?
> > 
> >>  	/*
> >>  	 * By forcing VMCR to zero, the GIC will restore the binary
> >>  	 * points to their reset values. Anything else resets to zero
> 
> So most of the code above is gone now in this form due to the drop of
> init_emul and friends I did earlier last week due to your comments.
> So I will skip those comments for now (or better: try to translate them
> to the new code structure if possbile) and eagerly wait for them to
> reappear in a different form in the v4 comments ;-)

sounds good, I'll go hunt again in the v4 :)

> 
> >> @@ -155,7 +206,14 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
> >>  	 */
> >>  	vgic_v3->vgic_vmcr = 0;
> >>  
> >> -	vgic_v3->vgic_sre = 0;
> >> +	/*
> >> +	 * Set the SRE_EL1 value depending on the configured
> >> +	 * emulated vGIC model.
> >> +	 */
> >> +	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
> >> +		vgic_v3->vgic_sre = ICC_SRE_EL1_SRE;
> > 
> > If we're on hardware with the GICv2 backwards compatibility can the
> > guest not actually set ICC_SRE_EL1_SRE to 0 but then we are not
> > preserving this because we wouldn't be trapping such accesses anymore?
> > 
> > Also, this is really about the reset value of that field, which the
> > comment above is not being specific about.
> > 
> > Further, that would mean that the field is NOT actually RAO/WI, and thus
> > the spec dictates that the field should reset to zero if EL1 is the
> > highest implemented exception level and the field is not RAO/WI, which
> > is what the guest expects, no?
> 
> I am still thinking about this (because it is probably true). Need to
> discuss with Marc what we can do about this.
> 

Kudos for understanding my cryptic comment.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3
  2014-11-10 12:26     ` Andre Przywara
@ 2014-11-10 13:25       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-10 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 10, 2014 at 12:26:28PM +0000, Andre Przywara wrote:
> Hej Christoffer,
> 
> On 07/11/14 16:15, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:54PM +0000, Andre Przywara wrote:
> >> With everything in place we allow userland to request the kernel
> >> using a virtual GICv3 in the guest, which finally lifts the 8 vCPU
> >> limit for a guest.
> > 
> > You're actually not explicitly allowing this in this patch, you're
> > implicitly allowing it because init would fail without the vgic
> > distributor base address being set already.
> > 
> > Either re-arrange your patches or fix the commit message.
> 
> The latter ;-)
> 
> >> Also we provide the necessary support for guests setting the memory
> >> addresses for the virtual distributor and redistributors.
> >> This requires some userland code to make use of that feature and
> >> explicitly ask for a virtual GICv3.
> > 
> > You need to add documentation for this new device type and the userspace
> > ABI.
> 
> Will do.
> 
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  arch/arm64/include/uapi/asm/kvm.h |    7 ++++++
> >>  include/kvm/arm_vgic.h            |    4 ++--
> >>  virt/kvm/arm/vgic-v3-emul.c       |    3 +++
> >>  virt/kvm/arm/vgic.c               |   46 ++++++++++++++++++++++++++-----------
> >>  4 files changed, 45 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> >> index 8e38878..2ed873a 100644
> >> --- a/arch/arm64/include/uapi/asm/kvm.h
> >> +++ b/arch/arm64/include/uapi/asm/kvm.h
> >> @@ -78,6 +78,13 @@ struct kvm_regs {
> >>  #define KVM_VGIC_V2_DIST_SIZE		0x1000
> >>  #define KVM_VGIC_V2_CPU_SIZE		0x2000
> >>  
> >> +/* Supported VGICv3 address types  */
> >> +#define KVM_VGIC_V3_ADDR_TYPE_DIST	2
> >> +#define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
> >> +
> >> +#define KVM_VGIC_V3_DIST_SIZE		SZ_64K
> >> +#define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
> >> +
> >>  #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
> >>  #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
> >>  #define KVM_ARM_VCPU_PSCI_0_2		2 /* CPU uses PSCI v0.2 */
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index c303083..e2e432c 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -35,8 +35,8 @@
> >>  #define VGIC_MAX_IRQS		1024
> >>  
> >>  /* Sanity checks... */
> >> -#if (KVM_MAX_VCPUS > 8)
> >> -#error	Invalid number of CPU interfaces
> >> +#if (KVM_MAX_VCPUS > 255)
> >> +#error Too many KVM VCPUs, the VGIC only supports up to 255 VCPUs for now
> > 
> > what happens now if you add more vcpus after having created a GICv2 with
> > 8 vcpus?
> 
> On adding a VCPU we check the number of allowed VCPUs for this
> particular guest (see arch/arm/kvm/arm.c:kvm_arch_vcpu_create() in patch
> 09/19). On creating a virtual GICv2 we set the limit to 8, so any
> KVM_VCPU_CREATE afterwards will fail.
> 
> But indeed I found other issues in this sequence of VCPU/VGIC init,
> which dissolved "magically" by the rework around (or actually the drop
> of) init_emul() and friends.
> 
I see, thanks.
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 1
  2014-11-07 14:30   ` Christoffer Dall
@ 2014-11-10 17:30     ` Andre Przywara
  2014-11-11 13:48       ` Christoffer Dall
  2014-11-12 12:39     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2 Andre Przywara
  1 sibling, 1 reply; 76+ messages in thread
From: Andre Przywara @ 2014-11-10 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hej,

I split the reply in two mails to make it easier accessible and reduce
the latency.
Would it make any sense to split the patch, too? Maybe distributor /
redistri

On 07/11/14 14:30, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
>> With everything separated and prepared, we implement a model of a
>> GICv3 distributor and redistributors by using the existing framework
>> to provide handler functions for each register group.
> 
> new paragraph
> 
>> Currently we limit the emulation to a model enforcing a single
>> security state, with SRE==1 (forcing system register access) and
>> ARE==1 (allowing more than 8 VCPUs).
> 
> new paragraph
> 
>> We share some of functions provided for GICv2 emulation, but take
>> the different ways of addressing (v)CPUs into account.
>> Save and restore is currently not implemented.
>>
>> Similar to the split-off GICv2 specific code, the new emulation code
>> goes into a new file (vgic-v3-emul.c).
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  arch/arm64/kvm/Makefile            |    1 +
>>  include/kvm/arm_vgic.h             |   10 +-
>>  include/linux/irqchip/arm-gic-v3.h |   26 ++
>>  include/linux/kvm_host.h           |    1 +
>>  include/uapi/linux/kvm.h           |    2 +
>>  virt/kvm/arm/vgic-v3-emul.c        |  891 ++++++++++++++++++++++++++++++++++++
>>  virt/kvm/arm/vgic.c                |   11 +-
>>  virt/kvm/arm/vgic.h                |    3 +
>>  8 files changed, 942 insertions(+), 3 deletions(-)
>>  create mode 100644 virt/kvm/arm/vgic-v3-emul.c
>>
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index d957353..4e6e09e 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -24,5 +24,6 @@ kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
>>  kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
>>  kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v2-switch.o
>>  kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3.o
>> +kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3-emul.o
>>  kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v3-switch.o
>>  kvm-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 8827bc7..c303083 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -160,7 +160,11 @@ struct vgic_dist {
>>
>>       /* Distributor and vcpu interface mapping in the guest */
>>       phys_addr_t             vgic_dist_base;
>> -     phys_addr_t             vgic_cpu_base;
>> +     /* GICv2 and GICv3 use different mapped register blocks */
>> +     union {
>> +             phys_addr_t             vgic_cpu_base;
>> +             phys_addr_t             vgic_redist_base;
>> +     };
>>
>>       /* Distributor enabled */
>>       u32                     enabled;
>> @@ -222,6 +226,9 @@ struct vgic_dist {
>>        */
>>       struct vgic_bitmap      *irq_spi_target;
>>
>> +     /* Target MPIDR for each IRQ (needed for GICv3 IROUTERn) only */
>> +     u32                     *irq_spi_mpidr;
>> +
>>       /* Bitmap indicating which CPU has something pending */
>>       unsigned long           *irq_pending_on_cpu;
>>
>> @@ -297,6 +304,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
>>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>                       bool level);
>> +void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>                     struct kvm_exit_mmio *mmio);
>> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
>> index 03a4ea3..6a649bc 100644
>> --- a/include/linux/irqchip/arm-gic-v3.h
>> +++ b/include/linux/irqchip/arm-gic-v3.h
>> @@ -33,6 +33,7 @@
>>  #define GICD_SETSPI_SR                       0x0050
>>  #define GICD_CLRSPI_SR                       0x0058
>>  #define GICD_SEIR                    0x0068
>> +#define GICD_IGROUPR                 0x0080
>>  #define GICD_ISENABLER                       0x0100
>>  #define GICD_ICENABLER                       0x0180
>>  #define GICD_ISPENDR                 0x0200
>> @@ -41,14 +42,31 @@
>>  #define GICD_ICACTIVER                       0x0380
>>  #define GICD_IPRIORITYR                      0x0400
>>  #define GICD_ICFGR                   0x0C00
>> +#define GICD_IGRPMODR                        0x0D00
>> +#define GICD_NSACR                   0x0E00
>>  #define GICD_IROUTER                 0x6000
>> +#define GICD_IDREGS                  0xFFD0
>>  #define GICD_PIDR2                   0xFFE8
>>
>> +/*
>> + * Non-ARE distributor registers, needed to provide the RES0
>> + * semantics for KVM's emulated GICv3
>> + */
> 
> huh?  I think this comment as to do a better job at explaining this, or,
> just go away.
> 
> Why are we re-defining these registers?  Is it just a conincidence that
> the offsets happen to be the same as for GICv2 so it would be
> semantically incorrect to reuse the defines, or?

The header files for GICv2 and v3 are distinct, and v3 does not include
v2. This is what we do in the backend (vgic-v2.c and vgic-v3.c), so I
repeated this here. AFAICT we cannot reuse the v2 definitions easily
other than copying them.
The comment is there because we don't implement the actual GICv2
semantics of these registers, but just the RAZ/WI one.
Will reword the comment to make this more clear.

>> +#define GICD_ITARGETSR                       0x0800
>> +#define GICD_SGIR                    0x0F00
>> +#define GICD_CPENDSGIR                       0x0F10
>> +#define GICD_SPENDSGIR                       0x0F20
>> +
>> +
>>  #define GICD_CTLR_RWP                        (1U << 31)
>> +#define GICD_CTLR_DS                 (1U << 6)
>>  #define GICD_CTLR_ARE_NS             (1U << 4)
>>  #define GICD_CTLR_ENABLE_G1A         (1U << 1)
>>  #define GICD_CTLR_ENABLE_G1          (1U << 0)
>>
>> +#define GICD_TYPER_LPIS                      (1U << 17)
>> +#define GICD_TYPER_MBIS                      (1U << 16)
>> +
>>  #define GICD_IROUTER_SPI_MODE_ONE    (0U << 31)
>>  #define GICD_IROUTER_SPI_MODE_ANY    (1U << 31)
>>
>> @@ -56,6 +74,8 @@
>>  #define GIC_PIDR2_ARCH_GICv3         0x30
>>  #define GIC_PIDR2_ARCH_GICv4         0x40
>>
>> +#define GIC_V3_DIST_SIZE             0x10000
>> +
>>  /*
>>   * Re-Distributor registers, offsets from RD_base
>>   */
>> @@ -74,6 +94,7 @@
>>  #define GICR_SYNCR                   0x00C0
>>  #define GICR_MOVLPIR                 0x0100
>>  #define GICR_MOVALLR                 0x0110
>> +#define GICR_IDREGS                  GICD_IDREGS
>>  #define GICR_PIDR2                   GICD_PIDR2
>>
>>  #define GICR_WAKER_ProcessorSleep    (1U << 1)
>> @@ -82,6 +103,7 @@
>>  /*
>>   * Re-Distributor registers, offsets from SGI_base
>>   */
>> +#define GICR_IGROUPR0                        GICD_IGROUPR
>>  #define GICR_ISENABLER0                      GICD_ISENABLER
>>  #define GICR_ICENABLER0                      GICD_ICENABLER
>>  #define GICR_ISPENDR0                        GICD_ISPENDR
>> @@ -90,10 +112,14 @@
>>  #define GICR_ICACTIVER0                      GICD_ICACTIVER
>>  #define GICR_IPRIORITYR0             GICD_IPRIORITYR
>>  #define GICR_ICFGR0                  GICD_ICFGR
>> +#define GICR_IGRPMODR0                       GICD_IGRPMODR
>> +#define GICR_NSACR                   GICD_NSACR
>>
>>  #define GICR_TYPER_VLPIS             (1U << 1)
>>  #define GICR_TYPER_LAST                      (1U << 4)
>>
>> +#define GIC_V3_REDIST_SIZE           0x20000
>> +
>>  /*
>>   * CPU interface registers
>>   */
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 326ba7a..4a7798e 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -1085,6 +1085,7 @@ void kvm_unregister_device_ops(u32 type);
>>  extern struct kvm_device_ops kvm_mpic_ops;
>>  extern struct kvm_device_ops kvm_xics_ops;
>>  extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
>> +extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
>>
>>  #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
>>
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 6076882..24cb129 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -960,6 +960,8 @@ enum kvm_device_type {
>>  #define KVM_DEV_TYPE_ARM_VGIC_V2     KVM_DEV_TYPE_ARM_VGIC_V2
>>       KVM_DEV_TYPE_FLIC,
>>  #define KVM_DEV_TYPE_FLIC            KVM_DEV_TYPE_FLIC
>> +     KVM_DEV_TYPE_ARM_VGIC_V3,
>> +#define KVM_DEV_TYPE_ARM_VGIC_V3     KVM_DEV_TYPE_ARM_VGIC_V3
> 
> You need to document this device type in
> Documentation/virtual/kvm/devices/ (probably in arm-vgic.txt).
> 
> That goes for patch 19 as well, but I'll remind you when I look at that
> patch more closely.

Ah right, thanks for the pointer.

>>       KVM_DEV_TYPE_MAX,
>>  };
>>
>> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
>> new file mode 100644
>> index 0000000..bcb5374
>> --- /dev/null
>> +++ b/virt/kvm/arm/vgic-v3-emul.c
>> @@ -0,0 +1,891 @@
>> +/*
>> + * GICv3 distributor and redistributor emulation on GICv3 hardware
>> + *
>> + * able to run on a pure native host GICv3 (which forces ARE=1)
>> + *
>> + * forcing ARE=1 and DS=1, not covering LPIs yet (TYPER.LPIS=0)
> 
> I think the above two lines require rewriting, may I suggest:

You may ... ;-)

> GICv3 emulation is currently only supported on a GICv3 host, but
> supports both hardware with or without the optional GICv2 backwards
> compatibility features.
> 
> We emulate a GICv3 without the backwards compatibility features (meaning
> the emulated GICD_CTLR.ARE resets to 1 and is RAO/WI) and with only a
> single security state (the emulated GICD_CTLR.DS=1, RAO/WI).  This
> emulated GICv3 does not yet include support for LPIs (TYPER.LIPS=0,
> RAZ/WI).
> 
> But pay particular attention to the bit about us emulating a GICv3 with
> only a single security state, because you're implementing GICD_IGROUPR
> and GICR_IGROUPR as RAZ/WI, which is then a limitation of the emulated
> GIC (just like we don't emulate priorities), which is fine, but let's
> then state that as such.
> 
>> + *
>> + * Copyright (C) 2014 ARM Ltd.
>> + * Author: Andre Przywara <andre.przywara@arm.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <linux/cpu.h>
>> +#include <linux/kvm.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/interrupt.h>
>> +
>> +#include <linux/irqchip/arm-gic-v3.h>
>> +#include <kvm/arm_vgic.h>
>> +
>> +#include <asm/kvm_emulate.h>
>> +#include <asm/kvm_arm.h>
>> +#include <asm/kvm_mmu.h>
>> +
>> +#include "vgic.h"
>> +
>> +#define INTERRUPT_ID_BITS 10
>> +
>> +static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
>> +                          struct kvm_exit_mmio *mmio, phys_addr_t offset,
>> +                          void *private)
>> +{
>> +     u32 reg = 0, val;
>> +     u32 word_offset = offset & 3;
>> +
>> +     switch (offset & ~3) {
>> +     case GICD_CTLR:
>> +             /*
>> +              * Force ARE and DS to 1, the guest cannot change this.
>> +              * For the time being we only support Group1 interrupts.
>> +              */
>> +             if (vcpu->kvm->arch.vgic.enabled)
>> +                     reg = GICD_CTLR_ENABLE_G1A;
>> +             reg |= GICD_CTLR_ARE_NS | GICD_CTLR_DS;
>> +
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                             ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> +             if (mmio->is_write) {
>> +                     vcpu->kvm->arch.vgic.enabled = !!(reg & GICD_CTLR_ENABLE_G1A);
> 
>> +                     vgic_update_state(vcpu->kvm);
>> +                     return true;
>> +             }
>> +             break;
> 
> so we don't implement read-as-written for this register, should we at
> least print a warning or something if the guest tries to enable group 0
> interrupts?

Good suggestion.

>> +     case GICD_TYPER:
>> +             /*
>> +              * as this implementation does not provide compatibility
> 
>        Upper-case  ^
> 
>> +              * with GICv2 (ARE==1), we report zero CPUs in the lower 5 bits.
> 
> lower 5 bits?  You mean we report bits [7:5] as 000 right?

Right, thanks for spotting this.

> 
>> +              * Also TYPER.LPIS is 0 for now and TYPER.MBIS is not supported.
> 
> drop the 'for now' just say we report TYPER.LPIS=0 and TYPER.MBIS=0;
> because we don't support LBIs or MBIs.
> 
>> +              */
>> +
>> +             /* claim we support at most 1024 (-4) SPIs via this interface */
> 
> claim?  Does this not hold in reality?  It doesn't seem to be what the
> code does.  I'm doubting the usefulnes of this comment.

I had ITS already in mind, so nr_irqs could be potentially much higher,
but we only report up to 1024/1020 via _this interface_. Not sure if we
would actually use nr_irqs for including LPIs later, but better safe
than sorry.

>> +             val = min(vcpu->kvm->arch.vgic.nr_irqs, 1024);
>> +             reg |= (val >> 5) - 1;
>> +
>> +             reg |= (INTERRUPT_ID_BITS - 1) << 19;
> 
> but it happens that we have no explanation about the arbitrarily chosen
> 10 bits?

To cover the 1020 interrupts that we support at most for now. Will add a
comment about that.

>> +
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                             ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     case GICD_IIDR:
>> +             reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                     ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     default:
>> +             vgic_reg_access(mmio, NULL, word_offset,
>> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     }
> 
> I'm getting increasingly skeptic about the value of combining these
> registers into a single misc function?

Indeed. That started with a GICv2 copy originally ...

>> +
>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_set_enable_reg_dist(struct kvm_vcpu *vcpu,
>> +                                         struct kvm_exit_mmio *mmio,
>> +                                         phys_addr_t offset,
>> +                                         void *private)
>> +{
>> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
>> +             return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
>> +                                           vcpu->vcpu_id,
>> +                                           ACCESS_WRITE_SETBIT);
>> +
>> +     vgic_reg_access(mmio, NULL, offset & 3,
>> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> 
> Somewhat general question:
> 
> This made me wonder if we check for unaligned accesses anywhere or could
> the guest get away with (offset & 3) = 2 and mmio->len = 4?  Then
> semantics for this would start being weird...

AFAIK non-natural aligned accesses to the GIC are not allowed and are
issuing an alignment exception before trapping on the MMIO access.
This is what the comment in vgic_reg_access() says.

>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_clear_enable_reg_dist(struct kvm_vcpu *vcpu,
>> +                                           struct kvm_exit_mmio *mmio,
>> +                                           phys_addr_t offset,
>> +                                           void *private)
>> +{
>> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
>> +             return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
>> +                                           vcpu->vcpu_id,
>> +                                           ACCESS_WRITE_CLEARBIT);
>> +
>> +     vgic_reg_access(mmio, NULL, offset & 3,
>> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_set_pending_reg_dist(struct kvm_vcpu *vcpu,
>> +                                          struct kvm_exit_mmio *mmio,
>> +                                          phys_addr_t offset,
>> +                                          void *private)
>> +{
>> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
>> +             return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
>> +                                                vcpu->vcpu_id);
>> +
>> +     vgic_reg_access(mmio, NULL, offset & 3,
>> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_clear_pending_reg_dist(struct kvm_vcpu *vcpu,
>> +                                            struct kvm_exit_mmio *mmio,
>> +                                            phys_addr_t offset,
>> +                                            void *private)
>> +{
>> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
>> +             return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
>> +                                                  vcpu->vcpu_id);
>> +
>> +     vgic_reg_access(mmio, NULL, offset & 3,
>> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_priority_reg_dist(struct kvm_vcpu *vcpu,
>> +                                       struct kvm_exit_mmio *mmio,
>> +                                       phys_addr_t offset,
>> +                                       void *private)
>> +{
>> +     u32 *reg;
>> +
>> +     if (unlikely(offset < VGIC_NR_PRIVATE_IRQS)) {
>> +             vgic_reg_access(mmio, NULL, offset & 3,
> 
> Just noticed, you don't need to mask off the upper bits all these places, do you?
> 
> I think it should be consistent with what we do in the v2 emulation.

Right, that's done in vgic_reg_access(). Will remove that.

> The only place you may need to do that is in the handle_mmio_misc function.
> 
>> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +             return false;
>> +     }
>> +
>> +     reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
>> +                                vcpu->vcpu_id, offset);
>> +     vgic_reg_access(mmio, reg, offset,
>> +             ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_cfg_reg_dist(struct kvm_vcpu *vcpu,
>> +                                  struct kvm_exit_mmio *mmio,
>> +                                  phys_addr_t offset,
>> +                                  void *private)
>> +{
>> +     u32 *reg;
>> +
>> +     if (unlikely(offset < VGIC_NR_PRIVATE_IRQS / 4)) {
>> +             vgic_reg_access(mmio, NULL, offset & 3,
>> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +             return false;
>> +     }
>> +
>> +     reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
>> +                               vcpu->vcpu_id, offset >> 1);
>> +
>> +     return vgic_handle_cfg_reg(reg, mmio, offset);
>> +}
>> +
>> +static u32 compress_mpidr(unsigned long mpidr)
> 
> can you comment on this function which format it returns and which
> context that's useful in?

Yes.

>> +{
>> +     u32 ret;
>> +
>> +     ret = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +     ret |= MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8;
>> +     ret |= MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16;
>> +     ret |= MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24;
>> +
>> +     return ret;
>> +}
>> +
>> +static unsigned long uncompress_mpidr(u32 value)
>> +{
>> +     unsigned long mpidr;
>> +
>> +     mpidr = ((value >> 0) & 0xFF) << MPIDR_LEVEL_SHIFT(0);
>> +     mpidr |= ((value >> 8) & 0xFF) << MPIDR_LEVEL_SHIFT(1);
>> +     mpidr |= ((value >> 16) & 0xFF) << MPIDR_LEVEL_SHIFT(2);
>> +     mpidr |= (u64)((value >> 24) & 0xFF) << MPIDR_LEVEL_SHIFT(3);
>> +
>> +     return mpidr;
>> +}
>> +
>> +/*
>> + * Lookup the given MPIDR value to get the vcpu_id (if there is one)
>> + * and store that in the irq_spi_cpu[] array.
>> + * This limits the number of VCPUs to 255 for now, extending the data
>> + * type (or storing kvm_vcpu poiners) should lift the limit.
>> + * Store the original MPIDR value in an extra array.
> 
> why?  To maintain read-as-written?

Yes. Marc mentioned some use case, I think it was about hot-(un)plugging
CPUs.

>> + * Unallocated MPIDRs are translated to a special value and catched
> 
> s/catched/caught/
> 
>> + * before any array accesses.
>> + */
>> +static bool handle_mmio_route_reg(struct kvm_vcpu *vcpu,
>> +                               struct kvm_exit_mmio *mmio,
>> +                               phys_addr_t offset, void *private)
>> +{
>> +     struct kvm *kvm = vcpu->kvm;
>> +     struct vgic_dist *dist = &kvm->arch.vgic;
>> +     int irq;
>> +     u32 reg;
>> +     int vcpu_id;
>> +     unsigned long *bmap, mpidr;
>> +     u32 word_offset = offset & 3;
>> +
>> +     /*
>> +      * Private interrupts cannot be re-routed, so this register
>> +      * is RES0 for any IRQ < 32.
>> +      * Also the upper 32 bits of each 64 bit register are zero,
>> +      * as we don't support Aff3 and that's the only value up there.
> 
> drop the rest of the sentence after Aff3.
> 
>> +      */
>> +     if (unlikely(offset < VGIC_NR_PRIVATE_IRQS * 8) || (offset & 4) == 4) {
> 
> you don't need the '== 4' part.
> 
>> +             vgic_reg_access(mmio, NULL, word_offset,
>> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +             return false;
>> +     }
>> +
>> +     irq = (offset / 8) - VGIC_NR_PRIVATE_IRQS;
> 
> can we not call this irq? spi instead maybe?

Yes (to all the four comments above).

>> +
>> +     /* get the stored MPIDR for this IRQ */
>> +     mpidr = uncompress_mpidr(dist->irq_spi_mpidr[irq]);
>> +     mpidr &= MPIDR_HWID_BITMASK;
>> +     reg = mpidr;
>> +
>> +     vgic_reg_access(mmio, &reg, word_offset,
>> +                     ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> +
>> +     if (!mmio->is_write)
>> +             return false;
>> +
>> +     /*
>> +      * Now clear the currently assigned vCPU from the map, making room
>> +      * for the new one to be written below
>> +      */
>> +     vcpu = kvm_mpidr_to_vcpu(kvm, mpidr);
>> +     if (likely(vcpu)) {
>> +             vcpu_id = vcpu->vcpu_id;
>> +             bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
>> +             clear_bit(irq, bmap);
> 
> this is the atomic version, right?  is it known to be faster on arm64
> because it's written in assembly and that's why we're using it instead
> of __clear_bit?

No, because I was unaware of the difference when writing this and was
assuming the the non-underscore version is the canonical one.
Inside the vgic lock __clear_bit and __set_bit are safe, right?

>> +     }
>> +
>> +     dist->irq_spi_mpidr[irq] = compress_mpidr(reg);
>> +     vcpu = kvm_mpidr_to_vcpu(kvm, reg & MPIDR_HWID_BITMASK);
>> +
>> +     /*
>> +      * The spec says that non-existent MPIDR values should not be
>> +      * forwarded to any existent (v)CPU, but should be able to become
>> +      * pending anyway. We simply keep the irq_spi_target[] array empty, so
>> +      * the interrupt will never be injected.
>> +      * irq_spi_cpu[irq] gets a magic value in this case.
>> +      */
>> +     if (likely(vcpu)) {
>> +             vcpu_id = vcpu->vcpu_id;
>> +             dist->irq_spi_cpu[irq] = vcpu_id;
>> +             bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
>> +             set_bit(irq, bmap);
> 
> __set_bit ?
> 
>> +     } else
>> +             dist->irq_spi_cpu[irq] = VCPU_NOT_ALLOCATED;
> 
> according to the CodingStyle (and me) this wants braces.
> 
>> +
>> +     vgic_update_state(kvm);
>> +
>> +     return true;
>> +}
>> +
>> +static bool handle_mmio_idregs(struct kvm_vcpu *vcpu,
>> +                            struct kvm_exit_mmio *mmio,
>> +                            phys_addr_t offset, void *private)
>> +{
>> +     u32 reg = 0;
>> +
>> +     switch (offset + GICD_IDREGS) {
>> +     case GICD_PIDR2:
>> +             reg = 0x3b;
>> +             break;
>> +     }
>> +
>> +     vgic_reg_access(mmio, &reg, offset & 3,
>> +                     ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> +
>> +     return false;
>> +}
>> +
>> +static const struct mmio_range vgic_dist_ranges[] = {
> 
> can we call this vgic_v3_dist_ranges ?

If that doesn't break the 80 characters limit somewhere: Yes! ;-)

>> +     {       /*
>> +              * handling CTLR, TYPER, IIDR and STATUSR
>> +              */
> 
> this one doesn't need wings (and you're not doing that below)
> 
>> +             .base           = GICD_CTLR,
>> +             .len            = 20,
> 
> nit: why do we specify this len as decimal and the others in hex?
> 
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_misc,
>> +     },
> 
> are we not mentioning the status register here because it's optional?

We will be mentioning this fact in a comment ...

>> +     {
>> +             /* when DS=1, this is RAZ/WI */
>> +             .base           = GICD_SETSPI_SR,
>> +             .len            = 0x04,
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             /* when DS=1, this is RAZ/WI */
>> +             .base           = GICD_CLRSPI_SR,
>> +             .len            = 0x04,
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
> 
> why are we only listing the _SR versions and not the _NSR versions?

Section 5.3.18 of the GICv3 spec states that this register is RAZ/WI if
GICD_CTLR.DS is one. As we do not implement MSIs from the guest, we omit
the _NSR registers from this list to provoke a MMIO error when they are
used (we have GICD_TYPER.MBIS == 0). The spec does not mention what
happens on an access in this case, though. Will check back with Marc.

>> +     {
>> +             .base           = GICD_IGROUPR,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
> 
> this one may warrant a TODO: Group 0 interrupts not yet supported.

Sure.

>> +     {
>> +             .base           = GICD_ISENABLER,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_set_enable_reg_dist,
>> +     },
>> +     {
>> +             .base           = GICD_ICENABLER,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_clear_enable_reg_dist,
>> +     },
>> +     {
>> +             .base           = GICD_ISPENDR,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_set_pending_reg_dist,
>> +     },
>> +     {
>> +             .base           = GICD_ICPENDR,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_clear_pending_reg_dist,
>> +     },
>> +     {
>> +             .base           = GICD_ISACTIVER,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICD_ICACTIVER,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICD_IPRIORITYR,
>> +             .len            = 0x400,
>> +             .bits_per_irq   = 8,
>> +             .handle_mmio    = handle_mmio_priority_reg_dist,
>> +     },
>> +     {
>> +             /* TARGETSRn is RES0 when ARE=1 */
>> +             .base           = GICD_ITARGETSR,
>> +             .len            = 0x400,
>> +             .bits_per_irq   = 8,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICD_ICFGR,
>> +             .len            = 0x100,
>> +             .bits_per_irq   = 2,
>> +             .handle_mmio    = handle_mmio_cfg_reg_dist,
>> +     },
>> +     {
>> +             /* this is RAZ/WI when DS=1 */
>> +             .base           = GICD_IGRPMODR,
>> +             .len            = 0x80,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             /* with DS==1 this is RAZ/WI */
> 
> any reason why the two comments above are not identical?  (I know, I
> have OCD).

That was left in to check whether you would actually read the middle of
the patch ;-)

Which tempts me to split up the reply here. Will answer the rest of the
comments in a second mail.

Hej Hej,
Andre

P.S.: now you witnessed about 20% of my Danish, the rest is about
various food items you can buy in Marielyst ;-)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 1
  2014-11-10 17:30     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 1 Andre Przywara
@ 2014-11-11 13:48       ` Christoffer Dall
  0 siblings, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-11 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 10, 2014 at 05:30:11PM +0000, Andre Przywara wrote:
> Hej,
> 
> I split the reply in two mails to make it easier accessible and reduce
> the latency.
> Would it make any sense to split the patch, too? Maybe distributor /
> redistri

It wouldn't have hurt to do that from the start, but I think I'd
recommend keeping the split as it is now to make it easier to follow up
on review comments etc.

> 
> On 07/11/14 14:30, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
> >> With everything separated and prepared, we implement a model of a
> >> GICv3 distributor and redistributors by using the existing framework
> >> to provide handler functions for each register group.
> > 
> > new paragraph
> > 
> >> Currently we limit the emulation to a model enforcing a single
> >> security state, with SRE==1 (forcing system register access) and
> >> ARE==1 (allowing more than 8 VCPUs).
> > 
> > new paragraph
> > 
> >> We share some of functions provided for GICv2 emulation, but take
> >> the different ways of addressing (v)CPUs into account.
> >> Save and restore is currently not implemented.
> >>
> >> Similar to the split-off GICv2 specific code, the new emulation code
> >> goes into a new file (vgic-v3-emul.c).
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  arch/arm64/kvm/Makefile            |    1 +
> >>  include/kvm/arm_vgic.h             |   10 +-
> >>  include/linux/irqchip/arm-gic-v3.h |   26 ++
> >>  include/linux/kvm_host.h           |    1 +
> >>  include/uapi/linux/kvm.h           |    2 +
> >>  virt/kvm/arm/vgic-v3-emul.c        |  891 ++++++++++++++++++++++++++++++++++++
> >>  virt/kvm/arm/vgic.c                |   11 +-
> >>  virt/kvm/arm/vgic.h                |    3 +
> >>  8 files changed, 942 insertions(+), 3 deletions(-)
> >>  create mode 100644 virt/kvm/arm/vgic-v3-emul.c
> >>
> >> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> >> index d957353..4e6e09e 100644
> >> --- a/arch/arm64/kvm/Makefile
> >> +++ b/arch/arm64/kvm/Makefile
> >> @@ -24,5 +24,6 @@ kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
> >>  kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
> >>  kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v2-switch.o
> >>  kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3.o
> >> +kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3-emul.o
> >>  kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v3-switch.o
> >>  kvm-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index 8827bc7..c303083 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -160,7 +160,11 @@ struct vgic_dist {
> >>
> >>       /* Distributor and vcpu interface mapping in the guest */
> >>       phys_addr_t             vgic_dist_base;
> >> -     phys_addr_t             vgic_cpu_base;
> >> +     /* GICv2 and GICv3 use different mapped register blocks */
> >> +     union {
> >> +             phys_addr_t             vgic_cpu_base;
> >> +             phys_addr_t             vgic_redist_base;
> >> +     };
> >>
> >>       /* Distributor enabled */
> >>       u32                     enabled;
> >> @@ -222,6 +226,9 @@ struct vgic_dist {
> >>        */
> >>       struct vgic_bitmap      *irq_spi_target;
> >>
> >> +     /* Target MPIDR for each IRQ (needed for GICv3 IROUTERn) only */
> >> +     u32                     *irq_spi_mpidr;
> >> +
> >>       /* Bitmap indicating which CPU has something pending */
> >>       unsigned long           *irq_pending_on_cpu;
> >>
> >> @@ -297,6 +304,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
> >>  void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> >>                       bool level);
> >> +void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >>                     struct kvm_exit_mmio *mmio);
> >> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
> >> index 03a4ea3..6a649bc 100644
> >> --- a/include/linux/irqchip/arm-gic-v3.h
> >> +++ b/include/linux/irqchip/arm-gic-v3.h
> >> @@ -33,6 +33,7 @@
> >>  #define GICD_SETSPI_SR                       0x0050
> >>  #define GICD_CLRSPI_SR                       0x0058
> >>  #define GICD_SEIR                    0x0068
> >> +#define GICD_IGROUPR                 0x0080
> >>  #define GICD_ISENABLER                       0x0100
> >>  #define GICD_ICENABLER                       0x0180
> >>  #define GICD_ISPENDR                 0x0200
> >> @@ -41,14 +42,31 @@
> >>  #define GICD_ICACTIVER                       0x0380
> >>  #define GICD_IPRIORITYR                      0x0400
> >>  #define GICD_ICFGR                   0x0C00
> >> +#define GICD_IGRPMODR                        0x0D00
> >> +#define GICD_NSACR                   0x0E00
> >>  #define GICD_IROUTER                 0x6000
> >> +#define GICD_IDREGS                  0xFFD0
> >>  #define GICD_PIDR2                   0xFFE8
> >>
> >> +/*
> >> + * Non-ARE distributor registers, needed to provide the RES0
> >> + * semantics for KVM's emulated GICv3
> >> + */
> > 
> > huh?  I think this comment as to do a better job at explaining this, or,
> > just go away.
> > 
> > Why are we re-defining these registers?  Is it just a conincidence that
> > the offsets happen to be the same as for GICv2 so it would be
> > semantically incorrect to reuse the defines, or?
> 
> The header files for GICv2 and v3 are distinct, and v3 does not include
> v2. This is what we do in the backend (vgic-v2.c and vgic-v3.c), so I
> repeated this here. AFAICT we cannot reuse the v2 definitions easily
> other than copying them.
> The comment is there because we don't implement the actual GICv2
> semantics of these registers, but just the RAZ/WI one.
> Will reword the comment to make this more clear.
> 
> >> +#define GICD_ITARGETSR                       0x0800
> >> +#define GICD_SGIR                    0x0F00
> >> +#define GICD_CPENDSGIR                       0x0F10
> >> +#define GICD_SPENDSGIR                       0x0F20
> >> +
> >> +
> >>  #define GICD_CTLR_RWP                        (1U << 31)
> >> +#define GICD_CTLR_DS                 (1U << 6)
> >>  #define GICD_CTLR_ARE_NS             (1U << 4)
> >>  #define GICD_CTLR_ENABLE_G1A         (1U << 1)
> >>  #define GICD_CTLR_ENABLE_G1          (1U << 0)
> >>
> >> +#define GICD_TYPER_LPIS                      (1U << 17)
> >> +#define GICD_TYPER_MBIS                      (1U << 16)
> >> +
> >>  #define GICD_IROUTER_SPI_MODE_ONE    (0U << 31)
> >>  #define GICD_IROUTER_SPI_MODE_ANY    (1U << 31)
> >>
> >> @@ -56,6 +74,8 @@
> >>  #define GIC_PIDR2_ARCH_GICv3         0x30
> >>  #define GIC_PIDR2_ARCH_GICv4         0x40
> >>
> >> +#define GIC_V3_DIST_SIZE             0x10000
> >> +
> >>  /*
> >>   * Re-Distributor registers, offsets from RD_base
> >>   */
> >> @@ -74,6 +94,7 @@
> >>  #define GICR_SYNCR                   0x00C0
> >>  #define GICR_MOVLPIR                 0x0100
> >>  #define GICR_MOVALLR                 0x0110
> >> +#define GICR_IDREGS                  GICD_IDREGS
> >>  #define GICR_PIDR2                   GICD_PIDR2
> >>
> >>  #define GICR_WAKER_ProcessorSleep    (1U << 1)
> >> @@ -82,6 +103,7 @@
> >>  /*
> >>   * Re-Distributor registers, offsets from SGI_base
> >>   */
> >> +#define GICR_IGROUPR0                        GICD_IGROUPR
> >>  #define GICR_ISENABLER0                      GICD_ISENABLER
> >>  #define GICR_ICENABLER0                      GICD_ICENABLER
> >>  #define GICR_ISPENDR0                        GICD_ISPENDR
> >> @@ -90,10 +112,14 @@
> >>  #define GICR_ICACTIVER0                      GICD_ICACTIVER
> >>  #define GICR_IPRIORITYR0             GICD_IPRIORITYR
> >>  #define GICR_ICFGR0                  GICD_ICFGR
> >> +#define GICR_IGRPMODR0                       GICD_IGRPMODR
> >> +#define GICR_NSACR                   GICD_NSACR
> >>
> >>  #define GICR_TYPER_VLPIS             (1U << 1)
> >>  #define GICR_TYPER_LAST                      (1U << 4)
> >>
> >> +#define GIC_V3_REDIST_SIZE           0x20000
> >> +
> >>  /*
> >>   * CPU interface registers
> >>   */
> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >> index 326ba7a..4a7798e 100644
> >> --- a/include/linux/kvm_host.h
> >> +++ b/include/linux/kvm_host.h
> >> @@ -1085,6 +1085,7 @@ void kvm_unregister_device_ops(u32 type);
> >>  extern struct kvm_device_ops kvm_mpic_ops;
> >>  extern struct kvm_device_ops kvm_xics_ops;
> >>  extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
> >> +extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
> >>
> >>  #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
> >>
> >> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >> index 6076882..24cb129 100644
> >> --- a/include/uapi/linux/kvm.h
> >> +++ b/include/uapi/linux/kvm.h
> >> @@ -960,6 +960,8 @@ enum kvm_device_type {
> >>  #define KVM_DEV_TYPE_ARM_VGIC_V2     KVM_DEV_TYPE_ARM_VGIC_V2
> >>       KVM_DEV_TYPE_FLIC,
> >>  #define KVM_DEV_TYPE_FLIC            KVM_DEV_TYPE_FLIC
> >> +     KVM_DEV_TYPE_ARM_VGIC_V3,
> >> +#define KVM_DEV_TYPE_ARM_VGIC_V3     KVM_DEV_TYPE_ARM_VGIC_V3
> > 
> > You need to document this device type in
> > Documentation/virtual/kvm/devices/ (probably in arm-vgic.txt).
> > 
> > That goes for patch 19 as well, but I'll remind you when I look at that
> > patch more closely.
> 
> Ah right, thanks for the pointer.
> 
> >>       KVM_DEV_TYPE_MAX,
> >>  };
> >>
> >> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> >> new file mode 100644
> >> index 0000000..bcb5374
> >> --- /dev/null
> >> +++ b/virt/kvm/arm/vgic-v3-emul.c
> >> @@ -0,0 +1,891 @@
> >> +/*
> >> + * GICv3 distributor and redistributor emulation on GICv3 hardware
> >> + *
> >> + * able to run on a pure native host GICv3 (which forces ARE=1)
> >> + *
> >> + * forcing ARE=1 and DS=1, not covering LPIs yet (TYPER.LPIS=0)
> > 
> > I think the above two lines require rewriting, may I suggest:
> 
> You may ... ;-)
> 
> > GICv3 emulation is currently only supported on a GICv3 host, but
> > supports both hardware with or without the optional GICv2 backwards
> > compatibility features.
> > 
> > We emulate a GICv3 without the backwards compatibility features (meaning
> > the emulated GICD_CTLR.ARE resets to 1 and is RAO/WI) and with only a
> > single security state (the emulated GICD_CTLR.DS=1, RAO/WI).  This
> > emulated GICv3 does not yet include support for LPIs (TYPER.LIPS=0,
> > RAZ/WI).
> > 
> > But pay particular attention to the bit about us emulating a GICv3 with
> > only a single security state, because you're implementing GICD_IGROUPR
> > and GICR_IGROUPR as RAZ/WI, which is then a limitation of the emulated
> > GIC (just like we don't emulate priorities), which is fine, but let's
> > then state that as such.
> > 
> >> + *
> >> + * Copyright (C) 2014 ARM Ltd.
> >> + * Author: Andre Przywara <andre.przywara@arm.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License version 2 as
> >> + * published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License
> >> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#include <linux/cpu.h>
> >> +#include <linux/kvm.h>
> >> +#include <linux/kvm_host.h>
> >> +#include <linux/interrupt.h>
> >> +
> >> +#include <linux/irqchip/arm-gic-v3.h>
> >> +#include <kvm/arm_vgic.h>
> >> +
> >> +#include <asm/kvm_emulate.h>
> >> +#include <asm/kvm_arm.h>
> >> +#include <asm/kvm_mmu.h>
> >> +
> >> +#include "vgic.h"
> >> +
> >> +#define INTERRUPT_ID_BITS 10
> >> +
> >> +static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
> >> +                          struct kvm_exit_mmio *mmio, phys_addr_t offset,
> >> +                          void *private)
> >> +{
> >> +     u32 reg = 0, val;
> >> +     u32 word_offset = offset & 3;
> >> +
> >> +     switch (offset & ~3) {
> >> +     case GICD_CTLR:
> >> +             /*
> >> +              * Force ARE and DS to 1, the guest cannot change this.
> >> +              * For the time being we only support Group1 interrupts.
> >> +              */
> >> +             if (vcpu->kvm->arch.vgic.enabled)
> >> +                     reg = GICD_CTLR_ENABLE_G1A;
> >> +             reg |= GICD_CTLR_ARE_NS | GICD_CTLR_DS;
> >> +
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                             ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> >> +             if (mmio->is_write) {
> >> +                     vcpu->kvm->arch.vgic.enabled = !!(reg & GICD_CTLR_ENABLE_G1A);
> > 
> >> +                     vgic_update_state(vcpu->kvm);
> >> +                     return true;
> >> +             }
> >> +             break;
> > 
> > so we don't implement read-as-written for this register, should we at
> > least print a warning or something if the guest tries to enable group 0
> > interrupts?
> 
> Good suggestion.
> 
> >> +     case GICD_TYPER:
> >> +             /*
> >> +              * as this implementation does not provide compatibility
> > 
> >        Upper-case  ^
> > 
> >> +              * with GICv2 (ARE==1), we report zero CPUs in the lower 5 bits.
> > 
> > lower 5 bits?  You mean we report bits [7:5] as 000 right?
> 
> Right, thanks for spotting this.
> 
> > 
> >> +              * Also TYPER.LPIS is 0 for now and TYPER.MBIS is not supported.
> > 
> > drop the 'for now' just say we report TYPER.LPIS=0 and TYPER.MBIS=0;
> > because we don't support LBIs or MBIs.
> > 
> >> +              */
> >> +
> >> +             /* claim we support at most 1024 (-4) SPIs via this interface */
> > 
> > claim?  Does this not hold in reality?  It doesn't seem to be what the
> > code does.  I'm doubting the usefulnes of this comment.
> 
> I had ITS already in mind, so nr_irqs could be potentially much higher,
> but we only report up to 1024/1020 via _this interface_. Not sure if we
> would actually use nr_irqs for including LPIs later, but better safe
> than sorry.

that should probably be clarified, the wording of the comment suggests
a hack, or some uncertainty.

> 
> >> +             val = min(vcpu->kvm->arch.vgic.nr_irqs, 1024);
> >> +             reg |= (val >> 5) - 1;
> >> +
> >> +             reg |= (INTERRUPT_ID_BITS - 1) << 19;
> > 
> > but it happens that we have no explanation about the arbitrarily chosen
> > 10 bits?
> 
> To cover the 1020 interrupts that we support at most for now. Will add a
> comment about that.
> 

ah, didn't realize.  or maybe I did, don't remember.

> >> +
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                             ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     case GICD_IIDR:
> >> +             reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                     ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     default:
> >> +             vgic_reg_access(mmio, NULL, word_offset,
> >> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     }
> > 
> > I'm getting increasingly skeptic about the value of combining these
> > registers into a single misc function?
> 
> Indeed. That started with a GICv2 copy originally ...
> 
> >> +
> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_set_enable_reg_dist(struct kvm_vcpu *vcpu,
> >> +                                         struct kvm_exit_mmio *mmio,
> >> +                                         phys_addr_t offset,
> >> +                                         void *private)
> >> +{
> >> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> >> +             return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> >> +                                           vcpu->vcpu_id,
> >> +                                           ACCESS_WRITE_SETBIT);
> >> +
> >> +     vgic_reg_access(mmio, NULL, offset & 3,
> >> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> > 
> > Somewhat general question:
> > 
> > This made me wonder if we check for unaligned accesses anywhere or could
> > the guest get away with (offset & 3) = 2 and mmio->len = 4?  Then
> > semantics for this would start being weird...
> 
> AFAIK non-natural aligned accesses to the GIC are not allowed and are
> issuing an alignment exception before trapping on the MMIO access.
> This is what the comment in vgic_reg_access() says.
> 

right, that's where we guarantee it, I got to that conclusion when
reviewing gicv3 host support, but forgot about it again.  Sorry for
making you chase it down.

> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_clear_enable_reg_dist(struct kvm_vcpu *vcpu,
> >> +                                           struct kvm_exit_mmio *mmio,
> >> +                                           phys_addr_t offset,
> >> +                                           void *private)
> >> +{
> >> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> >> +             return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> >> +                                           vcpu->vcpu_id,
> >> +                                           ACCESS_WRITE_CLEARBIT);
> >> +
> >> +     vgic_reg_access(mmio, NULL, offset & 3,
> >> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_set_pending_reg_dist(struct kvm_vcpu *vcpu,
> >> +                                          struct kvm_exit_mmio *mmio,
> >> +                                          phys_addr_t offset,
> >> +                                          void *private)
> >> +{
> >> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> >> +             return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
> >> +                                                vcpu->vcpu_id);
> >> +
> >> +     vgic_reg_access(mmio, NULL, offset & 3,
> >> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_clear_pending_reg_dist(struct kvm_vcpu *vcpu,
> >> +                                            struct kvm_exit_mmio *mmio,
> >> +                                            phys_addr_t offset,
> >> +                                            void *private)
> >> +{
> >> +     if (likely(offset >= VGIC_NR_PRIVATE_IRQS / 8))
> >> +             return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
> >> +                                                  vcpu->vcpu_id);
> >> +
> >> +     vgic_reg_access(mmio, NULL, offset & 3,
> >> +                     ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_priority_reg_dist(struct kvm_vcpu *vcpu,
> >> +                                       struct kvm_exit_mmio *mmio,
> >> +                                       phys_addr_t offset,
> >> +                                       void *private)
> >> +{
> >> +     u32 *reg;
> >> +
> >> +     if (unlikely(offset < VGIC_NR_PRIVATE_IRQS)) {
> >> +             vgic_reg_access(mmio, NULL, offset & 3,
> > 
> > Just noticed, you don't need to mask off the upper bits all these places, do you?
> > 
> > I think it should be consistent with what we do in the v2 emulation.
> 
> Right, that's done in vgic_reg_access(). Will remove that.
> 
> > The only place you may need to do that is in the handle_mmio_misc function.
> > 
> >> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +             return false;
> >> +     }
> >> +
> >> +     reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
> >> +                                vcpu->vcpu_id, offset);
> >> +     vgic_reg_access(mmio, reg, offset,
> >> +             ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_cfg_reg_dist(struct kvm_vcpu *vcpu,
> >> +                                  struct kvm_exit_mmio *mmio,
> >> +                                  phys_addr_t offset,
> >> +                                  void *private)
> >> +{
> >> +     u32 *reg;
> >> +
> >> +     if (unlikely(offset < VGIC_NR_PRIVATE_IRQS / 4)) {
> >> +             vgic_reg_access(mmio, NULL, offset & 3,
> >> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +             return false;
> >> +     }
> >> +
> >> +     reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
> >> +                               vcpu->vcpu_id, offset >> 1);
> >> +
> >> +     return vgic_handle_cfg_reg(reg, mmio, offset);
> >> +}
> >> +
> >> +static u32 compress_mpidr(unsigned long mpidr)
> > 
> > can you comment on this function which format it returns and which
> > context that's useful in?
> 
> Yes.
> 
> >> +{
> >> +     u32 ret;
> >> +
> >> +     ret = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> >> +     ret |= MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8;
> >> +     ret |= MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16;
> >> +     ret |= MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24;
> >> +
> >> +     return ret;
> >> +}
> >> +
> >> +static unsigned long uncompress_mpidr(u32 value)
> >> +{
> >> +     unsigned long mpidr;
> >> +
> >> +     mpidr = ((value >> 0) & 0xFF) << MPIDR_LEVEL_SHIFT(0);
> >> +     mpidr |= ((value >> 8) & 0xFF) << MPIDR_LEVEL_SHIFT(1);
> >> +     mpidr |= ((value >> 16) & 0xFF) << MPIDR_LEVEL_SHIFT(2);
> >> +     mpidr |= (u64)((value >> 24) & 0xFF) << MPIDR_LEVEL_SHIFT(3);
> >> +
> >> +     return mpidr;
> >> +}
> >> +
> >> +/*
> >> + * Lookup the given MPIDR value to get the vcpu_id (if there is one)
> >> + * and store that in the irq_spi_cpu[] array.
> >> + * This limits the number of VCPUs to 255 for now, extending the data
> >> + * type (or storing kvm_vcpu poiners) should lift the limit.
> >> + * Store the original MPIDR value in an extra array.
> > 
> > why?  To maintain read-as-written?
> 
> Yes. Marc mentioned some use case, I think it was about hot-(un)plugging
> CPUs.
> 

I thought I figured out there was some reason why we couldn't just
construct the MPIDR based on the vcpu_id when reading back the value,
but now I can't remember.  We should probably document why we're doing
this.

> >> + * Unallocated MPIDRs are translated to a special value and catched
> > 
> > s/catched/caught/
> > 
> >> + * before any array accesses.
> >> + */
> >> +static bool handle_mmio_route_reg(struct kvm_vcpu *vcpu,
> >> +                               struct kvm_exit_mmio *mmio,
> >> +                               phys_addr_t offset, void *private)
> >> +{
> >> +     struct kvm *kvm = vcpu->kvm;
> >> +     struct vgic_dist *dist = &kvm->arch.vgic;
> >> +     int irq;
> >> +     u32 reg;
> >> +     int vcpu_id;
> >> +     unsigned long *bmap, mpidr;
> >> +     u32 word_offset = offset & 3;
> >> +
> >> +     /*
> >> +      * Private interrupts cannot be re-routed, so this register
> >> +      * is RES0 for any IRQ < 32.
> >> +      * Also the upper 32 bits of each 64 bit register are zero,
> >> +      * as we don't support Aff3 and that's the only value up there.
> > 
> > drop the rest of the sentence after Aff3.
> > 
> >> +      */
> >> +     if (unlikely(offset < VGIC_NR_PRIVATE_IRQS * 8) || (offset & 4) == 4) {
> > 
> > you don't need the '== 4' part.
> > 
> >> +             vgic_reg_access(mmio, NULL, word_offset,
> >> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +             return false;
> >> +     }
> >> +
> >> +     irq = (offset / 8) - VGIC_NR_PRIVATE_IRQS;
> > 
> > can we not call this irq? spi instead maybe?
> 
> Yes (to all the four comments above).
> 
> >> +
> >> +     /* get the stored MPIDR for this IRQ */
> >> +     mpidr = uncompress_mpidr(dist->irq_spi_mpidr[irq]);
> >> +     mpidr &= MPIDR_HWID_BITMASK;
> >> +     reg = mpidr;
> >> +
> >> +     vgic_reg_access(mmio, &reg, word_offset,
> >> +                     ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> >> +
> >> +     if (!mmio->is_write)
> >> +             return false;
> >> +
> >> +     /*
> >> +      * Now clear the currently assigned vCPU from the map, making room
> >> +      * for the new one to be written below
> >> +      */
> >> +     vcpu = kvm_mpidr_to_vcpu(kvm, mpidr);
> >> +     if (likely(vcpu)) {
> >> +             vcpu_id = vcpu->vcpu_id;
> >> +             bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
> >> +             clear_bit(irq, bmap);
> > 
> > this is the atomic version, right?  is it known to be faster on arm64
> > because it's written in assembly and that's why we're using it instead
> > of __clear_bit?
> 
> No, because I was unaware of the difference when writing this and was
> assuming the the non-underscore version is the canonical one.
> Inside the vgic lock __clear_bit and __set_bit are safe, right?
> 

yes (we're doing an awful lot holding those spinlocks, so we should
probably look at the contention on those some time).

> >> +     }
> >> +
> >> +     dist->irq_spi_mpidr[irq] = compress_mpidr(reg);
> >> +     vcpu = kvm_mpidr_to_vcpu(kvm, reg & MPIDR_HWID_BITMASK);
> >> +
> >> +     /*
> >> +      * The spec says that non-existent MPIDR values should not be
> >> +      * forwarded to any existent (v)CPU, but should be able to become
> >> +      * pending anyway. We simply keep the irq_spi_target[] array empty, so
> >> +      * the interrupt will never be injected.
> >> +      * irq_spi_cpu[irq] gets a magic value in this case.
> >> +      */
> >> +     if (likely(vcpu)) {
> >> +             vcpu_id = vcpu->vcpu_id;
> >> +             dist->irq_spi_cpu[irq] = vcpu_id;
> >> +             bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]);
> >> +             set_bit(irq, bmap);
> > 
> > __set_bit ?
> > 
> >> +     } else
> >> +             dist->irq_spi_cpu[irq] = VCPU_NOT_ALLOCATED;
> > 
> > according to the CodingStyle (and me) this wants braces.
> > 
> >> +
> >> +     vgic_update_state(kvm);
> >> +
> >> +     return true;
> >> +}
> >> +
> >> +static bool handle_mmio_idregs(struct kvm_vcpu *vcpu,
> >> +                            struct kvm_exit_mmio *mmio,
> >> +                            phys_addr_t offset, void *private)
> >> +{
> >> +     u32 reg = 0;
> >> +
> >> +     switch (offset + GICD_IDREGS) {
> >> +     case GICD_PIDR2:
> >> +             reg = 0x3b;
> >> +             break;
> >> +     }
> >> +
> >> +     vgic_reg_access(mmio, &reg, offset & 3,
> >> +                     ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> >> +
> >> +     return false;
> >> +}
> >> +
> >> +static const struct mmio_range vgic_dist_ranges[] = {
> > 
> > can we call this vgic_v3_dist_ranges ?
> 
> If that doesn't break the 80 characters limit somewhere: Yes! ;-)
> 
> >> +     {       /*
> >> +              * handling CTLR, TYPER, IIDR and STATUSR
> >> +              */
> > 
> > this one doesn't need wings (and you're not doing that below)
> > 
> >> +             .base           = GICD_CTLR,
> >> +             .len            = 20,
> > 
> > nit: why do we specify this len as decimal and the others in hex?
> > 
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_misc,
> >> +     },
> > 
> > are we not mentioning the status register here because it's optional?
> 
> We will be mentioning this fact in a comment ...
> 
> >> +     {
> >> +             /* when DS=1, this is RAZ/WI */
> >> +             .base           = GICD_SETSPI_SR,
> >> +             .len            = 0x04,
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             /* when DS=1, this is RAZ/WI */
> >> +             .base           = GICD_CLRSPI_SR,
> >> +             .len            = 0x04,
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> > 
> > why are we only listing the _SR versions and not the _NSR versions?
> 
> Section 5.3.18 of the GICv3 spec states that this register is RAZ/WI if
> GICD_CTLR.DS is one. As we do not implement MSIs from the guest, we omit
> the _NSR registers from this list to provoke a MMIO error when they are
> used (we have GICD_TYPER.MBIS == 0). The spec does not mention what
> happens on an access in this case, though. Will check back with Marc.

I can't find this, it seems to me the NSR versions should be properly
implemented and the SR versions are write-only and should generate some
kind of error when being read, or?

> 
> >> +     {
> >> +             .base           = GICD_IGROUPR,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> > 
> > this one may warrant a TODO: Group 0 interrupts not yet supported.
> 
> Sure.
> 
> >> +     {
> >> +             .base           = GICD_ISENABLER,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_set_enable_reg_dist,
> >> +     },
> >> +     {
> >> +             .base           = GICD_ICENABLER,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_clear_enable_reg_dist,
> >> +     },
> >> +     {
> >> +             .base           = GICD_ISPENDR,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_set_pending_reg_dist,
> >> +     },
> >> +     {
> >> +             .base           = GICD_ICPENDR,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_clear_pending_reg_dist,
> >> +     },
> >> +     {
> >> +             .base           = GICD_ISACTIVER,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICD_ICACTIVER,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICD_IPRIORITYR,
> >> +             .len            = 0x400,
> >> +             .bits_per_irq   = 8,
> >> +             .handle_mmio    = handle_mmio_priority_reg_dist,
> >> +     },
> >> +     {
> >> +             /* TARGETSRn is RES0 when ARE=1 */
> >> +             .base           = GICD_ITARGETSR,
> >> +             .len            = 0x400,
> >> +             .bits_per_irq   = 8,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICD_ICFGR,
> >> +             .len            = 0x100,
> >> +             .bits_per_irq   = 2,
> >> +             .handle_mmio    = handle_mmio_cfg_reg_dist,
> >> +     },
> >> +     {
> >> +             /* this is RAZ/WI when DS=1 */
> >> +             .base           = GICD_IGRPMODR,
> >> +             .len            = 0x80,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             /* with DS==1 this is RAZ/WI */
> > 
> > any reason why the two comments above are not identical?  (I know, I
> > have OCD).
> 
> That was left in to check whether you would actually read the middle of
> the patch ;-)
> 
> Which tempts me to split up the reply here. Will answer the rest of the
> comments in a second mail.
> 
> Hej Hej,
> Andre
> 
> P.S.: now you witnessed about 20% of my Danish, the rest is about
> various food items you can buy in Marielyst ;-)

Still impressed, thanks!
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2
  2014-11-07 14:30   ` Christoffer Dall
  2014-11-10 17:30     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 1 Andre Przywara
@ 2014-11-12 12:39     ` Andre Przywara
  2014-11-12 19:51       ` Christoffer Dall
  2014-11-13 11:18       ` Christoffer Dall
  1 sibling, 2 replies; 76+ messages in thread
From: Andre Przywara @ 2014-11-12 12:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hej Christoffer,

the promised part 2 of the reply:

On 07/11/14 14:30, Christoffer Dall wrote:
> On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
>> With everything separated and prepared, we implement a model of a
>> GICv3 distributor and redistributors by using the existing framework
>> to provide handler functions for each register group.

[...]

>> +
>> +static const struct mmio_range vgic_dist_ranges[] = {

[...]

>> +     /* the next three blocks are RES0 if ARE=1 */
> 
> probably nicer to just have a comment for each register where this
> applies.

Done.

> 
>> +     {
>> +             .base           = GICD_SGIR,
>> +             .len            = 4,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICD_CPENDSGIR,
>> +             .len            = 0x10,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICD_SPENDSGIR,
>> +             .len            = 0x10,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICD_IROUTER,
>> +             .len            = 0x2000,
> 
> shouldn't this be 0x1ee0?

The limit of 0x7FD8 in the spec seems to come from 1020 - 32 SPIs.
However all the other registers always claim 1024 IRQs supported (with
non-implemented SPIs being RAZ/WI anyway).
So I wonder if this is just a inconsistency in the spec.
Marc, can you comment?

And we cover the 32 private IRQs also with this function (spec demands
RES0 for those), this is handled in handle_mmio_route_reg().

So I tend to leave this at 8KB, as this is what the spec talks about in
section 5.3.4.

>> +             .bits_per_irq   = 64,
>> +             .handle_mmio    = handle_mmio_route_reg,
>> +     },
>> +     {
>> +             .base           = GICD_IDREGS,
>> +             .len            = 0x30,
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_idregs,
>> +     },
>> +     {},
>> +};
>> +
>> +static bool handle_mmio_set_enable_reg_redist(struct kvm_vcpu *vcpu,
>> +                                           struct kvm_exit_mmio *mmio,
>> +                                           phys_addr_t offset,
>> +                                           void *private)
>> +{
>> +     struct kvm_vcpu *target_redist_vcpu = private;
>> +
>> +     return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
>> +                                   target_redist_vcpu->vcpu_id,
>> +                                   ACCESS_WRITE_SETBIT);
>> +}
>> +
>> +static bool handle_mmio_clear_enable_reg_redist(struct kvm_vcpu *vcpu,
>> +                                             struct kvm_exit_mmio *mmio,
>> +                                             phys_addr_t offset,
>> +                                             void *private)
>> +{
>> +     struct kvm_vcpu *target_redist_vcpu = private;
>> +
>> +     return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
>> +                                   target_redist_vcpu->vcpu_id,
>> +                                   ACCESS_WRITE_CLEARBIT);
>> +}
>> +
>> +static bool handle_mmio_set_pending_reg_redist(struct kvm_vcpu *vcpu,
>> +                                            struct kvm_exit_mmio *mmio,
>> +                                            phys_addr_t offset,
>> +                                            void *private)
>> +{
>> +     struct kvm_vcpu *target_redist_vcpu = private;
>> +
>> +     return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
>> +                                        target_redist_vcpu->vcpu_id);
>> +}
>> +
>> +static bool handle_mmio_clear_pending_reg_redist(struct kvm_vcpu *vcpu,
>> +                                              struct kvm_exit_mmio *mmio,
>> +                                              phys_addr_t offset,
>> +                                              void *private)
>> +{
>> +     struct kvm_vcpu *target_redist_vcpu = private;
>> +
>> +     return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
>> +                                          target_redist_vcpu->vcpu_id);
>> +}
>> +
>> +static bool handle_mmio_priority_reg_redist(struct kvm_vcpu *vcpu,
>> +                                         struct kvm_exit_mmio *mmio,
>> +                                         phys_addr_t offset,
>> +                                         void *private)
>> +{
>> +     struct kvm_vcpu *target_redist_vcpu = private;
>> +     u32 *reg;
>> +
>> +     reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
>> +                                target_redist_vcpu->vcpu_id, offset);
>> +     vgic_reg_access(mmio, reg, offset,
>> +                     ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> +     return false;
>> +}
>> +
>> +static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
>> +                                    struct kvm_exit_mmio *mmio,
>> +                                    phys_addr_t offset,
>> +                                    void *private)
>> +{
>> +     u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
>> +                                    *(int *)private, offset >> 1);
>> +
>> +     return vgic_handle_cfg_reg(reg, mmio, offset);
>> +}
>> +
>> +static const struct mmio_range vgic_redist_sgi_ranges[] = {
>> +     {
>> +             .base           = GICR_IGROUPR0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
> 
> shouldn't these RAO/WI instead?

Mmmh, looks like it. I added a simple handle_mmio_rao_wi()
implementation for this.

>> +     },
>> +     {
>> +             .base           = GICR_ISENABLER0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_set_enable_reg_redist,
>> +     },
>> +     {
>> +             .base           = GICR_ICENABLER0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_clear_enable_reg_redist,
>> +     },
>> +     {
>> +             .base           = GICR_ISPENDR0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_set_pending_reg_redist,
>> +     },
>> +     {
>> +             .base           = GICR_ICPENDR0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_clear_pending_reg_redist,
>> +     },
>> +     {
>> +             .base           = GICR_ISACTIVER0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICR_ICACTIVER0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICR_IPRIORITYR0,
>> +             .len            = 32,
>> +             .bits_per_irq   = 8,
>> +             .handle_mmio    = handle_mmio_priority_reg_redist,
>> +     },
>> +     {
>> +             .base           = GICR_ICFGR0,
>> +             .len            = 8,
>> +             .bits_per_irq   = 2,
>> +             .handle_mmio    = handle_mmio_cfg_reg_redist,
>> +     },
>> +     {
>> +             .base           = GICR_IGRPMODR0,
>> +             .len            = 4,
>> +             .bits_per_irq   = 1,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICR_NSACR,
>> +             .len            = 4,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {},
>> +};
>> +
>> +static bool handle_mmio_misc_redist(struct kvm_vcpu *vcpu,
>> +                                 struct kvm_exit_mmio *mmio,
>> +                                 phys_addr_t offset, void *private)
>> +{
>> +     u32 reg;
>> +     u32 word_offset = offset & 3;
>> +     u64 mpidr;
>> +     struct kvm_vcpu *target_redist_vcpu = private;
>> +     int target_vcpu_id = target_redist_vcpu->vcpu_id;
>> +
>> +     switch (offset & ~3) {
>> +     case GICR_CTLR:
>> +             /* since we don't support LPIs, this register is zero for now */
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     case GICR_TYPER + 4:
>> +             mpidr = kvm_vcpu_get_mpidr(target_redist_vcpu);
>> +             reg = compress_mpidr(mpidr);
>> +
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                             ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     case GICR_TYPER:
>> +             reg = target_redist_vcpu->vcpu_id << 8;
>> +             if (target_vcpu_id == atomic_read(&vcpu->kvm->online_vcpus) - 1)
>> +                     reg |= GICR_TYPER_LAST;
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                             ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     case GICR_IIDR:
>> +             reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
>> +             vgic_reg_access(mmio, &reg, word_offset,
>> +                     ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> +             break;
> 
> the fact that you could reuse handle_mmio_iidr directly here and that
> GICR_TYPER reads funny here, indicates to me that we should once again
> split this up into smaller functions.

Yeah, done that. Looks indeed better now.

>> +     default:
>> +             vgic_reg_access(mmio, NULL, word_offset,
>> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> +             break;
>> +     }
>> +
>> +     return false;
>> +}
>> +
>> +static const struct mmio_range vgic_redist_ranges[] = {
>> +     {       /*
>> +              * handling CTLR, IIDR, TYPER and STATUSR
>> +              */
>> +             .base           = GICR_CTLR,
>> +             .len            = 20,
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_misc_redist,
>> +     },
>> +     {
>> +             .base           = GICR_WAKER,
>> +             .len            = 4,
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_raz_wi,
>> +     },
>> +     {
>> +             .base           = GICR_IDREGS,
>> +             .len            = 0x30,
>> +             .bits_per_irq   = 0,
>> +             .handle_mmio    = handle_mmio_idregs,
>> +     },
>> +     {},
>> +};
>> +
>> +/*
>> + * this is the stub handling both dist and redist MMIO exits for v3
>       This
> 
> Is this really a stub?
> 
> I would suggest spelling out distributor and re-distributor and GICv3.
> Full stop after GICv3.
> 
>> + * does some vcpu_id calculation on the redist MMIO to use a possibly
>> + * different VCPU than the current one
> 
> "some vcpu_id calculation" is not very helpful, either explain the magic
> sauce, or just say in which way a "different" VCPU is something we need
> to pay special attention to.
> 
> If I read the code correctly, the comment shoudl simply be:
> 
> The GICv3 spec allows any CPU to access any redistributor through the
> memory-mapped redistributor registers.  We can therefore determine which
> reditributor is being accesses by simply looking at the faulting IPA.
> 

Yeah, admittedly this comment was total crap. Changed it to something
closer to yours.

>> + */
>> +static bool vgic_v3_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>> +                             struct kvm_exit_mmio *mmio)
>> +{
>> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +     unsigned long dbase = dist->vgic_dist_base;
>> +     unsigned long rdbase = dist->vgic_redist_base;
> 
> I'm not crazy about these 'shortcuts', especially given that RD_base is
> the base of a specific redistributor, but ok.

Well, I change rdbase below, so at least this one has to stay as a variable.

>> +     int nrcpus = atomic_read(&vcpu->kvm->online_vcpus);
>> +     int vcpu_id;
>> +     struct kvm_vcpu *target_redist_vcpu;
>> +
>> +     if (is_in_range(mmio->phys_addr, mmio->len, dbase, GIC_V3_DIST_SIZE)) {
>> +             return vgic_handle_mmio_range(vcpu, run, mmio,
>> +                                           vgic_dist_ranges, dbase, NULL);
>> +     }
>> +
>> +     if (!is_in_range(mmio->phys_addr, mmio->len, rdbase,
>> +         GIC_V3_REDIST_SIZE * nrcpus))
>> +             return false;
> 
> so this implies that all redistributors will always be in contiguous IPA
> space, is this reasonable?

As far as I read the spec, this is mandated there. And as the "GIC
implementors" we define that anyway, right?

>> +
>> +     vcpu_id = (mmio->phys_addr - rdbase) / GIC_V3_REDIST_SIZE;
>> +     rdbase += (vcpu_id * GIC_V3_REDIST_SIZE);
>> +     target_redist_vcpu = kvm_get_vcpu(vcpu->kvm, vcpu_id);
> 
> redist_vcpu should be enough

fixed.

>> +
>> +     if (mmio->phys_addr >= rdbase + 0x10000)
>> +             return vgic_handle_mmio_range(vcpu, run, mmio,
>> +                                           vgic_redist_sgi_ranges,
>> +                                           rdbase + 0x10000,
>> +                                           target_redist_vcpu);
> 
> 0x10000 magic number used twice,  GICV3_REDIST_SGI_PAGE_OFFSET or
> something shorter.

Done that.

> perhaps it is nicer to just adjust rdbase and set a range variable above
> and only have a single call to vgic_handle_mmio_range().

Yup.

>> +
>> +     return vgic_handle_mmio_range(vcpu, run, mmio, vgic_redist_ranges,
>> +                                   rdbase, target_redist_vcpu);
>> +}
>> +
>> +static bool vgic_v3_queue_sgi(struct kvm_vcpu *vcpu, int irq)
>> +{
>> +     if (vgic_queue_irq(vcpu, 0, irq)) {
>> +             vgic_dist_irq_clear_pending(vcpu, irq);
>> +             vgic_cpu_irq_clear(vcpu, irq);
>> +             return true;
>> +     }
>> +
>> +     return false;
>> +}
>> +
>> +static int vgic_v3_init_maps(struct vgic_dist *dist)
>> +{
>> +     int nr_spis = dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
>> +
>> +     dist->irq_spi_mpidr = kcalloc(nr_spis, sizeof(dist->irq_spi_mpidr[0]),
>> +                                   GFP_KERNEL);
>> +
>> +     if (!dist->irq_spi_mpidr)
>> +             return -ENOMEM;
>> +
>> +     return 0;
>> +}
>> +
>> +static int vgic_v3_init(struct kvm *kvm, const struct vgic_params *params)
>> +{
>> +     struct vgic_dist *dist = &kvm->arch.vgic;
>> +     int ret, i;
>> +     u32 mpidr;
>> +
>> +     if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
>> +         IS_VGIC_ADDR_UNDEF(dist->vgic_redist_base)) {
>> +             kvm_err("Need to set vgic distributor addresses first\n");
>> +             return -ENXIO;
>> +     }
>> +
>> +     /*
>> +      * FIXME: this should be moved to init_maps time, and may bite
>> +      * us when adding save/restore. Add a per-emulation hook?
>> +      */
> 
> What is the plan for this?  Can we move it into init_maps or does that
> require some more work?

This comment is from Marc, when he once rebased these patches on top of
his rebased and reworked vgic_dyn patches.
Looks like I have to take a closer look at this now ...

> Why can't we do what the gicv2 emulation does?
> 
> Not sure what the "Add a per-emulation hook?" question is asking...

The point is that this allocation is guest GIC model dependend.
Per-emulation hook means to differentiate between the possible guest
model code by using a function pointer.

>> +     ret = vgic_v3_init_maps(dist);
>> +     if (ret) {
>> +             kvm_err("Unable to allocate maps\n");
>> +             return ret;
>> +     }
>> +
>> +     mpidr = compress_mpidr(kvm_vcpu_get_mpidr(kvm_get_vcpu(kvm, 0)));
>> +     for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i++) {
>> +             dist->irq_spi_cpu[i - VGIC_NR_PRIVATE_IRQS] = 0;
>> +             dist->irq_spi_mpidr[i - VGIC_NR_PRIVATE_IRQS] = mpidr;
>> +             vgic_bitmap_set_irq_val(dist->irq_spi_target, 0, i, 1);
> 
> why do we need 3 different copies of the same value now?  ok, we had two
> before because of the bitmap "optimization" thingy, but now we have two
> other sets of state for the same thing...

Mmmh, we use irq_spi_cpu[] and irq_spi_target[] to be able to reuse the
existing code. irq_spi_mpidr[] is just there to allow read-as-written.

>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static void vgic_v3_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
>> +{
> 
> can you put a one line comment here:
> 
> /* The GICv3 spec does away with keeping track of SGI sources */

Sure.

>> +}
>> +
>> +bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type)
>> +{
>> +     struct vgic_dist *dist = &kvm->arch.vgic;
>> +
>> +     switch (type) {
>> +     case KVM_DEV_TYPE_ARM_VGIC_V3:
>> +             dist->vm_ops.handle_mmio = vgic_v3_handle_mmio;
>> +             dist->vm_ops.queue_sgi = vgic_v3_queue_sgi;
>> +             dist->vm_ops.add_sgi_source = vgic_v3_add_sgi_source;
>> +             dist->vm_ops.vgic_init = vgic_v3_init;
>> +             break;
>> +     default:
>> +             return false;
>> +     }
>> +     return true;
>> +}
>> +
>> +/*
>> + * triggered by a system register access trap, called from the sysregs
> 
>       Triggered
> 
>> + * handling code there.
> 
>                     ^^^ there, where, here, and everywhere ?
> 
>> + * The register contains the upper three affinity levels of the target
> 
>           ^^^ which register?  @reg ?
> 
>> + * processors as well as a bitmask of 16 Aff0 CPUs.
> 
> Does @reg follow the format from something in the spec?  That would be
> useful to know...
> 
>> + * Iterate over all VCPUs to check for matching ones or signal on
>> + * all-but-self if the mode bit is set.
> 
> an all-but-self IPI?  Is that the architectural term?  Otherwise I would
> suggest something like:  If not VCPUs are found which match reg (in some
> way), then send the IPI to all VCPUs in the VM, except the one
> performing the system register acces.

I totally reworked the comment. Admittedly this was more targeted to
Marc ;-)

>> + */
> 
> Also, please use the kdocs format here like the rest of the kvm/arm code.
> Begin sentences with upper-case, etc.:
> 
> /**
> * vgic_v3_dispatch_sgi - This function does something with SGIs
> * @vcpu: The vcpu pointer
> * @reg: Magic
> *
> * Some nicer version of what you have above.
> */
> 
>> +void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
>> +{
> 
> It's a bit hard to review this when I cannot see how it is called, I'm
> assuming that this is on writes to ICC_SGI1R_EL1 and reg is what the
> guest tried to write to that register.
> 
> I have a feeling that you may want to add this function in a separate patch.

I think I had it thay way before and there was some other issue with
this split-up. Will give it a try again.

>> +     struct kvm *kvm = vcpu->kvm;
>> +     struct kvm_vcpu *c_vcpu;
>> +     struct vgic_dist *dist = &kvm->arch.vgic;
>> +     u16 target_cpus;
>> +     u64 mpidr, mpidr_h, mpidr_l;
>> +     int sgi, mode, c, vcpu_id;
>> +     int updated = 0;
>> +
>> +     vcpu_id = vcpu->vcpu_id;
>> +
>> +     sgi = (reg >> 24) & 0xf;
>> +     mode = (reg >> 40) & 0x1;
> 
> perhaps we can call this 'targeted' or something to make it a bit more
> clear.

I use broadcast now, that is even more readable in the code below.

>> +     target_cpus = reg & 0xffff;
> 
> Can you add some defines for these magic shifts?  Are there not some
> already for the GICv3 host driver we can reuse?

No, the host driver uses magic shift values directly.
I added appropriate defines in arm-gic-v3.h and use them in both places
now. It is a bit messy though, since the names tend to get quite long
and I needed to define wrapper macros to make it not totally unreadable.

>> +     mpidr = ((reg >> 48) & 0xff) << MPIDR_LEVEL_SHIFT(3);
>> +     mpidr |= ((reg >> 32) & 0xff) << MPIDR_LEVEL_SHIFT(2);
>> +     mpidr |= ((reg >> 16) & 0xff) << MPIDR_LEVEL_SHIFT(1);
>> +     mpidr &= ~MPIDR_LEVEL_MASK;
> 
> (**) note the comment a few lines down.
> 
>> +
>> +     /*
>> +      * We take the dist lock here, because we come from the sysregs
>> +      * code path and not from MMIO (where this is already done)
> 
>                                         which already takes the lock).
> 
>> +      */
>> +     spin_lock(&dist->lock);
>> +     kvm_for_each_vcpu(c, c_vcpu, kvm) {
> 
> I think it would be helpful to document this loop, something like:
> 
> We loop through every possible vCPU and check if we need to send an SGI
> to that vCPU.  If targeting specific vCPUS, we check if the candidate
> vCPU is in the target list and if it is, we send an SGI and clear the
> bit in the target list.  When the target list is empty and we are
> targeting specific vCPUs, we are done.
> 
> Maybe too verbose, you can tweak it as you like.
> 
>> +             if (!mode && target_cpus == 0)
>> +                     break;
>> +             if (mode && c == vcpu_id)       /* not to myself */
>> +                     continue;
>> +             if (!mode) {
>> +                     mpidr_h = kvm_vcpu_get_mpidr(c_vcpu);
>> +                     mpidr_l = MPIDR_AFFINITY_LEVEL(mpidr_h, 0);
>> +                     mpidr_h &= ~MPIDR_LEVEL_MASK;
> 
> this is *really* confusing. _h and _l are high and low?
> 
> Can you factor this out into a static inline and get rid of that mpidr
> mask above (**) ?

So I reworked the whole function and commented it heavily. The algorithm
stays the same, but it much more readable, hopefully.

>> +                     if (mpidr != mpidr_h)
>> +                             continue;
>> +                     if (!(target_cpus & BIT(mpidr_l)))
>> +                             continue;
>> +                     target_cpus &= ~BIT(mpidr_l);
>> +             }
>> +             /* Flag the SGI as pending */
>> +             vgic_dist_irq_set_pending(c_vcpu, sgi);
>> +             updated = 1;
>> +             kvm_debug("SGI%d from CPU%d to CPU%d\n", sgi, vcpu_id, c);
>> +     }
>> +     if (updated)
>> +             vgic_update_state(vcpu->kvm);
>> +     spin_unlock(&dist->lock);
>> +     if (updated)
>> +             vgic_kick_vcpus(vcpu->kvm);
>> +}
>> +
>> +
>> +static int vgic_v3_get_attr(struct kvm_device *dev,
>> +                         struct kvm_device_attr *attr)
>> +{
>> +     int ret;
>> +
>> +     ret = vgic_get_common_attr(dev, attr);
> 
> So this means we can get the KVM_VGIC_V2_ADDR_TYPE_DIST and
> KVM_VGIC_V2_ADDR_TYPE_CPU from an emualted gicv3 without the GICv2
> backwards compatibility features?

Mmh, below has_attr() doesn't claim to support it. So accessing them
would break the KVM protocol?
Also, kvm_vgic_addr() explicitly tries to avoid this (check type_needed).
Feel free to correct if I am wrong on this ...

>> +     if (ret != -ENXIO)
>> +             return ret;
>> +
>> +     switch (attr->group) {
>> +     case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
>> +     case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
>> +             return -ENXIO;
>> +     }
>> +
>> +     return -ENXIO;
>> +}
>> +
>> +static int vgic_v3_set_attr(struct kvm_device *dev,
>> +                         struct kvm_device_attr *attr)
>> +{
>> +     int ret;
>> +
>> +     ret = vgic_set_common_attr(dev, attr);
> 
> same as above?
> 
>> +     if (ret != -ENXIO)
>> +             return ret;
>> +
>> +     switch (attr->group) {
>> +     case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
>> +     case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
>> +             return -ENXIO;
>> +     }
>> +
>> +     return -ENXIO;
>> +}
>> +
>> +static int vgic_v3_has_attr(struct kvm_device *dev,
>> +                         struct kvm_device_attr *attr)
>> +{
>> +     switch (attr->group) {
>> +     case KVM_DEV_ARM_VGIC_GRP_ADDR:
>> +             switch (attr->attr) {
>> +             case KVM_VGIC_V2_ADDR_TYPE_DIST:
>> +             case KVM_VGIC_V2_ADDR_TYPE_CPU:
>> +                     return -ENXIO;
>> +             }
>> +             break;
>> +     case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
>> +     case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
>> +             return -ENXIO;
>> +     case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
>> +             return 0;
>> +     }
>> +     return -ENXIO;
>> +}
>> +
>> +struct kvm_device_ops kvm_arm_vgic_v3_ops = {
>> +     .name = "kvm-arm-vgic-v3",
>> +     .create = vgic_create,
>> +     .destroy = vgic_destroy,
>> +     .set_attr = vgic_v3_set_attr,
>> +     .get_attr = vgic_v3_get_attr,
>> +     .has_attr = vgic_v3_has_attr,
> 
> nit: you could reorder set/get so they're set in the same order they
> appear in the code.

Why not...
Have you thought about an OCD treatment, btw? ;-)

>> +};
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index a54389b..2867269d 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1228,7 +1228,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>       struct kvm_vcpu *vcpu;
>>       int edge_triggered, level_triggered;
>>       int enabled;
>> -     bool ret = true;
>> +     bool ret = true, can_inject = true;
>>
>>       spin_lock(&dist->lock);
>>
>> @@ -1243,6 +1243,11 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>
>>       if (irq_num >= VGIC_NR_PRIVATE_IRQS) {
>>               cpuid = dist->irq_spi_cpu[irq_num - VGIC_NR_PRIVATE_IRQS];
>> +             if (cpuid == VCPU_NOT_ALLOCATED) {
>> +                     /* Pretend we use CPU0, and prevent injection */
>> +                     cpuid = 0;
>> +                     can_inject = false;
>> +             }
>>               vcpu = kvm_get_vcpu(kvm, cpuid);
>>       }
>>
>> @@ -1264,7 +1269,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>
>>       enabled = vgic_irq_is_enabled(vcpu, irq_num);
>>
>> -     if (!enabled) {
>> +     if (!enabled || !can_inject) {
> 
> don't you also need to handle the vgic_dist_irq_set_pending() call and
> its friends above?

Not sure I understand what you mean here.
can_inject is there to check accesses to the irq_spi_cpu[] array and
detect the "undefined CPU" special case caused by writing an unknown
MPIDR into the GICD_IROUTERn register.
AFAICT vgic_update_irq_pending() is the only function using this array,
right? Or have I got something wrong here?

>>               ret = false;
>>               goto out;
>>       }
>> @@ -1406,6 +1411,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
>>       }
>>       kfree(dist->irq_sgi_sources);
>>       kfree(dist->irq_spi_cpu);
>> +     kfree(dist->irq_spi_mpidr);
>>       kfree(dist->irq_spi_target);
>>       kfree(dist->irq_pending_on_cpu);
>>       dist->irq_sgi_sources = NULL;
>> @@ -1581,6 +1587,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>>       kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
>>       kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>>       kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>> +     kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
> 
> sure, we can write to the same memory twice, why not, it's fun.

It honours you that you remembered that this is defined as a union, but
I consider this an implementation detail (which may change in the
future) and the "casual" reader may not know this, so for the sake of
completeness let's initialize all to them. Either the compiler, the L1D
or the out-of-order engine in the CPU should drop this redundancy in
practise, hopefully.

>>
>>       if (!init_emulation_ops(kvm, type))
>>               ret = -ENODEV;
>> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
>> index f52db4e..42c20c1 100644
>> --- a/virt/kvm/arm/vgic.h
>> +++ b/virt/kvm/arm/vgic.h
>> @@ -35,6 +35,8 @@
>>  #define ACCESS_WRITE_VALUE   (3 << 1)
>>  #define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>>
>> +#define VCPU_NOT_ALLOCATED   ((u8)-1)
>> +
>>  unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x);
>>
>>  void vgic_update_state(struct kvm *kvm);
>> @@ -121,5 +123,6 @@ int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
>>  int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
>>
>>  bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type);
>> +bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type);
>>
>>  #endif
>> --
>> 1.7.9.5
>>

Puh, that's it. I'm on to merging the changes in the respective patches
and will send out a v4 ASAP.

Tak!
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2
  2014-11-12 12:39     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2 Andre Przywara
@ 2014-11-12 19:51       ` Christoffer Dall
  2014-11-13 11:18       ` Christoffer Dall
  1 sibling, 0 replies; 76+ messages in thread
From: Christoffer Dall @ 2014-11-12 19:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 12, 2014 at 12:39:05PM +0000, Andre Przywara wrote:
> Hej Christoffer,
> 

[...]

> 
> Puh, that's it. I'm on to merging the changes in the respective patches
> and will send out a v4 ASAP.
> 

There are a few of your comments that I need to think about and reply
to.  Will do so in the morning, may be better to hold off on sending out
a v4 until we agree on the points here.

-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2
  2014-11-12 12:39     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2 Andre Przywara
  2014-11-12 19:51       ` Christoffer Dall
@ 2014-11-13 11:18       ` Christoffer Dall
  2014-11-13 11:45         ` Marc Zyngier
  1 sibling, 1 reply; 76+ messages in thread
From: Christoffer Dall @ 2014-11-13 11:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 12, 2014 at 12:39:05PM +0000, Andre Przywara wrote:
> Hej Christoffer,
> 
> the promised part 2 of the reply:
> 
> On 07/11/14 14:30, Christoffer Dall wrote:
> > On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
> >> With everything separated and prepared, we implement a model of a
> >> GICv3 distributor and redistributors by using the existing framework
> >> to provide handler functions for each register group.
> 
> [...]
> 
> >> +
> >> +static const struct mmio_range vgic_dist_ranges[] = {
> 
> [...]
> 
> >> +     /* the next three blocks are RES0 if ARE=1 */
> > 
> > probably nicer to just have a comment for each register where this
> > applies.
> 
> Done.
> 
> > 
> >> +     {
> >> +             .base           = GICD_SGIR,
> >> +             .len            = 4,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICD_CPENDSGIR,
> >> +             .len            = 0x10,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICD_SPENDSGIR,
> >> +             .len            = 0x10,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICD_IROUTER,
> >> +             .len            = 0x2000,
> > 
> > shouldn't this be 0x1ee0?
> 
> The limit of 0x7FD8 in the spec seems to come from 1020 - 32 SPIs.
> However all the other registers always claim 1024 IRQs supported (with
> non-implemented SPIs being RAZ/WI anyway).
> So I wonder if this is just a inconsistency in the spec.
> Marc, can you comment?

The spec's memory map clearly indicates that the space at 0x6100 +
0x1edc an onwards is reserverd, so it feels weird to define IROUTER
registers here.

Indeed you guys should check what the true intention is.

> 
> And we cover the 32 private IRQs also with this function (spec demands
> RES0 for those), this is handled in handle_mmio_route_reg().
> 
> So I tend to leave this at 8KB, as this is what the spec talks about in
> section 5.3.4.
> 
> >> +             .bits_per_irq   = 64,
> >> +             .handle_mmio    = handle_mmio_route_reg,
> >> +     },
> >> +     {
> >> +             .base           = GICD_IDREGS,
> >> +             .len            = 0x30,
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_idregs,
> >> +     },
> >> +     {},
> >> +};
> >> +
> >> +static bool handle_mmio_set_enable_reg_redist(struct kvm_vcpu *vcpu,
> >> +                                           struct kvm_exit_mmio *mmio,
> >> +                                           phys_addr_t offset,
> >> +                                           void *private)
> >> +{
> >> +     struct kvm_vcpu *target_redist_vcpu = private;
> >> +
> >> +     return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> >> +                                   target_redist_vcpu->vcpu_id,
> >> +                                   ACCESS_WRITE_SETBIT);
> >> +}
> >> +
> >> +static bool handle_mmio_clear_enable_reg_redist(struct kvm_vcpu *vcpu,
> >> +                                             struct kvm_exit_mmio *mmio,
> >> +                                             phys_addr_t offset,
> >> +                                             void *private)
> >> +{
> >> +     struct kvm_vcpu *target_redist_vcpu = private;
> >> +
> >> +     return vgic_handle_enable_reg(vcpu->kvm, mmio, offset,
> >> +                                   target_redist_vcpu->vcpu_id,
> >> +                                   ACCESS_WRITE_CLEARBIT);
> >> +}
> >> +
> >> +static bool handle_mmio_set_pending_reg_redist(struct kvm_vcpu *vcpu,
> >> +                                            struct kvm_exit_mmio *mmio,
> >> +                                            phys_addr_t offset,
> >> +                                            void *private)
> >> +{
> >> +     struct kvm_vcpu *target_redist_vcpu = private;
> >> +
> >> +     return vgic_handle_set_pending_reg(vcpu->kvm, mmio, offset,
> >> +                                        target_redist_vcpu->vcpu_id);
> >> +}
> >> +
> >> +static bool handle_mmio_clear_pending_reg_redist(struct kvm_vcpu *vcpu,
> >> +                                              struct kvm_exit_mmio *mmio,
> >> +                                              phys_addr_t offset,
> >> +                                              void *private)
> >> +{
> >> +     struct kvm_vcpu *target_redist_vcpu = private;
> >> +
> >> +     return vgic_handle_clear_pending_reg(vcpu->kvm, mmio, offset,
> >> +                                          target_redist_vcpu->vcpu_id);
> >> +}
> >> +
> >> +static bool handle_mmio_priority_reg_redist(struct kvm_vcpu *vcpu,
> >> +                                         struct kvm_exit_mmio *mmio,
> >> +                                         phys_addr_t offset,
> >> +                                         void *private)
> >> +{
> >> +     struct kvm_vcpu *target_redist_vcpu = private;
> >> +     u32 *reg;
> >> +
> >> +     reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
> >> +                                target_redist_vcpu->vcpu_id, offset);
> >> +     vgic_reg_access(mmio, reg, offset,
> >> +                     ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> >> +     return false;
> >> +}
> >> +
> >> +static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
> >> +                                    struct kvm_exit_mmio *mmio,
> >> +                                    phys_addr_t offset,
> >> +                                    void *private)
> >> +{
> >> +     u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
> >> +                                    *(int *)private, offset >> 1);
> >> +
> >> +     return vgic_handle_cfg_reg(reg, mmio, offset);
> >> +}
> >> +
> >> +static const struct mmio_range vgic_redist_sgi_ranges[] = {
> >> +     {
> >> +             .base           = GICR_IGROUPR0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> > 
> > shouldn't these RAO/WI instead?
> 
> Mmmh, looks like it. I added a simple handle_mmio_rao_wi()
> implementation for this.
> 
> >> +     },
> >> +     {
> >> +             .base           = GICR_ISENABLER0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_set_enable_reg_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_ICENABLER0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_clear_enable_reg_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_ISPENDR0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_set_pending_reg_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_ICPENDR0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_clear_pending_reg_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_ISACTIVER0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICR_ICACTIVER0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICR_IPRIORITYR0,
> >> +             .len            = 32,
> >> +             .bits_per_irq   = 8,
> >> +             .handle_mmio    = handle_mmio_priority_reg_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_ICFGR0,
> >> +             .len            = 8,
> >> +             .bits_per_irq   = 2,
> >> +             .handle_mmio    = handle_mmio_cfg_reg_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_IGRPMODR0,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 1,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICR_NSACR,
> >> +             .len            = 4,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {},
> >> +};
> >> +
> >> +static bool handle_mmio_misc_redist(struct kvm_vcpu *vcpu,
> >> +                                 struct kvm_exit_mmio *mmio,
> >> +                                 phys_addr_t offset, void *private)
> >> +{
> >> +     u32 reg;
> >> +     u32 word_offset = offset & 3;
> >> +     u64 mpidr;
> >> +     struct kvm_vcpu *target_redist_vcpu = private;
> >> +     int target_vcpu_id = target_redist_vcpu->vcpu_id;
> >> +
> >> +     switch (offset & ~3) {
> >> +     case GICR_CTLR:
> >> +             /* since we don't support LPIs, this register is zero for now */
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     case GICR_TYPER + 4:
> >> +             mpidr = kvm_vcpu_get_mpidr(target_redist_vcpu);
> >> +             reg = compress_mpidr(mpidr);
> >> +
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                             ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     case GICR_TYPER:
> >> +             reg = target_redist_vcpu->vcpu_id << 8;
> >> +             if (target_vcpu_id == atomic_read(&vcpu->kvm->online_vcpus) - 1)
> >> +                     reg |= GICR_TYPER_LAST;
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                             ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     case GICR_IIDR:
> >> +             reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
> >> +             vgic_reg_access(mmio, &reg, word_offset,
> >> +                     ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> >> +             break;
> > 
> > the fact that you could reuse handle_mmio_iidr directly here and that
> > GICR_TYPER reads funny here, indicates to me that we should once again
> > split this up into smaller functions.
> 
> Yeah, done that. Looks indeed better now.
> 
> >> +     default:
> >> +             vgic_reg_access(mmio, NULL, word_offset,
> >> +                             ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> >> +             break;
> >> +     }
> >> +
> >> +     return false;
> >> +}
> >> +
> >> +static const struct mmio_range vgic_redist_ranges[] = {
> >> +     {       /*
> >> +              * handling CTLR, IIDR, TYPER and STATUSR
> >> +              */
> >> +             .base           = GICR_CTLR,
> >> +             .len            = 20,
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_misc_redist,
> >> +     },
> >> +     {
> >> +             .base           = GICR_WAKER,
> >> +             .len            = 4,
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_raz_wi,
> >> +     },
> >> +     {
> >> +             .base           = GICR_IDREGS,
> >> +             .len            = 0x30,
> >> +             .bits_per_irq   = 0,
> >> +             .handle_mmio    = handle_mmio_idregs,
> >> +     },
> >> +     {},
> >> +};
> >> +
> >> +/*
> >> + * this is the stub handling both dist and redist MMIO exits for v3
> >       This
> > 
> > Is this really a stub?
> > 
> > I would suggest spelling out distributor and re-distributor and GICv3.
> > Full stop after GICv3.
> > 
> >> + * does some vcpu_id calculation on the redist MMIO to use a possibly
> >> + * different VCPU than the current one
> > 
> > "some vcpu_id calculation" is not very helpful, either explain the magic
> > sauce, or just say in which way a "different" VCPU is something we need
> > to pay special attention to.
> > 
> > If I read the code correctly, the comment shoudl simply be:
> > 
> > The GICv3 spec allows any CPU to access any redistributor through the
> > memory-mapped redistributor registers.  We can therefore determine which
> > reditributor is being accesses by simply looking at the faulting IPA.
> > 
> 
> Yeah, admittedly this comment was total crap. Changed it to something
> closer to yours.
> 
> >> + */
> >> +static bool vgic_v3_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >> +                             struct kvm_exit_mmio *mmio)
> >> +{
> >> +     struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> +     unsigned long dbase = dist->vgic_dist_base;
> >> +     unsigned long rdbase = dist->vgic_redist_base;
> > 
> > I'm not crazy about these 'shortcuts', especially given that RD_base is
> > the base of a specific redistributor, but ok.
> 
> Well, I change rdbase below, so at least this one has to stay as a variable.
> 

yeah, I know we did that in the other code too, but I would use the
values directly and only assign rdbase to the specific vcpu's rdbase.
Bah, do whatever you like.

> >> +     int nrcpus = atomic_read(&vcpu->kvm->online_vcpus);
> >> +     int vcpu_id;
> >> +     struct kvm_vcpu *target_redist_vcpu;
> >> +
> >> +     if (is_in_range(mmio->phys_addr, mmio->len, dbase, GIC_V3_DIST_SIZE)) {
> >> +             return vgic_handle_mmio_range(vcpu, run, mmio,
> >> +                                           vgic_dist_ranges, dbase, NULL);
> >> +     }
> >> +
> >> +     if (!is_in_range(mmio->phys_addr, mmio->len, rdbase,
> >> +         GIC_V3_REDIST_SIZE * nrcpus))
> >> +             return false;
> > 
> > so this implies that all redistributors will always be in contiguous IPA
> > space, is this reasonable?
> 
> As far as I read the spec, this is mandated there. And as the "GIC
> implementors" we define that anyway, right?
> 

Where is this mandated in the spec?  I looked for it, but couldn't find
it.

Yes, we can probably define that, hence my question whether this is
reasonable.  Imposing unnecessary physically contigous allocation
requirements (in the guest IPA) is probably something we should avoid.

> >> +
> >> +     vcpu_id = (mmio->phys_addr - rdbase) / GIC_V3_REDIST_SIZE;
> >> +     rdbase += (vcpu_id * GIC_V3_REDIST_SIZE);
> >> +     target_redist_vcpu = kvm_get_vcpu(vcpu->kvm, vcpu_id);
> > 
> > redist_vcpu should be enough
> 
> fixed.
> 
> >> +
> >> +     if (mmio->phys_addr >= rdbase + 0x10000)
> >> +             return vgic_handle_mmio_range(vcpu, run, mmio,
> >> +                                           vgic_redist_sgi_ranges,
> >> +                                           rdbase + 0x10000,
> >> +                                           target_redist_vcpu);
> > 
> > 0x10000 magic number used twice,  GICV3_REDIST_SGI_PAGE_OFFSET or
> > something shorter.
> 
> Done that.
> 
> > perhaps it is nicer to just adjust rdbase and set a range variable above
> > and only have a single call to vgic_handle_mmio_range().
> 
> Yup.
> 
> >> +
> >> +     return vgic_handle_mmio_range(vcpu, run, mmio, vgic_redist_ranges,
> >> +                                   rdbase, target_redist_vcpu);
> >> +}
> >> +
> >> +static bool vgic_v3_queue_sgi(struct kvm_vcpu *vcpu, int irq)
> >> +{
> >> +     if (vgic_queue_irq(vcpu, 0, irq)) {
> >> +             vgic_dist_irq_clear_pending(vcpu, irq);
> >> +             vgic_cpu_irq_clear(vcpu, irq);
> >> +             return true;
> >> +     }
> >> +
> >> +     return false;
> >> +}
> >> +
> >> +static int vgic_v3_init_maps(struct vgic_dist *dist)
> >> +{
> >> +     int nr_spis = dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
> >> +
> >> +     dist->irq_spi_mpidr = kcalloc(nr_spis, sizeof(dist->irq_spi_mpidr[0]),
> >> +                                   GFP_KERNEL);
> >> +
> >> +     if (!dist->irq_spi_mpidr)
> >> +             return -ENOMEM;
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int vgic_v3_init(struct kvm *kvm, const struct vgic_params *params)
> >> +{
> >> +     struct vgic_dist *dist = &kvm->arch.vgic;
> >> +     int ret, i;
> >> +     u32 mpidr;
> >> +
> >> +     if (IS_VGIC_ADDR_UNDEF(dist->vgic_dist_base) ||
> >> +         IS_VGIC_ADDR_UNDEF(dist->vgic_redist_base)) {
> >> +             kvm_err("Need to set vgic distributor addresses first\n");
> >> +             return -ENXIO;
> >> +     }
> >> +
> >> +     /*
> >> +      * FIXME: this should be moved to init_maps time, and may bite
> >> +      * us when adding save/restore. Add a per-emulation hook?
> >> +      */
> > 
> > What is the plan for this?  Can we move it into init_maps or does that
> > require some more work?
> 
> This comment is from Marc, when he once rebased these patches on top of
> his rebased and reworked vgic_dyn patches.
> Looks like I have to take a closer look at this now ...
> 
> > Why can't we do what the gicv2 emulation does?
> > 
> > Not sure what the "Add a per-emulation hook?" question is asking...
> 
> The point is that this allocation is guest GIC model dependend.
> Per-emulation hook means to differentiate between the possible guest
> model code by using a function pointer.
> 

ah, I see.  Yeah, take a look at it and note in the changelog what you
did here and I'll look at it in the new version ;)

> >> +     ret = vgic_v3_init_maps(dist);
> >> +     if (ret) {
> >> +             kvm_err("Unable to allocate maps\n");
> >> +             return ret;
> >> +     }
> >> +
> >> +     mpidr = compress_mpidr(kvm_vcpu_get_mpidr(kvm_get_vcpu(kvm, 0)));
> >> +     for (i = VGIC_NR_PRIVATE_IRQS; i < dist->nr_irqs; i++) {
> >> +             dist->irq_spi_cpu[i - VGIC_NR_PRIVATE_IRQS] = 0;
> >> +             dist->irq_spi_mpidr[i - VGIC_NR_PRIVATE_IRQS] = mpidr;
> >> +             vgic_bitmap_set_irq_val(dist->irq_spi_target, 0, i, 1);
> > 
> > why do we need 3 different copies of the same value now?  ok, we had two
> > before because of the bitmap "optimization" thingy, but now we have two
> > other sets of state for the same thing...
> 
> Mmmh, we use irq_spi_cpu[] and irq_spi_target[] to be able to reuse the
> existing code. irq_spi_mpidr[] is just there to allow read-as-written.
> 

why can't we re-create the mpidr based on the value in the
irq_spi_target[] on reads of the register?

> >> +     }
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static void vgic_v3_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
> >> +{
> > 
> > can you put a one line comment here:
> > 
> > /* The GICv3 spec does away with keeping track of SGI sources */
> 
> Sure.
> 
> >> +}
> >> +
> >> +bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type)
> >> +{
> >> +     struct vgic_dist *dist = &kvm->arch.vgic;
> >> +
> >> +     switch (type) {
> >> +     case KVM_DEV_TYPE_ARM_VGIC_V3:
> >> +             dist->vm_ops.handle_mmio = vgic_v3_handle_mmio;
> >> +             dist->vm_ops.queue_sgi = vgic_v3_queue_sgi;
> >> +             dist->vm_ops.add_sgi_source = vgic_v3_add_sgi_source;
> >> +             dist->vm_ops.vgic_init = vgic_v3_init;
> >> +             break;
> >> +     default:
> >> +             return false;
> >> +     }
> >> +     return true;
> >> +}
> >> +
> >> +/*
> >> + * triggered by a system register access trap, called from the sysregs
> > 
> >       Triggered
> > 
> >> + * handling code there.
> > 
> >                     ^^^ there, where, here, and everywhere ?
> > 
> >> + * The register contains the upper three affinity levels of the target
> > 
> >           ^^^ which register?  @reg ?
> > 
> >> + * processors as well as a bitmask of 16 Aff0 CPUs.
> > 
> > Does @reg follow the format from something in the spec?  That would be
> > useful to know...
> > 
> >> + * Iterate over all VCPUs to check for matching ones or signal on
> >> + * all-but-self if the mode bit is set.
> > 
> > an all-but-self IPI?  Is that the architectural term?  Otherwise I would
> > suggest something like:  If not VCPUs are found which match reg (in some
> > way), then send the IPI to all VCPUs in the VM, except the one
> > performing the system register acces.
> 
> I totally reworked the comment. Admittedly this was more targeted to
> Marc ;-)
> 
> >> + */
> > 
> > Also, please use the kdocs format here like the rest of the kvm/arm code.
> > Begin sentences with upper-case, etc.:
> > 
> > /**
> > * vgic_v3_dispatch_sgi - This function does something with SGIs
> > * @vcpu: The vcpu pointer
> > * @reg: Magic
> > *
> > * Some nicer version of what you have above.
> > */
> > 
> >> +void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
> >> +{
> > 
> > It's a bit hard to review this when I cannot see how it is called, I'm
> > assuming that this is on writes to ICC_SGI1R_EL1 and reg is what the
> > guest tried to write to that register.
> > 
> > I have a feeling that you may want to add this function in a separate patch.
> 
> I think I had it thay way before and there was some other issue with
> this split-up. Will give it a try again.
> 

that or document the function clearly enough, whatever you find easiest
at this point.

> >> +     struct kvm *kvm = vcpu->kvm;
> >> +     struct kvm_vcpu *c_vcpu;
> >> +     struct vgic_dist *dist = &kvm->arch.vgic;
> >> +     u16 target_cpus;
> >> +     u64 mpidr, mpidr_h, mpidr_l;
> >> +     int sgi, mode, c, vcpu_id;
> >> +     int updated = 0;
> >> +
> >> +     vcpu_id = vcpu->vcpu_id;
> >> +
> >> +     sgi = (reg >> 24) & 0xf;
> >> +     mode = (reg >> 40) & 0x1;
> > 
> > perhaps we can call this 'targeted' or something to make it a bit more
> > clear.
> 
> I use broadcast now, that is even more readable in the code below.
> 
> >> +     target_cpus = reg & 0xffff;
> > 
> > Can you add some defines for these magic shifts?  Are there not some
> > already for the GICv3 host driver we can reuse?
> 
> No, the host driver uses magic shift values directly.
> I added appropriate defines in arm-gic-v3.h and use them in both places
> now. It is a bit messy though, since the names tend to get quite long
> and I needed to define wrapper macros to make it not totally unreadable.
> 

ok, then maybe not worth it, if it looks worse.

> >> +     mpidr = ((reg >> 48) & 0xff) << MPIDR_LEVEL_SHIFT(3);
> >> +     mpidr |= ((reg >> 32) & 0xff) << MPIDR_LEVEL_SHIFT(2);
> >> +     mpidr |= ((reg >> 16) & 0xff) << MPIDR_LEVEL_SHIFT(1);
> >> +     mpidr &= ~MPIDR_LEVEL_MASK;
> > 
> > (**) note the comment a few lines down.
> > 
> >> +
> >> +     /*
> >> +      * We take the dist lock here, because we come from the sysregs
> >> +      * code path and not from MMIO (where this is already done)
> > 
> >                                         which already takes the lock).
> > 
> >> +      */
> >> +     spin_lock(&dist->lock);
> >> +     kvm_for_each_vcpu(c, c_vcpu, kvm) {
> > 
> > I think it would be helpful to document this loop, something like:
> > 
> > We loop through every possible vCPU and check if we need to send an SGI
> > to that vCPU.  If targeting specific vCPUS, we check if the candidate
> > vCPU is in the target list and if it is, we send an SGI and clear the
> > bit in the target list.  When the target list is empty and we are
> > targeting specific vCPUs, we are done.
> > 
> > Maybe too verbose, you can tweak it as you like.
> > 
> >> +             if (!mode && target_cpus == 0)
> >> +                     break;
> >> +             if (mode && c == vcpu_id)       /* not to myself */
> >> +                     continue;
> >> +             if (!mode) {
> >> +                     mpidr_h = kvm_vcpu_get_mpidr(c_vcpu);
> >> +                     mpidr_l = MPIDR_AFFINITY_LEVEL(mpidr_h, 0);
> >> +                     mpidr_h &= ~MPIDR_LEVEL_MASK;
> > 
> > this is *really* confusing. _h and _l are high and low?
> > 
> > Can you factor this out into a static inline and get rid of that mpidr
> > mask above (**) ?
> 
> So I reworked the whole function and commented it heavily. The algorithm
> stays the same, but it much more readable, hopefully.
> 
> >> +                     if (mpidr != mpidr_h)
> >> +                             continue;
> >> +                     if (!(target_cpus & BIT(mpidr_l)))
> >> +                             continue;
> >> +                     target_cpus &= ~BIT(mpidr_l);
> >> +             }
> >> +             /* Flag the SGI as pending */
> >> +             vgic_dist_irq_set_pending(c_vcpu, sgi);
> >> +             updated = 1;
> >> +             kvm_debug("SGI%d from CPU%d to CPU%d\n", sgi, vcpu_id, c);
> >> +     }
> >> +     if (updated)
> >> +             vgic_update_state(vcpu->kvm);
> >> +     spin_unlock(&dist->lock);
> >> +     if (updated)
> >> +             vgic_kick_vcpus(vcpu->kvm);
> >> +}
> >> +
> >> +
> >> +static int vgic_v3_get_attr(struct kvm_device *dev,
> >> +                         struct kvm_device_attr *attr)
> >> +{
> >> +     int ret;
> >> +
> >> +     ret = vgic_get_common_attr(dev, attr);
> > 
> > So this means we can get the KVM_VGIC_V2_ADDR_TYPE_DIST and
> > KVM_VGIC_V2_ADDR_TYPE_CPU from an emualted gicv3 without the GICv2
> > backwards compatibility features?
> 
> Mmh, below has_attr() doesn't claim to support it. So accessing them
> would break the KVM protocol?

Regardless, the kernel shouldn't be exposing the values if they're not
supported for the device in question.

> Also, kvm_vgic_addr() explicitly tries to avoid this (check type_needed).
> Feel free to correct if I am wrong on this ...
> 

ah, I missed that check.  It may be easier to just move the details of
kvm_vgic_addr into the specific -emul.c files, because you're currently
only sharing two lines + lock/unlock.  If you move the function you can
get rid of all the type_needed stuff.


> >> +     if (ret != -ENXIO)
> >> +             return ret;
> >> +
> >> +     switch (attr->group) {
> >> +     case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> >> +     case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> >> +             return -ENXIO;
> >> +     }
> >> +
> >> +     return -ENXIO;
> >> +}
> >> +
> >> +static int vgic_v3_set_attr(struct kvm_device *dev,
> >> +                         struct kvm_device_attr *attr)
> >> +{
> >> +     int ret;
> >> +
> >> +     ret = vgic_set_common_attr(dev, attr);
> > 
> > same as above?
> > 
> >> +     if (ret != -ENXIO)
> >> +             return ret;
> >> +
> >> +     switch (attr->group) {
> >> +     case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> >> +     case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> >> +             return -ENXIO;
> >> +     }
> >> +
> >> +     return -ENXIO;
> >> +}
> >> +
> >> +static int vgic_v3_has_attr(struct kvm_device *dev,
> >> +                         struct kvm_device_attr *attr)
> >> +{
> >> +     switch (attr->group) {
> >> +     case KVM_DEV_ARM_VGIC_GRP_ADDR:
> >> +             switch (attr->attr) {
> >> +             case KVM_VGIC_V2_ADDR_TYPE_DIST:
> >> +             case KVM_VGIC_V2_ADDR_TYPE_CPU:
> >> +                     return -ENXIO;
> >> +             }
> >> +             break;
> >> +     case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> >> +     case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> >> +             return -ENXIO;
> >> +     case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
> >> +             return 0;
> >> +     }
> >> +     return -ENXIO;
> >> +}
> >> +
> >> +struct kvm_device_ops kvm_arm_vgic_v3_ops = {
> >> +     .name = "kvm-arm-vgic-v3",
> >> +     .create = vgic_create,
> >> +     .destroy = vgic_destroy,
> >> +     .set_attr = vgic_v3_set_attr,
> >> +     .get_attr = vgic_v3_get_attr,
> >> +     .has_attr = vgic_v3_has_attr,
> > 
> > nit: you could reorder set/get so they're set in the same order they
> > appear in the code.
> 
> Why not...
> Have you thought about an OCD treatment, btw? ;-)
> 

This is it, right here ;)

> >> +};
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index a54389b..2867269d 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -1228,7 +1228,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>       struct kvm_vcpu *vcpu;
> >>       int edge_triggered, level_triggered;
> >>       int enabled;
> >> -     bool ret = true;
> >> +     bool ret = true, can_inject = true;
> >>
> >>       spin_lock(&dist->lock);
> >>
> >> @@ -1243,6 +1243,11 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>
> >>       if (irq_num >= VGIC_NR_PRIVATE_IRQS) {
> >>               cpuid = dist->irq_spi_cpu[irq_num - VGIC_NR_PRIVATE_IRQS];
> >> +             if (cpuid == VCPU_NOT_ALLOCATED) {
> >> +                     /* Pretend we use CPU0, and prevent injection */
> >> +                     cpuid = 0;
> >> +                     can_inject = false;
> >> +             }
> >>               vcpu = kvm_get_vcpu(kvm, cpuid);
> >>       }
> >>
> >> @@ -1264,7 +1269,7 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int cpuid,
> >>
> >>       enabled = vgic_irq_is_enabled(vcpu, irq_num);
> >>
> >> -     if (!enabled) {
> >> +     if (!enabled || !can_inject) {
> > 
> > don't you also need to handle the vgic_dist_irq_set_pending() call and
> > its friends above?
> 
> Not sure I understand what you mean here.
> can_inject is there to check accesses to the irq_spi_cpu[] array and
> detect the "undefined CPU" special case caused by writing an unknown
> MPIDR into the GICD_IROUTERn register.
> AFAICT vgic_update_irq_pending() is the only function using this array,
> right? Or have I got something wrong here?
> 

This is another indication that Marc is right; we should really think
about rewriting all of this.

Firstly, that comment about "Pretend we use CPU0" doesn't make any sense
unless you look at the comment in handle_mmio_route_reg() in
vgic-v3-emul.c.  That comment (about non-existing MPIDR values) tells us
that if the IROUTER is programmed to a non-existing MPIDR, the IRQ
should become pending on the distributor, but not be forwarded to the
CPU interface, which makes this look like it's correct.  However...

Secondly, this appears to be correct, but, the fact that you skip the
call to vgic_cpu_irq_set() and do not set the bit in irq_pending_on_cpu
doesn't really *change* anything, it just *delays* it.  Because whenever
someone calls vgic_update_state() and subsequently
compute_pending_for_cpu() those two values will be updated and this
interrupt will now end up being forwarded to CPU0, which was clearly not
the intention.

So, you need to add your can_inject check to copmute_pending_for_cpu(),
which is just horrible, and not something you want to do.

So your alternative is to not actually mark the interrupt as pending on
the distributor.  However, that probably breaks semantics for a guest
reading the GICD_I{SC}PENDR registers.

So unless I managed to confuse myself too much, to get this right, you
have to re-architect something more substantially, and this really
sucks.

So I suggest you just move this check to vgic_validate_injection() and
add a big fat comment in the other places where this breaks, and we make
this the main focus of a KVM/ARM sprint some time in the future.


> >>               ret = false;
> >>               goto out;
> >>       }
> >> @@ -1406,6 +1411,7 @@ void kvm_vgic_destroy(struct kvm *kvm)
> >>       }
> >>       kfree(dist->irq_sgi_sources);
> >>       kfree(dist->irq_spi_cpu);
> >> +     kfree(dist->irq_spi_mpidr);
> >>       kfree(dist->irq_spi_target);
> >>       kfree(dist->irq_pending_on_cpu);
> >>       dist->irq_sgi_sources = NULL;
> >> @@ -1581,6 +1587,7 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
> >>       kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
> >>       kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
> >>       kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
> >> +     kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
> > 
> > sure, we can write to the same memory twice, why not, it's fun.
> 
> It honours you that you remembered that this is defined as a union, but
> I consider this an implementation detail (which may change in the
> future) and the "casual" reader may not know this, so for the sake of
> completeness let's initialize all to them. Either the compiler, the L1D
> or the out-of-order engine in the CPU should drop this redundancy in
> practise, hopefully.
> 

ha, do as you like, but I think this is so tightly coupled anyhow that
your argument doesn't really hold water.

> >>
> >>       if (!init_emulation_ops(kvm, type))
> >>               ret = -ENODEV;
> >> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
> >> index f52db4e..42c20c1 100644
> >> --- a/virt/kvm/arm/vgic.h
> >> +++ b/virt/kvm/arm/vgic.h
> >> @@ -35,6 +35,8 @@
> >>  #define ACCESS_WRITE_VALUE   (3 << 1)
> >>  #define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
> >>
> >> +#define VCPU_NOT_ALLOCATED   ((u8)-1)
> >> +
> >>  unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x);
> >>
> >>  void vgic_update_state(struct kvm *kvm);
> >> @@ -121,5 +123,6 @@ int vgic_set_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
> >>  int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
> >>
> >>  bool vgic_v2_init_emulation_ops(struct kvm *kvm, int type);
> >> +bool vgic_v3_init_emulation_ops(struct kvm *kvm, int type);
> >>
> >>  #endif
> >> --
> >> 1.7.9.5
> >>
> 
> Puh, that's it. I'm on to merging the changes in the respective patches
> and will send out a v4 ASAP.
> 

Yes, puh indeed.

Vielen Dank for dealing with my OCD.
-Christoffer

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2
  2014-11-13 11:18       ` Christoffer Dall
@ 2014-11-13 11:45         ` Marc Zyngier
  2014-11-13 12:01           ` Andre Przywara
  0 siblings, 1 reply; 76+ messages in thread
From: Marc Zyngier @ 2014-11-13 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 13/11/14 11:18, Christoffer Dall wrote:
> On Wed, Nov 12, 2014 at 12:39:05PM +0000, Andre Przywara wrote:
>> Hej Christoffer,
>>
>> the promised part 2 of the reply:
>>
>> On 07/11/14 14:30, Christoffer Dall wrote:
>>> On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
>>>> With everything separated and prepared, we implement a model of a
>>>> GICv3 distributor and redistributors by using the existing framework
>>>> to provide handler functions for each register group.
>>
>> [...]
>>
>>>> +
>>>> +static const struct mmio_range vgic_dist_ranges[] = {
>>
>> [...]
>>
>>>> +     /* the next three blocks are RES0 if ARE=1 */
>>>
>>> probably nicer to just have a comment for each register where this
>>> applies.
>>
>> Done.
>>
>>>
>>>> +     {
>>>> +             .base           = GICD_SGIR,
>>>> +             .len            = 4,
>>>> +             .handle_mmio    = handle_mmio_raz_wi,
>>>> +     },
>>>> +     {
>>>> +             .base           = GICD_CPENDSGIR,
>>>> +             .len            = 0x10,
>>>> +             .handle_mmio    = handle_mmio_raz_wi,
>>>> +     },
>>>> +     {
>>>> +             .base           = GICD_SPENDSGIR,
>>>> +             .len            = 0x10,
>>>> +             .handle_mmio    = handle_mmio_raz_wi,
>>>> +     },
>>>> +     {
>>>> +             .base           = GICD_IROUTER,
>>>> +             .len            = 0x2000,
>>>
>>> shouldn't this be 0x1ee0?
>>
>> The limit of 0x7FD8 in the spec seems to come from 1020 - 32 SPIs.
>> However all the other registers always claim 1024 IRQs supported (with
>> non-implemented SPIs being RAZ/WI anyway).
>> So I wonder if this is just a inconsistency in the spec.
>> Marc, can you comment?
> 
> The spec's memory map clearly indicates that the space at 0x6100 +
> 0x1edc an onwards is reserverd, so it feels weird to define IROUTER
> registers here.
> 
> Indeed you guys should check what the true intention is.

Indeed, the spec is very clear about the range 0x7fdc-0xbffc to be
reserved. Andre, can you please update this?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2
  2014-11-13 11:45         ` Marc Zyngier
@ 2014-11-13 12:01           ` Andre Przywara
  0 siblings, 0 replies; 76+ messages in thread
From: Andre Przywara @ 2014-11-13 12:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 13/11/14 11:45, Marc Zyngier wrote:
> On 13/11/14 11:18, Christoffer Dall wrote:
>> On Wed, Nov 12, 2014 at 12:39:05PM +0000, Andre Przywara wrote:
>>> Hej Christoffer,
>>>
>>> the promised part 2 of the reply:
>>>
>>> On 07/11/14 14:30, Christoffer Dall wrote:
>>>> On Fri, Oct 31, 2014 at 05:26:51PM +0000, Andre Przywara wrote:
>>>>> With everything separated and prepared, we implement a model of a
>>>>> GICv3 distributor and redistributors by using the existing framework
>>>>> to provide handler functions for each register group.
>>>
>>> [...]
>>>
>>>>> +
>>>>> +static const struct mmio_range vgic_dist_ranges[] = {
>>>
>>> [...]
>>>
>>>>> +     /* the next three blocks are RES0 if ARE=1 */
>>>>
>>>> probably nicer to just have a comment for each register where this
>>>> applies.
>>>
>>> Done.
>>>
>>>>
>>>>> +     {
>>>>> +             .base           = GICD_SGIR,
>>>>> +             .len            = 4,
>>>>> +             .handle_mmio    = handle_mmio_raz_wi,
>>>>> +     },
>>>>> +     {
>>>>> +             .base           = GICD_CPENDSGIR,
>>>>> +             .len            = 0x10,
>>>>> +             .handle_mmio    = handle_mmio_raz_wi,
>>>>> +     },
>>>>> +     {
>>>>> +             .base           = GICD_SPENDSGIR,
>>>>> +             .len            = 0x10,
>>>>> +             .handle_mmio    = handle_mmio_raz_wi,
>>>>> +     },
>>>>> +     {
>>>>> +             .base           = GICD_IROUTER,
>>>>> +             .len            = 0x2000,
>>>>
>>>> shouldn't this be 0x1ee0?
>>>
>>> The limit of 0x7FD8 in the spec seems to come from 1020 - 32 SPIs.
>>> However all the other registers always claim 1024 IRQs supported (with
>>> non-implemented SPIs being RAZ/WI anyway).
>>> So I wonder if this is just a inconsistency in the spec.
>>> Marc, can you comment?
>>
>> The spec's memory map clearly indicates that the space at 0x6100 +
>> 0x1edc an onwards is reserverd, so it feels weird to define IROUTER
>> registers here.
>>
>> Indeed you guys should check what the true intention is.
> 
> Indeed, the spec is very clear about the range 0x7fdc-0xbffc to be
> reserved. Andre, can you please update this?

Ah, yes I skipped this, sorry. Will fix it.

Regards,
Andre.

^ permalink raw reply	[flat|nested] 76+ messages in thread

end of thread, other threads:[~2014-11-13 12:01 UTC | newest]

Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-31 17:26 [PATCH v3 00/19] KVM GICv3 emulation Andre Przywara
2014-10-31 17:26 ` [PATCH v3 01/19] arm/arm64: KVM: rework MPIDR assignment and add accessors Andre Przywara
2014-11-03 13:13   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 02/19] arm/arm64: KVM: pass down user space provided GIC type into vGIC code Andre Przywara
2014-11-03 13:14   ` Christoffer Dall
2014-11-03 13:25     ` Andre Przywara
2014-11-03 16:51       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 03/19] arm/arm64: KVM: refactor vgic_handle_mmio() function Andre Przywara
2014-11-03 13:23   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 04/19] arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones Andre Przywara
2014-11-03 13:25   ` Christoffer Dall
2014-11-04 12:18     ` Andre Przywara
2014-11-04 13:24       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 05/19] arm/arm64: KVM: introduce per-VM ops Andre Przywara
2014-11-03 13:59   ` Christoffer Dall
2014-11-04 15:58     ` Andre Przywara
2014-11-04 19:03       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 06/19] arm/arm64: KVM: move [sg]et_lr into " Andre Przywara
2014-11-03 14:15   ` Christoffer Dall
2014-11-04 16:30     ` Andre Przywara
2014-11-04 19:12       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 07/19] arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing Andre Przywara
2014-11-03 20:05   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 08/19] arm/arm64: KVM: dont rely on a valid GICH base address Andre Przywara
2014-11-03 20:05   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 09/19] arm/arm64: KVM: make the maximum number of vCPUs a per-VM value Andre Przywara
2014-11-03 20:06   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 10/19] arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable Andre Przywara
2014-11-03 20:04   ` Christoffer Dall
2014-11-03 20:17     ` Marc Zyngier
2014-11-07 19:18       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 11/19] arm/arm64: KVM: refactor MMIO accessors Andre Przywara
2014-11-04 11:55   ` Christoffer Dall
2014-11-04 12:25     ` Andre Przywara
2014-10-31 17:26 ` [PATCH v3 12/19] arm/arm64: KVM: refactor/wrap vgic_set/get_attr() Andre Przywara
2014-11-04 19:30   ` Christoffer Dall
2014-11-05 10:27     ` Andre Przywara
2014-11-05 10:37       ` Andre Przywara
2014-11-05 12:57       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 13/19] arm/arm64: KVM: add vgic.h header file Andre Przywara
2014-11-04 19:30   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 14/19] arm/arm64: KVM: split GICv2 specific emulation code from vgic.c Andre Przywara
2014-11-04 19:30   ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 15/19] arm/arm64: KVM: add opaque private pointer to MMIO accessors Andre Przywara
2014-11-04 15:44   ` Christoffer Dall
2014-11-04 17:24     ` Andre Przywara
2014-11-04 18:05       ` Marc Zyngier
2014-11-04 19:18         ` Christoffer Dall
2014-11-04 20:17           ` Marc Zyngier
2014-11-05  9:49             ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation Andre Przywara
2014-11-07 14:30   ` Christoffer Dall
2014-11-10 17:30     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 1 Andre Przywara
2014-11-11 13:48       ` Christoffer Dall
2014-11-12 12:39     ` [PATCH v3 16/19] arm/arm64: KVM: add virtual GICv3 distributor emulation / PART 2 Andre Przywara
2014-11-12 19:51       ` Christoffer Dall
2014-11-13 11:18       ` Christoffer Dall
2014-11-13 11:45         ` Marc Zyngier
2014-11-13 12:01           ` Andre Przywara
2014-10-31 17:26 ` [PATCH v3 17/19] arm64: KVM: add SGI system register trapping Andre Przywara
2014-11-07 15:07   ` Christoffer Dall
2014-11-10 11:31     ` Andre Przywara
2014-11-10 12:45       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 18/19] arm/arm64: KVM: enable kernel side of GICv3 emulation Andre Przywara
2014-11-07 16:07   ` Christoffer Dall
2014-11-10 12:19     ` Andre Przywara
2014-11-10 13:24       ` Christoffer Dall
2014-10-31 17:26 ` [PATCH v3 19/19] arm/arm64: KVM: allow userland to request a virtual GICv3 Andre Przywara
2014-11-07 16:15   ` Christoffer Dall
2014-11-10 12:26     ` Andre Przywara
2014-11-10 13:25       ` Christoffer Dall
2014-11-03 12:59 ` [PATCH v3 00/19] KVM GICv3 emulation Christoffer Dall
2014-11-06 10:57 ` Christoffer Dall
2014-11-06 11:21   ` Christoffer Dall
2014-11-06 15:13     ` Andre Przywara
2014-11-06 18:09       ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.