All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 14:55 ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

Hi,

another respin and rebase of the ITS emulation series.
Major changes compared to v2 (beside some minor things like added
comments and function renames) are the rebasing and adaption to 4.3-rc
and Christoffer's timer rework series. Also the locking has been
reworked to cope with the dependencies of the its and the dist lock
in connection with the PROPBASER/PENDBASER and the command handling.
For a more detailed changelog see below or look at the respective
commit messages.

This should address most of the comments I got on the list.
Many thanks to the diligent reviewers!
I didn't bother to fine-tune patch 01/16 too much, as I guess there
will be more discussion around this based on Pavel's latest post.

These patches go on top of Christoffer's timer rework series [1],
which itself is on top of 4.3-rc2.
You can find all of this code in the its-emul/v3 branch of my
repository [2].

Cheers,
Andre.

Changelog v2..v3:
- adapt to 4.3-rc and Christoffer's timer rework
- adapt spin locks on handling PROPBASER/PENDBASER registers
- rework locking in ITS command handling (dropping dist where needed)
- only clear LPI pending bit if LPI could actually be queued
- simplify GICR_CTLR handling
- properly free ITTEs (including our pending bitmap)
- fix corner cases with unmapped collections
- keep retire_lr() around
- rename vgic_handle_base_register to vgic_reg64_access()
- use kcalloc instead of kmalloc
- minor fixes, renames and added comments

Changelog v1..v2
- fix issues when using non-ITS GICv3 emulation
- streamline frame address initialization (new patch 05/15)
- preallocate buffer memory for reading from guest's memory
- move locking into the actual command handlers
-   preallocate memory for new structures if needed
- use non-atomic __set_bit() and __clear_bit() when under the lock
- add INT command handler to allow LPI injection from the guest
- rewrite CWRITER handler to align with new locking scheme
- remove unneeded CONFIG_HAVE_KVM_MSI #ifdefs
- check memory table size against our LPI limit (65536 interrupts)
- observe initial gap of 1024 interrupts in pending table
- use term "configuration table" to be in line with the spec
- clarify and extend documentation on API extensions
- introduce new KVM_CAP_MSI_DEVID capability to advertise device ID requirement
- update, fix and add many comments
- minor style changes as requested by reviewers

---------------

The GICv3 ITS (Interrupt Translation Service) is a part of the
ARM GICv3 interrupt controller [4] used for implementing MSIs.
It specifies a new kind of interrupts (LPIs), which are mapped to
establish a connection between a device, its MSI payload value and
the target processor the IRQ is eventually delivered to.
In order to allow using MSIs in an ARM64 KVM guest, we emulate this
ITS widget in the kernel.
The ITS works by reading commands written by software (from the guest
in our case) into a (guest allocated) memory region and establishing
the mapping between a device, the MSI payload and the target CPU.
We parse these commands and update our internal data structures to
reflect those changes. On an MSI injection we iterate those
structures to learn the LPI number we have to inject.
For the time being we use simple lists to hold the data, this is
good enough for the small number of entries each of the components
currently have. Should this become a performance bottleneck in the
future, those can be extended to arrays or trees if needed.

Most of the code lives in a separate source file (its-emul.c), though
there are some changes necessary both in vgic.c and vgic-v3-emul.c.

Patch 01/16 gets rid of the internal tracking of the used LR for
an injected IRQ, see the commit message for more details.
Patch 03/16 extends the KVM MSI ioctl to hold a device ID.
Patch 04-06 make small changes to the existing VGIC code which make
adaptions to the ITS later easier.
The rest of the patches implement the ITS functionality step by step.
For more details see the respective commit messages.

For the time being this series gives us the ability to use emulated
PCI devices that can use MSIs in the guest. Those have to be
triggered by letting the userland device emulation simulate the MSI
write with the KVM_SIGNAL_MSI ioctl. This will be translated into
the proper LPI by the ITS emulation and injected into the guest in
the usual way (just with a higher IRQ number).

This series is based on 4.3-rc2 and can be found at the its-emul/v3
branch of this repository [2].
For this to be used you need a GICv3 host machine (a fast model would
do), though it does not rely on any host ITS bits (neither in hardware
or software).

To test this you can use the kvmtool patches available in the "its"
branch here [3].
Start a guest with: "$ lkvm run --irqchip=gicv3-its --force-pci"
and see the ITS being used for instance by the virtio devices.

[1]: https://git.linaro.org/people/christoffer.dall/linux-kvm-arm.git/shortlog/refs/heads/timer-rework-v3
[2]: git://linux-arm.org/linux-ap.git
     http://www.linux-arm.org/git?p=linux-ap.git;a=log;h=refs/heads/its-emul/v3
[3]: git://linux-arm.org/kvmtool.git
     http://www.linux-arm.org/git?p=kvmtool.git;a=log;h=refs/heads/its
[4]: http://arminfo.emea.arm.com/help/topic/com.arm.doc.ihi0069a/IHI0069A_gic_architecture_specification.pdf

Andre Przywara (16):
  KVM: arm/arm64: VGIC: don't track used LRs in the distributor
  KVM: arm/arm64: remove now unused code after stay-in-LR rework
  KVM: extend struct kvm_msi to hold a 32-bit device ID
  KVM: arm/arm64: add emulation model specific destroy function
  KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
  KVM: arm/arm64: make GIC frame address initialization model specific
  KVM: arm64: Introduce new MMIO region for the ITS base address
  KVM: arm64: handle ITS related GICv3 redistributor registers
  KVM: arm64: introduce ITS emulation file with stub functions
  KVM: arm64: implement basic ITS register handlers
  KVM: arm64: add data structures to model ITS interrupt translation
  KVM: arm64: handle pending bit for LPIs in ITS emulation
  KVM: arm64: sync LPI configuration and pending tables
  KVM: arm64: implement ITS command queue command handlers
  KVM: arm64: implement MSI injection in ITS emulation
  KVM: arm64: enable ITS emulation as a virtual MSI controller

 Documentation/virtual/kvm/api.txt              |   14 +-
 Documentation/virtual/kvm/devices/arm-vgic.txt |    9 +
 arch/arm/include/asm/kvm_host.h                |    2 +-
 arch/arm/kvm/arm.c                             |    2 +-
 arch/arm64/include/asm/kvm_host.h              |    2 +-
 arch/arm64/include/uapi/asm/kvm.h              |    2 +
 arch/arm64/kvm/Kconfig                         |    1 +
 arch/arm64/kvm/Makefile                        |    1 +
 arch/arm64/kvm/reset.c                         |    8 +-
 include/kvm/arm_vgic.h                         |   43 +-
 include/linux/irqchip/arm-gic-v3.h             |   14 +-
 include/uapi/linux/kvm.h                       |    5 +-
 virt/kvm/arm/its-emul.c                        | 1187 ++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h                        |   55 ++
 virt/kvm/arm/vgic-v2-emul.c                    |    3 +
 virt/kvm/arm/vgic-v2.c                         |    1 +
 virt/kvm/arm/vgic-v3-emul.c                    |  101 +-
 virt/kvm/arm/vgic-v3.c                         |    1 +
 virt/kvm/arm/vgic.c                            |  292 +++---
 virt/kvm/arm/vgic.h                            |    3 +
 20 files changed, 1601 insertions(+), 145 deletions(-)
 create mode 100644 virt/kvm/arm/its-emul.c
 create mode 100644 virt/kvm/arm/its-emul.h

-- 
2.5.1


^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 14:55 ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

another respin and rebase of the ITS emulation series.
Major changes compared to v2 (beside some minor things like added
comments and function renames) are the rebasing and adaption to 4.3-rc
and Christoffer's timer rework series. Also the locking has been
reworked to cope with the dependencies of the its and the dist lock
in connection with the PROPBASER/PENDBASER and the command handling.
For a more detailed changelog see below or look at the respective
commit messages.

This should address most of the comments I got on the list.
Many thanks to the diligent reviewers!
I didn't bother to fine-tune patch 01/16 too much, as I guess there
will be more discussion around this based on Pavel's latest post.

These patches go on top of Christoffer's timer rework series [1],
which itself is on top of 4.3-rc2.
You can find all of this code in the its-emul/v3 branch of my
repository [2].

Cheers,
Andre.

Changelog v2..v3:
- adapt to 4.3-rc and Christoffer's timer rework
- adapt spin locks on handling PROPBASER/PENDBASER registers
- rework locking in ITS command handling (dropping dist where needed)
- only clear LPI pending bit if LPI could actually be queued
- simplify GICR_CTLR handling
- properly free ITTEs (including our pending bitmap)
- fix corner cases with unmapped collections
- keep retire_lr() around
- rename vgic_handle_base_register to vgic_reg64_access()
- use kcalloc instead of kmalloc
- minor fixes, renames and added comments

Changelog v1..v2
- fix issues when using non-ITS GICv3 emulation
- streamline frame address initialization (new patch 05/15)
- preallocate buffer memory for reading from guest's memory
- move locking into the actual command handlers
-   preallocate memory for new structures if needed
- use non-atomic __set_bit() and __clear_bit() when under the lock
- add INT command handler to allow LPI injection from the guest
- rewrite CWRITER handler to align with new locking scheme
- remove unneeded CONFIG_HAVE_KVM_MSI #ifdefs
- check memory table size against our LPI limit (65536 interrupts)
- observe initial gap of 1024 interrupts in pending table
- use term "configuration table" to be in line with the spec
- clarify and extend documentation on API extensions
- introduce new KVM_CAP_MSI_DEVID capability to advertise device ID requirement
- update, fix and add many comments
- minor style changes as requested by reviewers

---------------

The GICv3 ITS (Interrupt Translation Service) is a part of the
ARM GICv3 interrupt controller [4] used for implementing MSIs.
It specifies a new kind of interrupts (LPIs), which are mapped to
establish a connection between a device, its MSI payload value and
the target processor the IRQ is eventually delivered to.
In order to allow using MSIs in an ARM64 KVM guest, we emulate this
ITS widget in the kernel.
The ITS works by reading commands written by software (from the guest
in our case) into a (guest allocated) memory region and establishing
the mapping between a device, the MSI payload and the target CPU.
We parse these commands and update our internal data structures to
reflect those changes. On an MSI injection we iterate those
structures to learn the LPI number we have to inject.
For the time being we use simple lists to hold the data, this is
good enough for the small number of entries each of the components
currently have. Should this become a performance bottleneck in the
future, those can be extended to arrays or trees if needed.

Most of the code lives in a separate source file (its-emul.c), though
there are some changes necessary both in vgic.c and vgic-v3-emul.c.

Patch 01/16 gets rid of the internal tracking of the used LR for
an injected IRQ, see the commit message for more details.
Patch 03/16 extends the KVM MSI ioctl to hold a device ID.
Patch 04-06 make small changes to the existing VGIC code which make
adaptions to the ITS later easier.
The rest of the patches implement the ITS functionality step by step.
For more details see the respective commit messages.

For the time being this series gives us the ability to use emulated
PCI devices that can use MSIs in the guest. Those have to be
triggered by letting the userland device emulation simulate the MSI
write with the KVM_SIGNAL_MSI ioctl. This will be translated into
the proper LPI by the ITS emulation and injected into the guest in
the usual way (just with a higher IRQ number).

This series is based on 4.3-rc2 and can be found at the its-emul/v3
branch of this repository [2].
For this to be used you need a GICv3 host machine (a fast model would
do), though it does not rely on any host ITS bits (neither in hardware
or software).

To test this you can use the kvmtool patches available in the "its"
branch here [3].
Start a guest with: "$ lkvm run --irqchip=gicv3-its --force-pci"
and see the ITS being used for instance by the virtio devices.

[1]: https://git.linaro.org/people/christoffer.dall/linux-kvm-arm.git/shortlog/refs/heads/timer-rework-v3
[2]: git://linux-arm.org/linux-ap.git
     http://www.linux-arm.org/git?p=linux-ap.git;a=log;h=refs/heads/its-emul/v3
[3]: git://linux-arm.org/kvmtool.git
     http://www.linux-arm.org/git?p=kvmtool.git;a=log;h=refs/heads/its
[4]: http://arminfo.emea.arm.com/help/topic/com.arm.doc.ihi0069a/IHI0069A_gic_architecture_specification.pdf

Andre Przywara (16):
  KVM: arm/arm64: VGIC: don't track used LRs in the distributor
  KVM: arm/arm64: remove now unused code after stay-in-LR rework
  KVM: extend struct kvm_msi to hold a 32-bit device ID
  KVM: arm/arm64: add emulation model specific destroy function
  KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
  KVM: arm/arm64: make GIC frame address initialization model specific
  KVM: arm64: Introduce new MMIO region for the ITS base address
  KVM: arm64: handle ITS related GICv3 redistributor registers
  KVM: arm64: introduce ITS emulation file with stub functions
  KVM: arm64: implement basic ITS register handlers
  KVM: arm64: add data structures to model ITS interrupt translation
  KVM: arm64: handle pending bit for LPIs in ITS emulation
  KVM: arm64: sync LPI configuration and pending tables
  KVM: arm64: implement ITS command queue command handlers
  KVM: arm64: implement MSI injection in ITS emulation
  KVM: arm64: enable ITS emulation as a virtual MSI controller

 Documentation/virtual/kvm/api.txt              |   14 +-
 Documentation/virtual/kvm/devices/arm-vgic.txt |    9 +
 arch/arm/include/asm/kvm_host.h                |    2 +-
 arch/arm/kvm/arm.c                             |    2 +-
 arch/arm64/include/asm/kvm_host.h              |    2 +-
 arch/arm64/include/uapi/asm/kvm.h              |    2 +
 arch/arm64/kvm/Kconfig                         |    1 +
 arch/arm64/kvm/Makefile                        |    1 +
 arch/arm64/kvm/reset.c                         |    8 +-
 include/kvm/arm_vgic.h                         |   43 +-
 include/linux/irqchip/arm-gic-v3.h             |   14 +-
 include/uapi/linux/kvm.h                       |    5 +-
 virt/kvm/arm/its-emul.c                        | 1187 ++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h                        |   55 ++
 virt/kvm/arm/vgic-v2-emul.c                    |    3 +
 virt/kvm/arm/vgic-v2.c                         |    1 +
 virt/kvm/arm/vgic-v3-emul.c                    |  101 +-
 virt/kvm/arm/vgic-v3.c                         |    1 +
 virt/kvm/arm/vgic.c                            |  292 +++---
 virt/kvm/arm/vgic.h                            |    3 +
 20 files changed, 1601 insertions(+), 145 deletions(-)
 create mode 100644 virt/kvm/arm/its-emul.c
 create mode 100644 virt/kvm/arm/its-emul.h

-- 
2.5.1

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 01/16] KVM: arm/arm64: VGIC: don't track used LRs in the distributor
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

Currently we track which IRQ has been mapped to which VGIC list
register and also have to synchronize both. We used to do this
to hold some extra state (for instance the active bit).
It turns out that this extra state in the LRs is no longer needed and
this extra tracking causes some pain later.
Remove the tracking feature (lr_map and lr_used) and get rid of
quite some code on the way.
In places where we scan LRs we now use our shadow copy of the ELRSR
register directly.
This code change means we lose the "piggy-back" optimization, which
would re-use an active-only LR to inject the pending state on top of
it. Tracing with various workloads shows that this actually occurred
very rarely, the ballpark figure is about once every 10,000 exits
in a disk I/O heavy workload. Also the list registers don't seem to
as scarce as assumed, with all 4 LRs on the popular implementations
used less than once every 100,000 exits.

This has been briefly tested on Midway, Juno and the model (the latter
both with GICv2 and GICv3 guests).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- adapt to 4.3-rc
- keep, but change retire_lr to drop now unused parameter

 include/kvm/arm_vgic.h |   6 ---
 virt/kvm/arm/vgic-v2.c |   1 +
 virt/kvm/arm/vgic-v3.c |   1 +
 virt/kvm/arm/vgic.c    | 137 +++++++++++++++++++++----------------------------
 4 files changed, 61 insertions(+), 84 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7bc5d02..926d67c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -295,9 +295,6 @@ struct vgic_v3_cpu_if {
 };
 
 struct vgic_cpu {
-	/* per IRQ to LR mapping */
-	u8		*vgic_irq_lr_map;
-
 	/* Pending/active/both interrupts on this VCPU */
 	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
 	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
@@ -308,9 +305,6 @@ struct vgic_cpu {
 	unsigned long   *active_shared;
 	unsigned long   *pend_act_shared;
 
-	/* Bitmap of used/free list registers */
-	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
-
 	/* Number of list registers on this CPU */
 	int		nr_lr;
 
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index 8d7b04d..c0f5d7f 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -158,6 +158,7 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 	 * anyway.
 	 */
 	vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0;
+	vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0;
 
 	/* Get the show on the road... */
 	vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN;
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 7dd5d62..92003cb 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -193,6 +193,7 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 	 * anyway.
 	 */
 	vgic_v3->vgic_vmcr = 0;
+	vgic_v3->vgic_elrsr = ~0;
 
 	/*
 	 * If we are emulating a GICv3, we do it in an non-GICv2-compatible
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index f3e76e5..da0a866 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -102,7 +102,7 @@
 #include "vgic.h"
 
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
 static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
@@ -672,6 +672,17 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 	return false;
 }
 
+static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
+			       struct vgic_lr vlr)
+{
+	vgic_ops->sync_lr_elrsr(vcpu, lr, vlr);
+}
+
+static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu)
+{
+	return vgic_ops->get_elrsr(vcpu);
+}
+
 /**
  * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
@@ -683,9 +694,11 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	u64 elrsr = vgic_get_elrsr(vcpu);
+	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
 	int i;
 
-	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+	for_each_clear_bit(i, elrsr_ptr, vgic_cpu->nr_lr) {
 		struct vgic_lr lr = vgic_get_lr(vcpu, i);
 
 		/*
@@ -728,7 +741,7 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * Mark the LR as free for other use.
 		 */
 		BUG_ON(lr.state & LR_STATE_MASK);
-		vgic_retire_lr(i, lr.irq, vcpu);
+		vgic_retire_lr(i, vcpu);
 		vgic_irq_clear_queued(vcpu, lr.irq);
 
 		/* Finally update the VGIC state. */
@@ -1036,17 +1049,6 @@ static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
 	vgic_ops->set_lr(vcpu, lr, vlr);
 }
 
-static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
-			       struct vgic_lr vlr)
-{
-	vgic_ops->sync_lr_elrsr(vcpu, lr, vlr);
-}
-
-static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu)
-{
-	return vgic_ops->get_elrsr(vcpu);
-}
-
 static inline u64 vgic_get_eisr(struct kvm_vcpu *vcpu)
 {
 	return vgic_ops->get_eisr(vcpu);
@@ -1087,15 +1089,13 @@ static inline void vgic_enable(struct kvm_vcpu *vcpu)
 	vgic_ops->enable(vcpu);
 }
 
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
 	vlr.state = 0;
+	vlr.hwirq = 0;
 	vgic_set_lr(vcpu, lr_nr, vlr);
-	clear_bit(lr_nr, vgic_cpu->lr_used);
-	vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
 	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
@@ -1110,14 +1110,15 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
  */
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	u64 elrsr = vgic_get_elrsr(vcpu);
+	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
 	int lr;
 
-	for_each_set_bit(lr, vgic_cpu->lr_used, vgic->nr_lr) {
+	for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
 		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
 		if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
-			vgic_retire_lr(lr, vlr.irq, vcpu);
+			vgic_retire_lr(lr, vcpu);
 			if (vgic_irq_is_queued(vcpu, vlr.irq))
 				vgic_irq_clear_queued(vcpu, vlr.irq);
 		}
@@ -1169,9 +1170,10 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
  */
 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct vgic_lr vlr;
+	u64 elrsr = vgic_get_elrsr(vcpu);
+	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
 	int lr;
 
 	/* Sanitize the input... */
@@ -1181,28 +1183,12 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 	kvm_debug("Queue IRQ%d\n", irq);
 
-	lr = vgic_cpu->vgic_irq_lr_map[irq];
+	lr = find_first_bit(elrsr_ptr, vgic->nr_lr);
 
-	/* Do we have an active interrupt for the same CPUID? */
-	if (lr != LR_EMPTY) {
-		vlr = vgic_get_lr(vcpu, lr);
-		if (vlr.source == sgi_source_id) {
-			kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq);
-			BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
-			vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
-			return true;
-		}
-	}
-
-	/* Try to use another LR for this interrupt */
-	lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used,
-			       vgic->nr_lr);
 	if (lr >= vgic->nr_lr)
 		return false;
 
 	kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
-	vgic_cpu->vgic_irq_lr_map[irq] = lr;
-	set_bit(lr, vgic_cpu->lr_used);
 
 	vlr.irq = irq;
 	vlr.source = sgi_source_id;
@@ -1214,9 +1200,6 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
 {
-	if (!vgic_can_sample_irq(vcpu, irq))
-		return true; /* level interrupt, already queued */
-
 	if (vgic_queue_irq(vcpu, 0, irq)) {
 		if (vgic_irq_is_edge(vcpu, irq)) {
 			vgic_dist_irq_clear_pending(vcpu, irq);
@@ -1299,9 +1282,6 @@ epilog:
 	for (lr = 0; lr < vgic->nr_lr; lr++) {
 		struct vgic_lr vlr;
 
-		if (!test_bit(lr, vgic_cpu->lr_used))
-			continue;
-
 		vlr = vgic_get_lr(vcpu, lr);
 
 		/*
@@ -1363,11 +1343,7 @@ static int process_queued_irq(struct kvm_vcpu *vcpu,
 	 * Despite being EOIed, the LR may not have
 	 * been marked as empty.
 	 */
-	vlr.state = 0;
-	vlr.hwirq = 0;
-	vgic_set_lr(vcpu, lr, vlr);
-
-	vgic_sync_lr_elrsr(vcpu, lr, vlr);
+	vgic_retire_lr(lr, vcpu);
 
 	return pending;
 }
@@ -1432,7 +1408,6 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
  */
 static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
 {
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct irq_phys_map *map;
 	bool phys_active;
 	bool level_pending;
@@ -1464,50 +1439,62 @@ static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
 		return false;
 	}
 
-	spin_lock(&dist->lock);
 	level_pending = process_queued_irq(vcpu, lr, vlr);
-	spin_unlock(&dist->lock);
 	return level_pending;
 }
 
 /* Sync back the VGIC state after a guest run */
 static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	u64 elrsr;
 	unsigned long *elrsr_ptr;
-	int lr, pending;
-	bool level_pending;
+	bool pending;
+	int lr;
 
-	level_pending = vgic_process_maintenance(vcpu);
+	pending = vgic_process_maintenance(vcpu);
 	elrsr = vgic_get_elrsr(vcpu);
 	elrsr_ptr = u64_to_bitmask(&elrsr);
 
-	/* Deal with HW interrupts, and clear mappings for empty LRs */
+	spin_lock(&dist->lock);
+	/* Put all state from the LRs back into our emulation. */
 	for (lr = 0; lr < vgic->nr_lr; lr++) {
-		struct vgic_lr vlr;
-
-		if (!test_bit(lr, vgic_cpu->lr_used))
-			continue;
+		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
-		vlr = vgic_get_lr(vcpu, lr);
+		/* Deal with deactivated HW interrupts */
 		if (vgic_sync_hwirq(vcpu, lr, vlr))
-			level_pending = true;
+			pending = true;
 
-		if (!test_bit(lr, elrsr_ptr))
+		if (test_bit(lr, elrsr_ptr))
 			continue;
 
-		clear_bit(lr, vgic_cpu->lr_used);
+		/* Reestablish SGI source for pending and active SGIs */
+		if (vlr.irq < VGIC_NR_SGIS)
+			add_sgi_source(vcpu, vlr.irq, vlr.source);
+
+		if (vlr.state & LR_STATE_PENDING)
+			vgic_dist_irq_set_pending(vcpu, vlr.irq);
 
-		BUG_ON(vlr.irq >= dist->nr_irqs);
-		vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY;
+		if (vlr.state & LR_STATE_ACTIVE) {
+			if (vlr.state & LR_STATE_PENDING) {
+				vgic_irq_set_active(vcpu, vlr.irq);
+			} else {
+				/* Active-only IRQs stay in the LR */
+				pending = true;
+				continue;
+			}
+		}
+
+		pending = true;
+
+		/* Mark this LR as empty now. */
+		vgic_retire_lr(lr, vcpu);
 	}
+	vgic_update_state(vcpu->kvm);
 
-	/* Check if we still have something up our sleeve... */
-	pending = find_first_zero_bit(elrsr_ptr, vgic->nr_lr);
-	if (level_pending || pending < vgic->nr_lr)
+	if (pending)
 		set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
+	spin_unlock(&dist->lock);
 }
 
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
@@ -1923,12 +1910,10 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	kfree(vgic_cpu->pending_shared);
 	kfree(vgic_cpu->active_shared);
 	kfree(vgic_cpu->pend_act_shared);
-	kfree(vgic_cpu->vgic_irq_lr_map);
 	vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
 	vgic_cpu->pending_shared = NULL;
 	vgic_cpu->active_shared = NULL;
 	vgic_cpu->pend_act_shared = NULL;
-	vgic_cpu->vgic_irq_lr_map = NULL;
 }
 
 static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
@@ -1939,18 +1924,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
-	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
 
 	if (!vgic_cpu->pending_shared
 		|| !vgic_cpu->active_shared
-		|| !vgic_cpu->pend_act_shared
-		|| !vgic_cpu->vgic_irq_lr_map) {
+		|| !vgic_cpu->pend_act_shared) {
 		kvm_vgic_vcpu_destroy(vcpu);
 		return -ENOMEM;
 	}
 
-	memset(vgic_cpu->vgic_irq_lr_map, LR_EMPTY, nr_irqs);
-
 	/*
 	 * Store the number of LRs per vcpu, so we don't have to go
 	 * all the way to the distributor structure to find out. Only
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 01/16] KVM: arm/arm64: VGIC: don't track used LRs in the distributor
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we track which IRQ has been mapped to which VGIC list
register and also have to synchronize both. We used to do this
to hold some extra state (for instance the active bit).
It turns out that this extra state in the LRs is no longer needed and
this extra tracking causes some pain later.
Remove the tracking feature (lr_map and lr_used) and get rid of
quite some code on the way.
In places where we scan LRs we now use our shadow copy of the ELRSR
register directly.
This code change means we lose the "piggy-back" optimization, which
would re-use an active-only LR to inject the pending state on top of
it. Tracing with various workloads shows that this actually occurred
very rarely, the ballpark figure is about once every 10,000 exits
in a disk I/O heavy workload. Also the list registers don't seem to
as scarce as assumed, with all 4 LRs on the popular implementations
used less than once every 100,000 exits.

This has been briefly tested on Midway, Juno and the model (the latter
both with GICv2 and GICv3 guests).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- adapt to 4.3-rc
- keep, but change retire_lr to drop now unused parameter

 include/kvm/arm_vgic.h |   6 ---
 virt/kvm/arm/vgic-v2.c |   1 +
 virt/kvm/arm/vgic-v3.c |   1 +
 virt/kvm/arm/vgic.c    | 137 +++++++++++++++++++++----------------------------
 4 files changed, 61 insertions(+), 84 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7bc5d02..926d67c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -295,9 +295,6 @@ struct vgic_v3_cpu_if {
 };
 
 struct vgic_cpu {
-	/* per IRQ to LR mapping */
-	u8		*vgic_irq_lr_map;
-
 	/* Pending/active/both interrupts on this VCPU */
 	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
 	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
@@ -308,9 +305,6 @@ struct vgic_cpu {
 	unsigned long   *active_shared;
 	unsigned long   *pend_act_shared;
 
-	/* Bitmap of used/free list registers */
-	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
-
 	/* Number of list registers on this CPU */
 	int		nr_lr;
 
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index 8d7b04d..c0f5d7f 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -158,6 +158,7 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 	 * anyway.
 	 */
 	vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0;
+	vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0;
 
 	/* Get the show on the road... */
 	vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN;
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 7dd5d62..92003cb 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -193,6 +193,7 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 	 * anyway.
 	 */
 	vgic_v3->vgic_vmcr = 0;
+	vgic_v3->vgic_elrsr = ~0;
 
 	/*
 	 * If we are emulating a GICv3, we do it in an non-GICv2-compatible
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index f3e76e5..da0a866 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -102,7 +102,7 @@
 #include "vgic.h"
 
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
 static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
@@ -672,6 +672,17 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 	return false;
 }
 
+static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
+			       struct vgic_lr vlr)
+{
+	vgic_ops->sync_lr_elrsr(vcpu, lr, vlr);
+}
+
+static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu)
+{
+	return vgic_ops->get_elrsr(vcpu);
+}
+
 /**
  * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
@@ -683,9 +694,11 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	u64 elrsr = vgic_get_elrsr(vcpu);
+	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
 	int i;
 
-	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+	for_each_clear_bit(i, elrsr_ptr, vgic_cpu->nr_lr) {
 		struct vgic_lr lr = vgic_get_lr(vcpu, i);
 
 		/*
@@ -728,7 +741,7 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * Mark the LR as free for other use.
 		 */
 		BUG_ON(lr.state & LR_STATE_MASK);
-		vgic_retire_lr(i, lr.irq, vcpu);
+		vgic_retire_lr(i, vcpu);
 		vgic_irq_clear_queued(vcpu, lr.irq);
 
 		/* Finally update the VGIC state. */
@@ -1036,17 +1049,6 @@ static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
 	vgic_ops->set_lr(vcpu, lr, vlr);
 }
 
-static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
-			       struct vgic_lr vlr)
-{
-	vgic_ops->sync_lr_elrsr(vcpu, lr, vlr);
-}
-
-static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu)
-{
-	return vgic_ops->get_elrsr(vcpu);
-}
-
 static inline u64 vgic_get_eisr(struct kvm_vcpu *vcpu)
 {
 	return vgic_ops->get_eisr(vcpu);
@@ -1087,15 +1089,13 @@ static inline void vgic_enable(struct kvm_vcpu *vcpu)
 	vgic_ops->enable(vcpu);
 }
 
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
 	vlr.state = 0;
+	vlr.hwirq = 0;
 	vgic_set_lr(vcpu, lr_nr, vlr);
-	clear_bit(lr_nr, vgic_cpu->lr_used);
-	vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
 	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
@@ -1110,14 +1110,15 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
  */
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	u64 elrsr = vgic_get_elrsr(vcpu);
+	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
 	int lr;
 
-	for_each_set_bit(lr, vgic_cpu->lr_used, vgic->nr_lr) {
+	for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
 		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
 		if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
-			vgic_retire_lr(lr, vlr.irq, vcpu);
+			vgic_retire_lr(lr, vcpu);
 			if (vgic_irq_is_queued(vcpu, vlr.irq))
 				vgic_irq_clear_queued(vcpu, vlr.irq);
 		}
@@ -1169,9 +1170,10 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
  */
 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct vgic_lr vlr;
+	u64 elrsr = vgic_get_elrsr(vcpu);
+	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
 	int lr;
 
 	/* Sanitize the input... */
@@ -1181,28 +1183,12 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 	kvm_debug("Queue IRQ%d\n", irq);
 
-	lr = vgic_cpu->vgic_irq_lr_map[irq];
+	lr = find_first_bit(elrsr_ptr, vgic->nr_lr);
 
-	/* Do we have an active interrupt for the same CPUID? */
-	if (lr != LR_EMPTY) {
-		vlr = vgic_get_lr(vcpu, lr);
-		if (vlr.source == sgi_source_id) {
-			kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq);
-			BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
-			vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
-			return true;
-		}
-	}
-
-	/* Try to use another LR for this interrupt */
-	lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used,
-			       vgic->nr_lr);
 	if (lr >= vgic->nr_lr)
 		return false;
 
 	kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
-	vgic_cpu->vgic_irq_lr_map[irq] = lr;
-	set_bit(lr, vgic_cpu->lr_used);
 
 	vlr.irq = irq;
 	vlr.source = sgi_source_id;
@@ -1214,9 +1200,6 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
 {
-	if (!vgic_can_sample_irq(vcpu, irq))
-		return true; /* level interrupt, already queued */
-
 	if (vgic_queue_irq(vcpu, 0, irq)) {
 		if (vgic_irq_is_edge(vcpu, irq)) {
 			vgic_dist_irq_clear_pending(vcpu, irq);
@@ -1299,9 +1282,6 @@ epilog:
 	for (lr = 0; lr < vgic->nr_lr; lr++) {
 		struct vgic_lr vlr;
 
-		if (!test_bit(lr, vgic_cpu->lr_used))
-			continue;
-
 		vlr = vgic_get_lr(vcpu, lr);
 
 		/*
@@ -1363,11 +1343,7 @@ static int process_queued_irq(struct kvm_vcpu *vcpu,
 	 * Despite being EOIed, the LR may not have
 	 * been marked as empty.
 	 */
-	vlr.state = 0;
-	vlr.hwirq = 0;
-	vgic_set_lr(vcpu, lr, vlr);
-
-	vgic_sync_lr_elrsr(vcpu, lr, vlr);
+	vgic_retire_lr(lr, vcpu);
 
 	return pending;
 }
@@ -1432,7 +1408,6 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
  */
 static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
 {
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct irq_phys_map *map;
 	bool phys_active;
 	bool level_pending;
@@ -1464,50 +1439,62 @@ static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
 		return false;
 	}
 
-	spin_lock(&dist->lock);
 	level_pending = process_queued_irq(vcpu, lr, vlr);
-	spin_unlock(&dist->lock);
 	return level_pending;
 }
 
 /* Sync back the VGIC state after a guest run */
 static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 {
-	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	u64 elrsr;
 	unsigned long *elrsr_ptr;
-	int lr, pending;
-	bool level_pending;
+	bool pending;
+	int lr;
 
-	level_pending = vgic_process_maintenance(vcpu);
+	pending = vgic_process_maintenance(vcpu);
 	elrsr = vgic_get_elrsr(vcpu);
 	elrsr_ptr = u64_to_bitmask(&elrsr);
 
-	/* Deal with HW interrupts, and clear mappings for empty LRs */
+	spin_lock(&dist->lock);
+	/* Put all state from the LRs back into our emulation. */
 	for (lr = 0; lr < vgic->nr_lr; lr++) {
-		struct vgic_lr vlr;
-
-		if (!test_bit(lr, vgic_cpu->lr_used))
-			continue;
+		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
-		vlr = vgic_get_lr(vcpu, lr);
+		/* Deal with deactivated HW interrupts */
 		if (vgic_sync_hwirq(vcpu, lr, vlr))
-			level_pending = true;
+			pending = true;
 
-		if (!test_bit(lr, elrsr_ptr))
+		if (test_bit(lr, elrsr_ptr))
 			continue;
 
-		clear_bit(lr, vgic_cpu->lr_used);
+		/* Reestablish SGI source for pending and active SGIs */
+		if (vlr.irq < VGIC_NR_SGIS)
+			add_sgi_source(vcpu, vlr.irq, vlr.source);
+
+		if (vlr.state & LR_STATE_PENDING)
+			vgic_dist_irq_set_pending(vcpu, vlr.irq);
 
-		BUG_ON(vlr.irq >= dist->nr_irqs);
-		vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY;
+		if (vlr.state & LR_STATE_ACTIVE) {
+			if (vlr.state & LR_STATE_PENDING) {
+				vgic_irq_set_active(vcpu, vlr.irq);
+			} else {
+				/* Active-only IRQs stay in the LR */
+				pending = true;
+				continue;
+			}
+		}
+
+		pending = true;
+
+		/* Mark this LR as empty now. */
+		vgic_retire_lr(lr, vcpu);
 	}
+	vgic_update_state(vcpu->kvm);
 
-	/* Check if we still have something up our sleeve... */
-	pending = find_first_zero_bit(elrsr_ptr, vgic->nr_lr);
-	if (level_pending || pending < vgic->nr_lr)
+	if (pending)
 		set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
+	spin_unlock(&dist->lock);
 }
 
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
@@ -1923,12 +1910,10 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	kfree(vgic_cpu->pending_shared);
 	kfree(vgic_cpu->active_shared);
 	kfree(vgic_cpu->pend_act_shared);
-	kfree(vgic_cpu->vgic_irq_lr_map);
 	vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list);
 	vgic_cpu->pending_shared = NULL;
 	vgic_cpu->active_shared = NULL;
 	vgic_cpu->pend_act_shared = NULL;
-	vgic_cpu->vgic_irq_lr_map = NULL;
 }
 
 static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
@@ -1939,18 +1924,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
-	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
 
 	if (!vgic_cpu->pending_shared
 		|| !vgic_cpu->active_shared
-		|| !vgic_cpu->pend_act_shared
-		|| !vgic_cpu->vgic_irq_lr_map) {
+		|| !vgic_cpu->pend_act_shared) {
 		kvm_vgic_vcpu_destroy(vcpu);
 		return -ENOMEM;
 	}
 
-	memset(vgic_cpu->vgic_irq_lr_map, LR_EMPTY, nr_irqs);
-
 	/*
 	 * Store the number of LRs per vcpu, so we don't have to go
 	 * all the way to the distributor structure to find out. Only
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 02/16] KVM: arm/arm64: remove now unused code after stay-in-LR rework
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

Now that we synchronize the LR state into our emulation upon guest
exit, there is no need for taking extra care of disabled IRQs.
Remove that code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- new patch

 virt/kvm/arm/vgic.c | 29 -----------------------------
 1 file changed, 29 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index da0a866..a5360b7 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -101,7 +101,6 @@
 
 #include "vgic.h"
 
-static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
@@ -477,7 +476,6 @@ bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 {
 	u32 *reg;
 	int mode = ACCESS_READ_VALUE | access;
-	struct kvm_vcpu *target_vcpu = kvm_get_vcpu(kvm, vcpu_id);
 
 	reg = vgic_bitmap_get_reg(&kvm->arch.vgic.irq_enabled, vcpu_id, offset);
 	vgic_reg_access(mmio, reg, offset, mode);
@@ -485,7 +483,6 @@ bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 		if (access & ACCESS_WRITE_CLEARBIT) {
 			if (offset < 4) /* Force SGI enabled */
 				*reg |= 0xffff;
-			vgic_retire_disabled_irqs(target_vcpu);
 		}
 		vgic_update_state(kvm);
 		return true;
@@ -1099,32 +1096,6 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
-/*
- * An interrupt may have been disabled after being made pending on the
- * CPU interface (the classic case is a timer running while we're
- * rebooting the guest - the interrupt would kick as soon as the CPU
- * interface gets enabled, with deadly consequences).
- *
- * The solution is to examine already active LRs, and check the
- * interrupt is still enabled. If not, just retire it.
- */
-static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
-{
-	u64 elrsr = vgic_get_elrsr(vcpu);
-	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
-	int lr;
-
-	for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
-		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
-
-		if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
-			vgic_retire_lr(lr, vcpu);
-			if (vgic_irq_is_queued(vcpu, vlr.irq))
-				vgic_irq_clear_queued(vcpu, vlr.irq);
-		}
-	}
-}
-
 static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 				 int lr_nr, struct vgic_lr vlr)
 {
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 02/16] KVM: arm/arm64: remove now unused code after stay-in-LR rework
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we synchronize the LR state into our emulation upon guest
exit, there is no need for taking extra care of disabled IRQs.
Remove that code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- new patch

 virt/kvm/arm/vgic.c | 29 -----------------------------
 1 file changed, 29 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index da0a866..a5360b7 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -101,7 +101,6 @@
 
 #include "vgic.h"
 
-static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
@@ -477,7 +476,6 @@ bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 {
 	u32 *reg;
 	int mode = ACCESS_READ_VALUE | access;
-	struct kvm_vcpu *target_vcpu = kvm_get_vcpu(kvm, vcpu_id);
 
 	reg = vgic_bitmap_get_reg(&kvm->arch.vgic.irq_enabled, vcpu_id, offset);
 	vgic_reg_access(mmio, reg, offset, mode);
@@ -485,7 +483,6 @@ bool vgic_handle_enable_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 		if (access & ACCESS_WRITE_CLEARBIT) {
 			if (offset < 4) /* Force SGI enabled */
 				*reg |= 0xffff;
-			vgic_retire_disabled_irqs(target_vcpu);
 		}
 		vgic_update_state(kvm);
 		return true;
@@ -1099,32 +1096,6 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
-/*
- * An interrupt may have been disabled after being made pending on the
- * CPU interface (the classic case is a timer running while we're
- * rebooting the guest - the interrupt would kick as soon as the CPU
- * interface gets enabled, with deadly consequences).
- *
- * The solution is to examine already active LRs, and check the
- * interrupt is still enabled. If not, just retire it.
- */
-static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
-{
-	u64 elrsr = vgic_get_elrsr(vcpu);
-	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
-	int lr;
-
-	for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
-		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
-
-		if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
-			vgic_retire_lr(lr, vcpu);
-			if (vgic_irq_is_queued(vcpu, vlr.irq))
-				vgic_irq_clear_queued(vcpu, vlr.irq);
-		}
-	}
-}
-
 static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 				 int lr_nr, struct vgic_lr vlr)
 {
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 03/16] KVM: extend struct kvm_msi to hold a 32-bit device ID
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall; +Cc: kvm, kvmarm, linux-arm-kernel

The ARM GICv3 ITS MSI controller requires a device ID to be able to
assign the proper interrupt vector. On real hardware, this ID is
sampled from the bus. To be able to emulate an ITS controller, extend
the KVM MSI interface to let userspace provide such a device ID. For
PCI devices, the device ID is simply the 16-bit bus-device-function
triplet, which should be easily available to the userland tool.

Also there is a new KVM capability which advertises whether the
current VM requires a device ID to be set along with the MSI data.
This flag is still reported as not available everywhere, later we will
enable it when ITS emulation is used.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- adjust KVM_CAP number to not clash with upstream

 Documentation/virtual/kvm/api.txt | 12 ++++++++++--
 include/uapi/linux/kvm.h          |  5 ++++-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d9eccee..a302e0a 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2147,10 +2147,18 @@ struct kvm_msi {
 	__u32 address_hi;
 	__u32 data;
 	__u32 flags;
-	__u8  pad[16];
+	__u32 devid;
+	__u8  pad[12];
 };
 
-No flags are defined so far. The corresponding field must be 0.
+flags: KVM_MSI_VALID_DEVID: devid contains a valid value
+devid: If KVM_MSI_VALID_DEVID is set, contains a unique device identifier
+       for the device that wrote the MSI message.
+       For PCI, this is usually a BFD identifier in the lower 16 bits.
+
+The per-VM KVM_CAP_MSI_DEVID capability advertises the need to provide
+the device ID. If this capability is not set, userland cannot rely on
+the kernel to allow the KVM_MSI_VALID_DEVID flag being set.
 
 
 4.71 KVM_CREATE_PIT2
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a9256f0..eae9ba1 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -824,6 +824,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_MULTI_ADDRESS_SPACE 118
 #define KVM_CAP_GUEST_DEBUG_HW_BPS 119
 #define KVM_CAP_GUEST_DEBUG_HW_WPS 120
+#define KVM_CAP_MSI_DEVID 121
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -975,12 +976,14 @@ struct kvm_one_reg {
 	__u64 addr;
 };
 
+#define KVM_MSI_VALID_DEVID	(1U << 0)
 struct kvm_msi {
 	__u32 address_lo;
 	__u32 address_hi;
 	__u32 data;
 	__u32 flags;
-	__u8  pad[16];
+	__u32 devid;
+	__u8  pad[12];
 };
 
 struct kvm_arm_device_addr {
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 03/16] KVM: extend struct kvm_msi to hold a 32-bit device ID
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

The ARM GICv3 ITS MSI controller requires a device ID to be able to
assign the proper interrupt vector. On real hardware, this ID is
sampled from the bus. To be able to emulate an ITS controller, extend
the KVM MSI interface to let userspace provide such a device ID. For
PCI devices, the device ID is simply the 16-bit bus-device-function
triplet, which should be easily available to the userland tool.

Also there is a new KVM capability which advertises whether the
current VM requires a device ID to be set along with the MSI data.
This flag is still reported as not available everywhere, later we will
enable it when ITS emulation is used.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- adjust KVM_CAP number to not clash with upstream

 Documentation/virtual/kvm/api.txt | 12 ++++++++++--
 include/uapi/linux/kvm.h          |  5 ++++-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d9eccee..a302e0a 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2147,10 +2147,18 @@ struct kvm_msi {
 	__u32 address_hi;
 	__u32 data;
 	__u32 flags;
-	__u8  pad[16];
+	__u32 devid;
+	__u8  pad[12];
 };
 
-No flags are defined so far. The corresponding field must be 0.
+flags: KVM_MSI_VALID_DEVID: devid contains a valid value
+devid: If KVM_MSI_VALID_DEVID is set, contains a unique device identifier
+       for the device that wrote the MSI message.
+       For PCI, this is usually a BFD identifier in the lower 16 bits.
+
+The per-VM KVM_CAP_MSI_DEVID capability advertises the need to provide
+the device ID. If this capability is not set, userland cannot rely on
+the kernel to allow the KVM_MSI_VALID_DEVID flag being set.
 
 
 4.71 KVM_CREATE_PIT2
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index a9256f0..eae9ba1 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -824,6 +824,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_MULTI_ADDRESS_SPACE 118
 #define KVM_CAP_GUEST_DEBUG_HW_BPS 119
 #define KVM_CAP_GUEST_DEBUG_HW_WPS 120
+#define KVM_CAP_MSI_DEVID 121
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -975,12 +976,14 @@ struct kvm_one_reg {
 	__u64 addr;
 };
 
+#define KVM_MSI_VALID_DEVID	(1U << 0)
 struct kvm_msi {
 	__u32 address_lo;
 	__u32 address_hi;
 	__u32 data;
 	__u32 flags;
-	__u8  pad[16];
+	__u32 devid;
+	__u8  pad[12];
 };
 
 struct kvm_arm_device_addr {
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 04/16] KVM: arm/arm64: add emulation model specific destroy function
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

Currently we destroy the VGIC emulation in one function that cares for
all emulated models. To be on par with init_model (which is model
specific), lets introduce a per-emulation-model destroy method, too.
Use it for a tiny GICv3 specific code already, later it will be handy
for the ITS emulation.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- none

 include/kvm/arm_vgic.h      |  1 +
 virt/kvm/arm/vgic-v3-emul.c |  9 +++++++++
 virt/kvm/arm/vgic.c         | 11 ++++++++++-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 926d67c..2c10082 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -144,6 +144,7 @@ struct vgic_vm_ops {
 	bool	(*queue_sgi)(struct kvm_vcpu *, int irq);
 	void	(*add_sgi_source)(struct kvm_vcpu *, int irq, int source);
 	int	(*init_model)(struct kvm *);
+	void	(*destroy_model)(struct kvm *);
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
 };
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e661e7f..d2eeb20 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -862,6 +862,14 @@ static int vgic_v3_init_model(struct kvm *kvm)
 	return 0;
 }
 
+static void vgic_v3_destroy_model(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	kfree(dist->irq_spi_mpidr);
+	dist->irq_spi_mpidr = NULL;
+}
+
 /* GICv3 does not keep track of SGI sources anymore. */
 static void vgic_v3_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
 {
@@ -874,6 +882,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.queue_sgi = vgic_v3_queue_sgi;
 	dist->vm_ops.add_sgi_source = vgic_v3_add_sgi_source;
 	dist->vm_ops.init_model = vgic_v3_init_model;
+	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
 
 	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index a5360b7..b71f627 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -125,6 +125,14 @@ int kvm_vgic_map_resources(struct kvm *kvm)
 	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
 }
 
+static void vgic_destroy_model(struct kvm *kvm)
+{
+	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+
+	if (vm_ops->destroy_model)
+		vm_ops->destroy_model(kvm);
+}
+
 /*
  * struct vgic_bitmap contains a bitmap made of unsigned longs, but
  * extracts u32s out of them.
@@ -1941,6 +1949,8 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	struct kvm_vcpu *vcpu;
 	int i;
 
+	vgic_destroy_model(kvm);
+
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		kvm_vgic_vcpu_destroy(vcpu);
 
@@ -1957,7 +1967,6 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	}
 	kfree(dist->irq_sgi_sources);
 	kfree(dist->irq_spi_cpu);
-	kfree(dist->irq_spi_mpidr);
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
 	kfree(dist->irq_active_on_cpu);
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 04/16] KVM: arm/arm64: add emulation model specific destroy function
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we destroy the VGIC emulation in one function that cares for
all emulated models. To be on par with init_model (which is model
specific), lets introduce a per-emulation-model destroy method, too.
Use it for a tiny GICv3 specific code already, later it will be handy
for the ITS emulation.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- none

 include/kvm/arm_vgic.h      |  1 +
 virt/kvm/arm/vgic-v3-emul.c |  9 +++++++++
 virt/kvm/arm/vgic.c         | 11 ++++++++++-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 926d67c..2c10082 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -144,6 +144,7 @@ struct vgic_vm_ops {
 	bool	(*queue_sgi)(struct kvm_vcpu *, int irq);
 	void	(*add_sgi_source)(struct kvm_vcpu *, int irq, int source);
 	int	(*init_model)(struct kvm *);
+	void	(*destroy_model)(struct kvm *);
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
 };
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e661e7f..d2eeb20 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -862,6 +862,14 @@ static int vgic_v3_init_model(struct kvm *kvm)
 	return 0;
 }
 
+static void vgic_v3_destroy_model(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	kfree(dist->irq_spi_mpidr);
+	dist->irq_spi_mpidr = NULL;
+}
+
 /* GICv3 does not keep track of SGI sources anymore. */
 static void vgic_v3_add_sgi_source(struct kvm_vcpu *vcpu, int irq, int source)
 {
@@ -874,6 +882,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.queue_sgi = vgic_v3_queue_sgi;
 	dist->vm_ops.add_sgi_source = vgic_v3_add_sgi_source;
 	dist->vm_ops.init_model = vgic_v3_init_model;
+	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
 
 	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index a5360b7..b71f627 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -125,6 +125,14 @@ int kvm_vgic_map_resources(struct kvm *kvm)
 	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
 }
 
+static void vgic_destroy_model(struct kvm *kvm)
+{
+	struct vgic_vm_ops *vm_ops = &kvm->arch.vgic.vm_ops;
+
+	if (vm_ops->destroy_model)
+		vm_ops->destroy_model(kvm);
+}
+
 /*
  * struct vgic_bitmap contains a bitmap made of unsigned longs, but
  * extracts u32s out of them.
@@ -1941,6 +1949,8 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	struct kvm_vcpu *vcpu;
 	int i;
 
+	vgic_destroy_model(kvm);
+
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		kvm_vgic_vcpu_destroy(vcpu);
 
@@ -1957,7 +1967,6 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	}
 	kfree(dist->irq_sgi_sources);
 	kfree(dist->irq_spi_cpu);
-	kfree(dist->irq_spi_mpidr);
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
 	kfree(dist->irq_active_on_cpu);
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 05/16] KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

KVM capabilities can be a per-VM property, though ARM/ARM64 currently
does not pass on the VM pointer to the architecture specific
capability handlers.
Add a "struct kvm*" parameter to those function to later allow proper
per-VM capability reporting.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- none

 arch/arm/include/asm/kvm_host.h   | 2 +-
 arch/arm/kvm/arm.c                | 2 +-
 arch/arm64/include/asm/kvm_host.h | 2 +-
 arch/arm64/kvm/reset.c            | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3df1e97..88e84db 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -210,7 +210,7 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
-static inline int kvm_arch_dev_ioctl_check_extension(long ext)
+static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	return 0;
 }
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6c7f4520..bdbefcd 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -197,7 +197,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = KVM_MAX_VCPUS;
 		break;
 	default:
-		r = kvm_arch_dev_ioctl_check_extension(ext);
+		r = kvm_arch_dev_ioctl_check_extension(kvm, ext);
 		break;
 	}
 	return r;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4562459..c41e613 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -43,7 +43,7 @@
 
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
-int kvm_arch_dev_ioctl_check_extension(long ext);
+int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
 
 struct kvm_arch {
 	/* The VMID generation used for the virt. memory system */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 91cf535..4d7f78b4 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -63,7 +63,7 @@ static bool cpu_has_32bit_el1(void)
  * We currently assume that the number of HW registers is uniform
  * across all CPUs (see cpuinfo_sanity_check).
  */
-int kvm_arch_dev_ioctl_check_extension(long ext)
+int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	int r;
 
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 05/16] KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

KVM capabilities can be a per-VM property, though ARM/ARM64 currently
does not pass on the VM pointer to the architecture specific
capability handlers.
Add a "struct kvm*" parameter to those function to later allow proper
per-VM capability reporting.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- none

 arch/arm/include/asm/kvm_host.h   | 2 +-
 arch/arm/kvm/arm.c                | 2 +-
 arch/arm64/include/asm/kvm_host.h | 2 +-
 arch/arm64/kvm/reset.c            | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3df1e97..88e84db 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -210,7 +210,7 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
-static inline int kvm_arch_dev_ioctl_check_extension(long ext)
+static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	return 0;
 }
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6c7f4520..bdbefcd 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -197,7 +197,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = KVM_MAX_VCPUS;
 		break;
 	default:
-		r = kvm_arch_dev_ioctl_check_extension(ext);
+		r = kvm_arch_dev_ioctl_check_extension(kvm, ext);
 		break;
 	}
 	return r;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4562459..c41e613 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -43,7 +43,7 @@
 
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
-int kvm_arch_dev_ioctl_check_extension(long ext);
+int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
 
 struct kvm_arch {
 	/* The VMID generation used for the virt. memory system */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 91cf535..4d7f78b4 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -63,7 +63,7 @@ static bool cpu_has_32bit_el1(void)
  * We currently assume that the number of HW registers is uniform
  * across all CPUs (see cpuinfo_sanity_check).
  */
-int kvm_arch_dev_ioctl_check_extension(long ext)
+int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 {
 	int r;
 
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 06/16] KVM: arm/arm64: make GIC frame address initialization model specific
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall; +Cc: kvm, kvmarm, linux-arm-kernel

Currently we initialize all the possible GIC frame addresses in one
function, without looking at the specific GIC model we instantiate
for the guest.
As this gets confusing when adding another VGIC model later, lets
move these initializations into the respective model's emulation
init functions.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- none

 virt/kvm/arm/vgic-v2-emul.c | 3 +++
 virt/kvm/arm/vgic-v3-emul.c | 3 +++
 virt/kvm/arm/vgic.c         | 3 ---
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 1390797..8faa28c 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -567,6 +567,9 @@ void vgic_v2_init_emulation(struct kvm *kvm)
 	dist->vm_ops.init_model = vgic_v2_init_model;
 	dist->vm_ops.map_resources = vgic_v2_map_resources;
 
+	dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
+	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
+
 	kvm->arch.max_vcpus = VGIC_V2_MAX_CPUS;
 }
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index d2eeb20..1f42348 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -885,6 +885,9 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
 
+	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
+	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
+
 	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
 }
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index b71f627..1dd79e1 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2160,9 +2160,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
 	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
-	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
-	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
-	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
 
 out_unlock:
 	for (; vcpu_lock_idx >= 0; vcpu_lock_idx--) {
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 06/16] KVM: arm/arm64: make GIC frame address initialization model specific
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Currently we initialize all the possible GIC frame addresses in one
function, without looking at the specific GIC model we instantiate
for the guest.
As this gets confusing when adding another VGIC model later, lets
move these initializations into the respective model's emulation
init functions.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- none

 virt/kvm/arm/vgic-v2-emul.c | 3 +++
 virt/kvm/arm/vgic-v3-emul.c | 3 +++
 virt/kvm/arm/vgic.c         | 3 ---
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 1390797..8faa28c 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -567,6 +567,9 @@ void vgic_v2_init_emulation(struct kvm *kvm)
 	dist->vm_ops.init_model = vgic_v2_init_model;
 	dist->vm_ops.map_resources = vgic_v2_map_resources;
 
+	dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
+	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
+
 	kvm->arch.max_vcpus = VGIC_V2_MAX_CPUS;
 }
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index d2eeb20..1f42348 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -885,6 +885,9 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
 
+	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
+	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
+
 	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
 }
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index b71f627..1dd79e1 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2160,9 +2160,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
 	kvm->arch.vgic.vctrl_base = vgic->vctrl_base;
-	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
-	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
-	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
 
 out_unlock:
 	for (; vcpu_lock_idx >= 0; vcpu_lock_idx--) {
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 07/16] KVM: arm64: Introduce new MMIO region for the ITS base address
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

The ARM GICv3 ITS controller requires a separate register frame to
cover ITS specific registers. Add a new VGIC address type and store
the address in a field in the vgic_dist structure.
Provide a function to check whether userland has provided the address,
so ITS functionality can be guarded by that check.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- none

 Documentation/virtual/kvm/devices/arm-vgic.txt |  9 +++++++++
 arch/arm64/include/uapi/asm/kvm.h              |  2 ++
 include/kvm/arm_vgic.h                         |  3 +++
 virt/kvm/arm/vgic-v3-emul.c                    |  2 ++
 virt/kvm/arm/vgic.c                            | 16 ++++++++++++++++
 virt/kvm/arm/vgic.h                            |  1 +
 6 files changed, 33 insertions(+)

diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt
index 3fb9054..ec715f9e 100644
--- a/Documentation/virtual/kvm/devices/arm-vgic.txt
+++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
@@ -39,6 +39,15 @@ Groups:
       Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
       This address needs to be 64K aligned.
 
+    KVM_VGIC_V3_ADDR_TYPE_ITS (rw, 64-bit)
+      Base address in the guest physical address space of the GICv3 ITS
+      control register frame. The ITS allows MSI(-X) interrupts to be
+      injected into guests. This extension is optional, if the kernel
+      does not support the ITS, the call returns -ENODEV.
+      This memory is solely for the guest to access the ITS control
+      registers and does not cover the ITS translation register.
+      Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
+      This address needs to be 64K aligned and the region covers 64 KByte.
 
   KVM_DEV_ARM_VGIC_GRP_DIST_REGS
   Attributes:
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 0cd7b59..99e4006 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -87,9 +87,11 @@ struct kvm_regs {
 /* Supported VGICv3 address types  */
 #define KVM_VGIC_V3_ADDR_TYPE_DIST	2
 #define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
+#define KVM_VGIC_V3_ADDR_TYPE_ITS	4
 
 #define KVM_VGIC_V3_DIST_SIZE		SZ_64K
 #define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
+#define KVM_VGIC_V3_ITS_SIZE		SZ_64K
 
 #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
 #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 2c10082..067ad09 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -190,6 +190,9 @@ struct vgic_dist {
 		phys_addr_t		vgic_redist_base;
 	};
 
+	/* The base address of the ITS control register frame */
+	phys_addr_t		vgic_its_base;
+
 	/* Distributor enabled */
 	u32			enabled;
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 1f42348..a8cf669 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -887,6 +887,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 
 	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
 	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
+	dist->vgic_its_base = VGIC_ADDR_UNDEF;
 
 	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
 }
@@ -1059,6 +1060,7 @@ static int vgic_v3_has_attr(struct kvm_device *dev,
 			return -ENXIO;
 		case KVM_VGIC_V3_ADDR_TYPE_DIST:
 		case KVM_VGIC_V3_ADDR_TYPE_REDIST:
+		case KVM_VGIC_V3_ADDR_TYPE_ITS:
 			return 0;
 		}
 		break;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 1dd79e1..4219f22 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -953,6 +953,16 @@ int vgic_register_kvm_io_dev(struct kvm *kvm, gpa_t base, int len,
 	return ret;
 }
 
+bool vgic_has_its(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	if (dist->vgic_model != KVM_DEV_TYPE_ARM_VGIC_V3)
+		return false;
+
+	return !IS_VGIC_ADDR_UNDEF(dist->vgic_its_base);
+}
+
 static int vgic_nr_shared_irqs(struct vgic_dist *dist)
 {
 	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
@@ -2257,6 +2267,12 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write)
 		block_size = KVM_VGIC_V3_REDIST_SIZE;
 		alignment = SZ_64K;
 		break;
+	case KVM_VGIC_V3_ADDR_TYPE_ITS:
+		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
+		addr_ptr = &vgic->vgic_its_base;
+		block_size = KVM_VGIC_V3_ITS_SIZE;
+		alignment = SZ_64K;
+		break;
 #endif
 	default:
 		r = -ENODEV;
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index 0df74cb..a093f5c 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -136,5 +136,6 @@ int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
 int vgic_init(struct kvm *kvm);
 void vgic_v2_init_emulation(struct kvm *kvm);
 void vgic_v3_init_emulation(struct kvm *kvm);
+bool vgic_has_its(struct kvm *kvm);
 
 #endif
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 07/16] KVM: arm64: Introduce new MMIO region for the ITS base address
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

The ARM GICv3 ITS controller requires a separate register frame to
cover ITS specific registers. Add a new VGIC address type and store
the address in a field in the vgic_dist structure.
Provide a function to check whether userland has provided the address,
so ITS functionality can be guarded by that check.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- none

 Documentation/virtual/kvm/devices/arm-vgic.txt |  9 +++++++++
 arch/arm64/include/uapi/asm/kvm.h              |  2 ++
 include/kvm/arm_vgic.h                         |  3 +++
 virt/kvm/arm/vgic-v3-emul.c                    |  2 ++
 virt/kvm/arm/vgic.c                            | 16 ++++++++++++++++
 virt/kvm/arm/vgic.h                            |  1 +
 6 files changed, 33 insertions(+)

diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt
index 3fb9054..ec715f9e 100644
--- a/Documentation/virtual/kvm/devices/arm-vgic.txt
+++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
@@ -39,6 +39,15 @@ Groups:
       Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
       This address needs to be 64K aligned.
 
+    KVM_VGIC_V3_ADDR_TYPE_ITS (rw, 64-bit)
+      Base address in the guest physical address space of the GICv3 ITS
+      control register frame. The ITS allows MSI(-X) interrupts to be
+      injected into guests. This extension is optional, if the kernel
+      does not support the ITS, the call returns -ENODEV.
+      This memory is solely for the guest to access the ITS control
+      registers and does not cover the ITS translation register.
+      Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
+      This address needs to be 64K aligned and the region covers 64 KByte.
 
   KVM_DEV_ARM_VGIC_GRP_DIST_REGS
   Attributes:
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 0cd7b59..99e4006 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -87,9 +87,11 @@ struct kvm_regs {
 /* Supported VGICv3 address types  */
 #define KVM_VGIC_V3_ADDR_TYPE_DIST	2
 #define KVM_VGIC_V3_ADDR_TYPE_REDIST	3
+#define KVM_VGIC_V3_ADDR_TYPE_ITS	4
 
 #define KVM_VGIC_V3_DIST_SIZE		SZ_64K
 #define KVM_VGIC_V3_REDIST_SIZE		(2 * SZ_64K)
+#define KVM_VGIC_V3_ITS_SIZE		SZ_64K
 
 #define KVM_ARM_VCPU_POWER_OFF		0 /* CPU is started in OFF state */
 #define KVM_ARM_VCPU_EL1_32BIT		1 /* CPU running a 32bit VM */
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 2c10082..067ad09 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -190,6 +190,9 @@ struct vgic_dist {
 		phys_addr_t		vgic_redist_base;
 	};
 
+	/* The base address of the ITS control register frame */
+	phys_addr_t		vgic_its_base;
+
 	/* Distributor enabled */
 	u32			enabled;
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 1f42348..a8cf669 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -887,6 +887,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 
 	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
 	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
+	dist->vgic_its_base = VGIC_ADDR_UNDEF;
 
 	kvm->arch.max_vcpus = KVM_MAX_VCPUS;
 }
@@ -1059,6 +1060,7 @@ static int vgic_v3_has_attr(struct kvm_device *dev,
 			return -ENXIO;
 		case KVM_VGIC_V3_ADDR_TYPE_DIST:
 		case KVM_VGIC_V3_ADDR_TYPE_REDIST:
+		case KVM_VGIC_V3_ADDR_TYPE_ITS:
 			return 0;
 		}
 		break;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 1dd79e1..4219f22 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -953,6 +953,16 @@ int vgic_register_kvm_io_dev(struct kvm *kvm, gpa_t base, int len,
 	return ret;
 }
 
+bool vgic_has_its(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	if (dist->vgic_model != KVM_DEV_TYPE_ARM_VGIC_V3)
+		return false;
+
+	return !IS_VGIC_ADDR_UNDEF(dist->vgic_its_base);
+}
+
 static int vgic_nr_shared_irqs(struct vgic_dist *dist)
 {
 	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
@@ -2257,6 +2267,12 @@ int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write)
 		block_size = KVM_VGIC_V3_REDIST_SIZE;
 		alignment = SZ_64K;
 		break;
+	case KVM_VGIC_V3_ADDR_TYPE_ITS:
+		type_needed = KVM_DEV_TYPE_ARM_VGIC_V3;
+		addr_ptr = &vgic->vgic_its_base;
+		block_size = KVM_VGIC_V3_ITS_SIZE;
+		alignment = SZ_64K;
+		break;
 #endif
 	default:
 		r = -ENODEV;
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index 0df74cb..a093f5c 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -136,5 +136,6 @@ int vgic_get_common_attr(struct kvm_device *dev, struct kvm_device_attr *attr);
 int vgic_init(struct kvm *kvm);
 void vgic_v2_init_emulation(struct kvm *kvm);
 void vgic_v3_init_emulation(struct kvm *kvm);
+bool vgic_has_its(struct kvm *kvm);
 
 #endif
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall; +Cc: kvm, kvmarm, linux-arm-kernel

In the GICv3 redistributor there are the PENDBASER and PROPBASER
registers which we did not emulate so far, as they only make sense
when having an ITS. In preparation for that emulate those MMIO
accesses by storing the 64-bit data written into it into a variable
which we later read in the ITS emulation.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- rename vgic_handle_base_register to vgic_reg64_access()

 include/kvm/arm_vgic.h      |  8 ++++++++
 virt/kvm/arm/vgic-v3-emul.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic.c         | 31 +++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic.h         |  2 ++
 4 files changed, 85 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 067ad09..06c33bc 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -272,6 +272,14 @@ struct vgic_dist {
 	/* Virtual irq to hwirq mapping */
 	spinlock_t		irq_phys_map_lock;
 	struct list_head	irq_phys_map_list;
+
+	/* Address of LPI configuration table shared by all redistributors */
+	u64			propbaser;
+
+	/* Addresses of LPI pending tables per redistributor */
+	u64			*pendbaser;
+
+	bool			lpis_enabled;
 };
 
 struct vgic_v2_cpu_if {
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index a8cf669..6939f7c 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -651,6 +651,38 @@ static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
 	return vgic_handle_cfg_reg(reg, mmio, offset);
 }
 
+/* We don't trigger any actions here, just store the register value */
+static bool handle_mmio_propbaser_redist(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int mode = ACCESS_READ_VALUE;
+
+	/* Storing a value with LPIs already enabled is undefined */
+	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
+	vgic_reg64_access(mmio, offset, &dist->propbaser, mode);
+
+	return false;
+}
+
+/* We don't trigger any actions here, just store the register value */
+static bool handle_mmio_pendbaser_redist(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	struct kvm_vcpu *rdvcpu = mmio->private;
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int mode = ACCESS_READ_VALUE;
+
+	/* Storing a value with LPIs already enabled is undefined */
+	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
+	vgic_reg64_access(mmio, offset,
+			  &dist->pendbaser[rdvcpu->vcpu_id], mode);
+
+	return false;
+}
+
 #define SGI_base(x) ((x) + SZ_64K)
 
 static const struct vgic_io_range vgic_redist_ranges[] = {
@@ -679,6 +711,18 @@ static const struct vgic_io_range vgic_redist_ranges[] = {
 		.handle_mmio    = handle_mmio_raz_wi,
 	},
 	{
+		.base		= GICR_PENDBASER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_pendbaser_redist,
+	},
+	{
+		.base		= GICR_PROPBASER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_propbaser_redist,
+	},
+	{
 		.base           = GICR_IDREGS,
 		.len            = 0x30,
 		.bits_per_irq   = 0,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 4219f22..11bf692 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -471,6 +471,37 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 	}
 }
 
+/* handle a 64-bit register access */
+void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
+		       u64 *basereg, int mode)
+{
+	u32 reg;
+	u64 breg;
+
+	switch (offset & ~3) {
+	case 0x00:
+		breg = *basereg;
+		reg = lower_32_bits(breg);
+		vgic_reg_access(mmio, &reg, offset & 3, mode);
+		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
+			breg &= GENMASK_ULL(63, 32);
+			breg |= reg;
+			*basereg = breg;
+		}
+		break;
+	case 0x04:
+		breg = *basereg;
+		reg = upper_32_bits(breg);
+		vgic_reg_access(mmio, &reg, offset & 3, mode);
+		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
+			breg  = lower_32_bits(breg);
+			breg |= (u64)reg << 32;
+			*basereg = breg;
+		}
+		break;
+	}
+}
+
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
 			phys_addr_t offset)
 {
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index a093f5c..104f780 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -71,6 +71,8 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 		     phys_addr_t offset, int mode);
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
 			phys_addr_t offset);
+void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
+		       u64 *basereg, int mode);
 
 static inline
 u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

In the GICv3 redistributor there are the PENDBASER and PROPBASER
registers which we did not emulate so far, as they only make sense
when having an ITS. In preparation for that emulate those MMIO
accesses by storing the 64-bit data written into it into a variable
which we later read in the ITS emulation.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- rename vgic_handle_base_register to vgic_reg64_access()

 include/kvm/arm_vgic.h      |  8 ++++++++
 virt/kvm/arm/vgic-v3-emul.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic.c         | 31 +++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic.h         |  2 ++
 4 files changed, 85 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 067ad09..06c33bc 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -272,6 +272,14 @@ struct vgic_dist {
 	/* Virtual irq to hwirq mapping */
 	spinlock_t		irq_phys_map_lock;
 	struct list_head	irq_phys_map_list;
+
+	/* Address of LPI configuration table shared by all redistributors */
+	u64			propbaser;
+
+	/* Addresses of LPI pending tables per redistributor */
+	u64			*pendbaser;
+
+	bool			lpis_enabled;
 };
 
 struct vgic_v2_cpu_if {
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index a8cf669..6939f7c 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -651,6 +651,38 @@ static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
 	return vgic_handle_cfg_reg(reg, mmio, offset);
 }
 
+/* We don't trigger any actions here, just store the register value */
+static bool handle_mmio_propbaser_redist(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int mode = ACCESS_READ_VALUE;
+
+	/* Storing a value with LPIs already enabled is undefined */
+	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
+	vgic_reg64_access(mmio, offset, &dist->propbaser, mode);
+
+	return false;
+}
+
+/* We don't trigger any actions here, just store the register value */
+static bool handle_mmio_pendbaser_redist(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	struct kvm_vcpu *rdvcpu = mmio->private;
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	int mode = ACCESS_READ_VALUE;
+
+	/* Storing a value with LPIs already enabled is undefined */
+	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
+	vgic_reg64_access(mmio, offset,
+			  &dist->pendbaser[rdvcpu->vcpu_id], mode);
+
+	return false;
+}
+
 #define SGI_base(x) ((x) + SZ_64K)
 
 static const struct vgic_io_range vgic_redist_ranges[] = {
@@ -679,6 +711,18 @@ static const struct vgic_io_range vgic_redist_ranges[] = {
 		.handle_mmio    = handle_mmio_raz_wi,
 	},
 	{
+		.base		= GICR_PENDBASER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_pendbaser_redist,
+	},
+	{
+		.base		= GICR_PROPBASER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_propbaser_redist,
+	},
+	{
 		.base           = GICR_IDREGS,
 		.len            = 0x30,
 		.bits_per_irq   = 0,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 4219f22..11bf692 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -471,6 +471,37 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 	}
 }
 
+/* handle a 64-bit register access */
+void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
+		       u64 *basereg, int mode)
+{
+	u32 reg;
+	u64 breg;
+
+	switch (offset & ~3) {
+	case 0x00:
+		breg = *basereg;
+		reg = lower_32_bits(breg);
+		vgic_reg_access(mmio, &reg, offset & 3, mode);
+		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
+			breg &= GENMASK_ULL(63, 32);
+			breg |= reg;
+			*basereg = breg;
+		}
+		break;
+	case 0x04:
+		breg = *basereg;
+		reg = upper_32_bits(breg);
+		vgic_reg_access(mmio, &reg, offset & 3, mode);
+		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
+			breg  = lower_32_bits(breg);
+			breg |= (u64)reg << 32;
+			*basereg = breg;
+		}
+		break;
+	}
+}
+
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
 			phys_addr_t offset)
 {
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index a093f5c..104f780 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -71,6 +71,8 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
 		     phys_addr_t offset, int mode);
 bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
 			phys_addr_t offset);
+void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
+		       u64 *basereg, int mode);
 
 static inline
 u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 09/16] KVM: arm64: introduce ITS emulation file with stub functions
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

The ARM GICv3 ITS emulation code goes into a separate file, but
needs to be connected to the GICv3 emulation, of which it is an
option.
Introduce the skeleton with function stubs to be filled later.
Introduce the basic ITS data structure and initialize it, but don't
return any success yet, as we are not yet ready for the show.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- drop ITS check before doing GICR_CTLR access

 arch/arm64/kvm/Makefile            |   1 +
 include/kvm/arm_vgic.h             |   6 ++
 include/linux/irqchip/arm-gic-v3.h |   1 +
 virt/kvm/arm/its-emul.c            | 125 +++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h            |  35 +++++++++++
 virt/kvm/arm/vgic-v3-emul.c        |  20 +++++-
 6 files changed, 185 insertions(+), 3 deletions(-)
 create mode 100644 virt/kvm/arm/its-emul.c
 create mode 100644 virt/kvm/arm/its-emul.h

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 1949fe5..75069a9 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -25,5 +25,6 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v2-emul.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v2-switch.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3-emul.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/its-emul.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v3-switch.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 06c33bc..c8c48e3 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -168,6 +168,11 @@ struct irq_phys_map_entry {
 	struct irq_phys_map	map;
 };
 
+struct vgic_its {
+	bool			enabled;
+	spinlock_t		lock;
+};
+
 struct vgic_dist {
 	spinlock_t		lock;
 	bool			in_kernel;
@@ -280,6 +285,7 @@ struct vgic_dist {
 	u64			*pendbaser;
 
 	bool			lpis_enabled;
+	struct vgic_its		its;
 };
 
 struct vgic_v2_cpu_if {
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 9eeeb95..70e9539 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -179,6 +179,7 @@
 #define GITS_CWRITER			0x0088
 #define GITS_CREADR			0x0090
 #define GITS_BASER			0x0100
+#define GITS_IDREGS_BASE		0xffd0
 #define GITS_PIDR2			GICR_PIDR2
 
 #define GITS_TRANSLATER			0x10040
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
new file mode 100644
index 0000000..659dd39
--- /dev/null
+++ b/virt/kvm/arm/its-emul.c
@@ -0,0 +1,125 @@
+/*
+ * GICv3 ITS emulation
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Author: Andre Przywara <andre.przywara@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/cpu.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/interrupt.h>
+
+#include <linux/irqchip/arm-gic-v3.h>
+#include <kvm/arm_vgic.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
+
+#include "vgic.h"
+#include "its-emul.h"
+
+static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_cwriter(struct kvm_vcpu *vcpu,
+				     struct kvm_exit_mmio *mmio,
+				     phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_creadr(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset)
+{
+	return false;
+}
+
+static const struct vgic_io_range vgicv3_its_ranges[] = {
+	{
+		.base		= GITS_CTLR,
+		.len		= 0x10,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_misc_gits,
+	},
+	{
+		.base		= GITS_CBASER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_cbaser,
+	},
+	{
+		.base		= GITS_CWRITER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_cwriter,
+	},
+	{
+		.base		= GITS_CREADR,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_creadr,
+	},
+	{
+		/* We don't need any memory from the guest. */
+		.base		= GITS_BASER,
+		.len		= 0x40,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GITS_IDREGS_BASE,
+		.len		= 0x30,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_idregs,
+	},
+};
+
+/* This is called on setting the LPI enable bit in the redistributor. */
+void vgic_enable_lpis(struct kvm_vcpu *vcpu)
+{
+}
+
+int vits_init(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+
+	spin_lock_init(&its->lock);
+
+	its->enabled = false;
+
+	return -ENXIO;
+}
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
new file mode 100644
index 0000000..5dc8e2f
--- /dev/null
+++ b/virt/kvm/arm/its-emul.h
@@ -0,0 +1,35 @@
+/*
+ * GICv3 ITS emulation definitions
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Author: Andre Przywara <andre.przywara@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __KVM_ITS_EMUL_H__
+#define __KVM_ITS_EMUL_H__
+
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
+
+#include "vgic.h"
+
+void vgic_enable_lpis(struct kvm_vcpu *vcpu);
+int vits_init(struct kvm *kvm);
+
+#endif
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 6939f7c..a7b60bdb 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -48,6 +48,7 @@
 #include <asm/kvm_mmu.h>
 
 #include "vgic.h"
+#include "its-emul.h"
 
 static bool handle_mmio_rao_wi(struct kvm_vcpu *vcpu,
 			       struct kvm_exit_mmio *mmio, phys_addr_t offset)
@@ -530,9 +531,16 @@ static bool handle_mmio_ctlr_redist(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
-	/* since we don't support LPIs, this register is zero for now */
-	vgic_reg_access(mmio, NULL, offset,
-			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	u32 reg;
+
+	reg = dist->lpis_enabled ? GICR_CTLR_ENABLE_LPIS : 0;
+	vgic_reg_access(mmio, &reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	if (vgic_has_its(vcpu->kvm) && !dist->lpis_enabled &&
+	    (reg & GICR_CTLR_ENABLE_LPIS)) {
+		/* Eventually do something */
+	}
 	return false;
 }
 
@@ -861,6 +869,12 @@ static int vgic_v3_map_resources(struct kvm *kvm,
 		rdbase += GIC_V3_REDIST_SIZE;
 	}
 
+	if (vgic_has_its(kvm)) {
+		ret = vits_init(kvm);
+		if (ret)
+			goto out_unregister;
+	}
+
 	dist->redist_iodevs = iodevs;
 	dist->ready = true;
 	goto out;
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 09/16] KVM: arm64: introduce ITS emulation file with stub functions
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

The ARM GICv3 ITS emulation code goes into a separate file, but
needs to be connected to the GICv3 emulation, of which it is an
option.
Introduce the skeleton with function stubs to be filled later.
Introduce the basic ITS data structure and initialize it, but don't
return any success yet, as we are not yet ready for the show.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
---
Changelog v2..v3:
- drop ITS check before doing GICR_CTLR access

 arch/arm64/kvm/Makefile            |   1 +
 include/kvm/arm_vgic.h             |   6 ++
 include/linux/irqchip/arm-gic-v3.h |   1 +
 virt/kvm/arm/its-emul.c            | 125 +++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h            |  35 +++++++++++
 virt/kvm/arm/vgic-v3-emul.c        |  20 +++++-
 6 files changed, 185 insertions(+), 3 deletions(-)
 create mode 100644 virt/kvm/arm/its-emul.c
 create mode 100644 virt/kvm/arm/its-emul.h

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 1949fe5..75069a9 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -25,5 +25,6 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v2-emul.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v2-switch.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3-emul.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/its-emul.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v3-switch.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 06c33bc..c8c48e3 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -168,6 +168,11 @@ struct irq_phys_map_entry {
 	struct irq_phys_map	map;
 };
 
+struct vgic_its {
+	bool			enabled;
+	spinlock_t		lock;
+};
+
 struct vgic_dist {
 	spinlock_t		lock;
 	bool			in_kernel;
@@ -280,6 +285,7 @@ struct vgic_dist {
 	u64			*pendbaser;
 
 	bool			lpis_enabled;
+	struct vgic_its		its;
 };
 
 struct vgic_v2_cpu_if {
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 9eeeb95..70e9539 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -179,6 +179,7 @@
 #define GITS_CWRITER			0x0088
 #define GITS_CREADR			0x0090
 #define GITS_BASER			0x0100
+#define GITS_IDREGS_BASE		0xffd0
 #define GITS_PIDR2			GICR_PIDR2
 
 #define GITS_TRANSLATER			0x10040
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
new file mode 100644
index 0000000..659dd39
--- /dev/null
+++ b/virt/kvm/arm/its-emul.c
@@ -0,0 +1,125 @@
+/*
+ * GICv3 ITS emulation
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Author: Andre Przywara <andre.przywara@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/cpu.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/interrupt.h>
+
+#include <linux/irqchip/arm-gic-v3.h>
+#include <kvm/arm_vgic.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
+
+#include "vgic.h"
+#include "its-emul.h"
+
+static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_cwriter(struct kvm_vcpu *vcpu,
+				     struct kvm_exit_mmio *mmio,
+				     phys_addr_t offset)
+{
+	return false;
+}
+
+static bool handle_mmio_gits_creadr(struct kvm_vcpu *vcpu,
+				    struct kvm_exit_mmio *mmio,
+				    phys_addr_t offset)
+{
+	return false;
+}
+
+static const struct vgic_io_range vgicv3_its_ranges[] = {
+	{
+		.base		= GITS_CTLR,
+		.len		= 0x10,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_misc_gits,
+	},
+	{
+		.base		= GITS_CBASER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_cbaser,
+	},
+	{
+		.base		= GITS_CWRITER,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_cwriter,
+	},
+	{
+		.base		= GITS_CREADR,
+		.len		= 0x08,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_creadr,
+	},
+	{
+		/* We don't need any memory from the guest. */
+		.base		= GITS_BASER,
+		.len		= 0x40,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_raz_wi,
+	},
+	{
+		.base		= GITS_IDREGS_BASE,
+		.len		= 0x30,
+		.bits_per_irq	= 0,
+		.handle_mmio	= handle_mmio_gits_idregs,
+	},
+};
+
+/* This is called on setting the LPI enable bit in the redistributor. */
+void vgic_enable_lpis(struct kvm_vcpu *vcpu)
+{
+}
+
+int vits_init(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+
+	spin_lock_init(&its->lock);
+
+	its->enabled = false;
+
+	return -ENXIO;
+}
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
new file mode 100644
index 0000000..5dc8e2f
--- /dev/null
+++ b/virt/kvm/arm/its-emul.h
@@ -0,0 +1,35 @@
+/*
+ * GICv3 ITS emulation definitions
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Author: Andre Przywara <andre.przywara@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __KVM_ITS_EMUL_H__
+#define __KVM_ITS_EMUL_H__
+
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
+
+#include "vgic.h"
+
+void vgic_enable_lpis(struct kvm_vcpu *vcpu);
+int vits_init(struct kvm *kvm);
+
+#endif
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 6939f7c..a7b60bdb 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -48,6 +48,7 @@
 #include <asm/kvm_mmu.h>
 
 #include "vgic.h"
+#include "its-emul.h"
 
 static bool handle_mmio_rao_wi(struct kvm_vcpu *vcpu,
 			       struct kvm_exit_mmio *mmio, phys_addr_t offset)
@@ -530,9 +531,16 @@ static bool handle_mmio_ctlr_redist(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
-	/* since we don't support LPIs, this register is zero for now */
-	vgic_reg_access(mmio, NULL, offset,
-			ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	u32 reg;
+
+	reg = dist->lpis_enabled ? GICR_CTLR_ENABLE_LPIS : 0;
+	vgic_reg_access(mmio, &reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	if (vgic_has_its(vcpu->kvm) && !dist->lpis_enabled &&
+	    (reg & GICR_CTLR_ENABLE_LPIS)) {
+		/* Eventually do something */
+	}
 	return false;
 }
 
@@ -861,6 +869,12 @@ static int vgic_v3_map_resources(struct kvm *kvm,
 		rdbase += GIC_V3_REDIST_SIZE;
 	}
 
+	if (vgic_has_its(kvm)) {
+		ret = vits_init(kvm);
+		if (ret)
+			goto out_unregister;
+	}
+
 	dist->redist_iodevs = iodevs;
 	dist->ready = true;
 	goto out;
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 10/16] KVM: arm64: implement basic ITS register handlers
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

Add emulation for some basic MMIO registers used in the ITS emulation.
This includes:
- GITS_{CTLR,TYPER,IIDR}
- ID registers
- GITS_{CBASER,CREADR,CWRITER}
  those implement the ITS command buffer handling

Most of the handlers are pretty straight forward, but CWRITER goes
some extra miles to allow fine grained locking. The idea here
is to let only the first instance iterate through the command ring
buffer, CWRITER accesses on other VCPUs meanwhile will be picked up
by that first instance and handled as well. The ITS lock is thus only
hold for very small periods of time and is dropped before the actual
command handler is called.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- use new renamed vgic_reg64_access() function
- rework locking in CWRITER handling
- use kcalloc instead of kmalloc

 include/kvm/arm_vgic.h             |   3 +
 include/linux/irqchip/arm-gic-v3.h |   8 ++
 virt/kvm/arm/its-emul.c            | 215 +++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h            |   1 +
 virt/kvm/arm/vgic-v3-emul.c        |   2 +
 5 files changed, 229 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c8c48e3..9ac850d 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -171,6 +171,9 @@ struct irq_phys_map_entry {
 struct vgic_its {
 	bool			enabled;
 	spinlock_t		lock;
+	u64			cbaser;
+	int			creadr;
+	int			cwriter;
 };
 
 struct vgic_dist {
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 70e9539..ef274a9 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -181,15 +181,23 @@
 #define GITS_BASER			0x0100
 #define GITS_IDREGS_BASE		0xffd0
 #define GITS_PIDR2			GICR_PIDR2
+#define GITS_PIDR4			0xffd0
+#define GITS_CIDR0			0xfff0
+#define GITS_CIDR1			0xfff4
+#define GITS_CIDR2			0xfff8
+#define GITS_CIDR3			0xfffc
 
 #define GITS_TRANSLATER			0x10040
 
 #define GITS_CTLR_ENABLE		(1U << 0)
 #define GITS_CTLR_QUIESCENT		(1U << 31)
 
+#define GITS_TYPER_PLPIS		(1UL << 0)
+#define GITS_TYPER_IDBITS_SHIFT		8
 #define GITS_TYPER_DEVBITS_SHIFT	13
 #define GITS_TYPER_DEVBITS(r)		((((r) >> GITS_TYPER_DEVBITS_SHIFT) & 0x1f) + 1)
 #define GITS_TYPER_PTA			(1UL << 19)
+#define GITS_TYPER_HWCOLLCNT_SHIFT	24
 
 #define GITS_CBASER_VALID		(1UL << 63)
 #define GITS_CBASER_nCnB		(0UL << 59)
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 659dd39..9bbed86 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -32,10 +32,62 @@
 #include "vgic.h"
 #include "its-emul.h"
 
+#define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
+
+/* The distributor lock is held by the VGIC MMIO handler. */
 static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
 				  struct kvm_exit_mmio *mmio,
 				  phys_addr_t offset)
 {
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	u32 reg;
+	bool was_enabled;
+
+	switch (offset & ~3) {
+	case 0x00:		/* GITS_CTLR */
+		/* We never defer any command execution. */
+		reg = GITS_CTLR_QUIESCENT;
+		if (its->enabled)
+			reg |= GITS_CTLR_ENABLE;
+		was_enabled = its->enabled;
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+		its->enabled = !!(reg & GITS_CTLR_ENABLE);
+		return !was_enabled && its->enabled;
+	case 0x04:		/* GITS_IIDR */
+		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case 0x08:		/* GITS_TYPER */
+		/*
+		 * We use linear CPU numbers for redistributor addressing,
+		 * so GITS_TYPER.PTA is 0.
+		 * To avoid memory waste on the guest side, we keep the
+		 * number of IDBits and DevBits low for the time being.
+		 * This could later be made configurable by userland.
+		 * Since we have all collections in linked list, we claim
+		 * that we can hold all of the collection tables in our
+		 * own memory and that the ITT entry size is 1 byte (the
+		 * smallest possible one).
+		 */
+		reg = GITS_TYPER_PLPIS;
+		reg |= 0xff << GITS_TYPER_HWCOLLCNT_SHIFT;
+		reg |= 0x0f << GITS_TYPER_DEVBITS_SHIFT;
+		reg |= 0x0f << GITS_TYPER_IDBITS_SHIFT;
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case 0x0c:
+		/* The upper 32bits of TYPER are all 0 for the time being.
+		 * Should we need more than 256 collections, we can enable
+		 * some bits in here.
+		 */
+		vgic_reg_access(mmio, NULL, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	}
+
 	return false;
 }
 
@@ -43,20 +95,152 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
+	u32 reg = 0;
+	int idreg = (offset & ~3) + GITS_IDREGS_BASE;
+
+	switch (idreg) {
+	case GITS_PIDR2:
+		reg = GIC_PIDR2_ARCH_GICv3;
+		break;
+	case GITS_PIDR4:
+		/* This is a 64K software visible page */
+		reg = 0x40;
+		break;
+	/* Those are the ID registers for (any) GIC. */
+	case GITS_CIDR0:
+		reg = 0x0d;
+		break;
+	case GITS_CIDR1:
+		reg = 0xf0;
+		break;
+	case GITS_CIDR2:
+		reg = 0x05;
+		break;
+	case GITS_CIDR3:
+		reg = 0xb1;
+		break;
+	}
+	vgic_reg_access(mmio, &reg, offset & 3,
+			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
 	return false;
 }
 
+/*
+ * This function is called with both the ITS and the distributor lock dropped,
+ * so the actual command handlers must take the respective locks when needed.
+ */
+static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
+{
+	return -ENODEV;
+}
+
 static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	int mode = ACCESS_READ_VALUE;
+
+	mode |= its->enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
+
+	vgic_reg64_access(mmio, offset, &its->cbaser, mode);
+
+	/* Writing CBASER resets the read pointer. */
+	if (mmio->is_write)
+		its->creadr = 0;
+
 	return false;
 }
 
+static int its_cmd_buffer_size(struct kvm *kvm)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+
+	return ((its->cbaser & 0xff) + 1) << 12;
+}
+
+static gpa_t its_cmd_buffer_base(struct kvm *kvm)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+
+	return BASER_BASE_ADDRESS(its->cbaser);
+}
+
+/*
+ * By writing to CWRITER the guest announces new commands to be processed.
+ * Since we cannot read from guest memory inside the ITS spinlock, we
+ * iterate over the command buffer (with the lock dropped) until the read
+ * pointer matches the write pointer. Other VCPUs writing this register in the
+ * meantime will just update the write pointer, leaving the command
+ * processing to the first instance of the function.
+ */
 static bool handle_mmio_gits_cwriter(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
 				     phys_addr_t offset)
 {
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+	gpa_t cbaser = its_cmd_buffer_base(vcpu->kvm);
+	u64 cmd_buf[4];
+	u32 reg;
+	bool finished;
+
+	/* The upper 32 bits are RES0 */
+	if ((offset & ~3) == 0x04) {
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		return false;
+	}
+
+	reg = its->cwriter & 0xfffe0;
+	vgic_reg_access(mmio, &reg, offset & 3,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	if (!mmio->is_write)
+		return false;
+
+	reg &= 0xfffe0;
+	if (reg > its_cmd_buffer_size(vcpu->kvm))
+		return false;
+
+	spin_unlock(&dist->lock);
+
+	spin_lock(&its->lock);
+
+	/*
+	 * If there is still another VCPU handling commands, let this
+	 * one pick up the new CWRITER and process "our" new commands as well.
+	 */
+	finished = (its->cwriter != its->creadr);
+	its->cwriter = reg;
+
+	spin_unlock(&its->lock);
+
+	while (!finished) {
+		int ret = kvm_read_guest(vcpu->kvm, cbaser + its->creadr,
+					 cmd_buf, 32);
+		if (ret) {
+			/*
+			 * Gah, we are screwed. Reset CWRITER to that command
+			 * that we have finished processing and return.
+			 */
+			spin_lock(&its->lock);
+			its->cwriter = its->creadr;
+			spin_unlock(&its->lock);
+			break;
+		}
+		vits_handle_command(vcpu, cmd_buf);
+
+		spin_lock(&its->lock);
+		its->creadr += 32;
+		if (its->creadr == its_cmd_buffer_size(vcpu->kvm))
+			its->creadr = 0;
+		finished = (its->creadr == its->cwriter);
+		spin_unlock(&its->lock);
+	}
+
+	/* The caller expects the lock to be still held. */
+	spin_lock(&dist->lock);
+
 	return false;
 }
 
@@ -64,6 +248,20 @@ static bool handle_mmio_gits_creadr(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	u32 reg;
+
+	switch (offset & ~3) {
+	case 0x00:
+		reg = its->creadr & 0xfffe0;
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case 0x04:
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	}
 	return false;
 }
 
@@ -117,9 +315,26 @@ int vits_init(struct kvm *kvm)
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_its *its = &dist->its;
 
+	dist->pendbaser = kcalloc(dist->nr_cpus, sizeof(u64), GFP_KERNEL);
+	if (!dist->pendbaser)
+		return -ENOMEM;
+
 	spin_lock_init(&its->lock);
 
 	its->enabled = false;
 
 	return -ENXIO;
 }
+
+void vits_destroy(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+
+	if (!vgic_has_its(kvm))
+		return;
+
+	kfree(dist->pendbaser);
+
+	its->enabled = false;
+}
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 5dc8e2f..472a6d0 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -31,5 +31,6 @@
 
 void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
+void vits_destroy(struct kvm *kvm);
 
 #endif
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index a7b60bdb..e9aa29e 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -924,6 +924,8 @@ static void vgic_v3_destroy_model(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 
+	vits_destroy(kvm);
+
 	kfree(dist->irq_spi_mpidr);
 	dist->irq_spi_mpidr = NULL;
 }
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 10/16] KVM: arm64: implement basic ITS register handlers
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Add emulation for some basic MMIO registers used in the ITS emulation.
This includes:
- GITS_{CTLR,TYPER,IIDR}
- ID registers
- GITS_{CBASER,CREADR,CWRITER}
  those implement the ITS command buffer handling

Most of the handlers are pretty straight forward, but CWRITER goes
some extra miles to allow fine grained locking. The idea here
is to let only the first instance iterate through the command ring
buffer, CWRITER accesses on other VCPUs meanwhile will be picked up
by that first instance and handled as well. The ITS lock is thus only
hold for very small periods of time and is dropped before the actual
command handler is called.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- use new renamed vgic_reg64_access() function
- rework locking in CWRITER handling
- use kcalloc instead of kmalloc

 include/kvm/arm_vgic.h             |   3 +
 include/linux/irqchip/arm-gic-v3.h |   8 ++
 virt/kvm/arm/its-emul.c            | 215 +++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h            |   1 +
 virt/kvm/arm/vgic-v3-emul.c        |   2 +
 5 files changed, 229 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c8c48e3..9ac850d 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -171,6 +171,9 @@ struct irq_phys_map_entry {
 struct vgic_its {
 	bool			enabled;
 	spinlock_t		lock;
+	u64			cbaser;
+	int			creadr;
+	int			cwriter;
 };
 
 struct vgic_dist {
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 70e9539..ef274a9 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -181,15 +181,23 @@
 #define GITS_BASER			0x0100
 #define GITS_IDREGS_BASE		0xffd0
 #define GITS_PIDR2			GICR_PIDR2
+#define GITS_PIDR4			0xffd0
+#define GITS_CIDR0			0xfff0
+#define GITS_CIDR1			0xfff4
+#define GITS_CIDR2			0xfff8
+#define GITS_CIDR3			0xfffc
 
 #define GITS_TRANSLATER			0x10040
 
 #define GITS_CTLR_ENABLE		(1U << 0)
 #define GITS_CTLR_QUIESCENT		(1U << 31)
 
+#define GITS_TYPER_PLPIS		(1UL << 0)
+#define GITS_TYPER_IDBITS_SHIFT		8
 #define GITS_TYPER_DEVBITS_SHIFT	13
 #define GITS_TYPER_DEVBITS(r)		((((r) >> GITS_TYPER_DEVBITS_SHIFT) & 0x1f) + 1)
 #define GITS_TYPER_PTA			(1UL << 19)
+#define GITS_TYPER_HWCOLLCNT_SHIFT	24
 
 #define GITS_CBASER_VALID		(1UL << 63)
 #define GITS_CBASER_nCnB		(0UL << 59)
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 659dd39..9bbed86 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -32,10 +32,62 @@
 #include "vgic.h"
 #include "its-emul.h"
 
+#define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
+
+/* The distributor lock is held by the VGIC MMIO handler. */
 static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
 				  struct kvm_exit_mmio *mmio,
 				  phys_addr_t offset)
 {
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	u32 reg;
+	bool was_enabled;
+
+	switch (offset & ~3) {
+	case 0x00:		/* GITS_CTLR */
+		/* We never defer any command execution. */
+		reg = GITS_CTLR_QUIESCENT;
+		if (its->enabled)
+			reg |= GITS_CTLR_ENABLE;
+		was_enabled = its->enabled;
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+		its->enabled = !!(reg & GITS_CTLR_ENABLE);
+		return !was_enabled && its->enabled;
+	case 0x04:		/* GITS_IIDR */
+		reg = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case 0x08:		/* GITS_TYPER */
+		/*
+		 * We use linear CPU numbers for redistributor addressing,
+		 * so GITS_TYPER.PTA is 0.
+		 * To avoid memory waste on the guest side, we keep the
+		 * number of IDBits and DevBits low for the time being.
+		 * This could later be made configurable by userland.
+		 * Since we have all collections in linked list, we claim
+		 * that we can hold all of the collection tables in our
+		 * own memory and that the ITT entry size is 1 byte (the
+		 * smallest possible one).
+		 */
+		reg = GITS_TYPER_PLPIS;
+		reg |= 0xff << GITS_TYPER_HWCOLLCNT_SHIFT;
+		reg |= 0x0f << GITS_TYPER_DEVBITS_SHIFT;
+		reg |= 0x0f << GITS_TYPER_IDBITS_SHIFT;
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case 0x0c:
+		/* The upper 32bits of TYPER are all 0 for the time being.
+		 * Should we need more than 256 collections, we can enable
+		 * some bits in here.
+		 */
+		vgic_reg_access(mmio, NULL, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	}
+
 	return false;
 }
 
@@ -43,20 +95,152 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
+	u32 reg = 0;
+	int idreg = (offset & ~3) + GITS_IDREGS_BASE;
+
+	switch (idreg) {
+	case GITS_PIDR2:
+		reg = GIC_PIDR2_ARCH_GICv3;
+		break;
+	case GITS_PIDR4:
+		/* This is a 64K software visible page */
+		reg = 0x40;
+		break;
+	/* Those are the ID registers for (any) GIC. */
+	case GITS_CIDR0:
+		reg = 0x0d;
+		break;
+	case GITS_CIDR1:
+		reg = 0xf0;
+		break;
+	case GITS_CIDR2:
+		reg = 0x05;
+		break;
+	case GITS_CIDR3:
+		reg = 0xb1;
+		break;
+	}
+	vgic_reg_access(mmio, &reg, offset & 3,
+			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
 	return false;
 }
 
+/*
+ * This function is called with both the ITS and the distributor lock dropped,
+ * so the actual command handlers must take the respective locks when needed.
+ */
+static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
+{
+	return -ENODEV;
+}
+
 static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	int mode = ACCESS_READ_VALUE;
+
+	mode |= its->enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
+
+	vgic_reg64_access(mmio, offset, &its->cbaser, mode);
+
+	/* Writing CBASER resets the read pointer. */
+	if (mmio->is_write)
+		its->creadr = 0;
+
 	return false;
 }
 
+static int its_cmd_buffer_size(struct kvm *kvm)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+
+	return ((its->cbaser & 0xff) + 1) << 12;
+}
+
+static gpa_t its_cmd_buffer_base(struct kvm *kvm)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+
+	return BASER_BASE_ADDRESS(its->cbaser);
+}
+
+/*
+ * By writing to CWRITER the guest announces new commands to be processed.
+ * Since we cannot read from guest memory inside the ITS spinlock, we
+ * iterate over the command buffer (with the lock dropped) until the read
+ * pointer matches the write pointer. Other VCPUs writing this register in the
+ * meantime will just update the write pointer, leaving the command
+ * processing to the first instance of the function.
+ */
 static bool handle_mmio_gits_cwriter(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
 				     phys_addr_t offset)
 {
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+	gpa_t cbaser = its_cmd_buffer_base(vcpu->kvm);
+	u64 cmd_buf[4];
+	u32 reg;
+	bool finished;
+
+	/* The upper 32 bits are RES0 */
+	if ((offset & ~3) == 0x04) {
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		return false;
+	}
+
+	reg = its->cwriter & 0xfffe0;
+	vgic_reg_access(mmio, &reg, offset & 3,
+			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+	if (!mmio->is_write)
+		return false;
+
+	reg &= 0xfffe0;
+	if (reg > its_cmd_buffer_size(vcpu->kvm))
+		return false;
+
+	spin_unlock(&dist->lock);
+
+	spin_lock(&its->lock);
+
+	/*
+	 * If there is still another VCPU handling commands, let this
+	 * one pick up the new CWRITER and process "our" new commands as well.
+	 */
+	finished = (its->cwriter != its->creadr);
+	its->cwriter = reg;
+
+	spin_unlock(&its->lock);
+
+	while (!finished) {
+		int ret = kvm_read_guest(vcpu->kvm, cbaser + its->creadr,
+					 cmd_buf, 32);
+		if (ret) {
+			/*
+			 * Gah, we are screwed. Reset CWRITER to that command
+			 * that we have finished processing and return.
+			 */
+			spin_lock(&its->lock);
+			its->cwriter = its->creadr;
+			spin_unlock(&its->lock);
+			break;
+		}
+		vits_handle_command(vcpu, cmd_buf);
+
+		spin_lock(&its->lock);
+		its->creadr += 32;
+		if (its->creadr == its_cmd_buffer_size(vcpu->kvm))
+			its->creadr = 0;
+		finished = (its->creadr == its->cwriter);
+		spin_unlock(&its->lock);
+	}
+
+	/* The caller expects the lock to be still held. */
+	spin_lock(&dist->lock);
+
 	return false;
 }
 
@@ -64,6 +248,20 @@ static bool handle_mmio_gits_creadr(struct kvm_vcpu *vcpu,
 				    struct kvm_exit_mmio *mmio,
 				    phys_addr_t offset)
 {
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	u32 reg;
+
+	switch (offset & ~3) {
+	case 0x00:
+		reg = its->creadr & 0xfffe0;
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+		break;
+	case 0x04:
+		vgic_reg_access(mmio, &reg, offset & 3,
+				ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+		break;
+	}
 	return false;
 }
 
@@ -117,9 +315,26 @@ int vits_init(struct kvm *kvm)
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_its *its = &dist->its;
 
+	dist->pendbaser = kcalloc(dist->nr_cpus, sizeof(u64), GFP_KERNEL);
+	if (!dist->pendbaser)
+		return -ENOMEM;
+
 	spin_lock_init(&its->lock);
 
 	its->enabled = false;
 
 	return -ENXIO;
 }
+
+void vits_destroy(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+
+	if (!vgic_has_its(kvm))
+		return;
+
+	kfree(dist->pendbaser);
+
+	its->enabled = false;
+}
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 5dc8e2f..472a6d0 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -31,5 +31,6 @@
 
 void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
+void vits_destroy(struct kvm *kvm);
 
 #endif
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index a7b60bdb..e9aa29e 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -924,6 +924,8 @@ static void vgic_v3_destroy_model(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 
+	vits_destroy(kvm);
+
 	kfree(dist->irq_spi_mpidr);
 	dist->irq_spi_mpidr = NULL;
 }
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 11/16] KVM: arm64: add data structures to model ITS interrupt translation
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

The GICv3 Interrupt Translation Service (ITS) uses tables in memory
to allow a sophisticated interrupt routing. It features device tables,
an interrupt table per device and a table connecting "collections" to
actual CPUs (aka. redistributors in the GICv3 lingo).
Since the interrupt numbers for the LPIs are allocated quite sparsely
and the range can be quite huge (8192 LPIs being the minimum), using
bitmaps or arrays for storing information is a waste of memory.
We use linked lists instead, which we iterate linearily. This works
very well with the actual number of LPIs/MSIs in the guest being
quite low. Should the number of LPIs exceed the number where iterating
through lists seems acceptable, we can later revisit this and use more
efficient data structures.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- add a comment

 include/kvm/arm_vgic.h  |  3 +++
 virt/kvm/arm/its-emul.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 9ac850d..c3eb414 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -25,6 +25,7 @@
 #include <linux/spinlock.h>
 #include <linux/types.h>
 #include <kvm/iodev.h>
+#include <linux/list.h>
 
 #define VGIC_NR_IRQS_LEGACY	256
 #define VGIC_NR_SGIS		16
@@ -174,6 +175,8 @@ struct vgic_its {
 	u64			cbaser;
 	int			creadr;
 	int			cwriter;
+	struct list_head	device_list;
+	struct list_head	collection_list;
 };
 
 struct vgic_dist {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 9bbed86..bab8033 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -21,6 +21,7 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <linux/interrupt.h>
+#include <linux/list.h>
 
 #include <linux/irqchip/arm-gic-v3.h>
 #include <kvm/arm_vgic.h>
@@ -32,6 +33,34 @@
 #include "vgic.h"
 #include "its-emul.h"
 
+struct its_device {
+	struct list_head dev_list;
+
+	/* the head for the list of ITTEs */
+	struct list_head itt;
+	u32 device_id;
+};
+
+#define COLLECTION_NOT_MAPPED ((u32)-1)
+
+struct its_collection {
+	struct list_head coll_list;
+
+	u32 collection_id;
+	u32 target_addr;
+};
+
+#define its_is_collection_mapped(coll) ((coll) && \
+				((coll)->target_addr != COLLECTION_NOT_MAPPED))
+
+struct its_itte {
+	struct list_head itte_list;
+
+	struct its_collection *collection;
+	u32 lpi;
+	u32 event_id;
+};
+
 #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
 
 /* The distributor lock is held by the VGIC MMIO handler. */
@@ -125,6 +154,12 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+static void its_free_itte(struct its_itte *itte)
+{
+	list_del(&itte->itte_list);
+	kfree(itte);
+}
+
 /*
  * This function is called with both the ITS and the distributor lock dropped,
  * so the actual command handlers must take the respective locks when needed.
@@ -321,6 +356,9 @@ int vits_init(struct kvm *kvm)
 
 	spin_lock_init(&its->lock);
 
+	INIT_LIST_HEAD(&its->device_list);
+	INIT_LIST_HEAD(&its->collection_list);
+
 	its->enabled = false;
 
 	return -ENXIO;
@@ -330,11 +368,39 @@ void vits_destroy(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_its *its = &dist->its;
+	struct its_device *dev;
+	struct its_itte *itte;
+	struct list_head *dev_cur, *dev_temp;
+	struct list_head *cur, *temp;
 
 	if (!vgic_has_its(kvm))
 		return;
 
+	/*
+	 * We may end up here without the lists ever having been initialized.
+	 * Check this and bail out early to avoid dereferencing a NULL pointer.
+	 */
+	if (!its->device_list.next)
+		return;
+
+	spin_lock(&its->lock);
+	list_for_each_safe(dev_cur, dev_temp, &its->device_list) {
+		dev = container_of(dev_cur, struct its_device, dev_list);
+		list_for_each_safe(cur, temp, &dev->itt) {
+			itte = (container_of(cur, struct its_itte, itte_list));
+			its_free_itte(itte);
+		}
+		list_del(dev_cur);
+		kfree(dev);
+	}
+
+	list_for_each_safe(cur, temp, &its->collection_list) {
+		list_del(cur);
+		kfree(container_of(cur, struct its_collection, coll_list));
+	}
+
 	kfree(dist->pendbaser);
 
 	its->enabled = false;
+	spin_unlock(&its->lock);
 }
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 11/16] KVM: arm64: add data structures to model ITS interrupt translation
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

The GICv3 Interrupt Translation Service (ITS) uses tables in memory
to allow a sophisticated interrupt routing. It features device tables,
an interrupt table per device and a table connecting "collections" to
actual CPUs (aka. redistributors in the GICv3 lingo).
Since the interrupt numbers for the LPIs are allocated quite sparsely
and the range can be quite huge (8192 LPIs being the minimum), using
bitmaps or arrays for storing information is a waste of memory.
We use linked lists instead, which we iterate linearily. This works
very well with the actual number of LPIs/MSIs in the guest being
quite low. Should the number of LPIs exceed the number where iterating
through lists seems acceptable, we can later revisit this and use more
efficient data structures.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- add a comment

 include/kvm/arm_vgic.h  |  3 +++
 virt/kvm/arm/its-emul.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 9ac850d..c3eb414 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -25,6 +25,7 @@
 #include <linux/spinlock.h>
 #include <linux/types.h>
 #include <kvm/iodev.h>
+#include <linux/list.h>
 
 #define VGIC_NR_IRQS_LEGACY	256
 #define VGIC_NR_SGIS		16
@@ -174,6 +175,8 @@ struct vgic_its {
 	u64			cbaser;
 	int			creadr;
 	int			cwriter;
+	struct list_head	device_list;
+	struct list_head	collection_list;
 };
 
 struct vgic_dist {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 9bbed86..bab8033 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -21,6 +21,7 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <linux/interrupt.h>
+#include <linux/list.h>
 
 #include <linux/irqchip/arm-gic-v3.h>
 #include <kvm/arm_vgic.h>
@@ -32,6 +33,34 @@
 #include "vgic.h"
 #include "its-emul.h"
 
+struct its_device {
+	struct list_head dev_list;
+
+	/* the head for the list of ITTEs */
+	struct list_head itt;
+	u32 device_id;
+};
+
+#define COLLECTION_NOT_MAPPED ((u32)-1)
+
+struct its_collection {
+	struct list_head coll_list;
+
+	u32 collection_id;
+	u32 target_addr;
+};
+
+#define its_is_collection_mapped(coll) ((coll) && \
+				((coll)->target_addr != COLLECTION_NOT_MAPPED))
+
+struct its_itte {
+	struct list_head itte_list;
+
+	struct its_collection *collection;
+	u32 lpi;
+	u32 event_id;
+};
+
 #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
 
 /* The distributor lock is held by the VGIC MMIO handler. */
@@ -125,6 +154,12 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+static void its_free_itte(struct its_itte *itte)
+{
+	list_del(&itte->itte_list);
+	kfree(itte);
+}
+
 /*
  * This function is called with both the ITS and the distributor lock dropped,
  * so the actual command handlers must take the respective locks when needed.
@@ -321,6 +356,9 @@ int vits_init(struct kvm *kvm)
 
 	spin_lock_init(&its->lock);
 
+	INIT_LIST_HEAD(&its->device_list);
+	INIT_LIST_HEAD(&its->collection_list);
+
 	its->enabled = false;
 
 	return -ENXIO;
@@ -330,11 +368,39 @@ void vits_destroy(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_its *its = &dist->its;
+	struct its_device *dev;
+	struct its_itte *itte;
+	struct list_head *dev_cur, *dev_temp;
+	struct list_head *cur, *temp;
 
 	if (!vgic_has_its(kvm))
 		return;
 
+	/*
+	 * We may end up here without the lists ever having been initialized.
+	 * Check this and bail out early to avoid dereferencing a NULL pointer.
+	 */
+	if (!its->device_list.next)
+		return;
+
+	spin_lock(&its->lock);
+	list_for_each_safe(dev_cur, dev_temp, &its->device_list) {
+		dev = container_of(dev_cur, struct its_device, dev_list);
+		list_for_each_safe(cur, temp, &dev->itt) {
+			itte = (container_of(cur, struct its_itte, itte_list));
+			its_free_itte(itte);
+		}
+		list_del(dev_cur);
+		kfree(dev);
+	}
+
+	list_for_each_safe(cur, temp, &its->collection_list) {
+		list_del(cur);
+		kfree(container_of(cur, struct its_collection, coll_list));
+	}
+
 	kfree(dist->pendbaser);
 
 	its->enabled = false;
+	spin_unlock(&its->lock);
 }
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall; +Cc: kvm, kvmarm, linux-arm-kernel

As the actual LPI number in a guest can be quite high, but is mostly
assigned using a very sparse allocation scheme, bitmaps and arrays
for storing the virtual interrupt status are a waste of memory.
We use our equivalent of the "Interrupt Translation Table Entry"
(ITTE) to hold this extra status information for a virtual LPI.
As the normal VGIC code cannot use its fancy bitmaps to manage
pending interrupts, we provide a hook in the VGIC code to let the
ITS emulation handle the list register queueing itself.
LPIs are located in a separate number range (>=8192), so
distinguishing them is easy. With LPIs being only edge-triggered, we
get away with a less complex IRQ handling.
We extend the number of bits for storing the IRQ number in our
LR struct to 16 to cover the LPI numbers we support as well.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- extend LR data structure to hold 16-bit wide IRQ IDs
- only clear pending bit if IRQ could be queued
- adapt __kvm_vgic_sync_hwstate() to upstream changes

 include/kvm/arm_vgic.h      |  4 +-
 virt/kvm/arm/its-emul.c     | 75 ++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h     |  3 ++
 virt/kvm/arm/vgic-v3-emul.c |  2 +
 virt/kvm/arm/vgic.c         | 93 +++++++++++++++++++++++++++++++--------------
 5 files changed, 148 insertions(+), 29 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c3eb414..035911f 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -95,7 +95,7 @@ enum vgic_type {
 #define LR_HW			(1 << 3)
 
 struct vgic_lr {
-	unsigned irq:10;
+	unsigned irq:16;
 	union {
 		unsigned hwirq:10;
 		unsigned source:3;
@@ -147,6 +147,8 @@ struct vgic_vm_ops {
 	int	(*init_model)(struct kvm *);
 	void	(*destroy_model)(struct kvm *);
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
+	bool	(*queue_lpis)(struct kvm_vcpu *);
+	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
 };
 
 struct vgic_io_device {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index bab8033..8349970 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -59,8 +59,27 @@ struct its_itte {
 	struct its_collection *collection;
 	u32 lpi;
 	u32 event_id;
+	bool enabled;
+	unsigned long *pending;
 };
 
+/* To be used as an iterator this macro misses the enclosing parentheses */
+#define for_each_lpi(dev, itte, kvm) \
+	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
+		list_for_each_entry(itte, &(dev)->itt, itte_list)
+
+static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
+{
+	struct its_device *device;
+	struct its_itte *itte;
+
+	for_each_lpi(device, itte, kvm) {
+		if (itte->lpi == lpi)
+			return itte;
+	}
+	return NULL;
+}
+
 #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
 
 /* The distributor lock is held by the VGIC MMIO handler. */
@@ -154,9 +173,65 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+/*
+ * Find all enabled and pending LPIs and queue them into the list
+ * registers.
+ * The dist lock is held by the caller.
+ */
+bool vits_queue_lpis(struct kvm_vcpu *vcpu)
+{
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	struct its_device *device;
+	struct its_itte *itte;
+	bool ret = true;
+
+	if (!vgic_has_its(vcpu->kvm))
+		return true;
+	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
+		return true;
+
+	spin_lock(&its->lock);
+	for_each_lpi(device, itte, vcpu->kvm) {
+		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
+			continue;
+
+		if (!itte->collection)
+			continue;
+
+		if (itte->collection->target_addr != vcpu->vcpu_id)
+			continue;
+
+
+		if (vgic_queue_irq(vcpu, 0, itte->lpi))
+			__clear_bit(vcpu->vcpu_id, itte->pending);
+		else
+			ret = false;
+	}
+
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+/* Called with the distributor lock held by the caller. */
+void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
+{
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	struct its_itte *itte;
+
+	spin_lock(&its->lock);
+
+	/* Find the right ITTE and put the pending state back in there */
+	itte = find_itte_by_lpi(vcpu->kvm, lpi);
+	if (itte)
+		__set_bit(vcpu->vcpu_id, itte->pending);
+
+	spin_unlock(&its->lock);
+}
+
 static void its_free_itte(struct its_itte *itte)
 {
 	list_del(&itte->itte_list);
+	kfree(itte->pending);
 	kfree(itte);
 }
 
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 472a6d0..cc5d5ff 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -33,4 +33,7 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
 void vits_destroy(struct kvm *kvm);
 
+bool vits_queue_lpis(struct kvm_vcpu *vcpu);
+void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
+
 #endif
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e9aa29e..f482e34 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -944,6 +944,8 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.init_model = vgic_v3_init_model;
 	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
+	dist->vm_ops.queue_lpis = vits_queue_lpis;
+	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
 
 	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
 	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 11bf692..9ee87d3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -120,6 +120,20 @@ static bool queue_sgi(struct kvm_vcpu *vcpu, int irq)
 	return vcpu->kvm->arch.vgic.vm_ops.queue_sgi(vcpu, irq);
 }
 
+static bool vgic_queue_lpis(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->kvm->arch.vgic.vm_ops.queue_lpis)
+		return vcpu->kvm->arch.vgic.vm_ops.queue_lpis(vcpu);
+	else
+		return true;
+}
+
+static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
+{
+	if (vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi)
+		vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
+}
+
 int kvm_vgic_map_resources(struct kvm *kvm)
 {
 	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
@@ -1148,18 +1162,28 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 				 int lr_nr, struct vgic_lr vlr)
 {
-	if (vgic_irq_is_active(vcpu, irq)) {
-		vlr.state |= LR_STATE_ACTIVE;
-		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
-		vgic_irq_clear_active(vcpu, irq);
-		vgic_update_state(vcpu->kvm);
-	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
-		vlr.state |= LR_STATE_PENDING;
-		kvm_debug("Set pending: 0x%x\n", vlr.state);
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	/* We care only about state for SGIs/PPIs/SPIs, not for LPIs */
+	if (irq < dist->nr_irqs) {
+		if (vgic_irq_is_active(vcpu, irq)) {
+			vlr.state |= LR_STATE_ACTIVE;
+			kvm_debug("Set active, clear distributor: 0x%x\n",
+				  vlr.state);
+			vgic_irq_clear_active(vcpu, irq);
+			vgic_update_state(vcpu->kvm);
+		} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+			vlr.state |= LR_STATE_PENDING;
+			kvm_debug("Set pending: 0x%x\n", vlr.state);
+		}
+		if (!vgic_irq_is_edge(vcpu, irq))
+			vlr.state |= LR_EOI_INT;
+	} else {
+		/* If this is an LPI, it can only be pending */
+		if (irq >= 8192)
+			vlr.state |= LR_STATE_PENDING;
 	}
 
-	if (!vgic_irq_is_edge(vcpu, irq))
-		vlr.state |= LR_EOI_INT;
 
 	if (vlr.irq >= VGIC_NR_SGIS) {
 		struct irq_phys_map *map;
@@ -1190,16 +1214,14 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
  */
 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-	struct vgic_lr vlr;
 	u64 elrsr = vgic_get_elrsr(vcpu);
 	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
+	struct vgic_lr vlr;
 	int lr;
 
 	/* Sanitize the input... */
 	BUG_ON(sgi_source_id & ~7);
 	BUG_ON(sgi_source_id && irq >= VGIC_NR_SGIS);
-	BUG_ON(irq >= dist->nr_irqs);
 
 	kvm_debug("Queue IRQ%d\n", irq);
 
@@ -1282,8 +1304,12 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 			overflow = 1;
 	}
 
-
-
+	/*
+	 * LPIs are not mapped in our bitmaps, so we leave the iteration
+	 * to the ITS emulation code.
+	 */
+	if (!vgic_queue_lpis(vcpu))
+		overflow = 1;
 
 epilog:
 	if (overflow) {
@@ -1488,20 +1514,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 		if (test_bit(lr, elrsr_ptr))
 			continue;
 
-		/* Reestablish SGI source for pending and active SGIs */
-		if (vlr.irq < VGIC_NR_SGIS)
-			add_sgi_source(vcpu, vlr.irq, vlr.source);
-
-		if (vlr.state & LR_STATE_PENDING)
-			vgic_dist_irq_set_pending(vcpu, vlr.irq);
-
-		if (vlr.state & LR_STATE_ACTIVE) {
-			if (vlr.state & LR_STATE_PENDING) {
-				vgic_irq_set_active(vcpu, vlr.irq);
-			} else {
-				/* Active-only IRQs stay in the LR */
-				pending = true;
+		/* LPIs are handled separately */
+		if (vlr.irq >= 8192) {
+			/* We just need to take care about still pending LPIs */
+			if (!(vlr.state & LR_STATE_PENDING))
 				continue;
+			vgic_unqueue_lpi(vcpu, vlr.irq);
+		} else {
+			BUG_ON(!(vlr.state & LR_STATE_MASK));
+
+			/* Reestablish SGI source for pending and active SGIs */
+			if (vlr.irq < VGIC_NR_SGIS)
+				add_sgi_source(vcpu, vlr.irq, vlr.source);
+
+			if (vlr.state & LR_STATE_PENDING)
+				vgic_dist_irq_set_pending(vcpu, vlr.irq);
+
+			if (vlr.state & LR_STATE_ACTIVE) {
+				if (vlr.state & LR_STATE_PENDING) {
+					vgic_irq_set_active(vcpu, vlr.irq);
+				} else {
+					/* Active-only IRQs stay in the LR */
+					pending = true;
+					continue;
+				}
 			}
 		}
 
@@ -1512,6 +1548,7 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 	}
 	vgic_update_state(vcpu->kvm);
 
+	/* vgic_update_state would not cover only-active IRQs or LPIs */
 	if (pending)
 		set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
 	spin_unlock(&dist->lock);
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

As the actual LPI number in a guest can be quite high, but is mostly
assigned using a very sparse allocation scheme, bitmaps and arrays
for storing the virtual interrupt status are a waste of memory.
We use our equivalent of the "Interrupt Translation Table Entry"
(ITTE) to hold this extra status information for a virtual LPI.
As the normal VGIC code cannot use its fancy bitmaps to manage
pending interrupts, we provide a hook in the VGIC code to let the
ITS emulation handle the list register queueing itself.
LPIs are located in a separate number range (>=8192), so
distinguishing them is easy. With LPIs being only edge-triggered, we
get away with a less complex IRQ handling.
We extend the number of bits for storing the IRQ number in our
LR struct to 16 to cover the LPI numbers we support as well.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- extend LR data structure to hold 16-bit wide IRQ IDs
- only clear pending bit if IRQ could be queued
- adapt __kvm_vgic_sync_hwstate() to upstream changes

 include/kvm/arm_vgic.h      |  4 +-
 virt/kvm/arm/its-emul.c     | 75 ++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h     |  3 ++
 virt/kvm/arm/vgic-v3-emul.c |  2 +
 virt/kvm/arm/vgic.c         | 93 +++++++++++++++++++++++++++++++--------------
 5 files changed, 148 insertions(+), 29 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c3eb414..035911f 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -95,7 +95,7 @@ enum vgic_type {
 #define LR_HW			(1 << 3)
 
 struct vgic_lr {
-	unsigned irq:10;
+	unsigned irq:16;
 	union {
 		unsigned hwirq:10;
 		unsigned source:3;
@@ -147,6 +147,8 @@ struct vgic_vm_ops {
 	int	(*init_model)(struct kvm *);
 	void	(*destroy_model)(struct kvm *);
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
+	bool	(*queue_lpis)(struct kvm_vcpu *);
+	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
 };
 
 struct vgic_io_device {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index bab8033..8349970 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -59,8 +59,27 @@ struct its_itte {
 	struct its_collection *collection;
 	u32 lpi;
 	u32 event_id;
+	bool enabled;
+	unsigned long *pending;
 };
 
+/* To be used as an iterator this macro misses the enclosing parentheses */
+#define for_each_lpi(dev, itte, kvm) \
+	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
+		list_for_each_entry(itte, &(dev)->itt, itte_list)
+
+static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
+{
+	struct its_device *device;
+	struct its_itte *itte;
+
+	for_each_lpi(device, itte, kvm) {
+		if (itte->lpi == lpi)
+			return itte;
+	}
+	return NULL;
+}
+
 #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
 
 /* The distributor lock is held by the VGIC MMIO handler. */
@@ -154,9 +173,65 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+/*
+ * Find all enabled and pending LPIs and queue them into the list
+ * registers.
+ * The dist lock is held by the caller.
+ */
+bool vits_queue_lpis(struct kvm_vcpu *vcpu)
+{
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	struct its_device *device;
+	struct its_itte *itte;
+	bool ret = true;
+
+	if (!vgic_has_its(vcpu->kvm))
+		return true;
+	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
+		return true;
+
+	spin_lock(&its->lock);
+	for_each_lpi(device, itte, vcpu->kvm) {
+		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
+			continue;
+
+		if (!itte->collection)
+			continue;
+
+		if (itte->collection->target_addr != vcpu->vcpu_id)
+			continue;
+
+
+		if (vgic_queue_irq(vcpu, 0, itte->lpi))
+			__clear_bit(vcpu->vcpu_id, itte->pending);
+		else
+			ret = false;
+	}
+
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+/* Called with the distributor lock held by the caller. */
+void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
+{
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	struct its_itte *itte;
+
+	spin_lock(&its->lock);
+
+	/* Find the right ITTE and put the pending state back in there */
+	itte = find_itte_by_lpi(vcpu->kvm, lpi);
+	if (itte)
+		__set_bit(vcpu->vcpu_id, itte->pending);
+
+	spin_unlock(&its->lock);
+}
+
 static void its_free_itte(struct its_itte *itte)
 {
 	list_del(&itte->itte_list);
+	kfree(itte->pending);
 	kfree(itte);
 }
 
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 472a6d0..cc5d5ff 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -33,4 +33,7 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
 void vits_destroy(struct kvm *kvm);
 
+bool vits_queue_lpis(struct kvm_vcpu *vcpu);
+void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
+
 #endif
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e9aa29e..f482e34 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -944,6 +944,8 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.init_model = vgic_v3_init_model;
 	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
+	dist->vm_ops.queue_lpis = vits_queue_lpis;
+	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
 
 	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
 	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 11bf692..9ee87d3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -120,6 +120,20 @@ static bool queue_sgi(struct kvm_vcpu *vcpu, int irq)
 	return vcpu->kvm->arch.vgic.vm_ops.queue_sgi(vcpu, irq);
 }
 
+static bool vgic_queue_lpis(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->kvm->arch.vgic.vm_ops.queue_lpis)
+		return vcpu->kvm->arch.vgic.vm_ops.queue_lpis(vcpu);
+	else
+		return true;
+}
+
+static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
+{
+	if (vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi)
+		vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
+}
+
 int kvm_vgic_map_resources(struct kvm *kvm)
 {
 	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
@@ -1148,18 +1162,28 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 				 int lr_nr, struct vgic_lr vlr)
 {
-	if (vgic_irq_is_active(vcpu, irq)) {
-		vlr.state |= LR_STATE_ACTIVE;
-		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
-		vgic_irq_clear_active(vcpu, irq);
-		vgic_update_state(vcpu->kvm);
-	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
-		vlr.state |= LR_STATE_PENDING;
-		kvm_debug("Set pending: 0x%x\n", vlr.state);
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	/* We care only about state for SGIs/PPIs/SPIs, not for LPIs */
+	if (irq < dist->nr_irqs) {
+		if (vgic_irq_is_active(vcpu, irq)) {
+			vlr.state |= LR_STATE_ACTIVE;
+			kvm_debug("Set active, clear distributor: 0x%x\n",
+				  vlr.state);
+			vgic_irq_clear_active(vcpu, irq);
+			vgic_update_state(vcpu->kvm);
+		} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+			vlr.state |= LR_STATE_PENDING;
+			kvm_debug("Set pending: 0x%x\n", vlr.state);
+		}
+		if (!vgic_irq_is_edge(vcpu, irq))
+			vlr.state |= LR_EOI_INT;
+	} else {
+		/* If this is an LPI, it can only be pending */
+		if (irq >= 8192)
+			vlr.state |= LR_STATE_PENDING;
 	}
 
-	if (!vgic_irq_is_edge(vcpu, irq))
-		vlr.state |= LR_EOI_INT;
 
 	if (vlr.irq >= VGIC_NR_SGIS) {
 		struct irq_phys_map *map;
@@ -1190,16 +1214,14 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
  */
 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
-	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-	struct vgic_lr vlr;
 	u64 elrsr = vgic_get_elrsr(vcpu);
 	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
+	struct vgic_lr vlr;
 	int lr;
 
 	/* Sanitize the input... */
 	BUG_ON(sgi_source_id & ~7);
 	BUG_ON(sgi_source_id && irq >= VGIC_NR_SGIS);
-	BUG_ON(irq >= dist->nr_irqs);
 
 	kvm_debug("Queue IRQ%d\n", irq);
 
@@ -1282,8 +1304,12 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 			overflow = 1;
 	}
 
-
-
+	/*
+	 * LPIs are not mapped in our bitmaps, so we leave the iteration
+	 * to the ITS emulation code.
+	 */
+	if (!vgic_queue_lpis(vcpu))
+		overflow = 1;
 
 epilog:
 	if (overflow) {
@@ -1488,20 +1514,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 		if (test_bit(lr, elrsr_ptr))
 			continue;
 
-		/* Reestablish SGI source for pending and active SGIs */
-		if (vlr.irq < VGIC_NR_SGIS)
-			add_sgi_source(vcpu, vlr.irq, vlr.source);
-
-		if (vlr.state & LR_STATE_PENDING)
-			vgic_dist_irq_set_pending(vcpu, vlr.irq);
-
-		if (vlr.state & LR_STATE_ACTIVE) {
-			if (vlr.state & LR_STATE_PENDING) {
-				vgic_irq_set_active(vcpu, vlr.irq);
-			} else {
-				/* Active-only IRQs stay in the LR */
-				pending = true;
+		/* LPIs are handled separately */
+		if (vlr.irq >= 8192) {
+			/* We just need to take care about still pending LPIs */
+			if (!(vlr.state & LR_STATE_PENDING))
 				continue;
+			vgic_unqueue_lpi(vcpu, vlr.irq);
+		} else {
+			BUG_ON(!(vlr.state & LR_STATE_MASK));
+
+			/* Reestablish SGI source for pending and active SGIs */
+			if (vlr.irq < VGIC_NR_SGIS)
+				add_sgi_source(vcpu, vlr.irq, vlr.source);
+
+			if (vlr.state & LR_STATE_PENDING)
+				vgic_dist_irq_set_pending(vcpu, vlr.irq);
+
+			if (vlr.state & LR_STATE_ACTIVE) {
+				if (vlr.state & LR_STATE_PENDING) {
+					vgic_irq_set_active(vcpu, vlr.irq);
+				} else {
+					/* Active-only IRQs stay in the LR */
+					pending = true;
+					continue;
+				}
 			}
 		}
 
@@ -1512,6 +1548,7 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 	}
 	vgic_update_state(vcpu->kvm);
 
+	/* vgic_update_state would not cover only-active IRQs or LPIs */
 	if (pending)
 		set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
 	spin_unlock(&dist->lock);
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall; +Cc: kvm, kvmarm, linux-arm-kernel

The LPI configuration and pending tables of the GICv3 LPIs are held
in tables in (guest) memory. To achieve reasonable performance, we
cache this data in our own data structures, so we need to sync those
two views from time to time. This behaviour is well described in the
GICv3 spec and is also exercised by hardware, so the sync points are
well known.

Provide functions that read the guest memory and store the
information from the configuration and pending tables in the kernel.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- rework functions to avoid propbaser/pendbaser accesses inside lock

 include/kvm/arm_vgic.h  |   2 +
 virt/kvm/arm/its-emul.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h |   3 ++
 3 files changed, 138 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 035911f..4ea023c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -179,6 +179,8 @@ struct vgic_its {
 	int			cwriter;
 	struct list_head	device_list;
 	struct list_head	collection_list;
+	/* memory used for buffering guest's memory */
+	void			*buffer_page;
 };
 
 struct vgic_dist {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 8349970..7a8c5db 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -59,6 +59,7 @@ struct its_itte {
 	struct its_collection *collection;
 	u32 lpi;
 	u32 event_id;
+	u8 priority;
 	bool enabled;
 	unsigned long *pending;
 };
@@ -80,8 +81,124 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
 	return NULL;
 }
 
+#define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
+#define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
+
+/* stores the priority and enable bit for a given LPI */
+static void update_lpi_config(struct kvm *kvm, struct its_itte *itte, u8 prop)
+{
+	itte->priority = LPI_PROP_PRIORITY(prop);
+	itte->enabled  = LPI_PROP_ENABLE_BIT(prop);
+}
+
+#define GIC_LPI_OFFSET 8192
+
+/* We scan the table in chunks the size of the smallest page size */
+#define CHUNK_SIZE 4096U
+
 #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
 
+static int nr_idbits_propbase(u64 propbaser)
+{
+	int nr_idbits = (1U << (propbaser & 0x1f)) + 1;
+
+	return max(nr_idbits, INTERRUPT_ID_BITS_ITS);
+}
+
+/*
+ * Scan the whole LPI configuration table and put the LPI configuration
+ * data in our own data structures. This relies on the LPI being
+ * mapped before.
+ */
+static bool its_update_lpis_configuration(struct kvm *kvm, u64 prop_base_reg)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u8 *prop = dist->its.buffer_page;
+	u32 tsize;
+	gpa_t propbase;
+	int lpi = GIC_LPI_OFFSET;
+	struct its_itte *itte;
+	struct its_device *device;
+	int ret;
+
+	propbase = BASER_BASE_ADDRESS(prop_base_reg);
+	tsize = nr_idbits_propbase(prop_base_reg);
+
+	while (tsize > 0) {
+		int chunksize = min(tsize, CHUNK_SIZE);
+
+		ret = kvm_read_guest(kvm, propbase, prop, chunksize);
+		if (ret)
+			return false;
+
+		spin_lock(&dist->its.lock);
+		/*
+		 * Updating the status for all allocated LPIs. We catch
+		 * those LPIs that get disabled. We really don't care
+		 * about unmapped LPIs, as they need to be updated
+		 * later manually anyway once they get mapped.
+		 */
+		for_each_lpi(device, itte, kvm) {
+			if (itte->lpi < lpi || itte->lpi >= lpi + chunksize)
+				continue;
+
+			update_lpi_config(kvm, itte, prop[itte->lpi - lpi]);
+		}
+		spin_unlock(&dist->its.lock);
+		tsize -= chunksize;
+		lpi += chunksize;
+		propbase += chunksize;
+	}
+
+	return true;
+}
+
+/*
+ * Scan the whole LPI pending table and sync the pending bit in there
+ * with our own data structures. This relies on the LPI being
+ * mapped before.
+ */
+static bool its_sync_lpi_pending_table(struct kvm_vcpu *vcpu, u64 base_addr_reg)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *pendmask = dist->its.buffer_page;
+	u32 nr_lpis = VITS_NR_LPIS;
+	gpa_t pendbase;
+	int lpi = 0;
+	struct its_itte *itte;
+	struct its_device *device;
+	int ret;
+	int lpi_bit, nr_bits;
+
+	pendbase = BASER_BASE_ADDRESS(base_addr_reg);
+
+	while (nr_lpis > 0) {
+		nr_bits = min(nr_lpis, CHUNK_SIZE * 8);
+
+		ret = kvm_read_guest(vcpu->kvm, pendbase, pendmask,
+				     nr_bits / 8);
+		if (ret)
+			return false;
+
+		spin_lock(&dist->its.lock);
+		for_each_lpi(device, itte, vcpu->kvm) {
+			lpi_bit = itte->lpi - lpi;
+			if (lpi_bit < 0 || lpi_bit >= nr_bits)
+				continue;
+			if (test_bit(lpi_bit, pendmask))
+				__set_bit(vcpu->vcpu_id, itte->pending);
+			else
+				__clear_bit(vcpu->vcpu_id, itte->pending);
+		}
+		spin_unlock(&dist->its.lock);
+		nr_lpis -= nr_bits;
+		lpi += nr_bits;
+		pendbase += nr_bits / 8;
+	}
+
+	return true;
+}
+
 /* The distributor lock is held by the VGIC MMIO handler. */
 static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
 				  struct kvm_exit_mmio *mmio,
@@ -418,6 +535,17 @@ static const struct vgic_io_range vgicv3_its_ranges[] = {
 /* This is called on setting the LPI enable bit in the redistributor. */
 void vgic_enable_lpis(struct kvm_vcpu *vcpu)
 {
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	u64 prop_base_reg, pend_base_reg;
+
+	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
+	prop_base_reg = dist->propbaser;
+	spin_unlock(&dist->lock);
+
+	its_update_lpis_configuration(vcpu->kvm, prop_base_reg);
+	its_sync_lpi_pending_table(vcpu, pend_base_reg);
+
+	spin_lock(&dist->lock);
 }
 
 int vits_init(struct kvm *kvm)
@@ -429,6 +557,10 @@ int vits_init(struct kvm *kvm)
 	if (!dist->pendbaser)
 		return -ENOMEM;
 
+	its->buffer_page = kmalloc(CHUNK_SIZE, GFP_KERNEL);
+	if (!its->buffer_page)
+		return -ENOMEM;
+
 	spin_lock_init(&its->lock);
 
 	INIT_LIST_HEAD(&its->device_list);
@@ -474,6 +606,7 @@ void vits_destroy(struct kvm *kvm)
 		kfree(container_of(cur, struct its_collection, coll_list));
 	}
 
+	kfree(its->buffer_page);
 	kfree(dist->pendbaser);
 
 	its->enabled = false;
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index cc5d5ff..cbc3877 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -29,6 +29,9 @@
 
 #include "vgic.h"
 
+#define INTERRUPT_ID_BITS_ITS 16
+#define VITS_NR_LPIS (1U << INTERRUPT_ID_BITS_ITS)
+
 void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
 void vits_destroy(struct kvm *kvm);
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

The LPI configuration and pending tables of the GICv3 LPIs are held
in tables in (guest) memory. To achieve reasonable performance, we
cache this data in our own data structures, so we need to sync those
two views from time to time. This behaviour is well described in the
GICv3 spec and is also exercised by hardware, so the sync points are
well known.

Provide functions that read the guest memory and store the
information from the configuration and pending tables in the kernel.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- rework functions to avoid propbaser/pendbaser accesses inside lock

 include/kvm/arm_vgic.h  |   2 +
 virt/kvm/arm/its-emul.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h |   3 ++
 3 files changed, 138 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 035911f..4ea023c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -179,6 +179,8 @@ struct vgic_its {
 	int			cwriter;
 	struct list_head	device_list;
 	struct list_head	collection_list;
+	/* memory used for buffering guest's memory */
+	void			*buffer_page;
 };
 
 struct vgic_dist {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 8349970..7a8c5db 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -59,6 +59,7 @@ struct its_itte {
 	struct its_collection *collection;
 	u32 lpi;
 	u32 event_id;
+	u8 priority;
 	bool enabled;
 	unsigned long *pending;
 };
@@ -80,8 +81,124 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
 	return NULL;
 }
 
+#define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
+#define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
+
+/* stores the priority and enable bit for a given LPI */
+static void update_lpi_config(struct kvm *kvm, struct its_itte *itte, u8 prop)
+{
+	itte->priority = LPI_PROP_PRIORITY(prop);
+	itte->enabled  = LPI_PROP_ENABLE_BIT(prop);
+}
+
+#define GIC_LPI_OFFSET 8192
+
+/* We scan the table in chunks the size of the smallest page size */
+#define CHUNK_SIZE 4096U
+
 #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
 
+static int nr_idbits_propbase(u64 propbaser)
+{
+	int nr_idbits = (1U << (propbaser & 0x1f)) + 1;
+
+	return max(nr_idbits, INTERRUPT_ID_BITS_ITS);
+}
+
+/*
+ * Scan the whole LPI configuration table and put the LPI configuration
+ * data in our own data structures. This relies on the LPI being
+ * mapped before.
+ */
+static bool its_update_lpis_configuration(struct kvm *kvm, u64 prop_base_reg)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u8 *prop = dist->its.buffer_page;
+	u32 tsize;
+	gpa_t propbase;
+	int lpi = GIC_LPI_OFFSET;
+	struct its_itte *itte;
+	struct its_device *device;
+	int ret;
+
+	propbase = BASER_BASE_ADDRESS(prop_base_reg);
+	tsize = nr_idbits_propbase(prop_base_reg);
+
+	while (tsize > 0) {
+		int chunksize = min(tsize, CHUNK_SIZE);
+
+		ret = kvm_read_guest(kvm, propbase, prop, chunksize);
+		if (ret)
+			return false;
+
+		spin_lock(&dist->its.lock);
+		/*
+		 * Updating the status for all allocated LPIs. We catch
+		 * those LPIs that get disabled. We really don't care
+		 * about unmapped LPIs, as they need to be updated
+		 * later manually anyway once they get mapped.
+		 */
+		for_each_lpi(device, itte, kvm) {
+			if (itte->lpi < lpi || itte->lpi >= lpi + chunksize)
+				continue;
+
+			update_lpi_config(kvm, itte, prop[itte->lpi - lpi]);
+		}
+		spin_unlock(&dist->its.lock);
+		tsize -= chunksize;
+		lpi += chunksize;
+		propbase += chunksize;
+	}
+
+	return true;
+}
+
+/*
+ * Scan the whole LPI pending table and sync the pending bit in there
+ * with our own data structures. This relies on the LPI being
+ * mapped before.
+ */
+static bool its_sync_lpi_pending_table(struct kvm_vcpu *vcpu, u64 base_addr_reg)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *pendmask = dist->its.buffer_page;
+	u32 nr_lpis = VITS_NR_LPIS;
+	gpa_t pendbase;
+	int lpi = 0;
+	struct its_itte *itte;
+	struct its_device *device;
+	int ret;
+	int lpi_bit, nr_bits;
+
+	pendbase = BASER_BASE_ADDRESS(base_addr_reg);
+
+	while (nr_lpis > 0) {
+		nr_bits = min(nr_lpis, CHUNK_SIZE * 8);
+
+		ret = kvm_read_guest(vcpu->kvm, pendbase, pendmask,
+				     nr_bits / 8);
+		if (ret)
+			return false;
+
+		spin_lock(&dist->its.lock);
+		for_each_lpi(device, itte, vcpu->kvm) {
+			lpi_bit = itte->lpi - lpi;
+			if (lpi_bit < 0 || lpi_bit >= nr_bits)
+				continue;
+			if (test_bit(lpi_bit, pendmask))
+				__set_bit(vcpu->vcpu_id, itte->pending);
+			else
+				__clear_bit(vcpu->vcpu_id, itte->pending);
+		}
+		spin_unlock(&dist->its.lock);
+		nr_lpis -= nr_bits;
+		lpi += nr_bits;
+		pendbase += nr_bits / 8;
+	}
+
+	return true;
+}
+
 /* The distributor lock is held by the VGIC MMIO handler. */
 static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
 				  struct kvm_exit_mmio *mmio,
@@ -418,6 +535,17 @@ static const struct vgic_io_range vgicv3_its_ranges[] = {
 /* This is called on setting the LPI enable bit in the redistributor. */
 void vgic_enable_lpis(struct kvm_vcpu *vcpu)
 {
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	u64 prop_base_reg, pend_base_reg;
+
+	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
+	prop_base_reg = dist->propbaser;
+	spin_unlock(&dist->lock);
+
+	its_update_lpis_configuration(vcpu->kvm, prop_base_reg);
+	its_sync_lpi_pending_table(vcpu, pend_base_reg);
+
+	spin_lock(&dist->lock);
 }
 
 int vits_init(struct kvm *kvm)
@@ -429,6 +557,10 @@ int vits_init(struct kvm *kvm)
 	if (!dist->pendbaser)
 		return -ENOMEM;
 
+	its->buffer_page = kmalloc(CHUNK_SIZE, GFP_KERNEL);
+	if (!its->buffer_page)
+		return -ENOMEM;
+
 	spin_lock_init(&its->lock);
 
 	INIT_LIST_HEAD(&its->device_list);
@@ -474,6 +606,7 @@ void vits_destroy(struct kvm *kvm)
 		kfree(container_of(cur, struct its_collection, coll_list));
 	}
 
+	kfree(its->buffer_page);
 	kfree(dist->pendbaser);
 
 	its->enabled = false;
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index cc5d5ff..cbc3877 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -29,6 +29,9 @@
 
 #include "vgic.h"
 
+#define INTERRUPT_ID_BITS_ITS 16
+#define VITS_NR_LPIS (1U << INTERRUPT_ID_BITS_ITS)
+
 void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
 void vits_destroy(struct kvm *kvm);
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

The connection between a device, an event ID, the LPI number and the
allocated CPU is stored in in-memory tables in a GICv3, but their
format is not specified by the spec. Instead software uses a command
queue in a ring buffer to let the ITS implementation use their own
format.
Implement handlers for the various ITS commands and let them store
the requested relation into our own data structures.
To avoid kmallocs inside the ITS spinlock, we preallocate possibly
needed memory outside of the lock and free that if it turns out to
be not needed (mostly error handling).
Error handling is very basic at this point, as we don't have a good
way of communicating errors to the guest (usually a SError).
The INT command handler is missing at this point, as we gain the
capability of actually injecting MSIs into the guest only later on.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- adjust handlers to new pendbaser/propbaser locking scheme
- properly free ITTEs (including pending bitmap)
- fix handling of unmapped collections

 include/linux/irqchip/arm-gic-v3.h |   5 +-
 virt/kvm/arm/its-emul.c            | 502 ++++++++++++++++++++++++++++++++++++-
 virt/kvm/arm/its-emul.h            |  11 +
 3 files changed, 516 insertions(+), 2 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index ef274a9..27c0e75 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -255,7 +255,10 @@
  */
 #define GITS_CMD_MAPD			0x08
 #define GITS_CMD_MAPC			0x09
-#define GITS_CMD_MAPVI			0x0a
+#define GITS_CMD_MAPTI			0x0a
+/* older GIC documentation used MAPVI for this command */
+#define GITS_CMD_MAPVI			GITS_CMD_MAPTI
+#define GITS_CMD_MAPI			0x0b
 #define GITS_CMD_MOVI			0x01
 #define GITS_CMD_DISCARD		0x0f
 #define GITS_CMD_INV			0x0c
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 7a8c5db..642effb 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -22,6 +22,7 @@
 #include <linux/kvm_host.h>
 #include <linux/interrupt.h>
 #include <linux/list.h>
+#include <linux/slab.h>
 
 #include <linux/irqchip/arm-gic-v3.h>
 #include <kvm/arm_vgic.h>
@@ -64,6 +65,34 @@ struct its_itte {
 	unsigned long *pending;
 };
 
+static struct its_device *find_its_device(struct kvm *kvm, u32 device_id)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	struct its_device *device;
+
+	list_for_each_entry(device, &its->device_list, dev_list)
+		if (device_id == device->device_id)
+			return device;
+
+	return NULL;
+}
+
+static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
+{
+	struct its_device *device;
+	struct its_itte *itte;
+
+	device = find_its_device(kvm, device_id);
+	if (device == NULL)
+		return NULL;
+
+	list_for_each_entry(itte, &device->itt, itte_list)
+		if (itte->event_id == event_id)
+			return itte;
+
+	return NULL;
+}
+
 /* To be used as an iterator this macro misses the enclosing parentheses */
 #define for_each_lpi(dev, itte, kvm) \
 	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
@@ -81,6 +110,19 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
 	return NULL;
 }
 
+static struct its_collection *find_collection(struct kvm *kvm, int coll_id)
+{
+	struct its_collection *collection;
+
+	list_for_each_entry(collection, &kvm->arch.vgic.its.collection_list,
+			    coll_list) {
+		if (coll_id == collection->collection_id)
+			return collection;
+	}
+
+	return NULL;
+}
+
 #define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
 #define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
 
@@ -352,13 +394,471 @@ static void its_free_itte(struct its_itte *itte)
 	kfree(itte);
 }
 
+static u64 its_cmd_mask_field(u64 *its_cmd, int word, int shift, int size)
+{
+	return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT_ULL(size) - 1);
+}
+
+#define its_cmd_get_command(cmd)	its_cmd_mask_field(cmd, 0,  0,  8)
+#define its_cmd_get_deviceid(cmd)	its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_id(cmd)		its_cmd_mask_field(cmd, 1,  0, 32)
+#define its_cmd_get_physical_id(cmd)	its_cmd_mask_field(cmd, 1, 32, 32)
+#define its_cmd_get_collection(cmd)	its_cmd_mask_field(cmd, 2,  0, 16)
+#define its_cmd_get_target_addr(cmd)	its_cmd_mask_field(cmd, 2, 16, 32)
+#define its_cmd_get_validbit(cmd)	its_cmd_mask_field(cmd, 2, 63,  1)
+
+/* The DISCARD command frees an Interrupt Translation Table Entry (ITTE). */
+static int vits_cmd_handle_discard(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 device_id;
+	u32 event_id;
+	struct its_itte *itte;
+	int ret = E_ITS_DISCARD_UNMAPPED_INTERRUPT;
+
+	device_id = its_cmd_get_deviceid(its_cmd);
+	event_id = its_cmd_get_id(its_cmd);
+
+	spin_lock(&its->lock);
+	itte = find_itte(kvm, device_id, event_id);
+	if (itte && itte->collection) {
+		/*
+		 * Though the spec talks about removing the pending state, we
+		 * don't bother here since we clear the ITTE anyway and the
+		 * pending state is a property of the ITTE struct.
+		 */
+		its_free_itte(itte);
+		ret = 0;
+	}
+
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+/* The MOVI command moves an ITTE to a different collection. */
+static int vits_cmd_handle_movi(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 device_id = its_cmd_get_deviceid(its_cmd);
+	u32 event_id = its_cmd_get_id(its_cmd);
+	u32 coll_id = its_cmd_get_collection(its_cmd);
+	struct its_itte *itte;
+	struct its_collection *collection;
+	int ret;
+
+	spin_lock(&its->lock);
+	itte = find_itte(kvm, device_id, event_id);
+	if (!itte) {
+		ret = E_ITS_MOVI_UNMAPPED_INTERRUPT;
+		goto out_unlock;
+	}
+	if (!its_is_collection_mapped(itte->collection)) {
+		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		goto out_unlock;
+	}
+
+	collection = find_collection(kvm, coll_id);
+	if (!its_is_collection_mapped(collection)) {
+		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		goto out_unlock;
+	}
+
+	if (test_and_clear_bit(itte->collection->target_addr, itte->pending))
+		__set_bit(collection->target_addr, itte->pending);
+
+	itte->collection = collection;
+out_unlock:
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+static void vits_init_collection(struct kvm *kvm,
+				 struct its_collection *collection,
+				 u32 coll_id)
+{
+	collection->collection_id = coll_id;
+	collection->target_addr = COLLECTION_NOT_MAPPED;
+
+	list_add_tail(&collection->coll_list,
+		&kvm->arch.vgic.its.collection_list);
+}
+
+/* The MAPTI and MAPI commands map LPIs to ITTEs. */
+static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u32 device_id = its_cmd_get_deviceid(its_cmd);
+	u32 event_id = its_cmd_get_id(its_cmd);
+	u32 coll_id = its_cmd_get_collection(its_cmd);
+	struct its_itte *itte, *new_itte;
+	struct its_device *device;
+	struct its_collection *collection, *new_coll;
+	int lpi_nr;
+	int ret = 0;
+
+	/* Preallocate possibly needed memory here outside of the lock */
+	new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
+	new_itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
+	if (new_itte)
+		new_itte->pending = kcalloc(BITS_TO_LONGS(dist->nr_cpus),
+					    sizeof(long), GFP_KERNEL);
+
+	spin_lock(&dist->its.lock);
+
+	device = find_its_device(kvm, device_id);
+	if (!device) {
+		ret = E_ITS_MAPTI_UNMAPPED_DEVICE;
+		goto out_unlock;
+	}
+
+	collection = find_collection(kvm, coll_id);
+	if (!collection && !new_coll) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	if (cmd == GITS_CMD_MAPTI)
+		lpi_nr = its_cmd_get_physical_id(its_cmd);
+	else
+		lpi_nr = event_id;
+	if (lpi_nr < GIC_LPI_OFFSET ||
+	    lpi_nr >= nr_idbits_propbase(dist->propbaser)) {
+		ret = E_ITS_MAPTI_PHYSICALID_OOR;
+		goto out_unlock;
+	}
+
+	itte = find_itte(kvm, device_id, event_id);
+	if (!itte) {
+		if (!new_itte || !new_itte->pending) {
+			ret = -ENOMEM;
+			goto out_unlock;
+		}
+		itte = new_itte;
+
+		itte->event_id	= event_id;
+		list_add_tail(&itte->itte_list, &device->itt);
+	} else {
+		if (new_itte)
+			kfree(new_itte->pending);
+		kfree(new_itte);
+	}
+
+	if (!collection) {
+		collection = new_coll;
+		vits_init_collection(kvm, collection, coll_id);
+	} else {
+		kfree(new_coll);
+	}
+
+	itte->collection = collection;
+	itte->lpi = lpi_nr;
+
+out_unlock:
+	spin_unlock(&dist->its.lock);
+	if (ret) {
+		kfree(new_coll);
+		if (new_itte)
+			kfree(new_itte->pending);
+		kfree(new_itte);
+	}
+	return ret;
+}
+
+static void vits_unmap_device(struct kvm *kvm, struct its_device *device)
+{
+	struct its_itte *itte, *temp;
+
+	/*
+	 * The spec says that unmapping a device with still valid
+	 * ITTEs associated is UNPREDICTABLE. We remove all ITTEs,
+	 * since we cannot leave the memory unreferenced.
+	 */
+	list_for_each_entry_safe(itte, temp, &device->itt, itte_list)
+		its_free_itte(itte);
+
+	list_del(&device->dev_list);
+	kfree(device);
+}
+
+/* MAPD maps or unmaps a device ID to Interrupt Translation Tables (ITTs). */
+static int vits_cmd_handle_mapd(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	bool valid = its_cmd_get_validbit(its_cmd);
+	u32 device_id = its_cmd_get_deviceid(its_cmd);
+	struct its_device *device, *new_device = NULL;
+
+	/* We preallocate memory outside of the lock here */
+	if (valid) {
+		new_device = kzalloc(sizeof(struct its_device), GFP_KERNEL);
+		if (!new_device)
+			return -ENOMEM;
+	}
+
+	spin_lock(&its->lock);
+
+	device = find_its_device(kvm, device_id);
+	if (device)
+		vits_unmap_device(kvm, device);
+
+	/*
+	 * The spec does not say whether unmapping a not-mapped device
+	 * is an error, so we are done in any case.
+	 */
+	if (!valid)
+		goto out_unlock;
+
+	device = new_device;
+
+	device->device_id = device_id;
+	INIT_LIST_HEAD(&device->itt);
+
+	list_add_tail(&device->dev_list,
+		      &kvm->arch.vgic.its.device_list);
+
+out_unlock:
+	spin_unlock(&its->lock);
+	return 0;
+}
+
+/* The MAPC command maps collection IDs to redistributors. */
+static int vits_cmd_handle_mapc(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u16 coll_id;
+	u32 target_addr;
+	struct its_collection *collection, *new_coll = NULL;
+	bool valid;
+
+	valid = its_cmd_get_validbit(its_cmd);
+	coll_id = its_cmd_get_collection(its_cmd);
+	target_addr = its_cmd_get_target_addr(its_cmd);
+
+	if (target_addr >= atomic_read(&kvm->online_vcpus))
+		return E_ITS_MAPC_PROCNUM_OOR;
+
+	/* We preallocate memory outside of the lock here */
+	if (valid) {
+		new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
+		if (!new_coll)
+			return -ENOMEM;
+	}
+
+	spin_lock(&its->lock);
+	collection = find_collection(kvm, coll_id);
+
+	if (!valid) {
+		struct its_device *device;
+		struct its_itte *itte;
+		/*
+		 * Clearing the mapping for that collection ID removes the
+		 * entry from the list. If there wasn't any before, we can
+		 * go home early.
+		 */
+		if (!collection)
+			goto out_unlock;
+
+		for_each_lpi(device, itte, kvm)
+			if (itte->collection &&
+			    itte->collection->collection_id == coll_id)
+				itte->collection = NULL;
+
+		list_del(&collection->coll_list);
+		kfree(collection);
+	} else {
+		if (!collection)
+			collection = new_coll;
+		else
+			kfree(new_coll);
+
+		vits_init_collection(kvm, collection, coll_id);
+		collection->target_addr = target_addr;
+	}
+
+out_unlock:
+	spin_unlock(&its->lock);
+	return 0;
+}
+
+/* The CLEAR command removes the pending state for a particular LPI. */
+static int vits_cmd_handle_clear(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 device_id;
+	u32 event_id;
+	struct its_itte *itte;
+	int ret = 0;
+
+	device_id = its_cmd_get_deviceid(its_cmd);
+	event_id = its_cmd_get_id(its_cmd);
+
+	spin_lock(&its->lock);
+
+	itte = find_itte(kvm, device_id, event_id);
+	if (!itte) {
+		ret = E_ITS_CLEAR_UNMAPPED_INTERRUPT;
+		goto out_unlock;
+	}
+
+	if (its_is_collection_mapped(itte->collection))
+		__clear_bit(itte->collection->target_addr, itte->pending);
+
+out_unlock:
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+/* The INV command syncs the configuration bits from the memory tables. */
+static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u32 device_id;
+	u32 event_id;
+	struct its_itte *itte, *new_itte;
+	gpa_t propbase;
+	int ret;
+	u8 prop;
+
+	device_id = its_cmd_get_deviceid(its_cmd);
+	event_id = its_cmd_get_id(its_cmd);
+
+	spin_lock(&dist->its.lock);
+	itte = find_itte(kvm, device_id, event_id);
+	spin_unlock(&dist->its.lock);
+	if (!itte)
+		return E_ITS_INV_UNMAPPED_INTERRUPT;
+
+	/*
+	 * We cannot read from guest memory inside the spinlock, so we
+	 * need to re-read our tables to learn whether the LPI number we are
+	 * using is still valid.
+	 */
+	do {
+		propbase = BASER_BASE_ADDRESS(dist->propbaser);
+		ret = kvm_read_guest(kvm, propbase + itte->lpi - GIC_LPI_OFFSET,
+				     &prop, 1);
+		if (ret)
+			return ret;
+
+		spin_lock(&dist->its.lock);
+		new_itte = find_itte(kvm, device_id, event_id);
+		if (new_itte->lpi != itte->lpi) {
+			itte = new_itte;
+			spin_unlock(&dist->its.lock);
+			continue;
+		}
+		update_lpi_config(kvm, itte, prop);
+		spin_unlock(&dist->its.lock);
+	} while (0);
+	return 0;
+}
+
+/* The INVALL command requests flushing of all IRQ data in this collection. */
+static int vits_cmd_handle_invall(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u64 prop_base_reg, pend_base_reg;
+	u32 coll_id = its_cmd_get_collection(its_cmd);
+	struct its_collection *collection;
+	struct kvm_vcpu *vcpu;
+
+	collection = find_collection(kvm, coll_id);
+	if (!its_is_collection_mapped(collection))
+		return E_ITS_INVALL_UNMAPPED_COLLECTION;
+
+	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
+
+	spin_lock(&dist->lock);
+	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
+	prop_base_reg = dist->propbaser;
+	spin_unlock(&dist->lock);
+
+	its_update_lpis_configuration(kvm, prop_base_reg);
+	its_sync_lpi_pending_table(vcpu, pend_base_reg);
+
+	return 0;
+}
+
+/* The MOVALL command moves all IRQs from one redistributor to another. */
+static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 target1_addr = its_cmd_get_target_addr(its_cmd);
+	u32 target2_addr = its_cmd_mask_field(its_cmd, 3, 16, 32);
+	struct its_collection *collection;
+	struct its_device *device;
+	struct its_itte *itte;
+
+	if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
+	    target2_addr >= atomic_read(&kvm->online_vcpus))
+		return E_ITS_MOVALL_PROCNUM_OOR;
+
+	if (target1_addr == target2_addr)
+		return 0;
+
+	spin_lock(&its->lock);
+	for_each_lpi(device, itte, kvm) {
+		/* remap all collections mapped to target address 1 */
+		collection = itte->collection;
+		if (collection && collection->target_addr == target1_addr)
+			collection->target_addr = target2_addr;
+
+		/* move pending state if LPI is affected */
+		if (test_and_clear_bit(target1_addr, itte->pending))
+			__set_bit(target2_addr, itte->pending);
+	}
+
+	spin_unlock(&its->lock);
+	return 0;
+}
+
 /*
  * This function is called with both the ITS and the distributor lock dropped,
  * so the actual command handlers must take the respective locks when needed.
  */
 static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
 {
-	return -ENODEV;
+	u8 cmd = its_cmd_get_command(its_cmd);
+	int ret = -ENODEV;
+
+	switch (cmd) {
+	case GITS_CMD_MAPD:
+		ret = vits_cmd_handle_mapd(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_MAPC:
+		ret = vits_cmd_handle_mapc(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_MAPI:
+		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
+		break;
+	case GITS_CMD_MAPTI:
+		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
+		break;
+	case GITS_CMD_MOVI:
+		ret = vits_cmd_handle_movi(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_DISCARD:
+		ret = vits_cmd_handle_discard(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_CLEAR:
+		ret = vits_cmd_handle_clear(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_MOVALL:
+		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_INV:
+		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_INVALL:
+		ret = vits_cmd_handle_invall(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_SYNC:
+		/* we ignore this command: we are in sync all of the time */
+		ret = 0;
+		break;
+	}
+
+	return ret;
 }
 
 static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index cbc3877..830524a 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -39,4 +39,15 @@ void vits_destroy(struct kvm *kvm);
 bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
 
+#define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
+#define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
+#define E_ITS_CLEAR_UNMAPPED_INTERRUPT		0x010507
+#define E_ITS_MAPC_PROCNUM_OOR			0x010902
+#define E_ITS_MAPTI_UNMAPPED_DEVICE		0x010a04
+#define E_ITS_MAPTI_PHYSICALID_OOR		0x010a06
+#define E_ITS_INV_UNMAPPED_INTERRUPT		0x010c07
+#define E_ITS_INVALL_UNMAPPED_COLLECTION	0x010d09
+#define E_ITS_MOVALL_PROCNUM_OOR		0x010e01
+#define E_ITS_DISCARD_UNMAPPED_INTERRUPT	0x010f07
+
 #endif
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

The connection between a device, an event ID, the LPI number and the
allocated CPU is stored in in-memory tables in a GICv3, but their
format is not specified by the spec. Instead software uses a command
queue in a ring buffer to let the ITS implementation use their own
format.
Implement handlers for the various ITS commands and let them store
the requested relation into our own data structures.
To avoid kmallocs inside the ITS spinlock, we preallocate possibly
needed memory outside of the lock and free that if it turns out to
be not needed (mostly error handling).
Error handling is very basic at this point, as we don't have a good
way of communicating errors to the guest (usually a SError).
The INT command handler is missing at this point, as we gain the
capability of actually injecting MSIs into the guest only later on.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- adjust handlers to new pendbaser/propbaser locking scheme
- properly free ITTEs (including pending bitmap)
- fix handling of unmapped collections

 include/linux/irqchip/arm-gic-v3.h |   5 +-
 virt/kvm/arm/its-emul.c            | 502 ++++++++++++++++++++++++++++++++++++-
 virt/kvm/arm/its-emul.h            |  11 +
 3 files changed, 516 insertions(+), 2 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index ef274a9..27c0e75 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -255,7 +255,10 @@
  */
 #define GITS_CMD_MAPD			0x08
 #define GITS_CMD_MAPC			0x09
-#define GITS_CMD_MAPVI			0x0a
+#define GITS_CMD_MAPTI			0x0a
+/* older GIC documentation used MAPVI for this command */
+#define GITS_CMD_MAPVI			GITS_CMD_MAPTI
+#define GITS_CMD_MAPI			0x0b
 #define GITS_CMD_MOVI			0x01
 #define GITS_CMD_DISCARD		0x0f
 #define GITS_CMD_INV			0x0c
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 7a8c5db..642effb 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -22,6 +22,7 @@
 #include <linux/kvm_host.h>
 #include <linux/interrupt.h>
 #include <linux/list.h>
+#include <linux/slab.h>
 
 #include <linux/irqchip/arm-gic-v3.h>
 #include <kvm/arm_vgic.h>
@@ -64,6 +65,34 @@ struct its_itte {
 	unsigned long *pending;
 };
 
+static struct its_device *find_its_device(struct kvm *kvm, u32 device_id)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	struct its_device *device;
+
+	list_for_each_entry(device, &its->device_list, dev_list)
+		if (device_id == device->device_id)
+			return device;
+
+	return NULL;
+}
+
+static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
+{
+	struct its_device *device;
+	struct its_itte *itte;
+
+	device = find_its_device(kvm, device_id);
+	if (device == NULL)
+		return NULL;
+
+	list_for_each_entry(itte, &device->itt, itte_list)
+		if (itte->event_id == event_id)
+			return itte;
+
+	return NULL;
+}
+
 /* To be used as an iterator this macro misses the enclosing parentheses */
 #define for_each_lpi(dev, itte, kvm) \
 	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
@@ -81,6 +110,19 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
 	return NULL;
 }
 
+static struct its_collection *find_collection(struct kvm *kvm, int coll_id)
+{
+	struct its_collection *collection;
+
+	list_for_each_entry(collection, &kvm->arch.vgic.its.collection_list,
+			    coll_list) {
+		if (coll_id == collection->collection_id)
+			return collection;
+	}
+
+	return NULL;
+}
+
 #define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
 #define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
 
@@ -352,13 +394,471 @@ static void its_free_itte(struct its_itte *itte)
 	kfree(itte);
 }
 
+static u64 its_cmd_mask_field(u64 *its_cmd, int word, int shift, int size)
+{
+	return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT_ULL(size) - 1);
+}
+
+#define its_cmd_get_command(cmd)	its_cmd_mask_field(cmd, 0,  0,  8)
+#define its_cmd_get_deviceid(cmd)	its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_id(cmd)		its_cmd_mask_field(cmd, 1,  0, 32)
+#define its_cmd_get_physical_id(cmd)	its_cmd_mask_field(cmd, 1, 32, 32)
+#define its_cmd_get_collection(cmd)	its_cmd_mask_field(cmd, 2,  0, 16)
+#define its_cmd_get_target_addr(cmd)	its_cmd_mask_field(cmd, 2, 16, 32)
+#define its_cmd_get_validbit(cmd)	its_cmd_mask_field(cmd, 2, 63,  1)
+
+/* The DISCARD command frees an Interrupt Translation Table Entry (ITTE). */
+static int vits_cmd_handle_discard(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 device_id;
+	u32 event_id;
+	struct its_itte *itte;
+	int ret = E_ITS_DISCARD_UNMAPPED_INTERRUPT;
+
+	device_id = its_cmd_get_deviceid(its_cmd);
+	event_id = its_cmd_get_id(its_cmd);
+
+	spin_lock(&its->lock);
+	itte = find_itte(kvm, device_id, event_id);
+	if (itte && itte->collection) {
+		/*
+		 * Though the spec talks about removing the pending state, we
+		 * don't bother here since we clear the ITTE anyway and the
+		 * pending state is a property of the ITTE struct.
+		 */
+		its_free_itte(itte);
+		ret = 0;
+	}
+
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+/* The MOVI command moves an ITTE to a different collection. */
+static int vits_cmd_handle_movi(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 device_id = its_cmd_get_deviceid(its_cmd);
+	u32 event_id = its_cmd_get_id(its_cmd);
+	u32 coll_id = its_cmd_get_collection(its_cmd);
+	struct its_itte *itte;
+	struct its_collection *collection;
+	int ret;
+
+	spin_lock(&its->lock);
+	itte = find_itte(kvm, device_id, event_id);
+	if (!itte) {
+		ret = E_ITS_MOVI_UNMAPPED_INTERRUPT;
+		goto out_unlock;
+	}
+	if (!its_is_collection_mapped(itte->collection)) {
+		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		goto out_unlock;
+	}
+
+	collection = find_collection(kvm, coll_id);
+	if (!its_is_collection_mapped(collection)) {
+		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		goto out_unlock;
+	}
+
+	if (test_and_clear_bit(itte->collection->target_addr, itte->pending))
+		__set_bit(collection->target_addr, itte->pending);
+
+	itte->collection = collection;
+out_unlock:
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+static void vits_init_collection(struct kvm *kvm,
+				 struct its_collection *collection,
+				 u32 coll_id)
+{
+	collection->collection_id = coll_id;
+	collection->target_addr = COLLECTION_NOT_MAPPED;
+
+	list_add_tail(&collection->coll_list,
+		&kvm->arch.vgic.its.collection_list);
+}
+
+/* The MAPTI and MAPI commands map LPIs to ITTEs. */
+static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u32 device_id = its_cmd_get_deviceid(its_cmd);
+	u32 event_id = its_cmd_get_id(its_cmd);
+	u32 coll_id = its_cmd_get_collection(its_cmd);
+	struct its_itte *itte, *new_itte;
+	struct its_device *device;
+	struct its_collection *collection, *new_coll;
+	int lpi_nr;
+	int ret = 0;
+
+	/* Preallocate possibly needed memory here outside of the lock */
+	new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
+	new_itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
+	if (new_itte)
+		new_itte->pending = kcalloc(BITS_TO_LONGS(dist->nr_cpus),
+					    sizeof(long), GFP_KERNEL);
+
+	spin_lock(&dist->its.lock);
+
+	device = find_its_device(kvm, device_id);
+	if (!device) {
+		ret = E_ITS_MAPTI_UNMAPPED_DEVICE;
+		goto out_unlock;
+	}
+
+	collection = find_collection(kvm, coll_id);
+	if (!collection && !new_coll) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	if (cmd == GITS_CMD_MAPTI)
+		lpi_nr = its_cmd_get_physical_id(its_cmd);
+	else
+		lpi_nr = event_id;
+	if (lpi_nr < GIC_LPI_OFFSET ||
+	    lpi_nr >= nr_idbits_propbase(dist->propbaser)) {
+		ret = E_ITS_MAPTI_PHYSICALID_OOR;
+		goto out_unlock;
+	}
+
+	itte = find_itte(kvm, device_id, event_id);
+	if (!itte) {
+		if (!new_itte || !new_itte->pending) {
+			ret = -ENOMEM;
+			goto out_unlock;
+		}
+		itte = new_itte;
+
+		itte->event_id	= event_id;
+		list_add_tail(&itte->itte_list, &device->itt);
+	} else {
+		if (new_itte)
+			kfree(new_itte->pending);
+		kfree(new_itte);
+	}
+
+	if (!collection) {
+		collection = new_coll;
+		vits_init_collection(kvm, collection, coll_id);
+	} else {
+		kfree(new_coll);
+	}
+
+	itte->collection = collection;
+	itte->lpi = lpi_nr;
+
+out_unlock:
+	spin_unlock(&dist->its.lock);
+	if (ret) {
+		kfree(new_coll);
+		if (new_itte)
+			kfree(new_itte->pending);
+		kfree(new_itte);
+	}
+	return ret;
+}
+
+static void vits_unmap_device(struct kvm *kvm, struct its_device *device)
+{
+	struct its_itte *itte, *temp;
+
+	/*
+	 * The spec says that unmapping a device with still valid
+	 * ITTEs associated is UNPREDICTABLE. We remove all ITTEs,
+	 * since we cannot leave the memory unreferenced.
+	 */
+	list_for_each_entry_safe(itte, temp, &device->itt, itte_list)
+		its_free_itte(itte);
+
+	list_del(&device->dev_list);
+	kfree(device);
+}
+
+/* MAPD maps or unmaps a device ID to Interrupt Translation Tables (ITTs). */
+static int vits_cmd_handle_mapd(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	bool valid = its_cmd_get_validbit(its_cmd);
+	u32 device_id = its_cmd_get_deviceid(its_cmd);
+	struct its_device *device, *new_device = NULL;
+
+	/* We preallocate memory outside of the lock here */
+	if (valid) {
+		new_device = kzalloc(sizeof(struct its_device), GFP_KERNEL);
+		if (!new_device)
+			return -ENOMEM;
+	}
+
+	spin_lock(&its->lock);
+
+	device = find_its_device(kvm, device_id);
+	if (device)
+		vits_unmap_device(kvm, device);
+
+	/*
+	 * The spec does not say whether unmapping a not-mapped device
+	 * is an error, so we are done in any case.
+	 */
+	if (!valid)
+		goto out_unlock;
+
+	device = new_device;
+
+	device->device_id = device_id;
+	INIT_LIST_HEAD(&device->itt);
+
+	list_add_tail(&device->dev_list,
+		      &kvm->arch.vgic.its.device_list);
+
+out_unlock:
+	spin_unlock(&its->lock);
+	return 0;
+}
+
+/* The MAPC command maps collection IDs to redistributors. */
+static int vits_cmd_handle_mapc(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u16 coll_id;
+	u32 target_addr;
+	struct its_collection *collection, *new_coll = NULL;
+	bool valid;
+
+	valid = its_cmd_get_validbit(its_cmd);
+	coll_id = its_cmd_get_collection(its_cmd);
+	target_addr = its_cmd_get_target_addr(its_cmd);
+
+	if (target_addr >= atomic_read(&kvm->online_vcpus))
+		return E_ITS_MAPC_PROCNUM_OOR;
+
+	/* We preallocate memory outside of the lock here */
+	if (valid) {
+		new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
+		if (!new_coll)
+			return -ENOMEM;
+	}
+
+	spin_lock(&its->lock);
+	collection = find_collection(kvm, coll_id);
+
+	if (!valid) {
+		struct its_device *device;
+		struct its_itte *itte;
+		/*
+		 * Clearing the mapping for that collection ID removes the
+		 * entry from the list. If there wasn't any before, we can
+		 * go home early.
+		 */
+		if (!collection)
+			goto out_unlock;
+
+		for_each_lpi(device, itte, kvm)
+			if (itte->collection &&
+			    itte->collection->collection_id == coll_id)
+				itte->collection = NULL;
+
+		list_del(&collection->coll_list);
+		kfree(collection);
+	} else {
+		if (!collection)
+			collection = new_coll;
+		else
+			kfree(new_coll);
+
+		vits_init_collection(kvm, collection, coll_id);
+		collection->target_addr = target_addr;
+	}
+
+out_unlock:
+	spin_unlock(&its->lock);
+	return 0;
+}
+
+/* The CLEAR command removes the pending state for a particular LPI. */
+static int vits_cmd_handle_clear(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 device_id;
+	u32 event_id;
+	struct its_itte *itte;
+	int ret = 0;
+
+	device_id = its_cmd_get_deviceid(its_cmd);
+	event_id = its_cmd_get_id(its_cmd);
+
+	spin_lock(&its->lock);
+
+	itte = find_itte(kvm, device_id, event_id);
+	if (!itte) {
+		ret = E_ITS_CLEAR_UNMAPPED_INTERRUPT;
+		goto out_unlock;
+	}
+
+	if (its_is_collection_mapped(itte->collection))
+		__clear_bit(itte->collection->target_addr, itte->pending);
+
+out_unlock:
+	spin_unlock(&its->lock);
+	return ret;
+}
+
+/* The INV command syncs the configuration bits from the memory tables. */
+static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u32 device_id;
+	u32 event_id;
+	struct its_itte *itte, *new_itte;
+	gpa_t propbase;
+	int ret;
+	u8 prop;
+
+	device_id = its_cmd_get_deviceid(its_cmd);
+	event_id = its_cmd_get_id(its_cmd);
+
+	spin_lock(&dist->its.lock);
+	itte = find_itte(kvm, device_id, event_id);
+	spin_unlock(&dist->its.lock);
+	if (!itte)
+		return E_ITS_INV_UNMAPPED_INTERRUPT;
+
+	/*
+	 * We cannot read from guest memory inside the spinlock, so we
+	 * need to re-read our tables to learn whether the LPI number we are
+	 * using is still valid.
+	 */
+	do {
+		propbase = BASER_BASE_ADDRESS(dist->propbaser);
+		ret = kvm_read_guest(kvm, propbase + itte->lpi - GIC_LPI_OFFSET,
+				     &prop, 1);
+		if (ret)
+			return ret;
+
+		spin_lock(&dist->its.lock);
+		new_itte = find_itte(kvm, device_id, event_id);
+		if (new_itte->lpi != itte->lpi) {
+			itte = new_itte;
+			spin_unlock(&dist->its.lock);
+			continue;
+		}
+		update_lpi_config(kvm, itte, prop);
+		spin_unlock(&dist->its.lock);
+	} while (0);
+	return 0;
+}
+
+/* The INVALL command requests flushing of all IRQ data in this collection. */
+static int vits_cmd_handle_invall(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	u64 prop_base_reg, pend_base_reg;
+	u32 coll_id = its_cmd_get_collection(its_cmd);
+	struct its_collection *collection;
+	struct kvm_vcpu *vcpu;
+
+	collection = find_collection(kvm, coll_id);
+	if (!its_is_collection_mapped(collection))
+		return E_ITS_INVALL_UNMAPPED_COLLECTION;
+
+	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
+
+	spin_lock(&dist->lock);
+	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
+	prop_base_reg = dist->propbaser;
+	spin_unlock(&dist->lock);
+
+	its_update_lpis_configuration(kvm, prop_base_reg);
+	its_sync_lpi_pending_table(vcpu, pend_base_reg);
+
+	return 0;
+}
+
+/* The MOVALL command moves all IRQs from one redistributor to another. */
+static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
+{
+	struct vgic_its *its = &kvm->arch.vgic.its;
+	u32 target1_addr = its_cmd_get_target_addr(its_cmd);
+	u32 target2_addr = its_cmd_mask_field(its_cmd, 3, 16, 32);
+	struct its_collection *collection;
+	struct its_device *device;
+	struct its_itte *itte;
+
+	if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
+	    target2_addr >= atomic_read(&kvm->online_vcpus))
+		return E_ITS_MOVALL_PROCNUM_OOR;
+
+	if (target1_addr == target2_addr)
+		return 0;
+
+	spin_lock(&its->lock);
+	for_each_lpi(device, itte, kvm) {
+		/* remap all collections mapped to target address 1 */
+		collection = itte->collection;
+		if (collection && collection->target_addr == target1_addr)
+			collection->target_addr = target2_addr;
+
+		/* move pending state if LPI is affected */
+		if (test_and_clear_bit(target1_addr, itte->pending))
+			__set_bit(target2_addr, itte->pending);
+	}
+
+	spin_unlock(&its->lock);
+	return 0;
+}
+
 /*
  * This function is called with both the ITS and the distributor lock dropped,
  * so the actual command handlers must take the respective locks when needed.
  */
 static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
 {
-	return -ENODEV;
+	u8 cmd = its_cmd_get_command(its_cmd);
+	int ret = -ENODEV;
+
+	switch (cmd) {
+	case GITS_CMD_MAPD:
+		ret = vits_cmd_handle_mapd(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_MAPC:
+		ret = vits_cmd_handle_mapc(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_MAPI:
+		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
+		break;
+	case GITS_CMD_MAPTI:
+		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
+		break;
+	case GITS_CMD_MOVI:
+		ret = vits_cmd_handle_movi(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_DISCARD:
+		ret = vits_cmd_handle_discard(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_CLEAR:
+		ret = vits_cmd_handle_clear(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_MOVALL:
+		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_INV:
+		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_INVALL:
+		ret = vits_cmd_handle_invall(vcpu->kvm, its_cmd);
+		break;
+	case GITS_CMD_SYNC:
+		/* we ignore this command: we are in sync all of the time */
+		ret = 0;
+		break;
+	}
+
+	return ret;
 }
 
 static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index cbc3877..830524a 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -39,4 +39,15 @@ void vits_destroy(struct kvm *kvm);
 bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
 
+#define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
+#define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
+#define E_ITS_CLEAR_UNMAPPED_INTERRUPT		0x010507
+#define E_ITS_MAPC_PROCNUM_OOR			0x010902
+#define E_ITS_MAPTI_UNMAPPED_DEVICE		0x010a04
+#define E_ITS_MAPTI_PHYSICALID_OOR		0x010a06
+#define E_ITS_INV_UNMAPPED_INTERRUPT		0x010c07
+#define E_ITS_INVALL_UNMAPPED_COLLECTION	0x010d09
+#define E_ITS_MOVALL_PROCNUM_OOR		0x010e01
+#define E_ITS_DISCARD_UNMAPPED_INTERRUPT	0x010f07
+
 #endif
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

When userland wants to inject a MSI into the guest, we have to use
our data structures to find the LPI number and the VCPU to receive
the interrupt.
Use the wrapper functions to iterate the linked lists and find the
proper Interrupt Translation Table Entry. Then set the pending bit
in this ITTE to be later picked up by the LR handling code. Kick
the VCPU which is meant to handle this interrupt.
We provide a VGIC emulation model specific routine for the actual
MSI injection. The wrapper functions return an error for models not
(yet) implementing MSIs (like the GICv2 emulation).
We also provide the handler for the ITS "INT" command, which allows a
guest to trigger an MSI via the ITS command queue.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- proper checking for unmapped collections

 include/kvm/arm_vgic.h      |  1 +
 virt/kvm/arm/its-emul.c     | 65 +++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h     |  2 ++
 virt/kvm/arm/vgic-v3-emul.c |  1 +
 4 files changed, 69 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4ea023c..7911059 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -149,6 +149,7 @@ struct vgic_vm_ops {
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
 	bool	(*queue_lpis)(struct kvm_vcpu *);
 	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
+	int	(*inject_msi)(struct kvm *, struct kvm_msi *);
 };
 
 struct vgic_io_device {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 642effb..cd8526a 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -333,6 +333,55 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 }
 
 /*
+ * Translates an incoming MSI request into the redistributor (=VCPU) and
+ * the associated LPI number. Sets the LPI pending bit and also marks the
+ * VCPU as having a pending interrupt.
+ */
+int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+	struct its_itte *itte;
+	int cpuid;
+	bool inject = false;
+	int ret = 0;
+
+	if (!vgic_has_its(kvm))
+		return -ENODEV;
+
+	if (!(msi->flags & KVM_MSI_VALID_DEVID))
+		return -EINVAL;
+
+	spin_lock(&its->lock);
+
+	if (!its->enabled || !dist->lpis_enabled) {
+		ret = -EAGAIN;
+		goto out_unlock;
+	}
+
+	itte = find_itte(kvm, msi->devid, msi->data);
+	/* Triggering an unmapped IRQ gets silently dropped. */
+	if (!itte || !its_is_collection_mapped(itte->collection))
+		goto out_unlock;
+
+	cpuid = itte->collection->target_addr;
+	__set_bit(cpuid, itte->pending);
+	inject = itte->enabled;
+
+out_unlock:
+	spin_unlock(&its->lock);
+
+	if (inject) {
+		spin_lock(&dist->lock);
+		__set_bit(cpuid, dist->irq_pending_on_cpu);
+		spin_unlock(&dist->lock);
+		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
+	}
+
+	return ret;
+}
+
+/*
  * Find all enabled and pending LPIs and queue them into the list
  * registers.
  * The dist lock is held by the caller.
@@ -812,6 +861,19 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
 	return 0;
 }
 
+/* The INT command injects the LPI associated with that DevID/EvID pair. */
+static int vits_cmd_handle_int(struct kvm *kvm, u64 *its_cmd)
+{
+	struct kvm_msi msi = {
+		.data = its_cmd_get_id(its_cmd),
+		.devid = its_cmd_get_deviceid(its_cmd),
+		.flags = KVM_MSI_VALID_DEVID,
+	};
+
+	vits_inject_msi(kvm, &msi);
+	return 0;
+}
+
 /*
  * This function is called with both the ITS and the distributor lock dropped,
  * so the actual command handlers must take the respective locks when needed.
@@ -846,6 +908,9 @@ static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
 	case GITS_CMD_MOVALL:
 		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
 		break;
+	case GITS_CMD_INT:
+		ret = vits_cmd_handle_int(vcpu->kvm, its_cmd);
+		break;
 	case GITS_CMD_INV:
 		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
 		break;
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 830524a..95e56a7 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -36,6 +36,8 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
 void vits_destroy(struct kvm *kvm);
 
+int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
+
 bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index f482e34..90f3628 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -944,6 +944,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.init_model = vgic_v3_init_model;
 	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
+	dist->vm_ops.inject_msi = vits_inject_msi;
 	dist->vm_ops.queue_lpis = vits_queue_lpis;
 	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
 
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

When userland wants to inject a MSI into the guest, we have to use
our data structures to find the LPI number and the VCPU to receive
the interrupt.
Use the wrapper functions to iterate the linked lists and find the
proper Interrupt Translation Table Entry. Then set the pending bit
in this ITTE to be later picked up by the LR handling code. Kick
the VCPU which is meant to handle this interrupt.
We provide a VGIC emulation model specific routine for the actual
MSI injection. The wrapper functions return an error for models not
(yet) implementing MSIs (like the GICv2 emulation).
We also provide the handler for the ITS "INT" command, which allows a
guest to trigger an MSI via the ITS command queue.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- proper checking for unmapped collections

 include/kvm/arm_vgic.h      |  1 +
 virt/kvm/arm/its-emul.c     | 65 +++++++++++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/its-emul.h     |  2 ++
 virt/kvm/arm/vgic-v3-emul.c |  1 +
 4 files changed, 69 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4ea023c..7911059 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -149,6 +149,7 @@ struct vgic_vm_ops {
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
 	bool	(*queue_lpis)(struct kvm_vcpu *);
 	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
+	int	(*inject_msi)(struct kvm *, struct kvm_msi *);
 };
 
 struct vgic_io_device {
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 642effb..cd8526a 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -333,6 +333,55 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
 }
 
 /*
+ * Translates an incoming MSI request into the redistributor (=VCPU) and
+ * the associated LPI number. Sets the LPI pending bit and also marks the
+ * VCPU as having a pending interrupt.
+ */
+int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_its *its = &dist->its;
+	struct its_itte *itte;
+	int cpuid;
+	bool inject = false;
+	int ret = 0;
+
+	if (!vgic_has_its(kvm))
+		return -ENODEV;
+
+	if (!(msi->flags & KVM_MSI_VALID_DEVID))
+		return -EINVAL;
+
+	spin_lock(&its->lock);
+
+	if (!its->enabled || !dist->lpis_enabled) {
+		ret = -EAGAIN;
+		goto out_unlock;
+	}
+
+	itte = find_itte(kvm, msi->devid, msi->data);
+	/* Triggering an unmapped IRQ gets silently dropped. */
+	if (!itte || !its_is_collection_mapped(itte->collection))
+		goto out_unlock;
+
+	cpuid = itte->collection->target_addr;
+	__set_bit(cpuid, itte->pending);
+	inject = itte->enabled;
+
+out_unlock:
+	spin_unlock(&its->lock);
+
+	if (inject) {
+		spin_lock(&dist->lock);
+		__set_bit(cpuid, dist->irq_pending_on_cpu);
+		spin_unlock(&dist->lock);
+		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
+	}
+
+	return ret;
+}
+
+/*
  * Find all enabled and pending LPIs and queue them into the list
  * registers.
  * The dist lock is held by the caller.
@@ -812,6 +861,19 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
 	return 0;
 }
 
+/* The INT command injects the LPI associated with that DevID/EvID pair. */
+static int vits_cmd_handle_int(struct kvm *kvm, u64 *its_cmd)
+{
+	struct kvm_msi msi = {
+		.data = its_cmd_get_id(its_cmd),
+		.devid = its_cmd_get_deviceid(its_cmd),
+		.flags = KVM_MSI_VALID_DEVID,
+	};
+
+	vits_inject_msi(kvm, &msi);
+	return 0;
+}
+
 /*
  * This function is called with both the ITS and the distributor lock dropped,
  * so the actual command handlers must take the respective locks when needed.
@@ -846,6 +908,9 @@ static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
 	case GITS_CMD_MOVALL:
 		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
 		break;
+	case GITS_CMD_INT:
+		ret = vits_cmd_handle_int(vcpu->kvm, its_cmd);
+		break;
 	case GITS_CMD_INV:
 		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
 		break;
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 830524a..95e56a7 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -36,6 +36,8 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
 int vits_init(struct kvm *kvm);
 void vits_destroy(struct kvm *kvm);
 
+int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
+
 bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index f482e34..90f3628 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -944,6 +944,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.init_model = vgic_v3_init_model;
 	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
 	dist->vm_ops.map_resources = vgic_v3_map_resources;
+	dist->vm_ops.inject_msi = vits_inject_msi;
 	dist->vm_ops.queue_lpis = vits_queue_lpis;
 	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
 
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 16/16] KVM: arm64: enable ITS emulation as a virtual MSI controller
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 14:55   ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: marc.zyngier, christoffer.dall
  Cc: eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

If userspace has provided a base address for the ITS register frame,
we enable the bits that advertise LPIs in the GICv3.
When the guest has enabled LPIs and the ITS, we enable the emulation
part by initializing the ITS data structures and trapping on ITS
register frame accesses by the guest.
Also we enable the KVM_SIGNAL_MSI feature to allow userland to inject
MSIs into the guest. Not having enabled the ITS emulation will lead
to a -ENODEV when trying to inject a MSI.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- replace kmalloc with kcalloc
- adjust number of supported LPIs in comment

 Documentation/virtual/kvm/api.txt |  2 +-
 arch/arm64/kvm/Kconfig            |  1 +
 arch/arm64/kvm/reset.c            |  6 ++++++
 include/kvm/arm_vgic.h            |  6 ++++++
 virt/kvm/arm/its-emul.c           | 10 +++++++++-
 virt/kvm/arm/vgic-v3-emul.c       | 20 ++++++++++++++------
 virt/kvm/arm/vgic.c               |  8 ++++++++
 7 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index a302e0a..047e4e7 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2134,7 +2134,7 @@ after pausing the vcpu, but before it is resumed.
 4.71 KVM_SIGNAL_MSI
 
 Capability: KVM_CAP_SIGNAL_MSI
-Architectures: x86
+Architectures: x86 arm64
 Type: vm ioctl
 Parameters: struct kvm_msi (in)
 Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 5c7e920..e8d77f4 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -31,6 +31,7 @@ config KVM
 	select KVM_VFIO
 	select HAVE_KVM_EVENTFD
 	select HAVE_KVM_IRQFD
+	select HAVE_KVM_MSI
 	---help---
 	  Support hosting virtualized guest machines.
 
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 4d7f78b4..a490f67 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -80,6 +80,12 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_GUEST_DEBUG:
 		r = 1;
 		break;
+	case KVM_CAP_MSI_DEVID:
+		if (!kvm)
+			r = -EINVAL;
+		else
+			r = kvm->arch.vgic.msis_require_devid;
+		break;
 	default:
 		r = 0;
 	}
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7911059..35657f9 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -174,6 +174,7 @@ struct irq_phys_map_entry {
 
 struct vgic_its {
 	bool			enabled;
+	struct vgic_io_device	iodev;
 	spinlock_t		lock;
 	u64			cbaser;
 	int			creadr;
@@ -192,6 +193,9 @@ struct vgic_dist {
 	/* vGIC model the kernel emulates for the guest (GICv2 or GICv3) */
 	u32			vgic_model;
 
+	/* Do injected MSIs require an additional device ID? */
+	bool			msis_require_devid;
+
 	int			nr_cpus;
 	int			nr_irqs;
 
@@ -397,4 +401,6 @@ static inline int vgic_v3_probe(struct device_node *vgic_node,
 }
 #endif
 
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
+
 #endif
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index cd8526a..b40a7fc 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -1117,6 +1117,7 @@ int vits_init(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_its *its = &dist->its;
+	int ret;
 
 	dist->pendbaser = kcalloc(dist->nr_cpus, sizeof(u64), GFP_KERNEL);
 	if (!dist->pendbaser)
@@ -1131,9 +1132,16 @@ int vits_init(struct kvm *kvm)
 	INIT_LIST_HEAD(&its->device_list);
 	INIT_LIST_HEAD(&its->collection_list);
 
+	ret = vgic_register_kvm_io_dev(kvm, dist->vgic_its_base,
+				       KVM_VGIC_V3_ITS_SIZE, vgicv3_its_ranges,
+				       -1, &its->iodev);
+	if (ret)
+		return ret;
+
 	its->enabled = false;
+	dist->msis_require_devid = true;
 
-	return -ENXIO;
+	return 0;
 }
 
 void vits_destroy(struct kvm *kvm)
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 90f3628..311b3ea 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -8,7 +8,6 @@
  *
  * Limitations of the emulation:
  * (RAZ/WI: read as zero, write ignore, RAO/WI: read as one, write ignore)
- * - We do not support LPIs (yet). TYPER.LPIS is reported as 0 and is RAZ/WI.
  * - We do not support the message based interrupts (MBIs) triggered by
  *   writes to the GICD_{SET,CLR}SPI_* registers. TYPER.MBIS is reported as 0.
  * - We do not support the (optional) backwards compatibility feature.
@@ -87,10 +86,10 @@ static bool handle_mmio_ctlr(struct kvm_vcpu *vcpu,
 /*
  * As this implementation does not provide compatibility
  * with GICv2 (ARE==1), we report zero CPUs in bits [5..7].
- * Also LPIs and MBIs are not supported, so we set the respective bits to 0.
- * Also we report at most 2**10=1024 interrupt IDs (to match 1024 SPIs).
+ * Also we report at most 2**10=1024 interrupt IDs (to match 1024 SPIs)
+ * and provide 16 bits worth of LPI number space (to give 57344 LPIs).
  */
-#define INTERRUPT_ID_BITS 10
+#define INTERRUPT_ID_BITS_SPIS 10
 static bool handle_mmio_typer(struct kvm_vcpu *vcpu,
 			      struct kvm_exit_mmio *mmio, phys_addr_t offset)
 {
@@ -98,7 +97,12 @@ static bool handle_mmio_typer(struct kvm_vcpu *vcpu,
 
 	reg = (min(vcpu->kvm->arch.vgic.nr_irqs, 1024) >> 5) - 1;
 
-	reg |= (INTERRUPT_ID_BITS - 1) << 19;
+	if (vgic_has_its(vcpu->kvm)) {
+		reg |= GICD_TYPER_LPIS;
+		reg |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
+	} else {
+		reg |= (INTERRUPT_ID_BITS_SPIS - 1) << 19;
+	}
 
 	vgic_reg_access(mmio, &reg, offset,
 			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
@@ -539,7 +543,9 @@ static bool handle_mmio_ctlr_redist(struct kvm_vcpu *vcpu,
 			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
 	if (vgic_has_its(vcpu->kvm) && !dist->lpis_enabled &&
 	    (reg & GICR_CTLR_ENABLE_LPIS)) {
-		/* Eventually do something */
+		vgic_enable_lpis(vcpu);
+		dist->lpis_enabled = true;
+		return true;
 	}
 	return false;
 }
@@ -566,6 +572,8 @@ static bool handle_mmio_typer_redist(struct kvm_vcpu *vcpu,
 	reg = redist_vcpu->vcpu_id << 8;
 	if (target_vcpu_id == atomic_read(&vcpu->kvm->online_vcpus) - 1)
 		reg |= GICR_TYPER_LAST;
+	if (vgic_has_its(vcpu->kvm))
+		reg |= GICR_TYPER_PLPIS;
 	vgic_reg_access(mmio, &reg, offset,
 			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
 	return false;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 9ee87d3..372cb20 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2571,3 +2571,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 {
 	return 0;
 }
+
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	if (kvm->arch.vgic.vm_ops.inject_msi)
+		return kvm->arch.vgic.vm_ops.inject_msi(kvm, msi);
+	else
+		return -ENODEV;
+}
-- 
2.5.1


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 16/16] KVM: arm64: enable ITS emulation as a virtual MSI controller
@ 2015-10-07 14:55   ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-07 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

If userspace has provided a base address for the ITS register frame,
we enable the bits that advertise LPIs in the GICv3.
When the guest has enabled LPIs and the ITS, we enable the emulation
part by initializing the ITS data structures and trapping on ITS
register frame accesses by the guest.
Also we enable the KVM_SIGNAL_MSI feature to allow userland to inject
MSIs into the guest. Not having enabled the ITS emulation will lead
to a -ENODEV when trying to inject a MSI.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Changelog v2..v3:
- replace kmalloc with kcalloc
- adjust number of supported LPIs in comment

 Documentation/virtual/kvm/api.txt |  2 +-
 arch/arm64/kvm/Kconfig            |  1 +
 arch/arm64/kvm/reset.c            |  6 ++++++
 include/kvm/arm_vgic.h            |  6 ++++++
 virt/kvm/arm/its-emul.c           | 10 +++++++++-
 virt/kvm/arm/vgic-v3-emul.c       | 20 ++++++++++++++------
 virt/kvm/arm/vgic.c               |  8 ++++++++
 7 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index a302e0a..047e4e7 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2134,7 +2134,7 @@ after pausing the vcpu, but before it is resumed.
 4.71 KVM_SIGNAL_MSI
 
 Capability: KVM_CAP_SIGNAL_MSI
-Architectures: x86
+Architectures: x86 arm64
 Type: vm ioctl
 Parameters: struct kvm_msi (in)
 Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 5c7e920..e8d77f4 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -31,6 +31,7 @@ config KVM
 	select KVM_VFIO
 	select HAVE_KVM_EVENTFD
 	select HAVE_KVM_IRQFD
+	select HAVE_KVM_MSI
 	---help---
 	  Support hosting virtualized guest machines.
 
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 4d7f78b4..a490f67 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -80,6 +80,12 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_GUEST_DEBUG:
 		r = 1;
 		break;
+	case KVM_CAP_MSI_DEVID:
+		if (!kvm)
+			r = -EINVAL;
+		else
+			r = kvm->arch.vgic.msis_require_devid;
+		break;
 	default:
 		r = 0;
 	}
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7911059..35657f9 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -174,6 +174,7 @@ struct irq_phys_map_entry {
 
 struct vgic_its {
 	bool			enabled;
+	struct vgic_io_device	iodev;
 	spinlock_t		lock;
 	u64			cbaser;
 	int			creadr;
@@ -192,6 +193,9 @@ struct vgic_dist {
 	/* vGIC model the kernel emulates for the guest (GICv2 or GICv3) */
 	u32			vgic_model;
 
+	/* Do injected MSIs require an additional device ID? */
+	bool			msis_require_devid;
+
 	int			nr_cpus;
 	int			nr_irqs;
 
@@ -397,4 +401,6 @@ static inline int vgic_v3_probe(struct device_node *vgic_node,
 }
 #endif
 
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
+
 #endif
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index cd8526a..b40a7fc 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -1117,6 +1117,7 @@ int vits_init(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_its *its = &dist->its;
+	int ret;
 
 	dist->pendbaser = kcalloc(dist->nr_cpus, sizeof(u64), GFP_KERNEL);
 	if (!dist->pendbaser)
@@ -1131,9 +1132,16 @@ int vits_init(struct kvm *kvm)
 	INIT_LIST_HEAD(&its->device_list);
 	INIT_LIST_HEAD(&its->collection_list);
 
+	ret = vgic_register_kvm_io_dev(kvm, dist->vgic_its_base,
+				       KVM_VGIC_V3_ITS_SIZE, vgicv3_its_ranges,
+				       -1, &its->iodev);
+	if (ret)
+		return ret;
+
 	its->enabled = false;
+	dist->msis_require_devid = true;
 
-	return -ENXIO;
+	return 0;
 }
 
 void vits_destroy(struct kvm *kvm)
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 90f3628..311b3ea 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -8,7 +8,6 @@
  *
  * Limitations of the emulation:
  * (RAZ/WI: read as zero, write ignore, RAO/WI: read as one, write ignore)
- * - We do not support LPIs (yet). TYPER.LPIS is reported as 0 and is RAZ/WI.
  * - We do not support the message based interrupts (MBIs) triggered by
  *   writes to the GICD_{SET,CLR}SPI_* registers. TYPER.MBIS is reported as 0.
  * - We do not support the (optional) backwards compatibility feature.
@@ -87,10 +86,10 @@ static bool handle_mmio_ctlr(struct kvm_vcpu *vcpu,
 /*
  * As this implementation does not provide compatibility
  * with GICv2 (ARE==1), we report zero CPUs in bits [5..7].
- * Also LPIs and MBIs are not supported, so we set the respective bits to 0.
- * Also we report at most 2**10=1024 interrupt IDs (to match 1024 SPIs).
+ * Also we report at most 2**10=1024 interrupt IDs (to match 1024 SPIs)
+ * and provide 16 bits worth of LPI number space (to give 57344 LPIs).
  */
-#define INTERRUPT_ID_BITS 10
+#define INTERRUPT_ID_BITS_SPIS 10
 static bool handle_mmio_typer(struct kvm_vcpu *vcpu,
 			      struct kvm_exit_mmio *mmio, phys_addr_t offset)
 {
@@ -98,7 +97,12 @@ static bool handle_mmio_typer(struct kvm_vcpu *vcpu,
 
 	reg = (min(vcpu->kvm->arch.vgic.nr_irqs, 1024) >> 5) - 1;
 
-	reg |= (INTERRUPT_ID_BITS - 1) << 19;
+	if (vgic_has_its(vcpu->kvm)) {
+		reg |= GICD_TYPER_LPIS;
+		reg |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
+	} else {
+		reg |= (INTERRUPT_ID_BITS_SPIS - 1) << 19;
+	}
 
 	vgic_reg_access(mmio, &reg, offset,
 			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
@@ -539,7 +543,9 @@ static bool handle_mmio_ctlr_redist(struct kvm_vcpu *vcpu,
 			ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
 	if (vgic_has_its(vcpu->kvm) && !dist->lpis_enabled &&
 	    (reg & GICR_CTLR_ENABLE_LPIS)) {
-		/* Eventually do something */
+		vgic_enable_lpis(vcpu);
+		dist->lpis_enabled = true;
+		return true;
 	}
 	return false;
 }
@@ -566,6 +572,8 @@ static bool handle_mmio_typer_redist(struct kvm_vcpu *vcpu,
 	reg = redist_vcpu->vcpu_id << 8;
 	if (target_vcpu_id == atomic_read(&vcpu->kvm->online_vcpus) - 1)
 		reg |= GICR_TYPER_LAST;
+	if (vgic_has_its(vcpu->kvm))
+		reg |= GICR_TYPER_PLPIS;
 	vgic_reg_access(mmio, &reg, offset,
 			ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
 	return false;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 9ee87d3..372cb20 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2571,3 +2571,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 {
 	return 0;
 }
+
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	if (kvm->arch.vgic.vm_ops.inject_msi)
+		return kvm->arch.vgic.vm_ops.inject_msi(kvm, msi);
+	else
+		return -ENODEV;
+}
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-07 14:55   ` Andre Przywara
@ 2015-10-07 15:10     ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 15:10 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> As the actual LPI number in a guest can be quite high, but is mostly
> assigned using a very sparse allocation scheme, bitmaps and arrays
> for storing the virtual interrupt status are a waste of memory.
> We use our equivalent of the "Interrupt Translation Table Entry"
> (ITTE) to hold this extra status information for a virtual LPI.

 You know, not that i'm strongly against current approach and want you to redo everything once
again, but... Is it architecturally correct to intertwine LPIs and ITS so much? As far as i
understand arch manual, it is possible to have LPIs without ITS (triggered by something else?).
Shouldn't we do the same, and just add LPI support to our redistributors, and then proceed with the
ITS?
 As to memory consumption, do we really need to store own copy of tables? After all, it's just a
memory. What if we map a pointer directly into guest's memory (which it writes to
PROPBASER/PENDBASER), and just keep it? There will be no issues with caching and synchronization at
all.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-07 15:10     ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> As the actual LPI number in a guest can be quite high, but is mostly
> assigned using a very sparse allocation scheme, bitmaps and arrays
> for storing the virtual interrupt status are a waste of memory.
> We use our equivalent of the "Interrupt Translation Table Entry"
> (ITTE) to hold this extra status information for a virtual LPI.

 You know, not that i'm strongly against current approach and want you to redo everything once
again, but... Is it architecturally correct to intertwine LPIs and ITS so much? As far as i
understand arch manual, it is possible to have LPIs without ITS (triggered by something else?).
Shouldn't we do the same, and just add LPI support to our redistributors, and then proceed with the
ITS?
 As to memory consumption, do we really need to store own copy of tables? After all, it's just a
memory. What if we map a pointer directly into guest's memory (which it writes to
PROPBASER/PENDBASER), and just keep it? There will be no issues with caching and synchronization at
all.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-07 15:10     ` Pavel Fedin
@ 2015-10-07 15:35       ` Marc Zyngier
  -1 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 15:35 UTC (permalink / raw)
  To: Pavel Fedin, 'Andre Przywara', christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

On 07/10/15 16:10, Pavel Fedin wrote:
>  Hello!
> 
>> As the actual LPI number in a guest can be quite high, but is mostly
>> assigned using a very sparse allocation scheme, bitmaps and arrays
>> for storing the virtual interrupt status are a waste of memory.
>> We use our equivalent of the "Interrupt Translation Table Entry"
>> (ITTE) to hold this extra status information for a virtual LPI.
> 
> You know, not that i'm strongly against current approach and want you
> to redo everything once again, but... Is it architecturally correct
> to intertwine LPIs and ITS so much? As far as i

Yes it is.

> understand arch manual, it is possible to have LPIs without ITS
> (triggered by something else?). Shouldn't we do the same, and just
> add LPI support to our redistributors, and then proceed with the 
> ITS?

No. We're implementing a monolithic GICv3 that doesn't offer writing to
the redistributors directly from a device. And frankly, that's good enough.

> As to memory consumption, do we really need to store own copy of
> tables? After all, it's just a memory. What if we map a pointer
> directly into guest's memory (which it writes to 
> PROPBASER/PENDBASER), and just keep it? There will be no issues with
> caching and synchronization at all.

Sure. And you then have to parse and validate all the tables each and
every time you're going to inject an interrupt (because the guest can
change the table content behind your back). You are quickly going to
notice that your performance is abysmal.

At that point, you're going to start being clever, and add a cache. And
guess what, that's what the HW does too. And then you'll make your cache
a convenient structure to be able to quickly inject interrupts. And
that's what the HW does too. And finally, you're going to realize that
populating a cache sucks, and you're going to keep all the state where
it is convenient, when it is convenient (and that's basically always).

The HW can't do that, but we can.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-07 15:35       ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 15:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/10/15 16:10, Pavel Fedin wrote:
>  Hello!
> 
>> As the actual LPI number in a guest can be quite high, but is mostly
>> assigned using a very sparse allocation scheme, bitmaps and arrays
>> for storing the virtual interrupt status are a waste of memory.
>> We use our equivalent of the "Interrupt Translation Table Entry"
>> (ITTE) to hold this extra status information for a virtual LPI.
> 
> You know, not that i'm strongly against current approach and want you
> to redo everything once again, but... Is it architecturally correct
> to intertwine LPIs and ITS so much? As far as i

Yes it is.

> understand arch manual, it is possible to have LPIs without ITS
> (triggered by something else?). Shouldn't we do the same, and just
> add LPI support to our redistributors, and then proceed with the 
> ITS?

No. We're implementing a monolithic GICv3 that doesn't offer writing to
the redistributors directly from a device. And frankly, that's good enough.

> As to memory consumption, do we really need to store own copy of
> tables? After all, it's just a memory. What if we map a pointer
> directly into guest's memory (which it writes to 
> PROPBASER/PENDBASER), and just keep it? There will be no issues with
> caching and synchronization at all.

Sure. And you then have to parse and validate all the tables each and
every time you're going to inject an interrupt (because the guest can
change the table content behind your back). You are quickly going to
notice that your performance is abysmal.

At that point, you're going to start being clever, and add a cache. And
guess what, that's what the HW does too. And then you'll make your cache
a convenient structure to be able to quickly inject interrupts. And
that's what the HW does too. And finally, you're going to realize that
populating a cache sucks, and you're going to keep all the state where
it is convenient, when it is convenient (and that's basically always).

The HW can't do that, but we can.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-07 15:35       ` Marc Zyngier
@ 2015-10-07 15:46         ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 15:46 UTC (permalink / raw)
  To: 'Marc Zyngier', 'Andre Przywara', christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> Sure. And you then have to parse and validate all the tables each and
> every time you're going to inject an interrupt (because the guest can
> change the table content behind your back). You are quickly going to
> notice that your performance is abysmal.

 I don't see any real problems, at least with LPI tables. If the guest changes something, it will be
immediately available to us. I don't see any need to seriously validate something, at least here.
Pending bit is just pending bit, and configuration is just priority value plus enable bit.
 But, well, if we think a bit better, in case of pending bit modification, the operations on both
guest and host side have to be atomic, otherwise we can clobber our table if, for example, both host
and guest modify adjacent bits. And there's no way to interlock with the guest. So, OK, i accept
your point.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-07 15:46         ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 15:46 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> Sure. And you then have to parse and validate all the tables each and
> every time you're going to inject an interrupt (because the guest can
> change the table content behind your back). You are quickly going to
> notice that your performance is abysmal.

 I don't see any real problems, at least with LPI tables. If the guest changes something, it will be
immediately available to us. I don't see any need to seriously validate something, at least here.
Pending bit is just pending bit, and configuration is just priority value plus enable bit.
 But, well, if we think a bit better, in case of pending bit modification, the operations on both
guest and host side have to be atomic, otherwise we can clobber our table if, for example, both host
and guest modify adjacent bits. And there's no way to interlock with the guest. So, OK, i accept
your point.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-07 15:46         ` Pavel Fedin
@ 2015-10-07 15:49           ` Marc Zyngier
  -1 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 15:49 UTC (permalink / raw)
  To: Pavel Fedin, 'Andre Przywara', christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

On 07/10/15 16:46, Pavel Fedin wrote:
>  Hello!
> 
>> Sure. And you then have to parse and validate all the tables each and
>> every time you're going to inject an interrupt (because the guest can
>> change the table content behind your back). You are quickly going to
>> notice that your performance is abysmal.
> 
>  I don't see any real problems, at least with LPI tables. If the guest changes something, it will be
> immediately available to us. I don't see any need to seriously validate something, at least here.
> Pending bit is just pending bit, and configuration is just priority value plus enable bit.
>  But, well, if we think a bit better, in case of pending bit modification, the operations on both
> guest and host side have to be atomic, otherwise we can clobber our table if, for example, both host
> and guest modify adjacent bits. And there's no way to interlock with the guest. So, OK, i accept
> your point.

The pending table is the least of our concerns. Device table, ITTs,
collections. That's the real problem.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-07 15:49           ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 15:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/10/15 16:46, Pavel Fedin wrote:
>  Hello!
> 
>> Sure. And you then have to parse and validate all the tables each and
>> every time you're going to inject an interrupt (because the guest can
>> change the table content behind your back). You are quickly going to
>> notice that your performance is abysmal.
> 
>  I don't see any real problems, at least with LPI tables. If the guest changes something, it will be
> immediately available to us. I don't see any need to seriously validate something, at least here.
> Pending bit is just pending bit, and configuration is just priority value plus enable bit.
>  But, well, if we think a bit better, in case of pending bit modification, the operations on both
> guest and host side have to be atomic, otherwise we can clobber our table if, for example, both host
> and guest modify adjacent bits. And there's no way to interlock with the guest. So, OK, i accept
> your point.

The pending table is the least of our concerns. Device table, ITTs,
collections. That's the real problem.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-07 16:05   ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 16:05 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

 One more concern about the whole thing. I already replied to the previous series, but looks like my
reply was missed.
 Your implementation does not care about live migration at all. And there's one fundamental issue
with it. In the redistributor LPIs can be only pending, but in the CPU interface they still can be
active. And they have priorities, therefore they can be preempted, so we can have even more than one
active LPI at once. How to migrate this state?
 Here i am trying to prototype this by leaving active interrupts in LRs and allowing the userland to
read/write them. This looks a bit stupid, additionally this will create problems if we are e. g.
migrating from host with 8 LRs to host with 4 LRs, while having 6 active LPIs. Can anybody suggest
better solution?
 Technically LPI pending table has unused bits from 0 to 8191, and we have 8192 LPIs, so we could
push active state there, just for migration. Would this be a big violation of specification? It says
nothing about these bits at all.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 16:05   ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 16:05 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

 One more concern about the whole thing. I already replied to the previous series, but looks like my
reply was missed.
 Your implementation does not care about live migration at all. And there's one fundamental issue
with it. In the redistributor LPIs can be only pending, but in the CPU interface they still can be
active. And they have priorities, therefore they can be preempted, so we can have even more than one
active LPI at once. How to migrate this state?
 Here i am trying to prototype this by leaving active interrupts in LRs and allowing the userland to
read/write them. This looks a bit stupid, additionally this will create problems if we are e. g.
migrating from host with 8 LRs to host with 4 LRs, while having 6 active LPIs. Can anybody suggest
better solution?
 Technically LPI pending table has unused bits from 0 to 8191, and we have 8192 LPIs, so we could
push active state there, just for migration. Would this be a big violation of specification? It says
nothing about these bits at all.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 16:05   ` Pavel Fedin
@ 2015-10-07 16:22     ` Marc Zyngier
  -1 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 16:22 UTC (permalink / raw)
  To: Pavel Fedin, 'Andre Przywara', christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

On 07/10/15 17:05, Pavel Fedin wrote:
>  Hello!
> 
>  One more concern about the whole thing. I already replied to the previous series, but looks like my
> reply was missed.
>  Your implementation does not care about live migration at all. And there's one fundamental issue
> with it. In the redistributor LPIs can be only pending, but in the CPU interface they still can be
> active. And they have priorities, therefore they can be preempted, so we can have even more than one
> active LPI at once. How to migrate this state?
>  Here i am trying to prototype this by leaving active interrupts in LRs and allowing the userland to
> read/write them. This looks a bit stupid, additionally this will create problems if we are e. g.
> migrating from host with 8 LRs to host with 4 LRs, while having 6 active LPIs. Can anybody suggest
> better solution?
>  Technically LPI pending table has unused bits from 0 to 8191, and we have 8192 LPIs, so we could
> push active state there, just for migration. Would this be a big violation of specification? It says
> nothing about these bits at all.

LPIs do not have an active state, at the redistributor or otherwise.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 16:22     ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 16:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/10/15 17:05, Pavel Fedin wrote:
>  Hello!
> 
>  One more concern about the whole thing. I already replied to the previous series, but looks like my
> reply was missed.
>  Your implementation does not care about live migration at all. And there's one fundamental issue
> with it. In the redistributor LPIs can be only pending, but in the CPU interface they still can be
> active. And they have priorities, therefore they can be preempted, so we can have even more than one
> active LPI at once. How to migrate this state?
>  Here i am trying to prototype this by leaving active interrupts in LRs and allowing the userland to
> read/write them. This looks a bit stupid, additionally this will create problems if we are e. g.
> migrating from host with 8 LRs to host with 4 LRs, while having 6 active LPIs. Can anybody suggest
> better solution?
>  Technically LPI pending table has unused bits from 0 to 8191, and we have 8192 LPIs, so we could
> push active state there, just for migration. Would this be a big violation of specification? It says
> nothing about these bits at all.

LPIs do not have an active state, at the redistributor or otherwise.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 16:22     ` Marc Zyngier
@ 2015-10-07 18:09       ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 18:09 UTC (permalink / raw)
  To: 'Marc Zyngier', 'Andre Przywara', christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> LPIs do not have an active state, at the redistributor or otherwise.

 Then what do they become after they were ACK'ed and before EOI'ed?
 I tried to google up this thing, and came up with this email:
http://www.spinics.net/lists/kvm-arm/msg16032.html. It says that "SW must issue a write to EOI to
clear the active priorities register, hence the CPU interface still requires an active state for
LPIs". They give a link to some document which seems to be top-secret and never published, because
my arch reference manual does not have section 4.8.3 named "Properties of LPI".
 And another thread, http://lists.xen.org/archives/html/xen-devel/2014-09/msg01141.html, says that
virtual LPIs actually do have active state in LR.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 18:09       ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-07 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> LPIs do not have an active state, at the redistributor or otherwise.

 Then what do they become after they were ACK'ed and before EOI'ed?
 I tried to google up this thing, and came up with this email:
http://www.spinics.net/lists/kvm-arm/msg16032.html. It says that "SW must issue a write to EOI to
clear the active priorities register, hence the CPU interface still requires an active state for
LPIs". They give a link to some document which seems to be top-secret and never published, because
my arch reference manual does not have section 4.8.3 named "Properties of LPI".
 And another thread, http://lists.xen.org/archives/html/xen-devel/2014-09/msg01141.html, says that
virtual LPIs actually do have active state in LR.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 18:09       ` Pavel Fedin
  (?)
@ 2015-10-07 19:48         ` Marc Zyngier
  -1 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 19:48 UTC (permalink / raw)
  To: Pavel Fedin
  Cc: 'Andre Przywara',
	christoffer.dall, eric.auger, kvmarm, linux-arm-kernel, kvm

On Wed, 7 Oct 2015 21:09:07 +0300
Pavel Fedin <p.fedin@samsung.com> wrote:

>  Hello!
> 
> > LPIs do not have an active state, at the redistributor or otherwise.
> 
>  Then what do they become after they were ACK'ed and before EOI'ed?

Nothing. They are gone. What is left at the CPU interface is the active
priority.

>  I tried to google up this thing, and came up with this email:
> http://www.spinics.net/lists/kvm-arm/msg16032.html. It says that "SW must issue a write to EOI to
> clear the active priorities register, hence the CPU interface still requires an active state for
> LPIs". They give a link to some document which seems to be top-secret and never published, because
> my arch reference manual does not have section 4.8.3 named "Properties of LPI".

Your architecture document has a section 1.2.1 which contains the
sentence: "LPIs do not have an active state, and therefore do not
require explicit deactivation.". It also has 1.2.2 ("Interrupt states")
that repeatedly states the same thing. Finally, the email you quote is
about priority drop vs deactivation, not about the active state of an
LPI.

>  And another thread,
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg01141.html,
> says that virtual LPIs actually do have active state in LR.

Or not. Read again. The only case where something vaguely relevant
happens is when you inject a virtual SPI backed by a HW LPI. In that
case, the LR does have an active state (of course, this is an SPI). Or
when you inject a virtual LPI backed by a HW SPI (in which case the
relevant active state is in the physical distributor, not in the LR).

I'd appreciate if you could try to read and understand the architecture
spec instead of randomly googling and quoting various bits of
irrelevant information.

If something is unclear in the architecture specification (yes, this
is complicated and sometimes confusing), please ask relevant questions.
At the moment, you're just asserting fallacies, and I'd rather spend
time doing something useful instead of setting the record straight
again and again.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 19:48         ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 19:48 UTC (permalink / raw)
  To: Pavel Fedin
  Cc: 'Andre Przywara',
	christoffer.dall, eric.auger, kvmarm, linux-arm-kernel, kvm

On Wed, 7 Oct 2015 21:09:07 +0300
Pavel Fedin <p.fedin@samsung.com> wrote:

>  Hello!
> 
> > LPIs do not have an active state, at the redistributor or otherwise.
> 
>  Then what do they become after they were ACK'ed and before EOI'ed?

Nothing. They are gone. What is left at the CPU interface is the active
priority.

>  I tried to google up this thing, and came up with this email:
> http://www.spinics.net/lists/kvm-arm/msg16032.html. It says that "SW must issue a write to EOI to
> clear the active priorities register, hence the CPU interface still requires an active state for
> LPIs". They give a link to some document which seems to be top-secret and never published, because
> my arch reference manual does not have section 4.8.3 named "Properties of LPI".

Your architecture document has a section 1.2.1 which contains the
sentence: "LPIs do not have an active state, and therefore do not
require explicit deactivation.". It also has 1.2.2 ("Interrupt states")
that repeatedly states the same thing. Finally, the email you quote is
about priority drop vs deactivation, not about the active state of an
LPI.

>  And another thread,
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg01141.html,
> says that virtual LPIs actually do have active state in LR.

Or not. Read again. The only case where something vaguely relevant
happens is when you inject a virtual SPI backed by a HW LPI. In that
case, the LR does have an active state (of course, this is an SPI). Or
when you inject a virtual LPI backed by a HW SPI (in which case the
relevant active state is in the physical distributor, not in the LR).

I'd appreciate if you could try to read and understand the architecture
spec instead of randomly googling and quoting various bits of
irrelevant information.

If something is unclear in the architecture specification (yes, this
is complicated and sometimes confusing), please ask relevant questions.
At the moment, you're just asserting fallacies, and I'd rather spend
time doing something useful instead of setting the record straight
again and again.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-07 19:48         ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2015-10-07 19:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 7 Oct 2015 21:09:07 +0300
Pavel Fedin <p.fedin@samsung.com> wrote:

>  Hello!
> 
> > LPIs do not have an active state, at the redistributor or otherwise.
> 
>  Then what do they become after they were ACK'ed and before EOI'ed?

Nothing. They are gone. What is left at the CPU interface is the active
priority.

>  I tried to google up this thing, and came up with this email:
> http://www.spinics.net/lists/kvm-arm/msg16032.html. It says that "SW must issue a write to EOI to
> clear the active priorities register, hence the CPU interface still requires an active state for
> LPIs". They give a link to some document which seems to be top-secret and never published, because
> my arch reference manual does not have section 4.8.3 named "Properties of LPI".

Your architecture document has a section 1.2.1 which contains the
sentence: "LPIs do not have an active state, and therefore do not
require explicit deactivation.". It also has 1.2.2 ("Interrupt states")
that repeatedly states the same thing. Finally, the email you quote is
about priority drop vs deactivation, not about the active state of an
LPI.

>  And another thread,
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg01141.html,
> says that virtual LPIs actually do have active state in LR.

Or not. Read again. The only case where something vaguely relevant
happens is when you inject a virtual SPI backed by a HW LPI. In that
case, the LR does have an active state (of course, this is an SPI). Or
when you inject a virtual LPI backed by a HW SPI (in which case the
relevant active state is in the physical distributor, not in the LR).

I'd appreciate if you could try to read and understand the architecture
spec instead of randomly googling and quoting various bits of
irrelevant information.

If something is unclear in the architecture specification (yes, this
is complicated and sometimes confusing), please ask relevant questions.
At the moment, you're just asserting fallacies, and I'd rather spend
time doing something useful instead of setting the record straight
again and again.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 19:48         ` Marc Zyngier
@ 2015-10-08  8:41           ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-08  8:41 UTC (permalink / raw)
  To: 'Marc Zyngier'
  Cc: kvm, 'Andre Przywara', kvmarm, linux-arm-kernel

 Hello!

 Sorry for taking up your time, and thank you very much for the explanation.

> I'd appreciate if you could try to read and understand the architecture
> spec instead of randomly googling and quoting various bits of
> irrelevant information.

 I give my apologizes for not having time to read the whole specs from beginning to the end. Can
only add that it's quite weird to have these important things in "Terminology" section. I would
expect them to be in 6.1, for example. That was the part i read, but failed to find the exact
answer:
--- cut ---
LPIs do not have an active state, and transition to the inactive state on being acknowledged by a PE
--- cut ---

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-08  8:41           ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-08  8:41 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

 Sorry for taking up your time, and thank you very much for the explanation.

> I'd appreciate if you could try to read and understand the architecture
> spec instead of randomly googling and quoting various bits of
> irrelevant information.

 I give my apologizes for not having time to read the whole specs from beginning to the end. Can
only add that it's quite weird to have these important things in "Terminology" section. I would
expect them to be in 6.1, for example. That was the part i read, but failed to find the exact
answer:
--- cut ---
LPIs do not have an active state, and transition to the inactive state on being acknowledged by a PE
--- cut ---

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-10 15:37   ` Christoffer Dall
  -1 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2015-10-10 15:37 UTC (permalink / raw)
  To: Andre Przywara
  Cc: marc.zyngier, eric.auger, p.fedin, kvmarm, linux-arm-kernel, kvm

Hi Andre,


On Wed, Oct 07, 2015 at 03:55:10PM +0100, Andre Przywara wrote:
> Hi,
> 
> another respin and rebase of the ITS emulation series.
> Major changes compared to v2 (beside some minor things like added
> comments and function renames) are the rebasing and adaption to 4.3-rc
> and Christoffer's timer rework series. Also the locking has been
> reworked to cope with the dependencies of the its and the dist lock
> in connection with the PROPBASER/PENDBASER and the command handling.
> For a more detailed changelog see below or look at the respective
> commit messages.
> 
> This should address most of the comments I got on the list.
> Many thanks to the diligent reviewers!
> I didn't bother to fine-tune patch 01/16 too much, as I guess there
> will be more discussion around this based on Pavel's latest post.
> 
> These patches go on top of Christoffer's timer rework series [1],
> which itself is on top of 4.3-rc2.
> You can find all of this code in the its-emul/v3 branch of my
> repository [2].

Thanks for rebasing the series!

Just a heads up that I may not be able to review this series for the
next 1-2 weeks, so I'm afraid it's not going to make it in for v4.4,
sorry.

Please let me know if this breaks expectations from everyone.

Othersie, I will try review it with due dilligence so it makes it in for
v4.5.

Best,
-Christoffer

> 
> Changelog v2..v3:
> - adapt to 4.3-rc and Christoffer's timer rework
> - adapt spin locks on handling PROPBASER/PENDBASER registers
> - rework locking in ITS command handling (dropping dist where needed)
> - only clear LPI pending bit if LPI could actually be queued
> - simplify GICR_CTLR handling
> - properly free ITTEs (including our pending bitmap)
> - fix corner cases with unmapped collections
> - keep retire_lr() around
> - rename vgic_handle_base_register to vgic_reg64_access()
> - use kcalloc instead of kmalloc
> - minor fixes, renames and added comments
> 
> Changelog v1..v2
> - fix issues when using non-ITS GICv3 emulation
> - streamline frame address initialization (new patch 05/15)
> - preallocate buffer memory for reading from guest's memory
> - move locking into the actual command handlers
> -   preallocate memory for new structures if needed
> - use non-atomic __set_bit() and __clear_bit() when under the lock
> - add INT command handler to allow LPI injection from the guest
> - rewrite CWRITER handler to align with new locking scheme
> - remove unneeded CONFIG_HAVE_KVM_MSI #ifdefs
> - check memory table size against our LPI limit (65536 interrupts)
> - observe initial gap of 1024 interrupts in pending table
> - use term "configuration table" to be in line with the spec
> - clarify and extend documentation on API extensions
> - introduce new KVM_CAP_MSI_DEVID capability to advertise device ID requirement
> - update, fix and add many comments
> - minor style changes as requested by reviewers
> 
> ---------------
> 
> The GICv3 ITS (Interrupt Translation Service) is a part of the
> ARM GICv3 interrupt controller [4] used for implementing MSIs.
> It specifies a new kind of interrupts (LPIs), which are mapped to
> establish a connection between a device, its MSI payload value and
> the target processor the IRQ is eventually delivered to.
> In order to allow using MSIs in an ARM64 KVM guest, we emulate this
> ITS widget in the kernel.
> The ITS works by reading commands written by software (from the guest
> in our case) into a (guest allocated) memory region and establishing
> the mapping between a device, the MSI payload and the target CPU.
> We parse these commands and update our internal data structures to
> reflect those changes. On an MSI injection we iterate those
> structures to learn the LPI number we have to inject.
> For the time being we use simple lists to hold the data, this is
> good enough for the small number of entries each of the components
> currently have. Should this become a performance bottleneck in the
> future, those can be extended to arrays or trees if needed.
> 
> Most of the code lives in a separate source file (its-emul.c), though
> there are some changes necessary both in vgic.c and vgic-v3-emul.c.
> 
> Patch 01/16 gets rid of the internal tracking of the used LR for
> an injected IRQ, see the commit message for more details.
> Patch 03/16 extends the KVM MSI ioctl to hold a device ID.
> Patch 04-06 make small changes to the existing VGIC code which make
> adaptions to the ITS later easier.
> The rest of the patches implement the ITS functionality step by step.
> For more details see the respective commit messages.
> 
> For the time being this series gives us the ability to use emulated
> PCI devices that can use MSIs in the guest. Those have to be
> triggered by letting the userland device emulation simulate the MSI
> write with the KVM_SIGNAL_MSI ioctl. This will be translated into
> the proper LPI by the ITS emulation and injected into the guest in
> the usual way (just with a higher IRQ number).
> 
> This series is based on 4.3-rc2 and can be found at the its-emul/v3
> branch of this repository [2].
> For this to be used you need a GICv3 host machine (a fast model would
> do), though it does not rely on any host ITS bits (neither in hardware
> or software).
> 
> To test this you can use the kvmtool patches available in the "its"
> branch here [3].
> Start a guest with: "$ lkvm run --irqchip=gicv3-its --force-pci"
> and see the ITS being used for instance by the virtio devices.
> 
> [1]: https://git.linaro.org/people/christoffer.dall/linux-kvm-arm.git/shortlog/refs/heads/timer-rework-v3
> [2]: git://linux-arm.org/linux-ap.git
>      http://www.linux-arm.org/git?p=linux-ap.git;a=log;h=refs/heads/its-emul/v3
> [3]: git://linux-arm.org/kvmtool.git
>      http://www.linux-arm.org/git?p=kvmtool.git;a=log;h=refs/heads/its
> [4]: http://arminfo.emea.arm.com/help/topic/com.arm.doc.ihi0069a/IHI0069A_gic_architecture_specification.pdf
> 
> Andre Przywara (16):
>   KVM: arm/arm64: VGIC: don't track used LRs in the distributor
>   KVM: arm/arm64: remove now unused code after stay-in-LR rework
>   KVM: extend struct kvm_msi to hold a 32-bit device ID
>   KVM: arm/arm64: add emulation model specific destroy function
>   KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
>   KVM: arm/arm64: make GIC frame address initialization model specific
>   KVM: arm64: Introduce new MMIO region for the ITS base address
>   KVM: arm64: handle ITS related GICv3 redistributor registers
>   KVM: arm64: introduce ITS emulation file with stub functions
>   KVM: arm64: implement basic ITS register handlers
>   KVM: arm64: add data structures to model ITS interrupt translation
>   KVM: arm64: handle pending bit for LPIs in ITS emulation
>   KVM: arm64: sync LPI configuration and pending tables
>   KVM: arm64: implement ITS command queue command handlers
>   KVM: arm64: implement MSI injection in ITS emulation
>   KVM: arm64: enable ITS emulation as a virtual MSI controller
> 
>  Documentation/virtual/kvm/api.txt              |   14 +-
>  Documentation/virtual/kvm/devices/arm-vgic.txt |    9 +
>  arch/arm/include/asm/kvm_host.h                |    2 +-
>  arch/arm/kvm/arm.c                             |    2 +-
>  arch/arm64/include/asm/kvm_host.h              |    2 +-
>  arch/arm64/include/uapi/asm/kvm.h              |    2 +
>  arch/arm64/kvm/Kconfig                         |    1 +
>  arch/arm64/kvm/Makefile                        |    1 +
>  arch/arm64/kvm/reset.c                         |    8 +-
>  include/kvm/arm_vgic.h                         |   43 +-
>  include/linux/irqchip/arm-gic-v3.h             |   14 +-
>  include/uapi/linux/kvm.h                       |    5 +-
>  virt/kvm/arm/its-emul.c                        | 1187 ++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h                        |   55 ++
>  virt/kvm/arm/vgic-v2-emul.c                    |    3 +
>  virt/kvm/arm/vgic-v2.c                         |    1 +
>  virt/kvm/arm/vgic-v3-emul.c                    |  101 +-
>  virt/kvm/arm/vgic-v3.c                         |    1 +
>  virt/kvm/arm/vgic.c                            |  292 +++---
>  virt/kvm/arm/vgic.h                            |    3 +
>  20 files changed, 1601 insertions(+), 145 deletions(-)
>  create mode 100644 virt/kvm/arm/its-emul.c
>  create mode 100644 virt/kvm/arm/its-emul.h
> 
> -- 
> 2.5.1
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-10 15:37   ` Christoffer Dall
  0 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2015-10-10 15:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Andre,


On Wed, Oct 07, 2015 at 03:55:10PM +0100, Andre Przywara wrote:
> Hi,
> 
> another respin and rebase of the ITS emulation series.
> Major changes compared to v2 (beside some minor things like added
> comments and function renames) are the rebasing and adaption to 4.3-rc
> and Christoffer's timer rework series. Also the locking has been
> reworked to cope with the dependencies of the its and the dist lock
> in connection with the PROPBASER/PENDBASER and the command handling.
> For a more detailed changelog see below or look at the respective
> commit messages.
> 
> This should address most of the comments I got on the list.
> Many thanks to the diligent reviewers!
> I didn't bother to fine-tune patch 01/16 too much, as I guess there
> will be more discussion around this based on Pavel's latest post.
> 
> These patches go on top of Christoffer's timer rework series [1],
> which itself is on top of 4.3-rc2.
> You can find all of this code in the its-emul/v3 branch of my
> repository [2].

Thanks for rebasing the series!

Just a heads up that I may not be able to review this series for the
next 1-2 weeks, so I'm afraid it's not going to make it in for v4.4,
sorry.

Please let me know if this breaks expectations from everyone.

Othersie, I will try review it with due dilligence so it makes it in for
v4.5.

Best,
-Christoffer

> 
> Changelog v2..v3:
> - adapt to 4.3-rc and Christoffer's timer rework
> - adapt spin locks on handling PROPBASER/PENDBASER registers
> - rework locking in ITS command handling (dropping dist where needed)
> - only clear LPI pending bit if LPI could actually be queued
> - simplify GICR_CTLR handling
> - properly free ITTEs (including our pending bitmap)
> - fix corner cases with unmapped collections
> - keep retire_lr() around
> - rename vgic_handle_base_register to vgic_reg64_access()
> - use kcalloc instead of kmalloc
> - minor fixes, renames and added comments
> 
> Changelog v1..v2
> - fix issues when using non-ITS GICv3 emulation
> - streamline frame address initialization (new patch 05/15)
> - preallocate buffer memory for reading from guest's memory
> - move locking into the actual command handlers
> -   preallocate memory for new structures if needed
> - use non-atomic __set_bit() and __clear_bit() when under the lock
> - add INT command handler to allow LPI injection from the guest
> - rewrite CWRITER handler to align with new locking scheme
> - remove unneeded CONFIG_HAVE_KVM_MSI #ifdefs
> - check memory table size against our LPI limit (65536 interrupts)
> - observe initial gap of 1024 interrupts in pending table
> - use term "configuration table" to be in line with the spec
> - clarify and extend documentation on API extensions
> - introduce new KVM_CAP_MSI_DEVID capability to advertise device ID requirement
> - update, fix and add many comments
> - minor style changes as requested by reviewers
> 
> ---------------
> 
> The GICv3 ITS (Interrupt Translation Service) is a part of the
> ARM GICv3 interrupt controller [4] used for implementing MSIs.
> It specifies a new kind of interrupts (LPIs), which are mapped to
> establish a connection between a device, its MSI payload value and
> the target processor the IRQ is eventually delivered to.
> In order to allow using MSIs in an ARM64 KVM guest, we emulate this
> ITS widget in the kernel.
> The ITS works by reading commands written by software (from the guest
> in our case) into a (guest allocated) memory region and establishing
> the mapping between a device, the MSI payload and the target CPU.
> We parse these commands and update our internal data structures to
> reflect those changes. On an MSI injection we iterate those
> structures to learn the LPI number we have to inject.
> For the time being we use simple lists to hold the data, this is
> good enough for the small number of entries each of the components
> currently have. Should this become a performance bottleneck in the
> future, those can be extended to arrays or trees if needed.
> 
> Most of the code lives in a separate source file (its-emul.c), though
> there are some changes necessary both in vgic.c and vgic-v3-emul.c.
> 
> Patch 01/16 gets rid of the internal tracking of the used LR for
> an injected IRQ, see the commit message for more details.
> Patch 03/16 extends the KVM MSI ioctl to hold a device ID.
> Patch 04-06 make small changes to the existing VGIC code which make
> adaptions to the ITS later easier.
> The rest of the patches implement the ITS functionality step by step.
> For more details see the respective commit messages.
> 
> For the time being this series gives us the ability to use emulated
> PCI devices that can use MSIs in the guest. Those have to be
> triggered by letting the userland device emulation simulate the MSI
> write with the KVM_SIGNAL_MSI ioctl. This will be translated into
> the proper LPI by the ITS emulation and injected into the guest in
> the usual way (just with a higher IRQ number).
> 
> This series is based on 4.3-rc2 and can be found at the its-emul/v3
> branch of this repository [2].
> For this to be used you need a GICv3 host machine (a fast model would
> do), though it does not rely on any host ITS bits (neither in hardware
> or software).
> 
> To test this you can use the kvmtool patches available in the "its"
> branch here [3].
> Start a guest with: "$ lkvm run --irqchip=gicv3-its --force-pci"
> and see the ITS being used for instance by the virtio devices.
> 
> [1]: https://git.linaro.org/people/christoffer.dall/linux-kvm-arm.git/shortlog/refs/heads/timer-rework-v3
> [2]: git://linux-arm.org/linux-ap.git
>      http://www.linux-arm.org/git?p=linux-ap.git;a=log;h=refs/heads/its-emul/v3
> [3]: git://linux-arm.org/kvmtool.git
>      http://www.linux-arm.org/git?p=kvmtool.git;a=log;h=refs/heads/its
> [4]: http://arminfo.emea.arm.com/help/topic/com.arm.doc.ihi0069a/IHI0069A_gic_architecture_specification.pdf
> 
> Andre Przywara (16):
>   KVM: arm/arm64: VGIC: don't track used LRs in the distributor
>   KVM: arm/arm64: remove now unused code after stay-in-LR rework
>   KVM: extend struct kvm_msi to hold a 32-bit device ID
>   KVM: arm/arm64: add emulation model specific destroy function
>   KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
>   KVM: arm/arm64: make GIC frame address initialization model specific
>   KVM: arm64: Introduce new MMIO region for the ITS base address
>   KVM: arm64: handle ITS related GICv3 redistributor registers
>   KVM: arm64: introduce ITS emulation file with stub functions
>   KVM: arm64: implement basic ITS register handlers
>   KVM: arm64: add data structures to model ITS interrupt translation
>   KVM: arm64: handle pending bit for LPIs in ITS emulation
>   KVM: arm64: sync LPI configuration and pending tables
>   KVM: arm64: implement ITS command queue command handlers
>   KVM: arm64: implement MSI injection in ITS emulation
>   KVM: arm64: enable ITS emulation as a virtual MSI controller
> 
>  Documentation/virtual/kvm/api.txt              |   14 +-
>  Documentation/virtual/kvm/devices/arm-vgic.txt |    9 +
>  arch/arm/include/asm/kvm_host.h                |    2 +-
>  arch/arm/kvm/arm.c                             |    2 +-
>  arch/arm64/include/asm/kvm_host.h              |    2 +-
>  arch/arm64/include/uapi/asm/kvm.h              |    2 +
>  arch/arm64/kvm/Kconfig                         |    1 +
>  arch/arm64/kvm/Makefile                        |    1 +
>  arch/arm64/kvm/reset.c                         |    8 +-
>  include/kvm/arm_vgic.h                         |   43 +-
>  include/linux/irqchip/arm-gic-v3.h             |   14 +-
>  include/uapi/linux/kvm.h                       |    5 +-
>  virt/kvm/arm/its-emul.c                        | 1187 ++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h                        |   55 ++
>  virt/kvm/arm/vgic-v2-emul.c                    |    3 +
>  virt/kvm/arm/vgic-v2.c                         |    1 +
>  virt/kvm/arm/vgic-v3-emul.c                    |  101 +-
>  virt/kvm/arm/vgic-v3.c                         |    1 +
>  virt/kvm/arm/vgic.c                            |  292 +++---
>  virt/kvm/arm/vgic.h                            |    3 +
>  20 files changed, 1601 insertions(+), 145 deletions(-)
>  create mode 100644 virt/kvm/arm/its-emul.c
>  create mode 100644 virt/kvm/arm/its-emul.h
> 
> -- 
> 2.5.1
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-07 14:55   ` Andre Przywara
@ 2015-10-12  7:40     ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-12  7:40 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier@arm.com; christoffer.dall@linaro.org
> Cc: eric.auger@linaro.org; p.fedin@samsung.com; kvmarm@lists.cs.columbia.edu; linux-arm-
> kernel@lists.infradead.org; kvm@vger.kernel.org
> Subject: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
> 
> As the actual LPI number in a guest can be quite high, but is mostly
> assigned using a very sparse allocation scheme, bitmaps and arrays
> for storing the virtual interrupt status are a waste of memory.
> We use our equivalent of the "Interrupt Translation Table Entry"
> (ITTE) to hold this extra status information for a virtual LPI.
> As the normal VGIC code cannot use its fancy bitmaps to manage
> pending interrupts, we provide a hook in the VGIC code to let the
> ITS emulation handle the list register queueing itself.
> LPIs are located in a separate number range (>=8192), so
> distinguishing them is easy. With LPIs being only edge-triggered, we
> get away with a less complex IRQ handling.
> We extend the number of bits for storing the IRQ number in our
> LR struct to 16 to cover the LPI numbers we support as well.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - extend LR data structure to hold 16-bit wide IRQ IDs
> - only clear pending bit if IRQ could be queued
> - adapt __kvm_vgic_sync_hwstate() to upstream changes
> 
>  include/kvm/arm_vgic.h      |  4 +-
>  virt/kvm/arm/its-emul.c     | 75 ++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h     |  3 ++
>  virt/kvm/arm/vgic-v3-emul.c |  2 +
>  virt/kvm/arm/vgic.c         | 93 +++++++++++++++++++++++++++++++--------------
>  5 files changed, 148 insertions(+), 29 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index c3eb414..035911f 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -95,7 +95,7 @@ enum vgic_type {
>  #define LR_HW			(1 << 3)
> 
>  struct vgic_lr {
> -	unsigned irq:10;
> +	unsigned irq:16;
>  	union {
>  		unsigned hwirq:10;
>  		unsigned source:3;
> @@ -147,6 +147,8 @@ struct vgic_vm_ops {
>  	int	(*init_model)(struct kvm *);
>  	void	(*destroy_model)(struct kvm *);
>  	int	(*map_resources)(struct kvm *, const struct vgic_params *);
> +	bool	(*queue_lpis)(struct kvm_vcpu *);
> +	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
>  };
> 
>  struct vgic_io_device {
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index bab8033..8349970 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -59,8 +59,27 @@ struct its_itte {
>  	struct its_collection *collection;
>  	u32 lpi;
>  	u32 event_id;
> +	bool enabled;
> +	unsigned long *pending;
>  };
> 
> +/* To be used as an iterator this macro misses the enclosing parentheses */
> +#define for_each_lpi(dev, itte, kvm) \
> +	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
> +		list_for_each_entry(itte, &(dev)->itt, itte_list)
> +
> +static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
> +{
> +	struct its_device *device;
> +	struct its_itte *itte;
> +
> +	for_each_lpi(device, itte, kvm) {
> +		if (itte->lpi == lpi)
> +			return itte;
> +	}
> +	return NULL;
> +}
> +
>  #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
> 
>  /* The distributor lock is held by the VGIC MMIO handler. */
> @@ -154,9 +173,65 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>  	return false;
>  }
> 
> +/*
> + * Find all enabled and pending LPIs and queue them into the list
> + * registers.
> + * The dist lock is held by the caller.
> + */
> +bool vits_queue_lpis(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
> +	struct its_device *device;
> +	struct its_itte *itte;
> +	bool ret = true;
> +
> +	if (!vgic_has_its(vcpu->kvm))
> +		return true;
> +	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
> +		return true;
> +
> +	spin_lock(&its->lock);
> +	for_each_lpi(device, itte, vcpu->kvm) {
> +		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
> +			continue;
> +
> +		if (!itte->collection)
> +			continue;
> +
> +		if (itte->collection->target_addr != vcpu->vcpu_id)
> +			continue;
> +
> +
> +		if (vgic_queue_irq(vcpu, 0, itte->lpi))
> +			__clear_bit(vcpu->vcpu_id, itte->pending);
> +		else
> +			ret = false;

 Shouldn't we also have 'break' here? If vgic_queue_irq() returns false, this means we have no more
LRs to use, therefore it makes no sense to keep iterating.

> +	}
> +
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +/* Called with the distributor lock held by the caller. */
> +void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
> +{
> +	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
> +	struct its_itte *itte;
> +
> +	spin_lock(&its->lock);
> +
> +	/* Find the right ITTE and put the pending state back in there */
> +	itte = find_itte_by_lpi(vcpu->kvm, lpi);
> +	if (itte)
> +		__set_bit(vcpu->vcpu_id, itte->pending);
> +
> +	spin_unlock(&its->lock);
> +}
> +
>  static void its_free_itte(struct its_itte *itte)
>  {
>  	list_del(&itte->itte_list);
> +	kfree(itte->pending);
>  	kfree(itte);
>  }
> 
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index 472a6d0..cc5d5ff 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -33,4 +33,7 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> 
> +bool vits_queue_lpis(struct kvm_vcpu *vcpu);
> +void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> +
>  #endif
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index e9aa29e..f482e34 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -944,6 +944,8 @@ void vgic_v3_init_emulation(struct kvm *kvm)
>  	dist->vm_ops.init_model = vgic_v3_init_model;
>  	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
>  	dist->vm_ops.map_resources = vgic_v3_map_resources;
> +	dist->vm_ops.queue_lpis = vits_queue_lpis;
> +	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
> 
>  	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>  	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 11bf692..9ee87d3 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -120,6 +120,20 @@ static bool queue_sgi(struct kvm_vcpu *vcpu, int irq)
>  	return vcpu->kvm->arch.vgic.vm_ops.queue_sgi(vcpu, irq);
>  }
> 
> +static bool vgic_queue_lpis(struct kvm_vcpu *vcpu)
> +{
> +	if (vcpu->kvm->arch.vgic.vm_ops.queue_lpis)
> +		return vcpu->kvm->arch.vgic.vm_ops.queue_lpis(vcpu);
> +	else
> +		return true;
> +}
> +
> +static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
> +{
> +	if (vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi)
> +		vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
> +}
> +
>  int kvm_vgic_map_resources(struct kvm *kvm)
>  {
>  	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
> @@ -1148,18 +1162,28 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
>  static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>  				 int lr_nr, struct vgic_lr vlr)
>  {
> -	if (vgic_irq_is_active(vcpu, irq)) {
> -		vlr.state |= LR_STATE_ACTIVE;
> -		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
> -		vgic_irq_clear_active(vcpu, irq);
> -		vgic_update_state(vcpu->kvm);
> -	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
> -		vlr.state |= LR_STATE_PENDING;
> -		kvm_debug("Set pending: 0x%x\n", vlr.state);
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	/* We care only about state for SGIs/PPIs/SPIs, not for LPIs */
> +	if (irq < dist->nr_irqs) {
> +		if (vgic_irq_is_active(vcpu, irq)) {
> +			vlr.state |= LR_STATE_ACTIVE;
> +			kvm_debug("Set active, clear distributor: 0x%x\n",
> +				  vlr.state);
> +			vgic_irq_clear_active(vcpu, irq);
> +			vgic_update_state(vcpu->kvm);
> +		} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
> +			vlr.state |= LR_STATE_PENDING;
> +			kvm_debug("Set pending: 0x%x\n", vlr.state);
> +		}
> +		if (!vgic_irq_is_edge(vcpu, irq))
> +			vlr.state |= LR_EOI_INT;
> +	} else {
> +		/* If this is an LPI, it can only be pending */
> +		if (irq >= 8192)
> +			vlr.state |= LR_STATE_PENDING;
>  	}
> 
> -	if (!vgic_irq_is_edge(vcpu, irq))
> -		vlr.state |= LR_EOI_INT;
> 
>  	if (vlr.irq >= VGIC_NR_SGIS) {
>  		struct irq_phys_map *map;
> @@ -1190,16 +1214,14 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>   */
>  bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>  {
> -	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> -	struct vgic_lr vlr;
>  	u64 elrsr = vgic_get_elrsr(vcpu);
>  	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
> +	struct vgic_lr vlr;
>  	int lr;
> 
>  	/* Sanitize the input... */
>  	BUG_ON(sgi_source_id & ~7);
>  	BUG_ON(sgi_source_id && irq >= VGIC_NR_SGIS);
> -	BUG_ON(irq >= dist->nr_irqs);
> 
>  	kvm_debug("Queue IRQ%d\n", irq);
> 
> @@ -1282,8 +1304,12 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>  			overflow = 1;
>  	}
> 
> -
> -
> +	/*
> +	 * LPIs are not mapped in our bitmaps, so we leave the iteration
> +	 * to the ITS emulation code.
> +	 */
> +	if (!vgic_queue_lpis(vcpu))
> +		overflow = 1;
> 
>  epilog:
>  	if (overflow) {
> @@ -1488,20 +1514,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>  		if (test_bit(lr, elrsr_ptr))
>  			continue;
> 
> -		/* Reestablish SGI source for pending and active SGIs */
> -		if (vlr.irq < VGIC_NR_SGIS)
> -			add_sgi_source(vcpu, vlr.irq, vlr.source);
> -
> -		if (vlr.state & LR_STATE_PENDING)
> -			vgic_dist_irq_set_pending(vcpu, vlr.irq);
> -
> -		if (vlr.state & LR_STATE_ACTIVE) {
> -			if (vlr.state & LR_STATE_PENDING) {
> -				vgic_irq_set_active(vcpu, vlr.irq);
> -			} else {
> -				/* Active-only IRQs stay in the LR */
> -				pending = true;
> +		/* LPIs are handled separately */
> +		if (vlr.irq >= 8192) {
> +			/* We just need to take care about still pending LPIs */
> +			if (!(vlr.state & LR_STATE_PENDING))
>  				continue;
> +			vgic_unqueue_lpi(vcpu, vlr.irq);
> +		} else {
> +			BUG_ON(!(vlr.state & LR_STATE_MASK));
> +
> +			/* Reestablish SGI source for pending and active SGIs */
> +			if (vlr.irq < VGIC_NR_SGIS)
> +				add_sgi_source(vcpu, vlr.irq, vlr.source);
> +
> +			if (vlr.state & LR_STATE_PENDING)
> +				vgic_dist_irq_set_pending(vcpu, vlr.irq);
> +
> +			if (vlr.state & LR_STATE_ACTIVE) {
> +				if (vlr.state & LR_STATE_PENDING) {
> +					vgic_irq_set_active(vcpu, vlr.irq);
> +				} else {
> +					/* Active-only IRQs stay in the LR */
> +					pending = true;
> +					continue;
> +				}
>  			}
>  		}
> 
> @@ -1512,6 +1548,7 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>  	}
>  	vgic_update_state(vcpu->kvm);
> 
> +	/* vgic_update_state would not cover only-active IRQs or LPIs */
>  	if (pending)
>  		set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>  	spin_unlock(&dist->lock);
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-12  7:40     ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-12  7:40 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> -----Original Message-----
> From: kvm-owner at vger.kernel.org [mailto:kvm-owner at vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier at arm.com; christoffer.dall at linaro.org
> Cc: eric.auger at linaro.org; p.fedin at samsung.com; kvmarm at lists.cs.columbia.edu; linux-arm-
> kernel at lists.infradead.org; kvm at vger.kernel.org
> Subject: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
> 
> As the actual LPI number in a guest can be quite high, but is mostly
> assigned using a very sparse allocation scheme, bitmaps and arrays
> for storing the virtual interrupt status are a waste of memory.
> We use our equivalent of the "Interrupt Translation Table Entry"
> (ITTE) to hold this extra status information for a virtual LPI.
> As the normal VGIC code cannot use its fancy bitmaps to manage
> pending interrupts, we provide a hook in the VGIC code to let the
> ITS emulation handle the list register queueing itself.
> LPIs are located in a separate number range (>=8192), so
> distinguishing them is easy. With LPIs being only edge-triggered, we
> get away with a less complex IRQ handling.
> We extend the number of bits for storing the IRQ number in our
> LR struct to 16 to cover the LPI numbers we support as well.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - extend LR data structure to hold 16-bit wide IRQ IDs
> - only clear pending bit if IRQ could be queued
> - adapt __kvm_vgic_sync_hwstate() to upstream changes
> 
>  include/kvm/arm_vgic.h      |  4 +-
>  virt/kvm/arm/its-emul.c     | 75 ++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h     |  3 ++
>  virt/kvm/arm/vgic-v3-emul.c |  2 +
>  virt/kvm/arm/vgic.c         | 93 +++++++++++++++++++++++++++++++--------------
>  5 files changed, 148 insertions(+), 29 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index c3eb414..035911f 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -95,7 +95,7 @@ enum vgic_type {
>  #define LR_HW			(1 << 3)
> 
>  struct vgic_lr {
> -	unsigned irq:10;
> +	unsigned irq:16;
>  	union {
>  		unsigned hwirq:10;
>  		unsigned source:3;
> @@ -147,6 +147,8 @@ struct vgic_vm_ops {
>  	int	(*init_model)(struct kvm *);
>  	void	(*destroy_model)(struct kvm *);
>  	int	(*map_resources)(struct kvm *, const struct vgic_params *);
> +	bool	(*queue_lpis)(struct kvm_vcpu *);
> +	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
>  };
> 
>  struct vgic_io_device {
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index bab8033..8349970 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -59,8 +59,27 @@ struct its_itte {
>  	struct its_collection *collection;
>  	u32 lpi;
>  	u32 event_id;
> +	bool enabled;
> +	unsigned long *pending;
>  };
> 
> +/* To be used as an iterator this macro misses the enclosing parentheses */
> +#define for_each_lpi(dev, itte, kvm) \
> +	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
> +		list_for_each_entry(itte, &(dev)->itt, itte_list)
> +
> +static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
> +{
> +	struct its_device *device;
> +	struct its_itte *itte;
> +
> +	for_each_lpi(device, itte, kvm) {
> +		if (itte->lpi == lpi)
> +			return itte;
> +	}
> +	return NULL;
> +}
> +
>  #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
> 
>  /* The distributor lock is held by the VGIC MMIO handler. */
> @@ -154,9 +173,65 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>  	return false;
>  }
> 
> +/*
> + * Find all enabled and pending LPIs and queue them into the list
> + * registers.
> + * The dist lock is held by the caller.
> + */
> +bool vits_queue_lpis(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
> +	struct its_device *device;
> +	struct its_itte *itte;
> +	bool ret = true;
> +
> +	if (!vgic_has_its(vcpu->kvm))
> +		return true;
> +	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
> +		return true;
> +
> +	spin_lock(&its->lock);
> +	for_each_lpi(device, itte, vcpu->kvm) {
> +		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
> +			continue;
> +
> +		if (!itte->collection)
> +			continue;
> +
> +		if (itte->collection->target_addr != vcpu->vcpu_id)
> +			continue;
> +
> +
> +		if (vgic_queue_irq(vcpu, 0, itte->lpi))
> +			__clear_bit(vcpu->vcpu_id, itte->pending);
> +		else
> +			ret = false;

 Shouldn't we also have 'break' here? If vgic_queue_irq() returns false, this means we have no more
LRs to use, therefore it makes no sense to keep iterating.

> +	}
> +
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +/* Called with the distributor lock held by the caller. */
> +void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
> +{
> +	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
> +	struct its_itte *itte;
> +
> +	spin_lock(&its->lock);
> +
> +	/* Find the right ITTE and put the pending state back in there */
> +	itte = find_itte_by_lpi(vcpu->kvm, lpi);
> +	if (itte)
> +		__set_bit(vcpu->vcpu_id, itte->pending);
> +
> +	spin_unlock(&its->lock);
> +}
> +
>  static void its_free_itte(struct its_itte *itte)
>  {
>  	list_del(&itte->itte_list);
> +	kfree(itte->pending);
>  	kfree(itte);
>  }
> 
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index 472a6d0..cc5d5ff 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -33,4 +33,7 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> 
> +bool vits_queue_lpis(struct kvm_vcpu *vcpu);
> +void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> +
>  #endif
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index e9aa29e..f482e34 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -944,6 +944,8 @@ void vgic_v3_init_emulation(struct kvm *kvm)
>  	dist->vm_ops.init_model = vgic_v3_init_model;
>  	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
>  	dist->vm_ops.map_resources = vgic_v3_map_resources;
> +	dist->vm_ops.queue_lpis = vits_queue_lpis;
> +	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
> 
>  	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>  	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 11bf692..9ee87d3 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -120,6 +120,20 @@ static bool queue_sgi(struct kvm_vcpu *vcpu, int irq)
>  	return vcpu->kvm->arch.vgic.vm_ops.queue_sgi(vcpu, irq);
>  }
> 
> +static bool vgic_queue_lpis(struct kvm_vcpu *vcpu)
> +{
> +	if (vcpu->kvm->arch.vgic.vm_ops.queue_lpis)
> +		return vcpu->kvm->arch.vgic.vm_ops.queue_lpis(vcpu);
> +	else
> +		return true;
> +}
> +
> +static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
> +{
> +	if (vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi)
> +		vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
> +}
> +
>  int kvm_vgic_map_resources(struct kvm *kvm)
>  {
>  	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
> @@ -1148,18 +1162,28 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
>  static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>  				 int lr_nr, struct vgic_lr vlr)
>  {
> -	if (vgic_irq_is_active(vcpu, irq)) {
> -		vlr.state |= LR_STATE_ACTIVE;
> -		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
> -		vgic_irq_clear_active(vcpu, irq);
> -		vgic_update_state(vcpu->kvm);
> -	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
> -		vlr.state |= LR_STATE_PENDING;
> -		kvm_debug("Set pending: 0x%x\n", vlr.state);
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	/* We care only about state for SGIs/PPIs/SPIs, not for LPIs */
> +	if (irq < dist->nr_irqs) {
> +		if (vgic_irq_is_active(vcpu, irq)) {
> +			vlr.state |= LR_STATE_ACTIVE;
> +			kvm_debug("Set active, clear distributor: 0x%x\n",
> +				  vlr.state);
> +			vgic_irq_clear_active(vcpu, irq);
> +			vgic_update_state(vcpu->kvm);
> +		} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
> +			vlr.state |= LR_STATE_PENDING;
> +			kvm_debug("Set pending: 0x%x\n", vlr.state);
> +		}
> +		if (!vgic_irq_is_edge(vcpu, irq))
> +			vlr.state |= LR_EOI_INT;
> +	} else {
> +		/* If this is an LPI, it can only be pending */
> +		if (irq >= 8192)
> +			vlr.state |= LR_STATE_PENDING;
>  	}
> 
> -	if (!vgic_irq_is_edge(vcpu, irq))
> -		vlr.state |= LR_EOI_INT;
> 
>  	if (vlr.irq >= VGIC_NR_SGIS) {
>  		struct irq_phys_map *map;
> @@ -1190,16 +1214,14 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>   */
>  bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>  {
> -	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> -	struct vgic_lr vlr;
>  	u64 elrsr = vgic_get_elrsr(vcpu);
>  	unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
> +	struct vgic_lr vlr;
>  	int lr;
> 
>  	/* Sanitize the input... */
>  	BUG_ON(sgi_source_id & ~7);
>  	BUG_ON(sgi_source_id && irq >= VGIC_NR_SGIS);
> -	BUG_ON(irq >= dist->nr_irqs);
> 
>  	kvm_debug("Queue IRQ%d\n", irq);
> 
> @@ -1282,8 +1304,12 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>  			overflow = 1;
>  	}
> 
> -
> -
> +	/*
> +	 * LPIs are not mapped in our bitmaps, so we leave the iteration
> +	 * to the ITS emulation code.
> +	 */
> +	if (!vgic_queue_lpis(vcpu))
> +		overflow = 1;
> 
>  epilog:
>  	if (overflow) {
> @@ -1488,20 +1514,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>  		if (test_bit(lr, elrsr_ptr))
>  			continue;
> 
> -		/* Reestablish SGI source for pending and active SGIs */
> -		if (vlr.irq < VGIC_NR_SGIS)
> -			add_sgi_source(vcpu, vlr.irq, vlr.source);
> -
> -		if (vlr.state & LR_STATE_PENDING)
> -			vgic_dist_irq_set_pending(vcpu, vlr.irq);
> -
> -		if (vlr.state & LR_STATE_ACTIVE) {
> -			if (vlr.state & LR_STATE_PENDING) {
> -				vgic_irq_set_active(vcpu, vlr.irq);
> -			} else {
> -				/* Active-only IRQs stay in the LR */
> -				pending = true;
> +		/* LPIs are handled separately */
> +		if (vlr.irq >= 8192) {
> +			/* We just need to take care about still pending LPIs */
> +			if (!(vlr.state & LR_STATE_PENDING))
>  				continue;
> +			vgic_unqueue_lpi(vcpu, vlr.irq);
> +		} else {
> +			BUG_ON(!(vlr.state & LR_STATE_MASK));
> +
> +			/* Reestablish SGI source for pending and active SGIs */
> +			if (vlr.irq < VGIC_NR_SGIS)
> +				add_sgi_source(vcpu, vlr.irq, vlr.source);
> +
> +			if (vlr.state & LR_STATE_PENDING)
> +				vgic_dist_irq_set_pending(vcpu, vlr.irq);
> +
> +			if (vlr.state & LR_STATE_ACTIVE) {
> +				if (vlr.state & LR_STATE_PENDING) {
> +					vgic_irq_set_active(vcpu, vlr.irq);
> +				} else {
> +					/* Active-only IRQs stay in the LR */
> +					pending = true;
> +					continue;
> +				}
>  			}
>  		}
> 
> @@ -1512,6 +1548,7 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>  	}
>  	vgic_update_state(vcpu->kvm);
> 
> +	/* vgic_update_state would not cover only-active IRQs or LPIs */
>  	if (pending)
>  		set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>  	spin_unlock(&dist->lock);
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-12  7:40     ` Pavel Fedin
@ 2015-10-12 11:39       ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-12 11:39 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

>  Shouldn't we also have 'break' here? If vgic_queue_irq() returns false, this means we have no
> more
> LRs to use, therefore it makes no sense to keep iterating.

 No, don't listen to me. :) Because of piggyback, we indeed have to recheck all the interrupts.

 P.S. I still sometimes lose LPIs, and this is not related to spurious injection fix, because i
tried to omit resetting irq_pending_on_cpu bit, and still lost some LPIs. Will try to compare with
v2, because with v2 i don't remember this problem.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-12 11:39       ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-12 11:39 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

>  Shouldn't we also have 'break' here? If vgic_queue_irq() returns false, this means we have no
> more
> LRs to use, therefore it makes no sense to keep iterating.

 No, don't listen to me. :) Because of piggyback, we indeed have to recheck all the interrupts.

 P.S. I still sometimes lose LPIs, and this is not related to spurious injection fix, because i
tried to omit resetting irq_pending_on_cpu bit, and still lost some LPIs. Will try to compare with
v2, because with v2 i don't remember this problem.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-10 15:37   ` Christoffer Dall
@ 2015-10-12 14:12     ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-12 14:12 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: kvm, marc.zyngier, kvmarm, linux-arm-kernel

Hej,

On 10/10/15 16:37, Christoffer Dall wrote:
> Hi Andre,
> 
> 
> On Wed, Oct 07, 2015 at 03:55:10PM +0100, Andre Przywara wrote:
>> Hi,
>>
>> another respin and rebase of the ITS emulation series.
>> Major changes compared to v2 (beside some minor things like added
>> comments and function renames) are the rebasing and adaption to 4.3-rc
>> and Christoffer's timer rework series. Also the locking has been
>> reworked to cope with the dependencies of the its and the dist lock
>> in connection with the PROPBASER/PENDBASER and the command handling.
>> For a more detailed changelog see below or look at the respective
>> commit messages.
>>
>> This should address most of the comments I got on the list.
>> Many thanks to the diligent reviewers!
>> I didn't bother to fine-tune patch 01/16 too much, as I guess there
>> will be more discussion around this based on Pavel's latest post.
>>
>> These patches go on top of Christoffer's timer rework series [1],
>> which itself is on top of 4.3-rc2.
>> You can find all of this code in the its-emul/v3 branch of my
>> repository [2].
> 
> Thanks for rebasing the series!
> 
> Just a heads up that I may not be able to review this series for the
> next 1-2 weeks, so I'm afraid it's not going to make it in for v4.4,
> sorry.
> 
> Please let me know if this breaks expectations from everyone.

No worries, I wasn't expecting this for 4.4 anyway.
I'd rather see the prerequisites like your timer series going upstream
first, I will then rebase it on top of 4.4-rc1 (with fixes from newer
review comments incorporated).
Maybe we can take Pavel's cleanup (replacing my 1/16 and 2/16) for 4.4
already? (I will reply on those soon)
Also what is the status of Eric's IRQ routing support? Should this go in
first now?

Cheers,
Andre.

> Othersie, I will try review it with due dilligence so it makes it in for
> v4.5.
> 
> Best,
> -Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-12 14:12     ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-12 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hej,

On 10/10/15 16:37, Christoffer Dall wrote:
> Hi Andre,
> 
> 
> On Wed, Oct 07, 2015 at 03:55:10PM +0100, Andre Przywara wrote:
>> Hi,
>>
>> another respin and rebase of the ITS emulation series.
>> Major changes compared to v2 (beside some minor things like added
>> comments and function renames) are the rebasing and adaption to 4.3-rc
>> and Christoffer's timer rework series. Also the locking has been
>> reworked to cope with the dependencies of the its and the dist lock
>> in connection with the PROPBASER/PENDBASER and the command handling.
>> For a more detailed changelog see below or look at the respective
>> commit messages.
>>
>> This should address most of the comments I got on the list.
>> Many thanks to the diligent reviewers!
>> I didn't bother to fine-tune patch 01/16 too much, as I guess there
>> will be more discussion around this based on Pavel's latest post.
>>
>> These patches go on top of Christoffer's timer rework series [1],
>> which itself is on top of 4.3-rc2.
>> You can find all of this code in the its-emul/v3 branch of my
>> repository [2].
> 
> Thanks for rebasing the series!
> 
> Just a heads up that I may not be able to review this series for the
> next 1-2 weeks, so I'm afraid it's not going to make it in for v4.4,
> sorry.
> 
> Please let me know if this breaks expectations from everyone.

No worries, I wasn't expecting this for 4.4 anyway.
I'd rather see the prerequisites like your timer series going upstream
first, I will then rebase it on top of 4.4-rc1 (with fixes from newer
review comments incorporated).
Maybe we can take Pavel's cleanup (replacing my 1/16 and 2/16) for 4.4
already? (I will reply on those soon)
Also what is the status of Eric's IRQ routing support? Should this go in
first now?

Cheers,
Andre.

> Othersie, I will try review it with due dilligence so it makes it in for
> v4.5.
> 
> Best,
> -Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
  2015-10-12  7:40     ` Pavel Fedin
@ 2015-10-12 14:17       ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-12 14:17 UTC (permalink / raw)
  To: Pavel Fedin; +Cc: kvm, marc.zyngier, kvmarm, linux-arm-kernel

Hi Pavel,

On 12/10/15 08:40, Pavel Fedin wrote:
>  Hello!
> 
>> -----Original Message-----
>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Andre Przywara
>> Sent: Wednesday, October 07, 2015 5:55 PM
>> To: marc.zyngier@arm.com; christoffer.dall@linaro.org
>> Cc: eric.auger@linaro.org; p.fedin@samsung.com; kvmarm@lists.cs.columbia.edu; linux-arm-
>> kernel@lists.infradead.org; kvm@vger.kernel.org
>> Subject: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
>>
>> As the actual LPI number in a guest can be quite high, but is mostly
>> assigned using a very sparse allocation scheme, bitmaps and arrays
>> for storing the virtual interrupt status are a waste of memory.
>> We use our equivalent of the "Interrupt Translation Table Entry"
>> (ITTE) to hold this extra status information for a virtual LPI.
>> As the normal VGIC code cannot use its fancy bitmaps to manage
>> pending interrupts, we provide a hook in the VGIC code to let the
>> ITS emulation handle the list register queueing itself.
>> LPIs are located in a separate number range (>=8192), so
>> distinguishing them is easy. With LPIs being only edge-triggered, we
>> get away with a less complex IRQ handling.
>> We extend the number of bits for storing the IRQ number in our
>> LR struct to 16 to cover the LPI numbers we support as well.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>> Changelog v2..v3:
>> - extend LR data structure to hold 16-bit wide IRQ IDs
>> - only clear pending bit if IRQ could be queued
>> - adapt __kvm_vgic_sync_hwstate() to upstream changes
>>
>>  include/kvm/arm_vgic.h      |  4 +-
>>  virt/kvm/arm/its-emul.c     | 75 ++++++++++++++++++++++++++++++++++++
>>  virt/kvm/arm/its-emul.h     |  3 ++
>>  virt/kvm/arm/vgic-v3-emul.c |  2 +
>>  virt/kvm/arm/vgic.c         | 93 +++++++++++++++++++++++++++++++--------------
>>  5 files changed, 148 insertions(+), 29 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index c3eb414..035911f 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -95,7 +95,7 @@ enum vgic_type {
>>  #define LR_HW			(1 << 3)
>>
>>  struct vgic_lr {
>> -	unsigned irq:10;
>> +	unsigned irq:16;
>>  	union {
>>  		unsigned hwirq:10;
>>  		unsigned source:3;
>> @@ -147,6 +147,8 @@ struct vgic_vm_ops {
>>  	int	(*init_model)(struct kvm *);
>>  	void	(*destroy_model)(struct kvm *);
>>  	int	(*map_resources)(struct kvm *, const struct vgic_params *);
>> +	bool	(*queue_lpis)(struct kvm_vcpu *);
>> +	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
>>  };
>>
>>  struct vgic_io_device {
>> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
>> index bab8033..8349970 100644
>> --- a/virt/kvm/arm/its-emul.c
>> +++ b/virt/kvm/arm/its-emul.c
>> @@ -59,8 +59,27 @@ struct its_itte {
>>  	struct its_collection *collection;
>>  	u32 lpi;
>>  	u32 event_id;
>> +	bool enabled;
>> +	unsigned long *pending;
>>  };
>>
>> +/* To be used as an iterator this macro misses the enclosing parentheses */
>> +#define for_each_lpi(dev, itte, kvm) \
>> +	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
>> +		list_for_each_entry(itte, &(dev)->itt, itte_list)
>> +
>> +static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>> +{
>> +	struct its_device *device;
>> +	struct its_itte *itte;
>> +
>> +	for_each_lpi(device, itte, kvm) {
>> +		if (itte->lpi == lpi)
>> +			return itte;
>> +	}
>> +	return NULL;
>> +}
>> +
>>  #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
>>
>>  /* The distributor lock is held by the VGIC MMIO handler. */
>> @@ -154,9 +173,65 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>>  	return false;
>>  }
>>
>> +/*
>> + * Find all enabled and pending LPIs and queue them into the list
>> + * registers.
>> + * The dist lock is held by the caller.
>> + */
>> +bool vits_queue_lpis(struct kvm_vcpu *vcpu)
>> +{
>> +	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
>> +	struct its_device *device;
>> +	struct its_itte *itte;
>> +	bool ret = true;
>> +
>> +	if (!vgic_has_its(vcpu->kvm))
>> +		return true;
>> +	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
>> +		return true;
>> +
>> +	spin_lock(&its->lock);
>> +	for_each_lpi(device, itte, vcpu->kvm) {
>> +		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
>> +			continue;
>> +
>> +		if (!itte->collection)
>> +			continue;
>> +
>> +		if (itte->collection->target_addr != vcpu->vcpu_id)
>> +			continue;
>> +
>> +
>> +		if (vgic_queue_irq(vcpu, 0, itte->lpi))
>> +			__clear_bit(vcpu->vcpu_id, itte->pending);
>> +		else
>> +			ret = false;
> 
>  Shouldn't we also have 'break' here? If vgic_queue_irq() returns false, this means we have no more
> LRs to use, therefore it makes no sense to keep iterating.

I consider this too much optimization at this point.
vgic_queue_irq() just tells about the success for this interrupt, I'd
rather not make assumptions about other IRQs (we could piggy-back those,
for instance).
Even if not, I'd prefer to not break abstraction here.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
@ 2015-10-12 14:17       ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2015-10-12 14:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pavel,

On 12/10/15 08:40, Pavel Fedin wrote:
>  Hello!
> 
>> -----Original Message-----
>> From: kvm-owner at vger.kernel.org [mailto:kvm-owner at vger.kernel.org] On Behalf Of Andre Przywara
>> Sent: Wednesday, October 07, 2015 5:55 PM
>> To: marc.zyngier at arm.com; christoffer.dall at linaro.org
>> Cc: eric.auger at linaro.org; p.fedin at samsung.com; kvmarm at lists.cs.columbia.edu; linux-arm-
>> kernel at lists.infradead.org; kvm at vger.kernel.org
>> Subject: [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation
>>
>> As the actual LPI number in a guest can be quite high, but is mostly
>> assigned using a very sparse allocation scheme, bitmaps and arrays
>> for storing the virtual interrupt status are a waste of memory.
>> We use our equivalent of the "Interrupt Translation Table Entry"
>> (ITTE) to hold this extra status information for a virtual LPI.
>> As the normal VGIC code cannot use its fancy bitmaps to manage
>> pending interrupts, we provide a hook in the VGIC code to let the
>> ITS emulation handle the list register queueing itself.
>> LPIs are located in a separate number range (>=8192), so
>> distinguishing them is easy. With LPIs being only edge-triggered, we
>> get away with a less complex IRQ handling.
>> We extend the number of bits for storing the IRQ number in our
>> LR struct to 16 to cover the LPI numbers we support as well.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>> Changelog v2..v3:
>> - extend LR data structure to hold 16-bit wide IRQ IDs
>> - only clear pending bit if IRQ could be queued
>> - adapt __kvm_vgic_sync_hwstate() to upstream changes
>>
>>  include/kvm/arm_vgic.h      |  4 +-
>>  virt/kvm/arm/its-emul.c     | 75 ++++++++++++++++++++++++++++++++++++
>>  virt/kvm/arm/its-emul.h     |  3 ++
>>  virt/kvm/arm/vgic-v3-emul.c |  2 +
>>  virt/kvm/arm/vgic.c         | 93 +++++++++++++++++++++++++++++++--------------
>>  5 files changed, 148 insertions(+), 29 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index c3eb414..035911f 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -95,7 +95,7 @@ enum vgic_type {
>>  #define LR_HW			(1 << 3)
>>
>>  struct vgic_lr {
>> -	unsigned irq:10;
>> +	unsigned irq:16;
>>  	union {
>>  		unsigned hwirq:10;
>>  		unsigned source:3;
>> @@ -147,6 +147,8 @@ struct vgic_vm_ops {
>>  	int	(*init_model)(struct kvm *);
>>  	void	(*destroy_model)(struct kvm *);
>>  	int	(*map_resources)(struct kvm *, const struct vgic_params *);
>> +	bool	(*queue_lpis)(struct kvm_vcpu *);
>> +	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
>>  };
>>
>>  struct vgic_io_device {
>> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
>> index bab8033..8349970 100644
>> --- a/virt/kvm/arm/its-emul.c
>> +++ b/virt/kvm/arm/its-emul.c
>> @@ -59,8 +59,27 @@ struct its_itte {
>>  	struct its_collection *collection;
>>  	u32 lpi;
>>  	u32 event_id;
>> +	bool enabled;
>> +	unsigned long *pending;
>>  };
>>
>> +/* To be used as an iterator this macro misses the enclosing parentheses */
>> +#define for_each_lpi(dev, itte, kvm) \
>> +	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
>> +		list_for_each_entry(itte, &(dev)->itt, itte_list)
>> +
>> +static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>> +{
>> +	struct its_device *device;
>> +	struct its_itte *itte;
>> +
>> +	for_each_lpi(device, itte, kvm) {
>> +		if (itte->lpi == lpi)
>> +			return itte;
>> +	}
>> +	return NULL;
>> +}
>> +
>>  #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
>>
>>  /* The distributor lock is held by the VGIC MMIO handler. */
>> @@ -154,9 +173,65 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>>  	return false;
>>  }
>>
>> +/*
>> + * Find all enabled and pending LPIs and queue them into the list
>> + * registers.
>> + * The dist lock is held by the caller.
>> + */
>> +bool vits_queue_lpis(struct kvm_vcpu *vcpu)
>> +{
>> +	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
>> +	struct its_device *device;
>> +	struct its_itte *itte;
>> +	bool ret = true;
>> +
>> +	if (!vgic_has_its(vcpu->kvm))
>> +		return true;
>> +	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
>> +		return true;
>> +
>> +	spin_lock(&its->lock);
>> +	for_each_lpi(device, itte, vcpu->kvm) {
>> +		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
>> +			continue;
>> +
>> +		if (!itte->collection)
>> +			continue;
>> +
>> +		if (itte->collection->target_addr != vcpu->vcpu_id)
>> +			continue;
>> +
>> +
>> +		if (vgic_queue_irq(vcpu, 0, itte->lpi))
>> +			__clear_bit(vcpu->vcpu_id, itte->pending);
>> +		else
>> +			ret = false;
> 
>  Shouldn't we also have 'break' here? If vgic_queue_irq() returns false, this means we have no more
> LRs to use, therefore it makes no sense to keep iterating.

I consider this too much optimization at this point.
vgic_queue_irq() just tells about the success for this interrupt, I'd
rather not make assumptions about other IRQs (we could piggy-back those,
for instance).
Even if not, I'd prefer to not break abstraction here.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-12 14:12     ` Andre Przywara
@ 2015-10-12 15:18       ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-12 15:18 UTC (permalink / raw)
  To: 'Andre Przywara', 'Christoffer Dall'
  Cc: marc.zyngier, eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> Also what is the status of Eric's IRQ routing support? Should this go in
> first now?

 I'd say without vITS there's nothing to use IRQ routing with. It could go in and just lay around
silently, so that it's not forgotten, but for example current qemu just knows that with GICv2m it
should use hardcoded linear MSI->SPI mapping.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-12 15:18       ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-12 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> Also what is the status of Eric's IRQ routing support? Should this go in
> first now?

 I'd say without vITS there's nothing to use IRQ routing with. It could go in and just lay around
silently, so that it's not forgotten, but for example current qemu just knows that with GICv2m it
should use hardcoded linear MSI->SPI mapping.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 14:55 ` Andre Przywara
@ 2015-10-13 15:46   ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-13 15:46 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: kvm, kvmarm, linux-arm-kernel

 Hello!

 I already suggested one bunch of fixes on top of vITS series, and here is another one. It reconciles it with spurious interrupt
fix, and adds missing check in vgic_retire_disabled_irqs(), which was removed in original v3 series.
---
>From bdbedc35a4dc9bc258b21792cf734aa3b2383dff Mon Sep 17 00:00:00 2001
From: Pavel Fedin <p.fedin@samsung.com>
Date: Tue, 13 Oct 2015 15:24:19 +0300
Subject: [PATCH] KVM: arm/arm64: Fix LPI loss

compute_pending_for_cpu() should return true if there's something pending
on the given vCPU. This is used in order to correctly set
dist->irq_pending_on_cpu flag. However, the function knows nothing about
LPIs, this can contribute to LPI loss.

This patch fixes it by introducing vits_check_lpis() function, which
returns true if there's any pending LPI. Also, some refactoring done,
wrapping some repeated checks into helper functions.

Additionally, vgic_retire_disabled_irqs() is fixed to correctly skip LPIs.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
---
 include/kvm/arm_vgic.h      |  1 +
 virt/kvm/arm/its-emul.c     | 46 +++++++++++++++++++++++++++++++++++----------
 virt/kvm/arm/its-emul.h     |  1 +
 virt/kvm/arm/vgic-v3-emul.c |  1 +
 virt/kvm/arm/vgic.c         | 19 +++++++++++++++++--
 5 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 39113b9..21c8427 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -148,6 +148,7 @@ struct vgic_vm_ops {
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
 	bool	(*queue_lpis)(struct kvm_vcpu *);
 	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
+	bool	(*check_lpis)(struct kvm_vcpu *);
 	int	(*inject_msi)(struct kvm *, struct kvm_msi *);
 };
 
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index b1d61df..2fcd844 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -381,6 +381,18 @@ out_unlock:
 	return ret;
 }
 
+static bool its_is_enabled(struct kvm *kvm)
+{
+	return vgic_has_its(kvm) && kvm->arch.vgic.its.enabled &&
+	       kvm->arch.vgic.lpis_enabled;
+}
+
+static bool lpi_is_pending(struct its_itte *itte, u32 vcpu_id)
+{
+	return itte->enabled && test_bit(vcpu_id, itte->pending) &&
+	       itte->collection && (itte->collection->target_addr == vcpu_id);
+}
+
 /*
  * Find all enabled and pending LPIs and queue them into the list
  * registers.
@@ -393,20 +405,12 @@ bool vits_queue_lpis(struct kvm_vcpu *vcpu)
 	struct its_itte *itte;
 	bool ret = true;
 
-	if (!vgic_has_its(vcpu->kvm))
-		return true;
-	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
+	if (!its_is_enabled(vcpu->kvm))
 		return true;
 
 	spin_lock(&its->lock);
 	for_each_lpi(device, itte, vcpu->kvm) {
-		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
-			continue;
-
-		if (!itte->collection)
-			continue;
-
-		if (itte->collection->target_addr != vcpu->vcpu_id)
+		if (!lpi_is_pending(itte, vcpu->vcpu_id))
 			continue;
 
 
@@ -436,6 +440,28 @@ void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
 	spin_unlock(&its->lock);
 }
 
+bool vits_check_lpis(struct kvm_vcpu *vcpu)
+{
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	struct its_device *device;
+	struct its_itte *itte;
+	bool ret = false;
+
+	if (!its_is_enabled(vcpu->kvm))
+		return false;
+
+	spin_lock(&its->lock);
+	for_each_lpi(device, itte, vcpu->kvm) {
+		ret = lpi_is_pending(itte, vcpu->vcpu_id);
+		if (ret)
+			goto out;
+	}
+
+out:
+	spin_unlock(&its->lock);
+	return ret;
+}
+
 static void its_free_itte(struct its_itte *itte)
 {
 	list_del(&itte->itte_list);
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 236f153..f7fa5f8 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -41,6 +41,7 @@ int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
 
 bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
+bool vits_check_lpis(struct kvm_vcpu *vcpu);
 
 #define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
 #define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 798f256..25463d0 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -966,6 +966,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.inject_msi = vits_inject_msi;
 	dist->vm_ops.queue_lpis = vits_queue_lpis;
 	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
+	dist->vm_ops.check_lpis = vits_check_lpis;
 
 	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
 	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 5d0f6ee..796964a 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -111,6 +111,14 @@ static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
 		vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
 }
 
+static bool vgic_check_lpis(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->kvm->arch.vgic.vm_ops.check_lpis)
+		return vcpu->kvm->arch.vgic.vm_ops.check_lpis(vcpu);
+	else
+		return false;
+}
+
 int kvm_vgic_map_resources(struct kvm *kvm)
 {
 	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
@@ -1036,8 +1044,11 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 
 	pending_private = find_first_bit(pend_percpu, VGIC_NR_PRIVATE_IRQS);
 	pending_shared = find_first_bit(pend_shared, nr_shared);
-	return (pending_private < VGIC_NR_PRIVATE_IRQS ||
-		pending_shared < vgic_nr_shared_irqs(dist));
+	if (pending_private < VGIC_NR_PRIVATE_IRQS ||
+		pending_shared < vgic_nr_shared_irqs(dist))
+		return true;
+
+	return vgic_check_lpis(vcpu);
 }
 
 /*
@@ -1148,6 +1159,10 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 	for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
 		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
+		/* We don't care about LPIs here */
+		if (vlr.irq >= 8192)
+			continue;
+
 		if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
 			vgic_retire_lr(lr, vcpu);
 			if (vgic_irq_is_queued(vcpu, vlr.irq))
-- 
2.4.4


Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-13 15:46   ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-13 15:46 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

 I already suggested one bunch of fixes on top of vITS series, and here is another one. It reconciles it with spurious interrupt
fix, and adds missing check in vgic_retire_disabled_irqs(), which was removed in original v3 series.
---
>From bdbedc35a4dc9bc258b21792cf734aa3b2383dff Mon Sep 17 00:00:00 2001
From: Pavel Fedin <p.fedin@samsung.com>
Date: Tue, 13 Oct 2015 15:24:19 +0300
Subject: [PATCH] KVM: arm/arm64: Fix LPI loss

compute_pending_for_cpu() should return true if there's something pending
on the given vCPU. This is used in order to correctly set
dist->irq_pending_on_cpu flag. However, the function knows nothing about
LPIs, this can contribute to LPI loss.

This patch fixes it by introducing vits_check_lpis() function, which
returns true if there's any pending LPI. Also, some refactoring done,
wrapping some repeated checks into helper functions.

Additionally, vgic_retire_disabled_irqs() is fixed to correctly skip LPIs.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
---
 include/kvm/arm_vgic.h      |  1 +
 virt/kvm/arm/its-emul.c     | 46 +++++++++++++++++++++++++++++++++++----------
 virt/kvm/arm/its-emul.h     |  1 +
 virt/kvm/arm/vgic-v3-emul.c |  1 +
 virt/kvm/arm/vgic.c         | 19 +++++++++++++++++--
 5 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 39113b9..21c8427 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -148,6 +148,7 @@ struct vgic_vm_ops {
 	int	(*map_resources)(struct kvm *, const struct vgic_params *);
 	bool	(*queue_lpis)(struct kvm_vcpu *);
 	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
+	bool	(*check_lpis)(struct kvm_vcpu *);
 	int	(*inject_msi)(struct kvm *, struct kvm_msi *);
 };
 
diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index b1d61df..2fcd844 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -381,6 +381,18 @@ out_unlock:
 	return ret;
 }
 
+static bool its_is_enabled(struct kvm *kvm)
+{
+	return vgic_has_its(kvm) && kvm->arch.vgic.its.enabled &&
+	       kvm->arch.vgic.lpis_enabled;
+}
+
+static bool lpi_is_pending(struct its_itte *itte, u32 vcpu_id)
+{
+	return itte->enabled && test_bit(vcpu_id, itte->pending) &&
+	       itte->collection && (itte->collection->target_addr == vcpu_id);
+}
+
 /*
  * Find all enabled and pending LPIs and queue them into the list
  * registers.
@@ -393,20 +405,12 @@ bool vits_queue_lpis(struct kvm_vcpu *vcpu)
 	struct its_itte *itte;
 	bool ret = true;
 
-	if (!vgic_has_its(vcpu->kvm))
-		return true;
-	if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
+	if (!its_is_enabled(vcpu->kvm))
 		return true;
 
 	spin_lock(&its->lock);
 	for_each_lpi(device, itte, vcpu->kvm) {
-		if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
-			continue;
-
-		if (!itte->collection)
-			continue;
-
-		if (itte->collection->target_addr != vcpu->vcpu_id)
+		if (!lpi_is_pending(itte, vcpu->vcpu_id))
 			continue;
 
 
@@ -436,6 +440,28 @@ void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
 	spin_unlock(&its->lock);
 }
 
+bool vits_check_lpis(struct kvm_vcpu *vcpu)
+{
+	struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
+	struct its_device *device;
+	struct its_itte *itte;
+	bool ret = false;
+
+	if (!its_is_enabled(vcpu->kvm))
+		return false;
+
+	spin_lock(&its->lock);
+	for_each_lpi(device, itte, vcpu->kvm) {
+		ret = lpi_is_pending(itte, vcpu->vcpu_id);
+		if (ret)
+			goto out;
+	}
+
+out:
+	spin_unlock(&its->lock);
+	return ret;
+}
+
 static void its_free_itte(struct its_itte *itte)
 {
 	list_del(&itte->itte_list);
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index 236f153..f7fa5f8 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -41,6 +41,7 @@ int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
 
 bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
+bool vits_check_lpis(struct kvm_vcpu *vcpu);
 
 #define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
 #define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index 798f256..25463d0 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -966,6 +966,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
 	dist->vm_ops.inject_msi = vits_inject_msi;
 	dist->vm_ops.queue_lpis = vits_queue_lpis;
 	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
+	dist->vm_ops.check_lpis = vits_check_lpis;
 
 	dist->vgic_dist_base = VGIC_ADDR_UNDEF;
 	dist->vgic_redist_base = VGIC_ADDR_UNDEF;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 5d0f6ee..796964a 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -111,6 +111,14 @@ static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
 		vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
 }
 
+static bool vgic_check_lpis(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->kvm->arch.vgic.vm_ops.check_lpis)
+		return vcpu->kvm->arch.vgic.vm_ops.check_lpis(vcpu);
+	else
+		return false;
+}
+
 int kvm_vgic_map_resources(struct kvm *kvm)
 {
 	return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
@@ -1036,8 +1044,11 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 
 	pending_private = find_first_bit(pend_percpu, VGIC_NR_PRIVATE_IRQS);
 	pending_shared = find_first_bit(pend_shared, nr_shared);
-	return (pending_private < VGIC_NR_PRIVATE_IRQS ||
-		pending_shared < vgic_nr_shared_irqs(dist));
+	if (pending_private < VGIC_NR_PRIVATE_IRQS ||
+		pending_shared < vgic_nr_shared_irqs(dist))
+		return true;
+
+	return vgic_check_lpis(vcpu);
 }
 
 /*
@@ -1148,6 +1159,10 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 	for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
 		struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
+		/* We don't care about LPIs here */
+		if (vlr.irq >= 8192)
+			continue;
+
 		if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
 			vgic_retire_lr(lr, vcpu);
 			if (vgic_irq_is_queued(vcpu, vlr.irq))
-- 
2.4.4


Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-12 15:18       ` Pavel Fedin
@ 2015-10-14  8:48         ` Eric Auger
  -1 siblings, 0 replies; 101+ messages in thread
From: Eric Auger @ 2015-10-14  8:48 UTC (permalink / raw)
  To: Pavel Fedin, 'Andre Przywara', 'Christoffer Dall'
  Cc: marc.zyngier, kvmarm, linux-arm-kernel, kvm

Hi Andre, Pavel
On 10/12/2015 05:18 PM, Pavel Fedin wrote:
>  Hello!
> 
>> Also what is the status of Eric's IRQ routing support? Should this go in
>> first now?
> 
>  I'd say without vITS there's nothing to use IRQ routing with. It could go in and just lay around
> silently, so that it's not forgotten, but for example current qemu just knows that with GICv2m it
> should use hardcoded linear MSI->SPI mapping.
Currently the gsi routing applies on top of ITS emulation series. I am
going to rebase it soon. It can go in 4.5 with ITS emulation series.

Best Regards

Eric
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 
> 


^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-14  8:48         ` Eric Auger
  0 siblings, 0 replies; 101+ messages in thread
From: Eric Auger @ 2015-10-14  8:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Andre, Pavel
On 10/12/2015 05:18 PM, Pavel Fedin wrote:
>  Hello!
> 
>> Also what is the status of Eric's IRQ routing support? Should this go in
>> first now?
> 
>  I'd say without vITS there's nothing to use IRQ routing with. It could go in and just lay around
> silently, so that it's not forgotten, but for example current qemu just knows that with GICv2m it
> should use hardcoded linear MSI->SPI mapping.
Currently the gsi routing applies on top of ITS emulation series. I am
going to rebase it soon. It can go in 4.5 with ITS emulation series.

Best Regards

Eric
> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-14  8:48         ` Eric Auger
@ 2015-10-14  8:50           ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-14  8:50 UTC (permalink / raw)
  To: 'Eric Auger', 'Andre Przywara',
	'Christoffer Dall'
  Cc: marc.zyngier, kvmarm, linux-arm-kernel, kvm

 Hello!

> Currently the gsi routing applies on top of ITS emulation series. I am
> going to rebase it soon. It can go in 4.5 with ITS emulation series.

 Ah, yes, of course, because it reuses API definitions. I forgot this.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2015-10-14  8:50           ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-14  8:50 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> Currently the gsi routing applies on top of ITS emulation series. I am
> going to rebase it soon. It can go in 4.5 with ITS emulation series.

 Ah, yes, of course, because it reuses API definitions. I forgot this.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers
  2015-10-07 14:55   ` Andre Przywara
@ 2015-10-14 12:26     ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-14 12:26 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier@arm.com; christoffer.dall@linaro.org
> Cc: eric.auger@linaro.org; p.fedin@samsung.com; kvmarm@lists.cs.columbia.edu; linux-arm-
> kernel@lists.infradead.org; kvm@vger.kernel.org
> Subject: [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers
> 
> The connection between a device, an event ID, the LPI number and the
> allocated CPU is stored in in-memory tables in a GICv3, but their
> format is not specified by the spec. Instead software uses a command
> queue in a ring buffer to let the ITS implementation use their own
> format.
> Implement handlers for the various ITS commands and let them store
> the requested relation into our own data structures.
> To avoid kmallocs inside the ITS spinlock, we preallocate possibly
> needed memory outside of the lock and free that if it turns out to
> be not needed (mostly error handling).
> Error handling is very basic at this point, as we don't have a good
> way of communicating errors to the guest (usually a SError).
> The INT command handler is missing at this point, as we gain the
> capability of actually injecting MSIs into the guest only later on.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - adjust handlers to new pendbaser/propbaser locking scheme
> - properly free ITTEs (including pending bitmap)
> - fix handling of unmapped collections
> 
>  include/linux/irqchip/arm-gic-v3.h |   5 +-
>  virt/kvm/arm/its-emul.c            | 502 ++++++++++++++++++++++++++++++++++++-
>  virt/kvm/arm/its-emul.h            |  11 +
>  3 files changed, 516 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
> index ef274a9..27c0e75 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -255,7 +255,10 @@
>   */
>  #define GITS_CMD_MAPD			0x08
>  #define GITS_CMD_MAPC			0x09
> -#define GITS_CMD_MAPVI			0x0a
> +#define GITS_CMD_MAPTI			0x0a
> +/* older GIC documentation used MAPVI for this command */
> +#define GITS_CMD_MAPVI			GITS_CMD_MAPTI
> +#define GITS_CMD_MAPI			0x0b
>  #define GITS_CMD_MOVI			0x01
>  #define GITS_CMD_DISCARD		0x0f
>  #define GITS_CMD_INV			0x0c
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 7a8c5db..642effb 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -22,6 +22,7 @@
>  #include <linux/kvm_host.h>
>  #include <linux/interrupt.h>
>  #include <linux/list.h>
> +#include <linux/slab.h>
> 
>  #include <linux/irqchip/arm-gic-v3.h>
>  #include <kvm/arm_vgic.h>
> @@ -64,6 +65,34 @@ struct its_itte {
>  	unsigned long *pending;
>  };
> 
> +static struct its_device *find_its_device(struct kvm *kvm, u32 device_id)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	struct its_device *device;
> +
> +	list_for_each_entry(device, &its->device_list, dev_list)
> +		if (device_id == device->device_id)
> +			return device;
> +
> +	return NULL;
> +}
> +
> +static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
> +{
> +	struct its_device *device;
> +	struct its_itte *itte;
> +
> +	device = find_its_device(kvm, device_id);
> +	if (device == NULL)
> +		return NULL;
> +
> +	list_for_each_entry(itte, &device->itt, itte_list)
> +		if (itte->event_id == event_id)
> +			return itte;
> +
> +	return NULL;
> +}
> +
>  /* To be used as an iterator this macro misses the enclosing parentheses */
>  #define for_each_lpi(dev, itte, kvm) \
>  	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
> @@ -81,6 +110,19 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>  	return NULL;
>  }
> 
> +static struct its_collection *find_collection(struct kvm *kvm, int coll_id)
> +{
> +	struct its_collection *collection;
> +
> +	list_for_each_entry(collection, &kvm->arch.vgic.its.collection_list,
> +			    coll_list) {
> +		if (coll_id == collection->collection_id)
> +			return collection;
> +	}
> +
> +	return NULL;
> +}
> +
>  #define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
>  #define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
> 
> @@ -352,13 +394,471 @@ static void its_free_itte(struct its_itte *itte)
>  	kfree(itte);
>  }
> 
> +static u64 its_cmd_mask_field(u64 *its_cmd, int word, int shift, int size)
> +{
> +	return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT_ULL(size) - 1);
> +}
> +
> +#define its_cmd_get_command(cmd)	its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)	its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_id(cmd)		its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)	its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)	its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)	its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)	its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +/* The DISCARD command frees an Interrupt Translation Table Entry (ITTE). */
> +static int vits_cmd_handle_discard(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 device_id;
> +	u32 event_id;
> +	struct its_itte *itte;
> +	int ret = E_ITS_DISCARD_UNMAPPED_INTERRUPT;
> +
> +	device_id = its_cmd_get_deviceid(its_cmd);
> +	event_id = its_cmd_get_id(its_cmd);
> +
> +	spin_lock(&its->lock);
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (itte && itte->collection) {
> +		/*
> +		 * Though the spec talks about removing the pending state, we
> +		 * don't bother here since we clear the ITTE anyway and the
> +		 * pending state is a property of the ITTE struct.
> +		 */
> +		its_free_itte(itte);
> +		ret = 0;
> +	}

 Are you sure that DISCARD should remove the entry? The doc says in 6.3.4:
--- cut ---
This command translates the event defined by EventID and DeviceID and instructs the appropriate Redistributor to
remove the pending state of the interrupt. It also ensures that any caching in the Redistributors associated with a
specific EventID is consistent with the configuration held in memory.
--- cut ---
 So, it seems to be like CLEAR + INV.

> +
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +/* The MOVI command moves an ITTE to a different collection. */
> +static int vits_cmd_handle_movi(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 device_id = its_cmd_get_deviceid(its_cmd);
> +	u32 event_id = its_cmd_get_id(its_cmd);
> +	u32 coll_id = its_cmd_get_collection(its_cmd);
> +	struct its_itte *itte;
> +	struct its_collection *collection;
> +	int ret;
> +
> +	spin_lock(&its->lock);
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (!itte) {
> +		ret = E_ITS_MOVI_UNMAPPED_INTERRUPT;
> +		goto out_unlock;
> +	}
> +	if (!its_is_collection_mapped(itte->collection)) {
> +		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
> +		goto out_unlock;
> +	}
> +
> +	collection = find_collection(kvm, coll_id);
> +	if (!its_is_collection_mapped(collection)) {
> +		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
> +		goto out_unlock;
> +	}
> +
> +	if (test_and_clear_bit(itte->collection->target_addr, itte->pending))
> +		__set_bit(collection->target_addr, itte->pending);
> +
> +	itte->collection = collection;
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +static void vits_init_collection(struct kvm *kvm,
> +				 struct its_collection *collection,
> +				 u32 coll_id)
> +{
> +	collection->collection_id = coll_id;
> +	collection->target_addr = COLLECTION_NOT_MAPPED;
> +
> +	list_add_tail(&collection->coll_list,
> +		&kvm->arch.vgic.its.collection_list);
> +}
> +
> +/* The MAPTI and MAPI commands map LPIs to ITTEs. */
> +static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u32 device_id = its_cmd_get_deviceid(its_cmd);
> +	u32 event_id = its_cmd_get_id(its_cmd);
> +	u32 coll_id = its_cmd_get_collection(its_cmd);
> +	struct its_itte *itte, *new_itte;
> +	struct its_device *device;
> +	struct its_collection *collection, *new_coll;
> +	int lpi_nr;
> +	int ret = 0;
> +
> +	/* Preallocate possibly needed memory here outside of the lock */
> +	new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
> +	new_itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
> +	if (new_itte)
> +		new_itte->pending = kcalloc(BITS_TO_LONGS(dist->nr_cpus),
> +					    sizeof(long), GFP_KERNEL);
> +
> +	spin_lock(&dist->its.lock);
> +
> +	device = find_its_device(kvm, device_id);
> +	if (!device) {
> +		ret = E_ITS_MAPTI_UNMAPPED_DEVICE;
> +		goto out_unlock;
> +	}
> +
> +	collection = find_collection(kvm, coll_id);
> +	if (!collection && !new_coll) {
> +		ret = -ENOMEM;
> +		goto out_unlock;
> +	}
> +
> +	if (cmd == GITS_CMD_MAPTI)
> +		lpi_nr = its_cmd_get_physical_id(its_cmd);
> +	else
> +		lpi_nr = event_id;
> +	if (lpi_nr < GIC_LPI_OFFSET ||
> +	    lpi_nr >= nr_idbits_propbase(dist->propbaser)) {
> +		ret = E_ITS_MAPTI_PHYSICALID_OOR;
> +		goto out_unlock;
> +	}
> +
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (!itte) {
> +		if (!new_itte || !new_itte->pending) {
> +			ret = -ENOMEM;
> +			goto out_unlock;
> +		}
> +		itte = new_itte;
> +
> +		itte->event_id	= event_id;
> +		list_add_tail(&itte->itte_list, &device->itt);
> +	} else {
> +		if (new_itte)
> +			kfree(new_itte->pending);
> +		kfree(new_itte);
> +	}
> +
> +	if (!collection) {
> +		collection = new_coll;
> +		vits_init_collection(kvm, collection, coll_id);
> +	} else {
> +		kfree(new_coll);
> +	}
> +
> +	itte->collection = collection;
> +	itte->lpi = lpi_nr;
> +
> +out_unlock:
> +	spin_unlock(&dist->its.lock);
> +	if (ret) {
> +		kfree(new_coll);
> +		if (new_itte)
> +			kfree(new_itte->pending);
> +		kfree(new_itte);
> +	}
> +	return ret;
> +}
> +
> +static void vits_unmap_device(struct kvm *kvm, struct its_device *device)
> +{
> +	struct its_itte *itte, *temp;
> +
> +	/*
> +	 * The spec says that unmapping a device with still valid
> +	 * ITTEs associated is UNPREDICTABLE. We remove all ITTEs,
> +	 * since we cannot leave the memory unreferenced.
> +	 */
> +	list_for_each_entry_safe(itte, temp, &device->itt, itte_list)
> +		its_free_itte(itte);
> +
> +	list_del(&device->dev_list);
> +	kfree(device);
> +}
> +
> +/* MAPD maps or unmaps a device ID to Interrupt Translation Tables (ITTs). */
> +static int vits_cmd_handle_mapd(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	bool valid = its_cmd_get_validbit(its_cmd);
> +	u32 device_id = its_cmd_get_deviceid(its_cmd);
> +	struct its_device *device, *new_device = NULL;
> +
> +	/* We preallocate memory outside of the lock here */
> +	if (valid) {
> +		new_device = kzalloc(sizeof(struct its_device), GFP_KERNEL);
> +		if (!new_device)
> +			return -ENOMEM;
> +	}
> +
> +	spin_lock(&its->lock);
> +
> +	device = find_its_device(kvm, device_id);
> +	if (device)
> +		vits_unmap_device(kvm, device);
> +
> +	/*
> +	 * The spec does not say whether unmapping a not-mapped device
> +	 * is an error, so we are done in any case.
> +	 */
> +	if (!valid)
> +		goto out_unlock;
> +
> +	device = new_device;
> +
> +	device->device_id = device_id;
> +	INIT_LIST_HEAD(&device->itt);
> +
> +	list_add_tail(&device->dev_list,
> +		      &kvm->arch.vgic.its.device_list);
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return 0;
> +}
> +
> +/* The MAPC command maps collection IDs to redistributors. */
> +static int vits_cmd_handle_mapc(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u16 coll_id;
> +	u32 target_addr;
> +	struct its_collection *collection, *new_coll = NULL;
> +	bool valid;
> +
> +	valid = its_cmd_get_validbit(its_cmd);
> +	coll_id = its_cmd_get_collection(its_cmd);
> +	target_addr = its_cmd_get_target_addr(its_cmd);
> +
> +	if (target_addr >= atomic_read(&kvm->online_vcpus))
> +		return E_ITS_MAPC_PROCNUM_OOR;
> +
> +	/* We preallocate memory outside of the lock here */
> +	if (valid) {
> +		new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
> +		if (!new_coll)
> +			return -ENOMEM;
> +	}
> +
> +	spin_lock(&its->lock);
> +	collection = find_collection(kvm, coll_id);
> +
> +	if (!valid) {
> +		struct its_device *device;
> +		struct its_itte *itte;
> +		/*
> +		 * Clearing the mapping for that collection ID removes the
> +		 * entry from the list. If there wasn't any before, we can
> +		 * go home early.
> +		 */
> +		if (!collection)
> +			goto out_unlock;
> +
> +		for_each_lpi(device, itte, kvm)
> +			if (itte->collection &&
> +			    itte->collection->collection_id == coll_id)
> +				itte->collection = NULL;
> +
> +		list_del(&collection->coll_list);
> +		kfree(collection);
> +	} else {
> +		if (!collection)
> +			collection = new_coll;
> +		else
> +			kfree(new_coll);
> +
> +		vits_init_collection(kvm, collection, coll_id);
> +		collection->target_addr = target_addr;
> +	}
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return 0;
> +}
> +
> +/* The CLEAR command removes the pending state for a particular LPI. */
> +static int vits_cmd_handle_clear(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 device_id;
> +	u32 event_id;
> +	struct its_itte *itte;
> +	int ret = 0;
> +
> +	device_id = its_cmd_get_deviceid(its_cmd);
> +	event_id = its_cmd_get_id(its_cmd);
> +
> +	spin_lock(&its->lock);
> +
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (!itte) {
> +		ret = E_ITS_CLEAR_UNMAPPED_INTERRUPT;
> +		goto out_unlock;
> +	}
> +
> +	if (its_is_collection_mapped(itte->collection))
> +		__clear_bit(itte->collection->target_addr, itte->pending);
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +/* The INV command syncs the configuration bits from the memory tables. */
> +static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u32 device_id;
> +	u32 event_id;
> +	struct its_itte *itte, *new_itte;
> +	gpa_t propbase;
> +	int ret;
> +	u8 prop;
> +
> +	device_id = its_cmd_get_deviceid(its_cmd);
> +	event_id = its_cmd_get_id(its_cmd);
> +
> +	spin_lock(&dist->its.lock);
> +	itte = find_itte(kvm, device_id, event_id);
> +	spin_unlock(&dist->its.lock);
> +	if (!itte)
> +		return E_ITS_INV_UNMAPPED_INTERRUPT;
> +
> +	/*
> +	 * We cannot read from guest memory inside the spinlock, so we
> +	 * need to re-read our tables to learn whether the LPI number we are
> +	 * using is still valid.
> +	 */
> +	do {
> +		propbase = BASER_BASE_ADDRESS(dist->propbaser);
> +		ret = kvm_read_guest(kvm, propbase + itte->lpi - GIC_LPI_OFFSET,
> +				     &prop, 1);
> +		if (ret)
> +			return ret;
> +
> +		spin_lock(&dist->its.lock);
> +		new_itte = find_itte(kvm, device_id, event_id);
> +		if (new_itte->lpi != itte->lpi) {
> +			itte = new_itte;
> +			spin_unlock(&dist->its.lock);
> +			continue;
> +		}
> +		update_lpi_config(kvm, itte, prop);
> +		spin_unlock(&dist->its.lock);
> +	} while (0);
> +	return 0;
> +}
> +
> +/* The INVALL command requests flushing of all IRQ data in this collection. */
> +static int vits_cmd_handle_invall(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u64 prop_base_reg, pend_base_reg;
> +	u32 coll_id = its_cmd_get_collection(its_cmd);
> +	struct its_collection *collection;
> +	struct kvm_vcpu *vcpu;
> +
> +	collection = find_collection(kvm, coll_id);
> +	if (!its_is_collection_mapped(collection))
> +		return E_ITS_INVALL_UNMAPPED_COLLECTION;
> +
> +	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
> +
> +	spin_lock(&dist->lock);
> +	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
> +	prop_base_reg = dist->propbaser;
> +	spin_unlock(&dist->lock);
> +
> +	its_update_lpis_configuration(kvm, prop_base_reg);
> +	its_sync_lpi_pending_table(vcpu, pend_base_reg);
> +
> +	return 0;
> +}
> +
> +/* The MOVALL command moves all IRQs from one redistributor to another. */
> +static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 target1_addr = its_cmd_get_target_addr(its_cmd);
> +	u32 target2_addr = its_cmd_mask_field(its_cmd, 3, 16, 32);
> +	struct its_collection *collection;
> +	struct its_device *device;
> +	struct its_itte *itte;
> +
> +	if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
> +	    target2_addr >= atomic_read(&kvm->online_vcpus))
> +		return E_ITS_MOVALL_PROCNUM_OOR;
> +
> +	if (target1_addr == target2_addr)
> +		return 0;
> +
> +	spin_lock(&its->lock);
> +	for_each_lpi(device, itte, kvm) {
> +		/* remap all collections mapped to target address 1 */
> +		collection = itte->collection;
> +		if (collection && collection->target_addr == target1_addr)
> +			collection->target_addr = target2_addr;
> +
> +		/* move pending state if LPI is affected */
> +		if (test_and_clear_bit(target1_addr, itte->pending))
> +			__set_bit(target2_addr, itte->pending);
> +	}
> +
> +	spin_unlock(&its->lock);
> +	return 0;
> +}
> +
>  /*
>   * This function is called with both the ITS and the distributor lock dropped,
>   * so the actual command handlers must take the respective locks when needed.
>   */
>  static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
>  {
> -	return -ENODEV;
> +	u8 cmd = its_cmd_get_command(its_cmd);
> +	int ret = -ENODEV;
> +
> +	switch (cmd) {
> +	case GITS_CMD_MAPD:
> +		ret = vits_cmd_handle_mapd(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_MAPC:
> +		ret = vits_cmd_handle_mapc(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_MAPI:
> +		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
> +		break;
> +	case GITS_CMD_MAPTI:
> +		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
> +		break;
> +	case GITS_CMD_MOVI:
> +		ret = vits_cmd_handle_movi(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_DISCARD:
> +		ret = vits_cmd_handle_discard(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_CLEAR:
> +		ret = vits_cmd_handle_clear(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_MOVALL:
> +		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_INV:
> +		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_INVALL:
> +		ret = vits_cmd_handle_invall(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_SYNC:
> +		/* we ignore this command: we are in sync all of the time */
> +		ret = 0;
> +		break;
> +	}
> +
> +	return ret;
>  }
> 
>  static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index cbc3877..830524a 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -39,4 +39,15 @@ void vits_destroy(struct kvm *kvm);
>  bool vits_queue_lpis(struct kvm_vcpu *vcpu);
>  void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> 
> +#define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
> +#define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
> +#define E_ITS_CLEAR_UNMAPPED_INTERRUPT		0x010507
> +#define E_ITS_MAPC_PROCNUM_OOR			0x010902
> +#define E_ITS_MAPTI_UNMAPPED_DEVICE		0x010a04
> +#define E_ITS_MAPTI_PHYSICALID_OOR		0x010a06
> +#define E_ITS_INV_UNMAPPED_INTERRUPT		0x010c07
> +#define E_ITS_INVALL_UNMAPPED_COLLECTION	0x010d09
> +#define E_ITS_MOVALL_PROCNUM_OOR		0x010e01
> +#define E_ITS_DISCARD_UNMAPPED_INTERRUPT	0x010f07

 Did you just invent PROCNUM_OOR? Additionally, you have two different suffixes for it, this goes against the concept.
 Actually, your error handling is IMHO too stripped down. I can suggest you to squash in the following patch below, it significantly
improves error recognition. Yes, i know that status register isn't implemented (or at least error code goes nowhere).

> +
>  #endif
> --

---
>From 6da01d44c5b3753610b2a87724aedc2b05f42e06 Mon Sep 17 00:00:00 2001
From: Pavel Fedin <p.fedin@samsung.com>
Date: Wed, 14 Oct 2015 15:14:07 +0300
Subject: [PATCH] KVM: arm64: Improve ITS error handling

Remember ITT size and check device IDs against it. Also, factor out
its_find_itte_in_device() for convenience.

Error codes generalized accross commands. When status register is
implemented, in order to complete error code, command code should be
echoed back in bits 15...8.
---
 virt/kvm/arm/its-emul.c | 86 +++++++++++++++++++++++++++++++------------------
 virt/kvm/arm/its-emul.h | 15 +++------
 2 files changed, 59 insertions(+), 42 deletions(-)

diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 2fcd844..181bec8 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -40,6 +40,7 @@ struct its_device {
 	/* the head for the list of ITTEs */
 	struct list_head itt;
 	u32 device_id;
+	u32 itt_size;
 };
 
 #define COLLECTION_NOT_MAPPED ((u32)-1)
@@ -77,15 +78,11 @@ static struct its_device *find_its_device(struct kvm *kvm, u32 device_id)
 	return NULL;
 }
 
-static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
+static struct its_itte *find_itte_in_device(struct its_device *device,
+					    u32 event_id)
 {
-	struct its_device *device;
 	struct its_itte *itte;
 
-	device = find_its_device(kvm, device_id);
-	if (device == NULL)
-		return NULL;
-
 	list_for_each_entry(itte, &device->itt, itte_list)
 		if (itte->event_id == event_id)
 			return itte;
@@ -93,6 +90,28 @@ static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
 	return NULL;
 }
 
+static struct its_itte *find_itte(struct kvm *kvm, u32 device_id,
+				  u32 event_id, int *err)
+{
+	struct its_device *device;
+	struct its_itte *itte;
+
+	device = find_its_device(kvm, device_id);
+	if (device == NULL) {
+		*err = E_ITS_UNMAPPED_DEVICE;
+		return NULL;
+	}
+
+	if (event_id >= 2 << device->itt_size) {
+		*err = E_ITS_ID_OOR;
+		return NULL;
+	}
+
+	itte = find_itte_in_device(device, event_id);
+	*err = itte ? 0 : E_ITS_UNMAPPED_INTERRUPT;
+	return itte;
+}
+
 /* To be used as an iterator this macro misses the enclosing parentheses */
 #define for_each_lpi(dev, itte, kvm) \
 	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
@@ -359,10 +378,12 @@ int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
 		goto out_unlock;
 	}
 
-	itte = find_itte(kvm, msi->devid, msi->data);
+	itte = find_itte(kvm, msi->devid, msi->data, &ret);
 	/* Triggering an unmapped IRQ gets silently dropped. */
-	if (!itte || !its_is_collection_mapped(itte->collection))
+	if (!itte || !its_is_collection_mapped(itte->collection)) {
+		ret = 0;
 		goto out_unlock;
+	}
 
 	cpuid = itte->collection->target_addr;
 	__set_bit(cpuid, itte->pending);
@@ -476,6 +497,7 @@ static u64 its_cmd_mask_field(u64 *its_cmd, int word, int shift, int size)
 
 #define its_cmd_get_command(cmd)	its_cmd_mask_field(cmd, 0,  0,  8)
 #define its_cmd_get_deviceid(cmd)	its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_size(cmd)		its_cmd_mask_field(cmd, 1,  0,  5)
 #define its_cmd_get_id(cmd)		its_cmd_mask_field(cmd, 1,  0, 32)
 #define its_cmd_get_physical_id(cmd)	its_cmd_mask_field(cmd, 1, 32, 32)
 #define its_cmd_get_collection(cmd)	its_cmd_mask_field(cmd, 2,  0, 16)
@@ -489,21 +511,23 @@ static int vits_cmd_handle_discard(struct kvm *kvm, u64 *its_cmd)
 	u32 device_id;
 	u32 event_id;
 	struct its_itte *itte;
-	int ret = E_ITS_DISCARD_UNMAPPED_INTERRUPT;
+	int ret;
 
 	device_id = its_cmd_get_deviceid(its_cmd);
 	event_id = its_cmd_get_id(its_cmd);
 
 	spin_lock(&its->lock);
-	itte = find_itte(kvm, device_id, event_id);
-	if (itte && itte->collection) {
+	itte = find_itte(kvm, device_id, event_id, &ret);
+	if (itte) {
 		/*
 		 * Though the spec talks about removing the pending state, we
 		 * don't bother here since we clear the ITTE anyway and the
 		 * pending state is a property of the ITTE struct.
 		 */
-		its_free_itte(itte);
-		ret = 0;
+		if (itte->collection)
+			its_free_itte(itte);
+		else
+			ret = E_ITS_UNMAPPED_INTERRUPT;
 	}
 
 	spin_unlock(&its->lock);
@@ -522,19 +546,18 @@ static int vits_cmd_handle_movi(struct kvm *kvm, u64 *its_cmd)
 	int ret;
 
 	spin_lock(&its->lock);
-	itte = find_itte(kvm, device_id, event_id);
-	if (!itte) {
-		ret = E_ITS_MOVI_UNMAPPED_INTERRUPT;
+	itte = find_itte(kvm, device_id, event_id, &ret);
+	if (!itte)
 		goto out_unlock;
-	}
+
 	if (!its_is_collection_mapped(itte->collection)) {
-		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		ret = E_ITS_UNMAPPED_COLLECTION;
 		goto out_unlock;
 	}
 
 	collection = find_collection(kvm, coll_id);
 	if (!its_is_collection_mapped(collection)) {
-		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		ret = E_ITS_UNMAPPED_COLLECTION;
 		goto out_unlock;
 	}
 
@@ -582,7 +605,7 @@ static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
 
 	device = find_its_device(kvm, device_id);
 	if (!device) {
-		ret = E_ITS_MAPTI_UNMAPPED_DEVICE;
+		ret = E_ITS_UNMAPPED_DEVICE;
 		goto out_unlock;
 	}
 
@@ -598,11 +621,11 @@ static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
 		lpi_nr = event_id;
 	if (lpi_nr < GIC_LPI_OFFSET ||
 	    lpi_nr >= nr_idbits_propbase(dist->propbaser)) {
-		ret = E_ITS_MAPTI_PHYSICALID_OOR;
+		ret = E_ITS_PHYSICALID_OOR;
 		goto out_unlock;
 	}
 
-	itte = find_itte(kvm, device_id, event_id);
+	itte = find_itte_in_device(device, event_id);
 	if (!itte) {
 		if (!new_itte || !new_itte->pending) {
 			ret = -ENOMEM;
@@ -686,6 +709,7 @@ static int vits_cmd_handle_mapd(struct kvm *kvm, u64 *its_cmd)
 	device = new_device;
 
 	device->device_id = device_id;
+	device->itt_size = its_cmd_get_size(its_cmd);
 	INIT_LIST_HEAD(&device->itt);
 
 	list_add_tail(&device->dev_list,
@@ -710,7 +734,7 @@ static int vits_cmd_handle_mapc(struct kvm *kvm, u64 *its_cmd)
 	target_addr = its_cmd_get_target_addr(its_cmd);
 
 	if (target_addr >= atomic_read(&kvm->online_vcpus))
-		return E_ITS_MAPC_PROCNUM_OOR;
+		return -EINVAL;
 
 	/* We preallocate memory outside of the lock here */
 	if (valid) {
@@ -769,11 +793,9 @@ static int vits_cmd_handle_clear(struct kvm *kvm, u64 *its_cmd)
 
 	spin_lock(&its->lock);
 
-	itte = find_itte(kvm, device_id, event_id);
-	if (!itte) {
-		ret = E_ITS_CLEAR_UNMAPPED_INTERRUPT;
+	itte = find_itte(kvm, device_id, event_id, &ret);
+	if (!itte)
 		goto out_unlock;
-	}
 
 	if (its_is_collection_mapped(itte->collection))
 		__clear_bit(itte->collection->target_addr, itte->pending);
@@ -798,10 +820,10 @@ static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
 	event_id = its_cmd_get_id(its_cmd);
 
 	spin_lock(&dist->its.lock);
-	itte = find_itte(kvm, device_id, event_id);
+	itte = find_itte(kvm, device_id, event_id, &ret);
 	spin_unlock(&dist->its.lock);
 	if (!itte)
-		return E_ITS_INV_UNMAPPED_INTERRUPT;
+		return ret;
 
 	/*
 	 * We cannot read from guest memory inside the spinlock, so we
@@ -816,7 +838,7 @@ static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
 			return ret;
 
 		spin_lock(&dist->its.lock);
-		new_itte = find_itte(kvm, device_id, event_id);
+		new_itte = find_itte(kvm, device_id, event_id, &ret);
 		if (new_itte->lpi != itte->lpi) {
 			itte = new_itte;
 			spin_unlock(&dist->its.lock);
@@ -839,7 +861,7 @@ static int vits_cmd_handle_invall(struct kvm *kvm, u64 *its_cmd)
 
 	collection = find_collection(kvm, coll_id);
 	if (!its_is_collection_mapped(collection))
-		return E_ITS_INVALL_UNMAPPED_COLLECTION;
+		return E_ITS_UNMAPPED_COLLECTION;
 
 	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
 
@@ -866,7 +888,7 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
 
 	if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
 	    target2_addr >= atomic_read(&kvm->online_vcpus))
-		return E_ITS_MOVALL_PROCNUM_OOR;
+		return -EINVAL;
 
 	if (target1_addr == target2_addr)
 		return 0;
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index f7fa5f8..56161d9 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -43,15 +43,10 @@ bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
 bool vits_check_lpis(struct kvm_vcpu *vcpu);
 
-#define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
-#define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
-#define E_ITS_CLEAR_UNMAPPED_INTERRUPT		0x010507
-#define E_ITS_MAPC_PROCNUM_OOR			0x010902
-#define E_ITS_MAPTI_UNMAPPED_DEVICE		0x010a04
-#define E_ITS_MAPTI_PHYSICALID_OOR		0x010a06
-#define E_ITS_INV_UNMAPPED_INTERRUPT		0x010c07
-#define E_ITS_INVALL_UNMAPPED_COLLECTION	0x010d09
-#define E_ITS_MOVALL_PROCNUM_OOR		0x010e01
-#define E_ITS_DISCARD_UNMAPPED_INTERRUPT	0x010f07
+#define E_ITS_UNMAPPED_DEVICE		0x010004
+#define E_ITS_ID_OOR			0x010005
+#define E_ITS_PHYSICALID_OOR		0x010006
+#define E_ITS_UNMAPPED_INTERRUPT	0x010007
+#define E_ITS_UNMAPPED_COLLECTION	0x010009
 
 #endif
-- 
2.4.4


Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers
@ 2015-10-14 12:26     ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-14 12:26 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> -----Original Message-----
> From: kvm-owner at vger.kernel.org [mailto:kvm-owner at vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier at arm.com; christoffer.dall at linaro.org
> Cc: eric.auger at linaro.org; p.fedin at samsung.com; kvmarm at lists.cs.columbia.edu; linux-arm-
> kernel at lists.infradead.org; kvm at vger.kernel.org
> Subject: [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers
> 
> The connection between a device, an event ID, the LPI number and the
> allocated CPU is stored in in-memory tables in a GICv3, but their
> format is not specified by the spec. Instead software uses a command
> queue in a ring buffer to let the ITS implementation use their own
> format.
> Implement handlers for the various ITS commands and let them store
> the requested relation into our own data structures.
> To avoid kmallocs inside the ITS spinlock, we preallocate possibly
> needed memory outside of the lock and free that if it turns out to
> be not needed (mostly error handling).
> Error handling is very basic at this point, as we don't have a good
> way of communicating errors to the guest (usually a SError).
> The INT command handler is missing at this point, as we gain the
> capability of actually injecting MSIs into the guest only later on.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - adjust handlers to new pendbaser/propbaser locking scheme
> - properly free ITTEs (including pending bitmap)
> - fix handling of unmapped collections
> 
>  include/linux/irqchip/arm-gic-v3.h |   5 +-
>  virt/kvm/arm/its-emul.c            | 502 ++++++++++++++++++++++++++++++++++++-
>  virt/kvm/arm/its-emul.h            |  11 +
>  3 files changed, 516 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
> index ef274a9..27c0e75 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -255,7 +255,10 @@
>   */
>  #define GITS_CMD_MAPD			0x08
>  #define GITS_CMD_MAPC			0x09
> -#define GITS_CMD_MAPVI			0x0a
> +#define GITS_CMD_MAPTI			0x0a
> +/* older GIC documentation used MAPVI for this command */
> +#define GITS_CMD_MAPVI			GITS_CMD_MAPTI
> +#define GITS_CMD_MAPI			0x0b
>  #define GITS_CMD_MOVI			0x01
>  #define GITS_CMD_DISCARD		0x0f
>  #define GITS_CMD_INV			0x0c
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 7a8c5db..642effb 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -22,6 +22,7 @@
>  #include <linux/kvm_host.h>
>  #include <linux/interrupt.h>
>  #include <linux/list.h>
> +#include <linux/slab.h>
> 
>  #include <linux/irqchip/arm-gic-v3.h>
>  #include <kvm/arm_vgic.h>
> @@ -64,6 +65,34 @@ struct its_itte {
>  	unsigned long *pending;
>  };
> 
> +static struct its_device *find_its_device(struct kvm *kvm, u32 device_id)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	struct its_device *device;
> +
> +	list_for_each_entry(device, &its->device_list, dev_list)
> +		if (device_id == device->device_id)
> +			return device;
> +
> +	return NULL;
> +}
> +
> +static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
> +{
> +	struct its_device *device;
> +	struct its_itte *itte;
> +
> +	device = find_its_device(kvm, device_id);
> +	if (device == NULL)
> +		return NULL;
> +
> +	list_for_each_entry(itte, &device->itt, itte_list)
> +		if (itte->event_id == event_id)
> +			return itte;
> +
> +	return NULL;
> +}
> +
>  /* To be used as an iterator this macro misses the enclosing parentheses */
>  #define for_each_lpi(dev, itte, kvm) \
>  	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
> @@ -81,6 +110,19 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>  	return NULL;
>  }
> 
> +static struct its_collection *find_collection(struct kvm *kvm, int coll_id)
> +{
> +	struct its_collection *collection;
> +
> +	list_for_each_entry(collection, &kvm->arch.vgic.its.collection_list,
> +			    coll_list) {
> +		if (coll_id == collection->collection_id)
> +			return collection;
> +	}
> +
> +	return NULL;
> +}
> +
>  #define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
>  #define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
> 
> @@ -352,13 +394,471 @@ static void its_free_itte(struct its_itte *itte)
>  	kfree(itte);
>  }
> 
> +static u64 its_cmd_mask_field(u64 *its_cmd, int word, int shift, int size)
> +{
> +	return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT_ULL(size) - 1);
> +}
> +
> +#define its_cmd_get_command(cmd)	its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)	its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_id(cmd)		its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)	its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)	its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)	its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)	its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +/* The DISCARD command frees an Interrupt Translation Table Entry (ITTE). */
> +static int vits_cmd_handle_discard(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 device_id;
> +	u32 event_id;
> +	struct its_itte *itte;
> +	int ret = E_ITS_DISCARD_UNMAPPED_INTERRUPT;
> +
> +	device_id = its_cmd_get_deviceid(its_cmd);
> +	event_id = its_cmd_get_id(its_cmd);
> +
> +	spin_lock(&its->lock);
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (itte && itte->collection) {
> +		/*
> +		 * Though the spec talks about removing the pending state, we
> +		 * don't bother here since we clear the ITTE anyway and the
> +		 * pending state is a property of the ITTE struct.
> +		 */
> +		its_free_itte(itte);
> +		ret = 0;
> +	}

 Are you sure that DISCARD should remove the entry? The doc says in 6.3.4:
--- cut ---
This command translates the event defined by EventID and DeviceID and instructs the appropriate Redistributor to
remove the pending state of the interrupt. It also ensures that any caching in the Redistributors associated with a
specific EventID is consistent with the configuration held in memory.
--- cut ---
 So, it seems to be like CLEAR + INV.

> +
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +/* The MOVI command moves an ITTE to a different collection. */
> +static int vits_cmd_handle_movi(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 device_id = its_cmd_get_deviceid(its_cmd);
> +	u32 event_id = its_cmd_get_id(its_cmd);
> +	u32 coll_id = its_cmd_get_collection(its_cmd);
> +	struct its_itte *itte;
> +	struct its_collection *collection;
> +	int ret;
> +
> +	spin_lock(&its->lock);
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (!itte) {
> +		ret = E_ITS_MOVI_UNMAPPED_INTERRUPT;
> +		goto out_unlock;
> +	}
> +	if (!its_is_collection_mapped(itte->collection)) {
> +		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
> +		goto out_unlock;
> +	}
> +
> +	collection = find_collection(kvm, coll_id);
> +	if (!its_is_collection_mapped(collection)) {
> +		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
> +		goto out_unlock;
> +	}
> +
> +	if (test_and_clear_bit(itte->collection->target_addr, itte->pending))
> +		__set_bit(collection->target_addr, itte->pending);
> +
> +	itte->collection = collection;
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +static void vits_init_collection(struct kvm *kvm,
> +				 struct its_collection *collection,
> +				 u32 coll_id)
> +{
> +	collection->collection_id = coll_id;
> +	collection->target_addr = COLLECTION_NOT_MAPPED;
> +
> +	list_add_tail(&collection->coll_list,
> +		&kvm->arch.vgic.its.collection_list);
> +}
> +
> +/* The MAPTI and MAPI commands map LPIs to ITTEs. */
> +static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u32 device_id = its_cmd_get_deviceid(its_cmd);
> +	u32 event_id = its_cmd_get_id(its_cmd);
> +	u32 coll_id = its_cmd_get_collection(its_cmd);
> +	struct its_itte *itte, *new_itte;
> +	struct its_device *device;
> +	struct its_collection *collection, *new_coll;
> +	int lpi_nr;
> +	int ret = 0;
> +
> +	/* Preallocate possibly needed memory here outside of the lock */
> +	new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
> +	new_itte = kzalloc(sizeof(struct its_itte), GFP_KERNEL);
> +	if (new_itte)
> +		new_itte->pending = kcalloc(BITS_TO_LONGS(dist->nr_cpus),
> +					    sizeof(long), GFP_KERNEL);
> +
> +	spin_lock(&dist->its.lock);
> +
> +	device = find_its_device(kvm, device_id);
> +	if (!device) {
> +		ret = E_ITS_MAPTI_UNMAPPED_DEVICE;
> +		goto out_unlock;
> +	}
> +
> +	collection = find_collection(kvm, coll_id);
> +	if (!collection && !new_coll) {
> +		ret = -ENOMEM;
> +		goto out_unlock;
> +	}
> +
> +	if (cmd == GITS_CMD_MAPTI)
> +		lpi_nr = its_cmd_get_physical_id(its_cmd);
> +	else
> +		lpi_nr = event_id;
> +	if (lpi_nr < GIC_LPI_OFFSET ||
> +	    lpi_nr >= nr_idbits_propbase(dist->propbaser)) {
> +		ret = E_ITS_MAPTI_PHYSICALID_OOR;
> +		goto out_unlock;
> +	}
> +
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (!itte) {
> +		if (!new_itte || !new_itte->pending) {
> +			ret = -ENOMEM;
> +			goto out_unlock;
> +		}
> +		itte = new_itte;
> +
> +		itte->event_id	= event_id;
> +		list_add_tail(&itte->itte_list, &device->itt);
> +	} else {
> +		if (new_itte)
> +			kfree(new_itte->pending);
> +		kfree(new_itte);
> +	}
> +
> +	if (!collection) {
> +		collection = new_coll;
> +		vits_init_collection(kvm, collection, coll_id);
> +	} else {
> +		kfree(new_coll);
> +	}
> +
> +	itte->collection = collection;
> +	itte->lpi = lpi_nr;
> +
> +out_unlock:
> +	spin_unlock(&dist->its.lock);
> +	if (ret) {
> +		kfree(new_coll);
> +		if (new_itte)
> +			kfree(new_itte->pending);
> +		kfree(new_itte);
> +	}
> +	return ret;
> +}
> +
> +static void vits_unmap_device(struct kvm *kvm, struct its_device *device)
> +{
> +	struct its_itte *itte, *temp;
> +
> +	/*
> +	 * The spec says that unmapping a device with still valid
> +	 * ITTEs associated is UNPREDICTABLE. We remove all ITTEs,
> +	 * since we cannot leave the memory unreferenced.
> +	 */
> +	list_for_each_entry_safe(itte, temp, &device->itt, itte_list)
> +		its_free_itte(itte);
> +
> +	list_del(&device->dev_list);
> +	kfree(device);
> +}
> +
> +/* MAPD maps or unmaps a device ID to Interrupt Translation Tables (ITTs). */
> +static int vits_cmd_handle_mapd(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	bool valid = its_cmd_get_validbit(its_cmd);
> +	u32 device_id = its_cmd_get_deviceid(its_cmd);
> +	struct its_device *device, *new_device = NULL;
> +
> +	/* We preallocate memory outside of the lock here */
> +	if (valid) {
> +		new_device = kzalloc(sizeof(struct its_device), GFP_KERNEL);
> +		if (!new_device)
> +			return -ENOMEM;
> +	}
> +
> +	spin_lock(&its->lock);
> +
> +	device = find_its_device(kvm, device_id);
> +	if (device)
> +		vits_unmap_device(kvm, device);
> +
> +	/*
> +	 * The spec does not say whether unmapping a not-mapped device
> +	 * is an error, so we are done in any case.
> +	 */
> +	if (!valid)
> +		goto out_unlock;
> +
> +	device = new_device;
> +
> +	device->device_id = device_id;
> +	INIT_LIST_HEAD(&device->itt);
> +
> +	list_add_tail(&device->dev_list,
> +		      &kvm->arch.vgic.its.device_list);
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return 0;
> +}
> +
> +/* The MAPC command maps collection IDs to redistributors. */
> +static int vits_cmd_handle_mapc(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u16 coll_id;
> +	u32 target_addr;
> +	struct its_collection *collection, *new_coll = NULL;
> +	bool valid;
> +
> +	valid = its_cmd_get_validbit(its_cmd);
> +	coll_id = its_cmd_get_collection(its_cmd);
> +	target_addr = its_cmd_get_target_addr(its_cmd);
> +
> +	if (target_addr >= atomic_read(&kvm->online_vcpus))
> +		return E_ITS_MAPC_PROCNUM_OOR;
> +
> +	/* We preallocate memory outside of the lock here */
> +	if (valid) {
> +		new_coll = kmalloc(sizeof(struct its_collection), GFP_KERNEL);
> +		if (!new_coll)
> +			return -ENOMEM;
> +	}
> +
> +	spin_lock(&its->lock);
> +	collection = find_collection(kvm, coll_id);
> +
> +	if (!valid) {
> +		struct its_device *device;
> +		struct its_itte *itte;
> +		/*
> +		 * Clearing the mapping for that collection ID removes the
> +		 * entry from the list. If there wasn't any before, we can
> +		 * go home early.
> +		 */
> +		if (!collection)
> +			goto out_unlock;
> +
> +		for_each_lpi(device, itte, kvm)
> +			if (itte->collection &&
> +			    itte->collection->collection_id == coll_id)
> +				itte->collection = NULL;
> +
> +		list_del(&collection->coll_list);
> +		kfree(collection);
> +	} else {
> +		if (!collection)
> +			collection = new_coll;
> +		else
> +			kfree(new_coll);
> +
> +		vits_init_collection(kvm, collection, coll_id);
> +		collection->target_addr = target_addr;
> +	}
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return 0;
> +}
> +
> +/* The CLEAR command removes the pending state for a particular LPI. */
> +static int vits_cmd_handle_clear(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 device_id;
> +	u32 event_id;
> +	struct its_itte *itte;
> +	int ret = 0;
> +
> +	device_id = its_cmd_get_deviceid(its_cmd);
> +	event_id = its_cmd_get_id(its_cmd);
> +
> +	spin_lock(&its->lock);
> +
> +	itte = find_itte(kvm, device_id, event_id);
> +	if (!itte) {
> +		ret = E_ITS_CLEAR_UNMAPPED_INTERRUPT;
> +		goto out_unlock;
> +	}
> +
> +	if (its_is_collection_mapped(itte->collection))
> +		__clear_bit(itte->collection->target_addr, itte->pending);
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +	return ret;
> +}
> +
> +/* The INV command syncs the configuration bits from the memory tables. */
> +static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u32 device_id;
> +	u32 event_id;
> +	struct its_itte *itte, *new_itte;
> +	gpa_t propbase;
> +	int ret;
> +	u8 prop;
> +
> +	device_id = its_cmd_get_deviceid(its_cmd);
> +	event_id = its_cmd_get_id(its_cmd);
> +
> +	spin_lock(&dist->its.lock);
> +	itte = find_itte(kvm, device_id, event_id);
> +	spin_unlock(&dist->its.lock);
> +	if (!itte)
> +		return E_ITS_INV_UNMAPPED_INTERRUPT;
> +
> +	/*
> +	 * We cannot read from guest memory inside the spinlock, so we
> +	 * need to re-read our tables to learn whether the LPI number we are
> +	 * using is still valid.
> +	 */
> +	do {
> +		propbase = BASER_BASE_ADDRESS(dist->propbaser);
> +		ret = kvm_read_guest(kvm, propbase + itte->lpi - GIC_LPI_OFFSET,
> +				     &prop, 1);
> +		if (ret)
> +			return ret;
> +
> +		spin_lock(&dist->its.lock);
> +		new_itte = find_itte(kvm, device_id, event_id);
> +		if (new_itte->lpi != itte->lpi) {
> +			itte = new_itte;
> +			spin_unlock(&dist->its.lock);
> +			continue;
> +		}
> +		update_lpi_config(kvm, itte, prop);
> +		spin_unlock(&dist->its.lock);
> +	} while (0);
> +	return 0;
> +}
> +
> +/* The INVALL command requests flushing of all IRQ data in this collection. */
> +static int vits_cmd_handle_invall(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u64 prop_base_reg, pend_base_reg;
> +	u32 coll_id = its_cmd_get_collection(its_cmd);
> +	struct its_collection *collection;
> +	struct kvm_vcpu *vcpu;
> +
> +	collection = find_collection(kvm, coll_id);
> +	if (!its_is_collection_mapped(collection))
> +		return E_ITS_INVALL_UNMAPPED_COLLECTION;
> +
> +	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
> +
> +	spin_lock(&dist->lock);
> +	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
> +	prop_base_reg = dist->propbaser;
> +	spin_unlock(&dist->lock);
> +
> +	its_update_lpis_configuration(kvm, prop_base_reg);
> +	its_sync_lpi_pending_table(vcpu, pend_base_reg);
> +
> +	return 0;
> +}
> +
> +/* The MOVALL command moves all IRQs from one redistributor to another. */
> +static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct vgic_its *its = &kvm->arch.vgic.its;
> +	u32 target1_addr = its_cmd_get_target_addr(its_cmd);
> +	u32 target2_addr = its_cmd_mask_field(its_cmd, 3, 16, 32);
> +	struct its_collection *collection;
> +	struct its_device *device;
> +	struct its_itte *itte;
> +
> +	if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
> +	    target2_addr >= atomic_read(&kvm->online_vcpus))
> +		return E_ITS_MOVALL_PROCNUM_OOR;
> +
> +	if (target1_addr == target2_addr)
> +		return 0;
> +
> +	spin_lock(&its->lock);
> +	for_each_lpi(device, itte, kvm) {
> +		/* remap all collections mapped to target address 1 */
> +		collection = itte->collection;
> +		if (collection && collection->target_addr == target1_addr)
> +			collection->target_addr = target2_addr;
> +
> +		/* move pending state if LPI is affected */
> +		if (test_and_clear_bit(target1_addr, itte->pending))
> +			__set_bit(target2_addr, itte->pending);
> +	}
> +
> +	spin_unlock(&its->lock);
> +	return 0;
> +}
> +
>  /*
>   * This function is called with both the ITS and the distributor lock dropped,
>   * so the actual command handlers must take the respective locks when needed.
>   */
>  static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
>  {
> -	return -ENODEV;
> +	u8 cmd = its_cmd_get_command(its_cmd);
> +	int ret = -ENODEV;
> +
> +	switch (cmd) {
> +	case GITS_CMD_MAPD:
> +		ret = vits_cmd_handle_mapd(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_MAPC:
> +		ret = vits_cmd_handle_mapc(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_MAPI:
> +		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
> +		break;
> +	case GITS_CMD_MAPTI:
> +		ret = vits_cmd_handle_mapi(vcpu->kvm, its_cmd, cmd);
> +		break;
> +	case GITS_CMD_MOVI:
> +		ret = vits_cmd_handle_movi(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_DISCARD:
> +		ret = vits_cmd_handle_discard(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_CLEAR:
> +		ret = vits_cmd_handle_clear(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_MOVALL:
> +		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_INV:
> +		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_INVALL:
> +		ret = vits_cmd_handle_invall(vcpu->kvm, its_cmd);
> +		break;
> +	case GITS_CMD_SYNC:
> +		/* we ignore this command: we are in sync all of the time */
> +		ret = 0;
> +		break;
> +	}
> +
> +	return ret;
>  }
> 
>  static bool handle_mmio_gits_cbaser(struct kvm_vcpu *vcpu,
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index cbc3877..830524a 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -39,4 +39,15 @@ void vits_destroy(struct kvm *kvm);
>  bool vits_queue_lpis(struct kvm_vcpu *vcpu);
>  void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> 
> +#define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
> +#define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
> +#define E_ITS_CLEAR_UNMAPPED_INTERRUPT		0x010507
> +#define E_ITS_MAPC_PROCNUM_OOR			0x010902
> +#define E_ITS_MAPTI_UNMAPPED_DEVICE		0x010a04
> +#define E_ITS_MAPTI_PHYSICALID_OOR		0x010a06
> +#define E_ITS_INV_UNMAPPED_INTERRUPT		0x010c07
> +#define E_ITS_INVALL_UNMAPPED_COLLECTION	0x010d09
> +#define E_ITS_MOVALL_PROCNUM_OOR		0x010e01
> +#define E_ITS_DISCARD_UNMAPPED_INTERRUPT	0x010f07

 Did you just invent PROCNUM_OOR? Additionally, you have two different suffixes for it, this goes against the concept.
 Actually, your error handling is IMHO too stripped down. I can suggest you to squash in the following patch below, it significantly
improves error recognition. Yes, i know that status register isn't implemented (or at least error code goes nowhere).

> +
>  #endif
> --

---
>From 6da01d44c5b3753610b2a87724aedc2b05f42e06 Mon Sep 17 00:00:00 2001
From: Pavel Fedin <p.fedin@samsung.com>
Date: Wed, 14 Oct 2015 15:14:07 +0300
Subject: [PATCH] KVM: arm64: Improve ITS error handling

Remember ITT size and check device IDs against it. Also, factor out
its_find_itte_in_device() for convenience.

Error codes generalized accross commands. When status register is
implemented, in order to complete error code, command code should be
echoed back in bits 15...8.
---
 virt/kvm/arm/its-emul.c | 86 +++++++++++++++++++++++++++++++------------------
 virt/kvm/arm/its-emul.h | 15 +++------
 2 files changed, 59 insertions(+), 42 deletions(-)

diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
index 2fcd844..181bec8 100644
--- a/virt/kvm/arm/its-emul.c
+++ b/virt/kvm/arm/its-emul.c
@@ -40,6 +40,7 @@ struct its_device {
 	/* the head for the list of ITTEs */
 	struct list_head itt;
 	u32 device_id;
+	u32 itt_size;
 };
 
 #define COLLECTION_NOT_MAPPED ((u32)-1)
@@ -77,15 +78,11 @@ static struct its_device *find_its_device(struct kvm *kvm, u32 device_id)
 	return NULL;
 }
 
-static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
+static struct its_itte *find_itte_in_device(struct its_device *device,
+					    u32 event_id)
 {
-	struct its_device *device;
 	struct its_itte *itte;
 
-	device = find_its_device(kvm, device_id);
-	if (device == NULL)
-		return NULL;
-
 	list_for_each_entry(itte, &device->itt, itte_list)
 		if (itte->event_id == event_id)
 			return itte;
@@ -93,6 +90,28 @@ static struct its_itte *find_itte(struct kvm *kvm, u32 device_id, u32 event_id)
 	return NULL;
 }
 
+static struct its_itte *find_itte(struct kvm *kvm, u32 device_id,
+				  u32 event_id, int *err)
+{
+	struct its_device *device;
+	struct its_itte *itte;
+
+	device = find_its_device(kvm, device_id);
+	if (device == NULL) {
+		*err = E_ITS_UNMAPPED_DEVICE;
+		return NULL;
+	}
+
+	if (event_id >= 2 << device->itt_size) {
+		*err = E_ITS_ID_OOR;
+		return NULL;
+	}
+
+	itte = find_itte_in_device(device, event_id);
+	*err = itte ? 0 : E_ITS_UNMAPPED_INTERRUPT;
+	return itte;
+}
+
 /* To be used as an iterator this macro misses the enclosing parentheses */
 #define for_each_lpi(dev, itte, kvm) \
 	list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
@@ -359,10 +378,12 @@ int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
 		goto out_unlock;
 	}
 
-	itte = find_itte(kvm, msi->devid, msi->data);
+	itte = find_itte(kvm, msi->devid, msi->data, &ret);
 	/* Triggering an unmapped IRQ gets silently dropped. */
-	if (!itte || !its_is_collection_mapped(itte->collection))
+	if (!itte || !its_is_collection_mapped(itte->collection)) {
+		ret = 0;
 		goto out_unlock;
+	}
 
 	cpuid = itte->collection->target_addr;
 	__set_bit(cpuid, itte->pending);
@@ -476,6 +497,7 @@ static u64 its_cmd_mask_field(u64 *its_cmd, int word, int shift, int size)
 
 #define its_cmd_get_command(cmd)	its_cmd_mask_field(cmd, 0,  0,  8)
 #define its_cmd_get_deviceid(cmd)	its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_size(cmd)		its_cmd_mask_field(cmd, 1,  0,  5)
 #define its_cmd_get_id(cmd)		its_cmd_mask_field(cmd, 1,  0, 32)
 #define its_cmd_get_physical_id(cmd)	its_cmd_mask_field(cmd, 1, 32, 32)
 #define its_cmd_get_collection(cmd)	its_cmd_mask_field(cmd, 2,  0, 16)
@@ -489,21 +511,23 @@ static int vits_cmd_handle_discard(struct kvm *kvm, u64 *its_cmd)
 	u32 device_id;
 	u32 event_id;
 	struct its_itte *itte;
-	int ret = E_ITS_DISCARD_UNMAPPED_INTERRUPT;
+	int ret;
 
 	device_id = its_cmd_get_deviceid(its_cmd);
 	event_id = its_cmd_get_id(its_cmd);
 
 	spin_lock(&its->lock);
-	itte = find_itte(kvm, device_id, event_id);
-	if (itte && itte->collection) {
+	itte = find_itte(kvm, device_id, event_id, &ret);
+	if (itte) {
 		/*
 		 * Though the spec talks about removing the pending state, we
 		 * don't bother here since we clear the ITTE anyway and the
 		 * pending state is a property of the ITTE struct.
 		 */
-		its_free_itte(itte);
-		ret = 0;
+		if (itte->collection)
+			its_free_itte(itte);
+		else
+			ret = E_ITS_UNMAPPED_INTERRUPT;
 	}
 
 	spin_unlock(&its->lock);
@@ -522,19 +546,18 @@ static int vits_cmd_handle_movi(struct kvm *kvm, u64 *its_cmd)
 	int ret;
 
 	spin_lock(&its->lock);
-	itte = find_itte(kvm, device_id, event_id);
-	if (!itte) {
-		ret = E_ITS_MOVI_UNMAPPED_INTERRUPT;
+	itte = find_itte(kvm, device_id, event_id, &ret);
+	if (!itte)
 		goto out_unlock;
-	}
+
 	if (!its_is_collection_mapped(itte->collection)) {
-		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		ret = E_ITS_UNMAPPED_COLLECTION;
 		goto out_unlock;
 	}
 
 	collection = find_collection(kvm, coll_id);
 	if (!its_is_collection_mapped(collection)) {
-		ret = E_ITS_MOVI_UNMAPPED_COLLECTION;
+		ret = E_ITS_UNMAPPED_COLLECTION;
 		goto out_unlock;
 	}
 
@@ -582,7 +605,7 @@ static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
 
 	device = find_its_device(kvm, device_id);
 	if (!device) {
-		ret = E_ITS_MAPTI_UNMAPPED_DEVICE;
+		ret = E_ITS_UNMAPPED_DEVICE;
 		goto out_unlock;
 	}
 
@@ -598,11 +621,11 @@ static int vits_cmd_handle_mapi(struct kvm *kvm, u64 *its_cmd, u8 cmd)
 		lpi_nr = event_id;
 	if (lpi_nr < GIC_LPI_OFFSET ||
 	    lpi_nr >= nr_idbits_propbase(dist->propbaser)) {
-		ret = E_ITS_MAPTI_PHYSICALID_OOR;
+		ret = E_ITS_PHYSICALID_OOR;
 		goto out_unlock;
 	}
 
-	itte = find_itte(kvm, device_id, event_id);
+	itte = find_itte_in_device(device, event_id);
 	if (!itte) {
 		if (!new_itte || !new_itte->pending) {
 			ret = -ENOMEM;
@@ -686,6 +709,7 @@ static int vits_cmd_handle_mapd(struct kvm *kvm, u64 *its_cmd)
 	device = new_device;
 
 	device->device_id = device_id;
+	device->itt_size = its_cmd_get_size(its_cmd);
 	INIT_LIST_HEAD(&device->itt);
 
 	list_add_tail(&device->dev_list,
@@ -710,7 +734,7 @@ static int vits_cmd_handle_mapc(struct kvm *kvm, u64 *its_cmd)
 	target_addr = its_cmd_get_target_addr(its_cmd);
 
 	if (target_addr >= atomic_read(&kvm->online_vcpus))
-		return E_ITS_MAPC_PROCNUM_OOR;
+		return -EINVAL;
 
 	/* We preallocate memory outside of the lock here */
 	if (valid) {
@@ -769,11 +793,9 @@ static int vits_cmd_handle_clear(struct kvm *kvm, u64 *its_cmd)
 
 	spin_lock(&its->lock);
 
-	itte = find_itte(kvm, device_id, event_id);
-	if (!itte) {
-		ret = E_ITS_CLEAR_UNMAPPED_INTERRUPT;
+	itte = find_itte(kvm, device_id, event_id, &ret);
+	if (!itte)
 		goto out_unlock;
-	}
 
 	if (its_is_collection_mapped(itte->collection))
 		__clear_bit(itte->collection->target_addr, itte->pending);
@@ -798,10 +820,10 @@ static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
 	event_id = its_cmd_get_id(its_cmd);
 
 	spin_lock(&dist->its.lock);
-	itte = find_itte(kvm, device_id, event_id);
+	itte = find_itte(kvm, device_id, event_id, &ret);
 	spin_unlock(&dist->its.lock);
 	if (!itte)
-		return E_ITS_INV_UNMAPPED_INTERRUPT;
+		return ret;
 
 	/*
 	 * We cannot read from guest memory inside the spinlock, so we
@@ -816,7 +838,7 @@ static int vits_cmd_handle_inv(struct kvm *kvm, u64 *its_cmd)
 			return ret;
 
 		spin_lock(&dist->its.lock);
-		new_itte = find_itte(kvm, device_id, event_id);
+		new_itte = find_itte(kvm, device_id, event_id, &ret);
 		if (new_itte->lpi != itte->lpi) {
 			itte = new_itte;
 			spin_unlock(&dist->its.lock);
@@ -839,7 +861,7 @@ static int vits_cmd_handle_invall(struct kvm *kvm, u64 *its_cmd)
 
 	collection = find_collection(kvm, coll_id);
 	if (!its_is_collection_mapped(collection))
-		return E_ITS_INVALL_UNMAPPED_COLLECTION;
+		return E_ITS_UNMAPPED_COLLECTION;
 
 	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
 
@@ -866,7 +888,7 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
 
 	if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
 	    target2_addr >= atomic_read(&kvm->online_vcpus))
-		return E_ITS_MOVALL_PROCNUM_OOR;
+		return -EINVAL;
 
 	if (target1_addr == target2_addr)
 		return 0;
diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
index f7fa5f8..56161d9 100644
--- a/virt/kvm/arm/its-emul.h
+++ b/virt/kvm/arm/its-emul.h
@@ -43,15 +43,10 @@ bool vits_queue_lpis(struct kvm_vcpu *vcpu);
 void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
 bool vits_check_lpis(struct kvm_vcpu *vcpu);
 
-#define E_ITS_MOVI_UNMAPPED_INTERRUPT		0x010107
-#define E_ITS_MOVI_UNMAPPED_COLLECTION		0x010109
-#define E_ITS_CLEAR_UNMAPPED_INTERRUPT		0x010507
-#define E_ITS_MAPC_PROCNUM_OOR			0x010902
-#define E_ITS_MAPTI_UNMAPPED_DEVICE		0x010a04
-#define E_ITS_MAPTI_PHYSICALID_OOR		0x010a06
-#define E_ITS_INV_UNMAPPED_INTERRUPT		0x010c07
-#define E_ITS_INVALL_UNMAPPED_COLLECTION	0x010d09
-#define E_ITS_MOVALL_PROCNUM_OOR		0x010e01
-#define E_ITS_DISCARD_UNMAPPED_INTERRUPT	0x010f07
+#define E_ITS_UNMAPPED_DEVICE		0x010004
+#define E_ITS_ID_OOR			0x010005
+#define E_ITS_PHYSICALID_OOR		0x010006
+#define E_ITS_UNMAPPED_INTERRUPT	0x010007
+#define E_ITS_UNMAPPED_COLLECTION	0x010009
 
 #endif
-- 
2.4.4


Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables
  2015-10-07 14:55   ` Andre Przywara
@ 2015-10-21 11:29     ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-21 11:29 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier@arm.com; christoffer.dall@linaro.org
> Cc: eric.auger@linaro.org; p.fedin@samsung.com; kvmarm@lists.cs.columbia.edu; linux-arm-
> kernel@lists.infradead.org; kvm@vger.kernel.org
> Subject: [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables
> 
> The LPI configuration and pending tables of the GICv3 LPIs are held
> in tables in (guest) memory. To achieve reasonable performance, we
> cache this data in our own data structures, so we need to sync those
> two views from time to time. This behaviour is well described in the
> GICv3 spec and is also exercised by hardware, so the sync points are
> well known.
> 
> Provide functions that read the guest memory and store the
> information from the configuration and pending tables in the kernel.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - rework functions to avoid propbaser/pendbaser accesses inside lock
> 
>  include/kvm/arm_vgic.h  |   2 +
>  virt/kvm/arm/its-emul.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h |   3 ++
>  3 files changed, 138 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 035911f..4ea023c 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -179,6 +179,8 @@ struct vgic_its {
>  	int			cwriter;
>  	struct list_head	device_list;
>  	struct list_head	collection_list;
> +	/* memory used for buffering guest's memory */
> +	void			*buffer_page;
>  };
> 
>  struct vgic_dist {
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 8349970..7a8c5db 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -59,6 +59,7 @@ struct its_itte {
>  	struct its_collection *collection;
>  	u32 lpi;
>  	u32 event_id;
> +	u8 priority;
>  	bool enabled;
>  	unsigned long *pending;
>  };
> @@ -80,8 +81,124 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>  	return NULL;
>  }
> 
> +#define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
> +#define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
> +
> +/* stores the priority and enable bit for a given LPI */
> +static void update_lpi_config(struct kvm *kvm, struct its_itte *itte, u8 prop)
> +{
> +	itte->priority = LPI_PROP_PRIORITY(prop);
> +	itte->enabled  = LPI_PROP_ENABLE_BIT(prop);
> +}
> +
> +#define GIC_LPI_OFFSET 8192
> +
> +/* We scan the table in chunks the size of the smallest page size */
> +#define CHUNK_SIZE 4096U
> +
>  #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
> 
> +static int nr_idbits_propbase(u64 propbaser)
> +{
> +	int nr_idbits = (1U << (propbaser & 0x1f)) + 1;
> +
> +	return max(nr_idbits, INTERRUPT_ID_BITS_ITS);
> +}
> +
> +/*
> + * Scan the whole LPI configuration table and put the LPI configuration
> + * data in our own data structures. This relies on the LPI being
> + * mapped before.
> + */
> +static bool its_update_lpis_configuration(struct kvm *kvm, u64 prop_base_reg)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u8 *prop = dist->its.buffer_page;
> +	u32 tsize;
> +	gpa_t propbase;
> +	int lpi = GIC_LPI_OFFSET;
> +	struct its_itte *itte;
> +	struct its_device *device;
> +	int ret;
> +
> +	propbase = BASER_BASE_ADDRESS(prop_base_reg);
> +	tsize = nr_idbits_propbase(prop_base_reg);
> +
> +	while (tsize > 0) {
> +		int chunksize = min(tsize, CHUNK_SIZE);
> +
> +		ret = kvm_read_guest(kvm, propbase, prop, chunksize);
> +		if (ret)
> +			return false;

 I think it would be more convenient to return 'ret' here, and 0 on success. I see that currently nobody consumes the error code,
but with live migration this may change. And the same in its_sync_lpi_pending_table().

> +
> +		spin_lock(&dist->its.lock);
> +		/*
> +		 * Updating the status for all allocated LPIs. We catch
> +		 * those LPIs that get disabled. We really don't care
> +		 * about unmapped LPIs, as they need to be updated
> +		 * later manually anyway once they get mapped.
> +		 */
> +		for_each_lpi(device, itte, kvm) {
> +			if (itte->lpi < lpi || itte->lpi >= lpi + chunksize)
> +				continue;
> +
> +			update_lpi_config(kvm, itte, prop[itte->lpi - lpi]);
> +		}
> +		spin_unlock(&dist->its.lock);
> +		tsize -= chunksize;
> +		lpi += chunksize;
> +		propbase += chunksize;
> +	}
> +
> +	return true;
> +}
> +
> +/*
> + * Scan the whole LPI pending table and sync the pending bit in there
> + * with our own data structures. This relies on the LPI being
> + * mapped before.
> + */
> +static bool its_sync_lpi_pending_table(struct kvm_vcpu *vcpu, u64 base_addr_reg)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long *pendmask = dist->its.buffer_page;
> +	u32 nr_lpis = VITS_NR_LPIS;
> +	gpa_t pendbase;
> +	int lpi = 0;
> +	struct its_itte *itte;
> +	struct its_device *device;
> +	int ret;
> +	int lpi_bit, nr_bits;
> +
> +	pendbase = BASER_BASE_ADDRESS(base_addr_reg);
> +
> +	while (nr_lpis > 0) {
> +		nr_bits = min(nr_lpis, CHUNK_SIZE * 8);
> +
> +		ret = kvm_read_guest(vcpu->kvm, pendbase, pendmask,
> +				     nr_bits / 8);
> +		if (ret)
> +			return false;
> +
> +		spin_lock(&dist->its.lock);
> +		for_each_lpi(device, itte, vcpu->kvm) {
> +			lpi_bit = itte->lpi - lpi;
> +			if (lpi_bit < 0 || lpi_bit >= nr_bits)
> +				continue;
> +			if (test_bit(lpi_bit, pendmask))
> +				__set_bit(vcpu->vcpu_id, itte->pending);
> +			else
> +				__clear_bit(vcpu->vcpu_id, itte->pending);
> +		}
> +		spin_unlock(&dist->its.lock);
> +		nr_lpis -= nr_bits;
> +		lpi += nr_bits;
> +		pendbase += nr_bits / 8;
> +	}
> +
> +	return true;
> +}
> +
>  /* The distributor lock is held by the VGIC MMIO handler. */
>  static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
>  				  struct kvm_exit_mmio *mmio,
> @@ -418,6 +535,17 @@ static const struct vgic_io_range vgicv3_its_ranges[] = {
>  /* This is called on setting the LPI enable bit in the redistributor. */
>  void vgic_enable_lpis(struct kvm_vcpu *vcpu)
>  {
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	u64 prop_base_reg, pend_base_reg;
> +
> +	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
> +	prop_base_reg = dist->propbaser;
> +	spin_unlock(&dist->lock);
> +
> +	its_update_lpis_configuration(vcpu->kvm, prop_base_reg);
> +	its_sync_lpi_pending_table(vcpu, pend_base_reg);
> +
> +	spin_lock(&dist->lock);
>  }
> 
>  int vits_init(struct kvm *kvm)
> @@ -429,6 +557,10 @@ int vits_init(struct kvm *kvm)
>  	if (!dist->pendbaser)
>  		return -ENOMEM;
> 
> +	its->buffer_page = kmalloc(CHUNK_SIZE, GFP_KERNEL);
> +	if (!its->buffer_page)
> +		return -ENOMEM;
> +
>  	spin_lock_init(&its->lock);
> 
>  	INIT_LIST_HEAD(&its->device_list);
> @@ -474,6 +606,7 @@ void vits_destroy(struct kvm *kvm)
>  		kfree(container_of(cur, struct its_collection, coll_list));
>  	}
> 
> +	kfree(its->buffer_page);
>  	kfree(dist->pendbaser);
> 
>  	its->enabled = false;
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index cc5d5ff..cbc3877 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -29,6 +29,9 @@
> 
>  #include "vgic.h"
> 
> +#define INTERRUPT_ID_BITS_ITS 16
> +#define VITS_NR_LPIS (1U << INTERRUPT_ID_BITS_ITS)
> +
>  void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables
@ 2015-10-21 11:29     ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-21 11:29 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

> -----Original Message-----
> From: kvm-owner at vger.kernel.org [mailto:kvm-owner at vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier at arm.com; christoffer.dall at linaro.org
> Cc: eric.auger at linaro.org; p.fedin at samsung.com; kvmarm at lists.cs.columbia.edu; linux-arm-
> kernel at lists.infradead.org; kvm at vger.kernel.org
> Subject: [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables
> 
> The LPI configuration and pending tables of the GICv3 LPIs are held
> in tables in (guest) memory. To achieve reasonable performance, we
> cache this data in our own data structures, so we need to sync those
> two views from time to time. This behaviour is well described in the
> GICv3 spec and is also exercised by hardware, so the sync points are
> well known.
> 
> Provide functions that read the guest memory and store the
> information from the configuration and pending tables in the kernel.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - rework functions to avoid propbaser/pendbaser accesses inside lock
> 
>  include/kvm/arm_vgic.h  |   2 +
>  virt/kvm/arm/its-emul.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h |   3 ++
>  3 files changed, 138 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 035911f..4ea023c 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -179,6 +179,8 @@ struct vgic_its {
>  	int			cwriter;
>  	struct list_head	device_list;
>  	struct list_head	collection_list;
> +	/* memory used for buffering guest's memory */
> +	void			*buffer_page;
>  };
> 
>  struct vgic_dist {
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 8349970..7a8c5db 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -59,6 +59,7 @@ struct its_itte {
>  	struct its_collection *collection;
>  	u32 lpi;
>  	u32 event_id;
> +	u8 priority;
>  	bool enabled;
>  	unsigned long *pending;
>  };
> @@ -80,8 +81,124 @@ static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>  	return NULL;
>  }
> 
> +#define LPI_PROP_ENABLE_BIT(p)	((p) & LPI_PROP_ENABLED)
> +#define LPI_PROP_PRIORITY(p)	((p) & 0xfc)
> +
> +/* stores the priority and enable bit for a given LPI */
> +static void update_lpi_config(struct kvm *kvm, struct its_itte *itte, u8 prop)
> +{
> +	itte->priority = LPI_PROP_PRIORITY(prop);
> +	itte->enabled  = LPI_PROP_ENABLE_BIT(prop);
> +}
> +
> +#define GIC_LPI_OFFSET 8192
> +
> +/* We scan the table in chunks the size of the smallest page size */
> +#define CHUNK_SIZE 4096U
> +
>  #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
> 
> +static int nr_idbits_propbase(u64 propbaser)
> +{
> +	int nr_idbits = (1U << (propbaser & 0x1f)) + 1;
> +
> +	return max(nr_idbits, INTERRUPT_ID_BITS_ITS);
> +}
> +
> +/*
> + * Scan the whole LPI configuration table and put the LPI configuration
> + * data in our own data structures. This relies on the LPI being
> + * mapped before.
> + */
> +static bool its_update_lpis_configuration(struct kvm *kvm, u64 prop_base_reg)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	u8 *prop = dist->its.buffer_page;
> +	u32 tsize;
> +	gpa_t propbase;
> +	int lpi = GIC_LPI_OFFSET;
> +	struct its_itte *itte;
> +	struct its_device *device;
> +	int ret;
> +
> +	propbase = BASER_BASE_ADDRESS(prop_base_reg);
> +	tsize = nr_idbits_propbase(prop_base_reg);
> +
> +	while (tsize > 0) {
> +		int chunksize = min(tsize, CHUNK_SIZE);
> +
> +		ret = kvm_read_guest(kvm, propbase, prop, chunksize);
> +		if (ret)
> +			return false;

 I think it would be more convenient to return 'ret' here, and 0 on success. I see that currently nobody consumes the error code,
but with live migration this may change. And the same in its_sync_lpi_pending_table().

> +
> +		spin_lock(&dist->its.lock);
> +		/*
> +		 * Updating the status for all allocated LPIs. We catch
> +		 * those LPIs that get disabled. We really don't care
> +		 * about unmapped LPIs, as they need to be updated
> +		 * later manually anyway once they get mapped.
> +		 */
> +		for_each_lpi(device, itte, kvm) {
> +			if (itte->lpi < lpi || itte->lpi >= lpi + chunksize)
> +				continue;
> +
> +			update_lpi_config(kvm, itte, prop[itte->lpi - lpi]);
> +		}
> +		spin_unlock(&dist->its.lock);
> +		tsize -= chunksize;
> +		lpi += chunksize;
> +		propbase += chunksize;
> +	}
> +
> +	return true;
> +}
> +
> +/*
> + * Scan the whole LPI pending table and sync the pending bit in there
> + * with our own data structures. This relies on the LPI being
> + * mapped before.
> + */
> +static bool its_sync_lpi_pending_table(struct kvm_vcpu *vcpu, u64 base_addr_reg)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long *pendmask = dist->its.buffer_page;
> +	u32 nr_lpis = VITS_NR_LPIS;
> +	gpa_t pendbase;
> +	int lpi = 0;
> +	struct its_itte *itte;
> +	struct its_device *device;
> +	int ret;
> +	int lpi_bit, nr_bits;
> +
> +	pendbase = BASER_BASE_ADDRESS(base_addr_reg);
> +
> +	while (nr_lpis > 0) {
> +		nr_bits = min(nr_lpis, CHUNK_SIZE * 8);
> +
> +		ret = kvm_read_guest(vcpu->kvm, pendbase, pendmask,
> +				     nr_bits / 8);
> +		if (ret)
> +			return false;
> +
> +		spin_lock(&dist->its.lock);
> +		for_each_lpi(device, itte, vcpu->kvm) {
> +			lpi_bit = itte->lpi - lpi;
> +			if (lpi_bit < 0 || lpi_bit >= nr_bits)
> +				continue;
> +			if (test_bit(lpi_bit, pendmask))
> +				__set_bit(vcpu->vcpu_id, itte->pending);
> +			else
> +				__clear_bit(vcpu->vcpu_id, itte->pending);
> +		}
> +		spin_unlock(&dist->its.lock);
> +		nr_lpis -= nr_bits;
> +		lpi += nr_bits;
> +		pendbase += nr_bits / 8;
> +	}
> +
> +	return true;
> +}
> +
>  /* The distributor lock is held by the VGIC MMIO handler. */
>  static bool handle_mmio_misc_gits(struct kvm_vcpu *vcpu,
>  				  struct kvm_exit_mmio *mmio,
> @@ -418,6 +535,17 @@ static const struct vgic_io_range vgicv3_its_ranges[] = {
>  /* This is called on setting the LPI enable bit in the redistributor. */
>  void vgic_enable_lpis(struct kvm_vcpu *vcpu)
>  {
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	u64 prop_base_reg, pend_base_reg;
> +
> +	pend_base_reg = dist->pendbaser[vcpu->vcpu_id];
> +	prop_base_reg = dist->propbaser;
> +	spin_unlock(&dist->lock);
> +
> +	its_update_lpis_configuration(vcpu->kvm, prop_base_reg);
> +	its_sync_lpi_pending_table(vcpu, pend_base_reg);
> +
> +	spin_lock(&dist->lock);
>  }
> 
>  int vits_init(struct kvm *kvm)
> @@ -429,6 +557,10 @@ int vits_init(struct kvm *kvm)
>  	if (!dist->pendbaser)
>  		return -ENOMEM;
> 
> +	its->buffer_page = kmalloc(CHUNK_SIZE, GFP_KERNEL);
> +	if (!its->buffer_page)
> +		return -ENOMEM;
> +
>  	spin_lock_init(&its->lock);
> 
>  	INIT_LIST_HEAD(&its->device_list);
> @@ -474,6 +606,7 @@ void vits_destroy(struct kvm *kvm)
>  		kfree(container_of(cur, struct its_collection, coll_list));
>  	}
> 
> +	kfree(its->buffer_page);
>  	kfree(dist->pendbaser);
> 
>  	its->enabled = false;
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index cc5d5ff..cbc3877 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -29,6 +29,9 @@
> 
>  #include "vgic.h"
> 
> +#define INTERRUPT_ID_BITS_ITS 16
> +#define VITS_NR_LPIS (1U << INTERRUPT_ID_BITS_ITS)
> +
>  void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
  2015-10-07 14:55   ` Andre Przywara
@ 2015-10-22 15:46     ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-22 15:46 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

 During my work on live migration i found a big bug in your implementation.

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier@arm.com; christoffer.dall@linaro.org
> Cc: eric.auger@linaro.org; p.fedin@samsung.com; kvmarm@lists.cs.columbia.edu; linux-arm-
> kernel@lists.infradead.org; kvm@vger.kernel.org
> Subject: [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
> 
> In the GICv3 redistributor there are the PENDBASER and PROPBASER
> registers which we did not emulate so far, as they only make sense
> when having an ITS. In preparation for that emulate those MMIO
> accesses by storing the 64-bit data written into it into a variable
> which we later read in the ITS emulation.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - rename vgic_handle_base_register to vgic_reg64_access()
> 
>  include/kvm/arm_vgic.h      |  8 ++++++++
>  virt/kvm/arm/vgic-v3-emul.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic.c         | 31 +++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic.h         |  2 ++
>  4 files changed, 85 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 067ad09..06c33bc 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -272,6 +272,14 @@ struct vgic_dist {
>  	/* Virtual irq to hwirq mapping */
>  	spinlock_t		irq_phys_map_lock;
>  	struct list_head	irq_phys_map_list;
> +
> +	/* Address of LPI configuration table shared by all redistributors */
> +	u64			propbaser;
> +
> +	/* Addresses of LPI pending tables per redistributor */
> +	u64			*pendbaser;
> +
> +	bool			lpis_enabled;
>  };
> 
>  struct vgic_v2_cpu_if {
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index a8cf669..6939f7c 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -651,6 +651,38 @@ static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
>  	return vgic_handle_cfg_reg(reg, mmio, offset);
>  }
> 
> +/* We don't trigger any actions here, just store the register value */
> +static bool handle_mmio_propbaser_redist(struct kvm_vcpu *vcpu,
> +					 struct kvm_exit_mmio *mmio,
> +					 phys_addr_t offset)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	int mode = ACCESS_READ_VALUE;
> +
> +	/* Storing a value with LPIs already enabled is undefined */
> +	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
> +	vgic_reg64_access(mmio, offset, &dist->propbaser, mode);
> +
> +	return false;
> +}
> +
> +/* We don't trigger any actions here, just store the register value */
> +static bool handle_mmio_pendbaser_redist(struct kvm_vcpu *vcpu,
> +					 struct kvm_exit_mmio *mmio,
> +					 phys_addr_t offset)
> +{
> +	struct kvm_vcpu *rdvcpu = mmio->private;
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	int mode = ACCESS_READ_VALUE;
> +
> +	/* Storing a value with LPIs already enabled is undefined */
> +	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;

 Here you store lpis_enabled globally, and this is plain wrong.

 Linux kernel separately programs PENDBASER and enables LPIs on every CPU. Therefore, after CPU #0 is initialized (this happens much
earlier than everything else), dist->lpis_enabled is set to true, and subsequent PROPBASER writes, even for different
redistributors, will be ignored. As a result, you'll get dist->pendbaser[n] == NULL forever, where n > 0. And your
its_sync_lpi_pending_table() actually reads some garbage from physical address 0 of the guest.

 Attempts to write data to that region silently corrupts random qemu data during migration, that's how i discovered it.

> +	vgic_reg64_access(mmio, offset,
> +			  &dist->pendbaser[rdvcpu->vcpu_id], mode);
> +
> +	return false;
> +}
> +
>  #define SGI_base(x) ((x) + SZ_64K)
> 
>  static const struct vgic_io_range vgic_redist_ranges[] = {
> @@ -679,6 +711,18 @@ static const struct vgic_io_range vgic_redist_ranges[] = {
>  		.handle_mmio    = handle_mmio_raz_wi,
>  	},
>  	{
> +		.base		= GICR_PENDBASER,
> +		.len		= 0x08,
> +		.bits_per_irq	= 0,
> +		.handle_mmio	= handle_mmio_pendbaser_redist,
> +	},
> +	{
> +		.base		= GICR_PROPBASER,
> +		.len		= 0x08,
> +		.bits_per_irq	= 0,
> +		.handle_mmio	= handle_mmio_propbaser_redist,
> +	},
> +	{
>  		.base           = GICR_IDREGS,
>  		.len            = 0x30,
>  		.bits_per_irq   = 0,
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 4219f22..11bf692 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -471,6 +471,37 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>  	}
>  }
> 
> +/* handle a 64-bit register access */
> +void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
> +		       u64 *basereg, int mode)
> +{
> +	u32 reg;
> +	u64 breg;
> +
> +	switch (offset & ~3) {
> +	case 0x00:
> +		breg = *basereg;
> +		reg = lower_32_bits(breg);
> +		vgic_reg_access(mmio, &reg, offset & 3, mode);
> +		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
> +			breg &= GENMASK_ULL(63, 32);
> +			breg |= reg;
> +			*basereg = breg;
> +		}
> +		break;
> +	case 0x04:
> +		breg = *basereg;
> +		reg = upper_32_bits(breg);
> +		vgic_reg_access(mmio, &reg, offset & 3, mode);
> +		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
> +			breg  = lower_32_bits(breg);
> +			breg |= (u64)reg << 32;
> +			*basereg = breg;
> +		}
> +		break;
> +	}
> +}
> +
>  bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
>  			phys_addr_t offset)
>  {
> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
> index a093f5c..104f780 100644
> --- a/virt/kvm/arm/vgic.h
> +++ b/virt/kvm/arm/vgic.h
> @@ -71,6 +71,8 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>  		     phys_addr_t offset, int mode);
>  bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
>  			phys_addr_t offset);
> +void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
> +		       u64 *basereg, int mode);
> 
>  static inline
>  u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
> --
> 2.5.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
@ 2015-10-22 15:46     ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-22 15:46 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

 During my work on live migration i found a big bug in your implementation.

> -----Original Message-----
> From: kvm-owner at vger.kernel.org [mailto:kvm-owner at vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier at arm.com; christoffer.dall at linaro.org
> Cc: eric.auger at linaro.org; p.fedin at samsung.com; kvmarm at lists.cs.columbia.edu; linux-arm-
> kernel at lists.infradead.org; kvm at vger.kernel.org
> Subject: [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
> 
> In the GICv3 redistributor there are the PENDBASER and PROPBASER
> registers which we did not emulate so far, as they only make sense
> when having an ITS. In preparation for that emulate those MMIO
> accesses by storing the 64-bit data written into it into a variable
> which we later read in the ITS emulation.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - rename vgic_handle_base_register to vgic_reg64_access()
> 
>  include/kvm/arm_vgic.h      |  8 ++++++++
>  virt/kvm/arm/vgic-v3-emul.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic.c         | 31 +++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic.h         |  2 ++
>  4 files changed, 85 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 067ad09..06c33bc 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -272,6 +272,14 @@ struct vgic_dist {
>  	/* Virtual irq to hwirq mapping */
>  	spinlock_t		irq_phys_map_lock;
>  	struct list_head	irq_phys_map_list;
> +
> +	/* Address of LPI configuration table shared by all redistributors */
> +	u64			propbaser;
> +
> +	/* Addresses of LPI pending tables per redistributor */
> +	u64			*pendbaser;
> +
> +	bool			lpis_enabled;
>  };
> 
>  struct vgic_v2_cpu_if {
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index a8cf669..6939f7c 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -651,6 +651,38 @@ static bool handle_mmio_cfg_reg_redist(struct kvm_vcpu *vcpu,
>  	return vgic_handle_cfg_reg(reg, mmio, offset);
>  }
> 
> +/* We don't trigger any actions here, just store the register value */
> +static bool handle_mmio_propbaser_redist(struct kvm_vcpu *vcpu,
> +					 struct kvm_exit_mmio *mmio,
> +					 phys_addr_t offset)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	int mode = ACCESS_READ_VALUE;
> +
> +	/* Storing a value with LPIs already enabled is undefined */
> +	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;
> +	vgic_reg64_access(mmio, offset, &dist->propbaser, mode);
> +
> +	return false;
> +}
> +
> +/* We don't trigger any actions here, just store the register value */
> +static bool handle_mmio_pendbaser_redist(struct kvm_vcpu *vcpu,
> +					 struct kvm_exit_mmio *mmio,
> +					 phys_addr_t offset)
> +{
> +	struct kvm_vcpu *rdvcpu = mmio->private;
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	int mode = ACCESS_READ_VALUE;
> +
> +	/* Storing a value with LPIs already enabled is undefined */
> +	mode |= dist->lpis_enabled ? ACCESS_WRITE_IGNORED : ACCESS_WRITE_VALUE;

 Here you store lpis_enabled globally, and this is plain wrong.

 Linux kernel separately programs PENDBASER and enables LPIs on every CPU. Therefore, after CPU #0 is initialized (this happens much
earlier than everything else), dist->lpis_enabled is set to true, and subsequent PROPBASER writes, even for different
redistributors, will be ignored. As a result, you'll get dist->pendbaser[n] == NULL forever, where n > 0. And your
its_sync_lpi_pending_table() actually reads some garbage from physical address 0 of the guest.

 Attempts to write data to that region silently corrupts random qemu data during migration, that's how i discovered it.

> +	vgic_reg64_access(mmio, offset,
> +			  &dist->pendbaser[rdvcpu->vcpu_id], mode);
> +
> +	return false;
> +}
> +
>  #define SGI_base(x) ((x) + SZ_64K)
> 
>  static const struct vgic_io_range vgic_redist_ranges[] = {
> @@ -679,6 +711,18 @@ static const struct vgic_io_range vgic_redist_ranges[] = {
>  		.handle_mmio    = handle_mmio_raz_wi,
>  	},
>  	{
> +		.base		= GICR_PENDBASER,
> +		.len		= 0x08,
> +		.bits_per_irq	= 0,
> +		.handle_mmio	= handle_mmio_pendbaser_redist,
> +	},
> +	{
> +		.base		= GICR_PROPBASER,
> +		.len		= 0x08,
> +		.bits_per_irq	= 0,
> +		.handle_mmio	= handle_mmio_propbaser_redist,
> +	},
> +	{
>  		.base           = GICR_IDREGS,
>  		.len            = 0x30,
>  		.bits_per_irq   = 0,
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 4219f22..11bf692 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -471,6 +471,37 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>  	}
>  }
> 
> +/* handle a 64-bit register access */
> +void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
> +		       u64 *basereg, int mode)
> +{
> +	u32 reg;
> +	u64 breg;
> +
> +	switch (offset & ~3) {
> +	case 0x00:
> +		breg = *basereg;
> +		reg = lower_32_bits(breg);
> +		vgic_reg_access(mmio, &reg, offset & 3, mode);
> +		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
> +			breg &= GENMASK_ULL(63, 32);
> +			breg |= reg;
> +			*basereg = breg;
> +		}
> +		break;
> +	case 0x04:
> +		breg = *basereg;
> +		reg = upper_32_bits(breg);
> +		vgic_reg_access(mmio, &reg, offset & 3, mode);
> +		if (mmio->is_write && (mode & ACCESS_WRITE_VALUE)) {
> +			breg  = lower_32_bits(breg);
> +			breg |= (u64)reg << 32;
> +			*basereg = breg;
> +		}
> +		break;
> +	}
> +}
> +
>  bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
>  			phys_addr_t offset)
>  {
> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
> index a093f5c..104f780 100644
> --- a/virt/kvm/arm/vgic.h
> +++ b/virt/kvm/arm/vgic.h
> @@ -71,6 +71,8 @@ void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>  		     phys_addr_t offset, int mode);
>  bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
>  			phys_addr_t offset);
> +void vgic_reg64_access(struct kvm_exit_mmio *mmio, phys_addr_t offset,
> +		       u64 *basereg, int mode);
> 
>  static inline
>  u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask)
> --
> 2.5.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
  2015-10-22 15:46     ` Pavel Fedin
@ 2015-10-22 15:55       ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-22 15:55 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: eric.auger, kvmarm, linux-arm-kernel, kvm

 Hello!

 One more idea...

>  Here you store lpis_enabled globally, and this is plain wrong.

 By the way, may be we should move this flag, together with pendbaser array, into struct vgic_cpu? Then we would not have to
allocate them manually.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers
@ 2015-10-22 15:55       ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-10-22 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

 One more idea...

>  Here you store lpis_enabled globally, and this is plain wrong.

 By the way, may be we should move this flag, together with pendbaser array, into struct vgic_cpu? Then we would not have to
allocate them manually.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation
  2015-10-07 14:55   ` Andre Przywara
@ 2015-11-25 13:28     ` Pavel Fedin
  -1 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-11-25 13:28 UTC (permalink / raw)
  To: 'Andre Przywara', marc.zyngier, christoffer.dall
  Cc: kvm, kvmarm, linux-arm-kernel

 Hello!

 I have discovered one more issue, and it is major one. It gets triggered by VFIO. See inline.

 P.S. What is the overall current status? Long time has passed since the last email...

> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier@arm.com; christoffer.dall@linaro.org
> Cc: eric.auger@linaro.org; p.fedin@samsung.com; kvmarm@lists.cs.columbia.edu; linux-arm-
> kernel@lists.infradead.org; kvm@vger.kernel.org
> Subject: [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation
> 
> When userland wants to inject a MSI into the guest, we have to use
> our data structures to find the LPI number and the VCPU to receive
> the interrupt.
> Use the wrapper functions to iterate the linked lists and find the
> proper Interrupt Translation Table Entry. Then set the pending bit
> in this ITTE to be later picked up by the LR handling code. Kick
> the VCPU which is meant to handle this interrupt.
> We provide a VGIC emulation model specific routine for the actual
> MSI injection. The wrapper functions return an error for models not
> (yet) implementing MSIs (like the GICv2 emulation).
> We also provide the handler for the ITS "INT" command, which allows a
> guest to trigger an MSI via the ITS command queue.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - proper checking for unmapped collections
> 
>  include/kvm/arm_vgic.h      |  1 +
>  virt/kvm/arm/its-emul.c     | 65 +++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h     |  2 ++
>  virt/kvm/arm/vgic-v3-emul.c |  1 +
>  4 files changed, 69 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 4ea023c..7911059 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -149,6 +149,7 @@ struct vgic_vm_ops {
>  	int	(*map_resources)(struct kvm *, const struct vgic_params *);
>  	bool	(*queue_lpis)(struct kvm_vcpu *);
>  	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
> +	int	(*inject_msi)(struct kvm *, struct kvm_msi *);
>  };
> 
>  struct vgic_io_device {
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 642effb..cd8526a 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -333,6 +333,55 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>  }
> 
>  /*
> + * Translates an incoming MSI request into the redistributor (=VCPU) and
> + * the associated LPI number. Sets the LPI pending bit and also marks the
> + * VCPU as having a pending interrupt.
> + */
> +int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct vgic_its *its = &dist->its;
> +	struct its_itte *itte;
> +	int cpuid;
> +	bool inject = false;
> +	int ret = 0;
> +
> +	if (!vgic_has_its(kvm))
> +		return -ENODEV;
> +
> +	if (!(msi->flags & KVM_MSI_VALID_DEVID))
> +		return -EINVAL;
> +
> +	spin_lock(&its->lock);
> +
> +	if (!its->enabled || !dist->lpis_enabled) {
> +		ret = -EAGAIN;
> +		goto out_unlock;
> +	}
> +
> +	itte = find_itte(kvm, msi->devid, msi->data);
> +	/* Triggering an unmapped IRQ gets silently dropped. */
> +	if (!itte || !its_is_collection_mapped(itte->collection))
> +		goto out_unlock;
> +
> +	cpuid = itte->collection->target_addr;
> +	__set_bit(cpuid, itte->pending);
> +	inject = itte->enabled;
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +
> +	if (inject) {
> +		spin_lock(&dist->lock);

 At this point there can be a deadlock, because dist->lock is taken from within many places in KVM. If we are forwarding VFIO IRQ
using IRQFDs, then irqfd_wakeup() will directly call kvm_set_msi(), which ends up here. But, interrupts from VFIO devices can happen
at any moments, including those when dist->lock is taken by KVM status update code.
 Currently i added a simple workaround by disabling MSI fast path for KVM_ARM_HOST, but i believe it's not good solution. But can we
do it better?
 OTOH, i know, direct IRQ forwarding is on the way.

> +		__set_bit(cpuid, dist->irq_pending_on_cpu);
> +		spin_unlock(&dist->lock);
> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> +	}
> +
> +	return ret;
> +}
> +
> +/*
>   * Find all enabled and pending LPIs and queue them into the list
>   * registers.
>   * The dist lock is held by the caller.
> @@ -812,6 +861,19 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
>  	return 0;
>  }
> 
> +/* The INT command injects the LPI associated with that DevID/EvID pair. */
> +static int vits_cmd_handle_int(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct kvm_msi msi = {
> +		.data = its_cmd_get_id(its_cmd),
> +		.devid = its_cmd_get_deviceid(its_cmd),
> +		.flags = KVM_MSI_VALID_DEVID,
> +	};
> +
> +	vits_inject_msi(kvm, &msi);
> +	return 0;
> +}
> +
>  /*
>   * This function is called with both the ITS and the distributor lock dropped,
>   * so the actual command handlers must take the respective locks when needed.
> @@ -846,6 +908,9 @@ static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
>  	case GITS_CMD_MOVALL:
>  		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
>  		break;
> +	case GITS_CMD_INT:
> +		ret = vits_cmd_handle_int(vcpu->kvm, its_cmd);
> +		break;
>  	case GITS_CMD_INV:
>  		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
>  		break;
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index 830524a..95e56a7 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -36,6 +36,8 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> 
> +int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
> +
>  bool vits_queue_lpis(struct kvm_vcpu *vcpu);
>  void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> 
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index f482e34..90f3628 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -944,6 +944,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
>  	dist->vm_ops.init_model = vgic_v3_init_model;
>  	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
>  	dist->vm_ops.map_resources = vgic_v3_map_resources;
> +	dist->vm_ops.inject_msi = vits_inject_msi;
>  	dist->vm_ops.queue_lpis = vits_queue_lpis;
>  	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
> 
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation
@ 2015-11-25 13:28     ` Pavel Fedin
  0 siblings, 0 replies; 101+ messages in thread
From: Pavel Fedin @ 2015-11-25 13:28 UTC (permalink / raw)
  To: linux-arm-kernel

 Hello!

 I have discovered one more issue, and it is major one. It gets triggered by VFIO. See inline.

 P.S. What is the overall current status? Long time has passed since the last email...

> -----Original Message-----
> From: kvm-owner at vger.kernel.org [mailto:kvm-owner at vger.kernel.org] On Behalf Of Andre Przywara
> Sent: Wednesday, October 07, 2015 5:55 PM
> To: marc.zyngier at arm.com; christoffer.dall at linaro.org
> Cc: eric.auger at linaro.org; p.fedin at samsung.com; kvmarm at lists.cs.columbia.edu; linux-arm-
> kernel at lists.infradead.org; kvm at vger.kernel.org
> Subject: [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation
> 
> When userland wants to inject a MSI into the guest, we have to use
> our data structures to find the LPI number and the VCPU to receive
> the interrupt.
> Use the wrapper functions to iterate the linked lists and find the
> proper Interrupt Translation Table Entry. Then set the pending bit
> in this ITTE to be later picked up by the LR handling code. Kick
> the VCPU which is meant to handle this interrupt.
> We provide a VGIC emulation model specific routine for the actual
> MSI injection. The wrapper functions return an error for models not
> (yet) implementing MSIs (like the GICv2 emulation).
> We also provide the handler for the ITS "INT" command, which allows a
> guest to trigger an MSI via the ITS command queue.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> Changelog v2..v3:
> - proper checking for unmapped collections
> 
>  include/kvm/arm_vgic.h      |  1 +
>  virt/kvm/arm/its-emul.c     | 65 +++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/its-emul.h     |  2 ++
>  virt/kvm/arm/vgic-v3-emul.c |  1 +
>  4 files changed, 69 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 4ea023c..7911059 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -149,6 +149,7 @@ struct vgic_vm_ops {
>  	int	(*map_resources)(struct kvm *, const struct vgic_params *);
>  	bool	(*queue_lpis)(struct kvm_vcpu *);
>  	void	(*unqueue_lpi)(struct kvm_vcpu *, int irq);
> +	int	(*inject_msi)(struct kvm *, struct kvm_msi *);
>  };
> 
>  struct vgic_io_device {
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 642effb..cd8526a 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -333,6 +333,55 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>  }
> 
>  /*
> + * Translates an incoming MSI request into the redistributor (=VCPU) and
> + * the associated LPI number. Sets the LPI pending bit and also marks the
> + * VCPU as having a pending interrupt.
> + */
> +int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct vgic_its *its = &dist->its;
> +	struct its_itte *itte;
> +	int cpuid;
> +	bool inject = false;
> +	int ret = 0;
> +
> +	if (!vgic_has_its(kvm))
> +		return -ENODEV;
> +
> +	if (!(msi->flags & KVM_MSI_VALID_DEVID))
> +		return -EINVAL;
> +
> +	spin_lock(&its->lock);
> +
> +	if (!its->enabled || !dist->lpis_enabled) {
> +		ret = -EAGAIN;
> +		goto out_unlock;
> +	}
> +
> +	itte = find_itte(kvm, msi->devid, msi->data);
> +	/* Triggering an unmapped IRQ gets silently dropped. */
> +	if (!itte || !its_is_collection_mapped(itte->collection))
> +		goto out_unlock;
> +
> +	cpuid = itte->collection->target_addr;
> +	__set_bit(cpuid, itte->pending);
> +	inject = itte->enabled;
> +
> +out_unlock:
> +	spin_unlock(&its->lock);
> +
> +	if (inject) {
> +		spin_lock(&dist->lock);

 At this point there can be a deadlock, because dist->lock is taken from within many places in KVM. If we are forwarding VFIO IRQ
using IRQFDs, then irqfd_wakeup() will directly call kvm_set_msi(), which ends up here. But, interrupts from VFIO devices can happen
at any moments, including those when dist->lock is taken by KVM status update code.
 Currently i added a simple workaround by disabling MSI fast path for KVM_ARM_HOST, but i believe it's not good solution. But can we
do it better?
 OTOH, i know, direct IRQ forwarding is on the way.

> +		__set_bit(cpuid, dist->irq_pending_on_cpu);
> +		spin_unlock(&dist->lock);
> +		kvm_vcpu_kick(kvm_get_vcpu(kvm, cpuid));
> +	}
> +
> +	return ret;
> +}
> +
> +/*
>   * Find all enabled and pending LPIs and queue them into the list
>   * registers.
>   * The dist lock is held by the caller.
> @@ -812,6 +861,19 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 *its_cmd)
>  	return 0;
>  }
> 
> +/* The INT command injects the LPI associated with that DevID/EvID pair. */
> +static int vits_cmd_handle_int(struct kvm *kvm, u64 *its_cmd)
> +{
> +	struct kvm_msi msi = {
> +		.data = its_cmd_get_id(its_cmd),
> +		.devid = its_cmd_get_deviceid(its_cmd),
> +		.flags = KVM_MSI_VALID_DEVID,
> +	};
> +
> +	vits_inject_msi(kvm, &msi);
> +	return 0;
> +}
> +
>  /*
>   * This function is called with both the ITS and the distributor lock dropped,
>   * so the actual command handlers must take the respective locks when needed.
> @@ -846,6 +908,9 @@ static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
>  	case GITS_CMD_MOVALL:
>  		ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
>  		break;
> +	case GITS_CMD_INT:
> +		ret = vits_cmd_handle_int(vcpu->kvm, its_cmd);
> +		break;
>  	case GITS_CMD_INV:
>  		ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
>  		break;
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index 830524a..95e56a7 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -36,6 +36,8 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> 
> +int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
> +
>  bool vits_queue_lpis(struct kvm_vcpu *vcpu);
>  void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> 
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index f482e34..90f3628 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -944,6 +944,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
>  	dist->vm_ops.init_model = vgic_v3_init_model;
>  	dist->vm_ops.destroy_model = vgic_v3_destroy_model;
>  	dist->vm_ops.map_resources = vgic_v3_map_resources;
> +	dist->vm_ops.inject_msi = vits_inject_msi;
>  	dist->vm_ops.queue_lpis = vits_queue_lpis;
>  	dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
> 
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2015-10-07 14:55 ` Andre Przywara
@ 2016-03-09 11:35   ` Tomasz Nowicki
  -1 siblings, 0 replies; 101+ messages in thread
From: Tomasz Nowicki @ 2016-03-09 11:35 UTC (permalink / raw)
  To: Andre Przywara, marc.zyngier, christoffer.dall
  Cc: linux-arm-kernel, kvmarm, kvm

Hi Andre,

Forgive me if anybody already asked this question for previous series 
versions.

The review is still pending so it is worth to ask. What is your idea for 
saving and restoring vITS state? I notice device, itte and collection 
linked lists which are essential for vITS state. Of course it is not 
feasible to transfer these list to e.g. QEMU using 
KVM_{GET|SET}_DEVICE_ATTR.

Regards,
Tomasz

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-09 11:35   ` Tomasz Nowicki
  0 siblings, 0 replies; 101+ messages in thread
From: Tomasz Nowicki @ 2016-03-09 11:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Andre,

Forgive me if anybody already asked this question for previous series 
versions.

The review is still pending so it is worth to ask. What is your idea for 
saving and restoring vITS state? I notice device, itte and collection 
linked lists which are essential for vITS state. Of course it is not 
feasible to transfer these list to e.g. QEMU using 
KVM_{GET|SET}_DEVICE_ATTR.

Regards,
Tomasz

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-09 11:35   ` Tomasz Nowicki
@ 2016-03-13 18:16     ` Christoffer Dall
  -1 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2016-03-13 18:16 UTC (permalink / raw)
  To: Tomasz Nowicki
  Cc: Andre Przywara, marc.zyngier, kvm, kvmarm, linux-arm-kernel

On Wed, Mar 09, 2016 at 12:35:26PM +0100, Tomasz Nowicki wrote:
> Hi Andre,
> 
> Forgive me if anybody already asked this question for previous
> series versions.
> 
> The review is still pending so it is worth to ask. What is your idea
> for saving and restoring vITS state? I notice device, itte and
> collection linked lists which are essential for vITS state. Of
> course it is not feasible to transfer these list to e.g. QEMU using
> KVM_{GET|SET}_DEVICE_ATTR.
> 
If I recall correctly these items are the ones stored in memory on real
hardware, and not in hardware registers.

We had an idea where userspace asks the kernel vgic to flush its
internal cache into the memory allocated by the guest driver for the
vITS data structures and then the state would be transferred across to
the new VM via the memory transfer mechanism.

Only caveat there I think was that we had to decide on a storage format
in those memory regions, to allow QEMU to understand the state and to
ensure back/forwards compatibility between KVM versions.

-Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-13 18:16     ` Christoffer Dall
  0 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2016-03-13 18:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 09, 2016 at 12:35:26PM +0100, Tomasz Nowicki wrote:
> Hi Andre,
> 
> Forgive me if anybody already asked this question for previous
> series versions.
> 
> The review is still pending so it is worth to ask. What is your idea
> for saving and restoring vITS state? I notice device, itte and
> collection linked lists which are essential for vITS state. Of
> course it is not feasible to transfer these list to e.g. QEMU using
> KVM_{GET|SET}_DEVICE_ATTR.
> 
If I recall correctly these items are the ones stored in memory on real
hardware, and not in hardware registers.

We had an idea where userspace asks the kernel vgic to flush its
internal cache into the memory allocated by the guest driver for the
vITS data structures and then the state would be transferred across to
the new VM via the memory transfer mechanism.

Only caveat there I think was that we had to decide on a storage format
in those memory regions, to allow QEMU to understand the state and to
ensure back/forwards compatibility between KVM versions.

-Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-13 18:16     ` Christoffer Dall
@ 2016-03-14 11:13       ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2016-03-14 11:13 UTC (permalink / raw)
  To: Christoffer Dall, Tomasz Nowicki
  Cc: marc.zyngier, kvmarm, kvm, linux-arm-kernel

Hi,

On 13/03/16 18:16, Christoffer Dall wrote:
> On Wed, Mar 09, 2016 at 12:35:26PM +0100, Tomasz Nowicki wrote:
>> Hi Andre,
>>
>> Forgive me if anybody already asked this question for previous
>> series versions.
>>
>> The review is still pending so it is worth to ask. What is your idea
>> for saving and restoring vITS state? I notice device, itte and
>> collection linked lists which are essential for vITS state. Of
>> course it is not feasible to transfer these list to e.g. QEMU using
>> KVM_{GET|SET}_DEVICE_ATTR.
>>
> If I recall correctly these items are the ones stored in memory on real
> hardware, and not in hardware registers.

Potentially, but not necessarily.

> We had an idea where userspace asks the kernel vgic to flush its
> internal cache into the memory allocated by the guest driver for the
> vITS data structures and then the state would be transferred across to
> the new VM via the memory transfer mechanism.

The problem with this idea is that we currently don't use guest memory
to hold those data structures. As we report 0 on reads for all BASER<n>
registers, this includes Type=0 for each register, which translates into
"Unimplemented", so a guest OS would never allocate memory for it.
Instead we claim to hold all information in our "cache" (aka. host
memory). This has several advantages, but obviously breaks this
save/restore approach.
So I see two ways to fix this:
1.) we find a KVM specific way of letting userland save and restore the
ITS tables directly
2.) we implement the BASER<n> registers, but still use our "cache" for
normal operations. On demand we would serialize KVM's virtual ITS data
structures and put them into the guest's memory, so they could be
saved/restored from there.

> Only caveat there I think was that we had to decide on a storage format
> in those memory regions, to allow QEMU to understand the state and to
> ensure back/forwards compatibility between KVM versions.

Do we need QEMU to actually understand this? Can't we just leave this
all to the kernel and QEMU just passes on the data? That would still
require some ABI stability between kernel versions in this respect, but
it's less problematic than exposing the data format to userland at all.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-14 11:13       ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2016-03-14 11:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 13/03/16 18:16, Christoffer Dall wrote:
> On Wed, Mar 09, 2016 at 12:35:26PM +0100, Tomasz Nowicki wrote:
>> Hi Andre,
>>
>> Forgive me if anybody already asked this question for previous
>> series versions.
>>
>> The review is still pending so it is worth to ask. What is your idea
>> for saving and restoring vITS state? I notice device, itte and
>> collection linked lists which are essential for vITS state. Of
>> course it is not feasible to transfer these list to e.g. QEMU using
>> KVM_{GET|SET}_DEVICE_ATTR.
>>
> If I recall correctly these items are the ones stored in memory on real
> hardware, and not in hardware registers.

Potentially, but not necessarily.

> We had an idea where userspace asks the kernel vgic to flush its
> internal cache into the memory allocated by the guest driver for the
> vITS data structures and then the state would be transferred across to
> the new VM via the memory transfer mechanism.

The problem with this idea is that we currently don't use guest memory
to hold those data structures. As we report 0 on reads for all BASER<n>
registers, this includes Type=0 for each register, which translates into
"Unimplemented", so a guest OS would never allocate memory for it.
Instead we claim to hold all information in our "cache" (aka. host
memory). This has several advantages, but obviously breaks this
save/restore approach.
So I see two ways to fix this:
1.) we find a KVM specific way of letting userland save and restore the
ITS tables directly
2.) we implement the BASER<n> registers, but still use our "cache" for
normal operations. On demand we would serialize KVM's virtual ITS data
structures and put them into the guest's memory, so they could be
saved/restored from there.

> Only caveat there I think was that we had to decide on a storage format
> in those memory regions, to allow QEMU to understand the state and to
> ensure back/forwards compatibility between KVM versions.

Do we need QEMU to actually understand this? Can't we just leave this
all to the kernel and QEMU just passes on the data? That would still
require some ABI stability between kernel versions in this respect, but
it's less problematic than exposing the data format to userland at all.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-14 11:13       ` Andre Przywara
@ 2016-03-14 17:29         ` Peter Maydell
  -1 siblings, 0 replies; 101+ messages in thread
From: Peter Maydell @ 2016-03-14 17:29 UTC (permalink / raw)
  To: Andre Przywara; +Cc: kvm-devel, Marc Zyngier, kvmarm, arm-mail-list

On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
> So I see two ways to fix this:
> 1.) we find a KVM specific way of letting userland save and restore the
> ITS tables directly
> 2.) we implement the BASER<n> registers, but still use our "cache" for
> normal operations. On demand we would serialize KVM's virtual ITS data
> structures and put them into the guest's memory, so they could be
> saved/restored from there.

I feel like we're rehashing a bunch of design choices we talked
through way back in the last-but-one Connect. I don't suppose
anybody wrote down our rationales from back then?

(In particular I forget whether we decided the ITS tables were
large enough to need to allow some sort of before-the-VM-stops
migration of the data, which would be relatively doable with
option 2 but painful under option 1.)

>> Only caveat there I think was that we had to decide on a storage format
>> in those memory regions, to allow QEMU to understand the state and to
>> ensure back/forwards compatibility between KVM versions.
>
> Do we need QEMU to actually understand this? Can't we just leave this
> all to the kernel and QEMU just passes on the data? That would still
> require some ABI stability between kernel versions in this respect, but
> it's less problematic than exposing the data format to userland at all.

This would preclude ever being able to migrate a VM from KVM to
TCG QEMU, which seems a shame. (That doesn't work right now, but
I'm a bit wary of shutting the door to it forever.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-14 17:29         ` Peter Maydell
  0 siblings, 0 replies; 101+ messages in thread
From: Peter Maydell @ 2016-03-14 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
> So I see two ways to fix this:
> 1.) we find a KVM specific way of letting userland save and restore the
> ITS tables directly
> 2.) we implement the BASER<n> registers, but still use our "cache" for
> normal operations. On demand we would serialize KVM's virtual ITS data
> structures and put them into the guest's memory, so they could be
> saved/restored from there.

I feel like we're rehashing a bunch of design choices we talked
through way back in the last-but-one Connect. I don't suppose
anybody wrote down our rationales from back then?

(In particular I forget whether we decided the ITS tables were
large enough to need to allow some sort of before-the-VM-stops
migration of the data, which would be relatively doable with
option 2 but painful under option 1.)

>> Only caveat there I think was that we had to decide on a storage format
>> in those memory regions, to allow QEMU to understand the state and to
>> ensure back/forwards compatibility between KVM versions.
>
> Do we need QEMU to actually understand this? Can't we just leave this
> all to the kernel and QEMU just passes on the data? That would still
> require some ABI stability between kernel versions in this respect, but
> it's less problematic than exposing the data format to userland at all.

This would preclude ever being able to migrate a VM from KVM to
TCG QEMU, which seems a shame. (That doesn't work right now, but
I'm a bit wary of shutting the door to it forever.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-14 17:29         ` Peter Maydell
@ 2016-03-14 17:54           ` Marc Zyngier
  -1 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2016-03-14 17:54 UTC (permalink / raw)
  To: Peter Maydell, Andre Przywara; +Cc: arm-mail-list, kvmarm, kvm-devel

On 14/03/16 17:29, Peter Maydell wrote:
> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
>> So I see two ways to fix this:
>> 1.) we find a KVM specific way of letting userland save and restore the
>> ITS tables directly
>> 2.) we implement the BASER<n> registers, but still use our "cache" for
>> normal operations. On demand we would serialize KVM's virtual ITS data
>> structures and put them into the guest's memory, so they could be
>> saved/restored from there.
> 
> I feel like we're rehashing a bunch of design choices we talked
> through way back in the last-but-one Connect. I don't suppose
> anybody wrote down our rationales from back then?
> 
> (In particular I forget whether we decided the ITS tables were
> large enough to need to allow some sort of before-the-VM-stops
> migration of the data, which would be relatively doable with
> option 2 but painful under option 1.)

I think only option 2 is valid here, and we must be able to shove most
of the routing information in the device/collection/IT tables. Common HW
seems to use 64bit of data per entry per table, so we should be able to
do the same with KVM.

> 
>>> Only caveat there I think was that we had to decide on a storage format
>>> in those memory regions, to allow QEMU to understand the state and to
>>> ensure back/forwards compatibility between KVM versions.
>>
>> Do we need QEMU to actually understand this? Can't we just leave this
>> all to the kernel and QEMU just passes on the data? That would still
>> require some ABI stability between kernel versions in this respect, but
>> it's less problematic than exposing the data format to userland at all.
> 
> This would preclude ever being able to migrate a VM from KVM to
> TCG QEMU, which seems a shame. (That doesn't work right now, but
> I'm a bit wary of shutting the door to it forever.)

If the format of the migrated tables becomes ABI for KVM, it also
becomes ABI for userspace (anything that comes out of the kernel *is*
ABI). Andre, can you please explain what you mean?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-14 17:54           ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2016-03-14 17:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/03/16 17:29, Peter Maydell wrote:
> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
>> So I see two ways to fix this:
>> 1.) we find a KVM specific way of letting userland save and restore the
>> ITS tables directly
>> 2.) we implement the BASER<n> registers, but still use our "cache" for
>> normal operations. On demand we would serialize KVM's virtual ITS data
>> structures and put them into the guest's memory, so they could be
>> saved/restored from there.
> 
> I feel like we're rehashing a bunch of design choices we talked
> through way back in the last-but-one Connect. I don't suppose
> anybody wrote down our rationales from back then?
> 
> (In particular I forget whether we decided the ITS tables were
> large enough to need to allow some sort of before-the-VM-stops
> migration of the data, which would be relatively doable with
> option 2 but painful under option 1.)

I think only option 2 is valid here, and we must be able to shove most
of the routing information in the device/collection/IT tables. Common HW
seems to use 64bit of data per entry per table, so we should be able to
do the same with KVM.

> 
>>> Only caveat there I think was that we had to decide on a storage format
>>> in those memory regions, to allow QEMU to understand the state and to
>>> ensure back/forwards compatibility between KVM versions.
>>
>> Do we need QEMU to actually understand this? Can't we just leave this
>> all to the kernel and QEMU just passes on the data? That would still
>> require some ABI stability between kernel versions in this respect, but
>> it's less problematic than exposing the data format to userland at all.
> 
> This would preclude ever being able to migrate a VM from KVM to
> TCG QEMU, which seems a shame. (That doesn't work right now, but
> I'm a bit wary of shutting the door to it forever.)

If the format of the migrated tables becomes ABI for KVM, it also
becomes ABI for userspace (anything that comes out of the kernel *is*
ABI). Andre, can you please explain what you mean?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-14 17:54           ` Marc Zyngier
@ 2016-03-14 18:20             ` Andre Przywara
  -1 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2016-03-14 18:20 UTC (permalink / raw)
  To: Marc Zyngier, Peter Maydell; +Cc: arm-mail-list, kvmarm, kvm-devel

Hi,

On 14/03/16 17:54, Marc Zyngier wrote:
> On 14/03/16 17:29, Peter Maydell wrote:
>> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
>>> So I see two ways to fix this:
>>> 1.) we find a KVM specific way of letting userland save and restore the
>>> ITS tables directly
>>> 2.) we implement the BASER<n> registers, but still use our "cache" for
>>> normal operations. On demand we would serialize KVM's virtual ITS data
>>> structures and put them into the guest's memory, so they could be
>>> saved/restored from there.
>>
>> I feel like we're rehashing a bunch of design choices we talked
>> through way back in the last-but-one Connect. I don't suppose
>> anybody wrote down our rationales from back then?
>>
>> (In particular I forget whether we decided the ITS tables were
>> large enough to need to allow some sort of before-the-VM-stops
>> migration of the data, which would be relatively doable with
>> option 2 but painful under option 1.)
> 
> I think only option 2 is valid here, and we must be able to shove most
> of the routing information in the device/collection/IT tables. Common HW
> seems to use 64bit of data per entry per table, so we should be able to
> do the same with KVM.

All right, just skimmed over this and it looks doable.
For the collection table we will most likely even get away with 32 bits
per entry (compressed MPIDR or even VCPUIDs).
Would the IPA of the ITTE suffice for each device table entry?

I will work out the details later.

>>>> Only caveat there I think was that we had to decide on a storage format
>>>> in those memory regions, to allow QEMU to understand the state and to
>>>> ensure back/forwards compatibility between KVM versions.
>>>
>>> Do we need QEMU to actually understand this? Can't we just leave this
>>> all to the kernel and QEMU just passes on the data? That would still
>>> require some ABI stability between kernel versions in this respect, but
>>> it's less problematic than exposing the data format to userland at all.
>>
>> This would preclude ever being able to migrate a VM from KVM to
>> TCG QEMU, which seems a shame. (That doesn't work right now, but
>> I'm a bit wary of shutting the door to it forever.)
> 
> If the format of the migrated tables becomes ABI for KVM, it also
> becomes ABI for userspace (anything that comes out of the kernel *is*
> ABI). Andre, can you please explain what you mean?

Well, probably there is not so much difference. I was just wondering if
it would be easier to treat that data as an opaque blob.
But you are probably right that it would just mean the difference
between documenting the format or not.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-14 18:20             ` Andre Przywara
  0 siblings, 0 replies; 101+ messages in thread
From: Andre Przywara @ 2016-03-14 18:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 14/03/16 17:54, Marc Zyngier wrote:
> On 14/03/16 17:29, Peter Maydell wrote:
>> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
>>> So I see two ways to fix this:
>>> 1.) we find a KVM specific way of letting userland save and restore the
>>> ITS tables directly
>>> 2.) we implement the BASER<n> registers, but still use our "cache" for
>>> normal operations. On demand we would serialize KVM's virtual ITS data
>>> structures and put them into the guest's memory, so they could be
>>> saved/restored from there.
>>
>> I feel like we're rehashing a bunch of design choices we talked
>> through way back in the last-but-one Connect. I don't suppose
>> anybody wrote down our rationales from back then?
>>
>> (In particular I forget whether we decided the ITS tables were
>> large enough to need to allow some sort of before-the-VM-stops
>> migration of the data, which would be relatively doable with
>> option 2 but painful under option 1.)
> 
> I think only option 2 is valid here, and we must be able to shove most
> of the routing information in the device/collection/IT tables. Common HW
> seems to use 64bit of data per entry per table, so we should be able to
> do the same with KVM.

All right, just skimmed over this and it looks doable.
For the collection table we will most likely even get away with 32 bits
per entry (compressed MPIDR or even VCPUIDs).
Would the IPA of the ITTE suffice for each device table entry?

I will work out the details later.

>>>> Only caveat there I think was that we had to decide on a storage format
>>>> in those memory regions, to allow QEMU to understand the state and to
>>>> ensure back/forwards compatibility between KVM versions.
>>>
>>> Do we need QEMU to actually understand this? Can't we just leave this
>>> all to the kernel and QEMU just passes on the data? That would still
>>> require some ABI stability between kernel versions in this respect, but
>>> it's less problematic than exposing the data format to userland at all.
>>
>> This would preclude ever being able to migrate a VM from KVM to
>> TCG QEMU, which seems a shame. (That doesn't work right now, but
>> I'm a bit wary of shutting the door to it forever.)
> 
> If the format of the migrated tables becomes ABI for KVM, it also
> becomes ABI for userspace (anything that comes out of the kernel *is*
> ABI). Andre, can you please explain what you mean?

Well, probably there is not so much difference. I was just wondering if
it would be easier to treat that data as an opaque blob.
But you are probably right that it would just mean the difference
between documenting the format or not.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-14 18:20             ` Andre Przywara
@ 2016-03-14 18:36               ` Marc Zyngier
  -1 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2016-03-14 18:36 UTC (permalink / raw)
  To: Andre Przywara, Peter Maydell
  Cc: Christoffer Dall, Tomasz Nowicki, kvmarm, kvm-devel, arm-mail-list

On 14/03/16 18:20, Andre Przywara wrote:
> Hi,
> 
> On 14/03/16 17:54, Marc Zyngier wrote:
>> On 14/03/16 17:29, Peter Maydell wrote:
>>> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
>>>> So I see two ways to fix this:
>>>> 1.) we find a KVM specific way of letting userland save and restore the
>>>> ITS tables directly
>>>> 2.) we implement the BASER<n> registers, but still use our "cache" for
>>>> normal operations. On demand we would serialize KVM's virtual ITS data
>>>> structures and put them into the guest's memory, so they could be
>>>> saved/restored from there.
>>>
>>> I feel like we're rehashing a bunch of design choices we talked
>>> through way back in the last-but-one Connect. I don't suppose
>>> anybody wrote down our rationales from back then?
>>>
>>> (In particular I forget whether we decided the ITS tables were
>>> large enough to need to allow some sort of before-the-VM-stops
>>> migration of the data, which would be relatively doable with
>>> option 2 but painful under option 1.)
>>
>> I think only option 2 is valid here, and we must be able to shove most
>> of the routing information in the device/collection/IT tables. Common HW
>> seems to use 64bit of data per entry per table, so we should be able to
>> do the same with KVM.
> 
> All right, just skimmed over this and it looks doable.
> For the collection table we will most likely even get away with 32 bits
> per entry (compressed MPIDR or even VCPUIDs).
> Would the IPA of the ITTE suffice for each device table entry?

Yup. You can even loose the low 8 bits, as this is guaranteed to be 256
byte aligned. So for a 48bit IPA and 32bit of EventID, you end up only
using 45 bits, which leaves quite a few to spare, should we ever want a
larger IPA. Ideally, this should contain the relevant fields of the MAPD
command, with similar sizes.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-14 18:36               ` Marc Zyngier
  0 siblings, 0 replies; 101+ messages in thread
From: Marc Zyngier @ 2016-03-14 18:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/03/16 18:20, Andre Przywara wrote:
> Hi,
> 
> On 14/03/16 17:54, Marc Zyngier wrote:
>> On 14/03/16 17:29, Peter Maydell wrote:
>>> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
>>>> So I see two ways to fix this:
>>>> 1.) we find a KVM specific way of letting userland save and restore the
>>>> ITS tables directly
>>>> 2.) we implement the BASER<n> registers, but still use our "cache" for
>>>> normal operations. On demand we would serialize KVM's virtual ITS data
>>>> structures and put them into the guest's memory, so they could be
>>>> saved/restored from there.
>>>
>>> I feel like we're rehashing a bunch of design choices we talked
>>> through way back in the last-but-one Connect. I don't suppose
>>> anybody wrote down our rationales from back then?
>>>
>>> (In particular I forget whether we decided the ITS tables were
>>> large enough to need to allow some sort of before-the-VM-stops
>>> migration of the data, which would be relatively doable with
>>> option 2 but painful under option 1.)
>>
>> I think only option 2 is valid here, and we must be able to shove most
>> of the routing information in the device/collection/IT tables. Common HW
>> seems to use 64bit of data per entry per table, so we should be able to
>> do the same with KVM.
> 
> All right, just skimmed over this and it looks doable.
> For the collection table we will most likely even get away with 32 bits
> per entry (compressed MPIDR or even VCPUIDs).
> Would the IPA of the ITTE suffice for each device table entry?

Yup. You can even loose the low 8 bits, as this is guaranteed to be 256
byte aligned. So for a 48bit IPA and 32bit of EventID, you end up only
using 45 bits, which leaves quite a few to spare, should we ever want a
larger IPA. Ideally, this should contain the relevant fields of the MAPD
command, with similar sizes.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-14 17:29         ` Peter Maydell
@ 2016-03-18  9:38           ` Christoffer Dall
  -1 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2016-03-18  9:38 UTC (permalink / raw)
  To: Peter Maydell
  Cc: kvm-devel, Marc Zyngier, Andre Przywara, kvmarm, arm-mail-list

On Mon, Mar 14, 2016 at 05:29:44PM +0000, Peter Maydell wrote:
> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
> > So I see two ways to fix this:
> > 1.) we find a KVM specific way of letting userland save and restore the
> > ITS tables directly
> > 2.) we implement the BASER<n> registers, but still use our "cache" for
> > normal operations. On demand we would serialize KVM's virtual ITS data
> > structures and put them into the guest's memory, so they could be
> > saved/restored from there.
> 
> I feel like we're rehashing a bunch of design choices we talked
> through way back in the last-but-one Connect. I don't suppose
> anybody wrote down our rationales from back then?

Someone (not me) had the task to write it down, I don't recall if that
happened or not :)

> 
> (In particular I forget whether we decided the ITS tables were
> large enough to need to allow some sort of before-the-VM-stops
> migration of the data, which would be relatively doable with
> option 2 but painful under option 1.)

I think we concluded that it's not so much data that applying dirty
bitmaps stuff on there is strictly necessary, but that being able to do
this was probably a plus, and not very hard to do.

I am quite sure that we dismissed option 1, and were decided on option 2
though.

-Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-18  9:38           ` Christoffer Dall
  0 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2016-03-18  9:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 14, 2016 at 05:29:44PM +0000, Peter Maydell wrote:
> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
> > So I see two ways to fix this:
> > 1.) we find a KVM specific way of letting userland save and restore the
> > ITS tables directly
> > 2.) we implement the BASER<n> registers, but still use our "cache" for
> > normal operations. On demand we would serialize KVM's virtual ITS data
> > structures and put them into the guest's memory, so they could be
> > saved/restored from there.
> 
> I feel like we're rehashing a bunch of design choices we talked
> through way back in the last-but-one Connect. I don't suppose
> anybody wrote down our rationales from back then?

Someone (not me) had the task to write it down, I don't recall if that
happened or not :)

> 
> (In particular I forget whether we decided the ITS tables were
> large enough to need to allow some sort of before-the-VM-stops
> migration of the data, which would be relatively doable with
> option 2 but painful under option 1.)

I think we concluded that it's not so much data that applying dirty
bitmaps stuff on there is strictly necessary, but that being able to do
this was probably a plus, and not very hard to do.

I am quite sure that we dismissed option 1, and were decided on option 2
though.

-Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-14 18:20             ` Andre Przywara
@ 2016-03-18  9:40               ` Christoffer Dall
  -1 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2016-03-18  9:40 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Marc Zyngier, Peter Maydell, Tomasz Nowicki, kvmarm, kvm-devel,
	arm-mail-list

On Mon, Mar 14, 2016 at 06:20:36PM +0000, Andre Przywara wrote:
> Hi,
> 
> On 14/03/16 17:54, Marc Zyngier wrote:
> > On 14/03/16 17:29, Peter Maydell wrote:
> >> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
> >>> So I see two ways to fix this:
> >>> 1.) we find a KVM specific way of letting userland save and restore the
> >>> ITS tables directly
> >>> 2.) we implement the BASER<n> registers, but still use our "cache" for
> >>> normal operations. On demand we would serialize KVM's virtual ITS data
> >>> structures and put them into the guest's memory, so they could be
> >>> saved/restored from there.
> >>
> >> I feel like we're rehashing a bunch of design choices we talked
> >> through way back in the last-but-one Connect. I don't suppose
> >> anybody wrote down our rationales from back then?
> >>
> >> (In particular I forget whether we decided the ITS tables were
> >> large enough to need to allow some sort of before-the-VM-stops
> >> migration of the data, which would be relatively doable with
> >> option 2 but painful under option 1.)
> > 
> > I think only option 2 is valid here, and we must be able to shove most
> > of the routing information in the device/collection/IT tables. Common HW
> > seems to use 64bit of data per entry per table, so we should be able to
> > do the same with KVM.
> 
> All right, just skimmed over this and it looks doable.
> For the collection table we will most likely even get away with 32 bits
> per entry (compressed MPIDR or even VCPUIDs).
> Would the IPA of the ITTE suffice for each device table entry?
> 
> I will work out the details later.
> 
> >>>> Only caveat there I think was that we had to decide on a storage format
> >>>> in those memory regions, to allow QEMU to understand the state and to
> >>>> ensure back/forwards compatibility between KVM versions.
> >>>
> >>> Do we need QEMU to actually understand this? Can't we just leave this
> >>> all to the kernel and QEMU just passes on the data? That would still
> >>> require some ABI stability between kernel versions in this respect, but
> >>> it's less problematic than exposing the data format to userland at all.
> >>
> >> This would preclude ever being able to migrate a VM from KVM to
> >> TCG QEMU, which seems a shame. (That doesn't work right now, but
> >> I'm a bit wary of shutting the door to it forever.)
> > 
> > If the format of the migrated tables becomes ABI for KVM, it also
> > becomes ABI for userspace (anything that comes out of the kernel *is*
> > ABI). Andre, can you please explain what you mean?
> 
> Well, probably there is not so much difference. I was just wondering if
> it would be easier to treat that data as an opaque blob.
> But you are probably right that it would just mean the difference
> between documenting the format or not.
> 

Even ignoring the migrate-to-TCG case, you cannot treat it as a blob,
because you want to be able to migrate between KVM on kernel version X
and version Y.

-Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-18  9:40               ` Christoffer Dall
  0 siblings, 0 replies; 101+ messages in thread
From: Christoffer Dall @ 2016-03-18  9:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 14, 2016 at 06:20:36PM +0000, Andre Przywara wrote:
> Hi,
> 
> On 14/03/16 17:54, Marc Zyngier wrote:
> > On 14/03/16 17:29, Peter Maydell wrote:
> >> On 14 March 2016 at 11:13, Andre Przywara <andre.przywara@arm.com> wrote:
> >>> So I see two ways to fix this:
> >>> 1.) we find a KVM specific way of letting userland save and restore the
> >>> ITS tables directly
> >>> 2.) we implement the BASER<n> registers, but still use our "cache" for
> >>> normal operations. On demand we would serialize KVM's virtual ITS data
> >>> structures and put them into the guest's memory, so they could be
> >>> saved/restored from there.
> >>
> >> I feel like we're rehashing a bunch of design choices we talked
> >> through way back in the last-but-one Connect. I don't suppose
> >> anybody wrote down our rationales from back then?
> >>
> >> (In particular I forget whether we decided the ITS tables were
> >> large enough to need to allow some sort of before-the-VM-stops
> >> migration of the data, which would be relatively doable with
> >> option 2 but painful under option 1.)
> > 
> > I think only option 2 is valid here, and we must be able to shove most
> > of the routing information in the device/collection/IT tables. Common HW
> > seems to use 64bit of data per entry per table, so we should be able to
> > do the same with KVM.
> 
> All right, just skimmed over this and it looks doable.
> For the collection table we will most likely even get away with 32 bits
> per entry (compressed MPIDR or even VCPUIDs).
> Would the IPA of the ITTE suffice for each device table entry?
> 
> I will work out the details later.
> 
> >>>> Only caveat there I think was that we had to decide on a storage format
> >>>> in those memory regions, to allow QEMU to understand the state and to
> >>>> ensure back/forwards compatibility between KVM versions.
> >>>
> >>> Do we need QEMU to actually understand this? Can't we just leave this
> >>> all to the kernel and QEMU just passes on the data? That would still
> >>> require some ABI stability between kernel versions in this respect, but
> >>> it's less problematic than exposing the data format to userland at all.
> >>
> >> This would preclude ever being able to migrate a VM from KVM to
> >> TCG QEMU, which seems a shame. (That doesn't work right now, but
> >> I'm a bit wary of shutting the door to it forever.)
> > 
> > If the format of the migrated tables becomes ABI for KVM, it also
> > becomes ABI for userspace (anything that comes out of the kernel *is*
> > ABI). Andre, can you please explain what you mean?
> 
> Well, probably there is not so much difference. I was just wondering if
> it would be easier to treat that data as an opaque blob.
> But you are probably right that it would just mean the difference
> between documenting the format or not.
> 

Even ignoring the migrate-to-TCG case, you cannot treat it as a blob,
because you want to be able to migrate between KVM on kernel version X
and version Y.

-Christoffer

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
  2016-03-18  9:40               ` Christoffer Dall
@ 2016-03-18 17:14                 ` Peter Maydell
  -1 siblings, 0 replies; 101+ messages in thread
From: Peter Maydell @ 2016-03-18 17:14 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Andre Przywara, Marc Zyngier, Tomasz Nowicki, kvmarm, kvm-devel,
	arm-mail-list

On 18 March 2016 at 09:40, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Mon, Mar 14, 2016 at 06:20:36PM +0000, Andre Przywara wrote:
>> Well, probably there is not so much difference. I was just wondering if
>> it would be easier to treat that data as an opaque blob.
>> But you are probably right that it would just mean the difference
>> between documenting the format or not.

> Even ignoring the migrate-to-TCG case, you cannot treat it as a blob,
> because you want to be able to migrate between KVM on kernel version X
> and version Y.

You could require userspace to treat it as an opaque blob, and
transparently handle any version-upgrade within the kernel.
I think having it be documented-to-userspace ABI makes it
clearer that any format changes are a Big Deal, though.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation
@ 2016-03-18 17:14                 ` Peter Maydell
  0 siblings, 0 replies; 101+ messages in thread
From: Peter Maydell @ 2016-03-18 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 18 March 2016 at 09:40, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Mon, Mar 14, 2016 at 06:20:36PM +0000, Andre Przywara wrote:
>> Well, probably there is not so much difference. I was just wondering if
>> it would be easier to treat that data as an opaque blob.
>> But you are probably right that it would just mean the difference
>> between documenting the format or not.

> Even ignoring the migrate-to-TCG case, you cannot treat it as a blob,
> because you want to be able to migrate between KVM on kernel version X
> and version Y.

You could require userspace to treat it as an opaque blob, and
transparently handle any version-upgrade within the kernel.
I think having it be documented-to-userspace ABI makes it
clearer that any format changes are a Big Deal, though.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 101+ messages in thread

end of thread, other threads:[~2016-03-18 17:14 UTC | newest]

Thread overview: 101+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-07 14:55 [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation Andre Przywara
2015-10-07 14:55 ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 01/16] KVM: arm/arm64: VGIC: don't track used LRs in the distributor Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 02/16] KVM: arm/arm64: remove now unused code after stay-in-LR rework Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 03/16] KVM: extend struct kvm_msi to hold a 32-bit device ID Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 04/16] KVM: arm/arm64: add emulation model specific destroy function Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 05/16] KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 06/16] KVM: arm/arm64: make GIC frame address initialization model specific Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 07/16] KVM: arm64: Introduce new MMIO region for the ITS base address Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 08/16] KVM: arm64: handle ITS related GICv3 redistributor registers Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-22 15:46   ` Pavel Fedin
2015-10-22 15:46     ` Pavel Fedin
2015-10-22 15:55     ` Pavel Fedin
2015-10-22 15:55       ` Pavel Fedin
2015-10-07 14:55 ` [PATCH v3 09/16] KVM: arm64: introduce ITS emulation file with stub functions Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 10/16] KVM: arm64: implement basic ITS register handlers Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 11/16] KVM: arm64: add data structures to model ITS interrupt translation Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 12/16] KVM: arm64: handle pending bit for LPIs in ITS emulation Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 15:10   ` Pavel Fedin
2015-10-07 15:10     ` Pavel Fedin
2015-10-07 15:35     ` Marc Zyngier
2015-10-07 15:35       ` Marc Zyngier
2015-10-07 15:46       ` Pavel Fedin
2015-10-07 15:46         ` Pavel Fedin
2015-10-07 15:49         ` Marc Zyngier
2015-10-07 15:49           ` Marc Zyngier
2015-10-12  7:40   ` Pavel Fedin
2015-10-12  7:40     ` Pavel Fedin
2015-10-12 11:39     ` Pavel Fedin
2015-10-12 11:39       ` Pavel Fedin
2015-10-12 14:17     ` Andre Przywara
2015-10-12 14:17       ` Andre Przywara
2015-10-07 14:55 ` [PATCH v3 13/16] KVM: arm64: sync LPI configuration and pending tables Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-21 11:29   ` Pavel Fedin
2015-10-21 11:29     ` Pavel Fedin
2015-10-07 14:55 ` [PATCH v3 14/16] KVM: arm64: implement ITS command queue command handlers Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-14 12:26   ` Pavel Fedin
2015-10-14 12:26     ` Pavel Fedin
2015-10-07 14:55 ` [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-11-25 13:28   ` Pavel Fedin
2015-11-25 13:28     ` Pavel Fedin
2015-10-07 14:55 ` [PATCH v3 16/16] KVM: arm64: enable ITS emulation as a virtual MSI controller Andre Przywara
2015-10-07 14:55   ` Andre Przywara
2015-10-07 16:05 ` [PATCH v3 00/16] KVM: arm64: GICv3 ITS emulation Pavel Fedin
2015-10-07 16:05   ` Pavel Fedin
2015-10-07 16:22   ` Marc Zyngier
2015-10-07 16:22     ` Marc Zyngier
2015-10-07 18:09     ` Pavel Fedin
2015-10-07 18:09       ` Pavel Fedin
2015-10-07 19:48       ` Marc Zyngier
2015-10-07 19:48         ` Marc Zyngier
2015-10-07 19:48         ` Marc Zyngier
2015-10-08  8:41         ` Pavel Fedin
2015-10-08  8:41           ` Pavel Fedin
2015-10-10 15:37 ` Christoffer Dall
2015-10-10 15:37   ` Christoffer Dall
2015-10-12 14:12   ` Andre Przywara
2015-10-12 14:12     ` Andre Przywara
2015-10-12 15:18     ` Pavel Fedin
2015-10-12 15:18       ` Pavel Fedin
2015-10-14  8:48       ` Eric Auger
2015-10-14  8:48         ` Eric Auger
2015-10-14  8:50         ` Pavel Fedin
2015-10-14  8:50           ` Pavel Fedin
2015-10-13 15:46 ` Pavel Fedin
2015-10-13 15:46   ` Pavel Fedin
2016-03-09 11:35 ` Tomasz Nowicki
2016-03-09 11:35   ` Tomasz Nowicki
2016-03-13 18:16   ` Christoffer Dall
2016-03-13 18:16     ` Christoffer Dall
2016-03-14 11:13     ` Andre Przywara
2016-03-14 11:13       ` Andre Przywara
2016-03-14 17:29       ` Peter Maydell
2016-03-14 17:29         ` Peter Maydell
2016-03-14 17:54         ` Marc Zyngier
2016-03-14 17:54           ` Marc Zyngier
2016-03-14 18:20           ` Andre Przywara
2016-03-14 18:20             ` Andre Przywara
2016-03-14 18:36             ` Marc Zyngier
2016-03-14 18:36               ` Marc Zyngier
2016-03-18  9:40             ` Christoffer Dall
2016-03-18  9:40               ` Christoffer Dall
2016-03-18 17:14               ` Peter Maydell
2016-03-18 17:14                 ` Peter Maydell
2016-03-18  9:38         ` Christoffer Dall
2016-03-18  9:38           ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.