All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] KVM ARM64 Migration Fixes
@ 2015-03-02 13:28 ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:28 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Alex Bennée

An update to the patch posted earlier this week. Result is textually
identical but

  - split out the vgic_queue_irq_to_lr re-factoring
  - merged doc changes into earlier patch
  - added Acked-by tags

Christoffer,

Are you ok with the split?

The branch I've been working with can be found at:

  http://git.linaro.org/people/alex.bennee/linux.git
  branch: migration/kvmarm-fixes-for-4.0-v2

It includes the two patches from the current kvmarm/master branch.

Alex Bennée (2):
  arm: KVM: export vcpi->pause state via MP_STATE ioctls
  arm: KVM: add a common vgic_queue_irq_to_lr fn

Christoffer Dall (3):
  arm/arm64: KVM: Implement support for unqueueing active IRQs
  arm/arm64: KVM: Fix migration race in the arch timer
  arm/arm64: KVM: Keep elrsr/aisr in sync with software model

 Documentation/virtual/kvm/api.txt |  24 +++-
 arch/arm/kvm/arm.c                |  23 +++-
 include/kvm/arm_arch_timer.h      |   7 ++
 include/kvm/arm_vgic.h            |  16 ++-
 virt/kvm/arm/arch_timer.c         |  45 +++++--
 virt/kvm/arm/vgic-v2-emul.c       |  20 +++-
 virt/kvm/arm/vgic-v2.c            |   8 ++
 virt/kvm/arm/vgic-v3.c            |   8 ++
 virt/kvm/arm/vgic.c               | 245 +++++++++++++++++++++++++++++++-------
 virt/kvm/arm/vgic.h               |   8 ++
 10 files changed, 342 insertions(+), 62 deletions(-)

-- 
2.3.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 0/5] KVM ARM64 Migration Fixes
@ 2015-03-02 13:28 ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:28 UTC (permalink / raw)
  To: linux-arm-kernel

An update to the patch posted earlier this week. Result is textually
identical but

  - split out the vgic_queue_irq_to_lr re-factoring
  - merged doc changes into earlier patch
  - added Acked-by tags

Christoffer,

Are you ok with the split?

The branch I've been working with can be found at:

  http://git.linaro.org/people/alex.bennee/linux.git
  branch: migration/kvmarm-fixes-for-4.0-v2

It includes the two patches from the current kvmarm/master branch.

Alex Benn?e (2):
  arm: KVM: export vcpi->pause state via MP_STATE ioctls
  arm: KVM: add a common vgic_queue_irq_to_lr fn

Christoffer Dall (3):
  arm/arm64: KVM: Implement support for unqueueing active IRQs
  arm/arm64: KVM: Fix migration race in the arch timer
  arm/arm64: KVM: Keep elrsr/aisr in sync with software model

 Documentation/virtual/kvm/api.txt |  24 +++-
 arch/arm/kvm/arm.c                |  23 +++-
 include/kvm/arm_arch_timer.h      |   7 ++
 include/kvm/arm_vgic.h            |  16 ++-
 virt/kvm/arm/arch_timer.c         |  45 +++++--
 virt/kvm/arm/vgic-v2-emul.c       |  20 +++-
 virt/kvm/arm/vgic-v2.c            |   8 ++
 virt/kvm/arm/vgic-v3.c            |   8 ++
 virt/kvm/arm/vgic.c               | 245 +++++++++++++++++++++++++++++++-------
 virt/kvm/arm/vgic.h               |   8 ++
 10 files changed, 342 insertions(+), 62 deletions(-)

-- 
2.3.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
  2015-03-02 13:28 ` Alex Bennée
  (?)
@ 2015-03-02 13:29   ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Alex Bennée, Gleb Natapov, Paolo Bonzini, Jonathan Corbet,
	Russell King, open list:DOCUMENTATION, open list

To cleanly restore an SMP VM we need to ensure that the current pause
state of each vcpu is correctly recorded. Things could get confused if
the CPU starts running after migration restore completes when it was
paused before it state was captured.

We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
interface is a lot simpler as the only valid states are
KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b112efc..602156f 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -997,7 +997,7 @@ for vm-wide capabilities.
 4.38 KVM_GET_MP_STATE
 
 Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390
+Architectures: x86, s390, arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_mp_state (out)
 Returns: 0 on success; -1 on error
@@ -1027,15 +1027,21 @@ Possible values are:
  - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
                                  [s390]
 
-On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
-in-kernel irqchip, the multiprocessing state must be maintained by userspace on
+For x86:
+
+This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
+irqchip, the multiprocessing state must be maintained by userspace on
 these architectures.
 
+For arm/arm64:
+
+The only states that are valid are KVM_MP_STATE_HALTED and
+KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
 
 4.39 KVM_SET_MP_STATE
 
 Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390
+Architectures: x86, s390, arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_mp_state (in)
 Returns: 0 on success; -1 on error
@@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
 Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
 arguments.
 
-On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
-in-kernel irqchip, the multiprocessing state must be maintained by userspace on
+For x86:
+
+This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
+irqchip, the multiprocessing state must be maintained by userspace on
 these architectures.
 
+For arm/arm64:
+
+The only states that are valid are KVM_MP_STATE_HALTED and
+KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
 
 4.40 KVM_SET_IDENTITY_MAP_ADDR
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 5560f74..8531536 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_PSCI:
 	case KVM_CAP_ARM_PSCI_0_2:
 	case KVM_CAP_READONLY_MEM:
+	case KVM_CAP_MP_STATE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	return -EINVAL;
+	if (vcpu->arch.pause)
+		mp_state->mp_state = KVM_MP_STATE_HALTED;
+	else
+		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
+
+	return 0;
 }
 
 int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	return -EINVAL;
+	switch (mp_state->mp_state) {
+	case KVM_MP_STATE_RUNNABLE:
+		vcpu->arch.pause = false;
+		break;
+	case KVM_MP_STATE_HALTED:
+		vcpu->arch.pause = true;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 /**
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Russell King, open list:DOCUMENTATION, Gleb Natapov,
	Jonathan Corbet, open list, Paolo Bonzini

To cleanly restore an SMP VM we need to ensure that the current pause
state of each vcpu is correctly recorded. Things could get confused if
the CPU starts running after migration restore completes when it was
paused before it state was captured.

We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
interface is a lot simpler as the only valid states are
KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b112efc..602156f 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -997,7 +997,7 @@ for vm-wide capabilities.
 4.38 KVM_GET_MP_STATE
 
 Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390
+Architectures: x86, s390, arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_mp_state (out)
 Returns: 0 on success; -1 on error
@@ -1027,15 +1027,21 @@ Possible values are:
  - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
                                  [s390]
 
-On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
-in-kernel irqchip, the multiprocessing state must be maintained by userspace on
+For x86:
+
+This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
+irqchip, the multiprocessing state must be maintained by userspace on
 these architectures.
 
+For arm/arm64:
+
+The only states that are valid are KVM_MP_STATE_HALTED and
+KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
 
 4.39 KVM_SET_MP_STATE
 
 Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390
+Architectures: x86, s390, arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_mp_state (in)
 Returns: 0 on success; -1 on error
@@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
 Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
 arguments.
 
-On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
-in-kernel irqchip, the multiprocessing state must be maintained by userspace on
+For x86:
+
+This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
+irqchip, the multiprocessing state must be maintained by userspace on
 these architectures.
 
+For arm/arm64:
+
+The only states that are valid are KVM_MP_STATE_HALTED and
+KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
 
 4.40 KVM_SET_IDENTITY_MAP_ADDR
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 5560f74..8531536 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_PSCI:
 	case KVM_CAP_ARM_PSCI_0_2:
 	case KVM_CAP_READONLY_MEM:
+	case KVM_CAP_MP_STATE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	return -EINVAL;
+	if (vcpu->arch.pause)
+		mp_state->mp_state = KVM_MP_STATE_HALTED;
+	else
+		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
+
+	return 0;
 }
 
 int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	return -EINVAL;
+	switch (mp_state->mp_state) {
+	case KVM_MP_STATE_RUNNABLE:
+		vcpu->arch.pause = false;
+		break;
+	case KVM_MP_STATE_HALTED:
+		vcpu->arch.pause = true;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 /**
-- 
2.3.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

To cleanly restore an SMP VM we need to ensure that the current pause
state of each vcpu is correctly recorded. Things could get confused if
the CPU starts running after migration restore completes when it was
paused before it state was captured.

We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
interface is a lot simpler as the only valid states are
KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.

Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b112efc..602156f 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -997,7 +997,7 @@ for vm-wide capabilities.
 4.38 KVM_GET_MP_STATE
 
 Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390
+Architectures: x86, s390, arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_mp_state (out)
 Returns: 0 on success; -1 on error
@@ -1027,15 +1027,21 @@ Possible values are:
  - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
                                  [s390]
 
-On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
-in-kernel irqchip, the multiprocessing state must be maintained by userspace on
+For x86:
+
+This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
+irqchip, the multiprocessing state must be maintained by userspace on
 these architectures.
 
+For arm/arm64:
+
+The only states that are valid are KVM_MP_STATE_HALTED and
+KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
 
 4.39 KVM_SET_MP_STATE
 
 Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390
+Architectures: x86, s390, arm, arm64
 Type: vcpu ioctl
 Parameters: struct kvm_mp_state (in)
 Returns: 0 on success; -1 on error
@@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
 Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
 arguments.
 
-On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
-in-kernel irqchip, the multiprocessing state must be maintained by userspace on
+For x86:
+
+This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
+irqchip, the multiprocessing state must be maintained by userspace on
 these architectures.
 
+For arm/arm64:
+
+The only states that are valid are KVM_MP_STATE_HALTED and
+KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
 
 4.40 KVM_SET_IDENTITY_MAP_ADDR
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 5560f74..8531536 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_PSCI:
 	case KVM_CAP_ARM_PSCI_0_2:
 	case KVM_CAP_READONLY_MEM:
+	case KVM_CAP_MP_STATE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	return -EINVAL;
+	if (vcpu->arch.pause)
+		mp_state->mp_state = KVM_MP_STATE_HALTED;
+	else
+		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
+
+	return 0;
 }
 
 int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	return -EINVAL;
+	switch (mp_state->mp_state) {
+	case KVM_MP_STATE_RUNNABLE:
+		vcpu->arch.pause = false;
+		break;
+	case KVM_MP_STATE_HALTED:
+		vcpu->arch.pause = true;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 /**
-- 
2.3.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
  2015-03-02 13:28 ` Alex Bennée
  (?)
@ 2015-03-02 13:29   ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Alex Bennée, Gleb Natapov, Paolo Bonzini, open list

From: Christoffer Dall <christoffer.dall@linaro.org>

Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
AJB:
   - fixed merge conflicts
   - moved additional shared bitmaps to be dynamically allocated
   - make irq_active_on_cpu dynamically allocated as well
   - in vgic_queue_irq don't queue pending if already active
   - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
   - vgic: clear active on CPU bit
   - checkpatch, remove extraneous braces
   - checkpatch, remove debug, fix overflows
   - move register access fns to re-factored vgic-v2-emul.c
v2
   - doc: unqueue and update_state

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7c55dd5..7042251 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -196,6 +196,9 @@ struct vgic_dist {
 	/* Level-triggered interrupt queued on VCPU interface */
 	struct vgic_bitmap	irq_queued;
 
+	/* Interrupt was active when unqueue from VCPU interface */
+	struct vgic_bitmap	irq_active;
+
 	/* Interrupt priority. Not used yet. */
 	struct vgic_bytemap	irq_priority;
 
@@ -236,6 +239,9 @@ struct vgic_dist {
 	/* Bitmap indicating which CPU has something pending */
 	unsigned long		*irq_pending_on_cpu;
 
+	/* Bitmap indicating which CPU has active IRQs */
+	unsigned long		*irq_active_on_cpu;
+
 	struct vgic_vm_ops	vm_ops;
 #endif
 };
@@ -269,9 +275,15 @@ struct vgic_cpu {
 	/* per IRQ to LR mapping */
 	u8		*vgic_irq_lr_map;
 
-	/* Pending interrupts on this VCPU */
+	/* Pending/active/both interrupts on this VCPU */
 	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
+	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
+	DECLARE_BITMAP(	pend_act_percpu, VGIC_NR_PRIVATE_IRQS);
+
+	/* Pending/active/both shared interrupts, dynamically sized */
 	unsigned long	*pending_shared;
+	unsigned long   *active_shared;
+	unsigned long   *pend_act_shared;
 
 	/* Bitmap of used/free list registers */
 	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
@@ -311,6 +323,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level);
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
+int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		      struct kvm_exit_mmio *mmio);
 
diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 19c6210..c818662 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -107,6 +107,22 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
 					     vcpu->vcpu_id);
 }
 
+static bool handle_mmio_set_active_reg(struct kvm_vcpu *vcpu,
+				       struct kvm_exit_mmio *mmio,
+				       phys_addr_t offset)
+{
+	return vgic_handle_set_active_reg(vcpu->kvm, mmio, offset,
+					  vcpu->vcpu_id);
+}
+
+static bool handle_mmio_clear_active_reg(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	return vgic_handle_clear_active_reg(vcpu->kvm, mmio, offset,
+					    vcpu->vcpu_id);
+}
+
 static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
 				     phys_addr_t offset)
@@ -344,13 +360,13 @@ static const struct kvm_mmio_range vgic_dist_ranges[] = {
 		.base		= GIC_DIST_ACTIVE_SET,
 		.len		= VGIC_MAX_IRQS / 8,
 		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
+		.handle_mmio	= handle_mmio_set_active_reg,
 	},
 	{
 		.base		= GIC_DIST_ACTIVE_CLEAR,
 		.len		= VGIC_MAX_IRQS / 8,
 		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
+		.handle_mmio	= handle_mmio_clear_active_reg,
 	},
 	{
 		.base		= GIC_DIST_PRI,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0cc6ab6..bfb6fbb 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -277,6 +277,14 @@ static void vgic_irq_clear_queued(struct kvm_vcpu *vcpu, int irq)
 	vgic_bitmap_set_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq, 0);
 }
 
+
+static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
+}
+
 static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -520,6 +528,44 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 	return false;
 }
 
+bool vgic_handle_set_active_reg(struct kvm *kvm,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset, int vcpu_id)
+{
+	u32 *reg;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+
+	if (mmio->is_write) {
+		vgic_update_state(kvm);
+		return true;
+	}
+
+	return false;
+}
+
+bool vgic_handle_clear_active_reg(struct kvm *kvm,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, int vcpu_id)
+{
+	u32 *reg;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+
+	if (mmio->is_write) {
+		vgic_update_state(kvm);
+		return true;
+	}
+
+	return false;
+}
+
 static u32 vgic_cfg_expand(u16 val)
 {
 	u32 res = 0;
@@ -588,16 +634,12 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 }
 
 /**
- * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
+ * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
  *
- * Move any pending IRQs that have already been assigned to LRs back to the
+ * Move any IRQs that have already been assigned to LRs back to the
  * emulated distributor state so that the complete emulated state can be read
  * from the main emulation structures without investigating the LRs.
- *
- * Note that IRQs in the active state in the LRs get their pending state moved
- * to the distributor but the active state stays in the LRs, because we don't
- * track the active state on the distributor side.
  */
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
@@ -613,12 +655,22 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * 01: pending
 		 * 10: active
 		 * 11: pending and active
-		 *
-		 * If the LR holds only an active interrupt (not pending) then
-		 * just leave it alone.
 		 */
-		if ((lr.state & LR_STATE_MASK) == LR_STATE_ACTIVE)
-			continue;
+		BUG_ON(!(lr.state & LR_STATE_MASK));
+
+		/* Reestablish SGI source for pending and active IRQs */
+		if (lr.irq < VGIC_NR_SGIS)
+			add_sgi_source(vcpu, lr.irq, lr.source);
+
+		/*
+		 * If the LR holds an active (10) or a pending and active (11)
+		 * interrupt then move this state to the distributor tracking
+		 * bit.
+		 */
+		if (lr.state & LR_STATE_ACTIVE) {
+			vgic_irq_set_active(vcpu, lr.irq);
+			lr.state &= ~LR_STATE_ACTIVE;
+		}
 
 		/*
 		 * Reestablish the pending state on the distributor and the
@@ -626,21 +678,19 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * is fine, then we are only setting a few bits that were
 		 * already set.
 		 */
-		vgic_dist_irq_set_pending(vcpu, lr.irq);
-		if (lr.irq < VGIC_NR_SGIS)
-			add_sgi_source(vcpu, lr.irq, lr.source);
-		lr.state &= ~LR_STATE_PENDING;
+		if (lr.state & LR_STATE_PENDING) {
+			vgic_dist_irq_set_pending(vcpu, lr.irq);
+			lr.state &= ~LR_STATE_PENDING;
+		}
+
 		vgic_set_lr(vcpu, i, lr);
 
 		/*
-		 * If there's no state left on the LR (it could still be
-		 * active), then the LR does not hold any useful info and can
-		 * be marked as free for other use.
+		 * Marked the LR as free for other use.
 		 */
-		if (!(lr.state & LR_STATE_MASK)) {
-			vgic_retire_lr(i, lr.irq, vcpu);
-			vgic_irq_clear_queued(vcpu, lr.irq);
-		}
+		BUG_ON(lr.state & LR_STATE_MASK);
+		vgic_retire_lr(i, lr.irq, vcpu);
+		vgic_irq_clear_queued(vcpu, lr.irq);
 
 		/* Finally update the VGIC state. */
 		vgic_update_state(vcpu->kvm);
@@ -804,6 +854,36 @@ static int vgic_nr_shared_irqs(struct vgic_dist *dist)
 	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
 }
 
+static int compute_active_for_cpu(struct kvm_vcpu *vcpu)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *active, *enabled, *act_percpu, *act_shared;
+	unsigned long active_private, active_shared;
+	int nr_shared = vgic_nr_shared_irqs(dist);
+	int vcpu_id;
+
+	vcpu_id = vcpu->vcpu_id;
+	act_percpu = vcpu->arch.vgic_cpu.active_percpu;
+	act_shared = vcpu->arch.vgic_cpu.active_shared;
+
+	active = vgic_bitmap_get_cpu_map(&dist->irq_active, vcpu_id);
+	enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
+	bitmap_and(act_percpu, active, enabled, VGIC_NR_PRIVATE_IRQS);
+
+	active = vgic_bitmap_get_shared_map(&dist->irq_active);
+	enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
+	bitmap_and(act_shared, active, enabled, nr_shared);
+	bitmap_and(act_shared, act_shared,
+		   vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
+		   nr_shared);
+
+	active_private = find_first_bit(act_percpu, VGIC_NR_PRIVATE_IRQS);
+	active_shared = find_first_bit(act_shared, nr_shared);
+
+	return (active_private < VGIC_NR_PRIVATE_IRQS ||
+		active_shared < nr_shared);
+}
+
 static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -835,7 +915,7 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 
 /*
  * Update the interrupt state and determine which CPUs have pending
- * interrupts. Must be called with distributor lock held.
+ * or active interrupts. Must be called with distributor lock held.
  */
 void vgic_update_state(struct kvm *kvm)
 {
@@ -849,10 +929,13 @@ void vgic_update_state(struct kvm *kvm)
 	}
 
 	kvm_for_each_vcpu(c, vcpu, kvm) {
-		if (compute_pending_for_cpu(vcpu)) {
-			pr_debug("CPU%d has pending interrupts\n", c);
+		if (compute_pending_for_cpu(vcpu))
 			set_bit(c, dist->irq_pending_on_cpu);
-		}
+
+		if (compute_active_for_cpu(vcpu))
+			set_bit(c, dist->irq_active_on_cpu);
+		else
+			clear_bit(c, dist->irq_active_on_cpu);
 	}
 }
 
@@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *pa_percpu, *pa_shared;
 	int i, vcpu_id;
 	int overflow = 0;
+	int nr_shared = vgic_nr_shared_irqs(dist);
 
 	vcpu_id = vcpu->vcpu_id;
 
+	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
+	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
+
+	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
+		  VGIC_NR_PRIVATE_IRQS);
+	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
+		  nr_shared);
 	/*
 	 * We may not have any pending interrupt, or the interrupts
 	 * may have been serviced from another vcpu. In all cases,
 	 * move along.
 	 */
-	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
-		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
+	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
 		goto epilog;
-	}
 
 	/* SGIs */
-	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
+	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
 		if (!queue_sgi(vcpu, i))
 			overflow = 1;
 	}
 
 	/* PPIs */
-	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
+	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
 		if (!vgic_queue_hwirq(vcpu, i))
 			overflow = 1;
 	}
 
 	/* SPIs */
-	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
+	for_each_set_bit(i, pa_shared, nr_shared) {
 		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
 			overflow = 1;
 	}
 
+
+
+
 epilog:
 	if (overflow) {
 		vgic_enable_underflow(vcpu);
@@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
 	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
 }
 
+int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	if (!irqchip_in_kernel(vcpu->kvm))
+		return 0;
+
+	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
+}
+
+
 void vgic_kick_vcpus(struct kvm *kvm)
 {
 	struct kvm_vcpu *vcpu;
@@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 
 	kfree(vgic_cpu->pending_shared);
+	kfree(vgic_cpu->active_shared);
+	kfree(vgic_cpu->pend_act_shared);
 	kfree(vgic_cpu->vgic_irq_lr_map);
 	vgic_cpu->pending_shared = NULL;
+	vgic_cpu->active_shared = NULL;
+	vgic_cpu->pend_act_shared = NULL;
 	vgic_cpu->vgic_irq_lr_map = NULL;
 }
 
@@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 
 	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
 	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
+	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
+	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
 
-	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
+	if (!vgic_cpu->pending_shared
+		|| !vgic_cpu->active_shared
+		|| !vgic_cpu->pend_act_shared
+		|| !vgic_cpu->vgic_irq_lr_map) {
 		kvm_vgic_vcpu_destroy(vcpu);
 		return -ENOMEM;
 	}
@@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	kfree(dist->irq_spi_mpidr);
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
+	kfree(dist->irq_active_on_cpu);
 	dist->irq_sgi_sources = NULL;
 	dist->irq_spi_cpu = NULL;
 	dist->irq_spi_target = NULL;
 	dist->irq_pending_on_cpu = NULL;
+	dist->irq_active_on_cpu = NULL;
 	dist->nr_cpus = 0;
 }
 
@@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
 	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
+	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
 	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
 
@@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
 				       GFP_KERNEL);
 	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
 					   GFP_KERNEL);
+	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
+					   GFP_KERNEL);
 	if (!dist->irq_sgi_sources ||
 	    !dist->irq_spi_cpu ||
 	    !dist->irq_spi_target ||
-	    !dist->irq_pending_on_cpu) {
+	    !dist->irq_pending_on_cpu ||
+	    !dist->irq_active_on_cpu) {
 		ret = -ENOMEM;
 		goto out;
 	}
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index 1e83bdf..1e5a381 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 				   phys_addr_t offset, int vcpu_id);
 
+bool vgic_handle_set_active_reg(struct kvm *kvm,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset, int vcpu_id);
+
+bool vgic_handle_clear_active_reg(struct kvm *kvm,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, int vcpu_id);
+
 bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 			 phys_addr_t offset);
 
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Gleb Natapov, Paolo Bonzini, open list

From: Christoffer Dall <christoffer.dall@linaro.org>

Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
AJB:
   - fixed merge conflicts
   - moved additional shared bitmaps to be dynamically allocated
   - make irq_active_on_cpu dynamically allocated as well
   - in vgic_queue_irq don't queue pending if already active
   - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
   - vgic: clear active on CPU bit
   - checkpatch, remove extraneous braces
   - checkpatch, remove debug, fix overflows
   - move register access fns to re-factored vgic-v2-emul.c
v2
   - doc: unqueue and update_state

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7c55dd5..7042251 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -196,6 +196,9 @@ struct vgic_dist {
 	/* Level-triggered interrupt queued on VCPU interface */
 	struct vgic_bitmap	irq_queued;
 
+	/* Interrupt was active when unqueue from VCPU interface */
+	struct vgic_bitmap	irq_active;
+
 	/* Interrupt priority. Not used yet. */
 	struct vgic_bytemap	irq_priority;
 
@@ -236,6 +239,9 @@ struct vgic_dist {
 	/* Bitmap indicating which CPU has something pending */
 	unsigned long		*irq_pending_on_cpu;
 
+	/* Bitmap indicating which CPU has active IRQs */
+	unsigned long		*irq_active_on_cpu;
+
 	struct vgic_vm_ops	vm_ops;
 #endif
 };
@@ -269,9 +275,15 @@ struct vgic_cpu {
 	/* per IRQ to LR mapping */
 	u8		*vgic_irq_lr_map;
 
-	/* Pending interrupts on this VCPU */
+	/* Pending/active/both interrupts on this VCPU */
 	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
+	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
+	DECLARE_BITMAP(	pend_act_percpu, VGIC_NR_PRIVATE_IRQS);
+
+	/* Pending/active/both shared interrupts, dynamically sized */
 	unsigned long	*pending_shared;
+	unsigned long   *active_shared;
+	unsigned long   *pend_act_shared;
 
 	/* Bitmap of used/free list registers */
 	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
@@ -311,6 +323,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level);
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
+int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		      struct kvm_exit_mmio *mmio);
 
diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 19c6210..c818662 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -107,6 +107,22 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
 					     vcpu->vcpu_id);
 }
 
+static bool handle_mmio_set_active_reg(struct kvm_vcpu *vcpu,
+				       struct kvm_exit_mmio *mmio,
+				       phys_addr_t offset)
+{
+	return vgic_handle_set_active_reg(vcpu->kvm, mmio, offset,
+					  vcpu->vcpu_id);
+}
+
+static bool handle_mmio_clear_active_reg(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	return vgic_handle_clear_active_reg(vcpu->kvm, mmio, offset,
+					    vcpu->vcpu_id);
+}
+
 static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
 				     phys_addr_t offset)
@@ -344,13 +360,13 @@ static const struct kvm_mmio_range vgic_dist_ranges[] = {
 		.base		= GIC_DIST_ACTIVE_SET,
 		.len		= VGIC_MAX_IRQS / 8,
 		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
+		.handle_mmio	= handle_mmio_set_active_reg,
 	},
 	{
 		.base		= GIC_DIST_ACTIVE_CLEAR,
 		.len		= VGIC_MAX_IRQS / 8,
 		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
+		.handle_mmio	= handle_mmio_clear_active_reg,
 	},
 	{
 		.base		= GIC_DIST_PRI,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0cc6ab6..bfb6fbb 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -277,6 +277,14 @@ static void vgic_irq_clear_queued(struct kvm_vcpu *vcpu, int irq)
 	vgic_bitmap_set_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq, 0);
 }
 
+
+static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
+}
+
 static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -520,6 +528,44 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 	return false;
 }
 
+bool vgic_handle_set_active_reg(struct kvm *kvm,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset, int vcpu_id)
+{
+	u32 *reg;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+
+	if (mmio->is_write) {
+		vgic_update_state(kvm);
+		return true;
+	}
+
+	return false;
+}
+
+bool vgic_handle_clear_active_reg(struct kvm *kvm,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, int vcpu_id)
+{
+	u32 *reg;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+
+	if (mmio->is_write) {
+		vgic_update_state(kvm);
+		return true;
+	}
+
+	return false;
+}
+
 static u32 vgic_cfg_expand(u16 val)
 {
 	u32 res = 0;
@@ -588,16 +634,12 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 }
 
 /**
- * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
+ * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
  *
- * Move any pending IRQs that have already been assigned to LRs back to the
+ * Move any IRQs that have already been assigned to LRs back to the
  * emulated distributor state so that the complete emulated state can be read
  * from the main emulation structures without investigating the LRs.
- *
- * Note that IRQs in the active state in the LRs get their pending state moved
- * to the distributor but the active state stays in the LRs, because we don't
- * track the active state on the distributor side.
  */
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
@@ -613,12 +655,22 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * 01: pending
 		 * 10: active
 		 * 11: pending and active
-		 *
-		 * If the LR holds only an active interrupt (not pending) then
-		 * just leave it alone.
 		 */
-		if ((lr.state & LR_STATE_MASK) == LR_STATE_ACTIVE)
-			continue;
+		BUG_ON(!(lr.state & LR_STATE_MASK));
+
+		/* Reestablish SGI source for pending and active IRQs */
+		if (lr.irq < VGIC_NR_SGIS)
+			add_sgi_source(vcpu, lr.irq, lr.source);
+
+		/*
+		 * If the LR holds an active (10) or a pending and active (11)
+		 * interrupt then move this state to the distributor tracking
+		 * bit.
+		 */
+		if (lr.state & LR_STATE_ACTIVE) {
+			vgic_irq_set_active(vcpu, lr.irq);
+			lr.state &= ~LR_STATE_ACTIVE;
+		}
 
 		/*
 		 * Reestablish the pending state on the distributor and the
@@ -626,21 +678,19 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * is fine, then we are only setting a few bits that were
 		 * already set.
 		 */
-		vgic_dist_irq_set_pending(vcpu, lr.irq);
-		if (lr.irq < VGIC_NR_SGIS)
-			add_sgi_source(vcpu, lr.irq, lr.source);
-		lr.state &= ~LR_STATE_PENDING;
+		if (lr.state & LR_STATE_PENDING) {
+			vgic_dist_irq_set_pending(vcpu, lr.irq);
+			lr.state &= ~LR_STATE_PENDING;
+		}
+
 		vgic_set_lr(vcpu, i, lr);
 
 		/*
-		 * If there's no state left on the LR (it could still be
-		 * active), then the LR does not hold any useful info and can
-		 * be marked as free for other use.
+		 * Marked the LR as free for other use.
 		 */
-		if (!(lr.state & LR_STATE_MASK)) {
-			vgic_retire_lr(i, lr.irq, vcpu);
-			vgic_irq_clear_queued(vcpu, lr.irq);
-		}
+		BUG_ON(lr.state & LR_STATE_MASK);
+		vgic_retire_lr(i, lr.irq, vcpu);
+		vgic_irq_clear_queued(vcpu, lr.irq);
 
 		/* Finally update the VGIC state. */
 		vgic_update_state(vcpu->kvm);
@@ -804,6 +854,36 @@ static int vgic_nr_shared_irqs(struct vgic_dist *dist)
 	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
 }
 
+static int compute_active_for_cpu(struct kvm_vcpu *vcpu)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *active, *enabled, *act_percpu, *act_shared;
+	unsigned long active_private, active_shared;
+	int nr_shared = vgic_nr_shared_irqs(dist);
+	int vcpu_id;
+
+	vcpu_id = vcpu->vcpu_id;
+	act_percpu = vcpu->arch.vgic_cpu.active_percpu;
+	act_shared = vcpu->arch.vgic_cpu.active_shared;
+
+	active = vgic_bitmap_get_cpu_map(&dist->irq_active, vcpu_id);
+	enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
+	bitmap_and(act_percpu, active, enabled, VGIC_NR_PRIVATE_IRQS);
+
+	active = vgic_bitmap_get_shared_map(&dist->irq_active);
+	enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
+	bitmap_and(act_shared, active, enabled, nr_shared);
+	bitmap_and(act_shared, act_shared,
+		   vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
+		   nr_shared);
+
+	active_private = find_first_bit(act_percpu, VGIC_NR_PRIVATE_IRQS);
+	active_shared = find_first_bit(act_shared, nr_shared);
+
+	return (active_private < VGIC_NR_PRIVATE_IRQS ||
+		active_shared < nr_shared);
+}
+
 static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -835,7 +915,7 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 
 /*
  * Update the interrupt state and determine which CPUs have pending
- * interrupts. Must be called with distributor lock held.
+ * or active interrupts. Must be called with distributor lock held.
  */
 void vgic_update_state(struct kvm *kvm)
 {
@@ -849,10 +929,13 @@ void vgic_update_state(struct kvm *kvm)
 	}
 
 	kvm_for_each_vcpu(c, vcpu, kvm) {
-		if (compute_pending_for_cpu(vcpu)) {
-			pr_debug("CPU%d has pending interrupts\n", c);
+		if (compute_pending_for_cpu(vcpu))
 			set_bit(c, dist->irq_pending_on_cpu);
-		}
+
+		if (compute_active_for_cpu(vcpu))
+			set_bit(c, dist->irq_active_on_cpu);
+		else
+			clear_bit(c, dist->irq_active_on_cpu);
 	}
 }
 
@@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *pa_percpu, *pa_shared;
 	int i, vcpu_id;
 	int overflow = 0;
+	int nr_shared = vgic_nr_shared_irqs(dist);
 
 	vcpu_id = vcpu->vcpu_id;
 
+	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
+	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
+
+	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
+		  VGIC_NR_PRIVATE_IRQS);
+	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
+		  nr_shared);
 	/*
 	 * We may not have any pending interrupt, or the interrupts
 	 * may have been serviced from another vcpu. In all cases,
 	 * move along.
 	 */
-	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
-		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
+	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
 		goto epilog;
-	}
 
 	/* SGIs */
-	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
+	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
 		if (!queue_sgi(vcpu, i))
 			overflow = 1;
 	}
 
 	/* PPIs */
-	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
+	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
 		if (!vgic_queue_hwirq(vcpu, i))
 			overflow = 1;
 	}
 
 	/* SPIs */
-	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
+	for_each_set_bit(i, pa_shared, nr_shared) {
 		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
 			overflow = 1;
 	}
 
+
+
+
 epilog:
 	if (overflow) {
 		vgic_enable_underflow(vcpu);
@@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
 	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
 }
 
+int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	if (!irqchip_in_kernel(vcpu->kvm))
+		return 0;
+
+	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
+}
+
+
 void vgic_kick_vcpus(struct kvm *kvm)
 {
 	struct kvm_vcpu *vcpu;
@@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 
 	kfree(vgic_cpu->pending_shared);
+	kfree(vgic_cpu->active_shared);
+	kfree(vgic_cpu->pend_act_shared);
 	kfree(vgic_cpu->vgic_irq_lr_map);
 	vgic_cpu->pending_shared = NULL;
+	vgic_cpu->active_shared = NULL;
+	vgic_cpu->pend_act_shared = NULL;
 	vgic_cpu->vgic_irq_lr_map = NULL;
 }
 
@@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 
 	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
 	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
+	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
+	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
 
-	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
+	if (!vgic_cpu->pending_shared
+		|| !vgic_cpu->active_shared
+		|| !vgic_cpu->pend_act_shared
+		|| !vgic_cpu->vgic_irq_lr_map) {
 		kvm_vgic_vcpu_destroy(vcpu);
 		return -ENOMEM;
 	}
@@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	kfree(dist->irq_spi_mpidr);
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
+	kfree(dist->irq_active_on_cpu);
 	dist->irq_sgi_sources = NULL;
 	dist->irq_spi_cpu = NULL;
 	dist->irq_spi_target = NULL;
 	dist->irq_pending_on_cpu = NULL;
+	dist->irq_active_on_cpu = NULL;
 	dist->nr_cpus = 0;
 }
 
@@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
 	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
+	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
 	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
 
@@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
 				       GFP_KERNEL);
 	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
 					   GFP_KERNEL);
+	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
+					   GFP_KERNEL);
 	if (!dist->irq_sgi_sources ||
 	    !dist->irq_spi_cpu ||
 	    !dist->irq_spi_target ||
-	    !dist->irq_pending_on_cpu) {
+	    !dist->irq_pending_on_cpu ||
+	    !dist->irq_active_on_cpu) {
 		ret = -ENOMEM;
 		goto out;
 	}
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index 1e83bdf..1e5a381 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 				   phys_addr_t offset, int vcpu_id);
 
+bool vgic_handle_set_active_reg(struct kvm *kvm,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset, int vcpu_id);
+
+bool vgic_handle_clear_active_reg(struct kvm *kvm,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, int vcpu_id);
+
 bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 			 phys_addr_t offset);
 
-- 
2.3.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Christoffer Dall <christoffer.dall@linaro.org>

Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>

---
AJB:
   - fixed merge conflicts
   - moved additional shared bitmaps to be dynamically allocated
   - make irq_active_on_cpu dynamically allocated as well
   - in vgic_queue_irq don't queue pending if already active
   - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
   - vgic: clear active on CPU bit
   - checkpatch, remove extraneous braces
   - checkpatch, remove debug, fix overflows
   - move register access fns to re-factored vgic-v2-emul.c
v2
   - doc: unqueue and update_state

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7c55dd5..7042251 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -196,6 +196,9 @@ struct vgic_dist {
 	/* Level-triggered interrupt queued on VCPU interface */
 	struct vgic_bitmap	irq_queued;
 
+	/* Interrupt was active when unqueue from VCPU interface */
+	struct vgic_bitmap	irq_active;
+
 	/* Interrupt priority. Not used yet. */
 	struct vgic_bytemap	irq_priority;
 
@@ -236,6 +239,9 @@ struct vgic_dist {
 	/* Bitmap indicating which CPU has something pending */
 	unsigned long		*irq_pending_on_cpu;
 
+	/* Bitmap indicating which CPU has active IRQs */
+	unsigned long		*irq_active_on_cpu;
+
 	struct vgic_vm_ops	vm_ops;
 #endif
 };
@@ -269,9 +275,15 @@ struct vgic_cpu {
 	/* per IRQ to LR mapping */
 	u8		*vgic_irq_lr_map;
 
-	/* Pending interrupts on this VCPU */
+	/* Pending/active/both interrupts on this VCPU */
 	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
+	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
+	DECLARE_BITMAP(	pend_act_percpu, VGIC_NR_PRIVATE_IRQS);
+
+	/* Pending/active/both shared interrupts, dynamically sized */
 	unsigned long	*pending_shared;
+	unsigned long   *active_shared;
+	unsigned long   *pend_act_shared;
 
 	/* Bitmap of used/free list registers */
 	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
@@ -311,6 +323,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
 			bool level);
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
+int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		      struct kvm_exit_mmio *mmio);
 
diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 19c6210..c818662 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -107,6 +107,22 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
 					     vcpu->vcpu_id);
 }
 
+static bool handle_mmio_set_active_reg(struct kvm_vcpu *vcpu,
+				       struct kvm_exit_mmio *mmio,
+				       phys_addr_t offset)
+{
+	return vgic_handle_set_active_reg(vcpu->kvm, mmio, offset,
+					  vcpu->vcpu_id);
+}
+
+static bool handle_mmio_clear_active_reg(struct kvm_vcpu *vcpu,
+					 struct kvm_exit_mmio *mmio,
+					 phys_addr_t offset)
+{
+	return vgic_handle_clear_active_reg(vcpu->kvm, mmio, offset,
+					    vcpu->vcpu_id);
+}
+
 static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
 				     struct kvm_exit_mmio *mmio,
 				     phys_addr_t offset)
@@ -344,13 +360,13 @@ static const struct kvm_mmio_range vgic_dist_ranges[] = {
 		.base		= GIC_DIST_ACTIVE_SET,
 		.len		= VGIC_MAX_IRQS / 8,
 		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
+		.handle_mmio	= handle_mmio_set_active_reg,
 	},
 	{
 		.base		= GIC_DIST_ACTIVE_CLEAR,
 		.len		= VGIC_MAX_IRQS / 8,
 		.bits_per_irq	= 1,
-		.handle_mmio	= handle_mmio_raz_wi,
+		.handle_mmio	= handle_mmio_clear_active_reg,
 	},
 	{
 		.base		= GIC_DIST_PRI,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0cc6ab6..bfb6fbb 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -277,6 +277,14 @@ static void vgic_irq_clear_queued(struct kvm_vcpu *vcpu, int irq)
 	vgic_bitmap_set_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq, 0);
 }
 
+
+static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
+}
+
 static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -520,6 +528,44 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
 	return false;
 }
 
+bool vgic_handle_set_active_reg(struct kvm *kvm,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset, int vcpu_id)
+{
+	u32 *reg;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+
+	if (mmio->is_write) {
+		vgic_update_state(kvm);
+		return true;
+	}
+
+	return false;
+}
+
+bool vgic_handle_clear_active_reg(struct kvm *kvm,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, int vcpu_id)
+{
+	u32 *reg;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
+	vgic_reg_access(mmio, reg, offset,
+			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+
+	if (mmio->is_write) {
+		vgic_update_state(kvm);
+		return true;
+	}
+
+	return false;
+}
+
 static u32 vgic_cfg_expand(u16 val)
 {
 	u32 res = 0;
@@ -588,16 +634,12 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 }
 
 /**
- * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
+ * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
  * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
  *
- * Move any pending IRQs that have already been assigned to LRs back to the
+ * Move any IRQs that have already been assigned to LRs back to the
  * emulated distributor state so that the complete emulated state can be read
  * from the main emulation structures without investigating the LRs.
- *
- * Note that IRQs in the active state in the LRs get their pending state moved
- * to the distributor but the active state stays in the LRs, because we don't
- * track the active state on the distributor side.
  */
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
@@ -613,12 +655,22 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * 01: pending
 		 * 10: active
 		 * 11: pending and active
-		 *
-		 * If the LR holds only an active interrupt (not pending) then
-		 * just leave it alone.
 		 */
-		if ((lr.state & LR_STATE_MASK) == LR_STATE_ACTIVE)
-			continue;
+		BUG_ON(!(lr.state & LR_STATE_MASK));
+
+		/* Reestablish SGI source for pending and active IRQs */
+		if (lr.irq < VGIC_NR_SGIS)
+			add_sgi_source(vcpu, lr.irq, lr.source);
+
+		/*
+		 * If the LR holds an active (10) or a pending and active (11)
+		 * interrupt then move this state to the distributor tracking
+		 * bit.
+		 */
+		if (lr.state & LR_STATE_ACTIVE) {
+			vgic_irq_set_active(vcpu, lr.irq);
+			lr.state &= ~LR_STATE_ACTIVE;
+		}
 
 		/*
 		 * Reestablish the pending state on the distributor and the
@@ -626,21 +678,19 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 		 * is fine, then we are only setting a few bits that were
 		 * already set.
 		 */
-		vgic_dist_irq_set_pending(vcpu, lr.irq);
-		if (lr.irq < VGIC_NR_SGIS)
-			add_sgi_source(vcpu, lr.irq, lr.source);
-		lr.state &= ~LR_STATE_PENDING;
+		if (lr.state & LR_STATE_PENDING) {
+			vgic_dist_irq_set_pending(vcpu, lr.irq);
+			lr.state &= ~LR_STATE_PENDING;
+		}
+
 		vgic_set_lr(vcpu, i, lr);
 
 		/*
-		 * If there's no state left on the LR (it could still be
-		 * active), then the LR does not hold any useful info and can
-		 * be marked as free for other use.
+		 * Marked the LR as free for other use.
 		 */
-		if (!(lr.state & LR_STATE_MASK)) {
-			vgic_retire_lr(i, lr.irq, vcpu);
-			vgic_irq_clear_queued(vcpu, lr.irq);
-		}
+		BUG_ON(lr.state & LR_STATE_MASK);
+		vgic_retire_lr(i, lr.irq, vcpu);
+		vgic_irq_clear_queued(vcpu, lr.irq);
 
 		/* Finally update the VGIC state. */
 		vgic_update_state(vcpu->kvm);
@@ -804,6 +854,36 @@ static int vgic_nr_shared_irqs(struct vgic_dist *dist)
 	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
 }
 
+static int compute_active_for_cpu(struct kvm_vcpu *vcpu)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *active, *enabled, *act_percpu, *act_shared;
+	unsigned long active_private, active_shared;
+	int nr_shared = vgic_nr_shared_irqs(dist);
+	int vcpu_id;
+
+	vcpu_id = vcpu->vcpu_id;
+	act_percpu = vcpu->arch.vgic_cpu.active_percpu;
+	act_shared = vcpu->arch.vgic_cpu.active_shared;
+
+	active = vgic_bitmap_get_cpu_map(&dist->irq_active, vcpu_id);
+	enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
+	bitmap_and(act_percpu, active, enabled, VGIC_NR_PRIVATE_IRQS);
+
+	active = vgic_bitmap_get_shared_map(&dist->irq_active);
+	enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
+	bitmap_and(act_shared, active, enabled, nr_shared);
+	bitmap_and(act_shared, act_shared,
+		   vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
+		   nr_shared);
+
+	active_private = find_first_bit(act_percpu, VGIC_NR_PRIVATE_IRQS);
+	active_shared = find_first_bit(act_shared, nr_shared);
+
+	return (active_private < VGIC_NR_PRIVATE_IRQS ||
+		active_shared < nr_shared);
+}
+
 static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -835,7 +915,7 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
 
 /*
  * Update the interrupt state and determine which CPUs have pending
- * interrupts. Must be called with distributor lock held.
+ * or active interrupts. Must be called with distributor lock held.
  */
 void vgic_update_state(struct kvm *kvm)
 {
@@ -849,10 +929,13 @@ void vgic_update_state(struct kvm *kvm)
 	}
 
 	kvm_for_each_vcpu(c, vcpu, kvm) {
-		if (compute_pending_for_cpu(vcpu)) {
-			pr_debug("CPU%d has pending interrupts\n", c);
+		if (compute_pending_for_cpu(vcpu))
 			set_bit(c, dist->irq_pending_on_cpu);
-		}
+
+		if (compute_active_for_cpu(vcpu))
+			set_bit(c, dist->irq_active_on_cpu);
+		else
+			clear_bit(c, dist->irq_active_on_cpu);
 	}
 }
 
@@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+	unsigned long *pa_percpu, *pa_shared;
 	int i, vcpu_id;
 	int overflow = 0;
+	int nr_shared = vgic_nr_shared_irqs(dist);
 
 	vcpu_id = vcpu->vcpu_id;
 
+	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
+	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
+
+	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
+		  VGIC_NR_PRIVATE_IRQS);
+	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
+		  nr_shared);
 	/*
 	 * We may not have any pending interrupt, or the interrupts
 	 * may have been serviced from another vcpu. In all cases,
 	 * move along.
 	 */
-	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
-		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
+	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
 		goto epilog;
-	}
 
 	/* SGIs */
-	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
+	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
 		if (!queue_sgi(vcpu, i))
 			overflow = 1;
 	}
 
 	/* PPIs */
-	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
+	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
 		if (!vgic_queue_hwirq(vcpu, i))
 			overflow = 1;
 	}
 
 	/* SPIs */
-	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
+	for_each_set_bit(i, pa_shared, nr_shared) {
 		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
 			overflow = 1;
 	}
 
+
+
+
 epilog:
 	if (overflow) {
 		vgic_enable_underflow(vcpu);
@@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
 	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
 }
 
+int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	if (!irqchip_in_kernel(vcpu->kvm))
+		return 0;
+
+	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
+}
+
+
 void vgic_kick_vcpus(struct kvm *kvm)
 {
 	struct kvm_vcpu *vcpu;
@@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 
 	kfree(vgic_cpu->pending_shared);
+	kfree(vgic_cpu->active_shared);
+	kfree(vgic_cpu->pend_act_shared);
 	kfree(vgic_cpu->vgic_irq_lr_map);
 	vgic_cpu->pending_shared = NULL;
+	vgic_cpu->active_shared = NULL;
+	vgic_cpu->pend_act_shared = NULL;
 	vgic_cpu->vgic_irq_lr_map = NULL;
 }
 
@@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
 
 	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
 	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
+	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
+	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
 	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
 
-	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
+	if (!vgic_cpu->pending_shared
+		|| !vgic_cpu->active_shared
+		|| !vgic_cpu->pend_act_shared
+		|| !vgic_cpu->vgic_irq_lr_map) {
 		kvm_vgic_vcpu_destroy(vcpu);
 		return -ENOMEM;
 	}
@@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
 	kfree(dist->irq_spi_mpidr);
 	kfree(dist->irq_spi_target);
 	kfree(dist->irq_pending_on_cpu);
+	kfree(dist->irq_active_on_cpu);
 	dist->irq_sgi_sources = NULL;
 	dist->irq_spi_cpu = NULL;
 	dist->irq_spi_target = NULL;
 	dist->irq_pending_on_cpu = NULL;
+	dist->irq_active_on_cpu = NULL;
 	dist->nr_cpus = 0;
 }
 
@@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
 	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
+	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
 	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
 	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
 
@@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
 				       GFP_KERNEL);
 	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
 					   GFP_KERNEL);
+	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
+					   GFP_KERNEL);
 	if (!dist->irq_sgi_sources ||
 	    !dist->irq_spi_cpu ||
 	    !dist->irq_spi_target ||
-	    !dist->irq_pending_on_cpu) {
+	    !dist->irq_pending_on_cpu ||
+	    !dist->irq_active_on_cpu) {
 		ret = -ENOMEM;
 		goto out;
 	}
diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
index 1e83bdf..1e5a381 100644
--- a/virt/kvm/arm/vgic.h
+++ b/virt/kvm/arm/vgic.h
@@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
 				   phys_addr_t offset, int vcpu_id);
 
+bool vgic_handle_set_active_reg(struct kvm *kvm,
+				struct kvm_exit_mmio *mmio,
+				phys_addr_t offset, int vcpu_id);
+
+bool vgic_handle_clear_active_reg(struct kvm *kvm,
+				  struct kvm_exit_mmio *mmio,
+				  phys_addr_t offset, int vcpu_id);
+
 bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
 			 phys_addr_t offset);
 
-- 
2.3.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn
  2015-03-02 13:28 ` Alex Bennée
  (?)
@ 2015-03-02 13:29   ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Alex Bennée, Gleb Natapov, Paolo Bonzini, open list

This helps re-factor away some of the repetitive code and makes the code
flow more nicely.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index bfb6fbb..3b4ded2 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -263,6 +263,13 @@ static int vgic_irq_is_queued(struct kvm_vcpu *vcpu, int irq)
 	return vgic_bitmap_get_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq);
 }
 
+static int vgic_irq_is_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	return vgic_bitmap_get_irq_val(&dist->irq_active, vcpu->vcpu_id, irq);
+}
+
 static void vgic_irq_set_queued(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -285,6 +292,13 @@ static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
 	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
 }
 
+static void vgic_irq_clear_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 0);
+}
+
 static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -1032,6 +1046,25 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 	}
 }
 
+static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
+				 int lr_nr, struct vgic_lr vlr)
+{
+	if (vgic_irq_is_active(vcpu, irq)) {
+		vlr.state |= LR_STATE_ACTIVE;
+		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
+		vgic_irq_clear_active(vcpu, irq);
+		vgic_update_state(vcpu->kvm);
+	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+		vlr.state |= LR_STATE_PENDING;
+		kvm_debug("Set pending: 0x%x\n", vlr.state);
+	}
+
+	if (!vgic_irq_is_edge(vcpu, irq))
+		vlr.state |= LR_EOI_INT;
+
+	vgic_set_lr(vcpu, lr_nr, vlr);
+}
+
 /*
  * Queue an interrupt to a CPU virtual interface. Return true on success,
  * or false if it wasn't possible to queue it.
@@ -1059,8 +1092,7 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 		if (vlr.source == sgi_source_id) {
 			kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq);
 			BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
-			vlr.state |= LR_STATE_PENDING;
-			vgic_set_lr(vcpu, lr, vlr);
+			vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
 			return true;
 		}
 	}
@@ -1077,11 +1109,8 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 	vlr.irq = irq;
 	vlr.source = sgi_source_id;
-	vlr.state = LR_STATE_PENDING;
-	if (!vgic_irq_is_edge(vcpu, irq))
-		vlr.state |= LR_EOI_INT;
-
-	vgic_set_lr(vcpu, lr, vlr);
+	vlr.state = 0;
+	vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
 
 	return true;
 }
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Gleb Natapov, Paolo Bonzini, open list

This helps re-factor away some of the repetitive code and makes the code
flow more nicely.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index bfb6fbb..3b4ded2 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -263,6 +263,13 @@ static int vgic_irq_is_queued(struct kvm_vcpu *vcpu, int irq)
 	return vgic_bitmap_get_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq);
 }
 
+static int vgic_irq_is_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	return vgic_bitmap_get_irq_val(&dist->irq_active, vcpu->vcpu_id, irq);
+}
+
 static void vgic_irq_set_queued(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -285,6 +292,13 @@ static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
 	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
 }
 
+static void vgic_irq_clear_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 0);
+}
+
 static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -1032,6 +1046,25 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 	}
 }
 
+static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
+				 int lr_nr, struct vgic_lr vlr)
+{
+	if (vgic_irq_is_active(vcpu, irq)) {
+		vlr.state |= LR_STATE_ACTIVE;
+		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
+		vgic_irq_clear_active(vcpu, irq);
+		vgic_update_state(vcpu->kvm);
+	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+		vlr.state |= LR_STATE_PENDING;
+		kvm_debug("Set pending: 0x%x\n", vlr.state);
+	}
+
+	if (!vgic_irq_is_edge(vcpu, irq))
+		vlr.state |= LR_EOI_INT;
+
+	vgic_set_lr(vcpu, lr_nr, vlr);
+}
+
 /*
  * Queue an interrupt to a CPU virtual interface. Return true on success,
  * or false if it wasn't possible to queue it.
@@ -1059,8 +1092,7 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 		if (vlr.source == sgi_source_id) {
 			kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq);
 			BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
-			vlr.state |= LR_STATE_PENDING;
-			vgic_set_lr(vcpu, lr, vlr);
+			vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
 			return true;
 		}
 	}
@@ -1077,11 +1109,8 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 	vlr.irq = irq;
 	vlr.source = sgi_source_id;
-	vlr.state = LR_STATE_PENDING;
-	if (!vgic_irq_is_edge(vcpu, irq))
-		vlr.state |= LR_EOI_INT;
-
-	vgic_set_lr(vcpu, lr, vlr);
+	vlr.state = 0;
+	vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
 
 	return true;
 }
-- 
2.3.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

This helps re-factor away some of the repetitive code and makes the code
flow more nicely.

Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index bfb6fbb..3b4ded2 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -263,6 +263,13 @@ static int vgic_irq_is_queued(struct kvm_vcpu *vcpu, int irq)
 	return vgic_bitmap_get_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq);
 }
 
+static int vgic_irq_is_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	return vgic_bitmap_get_irq_val(&dist->irq_active, vcpu->vcpu_id, irq);
+}
+
 static void vgic_irq_set_queued(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -285,6 +292,13 @@ static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
 	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
 }
 
+static void vgic_irq_clear_active(struct kvm_vcpu *vcpu, int irq)
+{
+	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 0);
+}
+
 static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
@@ -1032,6 +1046,25 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 	}
 }
 
+static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
+				 int lr_nr, struct vgic_lr vlr)
+{
+	if (vgic_irq_is_active(vcpu, irq)) {
+		vlr.state |= LR_STATE_ACTIVE;
+		kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
+		vgic_irq_clear_active(vcpu, irq);
+		vgic_update_state(vcpu->kvm);
+	} else if (vgic_dist_irq_is_pending(vcpu, irq)) {
+		vlr.state |= LR_STATE_PENDING;
+		kvm_debug("Set pending: 0x%x\n", vlr.state);
+	}
+
+	if (!vgic_irq_is_edge(vcpu, irq))
+		vlr.state |= LR_EOI_INT;
+
+	vgic_set_lr(vcpu, lr_nr, vlr);
+}
+
 /*
  * Queue an interrupt to a CPU virtual interface. Return true on success,
  * or false if it wasn't possible to queue it.
@@ -1059,8 +1092,7 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 		if (vlr.source == sgi_source_id) {
 			kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq);
 			BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
-			vlr.state |= LR_STATE_PENDING;
-			vgic_set_lr(vcpu, lr, vlr);
+			vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
 			return true;
 		}
 	}
@@ -1077,11 +1109,8 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 	vlr.irq = irq;
 	vlr.source = sgi_source_id;
-	vlr.state = LR_STATE_PENDING;
-	if (!vgic_irq_is_edge(vcpu, irq))
-		vlr.state |= LR_EOI_INT;
-
-	vgic_set_lr(vcpu, lr, vlr);
+	vlr.state = 0;
+	vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
 
 	return true;
 }
-- 
2.3.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 4/5] arm/arm64: KVM: Fix migration race in the arch timer
  2015-03-02 13:28 ` Alex Bennée
  (?)
@ 2015-03-02 13:29   ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Alex Bennée, Gleb Natapov, Paolo Bonzini, Russell King, open list

From: Christoffer Dall <christoffer.dall@linaro.org>

When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running.  When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).

This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race.  It is unlikely, but possible, for the following
sequence of events to happen:

 1. Userspace stops the VM
 2. Hrtimer for VCPU is scheduled
 3. Userspace checkpoints the VGIC state (no pending timer interrupts)
 4. The hrtimer fires, schedules work in a workqueue
 5. Workqueue function runs, masks the timer and injects timer interrupt
 6. Userspace checkpoints the timer state (timer masked)

At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.

Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 8531536..f7fd76e 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -269,7 +269,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 {
-	return 0;
+	return kvm_timer_should_fire(vcpu);
 }
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index b3f45a5..98cc9f4 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -72,6 +72,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu);
 u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
 int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
 
+bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
+
 #else
 static inline int kvm_timer_hyp_init(void)
 {
@@ -96,6 +98,11 @@ static inline u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 {
 	return 0;
 }
+
+static inline bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
 #endif
 
 #endif
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 6e54f35..98c95f2 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -85,13 +85,22 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+/*
+ * Work function for handling the backup timer that we schedule when a vcpu is
+ * no longer running, but had a timer programmed to fire in the future.
+ */
 static void kvm_timer_inject_irq_work(struct work_struct *work)
 {
 	struct kvm_vcpu *vcpu;
 
 	vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired);
 	vcpu->arch.timer_cpu.armed = false;
-	kvm_timer_inject_irq(vcpu);
+
+	/*
+	 * If the vcpu is blocked we want to wake it up so that it will see
+	 * the timer has expired when entering the guest.
+	 */
+	kvm_vcpu_kick(vcpu);
 }
 
 static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
@@ -102,6 +111,21 @@ static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
 	return HRTIMER_NORESTART;
 }
 
+bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+{
+	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+	cycle_t cval, now;
+
+	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
+		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
+		return false;
+
+	cval = timer->cntv_cval;
+	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+
+	return cval <= now;
+}
+
 /**
  * kvm_timer_flush_hwstate - prepare to move the virt timer to the cpu
  * @vcpu: The vcpu pointer
@@ -119,6 +143,13 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
 	 * populate the CPU timer again.
 	 */
 	timer_disarm(timer);
+
+	/*
+	 * If the timer expired while we were not scheduled, now is the time
+	 * to inject it.
+	 */
+	if (kvm_timer_should_fire(vcpu))
+		kvm_timer_inject_irq(vcpu);
 }
 
 /**
@@ -134,16 +165,9 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 	cycle_t cval, now;
 	u64 ns;
 
-	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
-		return;
-
-	cval = timer->cntv_cval;
-	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
-
 	BUG_ON(timer_is_armed(timer));
 
-	if (cval <= now) {
+	if (kvm_timer_should_fire(vcpu)) {
 		/*
 		 * Timer has already expired while we were not
 		 * looking. Inject the interrupt and carry on.
@@ -152,6 +176,9 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 		return;
 	}
 
+	cval = timer->cntv_cval;
+	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+
 	ns = cyclecounter_cyc2ns(timecounter->cc, cval - now, timecounter->mask,
 				 &timecounter->frac);
 	timer_arm(timer, ns);
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 4/5] arm/arm64: KVM: Fix migration race in the arch timer
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Gleb Natapov, Paolo Bonzini, open list, Russell King

From: Christoffer Dall <christoffer.dall@linaro.org>

When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running.  When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).

This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race.  It is unlikely, but possible, for the following
sequence of events to happen:

 1. Userspace stops the VM
 2. Hrtimer for VCPU is scheduled
 3. Userspace checkpoints the VGIC state (no pending timer interrupts)
 4. The hrtimer fires, schedules work in a workqueue
 5. Workqueue function runs, masks the timer and injects timer interrupt
 6. Userspace checkpoints the timer state (timer masked)

At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.

Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 8531536..f7fd76e 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -269,7 +269,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 {
-	return 0;
+	return kvm_timer_should_fire(vcpu);
 }
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index b3f45a5..98cc9f4 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -72,6 +72,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu);
 u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
 int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
 
+bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
+
 #else
 static inline int kvm_timer_hyp_init(void)
 {
@@ -96,6 +98,11 @@ static inline u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 {
 	return 0;
 }
+
+static inline bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
 #endif
 
 #endif
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 6e54f35..98c95f2 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -85,13 +85,22 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+/*
+ * Work function for handling the backup timer that we schedule when a vcpu is
+ * no longer running, but had a timer programmed to fire in the future.
+ */
 static void kvm_timer_inject_irq_work(struct work_struct *work)
 {
 	struct kvm_vcpu *vcpu;
 
 	vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired);
 	vcpu->arch.timer_cpu.armed = false;
-	kvm_timer_inject_irq(vcpu);
+
+	/*
+	 * If the vcpu is blocked we want to wake it up so that it will see
+	 * the timer has expired when entering the guest.
+	 */
+	kvm_vcpu_kick(vcpu);
 }
 
 static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
@@ -102,6 +111,21 @@ static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
 	return HRTIMER_NORESTART;
 }
 
+bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+{
+	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+	cycle_t cval, now;
+
+	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
+		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
+		return false;
+
+	cval = timer->cntv_cval;
+	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+
+	return cval <= now;
+}
+
 /**
  * kvm_timer_flush_hwstate - prepare to move the virt timer to the cpu
  * @vcpu: The vcpu pointer
@@ -119,6 +143,13 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
 	 * populate the CPU timer again.
 	 */
 	timer_disarm(timer);
+
+	/*
+	 * If the timer expired while we were not scheduled, now is the time
+	 * to inject it.
+	 */
+	if (kvm_timer_should_fire(vcpu))
+		kvm_timer_inject_irq(vcpu);
 }
 
 /**
@@ -134,16 +165,9 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 	cycle_t cval, now;
 	u64 ns;
 
-	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
-		return;
-
-	cval = timer->cntv_cval;
-	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
-
 	BUG_ON(timer_is_armed(timer));
 
-	if (cval <= now) {
+	if (kvm_timer_should_fire(vcpu)) {
 		/*
 		 * Timer has already expired while we were not
 		 * looking. Inject the interrupt and carry on.
@@ -152,6 +176,9 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 		return;
 	}
 
+	cval = timer->cntv_cval;
+	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+
 	ns = cyclecounter_cyc2ns(timecounter->cc, cval - now, timecounter->mask,
 				 &timecounter->frac);
 	timer_arm(timer, ns);
-- 
2.3.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 4/5] arm/arm64: KVM: Fix migration race in the arch timer
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Christoffer Dall <christoffer.dall@linaro.org>

When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running.  When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).

This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race.  It is unlikely, but possible, for the following
sequence of events to happen:

 1. Userspace stops the VM
 2. Hrtimer for VCPU is scheduled
 3. Userspace checkpoints the VGIC state (no pending timer interrupts)
 4. The hrtimer fires, schedules work in a workqueue
 5. Workqueue function runs, masks the timer and injects timer interrupt
 6. Userspace checkpoints the timer state (timer masked)

At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.

Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 8531536..f7fd76e 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -269,7 +269,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 
 int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 {
-	return 0;
+	return kvm_timer_should_fire(vcpu);
 }
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index b3f45a5..98cc9f4 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -72,6 +72,8 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu);
 u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
 int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
 
+bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
+
 #else
 static inline int kvm_timer_hyp_init(void)
 {
@@ -96,6 +98,11 @@ static inline u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 {
 	return 0;
 }
+
+static inline bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
 #endif
 
 #endif
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 6e54f35..98c95f2 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -85,13 +85,22 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+/*
+ * Work function for handling the backup timer that we schedule when a vcpu is
+ * no longer running, but had a timer programmed to fire in the future.
+ */
 static void kvm_timer_inject_irq_work(struct work_struct *work)
 {
 	struct kvm_vcpu *vcpu;
 
 	vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired);
 	vcpu->arch.timer_cpu.armed = false;
-	kvm_timer_inject_irq(vcpu);
+
+	/*
+	 * If the vcpu is blocked we want to wake it up so that it will see
+	 * the timer has expired when entering the guest.
+	 */
+	kvm_vcpu_kick(vcpu);
 }
 
 static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
@@ -102,6 +111,21 @@ static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
 	return HRTIMER_NORESTART;
 }
 
+bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
+{
+	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
+	cycle_t cval, now;
+
+	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
+		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
+		return false;
+
+	cval = timer->cntv_cval;
+	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+
+	return cval <= now;
+}
+
 /**
  * kvm_timer_flush_hwstate - prepare to move the virt timer to the cpu
  * @vcpu: The vcpu pointer
@@ -119,6 +143,13 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
 	 * populate the CPU timer again.
 	 */
 	timer_disarm(timer);
+
+	/*
+	 * If the timer expired while we were not scheduled, now is the time
+	 * to inject it.
+	 */
+	if (kvm_timer_should_fire(vcpu))
+		kvm_timer_inject_irq(vcpu);
 }
 
 /**
@@ -134,16 +165,9 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 	cycle_t cval, now;
 	u64 ns;
 
-	if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-		!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
-		return;
-
-	cval = timer->cntv_cval;
-	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
-
 	BUG_ON(timer_is_armed(timer));
 
-	if (cval <= now) {
+	if (kvm_timer_should_fire(vcpu)) {
 		/*
 		 * Timer has already expired while we were not
 		 * looking. Inject the interrupt and carry on.
@@ -152,6 +176,9 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
 		return;
 	}
 
+	cval = timer->cntv_cval;
+	now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
+
 	ns = cyclecounter_cyc2ns(timecounter->cc, cval - now, timecounter->mask,
 				 &timecounter->frac);
 	timer_arm(timer, ns);
-- 
2.3.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 5/5] arm/arm64: KVM: Keep elrsr/aisr in sync with software model
  2015-03-02 13:28 ` Alex Bennée
  (?)
@ 2015-03-02 13:29   ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Alex Bennée, Gleb Natapov, Paolo Bonzini, open list

From: Christoffer Dall <christoffer.dall@linaro.org>

There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest.  The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7042251..e2a676e 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -114,6 +114,7 @@ struct vgic_ops {
 	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
 	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
 	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
+	void	(*clear_eisr)(struct kvm_vcpu *vcpu);
 	u32	(*get_interrupt_status)(const struct kvm_vcpu *vcpu);
 	void	(*enable_underflow)(struct kvm_vcpu *vcpu);
 	void	(*disable_underflow)(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index a0a7b5d..f9b9c7c 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -72,6 +72,8 @@ static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
 {
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr);
+	else
+		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr &= ~(1ULL << lr);
 }
 
 static u64 vgic_v2_get_elrsr(const struct kvm_vcpu *vcpu)
@@ -84,6 +86,11 @@ static u64 vgic_v2_get_eisr(const struct kvm_vcpu *vcpu)
 	return vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr;
 }
 
+static void vgic_v2_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr = 0;
+}
+
 static u32 vgic_v2_get_interrupt_status(const struct kvm_vcpu *vcpu)
 {
 	u32 misr = vcpu->arch.vgic_cpu.vgic_v2.vgic_misr;
@@ -148,6 +155,7 @@ static const struct vgic_ops vgic_v2_ops = {
 	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
 	.get_elrsr		= vgic_v2_get_elrsr,
 	.get_eisr		= vgic_v2_get_eisr,
+	.clear_eisr		= vgic_v2_clear_eisr,
 	.get_interrupt_status	= vgic_v2_get_interrupt_status,
 	.enable_underflow	= vgic_v2_enable_underflow,
 	.disable_underflow	= vgic_v2_disable_underflow,
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 3a62d8a..dff0602 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -104,6 +104,8 @@ static void vgic_v3_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
 {
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr);
+	else
+		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr &= ~(1U << lr);
 }
 
 static u64 vgic_v3_get_elrsr(const struct kvm_vcpu *vcpu)
@@ -116,6 +118,11 @@ static u64 vgic_v3_get_eisr(const struct kvm_vcpu *vcpu)
 	return vcpu->arch.vgic_cpu.vgic_v3.vgic_eisr;
 }
 
+static void vgic_v3_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.vgic_cpu.vgic_v3.vgic_eisr = 0;
+}
+
 static u32 vgic_v3_get_interrupt_status(const struct kvm_vcpu *vcpu)
 {
 	u32 misr = vcpu->arch.vgic_cpu.vgic_v3.vgic_misr;
@@ -192,6 +199,7 @@ static const struct vgic_ops vgic_v3_ops = {
 	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
 	.get_elrsr		= vgic_v3_get_elrsr,
 	.get_eisr		= vgic_v3_get_eisr,
+	.clear_eisr		= vgic_v3_clear_eisr,
 	.get_interrupt_status	= vgic_v3_get_interrupt_status,
 	.enable_underflow	= vgic_v3_enable_underflow,
 	.disable_underflow	= vgic_v3_disable_underflow,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 3b4ded2..3690c1e 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -980,6 +980,11 @@ static inline u64 vgic_get_eisr(struct kvm_vcpu *vcpu)
 	return vgic_ops->get_eisr(vcpu);
 }
 
+static inline void vgic_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vgic_ops->clear_eisr(vcpu);
+}
+
 static inline u32 vgic_get_interrupt_status(struct kvm_vcpu *vcpu)
 {
 	return vgic_ops->get_interrupt_status(vcpu);
@@ -1019,6 +1024,7 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
 	vgic_set_lr(vcpu, lr_nr, vlr);
 	clear_bit(lr_nr, vgic_cpu->lr_used);
 	vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
+	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1063,6 +1069,7 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		vlr.state |= LR_EOI_INT;
 
 	vgic_set_lr(vcpu, lr_nr, vlr);
+	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1258,6 +1265,14 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 	if (status & INT_STATUS_UNDERFLOW)
 		vgic_disable_underflow(vcpu);
 
+	/*
+	 * In the next iterations of the vcpu loop, if we sync the vgic state
+	 * after flushing it, but before entering the guest (this happens for
+	 * pending signals and vmid rollovers), then make sure we don't pick
+	 * up any old maintenance interrupts here.
+	 */
+	vgic_clear_eisr(vcpu);
+
 	return level_pending;
 }
 
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 5/5] arm/arm64: KVM: Keep elrsr/aisr in sync with software model
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: kvm, linux-arm-kernel, kvmarm, christoffer.dall, marc.zyngier
  Cc: Gleb Natapov, Paolo Bonzini, open list

From: Christoffer Dall <christoffer.dall@linaro.org>

There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest.  The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7042251..e2a676e 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -114,6 +114,7 @@ struct vgic_ops {
 	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
 	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
 	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
+	void	(*clear_eisr)(struct kvm_vcpu *vcpu);
 	u32	(*get_interrupt_status)(const struct kvm_vcpu *vcpu);
 	void	(*enable_underflow)(struct kvm_vcpu *vcpu);
 	void	(*disable_underflow)(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index a0a7b5d..f9b9c7c 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -72,6 +72,8 @@ static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
 {
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr);
+	else
+		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr &= ~(1ULL << lr);
 }
 
 static u64 vgic_v2_get_elrsr(const struct kvm_vcpu *vcpu)
@@ -84,6 +86,11 @@ static u64 vgic_v2_get_eisr(const struct kvm_vcpu *vcpu)
 	return vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr;
 }
 
+static void vgic_v2_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr = 0;
+}
+
 static u32 vgic_v2_get_interrupt_status(const struct kvm_vcpu *vcpu)
 {
 	u32 misr = vcpu->arch.vgic_cpu.vgic_v2.vgic_misr;
@@ -148,6 +155,7 @@ static const struct vgic_ops vgic_v2_ops = {
 	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
 	.get_elrsr		= vgic_v2_get_elrsr,
 	.get_eisr		= vgic_v2_get_eisr,
+	.clear_eisr		= vgic_v2_clear_eisr,
 	.get_interrupt_status	= vgic_v2_get_interrupt_status,
 	.enable_underflow	= vgic_v2_enable_underflow,
 	.disable_underflow	= vgic_v2_disable_underflow,
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 3a62d8a..dff0602 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -104,6 +104,8 @@ static void vgic_v3_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
 {
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr);
+	else
+		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr &= ~(1U << lr);
 }
 
 static u64 vgic_v3_get_elrsr(const struct kvm_vcpu *vcpu)
@@ -116,6 +118,11 @@ static u64 vgic_v3_get_eisr(const struct kvm_vcpu *vcpu)
 	return vcpu->arch.vgic_cpu.vgic_v3.vgic_eisr;
 }
 
+static void vgic_v3_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.vgic_cpu.vgic_v3.vgic_eisr = 0;
+}
+
 static u32 vgic_v3_get_interrupt_status(const struct kvm_vcpu *vcpu)
 {
 	u32 misr = vcpu->arch.vgic_cpu.vgic_v3.vgic_misr;
@@ -192,6 +199,7 @@ static const struct vgic_ops vgic_v3_ops = {
 	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
 	.get_elrsr		= vgic_v3_get_elrsr,
 	.get_eisr		= vgic_v3_get_eisr,
+	.clear_eisr		= vgic_v3_clear_eisr,
 	.get_interrupt_status	= vgic_v3_get_interrupt_status,
 	.enable_underflow	= vgic_v3_enable_underflow,
 	.disable_underflow	= vgic_v3_disable_underflow,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 3b4ded2..3690c1e 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -980,6 +980,11 @@ static inline u64 vgic_get_eisr(struct kvm_vcpu *vcpu)
 	return vgic_ops->get_eisr(vcpu);
 }
 
+static inline void vgic_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vgic_ops->clear_eisr(vcpu);
+}
+
 static inline u32 vgic_get_interrupt_status(struct kvm_vcpu *vcpu)
 {
 	return vgic_ops->get_interrupt_status(vcpu);
@@ -1019,6 +1024,7 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
 	vgic_set_lr(vcpu, lr_nr, vlr);
 	clear_bit(lr_nr, vgic_cpu->lr_used);
 	vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
+	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1063,6 +1069,7 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		vlr.state |= LR_EOI_INT;
 
 	vgic_set_lr(vcpu, lr_nr, vlr);
+	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1258,6 +1265,14 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 	if (status & INT_STATUS_UNDERFLOW)
 		vgic_disable_underflow(vcpu);
 
+	/*
+	 * In the next iterations of the vcpu loop, if we sync the vgic state
+	 * after flushing it, but before entering the guest (this happens for
+	 * pending signals and vmid rollovers), then make sure we don't pick
+	 * up any old maintenance interrupts here.
+	 */
+	vgic_clear_eisr(vcpu);
+
 	return level_pending;
 }
 
-- 
2.3.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 5/5] arm/arm64: KVM: Keep elrsr/aisr in sync with software model
@ 2015-03-02 13:29   ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-02 13:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Christoffer Dall <christoffer.dall@linaro.org>

There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest.  The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7042251..e2a676e 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -114,6 +114,7 @@ struct vgic_ops {
 	void	(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
 	u64	(*get_elrsr)(const struct kvm_vcpu *vcpu);
 	u64	(*get_eisr)(const struct kvm_vcpu *vcpu);
+	void	(*clear_eisr)(struct kvm_vcpu *vcpu);
 	u32	(*get_interrupt_status)(const struct kvm_vcpu *vcpu);
 	void	(*enable_underflow)(struct kvm_vcpu *vcpu);
 	void	(*disable_underflow)(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index a0a7b5d..f9b9c7c 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -72,6 +72,8 @@ static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
 {
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr);
+	else
+		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr &= ~(1ULL << lr);
 }
 
 static u64 vgic_v2_get_elrsr(const struct kvm_vcpu *vcpu)
@@ -84,6 +86,11 @@ static u64 vgic_v2_get_eisr(const struct kvm_vcpu *vcpu)
 	return vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr;
 }
 
+static void vgic_v2_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr = 0;
+}
+
 static u32 vgic_v2_get_interrupt_status(const struct kvm_vcpu *vcpu)
 {
 	u32 misr = vcpu->arch.vgic_cpu.vgic_v2.vgic_misr;
@@ -148,6 +155,7 @@ static const struct vgic_ops vgic_v2_ops = {
 	.sync_lr_elrsr		= vgic_v2_sync_lr_elrsr,
 	.get_elrsr		= vgic_v2_get_elrsr,
 	.get_eisr		= vgic_v2_get_eisr,
+	.clear_eisr		= vgic_v2_clear_eisr,
 	.get_interrupt_status	= vgic_v2_get_interrupt_status,
 	.enable_underflow	= vgic_v2_enable_underflow,
 	.disable_underflow	= vgic_v2_disable_underflow,
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 3a62d8a..dff0602 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -104,6 +104,8 @@ static void vgic_v3_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
 {
 	if (!(lr_desc.state & LR_STATE_MASK))
 		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr);
+	else
+		vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr &= ~(1U << lr);
 }
 
 static u64 vgic_v3_get_elrsr(const struct kvm_vcpu *vcpu)
@@ -116,6 +118,11 @@ static u64 vgic_v3_get_eisr(const struct kvm_vcpu *vcpu)
 	return vcpu->arch.vgic_cpu.vgic_v3.vgic_eisr;
 }
 
+static void vgic_v3_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.vgic_cpu.vgic_v3.vgic_eisr = 0;
+}
+
 static u32 vgic_v3_get_interrupt_status(const struct kvm_vcpu *vcpu)
 {
 	u32 misr = vcpu->arch.vgic_cpu.vgic_v3.vgic_misr;
@@ -192,6 +199,7 @@ static const struct vgic_ops vgic_v3_ops = {
 	.sync_lr_elrsr		= vgic_v3_sync_lr_elrsr,
 	.get_elrsr		= vgic_v3_get_elrsr,
 	.get_eisr		= vgic_v3_get_eisr,
+	.clear_eisr		= vgic_v3_clear_eisr,
 	.get_interrupt_status	= vgic_v3_get_interrupt_status,
 	.enable_underflow	= vgic_v3_enable_underflow,
 	.disable_underflow	= vgic_v3_disable_underflow,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 3b4ded2..3690c1e 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -980,6 +980,11 @@ static inline u64 vgic_get_eisr(struct kvm_vcpu *vcpu)
 	return vgic_ops->get_eisr(vcpu);
 }
 
+static inline void vgic_clear_eisr(struct kvm_vcpu *vcpu)
+{
+	vgic_ops->clear_eisr(vcpu);
+}
+
 static inline u32 vgic_get_interrupt_status(struct kvm_vcpu *vcpu)
 {
 	return vgic_ops->get_interrupt_status(vcpu);
@@ -1019,6 +1024,7 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
 	vgic_set_lr(vcpu, lr_nr, vlr);
 	clear_bit(lr_nr, vgic_cpu->lr_used);
 	vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
+	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1063,6 +1069,7 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		vlr.state |= LR_EOI_INT;
 
 	vgic_set_lr(vcpu, lr_nr, vlr);
+	vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1258,6 +1265,14 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 	if (status & INT_STATUS_UNDERFLOW)
 		vgic_disable_underflow(vcpu);
 
+	/*
+	 * In the next iterations of the vcpu loop, if we sync the vgic state
+	 * after flushing it, but before entering the guest (this happens for
+	 * pending signals and vmid rollovers), then make sure we don't pick
+	 * up any old maintenance interrupts here.
+	 */
+	vgic_clear_eisr(vcpu);
+
 	return level_pending;
 }
 
-- 
2.3.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
  2015-03-02 13:29   ` Alex Bennée
  (?)
@ 2015-03-09 13:50     ` Christoffer Dall
  -1 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 13:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, Gleb Natapov,
	Paolo Bonzini, Jonathan Corbet, Russell King,
	open list:DOCUMENTATION, open list

Hi Alex,

The subject of this change has a typo, and I also think it's not about
exposing the pause state (that's just an internal name/concept), but
about exposing the PSCI state, or simply the VCPU power state.

On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote:
> To cleanly restore an SMP VM we need to ensure that the current pause
> state of each vcpu is correctly recorded. Things could get confused if
> the CPU starts running after migration restore completes when it was
> paused before it state was captured.
> 
> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
> interface is a lot simpler as the only valid states are
> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index b112efc..602156f 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -997,7 +997,7 @@ for vm-wide capabilities.
>  4.38 KVM_GET_MP_STATE
>  
>  Capability: KVM_CAP_MP_STATE
> -Architectures: x86, s390
> +Architectures: x86, s390, arm, arm64
>  Type: vcpu ioctl
>  Parameters: struct kvm_mp_state (out)
>  Returns: 0 on success; -1 on error
> @@ -1027,15 +1027,21 @@ Possible values are:
>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
>                                   [s390]
>  
> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> +For x86:
> +
> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> +irqchip, the multiprocessing state must be maintained by userspace on

Nit: I would not taint the git log with this change but instead just
introduce a paragraph below starting with "On arm/arm64, " and you would
get the same effect.

>  these architectures.
>  
> +For arm/arm64:
> +
> +The only states that are valid are KVM_MP_STATE_HALTED and
> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.

As suggested on the QEMU series, HALTED is probably not the right thing
to use.

>  
>  4.39 KVM_SET_MP_STATE
>  
>  Capability: KVM_CAP_MP_STATE
> -Architectures: x86, s390
> +Architectures: x86, s390, arm, arm64
>  Type: vcpu ioctl
>  Parameters: struct kvm_mp_state (in)
>  Returns: 0 on success; -1 on error
> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
>  arguments.
>  
> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> +For x86:
> +
> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> +irqchip, the multiprocessing state must be maintained by userspace on
>  these architectures.
>  
> +For arm/arm64:
> +
> +The only states that are valid are KVM_MP_STATE_HALTED and
> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.

same as above

>  
>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>  
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 5560f74..8531536 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_ARM_PSCI:
>  	case KVM_CAP_ARM_PSCI_0_2:
>  	case KVM_CAP_READONLY_MEM:
> +	case KVM_CAP_MP_STATE:
>  		r = 1;
>  		break;
>  	case KVM_CAP_COALESCED_MMIO:
> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	return -EINVAL;
> +	if (vcpu->arch.pause)
> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
> +	else
> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> +
> +	return 0;
>  }
>  
>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	return -EINVAL;
> +	switch (mp_state->mp_state) {
> +	case KVM_MP_STATE_RUNNABLE:
> +		vcpu->arch.pause = false;
> +		break;
> +	case KVM_MP_STATE_HALTED:
> +		vcpu->arch.pause = true;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
>  }
>  
>  /**

Are we capturing the vcpu features in some way or do we expect userspace
to migrate these on its own?  The reason I'm asking, is if you create
multiple VCPUs, where all the non-primary VCPUs are started in power off
mode, and then you boot your guest which powers on all VCPUs, and then
you restart your guest (with PSCI RESET), the system will not power on
all the non-primary VCPUs but hold them in power-off.

We need to make sure this behavior is preserved for a reboot across a
migration.  Is it?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 13:50     ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 13:50 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Russell King, kvm, Jonathan Corbet, marc.zyngier,
	open list:DOCUMENTATION, open list, Gleb Natapov, Paolo Bonzini,
	kvmarm, linux-arm-kernel

Hi Alex,

The subject of this change has a typo, and I also think it's not about
exposing the pause state (that's just an internal name/concept), but
about exposing the PSCI state, or simply the VCPU power state.

On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote:
> To cleanly restore an SMP VM we need to ensure that the current pause
> state of each vcpu is correctly recorded. Things could get confused if
> the CPU starts running after migration restore completes when it was
> paused before it state was captured.
> 
> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
> interface is a lot simpler as the only valid states are
> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index b112efc..602156f 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -997,7 +997,7 @@ for vm-wide capabilities.
>  4.38 KVM_GET_MP_STATE
>  
>  Capability: KVM_CAP_MP_STATE
> -Architectures: x86, s390
> +Architectures: x86, s390, arm, arm64
>  Type: vcpu ioctl
>  Parameters: struct kvm_mp_state (out)
>  Returns: 0 on success; -1 on error
> @@ -1027,15 +1027,21 @@ Possible values are:
>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
>                                   [s390]
>  
> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> +For x86:
> +
> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> +irqchip, the multiprocessing state must be maintained by userspace on

Nit: I would not taint the git log with this change but instead just
introduce a paragraph below starting with "On arm/arm64, " and you would
get the same effect.

>  these architectures.
>  
> +For arm/arm64:
> +
> +The only states that are valid are KVM_MP_STATE_HALTED and
> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.

As suggested on the QEMU series, HALTED is probably not the right thing
to use.

>  
>  4.39 KVM_SET_MP_STATE
>  
>  Capability: KVM_CAP_MP_STATE
> -Architectures: x86, s390
> +Architectures: x86, s390, arm, arm64
>  Type: vcpu ioctl
>  Parameters: struct kvm_mp_state (in)
>  Returns: 0 on success; -1 on error
> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
>  arguments.
>  
> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> +For x86:
> +
> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> +irqchip, the multiprocessing state must be maintained by userspace on
>  these architectures.
>  
> +For arm/arm64:
> +
> +The only states that are valid are KVM_MP_STATE_HALTED and
> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.

same as above

>  
>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>  
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 5560f74..8531536 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_ARM_PSCI:
>  	case KVM_CAP_ARM_PSCI_0_2:
>  	case KVM_CAP_READONLY_MEM:
> +	case KVM_CAP_MP_STATE:
>  		r = 1;
>  		break;
>  	case KVM_CAP_COALESCED_MMIO:
> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	return -EINVAL;
> +	if (vcpu->arch.pause)
> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
> +	else
> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> +
> +	return 0;
>  }
>  
>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	return -EINVAL;
> +	switch (mp_state->mp_state) {
> +	case KVM_MP_STATE_RUNNABLE:
> +		vcpu->arch.pause = false;
> +		break;
> +	case KVM_MP_STATE_HALTED:
> +		vcpu->arch.pause = true;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
>  }
>  
>  /**

Are we capturing the vcpu features in some way or do we expect userspace
to migrate these on its own?  The reason I'm asking, is if you create
multiple VCPUs, where all the non-primary VCPUs are started in power off
mode, and then you boot your guest which powers on all VCPUs, and then
you restart your guest (with PSCI RESET), the system will not power on
all the non-primary VCPUs but hold them in power-off.

We need to make sure this behavior is preserved for a reboot across a
migration.  Is it?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 13:50     ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 13:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Alex,

The subject of this change has a typo, and I also think it's not about
exposing the pause state (that's just an internal name/concept), but
about exposing the PSCI state, or simply the VCPU power state.

On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Benn?e wrote:
> To cleanly restore an SMP VM we need to ensure that the current pause
> state of each vcpu is correctly recorded. Things could get confused if
> the CPU starts running after migration restore completes when it was
> paused before it state was captured.
> 
> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
> interface is a lot simpler as the only valid states are
> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
> 
> Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index b112efc..602156f 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -997,7 +997,7 @@ for vm-wide capabilities.
>  4.38 KVM_GET_MP_STATE
>  
>  Capability: KVM_CAP_MP_STATE
> -Architectures: x86, s390
> +Architectures: x86, s390, arm, arm64
>  Type: vcpu ioctl
>  Parameters: struct kvm_mp_state (out)
>  Returns: 0 on success; -1 on error
> @@ -1027,15 +1027,21 @@ Possible values are:
>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
>                                   [s390]
>  
> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> +For x86:
> +
> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> +irqchip, the multiprocessing state must be maintained by userspace on

Nit: I would not taint the git log with this change but instead just
introduce a paragraph below starting with "On arm/arm64, " and you would
get the same effect.

>  these architectures.
>  
> +For arm/arm64:
> +
> +The only states that are valid are KVM_MP_STATE_HALTED and
> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.

As suggested on the QEMU series, HALTED is probably not the right thing
to use.

>  
>  4.39 KVM_SET_MP_STATE
>  
>  Capability: KVM_CAP_MP_STATE
> -Architectures: x86, s390
> +Architectures: x86, s390, arm, arm64
>  Type: vcpu ioctl
>  Parameters: struct kvm_mp_state (in)
>  Returns: 0 on success; -1 on error
> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
>  arguments.
>  
> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> +For x86:
> +
> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> +irqchip, the multiprocessing state must be maintained by userspace on
>  these architectures.
>  
> +For arm/arm64:
> +
> +The only states that are valid are KVM_MP_STATE_HALTED and
> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.

same as above

>  
>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>  
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 5560f74..8531536 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_ARM_PSCI:
>  	case KVM_CAP_ARM_PSCI_0_2:
>  	case KVM_CAP_READONLY_MEM:
> +	case KVM_CAP_MP_STATE:
>  		r = 1;
>  		break;
>  	case KVM_CAP_COALESCED_MMIO:
> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	return -EINVAL;
> +	if (vcpu->arch.pause)
> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
> +	else
> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> +
> +	return 0;
>  }
>  
>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	return -EINVAL;
> +	switch (mp_state->mp_state) {
> +	case KVM_MP_STATE_RUNNABLE:
> +		vcpu->arch.pause = false;
> +		break;
> +	case KVM_MP_STATE_HALTED:
> +		vcpu->arch.pause = true;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	return 0;
>  }
>  
>  /**

Are we capturing the vcpu features in some way or do we expect userspace
to migrate these on its own?  The reason I'm asking, is if you create
multiple VCPUs, where all the non-primary VCPUs are started in power off
mode, and then you boot your guest which powers on all VCPUs, and then
you restart your guest (with PSCI RESET), the system will not power on
all the non-primary VCPUs but hold them in power-off.

We need to make sure this behavior is preserved for a reboot across a
migration.  Is it?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
  2015-03-02 13:29   ` Alex Bennée
@ 2015-03-09 14:41     ` Christoffer Dall
  -1 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 14:41 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, Gleb Natapov,
	Paolo Bonzini, open list

On Mon, Mar 02, 2015 at 01:29:01PM +0000, Alex Bennée wrote:
> From: Christoffer Dall <christoffer.dall@linaro.org>
> 
> Migrating active interrupts causes the active state to be lost
> completely. This implements some additional bitmaps to track the active
> state on the distributor and export this to user space.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> ---
> AJB:
>    - fixed merge conflicts
>    - moved additional shared bitmaps to be dynamically allocated
>    - make irq_active_on_cpu dynamically allocated as well
>    - in vgic_queue_irq don't queue pending if already active
>    - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
>    - vgic: clear active on CPU bit
>    - checkpatch, remove extraneous braces
>    - checkpatch, remove debug, fix overflows
>    - move register access fns to re-factored vgic-v2-emul.c
> v2
>    - doc: unqueue and update_state
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7c55dd5..7042251 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -196,6 +196,9 @@ struct vgic_dist {
>  	/* Level-triggered interrupt queued on VCPU interface */
>  	struct vgic_bitmap	irq_queued;
>  
> +	/* Interrupt was active when unqueue from VCPU interface */
> +	struct vgic_bitmap	irq_active;
> +
>  	/* Interrupt priority. Not used yet. */
>  	struct vgic_bytemap	irq_priority;
>  
> @@ -236,6 +239,9 @@ struct vgic_dist {
>  	/* Bitmap indicating which CPU has something pending */
>  	unsigned long		*irq_pending_on_cpu;
>  
> +	/* Bitmap indicating which CPU has active IRQs */
> +	unsigned long		*irq_active_on_cpu;
> +
>  	struct vgic_vm_ops	vm_ops;
>  #endif
>  };
> @@ -269,9 +275,15 @@ struct vgic_cpu {
>  	/* per IRQ to LR mapping */
>  	u8		*vgic_irq_lr_map;
>  
> -	/* Pending interrupts on this VCPU */
> +	/* Pending/active/both interrupts on this VCPU */
>  	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
> +	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
> +	DECLARE_BITMAP(	pend_act_percpu, VGIC_NR_PRIVATE_IRQS);
> +
> +	/* Pending/active/both shared interrupts, dynamically sized */
>  	unsigned long	*pending_shared;
> +	unsigned long   *active_shared;
> +	unsigned long   *pend_act_shared;
>  
>  	/* Bitmap of used/free list registers */
>  	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
> @@ -311,6 +323,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> +int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>  bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		      struct kvm_exit_mmio *mmio);
>  
> diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
> index 19c6210..c818662 100644
> --- a/virt/kvm/arm/vgic-v2-emul.c
> +++ b/virt/kvm/arm/vgic-v2-emul.c
> @@ -107,6 +107,22 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
>  					     vcpu->vcpu_id);
>  }
>  
> +static bool handle_mmio_set_active_reg(struct kvm_vcpu *vcpu,
> +				       struct kvm_exit_mmio *mmio,
> +				       phys_addr_t offset)
> +{
> +	return vgic_handle_set_active_reg(vcpu->kvm, mmio, offset,
> +					  vcpu->vcpu_id);
> +}
> +
> +static bool handle_mmio_clear_active_reg(struct kvm_vcpu *vcpu,
> +					 struct kvm_exit_mmio *mmio,
> +					 phys_addr_t offset)
> +{
> +	return vgic_handle_clear_active_reg(vcpu->kvm, mmio, offset,
> +					    vcpu->vcpu_id);
> +}
> +
>  static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
>  				     struct kvm_exit_mmio *mmio,
>  				     phys_addr_t offset)
> @@ -344,13 +360,13 @@ static const struct kvm_mmio_range vgic_dist_ranges[] = {
>  		.base		= GIC_DIST_ACTIVE_SET,
>  		.len		= VGIC_MAX_IRQS / 8,
>  		.bits_per_irq	= 1,
> -		.handle_mmio	= handle_mmio_raz_wi,
> +		.handle_mmio	= handle_mmio_set_active_reg,
>  	},
>  	{
>  		.base		= GIC_DIST_ACTIVE_CLEAR,
>  		.len		= VGIC_MAX_IRQS / 8,
>  		.bits_per_irq	= 1,
> -		.handle_mmio	= handle_mmio_raz_wi,
> +		.handle_mmio	= handle_mmio_clear_active_reg,
>  	},
>  	{
>  		.base		= GIC_DIST_PRI,
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 0cc6ab6..bfb6fbb 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -277,6 +277,14 @@ static void vgic_irq_clear_queued(struct kvm_vcpu *vcpu, int irq)
>  	vgic_bitmap_set_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq, 0);
>  }
>  
> +
> +static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
> +}
> +
>  static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> @@ -520,6 +528,44 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
>  	return false;
>  }
>  
> +bool vgic_handle_set_active_reg(struct kvm *kvm,
> +				struct kvm_exit_mmio *mmio,
> +				phys_addr_t offset, int vcpu_id)
> +{
> +	u32 *reg;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
> +	vgic_reg_access(mmio, reg, offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
> +
> +	if (mmio->is_write) {
> +		vgic_update_state(kvm);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +bool vgic_handle_clear_active_reg(struct kvm *kvm,
> +				  struct kvm_exit_mmio *mmio,
> +				  phys_addr_t offset, int vcpu_id)
> +{
> +	u32 *reg;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
> +	vgic_reg_access(mmio, reg, offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
> +
> +	if (mmio->is_write) {
> +		vgic_update_state(kvm);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
>  static u32 vgic_cfg_expand(u16 val)
>  {
>  	u32 res = 0;
> @@ -588,16 +634,12 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
>  }
>  
>  /**
> - * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
> + * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
>   * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
>   *
> - * Move any pending IRQs that have already been assigned to LRs back to the
> + * Move any IRQs that have already been assigned to LRs back to the
>   * emulated distributor state so that the complete emulated state can be read
>   * from the main emulation structures without investigating the LRs.
> - *
> - * Note that IRQs in the active state in the LRs get their pending state moved
> - * to the distributor but the active state stays in the LRs, because we don't
> - * track the active state on the distributor side.
>   */
>  void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  {
> @@ -613,12 +655,22 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  		 * 01: pending
>  		 * 10: active
>  		 * 11: pending and active
> -		 *
> -		 * If the LR holds only an active interrupt (not pending) then
> -		 * just leave it alone.
>  		 */
> -		if ((lr.state & LR_STATE_MASK) == LR_STATE_ACTIVE)
> -			continue;
> +		BUG_ON(!(lr.state & LR_STATE_MASK));
> +
> +		/* Reestablish SGI source for pending and active IRQs */
> +		if (lr.irq < VGIC_NR_SGIS)
> +			add_sgi_source(vcpu, lr.irq, lr.source);
> +
> +		/*
> +		 * If the LR holds an active (10) or a pending and active (11)
> +		 * interrupt then move this state to the distributor tracking

if you respin:                         the active state

> +		 * bit.
> +		 */
> +		if (lr.state & LR_STATE_ACTIVE) {
> +			vgic_irq_set_active(vcpu, lr.irq);
> +			lr.state &= ~LR_STATE_ACTIVE;
> +		}
>  
>  		/*
>  		 * Reestablish the pending state on the distributor and the
> @@ -626,21 +678,19 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  		 * is fine, then we are only setting a few bits that were
>  		 * already set.
>  		 */
> -		vgic_dist_irq_set_pending(vcpu, lr.irq);
> -		if (lr.irq < VGIC_NR_SGIS)
> -			add_sgi_source(vcpu, lr.irq, lr.source);
> -		lr.state &= ~LR_STATE_PENDING;
> +		if (lr.state & LR_STATE_PENDING) {
> +			vgic_dist_irq_set_pending(vcpu, lr.irq);
> +			lr.state &= ~LR_STATE_PENDING;
> +		}
> +
>  		vgic_set_lr(vcpu, i, lr);
>  
>  		/*
> -		 * If there's no state left on the LR (it could still be
> -		 * active), then the LR does not hold any useful info and can
> -		 * be marked as free for other use.
> +		 * Marked the LR as free for other use.

if you respin:     Mark the

>  		 */
> -		if (!(lr.state & LR_STATE_MASK)) {
> -			vgic_retire_lr(i, lr.irq, vcpu);
> -			vgic_irq_clear_queued(vcpu, lr.irq);
> -		}
> +		BUG_ON(lr.state & LR_STATE_MASK);
> +		vgic_retire_lr(i, lr.irq, vcpu);
> +		vgic_irq_clear_queued(vcpu, lr.irq);
>  
>  		/* Finally update the VGIC state. */
>  		vgic_update_state(vcpu->kvm);
> @@ -804,6 +854,36 @@ static int vgic_nr_shared_irqs(struct vgic_dist *dist)
>  	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
>  }
>  
> +static int compute_active_for_cpu(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long *active, *enabled, *act_percpu, *act_shared;
> +	unsigned long active_private, active_shared;
> +	int nr_shared = vgic_nr_shared_irqs(dist);
> +	int vcpu_id;
> +
> +	vcpu_id = vcpu->vcpu_id;
> +	act_percpu = vcpu->arch.vgic_cpu.active_percpu;
> +	act_shared = vcpu->arch.vgic_cpu.active_shared;
> +
> +	active = vgic_bitmap_get_cpu_map(&dist->irq_active, vcpu_id);
> +	enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
> +	bitmap_and(act_percpu, active, enabled, VGIC_NR_PRIVATE_IRQS);
> +
> +	active = vgic_bitmap_get_shared_map(&dist->irq_active);
> +	enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
> +	bitmap_and(act_shared, active, enabled, nr_shared);
> +	bitmap_and(act_shared, act_shared,
> +		   vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
> +		   nr_shared);
> +
> +	active_private = find_first_bit(act_percpu, VGIC_NR_PRIVATE_IRQS);
> +	active_shared = find_first_bit(act_shared, nr_shared);
> +
> +	return (active_private < VGIC_NR_PRIVATE_IRQS ||
> +		active_shared < nr_shared);
> +}
> +
>  static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> @@ -835,7 +915,7 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Update the interrupt state and determine which CPUs have pending
> - * interrupts. Must be called with distributor lock held.
> + * or active interrupts. Must be called with distributor lock held.
>   */
>  void vgic_update_state(struct kvm *kvm)
>  {
> @@ -849,10 +929,13 @@ void vgic_update_state(struct kvm *kvm)
>  	}
>  
>  	kvm_for_each_vcpu(c, vcpu, kvm) {
> -		if (compute_pending_for_cpu(vcpu)) {
> -			pr_debug("CPU%d has pending interrupts\n", c);
> +		if (compute_pending_for_cpu(vcpu))
>  			set_bit(c, dist->irq_pending_on_cpu);
> -		}
> +
> +		if (compute_active_for_cpu(vcpu))
> +			set_bit(c, dist->irq_active_on_cpu);
> +		else
> +			clear_bit(c, dist->irq_active_on_cpu);
>  	}
>  }
>  
> @@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long *pa_percpu, *pa_shared;
>  	int i, vcpu_id;
>  	int overflow = 0;
> +	int nr_shared = vgic_nr_shared_irqs(dist);
>  
>  	vcpu_id = vcpu->vcpu_id;
>  
> +	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
> +	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
> +
> +	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
> +		  VGIC_NR_PRIVATE_IRQS);
> +	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
> +		  nr_shared);
>  	/*
>  	 * We may not have any pending interrupt, or the interrupts
>  	 * may have been serviced from another vcpu. In all cases,
>  	 * move along.
>  	 */
> -	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
> -		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
> +	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
>  		goto epilog;
> -	}
>  
>  	/* SGIs */
> -	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
> +	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
>  		if (!queue_sgi(vcpu, i))
>  			overflow = 1;
>  	}
>  
>  	/* PPIs */
> -	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
> +	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
>  		if (!vgic_queue_hwirq(vcpu, i))
>  			overflow = 1;
>  	}
>  
>  	/* SPIs */
> -	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
> +	for_each_set_bit(i, pa_shared, nr_shared) {
>  		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
>  			overflow = 1;
>  	}

meh, these changes will now produce a quite strange result if one
bisects at this specific patch, since we will be setting pending
interrupts to active....  I thought it was more clear as one patch
containing the changes to vgic_queue_... imho.

Thoughts?

>  
> +
> +
> +
>  epilog:
>  	if (overflow) {
>  		vgic_enable_underflow(vcpu);
> @@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>  	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>  }
>  
> +int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	if (!irqchip_in_kernel(vcpu->kvm))
> +		return 0;
> +
> +	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
> +}
> +
> +
>  void vgic_kick_vcpus(struct kvm *kvm)
>  {
>  	struct kvm_vcpu *vcpu;
> @@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  
>  	kfree(vgic_cpu->pending_shared);
> +	kfree(vgic_cpu->active_shared);
> +	kfree(vgic_cpu->pend_act_shared);
>  	kfree(vgic_cpu->vgic_irq_lr_map);
>  	vgic_cpu->pending_shared = NULL;
> +	vgic_cpu->active_shared = NULL;
> +	vgic_cpu->pend_act_shared = NULL;
>  	vgic_cpu->vgic_irq_lr_map = NULL;
>  }
>  
> @@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>  
>  	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
>  	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
> +	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
> +	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
>  	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
>  
> -	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
> +	if (!vgic_cpu->pending_shared
> +		|| !vgic_cpu->active_shared
> +		|| !vgic_cpu->pend_act_shared
> +		|| !vgic_cpu->vgic_irq_lr_map) {
>  		kvm_vgic_vcpu_destroy(vcpu);
>  		return -ENOMEM;
>  	}
> @@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
>  	kfree(dist->irq_spi_mpidr);
>  	kfree(dist->irq_spi_target);
>  	kfree(dist->irq_pending_on_cpu);
> +	kfree(dist->irq_active_on_cpu);
>  	dist->irq_sgi_sources = NULL;
>  	dist->irq_spi_cpu = NULL;
>  	dist->irq_spi_target = NULL;
>  	dist->irq_pending_on_cpu = NULL;
> +	dist->irq_active_on_cpu = NULL;
>  	dist->nr_cpus = 0;
>  }
>  
> @@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
>  	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
> +	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
>  
> @@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
>  				       GFP_KERNEL);
>  	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
>  					   GFP_KERNEL);
> +	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
> +					   GFP_KERNEL);
>  	if (!dist->irq_sgi_sources ||
>  	    !dist->irq_spi_cpu ||
>  	    !dist->irq_spi_target ||
> -	    !dist->irq_pending_on_cpu) {
> +	    !dist->irq_pending_on_cpu ||
> +	    !dist->irq_active_on_cpu) {
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
> index 1e83bdf..1e5a381 100644
> --- a/virt/kvm/arm/vgic.h
> +++ b/virt/kvm/arm/vgic.h
> @@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>  bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>  				   phys_addr_t offset, int vcpu_id);
>  
> +bool vgic_handle_set_active_reg(struct kvm *kvm,
> +				struct kvm_exit_mmio *mmio,
> +				phys_addr_t offset, int vcpu_id);
> +
> +bool vgic_handle_clear_active_reg(struct kvm *kvm,
> +				  struct kvm_exit_mmio *mmio,
> +				  phys_addr_t offset, int vcpu_id);
> +
>  bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
>  			 phys_addr_t offset);
>  
> -- 
> 2.3.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
@ 2015-03-09 14:41     ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 02, 2015 at 01:29:01PM +0000, Alex Benn?e wrote:
> From: Christoffer Dall <christoffer.dall@linaro.org>
> 
> Migrating active interrupts causes the active state to be lost
> completely. This implements some additional bitmaps to track the active
> state on the distributor and export this to user space.
> 
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
> 
> ---
> AJB:
>    - fixed merge conflicts
>    - moved additional shared bitmaps to be dynamically allocated
>    - make irq_active_on_cpu dynamically allocated as well
>    - in vgic_queue_irq don't queue pending if already active
>    - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
>    - vgic: clear active on CPU bit
>    - checkpatch, remove extraneous braces
>    - checkpatch, remove debug, fix overflows
>    - move register access fns to re-factored vgic-v2-emul.c
> v2
>    - doc: unqueue and update_state
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7c55dd5..7042251 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -196,6 +196,9 @@ struct vgic_dist {
>  	/* Level-triggered interrupt queued on VCPU interface */
>  	struct vgic_bitmap	irq_queued;
>  
> +	/* Interrupt was active when unqueue from VCPU interface */
> +	struct vgic_bitmap	irq_active;
> +
>  	/* Interrupt priority. Not used yet. */
>  	struct vgic_bytemap	irq_priority;
>  
> @@ -236,6 +239,9 @@ struct vgic_dist {
>  	/* Bitmap indicating which CPU has something pending */
>  	unsigned long		*irq_pending_on_cpu;
>  
> +	/* Bitmap indicating which CPU has active IRQs */
> +	unsigned long		*irq_active_on_cpu;
> +
>  	struct vgic_vm_ops	vm_ops;
>  #endif
>  };
> @@ -269,9 +275,15 @@ struct vgic_cpu {
>  	/* per IRQ to LR mapping */
>  	u8		*vgic_irq_lr_map;
>  
> -	/* Pending interrupts on this VCPU */
> +	/* Pending/active/both interrupts on this VCPU */
>  	DECLARE_BITMAP(	pending_percpu, VGIC_NR_PRIVATE_IRQS);
> +	DECLARE_BITMAP(	active_percpu, VGIC_NR_PRIVATE_IRQS);
> +	DECLARE_BITMAP(	pend_act_percpu, VGIC_NR_PRIVATE_IRQS);
> +
> +	/* Pending/active/both shared interrupts, dynamically sized */
>  	unsigned long	*pending_shared;
> +	unsigned long   *active_shared;
> +	unsigned long   *pend_act_shared;
>  
>  	/* Bitmap of used/free list registers */
>  	DECLARE_BITMAP(	lr_used, VGIC_V2_MAX_LRS);
> @@ -311,6 +323,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>  			bool level);
>  void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> +int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>  bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  		      struct kvm_exit_mmio *mmio);
>  
> diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
> index 19c6210..c818662 100644
> --- a/virt/kvm/arm/vgic-v2-emul.c
> +++ b/virt/kvm/arm/vgic-v2-emul.c
> @@ -107,6 +107,22 @@ static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
>  					     vcpu->vcpu_id);
>  }
>  
> +static bool handle_mmio_set_active_reg(struct kvm_vcpu *vcpu,
> +				       struct kvm_exit_mmio *mmio,
> +				       phys_addr_t offset)
> +{
> +	return vgic_handle_set_active_reg(vcpu->kvm, mmio, offset,
> +					  vcpu->vcpu_id);
> +}
> +
> +static bool handle_mmio_clear_active_reg(struct kvm_vcpu *vcpu,
> +					 struct kvm_exit_mmio *mmio,
> +					 phys_addr_t offset)
> +{
> +	return vgic_handle_clear_active_reg(vcpu->kvm, mmio, offset,
> +					    vcpu->vcpu_id);
> +}
> +
>  static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
>  				     struct kvm_exit_mmio *mmio,
>  				     phys_addr_t offset)
> @@ -344,13 +360,13 @@ static const struct kvm_mmio_range vgic_dist_ranges[] = {
>  		.base		= GIC_DIST_ACTIVE_SET,
>  		.len		= VGIC_MAX_IRQS / 8,
>  		.bits_per_irq	= 1,
> -		.handle_mmio	= handle_mmio_raz_wi,
> +		.handle_mmio	= handle_mmio_set_active_reg,
>  	},
>  	{
>  		.base		= GIC_DIST_ACTIVE_CLEAR,
>  		.len		= VGIC_MAX_IRQS / 8,
>  		.bits_per_irq	= 1,
> -		.handle_mmio	= handle_mmio_raz_wi,
> +		.handle_mmio	= handle_mmio_clear_active_reg,
>  	},
>  	{
>  		.base		= GIC_DIST_PRI,
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 0cc6ab6..bfb6fbb 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -277,6 +277,14 @@ static void vgic_irq_clear_queued(struct kvm_vcpu *vcpu, int irq)
>  	vgic_bitmap_set_irq_val(&dist->irq_queued, vcpu->vcpu_id, irq, 0);
>  }
>  
> +
> +static void vgic_irq_set_active(struct kvm_vcpu *vcpu, int irq)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	vgic_bitmap_set_irq_val(&dist->irq_active, vcpu->vcpu_id, irq, 1);
> +}
> +
>  static int vgic_dist_irq_get_level(struct kvm_vcpu *vcpu, int irq)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> @@ -520,6 +528,44 @@ bool vgic_handle_clear_pending_reg(struct kvm *kvm,
>  	return false;
>  }
>  
> +bool vgic_handle_set_active_reg(struct kvm *kvm,
> +				struct kvm_exit_mmio *mmio,
> +				phys_addr_t offset, int vcpu_id)
> +{
> +	u32 *reg;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
> +	vgic_reg_access(mmio, reg, offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
> +
> +	if (mmio->is_write) {
> +		vgic_update_state(kvm);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +bool vgic_handle_clear_active_reg(struct kvm *kvm,
> +				  struct kvm_exit_mmio *mmio,
> +				  phys_addr_t offset, int vcpu_id)
> +{
> +	u32 *reg;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	reg = vgic_bitmap_get_reg(&dist->irq_active, vcpu_id, offset);
> +	vgic_reg_access(mmio, reg, offset,
> +			ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
> +
> +	if (mmio->is_write) {
> +		vgic_update_state(kvm);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
>  static u32 vgic_cfg_expand(u16 val)
>  {
>  	u32 res = 0;
> @@ -588,16 +634,12 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
>  }
>  
>  /**
> - * vgic_unqueue_irqs - move pending IRQs from LRs to the distributor
> + * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor
>   * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs
>   *
> - * Move any pending IRQs that have already been assigned to LRs back to the
> + * Move any IRQs that have already been assigned to LRs back to the
>   * emulated distributor state so that the complete emulated state can be read
>   * from the main emulation structures without investigating the LRs.
> - *
> - * Note that IRQs in the active state in the LRs get their pending state moved
> - * to the distributor but the active state stays in the LRs, because we don't
> - * track the active state on the distributor side.
>   */
>  void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  {
> @@ -613,12 +655,22 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  		 * 01: pending
>  		 * 10: active
>  		 * 11: pending and active
> -		 *
> -		 * If the LR holds only an active interrupt (not pending) then
> -		 * just leave it alone.
>  		 */
> -		if ((lr.state & LR_STATE_MASK) == LR_STATE_ACTIVE)
> -			continue;
> +		BUG_ON(!(lr.state & LR_STATE_MASK));
> +
> +		/* Reestablish SGI source for pending and active IRQs */
> +		if (lr.irq < VGIC_NR_SGIS)
> +			add_sgi_source(vcpu, lr.irq, lr.source);
> +
> +		/*
> +		 * If the LR holds an active (10) or a pending and active (11)
> +		 * interrupt then move this state to the distributor tracking

if you respin:                         the active state

> +		 * bit.
> +		 */
> +		if (lr.state & LR_STATE_ACTIVE) {
> +			vgic_irq_set_active(vcpu, lr.irq);
> +			lr.state &= ~LR_STATE_ACTIVE;
> +		}
>  
>  		/*
>  		 * Reestablish the pending state on the distributor and the
> @@ -626,21 +678,19 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  		 * is fine, then we are only setting a few bits that were
>  		 * already set.
>  		 */
> -		vgic_dist_irq_set_pending(vcpu, lr.irq);
> -		if (lr.irq < VGIC_NR_SGIS)
> -			add_sgi_source(vcpu, lr.irq, lr.source);
> -		lr.state &= ~LR_STATE_PENDING;
> +		if (lr.state & LR_STATE_PENDING) {
> +			vgic_dist_irq_set_pending(vcpu, lr.irq);
> +			lr.state &= ~LR_STATE_PENDING;
> +		}
> +
>  		vgic_set_lr(vcpu, i, lr);
>  
>  		/*
> -		 * If there's no state left on the LR (it could still be
> -		 * active), then the LR does not hold any useful info and can
> -		 * be marked as free for other use.
> +		 * Marked the LR as free for other use.

if you respin:     Mark the

>  		 */
> -		if (!(lr.state & LR_STATE_MASK)) {
> -			vgic_retire_lr(i, lr.irq, vcpu);
> -			vgic_irq_clear_queued(vcpu, lr.irq);
> -		}
> +		BUG_ON(lr.state & LR_STATE_MASK);
> +		vgic_retire_lr(i, lr.irq, vcpu);
> +		vgic_irq_clear_queued(vcpu, lr.irq);
>  
>  		/* Finally update the VGIC state. */
>  		vgic_update_state(vcpu->kvm);
> @@ -804,6 +854,36 @@ static int vgic_nr_shared_irqs(struct vgic_dist *dist)
>  	return dist->nr_irqs - VGIC_NR_PRIVATE_IRQS;
>  }
>  
> +static int compute_active_for_cpu(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long *active, *enabled, *act_percpu, *act_shared;
> +	unsigned long active_private, active_shared;
> +	int nr_shared = vgic_nr_shared_irqs(dist);
> +	int vcpu_id;
> +
> +	vcpu_id = vcpu->vcpu_id;
> +	act_percpu = vcpu->arch.vgic_cpu.active_percpu;
> +	act_shared = vcpu->arch.vgic_cpu.active_shared;
> +
> +	active = vgic_bitmap_get_cpu_map(&dist->irq_active, vcpu_id);
> +	enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
> +	bitmap_and(act_percpu, active, enabled, VGIC_NR_PRIVATE_IRQS);
> +
> +	active = vgic_bitmap_get_shared_map(&dist->irq_active);
> +	enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
> +	bitmap_and(act_shared, active, enabled, nr_shared);
> +	bitmap_and(act_shared, act_shared,
> +		   vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
> +		   nr_shared);
> +
> +	active_private = find_first_bit(act_percpu, VGIC_NR_PRIVATE_IRQS);
> +	active_shared = find_first_bit(act_shared, nr_shared);
> +
> +	return (active_private < VGIC_NR_PRIVATE_IRQS ||
> +		active_shared < nr_shared);
> +}
> +
>  static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> @@ -835,7 +915,7 @@ static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Update the interrupt state and determine which CPUs have pending
> - * interrupts. Must be called with distributor lock held.
> + * or active interrupts. Must be called with distributor lock held.
>   */
>  void vgic_update_state(struct kvm *kvm)
>  {
> @@ -849,10 +929,13 @@ void vgic_update_state(struct kvm *kvm)
>  	}
>  
>  	kvm_for_each_vcpu(c, vcpu, kvm) {
> -		if (compute_pending_for_cpu(vcpu)) {
> -			pr_debug("CPU%d has pending interrupts\n", c);
> +		if (compute_pending_for_cpu(vcpu))
>  			set_bit(c, dist->irq_pending_on_cpu);
> -		}
> +
> +		if (compute_active_for_cpu(vcpu))
> +			set_bit(c, dist->irq_active_on_cpu);
> +		else
> +			clear_bit(c, dist->irq_active_on_cpu);
>  	}
>  }
>  
> @@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +	unsigned long *pa_percpu, *pa_shared;
>  	int i, vcpu_id;
>  	int overflow = 0;
> +	int nr_shared = vgic_nr_shared_irqs(dist);
>  
>  	vcpu_id = vcpu->vcpu_id;
>  
> +	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
> +	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
> +
> +	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
> +		  VGIC_NR_PRIVATE_IRQS);
> +	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
> +		  nr_shared);
>  	/*
>  	 * We may not have any pending interrupt, or the interrupts
>  	 * may have been serviced from another vcpu. In all cases,
>  	 * move along.
>  	 */
> -	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
> -		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
> +	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
>  		goto epilog;
> -	}
>  
>  	/* SGIs */
> -	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
> +	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
>  		if (!queue_sgi(vcpu, i))
>  			overflow = 1;
>  	}
>  
>  	/* PPIs */
> -	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
> +	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
>  		if (!vgic_queue_hwirq(vcpu, i))
>  			overflow = 1;
>  	}
>  
>  	/* SPIs */
> -	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
> +	for_each_set_bit(i, pa_shared, nr_shared) {
>  		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
>  			overflow = 1;
>  	}

meh, these changes will now produce a quite strange result if one
bisects at this specific patch, since we will be setting pending
interrupts to active....  I thought it was more clear as one patch
containing the changes to vgic_queue_... imho.

Thoughts?

>  
> +
> +
> +
>  epilog:
>  	if (overflow) {
>  		vgic_enable_underflow(vcpu);
> @@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>  	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>  }
>  
> +int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
> +{
> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> +
> +	if (!irqchip_in_kernel(vcpu->kvm))
> +		return 0;
> +
> +	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
> +}
> +
> +
>  void vgic_kick_vcpus(struct kvm *kvm)
>  {
>  	struct kvm_vcpu *vcpu;
> @@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  
>  	kfree(vgic_cpu->pending_shared);
> +	kfree(vgic_cpu->active_shared);
> +	kfree(vgic_cpu->pend_act_shared);
>  	kfree(vgic_cpu->vgic_irq_lr_map);
>  	vgic_cpu->pending_shared = NULL;
> +	vgic_cpu->active_shared = NULL;
> +	vgic_cpu->pend_act_shared = NULL;
>  	vgic_cpu->vgic_irq_lr_map = NULL;
>  }
>  
> @@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>  
>  	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
>  	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
> +	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
> +	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
>  	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
>  
> -	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
> +	if (!vgic_cpu->pending_shared
> +		|| !vgic_cpu->active_shared
> +		|| !vgic_cpu->pend_act_shared
> +		|| !vgic_cpu->vgic_irq_lr_map) {
>  		kvm_vgic_vcpu_destroy(vcpu);
>  		return -ENOMEM;
>  	}
> @@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
>  	kfree(dist->irq_spi_mpidr);
>  	kfree(dist->irq_spi_target);
>  	kfree(dist->irq_pending_on_cpu);
> +	kfree(dist->irq_active_on_cpu);
>  	dist->irq_sgi_sources = NULL;
>  	dist->irq_spi_cpu = NULL;
>  	dist->irq_spi_target = NULL;
>  	dist->irq_pending_on_cpu = NULL;
> +	dist->irq_active_on_cpu = NULL;
>  	dist->nr_cpus = 0;
>  }
>  
> @@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
>  	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
> +	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
>  	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
>  
> @@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
>  				       GFP_KERNEL);
>  	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
>  					   GFP_KERNEL);
> +	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
> +					   GFP_KERNEL);
>  	if (!dist->irq_sgi_sources ||
>  	    !dist->irq_spi_cpu ||
>  	    !dist->irq_spi_target ||
> -	    !dist->irq_pending_on_cpu) {
> +	    !dist->irq_pending_on_cpu ||
> +	    !dist->irq_active_on_cpu) {
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
> index 1e83bdf..1e5a381 100644
> --- a/virt/kvm/arm/vgic.h
> +++ b/virt/kvm/arm/vgic.h
> @@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>  bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>  				   phys_addr_t offset, int vcpu_id);
>  
> +bool vgic_handle_set_active_reg(struct kvm *kvm,
> +				struct kvm_exit_mmio *mmio,
> +				phys_addr_t offset, int vcpu_id);
> +
> +bool vgic_handle_clear_active_reg(struct kvm *kvm,
> +				  struct kvm_exit_mmio *mmio,
> +				  phys_addr_t offset, int vcpu_id);
> +
>  bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
>  			 phys_addr_t offset);
>  
> -- 
> 2.3.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn
  2015-03-02 13:29   ` Alex Bennée
  (?)
@ 2015-03-09 14:47     ` Christoffer Dall
  -1 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 14:47 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, Gleb Natapov,
	Paolo Bonzini, open list

On Mon, Mar 02, 2015 at 01:29:02PM +0000, Alex Bennée wrote:
> This helps re-factor away some of the repetitive code and makes the code
> flow more nicely.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn
@ 2015-03-09 14:47     ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 14:47 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, marc.zyngier, open list, Gleb Natapov, Paolo Bonzini,
	kvmarm, linux-arm-kernel

On Mon, Mar 02, 2015 at 01:29:02PM +0000, Alex Bennée wrote:
> This helps re-factor away some of the repetitive code and makes the code
> flow more nicely.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn
@ 2015-03-09 14:47     ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 02, 2015 at 01:29:02PM +0000, Alex Benn?e wrote:
> This helps re-factor away some of the repetitive code and makes the code
> flow more nicely.
> 
> Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
> 
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
  2015-03-09 13:50     ` Christoffer Dall
  (?)
@ 2015-03-09 16:34       ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-09 16:34 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, Gleb Natapov,
	Paolo Bonzini, Jonathan Corbet, Russell King,
	open list:DOCUMENTATION, open list


Christoffer Dall <christoffer.dall@linaro.org> writes:

> Hi Alex,
>
> The subject of this change has a typo, and I also think it's not about
> exposing the pause state (that's just an internal name/concept), but
> about exposing the PSCI state, or simply the VCPU power state.

arm: KVM: export VCPU power state via MP_STATE ioctl?

>
> On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote:
>> To cleanly restore an SMP VM we need to ensure that the current pause
>> state of each vcpu is correctly recorded. Things could get confused if
>> the CPU starts running after migration restore completes when it was
>> paused before it state was captured.
>> 
>> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
>> interface is a lot simpler as the only valid states are
>> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> 
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index b112efc..602156f 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -997,7 +997,7 @@ for vm-wide capabilities.
>>  4.38 KVM_GET_MP_STATE
>>  
>>  Capability: KVM_CAP_MP_STATE
>> -Architectures: x86, s390
>> +Architectures: x86, s390, arm, arm64
>>  Type: vcpu ioctl
>>  Parameters: struct kvm_mp_state (out)
>>  Returns: 0 on success; -1 on error
>> @@ -1027,15 +1027,21 @@ Possible values are:
>>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
>>                                   [s390]
>>  
>> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
>> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
>> +For x86:
>> +
>> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
>> +irqchip, the multiprocessing state must be maintained by userspace on
>
> Nit: I would not taint the git log with this change but instead just
> introduce a paragraph below starting with "On arm/arm64, " and you would
> get the same effect.
>
>>  these architectures.
>>  
>> +For arm/arm64:
>> +
>> +The only states that are valid are KVM_MP_STATE_HALTED and
>> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
>
> As suggested on the QEMU series, HALTED is probably not the right thing
> to use.

KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit.
I'm wary of adding yet another define.

>
>>  
>>  4.39 KVM_SET_MP_STATE
>>  
>>  Capability: KVM_CAP_MP_STATE
>> -Architectures: x86, s390
>> +Architectures: x86, s390, arm, arm64
>>  Type: vcpu ioctl
>>  Parameters: struct kvm_mp_state (in)
>>  Returns: 0 on success; -1 on error
>> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
>>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
>>  arguments.
>>  
>> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
>> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
>> +For x86:
>> +
>> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
>> +irqchip, the multiprocessing state must be maintained by userspace on
>>  these architectures.
>>  
>> +For arm/arm64:
>> +
>> +The only states that are valid are KVM_MP_STATE_HALTED and
>> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
>
> same as above
>
>>  
>>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>>  
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 5560f74..8531536 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_ARM_PSCI:
>>  	case KVM_CAP_ARM_PSCI_0_2:
>>  	case KVM_CAP_READONLY_MEM:
>> +	case KVM_CAP_MP_STATE:
>>  		r = 1;
>>  		break;
>>  	case KVM_CAP_COALESCED_MMIO:
>> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>>  				    struct kvm_mp_state *mp_state)
>>  {
>> -	return -EINVAL;
>> +	if (vcpu->arch.pause)
>> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
>> +	else
>> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
>> +
>> +	return 0;
>>  }
>>  
>>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>>  				    struct kvm_mp_state *mp_state)
>>  {
>> -	return -EINVAL;
>> +	switch (mp_state->mp_state) {
>> +	case KVM_MP_STATE_RUNNABLE:
>> +		vcpu->arch.pause = false;
>> +		break;
>> +	case KVM_MP_STATE_HALTED:
>> +		vcpu->arch.pause = true;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>>  }
>>  
>>  /**
>
> Are we capturing the vcpu features in some way or do we expect userspace
> to migrate these on its own?  The reason I'm asking, is if you create
> multiple VCPUs, where all the non-primary VCPUs are started in power off
> mode, and then you boot your guest which powers on all VCPUs, and then
> you restart your guest (with PSCI RESET), the system will not power on
> all the non-primary VCPUs but hold them in power-off.
>
> We need to make sure this behavior is preserved for a reboot across a
> migration.  Is it?

Isn't that behaviour orthogonal to the migration case?

- Boot
- Power on secondary CPUs
- Power off one secondary CPU
- Migrate to file (cpu_powered reflects state of each CPU)

- Start fresh QEMU
- Restore from file (cpu_powered -> vcpu->paused via ioctl)
- Run (we continue with the same power state pre-migrate)

- PSCI RESET
- Does what it does, power all secondaries down?
- Kernel boots, turns them on?



>
> Thanks,
> -Christoffer

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 16:34       ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-09 16:34 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Russell King, kvm, Jonathan Corbet, marc.zyngier,
	open list:DOCUMENTATION, open list, Gleb Natapov, Paolo Bonzini,
	kvmarm, linux-arm-kernel


Christoffer Dall <christoffer.dall@linaro.org> writes:

> Hi Alex,
>
> The subject of this change has a typo, and I also think it's not about
> exposing the pause state (that's just an internal name/concept), but
> about exposing the PSCI state, or simply the VCPU power state.

arm: KVM: export VCPU power state via MP_STATE ioctl?

>
> On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote:
>> To cleanly restore an SMP VM we need to ensure that the current pause
>> state of each vcpu is correctly recorded. Things could get confused if
>> the CPU starts running after migration restore completes when it was
>> paused before it state was captured.
>> 
>> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
>> interface is a lot simpler as the only valid states are
>> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> 
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index b112efc..602156f 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -997,7 +997,7 @@ for vm-wide capabilities.
>>  4.38 KVM_GET_MP_STATE
>>  
>>  Capability: KVM_CAP_MP_STATE
>> -Architectures: x86, s390
>> +Architectures: x86, s390, arm, arm64
>>  Type: vcpu ioctl
>>  Parameters: struct kvm_mp_state (out)
>>  Returns: 0 on success; -1 on error
>> @@ -1027,15 +1027,21 @@ Possible values are:
>>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
>>                                   [s390]
>>  
>> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
>> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
>> +For x86:
>> +
>> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
>> +irqchip, the multiprocessing state must be maintained by userspace on
>
> Nit: I would not taint the git log with this change but instead just
> introduce a paragraph below starting with "On arm/arm64, " and you would
> get the same effect.
>
>>  these architectures.
>>  
>> +For arm/arm64:
>> +
>> +The only states that are valid are KVM_MP_STATE_HALTED and
>> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
>
> As suggested on the QEMU series, HALTED is probably not the right thing
> to use.

KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit.
I'm wary of adding yet another define.

>
>>  
>>  4.39 KVM_SET_MP_STATE
>>  
>>  Capability: KVM_CAP_MP_STATE
>> -Architectures: x86, s390
>> +Architectures: x86, s390, arm, arm64
>>  Type: vcpu ioctl
>>  Parameters: struct kvm_mp_state (in)
>>  Returns: 0 on success; -1 on error
>> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
>>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
>>  arguments.
>>  
>> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
>> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
>> +For x86:
>> +
>> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
>> +irqchip, the multiprocessing state must be maintained by userspace on
>>  these architectures.
>>  
>> +For arm/arm64:
>> +
>> +The only states that are valid are KVM_MP_STATE_HALTED and
>> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
>
> same as above
>
>>  
>>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>>  
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 5560f74..8531536 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_ARM_PSCI:
>>  	case KVM_CAP_ARM_PSCI_0_2:
>>  	case KVM_CAP_READONLY_MEM:
>> +	case KVM_CAP_MP_STATE:
>>  		r = 1;
>>  		break;
>>  	case KVM_CAP_COALESCED_MMIO:
>> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>>  				    struct kvm_mp_state *mp_state)
>>  {
>> -	return -EINVAL;
>> +	if (vcpu->arch.pause)
>> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
>> +	else
>> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
>> +
>> +	return 0;
>>  }
>>  
>>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>>  				    struct kvm_mp_state *mp_state)
>>  {
>> -	return -EINVAL;
>> +	switch (mp_state->mp_state) {
>> +	case KVM_MP_STATE_RUNNABLE:
>> +		vcpu->arch.pause = false;
>> +		break;
>> +	case KVM_MP_STATE_HALTED:
>> +		vcpu->arch.pause = true;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>>  }
>>  
>>  /**
>
> Are we capturing the vcpu features in some way or do we expect userspace
> to migrate these on its own?  The reason I'm asking, is if you create
> multiple VCPUs, where all the non-primary VCPUs are started in power off
> mode, and then you boot your guest which powers on all VCPUs, and then
> you restart your guest (with PSCI RESET), the system will not power on
> all the non-primary VCPUs but hold them in power-off.
>
> We need to make sure this behavior is preserved for a reboot across a
> migration.  Is it?

Isn't that behaviour orthogonal to the migration case?

- Boot
- Power on secondary CPUs
- Power off one secondary CPU
- Migrate to file (cpu_powered reflects state of each CPU)

- Start fresh QEMU
- Restore from file (cpu_powered -> vcpu->paused via ioctl)
- Run (we continue with the same power state pre-migrate)

- PSCI RESET
- Does what it does, power all secondaries down?
- Kernel boots, turns them on?



>
> Thanks,
> -Christoffer

-- 
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 16:34       ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-09 16:34 UTC (permalink / raw)
  To: linux-arm-kernel


Christoffer Dall <christoffer.dall@linaro.org> writes:

> Hi Alex,
>
> The subject of this change has a typo, and I also think it's not about
> exposing the pause state (that's just an internal name/concept), but
> about exposing the PSCI state, or simply the VCPU power state.

arm: KVM: export VCPU power state via MP_STATE ioctl?

>
> On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Benn?e wrote:
>> To cleanly restore an SMP VM we need to ensure that the current pause
>> state of each vcpu is correctly recorded. Things could get confused if
>> the CPU starts running after migration restore completes when it was
>> paused before it state was captured.
>> 
>> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
>> interface is a lot simpler as the only valid states are
>> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
>> 
>> Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
>> 
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index b112efc..602156f 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -997,7 +997,7 @@ for vm-wide capabilities.
>>  4.38 KVM_GET_MP_STATE
>>  
>>  Capability: KVM_CAP_MP_STATE
>> -Architectures: x86, s390
>> +Architectures: x86, s390, arm, arm64
>>  Type: vcpu ioctl
>>  Parameters: struct kvm_mp_state (out)
>>  Returns: 0 on success; -1 on error
>> @@ -1027,15 +1027,21 @@ Possible values are:
>>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
>>                                   [s390]
>>  
>> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
>> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
>> +For x86:
>> +
>> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
>> +irqchip, the multiprocessing state must be maintained by userspace on
>
> Nit: I would not taint the git log with this change but instead just
> introduce a paragraph below starting with "On arm/arm64, " and you would
> get the same effect.
>
>>  these architectures.
>>  
>> +For arm/arm64:
>> +
>> +The only states that are valid are KVM_MP_STATE_HALTED and
>> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
>
> As suggested on the QEMU series, HALTED is probably not the right thing
> to use.

KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit.
I'm wary of adding yet another define.

>
>>  
>>  4.39 KVM_SET_MP_STATE
>>  
>>  Capability: KVM_CAP_MP_STATE
>> -Architectures: x86, s390
>> +Architectures: x86, s390, arm, arm64
>>  Type: vcpu ioctl
>>  Parameters: struct kvm_mp_state (in)
>>  Returns: 0 on success; -1 on error
>> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
>>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
>>  arguments.
>>  
>> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
>> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
>> +For x86:
>> +
>> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
>> +irqchip, the multiprocessing state must be maintained by userspace on
>>  these architectures.
>>  
>> +For arm/arm64:
>> +
>> +The only states that are valid are KVM_MP_STATE_HALTED and
>> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
>
> same as above
>
>>  
>>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>>  
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 5560f74..8531536 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_ARM_PSCI:
>>  	case KVM_CAP_ARM_PSCI_0_2:
>>  	case KVM_CAP_READONLY_MEM:
>> +	case KVM_CAP_MP_STATE:
>>  		r = 1;
>>  		break;
>>  	case KVM_CAP_COALESCED_MMIO:
>> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>>  				    struct kvm_mp_state *mp_state)
>>  {
>> -	return -EINVAL;
>> +	if (vcpu->arch.pause)
>> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
>> +	else
>> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
>> +
>> +	return 0;
>>  }
>>  
>>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>>  				    struct kvm_mp_state *mp_state)
>>  {
>> -	return -EINVAL;
>> +	switch (mp_state->mp_state) {
>> +	case KVM_MP_STATE_RUNNABLE:
>> +		vcpu->arch.pause = false;
>> +		break;
>> +	case KVM_MP_STATE_HALTED:
>> +		vcpu->arch.pause = true;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	return 0;
>>  }
>>  
>>  /**
>
> Are we capturing the vcpu features in some way or do we expect userspace
> to migrate these on its own?  The reason I'm asking, is if you create
> multiple VCPUs, where all the non-primary VCPUs are started in power off
> mode, and then you boot your guest which powers on all VCPUs, and then
> you restart your guest (with PSCI RESET), the system will not power on
> all the non-primary VCPUs but hold them in power-off.
>
> We need to make sure this behavior is preserved for a reboot across a
> migration.  Is it?

Isn't that behaviour orthogonal to the migration case?

- Boot
- Power on secondary CPUs
- Power off one secondary CPU
- Migrate to file (cpu_powered reflects state of each CPU)

- Start fresh QEMU
- Restore from file (cpu_powered -> vcpu->paused via ioctl)
- Run (we continue with the same power state pre-migrate)

- PSCI RESET
- Does what it does, power all secondaries down?
- Kernel boots, turns them on?



>
> Thanks,
> -Christoffer

-- 
Alex Benn?e

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
  2015-03-09 14:41     ` Christoffer Dall
@ 2015-03-09 16:40       ` Alex Bennée
  -1 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-09 16:40 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, Gleb Natapov,
	Paolo Bonzini, open list


Christoffer Dall <christoffer.dall@linaro.org> writes:

> On Mon, Mar 02, 2015 at 01:29:01PM +0000, Alex Bennée wrote:
>> From: Christoffer Dall <christoffer.dall@linaro.org>
>> 
>> Migrating active interrupts causes the active state to be lost
>> completely. This implements some additional bitmaps to track the active
>> state on the distributor and export this to user space.
>> 
>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> 
>> ---
>> AJB:
>>    - fixed merge conflicts
>>    - moved additional shared bitmaps to be dynamically allocated
>>    - make irq_active_on_cpu dynamically allocated as well
>>    - in vgic_queue_irq don't queue pending if already active
>>    - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
>>    - vgic: clear active on CPU bit
>>    - checkpatch, remove extraneous braces
>>    - checkpatch, remove debug, fix overflows
>>    - move register access fns to re-factored vgic-v2-emul.c
>> v2
>>    - doc: unqueue and update_state
>> 
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7c55dd5..7042251 100644
<snip>
> if you respin:                         the active state
> if you respin:     Mark the

OK

<snip>
>>  
>> @@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>>  {
>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +	unsigned long *pa_percpu, *pa_shared;
>>  	int i, vcpu_id;
>>  	int overflow = 0;
>> +	int nr_shared = vgic_nr_shared_irqs(dist);
>>  
>>  	vcpu_id = vcpu->vcpu_id;
>>  
>> +	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
>> +	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
>> +
>> +	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
>> +		  VGIC_NR_PRIVATE_IRQS);
>> +	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
>> +		  nr_shared);
>>  	/*
>>  	 * We may not have any pending interrupt, or the interrupts
>>  	 * may have been serviced from another vcpu. In all cases,
>>  	 * move along.
>>  	 */
>> -	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
>> -		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
>> +	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
>>  		goto epilog;
>> -	}
>>  
>>  	/* SGIs */
>> -	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
>> +	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
>>  		if (!queue_sgi(vcpu, i))
>>  			overflow = 1;
>>  	}
>>  
>>  	/* PPIs */
>> -	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
>> +	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
>>  		if (!vgic_queue_hwirq(vcpu, i))
>>  			overflow = 1;
>>  	}
>>  
>>  	/* SPIs */
>> -	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
>> +	for_each_set_bit(i, pa_shared, nr_shared) {
>>  		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
>>  			overflow = 1;
>>  	}
>
> meh, these changes will now produce a quite strange result if one
> bisects at this specific patch, since we will be setting pending
> interrupts to active....  I thought it was more clear as one patch
> containing the changes to vgic_queue_... imho.

I split the re-factor off by Marc's request but your right it becomes a
little weird. Now the horror of dealing with the merge conflicts is done
I could split the next patch and add it before this one:

* add a common vgic_queue_irq_to_lr fn (just deal with pending)
* Implement support for unqueuing active IRQs (expand vgic_queue_irq_to_lr)

>
> Thoughts?
>
>>  
>> +
>> +
>> +
>>  epilog:
>>  	if (overflow) {
>>  		vgic_enable_underflow(vcpu);
>> @@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>>  	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>>  }
>>  
>> +int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
>> +{
>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +
>> +	if (!irqchip_in_kernel(vcpu->kvm))
>> +		return 0;
>> +
>> +	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
>> +}
>> +
>> +
>>  void vgic_kick_vcpus(struct kvm *kvm)
>>  {
>>  	struct kvm_vcpu *vcpu;
>> @@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>  
>>  	kfree(vgic_cpu->pending_shared);
>> +	kfree(vgic_cpu->active_shared);
>> +	kfree(vgic_cpu->pend_act_shared);
>>  	kfree(vgic_cpu->vgic_irq_lr_map);
>>  	vgic_cpu->pending_shared = NULL;
>> +	vgic_cpu->active_shared = NULL;
>> +	vgic_cpu->pend_act_shared = NULL;
>>  	vgic_cpu->vgic_irq_lr_map = NULL;
>>  }
>>  
>> @@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>>  
>>  	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
>>  	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
>> +	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
>> +	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
>>  	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
>>  
>> -	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
>> +	if (!vgic_cpu->pending_shared
>> +		|| !vgic_cpu->active_shared
>> +		|| !vgic_cpu->pend_act_shared
>> +		|| !vgic_cpu->vgic_irq_lr_map) {
>>  		kvm_vgic_vcpu_destroy(vcpu);
>>  		return -ENOMEM;
>>  	}
>> @@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
>>  	kfree(dist->irq_spi_mpidr);
>>  	kfree(dist->irq_spi_target);
>>  	kfree(dist->irq_pending_on_cpu);
>> +	kfree(dist->irq_active_on_cpu);
>>  	dist->irq_sgi_sources = NULL;
>>  	dist->irq_spi_cpu = NULL;
>>  	dist->irq_spi_target = NULL;
>>  	dist->irq_pending_on_cpu = NULL;
>> +	dist->irq_active_on_cpu = NULL;
>>  	dist->nr_cpus = 0;
>>  }
>>  
>> @@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
>>  	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
>> +	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
>>  
>> @@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
>>  				       GFP_KERNEL);
>>  	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
>>  					   GFP_KERNEL);
>> +	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
>> +					   GFP_KERNEL);
>>  	if (!dist->irq_sgi_sources ||
>>  	    !dist->irq_spi_cpu ||
>>  	    !dist->irq_spi_target ||
>> -	    !dist->irq_pending_on_cpu) {
>> +	    !dist->irq_pending_on_cpu ||
>> +	    !dist->irq_active_on_cpu) {
>>  		ret = -ENOMEM;
>>  		goto out;
>>  	}
>> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
>> index 1e83bdf..1e5a381 100644
>> --- a/virt/kvm/arm/vgic.h
>> +++ b/virt/kvm/arm/vgic.h
>> @@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>>  bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>>  				   phys_addr_t offset, int vcpu_id);
>>  
>> +bool vgic_handle_set_active_reg(struct kvm *kvm,
>> +				struct kvm_exit_mmio *mmio,
>> +				phys_addr_t offset, int vcpu_id);
>> +
>> +bool vgic_handle_clear_active_reg(struct kvm *kvm,
>> +				  struct kvm_exit_mmio *mmio,
>> +				  phys_addr_t offset, int vcpu_id);
>> +
>>  bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
>>  			 phys_addr_t offset);
>>  
>> -- 
>> 2.3.0
>> 

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs
@ 2015-03-09 16:40       ` Alex Bennée
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Bennée @ 2015-03-09 16:40 UTC (permalink / raw)
  To: linux-arm-kernel


Christoffer Dall <christoffer.dall@linaro.org> writes:

> On Mon, Mar 02, 2015 at 01:29:01PM +0000, Alex Benn?e wrote:
>> From: Christoffer Dall <christoffer.dall@linaro.org>
>> 
>> Migrating active interrupts causes the active state to be lost
>> completely. This implements some additional bitmaps to track the active
>> state on the distributor and export this to user space.
>> 
>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
>> 
>> ---
>> AJB:
>>    - fixed merge conflicts
>>    - moved additional shared bitmaps to be dynamically allocated
>>    - make irq_active_on_cpu dynamically allocated as well
>>    - in vgic_queue_irq don't queue pending if already active
>>    - in __kvm_vgic_flush_hwstate use pr_shared when checking SPIs
>>    - vgic: clear active on CPU bit
>>    - checkpatch, remove extraneous braces
>>    - checkpatch, remove debug, fix overflows
>>    - move register access fns to re-factored vgic-v2-emul.c
>> v2
>>    - doc: unqueue and update_state
>> 
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7c55dd5..7042251 100644
<snip>
> if you respin:                         the active state
> if you respin:     Mark the

OK

<snip>
>>  
>> @@ -1030,39 +1113,49 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>>  {
>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +	unsigned long *pa_percpu, *pa_shared;
>>  	int i, vcpu_id;
>>  	int overflow = 0;
>> +	int nr_shared = vgic_nr_shared_irqs(dist);
>>  
>>  	vcpu_id = vcpu->vcpu_id;
>>  
>> +	pa_percpu = vcpu->arch.vgic_cpu.pend_act_percpu;
>> +	pa_shared = vcpu->arch.vgic_cpu.pend_act_shared;
>> +
>> +	bitmap_or(pa_percpu, vgic_cpu->pending_percpu, vgic_cpu->active_percpu,
>> +		  VGIC_NR_PRIVATE_IRQS);
>> +	bitmap_or(pa_shared, vgic_cpu->pending_shared, vgic_cpu->active_shared,
>> +		  nr_shared);
>>  	/*
>>  	 * We may not have any pending interrupt, or the interrupts
>>  	 * may have been serviced from another vcpu. In all cases,
>>  	 * move along.
>>  	 */
>> -	if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
>> -		pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
>> +	if (!kvm_vgic_vcpu_pending_irq(vcpu) && !kvm_vgic_vcpu_active_irq(vcpu))
>>  		goto epilog;
>> -	}
>>  
>>  	/* SGIs */
>> -	for_each_set_bit(i, vgic_cpu->pending_percpu, VGIC_NR_SGIS) {
>> +	for_each_set_bit(i, pa_percpu, VGIC_NR_SGIS) {
>>  		if (!queue_sgi(vcpu, i))
>>  			overflow = 1;
>>  	}
>>  
>>  	/* PPIs */
>> -	for_each_set_bit_from(i, vgic_cpu->pending_percpu, VGIC_NR_PRIVATE_IRQS) {
>> +	for_each_set_bit_from(i, pa_percpu, VGIC_NR_PRIVATE_IRQS) {
>>  		if (!vgic_queue_hwirq(vcpu, i))
>>  			overflow = 1;
>>  	}
>>  
>>  	/* SPIs */
>> -	for_each_set_bit(i, vgic_cpu->pending_shared, vgic_nr_shared_irqs(dist)) {
>> +	for_each_set_bit(i, pa_shared, nr_shared) {
>>  		if (!vgic_queue_hwirq(vcpu, i + VGIC_NR_PRIVATE_IRQS))
>>  			overflow = 1;
>>  	}
>
> meh, these changes will now produce a quite strange result if one
> bisects at this specific patch, since we will be setting pending
> interrupts to active....  I thought it was more clear as one patch
> containing the changes to vgic_queue_... imho.

I split the re-factor off by Marc's request but your right it becomes a
little weird. Now the horror of dealing with the merge conflicts is done
I could split the next patch and add it before this one:

* add a common vgic_queue_irq_to_lr fn (just deal with pending)
* Implement support for unqueuing active IRQs (expand vgic_queue_irq_to_lr)

>
> Thoughts?
>
>>  
>> +
>> +
>> +
>>  epilog:
>>  	if (overflow) {
>>  		vgic_enable_underflow(vcpu);
>> @@ -1209,6 +1302,17 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>>  	return test_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>>  }
>>  
>> +int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu)
>> +{
>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +
>> +	if (!irqchip_in_kernel(vcpu->kvm))
>> +		return 0;
>> +
>> +	return test_bit(vcpu->vcpu_id, dist->irq_active_on_cpu);
>> +}
>> +
>> +
>>  void vgic_kick_vcpus(struct kvm *kvm)
>>  {
>>  	struct kvm_vcpu *vcpu;
>> @@ -1381,8 +1485,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>  
>>  	kfree(vgic_cpu->pending_shared);
>> +	kfree(vgic_cpu->active_shared);
>> +	kfree(vgic_cpu->pend_act_shared);
>>  	kfree(vgic_cpu->vgic_irq_lr_map);
>>  	vgic_cpu->pending_shared = NULL;
>> +	vgic_cpu->active_shared = NULL;
>> +	vgic_cpu->pend_act_shared = NULL;
>>  	vgic_cpu->vgic_irq_lr_map = NULL;
>>  }
>>  
>> @@ -1392,9 +1500,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
>>  
>>  	int sz = (nr_irqs - VGIC_NR_PRIVATE_IRQS) / 8;
>>  	vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
>> +	vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
>> +	vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
>>  	vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
>>  
>> -	if (!vgic_cpu->pending_shared || !vgic_cpu->vgic_irq_lr_map) {
>> +	if (!vgic_cpu->pending_shared
>> +		|| !vgic_cpu->active_shared
>> +		|| !vgic_cpu->pend_act_shared
>> +		|| !vgic_cpu->vgic_irq_lr_map) {
>>  		kvm_vgic_vcpu_destroy(vcpu);
>>  		return -ENOMEM;
>>  	}
>> @@ -1447,10 +1560,12 @@ void kvm_vgic_destroy(struct kvm *kvm)
>>  	kfree(dist->irq_spi_mpidr);
>>  	kfree(dist->irq_spi_target);
>>  	kfree(dist->irq_pending_on_cpu);
>> +	kfree(dist->irq_active_on_cpu);
>>  	dist->irq_sgi_sources = NULL;
>>  	dist->irq_spi_cpu = NULL;
>>  	dist->irq_spi_target = NULL;
>>  	dist->irq_pending_on_cpu = NULL;
>> +	dist->irq_active_on_cpu = NULL;
>>  	dist->nr_cpus = 0;
>>  }
>>  
>> @@ -1486,6 +1601,7 @@ int vgic_init(struct kvm *kvm)
>>  	ret |= vgic_init_bitmap(&dist->irq_pending, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bitmap(&dist->irq_soft_pend, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bitmap(&dist->irq_queued, nr_cpus, nr_irqs);
>> +	ret |= vgic_init_bitmap(&dist->irq_active, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bitmap(&dist->irq_cfg, nr_cpus, nr_irqs);
>>  	ret |= vgic_init_bytemap(&dist->irq_priority, nr_cpus, nr_irqs);
>>  
>> @@ -1498,10 +1614,13 @@ int vgic_init(struct kvm *kvm)
>>  				       GFP_KERNEL);
>>  	dist->irq_pending_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
>>  					   GFP_KERNEL);
>> +	dist->irq_active_on_cpu = kzalloc(BITS_TO_LONGS(nr_cpus) * sizeof(long),
>> +					   GFP_KERNEL);
>>  	if (!dist->irq_sgi_sources ||
>>  	    !dist->irq_spi_cpu ||
>>  	    !dist->irq_spi_target ||
>> -	    !dist->irq_pending_on_cpu) {
>> +	    !dist->irq_pending_on_cpu ||
>> +	    !dist->irq_active_on_cpu) {
>>  		ret = -ENOMEM;
>>  		goto out;
>>  	}
>> diff --git a/virt/kvm/arm/vgic.h b/virt/kvm/arm/vgic.h
>> index 1e83bdf..1e5a381 100644
>> --- a/virt/kvm/arm/vgic.h
>> +++ b/virt/kvm/arm/vgic.h
>> @@ -107,6 +107,14 @@ bool vgic_handle_set_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>>  bool vgic_handle_clear_pending_reg(struct kvm *kvm, struct kvm_exit_mmio *mmio,
>>  				   phys_addr_t offset, int vcpu_id);
>>  
>> +bool vgic_handle_set_active_reg(struct kvm *kvm,
>> +				struct kvm_exit_mmio *mmio,
>> +				phys_addr_t offset, int vcpu_id);
>> +
>> +bool vgic_handle_clear_active_reg(struct kvm *kvm,
>> +				  struct kvm_exit_mmio *mmio,
>> +				  phys_addr_t offset, int vcpu_id);
>> +
>>  bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio,
>>  			 phys_addr_t offset);
>>  
>> -- 
>> 2.3.0
>> 

-- 
Alex Benn?e

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
  2015-03-09 16:34       ` Alex Bennée
  (?)
@ 2015-03-09 19:29         ` Christoffer Dall
  -1 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 19:29 UTC (permalink / raw)
  To: Alex Bennée
  Cc: kvm, linux-arm-kernel, kvmarm, marc.zyngier, Gleb Natapov,
	Paolo Bonzini, Jonathan Corbet, Russell King,
	open list:DOCUMENTATION, open list

On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Bennée wrote:
> 
> Christoffer Dall <christoffer.dall@linaro.org> writes:
> 
> > Hi Alex,
> >
> > The subject of this change has a typo, and I also think it's not about
> > exposing the pause state (that's just an internal name/concept), but
> > about exposing the PSCI state, or simply the VCPU power state.
> 
> arm: KVM: export VCPU power state via MP_STATE ioctl?
> 

arm/arm64: KVM: 

otherwise looks good to me ;)


> >
> > On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote:
> >> To cleanly restore an SMP VM we need to ensure that the current pause
> >> state of each vcpu is correctly recorded. Things could get confused if
> >> the CPU starts running after migration restore completes when it was
> >> paused before it state was captured.
> >> 
> >> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
> >> interface is a lot simpler as the only valid states are
> >> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
> >> 
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> 
> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >> index b112efc..602156f 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -997,7 +997,7 @@ for vm-wide capabilities.
> >>  4.38 KVM_GET_MP_STATE
> >>  
> >>  Capability: KVM_CAP_MP_STATE
> >> -Architectures: x86, s390
> >> +Architectures: x86, s390, arm, arm64
> >>  Type: vcpu ioctl
> >>  Parameters: struct kvm_mp_state (out)
> >>  Returns: 0 on success; -1 on error
> >> @@ -1027,15 +1027,21 @@ Possible values are:
> >>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
> >>                                   [s390]
> >>  
> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> >> +For x86:
> >> +
> >> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> >> +irqchip, the multiprocessing state must be maintained by userspace on
> >
> > Nit: I would not taint the git log with this change but instead just
> > introduce a paragraph below starting with "On arm/arm64, " and you would
> > get the same effect.
> >
> >>  these architectures.
> >>  
> >> +For arm/arm64:
> >> +
> >> +The only states that are valid are KVM_MP_STATE_HALTED and
> >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
> >
> > As suggested on the QEMU series, HALTED is probably not the right thing
> > to use.
> 
> KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit.
> I'm wary of adding yet another define.
> 

sounds fine, as long as it doesn't have some inherently different
meaning with s390.

> >
> >>  
> >>  4.39 KVM_SET_MP_STATE
> >>  
> >>  Capability: KVM_CAP_MP_STATE
> >> -Architectures: x86, s390
> >> +Architectures: x86, s390, arm, arm64
> >>  Type: vcpu ioctl
> >>  Parameters: struct kvm_mp_state (in)
> >>  Returns: 0 on success; -1 on error
> >> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
> >>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
> >>  arguments.
> >>  
> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> >> +For x86:
> >> +
> >> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> >> +irqchip, the multiprocessing state must be maintained by userspace on
> >>  these architectures.
> >>  
> >> +For arm/arm64:
> >> +
> >> +The only states that are valid are KVM_MP_STATE_HALTED and
> >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
> >
> > same as above
> >
> >>  
> >>  4.40 KVM_SET_IDENTITY_MAP_ADDR
> >>  
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 5560f74..8531536 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >>  	case KVM_CAP_ARM_PSCI:
> >>  	case KVM_CAP_ARM_PSCI_0_2:
> >>  	case KVM_CAP_READONLY_MEM:
> >> +	case KVM_CAP_MP_STATE:
> >>  		r = 1;
> >>  		break;
> >>  	case KVM_CAP_COALESCED_MMIO:
> >> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> >>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
> >>  				    struct kvm_mp_state *mp_state)
> >>  {
> >> -	return -EINVAL;
> >> +	if (vcpu->arch.pause)
> >> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
> >> +	else
> >> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
> >>  				    struct kvm_mp_state *mp_state)
> >>  {
> >> -	return -EINVAL;
> >> +	switch (mp_state->mp_state) {
> >> +	case KVM_MP_STATE_RUNNABLE:
> >> +		vcpu->arch.pause = false;
> >> +		break;
> >> +	case KVM_MP_STATE_HALTED:
> >> +		vcpu->arch.pause = true;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  /**
> >
> > Are we capturing the vcpu features in some way or do we expect userspace
> > to migrate these on its own?  The reason I'm asking, is if you create
> > multiple VCPUs, where all the non-primary VCPUs are started in power off
> > mode, and then you boot your guest which powers on all VCPUs, and then
> > you restart your guest (with PSCI RESET), the system will not power on
> > all the non-primary VCPUs but hold them in power-off.
> >
> > We need to make sure this behavior is preserved for a reboot across a
> > migration.  Is it?
> 
> Isn't that behaviour orthogonal to the migration case?
> 
> - Boot
> - Power on secondary CPUs
> - Power off one secondary CPU
> - Migrate to file (cpu_powered reflects state of each CPU)
> 
> - Start fresh QEMU
> - Restore from file (cpu_powered -> vcpu->paused via ioctl)
> - Run (we continue with the same power state pre-migrate)
> 
> - PSCI RESET
> - Does what it does, power all secondaries down?
> - Kernel boots, turns them on?
> 
Hmmm. As long as QEMU always inits all VCPUs in the exact same way
(including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and
that's probably a reasonable requirement for migration.

-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 19:29         ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 19:29 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Russell King, kvm, Jonathan Corbet, marc.zyngier,
	open list:DOCUMENTATION, open list, Gleb Natapov, Paolo Bonzini,
	kvmarm, linux-arm-kernel

On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Bennée wrote:
> 
> Christoffer Dall <christoffer.dall@linaro.org> writes:
> 
> > Hi Alex,
> >
> > The subject of this change has a typo, and I also think it's not about
> > exposing the pause state (that's just an internal name/concept), but
> > about exposing the PSCI state, or simply the VCPU power state.
> 
> arm: KVM: export VCPU power state via MP_STATE ioctl?
> 

arm/arm64: KVM: 

otherwise looks good to me ;)


> >
> > On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote:
> >> To cleanly restore an SMP VM we need to ensure that the current pause
> >> state of each vcpu is correctly recorded. Things could get confused if
> >> the CPU starts running after migration restore completes when it was
> >> paused before it state was captured.
> >> 
> >> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
> >> interface is a lot simpler as the only valid states are
> >> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
> >> 
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> 
> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >> index b112efc..602156f 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -997,7 +997,7 @@ for vm-wide capabilities.
> >>  4.38 KVM_GET_MP_STATE
> >>  
> >>  Capability: KVM_CAP_MP_STATE
> >> -Architectures: x86, s390
> >> +Architectures: x86, s390, arm, arm64
> >>  Type: vcpu ioctl
> >>  Parameters: struct kvm_mp_state (out)
> >>  Returns: 0 on success; -1 on error
> >> @@ -1027,15 +1027,21 @@ Possible values are:
> >>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
> >>                                   [s390]
> >>  
> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> >> +For x86:
> >> +
> >> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> >> +irqchip, the multiprocessing state must be maintained by userspace on
> >
> > Nit: I would not taint the git log with this change but instead just
> > introduce a paragraph below starting with "On arm/arm64, " and you would
> > get the same effect.
> >
> >>  these architectures.
> >>  
> >> +For arm/arm64:
> >> +
> >> +The only states that are valid are KVM_MP_STATE_HALTED and
> >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
> >
> > As suggested on the QEMU series, HALTED is probably not the right thing
> > to use.
> 
> KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit.
> I'm wary of adding yet another define.
> 

sounds fine, as long as it doesn't have some inherently different
meaning with s390.

> >
> >>  
> >>  4.39 KVM_SET_MP_STATE
> >>  
> >>  Capability: KVM_CAP_MP_STATE
> >> -Architectures: x86, s390
> >> +Architectures: x86, s390, arm, arm64
> >>  Type: vcpu ioctl
> >>  Parameters: struct kvm_mp_state (in)
> >>  Returns: 0 on success; -1 on error
> >> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
> >>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
> >>  arguments.
> >>  
> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> >> +For x86:
> >> +
> >> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> >> +irqchip, the multiprocessing state must be maintained by userspace on
> >>  these architectures.
> >>  
> >> +For arm/arm64:
> >> +
> >> +The only states that are valid are KVM_MP_STATE_HALTED and
> >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
> >
> > same as above
> >
> >>  
> >>  4.40 KVM_SET_IDENTITY_MAP_ADDR
> >>  
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 5560f74..8531536 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >>  	case KVM_CAP_ARM_PSCI:
> >>  	case KVM_CAP_ARM_PSCI_0_2:
> >>  	case KVM_CAP_READONLY_MEM:
> >> +	case KVM_CAP_MP_STATE:
> >>  		r = 1;
> >>  		break;
> >>  	case KVM_CAP_COALESCED_MMIO:
> >> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> >>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
> >>  				    struct kvm_mp_state *mp_state)
> >>  {
> >> -	return -EINVAL;
> >> +	if (vcpu->arch.pause)
> >> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
> >> +	else
> >> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
> >>  				    struct kvm_mp_state *mp_state)
> >>  {
> >> -	return -EINVAL;
> >> +	switch (mp_state->mp_state) {
> >> +	case KVM_MP_STATE_RUNNABLE:
> >> +		vcpu->arch.pause = false;
> >> +		break;
> >> +	case KVM_MP_STATE_HALTED:
> >> +		vcpu->arch.pause = true;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  /**
> >
> > Are we capturing the vcpu features in some way or do we expect userspace
> > to migrate these on its own?  The reason I'm asking, is if you create
> > multiple VCPUs, where all the non-primary VCPUs are started in power off
> > mode, and then you boot your guest which powers on all VCPUs, and then
> > you restart your guest (with PSCI RESET), the system will not power on
> > all the non-primary VCPUs but hold them in power-off.
> >
> > We need to make sure this behavior is preserved for a reboot across a
> > migration.  Is it?
> 
> Isn't that behaviour orthogonal to the migration case?
> 
> - Boot
> - Power on secondary CPUs
> - Power off one secondary CPU
> - Migrate to file (cpu_powered reflects state of each CPU)
> 
> - Start fresh QEMU
> - Restore from file (cpu_powered -> vcpu->paused via ioctl)
> - Run (we continue with the same power state pre-migrate)
> 
> - PSCI RESET
> - Does what it does, power all secondaries down?
> - Kernel boots, turns them on?
> 
Hmmm. As long as QEMU always inits all VCPUs in the exact same way
(including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and
that's probably a reasonable requirement for migration.

-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 19:29         ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-03-09 19:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Benn?e wrote:
> 
> Christoffer Dall <christoffer.dall@linaro.org> writes:
> 
> > Hi Alex,
> >
> > The subject of this change has a typo, and I also think it's not about
> > exposing the pause state (that's just an internal name/concept), but
> > about exposing the PSCI state, or simply the VCPU power state.
> 
> arm: KVM: export VCPU power state via MP_STATE ioctl?
> 

arm/arm64: KVM: 

otherwise looks good to me ;)


> >
> > On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Benn?e wrote:
> >> To cleanly restore an SMP VM we need to ensure that the current pause
> >> state of each vcpu is correctly recorded. Things could get confused if
> >> the CPU starts running after migration restore completes when it was
> >> paused before it state was captured.
> >> 
> >> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64
> >> interface is a lot simpler as the only valid states are
> >> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED.
> >> 
> >> Signed-off-by: Alex Benn?e <alex.bennee@linaro.org>
> >> 
> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >> index b112efc..602156f 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -997,7 +997,7 @@ for vm-wide capabilities.
> >>  4.38 KVM_GET_MP_STATE
> >>  
> >>  Capability: KVM_CAP_MP_STATE
> >> -Architectures: x86, s390
> >> +Architectures: x86, s390, arm, arm64
> >>  Type: vcpu ioctl
> >>  Parameters: struct kvm_mp_state (out)
> >>  Returns: 0 on success; -1 on error
> >> @@ -1027,15 +1027,21 @@ Possible values are:
> >>   - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
> >>                                   [s390]
> >>  
> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> >> +For x86:
> >> +
> >> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> >> +irqchip, the multiprocessing state must be maintained by userspace on
> >
> > Nit: I would not taint the git log with this change but instead just
> > introduce a paragraph below starting with "On arm/arm64, " and you would
> > get the same effect.
> >
> >>  these architectures.
> >>  
> >> +For arm/arm64:
> >> +
> >> +The only states that are valid are KVM_MP_STATE_HALTED and
> >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
> >
> > As suggested on the QEMU series, HALTED is probably not the right thing
> > to use.
> 
> KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit.
> I'm wary of adding yet another define.
> 

sounds fine, as long as it doesn't have some inherently different
meaning with s390.

> >
> >>  
> >>  4.39 KVM_SET_MP_STATE
> >>  
> >>  Capability: KVM_CAP_MP_STATE
> >> -Architectures: x86, s390
> >> +Architectures: x86, s390, arm, arm64
> >>  Type: vcpu ioctl
> >>  Parameters: struct kvm_mp_state (in)
> >>  Returns: 0 on success; -1 on error
> >> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error
> >>  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
> >>  arguments.
> >>  
> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
> >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on
> >> +For x86:
> >> +
> >> +This ioctl is only useful after KVM_CREATE_IRQCHIP.  Without an in-kernel
> >> +irqchip, the multiprocessing state must be maintained by userspace on
> >>  these architectures.
> >>  
> >> +For arm/arm64:
> >> +
> >> +The only states that are valid are KVM_MP_STATE_HALTED and
> >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
> >
> > same as above
> >
> >>  
> >>  4.40 KVM_SET_IDENTITY_MAP_ADDR
> >>  
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 5560f74..8531536 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >>  	case KVM_CAP_ARM_PSCI:
> >>  	case KVM_CAP_ARM_PSCI_0_2:
> >>  	case KVM_CAP_READONLY_MEM:
> >> +	case KVM_CAP_MP_STATE:
> >>  		r = 1;
> >>  		break;
> >>  	case KVM_CAP_COALESCED_MMIO:
> >> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> >>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
> >>  				    struct kvm_mp_state *mp_state)
> >>  {
> >> -	return -EINVAL;
> >> +	if (vcpu->arch.pause)
> >> +		mp_state->mp_state = KVM_MP_STATE_HALTED;
> >> +	else
> >> +		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
> >>  				    struct kvm_mp_state *mp_state)
> >>  {
> >> -	return -EINVAL;
> >> +	switch (mp_state->mp_state) {
> >> +	case KVM_MP_STATE_RUNNABLE:
> >> +		vcpu->arch.pause = false;
> >> +		break;
> >> +	case KVM_MP_STATE_HALTED:
> >> +		vcpu->arch.pause = true;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  /**
> >
> > Are we capturing the vcpu features in some way or do we expect userspace
> > to migrate these on its own?  The reason I'm asking, is if you create
> > multiple VCPUs, where all the non-primary VCPUs are started in power off
> > mode, and then you boot your guest which powers on all VCPUs, and then
> > you restart your guest (with PSCI RESET), the system will not power on
> > all the non-primary VCPUs but hold them in power-off.
> >
> > We need to make sure this behavior is preserved for a reboot across a
> > migration.  Is it?
> 
> Isn't that behaviour orthogonal to the migration case?
> 
> - Boot
> - Power on secondary CPUs
> - Power off one secondary CPU
> - Migrate to file (cpu_powered reflects state of each CPU)
> 
> - Start fresh QEMU
> - Restore from file (cpu_powered -> vcpu->paused via ioctl)
> - Run (we continue with the same power state pre-migrate)
> 
> - PSCI RESET
> - Does what it does, power all secondaries down?
> - Kernel boots, turns them on?
> 
Hmmm. As long as QEMU always inits all VCPUs in the exact same way
(including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and
that's probably a reasonable requirement for migration.

-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
  2015-03-09 19:29         ` Christoffer Dall
@ 2015-03-09 19:55           ` Peter Maydell
  -1 siblings, 0 replies; 35+ messages in thread
From: Peter Maydell @ 2015-03-09 19:55 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Alex Bennée, Russell King, kvm-devel, Jonathan Corbet,
	Marc Zyngier, open list:DOCUMENTATION, open list, Gleb Natapov,
	Paolo Bonzini, kvmarm, arm-mail-list

On 10 March 2015 at 04:29, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Bennée wrote:
>> - Boot
>> - Power on secondary CPUs
>> - Power off one secondary CPU
>> - Migrate to file (cpu_powered reflects state of each CPU)
>>
>> - Start fresh QEMU
>> - Restore from file (cpu_powered -> vcpu->paused via ioctl)
>> - Run (we continue with the same power state pre-migrate)
>>
>> - PSCI RESET
>> - Does what it does, power all secondaries down?
>> - Kernel boots, turns them on?
>>
> Hmmm. As long as QEMU always inits all VCPUs in the exact same way
> (including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and
> that's probably a reasonable requirement for migration.

We init the VCPUs with the POWER_OFF flag set for exactly the
set of CPUs that we want to start powered off. Typically that
means that the first CPU is inited with POWER_OFF clear and the
rest are inited with it set.

Regardless, if we're doing an incoming migration, then the
incoming data will be used to set the VCPU on/off state
appropriately for resuming the VM. (This might include
powering on a VCPU that's been inited in power-off and never
run, or powering down a VCPU that was inited power-on but
never ran. In the 'migration moves a vcpu to powered-on'
case we'll also set the vcpu's PC and other registers so
when it is run it will DTRT.)

If the resumed guest subsequently does a PSCI reset then
QEMU will re-init the VCPUs with the set of feature bits that
the machine model/etc defines for a reset condition for this
board, which will be the same "first CPU powered on, all the
rest powered off" config we started with.

(It's the user's responsibility to ensure that when doing a
migration the QEMUs at both ends are identically configured,
incidentally.)

-- PMM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls
@ 2015-03-09 19:55           ` Peter Maydell
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Maydell @ 2015-03-09 19:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 10 March 2015 at 04:29, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Benn?e wrote:
>> - Boot
>> - Power on secondary CPUs
>> - Power off one secondary CPU
>> - Migrate to file (cpu_powered reflects state of each CPU)
>>
>> - Start fresh QEMU
>> - Restore from file (cpu_powered -> vcpu->paused via ioctl)
>> - Run (we continue with the same power state pre-migrate)
>>
>> - PSCI RESET
>> - Does what it does, power all secondaries down?
>> - Kernel boots, turns them on?
>>
> Hmmm. As long as QEMU always inits all VCPUs in the exact same way
> (including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and
> that's probably a reasonable requirement for migration.

We init the VCPUs with the POWER_OFF flag set for exactly the
set of CPUs that we want to start powered off. Typically that
means that the first CPU is inited with POWER_OFF clear and the
rest are inited with it set.

Regardless, if we're doing an incoming migration, then the
incoming data will be used to set the VCPU on/off state
appropriately for resuming the VM. (This might include
powering on a VCPU that's been inited in power-off and never
run, or powering down a VCPU that was inited power-on but
never ran. In the 'migration moves a vcpu to powered-on'
case we'll also set the vcpu's PC and other registers so
when it is run it will DTRT.)

If the resumed guest subsequently does a PSCI reset then
QEMU will re-init the VCPUs with the set of feature bits that
the machine model/etc defines for a reset condition for this
board, which will be the same "first CPU powered on, all the
rest powered off" config we started with.

(It's the user's responsibility to ensure that when doing a
migration the QEMUs at both ends are identically configured,
incidentally.)

-- PMM

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-03-09 19:55 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-02 13:28 [PATCH v2 0/5] KVM ARM64 Migration Fixes Alex Bennée
2015-03-02 13:28 ` Alex Bennée
2015-03-02 13:29 ` [PATCH v2 1/5] arm: KVM: export vcpi->pause state via MP_STATE ioctls Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-09 13:50   ` Christoffer Dall
2015-03-09 13:50     ` Christoffer Dall
2015-03-09 13:50     ` Christoffer Dall
2015-03-09 16:34     ` Alex Bennée
2015-03-09 16:34       ` Alex Bennée
2015-03-09 16:34       ` Alex Bennée
2015-03-09 19:29       ` Christoffer Dall
2015-03-09 19:29         ` Christoffer Dall
2015-03-09 19:29         ` Christoffer Dall
2015-03-09 19:55         ` Peter Maydell
2015-03-09 19:55           ` Peter Maydell
2015-03-02 13:29 ` [PATCH v2 2/5] arm/arm64: KVM: Implement support for unqueueing active IRQs Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-09 14:41   ` Christoffer Dall
2015-03-09 14:41     ` Christoffer Dall
2015-03-09 16:40     ` Alex Bennée
2015-03-09 16:40       ` Alex Bennée
2015-03-02 13:29 ` [PATCH v2 3/5] arm: KVM: add a common vgic_queue_irq_to_lr fn Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-09 14:47   ` Christoffer Dall
2015-03-09 14:47     ` Christoffer Dall
2015-03-09 14:47     ` Christoffer Dall
2015-03-02 13:29 ` [PATCH v2 4/5] arm/arm64: KVM: Fix migration race in the arch timer Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-02 13:29 ` [PATCH v2 5/5] arm/arm64: KVM: Keep elrsr/aisr in sync with software model Alex Bennée
2015-03-02 13:29   ` Alex Bennée
2015-03-02 13:29   ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.