linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager
@ 2015-08-10 13:20 Eric Auger
  2015-08-10 13:20 ` [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer Eric Auger
                   ` (10 more replies)
  0 siblings, 11 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:20 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

This series allows to set ARM IRQ forwarding between a VFIO platform
device physical IRQ and a guest virtual IRQ. The link is coordinated
by the IRQ bypass manager.

The principle is the VFIO platform driver registers an IRQ bypass producer
struct on VFIO_IRQ_SET_ACTION_TRIGGER while KVM irqfd registers a consumer
struct on the irqfd assignment. This leads to a handshake based on the
eventfd context (used as token) match. When either of the producer/consumer
disappears, an unregistration occurs and the link is disconnected.

This kernel integration deprecates the former kvm-vfio approach:
https://lkml.org/lkml/2015/4/13/353. Some rationale about that change can
be found in IRQ bypass manager thread: https://lkml.org/lkml/2015/6/29/268

Dependencies:
1- [PATCH v4 00/11] arm/arm64: KVM: Active interrupt state switching
   for shared devices (http://www.spinics.net/lists/arm-kernel/msg437884.html)
   except PATCH 11
2- [PATCH 0/6] irqchip: GICv2/v3: Add support for irq_vcpu_affinity
3- [PATCH v4] virt: IRQ bypass manager (https://lkml.org/lkml/2015/8/6/526)
4- [PATCH 0/2] KVM: arm/arm64: Guest synchronous halt/resume
   (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg950942.html)

All those pieces can be found at:
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.2-rc6-irq-forward-v3

More backgroung on ARM IRQ forwarding in the text below and at
http://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf.

A forwarded IRQ is deactivated by the guest and not by the host.
When the guest deactivates the associated virtual IRQ, the interrupt
controller automatically completes the physical IRQ. Obviously
this requires some HW support in the interrupt controller. This is
the case for ARM GICv2.

The direct benefit is that, for a level sensitive IRQ, a VM exit
can be avoided on forwarded IRQ completion.

When the IRQ is forwarded, the VFIO platform driver does not need to
mask the physical IRQ anymore before signaling the eventfd. Indeed
genirq lowers the running priority, enabling other physical IRQ to hit
except that one.

Besides, the injection still is based on irqfd triggering. The only
impact on irqfd process is resamplefd is not called anymore on
virtual IRQ completion since deactivation is not trapped by KVM.

This was tested on Calxeda Midway, assigning the xgmac main IRQ

History:

v2 (RFC) -> v3(PATCH):
- all dependencies now have a patch status
- we dropped the producer active boolean exchanged between the
  VFIO producer and irqfd arm consumer. As a consequence, on
  unforward, if the IRQ is active, this latter is deactivated
  without VFIO-masking it. So we do not exactly come back to the
  exact state where we would be in unforwarded state. A new
  physical IRQ can hit while the previous virtual IRQ is under
  treatment. Its injection in the guest may be rejected thanks
  to the VGIC state machine. This IRQ will be lost but I don't
  think this is a severe issue. In case no new IRQ hits, the
  guest deactivation of the virtual IRQ will trigger the resamplefd
  which will VFIO unmask the non-masked IRQ. This also has no
  consequence.
- VFIO platform driver consumer_add now can fail. It rejects the
  transition for forwarding state in case the IRQ is active
- the series is rebased on new irq_vcpu_affinity series
- no dependency anymore on "chip/vgic adaptations for forwarded irq"
  which was partially integrated into Marc's series. A fix is still
  needed through.
- Guest synchronous halt/resume patch re-integrated into this series
- integrate a new patch file coming mixing
  [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
                   non-shared devices &
  [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ

v1 -> v2:
- irq bypass manager and irqfd consumer moved in a separate patch
- kvm_arm_[halt,resume]_guest moved in a separate patch
- remove VFIO external functions since we do not need them anymore
- apply container_of strategy advised by Paolo. Only active field
  remains and discussions will tell whether we get rid of it.
- renamed kvm_arch functions

- kvm-vfio v6 -> RFC v1 based on IRQ bypass manager
  see previous history in https://lkml.org/lkml/2015/4/13/353).

Best Regards

Eric


Eric Auger (9):
  VFIO: platform: registration of a dummy IRQ bypass producer
  VFIO: platform: test forwarded state when selecting the IRQ handler
  VFIO: platform: single handler using function pointer
  VFIO: platform: add vfio_platform_set_automasked
  VFIO: platform: add vfio_platform_is_active
  VFIO: platform: add irq bypass producer management
  KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ
  KVM: arm/arm64: vgic: forwarding control
  KVM: arm/arm64: implement IRQ bypass consumer functions

Marc Zyngier (1):
  KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices

 arch/arm/kvm/Kconfig                          |   1 +
 arch/arm/kvm/arm.c                            |  35 +++++
 arch/arm64/kvm/Kconfig                        |   1 +
 drivers/vfio/platform/vfio_platform_irq.c     | 113 +++++++++++++-
 drivers/vfio/platform/vfio_platform_private.h |   4 +
 include/kvm/arm_vgic.h                        |  12 +-
 virt/kvm/arm/arch_timer.c                     |   3 +-
 virt/kvm/arm/vgic.c                           | 215 ++++++++++++++++++++++++--
 8 files changed, 358 insertions(+), 26 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
@ 2015-08-10 13:20 ` Eric Auger
  2015-08-12 18:56   ` Alex Williamson
  2015-08-10 13:20 ` [PATCH v3 02/10] VFIO: platform: test forwarded state when selecting the IRQ handler Eric Auger
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:20 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

Register a dummy producer with void callbacks

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v2 -> v3:
- rename vfio_platform_irq_bypass_resume into *_start
---
 drivers/vfio/platform/vfio_platform_irq.c     | 32 +++++++++++++++++++++++++++
 drivers/vfio/platform/vfio_platform_private.h |  2 ++
 2 files changed, 34 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 88bba57..b5cb8c7 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -20,6 +20,7 @@
 #include <linux/types.h>
 #include <linux/vfio.h>
 #include <linux/irq.h>
+#include <linux/irqbypass.h>
 
 #include "vfio_platform_private.h"
 
@@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
+{
+}
+
+static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
+{
+}
+
+static int vfio_platform_irq_bypass_add_consumer(
+			struct irq_bypass_producer *prod,
+			struct irq_bypass_consumer *cons)
+{
+	return 0;
+}
+
+static void vfio_platform_irq_bypass_del_consumer(
+			struct irq_bypass_producer *prod,
+			struct irq_bypass_consumer *cons)
+{
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 			    int fd, irq_handler_t handler)
 {
@@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 
 	if (irq->trigger) {
 		free_irq(irq->hwirq, irq);
+		irq_bypass_unregister_producer(&irq->producer);
 		kfree(irq->name);
 		eventfd_ctx_put(irq->trigger);
 		irq->trigger = NULL;
@@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 		return ret;
 	}
 
+	irq->producer.token = (void *)trigger;
+	irq->producer.irq = irq->hwirq;
+	irq->producer.add_consumer = vfio_platform_irq_bypass_add_consumer;
+	irq->producer.del_consumer = vfio_platform_irq_bypass_del_consumer;
+	irq->producer.stop = vfio_platform_irq_bypass_stop;
+	irq->producer.start = vfio_platform_irq_bypass_start;
+	ret = irq_bypass_register_producer(&irq->producer);
+	WARN_ON(ret);
+
 	if (!irq->masked)
 		enable_irq(irq->hwirq);
 
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 1c9b3d5..1d2d4d6 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -17,6 +17,7 @@
 
 #include <linux/types.h>
 #include <linux/interrupt.h>
+#include <linux/irqbypass.h>
 
 #define VFIO_PLATFORM_OFFSET_SHIFT   40
 #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
@@ -37,6 +38,7 @@ struct vfio_platform_irq {
 	spinlock_t		lock;
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
+	struct irq_bypass_producer producer;
 };
 
 struct vfio_platform_region {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 02/10] VFIO: platform: test forwarded state when selecting the IRQ handler
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
  2015-08-10 13:20 ` [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer Eric Auger
@ 2015-08-10 13:20 ` Eric Auger
  2015-08-10 13:20 ` [PATCH v3 03/10] VFIO: platform: single handler using function pointer Eric Auger
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:20 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

Add a new forwarded flag in vfio_platform_irq.  In case the IRQ
is forwarded, the VFIO platform IRQ handler does not need to
disable the IRQ anymore.

When setting the IRQ handler we now also test the forwarded state. In
case the IRQ is forwarded we select the vfio_irq_handler.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v2:
- add a new forwarded flag and do not use irqd_irq_forwarded anymore
---
 drivers/vfio/platform/vfio_platform_irq.c     | 3 ++-
 drivers/vfio/platform/vfio_platform_private.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index b5cb8c7..40f057a 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -262,7 +262,8 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
 	struct vfio_platform_irq *irq = &vdev->irqs[index];
 	irq_handler_t handler;
 
-	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED)
+	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED &&
+			!irq->forwarded)
 		handler = vfio_automasked_irq_handler;
 	else
 		handler = vfio_irq_handler;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 1d2d4d6..8b4f814 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -39,6 +39,7 @@ struct vfio_platform_irq {
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
 	struct irq_bypass_producer producer;
+	bool			forwarded;
 };
 
 struct vfio_platform_region {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 03/10] VFIO: platform: single handler using function pointer
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
  2015-08-10 13:20 ` [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer Eric Auger
  2015-08-10 13:20 ` [PATCH v3 02/10] VFIO: platform: test forwarded state when selecting the IRQ handler Eric Auger
@ 2015-08-10 13:20 ` Eric Auger
  2015-08-12 18:56   ` Alex Williamson
  2015-08-10 13:20 ` [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked Eric Auger
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:20 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

A single handler now is registered whatever the use case: automasked
or not. A function pointer is set according to the wished behavior
and the handler calls this function.

The irq lock is taken/released in the root handler. eventfd_signal can
be called in regions not allowed to sleep.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v4: creation
---
 drivers/vfio/platform/vfio_platform_irq.c     | 21 +++++++++++++++------
 drivers/vfio/platform/vfio_platform_private.h |  1 +
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 40f057a..b31b1f0 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
 static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 {
 	struct vfio_platform_irq *irq_ctx = dev_id;
-	unsigned long flags;
 	int ret = IRQ_NONE;
 
-	spin_lock_irqsave(&irq_ctx->lock, flags);
-
 	if (!irq_ctx->masked) {
 		ret = IRQ_HANDLED;
 
@@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 		irq_ctx->masked = true;
 	}
 
-	spin_unlock_irqrestore(&irq_ctx->lock, flags);
-
 	if (ret == IRQ_HANDLED)
 		eventfd_signal(irq_ctx->trigger, 1);
 
@@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t vfio_handler(int irq, void *dev_id)
+{
+	struct vfio_platform_irq *irq_ctx = dev_id;
+	unsigned long flags;
+	irqreturn_t ret;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+	ret = irq_ctx->handler(irq, dev_id);
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+
+	return ret;
+}
+
 static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
 {
 }
@@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 	}
 
 	irq->trigger = trigger;
+	irq->handler = handler;
 
 	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
-	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
+	ret = request_irq(irq->hwirq, vfio_handler, 0, irq->name, irq);
 	if (ret) {
 		kfree(irq->name);
 		eventfd_ctx_put(trigger);
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 8b4f814..f848a6b 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -40,6 +40,7 @@ struct vfio_platform_irq {
 	struct virqfd		*mask;
 	struct irq_bypass_producer producer;
 	bool			forwarded;
+	irqreturn_t (*handler)(int irq, void *dev_id);
 };
 
 struct vfio_platform_region {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (2 preceding siblings ...)
  2015-08-10 13:20 ` [PATCH v3 03/10] VFIO: platform: single handler using function pointer Eric Auger
@ 2015-08-10 13:20 ` Eric Auger
  2015-08-12 18:56   ` Alex Williamson
  2015-08-10 13:20 ` [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active Eric Auger
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:20 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

This function makes possible to change the automasked mode.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- set forwarded flag
---
 drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index b31b1f0..a285384 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
 	return ret;
 }
 
+static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
+					   bool automasked)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&irq->lock, flags);
+	if (automasked) {
+		irq->forwarded = true;
+		irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
+		irq->handler = vfio_automasked_irq_handler;
+	} else {
+		irq->forwarded = false;
+		irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
+		irq->handler = vfio_irq_handler;
+	}
+	spin_unlock_irqrestore(&irq->lock, flags);
+	return 0;
+}
+
 static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
 {
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (3 preceding siblings ...)
  2015-08-10 13:20 ` [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked Eric Auger
@ 2015-08-10 13:20 ` Eric Auger
  2015-08-12 18:56   ` Alex Williamson
  2015-09-02 19:29   ` Christoffer Dall
  2015-08-10 13:21 ` [PATCH v3 06/10] VFIO: platform: add irq bypass producer management Eric Auger
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:20 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

This function returns whether the IRQ is active at irqchip level or
VFIO masked. If either is true, it is considered the IRQ is active.
Currently there is no way to differentiate userspace masked IRQ from
automasked IRQ. There might be false detection of activity. However
it is currently acceptable to have false detection.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
---
 drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index a285384..efaee58 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
 	return 0;
 }
 
+static int vfio_platform_is_active(struct vfio_platform_irq *irq)
+{
+	unsigned long flags;
+	bool active, masked, outstanding;
+	int ret;
+
+	spin_lock_irqsave(&irq->lock, flags);
+
+	ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
+	BUG_ON(ret);
+	masked = irq->masked;
+	outstanding = active || masked;
+
+	spin_unlock_irqrestore(&irq->lock, flags);
+	return outstanding;
+}
+
 static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
 {
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 06/10] VFIO: platform: add irq bypass producer management
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (4 preceding siblings ...)
  2015-08-10 13:20 ` [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active Eric Auger
@ 2015-08-10 13:21 ` Eric Auger
  2015-08-12 18:56   ` Alex Williamson
  2015-09-02 19:32   ` Christoffer Dall
  2015-08-10 13:21 ` [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices Eric Auger
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:21 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

This patch populates the IRQ bypass callacks:
- stop/start producer simply consist in disabling/enabling the host irq
- add/del consumer: basically set the automasked flag to false/true

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v2 -> v3:
- vfio_platform_irq_bypass_add_consumer now returns an error in case
  the IRQ is recognized as active
- active boolean not set anymore
- do not VFIO mask the IRQ anymore on unforward

v1 -> v2:
- device handle caching in vfio_platform_device is introduced in a
  separate patch
- use container_of
---
 drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index efaee58..400a188 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
 
 static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
 {
+	disable_irq(prod->irq);
 }
 
 static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
 {
+	enable_irq(prod->irq);
 }
 
 static int vfio_platform_irq_bypass_add_consumer(
 			struct irq_bypass_producer *prod,
 			struct irq_bypass_consumer *cons)
 {
-	return 0;
+	struct vfio_platform_irq *irq =
+		container_of(prod, struct vfio_platform_irq, producer);
+
+	/*
+	 * if the IRQ is active at irqchip level or VFIO (auto)masked
+	 * this means the host IRQ is already under injection in the
+	 * guest and this not safe to change the forwarding state at
+	 * that stage.
+	 * It is not possible to differentiate user-space masking
+	 * from auto-masking, leading to possible false detection of
+	 * active state.
+	 */
+	if (vfio_platform_is_active(irq))
+		return -EAGAIN;
+
+	return vfio_platform_set_automasked(irq, false);
 }
 
 static void vfio_platform_irq_bypass_del_consumer(
 			struct irq_bypass_producer *prod,
 			struct irq_bypass_consumer *cons)
 {
+	struct vfio_platform_irq *irq =
+		container_of(prod, struct vfio_platform_irq, producer);
+
+	vfio_platform_set_automasked(irq, true);
 }
 
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (5 preceding siblings ...)
  2015-08-10 13:21 ` [PATCH v3 06/10] VFIO: platform: add irq bypass producer management Eric Auger
@ 2015-08-10 13:21 ` Eric Auger
  2015-09-02 19:42   ` Christoffer Dall
  2015-08-10 13:21 ` [PATCH v3 08/10] KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ Eric Auger
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:21 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

From: Marc Zyngier <marc.zyngier@arm.com>

So far, the only use of the HW interrupt facility was the timer,
implying that the active state is context-switched for each vcpu,
as the device is is shared across all vcpus.

This does not work for a device that has been assigned to a VM,
as the guest is entierely in control of that device (the HW is
not shared). In that case, it makes sense to bypass the whole
active state switching.

Also the VGIC state machine is adapted to support those assigned
(non shared) HW IRQs:
- nly can be sampled when it is pending
- when queueing the IRQ (programming the LR), the pending state is
  removed as for edge sensitive IRQs
- queued state is not modelled. Level state is not modelled
- its injection always is valid since steming from the HW.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

- a mix of
  [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
                   non-shared devices
  [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
---
 include/kvm/arm_vgic.h    |  6 +++--
 virt/kvm/arm/arch_timer.c |  3 ++-
 virt/kvm/arm/vgic.c       | 58 +++++++++++++++++++++++++++++++++++------------
 3 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index d901f1a..7ef9ce0 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -163,7 +163,8 @@ struct irq_phys_map {
 	u32			virt_irq;
 	u32			phys_irq;
 	u32			irq;
-	bool			active;
+	bool			shared;
+	bool			active; /* Only valid if shared */
 };
 
 struct irq_phys_map_entry {
@@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-					   int virt_irq, int irq);
+					   int virt_irq, int irq,
+					   bool shared);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 76e38d2..db21d8f 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 	 * Tell the VGIC that the virtual interrupt is tied to a
 	 * physical interrupt. We do that once per VCPU.
 	 */
-	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
+				    host_vtimer_irq, true);
 	if (WARN_ON(IS_ERR(map)))
 		return PTR_ERR(map);
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 9eb489a..fbd5ba5 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
 
 static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
 {
-	return !vgic_irq_is_queued(vcpu, irq);
+	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
+	bool shared_hw = map && !map->shared;
+
+	return !vgic_irq_is_queued(vcpu, irq) ||
+			(shared_hw && vgic_dist_irq_is_pending(vcpu, irq));
 }
 
 /**
@@ -1150,19 +1154,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
 		 * active in the physical world. Otherwise the
 		 * physical interrupt will fire and the guest will
 		 * exit before processing the virtual interrupt.
+		 *
+		 * This is of course only valid for a shared
+		 * interrupt. A non shared interrupt should already be
+		 * active.
 		 */
 		if (map) {
-			int ret;
-
-			BUG_ON(!map->active);
 			vlr.hwirq = map->phys_irq;
 			vlr.state |= LR_HW;
 			vlr.state &= ~LR_EOI_INT;
 
-			ret = irq_set_irqchip_state(map->irq,
-						    IRQCHIP_STATE_ACTIVE,
-						    true);
-			WARN_ON(ret);
+			if (map->shared) {
+				int ret;
+
+				BUG_ON(!map->active);
+				ret = irq_set_irqchip_state(
+						map->irq,
+						IRQCHIP_STATE_ACTIVE,
+						true);
+				WARN_ON(ret);
+			}
 
 			/*
 			 * Make sure we're not going to sample this
@@ -1229,10 +1240,13 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 
 static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
 {
+	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
+	bool shared_hw = map && !map->shared;
+
 	if (!vgic_can_sample_irq(vcpu, irq))
 		return true; /* level interrupt, already queued */
 
-	if (vgic_queue_irq(vcpu, 0, irq)) {
+	if (vgic_queue_irq(vcpu, 0, irq) || shared_hw) {
 		if (vgic_irq_is_edge(vcpu, irq)) {
 			vgic_dist_irq_clear_pending(vcpu, irq);
 			vgic_cpu_irq_clear(vcpu, irq);
@@ -1411,7 +1425,12 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
 		return 0;
 
 	map = vgic_irq_map_search(vcpu, vlr.irq);
-	BUG_ON(!map || !map->active);
+	BUG_ON(!map);
+
+	if (!map->shared)
+		return 0;
+
+	BUG_ON(map->shared && !map->active);
 
 	ret = irq_get_irqchip_state(map->irq,
 				    IRQCHIP_STATE_ACTIVE,
@@ -1563,6 +1582,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	int edge_triggered, level_triggered;
 	int enabled;
 	bool ret = true, can_inject = true;
+	bool shared_hw = map && !map->shared;
 
 	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
 		return -EINVAL;
@@ -1573,7 +1593,8 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
 	edge_triggered = vgic_irq_is_edge(vcpu, irq_num);
 	level_triggered = !edge_triggered;
 
-	if (!vgic_validate_injection(vcpu, irq_num, level)) {
+	if (!vgic_validate_injection(vcpu, irq_num, level) &&
+		!shared_hw) {
 		ret = false;
 		goto out;
 	}
@@ -1742,16 +1763,21 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
  * @vcpu: The VCPU pointer
  * @virt_irq: The virtual irq number
  * @irq: The Linux IRQ number
+ * @shared: Indicates if the interrupt has to be context-switched or
+ *          if it is private to a VM
  *
  * Establish a mapping between a guest visible irq (@virt_irq) and a
  * Linux irq (@irq). On injection, @virt_irq will be associated with
  * the physical interrupt represented by @irq. This mapping can be
  * established multiple times as long as the parameters are the same.
+ * If @shared is true, the active state of the interrupt will be
+ * context-switched.
  *
  * Returns a valid pointer on success, and an error pointer otherwise
  */
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-					   int virt_irq, int irq)
+					   int virt_irq, int irq,
+					   bool shared)
 {
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
@@ -1785,7 +1811,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 	if (map) {
 		/* Make sure this mapping matches */
 		if (map->phys_irq != phys_irq	||
-		    map->irq      != irq)
+		    map->irq      != irq	||
+		    map->shared   != shared)
 			map = ERR_PTR(-EINVAL);
 
 		/* Found an existing, valid mapping */
@@ -1796,6 +1823,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 	map->virt_irq = virt_irq;
 	map->phys_irq = phys_irq;
 	map->irq      = irq;
+	map->shared   = shared;
 
 	list_add_tail_rcu(&entry->entry, root);
 
@@ -1846,7 +1874,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
  */
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
 {
-	BUG_ON(!map);
+	BUG_ON(!map || !map->shared);
 	return map->active;
 }
 
@@ -1858,7 +1886,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
  */
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
 {
-	BUG_ON(!map);
+	BUG_ON(!map || !map->shared);
 	map->active = active;
 }
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 08/10] KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (6 preceding siblings ...)
  2015-08-10 13:21 ` [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices Eric Auger
@ 2015-08-10 13:21 ` Eric Auger
  2015-08-10 13:21 ` [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control Eric Auger
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:21 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

Currently irqfd injection relies on kvm_vgic_inject_irq function.
However this function cannot be used anymore for mapped IRQs. So
let's change the implementation to use kvm_vgic_inject_mapped_irq
when the IRQ is forwarded.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 virt/kvm/arm/vgic.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index fbd5ba5..03a85b3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2529,13 +2529,19 @@ int kvm_irq_map_chip_pin(struct kvm *kvm, unsigned irqchip, unsigned pin)
 int kvm_set_irq(struct kvm *kvm, int irq_source_id,
 		u32 irq, int level, bool line_status)
 {
+	struct irq_phys_map *map;
 	unsigned int spi = irq + VGIC_NR_PRIVATE_IRQS;
 
 	trace_kvm_set_irq(irq, level, irq_source_id);
 
 	BUG_ON(!vgic_initialized(kvm));
 
-	return kvm_vgic_inject_irq(kvm, 0, spi, level);
+	map = vgic_irq_map_search(kvm_get_vcpu(kvm, 0), spi);
+
+	if (!map)
+		return kvm_vgic_inject_irq(kvm, 0, spi, level);
+	else
+		return kvm_vgic_inject_mapped_irq(kvm, 0, map, level);
 }
 
 /* MSI not implemented yet */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (7 preceding siblings ...)
  2015-08-10 13:21 ` [PATCH v3 08/10] KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ Eric Auger
@ 2015-08-10 13:21 ` Eric Auger
  2015-09-02 19:58   ` Christoffer Dall
  2015-08-10 13:21 ` [PATCH v3 10/10] KVM: arm/arm64: implement IRQ bypass consumer functions Eric Auger
  2015-09-02 19:58 ` [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Christoffer Dall
  10 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:21 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

Implements kvm_vgic_[set|unset]_forward.

Handle low-level VGIC programming: physical IRQ/guest IRQ mapping,
list register cleanup, VGIC state machine. Also interacts with
the irqchip.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v2 -> v3:
- on unforward, we do not compute & output the active state anymore.
  This means if the unforward happens while the physical IRQ is
  active, we will not VFIO mask the IRQ while deactiving it. If a
  new physical IRQ hits, the corresponding virtual IRQ might not
  be injected (hence lost) due to VGIC state machine.

bypass rfc v2:
- use irq_set_vcpu_affinity API
- use irq_set_irqchip_state instead of chip->irq_eoi

bypass rfc:
- rename kvm_arch_{set|unset}_forward into
  kvm_vgic_{set|unset}_forward. Remove __KVM_HAVE_ARCH_HALT_GUEST.
  The function is bound to be called by ARM code only.

v4 -> v5:
- fix arm64 compilation issues, ie. also defines
  __KVM_HAVE_ARCH_HALT_GUEST for arm64

v3 -> v4:
- code originally located in kvm_vfio_arm.c
- kvm_arch_vfio_{set|unset}_forward renamed into
  kvm_arch_{set|unset}_forward
- split into 2 functions (set/unset) since unset does not fail anymore
- unset can be invoked at whatever time. Extra care is taken to handle
  transition in VGIC state machine, LR cleanup, ...

v2 -> v3:
- renaming of kvm_arch_set_fwd_state into kvm_arch_vfio_set_forward
- takes a bool arg instead of kvm_fwd_irq_action enum
- removal of KVM_VFIO_IRQ_CLEANUP
- platform device check now happens here
- more precise errors returned
- irq_eoi handled externally to this patch (VGIC)
- correct enable_irq bug done twice
- reword the commit message
- correct check of platform_bus_type
- use raw_spin_lock_irqsave and check the validity of the handler
---
 include/kvm/arm_vgic.h |   6 ++
 virt/kvm/arm/vgic.c    | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 155 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7ef9ce0..409ac0f 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -363,6 +363,12 @@ int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
 void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 
+int kvm_vgic_set_forward(struct kvm *kvm,
+			 unsigned int host_irq, unsigned int guest_irq);
+
+void kvm_vgic_unset_forward(struct kvm *kvm,
+			    unsigned int host_irq, unsigned int guest_irq);
+
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
 #define vgic_ready(k)		((k)->arch.vgic.ready)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 03a85b3..b15999a 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2551,3 +2551,152 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 {
 	return 0;
 }
+
+/**
+ * kvm_vgic_set_forward - Set IRQ forwarding
+ *
+ * @kvm: handle to the VM
+ * @host_irq: physical IRQ number
+ * @guest_irq: virtual IRQ number
+ *
+ * This function is supposed to be called only if the IRQ
+ * is not in progress: ie. not active at GIC level and not
+ * currently under injection in the guest. The physical IRQ must
+ * also be disabled and the guest must have been exited and
+ * prevented from being re-entered.
+ */
+int kvm_vgic_set_forward(struct kvm *kvm,
+			 unsigned int host_irq,
+			 unsigned int guest_irq)
+{
+	struct irq_phys_map *map = NULL;
+	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
+	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
+
+	kvm_debug("%s host_irq=%d guest_irq=%d\n",
+		__func__, host_irq, guest_irq);
+
+	if (!vcpu)
+		return 0;
+
+	irq_set_vcpu_affinity(host_irq, vcpu);
+	/*
+	 * next physical IRQ will be be handled as forwarded
+	 * by the host (priority drop only)
+	 */
+
+	map = kvm_vgic_map_phys_irq(vcpu, spi_id, host_irq, false);
+	/*
+	 * next guest_irq injection will be considered as
+	 * forwarded and next flush will program LR
+	 * without maintenance IRQ but with HW bit set
+	 */
+	return !map;
+}
+
+/**
+ * kvm_vgic_unset_forward - Unset IRQ forwarding
+ *
+ * @kvm: handle to the VM
+ * @host_irq: physical IRQ number
+ * @guest_irq: virtual IRQ number
+ *
+ * This function must be called when the host_irq is disabled
+ * and guest has been exited and prevented from being re-entered.
+ *
+ */
+void kvm_vgic_unset_forward(struct kvm *kvm,
+			    unsigned int host_irq,
+			    unsigned int guest_irq)
+{
+	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int ret, lr;
+	struct vgic_lr vlr;
+	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
+	bool queued = false, needs_deactivate = true;
+	struct irq_phys_map *map;
+	bool active;
+
+	kvm_debug("%s host_irq=%d guest_irq=%d\n",
+		__func__, host_irq, guest_irq);
+
+	spin_lock(&dist->lock);
+
+	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
+
+	if (!vcpu)
+		goto out;
+
+	map = vgic_irq_map_search(vcpu, spi_id);
+	BUG_ON(!map);
+	ret = kvm_vgic_unmap_phys_irq(vcpu, map);
+	BUG_ON(ret);
+	/*
+	 * subsequent update_irq_pending or flush will handle this
+	 * irq as not forwarded
+	 */
+	if (likely(!(active))) {
+		/*
+		 * the IRQ was not active. let's simply prepare the states
+		 * for subsequent non forwarded injection.
+		 */
+		vgic_dist_irq_clear_level(vcpu, spi_id);
+		vgic_dist_irq_clear_pending(vcpu, spi_id);
+		vgic_irq_clear_queued(vcpu, spi_id);
+		needs_deactivate = false;
+		goto out;
+	}
+
+	/* is there any list register with valid state? */
+	lr = vgic_cpu->vgic_irq_lr_map[spi_id];
+	if (lr != LR_EMPTY) {
+		vlr = vgic_get_lr(vcpu, lr);
+		if (vlr.state & LR_STATE_MASK)
+			queued = true;
+	}
+
+	if (!queued) {
+		vgic_irq_clear_queued(vcpu, spi_id);
+		if (vgic_dist_irq_is_pending(vcpu, spi_id)) {
+			/*
+			 * IRQ is injected but not yet queued. LR will be
+			 * written with EOI_INT and process_maintenance will
+			 * reset the states: queued, level(resampler). Pending
+			 * will be reset on flush.
+			 */
+			vgic_dist_irq_set_level(vcpu, spi_id);
+		} else {
+			/*
+			 * We are somewhere before the update_irq_pending.
+			 * we can't be sure the virtual IRQ will ever be
+			 * injected (due to previous disable_irq).
+			 * Let's simply clear the level which was not correctly
+			 * modelled in forwarded state.
+			 */
+			vgic_dist_irq_clear_level(vcpu, spi_id);
+		}
+		goto out;
+	}
+
+	/*
+	 * the virtual IRQ is queued and a valid LR exists, let's patch it so
+	 * that when EOI happens a maintenance IRQ gets triggered
+	 */
+	vlr.state |= LR_EOI_INT;
+	vgic_set_lr(vcpu, lr, vlr);
+
+	vgic_dist_irq_set_level(vcpu, spi_id);
+	vgic_dist_irq_set_pending(vcpu, spi_id);
+	vgic_irq_set_queued(vcpu, spi_id);
+	 /* The maintenance IRQ will reset all states above */
+
+out:
+	irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
+	irq_set_vcpu_affinity(host_irq, NULL);
+	/* next occurrence will be deactivated by the host */
+
+	spin_unlock(&dist->lock);
+}
+
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 10/10] KVM: arm/arm64: implement IRQ bypass consumer functions
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (8 preceding siblings ...)
  2015-08-10 13:21 ` [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control Eric Auger
@ 2015-08-10 13:21 ` Eric Auger
  2015-09-02 19:58 ` [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Christoffer Dall
  10 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-10 13:21 UTC (permalink / raw)
  To: eric.auger, eric.auger, linux-arm-kernel, kvmarm, kvm,
	christoffer.dall, marc.zyngier, alex.williamson, feng.wu
  Cc: linux-kernel, patches, pbonzini

Implement IRQ bypass callbacks for arm/arm64 IRQ forwarding:
- kvm_arch_irq_bypass_add_producer: perform VGIC/irqchip
  settings for forwarding
- kvm_arch_irq_bypass_del_producer: same for inverse operation
- kvm_arch_irq_bypass_stop: halt guest execution
- kvm_arch_irq_bypass_start: resume guest execution

and set CONFIG_HAVE_KVM_IRQ_BYPASS for arm/arm64.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v2 -> v3:
- kvm_arch_irq_bypass_resume renamed into kvm_arch_irq_bypass_start
- kvm_vgic_unset_forward does not take the active bool param anymore
- kvm_arch_irq_bypass_add_producer now returns an error value
- remove kvm_arch_irq_bypass_update

v1 -> v2:
- struct kvm_kernel_irqfd is retrieved with container_of
- function names changed
---
 arch/arm/kvm/Kconfig   |  1 +
 arch/arm/kvm/arm.c     | 35 +++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/Kconfig |  1 +
 3 files changed, 37 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 3c565b9..655d277 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -33,6 +33,7 @@ config KVM
 	select HAVE_KVM_EVENTFD
 	select HAVE_KVM_IRQFD
 	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
 	---help---
 	  Support hosting virtualized guest machines.
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 0529b38..7cfc5dc 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -27,6 +27,8 @@
 #include <linux/mman.h>
 #include <linux/sched.h>
 #include <linux/kvm.h>
+#include <linux/kvm_irqfd.h>
+#include <linux/irqbypass.h>
 #include <trace/events/kvm.h>
 
 #define CREATE_TRACE_POINTS
@@ -1149,6 +1151,39 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
 	return NULL;
 }
 
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	return kvm_vgic_set_forward(irqfd->kvm, prod->irq, irqfd->gsi);
+}
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_vgic_unset_forward(irqfd->kvm, prod->irq, irqfd->gsi);
+}
+
+void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_arm_halt_guest(irqfd->kvm);
+}
+
+void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_arm_resume_guest(irqfd->kvm);
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 2509539..6f6e7a5 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -33,6 +33,7 @@ config KVM
 	select HAVE_KVM_EVENTFD
 	select HAVE_KVM_IRQFD
 	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	---help---
 	  Support hosting virtualized guest machines.
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer
  2015-08-10 13:20 ` [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer Eric Auger
@ 2015-08-12 18:56   ` Alex Williamson
  2015-08-17 15:17     ` Eric Auger
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-12 18:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> Register a dummy producer with void callbacks
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v2 -> v3:
> - rename vfio_platform_irq_bypass_resume into *_start
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 32 +++++++++++++++++++++++++++
>  drivers/vfio/platform/vfio_platform_private.h |  2 ++
>  2 files changed, 34 insertions(+)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 88bba57..b5cb8c7 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -20,6 +20,7 @@
>  #include <linux/types.h>
>  #include <linux/vfio.h>
>  #include <linux/irq.h>
> +#include <linux/irqbypass.h>
>  
>  #include "vfio_platform_private.h"
>  
> @@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> +static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
> +{
> +}
> +
> +static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
> +{
> +}
> +
> +static int vfio_platform_irq_bypass_add_consumer(
> +			struct irq_bypass_producer *prod,
> +			struct irq_bypass_consumer *cons)
> +{
> +	return 0;
> +}
> +
> +static void vfio_platform_irq_bypass_del_consumer(
> +			struct irq_bypass_producer *prod,
> +			struct irq_bypass_consumer *cons)
> +{
> +}
> +
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  			    int fd, irq_handler_t handler)
>  {
> @@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  
>  	if (irq->trigger) {
>  		free_irq(irq->hwirq, irq);
> +		irq_bypass_unregister_producer(&irq->producer);
>  		kfree(irq->name);
>  		eventfd_ctx_put(irq->trigger);
>  		irq->trigger = NULL;
> @@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  		return ret;
>  	}
>  
> +	irq->producer.token = (void *)trigger;
> +	irq->producer.irq = irq->hwirq;
> +	irq->producer.add_consumer = vfio_platform_irq_bypass_add_consumer;
> +	irq->producer.del_consumer = vfio_platform_irq_bypass_del_consumer;
> +	irq->producer.stop = vfio_platform_irq_bypass_stop;
> +	irq->producer.start = vfio_platform_irq_bypass_start;
> +	ret = irq_bypass_register_producer(&irq->producer);
> +	WARN_ON(ret);

For what purpose?

> +
>  	if (!irq->masked)
>  		enable_irq(irq->hwirq);
>  
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 1c9b3d5..1d2d4d6 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -17,6 +17,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/interrupt.h>
> +#include <linux/irqbypass.h>
>  
>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
> @@ -37,6 +38,7 @@ struct vfio_platform_irq {
>  	spinlock_t		lock;
>  	struct virqfd		*unmask;
>  	struct virqfd		*mask;
> +	struct irq_bypass_producer producer;
>  };
>  
>  struct vfio_platform_region {




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] VFIO: platform: single handler using function pointer
  2015-08-10 13:20 ` [PATCH v3 03/10] VFIO: platform: single handler using function pointer Eric Auger
@ 2015-08-12 18:56   ` Alex Williamson
  2015-08-17 15:25     ` Eric Auger
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-12 18:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> A single handler now is registered whatever the use case: automasked
> or not. A function pointer is set according to the wished behavior
> and the handler calls this function.
> 
> The irq lock is taken/released in the root handler. eventfd_signal can
> be called in regions not allowed to sleep.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v4: creation
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 21 +++++++++++++++------
>  drivers/vfio/platform/vfio_platform_private.h |  1 +
>  2 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 40f057a..b31b1f0 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>  static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  {
>  	struct vfio_platform_irq *irq_ctx = dev_id;
> -	unsigned long flags;
>  	int ret = IRQ_NONE;
>  
> -	spin_lock_irqsave(&irq_ctx->lock, flags);
> -
>  	if (!irq_ctx->masked) {
>  		ret = IRQ_HANDLED;
>  
> @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  		irq_ctx->masked = true;
>  	}
>  
> -	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> -
>  	if (ret == IRQ_HANDLED)
>  		eventfd_signal(irq_ctx->trigger, 1);

Has this been run with lockdep to check whether this is safe to call
with spinlock_irqsave held?

>  
> @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t vfio_handler(int irq, void *dev_id)
> +{
> +	struct vfio_platform_irq *irq_ctx = dev_id;
> +	unsigned long flags;
> +	irqreturn_t ret;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +	ret = irq_ctx->handler(irq, dev_id);
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +
> +	return ret;
> +}
> +
>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>  {
>  }
> @@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  	}
>  
>  	irq->trigger = trigger;
> +	irq->handler = handler;
>  
>  	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
> -	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
> +	ret = request_irq(irq->hwirq, vfio_handler, 0, irq->name, irq);
>  	if (ret) {
>  		kfree(irq->name);
>  		eventfd_ctx_put(trigger);
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 8b4f814..f848a6b 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -40,6 +40,7 @@ struct vfio_platform_irq {
>  	struct virqfd		*mask;
>  	struct irq_bypass_producer producer;
>  	bool			forwarded;
> +	irqreturn_t (*handler)(int irq, void *dev_id);
>  };
>  
>  struct vfio_platform_region {




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
  2015-08-10 13:20 ` [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked Eric Auger
@ 2015-08-12 18:56   ` Alex Williamson
  2015-08-17 15:38     ` Eric Auger
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-12 18:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> This function makes possible to change the automasked mode.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v1 -> v2:
> - set forwarded flag
> ---
>  drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index b31b1f0..a285384 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
>  	return ret;
>  }
>  
> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> +					   bool automasked)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&irq->lock, flags);
> +	if (automasked) {
> +		irq->forwarded = true;
> +		irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
> +		irq->handler = vfio_automasked_irq_handler;
> +	} else {
> +		irq->forwarded = false;
> +		irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
> +		irq->handler = vfio_irq_handler;
> +	}
> +	spin_unlock_irqrestore(&irq->lock, flags);
> +	return 0;

In vfio-speak, automasked means level and we're not magically changing
the IRQ from level to edge, we're simply able to handle level
differently based on a hardware optimization.  Should the user visible
flags therefore change based on this?  Aren't we really setting the
forwarded state rather than the automasked state?

> +}
> +
>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>  {
>  }




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active
  2015-08-10 13:20 ` [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active Eric Auger
@ 2015-08-12 18:56   ` Alex Williamson
  2015-08-17 15:39     ` Eric Auger
  2015-09-02 19:29   ` Christoffer Dall
  1 sibling, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-12 18:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> This function returns whether the IRQ is active at irqchip level or
> VFIO masked. If either is true, it is considered the IRQ is active.
> Currently there is no way to differentiate userspace masked IRQ from
> automasked IRQ. There might be false detection of activity. However
> it is currently acceptable to have false detection.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> ---
>  drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index a285384..efaee58 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>  	return 0;
>  }
>  
> +static int vfio_platform_is_active(struct vfio_platform_irq *irq)

vfio_platform_irq_is_active()?

> +{
> +	unsigned long flags;
> +	bool active, masked, outstanding;
> +	int ret;
> +
> +	spin_lock_irqsave(&irq->lock, flags);
> +
> +	ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
> +	BUG_ON(ret);

Why can't we propagate this error to the caller and let them decide?

> +	masked = irq->masked;
> +	outstanding = active || masked;
> +
> +	spin_unlock_irqrestore(&irq->lock, flags);
> +	return outstanding;
> +}
> +
>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>  {
>  }




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/10] VFIO: platform: add irq bypass producer management
  2015-08-10 13:21 ` [PATCH v3 06/10] VFIO: platform: add irq bypass producer management Eric Auger
@ 2015-08-12 18:56   ` Alex Williamson
  2015-08-17 15:51     ` Eric Auger
  2015-09-02 19:32   ` Christoffer Dall
  1 sibling, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-12 18:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On Mon, 2015-08-10 at 15:21 +0200, Eric Auger wrote:
> This patch populates the IRQ bypass callacks:
> - stop/start producer simply consist in disabling/enabling the host irq
> - add/del consumer: basically set the automasked flag to false/true
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> v2 -> v3:
> - vfio_platform_irq_bypass_add_consumer now returns an error in case
>   the IRQ is recognized as active
> - active boolean not set anymore
> - do not VFIO mask the IRQ anymore on unforward
> 
> v1 -> v2:
> - device handle caching in vfio_platform_device is introduced in a
>   separate patch
> - use container_of
> ---
>  drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index efaee58..400a188 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
>  
>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>  {
> +	disable_irq(prod->irq);
>  }
>  
>  static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
>  {
> +	enable_irq(prod->irq);
>  }
>  
>  static int vfio_platform_irq_bypass_add_consumer(
>  			struct irq_bypass_producer *prod,
>  			struct irq_bypass_consumer *cons)
>  {
> -	return 0;
> +	struct vfio_platform_irq *irq =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +
> +	/*
> +	 * if the IRQ is active at irqchip level or VFIO (auto)masked
> +	 * this means the host IRQ is already under injection in the
> +	 * guest and this not safe to change the forwarding state at
> +	 * that stage.
> +	 * It is not possible to differentiate user-space masking
> +	 * from auto-masking, leading to possible false detection of
> +	 * active state.
> +	 */
> +	if (vfio_platform_is_active(irq))
> +		return -EAGAIN;

Here's an example of why we don't want WARN_ON if a registration fails,
this is effectively random.  When and how is a re-try going to happen?

> +
> +	return vfio_platform_set_automasked(irq, false);

set_forwarded just seems so much more logical here.

>  }
>  
>  static void vfio_platform_irq_bypass_del_consumer(
>  			struct irq_bypass_producer *prod,
>  			struct irq_bypass_consumer *cons)
>  {
> +	struct vfio_platform_irq *irq =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +
> +	vfio_platform_set_automasked(irq, true);
>  }
>  
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer
  2015-08-12 18:56   ` Alex Williamson
@ 2015-08-17 15:17     ` Eric Auger
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-17 15:17 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

Hi Alex,
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> Register a dummy producer with void callbacks
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v2 -> v3:
>> - rename vfio_platform_irq_bypass_resume into *_start
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c     | 32 +++++++++++++++++++++++++++
>>  drivers/vfio/platform/vfio_platform_private.h |  2 ++
>>  2 files changed, 34 insertions(+)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 88bba57..b5cb8c7 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/types.h>
>>  #include <linux/vfio.h>
>>  #include <linux/irq.h>
>> +#include <linux/irqbypass.h>
>>  
>>  #include "vfio_platform_private.h"
>>  
>> @@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>>  	return IRQ_HANDLED;
>>  }
>>  
>> +static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>> +{
>> +}
>> +
>> +static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
>> +{
>> +}
>> +
>> +static int vfio_platform_irq_bypass_add_consumer(
>> +			struct irq_bypass_producer *prod,
>> +			struct irq_bypass_consumer *cons)
>> +{
>> +	return 0;
>> +}
>> +
>> +static void vfio_platform_irq_bypass_del_consumer(
>> +			struct irq_bypass_producer *prod,
>> +			struct irq_bypass_consumer *cons)
>> +{
>> +}
>> +
>>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  			    int fd, irq_handler_t handler)
>>  {
>> @@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  
>>  	if (irq->trigger) {
>>  		free_irq(irq->hwirq, irq);
>> +		irq_bypass_unregister_producer(&irq->producer);
>>  		kfree(irq->name);
>>  		eventfd_ctx_put(irq->trigger);
>>  		irq->trigger = NULL;
>> @@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  		return ret;
>>  	}
>>  
>> +	irq->producer.token = (void *)trigger;
>> +	irq->producer.irq = irq->hwirq;
>> +	irq->producer.add_consumer = vfio_platform_irq_bypass_add_consumer;
>> +	irq->producer.del_consumer = vfio_platform_irq_bypass_del_consumer;
>> +	irq->producer.stop = vfio_platform_irq_bypass_stop;
>> +	irq->producer.start = vfio_platform_irq_bypass_start;
>> +	ret = irq_bypass_register_producer(&irq->producer);
>> +	WARN_ON(ret);
> 
> For what purpose?
Yes. I will replace by a simple pr_info as done in eventfd.c.

Best Regards

Eric
> 
>> +
>>  	if (!irq->masked)
>>  		enable_irq(irq->hwirq);
>>  
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index 1c9b3d5..1d2d4d6 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -17,6 +17,7 @@
>>  
>>  #include <linux/types.h>
>>  #include <linux/interrupt.h>
>> +#include <linux/irqbypass.h>
>>  
>>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
>> @@ -37,6 +38,7 @@ struct vfio_platform_irq {
>>  	spinlock_t		lock;
>>  	struct virqfd		*unmask;
>>  	struct virqfd		*mask;
>> +	struct irq_bypass_producer producer;
>>  };
>>  
>>  struct vfio_platform_region {
> 
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] VFIO: platform: single handler using function pointer
  2015-08-12 18:56   ` Alex Williamson
@ 2015-08-17 15:25     ` Eric Auger
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-17 15:25 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

Alex,
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> A single handler now is registered whatever the use case: automasked
>> or not. A function pointer is set according to the wished behavior
>> and the handler calls this function.
>>
>> The irq lock is taken/released in the root handler. eventfd_signal can
>> be called in regions not allowed to sleep.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v4: creation
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c     | 21 +++++++++++++++------
>>  drivers/vfio/platform/vfio_platform_private.h |  1 +
>>  2 files changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 40f057a..b31b1f0 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>>  static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>>  {
>>  	struct vfio_platform_irq *irq_ctx = dev_id;
>> -	unsigned long flags;
>>  	int ret = IRQ_NONE;
>>  
>> -	spin_lock_irqsave(&irq_ctx->lock, flags);
>> -
>>  	if (!irq_ctx->masked) {
>>  		ret = IRQ_HANDLED;
>>  
>> @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>>  		irq_ctx->masked = true;
>>  	}
>>  
>> -	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> -
>>  	if (ret == IRQ_HANDLED)
>>  		eventfd_signal(irq_ctx->trigger, 1);
> 
> Has this been run with lockdep to check whether this is safe to call
> with spinlock_irqsave held?

No I did not check with lockdep and I will do. There is a comment in
fs/eventfd.c in eventfd_signal function comments that says:

"This function is supposed to be called by the kernel in paths that do
not allow sleeping. In this function we allow the counter to reach the
ULLONG_MAX value, and we signal this as overflow condition by returining
a POLLERR to poll(2)."

so I understood from this it is safe.

Best Regards

Eric

> 
>>  
>> @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>>  	return IRQ_HANDLED;
>>  }
>>  
>> +static irqreturn_t vfio_handler(int irq, void *dev_id)
>> +{
>> +	struct vfio_platform_irq *irq_ctx = dev_id;
>> +	unsigned long flags;
>> +	irqreturn_t ret;
>> +
>> +	spin_lock_irqsave(&irq_ctx->lock, flags);
>> +	ret = irq_ctx->handler(irq, dev_id);
>> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> +
>> +	return ret;
>> +}
>> +
>>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>>  {
>>  }
>> @@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  	}
>>  
>>  	irq->trigger = trigger;
>> +	irq->handler = handler;
>>  
>>  	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>> -	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
>> +	ret = request_irq(irq->hwirq, vfio_handler, 0, irq->name, irq);
>>  	if (ret) {
>>  		kfree(irq->name);
>>  		eventfd_ctx_put(trigger);
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index 8b4f814..f848a6b 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -40,6 +40,7 @@ struct vfio_platform_irq {
>>  	struct virqfd		*mask;
>>  	struct irq_bypass_producer producer;
>>  	bool			forwarded;
>> +	irqreturn_t (*handler)(int irq, void *dev_id);
>>  };
>>  
>>  struct vfio_platform_region {
> 
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
  2015-08-12 18:56   ` Alex Williamson
@ 2015-08-17 15:38     ` Eric Auger
  2015-08-18 17:44       ` Alex Williamson
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-08-17 15:38 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> This function makes possible to change the automasked mode.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v1 -> v2:
>> - set forwarded flag
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index b31b1f0..a285384 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
>>  	return ret;
>>  }
>>  
>> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>> +					   bool automasked)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&irq->lock, flags);
>> +	if (automasked) {
>> +		irq->forwarded = true;
>> +		irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
>> +		irq->handler = vfio_automasked_irq_handler;
>> +	} else {
>> +		irq->forwarded = false;
>> +		irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
>> +		irq->handler = vfio_irq_handler;
>> +	}
>> +	spin_unlock_irqrestore(&irq->lock, flags);
>> +	return 0;
> 
> In vfio-speak, automasked means level and we're not magically changing
> the IRQ from level to edge, we're simply able to handle level
> differently based on a hardware optimization.  Should the user visible
> flags therefore change based on this?  Aren't we really setting the
> forwarded state rather than the automasked state?

Well actually this was following the discussion we had a long time ago
about that topic:

http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html

I did not really know how to conclude ...

If it is preferred I can hide this to the userspace, no problem.

Eric


> 
>> +}
>> +
>>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>>  {
>>  }
> 
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active
  2015-08-12 18:56   ` Alex Williamson
@ 2015-08-17 15:39     ` Eric Auger
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-17 15:39 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> This function returns whether the IRQ is active at irqchip level or
>> VFIO masked. If either is true, it is considered the IRQ is active.
>> Currently there is no way to differentiate userspace masked IRQ from
>> automasked IRQ. There might be false detection of activity. However
>> it is currently acceptable to have false detection.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index a285384..efaee58 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>>  	return 0;
>>  }
>>  
>> +static int vfio_platform_is_active(struct vfio_platform_irq *irq)
> 
> vfio_platform_irq_is_active()?
OK
> 
>> +{
>> +	unsigned long flags;
>> +	bool active, masked, outstanding;
>> +	int ret;
>> +
>> +	spin_lock_irqsave(&irq->lock, flags);
>> +
>> +	ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
>> +	BUG_ON(ret);
> 
> Why can't we propagate this error to the caller and let them decide?
sure

Eric
> 
>> +	masked = irq->masked;
>> +	outstanding = active || masked;
>> +
>> +	spin_unlock_irqrestore(&irq->lock, flags);
>> +	return outstanding;
>> +}
>> +
>>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>>  {
>>  }
> 
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/10] VFIO: platform: add irq bypass producer management
  2015-08-12 18:56   ` Alex Williamson
@ 2015-08-17 15:51     ` Eric Auger
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Auger @ 2015-08-17 15:51 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

Hi Alex,
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:21 +0200, Eric Auger wrote:
>> This patch populates the IRQ bypass callacks:
>> - stop/start producer simply consist in disabling/enabling the host irq
>> - add/del consumer: basically set the automasked flag to false/true
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v2 -> v3:
>> - vfio_platform_irq_bypass_add_consumer now returns an error in case
>>   the IRQ is recognized as active
>> - active boolean not set anymore
>> - do not VFIO mask the IRQ anymore on unforward
>>
>> v1 -> v2:
>> - device handle caching in vfio_platform_device is introduced in a
>>   separate patch
>> - use container_of
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
>>  1 file changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index efaee58..400a188 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
>>  
>>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>>  {
>> +	disable_irq(prod->irq);
>>  }
>>  
>>  static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
>>  {
>> +	enable_irq(prod->irq);
>>  }
>>  
>>  static int vfio_platform_irq_bypass_add_consumer(
>>  			struct irq_bypass_producer *prod,
>>  			struct irq_bypass_consumer *cons)
>>  {
>> -	return 0;
>> +	struct vfio_platform_irq *irq =
>> +		container_of(prod, struct vfio_platform_irq, producer);
>> +
>> +	/*
>> +	 * if the IRQ is active at irqchip level or VFIO (auto)masked
>> +	 * this means the host IRQ is already under injection in the
>> +	 * guest and this not safe to change the forwarding state at
>> +	 * that stage.
>> +	 * It is not possible to differentiate user-space masking
>> +	 * from auto-masking, leading to possible false detection of
>> +	 * active state.
>> +	 */
>> +	if (vfio_platform_is_active(irq))
>> +		return -EAGAIN;
> 
> Here's an example of why we don't want WARN_ON if a registration fails,
> this is effectively random.  When and how is a re-try going to happen?
To be honest I did not intend to implement any retry mechanism. I rather
intended to change the user-side which currently is not adapted to that
start-up. Typically with current QEMU code we have this 2 phase IRQ
signal startup, first set up eventfd signaling, then VFIO mask, then
turn irqfd on. With such sequence forwarding setup will fail because the
IRQ is seen as masked (false detection of activity).

Do you mandate such a retry mechanism? If forwarding fails we resort on
standard irqfd ...

I think forwarding setup should be as much static as possible (I think
this opinion also is shared by Marc?).

Please let me know your opinion on this.

Best Regards

Eric

> 
>> +
>> +	return vfio_platform_set_automasked(irq, false);
> 
> set_forwarded just seems so much more logical here.
> 
>>  }
>>  
>>  static void vfio_platform_irq_bypass_del_consumer(
>>  			struct irq_bypass_producer *prod,
>>  			struct irq_bypass_consumer *cons)
>>  {
>> +	struct vfio_platform_irq *irq =
>> +		container_of(prod, struct vfio_platform_irq, producer);
>> +
>> +	vfio_platform_set_automasked(irq, true);
>>  }
>>  
>>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
> 
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
  2015-08-17 15:38     ` Eric Auger
@ 2015-08-18 17:44       ` Alex Williamson
  2015-08-31 11:43         ` Antonios Motakis
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-18 17:44 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, christoffer.dall,
	marc.zyngier, feng.wu, linux-kernel, patches, pbonzini

On Mon, 2015-08-17 at 17:38 +0200, Eric Auger wrote:
> On 08/12/2015 08:56 PM, Alex Williamson wrote:
> > On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> >> This function makes possible to change the automasked mode.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> >>
> >> ---
> >>
> >> v1 -> v2:
> >> - set forwarded flag
> >> ---
> >>  drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
> >>  1 file changed, 19 insertions(+)
> >>
> >> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >> index b31b1f0..a285384 100644
> >> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
> >>  	return ret;
> >>  }
> >>  
> >> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> >> +					   bool automasked)
> >> +{
> >> +	unsigned long flags;
> >> +
> >> +	spin_lock_irqsave(&irq->lock, flags);
> >> +	if (automasked) {
> >> +		irq->forwarded = true;
> >> +		irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
> >> +		irq->handler = vfio_automasked_irq_handler;
> >> +	} else {
> >> +		irq->forwarded = false;
> >> +		irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
> >> +		irq->handler = vfio_irq_handler;
> >> +	}
> >> +	spin_unlock_irqrestore(&irq->lock, flags);
> >> +	return 0;
> > 
> > In vfio-speak, automasked means level and we're not magically changing
> > the IRQ from level to edge, we're simply able to handle level
> > differently based on a hardware optimization.  Should the user visible
> > flags therefore change based on this?  Aren't we really setting the
> > forwarded state rather than the automasked state?
> 
> Well actually this was following the discussion we had a long time ago
> about that topic:
> 
> http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
> 
> I did not really know how to conclude ...
> 
> If it is preferred I can hide this to the userspace, no problem.

I think that was based on the user being involved in enabling forwarding
though, now that it's hidden and automatic, it doesn't make much sense
to me to toggle any of the interrupt info details based on the state of
the forward.  The user always needs to handle the interrupt as level
since the bypass can be torn down at any point in time.  We're taking
advantage of the in-kernel path to make further optimizations, which
seems like they should be transparent to the user.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
  2015-08-18 17:44       ` Alex Williamson
@ 2015-08-31 11:43         ` Antonios Motakis
  2015-08-31 14:54           ` Alex Williamson
  0 siblings, 1 reply; 35+ messages in thread
From: Antonios Motakis @ 2015-08-31 11:43 UTC (permalink / raw)
  To: Alex Williamson, Eric Auger
  Cc: eric.auger, kvm, patches, marc.zyngier, linux-kernel,
	linux-arm-kernel, pbonzini, feng.wu, kvmarm



On 18-Aug-15 19:44, Alex Williamson wrote:
> On Mon, 2015-08-17 at 17:38 +0200, Eric Auger wrote:
>> On 08/12/2015 08:56 PM, Alex Williamson wrote:
>>> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>>>> This function makes possible to change the automasked mode.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>
>>>> ---
>>>>
>>>> v1 -> v2:
>>>> - set forwarded flag
>>>> ---
>>>>  drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
>>>>  1 file changed, 19 insertions(+)
>>>>
>>>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>>>> index b31b1f0..a285384 100644
>>>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>>>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>>>> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
>>>>  	return ret;
>>>>  }
>>>>  
>>>> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>>>> +					   bool automasked)
>>>> +{
>>>> +	unsigned long flags;
>>>> +
>>>> +	spin_lock_irqsave(&irq->lock, flags);
>>>> +	if (automasked) {
>>>> +		irq->forwarded = true;
>>>> +		irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
>>>> +		irq->handler = vfio_automasked_irq_handler;
>>>> +	} else {
>>>> +		irq->forwarded = false;
>>>> +		irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
>>>> +		irq->handler = vfio_irq_handler;
>>>> +	}
>>>> +	spin_unlock_irqrestore(&irq->lock, flags);
>>>> +	return 0;
>>>
>>> In vfio-speak, automasked means level and we're not magically changing
>>> the IRQ from level to edge, we're simply able to handle level
>>> differently based on a hardware optimization.  Should the user visible
>>> flags therefore change based on this?  Aren't we really setting the
>>> forwarded state rather than the automasked state?
>>
>> Well actually this was following the discussion we had a long time ago
>> about that topic:
>>
>> http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
>>
>> I did not really know how to conclude ...
>>
>> If it is preferred I can hide this to the userspace, no problem.
> 
> I think that was based on the user being involved in enabling forwarding
> though, now that it's hidden and automatic, it doesn't make much sense
> to me to toggle any of the interrupt info details based on the state of
> the forward.  The user always needs to handle the interrupt as level
> since the bypass can be torn down at any point in time.  We're taking
> advantage of the in-kernel path to make further optimizations, which
> seems like they should be transparent to the user.  Thanks,

I wonder if it makes sense to rename VFIO_IRQ_INFO_AUTOMASKED to
VFIO_IRQ_INFO_LEVEL_TRIGGERED, and reintroduce VFIO_IRQ_INFO_AUTOMASKED as
an alias, so compatibility with user space can be maintained? This way
this semantic misunderstanding could be left behind.

Cheers,
Antonios

> 
> Alex
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> 

-- 
Antonios Motakis
Virtualization Engineer
Huawei Technologies Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
  2015-08-31 11:43         ` Antonios Motakis
@ 2015-08-31 14:54           ` Alex Williamson
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Williamson @ 2015-08-31 14:54 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: Eric Auger, eric.auger, kvm, patches, marc.zyngier, linux-kernel,
	linux-arm-kernel, pbonzini, feng.wu, kvmarm

On Mon, 2015-08-31 at 13:43 +0200, Antonios Motakis wrote:
> 
> On 18-Aug-15 19:44, Alex Williamson wrote:
> > On Mon, 2015-08-17 at 17:38 +0200, Eric Auger wrote:
> >> On 08/12/2015 08:56 PM, Alex Williamson wrote:
> >>> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> >>>> This function makes possible to change the automasked mode.
> >>>>
> >>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> >>>>
> >>>> ---
> >>>>
> >>>> v1 -> v2:
> >>>> - set forwarded flag
> >>>> ---
> >>>>  drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
> >>>>  1 file changed, 19 insertions(+)
> >>>>
> >>>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >>>> index b31b1f0..a285384 100644
> >>>> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >>>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >>>> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
> >>>>  	return ret;
> >>>>  }
> >>>>  
> >>>> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> >>>> +					   bool automasked)
> >>>> +{
> >>>> +	unsigned long flags;
> >>>> +
> >>>> +	spin_lock_irqsave(&irq->lock, flags);
> >>>> +	if (automasked) {
> >>>> +		irq->forwarded = true;
> >>>> +		irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
> >>>> +		irq->handler = vfio_automasked_irq_handler;
> >>>> +	} else {
> >>>> +		irq->forwarded = false;
> >>>> +		irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
> >>>> +		irq->handler = vfio_irq_handler;
> >>>> +	}
> >>>> +	spin_unlock_irqrestore(&irq->lock, flags);
> >>>> +	return 0;
> >>>
> >>> In vfio-speak, automasked means level and we're not magically changing
> >>> the IRQ from level to edge, we're simply able to handle level
> >>> differently based on a hardware optimization.  Should the user visible
> >>> flags therefore change based on this?  Aren't we really setting the
> >>> forwarded state rather than the automasked state?
> >>
> >> Well actually this was following the discussion we had a long time ago
> >> about that topic:
> >>
> >> http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
> >>
> >> I did not really know how to conclude ...
> >>
> >> If it is preferred I can hide this to the userspace, no problem.
> > 
> > I think that was based on the user being involved in enabling forwarding
> > though, now that it's hidden and automatic, it doesn't make much sense
> > to me to toggle any of the interrupt info details based on the state of
> > the forward.  The user always needs to handle the interrupt as level
> > since the bypass can be torn down at any point in time.  We're taking
> > advantage of the in-kernel path to make further optimizations, which
> > seems like they should be transparent to the user.  Thanks,
> 
> I wonder if it makes sense to rename VFIO_IRQ_INFO_AUTOMASKED to
> VFIO_IRQ_INFO_LEVEL_TRIGGERED, and reintroduce VFIO_IRQ_INFO_AUTOMASKED as
> an alias, so compatibility with user space can be maintained? This way
> this semantic misunderstanding could be left behind.

Is there really a misunderstanding here?  I think the change is that the
user was previously involved and updating the flag reinforced that.  Now
the user is not involved and the flag changing is just unexpected noise.
"automasked" is intentionally not "level" because being level triggered
may not be the only reason we'd need to mask an interrupt upon receiving
it.  We could actually have automasked edge triggered interrupts if we
wanted.  We don't do that, so we can equate automasked to level
currently, but technically automasked simply indicates that an unmask is
required to get the next interrupt.  In this case the VM is able to
effectively do the unmask itself.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active
  2015-08-10 13:20 ` [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active Eric Auger
  2015-08-12 18:56   ` Alex Williamson
@ 2015-09-02 19:29   ` Christoffer Dall
  1 sibling, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-02 19:29 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Mon, Aug 10, 2015 at 03:20:59PM +0200, Eric Auger wrote:
> This function returns whether the IRQ is active at irqchip level or
> VFIO masked. If either is true, it is considered the IRQ is active.
> Currently there is no way to differentiate userspace masked IRQ from
> automasked IRQ. There might be false detection of activity. However
> it is currently acceptable to have false detection.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> ---
>  drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index a285384..efaee58 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>  	return 0;
>  }
>  
> +static int vfio_platform_is_active(struct vfio_platform_irq *irq)
> +{
> +	unsigned long flags;
> +	bool active, masked, outstanding;
> +	int ret;
> +
> +	spin_lock_irqsave(&irq->lock, flags);
> +
> +	ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
> +	BUG_ON(ret);
> +	masked = irq->masked;
> +	outstanding = active || masked;

outstanding?

why not just return active || masked ?

-Christoffer

> +
> +	spin_unlock_irqrestore(&irq->lock, flags);
> +	return outstanding;
> +}
> +
>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>  {
>  }
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/10] VFIO: platform: add irq bypass producer management
  2015-08-10 13:21 ` [PATCH v3 06/10] VFIO: platform: add irq bypass producer management Eric Auger
  2015-08-12 18:56   ` Alex Williamson
@ 2015-09-02 19:32   ` Christoffer Dall
  1 sibling, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-02 19:32 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Mon, Aug 10, 2015 at 03:21:00PM +0200, Eric Auger wrote:
> This patch populates the IRQ bypass callacks:
> - stop/start producer simply consist in disabling/enabling the host irq
> - add/del consumer: basically set the automasked flag to false/true
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> v2 -> v3:
> - vfio_platform_irq_bypass_add_consumer now returns an error in case
>   the IRQ is recognized as active
> - active boolean not set anymore
> - do not VFIO mask the IRQ anymore on unforward
> 
> v1 -> v2:
> - device handle caching in vfio_platform_device is introduced in a
>   separate patch
> - use container_of
> ---
>  drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index efaee58..400a188 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
>  
>  static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>  {
> +	disable_irq(prod->irq);
>  }
>  
>  static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
>  {
> +	enable_irq(prod->irq);
>  }
>  
>  static int vfio_platform_irq_bypass_add_consumer(
>  			struct irq_bypass_producer *prod,
>  			struct irq_bypass_consumer *cons)
>  {
> -	return 0;
> +	struct vfio_platform_irq *irq =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +
> +	/*
> +	 * if the IRQ is active at irqchip level or VFIO (auto)masked
           If                                       masked by VFIO?
> +	 * this means the host IRQ is already under injection in the
> +	 * guest and this not safe to change the forwarding state at
> +	 * that stage.

is this really specifically bound to guests?

> +	 * It is not possible to differentiate user-space masking
> +	 * from auto-masking, leading to possible false detection of
> +	 * active state.

ok, what is the consequence of that?

> +	 */
> +	if (vfio_platform_is_active(irq))
> +		return -EAGAIN;
> +
> +	return vfio_platform_set_automasked(irq, false);
>  }
>  
>  static void vfio_platform_irq_bypass_del_consumer(
>  			struct irq_bypass_producer *prod,
>  			struct irq_bypass_consumer *cons)
>  {
> +	struct vfio_platform_irq *irq =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +
> +	vfio_platform_set_automasked(irq, true);
>  }
>  
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-08-10 13:21 ` [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices Eric Auger
@ 2015-09-02 19:42   ` Christoffer Dall
  2015-09-08 12:04     ` Eric Auger
  2015-09-09  8:41     ` Eric Auger
  0 siblings, 2 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-02 19:42 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Mon, Aug 10, 2015 at 03:21:01PM +0200, Eric Auger wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> So far, the only use of the HW interrupt facility was the timer,
> implying that the active state is context-switched for each vcpu,
> as the device is is shared across all vcpus.
> 
> This does not work for a device that has been assigned to a VM,
> as the guest is entierely in control of that device (the HW is
> not shared). In that case, it makes sense to bypass the whole
> active state switching.
> 
> Also the VGIC state machine is adapted to support those assigned
> (non shared) HW IRQs:
> - nly can be sampled when it is pending
> - when queueing the IRQ (programming the LR), the pending state is
>   removed as for edge sensitive IRQs
> - queued state is not modelled. Level state is not modelled
> - its injection always is valid since steming from the HW.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> - a mix of
>   [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
>                    non-shared devices
>   [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
> ---
>  include/kvm/arm_vgic.h    |  6 +++--
>  virt/kvm/arm/arch_timer.c |  3 ++-
>  virt/kvm/arm/vgic.c       | 58 +++++++++++++++++++++++++++++++++++------------
>  3 files changed, 49 insertions(+), 18 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index d901f1a..7ef9ce0 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -163,7 +163,8 @@ struct irq_phys_map {
>  	u32			virt_irq;
>  	u32			phys_irq;
>  	u32			irq;
> -	bool			active;
> +	bool			shared;
> +	bool			active; /* Only valid if shared */
>  };
>  
>  struct irq_phys_map_entry {
> @@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -					   int virt_irq, int irq);
> +					   int virt_irq, int irq,
> +					   bool shared);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 76e38d2..db21d8f 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
> +				    host_vtimer_irq, true);
>  	if (WARN_ON(IS_ERR(map)))
>  		return PTR_ERR(map);
>  
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 9eb489a..fbd5ba5 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
>  
>  static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
>  {
> -	return !vgic_irq_is_queued(vcpu, irq);
> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
> +	bool shared_hw = map && !map->shared;

why is shared true when map->shared is false?

> +
> +	return !vgic_irq_is_queued(vcpu, irq) ||
> +			(shared_hw && vgic_dist_irq_is_pending(vcpu, irq));

so for forwarded, non-shared, level-triggered IRQs, we always sample the
line if it's pending?  Why?

>  }
>  
>  /**
> @@ -1150,19 +1154,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>  		 * active in the physical world. Otherwise the
>  		 * physical interrupt will fire and the guest will
>  		 * exit before processing the virtual interrupt.
> +		 *
> +		 * This is of course only valid for a shared
> +		 * interrupt. A non shared interrupt should already be
> +		 * active.
>  		 */
>  		if (map) {
> -			int ret;
> -
> -			BUG_ON(!map->active);
>  			vlr.hwirq = map->phys_irq;
>  			vlr.state |= LR_HW;
>  			vlr.state &= ~LR_EOI_INT;
>  
> -			ret = irq_set_irqchip_state(map->irq,
> -						    IRQCHIP_STATE_ACTIVE,
> -						    true);
> -			WARN_ON(ret);
> +			if (map->shared) {
> +				int ret;
> +
> +				BUG_ON(!map->active);
> +				ret = irq_set_irqchip_state(
> +						map->irq,
> +						IRQCHIP_STATE_ACTIVE,
> +						true);
> +				WARN_ON(ret);
> +			}

this stuff all needs to be rebased onto my latest timer/active state
rework series.

>  
>  			/*
>  			 * Make sure we're not going to sample this
> @@ -1229,10 +1240,13 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>  
>  static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
>  {
> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
> +	bool shared_hw = map && !map->shared;

same question as above?

> +
>  	if (!vgic_can_sample_irq(vcpu, irq))
>  		return true; /* level interrupt, already queued */
>  
> -	if (vgic_queue_irq(vcpu, 0, irq)) {
> +	if (vgic_queue_irq(vcpu, 0, irq) || shared_hw) {
>  		if (vgic_irq_is_edge(vcpu, irq)) {
>  			vgic_dist_irq_clear_pending(vcpu, irq);
>  			vgic_cpu_irq_clear(vcpu, irq);
> @@ -1411,7 +1425,12 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>  		return 0;
>  
>  	map = vgic_irq_map_search(vcpu, vlr.irq);
> -	BUG_ON(!map || !map->active);
> +	BUG_ON(!map);
> +
> +	if (!map->shared)
> +		return 0;
> +
> +	BUG_ON(map->shared && !map->active);
>  
>  	ret = irq_get_irqchip_state(map->irq,
>  				    IRQCHIP_STATE_ACTIVE,
> @@ -1563,6 +1582,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	int edge_triggered, level_triggered;
>  	int enabled;
>  	bool ret = true, can_inject = true;
> +	bool shared_hw = map && !map->shared;

same

>  
>  	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>  		return -EINVAL;
> @@ -1573,7 +1593,8 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>  	edge_triggered = vgic_irq_is_edge(vcpu, irq_num);
>  	level_triggered = !edge_triggered;
>  
> -	if (!vgic_validate_injection(vcpu, irq_num, level)) {
> +	if (!vgic_validate_injection(vcpu, irq_num, level) &&
> +		!shared_hw) {
>  		ret = false;
>  		goto out;
>  	}
> @@ -1742,16 +1763,21 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>   * @vcpu: The VCPU pointer
>   * @virt_irq: The virtual irq number
>   * @irq: The Linux IRQ number
> + * @shared: Indicates if the interrupt has to be context-switched or
> + *          if it is private to a VM
>   *
>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>   * Linux irq (@irq). On injection, @virt_irq will be associated with
>   * the physical interrupt represented by @irq. This mapping can be
>   * established multiple times as long as the parameters are the same.
> + * If @shared is true, the active state of the interrupt will be
> + * context-switched.
>   *
>   * Returns a valid pointer on success, and an error pointer otherwise
>   */
>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -					   int virt_irq, int irq)
> +					   int virt_irq, int irq,
> +					   bool shared)
>  {
>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
> @@ -1785,7 +1811,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>  	if (map) {
>  		/* Make sure this mapping matches */
>  		if (map->phys_irq != phys_irq	||
> -		    map->irq      != irq)
> +		    map->irq      != irq	||
> +		    map->shared   != shared)
>  			map = ERR_PTR(-EINVAL);
>  
>  		/* Found an existing, valid mapping */
> @@ -1796,6 +1823,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>  	map->virt_irq = virt_irq;
>  	map->phys_irq = phys_irq;
>  	map->irq      = irq;
> +	map->shared   = shared;
>  
>  	list_add_tail_rcu(&entry->entry, root);
>  
> @@ -1846,7 +1874,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>   */
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>  {
> -	BUG_ON(!map);
> +	BUG_ON(!map || !map->shared);
>  	return map->active;
>  }
>  
> @@ -1858,7 +1886,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>   */
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>  {
> -	BUG_ON(!map);
> +	BUG_ON(!map || !map->shared);
>  	map->active = active;
>  }
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control
  2015-08-10 13:21 ` [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control Eric Auger
@ 2015-09-02 19:58   ` Christoffer Dall
  2015-09-14  9:29     ` Eric Auger
  0 siblings, 1 reply; 35+ messages in thread
From: Christoffer Dall @ 2015-09-02 19:58 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Mon, Aug 10, 2015 at 03:21:03PM +0200, Eric Auger wrote:
> Implements kvm_vgic_[set|unset]_forward.
> 
> Handle low-level VGIC programming: physical IRQ/guest IRQ mapping,
> list register cleanup, VGIC state machine. Also interacts with
> the irqchip.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v2 -> v3:
> - on unforward, we do not compute & output the active state anymore.
>   This means if the unforward happens while the physical IRQ is
>   active, we will not VFIO mask the IRQ while deactiving it. If a
>   new physical IRQ hits, the corresponding virtual IRQ might not
>   be injected (hence lost) due to VGIC state machine.
> 
> bypass rfc v2:
> - use irq_set_vcpu_affinity API
> - use irq_set_irqchip_state instead of chip->irq_eoi
> 
> bypass rfc:
> - rename kvm_arch_{set|unset}_forward into
>   kvm_vgic_{set|unset}_forward. Remove __KVM_HAVE_ARCH_HALT_GUEST.
>   The function is bound to be called by ARM code only.
> 
> v4 -> v5:
> - fix arm64 compilation issues, ie. also defines
>   __KVM_HAVE_ARCH_HALT_GUEST for arm64
> 
> v3 -> v4:
> - code originally located in kvm_vfio_arm.c
> - kvm_arch_vfio_{set|unset}_forward renamed into
>   kvm_arch_{set|unset}_forward
> - split into 2 functions (set/unset) since unset does not fail anymore
> - unset can be invoked at whatever time. Extra care is taken to handle
>   transition in VGIC state machine, LR cleanup, ...
> 
> v2 -> v3:
> - renaming of kvm_arch_set_fwd_state into kvm_arch_vfio_set_forward
> - takes a bool arg instead of kvm_fwd_irq_action enum
> - removal of KVM_VFIO_IRQ_CLEANUP
> - platform device check now happens here
> - more precise errors returned
> - irq_eoi handled externally to this patch (VGIC)
> - correct enable_irq bug done twice
> - reword the commit message
> - correct check of platform_bus_type
> - use raw_spin_lock_irqsave and check the validity of the handler
> ---
>  include/kvm/arm_vgic.h |   6 ++
>  virt/kvm/arm/vgic.c    | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 155 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 7ef9ce0..409ac0f 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -363,6 +363,12 @@ int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>  
> +int kvm_vgic_set_forward(struct kvm *kvm,
> +			 unsigned int host_irq, unsigned int guest_irq);
> +
> +void kvm_vgic_unset_forward(struct kvm *kvm,
> +			    unsigned int host_irq, unsigned int guest_irq);
> +
>  #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
>  #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
>  #define vgic_ready(k)		((k)->arch.vgic.ready)
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 03a85b3..b15999a 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -2551,3 +2551,152 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>  {
>  	return 0;
>  }
> +
> +/**
> + * kvm_vgic_set_forward - Set IRQ forwarding
> + *
> + * @kvm: handle to the VM
> + * @host_irq: physical IRQ number
> + * @guest_irq: virtual IRQ number
> + *
> + * This function is supposed to be called only if the IRQ
> + * is not in progress: ie. not active at GIC level and not
> + * currently under injection in the guest. The physical IRQ must
> + * also be disabled and the guest must have been exited and

when you say the guest you mean all VCPUs, right?

> + * prevented from being re-entered.
> + */
> +int kvm_vgic_set_forward(struct kvm *kvm,
> +			 unsigned int host_irq,
> +			 unsigned int guest_irq)
> +{
> +	struct irq_phys_map *map = NULL;
> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
> +	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
> +
> +	kvm_debug("%s host_irq=%d guest_irq=%d\n",
> +		__func__, host_irq, guest_irq);
> +
> +	if (!vcpu)
> +		return 0;
> +
> +	irq_set_vcpu_affinity(host_irq, vcpu);

why are we hard-coding this to vcpu 0 ?

missing white space before the code block.  Where does the code block
belong exactly?

> +	/*
> +	 * next physical IRQ will be be handled as forwarded

what do you mean with 'next' here?

> +	 * by the host (priority drop only)
> +	 */
> +
> +	map = kvm_vgic_map_phys_irq(vcpu, spi_id, host_irq, false);
> +	/*
> +	 * next guest_irq injection will be considered as
> +	 * forwarded and next flush will program LR
> +	 * without maintenance IRQ but with HW bit set
> +	 */

also here, I don't understand what you mean by next.

Perhaps these comments were supposed to come before the function calls?

> +	return !map;
> +}
> +
> +/**
> + * kvm_vgic_unset_forward - Unset IRQ forwarding
> + *
> + * @kvm: handle to the VM
> + * @host_irq: physical IRQ number
> + * @guest_irq: virtual IRQ number
> + *
> + * This function must be called when the host_irq is disabled
> + * and guest has been exited and prevented from being re-entered.
> + *

extra white space here

> + */
> +void kvm_vgic_unset_forward(struct kvm *kvm,
> +			    unsigned int host_irq,
> +			    unsigned int guest_irq)
> +{
> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	int ret, lr;
> +	struct vgic_lr vlr;
> +	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
> +	bool queued = false, needs_deactivate = true;
> +	struct irq_phys_map *map;
> +	bool active;
> +
> +	kvm_debug("%s host_irq=%d guest_irq=%d\n",
> +		__func__, host_irq, guest_irq);
> +
> +	spin_lock(&dist->lock);
> +
> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
> +
> +	if (!vcpu)
> +		goto out;
> +
> +	map = vgic_irq_map_search(vcpu, spi_id);
> +	BUG_ON(!map);
> +	ret = kvm_vgic_unmap_phys_irq(vcpu, map);
> +	BUG_ON(ret);
> +	/*
> +	 * subsequent update_irq_pending or flush will handle this
> +	 * irq as not forwarded
> +	 */

missing white space before this comment block as well, also here with
'subsequent' do you mean "At this point" ?

> +	if (likely(!(active))) {
> +		/*
> +		 * the IRQ was not active. let's simply prepare the states
> +		 * for subsequent non forwarded injection.
> +		 */
> +		vgic_dist_irq_clear_level(vcpu, spi_id);
> +		vgic_dist_irq_clear_pending(vcpu, spi_id);
> +		vgic_irq_clear_queued(vcpu, spi_id);
> +		needs_deactivate = false;
> +		goto out;
> +	}
> +
> +	/* is there any list register with valid state? */
> +	lr = vgic_cpu->vgic_irq_lr_map[spi_id];
> +	if (lr != LR_EMPTY) {
> +		vlr = vgic_get_lr(vcpu, lr);
> +		if (vlr.state & LR_STATE_MASK)
> +			queued = true;
> +	}
> +
> +	if (!queued) {
> +		vgic_irq_clear_queued(vcpu, spi_id);
> +		if (vgic_dist_irq_is_pending(vcpu, spi_id)) {
> +			/*
> +			 * IRQ is injected but not yet queued. LR will be
> +			 * written with EOI_INT and process_maintenance will
> +			 * reset the states: queued, level(resampler). Pending
> +			 * will be reset on flush.
> +			 */
> +			vgic_dist_irq_set_level(vcpu, spi_id);

so this is all only valid for level-triggered interrupts?  Do we check
this somewhere along the call path?

> +		} else {
> +			/*
> +			 * We are somewhere before the update_irq_pending.
> +			 * we can't be sure the virtual IRQ will ever be
> +			 * injected (due to previous disable_irq).

I don't understand this.  Do we somehow know at this point that there is
a pending IRQ to inject as a virtual IRQ?

> +			 * Let's simply clear the level which was not correctly
> +			 * modelled in forwarded state.
> +			 */
> +			vgic_dist_irq_clear_level(vcpu, spi_id);
> +		}
> +		goto out;
> +	}
> +
> +	/*
> +	 * the virtual IRQ is queued and a valid LR exists, let's patch it so
> +	 * that when EOI happens a maintenance IRQ gets triggered
> +	 */
> +	vlr.state |= LR_EOI_INT;
> +	vgic_set_lr(vcpu, lr, vlr);
> +
> +	vgic_dist_irq_set_level(vcpu, spi_id);
> +	vgic_dist_irq_set_pending(vcpu, spi_id);

how do you know this is the case?  couldn't it be active in the LR?

> +	vgic_irq_set_queued(vcpu, spi_id);
> +	 /* The maintenance IRQ will reset all states above */

then why do we bother setting them to anything here?

> +
> +out:
> +	irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
> +	irq_set_vcpu_affinity(host_irq, NULL);
> +	/* next occurrence will be deactivated by the host */

I'm beginning to understand what you mean by 'next' here.

Can you make it more explicit by saying "After this function returns, a
physical IRQ will be..." ?

> +
> +	spin_unlock(&dist->lock);
> +}
> +
> -- 
> 1.9.1
> 
Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager
  2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
                   ` (9 preceding siblings ...)
  2015-08-10 13:21 ` [PATCH v3 10/10] KVM: arm/arm64: implement IRQ bypass consumer functions Eric Auger
@ 2015-09-02 19:58 ` Christoffer Dall
  10 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-02 19:58 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Mon, Aug 10, 2015 at 03:20:54PM +0200, Eric Auger wrote:
> This series allows to set ARM IRQ forwarding between a VFIO platform
> device physical IRQ and a guest virtual IRQ. The link is coordinated
> by the IRQ bypass manager.
> 
> The principle is the VFIO platform driver registers an IRQ bypass producer
> struct on VFIO_IRQ_SET_ACTION_TRIGGER while KVM irqfd registers a consumer
> struct on the irqfd assignment. This leads to a handshake based on the
> eventfd context (used as token) match. When either of the producer/consumer
> disappears, an unregistration occurs and the link is disconnected.
> 
> This kernel integration deprecates the former kvm-vfio approach:
> https://lkml.org/lkml/2015/4/13/353. Some rationale about that change can
> be found in IRQ bypass manager thread: https://lkml.org/lkml/2015/6/29/268
> 
> Dependencies:
> 1- [PATCH v4 00/11] arm/arm64: KVM: Active interrupt state switching
>    for shared devices (http://www.spinics.net/lists/arm-kernel/msg437884.html)
>    except PATCH 11
> 2- [PATCH 0/6] irqchip: GICv2/v3: Add support for irq_vcpu_affinity
> 3- [PATCH v4] virt: IRQ bypass manager (https://lkml.org/lkml/2015/8/6/526)
> 4- [PATCH 0/2] KVM: arm/arm64: Guest synchronous halt/resume
>    (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg950942.html)
> 
> All those pieces can be found at:
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.2-rc6-irq-forward-v3
> 
> More backgroung on ARM IRQ forwarding in the text below and at
> http://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf.
> 
> A forwarded IRQ is deactivated by the guest and not by the host.
> When the guest deactivates the associated virtual IRQ, the interrupt
> controller automatically completes the physical IRQ. Obviously
> this requires some HW support in the interrupt controller. This is
> the case for ARM GICv2.
> 
> The direct benefit is that, for a level sensitive IRQ, a VM exit
> can be avoided on forwarded IRQ completion.

Would there be any benefit for edge-triggered IRQs in that another
physical IRQ won't hit before the guest is done processing the original
IRQ, potentially reducing guest interrupt handling latency?

-Christoffer

> 
> When the IRQ is forwarded, the VFIO platform driver does not need to
> mask the physical IRQ anymore before signaling the eventfd. Indeed
> genirq lowers the running priority, enabling other physical IRQ to hit
> except that one.
> 
> Besides, the injection still is based on irqfd triggering. The only
> impact on irqfd process is resamplefd is not called anymore on
> virtual IRQ completion since deactivation is not trapped by KVM.
> 
> This was tested on Calxeda Midway, assigning the xgmac main IRQ
> 
> History:
> 
> v2 (RFC) -> v3(PATCH):
> - all dependencies now have a patch status
> - we dropped the producer active boolean exchanged between the
>   VFIO producer and irqfd arm consumer. As a consequence, on
>   unforward, if the IRQ is active, this latter is deactivated
>   without VFIO-masking it. So we do not exactly come back to the
>   exact state where we would be in unforwarded state. A new
>   physical IRQ can hit while the previous virtual IRQ is under
>   treatment. Its injection in the guest may be rejected thanks
>   to the VGIC state machine. This IRQ will be lost but I don't
>   think this is a severe issue. In case no new IRQ hits, the
>   guest deactivation of the virtual IRQ will trigger the resamplefd
>   which will VFIO unmask the non-masked IRQ. This also has no
>   consequence.
> - VFIO platform driver consumer_add now can fail. It rejects the
>   transition for forwarding state in case the IRQ is active
> - the series is rebased on new irq_vcpu_affinity series
> - no dependency anymore on "chip/vgic adaptations for forwarded irq"
>   which was partially integrated into Marc's series. A fix is still
>   needed through.
> - Guest synchronous halt/resume patch re-integrated into this series
> - integrate a new patch file coming mixing
>   [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
>                    non-shared devices &
>   [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
> 
> v1 -> v2:
> - irq bypass manager and irqfd consumer moved in a separate patch
> - kvm_arm_[halt,resume]_guest moved in a separate patch
> - remove VFIO external functions since we do not need them anymore
> - apply container_of strategy advised by Paolo. Only active field
>   remains and discussions will tell whether we get rid of it.
> - renamed kvm_arch functions
> 
> - kvm-vfio v6 -> RFC v1 based on IRQ bypass manager
>   see previous history in https://lkml.org/lkml/2015/4/13/353).
> 
> Best Regards
> 
> Eric
> 
> 
> Eric Auger (9):
>   VFIO: platform: registration of a dummy IRQ bypass producer
>   VFIO: platform: test forwarded state when selecting the IRQ handler
>   VFIO: platform: single handler using function pointer
>   VFIO: platform: add vfio_platform_set_automasked
>   VFIO: platform: add vfio_platform_is_active
>   VFIO: platform: add irq bypass producer management
>   KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ
>   KVM: arm/arm64: vgic: forwarding control
>   KVM: arm/arm64: implement IRQ bypass consumer functions
> 
> Marc Zyngier (1):
>   KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
> 
>  arch/arm/kvm/Kconfig                          |   1 +
>  arch/arm/kvm/arm.c                            |  35 +++++
>  arch/arm64/kvm/Kconfig                        |   1 +
>  drivers/vfio/platform/vfio_platform_irq.c     | 113 +++++++++++++-
>  drivers/vfio/platform/vfio_platform_private.h |   4 +
>  include/kvm/arm_vgic.h                        |  12 +-
>  virt/kvm/arm/arch_timer.c                     |   3 +-
>  virt/kvm/arm/vgic.c                           | 215 ++++++++++++++++++++++++--
>  8 files changed, 358 insertions(+), 26 deletions(-)
> 
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-09-02 19:42   ` Christoffer Dall
@ 2015-09-08 12:04     ` Eric Auger
  2015-09-14 10:59       ` Christoffer Dall
  2015-09-09  8:41     ` Eric Auger
  1 sibling, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-09-08 12:04 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

Hi Christoffer,
On 09/02/2015 09:42 PM, Christoffer Dall wrote:
> On Mon, Aug 10, 2015 at 03:21:01PM +0200, Eric Auger wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> So far, the only use of the HW interrupt facility was the timer,
>> implying that the active state is context-switched for each vcpu,
>> as the device is is shared across all vcpus.
>>
>> This does not work for a device that has been assigned to a VM,
>> as the guest is entierely in control of that device (the HW is
>> not shared). In that case, it makes sense to bypass the whole
>> active state switching.
>>
>> Also the VGIC state machine is adapted to support those assigned
>> (non shared) HW IRQs:
>> - nly can be sampled when it is pending
>> - when queueing the IRQ (programming the LR), the pending state is
>>   removed as for edge sensitive IRQs
>> - queued state is not modelled. Level state is not modelled
>> - its injection always is valid since steming from the HW.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> - a mix of
>>   [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
>>                    non-shared devices
>>   [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
>> ---
>>  include/kvm/arm_vgic.h    |  6 +++--
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic.c       | 58 +++++++++++++++++++++++++++++++++++------------
>>  3 files changed, 49 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index d901f1a..7ef9ce0 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -163,7 +163,8 @@ struct irq_phys_map {
>>  	u32			virt_irq;
>>  	u32			phys_irq;
>>  	u32			irq;
>> -	bool			active;
>> +	bool			shared;
>> +	bool			active; /* Only valid if shared */
>>  };
>>  
>>  struct irq_phys_map_entry {
>> @@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq);
>> +					   int virt_irq, int irq,
>> +					   bool shared);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 76e38d2..db21d8f 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
>> +				    host_vtimer_irq, true);
>>  	if (WARN_ON(IS_ERR(map)))
>>  		return PTR_ERR(map);
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 9eb489a..fbd5ba5 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
>>  
>>  static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
>>  {
>> -	return !vgic_irq_is_queued(vcpu, irq);
>> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
>> +	bool shared_hw = map && !map->shared;
> 
> why is shared true when map->shared is false?
definitively upside down
> 
>> +
>> +	return !vgic_irq_is_queued(vcpu, irq) ||
>> +			(shared_hw && vgic_dist_irq_is_pending(vcpu, irq));
> 
> so for forwarded, non-shared, level-triggered IRQs, we always sample the
> line if it's pending?  Why?

No we only sampled it if it was pending. The pending state was reset
when programming the LR.

Now that we model the queued state for mapped IRQ I will use that instead
> 
>>  }
>>  
>>  /**
>> @@ -1150,19 +1154,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>  		 * active in the physical world. Otherwise the
>>  		 * physical interrupt will fire and the guest will
>>  		 * exit before processing the virtual interrupt.
>> +		 *
>> +		 * This is of course only valid for a shared
>> +		 * interrupt. A non shared interrupt should already be
>> +		 * active.
>>  		 */
>>  		if (map) {
>> -			int ret;
>> -
>> -			BUG_ON(!map->active);
>>  			vlr.hwirq = map->phys_irq;
>>  			vlr.state |= LR_HW;
>>  			vlr.state &= ~LR_EOI_INT;
>>  
>> -			ret = irq_set_irqchip_state(map->irq,
>> -						    IRQCHIP_STATE_ACTIVE,
>> -						    true);
>> -			WARN_ON(ret);
>> +			if (map->shared) {
>> +				int ret;
>> +
>> +				BUG_ON(!map->active);
>> +				ret = irq_set_irqchip_state(
>> +						map->irq,
>> +						IRQCHIP_STATE_ACTIVE,
>> +						true);
>> +				WARN_ON(ret);
>> +			}
> 
> this stuff all needs to be rebased onto my latest timer/active state
> rework series.
sure, ongoing ...

Thanks

Eric
> 
>>  
>>  			/*
>>  			 * Make sure we're not going to sample this
>> @@ -1229,10 +1240,13 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>>  
>>  static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
>>  {
>> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
>> +	bool shared_hw = map && !map->shared;
> 
> same question as above?
> 
>> +
>>  	if (!vgic_can_sample_irq(vcpu, irq))
>>  		return true; /* level interrupt, already queued */
>>  
>> -	if (vgic_queue_irq(vcpu, 0, irq)) {
>> +	if (vgic_queue_irq(vcpu, 0, irq) || shared_hw) {
>>  		if (vgic_irq_is_edge(vcpu, irq)) {
>>  			vgic_dist_irq_clear_pending(vcpu, irq);
>>  			vgic_cpu_irq_clear(vcpu, irq);
>> @@ -1411,7 +1425,12 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>>  		return 0;
>>  
>>  	map = vgic_irq_map_search(vcpu, vlr.irq);
>> -	BUG_ON(!map || !map->active);
>> +	BUG_ON(!map);
>> +
>> +	if (!map->shared)
>> +		return 0;
>> +
>> +	BUG_ON(map->shared && !map->active);
>>  
>>  	ret = irq_get_irqchip_state(map->irq,
>>  				    IRQCHIP_STATE_ACTIVE,
>> @@ -1563,6 +1582,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  	int edge_triggered, level_triggered;
>>  	int enabled;
>>  	bool ret = true, can_inject = true;
>> +	bool shared_hw = map && !map->shared;
> 
> same
> 
>>  
>>  	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>>  		return -EINVAL;
>> @@ -1573,7 +1593,8 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  	edge_triggered = vgic_irq_is_edge(vcpu, irq_num);
>>  	level_triggered = !edge_triggered;
>>  
>> -	if (!vgic_validate_injection(vcpu, irq_num, level)) {
>> +	if (!vgic_validate_injection(vcpu, irq_num, level) &&
>> +		!shared_hw) {
>>  		ret = false;
>>  		goto out;
>>  	}
>> @@ -1742,16 +1763,21 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>>   * @vcpu: The VCPU pointer
>>   * @virt_irq: The virtual irq number
>>   * @irq: The Linux IRQ number
>> + * @shared: Indicates if the interrupt has to be context-switched or
>> + *          if it is private to a VM
>>   *
>>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>>   * Linux irq (@irq). On injection, @virt_irq will be associated with
>>   * the physical interrupt represented by @irq. This mapping can be
>>   * established multiple times as long as the parameters are the same.
>> + * If @shared is true, the active state of the interrupt will be
>> + * context-switched.
>>   *
>>   * Returns a valid pointer on success, and an error pointer otherwise
>>   */
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq)
>> +					   int virt_irq, int irq,
>> +					   bool shared)
>>  {
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> @@ -1785,7 +1811,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	if (map) {
>>  		/* Make sure this mapping matches */
>>  		if (map->phys_irq != phys_irq	||
>> -		    map->irq      != irq)
>> +		    map->irq      != irq	||
>> +		    map->shared   != shared)
>>  			map = ERR_PTR(-EINVAL);
>>  
>>  		/* Found an existing, valid mapping */
>> @@ -1796,6 +1823,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	map->virt_irq = virt_irq;
>>  	map->phys_irq = phys_irq;
>>  	map->irq      = irq;
>> +	map->shared   = shared;
>>  
>>  	list_add_tail_rcu(&entry->entry, root);
>>  
>> @@ -1846,7 +1874,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>>   */
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	return map->active;
>>  }
>>  
>> @@ -1858,7 +1886,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>   */
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	map->active = active;
>>  }
>>  
>> -- 
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-09-02 19:42   ` Christoffer Dall
  2015-09-08 12:04     ` Eric Auger
@ 2015-09-09  8:41     ` Eric Auger
  2015-09-14 11:11       ` Christoffer Dall
  1 sibling, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-09-09  8:41 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

Hi Christoffer,
On 09/02/2015 09:42 PM, Christoffer Dall wrote:
> On Mon, Aug 10, 2015 at 03:21:01PM +0200, Eric Auger wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> So far, the only use of the HW interrupt facility was the timer,
>> implying that the active state is context-switched for each vcpu,
>> as the device is is shared across all vcpus.
>>
>> This does not work for a device that has been assigned to a VM,
>> as the guest is entierely in control of that device (the HW is
>> not shared). In that case, it makes sense to bypass the whole
>> active state switching.
>>
>> Also the VGIC state machine is adapted to support those assigned
>> (non shared) HW IRQs:
>> - nly can be sampled when it is pending
>> - when queueing the IRQ (programming the LR), the pending state is
>>   removed as for edge sensitive IRQs
>> - queued state is not modelled. Level state is not modelled
>> - its injection always is valid since steming from the HW.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> - a mix of
>>   [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
>>                    non-shared devices
>>   [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
>> ---
>>  include/kvm/arm_vgic.h    |  6 +++--
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic.c       | 58 +++++++++++++++++++++++++++++++++++------------
>>  3 files changed, 49 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index d901f1a..7ef9ce0 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -163,7 +163,8 @@ struct irq_phys_map {
>>  	u32			virt_irq;
>>  	u32			phys_irq;
>>  	u32			irq;
>> -	bool			active;
>> +	bool			shared;
>> +	bool			active; /* Only valid if shared */
>>  };
>>  
>>  struct irq_phys_map_entry {
>> @@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
>>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq);
>> +					   int virt_irq, int irq,
>> +					   bool shared);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 76e38d2..db21d8f 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
>> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
>> +				    host_vtimer_irq, true);
>>  	if (WARN_ON(IS_ERR(map)))
>>  		return PTR_ERR(map);
>>  
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 9eb489a..fbd5ba5 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
>>  
>>  static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
>>  {
>> -	return !vgic_irq_is_queued(vcpu, irq);
>> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
>> +	bool shared_hw = map && !map->shared;
> 
> why is shared true when map->shared is false?
> 
>> +
>> +	return !vgic_irq_is_queued(vcpu, irq) ||
>> +			(shared_hw && vgic_dist_irq_is_pending(vcpu, irq));
> 
> so for forwarded, non-shared, level-triggered IRQs, we always sample the
> line if it's pending?  Why?
I tried to integrate into the updated state machine for non shared
mapped IRQ but I fail.

1) The first problem encountered is how to reset the level of the IRQ
(since its completion is not trapped). I added this reset in
process_queued_irq. I think this was the most natural place since at
sink time we get aware the IRQ is deactivated at physical distributor
level. However I observe  failures in vgic_validate_injection. I think
there is due to a race between update_irq_pending and sync. As soon as
the guest EOI's the virtual IRQ (and also the pIRQ), a new physical IRQ
hits and gets injected by irqfd. This injection can happen before the
sync. So I would be tempted to keep my current strategy of ignoring the
validate_injection in case of non-shared mapped IRQ and not model the
level state. The vIRQ directly comes from the HW so it must be valid
(guest deactivated the previous occurence).

2) can_sample.
once the above problem fixed, next issue is the can_sample failure.
Queued state also is reset in process_queued_irq and can_sample fails.
Same punishment. So currently the only manner I found to make this work
and sample the IRQ only once is to use the pending state which I reset
when I queue the IRQ.

So currently I don't think non-shared mapped IRQ fit the updated state
machine.

Any Thoughts?

Eric
> 
>>  }
>>  
>>  /**
>> @@ -1150,19 +1154,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>  		 * active in the physical world. Otherwise the
>>  		 * physical interrupt will fire and the guest will
>>  		 * exit before processing the virtual interrupt.
>> +		 *
>> +		 * This is of course only valid for a shared
>> +		 * interrupt. A non shared interrupt should already be
>> +		 * active.
>>  		 */
>>  		if (map) {
>> -			int ret;
>> -
>> -			BUG_ON(!map->active);
>>  			vlr.hwirq = map->phys_irq;
>>  			vlr.state |= LR_HW;
>>  			vlr.state &= ~LR_EOI_INT;
>>  
>> -			ret = irq_set_irqchip_state(map->irq,
>> -						    IRQCHIP_STATE_ACTIVE,
>> -						    true);
>> -			WARN_ON(ret);
>> +			if (map->shared) {
>> +				int ret;
>> +
>> +				BUG_ON(!map->active);
>> +				ret = irq_set_irqchip_state(
>> +						map->irq,
>> +						IRQCHIP_STATE_ACTIVE,
>> +						true);
>> +				WARN_ON(ret);
>> +			}
> 
> this stuff all needs to be rebased onto my latest timer/active state
> rework series.
> 
>>  
>>  			/*
>>  			 * Make sure we're not going to sample this
>> @@ -1229,10 +1240,13 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>>  
>>  static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
>>  {
>> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
>> +	bool shared_hw = map && !map->shared;
> 
> same question as above?
> 
>> +
>>  	if (!vgic_can_sample_irq(vcpu, irq))
>>  		return true; /* level interrupt, already queued */
>>  
>> -	if (vgic_queue_irq(vcpu, 0, irq)) {
>> +	if (vgic_queue_irq(vcpu, 0, irq) || shared_hw) {
>>  		if (vgic_irq_is_edge(vcpu, irq)) {
>>  			vgic_dist_irq_clear_pending(vcpu, irq);
>>  			vgic_cpu_irq_clear(vcpu, irq);
>> @@ -1411,7 +1425,12 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>>  		return 0;
>>  
>>  	map = vgic_irq_map_search(vcpu, vlr.irq);
>> -	BUG_ON(!map || !map->active);
>> +	BUG_ON(!map);
>> +
>> +	if (!map->shared)
>> +		return 0;
>> +
>> +	BUG_ON(map->shared && !map->active);
>>  
>>  	ret = irq_get_irqchip_state(map->irq,
>>  				    IRQCHIP_STATE_ACTIVE,
>> @@ -1563,6 +1582,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  	int edge_triggered, level_triggered;
>>  	int enabled;
>>  	bool ret = true, can_inject = true;
>> +	bool shared_hw = map && !map->shared;
> 
> same
> 
>>  
>>  	if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
>>  		return -EINVAL;
>> @@ -1573,7 +1593,8 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
>>  	edge_triggered = vgic_irq_is_edge(vcpu, irq_num);
>>  	level_triggered = !edge_triggered;
>>  
>> -	if (!vgic_validate_injection(vcpu, irq_num, level)) {
>> +	if (!vgic_validate_injection(vcpu, irq_num, level) &&
>> +		!shared_hw) {
>>  		ret = false;
>>  		goto out;
>>  	}
>> @@ -1742,16 +1763,21 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
>>   * @vcpu: The VCPU pointer
>>   * @virt_irq: The virtual irq number
>>   * @irq: The Linux IRQ number
>> + * @shared: Indicates if the interrupt has to be context-switched or
>> + *          if it is private to a VM
>>   *
>>   * Establish a mapping between a guest visible irq (@virt_irq) and a
>>   * Linux irq (@irq). On injection, @virt_irq will be associated with
>>   * the physical interrupt represented by @irq. This mapping can be
>>   * established multiple times as long as the parameters are the same.
>> + * If @shared is true, the active state of the interrupt will be
>> + * context-switched.
>>   *
>>   * Returns a valid pointer on success, and an error pointer otherwise
>>   */
>>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>> -					   int virt_irq, int irq)
>> +					   int virt_irq, int irq,
>> +					   bool shared)
>>  {
>>  	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  	struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
>> @@ -1785,7 +1811,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	if (map) {
>>  		/* Make sure this mapping matches */
>>  		if (map->phys_irq != phys_irq	||
>> -		    map->irq      != irq)
>> +		    map->irq      != irq	||
>> +		    map->shared   != shared)
>>  			map = ERR_PTR(-EINVAL);
>>  
>>  		/* Found an existing, valid mapping */
>> @@ -1796,6 +1823,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
>>  	map->virt_irq = virt_irq;
>>  	map->phys_irq = phys_irq;
>>  	map->irq      = irq;
>> +	map->shared   = shared;
>>  
>>  	list_add_tail_rcu(&entry->entry, root);
>>  
>> @@ -1846,7 +1874,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
>>   */
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	return map->active;
>>  }
>>  
>> @@ -1858,7 +1886,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
>>   */
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
>>  {
>> -	BUG_ON(!map);
>> +	BUG_ON(!map || !map->shared);
>>  	map->active = active;
>>  }
>>  
>> -- 
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control
  2015-09-02 19:58   ` Christoffer Dall
@ 2015-09-14  9:29     ` Eric Auger
  2015-09-14 11:24       ` Christoffer Dall
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Auger @ 2015-09-14  9:29 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

Christoffer,
On 09/02/2015 09:58 PM, Christoffer Dall wrote:
> On Mon, Aug 10, 2015 at 03:21:03PM +0200, Eric Auger wrote:
>> Implements kvm_vgic_[set|unset]_forward.
>>
>> Handle low-level VGIC programming: physical IRQ/guest IRQ mapping,
>> list register cleanup, VGIC state machine. Also interacts with
>> the irqchip.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v2 -> v3:
>> - on unforward, we do not compute & output the active state anymore.
>>   This means if the unforward happens while the physical IRQ is
>>   active, we will not VFIO mask the IRQ while deactiving it. If a
>>   new physical IRQ hits, the corresponding virtual IRQ might not
>>   be injected (hence lost) due to VGIC state machine.
>>
>> bypass rfc v2:
>> - use irq_set_vcpu_affinity API
>> - use irq_set_irqchip_state instead of chip->irq_eoi
>>
>> bypass rfc:
>> - rename kvm_arch_{set|unset}_forward into
>>   kvm_vgic_{set|unset}_forward. Remove __KVM_HAVE_ARCH_HALT_GUEST.
>>   The function is bound to be called by ARM code only.
>>
>> v4 -> v5:
>> - fix arm64 compilation issues, ie. also defines
>>   __KVM_HAVE_ARCH_HALT_GUEST for arm64
>>
>> v3 -> v4:
>> - code originally located in kvm_vfio_arm.c
>> - kvm_arch_vfio_{set|unset}_forward renamed into
>>   kvm_arch_{set|unset}_forward
>> - split into 2 functions (set/unset) since unset does not fail anymore
>> - unset can be invoked at whatever time. Extra care is taken to handle
>>   transition in VGIC state machine, LR cleanup, ...
>>
>> v2 -> v3:
>> - renaming of kvm_arch_set_fwd_state into kvm_arch_vfio_set_forward
>> - takes a bool arg instead of kvm_fwd_irq_action enum
>> - removal of KVM_VFIO_IRQ_CLEANUP
>> - platform device check now happens here
>> - more precise errors returned
>> - irq_eoi handled externally to this patch (VGIC)
>> - correct enable_irq bug done twice
>> - reword the commit message
>> - correct check of platform_bus_type
>> - use raw_spin_lock_irqsave and check the validity of the handler
>> ---
>>  include/kvm/arm_vgic.h |   6 ++
>>  virt/kvm/arm/vgic.c    | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 155 insertions(+)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 7ef9ce0..409ac0f 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -363,6 +363,12 @@ int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
>>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
>>  
>> +int kvm_vgic_set_forward(struct kvm *kvm,
>> +			 unsigned int host_irq, unsigned int guest_irq);
>> +
>> +void kvm_vgic_unset_forward(struct kvm *kvm,
>> +			    unsigned int host_irq, unsigned int guest_irq);
>> +
>>  #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
>>  #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
>>  #define vgic_ready(k)		((k)->arch.vgic.ready)
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 03a85b3..b15999a 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -2551,3 +2551,152 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>  {
>>  	return 0;
>>  }
>> +
>> +/**
>> + * kvm_vgic_set_forward - Set IRQ forwarding
>> + *
>> + * @kvm: handle to the VM
>> + * @host_irq: physical IRQ number
>> + * @guest_irq: virtual IRQ number
>> + *
>> + * This function is supposed to be called only if the IRQ
>> + * is not in progress: ie. not active at GIC level and not
>> + * currently under injection in the guest. The physical IRQ must
>> + * also be disabled and the guest must have been exited and
> 
> when you say the guest you mean all VCPUs, right?

yes that's correct.
> 
>> + * prevented from being re-entered.
>> + */
>> +int kvm_vgic_set_forward(struct kvm *kvm,
>> +			 unsigned int host_irq,
>> +			 unsigned int guest_irq)
>> +{
>> +	struct irq_phys_map *map = NULL;
>> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
>> +	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
>> +
>> +	kvm_debug("%s host_irq=%d guest_irq=%d\n",
>> +		__func__, host_irq, guest_irq);
>> +
>> +	if (!vcpu)
>> +		return 0;
>> +
>> +	irq_set_vcpu_affinity(host_irq, vcpu);
> 
> why are we hard-coding this to vcpu 0 ?
is that OK if I use dist->irq_spi_target[guest_irq]
> 
> missing white space before the code block.  Where does the code block
> belong exactly?
> 
>> +	/*
>> +	 * next physical IRQ will be be handled as forwarded
> 
> what do you mean with 'next' here?
I mean the next occurrence of the same physical IRQ
> 
>> +	 * by the host (priority drop only)
>> +	 */
>> +
>> +	map = kvm_vgic_map_phys_irq(vcpu, spi_id, host_irq, false);
>> +	/*
>> +	 * next guest_irq injection will be considered as
>> +	 * forwarded and next flush will program LR
>> +	 * without maintenance IRQ but with HW bit set
>> +	 */
> 
> also here, I don't understand what you mean by next.
I meant the next occurrence of the same IRQ injection into the guest
> 
> Perhaps these comments were supposed to come before the function calls?
OK I see what you mean, I will rephrase to set comments before the
function calls.
> 
>> +	return !map;
>> +}
>> +
>> +/**
>> + * kvm_vgic_unset_forward - Unset IRQ forwarding
>> + *
>> + * @kvm: handle to the VM
>> + * @host_irq: physical IRQ number
>> + * @guest_irq: virtual IRQ number
>> + *
>> + * This function must be called when the host_irq is disabled
>> + * and guest has been exited and prevented from being re-entered.
>> + *
> 
> extra white space here
OK
> 
>> + */
>> +void kvm_vgic_unset_forward(struct kvm *kvm,
>> +			    unsigned int host_irq,
>> +			    unsigned int guest_irq)
>> +{
>> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>> +	int ret, lr;
>> +	struct vgic_lr vlr;
>> +	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
>> +	bool queued = false, needs_deactivate = true;
>> +	struct irq_phys_map *map;
>> +	bool active;
>> +
>> +	kvm_debug("%s host_irq=%d guest_irq=%d\n",
>> +		__func__, host_irq, guest_irq);
>> +
>> +	spin_lock(&dist->lock);
>> +
>> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
>> +
>> +	if (!vcpu)
>> +		goto out;
>> +
>> +	map = vgic_irq_map_search(vcpu, spi_id);
>> +	BUG_ON(!map);
>> +	ret = kvm_vgic_unmap_phys_irq(vcpu, map);
>> +	BUG_ON(ret);
>> +	/*
>> +	 * subsequent update_irq_pending or flush will handle this
>> +	 * irq as not forwarded
>> +	 */
> 
> missing white space before this comment block as well, also here with
> 'subsequent' do you mean "At this point" ?
I mean any new execution of update_irq_pending/flush will handle this
irq as not forwarded. Since the state machine for forwarded IRQ and non
forwarded IRQ is different, we need to convert the states into the new
state machine's ones.
> 
>> +	if (likely(!(active))) {
>> +		/*
>> +		 * the IRQ was not active. let's simply prepare the states
>> +		 * for subsequent non forwarded injection.
>> +		 */
>> +		vgic_dist_irq_clear_level(vcpu, spi_id);
>> +		vgic_dist_irq_clear_pending(vcpu, spi_id);
>> +		vgic_irq_clear_queued(vcpu, spi_id);
>> +		needs_deactivate = false;
>> +		goto out;
>> +	}
>> +
>> +	/* is there any list register with valid state? */
>> +	lr = vgic_cpu->vgic_irq_lr_map[spi_id];
>> +	if (lr != LR_EMPTY) {
>> +		vlr = vgic_get_lr(vcpu, lr);
>> +		if (vlr.state & LR_STATE_MASK)
>> +			queued = true;
>> +	}
>> +
>> +	if (!queued) {
>> +		vgic_irq_clear_queued(vcpu, spi_id);
>> +		if (vgic_dist_irq_is_pending(vcpu, spi_id)) {
>> +			/*
>> +			 * IRQ is injected but not yet queued. LR will be
>> +			 * written with EOI_INT and process_maintenance will
>> +			 * reset the states: queued, level(resampler). Pending
>> +			 * will be reset on flush.
>> +			 */
>> +			vgic_dist_irq_set_level(vcpu, spi_id);
> 
> so this is all only valid for level-triggered interrupts?  Do we check
> this somewhere along the call path?
Up to now I only tested with level-sensitive IRQs. Now I have the means
I will also test with non shared edge-sensitive mapped IRQs.
> 
>> +		} else {
>> +			/*
>> +			 * We are somewhere before the update_irq_pending.
>> +			 * we can't be sure the virtual IRQ will ever be
>> +			 * injected (due to previous disable_irq).
> 
> I don't understand this.  Do we somehow know at this point that there is
> a pending IRQ to inject as a virtual IRQ?
No we just know the physical IRQ is active at physical distributor level.

There can be a long path before the virtual IRQ gets injected into the
guest:
phys IRQ hits (1) -> genirq handler(2) -> VFIO handler(3) ->
update_irq_pending in irqfd thread (4) -> flush (LR programming,5) ->
guest deactivates the virq/pirq (6)

There is no valid state LR. There is no pending state recorded. So we
are between (1) an (4) excluded.


However we cannot be sure a virtual IRQ will ever be injected. Typically
since the physical IRQ was disabled previously, it might happen that the
VFIO physical handler is never reached and irqfd is never signalled.
(genirq handler never calls the VFIO handler)
> 
>> +			 * Let's simply clear the level which was not correctly
>> +			 * modelled in forwarded state.
>> +			 */
>> +			vgic_dist_irq_clear_level(vcpu, spi_id);
>> +		}
>> +		goto out;
>> +	}
>> +
>> +	/*
>> +	 * the virtual IRQ is queued and a valid LR exists, let's patch it so
>> +	 * that when EOI happens a maintenance IRQ gets triggered
>> +	 */
>> +	vlr.state |= LR_EOI_INT;
>> +	vgic_set_lr(vcpu, lr, vlr);
>> +
>> +	vgic_dist_irq_set_level(vcpu, spi_id);
>> +	vgic_dist_irq_set_pending(vcpu, spi_id);
> 
> how do you know this is the case?  couldn't it be active in the LR?
for a level sensitive IRQ the pending state stays active until the EOI
maintenance gets called right? pending is set by vgic_queue_hwirq and
reset by process_queued_irq?
> 
>> +	vgic_irq_set_queued(vcpu, spi_id);
>> +	 /* The maintenance IRQ will reset all states above */
> 
> then why do we bother setting them to anything here?
well isn't it cleaner? What if somebody else queries those values before
the EOI maintenance gets called, a new injection for instance?
> 
>> +
>> +out:
>> +	irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
>> +	irq_set_vcpu_affinity(host_irq, NULL);
>> +	/* next occurrence will be deactivated by the host */
> 
> I'm beginning to understand what you mean by 'next' here.

> 
> Can you make it more explicit by saying "After this function returns, a
> physical IRQ will be..." ?

yes. I was failing finding the appropriate wording I aknowledge.

Thanks

Eric
> 
>> +
>> +	spin_unlock(&dist->lock);
>> +}
>> +
>> -- 
>> 1.9.1
>>
> Thanks,
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-09-08 12:04     ` Eric Auger
@ 2015-09-14 10:59       ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-14 10:59 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Tue, Sep 08, 2015 at 02:04:15PM +0200, Eric Auger wrote:
> Hi Christoffer,
> On 09/02/2015 09:42 PM, Christoffer Dall wrote:
> > On Mon, Aug 10, 2015 at 03:21:01PM +0200, Eric Auger wrote:
> >> From: Marc Zyngier <marc.zyngier@arm.com>
> >>
> >> So far, the only use of the HW interrupt facility was the timer,
> >> implying that the active state is context-switched for each vcpu,
> >> as the device is is shared across all vcpus.
> >>
> >> This does not work for a device that has been assigned to a VM,
> >> as the guest is entierely in control of that device (the HW is
> >> not shared). In that case, it makes sense to bypass the whole
> >> active state switching.
> >>
> >> Also the VGIC state machine is adapted to support those assigned
> >> (non shared) HW IRQs:
> >> - nly can be sampled when it is pending
> >> - when queueing the IRQ (programming the LR), the pending state is
> >>   removed as for edge sensitive IRQs
> >> - queued state is not modelled. Level state is not modelled
> >> - its injection always is valid since steming from the HW.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> >>
> >> ---
> >>
> >> - a mix of
> >>   [PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
> >>                    non-shared devices
> >>   [RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
> >> ---
> >>  include/kvm/arm_vgic.h    |  6 +++--
> >>  virt/kvm/arm/arch_timer.c |  3 ++-
> >>  virt/kvm/arm/vgic.c       | 58 +++++++++++++++++++++++++++++++++++------------
> >>  3 files changed, 49 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index d901f1a..7ef9ce0 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -163,7 +163,8 @@ struct irq_phys_map {
> >>  	u32			virt_irq;
> >>  	u32			phys_irq;
> >>  	u32			irq;
> >> -	bool			active;
> >> +	bool			shared;
> >> +	bool			active; /* Only valid if shared */
> >>  };
> >>  
> >>  struct irq_phys_map_entry {
> >> @@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
> >>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> >>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
> >>  struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> >> -					   int virt_irq, int irq);
> >> +					   int virt_irq, int irq,
> >> +					   bool shared);
> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
> >>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
> >>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 76e38d2..db21d8f 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>  	 * physical interrupt. We do that once per VCPU.
> >>  	 */
> >> -	map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> >> +	map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
> >> +				    host_vtimer_irq, true);
> >>  	if (WARN_ON(IS_ERR(map)))
> >>  		return PTR_ERR(map);
> >>  
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 9eb489a..fbd5ba5 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
> >>  
> >>  static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
> >>  {
> >> -	return !vgic_irq_is_queued(vcpu, irq);
> >> +	struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
> >> +	bool shared_hw = map && !map->shared;
> > 
> > why is shared true when map->shared is false?
> definitively upside down
> > 
> >> +
> >> +	return !vgic_irq_is_queued(vcpu, irq) ||
> >> +			(shared_hw && vgic_dist_irq_is_pending(vcpu, irq));
> > 
> > so for forwarded, non-shared, level-triggered IRQs, we always sample the
> > line if it's pending?  Why?
> 
> No we only sampled it if it was pending. The pending state was reset
> when programming the LR.
> 
> Now that we model the queued state for mapped IRQ I will use that instead

ok, it is certainly very counterintuitive to have to check the pending
state only if the pending state is set.

Not sure how this will look after your rework, but it may deserve a
comment in either case.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
  2015-09-09  8:41     ` Eric Auger
@ 2015-09-14 11:11       ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-14 11:11 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

Hi Eric,

On Wed, Sep 09, 2015 at 10:41:32AM +0200, Eric Auger wrote:

[...]

> I tried to integrate into the updated state machine for non shared
> mapped IRQ but I fail.

What exactly do you mean when you refer to 'updated state machine' ?

> 
> 1) The first problem encountered is how to reset the level of the IRQ
> (since its completion is not trapped). I added this reset in
> process_queued_irq. I think this was the most natural place since at
> sink time we get aware the IRQ is deactivated at physical distributor
> level. However I observe  failures in vgic_validate_injection. I think
> there is due to a race between update_irq_pending and sync. As soon as
> the guest EOI's the virtual IRQ (and also the pIRQ), a new physical IRQ
> hits and gets injected by irqfd. This injection can happen before the
> sync. So I would be tempted to keep my current strategy of ignoring the
> validate_injection in case of non-shared mapped IRQ and not model the
> level state. The vIRQ directly comes from the HW so it must be valid
> (guest deactivated the previous occurence).

I'm a bit confused about what we are talking about here?

Are you asking about how we should deal with forwarded, level-triggered,
non-shared interrupts?

If that's the case, here are my high-level thoughts:
 - We should not maintain a virtual line state, because this is
   maintained in hardware
 - Ideally we sample the physical distributor pending state at the
   important points (sync/flush).
 - Sampling the physical state may be slow/difficult, so we may
   choose an optimization, for example cache the pending state when the
   host ISR runs and clear the pending state when the interrupt is
   injected and not care about spurious interrupts if the signal is
   lowered after it's raised.

> 
> 2) can_sample.
> once the above problem fixed, next issue is the can_sample failure.
> Queued state also is reset in process_queued_irq and can_sample fails.
> Same punishment. So currently the only manner I found to make this work
> and sample the IRQ only once is to use the pending state which I reset
> when I queue the IRQ.

So queued means that we don't sample the virtual line state, yes?  If
so, then it would make sense to always set the state as queued for
injected forwarded non-shared level-triggered interrupts.

Does this help?

-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control
  2015-09-14  9:29     ` Eric Auger
@ 2015-09-14 11:24       ` Christoffer Dall
  0 siblings, 0 replies; 35+ messages in thread
From: Christoffer Dall @ 2015-09-14 11:24 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, linux-arm-kernel, kvmarm, kvm, marc.zyngier,
	alex.williamson, feng.wu, linux-kernel, patches, pbonzini

On Mon, Sep 14, 2015 at 11:29:34AM +0200, Eric Auger wrote:
> Christoffer,
> On 09/02/2015 09:58 PM, Christoffer Dall wrote:
> > On Mon, Aug 10, 2015 at 03:21:03PM +0200, Eric Auger wrote:
> >> Implements kvm_vgic_[set|unset]_forward.
> >>
> >> Handle low-level VGIC programming: physical IRQ/guest IRQ mapping,
> >> list register cleanup, VGIC state machine. Also interacts with
> >> the irqchip.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> >>
> >> ---
> >>
> >> v2 -> v3:
> >> - on unforward, we do not compute & output the active state anymore.
> >>   This means if the unforward happens while the physical IRQ is
> >>   active, we will not VFIO mask the IRQ while deactiving it. If a
> >>   new physical IRQ hits, the corresponding virtual IRQ might not
> >>   be injected (hence lost) due to VGIC state machine.
> >>
> >> bypass rfc v2:
> >> - use irq_set_vcpu_affinity API
> >> - use irq_set_irqchip_state instead of chip->irq_eoi
> >>
> >> bypass rfc:
> >> - rename kvm_arch_{set|unset}_forward into
> >>   kvm_vgic_{set|unset}_forward. Remove __KVM_HAVE_ARCH_HALT_GUEST.
> >>   The function is bound to be called by ARM code only.
> >>
> >> v4 -> v5:
> >> - fix arm64 compilation issues, ie. also defines
> >>   __KVM_HAVE_ARCH_HALT_GUEST for arm64
> >>
> >> v3 -> v4:
> >> - code originally located in kvm_vfio_arm.c
> >> - kvm_arch_vfio_{set|unset}_forward renamed into
> >>   kvm_arch_{set|unset}_forward
> >> - split into 2 functions (set/unset) since unset does not fail anymore
> >> - unset can be invoked at whatever time. Extra care is taken to handle
> >>   transition in VGIC state machine, LR cleanup, ...
> >>
> >> v2 -> v3:
> >> - renaming of kvm_arch_set_fwd_state into kvm_arch_vfio_set_forward
> >> - takes a bool arg instead of kvm_fwd_irq_action enum
> >> - removal of KVM_VFIO_IRQ_CLEANUP
> >> - platform device check now happens here
> >> - more precise errors returned
> >> - irq_eoi handled externally to this patch (VGIC)
> >> - correct enable_irq bug done twice
> >> - reword the commit message
> >> - correct check of platform_bus_type
> >> - use raw_spin_lock_irqsave and check the validity of the handler
> >> ---
> >>  include/kvm/arm_vgic.h |   6 ++
> >>  virt/kvm/arm/vgic.c    | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>  2 files changed, 155 insertions(+)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index 7ef9ce0..409ac0f 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -363,6 +363,12 @@ int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
> >>  bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
> >>  void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> >>  
> >> +int kvm_vgic_set_forward(struct kvm *kvm,
> >> +			 unsigned int host_irq, unsigned int guest_irq);
> >> +
> >> +void kvm_vgic_unset_forward(struct kvm *kvm,
> >> +			    unsigned int host_irq, unsigned int guest_irq);
> >> +
> >>  #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
> >>  #define vgic_initialized(k)	(!!((k)->arch.vgic.nr_cpus))
> >>  #define vgic_ready(k)		((k)->arch.vgic.ready)
> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 03a85b3..b15999a 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -2551,3 +2551,152 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>  {
> >>  	return 0;
> >>  }
> >> +
> >> +/**
> >> + * kvm_vgic_set_forward - Set IRQ forwarding
> >> + *
> >> + * @kvm: handle to the VM
> >> + * @host_irq: physical IRQ number
> >> + * @guest_irq: virtual IRQ number
> >> + *
> >> + * This function is supposed to be called only if the IRQ
> >> + * is not in progress: ie. not active at GIC level and not
> >> + * currently under injection in the guest. The physical IRQ must
> >> + * also be disabled and the guest must have been exited and
> > 
> > when you say the guest you mean all VCPUs, right?
> 
> yes that's correct.
> > 
> >> + * prevented from being re-entered.
> >> + */
> >> +int kvm_vgic_set_forward(struct kvm *kvm,
> >> +			 unsigned int host_irq,
> >> +			 unsigned int guest_irq)
> >> +{
> >> +	struct irq_phys_map *map = NULL;
> >> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
> >> +	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
> >> +
> >> +	kvm_debug("%s host_irq=%d guest_irq=%d\n",
> >> +		__func__, host_irq, guest_irq);
> >> +
> >> +	if (!vcpu)
> >> +		return 0;
> >> +
> >> +	irq_set_vcpu_affinity(host_irq, vcpu);
> > 
> > why are we hard-coding this to vcpu 0 ?
> is that OK if I use dist->irq_spi_target[guest_irq]

I would think so, yes.  Be careful about locking/barriers and
initialization, though.

> > 
> > missing white space before the code block.  Where does the code block
> > belong exactly?
> > 
> >> +	/*
> >> +	 * next physical IRQ will be be handled as forwarded
> > 
> > what do you mean with 'next' here?
> I mean the next occurrence of the same physical IRQ

Then I suggest wording it something like:

>From this point, this physical IRQ will be handled as forwarded.

> > 
> >> +	 * by the host (priority drop only)
> >> +	 */
> >> +
> >> +	map = kvm_vgic_map_phys_irq(vcpu, spi_id, host_irq, false);
> >> +	/*
> >> +	 * next guest_irq injection will be considered as
> >> +	 * forwarded and next flush will program LR
> >> +	 * without maintenance IRQ but with HW bit set
> >> +	 */
> > 
> > also here, I don't understand what you mean by next.
> I meant the next occurrence of the same IRQ injection into the guest

hmm, I find it a little hard to understand.

> > 
> > Perhaps these comments were supposed to come before the function calls?
> OK I see what you mean, I will rephrase to set comments before the
> function calls.

Sounds good, I'll have a look when it has been reworked.

> > 
> >> +	return !map;
> >> +}
> >> +
> >> +/**
> >> + * kvm_vgic_unset_forward - Unset IRQ forwarding
> >> + *
> >> + * @kvm: handle to the VM
> >> + * @host_irq: physical IRQ number
> >> + * @guest_irq: virtual IRQ number
> >> + *
> >> + * This function must be called when the host_irq is disabled
> >> + * and guest has been exited and prevented from being re-entered.
> >> + *
> > 
> > extra white space here
> OK
> > 
> >> + */
> >> +void kvm_vgic_unset_forward(struct kvm *kvm,
> >> +			    unsigned int host_irq,
> >> +			    unsigned int guest_irq)
> >> +{
> >> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
> >> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> +	struct vgic_dist *dist = &kvm->arch.vgic;
> >> +	int ret, lr;
> >> +	struct vgic_lr vlr;
> >> +	int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
> >> +	bool queued = false, needs_deactivate = true;
> >> +	struct irq_phys_map *map;
> >> +	bool active;
> >> +
> >> +	kvm_debug("%s host_irq=%d guest_irq=%d\n",
> >> +		__func__, host_irq, guest_irq);
> >> +
> >> +	spin_lock(&dist->lock);
> >> +
> >> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
> >> +
> >> +	if (!vcpu)
> >> +		goto out;
> >> +
> >> +	map = vgic_irq_map_search(vcpu, spi_id);
> >> +	BUG_ON(!map);
> >> +	ret = kvm_vgic_unmap_phys_irq(vcpu, map);
> >> +	BUG_ON(ret);
> >> +	/*
> >> +	 * subsequent update_irq_pending or flush will handle this
> >> +	 * irq as not forwarded
> >> +	 */
> > 
> > missing white space before this comment block as well, also here with
> > 'subsequent' do you mean "At this point" ?
> I mean any new execution of update_irq_pending/flush will handle this
> irq as not forwarded. Since the state machine for forwarded IRQ and non
> forwarded IRQ is different, we need to convert the states into the new
> state machine's ones.

Ok, perhaps this comment is more clear if it simply states:

At this point the IRQ is marked as non-forwarded.

> > 
> >> +	if (likely(!(active))) {
> >> +		/*
> >> +		 * the IRQ was not active. let's simply prepare the states
> >> +		 * for subsequent non forwarded injection.
> >> +		 */
> >> +		vgic_dist_irq_clear_level(vcpu, spi_id);
> >> +		vgic_dist_irq_clear_pending(vcpu, spi_id);
> >> +		vgic_irq_clear_queued(vcpu, spi_id);
> >> +		needs_deactivate = false;
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* is there any list register with valid state? */
> >> +	lr = vgic_cpu->vgic_irq_lr_map[spi_id];
> >> +	if (lr != LR_EMPTY) {
> >> +		vlr = vgic_get_lr(vcpu, lr);
> >> +		if (vlr.state & LR_STATE_MASK)
> >> +			queued = true;
> >> +	}
> >> +
> >> +	if (!queued) {
> >> +		vgic_irq_clear_queued(vcpu, spi_id);
> >> +		if (vgic_dist_irq_is_pending(vcpu, spi_id)) {
> >> +			/*
> >> +			 * IRQ is injected but not yet queued. LR will be
> >> +			 * written with EOI_INT and process_maintenance will
> >> +			 * reset the states: queued, level(resampler). Pending
> >> +			 * will be reset on flush.
> >> +			 */
> >> +			vgic_dist_irq_set_level(vcpu, spi_id);
> > 
> > so this is all only valid for level-triggered interrupts?  Do we check
> > this somewhere along the call path?
> Up to now I only tested with level-sensitive IRQs. Now I have the means
> I will also test with non shared edge-sensitive mapped IRQs.

I don't think we should ever set the level field for edge-triggered
IRQs.

> > 
> >> +		} else {
> >> +			/*
> >> +			 * We are somewhere before the update_irq_pending.
> >> +			 * we can't be sure the virtual IRQ will ever be
> >> +			 * injected (due to previous disable_irq).
> > 
> > I don't understand this.  Do we somehow know at this point that there is
> > a pending IRQ to inject as a virtual IRQ?
> No we just know the physical IRQ is active at physical distributor level.

how do we know this?

> 
> There can be a long path before the virtual IRQ gets injected into the
> guest:
> phys IRQ hits (1) -> genirq handler(2) -> VFIO handler(3) ->
> update_irq_pending in irqfd thread (4) -> flush (LR programming,5) ->
> guest deactivates the virq/pirq (6)
> 
> There is no valid state LR. There is no pending state recorded. So we
> are between (1) an (4) excluded.

you can also be before (1), no ?

> 
> 
> However we cannot be sure a virtual IRQ will ever be injected. Typically
> since the physical IRQ was disabled previously, it might happen that the
> VFIO physical handler is never reached and irqfd is never signalled.
> (genirq handler never calls the VFIO handler)

Isn't the point here simply that if you disconnect the virtual IRQ from
the that of the physical device, then you reset the state by clearing
the LR and if this virtual IRQ is to be used again as a purely virtual
IRQ or forwarding is configured again, then it must be re-injected?

If so, I think the comment can simply be:

Since we are changing this IRQ from fowarded to non-fowarded, reset its
state.

> > 
> >> +			 * Let's simply clear the level which was not correctly
> >> +			 * modelled in forwarded state.
> >> +			 */
> >> +			vgic_dist_irq_clear_level(vcpu, spi_id);
> >> +		}
> >> +		goto out;
> >> +	}
> >> +
> >> +	/*
> >> +	 * the virtual IRQ is queued and a valid LR exists, let's patch it so
> >> +	 * that when EOI happens a maintenance IRQ gets triggered
> >> +	 */
> >> +	vlr.state |= LR_EOI_INT;
> >> +	vgic_set_lr(vcpu, lr, vlr);
> >> +
> >> +	vgic_dist_irq_set_level(vcpu, spi_id);
> >> +	vgic_dist_irq_set_pending(vcpu, spi_id);
> > 
> > how do you know this is the case?  couldn't it be active in the LR?
> for a level sensitive IRQ the pending state stays active until the EOI
> maintenance gets called right? pending is set by vgic_queue_hwirq and
> reset by process_queued_irq?

ok, fair enough for the pending state (deserves a comment though), but I
don't see why you'd need to set the level field?  If there is no
connected virtual device that lowers the line afterwards, you'll keep
injecting this IRQ indefinitely I think.

> > 
> >> +	vgic_irq_set_queued(vcpu, spi_id);
> >> +	 /* The maintenance IRQ will reset all states above */

is this actually true for the level field?  It shouldn't, should it?

> > 
> > then why do we bother setting them to anything here?
> well isn't it cleaner? What if somebody else queries those values before
> the EOI maintenance gets called, a new injection for instance?

I guess, but then the comment could say that, "make sure this looks like
a regular virtual interrupt" or something like that.

> > 
> >> +
> >> +out:
> >> +	irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
> >> +	irq_set_vcpu_affinity(host_irq, NULL);
> >> +	/* next occurrence will be deactivated by the host */
> > 
> > I'm beginning to understand what you mean by 'next' here.
> 
> > 
> > Can you make it more explicit by saying "After this function returns, a
> > physical IRQ will be..." ?
> 
> yes. I was failing finding the appropriate wording I aknowledge.
> 
me too, it's hard.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-09-14 11:22 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-10 13:20 [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Eric Auger
2015-08-10 13:20 ` [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer Eric Auger
2015-08-12 18:56   ` Alex Williamson
2015-08-17 15:17     ` Eric Auger
2015-08-10 13:20 ` [PATCH v3 02/10] VFIO: platform: test forwarded state when selecting the IRQ handler Eric Auger
2015-08-10 13:20 ` [PATCH v3 03/10] VFIO: platform: single handler using function pointer Eric Auger
2015-08-12 18:56   ` Alex Williamson
2015-08-17 15:25     ` Eric Auger
2015-08-10 13:20 ` [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked Eric Auger
2015-08-12 18:56   ` Alex Williamson
2015-08-17 15:38     ` Eric Auger
2015-08-18 17:44       ` Alex Williamson
2015-08-31 11:43         ` Antonios Motakis
2015-08-31 14:54           ` Alex Williamson
2015-08-10 13:20 ` [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active Eric Auger
2015-08-12 18:56   ` Alex Williamson
2015-08-17 15:39     ` Eric Auger
2015-09-02 19:29   ` Christoffer Dall
2015-08-10 13:21 ` [PATCH v3 06/10] VFIO: platform: add irq bypass producer management Eric Auger
2015-08-12 18:56   ` Alex Williamson
2015-08-17 15:51     ` Eric Auger
2015-09-02 19:32   ` Christoffer Dall
2015-08-10 13:21 ` [PATCH v3 07/10] KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices Eric Auger
2015-09-02 19:42   ` Christoffer Dall
2015-09-08 12:04     ` Eric Auger
2015-09-14 10:59       ` Christoffer Dall
2015-09-09  8:41     ` Eric Auger
2015-09-14 11:11       ` Christoffer Dall
2015-08-10 13:21 ` [PATCH v3 08/10] KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ Eric Auger
2015-08-10 13:21 ` [PATCH v3 09/10] KVM: arm/arm64: vgic: forwarding control Eric Auger
2015-09-02 19:58   ` Christoffer Dall
2015-09-14  9:29     ` Eric Auger
2015-09-14 11:24       ` Christoffer Dall
2015-08-10 13:21 ` [PATCH v3 10/10] KVM: arm/arm64: implement IRQ bypass consumer functions Eric Auger
2015-09-02 19:58 ` [PATCH v3 00/10] ARM IRQ forward control based on IRQ bypass manager Christoffer Dall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).