All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts
@ 2017-05-24 20:13 Eric Auger
  2017-05-24 20:13   ` Eric Auger
                   ` (9 more replies)
  0 siblings, 10 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

This series allows to optimize the deactivation of virtual interrupts
associated to vfio device wired physical interrupts (non MSI).

This is a revival of "[PATCH v4 00/13] ARM IRQ forward control based on
IRQ bypass manager" (https://lkml.org/lkml/2015/11/19/351) whose development
was stalled due to dependency on new VGIC design and drop of priority.

Without that optimization the deactivation of the physical IRQ is performed
by the host. Also for level sensitive interrupts, The VFIO driver disables
the physical IRQ. The deactivation of the virtual IRQ by the guest is trapped
and the physical IRQ gets re-enabled at that time.

The ARM GIC supports direct EOI for virtual interrupts directly mapped
 to physical interrupts. When this mode is set, the host does not
deactivate the physical interrupt anymore, but simply drops the
interrupt priority on EOI. When the guest deactivates the virtual IRQ,
the GIC automatically deactivates the physical IRQ. This avoids a world
switch on deactivation.

This series sets direct EOI mode on ARM/ARM64 for shared peripheral
interrupts. This relies on a negotiation between the vfio driver and
KVM/irqfd though the irq bypass manager.

The setup sequence is:

preamble:
- disable the physical IRQ
- halt guest execution
forwarding setting:
- program the VFIO driver for forwarding (select the right physical
  interrupt handler)
- program the VGIC and IRQCHIP for forwarding
postamble:
- resume guest execution
- enable the physical IRQ

When destroying the optimized path the following sequence is executed:
- preamble
- unset forwarding at VGIC and IRQCHIP level
- unset forwarding at VFIO level
- postamble

The injection still is based on irqfd triggering. For level sensitive
interrupts though, the resamplefd is not triggered anymore since
deactivation is not trapped by KVM.

This was tested with:
- AMD Seattle xgmac platform device assignment
- Cavium ThunderX PCIe device assignment with pci=nomsi (INTx)
- Also MSI non regression was tested on ARM

The series can be fount at:
https://github.com/eauger/linux/tree/v4.12-rc2-deoi-v5

It is based on 4.12-rc2

Best Regards

Eric

Eric Auger (10):
  vfio: platform: Add automasked field to vfio_platform_irq
  VFIO: platform: Introduce direct EOI interrupt handler
  VFIO: platform: Direct EOI irq bypass for ARM/ARM64
  VFIO: pci: Add automasked field to vfio_pci_irq_ctx
  VFIO: pci: Introduce direct EOI INTx interrupt handler
  irqbypass: Add a private field in the producer
  VFIO: pci: Direct EOI irq bypass for ARM/ARM64
  KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  KVM: arm/arm64: vgic: Implement forwarding setting
  KVM: arm/arm64: register DEOI irq bypass consumer on ARM/ARM64

 arch/arm/kvm/Kconfig                             |   3 +
 arch/arm64/kvm/Kconfig                           |   3 +
 drivers/vfio/pci/Kconfig                         |   4 +
 drivers/vfio/pci/Makefile                        |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c                |  78 +++++++++---
 drivers/vfio/pci/vfio_pci_irq_bypass.c           | 134 ++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_private.h              |  35 ++++++
 drivers/vfio/platform/Kconfig                    |   5 +
 drivers/vfio/platform/Makefile                   |   2 +-
 drivers/vfio/platform/vfio_platform_irq.c        |  53 ++++++--
 drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++
 drivers/vfio/platform/vfio_platform_private.h    |  26 ++++
 include/kvm/arm_vgic.h                           |   9 +-
 include/linux/irqbypass.h                        |   2 +
 virt/kvm/arm/arch_timer.c                        |   3 +-
 virt/kvm/arm/arm.c                               |  42 +++++++
 virt/kvm/arm/vgic/vgic.c                         | 149 ++++++++++++++++++++++-
 virt/kvm/arm/vgic/vgic.h                         |   9 +-
 18 files changed, 632 insertions(+), 40 deletions(-)
 create mode 100644 drivers/vfio/pci/vfio_pci_irq_bypass.c
 create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c

-- 
2.5.5

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

For direct EOI modality we will need to differentiate a userspace
masking from the IRQ handler auto-masking.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
 drivers/vfio/platform/vfio_platform_private.h |  1 +
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 46d4750..831f0b0 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
 
 	spin_lock_irqsave(&irq_ctx->lock, flags);
 
-	if (!irq_ctx->masked) {
+	if (!irq_ctx->masked && !irq_ctx->automasked) {
 		disable_irq_nosync(irq_ctx->hwirq);
 		irq_ctx->masked = true;
 	}
@@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
 
 	spin_lock_irqsave(&irq_ctx->lock, flags);
 
-	if (irq_ctx->masked) {
+	if (irq_ctx->masked || irq_ctx->automasked) {
 		enable_irq(irq_ctx->hwirq);
 		irq_ctx->masked = false;
+		irq_ctx->automasked = false;
 	}
 
 	spin_unlock_irqrestore(&irq_ctx->lock, flags);
@@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 
 	spin_lock_irqsave(&irq_ctx->lock, flags);
 
-	if (!irq_ctx->masked) {
+	if (!irq_ctx->masked && !irq_ctx->automasked) {
 		ret = IRQ_HANDLED;
 
 		/* automask maskable interrupts */
 		disable_irq_nosync(irq_ctx->hwirq);
-		irq_ctx->masked = true;
+		irq_ctx->automasked = true;
 	}
 
 	spin_unlock_irqrestore(&irq_ctx->lock, flags);
@@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
 		vdev->irqs[i].count = 1;
 		vdev->irqs[i].hwirq = hwirq;
 		vdev->irqs[i].masked = false;
+		vdev->irqs[i].automasked = false;
 	}
 
 	vdev->num_irqs = cnt;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 85ffe5d..8a3cfa9 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -34,6 +34,7 @@ struct vfio_platform_irq {
 	char			*name;
 	struct eventfd_ctx	*trigger;
 	bool			masked;
+	bool			automasked;
 	spinlock_t		lock;
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

For direct EOI modality we will need to differentiate a userspace
masking from the IRQ handler auto-masking.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
 drivers/vfio/platform/vfio_platform_private.h |  1 +
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 46d4750..831f0b0 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
 
 	spin_lock_irqsave(&irq_ctx->lock, flags);
 
-	if (!irq_ctx->masked) {
+	if (!irq_ctx->masked && !irq_ctx->automasked) {
 		disable_irq_nosync(irq_ctx->hwirq);
 		irq_ctx->masked = true;
 	}
@@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
 
 	spin_lock_irqsave(&irq_ctx->lock, flags);
 
-	if (irq_ctx->masked) {
+	if (irq_ctx->masked || irq_ctx->automasked) {
 		enable_irq(irq_ctx->hwirq);
 		irq_ctx->masked = false;
+		irq_ctx->automasked = false;
 	}
 
 	spin_unlock_irqrestore(&irq_ctx->lock, flags);
@@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 
 	spin_lock_irqsave(&irq_ctx->lock, flags);
 
-	if (!irq_ctx->masked) {
+	if (!irq_ctx->masked && !irq_ctx->automasked) {
 		ret = IRQ_HANDLED;
 
 		/* automask maskable interrupts */
 		disable_irq_nosync(irq_ctx->hwirq);
-		irq_ctx->masked = true;
+		irq_ctx->automasked = true;
 	}
 
 	spin_unlock_irqrestore(&irq_ctx->lock, flags);
@@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
 		vdev->irqs[i].count = 1;
 		vdev->irqs[i].hwirq = hwirq;
 		vdev->irqs[i].masked = false;
+		vdev->irqs[i].automasked = false;
 	}
 
 	vdev->num_irqs = cnt;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 85ffe5d..8a3cfa9 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -34,6 +34,7 @@ struct vfio_platform_irq {
 	char			*name;
 	struct eventfd_ctx	*trigger;
 	bool			masked;
+	bool			automasked;
 	spinlock_t		lock;
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/10] VFIO: platform: Introduce direct EOI interrupt handler
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

We add two new fields in vfio_platform_irq: deoi and handler.

If deoi is set, this means the physical IRQ attached to the virtual
IRQ is directly deactivated by the guest and the VFIO driver does
not need to disable the physical IRQ and mask it at VFIO level.

The handler pointer points to either the automasked or "deoi" handler.
This latter is the same as the one used for edge sensitive IRQs.
A wrapper handler is introduced that calls the chosen handler function.

The irq lock is taken/released in the wrapper handler. eventfd_signal
can be called in regions not allowed to sleep.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/platform/vfio_platform_irq.c     | 24 +++++++++++++++++-------
 drivers/vfio/platform/vfio_platform_private.h |  2 ++
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 831f0b0..2f82459 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
 static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 {
 	struct vfio_platform_irq *irq_ctx = dev_id;
-	unsigned long flags;
 	int ret = IRQ_NONE;
 
-	spin_lock_irqsave(&irq_ctx->lock, flags);
-
 	if (!irq_ctx->masked && !irq_ctx->automasked) {
 		ret = IRQ_HANDLED;
 
@@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 		irq_ctx->automasked = true;
 	}
 
-	spin_unlock_irqrestore(&irq_ctx->lock, flags);
-
 	if (ret == IRQ_HANDLED)
 		eventfd_signal(irq_ctx->trigger, 1);
 
@@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
+{
+	struct vfio_platform_irq *irq_ctx = dev_id;
+	unsigned long flags;
+	irqreturn_t ret;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+	ret = irq_ctx->handler(irq, dev_id);
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+
+	return ret;
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 			    int fd, irq_handler_t handler)
 {
@@ -208,9 +216,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 	}
 
 	irq->trigger = trigger;
+	irq->handler = handler;
 
 	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
-	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
+	ret = request_irq(irq->hwirq, vfio_wrapper_handler, 0, irq->name, irq);
 	if (ret) {
 		kfree(irq->name);
 		eventfd_ctx_put(trigger);
@@ -232,7 +241,8 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
 	struct vfio_platform_irq *irq = &vdev->irqs[index];
 	irq_handler_t handler;
 
-	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED)
+	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED &&
+			!irq->deoi)
 		handler = vfio_automasked_irq_handler;
 	else
 		handler = vfio_irq_handler;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 8a3cfa9..b80a380 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -38,6 +38,8 @@ struct vfio_platform_irq {
 	spinlock_t		lock;
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
+	bool			deoi;
+	irqreturn_t		(*handler)(int irq, void *dev_id);
 };
 
 struct vfio_platform_region {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/10] VFIO: platform: Introduce direct EOI interrupt handler
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

We add two new fields in vfio_platform_irq: deoi and handler.

If deoi is set, this means the physical IRQ attached to the virtual
IRQ is directly deactivated by the guest and the VFIO driver does
not need to disable the physical IRQ and mask it at VFIO level.

The handler pointer points to either the automasked or "deoi" handler.
This latter is the same as the one used for edge sensitive IRQs.
A wrapper handler is introduced that calls the chosen handler function.

The irq lock is taken/released in the wrapper handler. eventfd_signal
can be called in regions not allowed to sleep.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/platform/vfio_platform_irq.c     | 24 +++++++++++++++++-------
 drivers/vfio/platform/vfio_platform_private.h |  2 ++
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 831f0b0..2f82459 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
 static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 {
 	struct vfio_platform_irq *irq_ctx = dev_id;
-	unsigned long flags;
 	int ret = IRQ_NONE;
 
-	spin_lock_irqsave(&irq_ctx->lock, flags);
-
 	if (!irq_ctx->masked && !irq_ctx->automasked) {
 		ret = IRQ_HANDLED;
 
@@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 		irq_ctx->automasked = true;
 	}
 
-	spin_unlock_irqrestore(&irq_ctx->lock, flags);
-
 	if (ret == IRQ_HANDLED)
 		eventfd_signal(irq_ctx->trigger, 1);
 
@@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
+{
+	struct vfio_platform_irq *irq_ctx = dev_id;
+	unsigned long flags;
+	irqreturn_t ret;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+	ret = irq_ctx->handler(irq, dev_id);
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+
+	return ret;
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 			    int fd, irq_handler_t handler)
 {
@@ -208,9 +216,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 	}
 
 	irq->trigger = trigger;
+	irq->handler = handler;
 
 	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
-	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
+	ret = request_irq(irq->hwirq, vfio_wrapper_handler, 0, irq->name, irq);
 	if (ret) {
 		kfree(irq->name);
 		eventfd_ctx_put(trigger);
@@ -232,7 +241,8 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
 	struct vfio_platform_irq *irq = &vdev->irqs[index];
 	irq_handler_t handler;
 
-	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED)
+	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED &&
+			!irq->deoi)
 		handler = vfio_automasked_irq_handler;
 	else
 		handler = vfio_irq_handler;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 8a3cfa9..b80a380 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -38,6 +38,8 @@ struct vfio_platform_irq {
 	spinlock_t		lock;
 	struct virqfd		*unmask;
 	struct virqfd		*mask;
+	bool			deoi;
+	irqreturn_t		(*handler)(int irq, void *dev_id);
 };
 
 struct vfio_platform_region {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

This patch adds the registration/unregistration of an
irq_bypass_producer for vfio platform device interrupts.

Its callbacks handle the direct EOI modality on VFIO side.

- stop/start: disable/enable the host irq
- add/del consumer: set the VFIO Direct EOI mode, ie. select the
  adapted physical IRQ handler (automasked or not automasked).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/platform/Kconfig                    |   5 +
 drivers/vfio/platform/Makefile                   |   2 +-
 drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
 drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
 drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c

diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
index bb30128..33ec3d9 100644
--- a/drivers/vfio/platform/Kconfig
+++ b/drivers/vfio/platform/Kconfig
@@ -2,6 +2,7 @@ config VFIO_PLATFORM
 	tristate "VFIO support for platform devices"
 	depends on VFIO && EVENTFD && (ARM || ARM64)
 	select VFIO_VIRQFD
+	select IRQ_BYPASS_MANAGER
 	help
 	  Support for platform devices with VFIO. This is required to make
 	  use of platform devices present on the system using the VFIO
@@ -19,4 +20,8 @@ config VFIO_AMBA
 
 	  If you don't know what to do here, say N.
 
+config VFIO_PLATFORM_IRQ_BYPASS_DEOI
+	depends on VFIO_PLATFORM
+	def_bool y
+
 source "drivers/vfio/platform/reset/Kconfig"
diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
index 41a6224..324f3e7 100644
--- a/drivers/vfio/platform/Makefile
+++ b/drivers/vfio/platform/Makefile
@@ -1,4 +1,4 @@
-vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
+vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
 vfio-platform-y := vfio_platform.o
 
 obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 2f82459..5b70c8e 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -20,6 +20,7 @@
 #include <linux/types.h>
 #include <linux/vfio.h>
 #include <linux/irq.h>
+#include <linux/irqbypass.h>
 
 #include "vfio_platform_private.h"
 
@@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
 	return ret;
 }
 
+/* must be called with irq_ctx->lock held */
+int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
+{
+	irq_ctx->deoi = deoi;
+
+	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
+		irq_ctx->handler = vfio_automasked_irq_handler;
+	else
+		irq_ctx->handler = vfio_irq_handler;
+
+	return 0;
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 			    int fd, irq_handler_t handler)
 {
@@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 	if (irq->trigger) {
 		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
 		free_irq(irq->hwirq, irq);
+		irq_bypass_unregister_producer(&irq->producer);
 		kfree(irq->name);
 		eventfd_ctx_put(irq->trigger);
 		irq->trigger = NULL;
@@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 		return ret;
 	}
 
+	if (vfio_platform_has_deoi())
+		vfio_platform_register_deoi_producer(vdev, irq,
+						     trigger, irq->hwirq);
+
 	if (!irq->masked)
 		enable_irq(irq->hwirq);
 
diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
new file mode 100644
index 0000000..436902c
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
@@ -0,0 +1,114 @@
+/*
+ * VFIO platform device irqbypass callback implementation for DEOI
+ *
+ * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
+ * Author: Eric Auger <eric.auger@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/err.h>
+#include <linux/device.h>
+#include <linux/irq.h>
+#include <linux/irqbypass.h>
+#include "vfio_platform_private.h"
+
+#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
+
+static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
+{
+	enable_irq(prod->irq);
+}
+
+static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
+{
+	disable_irq(prod->irq);
+}
+
+/**
+ * irq_bypass_deoi_add_consumer - turns irq direct EOI on
+ *
+ * The linux irq is disabled when the function is called.
+ * The operation succeeds only if the irq is not active at irqchip level
+ * and the irq is not automasked at VFIO level, meaning the IRQ is under
+ * injection into the guest.
+ */
+static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
+					struct irq_bypass_consumer *cons)
+{
+	struct vfio_platform_irq *irq_ctx =
+		container_of(prod, struct vfio_platform_irq, producer);
+	unsigned long flags;
+	bool active;
+	int ret;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+
+	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
+				    &active);
+	if (ret)
+		goto out;
+
+	if (active || irq_ctx->automasked) {
+		ret = -EAGAIN;
+		goto out;
+	}
+
+	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
+		goto out;
+
+	ret = vfio_platform_set_deoi(irq_ctx, true);
+out:
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+	return ret;
+}
+
+static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
+					 struct irq_bypass_consumer *cons)
+{
+	struct vfio_platform_irq *irq_ctx =
+		container_of(prod, struct vfio_platform_irq, producer);
+	unsigned long flags;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+	if (irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK)
+		vfio_platform_set_deoi(irq_ctx, false);
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+}
+
+bool vfio_platform_has_deoi(void)
+{
+	return true;
+}
+
+void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
+					  struct vfio_platform_irq *irq,
+					  struct eventfd_ctx *trigger,
+					  unsigned int host_irq)
+{
+	struct irq_bypass_producer *prod = &irq->producer;
+	int ret;
+
+	prod->token =		trigger;
+	prod->irq =		host_irq;
+	prod->add_consumer =	irq_bypass_deoi_add_consumer;
+	prod->del_consumer =	irq_bypass_deoi_del_consumer;
+	prod->stop =		irq_bypass_deoi_stop;
+	prod->start =		irq_bypass_deoi_start;
+
+	ret = irq_bypass_register_producer(prod);
+	if (unlikely(ret))
+		dev_info(vdev->device,
+			 "irq bypass producer (token %p) registration fails: %d\n",
+			 prod->token, ret);
+}
+
+#endif
+
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index b80a380..bfa2675 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -17,6 +17,7 @@
 
 #include <linux/types.h>
 #include <linux/interrupt.h>
+#include <linux/irqbypass.h>
 
 #define VFIO_PLATFORM_OFFSET_SHIFT   40
 #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
@@ -40,6 +41,7 @@ struct vfio_platform_irq {
 	struct virqfd		*mask;
 	bool			deoi;
 	irqreturn_t		(*handler)(int irq, void *dev_id);
+	struct irq_bypass_producer producer;
 };
 
 struct vfio_platform_region {
@@ -102,9 +104,30 @@ extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
 					unsigned start, unsigned count,
 					void *data);
 
+extern int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi);
+
 extern void __vfio_platform_register_reset(struct vfio_platform_reset_node *n);
 extern void vfio_platform_unregister_reset(const char *compat,
 					   vfio_platform_reset_fn_t fn);
+
+#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
+bool vfio_platform_has_deoi(void);
+void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
+					  struct vfio_platform_irq *irq,
+					  struct eventfd_ctx *trigger,
+					  unsigned int host_irq);
+#else
+static inline bool vfio_platform_has_deoi(void)
+{
+	return false;
+}
+static inline
+void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
+					  struct vfio_platform_irq *irq,
+					  struct eventfd_ctx *trigger,
+					  unsigned int host_irq) {}
+#endif
+
 #define vfio_platform_register_reset(__compat, __reset)		\
 static struct vfio_platform_reset_node __reset ## _node = {	\
 	.owner = THIS_MODULE,					\
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

This patch adds the registration/unregistration of an
irq_bypass_producer for vfio platform device interrupts.

Its callbacks handle the direct EOI modality on VFIO side.

- stop/start: disable/enable the host irq
- add/del consumer: set the VFIO Direct EOI mode, ie. select the
  adapted physical IRQ handler (automasked or not automasked).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/platform/Kconfig                    |   5 +
 drivers/vfio/platform/Makefile                   |   2 +-
 drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
 drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
 drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
 5 files changed, 162 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c

diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
index bb30128..33ec3d9 100644
--- a/drivers/vfio/platform/Kconfig
+++ b/drivers/vfio/platform/Kconfig
@@ -2,6 +2,7 @@ config VFIO_PLATFORM
 	tristate "VFIO support for platform devices"
 	depends on VFIO && EVENTFD && (ARM || ARM64)
 	select VFIO_VIRQFD
+	select IRQ_BYPASS_MANAGER
 	help
 	  Support for platform devices with VFIO. This is required to make
 	  use of platform devices present on the system using the VFIO
@@ -19,4 +20,8 @@ config VFIO_AMBA
 
 	  If you don't know what to do here, say N.
 
+config VFIO_PLATFORM_IRQ_BYPASS_DEOI
+	depends on VFIO_PLATFORM
+	def_bool y
+
 source "drivers/vfio/platform/reset/Kconfig"
diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
index 41a6224..324f3e7 100644
--- a/drivers/vfio/platform/Makefile
+++ b/drivers/vfio/platform/Makefile
@@ -1,4 +1,4 @@
-vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
+vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
 vfio-platform-y := vfio_platform.o
 
 obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 2f82459..5b70c8e 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -20,6 +20,7 @@
 #include <linux/types.h>
 #include <linux/vfio.h>
 #include <linux/irq.h>
+#include <linux/irqbypass.h>
 
 #include "vfio_platform_private.h"
 
@@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
 	return ret;
 }
 
+/* must be called with irq_ctx->lock held */
+int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
+{
+	irq_ctx->deoi = deoi;
+
+	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
+		irq_ctx->handler = vfio_automasked_irq_handler;
+	else
+		irq_ctx->handler = vfio_irq_handler;
+
+	return 0;
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 			    int fd, irq_handler_t handler)
 {
@@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 	if (irq->trigger) {
 		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
 		free_irq(irq->hwirq, irq);
+		irq_bypass_unregister_producer(&irq->producer);
 		kfree(irq->name);
 		eventfd_ctx_put(irq->trigger);
 		irq->trigger = NULL;
@@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
 		return ret;
 	}
 
+	if (vfio_platform_has_deoi())
+		vfio_platform_register_deoi_producer(vdev, irq,
+						     trigger, irq->hwirq);
+
 	if (!irq->masked)
 		enable_irq(irq->hwirq);
 
diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
new file mode 100644
index 0000000..436902c
--- /dev/null
+++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
@@ -0,0 +1,114 @@
+/*
+ * VFIO platform device irqbypass callback implementation for DEOI
+ *
+ * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
+ * Author: Eric Auger <eric.auger@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/err.h>
+#include <linux/device.h>
+#include <linux/irq.h>
+#include <linux/irqbypass.h>
+#include "vfio_platform_private.h"
+
+#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
+
+static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
+{
+	enable_irq(prod->irq);
+}
+
+static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
+{
+	disable_irq(prod->irq);
+}
+
+/**
+ * irq_bypass_deoi_add_consumer - turns irq direct EOI on
+ *
+ * The linux irq is disabled when the function is called.
+ * The operation succeeds only if the irq is not active at irqchip level
+ * and the irq is not automasked at VFIO level, meaning the IRQ is under
+ * injection into the guest.
+ */
+static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
+					struct irq_bypass_consumer *cons)
+{
+	struct vfio_platform_irq *irq_ctx =
+		container_of(prod, struct vfio_platform_irq, producer);
+	unsigned long flags;
+	bool active;
+	int ret;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+
+	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
+				    &active);
+	if (ret)
+		goto out;
+
+	if (active || irq_ctx->automasked) {
+		ret = -EAGAIN;
+		goto out;
+	}
+
+	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
+		goto out;
+
+	ret = vfio_platform_set_deoi(irq_ctx, true);
+out:
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+	return ret;
+}
+
+static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
+					 struct irq_bypass_consumer *cons)
+{
+	struct vfio_platform_irq *irq_ctx =
+		container_of(prod, struct vfio_platform_irq, producer);
+	unsigned long flags;
+
+	spin_lock_irqsave(&irq_ctx->lock, flags);
+	if (irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK)
+		vfio_platform_set_deoi(irq_ctx, false);
+	spin_unlock_irqrestore(&irq_ctx->lock, flags);
+}
+
+bool vfio_platform_has_deoi(void)
+{
+	return true;
+}
+
+void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
+					  struct vfio_platform_irq *irq,
+					  struct eventfd_ctx *trigger,
+					  unsigned int host_irq)
+{
+	struct irq_bypass_producer *prod = &irq->producer;
+	int ret;
+
+	prod->token =		trigger;
+	prod->irq =		host_irq;
+	prod->add_consumer =	irq_bypass_deoi_add_consumer;
+	prod->del_consumer =	irq_bypass_deoi_del_consumer;
+	prod->stop =		irq_bypass_deoi_stop;
+	prod->start =		irq_bypass_deoi_start;
+
+	ret = irq_bypass_register_producer(prod);
+	if (unlikely(ret))
+		dev_info(vdev->device,
+			 "irq bypass producer (token %p) registration fails: %d\n",
+			 prod->token, ret);
+}
+
+#endif
+
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index b80a380..bfa2675 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -17,6 +17,7 @@
 
 #include <linux/types.h>
 #include <linux/interrupt.h>
+#include <linux/irqbypass.h>
 
 #define VFIO_PLATFORM_OFFSET_SHIFT   40
 #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
@@ -40,6 +41,7 @@ struct vfio_platform_irq {
 	struct virqfd		*mask;
 	bool			deoi;
 	irqreturn_t		(*handler)(int irq, void *dev_id);
+	struct irq_bypass_producer producer;
 };
 
 struct vfio_platform_region {
@@ -102,9 +104,30 @@ extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
 					unsigned start, unsigned count,
 					void *data);
 
+extern int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi);
+
 extern void __vfio_platform_register_reset(struct vfio_platform_reset_node *n);
 extern void vfio_platform_unregister_reset(const char *compat,
 					   vfio_platform_reset_fn_t fn);
+
+#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
+bool vfio_platform_has_deoi(void);
+void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
+					  struct vfio_platform_irq *irq,
+					  struct eventfd_ctx *trigger,
+					  unsigned int host_irq);
+#else
+static inline bool vfio_platform_has_deoi(void)
+{
+	return false;
+}
+static inline
+void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
+					  struct vfio_platform_irq *irq,
+					  struct eventfd_ctx *trigger,
+					  unsigned int host_irq) {}
+#endif
+
 #define vfio_platform_register_reset(__compat, __reset)		\
 static struct vfio_platform_reset_node __reset ## _node = {	\
 	.owner = THIS_MODULE,					\
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/10] VFIO: pci: Add automasked field to vfio_pci_irq_ctx
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

For direct EOI modality we will need to differentiate a userspace
masking from the IRQ handler auto-masking.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c   | 15 +++++++++------
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 1c46045..d4d377b 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -52,7 +52,7 @@ void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
 	if (unlikely(!is_intx(vdev))) {
 		if (vdev->pci_2_3)
 			pci_intx(pdev, 0);
-	} else if (!vdev->ctx[0].masked) {
+	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked) {
 		/*
 		 * Can't use check_and_mask here because we always want to
 		 * mask, not just when something is pending.
@@ -90,7 +90,8 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 	if (unlikely(!is_intx(vdev))) {
 		if (vdev->pci_2_3)
 			pci_intx(pdev, 1);
-	} else if (vdev->ctx[0].masked && !vdev->virq_disabled) {
+	} else if ((vdev->ctx[0].masked || vdev->ctx[0].automasked) &&
+			!vdev->virq_disabled) {
 		/*
 		 * A pending interrupt here would immediately trigger,
 		 * but we can avoid that overhead by just re-sending
@@ -103,6 +104,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 			enable_irq(pdev->irq);
 
 		vdev->ctx[0].masked = (ret > 0);
+		vdev->ctx[0].automasked = (ret > 0);
 	}
 
 	spin_unlock_irqrestore(&vdev->irqlock, flags);
@@ -126,11 +128,12 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 
 	if (!vdev->pci_2_3) {
 		disable_irq_nosync(vdev->pdev->irq);
-		vdev->ctx[0].masked = true;
+		vdev->ctx[0].automasked = true;
 		ret = IRQ_HANDLED;
-	} else if (!vdev->ctx[0].masked &&  /* may be shared */
-		   pci_check_and_mask_intx(vdev->pdev)) {
-		vdev->ctx[0].masked = true;
+	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked &&
+			pci_check_and_mask_intx(vdev->pdev)) {
+		 /* shared INTx */
+		vdev->ctx[0].automasked = true;
 		ret = IRQ_HANDLED;
 	}
 
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index f561ac1..f7f1101 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -35,6 +35,7 @@ struct vfio_pci_irq_ctx {
 	struct virqfd		*mask;
 	char			*name;
 	bool			masked;
+	bool			automasked;
 	struct irq_bypass_producer	producer;
 };
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/10] VFIO: pci: Add automasked field to vfio_pci_irq_ctx
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

For direct EOI modality we will need to differentiate a userspace
masking from the IRQ handler auto-masking.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c   | 15 +++++++++------
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 1c46045..d4d377b 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -52,7 +52,7 @@ void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
 	if (unlikely(!is_intx(vdev))) {
 		if (vdev->pci_2_3)
 			pci_intx(pdev, 0);
-	} else if (!vdev->ctx[0].masked) {
+	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked) {
 		/*
 		 * Can't use check_and_mask here because we always want to
 		 * mask, not just when something is pending.
@@ -90,7 +90,8 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 	if (unlikely(!is_intx(vdev))) {
 		if (vdev->pci_2_3)
 			pci_intx(pdev, 1);
-	} else if (vdev->ctx[0].masked && !vdev->virq_disabled) {
+	} else if ((vdev->ctx[0].masked || vdev->ctx[0].automasked) &&
+			!vdev->virq_disabled) {
 		/*
 		 * A pending interrupt here would immediately trigger,
 		 * but we can avoid that overhead by just re-sending
@@ -103,6 +104,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
 			enable_irq(pdev->irq);
 
 		vdev->ctx[0].masked = (ret > 0);
+		vdev->ctx[0].automasked = (ret > 0);
 	}
 
 	spin_unlock_irqrestore(&vdev->irqlock, flags);
@@ -126,11 +128,12 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 
 	if (!vdev->pci_2_3) {
 		disable_irq_nosync(vdev->pdev->irq);
-		vdev->ctx[0].masked = true;
+		vdev->ctx[0].automasked = true;
 		ret = IRQ_HANDLED;
-	} else if (!vdev->ctx[0].masked &&  /* may be shared */
-		   pci_check_and_mask_intx(vdev->pdev)) {
-		vdev->ctx[0].masked = true;
+	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked &&
+			pci_check_and_mask_intx(vdev->pdev)) {
+		 /* shared INTx */
+		vdev->ctx[0].automasked = true;
 		ret = IRQ_HANDLED;
 	}
 
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index f561ac1..f7f1101 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -35,6 +35,7 @@ struct vfio_pci_irq_ctx {
 	struct virqfd		*mask;
 	char			*name;
 	bool			masked;
+	bool			automasked;
 	struct irq_bypass_producer	producer;
 };
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
If deoi is set, this means the physical IRQ attached to the virtual
IRQ is directly deactivated by the guest and the VFIO driver does
not need to disable the physical IRQ and mask it at VFIO level.

The handler pointer is set accordingly and a wrapper handler is
introduced that calls the chosen handler function.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
 drivers/vfio/pci/vfio_pci_private.h |  2 ++
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index d4d377b..06aa713 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
 static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 {
 	struct vfio_pci_device *vdev = dev_id;
-	unsigned long flags;
 	int ret = IRQ_NONE;
 
-	spin_lock_irqsave(&vdev->irqlock, flags);
-
 	if (!vdev->pci_2_3) {
 		disable_irq_nosync(vdev->pdev->irq);
 		vdev->ctx[0].automasked = true;
@@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 		ret = IRQ_HANDLED;
 	}
 
-	spin_unlock_irqrestore(&vdev->irqlock, flags);
-
 	if (ret == IRQ_HANDLED)
 		vfio_send_intx_eventfd(vdev, NULL);
 
 	return ret;
 }
 
+static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
+{
+	struct vfio_pci_device *vdev = dev_id;
+
+	vfio_send_intx_eventfd(vdev, NULL);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
+{
+	struct vfio_pci_device *vdev = dev_id;
+	unsigned long flags;
+	irqreturn_t ret;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+	ret = vdev->ctx[0].handler(irq, dev_id);
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+
+	return ret;
+}
+
 static int vfio_intx_enable(struct vfio_pci_device *vdev)
 {
 	if (!is_irq_none(vdev))
@@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 	if (!vdev->pci_2_3)
 		irqflags = 0;
 
-	ret = request_irq(pdev->irq, vfio_intx_handler,
+	if (vdev->ctx[0].deoi)
+		vdev->ctx[0].handler = vfio_intx_handler_deoi;
+	else
+		vdev->ctx[0].handler = vfio_intx_handler;
+	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
 			  irqflags, vdev->ctx[0].name, vdev);
 	if (ret) {
 		vdev->ctx[0].trigger = NULL;
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index f7f1101..5cfe59a 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx {
 	char			*name;
 	bool			masked;
 	bool			automasked;
+	bool			deoi;
+	irqreturn_t		(*handler)(int irq, void *dev_id);
 	struct irq_bypass_producer	producer;
 };
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
If deoi is set, this means the physical IRQ attached to the virtual
IRQ is directly deactivated by the guest and the VFIO driver does
not need to disable the physical IRQ and mask it at VFIO level.

The handler pointer is set accordingly and a wrapper handler is
introduced that calls the chosen handler function.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
 drivers/vfio/pci/vfio_pci_private.h |  2 ++
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index d4d377b..06aa713 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
 static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 {
 	struct vfio_pci_device *vdev = dev_id;
-	unsigned long flags;
 	int ret = IRQ_NONE;
 
-	spin_lock_irqsave(&vdev->irqlock, flags);
-
 	if (!vdev->pci_2_3) {
 		disable_irq_nosync(vdev->pdev->irq);
 		vdev->ctx[0].automasked = true;
@@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
 		ret = IRQ_HANDLED;
 	}
 
-	spin_unlock_irqrestore(&vdev->irqlock, flags);
-
 	if (ret == IRQ_HANDLED)
 		vfio_send_intx_eventfd(vdev, NULL);
 
 	return ret;
 }
 
+static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
+{
+	struct vfio_pci_device *vdev = dev_id;
+
+	vfio_send_intx_eventfd(vdev, NULL);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
+{
+	struct vfio_pci_device *vdev = dev_id;
+	unsigned long flags;
+	irqreturn_t ret;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+	ret = vdev->ctx[0].handler(irq, dev_id);
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+
+	return ret;
+}
+
 static int vfio_intx_enable(struct vfio_pci_device *vdev)
 {
 	if (!is_irq_none(vdev))
@@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 	if (!vdev->pci_2_3)
 		irqflags = 0;
 
-	ret = request_irq(pdev->irq, vfio_intx_handler,
+	if (vdev->ctx[0].deoi)
+		vdev->ctx[0].handler = vfio_intx_handler_deoi;
+	else
+		vdev->ctx[0].handler = vfio_intx_handler;
+	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
 			  irqflags, vdev->ctx[0].name, vdev);
 	if (ret) {
 		vdev->ctx[0].trigger = NULL;
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index f7f1101..5cfe59a 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx {
 	char			*name;
 	bool			masked;
 	bool			automasked;
+	bool			deoi;
+	irqreturn_t		(*handler)(int irq, void *dev_id);
 	struct irq_bypass_producer	producer;
 };
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/10] irqbypass: Add a private field in the producer
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

The producer callbacks may need to use a private data
stored in the producer structure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 include/linux/irqbypass.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
index f0f5d26..7f18f1b 100644
--- a/include/linux/irqbypass.h
+++ b/include/linux/irqbypass.h
@@ -35,6 +35,7 @@ struct irq_bypass_consumer;
  * struct irq_bypass_producer - IRQ bypass producer definition
  * @node: IRQ bypass manager private list management
  * @token: opaque token to match between producer and consumer (non-NULL)
+ * @private: private data that may be used by producer callbacks (optional)
  * @irq: Linux IRQ number for the producer device
  * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional)
  * @del_consumer: Disconnect the IRQ producer from an IRQ consumer (optional)
@@ -48,6 +49,7 @@ struct irq_bypass_consumer;
 struct irq_bypass_producer {
 	struct list_head node;
 	void *token;
+	void *private;
 	int irq;
 	int (*add_consumer)(struct irq_bypass_producer *,
 			    struct irq_bypass_consumer *);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/10] irqbypass: Add a private field in the producer
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

The producer callbacks may need to use a private data
stored in the producer structure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 include/linux/irqbypass.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
index f0f5d26..7f18f1b 100644
--- a/include/linux/irqbypass.h
+++ b/include/linux/irqbypass.h
@@ -35,6 +35,7 @@ struct irq_bypass_consumer;
  * struct irq_bypass_producer - IRQ bypass producer definition
  * @node: IRQ bypass manager private list management
  * @token: opaque token to match between producer and consumer (non-NULL)
+ * @private: private data that may be used by producer callbacks (optional)
  * @irq: Linux IRQ number for the producer device
  * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional)
  * @del_consumer: Disconnect the IRQ producer from an IRQ consumer (optional)
@@ -48,6 +49,7 @@ struct irq_bypass_consumer;
 struct irq_bypass_producer {
 	struct list_head node;
 	void *token;
+	void *private;
 	int irq;
 	int (*add_consumer)(struct irq_bypass_producer *,
 			    struct irq_bypass_consumer *);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/10] VFIO: pci: Direct EOI irq bypass for ARM/ARM64
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

Current architectures use irq bypass negotiation for MSI only and
the existing irq bypass producer does not implement any callback.

On ARM/ARM64 we would like to setup direct EOI for shared peripheral
interrupts and this requires specific callbacks to be implemented.

We introduce a new VFIO_PCI_IRQ_BYPASS_DEOI config that is set
only for ARM/ARM64. A new vfio_pci_irq_bypass.c file contains the
implementation of the DEOI callbacks. The code is not so much
ARM tainted, hence the choice of making it as generic as possible:

- stop/start: disable/enable the host irq
- add/del consumer: set the VFIO Direct EOI mode, ie. select the
  adapted physical IRQ handler (automasked or not automasked).

The DEOI bypass producer only is registered for ARM/ARM64 for INTx
interrupts.

Other architectures use vfio_register_default_producer for MSIs
only.

we also include vfio.h in vfio_pci_private as this latter is not
self-contained.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/pci/Kconfig               |   4 +
 drivers/vfio/pci/Makefile              |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c      |  31 ++++++--
 drivers/vfio/pci/vfio_pci_irq_bypass.c | 134 +++++++++++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_private.h    |  32 ++++++++
 5 files changed, 195 insertions(+), 7 deletions(-)
 create mode 100644 drivers/vfio/pci/vfio_pci_irq_bypass.c

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 24ee260..5b66d48 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -30,3 +30,7 @@ config VFIO_PCI_INTX
 config VFIO_PCI_IGD
 	depends on VFIO_PCI
 	def_bool y if X86
+
+config VFIO_PCI_IRQ_BYPASS_DEOI
+	depends on VFIO_PCI && (ARM || ARM64)
+	def_bool y if (ARM || ARM64)
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 76d8ec0..5836ea6 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,5 +1,6 @@
 
 vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-y += vfio_pci_irq_bypass.o
 vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 
 obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 06aa713..344147e 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -161,6 +161,23 @@ static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
 	return ret;
 }
 
+/* must be called with vdev->irqlock held */
+int vfio_pci_set_deoi(struct vfio_pci_device *vdev,
+		      struct vfio_pci_irq_ctx *irq_ctx, bool deoi)
+{
+	if (!is_intx(vdev))
+		return -EINVAL;
+
+	irq_ctx->deoi = deoi;
+
+	if (deoi)
+		irq_ctx->handler = vfio_intx_handler_deoi;
+	else
+		irq_ctx->handler = vfio_intx_handler;
+
+	return 0;
+}
+
 static int vfio_intx_enable(struct vfio_pci_device *vdev)
 {
 	if (!is_irq_none(vdev))
@@ -200,6 +217,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 
 	if (vdev->ctx[0].trigger) {
 		free_irq(pdev->irq, vdev);
+		irq_bypass_unregister_producer(&vdev->ctx[0].producer);
 		kfree(vdev->ctx[0].name);
 		eventfd_ctx_put(vdev->ctx[0].trigger);
 		vdev->ctx[0].trigger = NULL;
@@ -237,6 +255,10 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 		return ret;
 	}
 
+	if (vfio_pci_has_deoi())
+		vfio_pci_register_deoi_producer(vdev, vdev->ctx,
+						trigger, pdev->irq);
+
 	/*
 	 * INTx disable will stick across the new irq setup,
 	 * disable_irq won't.
@@ -364,13 +386,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
 		return ret;
 	}
 
-	vdev->ctx[vector].producer.token = trigger;
-	vdev->ctx[vector].producer.irq = irq;
-	ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
-	if (unlikely(ret))
-		dev_info(&pdev->dev,
-		"irq bypass producer (token %p) registration fails: %d\n",
-		vdev->ctx[vector].producer.token, ret);
+	vfio_pci_register_default_producer(vdev, &vdev->ctx[vector],
+					   trigger, irq);
 
 	vdev->ctx[vector].trigger = trigger;
 
diff --git a/drivers/vfio/pci/vfio_pci_irq_bypass.c b/drivers/vfio/pci/vfio_pci_irq_bypass.c
new file mode 100644
index 0000000..447fa60
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci_irq_bypass.c
@@ -0,0 +1,134 @@
+/*
+ * VFIO PCI device irqbypass callback implementation for DEOI
+ *
+ * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
+ * Author: Eric Auger <eric.auger@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/err.h>
+#include <linux/irqbypass.h>
+#include "vfio_pci_private.h"
+
+#ifdef CONFIG_VFIO_PCI_IRQ_BYPASS_DEOI
+
+static inline void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
+{
+	enable_irq(prod->irq);
+}
+
+static inline void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
+{
+	disable_irq(prod->irq);
+}
+
+/**
+ * irq_bypass_deoi_add_consumer - turns direct EOI on
+ *
+ * The linux irq is disabled when the function is called.
+ * The operation succeeds only if the irq is not active at irqchip level
+ * and not automasked at VFIO level, meaning the IRQ is not under injection
+ * into the guest.
+ */
+static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
+					struct irq_bypass_consumer *cons)
+{
+	struct vfio_pci_device *vdev = (struct vfio_pci_device *)prod->private;
+	struct vfio_pci_irq_ctx *irq_ctx =
+		container_of(prod, struct vfio_pci_irq_ctx, producer);
+	unsigned long flags;
+	bool active;
+	int ret;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+
+	ret = irq_get_irqchip_state(prod->irq, IRQCHIP_STATE_ACTIVE,
+				    &active);
+	WARN_ON(ret);
+	if (ret)
+		goto out;
+
+	if (active || irq_ctx->automasked) {
+		ret = -EAGAIN;
+		goto out;
+	}
+
+	ret = vfio_pci_set_deoi(vdev, irq_ctx, true);
+out:
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+	return ret;
+}
+
+static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
+					 struct irq_bypass_consumer *cons)
+{
+	struct vfio_pci_device *vdev = (struct vfio_pci_device *)prod->private;
+	struct vfio_pci_irq_ctx *irq_ctx =
+		container_of(prod, struct vfio_pci_irq_ctx, producer);
+	unsigned long flags;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+	vfio_pci_set_deoi(vdev, irq_ctx, false);
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+}
+
+bool vfio_pci_has_deoi(void)
+{
+	return true;
+}
+
+void vfio_pci_register_deoi_producer(struct vfio_pci_device *vdev,
+				     struct vfio_pci_irq_ctx *irq_ctx,
+				     struct eventfd_ctx *trigger,
+				     unsigned int irq)
+{
+	struct irq_bypass_producer *prod = &irq_ctx->producer;
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	prod->token =		trigger;
+	prod->irq =		irq;
+	prod->private =		vdev;
+	prod->add_consumer =	irq_bypass_deoi_add_consumer;
+	prod->del_consumer =	irq_bypass_deoi_del_consumer;
+	prod->stop =		irq_bypass_deoi_stop;
+	prod->start =		irq_bypass_deoi_start;
+
+	ret = irq_bypass_register_producer(prod);
+	if (unlikely(ret))
+		dev_info(&pdev->dev,
+		"irq bypass producer (token %p) registration fails for irq %s: %d\n",
+		prod->token, irq_ctx->name, ret);
+}
+
+#endif /* DEOI */
+
+void vfio_pci_register_default_producer(struct vfio_pci_device *vdev,
+					struct vfio_pci_irq_ctx *irq_ctx,
+					struct eventfd_ctx *trigger,
+					unsigned int irq)
+{
+	struct irq_bypass_producer *prod = &irq_ctx->producer;
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	prod->token =	trigger;
+	prod->irq =	irq;
+	prod->private = vdev;
+
+	ret = irq_bypass_register_producer(prod);
+	if (unlikely(ret))
+		dev_info(&pdev->dev,
+		"irq bypass producer (token %p) registration fails for irq %s: %d\n",
+		prod->token, irq_ctx->name, ret);
+}
+
+
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 5cfe59a..adc1ba2 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -15,6 +15,7 @@
 #include <linux/pci.h>
 #include <linux/irqbypass.h>
 #include <linux/types.h>
+#include <linux/vfio.h>
 
 #ifndef VFIO_PCI_PRIVATE_H
 #define VFIO_PCI_PRIVATE_H
@@ -133,6 +134,10 @@ extern int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
 					unsigned int type, unsigned int subtype,
 					const struct vfio_pci_regops *ops,
 					size_t size, u32 flags, void *data);
+
+extern int vfio_pci_set_deoi(struct vfio_pci_device *vdev,
+			     struct vfio_pci_irq_ctx *irq_ctx, bool deoi);
+
 #ifdef CONFIG_VFIO_PCI_IGD
 extern int vfio_pci_igd_init(struct vfio_pci_device *vdev);
 #else
@@ -141,4 +146,31 @@ static inline int vfio_pci_igd_init(struct vfio_pci_device *vdev)
 	return -ENODEV;
 }
 #endif
+
+#ifdef CONFIG_VFIO_PCI_IRQ_BYPASS_DEOI
+/*
+ * Architecture specific irqbypass producer callbacks
+ */
+bool vfio_pci_has_deoi(void);
+void vfio_pci_register_deoi_producer(struct vfio_pci_device *vdev,
+				     struct vfio_pci_irq_ctx *irq_ctx,
+				     struct eventfd_ctx *trigger,
+				     unsigned int irq);
+#else
+static inline bool vfio_pci_has_deoi(void)
+{
+	return false;
+}
+static inline
+void vfio_pci_register_deoi_producer(struct vfio_pci_device *vdev,
+				     struct vfio_pci_irq_ctx *irq_ctx,
+				     struct eventfd_ctx *trigger,
+				     unsigned int irq) {}
+#endif
+
+void vfio_pci_register_default_producer(struct vfio_pci_device *vdev,
+					struct vfio_pci_irq_ctx *irq_ctx,
+					struct eventfd_ctx *trigger,
+					unsigned int irq);
+
 #endif /* VFIO_PCI_PRIVATE_H */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/10] VFIO: pci: Direct EOI irq bypass for ARM/ARM64
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

Current architectures use irq bypass negotiation for MSI only and
the existing irq bypass producer does not implement any callback.

On ARM/ARM64 we would like to setup direct EOI for shared peripheral
interrupts and this requires specific callbacks to be implemented.

We introduce a new VFIO_PCI_IRQ_BYPASS_DEOI config that is set
only for ARM/ARM64. A new vfio_pci_irq_bypass.c file contains the
implementation of the DEOI callbacks. The code is not so much
ARM tainted, hence the choice of making it as generic as possible:

- stop/start: disable/enable the host irq
- add/del consumer: set the VFIO Direct EOI mode, ie. select the
  adapted physical IRQ handler (automasked or not automasked).

The DEOI bypass producer only is registered for ARM/ARM64 for INTx
interrupts.

Other architectures use vfio_register_default_producer for MSIs
only.

we also include vfio.h in vfio_pci_private as this latter is not
self-contained.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 drivers/vfio/pci/Kconfig               |   4 +
 drivers/vfio/pci/Makefile              |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c      |  31 ++++++--
 drivers/vfio/pci/vfio_pci_irq_bypass.c | 134 +++++++++++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_private.h    |  32 ++++++++
 5 files changed, 195 insertions(+), 7 deletions(-)
 create mode 100644 drivers/vfio/pci/vfio_pci_irq_bypass.c

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 24ee260..5b66d48 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -30,3 +30,7 @@ config VFIO_PCI_INTX
 config VFIO_PCI_IGD
 	depends on VFIO_PCI
 	def_bool y if X86
+
+config VFIO_PCI_IRQ_BYPASS_DEOI
+	depends on VFIO_PCI && (ARM || ARM64)
+	def_bool y if (ARM || ARM64)
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 76d8ec0..5836ea6 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,5 +1,6 @@
 
 vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-y += vfio_pci_irq_bypass.o
 vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 
 obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 06aa713..344147e 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -161,6 +161,23 @@ static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
 	return ret;
 }
 
+/* must be called with vdev->irqlock held */
+int vfio_pci_set_deoi(struct vfio_pci_device *vdev,
+		      struct vfio_pci_irq_ctx *irq_ctx, bool deoi)
+{
+	if (!is_intx(vdev))
+		return -EINVAL;
+
+	irq_ctx->deoi = deoi;
+
+	if (deoi)
+		irq_ctx->handler = vfio_intx_handler_deoi;
+	else
+		irq_ctx->handler = vfio_intx_handler;
+
+	return 0;
+}
+
 static int vfio_intx_enable(struct vfio_pci_device *vdev)
 {
 	if (!is_irq_none(vdev))
@@ -200,6 +217,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 
 	if (vdev->ctx[0].trigger) {
 		free_irq(pdev->irq, vdev);
+		irq_bypass_unregister_producer(&vdev->ctx[0].producer);
 		kfree(vdev->ctx[0].name);
 		eventfd_ctx_put(vdev->ctx[0].trigger);
 		vdev->ctx[0].trigger = NULL;
@@ -237,6 +255,10 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
 		return ret;
 	}
 
+	if (vfio_pci_has_deoi())
+		vfio_pci_register_deoi_producer(vdev, vdev->ctx,
+						trigger, pdev->irq);
+
 	/*
 	 * INTx disable will stick across the new irq setup,
 	 * disable_irq won't.
@@ -364,13 +386,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
 		return ret;
 	}
 
-	vdev->ctx[vector].producer.token = trigger;
-	vdev->ctx[vector].producer.irq = irq;
-	ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
-	if (unlikely(ret))
-		dev_info(&pdev->dev,
-		"irq bypass producer (token %p) registration fails: %d\n",
-		vdev->ctx[vector].producer.token, ret);
+	vfio_pci_register_default_producer(vdev, &vdev->ctx[vector],
+					   trigger, irq);
 
 	vdev->ctx[vector].trigger = trigger;
 
diff --git a/drivers/vfio/pci/vfio_pci_irq_bypass.c b/drivers/vfio/pci/vfio_pci_irq_bypass.c
new file mode 100644
index 0000000..447fa60
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci_irq_bypass.c
@@ -0,0 +1,134 @@
+/*
+ * VFIO PCI device irqbypass callback implementation for DEOI
+ *
+ * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
+ * Author: Eric Auger <eric.auger@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/err.h>
+#include <linux/irqbypass.h>
+#include "vfio_pci_private.h"
+
+#ifdef CONFIG_VFIO_PCI_IRQ_BYPASS_DEOI
+
+static inline void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
+{
+	enable_irq(prod->irq);
+}
+
+static inline void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
+{
+	disable_irq(prod->irq);
+}
+
+/**
+ * irq_bypass_deoi_add_consumer - turns direct EOI on
+ *
+ * The linux irq is disabled when the function is called.
+ * The operation succeeds only if the irq is not active at irqchip level
+ * and not automasked at VFIO level, meaning the IRQ is not under injection
+ * into the guest.
+ */
+static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
+					struct irq_bypass_consumer *cons)
+{
+	struct vfio_pci_device *vdev = (struct vfio_pci_device *)prod->private;
+	struct vfio_pci_irq_ctx *irq_ctx =
+		container_of(prod, struct vfio_pci_irq_ctx, producer);
+	unsigned long flags;
+	bool active;
+	int ret;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+
+	ret = irq_get_irqchip_state(prod->irq, IRQCHIP_STATE_ACTIVE,
+				    &active);
+	WARN_ON(ret);
+	if (ret)
+		goto out;
+
+	if (active || irq_ctx->automasked) {
+		ret = -EAGAIN;
+		goto out;
+	}
+
+	ret = vfio_pci_set_deoi(vdev, irq_ctx, true);
+out:
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+	return ret;
+}
+
+static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
+					 struct irq_bypass_consumer *cons)
+{
+	struct vfio_pci_device *vdev = (struct vfio_pci_device *)prod->private;
+	struct vfio_pci_irq_ctx *irq_ctx =
+		container_of(prod, struct vfio_pci_irq_ctx, producer);
+	unsigned long flags;
+
+	spin_lock_irqsave(&vdev->irqlock, flags);
+	vfio_pci_set_deoi(vdev, irq_ctx, false);
+	spin_unlock_irqrestore(&vdev->irqlock, flags);
+}
+
+bool vfio_pci_has_deoi(void)
+{
+	return true;
+}
+
+void vfio_pci_register_deoi_producer(struct vfio_pci_device *vdev,
+				     struct vfio_pci_irq_ctx *irq_ctx,
+				     struct eventfd_ctx *trigger,
+				     unsigned int irq)
+{
+	struct irq_bypass_producer *prod = &irq_ctx->producer;
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	prod->token =		trigger;
+	prod->irq =		irq;
+	prod->private =		vdev;
+	prod->add_consumer =	irq_bypass_deoi_add_consumer;
+	prod->del_consumer =	irq_bypass_deoi_del_consumer;
+	prod->stop =		irq_bypass_deoi_stop;
+	prod->start =		irq_bypass_deoi_start;
+
+	ret = irq_bypass_register_producer(prod);
+	if (unlikely(ret))
+		dev_info(&pdev->dev,
+		"irq bypass producer (token %p) registration fails for irq %s: %d\n",
+		prod->token, irq_ctx->name, ret);
+}
+
+#endif /* DEOI */
+
+void vfio_pci_register_default_producer(struct vfio_pci_device *vdev,
+					struct vfio_pci_irq_ctx *irq_ctx,
+					struct eventfd_ctx *trigger,
+					unsigned int irq)
+{
+	struct irq_bypass_producer *prod = &irq_ctx->producer;
+	struct pci_dev *pdev = vdev->pdev;
+	int ret;
+
+	prod->token =	trigger;
+	prod->irq =	irq;
+	prod->private = vdev;
+
+	ret = irq_bypass_register_producer(prod);
+	if (unlikely(ret))
+		dev_info(&pdev->dev,
+		"irq bypass producer (token %p) registration fails for irq %s: %d\n",
+		prod->token, irq_ctx->name, ret);
+}
+
+
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 5cfe59a..adc1ba2 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -15,6 +15,7 @@
 #include <linux/pci.h>
 #include <linux/irqbypass.h>
 #include <linux/types.h>
+#include <linux/vfio.h>
 
 #ifndef VFIO_PCI_PRIVATE_H
 #define VFIO_PCI_PRIVATE_H
@@ -133,6 +134,10 @@ extern int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
 					unsigned int type, unsigned int subtype,
 					const struct vfio_pci_regops *ops,
 					size_t size, u32 flags, void *data);
+
+extern int vfio_pci_set_deoi(struct vfio_pci_device *vdev,
+			     struct vfio_pci_irq_ctx *irq_ctx, bool deoi);
+
 #ifdef CONFIG_VFIO_PCI_IGD
 extern int vfio_pci_igd_init(struct vfio_pci_device *vdev);
 #else
@@ -141,4 +146,31 @@ static inline int vfio_pci_igd_init(struct vfio_pci_device *vdev)
 	return -ENODEV;
 }
 #endif
+
+#ifdef CONFIG_VFIO_PCI_IRQ_BYPASS_DEOI
+/*
+ * Architecture specific irqbypass producer callbacks
+ */
+bool vfio_pci_has_deoi(void);
+void vfio_pci_register_deoi_producer(struct vfio_pci_device *vdev,
+				     struct vfio_pci_irq_ctx *irq_ctx,
+				     struct eventfd_ctx *trigger,
+				     unsigned int irq);
+#else
+static inline bool vfio_pci_has_deoi(void)
+{
+	return false;
+}
+static inline
+void vfio_pci_register_deoi_producer(struct vfio_pci_device *vdev,
+				     struct vfio_pci_irq_ctx *irq_ctx,
+				     struct eventfd_ctx *trigger,
+				     unsigned int irq) {}
+#endif
+
+void vfio_pci_register_default_producer(struct vfio_pci_device *vdev,
+					struct vfio_pci_irq_ctx *irq_ctx,
+					struct eventfd_ctx *trigger,
+					unsigned int irq);
+
 #endif /* VFIO_PCI_PRIVATE_H */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

Virtual interrupts directly mapped to physical interrupts require
some special care. Their pending and active state must be observed
at distributor level and not in the list register.

Also a level sensitive interrupt's level is not toggled down by any
maintenance IRQ handler as the EOI is not trapped.

This patch adds an host_irq field in vgic_irq struct to easily
get the irqchip state of the host irq. We also handle the
physical IRQ case in vgic_validate_injection and add helpers to
get the line level and active state.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/kvm/arm_vgic.h    |  4 +++-
 virt/kvm/arm/arch_timer.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
 virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
 4 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index ef71858..695ebc7 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -112,6 +112,7 @@ struct vgic_irq {
 	bool hw;			/* Tied to HW IRQ */
 	struct kref refcount;		/* Used for LPIs */
 	u32 hwintid;			/* HW INTID number */
+	unsigned int host_irq;		/* linux irq corresponding to hwintid */
 	union {
 		u8 targets;			/* GICv2 target VCPUs mask */
 		u32 mpidr;			/* GICv3 target VCPU */
@@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 			bool level);
 int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 			       bool level);
-int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
+int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
+			  u32 virt_irq, u32 phys_irq);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
 bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 5976609..45f4779 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 	 * Tell the VGIC that the virtual interrupt is tied to a
 	 * physical interrupt. We do that once per VCPU.
 	 */
-	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
+	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
+				    vtimer->irq.irq, phys_irq);
 	if (ret)
 		return ret;
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 83b24d2..aa0618c 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
+bool irq_line_level(struct vgic_irq *irq)
+{
+	bool line_level = irq->line_level;
+
+	if (unlikely(is_unshared_mapped(irq)))
+		WARN_ON(irq_get_irqchip_state(irq->host_irq,
+					      IRQCHIP_STATE_PENDING,
+					      &line_level));
+	return line_level;
+}
+
+bool irq_is_active(struct vgic_irq *irq)
+{
+	bool is_active = irq->active;
+
+	if (unlikely(is_unshared_mapped(irq)))
+		WARN_ON(irq_get_irqchip_state(irq->host_irq,
+					      IRQCHIP_STATE_ACTIVE,
+					      &is_active));
+	return is_active;
+}
+
 /**
  * kvm_vgic_target_oracle - compute the target vcpu for an irq
  *
@@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
 	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
 
 	/* If the interrupt is active, it must stay on the current vcpu */
-	if (irq->active)
+	if (irq_is_active(irq))
 		return irq->vcpu ? : irq->target_vcpu;
 
 	/*
@@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
 {
 	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
 	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
+	bool activea, activeb;
 	bool penda, pendb;
 	int ret;
 
 	spin_lock(&irqa->irq_lock);
 	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
 
-	if (irqa->active || irqb->active) {
-		ret = (int)irqb->active - (int)irqa->active;
+	activea = irq_is_active(irqa);
+	activeb = irq_is_active(irqb);
+
+	if (activea || activeb) {
+		ret = (int)activeb - (int)activea;
 		goto out;
 	}
 
@@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
 
 /*
  * Only valid injection if changing level for level-triggered IRQs or for a
- * rising edge.
+ * rising edge. Injection of virtual interrupts associated to physical
+ * interrupts always is valid.
  */
 static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
 {
 	switch (irq->config) {
 	case VGIC_CONFIG_LEVEL:
-		return irq->line_level != level;
+		if (unlikely(is_unshared_mapped(irq)))
+			return true;
+		else
+			return irq->line_level != level;
 	case VGIC_CONFIG_EDGE:
 		return level;
 	}
@@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 	return 0;
 }
 
-int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
+int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
+			  u32 virt_irq, u32 phys_irq)
 {
 	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
 
@@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
 
 	irq->hw = true;
 	irq->hwintid = phys_irq;
+	irq->host_irq = host_irq;
 
 	spin_unlock(&irq->irq_lock);
 	vgic_put_irq(vcpu->kvm, irq);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index da83e4c..dc4972b 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -17,6 +17,7 @@
 #define __KVM_ARM_VGIC_NEW_H__
 
 #include <linux/irqchip/arm-gic-common.h>
+#include <linux/interrupt.h>
 
 #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
 #define IMPLEMENTER_ARM		0x43b
@@ -96,14 +97,20 @@
 /* we only support 64 kB translation table page size */
 #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
 
+bool irq_line_level(struct vgic_irq *irq);
+bool irq_is_active(struct vgic_irq *irq);
+
 static inline bool irq_is_pending(struct vgic_irq *irq)
 {
 	if (irq->config == VGIC_CONFIG_EDGE)
 		return irq->pending_latch;
 	else
-		return irq->pending_latch || irq->line_level;
+		return irq->pending_latch || irq_line_level(irq);
 }
 
+#define is_unshared_mapped(i) \
+((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
+
 /*
  * This struct provides an intermediate representation of the fields contained
  * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

Virtual interrupts directly mapped to physical interrupts require
some special care. Their pending and active state must be observed
at distributor level and not in the list register.

Also a level sensitive interrupt's level is not toggled down by any
maintenance IRQ handler as the EOI is not trapped.

This patch adds an host_irq field in vgic_irq struct to easily
get the irqchip state of the host irq. We also handle the
physical IRQ case in vgic_validate_injection and add helpers to
get the line level and active state.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/kvm/arm_vgic.h    |  4 +++-
 virt/kvm/arm/arch_timer.c |  3 ++-
 virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
 virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
 4 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index ef71858..695ebc7 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -112,6 +112,7 @@ struct vgic_irq {
 	bool hw;			/* Tied to HW IRQ */
 	struct kref refcount;		/* Used for LPIs */
 	u32 hwintid;			/* HW INTID number */
+	unsigned int host_irq;		/* linux irq corresponding to hwintid */
 	union {
 		u8 targets;			/* GICv2 target VCPUs mask */
 		u32 mpidr;			/* GICv3 target VCPU */
@@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 			bool level);
 int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 			       bool level);
-int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
+int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
+			  u32 virt_irq, u32 phys_irq);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
 bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 5976609..45f4779 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 	 * Tell the VGIC that the virtual interrupt is tied to a
 	 * physical interrupt. We do that once per VCPU.
 	 */
-	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
+	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
+				    vtimer->irq.irq, phys_irq);
 	if (ret)
 		return ret;
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 83b24d2..aa0618c 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	kfree(irq);
 }
 
+bool irq_line_level(struct vgic_irq *irq)
+{
+	bool line_level = irq->line_level;
+
+	if (unlikely(is_unshared_mapped(irq)))
+		WARN_ON(irq_get_irqchip_state(irq->host_irq,
+					      IRQCHIP_STATE_PENDING,
+					      &line_level));
+	return line_level;
+}
+
+bool irq_is_active(struct vgic_irq *irq)
+{
+	bool is_active = irq->active;
+
+	if (unlikely(is_unshared_mapped(irq)))
+		WARN_ON(irq_get_irqchip_state(irq->host_irq,
+					      IRQCHIP_STATE_ACTIVE,
+					      &is_active));
+	return is_active;
+}
+
 /**
  * kvm_vgic_target_oracle - compute the target vcpu for an irq
  *
@@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
 	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
 
 	/* If the interrupt is active, it must stay on the current vcpu */
-	if (irq->active)
+	if (irq_is_active(irq))
 		return irq->vcpu ? : irq->target_vcpu;
 
 	/*
@@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
 {
 	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
 	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
+	bool activea, activeb;
 	bool penda, pendb;
 	int ret;
 
 	spin_lock(&irqa->irq_lock);
 	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
 
-	if (irqa->active || irqb->active) {
-		ret = (int)irqb->active - (int)irqa->active;
+	activea = irq_is_active(irqa);
+	activeb = irq_is_active(irqb);
+
+	if (activea || activeb) {
+		ret = (int)activeb - (int)activea;
 		goto out;
 	}
 
@@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
 
 /*
  * Only valid injection if changing level for level-triggered IRQs or for a
- * rising edge.
+ * rising edge. Injection of virtual interrupts associated to physical
+ * interrupts always is valid.
  */
 static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
 {
 	switch (irq->config) {
 	case VGIC_CONFIG_LEVEL:
-		return irq->line_level != level;
+		if (unlikely(is_unshared_mapped(irq)))
+			return true;
+		else
+			return irq->line_level != level;
 	case VGIC_CONFIG_EDGE:
 		return level;
 	}
@@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
 	return 0;
 }
 
-int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
+int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
+			  u32 virt_irq, u32 phys_irq)
 {
 	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
 
@@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
 
 	irq->hw = true;
 	irq->hwintid = phys_irq;
+	irq->host_irq = host_irq;
 
 	spin_unlock(&irq->irq_lock);
 	vgic_put_irq(vcpu->kvm, irq);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index da83e4c..dc4972b 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -17,6 +17,7 @@
 #define __KVM_ARM_VGIC_NEW_H__
 
 #include <linux/irqchip/arm-gic-common.h>
+#include <linux/interrupt.h>
 
 #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
 #define IMPLEMENTER_ARM		0x43b
@@ -96,14 +97,20 @@
 /* we only support 64 kB translation table page size */
 #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
 
+bool irq_line_level(struct vgic_irq *irq);
+bool irq_is_active(struct vgic_irq *irq);
+
 static inline bool irq_is_pending(struct vgic_irq *irq)
 {
 	if (irq->config == VGIC_CONFIG_EDGE)
 		return irq->pending_latch;
 	else
-		return irq->pending_latch || irq->line_level;
+		return irq->pending_latch || irq_line_level(irq);
 }
 
+#define is_unshared_mapped(i) \
+((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
+
 /*
  * This struct provides an intermediate representation of the fields contained
  * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

Implements kvm_vgic_[set|unset]_forwarding.

Handle low-level VGIC programming and consistent irqchip
programming.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 include/kvm/arm_vgic.h   |   5 +++
 virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 110 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 695ebc7..7ddac8a 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
  */
 int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
 
+int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
+			    unsigned int virt_irq);
+void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
+			       unsigned int virt_irq);
+
 #endif /* __KVM_ARM_VGIC_H */
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index aa0618c..c2add8d 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -17,6 +17,8 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <linux/list_sort.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
 
 #include "vgic.h"
 
@@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
 	return map_is_active;
 }
 
+/**
+ * kvm_vgic_set_forwarding - Set IRQ forwarding
+ *
+ * @kvm: kvm handle
+ * @host_irq: the host linux IRQ
+ * @vintid: the virtual INTID
+ *
+ * This function must be called when the IRQ is not active:
+ * ie. not active at GIC level and not currently under injection
+ * into the guest using the unforwarded mode. The physical IRQ must
+ * be disabled and all vCPUs must have been exited and prevented
+ * from being re-entered.
+ */
+int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
+			    unsigned int vintid)
+{
+	struct kvm_vcpu *vcpu;
+	struct vgic_irq *irq;
+	struct irq_desc *desc;
+	struct irq_data *data;
+	unsigned int pintid;
+	int ret = 0;
+
+
+	kvm_debug("%s host linux irq=%d vintid=%d\n",
+		  __func__, host_irq, vintid);
+
+	if (!vgic_valid_spi(kvm, vintid))
+		return 0;
+
+	/* find the INTID corresponding to @host_irq */
+	desc = irq_to_desc(host_irq);
+	if (!desc) {
+		kvm_err("%s: no interrupt descriptor\n", __func__);
+		return -EINVAL;
+	}
+
+	data = irq_desc_get_irq_data(desc);
+	while (data->parent_data)
+		data = data->parent_data;
+
+	pintid = data->hwirq;
+
+	irq = vgic_get_irq(kvm, NULL, vintid);
+
+	spin_lock(&irq->irq_lock);
+
+	vcpu = irq->target_vcpu;
+
+	if (!vcpu) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	irq_set_vcpu_affinity(host_irq, vcpu);
+
+	irq->hw = true;
+	irq->hwintid = pintid;
+	irq->host_irq = host_irq;
+
+unlock:
+	spin_unlock(&irq->irq_lock);
+	vgic_put_irq(kvm, irq);
+	return ret;
+}
+
+/**
+ * kvm_vgic_unset_forwarding - Unset IRQ forwarding
+ *
+ * @kvm: KVM handle
+ * @host_irq: the host Linux IRQ number
+ * @vintid: virtual INTID
+ *
+ * This function must be called when the host irq is disabled and
+ * all vCPUs have been exited and prevented from being re-entered.
+ */
+void kvm_vgic_unset_forwarding(struct kvm *kvm,
+			       unsigned int host_irq,
+			       unsigned int vintid)
+{
+	struct vgic_irq *irq;
+	bool active;
+
+	kvm_debug("%s host_irq=%d virt_irq=%d\n", __func__, host_irq, vintid);
+
+	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
+
+	irq = vgic_get_irq(kvm, NULL, vintid);
+	spin_lock(&irq->irq_lock);
+
+	if (!is_unshared_mapped(irq))
+		goto unlock;
+
+	if (active)
+		irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
+
+	irq->hw = false;
+	irq_set_vcpu_affinity(host_irq, NULL);
+
+unlock:
+	spin_unlock(&irq->irq_lock);
+}
+
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

Implements kvm_vgic_[set|unset]_forwarding.

Handle low-level VGIC programming and consistent irqchip
programming.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 include/kvm/arm_vgic.h   |   5 +++
 virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 110 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 695ebc7..7ddac8a 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
  */
 int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
 
+int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
+			    unsigned int virt_irq);
+void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
+			       unsigned int virt_irq);
+
 #endif /* __KVM_ARM_VGIC_H */
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index aa0618c..c2add8d 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -17,6 +17,8 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <linux/list_sort.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
 
 #include "vgic.h"
 
@@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
 	return map_is_active;
 }
 
+/**
+ * kvm_vgic_set_forwarding - Set IRQ forwarding
+ *
+ * @kvm: kvm handle
+ * @host_irq: the host linux IRQ
+ * @vintid: the virtual INTID
+ *
+ * This function must be called when the IRQ is not active:
+ * ie. not active at GIC level and not currently under injection
+ * into the guest using the unforwarded mode. The physical IRQ must
+ * be disabled and all vCPUs must have been exited and prevented
+ * from being re-entered.
+ */
+int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
+			    unsigned int vintid)
+{
+	struct kvm_vcpu *vcpu;
+	struct vgic_irq *irq;
+	struct irq_desc *desc;
+	struct irq_data *data;
+	unsigned int pintid;
+	int ret = 0;
+
+
+	kvm_debug("%s host linux irq=%d vintid=%d\n",
+		  __func__, host_irq, vintid);
+
+	if (!vgic_valid_spi(kvm, vintid))
+		return 0;
+
+	/* find the INTID corresponding to @host_irq */
+	desc = irq_to_desc(host_irq);
+	if (!desc) {
+		kvm_err("%s: no interrupt descriptor\n", __func__);
+		return -EINVAL;
+	}
+
+	data = irq_desc_get_irq_data(desc);
+	while (data->parent_data)
+		data = data->parent_data;
+
+	pintid = data->hwirq;
+
+	irq = vgic_get_irq(kvm, NULL, vintid);
+
+	spin_lock(&irq->irq_lock);
+
+	vcpu = irq->target_vcpu;
+
+	if (!vcpu) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	irq_set_vcpu_affinity(host_irq, vcpu);
+
+	irq->hw = true;
+	irq->hwintid = pintid;
+	irq->host_irq = host_irq;
+
+unlock:
+	spin_unlock(&irq->irq_lock);
+	vgic_put_irq(kvm, irq);
+	return ret;
+}
+
+/**
+ * kvm_vgic_unset_forwarding - Unset IRQ forwarding
+ *
+ * @kvm: KVM handle
+ * @host_irq: the host Linux IRQ number
+ * @vintid: virtual INTID
+ *
+ * This function must be called when the host irq is disabled and
+ * all vCPUs have been exited and prevented from being re-entered.
+ */
+void kvm_vgic_unset_forwarding(struct kvm *kvm,
+			       unsigned int host_irq,
+			       unsigned int vintid)
+{
+	struct vgic_irq *irq;
+	bool active;
+
+	kvm_debug("%s host_irq=%d virt_irq=%d\n", __func__, host_irq, vintid);
+
+	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
+
+	irq = vgic_get_irq(kvm, NULL, vintid);
+	spin_lock(&irq->irq_lock);
+
+	if (!is_unshared_mapped(irq))
+		goto unlock;
+
+	if (active)
+		irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
+
+	irq->hw = false;
+	irq_set_vcpu_affinity(host_irq, NULL);
+
+unlock:
+	spin_unlock(&irq->irq_lock);
+}
+
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/10] KVM: arm/arm64: register DEOI irq bypass consumer on ARM/ARM64
  2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
@ 2017-05-24 20:13   ` Eric Auger
  2017-05-24 20:13   ` Eric Auger
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall
  Cc: drjones, wei

This patch selects IRQ_BYPASS_MANAGER and HAVE_KVM_IRQ_BYPASS
configs for ARM/ARM64.

kvm_arch_has_irq_bypass() now is implemented and returns true.
As a consequence the irq bypass consumer will be registered for
ARM/ARM64 with Direct EOI/IRQ forwarding callbacks:

- stop/start: halt/resume guest execution
- add/del_producer: set/unset forwarding/DEOI at vgic/irqchip level

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 arch/arm/kvm/Kconfig   |  3 +++
 arch/arm64/kvm/Kconfig |  3 +++
 virt/kvm/arm/arm.c     | 42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 90d0176..4e2b192 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
 	bool "Virtualization"
@@ -35,6 +36,8 @@ config KVM
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_MSI
+	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
 	---help---
 	  Support hosting virtualized guest machines.
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 52cb7ad..7e0d6e6 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
 	bool "Virtualization"
@@ -35,6 +36,8 @@ config KVM
 	select HAVE_KVM_MSI
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
+	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	---help---
 	  Support hosting virtualized guest machines.
 	  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 3417e18..0929185 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -27,6 +27,8 @@
 #include <linux/mman.h>
 #include <linux/sched.h>
 #include <linux/kvm.h>
+#include <linux/kvm_irqfd.h>
+#include <linux/irqbypass.h>
 #include <trace/events/kvm.h>
 #include <kvm/arm_pmu.h>
 
@@ -1420,6 +1422,46 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
 	return NULL;
 }
 
+bool kvm_arch_has_irq_bypass(void)
+{
+	return true;
+}
+
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	return kvm_vgic_set_forwarding(irqfd->kvm, prod->irq,
+				       irqfd->gsi + VGIC_NR_PRIVATE_IRQS);
+}
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_vgic_unset_forwarding(irqfd->kvm, prod->irq,
+				  irqfd->gsi + VGIC_NR_PRIVATE_IRQS);
+}
+
+void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_arm_halt_guest(irqfd->kvm);
+}
+
+void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_arm_resume_guest(irqfd->kvm);
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/10] KVM: arm/arm64: register DEOI irq bypass consumer on ARM/ARM64
@ 2017-05-24 20:13   ` Eric Auger
  0 siblings, 0 replies; 69+ messages in thread
From: Eric Auger @ 2017-05-24 20:13 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, marc.zyngier, christoffer.dall

This patch selects IRQ_BYPASS_MANAGER and HAVE_KVM_IRQ_BYPASS
configs for ARM/ARM64.

kvm_arch_has_irq_bypass() now is implemented and returns true.
As a consequence the irq bypass consumer will be registered for
ARM/ARM64 with Direct EOI/IRQ forwarding callbacks:

- stop/start: halt/resume guest execution
- add/del_producer: set/unset forwarding/DEOI at vgic/irqchip level

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 arch/arm/kvm/Kconfig   |  3 +++
 arch/arm64/kvm/Kconfig |  3 +++
 virt/kvm/arm/arm.c     | 42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 90d0176..4e2b192 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
 	bool "Virtualization"
@@ -35,6 +36,8 @@ config KVM
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_MSI
+	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
 	---help---
 	  Support hosting virtualized guest machines.
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 52cb7ad..7e0d6e6 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
 	bool "Virtualization"
@@ -35,6 +36,8 @@ config KVM
 	select HAVE_KVM_MSI
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
+	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	---help---
 	  Support hosting virtualized guest machines.
 	  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 3417e18..0929185 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -27,6 +27,8 @@
 #include <linux/mman.h>
 #include <linux/sched.h>
 #include <linux/kvm.h>
+#include <linux/kvm_irqfd.h>
+#include <linux/irqbypass.h>
 #include <trace/events/kvm.h>
 #include <kvm/arm_pmu.h>
 
@@ -1420,6 +1422,46 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
 	return NULL;
 }
 
+bool kvm_arch_has_irq_bypass(void)
+{
+	return true;
+}
+
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	return kvm_vgic_set_forwarding(irqfd->kvm, prod->irq,
+				       irqfd->gsi + VGIC_NR_PRIVATE_IRQS);
+}
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_vgic_unset_forwarding(irqfd->kvm, prod->irq,
+				  irqfd->gsi + VGIC_NR_PRIVATE_IRQS);
+}
+
+void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_arm_halt_guest(irqfd->kvm);
+}
+
+void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	kvm_arm_resume_guest(irqfd->kvm);
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
  2017-05-24 20:13   ` Eric Auger
@ 2017-05-25 18:05     ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-25 18:05 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

Hi Eric,

On Wed, May 24 2017 at 10:13:14 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
> For direct EOI modality we will need to differentiate a userspace
> masking from the IRQ handler auto-masking.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
>  drivers/vfio/platform/vfio_platform_private.h |  1 +
>  2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 46d4750..831f0b0 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
>  
>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>  
> -	if (!irq_ctx->masked) {
> +	if (!irq_ctx->masked && !irq_ctx->automasked) {

Could you please expand a bit on what this automasked variable covers?
It'd be good to document how masked and automasked differ in behaviour.

Also, it may be worth having a helper (is_masked?) to abstract both
cases.

>  		disable_irq_nosync(irq_ctx->hwirq);
>  		irq_ctx->masked = true;
>  	}
> @@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
>  
>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>  
> -	if (irq_ctx->masked) {
> +	if (irq_ctx->masked || irq_ctx->automasked) {
>  		enable_irq(irq_ctx->hwirq);
>  		irq_ctx->masked = false;
> +		irq_ctx->automasked = false;
>  	}
>  
>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> @@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  
>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>  
> -	if (!irq_ctx->masked) {
> +	if (!irq_ctx->masked && !irq_ctx->automasked) {
>  		ret = IRQ_HANDLED;
>  
>  		/* automask maskable interrupts */
>  		disable_irq_nosync(irq_ctx->hwirq);
> -		irq_ctx->masked = true;
> +		irq_ctx->automasked = true;
>  	}
>  
>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> @@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>  		vdev->irqs[i].count = 1;
>  		vdev->irqs[i].hwirq = hwirq;
>  		vdev->irqs[i].masked = false;
> +		vdev->irqs[i].automasked = false;
>  	}
>  
>  	vdev->num_irqs = cnt;
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 85ffe5d..8a3cfa9 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -34,6 +34,7 @@ struct vfio_platform_irq {
>  	char			*name;
>  	struct eventfd_ctx	*trigger;
>  	bool			masked;
> +	bool			automasked;
>  	spinlock_t		lock;
>  	struct virqfd		*unmask;
>  	struct virqfd		*mask;

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
@ 2017-05-25 18:05     ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-25 18:05 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

Hi Eric,

On Wed, May 24 2017 at 10:13:14 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
> For direct EOI modality we will need to differentiate a userspace
> masking from the IRQ handler auto-masking.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
>  drivers/vfio/platform/vfio_platform_private.h |  1 +
>  2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 46d4750..831f0b0 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
>  
>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>  
> -	if (!irq_ctx->masked) {
> +	if (!irq_ctx->masked && !irq_ctx->automasked) {

Could you please expand a bit on what this automasked variable covers?
It'd be good to document how masked and automasked differ in behaviour.

Also, it may be worth having a helper (is_masked?) to abstract both
cases.

>  		disable_irq_nosync(irq_ctx->hwirq);
>  		irq_ctx->masked = true;
>  	}
> @@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
>  
>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>  
> -	if (irq_ctx->masked) {
> +	if (irq_ctx->masked || irq_ctx->automasked) {
>  		enable_irq(irq_ctx->hwirq);
>  		irq_ctx->masked = false;
> +		irq_ctx->automasked = false;
>  	}
>  
>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> @@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  
>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>  
> -	if (!irq_ctx->masked) {
> +	if (!irq_ctx->masked && !irq_ctx->automasked) {
>  		ret = IRQ_HANDLED;
>  
>  		/* automask maskable interrupts */
>  		disable_irq_nosync(irq_ctx->hwirq);
> -		irq_ctx->masked = true;
> +		irq_ctx->automasked = true;
>  	}
>  
>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> @@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>  		vdev->irqs[i].count = 1;
>  		vdev->irqs[i].hwirq = hwirq;
>  		vdev->irqs[i].masked = false;
> +		vdev->irqs[i].automasked = false;
>  	}
>  
>  	vdev->num_irqs = cnt;
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 85ffe5d..8a3cfa9 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -34,6 +34,7 @@ struct vfio_platform_irq {
>  	char			*name;
>  	struct eventfd_ctx	*trigger;
>  	bool			masked;
> +	bool			automasked;
>  	spinlock_t		lock;
>  	struct virqfd		*unmask;
>  	struct virqfd		*mask;

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-05-24 20:13   ` Eric Auger
@ 2017-05-25 19:14     ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-25 19:14 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

On Wed, May 24 2017 at 10:13:21 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
> Virtual interrupts directly mapped to physical interrupts require
> some special care. Their pending and active state must be observed
> at distributor level and not in the list register.
>
> Also a level sensitive interrupt's level is not toggled down by any
> maintenance IRQ handler as the EOI is not trapped.
>
> This patch adds an host_irq field in vgic_irq struct to easily
> get the irqchip state of the host irq. We also handle the
> physical IRQ case in vgic_validate_injection and add helpers to
> get the line level and active state.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  include/kvm/arm_vgic.h    |  4 +++-
>  virt/kvm/arm/arch_timer.c |  3 ++-
>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>  4 files changed, 51 insertions(+), 9 deletions(-)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index ef71858..695ebc7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -112,6 +112,7 @@ struct vgic_irq {
>  	bool hw;			/* Tied to HW IRQ */
>  	struct kref refcount;		/* Used for LPIs */
>  	u32 hwintid;			/* HW INTID number */
> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>  	union {
>  		u8 targets;			/* GICv2 target VCPUs mask */
>  		u32 mpidr;			/* GICv3 target VCPU */
> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			bool level);
>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			       bool level);
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 5976609..45f4779 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> +				    vtimer->irq.irq, phys_irq);
>  	if (ret)
>  		return ret;
>  
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index 83b24d2..aa0618c 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  	kfree(irq);
>  }
>  
> +bool irq_line_level(struct vgic_irq *irq)
> +{
> +	bool line_level = irq->line_level;
> +
> +	if (unlikely(is_unshared_mapped(irq)))

The "unshared" bit doesn't mean much to me. Do you want to say "an
interrupt that belongs to a device only accessed by a single VM"?

Given that this can only be an SPI, can we use something like
"is_mapped_spi()" instead? I find it a lot more readable, but I'm open
to alternative suggestions.

> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_PENDING,
> +					      &line_level));
> +	return line_level;
> +}
> +
> +bool irq_is_active(struct vgic_irq *irq)
> +{
> +	bool is_active = irq->active;
> +
> +	if (unlikely(is_unshared_mapped(irq)))
> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_ACTIVE,
> +					      &is_active));
> +	return is_active;
> +}
> +
>  /**
>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>   *
> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>  
>  	/* If the interrupt is active, it must stay on the current vcpu */
> -	if (irq->active)
> +	if (irq_is_active(irq))
>  		return irq->vcpu ? : irq->target_vcpu;
>  
>  	/*
> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>  {
>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
> +	bool activea, activeb;
>  	bool penda, pendb;
>  	int ret;
>  
>  	spin_lock(&irqa->irq_lock);
>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>  
> -	if (irqa->active || irqb->active) {
> -		ret = (int)irqb->active - (int)irqa->active;
> +	activea = irq_is_active(irqa);
> +	activeb = irq_is_active(irqb);
> +
> +	if (activea || activeb) {
> +		ret = (int)activeb - (int)activea;
>  		goto out;
>  	}
>  
> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Only valid injection if changing level for level-triggered IRQs or for a
> - * rising edge.
> + * rising edge. Injection of virtual interrupts associated to physical
> + * interrupts always is valid.
>   */
>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>  {
>  	switch (irq->config) {
>  	case VGIC_CONFIG_LEVEL:
> -		return irq->line_level != level;
> +		if (unlikely(is_unshared_mapped(irq)))
> +			return true;
> +		else
> +			return irq->line_level != level;

This would be more readable as:

		return (irq->line_level != level ||
                	unlikely(is_unshared_mapped(irq)));

>  	case VGIC_CONFIG_EDGE:
>  		return level;
>  	}
> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  	return 0;
>  }
>  
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq)
>  {
>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>  
> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>  
>  	irq->hw = true;
>  	irq->hwintid = phys_irq;
> +	irq->host_irq = host_irq;

If you're now passing the Linux IRQ to the mapping function, you might
as well move the code that extracts the host hwirq here as well.

>  
>  	spin_unlock(&irq->irq_lock);
>  	vgic_put_irq(vcpu->kvm, irq);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index da83e4c..dc4972b 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -17,6 +17,7 @@
>  #define __KVM_ARM_VGIC_NEW_H__
>  
>  #include <linux/irqchip/arm-gic-common.h>
> +#include <linux/interrupt.h>
>  
>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>  #define IMPLEMENTER_ARM		0x43b
> @@ -96,14 +97,20 @@
>  /* we only support 64 kB translation table page size */
>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>  
> +bool irq_line_level(struct vgic_irq *irq);
> +bool irq_is_active(struct vgic_irq *irq);
> +
>  static inline bool irq_is_pending(struct vgic_irq *irq)
>  {
>  	if (irq->config == VGIC_CONFIG_EDGE)
>  		return irq->pending_latch;
>  	else
> -		return irq->pending_latch || irq->line_level;
> +		return irq->pending_latch || irq_line_level(irq);
>  }
>  
> +#define is_unshared_mapped(i) \
> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
> +
>  /*
>   * This struct provides an intermediate representation of the fields contained
>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-05-25 19:14     ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-25 19:14 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

On Wed, May 24 2017 at 10:13:21 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
> Virtual interrupts directly mapped to physical interrupts require
> some special care. Their pending and active state must be observed
> at distributor level and not in the list register.
>
> Also a level sensitive interrupt's level is not toggled down by any
> maintenance IRQ handler as the EOI is not trapped.
>
> This patch adds an host_irq field in vgic_irq struct to easily
> get the irqchip state of the host irq. We also handle the
> physical IRQ case in vgic_validate_injection and add helpers to
> get the line level and active state.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  include/kvm/arm_vgic.h    |  4 +++-
>  virt/kvm/arm/arch_timer.c |  3 ++-
>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>  4 files changed, 51 insertions(+), 9 deletions(-)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index ef71858..695ebc7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -112,6 +112,7 @@ struct vgic_irq {
>  	bool hw;			/* Tied to HW IRQ */
>  	struct kref refcount;		/* Used for LPIs */
>  	u32 hwintid;			/* HW INTID number */
> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>  	union {
>  		u8 targets;			/* GICv2 target VCPUs mask */
>  		u32 mpidr;			/* GICv3 target VCPU */
> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			bool level);
>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			       bool level);
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 5976609..45f4779 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> +				    vtimer->irq.irq, phys_irq);
>  	if (ret)
>  		return ret;
>  
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index 83b24d2..aa0618c 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  	kfree(irq);
>  }
>  
> +bool irq_line_level(struct vgic_irq *irq)
> +{
> +	bool line_level = irq->line_level;
> +
> +	if (unlikely(is_unshared_mapped(irq)))

The "unshared" bit doesn't mean much to me. Do you want to say "an
interrupt that belongs to a device only accessed by a single VM"?

Given that this can only be an SPI, can we use something like
"is_mapped_spi()" instead? I find it a lot more readable, but I'm open
to alternative suggestions.

> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_PENDING,
> +					      &line_level));
> +	return line_level;
> +}
> +
> +bool irq_is_active(struct vgic_irq *irq)
> +{
> +	bool is_active = irq->active;
> +
> +	if (unlikely(is_unshared_mapped(irq)))
> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_ACTIVE,
> +					      &is_active));
> +	return is_active;
> +}
> +
>  /**
>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>   *
> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>  
>  	/* If the interrupt is active, it must stay on the current vcpu */
> -	if (irq->active)
> +	if (irq_is_active(irq))
>  		return irq->vcpu ? : irq->target_vcpu;
>  
>  	/*
> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>  {
>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
> +	bool activea, activeb;
>  	bool penda, pendb;
>  	int ret;
>  
>  	spin_lock(&irqa->irq_lock);
>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>  
> -	if (irqa->active || irqb->active) {
> -		ret = (int)irqb->active - (int)irqa->active;
> +	activea = irq_is_active(irqa);
> +	activeb = irq_is_active(irqb);
> +
> +	if (activea || activeb) {
> +		ret = (int)activeb - (int)activea;
>  		goto out;
>  	}
>  
> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Only valid injection if changing level for level-triggered IRQs or for a
> - * rising edge.
> + * rising edge. Injection of virtual interrupts associated to physical
> + * interrupts always is valid.
>   */
>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>  {
>  	switch (irq->config) {
>  	case VGIC_CONFIG_LEVEL:
> -		return irq->line_level != level;
> +		if (unlikely(is_unshared_mapped(irq)))
> +			return true;
> +		else
> +			return irq->line_level != level;

This would be more readable as:

		return (irq->line_level != level ||
                	unlikely(is_unshared_mapped(irq)));

>  	case VGIC_CONFIG_EDGE:
>  		return level;
>  	}
> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  	return 0;
>  }
>  
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq)
>  {
>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>  
> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>  
>  	irq->hw = true;
>  	irq->hwintid = phys_irq;
> +	irq->host_irq = host_irq;

If you're now passing the Linux IRQ to the mapping function, you might
as well move the code that extracts the host hwirq here as well.

>  
>  	spin_unlock(&irq->irq_lock);
>  	vgic_put_irq(vcpu->kvm, irq);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index da83e4c..dc4972b 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -17,6 +17,7 @@
>  #define __KVM_ARM_VGIC_NEW_H__
>  
>  #include <linux/irqchip/arm-gic-common.h>
> +#include <linux/interrupt.h>
>  
>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>  #define IMPLEMENTER_ARM		0x43b
> @@ -96,14 +97,20 @@
>  /* we only support 64 kB translation table page size */
>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>  
> +bool irq_line_level(struct vgic_irq *irq);
> +bool irq_is_active(struct vgic_irq *irq);
> +
>  static inline bool irq_is_pending(struct vgic_irq *irq)
>  {
>  	if (irq->config == VGIC_CONFIG_EDGE)
>  		return irq->pending_latch;
>  	else
> -		return irq->pending_latch || irq->line_level;
> +		return irq->pending_latch || irq_line_level(irq);
>  }
>  
> +#define is_unshared_mapped(i) \
> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
> +
>  /*
>   * This struct provides an intermediate representation of the fields contained
>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
  2017-05-24 20:13   ` Eric Auger
@ 2017-05-25 19:19     ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-25 19:19 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
> Implements kvm_vgic_[set|unset]_forwarding.
>
> Handle low-level VGIC programming and consistent irqchip
> programming.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> ---
>  include/kvm/arm_vgic.h   |   5 +++
>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 110 insertions(+)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 695ebc7..7ddac8a 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>   */
>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>  
> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
> +			    unsigned int virt_irq);
> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
> +			       unsigned int virt_irq);

nit: the name of the variables do not match that of the function
definition, and are much clearer there.

> +
>  #endif /* __KVM_ARM_VGIC_H */
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index aa0618c..c2add8d 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -17,6 +17,8 @@
>  #include <linux/kvm.h>
>  #include <linux/kvm_host.h>
>  #include <linux/list_sort.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
>  
>  #include "vgic.h"
>  
> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>  	return map_is_active;
>  }
>  
> +/**
> + * kvm_vgic_set_forwarding - Set IRQ forwarding
> + *
> + * @kvm: kvm handle
> + * @host_irq: the host linux IRQ
> + * @vintid: the virtual INTID
> + *
> + * This function must be called when the IRQ is not active:
> + * ie. not active at GIC level and not currently under injection
> + * into the guest using the unforwarded mode. The physical IRQ must
> + * be disabled and all vCPUs must have been exited and prevented
> + * from being re-entered.
> + */
> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
> +			    unsigned int vintid)
> +{
> +	struct kvm_vcpu *vcpu;
> +	struct vgic_irq *irq;
> +	struct irq_desc *desc;
> +	struct irq_data *data;
> +	unsigned int pintid;
> +	int ret = 0;
> +
> +
> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
> +		  __func__, host_irq, vintid);
> +
> +	if (!vgic_valid_spi(kvm, vintid))
> +		return 0;
> +
> +	/* find the INTID corresponding to @host_irq */
> +	desc = irq_to_desc(host_irq);
> +	if (!desc) {
> +		kvm_err("%s: no interrupt descriptor\n", __func__);
> +		return -EINVAL;
> +	}
> +
> +	data = irq_desc_get_irq_data(desc);
> +	while (data->parent_data)
> +		data = data->parent_data;
> +
> +	pintid = data->hwirq;
> +
> +	irq = vgic_get_irq(kvm, NULL, vintid);
> +
> +	spin_lock(&irq->irq_lock);
> +
> +	vcpu = irq->target_vcpu;
> +
> +	if (!vcpu) {
> +		ret = -EAGAIN;
> +		goto unlock;
> +	}
> +
> +	irq_set_vcpu_affinity(host_irq, vcpu);
> +
> +	irq->hw = true;
> +	irq->hwintid = pintid;
> +	irq->host_irq = host_irq;

This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
you move the pintid discovery there. Can we somehow unify them?

> +
> +unlock:
> +	spin_unlock(&irq->irq_lock);
> +	vgic_put_irq(kvm, irq);
> +	return ret;
> +}
> +
> +/**
> + * kvm_vgic_unset_forwarding - Unset IRQ forwarding
> + *
> + * @kvm: KVM handle
> + * @host_irq: the host Linux IRQ number
> + * @vintid: virtual INTID
> + *
> + * This function must be called when the host irq is disabled and
> + * all vCPUs have been exited and prevented from being re-entered.
> + */
> +void kvm_vgic_unset_forwarding(struct kvm *kvm,
> +			       unsigned int host_irq,
> +			       unsigned int vintid)
> +{
> +	struct vgic_irq *irq;
> +	bool active;
> +
> +	kvm_debug("%s host_irq=%d virt_irq=%d\n", __func__, host_irq, vintid);
> +
> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
> +
> +	irq = vgic_get_irq(kvm, NULL, vintid);
> +	spin_lock(&irq->irq_lock);
> +
> +	if (!is_unshared_mapped(irq))
> +		goto unlock;
> +
> +	if (active)
> +		irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
> +
> +	irq->hw = false;
> +	irq_set_vcpu_affinity(host_irq, NULL);
> +
> +unlock:
> +	spin_unlock(&irq->irq_lock);

Same here.

> +}
> +


Thanks,

        M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
@ 2017-05-25 19:19     ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-25 19:19 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
> Implements kvm_vgic_[set|unset]_forwarding.
>
> Handle low-level VGIC programming and consistent irqchip
> programming.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> ---
>  include/kvm/arm_vgic.h   |   5 +++
>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 110 insertions(+)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 695ebc7..7ddac8a 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>   */
>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>  
> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
> +			    unsigned int virt_irq);
> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
> +			       unsigned int virt_irq);

nit: the name of the variables do not match that of the function
definition, and are much clearer there.

> +
>  #endif /* __KVM_ARM_VGIC_H */
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index aa0618c..c2add8d 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -17,6 +17,8 @@
>  #include <linux/kvm.h>
>  #include <linux/kvm_host.h>
>  #include <linux/list_sort.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
>  
>  #include "vgic.h"
>  
> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>  	return map_is_active;
>  }
>  
> +/**
> + * kvm_vgic_set_forwarding - Set IRQ forwarding
> + *
> + * @kvm: kvm handle
> + * @host_irq: the host linux IRQ
> + * @vintid: the virtual INTID
> + *
> + * This function must be called when the IRQ is not active:
> + * ie. not active at GIC level and not currently under injection
> + * into the guest using the unforwarded mode. The physical IRQ must
> + * be disabled and all vCPUs must have been exited and prevented
> + * from being re-entered.
> + */
> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
> +			    unsigned int vintid)
> +{
> +	struct kvm_vcpu *vcpu;
> +	struct vgic_irq *irq;
> +	struct irq_desc *desc;
> +	struct irq_data *data;
> +	unsigned int pintid;
> +	int ret = 0;
> +
> +
> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
> +		  __func__, host_irq, vintid);
> +
> +	if (!vgic_valid_spi(kvm, vintid))
> +		return 0;
> +
> +	/* find the INTID corresponding to @host_irq */
> +	desc = irq_to_desc(host_irq);
> +	if (!desc) {
> +		kvm_err("%s: no interrupt descriptor\n", __func__);
> +		return -EINVAL;
> +	}
> +
> +	data = irq_desc_get_irq_data(desc);
> +	while (data->parent_data)
> +		data = data->parent_data;
> +
> +	pintid = data->hwirq;
> +
> +	irq = vgic_get_irq(kvm, NULL, vintid);
> +
> +	spin_lock(&irq->irq_lock);
> +
> +	vcpu = irq->target_vcpu;
> +
> +	if (!vcpu) {
> +		ret = -EAGAIN;
> +		goto unlock;
> +	}
> +
> +	irq_set_vcpu_affinity(host_irq, vcpu);
> +
> +	irq->hw = true;
> +	irq->hwintid = pintid;
> +	irq->host_irq = host_irq;

This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
you move the pintid discovery there. Can we somehow unify them?

> +
> +unlock:
> +	spin_unlock(&irq->irq_lock);
> +	vgic_put_irq(kvm, irq);
> +	return ret;
> +}
> +
> +/**
> + * kvm_vgic_unset_forwarding - Unset IRQ forwarding
> + *
> + * @kvm: KVM handle
> + * @host_irq: the host Linux IRQ number
> + * @vintid: virtual INTID
> + *
> + * This function must be called when the host irq is disabled and
> + * all vCPUs have been exited and prevented from being re-entered.
> + */
> +void kvm_vgic_unset_forwarding(struct kvm *kvm,
> +			       unsigned int host_irq,
> +			       unsigned int vintid)
> +{
> +	struct vgic_irq *irq;
> +	bool active;
> +
> +	kvm_debug("%s host_irq=%d virt_irq=%d\n", __func__, host_irq, vintid);
> +
> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
> +
> +	irq = vgic_get_irq(kvm, NULL, vintid);
> +	spin_lock(&irq->irq_lock);
> +
> +	if (!is_unshared_mapped(irq))
> +		goto unlock;
> +
> +	if (active)
> +		irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
> +
> +	irq->hw = false;
> +	irq_set_vcpu_affinity(host_irq, NULL);
> +
> +unlock:
> +	spin_unlock(&irq->irq_lock);

Same here.

> +}
> +


Thanks,

        M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
  2017-05-25 18:05     ` Marc Zyngier
  (?)
@ 2017-05-30 12:45     ` Auger Eric
  2017-05-31 17:41         ` Alex Williamson
  -1 siblings, 1 reply; 69+ messages in thread
From: Auger Eric @ 2017-05-30 12:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

Hi Marc,

On 25/05/2017 20:05, Marc Zyngier wrote:
> Hi Eric,
> 
> On Wed, May 24 2017 at 10:13:14 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>> For direct EOI modality we will need to differentiate a userspace
>> masking from the IRQ handler auto-masking.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
>>  drivers/vfio/platform/vfio_platform_private.h |  1 +
>>  2 files changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 46d4750..831f0b0 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
>>  
>>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>>  
>> -	if (!irq_ctx->masked) {
>> +	if (!irq_ctx->masked && !irq_ctx->automasked) {
> 
> Could you please expand a bit on what this automasked variable covers?
> It'd be good to document how masked and automasked differ in behaviour.

Yes sure. So automasked is set by the physical IRQ handler only, for
level sensitive IRQ (AUTOMASKED interrupts). masked is set through the
userspace API (VFIO_DEVICE_SET_IRQS and ACTION_MASK) when masking the
IRQ. VFIO ACTION_UNMASK resets both.
> 
> Also, it may be worth having a helper (is_masked?) to abstract both
> cases.

Sure

Eric
> 
>>  		disable_irq_nosync(irq_ctx->hwirq);
>>  		irq_ctx->masked = true;
>>  	}
>> @@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
>>  
>>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>>  
>> -	if (irq_ctx->masked) {
>> +	if (irq_ctx->masked || irq_ctx->automasked) {
>>  		enable_irq(irq_ctx->hwirq);
>>  		irq_ctx->masked = false;
>> +		irq_ctx->automasked = false;
>>  	}
>>  
>>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> @@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>>  
>>  	spin_lock_irqsave(&irq_ctx->lock, flags);
>>  
>> -	if (!irq_ctx->masked) {
>> +	if (!irq_ctx->masked && !irq_ctx->automasked) {
>>  		ret = IRQ_HANDLED;
>>  
>>  		/* automask maskable interrupts */
>>  		disable_irq_nosync(irq_ctx->hwirq);
>> -		irq_ctx->masked = true;
>> +		irq_ctx->automasked = true;
>>  	}
>>  
>>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> @@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
>>  		vdev->irqs[i].count = 1;
>>  		vdev->irqs[i].hwirq = hwirq;
>>  		vdev->irqs[i].masked = false;
>> +		vdev->irqs[i].automasked = false;
>>  	}
>>  
>>  	vdev->num_irqs = cnt;
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index 85ffe5d..8a3cfa9 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -34,6 +34,7 @@ struct vfio_platform_irq {
>>  	char			*name;
>>  	struct eventfd_ctx	*trigger;
>>  	bool			masked;
>> +	bool			automasked;
>>  	spinlock_t		lock;
>>  	struct virqfd		*unmask;
>>  	struct virqfd		*mask;
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-05-25 19:14     ` Marc Zyngier
@ 2017-05-30 12:50       ` Auger Eric
  -1 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-30 12:50 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

Hi Marc,

On 25/05/2017 21:14, Marc Zyngier wrote:
> On Wed, May 24 2017 at 10:13:21 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>> Virtual interrupts directly mapped to physical interrupts require
>> some special care. Their pending and active state must be observed
>> at distributor level and not in the list register.
>>
>> Also a level sensitive interrupt's level is not toggled down by any
>> maintenance IRQ handler as the EOI is not trapped.
>>
>> This patch adds an host_irq field in vgic_irq struct to easily
>> get the irqchip state of the host irq. We also handle the
>> physical IRQ case in vgic_validate_injection and add helpers to
>> get the line level and active state.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/kvm/arm_vgic.h    |  4 +++-
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index ef71858..695ebc7 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>  	bool hw;			/* Tied to HW IRQ */
>>  	struct kref refcount;		/* Used for LPIs */
>>  	u32 hwintid;			/* HW INTID number */
>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>  	union {
>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>  		u32 mpidr;			/* GICv3 target VCPU */
>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			bool level);
>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			       bool level);
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 5976609..45f4779 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> +				    vtimer->irq.irq, phys_irq);
>>  	if (ret)
>>  		return ret;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index 83b24d2..aa0618c 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>  	kfree(irq);
>>  }
>>  
>> +bool irq_line_level(struct vgic_irq *irq)
>> +{
>> +	bool line_level = irq->line_level;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
> 
> The "unshared" bit doesn't mean much to me. Do you want to say "an
> interrupt that belongs to a device only accessed by a single VM"?

Yes. This was the former naming. timer used shared HW irq and others
were dubberd unshared (https://lkml.org/lkml/2015/11/19/362).
> 
> Given that this can only be an SPI, can we use something like
> "is_mapped_spi()" instead? I find it a lot more readable, but I'm open
> to alternative suggestions.
Yep!

Thanks

Eric
> 
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_PENDING,
>> +					      &line_level));
>> +	return line_level;
>> +}
>> +
>> +bool irq_is_active(struct vgic_irq *irq)
>> +{
>> +	bool is_active = irq->active;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_ACTIVE,
>> +					      &is_active));
>> +	return is_active;
>> +}
>> +
>>  /**
>>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>>   *
>> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>>  
>>  	/* If the interrupt is active, it must stay on the current vcpu */
>> -	if (irq->active)
>> +	if (irq_is_active(irq))
>>  		return irq->vcpu ? : irq->target_vcpu;
>>  
>>  	/*
>> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>>  {
>>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
>> +	bool activea, activeb;
>>  	bool penda, pendb;
>>  	int ret;
>>  
>>  	spin_lock(&irqa->irq_lock);
>>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>>  
>> -	if (irqa->active || irqb->active) {
>> -		ret = (int)irqb->active - (int)irqa->active;
>> +	activea = irq_is_active(irqa);
>> +	activeb = irq_is_active(irqb);
>> +
>> +	if (activea || activeb) {
>> +		ret = (int)activeb - (int)activea;
>>  		goto out;
>>  	}
>>  
>> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>>  
>>  /*
>>   * Only valid injection if changing level for level-triggered IRQs or for a
>> - * rising edge.
>> + * rising edge. Injection of virtual interrupts associated to physical
>> + * interrupts always is valid.
>>   */
>>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>>  {
>>  	switch (irq->config) {
>>  	case VGIC_CONFIG_LEVEL:
>> -		return irq->line_level != level;
>> +		if (unlikely(is_unshared_mapped(irq)))
>> +			return true;
>> +		else
>> +			return irq->line_level != level;
> 
> This would be more readable as:
> 
> 		return (irq->line_level != level ||
>                 	unlikely(is_unshared_mapped(irq)));

OK

Thanks

Eric
> 
>>  	case VGIC_CONFIG_EDGE:
>>  		return level;
>>  	}
>> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  	return 0;
>>  }
>>  
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq)
>>  {
>>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>>  
>> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>>  
>>  	irq->hw = true;
>>  	irq->hwintid = phys_irq;
>> +	irq->host_irq = host_irq;
> 
> If you're now passing the Linux IRQ to the mapping function, you might
> as well move the code that extracts the host hwirq here as well.
> 
>>  
>>  	spin_unlock(&irq->irq_lock);
>>  	vgic_put_irq(vcpu->kvm, irq);
>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>> index da83e4c..dc4972b 100644
>> --- a/virt/kvm/arm/vgic/vgic.h
>> +++ b/virt/kvm/arm/vgic/vgic.h
>> @@ -17,6 +17,7 @@
>>  #define __KVM_ARM_VGIC_NEW_H__
>>  
>>  #include <linux/irqchip/arm-gic-common.h>
>> +#include <linux/interrupt.h>
>>  
>>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>>  #define IMPLEMENTER_ARM		0x43b
>> @@ -96,14 +97,20 @@
>>  /* we only support 64 kB translation table page size */
>>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>>  
>> +bool irq_line_level(struct vgic_irq *irq);
>> +bool irq_is_active(struct vgic_irq *irq);
>> +
>>  static inline bool irq_is_pending(struct vgic_irq *irq)
>>  {
>>  	if (irq->config == VGIC_CONFIG_EDGE)
>>  		return irq->pending_latch;
>>  	else
>> -		return irq->pending_latch || irq->line_level;
>> +		return irq->pending_latch || irq_line_level(irq);
>>  }
>>  
>> +#define is_unshared_mapped(i) \
>> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
>> +
>>  /*
>>   * This struct provides an intermediate representation of the fields contained
>>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-05-30 12:50       ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-30 12:50 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

Hi Marc,

On 25/05/2017 21:14, Marc Zyngier wrote:
> On Wed, May 24 2017 at 10:13:21 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>> Virtual interrupts directly mapped to physical interrupts require
>> some special care. Their pending and active state must be observed
>> at distributor level and not in the list register.
>>
>> Also a level sensitive interrupt's level is not toggled down by any
>> maintenance IRQ handler as the EOI is not trapped.
>>
>> This patch adds an host_irq field in vgic_irq struct to easily
>> get the irqchip state of the host irq. We also handle the
>> physical IRQ case in vgic_validate_injection and add helpers to
>> get the line level and active state.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/kvm/arm_vgic.h    |  4 +++-
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index ef71858..695ebc7 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>  	bool hw;			/* Tied to HW IRQ */
>>  	struct kref refcount;		/* Used for LPIs */
>>  	u32 hwintid;			/* HW INTID number */
>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>  	union {
>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>  		u32 mpidr;			/* GICv3 target VCPU */
>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			bool level);
>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			       bool level);
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 5976609..45f4779 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> +				    vtimer->irq.irq, phys_irq);
>>  	if (ret)
>>  		return ret;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index 83b24d2..aa0618c 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>  	kfree(irq);
>>  }
>>  
>> +bool irq_line_level(struct vgic_irq *irq)
>> +{
>> +	bool line_level = irq->line_level;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
> 
> The "unshared" bit doesn't mean much to me. Do you want to say "an
> interrupt that belongs to a device only accessed by a single VM"?

Yes. This was the former naming. timer used shared HW irq and others
were dubberd unshared (https://lkml.org/lkml/2015/11/19/362).
> 
> Given that this can only be an SPI, can we use something like
> "is_mapped_spi()" instead? I find it a lot more readable, but I'm open
> to alternative suggestions.
Yep!

Thanks

Eric
> 
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_PENDING,
>> +					      &line_level));
>> +	return line_level;
>> +}
>> +
>> +bool irq_is_active(struct vgic_irq *irq)
>> +{
>> +	bool is_active = irq->active;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_ACTIVE,
>> +					      &is_active));
>> +	return is_active;
>> +}
>> +
>>  /**
>>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>>   *
>> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>>  
>>  	/* If the interrupt is active, it must stay on the current vcpu */
>> -	if (irq->active)
>> +	if (irq_is_active(irq))
>>  		return irq->vcpu ? : irq->target_vcpu;
>>  
>>  	/*
>> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>>  {
>>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
>> +	bool activea, activeb;
>>  	bool penda, pendb;
>>  	int ret;
>>  
>>  	spin_lock(&irqa->irq_lock);
>>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>>  
>> -	if (irqa->active || irqb->active) {
>> -		ret = (int)irqb->active - (int)irqa->active;
>> +	activea = irq_is_active(irqa);
>> +	activeb = irq_is_active(irqb);
>> +
>> +	if (activea || activeb) {
>> +		ret = (int)activeb - (int)activea;
>>  		goto out;
>>  	}
>>  
>> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>>  
>>  /*
>>   * Only valid injection if changing level for level-triggered IRQs or for a
>> - * rising edge.
>> + * rising edge. Injection of virtual interrupts associated to physical
>> + * interrupts always is valid.
>>   */
>>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>>  {
>>  	switch (irq->config) {
>>  	case VGIC_CONFIG_LEVEL:
>> -		return irq->line_level != level;
>> +		if (unlikely(is_unshared_mapped(irq)))
>> +			return true;
>> +		else
>> +			return irq->line_level != level;
> 
> This would be more readable as:
> 
> 		return (irq->line_level != level ||
>                 	unlikely(is_unshared_mapped(irq)));

OK

Thanks

Eric
> 
>>  	case VGIC_CONFIG_EDGE:
>>  		return level;
>>  	}
>> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  	return 0;
>>  }
>>  
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq)
>>  {
>>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>>  
>> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>>  
>>  	irq->hw = true;
>>  	irq->hwintid = phys_irq;
>> +	irq->host_irq = host_irq;
> 
> If you're now passing the Linux IRQ to the mapping function, you might
> as well move the code that extracts the host hwirq here as well.
> 
>>  
>>  	spin_unlock(&irq->irq_lock);
>>  	vgic_put_irq(vcpu->kvm, irq);
>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>> index da83e4c..dc4972b 100644
>> --- a/virt/kvm/arm/vgic/vgic.h
>> +++ b/virt/kvm/arm/vgic/vgic.h
>> @@ -17,6 +17,7 @@
>>  #define __KVM_ARM_VGIC_NEW_H__
>>  
>>  #include <linux/irqchip/arm-gic-common.h>
>> +#include <linux/interrupt.h>
>>  
>>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>>  #define IMPLEMENTER_ARM		0x43b
>> @@ -96,14 +97,20 @@
>>  /* we only support 64 kB translation table page size */
>>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>>  
>> +bool irq_line_level(struct vgic_irq *irq);
>> +bool irq_is_active(struct vgic_irq *irq);
>> +
>>  static inline bool irq_is_pending(struct vgic_irq *irq)
>>  {
>>  	if (irq->config == VGIC_CONFIG_EDGE)
>>  		return irq->pending_latch;
>>  	else
>> -		return irq->pending_latch || irq->line_level;
>> +		return irq->pending_latch || irq_line_level(irq);
>>  }
>>  
>> +#define is_unshared_mapped(i) \
>> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
>> +
>>  /*
>>   * This struct provides an intermediate representation of the fields contained
>>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
  2017-05-25 19:19     ` Marc Zyngier
@ 2017-05-30 12:54       ` Auger Eric
  -1 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-30 12:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

Hi,

On 25/05/2017 21:19, Marc Zyngier wrote:
> On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>> Implements kvm_vgic_[set|unset]_forwarding.
>>
>> Handle low-level VGIC programming and consistent irqchip
>> programming.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  include/kvm/arm_vgic.h   |   5 +++
>>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 110 insertions(+)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 695ebc7..7ddac8a 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>>   */
>>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>>  
>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
>> +			    unsigned int virt_irq);
>> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
>> +			       unsigned int virt_irq);
> 
> nit: the name of the variables do not match that of the function
> definition, and are much clearer there.
> 
>> +
>>  #endif /* __KVM_ARM_VGIC_H */
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index aa0618c..c2add8d 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -17,6 +17,8 @@
>>  #include <linux/kvm.h>
>>  #include <linux/kvm_host.h>
>>  #include <linux/list_sort.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/irq.h>
>>  
>>  #include "vgic.h"
>>  
>> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>>  	return map_is_active;
>>  }
>>  
>> +/**
>> + * kvm_vgic_set_forwarding - Set IRQ forwarding
>> + *
>> + * @kvm: kvm handle
>> + * @host_irq: the host linux IRQ
>> + * @vintid: the virtual INTID
>> + *
>> + * This function must be called when the IRQ is not active:
>> + * ie. not active at GIC level and not currently under injection
>> + * into the guest using the unforwarded mode. The physical IRQ must
>> + * be disabled and all vCPUs must have been exited and prevented
>> + * from being re-entered.
>> + */
>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
>> +			    unsigned int vintid)
>> +{
>> +	struct kvm_vcpu *vcpu;
>> +	struct vgic_irq *irq;
>> +	struct irq_desc *desc;
>> +	struct irq_data *data;
>> +	unsigned int pintid;
>> +	int ret = 0;
>> +
>> +
>> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
>> +		  __func__, host_irq, vintid);
>> +
>> +	if (!vgic_valid_spi(kvm, vintid))
>> +		return 0;
>> +
>> +	/* find the INTID corresponding to @host_irq */
>> +	desc = irq_to_desc(host_irq);
>> +	if (!desc) {
>> +		kvm_err("%s: no interrupt descriptor\n", __func__);
>> +		return -EINVAL;
>> +	}
>> +
>> +	data = irq_desc_get_irq_data(desc);
>> +	while (data->parent_data)
>> +		data = data->parent_data;
>> +
>> +	pintid = data->hwirq;
>> +
>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>> +
>> +	spin_lock(&irq->irq_lock);
>> +
>> +	vcpu = irq->target_vcpu;
>> +
>> +	if (!vcpu) {
>> +		ret = -EAGAIN;
>> +		goto unlock;
>> +	}
>> +
>> +	irq_set_vcpu_affinity(host_irq, vcpu);
>> +
>> +	irq->hw = true;
>> +	irq->hwintid = pintid;
>> +	irq->host_irq = host_irq;
> 
> This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
> you move the pintid discovery there. Can we somehow unify them?
Sure. At the beginning it was just a matter of irq_lock I did not want
to release.

I was somehow embarrassed by the vcpu param of irq_set_vcpu_affinity.
Shall we really test target_vcpu. The actual value is unused for SPI so
shouldn't we simply use something != NULL.
> 
>> +
>> +unlock:
>> +	spin_unlock(&irq->irq_lock);
>> +	vgic_put_irq(kvm, irq);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * kvm_vgic_unset_forwarding - Unset IRQ forwarding
>> + *
>> + * @kvm: KVM handle
>> + * @host_irq: the host Linux IRQ number
>> + * @vintid: virtual INTID
>> + *
>> + * This function must be called when the host irq is disabled and
>> + * all vCPUs have been exited and prevented from being re-entered.
>> + */
>> +void kvm_vgic_unset_forwarding(struct kvm *kvm,
>> +			       unsigned int host_irq,
>> +			       unsigned int vintid)
>> +{
>> +	struct vgic_irq *irq;
>> +	bool active;
>> +
>> +	kvm_debug("%s host_irq=%d virt_irq=%d\n", __func__, host_irq, vintid);
>> +
>> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
>> +
>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>> +	spin_lock(&irq->irq_lock);
>> +
>> +	if (!is_unshared_mapped(irq))
>> +		goto unlock;
>> +
>> +	if (active)
>> +		irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
>> +
>> +	irq->hw = false;
>> +	irq_set_vcpu_affinity(host_irq, NULL);
>> +
>> +unlock:
>> +	spin_unlock(&irq->irq_lock);
> 
> Same here.

OK

Thanks

Eric
> 
>> +}
>> +
> 
> 
> Thanks,
> 
>         M.
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
@ 2017-05-30 12:54       ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-30 12:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

Hi,

On 25/05/2017 21:19, Marc Zyngier wrote:
> On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>> Implements kvm_vgic_[set|unset]_forwarding.
>>
>> Handle low-level VGIC programming and consistent irqchip
>> programming.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  include/kvm/arm_vgic.h   |   5 +++
>>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 110 insertions(+)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 695ebc7..7ddac8a 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>>   */
>>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>>  
>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
>> +			    unsigned int virt_irq);
>> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
>> +			       unsigned int virt_irq);
> 
> nit: the name of the variables do not match that of the function
> definition, and are much clearer there.
> 
>> +
>>  #endif /* __KVM_ARM_VGIC_H */
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index aa0618c..c2add8d 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -17,6 +17,8 @@
>>  #include <linux/kvm.h>
>>  #include <linux/kvm_host.h>
>>  #include <linux/list_sort.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/irq.h>
>>  
>>  #include "vgic.h"
>>  
>> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>>  	return map_is_active;
>>  }
>>  
>> +/**
>> + * kvm_vgic_set_forwarding - Set IRQ forwarding
>> + *
>> + * @kvm: kvm handle
>> + * @host_irq: the host linux IRQ
>> + * @vintid: the virtual INTID
>> + *
>> + * This function must be called when the IRQ is not active:
>> + * ie. not active at GIC level and not currently under injection
>> + * into the guest using the unforwarded mode. The physical IRQ must
>> + * be disabled and all vCPUs must have been exited and prevented
>> + * from being re-entered.
>> + */
>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
>> +			    unsigned int vintid)
>> +{
>> +	struct kvm_vcpu *vcpu;
>> +	struct vgic_irq *irq;
>> +	struct irq_desc *desc;
>> +	struct irq_data *data;
>> +	unsigned int pintid;
>> +	int ret = 0;
>> +
>> +
>> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
>> +		  __func__, host_irq, vintid);
>> +
>> +	if (!vgic_valid_spi(kvm, vintid))
>> +		return 0;
>> +
>> +	/* find the INTID corresponding to @host_irq */
>> +	desc = irq_to_desc(host_irq);
>> +	if (!desc) {
>> +		kvm_err("%s: no interrupt descriptor\n", __func__);
>> +		return -EINVAL;
>> +	}
>> +
>> +	data = irq_desc_get_irq_data(desc);
>> +	while (data->parent_data)
>> +		data = data->parent_data;
>> +
>> +	pintid = data->hwirq;
>> +
>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>> +
>> +	spin_lock(&irq->irq_lock);
>> +
>> +	vcpu = irq->target_vcpu;
>> +
>> +	if (!vcpu) {
>> +		ret = -EAGAIN;
>> +		goto unlock;
>> +	}
>> +
>> +	irq_set_vcpu_affinity(host_irq, vcpu);
>> +
>> +	irq->hw = true;
>> +	irq->hwintid = pintid;
>> +	irq->host_irq = host_irq;
> 
> This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
> you move the pintid discovery there. Can we somehow unify them?
Sure. At the beginning it was just a matter of irq_lock I did not want
to release.

I was somehow embarrassed by the vcpu param of irq_set_vcpu_affinity.
Shall we really test target_vcpu. The actual value is unused for SPI so
shouldn't we simply use something != NULL.
> 
>> +
>> +unlock:
>> +	spin_unlock(&irq->irq_lock);
>> +	vgic_put_irq(kvm, irq);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * kvm_vgic_unset_forwarding - Unset IRQ forwarding
>> + *
>> + * @kvm: KVM handle
>> + * @host_irq: the host Linux IRQ number
>> + * @vintid: virtual INTID
>> + *
>> + * This function must be called when the host irq is disabled and
>> + * all vCPUs have been exited and prevented from being re-entered.
>> + */
>> +void kvm_vgic_unset_forwarding(struct kvm *kvm,
>> +			       unsigned int host_irq,
>> +			       unsigned int vintid)
>> +{
>> +	struct vgic_irq *irq;
>> +	bool active;
>> +
>> +	kvm_debug("%s host_irq=%d virt_irq=%d\n", __func__, host_irq, vintid);
>> +
>> +	irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
>> +
>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>> +	spin_lock(&irq->irq_lock);
>> +
>> +	if (!is_unshared_mapped(irq))
>> +		goto unlock;
>> +
>> +	if (active)
>> +		irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
>> +
>> +	irq->hw = false;
>> +	irq_set_vcpu_affinity(host_irq, NULL);
>> +
>> +unlock:
>> +	spin_unlock(&irq->irq_lock);
> 
> Same here.

OK

Thanks

Eric
> 
>> +}
>> +
> 
> 
> Thanks,
> 
>         M.
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
  2017-05-30 12:54       ` Auger Eric
@ 2017-05-30 13:17         ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-30 13:17 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

On 30/05/17 13:54, Auger Eric wrote:
> Hi,
> 
> On 25/05/2017 21:19, Marc Zyngier wrote:
>> On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>>> Implements kvm_vgic_[set|unset]_forwarding.
>>>
>>> Handle low-level VGIC programming and consistent irqchip
>>> programming.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  include/kvm/arm_vgic.h   |   5 +++
>>>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 110 insertions(+)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 695ebc7..7ddac8a 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>>>   */
>>>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>>>  
>>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
>>> +			    unsigned int virt_irq);
>>> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
>>> +			       unsigned int virt_irq);
>>
>> nit: the name of the variables do not match that of the function
>> definition, and are much clearer there.
>>
>>> +
>>>  #endif /* __KVM_ARM_VGIC_H */
>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>> index aa0618c..c2add8d 100644
>>> --- a/virt/kvm/arm/vgic/vgic.c
>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>> @@ -17,6 +17,8 @@
>>>  #include <linux/kvm.h>
>>>  #include <linux/kvm_host.h>
>>>  #include <linux/list_sort.h>
>>> +#include <linux/interrupt.h>
>>> +#include <linux/irq.h>
>>>  
>>>  #include "vgic.h"
>>>  
>>> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>>>  	return map_is_active;
>>>  }
>>>  
>>> +/**
>>> + * kvm_vgic_set_forwarding - Set IRQ forwarding
>>> + *
>>> + * @kvm: kvm handle
>>> + * @host_irq: the host linux IRQ
>>> + * @vintid: the virtual INTID
>>> + *
>>> + * This function must be called when the IRQ is not active:
>>> + * ie. not active at GIC level and not currently under injection
>>> + * into the guest using the unforwarded mode. The physical IRQ must
>>> + * be disabled and all vCPUs must have been exited and prevented
>>> + * from being re-entered.
>>> + */
>>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
>>> +			    unsigned int vintid)
>>> +{
>>> +	struct kvm_vcpu *vcpu;
>>> +	struct vgic_irq *irq;
>>> +	struct irq_desc *desc;
>>> +	struct irq_data *data;
>>> +	unsigned int pintid;
>>> +	int ret = 0;
>>> +
>>> +
>>> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
>>> +		  __func__, host_irq, vintid);
>>> +
>>> +	if (!vgic_valid_spi(kvm, vintid))
>>> +		return 0;
>>> +
>>> +	/* find the INTID corresponding to @host_irq */
>>> +	desc = irq_to_desc(host_irq);
>>> +	if (!desc) {
>>> +		kvm_err("%s: no interrupt descriptor\n", __func__);
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	data = irq_desc_get_irq_data(desc);
>>> +	while (data->parent_data)
>>> +		data = data->parent_data;
>>> +
>>> +	pintid = data->hwirq;
>>> +
>>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>>> +
>>> +	spin_lock(&irq->irq_lock);
>>> +
>>> +	vcpu = irq->target_vcpu;
>>> +
>>> +	if (!vcpu) {
>>> +		ret = -EAGAIN;
>>> +		goto unlock;
>>> +	}
>>> +
>>> +	irq_set_vcpu_affinity(host_irq, vcpu);
>>> +
>>> +	irq->hw = true;
>>> +	irq->hwintid = pintid;
>>> +	irq->host_irq = host_irq;
>>
>> This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
>> you move the pintid discovery there. Can we somehow unify them?
> Sure. At the beginning it was just a matter of irq_lock I did not want
> to release.
> 
> I was somehow embarrassed by the vcpu param of irq_set_vcpu_affinity.
> Shall we really test target_vcpu. The actual value is unused for SPI so
> shouldn't we simply use something != NULL.

I guess that for the time being, this would be good enough. But GICv4
requires some actual tracking of the affinity, so we may have to bite
the bullet already, and decide that the interrupt is always affine to a
vcpu.

Does this have any userspace visible impact?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
@ 2017-05-30 13:17         ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-05-30 13:17 UTC (permalink / raw)
  To: Auger Eric
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

On 30/05/17 13:54, Auger Eric wrote:
> Hi,
> 
> On 25/05/2017 21:19, Marc Zyngier wrote:
>> On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>>> Implements kvm_vgic_[set|unset]_forwarding.
>>>
>>> Handle low-level VGIC programming and consistent irqchip
>>> programming.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  include/kvm/arm_vgic.h   |   5 +++
>>>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 110 insertions(+)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 695ebc7..7ddac8a 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>>>   */
>>>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>>>  
>>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
>>> +			    unsigned int virt_irq);
>>> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
>>> +			       unsigned int virt_irq);
>>
>> nit: the name of the variables do not match that of the function
>> definition, and are much clearer there.
>>
>>> +
>>>  #endif /* __KVM_ARM_VGIC_H */
>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>> index aa0618c..c2add8d 100644
>>> --- a/virt/kvm/arm/vgic/vgic.c
>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>> @@ -17,6 +17,8 @@
>>>  #include <linux/kvm.h>
>>>  #include <linux/kvm_host.h>
>>>  #include <linux/list_sort.h>
>>> +#include <linux/interrupt.h>
>>> +#include <linux/irq.h>
>>>  
>>>  #include "vgic.h"
>>>  
>>> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>>>  	return map_is_active;
>>>  }
>>>  
>>> +/**
>>> + * kvm_vgic_set_forwarding - Set IRQ forwarding
>>> + *
>>> + * @kvm: kvm handle
>>> + * @host_irq: the host linux IRQ
>>> + * @vintid: the virtual INTID
>>> + *
>>> + * This function must be called when the IRQ is not active:
>>> + * ie. not active at GIC level and not currently under injection
>>> + * into the guest using the unforwarded mode. The physical IRQ must
>>> + * be disabled and all vCPUs must have been exited and prevented
>>> + * from being re-entered.
>>> + */
>>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
>>> +			    unsigned int vintid)
>>> +{
>>> +	struct kvm_vcpu *vcpu;
>>> +	struct vgic_irq *irq;
>>> +	struct irq_desc *desc;
>>> +	struct irq_data *data;
>>> +	unsigned int pintid;
>>> +	int ret = 0;
>>> +
>>> +
>>> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
>>> +		  __func__, host_irq, vintid);
>>> +
>>> +	if (!vgic_valid_spi(kvm, vintid))
>>> +		return 0;
>>> +
>>> +	/* find the INTID corresponding to @host_irq */
>>> +	desc = irq_to_desc(host_irq);
>>> +	if (!desc) {
>>> +		kvm_err("%s: no interrupt descriptor\n", __func__);
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	data = irq_desc_get_irq_data(desc);
>>> +	while (data->parent_data)
>>> +		data = data->parent_data;
>>> +
>>> +	pintid = data->hwirq;
>>> +
>>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>>> +
>>> +	spin_lock(&irq->irq_lock);
>>> +
>>> +	vcpu = irq->target_vcpu;
>>> +
>>> +	if (!vcpu) {
>>> +		ret = -EAGAIN;
>>> +		goto unlock;
>>> +	}
>>> +
>>> +	irq_set_vcpu_affinity(host_irq, vcpu);
>>> +
>>> +	irq->hw = true;
>>> +	irq->hwintid = pintid;
>>> +	irq->host_irq = host_irq;
>>
>> This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
>> you move the pintid discovery there. Can we somehow unify them?
> Sure. At the beginning it was just a matter of irq_lock I did not want
> to release.
> 
> I was somehow embarrassed by the vcpu param of irq_set_vcpu_affinity.
> Shall we really test target_vcpu. The actual value is unused for SPI so
> shouldn't we simply use something != NULL.

I guess that for the time being, this would be good enough. But GICv4
requires some actual tracking of the affinity, so we may have to bite
the bullet already, and decide that the interrupt is always affine to a
vcpu.

Does this have any userspace visible impact?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting
  2017-05-30 13:17         ` Marc Zyngier
  (?)
@ 2017-05-30 14:03         ` Auger Eric
  -1 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-30 14:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

Hi Marc,

On 30/05/2017 15:17, Marc Zyngier wrote:
> On 30/05/17 13:54, Auger Eric wrote:
>> Hi,
>>
>> On 25/05/2017 21:19, Marc Zyngier wrote:
>>> On Wed, May 24 2017 at 10:13:22 pm BST, Eric Auger <eric.auger@redhat.com> wrote:
>>>> Implements kvm_vgic_[set|unset]_forwarding.
>>>>
>>>> Handle low-level VGIC programming and consistent irqchip
>>>> programming.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>
>>>> ---
>>>> ---
>>>>  include/kvm/arm_vgic.h   |   5 +++
>>>>  virt/kvm/arm/vgic/vgic.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>  2 files changed, 110 insertions(+)
>>>>
>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>> index 695ebc7..7ddac8a 100644
>>>> --- a/include/kvm/arm_vgic.h
>>>> +++ b/include/kvm/arm_vgic.h
>>>> @@ -343,4 +343,9 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
>>>>   */
>>>>  int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
>>>>  
>>>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int irq,
>>>> +			    unsigned int virt_irq);
>>>> +void kvm_vgic_unset_forwarding(struct kvm *kvm, unsigned int irq,
>>>> +			       unsigned int virt_irq);
>>>
>>> nit: the name of the variables do not match that of the function
>>> definition, and are much clearer there.
>>>
>>>> +
>>>>  #endif /* __KVM_ARM_VGIC_H */
>>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>>> index aa0618c..c2add8d 100644
>>>> --- a/virt/kvm/arm/vgic/vgic.c
>>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>>> @@ -17,6 +17,8 @@
>>>>  #include <linux/kvm.h>
>>>>  #include <linux/kvm_host.h>
>>>>  #include <linux/list_sort.h>
>>>> +#include <linux/interrupt.h>
>>>> +#include <linux/irq.h>
>>>>  
>>>>  #include "vgic.h"
>>>>  
>>>> @@ -771,3 +773,106 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq)
>>>>  	return map_is_active;
>>>>  }
>>>>  
>>>> +/**
>>>> + * kvm_vgic_set_forwarding - Set IRQ forwarding
>>>> + *
>>>> + * @kvm: kvm handle
>>>> + * @host_irq: the host linux IRQ
>>>> + * @vintid: the virtual INTID
>>>> + *
>>>> + * This function must be called when the IRQ is not active:
>>>> + * ie. not active at GIC level and not currently under injection
>>>> + * into the guest using the unforwarded mode. The physical IRQ must
>>>> + * be disabled and all vCPUs must have been exited and prevented
>>>> + * from being re-entered.
>>>> + */
>>>> +int kvm_vgic_set_forwarding(struct kvm *kvm, unsigned int host_irq,
>>>> +			    unsigned int vintid)
>>>> +{
>>>> +	struct kvm_vcpu *vcpu;
>>>> +	struct vgic_irq *irq;
>>>> +	struct irq_desc *desc;
>>>> +	struct irq_data *data;
>>>> +	unsigned int pintid;
>>>> +	int ret = 0;
>>>> +
>>>> +
>>>> +	kvm_debug("%s host linux irq=%d vintid=%d\n",
>>>> +		  __func__, host_irq, vintid);
>>>> +
>>>> +	if (!vgic_valid_spi(kvm, vintid))
>>>> +		return 0;
>>>> +
>>>> +	/* find the INTID corresponding to @host_irq */
>>>> +	desc = irq_to_desc(host_irq);
>>>> +	if (!desc) {
>>>> +		kvm_err("%s: no interrupt descriptor\n", __func__);
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	data = irq_desc_get_irq_data(desc);
>>>> +	while (data->parent_data)
>>>> +		data = data->parent_data;
>>>> +
>>>> +	pintid = data->hwirq;
>>>> +
>>>> +	irq = vgic_get_irq(kvm, NULL, vintid);
>>>> +
>>>> +	spin_lock(&irq->irq_lock);
>>>> +
>>>> +	vcpu = irq->target_vcpu;
>>>> +
>>>> +	if (!vcpu) {
>>>> +		ret = -EAGAIN;
>>>> +		goto unlock;
>>>> +	}
>>>> +
>>>> +	irq_set_vcpu_affinity(host_irq, vcpu);
>>>> +
>>>> +	irq->hw = true;
>>>> +	irq->hwintid = pintid;
>>>> +	irq->host_irq = host_irq;
>>>
>>> This feels like a duplication of kvm_vgic_map_phys_irq(), specially if
>>> you move the pintid discovery there. Can we somehow unify them?
>> Sure. At the beginning it was just a matter of irq_lock I did not want
>> to release.
>>
>> I was somehow embarrassed by the vcpu param of irq_set_vcpu_affinity.
>> Shall we really test target_vcpu. The actual value is unused for SPI so
>> shouldn't we simply use something != NULL.
> 
> I guess that for the time being, this would be good enough. But GICv4
> requires some actual tracking of the affinity, so we may have to bite
> the bullet already, and decide that the interrupt is always affine to a
> vcpu.
> 
> Does this have any userspace visible impact?
I don't see any.

Thanks

Eric
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
  2017-05-30 12:45     ` Auger Eric
@ 2017-05-31 17:41         ` Alex Williamson
  0 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 17:41 UTC (permalink / raw)
  To: Auger Eric
  Cc: Marc Zyngier, eric.auger.pro, linux-kernel, kvm, kvmarm,
	pbonzini, christoffer.dall, drjones, wei

On Tue, 30 May 2017 14:45:54 +0200
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Marc,
> 
> On 25/05/2017 20:05, Marc Zyngier wrote:
> > Hi Eric,
> > 
> > On Wed, May 24 2017 at 10:13:14 pm BST, Eric Auger <eric.auger@redhat.com> wrote:  
> >> For direct EOI modality we will need to differentiate a userspace
> >> masking from the IRQ handler auto-masking.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>  drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
> >>  drivers/vfio/platform/vfio_platform_private.h |  1 +
> >>  2 files changed, 7 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >> index 46d4750..831f0b0 100644
> >> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >> @@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
> >>  
> >>  	spin_lock_irqsave(&irq_ctx->lock, flags);
> >>  
> >> -	if (!irq_ctx->masked) {
> >> +	if (!irq_ctx->masked && !irq_ctx->automasked) {  
> > 
> > Could you please expand a bit on what this automasked variable covers?
> > It'd be good to document how masked and automasked differ in behaviour.  
> 
> Yes sure. So automasked is set by the physical IRQ handler only, for
> level sensitive IRQ (AUTOMASKED interrupts). masked is set through the
> userspace API (VFIO_DEVICE_SET_IRQS and ACTION_MASK) when masking the
> IRQ. VFIO ACTION_UNMASK resets both.

This would make more sense if you at the same time renamed 'masked' to
'usermasked'.  Thanks,

Alex

> > 
> > Also, it may be worth having a helper (is_masked?) to abstract both
> > cases.  
> 
> Sure
> 
> Eric
> >   
> >>  		disable_irq_nosync(irq_ctx->hwirq);
> >>  		irq_ctx->masked = true;
> >>  	}
> >> @@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
> >>  
> >>  	spin_lock_irqsave(&irq_ctx->lock, flags);
> >>  
> >> -	if (irq_ctx->masked) {
> >> +	if (irq_ctx->masked || irq_ctx->automasked) {
> >>  		enable_irq(irq_ctx->hwirq);
> >>  		irq_ctx->masked = false;
> >> +		irq_ctx->automasked = false;
> >>  	}
> >>  
> >>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> >> @@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
> >>  
> >>  	spin_lock_irqsave(&irq_ctx->lock, flags);
> >>  
> >> -	if (!irq_ctx->masked) {
> >> +	if (!irq_ctx->masked && !irq_ctx->automasked) {
> >>  		ret = IRQ_HANDLED;
> >>  
> >>  		/* automask maskable interrupts */
> >>  		disable_irq_nosync(irq_ctx->hwirq);
> >> -		irq_ctx->masked = true;
> >> +		irq_ctx->automasked = true;
> >>  	}
> >>  
> >>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> >> @@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
> >>  		vdev->irqs[i].count = 1;
> >>  		vdev->irqs[i].hwirq = hwirq;
> >>  		vdev->irqs[i].masked = false;
> >> +		vdev->irqs[i].automasked = false;
> >>  	}
> >>  
> >>  	vdev->num_irqs = cnt;
> >> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> >> index 85ffe5d..8a3cfa9 100644
> >> --- a/drivers/vfio/platform/vfio_platform_private.h
> >> +++ b/drivers/vfio/platform/vfio_platform_private.h
> >> @@ -34,6 +34,7 @@ struct vfio_platform_irq {
> >>  	char			*name;
> >>  	struct eventfd_ctx	*trigger;
> >>  	bool			masked;
> >> +	bool			automasked;
> >>  	spinlock_t		lock;
> >>  	struct virqfd		*unmask;
> >>  	struct virqfd		*mask;  
> > 
> > Thanks,
> > 
> > 	M.
> >   

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq
@ 2017-05-31 17:41         ` Alex Williamson
  0 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 17:41 UTC (permalink / raw)
  To: Auger Eric
  Cc: kvm, Marc Zyngier, linux-kernel, pbonzini, kvmarm, eric.auger.pro

On Tue, 30 May 2017 14:45:54 +0200
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Marc,
> 
> On 25/05/2017 20:05, Marc Zyngier wrote:
> > Hi Eric,
> > 
> > On Wed, May 24 2017 at 10:13:14 pm BST, Eric Auger <eric.auger@redhat.com> wrote:  
> >> For direct EOI modality we will need to differentiate a userspace
> >> masking from the IRQ handler auto-masking.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>  drivers/vfio/platform/vfio_platform_irq.c     | 10 ++++++----
> >>  drivers/vfio/platform/vfio_platform_private.h |  1 +
> >>  2 files changed, 7 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >> index 46d4750..831f0b0 100644
> >> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >> @@ -29,7 +29,7 @@ static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
> >>  
> >>  	spin_lock_irqsave(&irq_ctx->lock, flags);
> >>  
> >> -	if (!irq_ctx->masked) {
> >> +	if (!irq_ctx->masked && !irq_ctx->automasked) {  
> > 
> > Could you please expand a bit on what this automasked variable covers?
> > It'd be good to document how masked and automasked differ in behaviour.  
> 
> Yes sure. So automasked is set by the physical IRQ handler only, for
> level sensitive IRQ (AUTOMASKED interrupts). masked is set through the
> userspace API (VFIO_DEVICE_SET_IRQS and ACTION_MASK) when masking the
> IRQ. VFIO ACTION_UNMASK resets both.

This would make more sense if you at the same time renamed 'masked' to
'usermasked'.  Thanks,

Alex

> > 
> > Also, it may be worth having a helper (is_masked?) to abstract both
> > cases.  
> 
> Sure
> 
> Eric
> >   
> >>  		disable_irq_nosync(irq_ctx->hwirq);
> >>  		irq_ctx->masked = true;
> >>  	}
> >> @@ -89,9 +89,10 @@ static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
> >>  
> >>  	spin_lock_irqsave(&irq_ctx->lock, flags);
> >>  
> >> -	if (irq_ctx->masked) {
> >> +	if (irq_ctx->masked || irq_ctx->automasked) {
> >>  		enable_irq(irq_ctx->hwirq);
> >>  		irq_ctx->masked = false;
> >> +		irq_ctx->automasked = false;
> >>  	}
> >>  
> >>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> >> @@ -152,12 +153,12 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
> >>  
> >>  	spin_lock_irqsave(&irq_ctx->lock, flags);
> >>  
> >> -	if (!irq_ctx->masked) {
> >> +	if (!irq_ctx->masked && !irq_ctx->automasked) {
> >>  		ret = IRQ_HANDLED;
> >>  
> >>  		/* automask maskable interrupts */
> >>  		disable_irq_nosync(irq_ctx->hwirq);
> >> -		irq_ctx->masked = true;
> >> +		irq_ctx->automasked = true;
> >>  	}
> >>  
> >>  	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> >> @@ -315,6 +316,7 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
> >>  		vdev->irqs[i].count = 1;
> >>  		vdev->irqs[i].hwirq = hwirq;
> >>  		vdev->irqs[i].masked = false;
> >> +		vdev->irqs[i].automasked = false;
> >>  	}
> >>  
> >>  	vdev->num_irqs = cnt;
> >> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> >> index 85ffe5d..8a3cfa9 100644
> >> --- a/drivers/vfio/platform/vfio_platform_private.h
> >> +++ b/drivers/vfio/platform/vfio_platform_private.h
> >> @@ -34,6 +34,7 @@ struct vfio_platform_irq {
> >>  	char			*name;
> >>  	struct eventfd_ctx	*trigger;
> >>  	bool			masked;
> >> +	bool			automasked;
> >>  	spinlock_t		lock;
> >>  	struct virqfd		*unmask;
> >>  	struct virqfd		*mask;  
> > 
> > Thanks,
> > 
> > 	M.
> >   

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] VFIO: platform: Introduce direct EOI interrupt handler
  2017-05-24 20:13   ` Eric Auger
@ 2017-05-31 18:20     ` Alex Williamson
  -1 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:20 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

On Wed, 24 May 2017 22:13:15 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> We add two new fields in vfio_platform_irq: deoi and handler.
> 
> If deoi is set, this means the physical IRQ attached to the virtual
> IRQ is directly deactivated by the guest and the VFIO driver does
> not need to disable the physical IRQ and mask it at VFIO level.
> 
> The handler pointer points to either the automasked or "deoi" handler.
> This latter is the same as the one used for edge sensitive IRQs.
> A wrapper handler is introduced that calls the chosen handler function.
> 
> The irq lock is taken/released in the wrapper handler. eventfd_signal
> can be called in regions not allowed to sleep.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 24 +++++++++++++++++-------
>  drivers/vfio/platform/vfio_platform_private.h |  2 ++
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 831f0b0..2f82459 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>  static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  {
>  	struct vfio_platform_irq *irq_ctx = dev_id;
> -	unsigned long flags;
>  	int ret = IRQ_NONE;
>  
> -	spin_lock_irqsave(&irq_ctx->lock, flags);
> -
>  	if (!irq_ctx->masked && !irq_ctx->automasked) {
>  		ret = IRQ_HANDLED;
>  
> @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  		irq_ctx->automasked = true;
>  	}
>  
> -	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> -
>  	if (ret == IRQ_HANDLED)
>  		eventfd_signal(irq_ctx->trigger, 1);
>  
> @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
> +{
> +	struct vfio_platform_irq *irq_ctx = dev_id;
> +	unsigned long flags;
> +	irqreturn_t ret;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +	ret = irq_ctx->handler(irq, dev_id);
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +
> +	return ret;
> +}
> +
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  			    int fd, irq_handler_t handler)
>  {
> @@ -208,9 +216,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  	}
>  
>  	irq->trigger = trigger;
> +	irq->handler = handler;
>  
>  	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
> -	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
> +	ret = request_irq(irq->hwirq, vfio_wrapper_handler, 0, irq->name, irq);

Noting here that vfio-platform already requests an exclusive interrupt
(flags = 0), which I think is something that's going to need work for
vfio-pci since we'll request a shared interrupt by default if the
device supports DisINTx masking.  If we're doing this special trick at
the GIC, we'll starve other devices if they share an interrupt.

>  	if (ret) {
>  		kfree(irq->name);
>  		eventfd_ctx_put(trigger);
> @@ -232,7 +241,8 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
>  	struct vfio_platform_irq *irq = &vdev->irqs[index];
>  	irq_handler_t handler;
>  
> -	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED)
> +	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED &&
> +			!irq->deoi)

nit, strange linewrap

>  		handler = vfio_automasked_irq_handler;
>  	else
>  		handler = vfio_irq_handler;
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 8a3cfa9..b80a380 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -38,6 +38,8 @@ struct vfio_platform_irq {
>  	spinlock_t		lock;
>  	struct virqfd		*unmask;
>  	struct virqfd		*mask;
> +	bool			deoi;
> +	irqreturn_t		(*handler)(int irq, void *dev_id);
>  };
>  
>  struct vfio_platform_region {

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] VFIO: platform: Introduce direct EOI interrupt handler
@ 2017-05-31 18:20     ` Alex Williamson
  0 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:20 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, marc.zyngier, linux-kernel, pbonzini, kvmarm, eric.auger.pro

On Wed, 24 May 2017 22:13:15 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> We add two new fields in vfio_platform_irq: deoi and handler.
> 
> If deoi is set, this means the physical IRQ attached to the virtual
> IRQ is directly deactivated by the guest and the VFIO driver does
> not need to disable the physical IRQ and mask it at VFIO level.
> 
> The handler pointer points to either the automasked or "deoi" handler.
> This latter is the same as the one used for edge sensitive IRQs.
> A wrapper handler is introduced that calls the chosen handler function.
> 
> The irq lock is taken/released in the wrapper handler. eventfd_signal
> can be called in regions not allowed to sleep.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> ---
>  drivers/vfio/platform/vfio_platform_irq.c     | 24 +++++++++++++++++-------
>  drivers/vfio/platform/vfio_platform_private.h |  2 ++
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 831f0b0..2f82459 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>  static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  {
>  	struct vfio_platform_irq *irq_ctx = dev_id;
> -	unsigned long flags;
>  	int ret = IRQ_NONE;
>  
> -	spin_lock_irqsave(&irq_ctx->lock, flags);
> -
>  	if (!irq_ctx->masked && !irq_ctx->automasked) {
>  		ret = IRQ_HANDLED;
>  
> @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>  		irq_ctx->automasked = true;
>  	}
>  
> -	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> -
>  	if (ret == IRQ_HANDLED)
>  		eventfd_signal(irq_ctx->trigger, 1);
>  
> @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
> +{
> +	struct vfio_platform_irq *irq_ctx = dev_id;
> +	unsigned long flags;
> +	irqreturn_t ret;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +	ret = irq_ctx->handler(irq, dev_id);
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +
> +	return ret;
> +}
> +
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  			    int fd, irq_handler_t handler)
>  {
> @@ -208,9 +216,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  	}
>  
>  	irq->trigger = trigger;
> +	irq->handler = handler;
>  
>  	irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
> -	ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
> +	ret = request_irq(irq->hwirq, vfio_wrapper_handler, 0, irq->name, irq);

Noting here that vfio-platform already requests an exclusive interrupt
(flags = 0), which I think is something that's going to need work for
vfio-pci since we'll request a shared interrupt by default if the
device supports DisINTx masking.  If we're doing this special trick at
the GIC, we'll starve other devices if they share an interrupt.

>  	if (ret) {
>  		kfree(irq->name);
>  		eventfd_ctx_put(trigger);
> @@ -232,7 +241,8 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
>  	struct vfio_platform_irq *irq = &vdev->irqs[index];
>  	irq_handler_t handler;
>  
> -	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED)
> +	if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED &&
> +			!irq->deoi)

nit, strange linewrap

>  		handler = vfio_automasked_irq_handler;
>  	else
>  		handler = vfio_irq_handler;
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 8a3cfa9..b80a380 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -38,6 +38,8 @@ struct vfio_platform_irq {
>  	spinlock_t		lock;
>  	struct virqfd		*unmask;
>  	struct virqfd		*mask;
> +	bool			deoi;
> +	irqreturn_t		(*handler)(int irq, void *dev_id);
>  };
>  
>  struct vfio_platform_region {

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
  2017-05-24 20:13   ` Eric Auger
@ 2017-05-31 18:20     ` Alex Williamson
  -1 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:20 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

On Wed, 24 May 2017 22:13:16 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> This patch adds the registration/unregistration of an
> irq_bypass_producer for vfio platform device interrupts.
> 
> Its callbacks handle the direct EOI modality on VFIO side.
> 
> - stop/start: disable/enable the host irq
> - add/del consumer: set the VFIO Direct EOI mode, ie. select the
>   adapted physical IRQ handler (automasked or not automasked).
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> ---
>  drivers/vfio/platform/Kconfig                    |   5 +
>  drivers/vfio/platform/Makefile                   |   2 +-
>  drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
>  drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
>  drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
>  5 files changed, 162 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c
> 
> diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
> index bb30128..33ec3d9 100644
> --- a/drivers/vfio/platform/Kconfig
> +++ b/drivers/vfio/platform/Kconfig
> @@ -2,6 +2,7 @@ config VFIO_PLATFORM
>  	tristate "VFIO support for platform devices"
>  	depends on VFIO && EVENTFD && (ARM || ARM64)
>  	select VFIO_VIRQFD
> +	select IRQ_BYPASS_MANAGER
>  	help
>  	  Support for platform devices with VFIO. This is required to make
>  	  use of platform devices present on the system using the VFIO
> @@ -19,4 +20,8 @@ config VFIO_AMBA
>  
>  	  If you don't know what to do here, say N.
>  
> +config VFIO_PLATFORM_IRQ_BYPASS_DEOI
> +	depends on VFIO_PLATFORM
> +	def_bool y
> +
>  source "drivers/vfio/platform/reset/Kconfig"
> diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
> index 41a6224..324f3e7 100644
> --- a/drivers/vfio/platform/Makefile
> +++ b/drivers/vfio/platform/Makefile
> @@ -1,4 +1,4 @@
> -vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
> +vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
>  vfio-platform-y := vfio_platform.o
>  
>  obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 2f82459..5b70c8e 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -20,6 +20,7 @@
>  #include <linux/types.h>
>  #include <linux/vfio.h>
>  #include <linux/irq.h>
> +#include <linux/irqbypass.h>
>  
>  #include "vfio_platform_private.h"
>  
> @@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
>  	return ret;
>  }
>  
> +/* must be called with irq_ctx->lock held */
> +int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
> +{
> +	irq_ctx->deoi = deoi;
> +
> +	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
> +		irq_ctx->handler = vfio_automasked_irq_handler;
> +	else
> +		irq_ctx->handler = vfio_irq_handler;
> +
> +	return 0;
> +}
> +
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  			    int fd, irq_handler_t handler)
>  {
> @@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  	if (irq->trigger) {
>  		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>  		free_irq(irq->hwirq, irq);
> +		irq_bypass_unregister_producer(&irq->producer);
>  		kfree(irq->name);
>  		eventfd_ctx_put(irq->trigger);
>  		irq->trigger = NULL;
> @@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  		return ret;
>  	}
>  
> +	if (vfio_platform_has_deoi())
> +		vfio_platform_register_deoi_producer(vdev, irq,
> +						     trigger, irq->hwirq);
> +
>  	if (!irq->masked)
>  		enable_irq(irq->hwirq);
>  
> diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
> new file mode 100644
> index 0000000..436902c
> --- /dev/null
> +++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
> @@ -0,0 +1,114 @@
> +/*
> + * VFIO platform device irqbypass callback implementation for DEOI
> + *
> + * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
> + * Author: Eric Auger <eric.auger@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <linux/err.h>
> +#include <linux/device.h>
> +#include <linux/irq.h>
> +#include <linux/irqbypass.h>
> +#include "vfio_platform_private.h"
> +
> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
> +
> +static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
> +{
> +	enable_irq(prod->irq);
> +}
> +
> +static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
> +{
> +	disable_irq(prod->irq);
> +}
> +
> +/**
> + * irq_bypass_deoi_add_consumer - turns irq direct EOI on
> + *
> + * The linux irq is disabled when the function is called.
> + * The operation succeeds only if the irq is not active at irqchip level
> + * and the irq is not automasked at VFIO level, meaning the IRQ is under
> + * injection into the guest.
> + */
> +static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
> +					struct irq_bypass_consumer *cons)
> +{
> +	struct vfio_platform_irq *irq_ctx =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +	unsigned long flags;
> +	bool active;
> +	int ret;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +
> +	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
> +				    &active);
> +	if (ret)
> +		goto out;
> +
> +	if (active || irq_ctx->automasked) {
> +		ret = -EAGAIN;
> +		goto out;
> +	}
> +
> +	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
> +		goto out;

Not sure why this wouldn't return -EINVAL;

> +
> +	ret = vfio_platform_set_deoi(irq_ctx, true);
> +out:
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +	return ret;
> +}
> +
> +static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
> +					 struct irq_bypass_consumer *cons)
> +{
> +	struct vfio_platform_irq *irq_ctx =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +	if (irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK)
> +		vfio_platform_set_deoi(irq_ctx, false);
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +}
> +
> +bool vfio_platform_has_deoi(void)
> +{
> +	return true;
> +}
> +
> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
> +					  struct vfio_platform_irq *irq,
> +					  struct eventfd_ctx *trigger,
> +					  unsigned int host_irq)
> +{
> +	struct irq_bypass_producer *prod = &irq->producer;
> +	int ret;
> +
> +	prod->token =		trigger;
> +	prod->irq =		host_irq;
> +	prod->add_consumer =	irq_bypass_deoi_add_consumer;
> +	prod->del_consumer =	irq_bypass_deoi_del_consumer;
> +	prod->stop =		irq_bypass_deoi_stop;
> +	prod->start =		irq_bypass_deoi_start;
> +
> +	ret = irq_bypass_register_producer(prod);
> +	if (unlikely(ret))
> +		dev_info(vdev->device,
> +			 "irq bypass producer (token %p) registration fails: %d\n",
> +			 prod->token, ret);
> +}
> +
> +#endif
> +
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index b80a380..bfa2675 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -17,6 +17,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/interrupt.h>
> +#include <linux/irqbypass.h>
>  
>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
> @@ -40,6 +41,7 @@ struct vfio_platform_irq {
>  	struct virqfd		*mask;
>  	bool			deoi;
>  	irqreturn_t		(*handler)(int irq, void *dev_id);
> +	struct irq_bypass_producer producer;
>  };
>  
>  struct vfio_platform_region {
> @@ -102,9 +104,30 @@ extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>  					unsigned start, unsigned count,
>  					void *data);
>  
> +extern int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi);
> +
>  extern void __vfio_platform_register_reset(struct vfio_platform_reset_node *n);
>  extern void vfio_platform_unregister_reset(const char *compat,
>  					   vfio_platform_reset_fn_t fn);
> +
> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
> +bool vfio_platform_has_deoi(void);
> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
> +					  struct vfio_platform_irq *irq,
> +					  struct eventfd_ctx *trigger,
> +					  unsigned int host_irq);
> +#else
> +static inline bool vfio_platform_has_deoi(void)
> +{
> +	return false;
> +}
> +static inline
> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
> +					  struct vfio_platform_irq *irq,
> +					  struct eventfd_ctx *trigger,
> +					  unsigned int host_irq) {}
> +#endif
> +
>  #define vfio_platform_register_reset(__compat, __reset)		\
>  static struct vfio_platform_reset_node __reset ## _node = {	\
>  	.owner = THIS_MODULE,					\

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
@ 2017-05-31 18:20     ` Alex Williamson
  0 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:20 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, marc.zyngier, linux-kernel, pbonzini, kvmarm, eric.auger.pro

On Wed, 24 May 2017 22:13:16 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> This patch adds the registration/unregistration of an
> irq_bypass_producer for vfio platform device interrupts.
> 
> Its callbacks handle the direct EOI modality on VFIO side.
> 
> - stop/start: disable/enable the host irq
> - add/del consumer: set the VFIO Direct EOI mode, ie. select the
>   adapted physical IRQ handler (automasked or not automasked).
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> ---
>  drivers/vfio/platform/Kconfig                    |   5 +
>  drivers/vfio/platform/Makefile                   |   2 +-
>  drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
>  drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
>  drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
>  5 files changed, 162 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c
> 
> diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
> index bb30128..33ec3d9 100644
> --- a/drivers/vfio/platform/Kconfig
> +++ b/drivers/vfio/platform/Kconfig
> @@ -2,6 +2,7 @@ config VFIO_PLATFORM
>  	tristate "VFIO support for platform devices"
>  	depends on VFIO && EVENTFD && (ARM || ARM64)
>  	select VFIO_VIRQFD
> +	select IRQ_BYPASS_MANAGER
>  	help
>  	  Support for platform devices with VFIO. This is required to make
>  	  use of platform devices present on the system using the VFIO
> @@ -19,4 +20,8 @@ config VFIO_AMBA
>  
>  	  If you don't know what to do here, say N.
>  
> +config VFIO_PLATFORM_IRQ_BYPASS_DEOI
> +	depends on VFIO_PLATFORM
> +	def_bool y
> +
>  source "drivers/vfio/platform/reset/Kconfig"
> diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
> index 41a6224..324f3e7 100644
> --- a/drivers/vfio/platform/Makefile
> +++ b/drivers/vfio/platform/Makefile
> @@ -1,4 +1,4 @@
> -vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
> +vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
>  vfio-platform-y := vfio_platform.o
>  
>  obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 2f82459..5b70c8e 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -20,6 +20,7 @@
>  #include <linux/types.h>
>  #include <linux/vfio.h>
>  #include <linux/irq.h>
> +#include <linux/irqbypass.h>
>  
>  #include "vfio_platform_private.h"
>  
> @@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
>  	return ret;
>  }
>  
> +/* must be called with irq_ctx->lock held */
> +int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
> +{
> +	irq_ctx->deoi = deoi;
> +
> +	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
> +		irq_ctx->handler = vfio_automasked_irq_handler;
> +	else
> +		irq_ctx->handler = vfio_irq_handler;
> +
> +	return 0;
> +}
> +
>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  			    int fd, irq_handler_t handler)
>  {
> @@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  	if (irq->trigger) {
>  		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>  		free_irq(irq->hwirq, irq);
> +		irq_bypass_unregister_producer(&irq->producer);
>  		kfree(irq->name);
>  		eventfd_ctx_put(irq->trigger);
>  		irq->trigger = NULL;
> @@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>  		return ret;
>  	}
>  
> +	if (vfio_platform_has_deoi())
> +		vfio_platform_register_deoi_producer(vdev, irq,
> +						     trigger, irq->hwirq);
> +
>  	if (!irq->masked)
>  		enable_irq(irq->hwirq);
>  
> diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
> new file mode 100644
> index 0000000..436902c
> --- /dev/null
> +++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
> @@ -0,0 +1,114 @@
> +/*
> + * VFIO platform device irqbypass callback implementation for DEOI
> + *
> + * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
> + * Author: Eric Auger <eric.auger@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <linux/err.h>
> +#include <linux/device.h>
> +#include <linux/irq.h>
> +#include <linux/irqbypass.h>
> +#include "vfio_platform_private.h"
> +
> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
> +
> +static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
> +{
> +	enable_irq(prod->irq);
> +}
> +
> +static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
> +{
> +	disable_irq(prod->irq);
> +}
> +
> +/**
> + * irq_bypass_deoi_add_consumer - turns irq direct EOI on
> + *
> + * The linux irq is disabled when the function is called.
> + * The operation succeeds only if the irq is not active at irqchip level
> + * and the irq is not automasked at VFIO level, meaning the IRQ is under
> + * injection into the guest.
> + */
> +static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
> +					struct irq_bypass_consumer *cons)
> +{
> +	struct vfio_platform_irq *irq_ctx =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +	unsigned long flags;
> +	bool active;
> +	int ret;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +
> +	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
> +				    &active);
> +	if (ret)
> +		goto out;
> +
> +	if (active || irq_ctx->automasked) {
> +		ret = -EAGAIN;
> +		goto out;
> +	}
> +
> +	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
> +		goto out;

Not sure why this wouldn't return -EINVAL;

> +
> +	ret = vfio_platform_set_deoi(irq_ctx, true);
> +out:
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +	return ret;
> +}
> +
> +static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
> +					 struct irq_bypass_consumer *cons)
> +{
> +	struct vfio_platform_irq *irq_ctx =
> +		container_of(prod, struct vfio_platform_irq, producer);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&irq_ctx->lock, flags);
> +	if (irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK)
> +		vfio_platform_set_deoi(irq_ctx, false);
> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +}
> +
> +bool vfio_platform_has_deoi(void)
> +{
> +	return true;
> +}
> +
> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
> +					  struct vfio_platform_irq *irq,
> +					  struct eventfd_ctx *trigger,
> +					  unsigned int host_irq)
> +{
> +	struct irq_bypass_producer *prod = &irq->producer;
> +	int ret;
> +
> +	prod->token =		trigger;
> +	prod->irq =		host_irq;
> +	prod->add_consumer =	irq_bypass_deoi_add_consumer;
> +	prod->del_consumer =	irq_bypass_deoi_del_consumer;
> +	prod->stop =		irq_bypass_deoi_stop;
> +	prod->start =		irq_bypass_deoi_start;
> +
> +	ret = irq_bypass_register_producer(prod);
> +	if (unlikely(ret))
> +		dev_info(vdev->device,
> +			 "irq bypass producer (token %p) registration fails: %d\n",
> +			 prod->token, ret);
> +}
> +
> +#endif
> +
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index b80a380..bfa2675 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -17,6 +17,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/interrupt.h>
> +#include <linux/irqbypass.h>
>  
>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
> @@ -40,6 +41,7 @@ struct vfio_platform_irq {
>  	struct virqfd		*mask;
>  	bool			deoi;
>  	irqreturn_t		(*handler)(int irq, void *dev_id);
> +	struct irq_bypass_producer producer;
>  };
>  
>  struct vfio_platform_region {
> @@ -102,9 +104,30 @@ extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>  					unsigned start, unsigned count,
>  					void *data);
>  
> +extern int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi);
> +
>  extern void __vfio_platform_register_reset(struct vfio_platform_reset_node *n);
>  extern void vfio_platform_unregister_reset(const char *compat,
>  					   vfio_platform_reset_fn_t fn);
> +
> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
> +bool vfio_platform_has_deoi(void);
> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
> +					  struct vfio_platform_irq *irq,
> +					  struct eventfd_ctx *trigger,
> +					  unsigned int host_irq);
> +#else
> +static inline bool vfio_platform_has_deoi(void)
> +{
> +	return false;
> +}
> +static inline
> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
> +					  struct vfio_platform_irq *irq,
> +					  struct eventfd_ctx *trigger,
> +					  unsigned int host_irq) {}
> +#endif
> +
>  #define vfio_platform_register_reset(__compat, __reset)		\
>  static struct vfio_platform_reset_node __reset ## _node = {	\
>  	.owner = THIS_MODULE,					\

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/10] VFIO: pci: Add automasked field to vfio_pci_irq_ctx
  2017-05-24 20:13   ` Eric Auger
@ 2017-05-31 18:21     ` Alex Williamson
  -1 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

On Wed, 24 May 2017 22:13:17 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> For direct EOI modality we will need to differentiate a userspace
> masking from the IRQ handler auto-masking.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/vfio/pci/vfio_pci_intrs.c   | 15 +++++++++------
>  drivers/vfio/pci/vfio_pci_private.h |  1 +
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index 1c46045..d4d377b 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -52,7 +52,7 @@ void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
>  	if (unlikely(!is_intx(vdev))) {
>  		if (vdev->pci_2_3)
>  			pci_intx(pdev, 0);
> -	} else if (!vdev->ctx[0].masked) {
> +	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked) {
>  		/*
>  		 * Can't use check_and_mask here because we always want to
>  		 * mask, not just when something is pending.
> @@ -90,7 +90,8 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
>  	if (unlikely(!is_intx(vdev))) {
>  		if (vdev->pci_2_3)
>  			pci_intx(pdev, 1);
> -	} else if (vdev->ctx[0].masked && !vdev->virq_disabled) {
> +	} else if ((vdev->ctx[0].masked || vdev->ctx[0].automasked) &&
> +			!vdev->virq_disabled) {
>  		/*
>  		 * A pending interrupt here would immediately trigger,
>  		 * but we can avoid that overhead by just re-sending
> @@ -103,6 +104,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
>  			enable_irq(pdev->irq);
>  
>  		vdev->ctx[0].masked = (ret > 0);
> +		vdev->ctx[0].automasked = (ret > 0);
>  	}

This looks suspicious, if we leave this function with the interrupt
masked, isn't it due to an automask, not a usermask?

>  
>  	spin_unlock_irqrestore(&vdev->irqlock, flags);
> @@ -126,11 +128,12 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>  
>  	if (!vdev->pci_2_3) {
>  		disable_irq_nosync(vdev->pdev->irq);
> -		vdev->ctx[0].masked = true;
> +		vdev->ctx[0].automasked = true;
>  		ret = IRQ_HANDLED;
> -	} else if (!vdev->ctx[0].masked &&  /* may be shared */
> -		   pci_check_and_mask_intx(vdev->pdev)) {
> -		vdev->ctx[0].masked = true;
> +	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked &&
> +			pci_check_and_mask_intx(vdev->pdev)) {
> +		 /* shared INTx */
> +		vdev->ctx[0].automasked = true;
>  		ret = IRQ_HANDLED;
>  	}
>  
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index f561ac1..f7f1101 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -35,6 +35,7 @@ struct vfio_pci_irq_ctx {
>  	struct virqfd		*mask;
>  	char			*name;
>  	bool			masked;
> +	bool			automasked;
>  	struct irq_bypass_producer	producer;
>  };
>  

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/10] VFIO: pci: Add automasked field to vfio_pci_irq_ctx
@ 2017-05-31 18:21     ` Alex Williamson
  0 siblings, 0 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, marc.zyngier, linux-kernel, pbonzini, kvmarm, eric.auger.pro

On Wed, 24 May 2017 22:13:17 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> For direct EOI modality we will need to differentiate a userspace
> masking from the IRQ handler auto-masking.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  drivers/vfio/pci/vfio_pci_intrs.c   | 15 +++++++++------
>  drivers/vfio/pci/vfio_pci_private.h |  1 +
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index 1c46045..d4d377b 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -52,7 +52,7 @@ void vfio_pci_intx_mask(struct vfio_pci_device *vdev)
>  	if (unlikely(!is_intx(vdev))) {
>  		if (vdev->pci_2_3)
>  			pci_intx(pdev, 0);
> -	} else if (!vdev->ctx[0].masked) {
> +	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked) {
>  		/*
>  		 * Can't use check_and_mask here because we always want to
>  		 * mask, not just when something is pending.
> @@ -90,7 +90,8 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
>  	if (unlikely(!is_intx(vdev))) {
>  		if (vdev->pci_2_3)
>  			pci_intx(pdev, 1);
> -	} else if (vdev->ctx[0].masked && !vdev->virq_disabled) {
> +	} else if ((vdev->ctx[0].masked || vdev->ctx[0].automasked) &&
> +			!vdev->virq_disabled) {
>  		/*
>  		 * A pending interrupt here would immediately trigger,
>  		 * but we can avoid that overhead by just re-sending
> @@ -103,6 +104,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused)
>  			enable_irq(pdev->irq);
>  
>  		vdev->ctx[0].masked = (ret > 0);
> +		vdev->ctx[0].automasked = (ret > 0);
>  	}

This looks suspicious, if we leave this function with the interrupt
masked, isn't it due to an automask, not a usermask?

>  
>  	spin_unlock_irqrestore(&vdev->irqlock, flags);
> @@ -126,11 +128,12 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>  
>  	if (!vdev->pci_2_3) {
>  		disable_irq_nosync(vdev->pdev->irq);
> -		vdev->ctx[0].masked = true;
> +		vdev->ctx[0].automasked = true;
>  		ret = IRQ_HANDLED;
> -	} else if (!vdev->ctx[0].masked &&  /* may be shared */
> -		   pci_check_and_mask_intx(vdev->pdev)) {
> -		vdev->ctx[0].masked = true;
> +	} else if (!vdev->ctx[0].masked && !vdev->ctx[0].automasked &&
> +			pci_check_and_mask_intx(vdev->pdev)) {
> +		 /* shared INTx */
> +		vdev->ctx[0].automasked = true;
>  		ret = IRQ_HANDLED;
>  	}
>  
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index f561ac1..f7f1101 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -35,6 +35,7 @@ struct vfio_pci_irq_ctx {
>  	struct virqfd		*mask;
>  	char			*name;
>  	bool			masked;
> +	bool			automasked;
>  	struct irq_bypass_producer	producer;
>  };
>  

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
  2017-05-24 20:13   ` Eric Auger
  (?)
@ 2017-05-31 18:24   ` Alex Williamson
  2017-06-01 20:40       ` Auger Eric
  2017-06-14  8:07     ` Auger Eric
  -1 siblings, 2 replies; 69+ messages in thread
From: Alex Williamson @ 2017-05-31 18:24 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

On Wed, 24 May 2017 22:13:18 +0200
Eric Auger <eric.auger@redhat.com> wrote:

> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
> If deoi is set, this means the physical IRQ attached to the virtual
> IRQ is directly deactivated by the guest and the VFIO driver does
> not need to disable the physical IRQ and mask it at VFIO level.
> 
> The handler pointer is set accordingly and a wrapper handler is
> introduced that calls the chosen handler function.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> ---
>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>  2 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index d4d377b..06aa713 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>  {
>  	struct vfio_pci_device *vdev = dev_id;
> -	unsigned long flags;
>  	int ret = IRQ_NONE;
>  
> -	spin_lock_irqsave(&vdev->irqlock, flags);
> -
>  	if (!vdev->pci_2_3) {
>  		disable_irq_nosync(vdev->pdev->irq);
>  		vdev->ctx[0].automasked = true;
> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>  		ret = IRQ_HANDLED;
>  	}
>  
> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
> -
>  	if (ret == IRQ_HANDLED)
>  		vfio_send_intx_eventfd(vdev, NULL);
>  
>  	return ret;
>  }
>  
> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
> +{
> +	struct vfio_pci_device *vdev = dev_id;
> +
> +	vfio_send_intx_eventfd(vdev, NULL);
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
> +{
> +	struct vfio_pci_device *vdev = dev_id;
> +	unsigned long flags;
> +	irqreturn_t ret;
> +
> +	spin_lock_irqsave(&vdev->irqlock, flags);
> +	ret = vdev->ctx[0].handler(irq, dev_id);
> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
> +
> +	return ret;
> +}
> +
>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>  {
>  	if (!is_irq_none(vdev))
> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>  	if (!vdev->pci_2_3)
>  		irqflags = 0;
>  
> -	ret = request_irq(pdev->irq, vfio_intx_handler,
> +	if (vdev->ctx[0].deoi)
> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
> +	else
> +		vdev->ctx[0].handler = vfio_intx_handler;
> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>  			  irqflags, vdev->ctx[0].name, vdev);


Here's where I think we don't account for irqflags properly.  If we get
a shared interrupt here, then enabling direct EOI needs to be disabled
or else we'll starve other devices sharing the interrupt.  In practice,
I wonder if this makes PCI direct EOI a useful feature.  We could try
to get an exclusive interrupt and fallback to shared, but any time we
get an exclusive interrupt we're more prone to conflicts with other
devices.  I might have two VMs that share an interrupt and now it's a
race that only the first to setup an IRQ can work.  Worse, one of those
VMs might be fully booted and switched to MSI and now it's just a
matter of time until they reboot in the right way to generate a
conflict.  I might also have two devices in the same VM that share an
IRQ and now I can't start the VM at all because the second device can
no longer get an interrupt.  This is the same problem we have with the
nointxmask flag, it's a useful debugging feature but since the masking
is done at the APIC/GIC rather than the device, much like here, it's not
very practical for more than debugging and isolating specific devices
as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
PCI side here. Thanks,

Alex

>  	if (ret) {
>  		vdev->ctx[0].trigger = NULL;
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index f7f1101..5cfe59a 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx {
>  	char			*name;
>  	bool			masked;
>  	bool			automasked;
> +	bool			deoi;
> +	irqreturn_t		(*handler)(int irq, void *dev_id);
>  	struct irq_bypass_producer	producer;
>  };
>  

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
  2017-05-31 18:20     ` Alex Williamson
@ 2017-05-31 19:31       ` Auger Eric
  -1 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-31 19:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

Hi Alex, Marc,

On 31/05/2017 20:20, Alex Williamson wrote:
> On Wed, 24 May 2017 22:13:16 +0200
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> This patch adds the registration/unregistration of an
>> irq_bypass_producer for vfio platform device interrupts.
>>
>> Its callbacks handle the direct EOI modality on VFIO side.
>>
>> - stop/start: disable/enable the host irq
>> - add/del consumer: set the VFIO Direct EOI mode, ie. select the
>>   adapted physical IRQ handler (automasked or not automasked).
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  drivers/vfio/platform/Kconfig                    |   5 +
>>  drivers/vfio/platform/Makefile                   |   2 +-
>>  drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
>>  drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
>>  drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
>>  5 files changed, 162 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c
>>
>> diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
>> index bb30128..33ec3d9 100644
>> --- a/drivers/vfio/platform/Kconfig
>> +++ b/drivers/vfio/platform/Kconfig
>> @@ -2,6 +2,7 @@ config VFIO_PLATFORM
>>  	tristate "VFIO support for platform devices"
>>  	depends on VFIO && EVENTFD && (ARM || ARM64)
>>  	select VFIO_VIRQFD
>> +	select IRQ_BYPASS_MANAGER
>>  	help
>>  	  Support for platform devices with VFIO. This is required to make
>>  	  use of platform devices present on the system using the VFIO
>> @@ -19,4 +20,8 @@ config VFIO_AMBA
>>  
>>  	  If you don't know what to do here, say N.
>>  
>> +config VFIO_PLATFORM_IRQ_BYPASS_DEOI
>> +	depends on VFIO_PLATFORM
>> +	def_bool y
>> +
>>  source "drivers/vfio/platform/reset/Kconfig"
>> diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
>> index 41a6224..324f3e7 100644
>> --- a/drivers/vfio/platform/Makefile
>> +++ b/drivers/vfio/platform/Makefile
>> @@ -1,4 +1,4 @@
>> -vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
>> +vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
>>  vfio-platform-y := vfio_platform.o
>>  
>>  obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 2f82459..5b70c8e 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/types.h>
>>  #include <linux/vfio.h>
>>  #include <linux/irq.h>
>> +#include <linux/irqbypass.h>
>>  
>>  #include "vfio_platform_private.h"
>>  
>> @@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
>>  	return ret;
>>  }
>>  
>> +/* must be called with irq_ctx->lock held */
>> +int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
>> +{
>> +	irq_ctx->deoi = deoi;
>> +
>> +	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
>> +		irq_ctx->handler = vfio_automasked_irq_handler;
>> +	else
>> +		irq_ctx->handler = vfio_irq_handler;
>> +
>> +	return 0;
>> +}
>> +
>>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  			    int fd, irq_handler_t handler)
>>  {
>> @@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  	if (irq->trigger) {
>>  		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>>  		free_irq(irq->hwirq, irq);
>> +		irq_bypass_unregister_producer(&irq->producer);
>>  		kfree(irq->name);
>>  		eventfd_ctx_put(irq->trigger);
>>  		irq->trigger = NULL;
>> @@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  		return ret;
>>  	}
>>  
>> +	if (vfio_platform_has_deoi())
>> +		vfio_platform_register_deoi_producer(vdev, irq,
>> +						     trigger, irq->hwirq);
>> +
>>  	if (!irq->masked)
>>  		enable_irq(irq->hwirq);
>>  
>> diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
>> new file mode 100644
>> index 0000000..436902c
>> --- /dev/null
>> +++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
>> @@ -0,0 +1,114 @@
>> +/*
>> + * VFIO platform device irqbypass callback implementation for DEOI
>> + *
>> + * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
>> + * Author: Eric Auger <eric.auger@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License, version 2, as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <linux/err.h>
>> +#include <linux/device.h>
>> +#include <linux/irq.h>
>> +#include <linux/irqbypass.h>
>> +#include "vfio_platform_private.h"
>> +
>> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
>> +
>> +static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
>> +{
>> +	enable_irq(prod->irq);
>> +}
>> +
>> +static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
>> +{
>> +	disable_irq(prod->irq);
>> +}
>> +
>> +/**
>> + * irq_bypass_deoi_add_consumer - turns irq direct EOI on
>> + *
>> + * The linux irq is disabled when the function is called.
>> + * The operation succeeds only if the irq is not active at irqchip level
>> + * and the irq is not automasked at VFIO level, meaning the IRQ is under
>> + * injection into the guest.
>> + */
>> +static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
>> +					struct irq_bypass_consumer *cons)
>> +{
>> +	struct vfio_platform_irq *irq_ctx =
>> +		container_of(prod, struct vfio_platform_irq, producer);
>> +	unsigned long flags;
>> +	bool active;
>> +	int ret;
>> +
>> +	spin_lock_irqsave(&irq_ctx->lock, flags);
>> +
>> +	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
>> +				    &active);
>> +	if (ret)
>> +		goto out;
>> +
>> +	if (active || irq_ctx->automasked) {
>> +		ret = -EAGAIN;
>> +		goto out;
>> +	}
>> +
>> +	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
>> +		goto out;
> 
> Not sure why this wouldn't return -EINVAL;
At the moment direct EOI is also set up for edge sensitive IRQs. This
means the deactivation of the IRQ will happen later through guest
virtual interrupt deactivation but somehow isn't it more accurate wrt
the passthrough? Marc, what is your opinion? GIC spec does not seem to
restrict this mode to level sensitive interrupts or am I mistajen?

Thanks

Eric
> 
>> +
>> +	ret = vfio_platform_set_deoi(irq_ctx, true);
>> +out:
>> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> +	return ret;
>> +}
>> +
>> +static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
>> +					 struct irq_bypass_consumer *cons)
>> +{
>> +	struct vfio_platform_irq *irq_ctx =
>> +		container_of(prod, struct vfio_platform_irq, producer);
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&irq_ctx->lock, flags);
>> +	if (irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK)
>> +		vfio_platform_set_deoi(irq_ctx, false);
>> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> +}
>> +
>> +bool vfio_platform_has_deoi(void)
>> +{
>> +	return true;
>> +}
>> +
>> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
>> +					  struct vfio_platform_irq *irq,
>> +					  struct eventfd_ctx *trigger,
>> +					  unsigned int host_irq)
>> +{
>> +	struct irq_bypass_producer *prod = &irq->producer;
>> +	int ret;
>> +
>> +	prod->token =		trigger;
>> +	prod->irq =		host_irq;
>> +	prod->add_consumer =	irq_bypass_deoi_add_consumer;
>> +	prod->del_consumer =	irq_bypass_deoi_del_consumer;
>> +	prod->stop =		irq_bypass_deoi_stop;
>> +	prod->start =		irq_bypass_deoi_start;
>> +
>> +	ret = irq_bypass_register_producer(prod);
>> +	if (unlikely(ret))
>> +		dev_info(vdev->device,
>> +			 "irq bypass producer (token %p) registration fails: %d\n",
>> +			 prod->token, ret);
>> +}
>> +
>> +#endif
>> +
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index b80a380..bfa2675 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -17,6 +17,7 @@
>>  
>>  #include <linux/types.h>
>>  #include <linux/interrupt.h>
>> +#include <linux/irqbypass.h>
>>  
>>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
>> @@ -40,6 +41,7 @@ struct vfio_platform_irq {
>>  	struct virqfd		*mask;
>>  	bool			deoi;
>>  	irqreturn_t		(*handler)(int irq, void *dev_id);
>> +	struct irq_bypass_producer producer;
>>  };
>>  
>>  struct vfio_platform_region {
>> @@ -102,9 +104,30 @@ extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>>  					unsigned start, unsigned count,
>>  					void *data);
>>  
>> +extern int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi);
>> +
>>  extern void __vfio_platform_register_reset(struct vfio_platform_reset_node *n);
>>  extern void vfio_platform_unregister_reset(const char *compat,
>>  					   vfio_platform_reset_fn_t fn);
>> +
>> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
>> +bool vfio_platform_has_deoi(void);
>> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
>> +					  struct vfio_platform_irq *irq,
>> +					  struct eventfd_ctx *trigger,
>> +					  unsigned int host_irq);
>> +#else
>> +static inline bool vfio_platform_has_deoi(void)
>> +{
>> +	return false;
>> +}
>> +static inline
>> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
>> +					  struct vfio_platform_irq *irq,
>> +					  struct eventfd_ctx *trigger,
>> +					  unsigned int host_irq) {}
>> +#endif
>> +
>>  #define vfio_platform_register_reset(__compat, __reset)		\
>>  static struct vfio_platform_reset_node __reset ## _node = {	\
>>  	.owner = THIS_MODULE,					\
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
@ 2017-05-31 19:31       ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-05-31 19:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, marc.zyngier, linux-kernel, pbonzini, kvmarm, eric.auger.pro

Hi Alex, Marc,

On 31/05/2017 20:20, Alex Williamson wrote:
> On Wed, 24 May 2017 22:13:16 +0200
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> This patch adds the registration/unregistration of an
>> irq_bypass_producer for vfio platform device interrupts.
>>
>> Its callbacks handle the direct EOI modality on VFIO side.
>>
>> - stop/start: disable/enable the host irq
>> - add/del consumer: set the VFIO Direct EOI mode, ie. select the
>>   adapted physical IRQ handler (automasked or not automasked).
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  drivers/vfio/platform/Kconfig                    |   5 +
>>  drivers/vfio/platform/Makefile                   |   2 +-
>>  drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
>>  drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
>>  drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
>>  5 files changed, 162 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c
>>
>> diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
>> index bb30128..33ec3d9 100644
>> --- a/drivers/vfio/platform/Kconfig
>> +++ b/drivers/vfio/platform/Kconfig
>> @@ -2,6 +2,7 @@ config VFIO_PLATFORM
>>  	tristate "VFIO support for platform devices"
>>  	depends on VFIO && EVENTFD && (ARM || ARM64)
>>  	select VFIO_VIRQFD
>> +	select IRQ_BYPASS_MANAGER
>>  	help
>>  	  Support for platform devices with VFIO. This is required to make
>>  	  use of platform devices present on the system using the VFIO
>> @@ -19,4 +20,8 @@ config VFIO_AMBA
>>  
>>  	  If you don't know what to do here, say N.
>>  
>> +config VFIO_PLATFORM_IRQ_BYPASS_DEOI
>> +	depends on VFIO_PLATFORM
>> +	def_bool y
>> +
>>  source "drivers/vfio/platform/reset/Kconfig"
>> diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
>> index 41a6224..324f3e7 100644
>> --- a/drivers/vfio/platform/Makefile
>> +++ b/drivers/vfio/platform/Makefile
>> @@ -1,4 +1,4 @@
>> -vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
>> +vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
>>  vfio-platform-y := vfio_platform.o
>>  
>>  obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 2f82459..5b70c8e 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/types.h>
>>  #include <linux/vfio.h>
>>  #include <linux/irq.h>
>> +#include <linux/irqbypass.h>
>>  
>>  #include "vfio_platform_private.h"
>>  
>> @@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
>>  	return ret;
>>  }
>>  
>> +/* must be called with irq_ctx->lock held */
>> +int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
>> +{
>> +	irq_ctx->deoi = deoi;
>> +
>> +	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
>> +		irq_ctx->handler = vfio_automasked_irq_handler;
>> +	else
>> +		irq_ctx->handler = vfio_irq_handler;
>> +
>> +	return 0;
>> +}
>> +
>>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  			    int fd, irq_handler_t handler)
>>  {
>> @@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  	if (irq->trigger) {
>>  		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>>  		free_irq(irq->hwirq, irq);
>> +		irq_bypass_unregister_producer(&irq->producer);
>>  		kfree(irq->name);
>>  		eventfd_ctx_put(irq->trigger);
>>  		irq->trigger = NULL;
>> @@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>  		return ret;
>>  	}
>>  
>> +	if (vfio_platform_has_deoi())
>> +		vfio_platform_register_deoi_producer(vdev, irq,
>> +						     trigger, irq->hwirq);
>> +
>>  	if (!irq->masked)
>>  		enable_irq(irq->hwirq);
>>  
>> diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
>> new file mode 100644
>> index 0000000..436902c
>> --- /dev/null
>> +++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
>> @@ -0,0 +1,114 @@
>> +/*
>> + * VFIO platform device irqbypass callback implementation for DEOI
>> + *
>> + * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
>> + * Author: Eric Auger <eric.auger@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License, version 2, as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <linux/err.h>
>> +#include <linux/device.h>
>> +#include <linux/irq.h>
>> +#include <linux/irqbypass.h>
>> +#include "vfio_platform_private.h"
>> +
>> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
>> +
>> +static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
>> +{
>> +	enable_irq(prod->irq);
>> +}
>> +
>> +static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
>> +{
>> +	disable_irq(prod->irq);
>> +}
>> +
>> +/**
>> + * irq_bypass_deoi_add_consumer - turns irq direct EOI on
>> + *
>> + * The linux irq is disabled when the function is called.
>> + * The operation succeeds only if the irq is not active at irqchip level
>> + * and the irq is not automasked at VFIO level, meaning the IRQ is under
>> + * injection into the guest.
>> + */
>> +static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
>> +					struct irq_bypass_consumer *cons)
>> +{
>> +	struct vfio_platform_irq *irq_ctx =
>> +		container_of(prod, struct vfio_platform_irq, producer);
>> +	unsigned long flags;
>> +	bool active;
>> +	int ret;
>> +
>> +	spin_lock_irqsave(&irq_ctx->lock, flags);
>> +
>> +	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
>> +				    &active);
>> +	if (ret)
>> +		goto out;
>> +
>> +	if (active || irq_ctx->automasked) {
>> +		ret = -EAGAIN;
>> +		goto out;
>> +	}
>> +
>> +	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
>> +		goto out;
> 
> Not sure why this wouldn't return -EINVAL;
At the moment direct EOI is also set up for edge sensitive IRQs. This
means the deactivation of the IRQ will happen later through guest
virtual interrupt deactivation but somehow isn't it more accurate wrt
the passthrough? Marc, what is your opinion? GIC spec does not seem to
restrict this mode to level sensitive interrupts or am I mistajen?

Thanks

Eric
> 
>> +
>> +	ret = vfio_platform_set_deoi(irq_ctx, true);
>> +out:
>> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> +	return ret;
>> +}
>> +
>> +static void irq_bypass_deoi_del_consumer(struct irq_bypass_producer *prod,
>> +					 struct irq_bypass_consumer *cons)
>> +{
>> +	struct vfio_platform_irq *irq_ctx =
>> +		container_of(prod, struct vfio_platform_irq, producer);
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&irq_ctx->lock, flags);
>> +	if (irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK)
>> +		vfio_platform_set_deoi(irq_ctx, false);
>> +	spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> +}
>> +
>> +bool vfio_platform_has_deoi(void)
>> +{
>> +	return true;
>> +}
>> +
>> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
>> +					  struct vfio_platform_irq *irq,
>> +					  struct eventfd_ctx *trigger,
>> +					  unsigned int host_irq)
>> +{
>> +	struct irq_bypass_producer *prod = &irq->producer;
>> +	int ret;
>> +
>> +	prod->token =		trigger;
>> +	prod->irq =		host_irq;
>> +	prod->add_consumer =	irq_bypass_deoi_add_consumer;
>> +	prod->del_consumer =	irq_bypass_deoi_del_consumer;
>> +	prod->stop =		irq_bypass_deoi_stop;
>> +	prod->start =		irq_bypass_deoi_start;
>> +
>> +	ret = irq_bypass_register_producer(prod);
>> +	if (unlikely(ret))
>> +		dev_info(vdev->device,
>> +			 "irq bypass producer (token %p) registration fails: %d\n",
>> +			 prod->token, ret);
>> +}
>> +
>> +#endif
>> +
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index b80a380..bfa2675 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -17,6 +17,7 @@
>>  
>>  #include <linux/types.h>
>>  #include <linux/interrupt.h>
>> +#include <linux/irqbypass.h>
>>  
>>  #define VFIO_PLATFORM_OFFSET_SHIFT   40
>>  #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
>> @@ -40,6 +41,7 @@ struct vfio_platform_irq {
>>  	struct virqfd		*mask;
>>  	bool			deoi;
>>  	irqreturn_t		(*handler)(int irq, void *dev_id);
>> +	struct irq_bypass_producer producer;
>>  };
>>  
>>  struct vfio_platform_region {
>> @@ -102,9 +104,30 @@ extern int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev,
>>  					unsigned start, unsigned count,
>>  					void *data);
>>  
>> +extern int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi);
>> +
>>  extern void __vfio_platform_register_reset(struct vfio_platform_reset_node *n);
>>  extern void vfio_platform_unregister_reset(const char *compat,
>>  					   vfio_platform_reset_fn_t fn);
>> +
>> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
>> +bool vfio_platform_has_deoi(void);
>> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
>> +					  struct vfio_platform_irq *irq,
>> +					  struct eventfd_ctx *trigger,
>> +					  unsigned int host_irq);
>> +#else
>> +static inline bool vfio_platform_has_deoi(void)
>> +{
>> +	return false;
>> +}
>> +static inline
>> +void vfio_platform_register_deoi_producer(struct vfio_platform_device *vdev,
>> +					  struct vfio_platform_irq *irq,
>> +					  struct eventfd_ctx *trigger,
>> +					  unsigned int host_irq) {}
>> +#endif
>> +
>>  #define vfio_platform_register_reset(__compat, __reset)		\
>>  static struct vfio_platform_reset_node __reset ## _node = {	\
>>  	.owner = THIS_MODULE,					\
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64
  2017-05-31 19:31       ` Auger Eric
  (?)
@ 2017-06-01 10:49       ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-01 10:49 UTC (permalink / raw)
  To: Auger Eric, Alex Williamson
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	christoffer.dall, drjones, wei

Hi Eric,

On 31/05/17 20:31, Auger Eric wrote:
> Hi Alex, Marc,
> 
> On 31/05/2017 20:20, Alex Williamson wrote:
>> On Wed, 24 May 2017 22:13:16 +0200
>> Eric Auger <eric.auger@redhat.com> wrote:
>>
>>> This patch adds the registration/unregistration of an
>>> irq_bypass_producer for vfio platform device interrupts.
>>>
>>> Its callbacks handle the direct EOI modality on VFIO side.
>>>
>>> - stop/start: disable/enable the host irq
>>> - add/del consumer: set the VFIO Direct EOI mode, ie. select the
>>>   adapted physical IRQ handler (automasked or not automasked).
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  drivers/vfio/platform/Kconfig                    |   5 +
>>>  drivers/vfio/platform/Makefile                   |   2 +-
>>>  drivers/vfio/platform/vfio_platform_irq.c        |  19 ++++
>>>  drivers/vfio/platform/vfio_platform_irq_bypass.c | 114 +++++++++++++++++++++++
>>>  drivers/vfio/platform/vfio_platform_private.h    |  23 +++++
>>>  5 files changed, 162 insertions(+), 1 deletion(-)
>>>  create mode 100644 drivers/vfio/platform/vfio_platform_irq_bypass.c
>>>
>>> diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
>>> index bb30128..33ec3d9 100644
>>> --- a/drivers/vfio/platform/Kconfig
>>> +++ b/drivers/vfio/platform/Kconfig
>>> @@ -2,6 +2,7 @@ config VFIO_PLATFORM
>>>  	tristate "VFIO support for platform devices"
>>>  	depends on VFIO && EVENTFD && (ARM || ARM64)
>>>  	select VFIO_VIRQFD
>>> +	select IRQ_BYPASS_MANAGER
>>>  	help
>>>  	  Support for platform devices with VFIO. This is required to make
>>>  	  use of platform devices present on the system using the VFIO
>>> @@ -19,4 +20,8 @@ config VFIO_AMBA
>>>  
>>>  	  If you don't know what to do here, say N.
>>>  
>>> +config VFIO_PLATFORM_IRQ_BYPASS_DEOI
>>> +	depends on VFIO_PLATFORM
>>> +	def_bool y
>>> +
>>>  source "drivers/vfio/platform/reset/Kconfig"
>>> diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile
>>> index 41a6224..324f3e7 100644
>>> --- a/drivers/vfio/platform/Makefile
>>> +++ b/drivers/vfio/platform/Makefile
>>> @@ -1,4 +1,4 @@
>>> -vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o
>>> +vfio-platform-base-y := vfio_platform_common.o vfio_platform_irq.o vfio_platform_irq_bypass.o
>>>  vfio-platform-y := vfio_platform.o
>>>  
>>>  obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o
>>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>>> index 2f82459..5b70c8e 100644
>>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>>> @@ -20,6 +20,7 @@
>>>  #include <linux/types.h>
>>>  #include <linux/vfio.h>
>>>  #include <linux/irq.h>
>>> +#include <linux/irqbypass.h>
>>>  
>>>  #include "vfio_platform_private.h"
>>>  
>>> @@ -186,6 +187,19 @@ static irqreturn_t vfio_wrapper_handler(int irq, void *dev_id)
>>>  	return ret;
>>>  }
>>>  
>>> +/* must be called with irq_ctx->lock held */
>>> +int vfio_platform_set_deoi(struct vfio_platform_irq *irq_ctx, bool deoi)
>>> +{
>>> +	irq_ctx->deoi = deoi;
>>> +
>>> +	if (!deoi && (irq_ctx->flags & VFIO_IRQ_INFO_AUTOMASKED))
>>> +		irq_ctx->handler = vfio_automasked_irq_handler;
>>> +	else
>>> +		irq_ctx->handler = vfio_irq_handler;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>  static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>>  			    int fd, irq_handler_t handler)
>>>  {
>>> @@ -196,6 +210,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>>  	if (irq->trigger) {
>>>  		irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>>>  		free_irq(irq->hwirq, irq);
>>> +		irq_bypass_unregister_producer(&irq->producer);
>>>  		kfree(irq->name);
>>>  		eventfd_ctx_put(irq->trigger);
>>>  		irq->trigger = NULL;
>>> @@ -227,6 +242,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>>  		return ret;
>>>  	}
>>>  
>>> +	if (vfio_platform_has_deoi())
>>> +		vfio_platform_register_deoi_producer(vdev, irq,
>>> +						     trigger, irq->hwirq);
>>> +
>>>  	if (!irq->masked)
>>>  		enable_irq(irq->hwirq);
>>>  
>>> diff --git a/drivers/vfio/platform/vfio_platform_irq_bypass.c b/drivers/vfio/platform/vfio_platform_irq_bypass.c
>>> new file mode 100644
>>> index 0000000..436902c
>>> --- /dev/null
>>> +++ b/drivers/vfio/platform/vfio_platform_irq_bypass.c
>>> @@ -0,0 +1,114 @@
>>> +/*
>>> + * VFIO platform device irqbypass callback implementation for DEOI
>>> + *
>>> + * Copyright (C) 2017 Red Hat, Inc.  All rights reserved.
>>> + * Author: Eric Auger <eric.auger@redhat.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License, version 2, as
>>> + * published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + */
>>> +
>>> +#include <linux/err.h>
>>> +#include <linux/device.h>
>>> +#include <linux/irq.h>
>>> +#include <linux/irqbypass.h>
>>> +#include "vfio_platform_private.h"
>>> +
>>> +#ifdef CONFIG_VFIO_PLATFORM_IRQ_BYPASS_DEOI
>>> +
>>> +static void irq_bypass_deoi_start(struct irq_bypass_producer *prod)
>>> +{
>>> +	enable_irq(prod->irq);
>>> +}
>>> +
>>> +static void irq_bypass_deoi_stop(struct irq_bypass_producer *prod)
>>> +{
>>> +	disable_irq(prod->irq);
>>> +}
>>> +
>>> +/**
>>> + * irq_bypass_deoi_add_consumer - turns irq direct EOI on
>>> + *
>>> + * The linux irq is disabled when the function is called.
>>> + * The operation succeeds only if the irq is not active at irqchip level
>>> + * and the irq is not automasked at VFIO level, meaning the IRQ is under
>>> + * injection into the guest.
>>> + */
>>> +static int irq_bypass_deoi_add_consumer(struct irq_bypass_producer *prod,
>>> +					struct irq_bypass_consumer *cons)
>>> +{
>>> +	struct vfio_platform_irq *irq_ctx =
>>> +		container_of(prod, struct vfio_platform_irq, producer);
>>> +	unsigned long flags;
>>> +	bool active;
>>> +	int ret;
>>> +
>>> +	spin_lock_irqsave(&irq_ctx->lock, flags);
>>> +
>>> +	ret = irq_get_irqchip_state(irq_ctx->hwirq, IRQCHIP_STATE_ACTIVE,
>>> +				    &active);
>>> +	if (ret)
>>> +		goto out;
>>> +
>>> +	if (active || irq_ctx->automasked) {
>>> +		ret = -EAGAIN;
>>> +		goto out;
>>> +	}
>>> +
>>> +	if (!(irq_get_trigger_type(irq_ctx->hwirq) & IRQ_TYPE_LEVEL_MASK))
>>> +		goto out;
>>
>> Not sure why this wouldn't return -EINVAL;
> At the moment direct EOI is also set up for edge sensitive IRQs. This
> means the deactivation of the IRQ will happen later through guest
> virtual interrupt deactivation but somehow isn't it more accurate wrt
> the passthrough? Marc, what is your opinion? GIC spec does not seem to
> restrict this mode to level sensitive interrupts or am I mistajen?

My preference would be to handle edge interrupts the same way as level
(setting deoi to true). The rational is that in the case of a screaming
interrupt, forwarding it will prevent subsequent interrupts from
knocking the VM out of the CPU (the active state at the physical level
will prevent the interrupt from being observed).

This would make the VM behave just like it would on bare metal, where
the new edges would only be observed once the initial interrupt has been
deactivated.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
  2017-05-31 18:24   ` Alex Williamson
@ 2017-06-01 20:40       ` Auger Eric
  2017-06-14  8:07     ` Auger Eric
  1 sibling, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-06-01 20:40 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

Hi Alex,

On 31/05/2017 20:24, Alex Williamson wrote:
> On Wed, 24 May 2017 22:13:18 +0200
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>> If deoi is set, this means the physical IRQ attached to the virtual
>> IRQ is directly deactivated by the guest and the VFIO driver does
>> not need to disable the physical IRQ and mask it at VFIO level.
>>
>> The handler pointer is set accordingly and a wrapper handler is
>> introduced that calls the chosen handler function.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>> index d4d377b..06aa713 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>  {
>>  	struct vfio_pci_device *vdev = dev_id;
>> -	unsigned long flags;
>>  	int ret = IRQ_NONE;
>>  
>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>> -
>>  	if (!vdev->pci_2_3) {
>>  		disable_irq_nosync(vdev->pdev->irq);
>>  		vdev->ctx[0].automasked = true;
>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>  		ret = IRQ_HANDLED;
>>  	}
>>  
>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>> -
>>  	if (ret == IRQ_HANDLED)
>>  		vfio_send_intx_eventfd(vdev, NULL);
>>  
>>  	return ret;
>>  }
>>  
>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>> +{
>> +	struct vfio_pci_device *vdev = dev_id;
>> +
>> +	vfio_send_intx_eventfd(vdev, NULL);
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>> +{
>> +	struct vfio_pci_device *vdev = dev_id;
>> +	unsigned long flags;
>> +	irqreturn_t ret;
>> +
>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>> +
>> +	return ret;
>> +}
>> +
>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>  {
>>  	if (!is_irq_none(vdev))
>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>  	if (!vdev->pci_2_3)
>>  		irqflags = 0;
>>  
>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>> +	if (vdev->ctx[0].deoi)
>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>> +	else
>> +		vdev->ctx[0].handler = vfio_intx_handler;
>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>  			  irqflags, vdev->ctx[0].name, vdev);
> 
> 
> Here's where I think we don't account for irqflags properly.  If we get
> a shared interrupt here, then enabling direct EOI needs to be disabled
> or else we'll starve other devices sharing the interrupt.  In practice,
> I wonder if this makes PCI direct EOI a useful feature.  We could try
> to get an exclusive interrupt and fallback to shared, but any time we
> get an exclusive interrupt we're more prone to conflicts with other
> devices.  I might have two VMs that share an interrupt and now it's a
> race that only the first to setup an IRQ can work.  Worse, one of those
> VMs might be fully booted and switched to MSI and now it's just a
> matter of time until they reboot in the right way to generate a
> conflict.  I might also have two devices in the same VM that share an
> IRQ and now I can't start the VM at all because the second device can
> no longer get an interrupt.  This is the same problem we have with the
> nointxmask flag, it's a useful debugging feature but since the masking
> is done at the APIC/GIC rather than the device, much like here, it's not
> very practical for more than debugging and isolating specific devices
> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
> PCI side here. Thanks,

Thanks for the review.

I share you concerns about shared interrupts. I need to spend some
additional time reading the VFIO code related to those and that part of
the PCIe spec too.

Maybe we shall introduce the IRQ bypass based DEOI setup only for
platform devices. I know those are not much used at the moment but this
at least prepares for GICv4 direct injection.

Thanks

Eric
> 
> Alex
> 
>>  	if (ret) {
>>  		vdev->ctx[0].trigger = NULL;
>> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
>> index f7f1101..5cfe59a 100644
>> --- a/drivers/vfio/pci/vfio_pci_private.h
>> +++ b/drivers/vfio/pci/vfio_pci_private.h
>> @@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx {
>>  	char			*name;
>>  	bool			masked;
>>  	bool			automasked;
>> +	bool			deoi;
>> +	irqreturn_t		(*handler)(int irq, void *dev_id);
>>  	struct irq_bypass_producer	producer;
>>  };
>>  
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
@ 2017-06-01 20:40       ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-06-01 20:40 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, marc.zyngier, linux-kernel, pbonzini, kvmarm, eric.auger.pro

Hi Alex,

On 31/05/2017 20:24, Alex Williamson wrote:
> On Wed, 24 May 2017 22:13:18 +0200
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>> If deoi is set, this means the physical IRQ attached to the virtual
>> IRQ is directly deactivated by the guest and the VFIO driver does
>> not need to disable the physical IRQ and mask it at VFIO level.
>>
>> The handler pointer is set accordingly and a wrapper handler is
>> introduced that calls the chosen handler function.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>> index d4d377b..06aa713 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>  {
>>  	struct vfio_pci_device *vdev = dev_id;
>> -	unsigned long flags;
>>  	int ret = IRQ_NONE;
>>  
>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>> -
>>  	if (!vdev->pci_2_3) {
>>  		disable_irq_nosync(vdev->pdev->irq);
>>  		vdev->ctx[0].automasked = true;
>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>  		ret = IRQ_HANDLED;
>>  	}
>>  
>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>> -
>>  	if (ret == IRQ_HANDLED)
>>  		vfio_send_intx_eventfd(vdev, NULL);
>>  
>>  	return ret;
>>  }
>>  
>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>> +{
>> +	struct vfio_pci_device *vdev = dev_id;
>> +
>> +	vfio_send_intx_eventfd(vdev, NULL);
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>> +{
>> +	struct vfio_pci_device *vdev = dev_id;
>> +	unsigned long flags;
>> +	irqreturn_t ret;
>> +
>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>> +
>> +	return ret;
>> +}
>> +
>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>  {
>>  	if (!is_irq_none(vdev))
>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>  	if (!vdev->pci_2_3)
>>  		irqflags = 0;
>>  
>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>> +	if (vdev->ctx[0].deoi)
>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>> +	else
>> +		vdev->ctx[0].handler = vfio_intx_handler;
>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>  			  irqflags, vdev->ctx[0].name, vdev);
> 
> 
> Here's where I think we don't account for irqflags properly.  If we get
> a shared interrupt here, then enabling direct EOI needs to be disabled
> or else we'll starve other devices sharing the interrupt.  In practice,
> I wonder if this makes PCI direct EOI a useful feature.  We could try
> to get an exclusive interrupt and fallback to shared, but any time we
> get an exclusive interrupt we're more prone to conflicts with other
> devices.  I might have two VMs that share an interrupt and now it's a
> race that only the first to setup an IRQ can work.  Worse, one of those
> VMs might be fully booted and switched to MSI and now it's just a
> matter of time until they reboot in the right way to generate a
> conflict.  I might also have two devices in the same VM that share an
> IRQ and now I can't start the VM at all because the second device can
> no longer get an interrupt.  This is the same problem we have with the
> nointxmask flag, it's a useful debugging feature but since the masking
> is done at the APIC/GIC rather than the device, much like here, it's not
> very practical for more than debugging and isolating specific devices
> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
> PCI side here. Thanks,

Thanks for the review.

I share you concerns about shared interrupts. I need to spend some
additional time reading the VFIO code related to those and that part of
the PCIe spec too.

Maybe we shall introduce the IRQ bypass based DEOI setup only for
platform devices. I know those are not much used at the moment but this
at least prepares for GICv4 direct injection.

Thanks

Eric
> 
> Alex
> 
>>  	if (ret) {
>>  		vdev->ctx[0].trigger = NULL;
>> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
>> index f7f1101..5cfe59a 100644
>> --- a/drivers/vfio/pci/vfio_pci_private.h
>> +++ b/drivers/vfio/pci/vfio_pci_private.h
>> @@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx {
>>  	char			*name;
>>  	bool			masked;
>>  	bool			automasked;
>> +	bool			deoi;
>> +	irqreturn_t		(*handler)(int irq, void *dev_id);
>>  	struct irq_bypass_producer	producer;
>>  };
>>  
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
  2017-06-01 20:40       ` Auger Eric
@ 2017-06-02  8:41         ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-02  8:41 UTC (permalink / raw)
  To: Auger Eric, Alex Williamson
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	christoffer.dall, drjones, wei

On 01/06/17 21:40, Auger Eric wrote:
> Hi Alex,
> 
> On 31/05/2017 20:24, Alex Williamson wrote:
>> On Wed, 24 May 2017 22:13:18 +0200
>> Eric Auger <eric.auger@redhat.com> wrote:
>>
>>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>>> If deoi is set, this means the physical IRQ attached to the virtual
>>> IRQ is directly deactivated by the guest and the VFIO driver does
>>> not need to disable the physical IRQ and mask it at VFIO level.
>>>
>>> The handler pointer is set accordingly and a wrapper handler is
>>> introduced that calls the chosen handler function.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>>> index d4d377b..06aa713 100644
>>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  {
>>>  	struct vfio_pci_device *vdev = dev_id;
>>> -	unsigned long flags;
>>>  	int ret = IRQ_NONE;
>>>  
>>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>>> -
>>>  	if (!vdev->pci_2_3) {
>>>  		disable_irq_nosync(vdev->pdev->irq);
>>>  		vdev->ctx[0].automasked = true;
>>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  		ret = IRQ_HANDLED;
>>>  	}
>>>  
>>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> -
>>>  	if (ret == IRQ_HANDLED)
>>>  		vfio_send_intx_eventfd(vdev, NULL);
>>>  
>>>  	return ret;
>>>  }
>>>  
>>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +
>>> +	vfio_send_intx_eventfd(vdev, NULL);
>>> +	return IRQ_HANDLED;
>>> +}
>>> +
>>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +	unsigned long flags;
>>> +	irqreturn_t ret;
>>> +
>>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>>  {
>>>  	if (!is_irq_none(vdev))
>>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>>  	if (!vdev->pci_2_3)
>>>  		irqflags = 0;
>>>  
>>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>>> +	if (vdev->ctx[0].deoi)
>>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>>> +	else
>>> +		vdev->ctx[0].handler = vfio_intx_handler;
>>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>>  			  irqflags, vdev->ctx[0].name, vdev);
>>
>>
>> Here's where I think we don't account for irqflags properly.  If we get
>> a shared interrupt here, then enabling direct EOI needs to be disabled
>> or else we'll starve other devices sharing the interrupt.  In practice,
>> I wonder if this makes PCI direct EOI a useful feature.  We could try
>> to get an exclusive interrupt and fallback to shared, but any time we
>> get an exclusive interrupt we're more prone to conflicts with other
>> devices.  I might have two VMs that share an interrupt and now it's a
>> race that only the first to setup an IRQ can work.  Worse, one of those
>> VMs might be fully booted and switched to MSI and now it's just a
>> matter of time until they reboot in the right way to generate a
>> conflict.  I might also have two devices in the same VM that share an
>> IRQ and now I can't start the VM at all because the second device can
>> no longer get an interrupt.  This is the same problem we have with the
>> nointxmask flag, it's a useful debugging feature but since the masking
>> is done at the APIC/GIC rather than the device, much like here, it's not
>> very practical for more than debugging and isolating specific devices
>> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
>> PCI side here. Thanks,
> 
> Thanks for the review.
> 
> I share you concerns about shared interrupts. I need to spend some
> additional time reading the VFIO code related to those and that part of
> the PCIe spec too.
> 
> Maybe we shall introduce the IRQ bypass based DEOI setup only for
> platform devices. I know those are not much used at the moment but this
> at least prepares for GICv4 direct injection.

I think that'd be good enough, unless we can ensure that we only engage
the Direct EOI feature when PCI legacy interrupts are not shared. The
system I have here (AMD Seattle) seem to have one set of INTx per PCIe
port, so the above would definitely work on it. But I understand that
there is not a guarantee at all.

Maybe the "nointxmask" flag is not that bad a solution in that case. It
would clearly outline that the user knows the platform is safe, and that
we can use the Direct EOI feature.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
@ 2017-06-02  8:41         ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-02  8:41 UTC (permalink / raw)
  To: Auger Eric, Alex Williamson
  Cc: kvm, linux-kernel, pbonzini, kvmarm, eric.auger.pro

On 01/06/17 21:40, Auger Eric wrote:
> Hi Alex,
> 
> On 31/05/2017 20:24, Alex Williamson wrote:
>> On Wed, 24 May 2017 22:13:18 +0200
>> Eric Auger <eric.auger@redhat.com> wrote:
>>
>>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>>> If deoi is set, this means the physical IRQ attached to the virtual
>>> IRQ is directly deactivated by the guest and the VFIO driver does
>>> not need to disable the physical IRQ and mask it at VFIO level.
>>>
>>> The handler pointer is set accordingly and a wrapper handler is
>>> introduced that calls the chosen handler function.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>>> index d4d377b..06aa713 100644
>>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  {
>>>  	struct vfio_pci_device *vdev = dev_id;
>>> -	unsigned long flags;
>>>  	int ret = IRQ_NONE;
>>>  
>>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>>> -
>>>  	if (!vdev->pci_2_3) {
>>>  		disable_irq_nosync(vdev->pdev->irq);
>>>  		vdev->ctx[0].automasked = true;
>>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  		ret = IRQ_HANDLED;
>>>  	}
>>>  
>>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> -
>>>  	if (ret == IRQ_HANDLED)
>>>  		vfio_send_intx_eventfd(vdev, NULL);
>>>  
>>>  	return ret;
>>>  }
>>>  
>>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +
>>> +	vfio_send_intx_eventfd(vdev, NULL);
>>> +	return IRQ_HANDLED;
>>> +}
>>> +
>>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +	unsigned long flags;
>>> +	irqreturn_t ret;
>>> +
>>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>>  {
>>>  	if (!is_irq_none(vdev))
>>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>>  	if (!vdev->pci_2_3)
>>>  		irqflags = 0;
>>>  
>>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>>> +	if (vdev->ctx[0].deoi)
>>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>>> +	else
>>> +		vdev->ctx[0].handler = vfio_intx_handler;
>>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>>  			  irqflags, vdev->ctx[0].name, vdev);
>>
>>
>> Here's where I think we don't account for irqflags properly.  If we get
>> a shared interrupt here, then enabling direct EOI needs to be disabled
>> or else we'll starve other devices sharing the interrupt.  In practice,
>> I wonder if this makes PCI direct EOI a useful feature.  We could try
>> to get an exclusive interrupt and fallback to shared, but any time we
>> get an exclusive interrupt we're more prone to conflicts with other
>> devices.  I might have two VMs that share an interrupt and now it's a
>> race that only the first to setup an IRQ can work.  Worse, one of those
>> VMs might be fully booted and switched to MSI and now it's just a
>> matter of time until they reboot in the right way to generate a
>> conflict.  I might also have two devices in the same VM that share an
>> IRQ and now I can't start the VM at all because the second device can
>> no longer get an interrupt.  This is the same problem we have with the
>> nointxmask flag, it's a useful debugging feature but since the masking
>> is done at the APIC/GIC rather than the device, much like here, it's not
>> very practical for more than debugging and isolating specific devices
>> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
>> PCI side here. Thanks,
> 
> Thanks for the review.
> 
> I share you concerns about shared interrupts. I need to spend some
> additional time reading the VFIO code related to those and that part of
> the PCIe spec too.
> 
> Maybe we shall introduce the IRQ bypass based DEOI setup only for
> platform devices. I know those are not much used at the moment but this
> at least prepares for GICv4 direct injection.

I think that'd be good enough, unless we can ensure that we only engage
the Direct EOI feature when PCI legacy interrupts are not shared. The
system I have here (AMD Seattle) seem to have one set of INTx per PCIe
port, so the above would definitely work on it. But I understand that
there is not a guarantee at all.

Maybe the "nointxmask" flag is not that bad a solution in that case. It
would clearly outline that the user knows the platform is safe, and that
we can use the Direct EOI feature.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-05-24 20:13   ` Eric Auger
@ 2017-06-02 13:33     ` Christoffer Dall
  -1 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-02 13:33 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, marc.zyngier, christoffer.dall, drjones, wei

On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> Virtual interrupts directly mapped to physical interrupts require
> some special care. Their pending and active state must be observed
> at distributor level and not in the list register.

This is not entirely true.  There's a dependency, but there is also
separate virtual vs. physical state, see below.

> 
> Also a level sensitive interrupt's level is not toggled down by any
> maintenance IRQ handler as the EOI is not trapped.
> 
> This patch adds an host_irq field in vgic_irq struct to easily
> get the irqchip state of the host irq. We also handle the
> physical IRQ case in vgic_validate_injection and add helpers to
> get the line level and active state.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  include/kvm/arm_vgic.h    |  4 +++-
>  virt/kvm/arm/arch_timer.c |  3 ++-
>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>  4 files changed, 51 insertions(+), 9 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index ef71858..695ebc7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -112,6 +112,7 @@ struct vgic_irq {
>  	bool hw;			/* Tied to HW IRQ */
>  	struct kref refcount;		/* Used for LPIs */
>  	u32 hwintid;			/* HW INTID number */
> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>  	union {
>  		u8 targets;			/* GICv2 target VCPUs mask */
>  		u32 mpidr;			/* GICv3 target VCPU */
> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			bool level);
>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			       bool level);
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 5976609..45f4779 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> +				    vtimer->irq.irq, phys_irq);
>  	if (ret)
>  		return ret;
>  
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index 83b24d2..aa0618c 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  	kfree(irq);
>  }
>  
> +bool irq_line_level(struct vgic_irq *irq)
> +{
> +	bool line_level = irq->line_level;
> +
> +	if (unlikely(is_unshared_mapped(irq)))
> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_PENDING,
> +					      &line_level));
> +	return line_level;
> +}

This really looks fishy.  When do we need this exactly?

I feel like we should treat this more like everything else and set the
line_level on the irq even for forwarded interrupts, and then you don't
need changes to validate injection.

The challenge, then, is how to re-sample the line and lower the
line_level field when necessary.  Can't we simply do this in
vgic_fold_lr_state(), and if you have a forwarded interrupt which is
level triggered and the level is high, then notify the one who injected
this and tell it to adjust its line level (lower it if it changed).

That would follow our existing path very closely.

Am I missing something?

> +
> +bool irq_is_active(struct vgic_irq *irq)
> +{
> +	bool is_active = irq->active;
> +
> +	if (unlikely(is_unshared_mapped(irq)))
> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_ACTIVE,
> +					      &is_active));
> +	return is_active;
> +}

Why do we need this?

The active state of a virtual IRQ is independent from the underlying
physical state, as I see it.

For example, when the interrupt is initially injected to the VGIC, it
will be ACTIVE on the physical distributor, but PENDING on the VGIC.


Thanks,
-Christoffer

> +
>  /**
>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>   *
> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>  
>  	/* If the interrupt is active, it must stay on the current vcpu */
> -	if (irq->active)
> +	if (irq_is_active(irq))
>  		return irq->vcpu ? : irq->target_vcpu;
>  
>  	/*
> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>  {
>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
> +	bool activea, activeb;
>  	bool penda, pendb;
>  	int ret;
>  
>  	spin_lock(&irqa->irq_lock);
>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>  
> -	if (irqa->active || irqb->active) {
> -		ret = (int)irqb->active - (int)irqa->active;
> +	activea = irq_is_active(irqa);
> +	activeb = irq_is_active(irqb);
> +
> +	if (activea || activeb) {
> +		ret = (int)activeb - (int)activea;
>  		goto out;
>  	}
>  
> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Only valid injection if changing level for level-triggered IRQs or for a
> - * rising edge.
> + * rising edge. Injection of virtual interrupts associated to physical
> + * interrupts always is valid.
>   */
>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>  {
>  	switch (irq->config) {
>  	case VGIC_CONFIG_LEVEL:
> -		return irq->line_level != level;
> +		if (unlikely(is_unshared_mapped(irq)))
> +			return true;
> +		else
> +			return irq->line_level != level;
>  	case VGIC_CONFIG_EDGE:
>  		return level;
>  	}
> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  	return 0;
>  }
>  
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq)
>  {
>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>  
> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>  
>  	irq->hw = true;
>  	irq->hwintid = phys_irq;
> +	irq->host_irq = host_irq;
>  
>  	spin_unlock(&irq->irq_lock);
>  	vgic_put_irq(vcpu->kvm, irq);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index da83e4c..dc4972b 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -17,6 +17,7 @@
>  #define __KVM_ARM_VGIC_NEW_H__
>  
>  #include <linux/irqchip/arm-gic-common.h>
> +#include <linux/interrupt.h>
>  
>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>  #define IMPLEMENTER_ARM		0x43b
> @@ -96,14 +97,20 @@
>  /* we only support 64 kB translation table page size */
>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>  
> +bool irq_line_level(struct vgic_irq *irq);
> +bool irq_is_active(struct vgic_irq *irq);
> +
>  static inline bool irq_is_pending(struct vgic_irq *irq)
>  {
>  	if (irq->config == VGIC_CONFIG_EDGE)
>  		return irq->pending_latch;
>  	else
> -		return irq->pending_latch || irq->line_level;
> +		return irq->pending_latch || irq_line_level(irq);
>  }
>  
> +#define is_unshared_mapped(i) \
> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
> +
>  /*
>   * This struct provides an intermediate representation of the fields contained
>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
> -- 
> 2.5.5
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-06-02 13:33     ` Christoffer Dall
  0 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-02 13:33 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, marc.zyngier, linux-kernel, alex.williamson, pbonzini,
	kvmarm, eric.auger.pro

On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> Virtual interrupts directly mapped to physical interrupts require
> some special care. Their pending and active state must be observed
> at distributor level and not in the list register.

This is not entirely true.  There's a dependency, but there is also
separate virtual vs. physical state, see below.

> 
> Also a level sensitive interrupt's level is not toggled down by any
> maintenance IRQ handler as the EOI is not trapped.
> 
> This patch adds an host_irq field in vgic_irq struct to easily
> get the irqchip state of the host irq. We also handle the
> physical IRQ case in vgic_validate_injection and add helpers to
> get the line level and active state.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  include/kvm/arm_vgic.h    |  4 +++-
>  virt/kvm/arm/arch_timer.c |  3 ++-
>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>  4 files changed, 51 insertions(+), 9 deletions(-)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index ef71858..695ebc7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -112,6 +112,7 @@ struct vgic_irq {
>  	bool hw;			/* Tied to HW IRQ */
>  	struct kref refcount;		/* Used for LPIs */
>  	u32 hwintid;			/* HW INTID number */
> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>  	union {
>  		u8 targets;			/* GICv2 target VCPUs mask */
>  		u32 mpidr;			/* GICv3 target VCPU */
> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			bool level);
>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  			       bool level);
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq);
>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>  
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 5976609..45f4779 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  	 * Tell the VGIC that the virtual interrupt is tied to a
>  	 * physical interrupt. We do that once per VCPU.
>  	 */
> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> +				    vtimer->irq.irq, phys_irq);
>  	if (ret)
>  		return ret;
>  
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index 83b24d2..aa0618c 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  	kfree(irq);
>  }
>  
> +bool irq_line_level(struct vgic_irq *irq)
> +{
> +	bool line_level = irq->line_level;
> +
> +	if (unlikely(is_unshared_mapped(irq)))
> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_PENDING,
> +					      &line_level));
> +	return line_level;
> +}

This really looks fishy.  When do we need this exactly?

I feel like we should treat this more like everything else and set the
line_level on the irq even for forwarded interrupts, and then you don't
need changes to validate injection.

The challenge, then, is how to re-sample the line and lower the
line_level field when necessary.  Can't we simply do this in
vgic_fold_lr_state(), and if you have a forwarded interrupt which is
level triggered and the level is high, then notify the one who injected
this and tell it to adjust its line level (lower it if it changed).

That would follow our existing path very closely.

Am I missing something?

> +
> +bool irq_is_active(struct vgic_irq *irq)
> +{
> +	bool is_active = irq->active;
> +
> +	if (unlikely(is_unshared_mapped(irq)))
> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> +					      IRQCHIP_STATE_ACTIVE,
> +					      &is_active));
> +	return is_active;
> +}

Why do we need this?

The active state of a virtual IRQ is independent from the underlying
physical state, as I see it.

For example, when the interrupt is initially injected to the VGIC, it
will be ACTIVE on the physical distributor, but PENDING on the VGIC.


Thanks,
-Christoffer

> +
>  /**
>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>   *
> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>  
>  	/* If the interrupt is active, it must stay on the current vcpu */
> -	if (irq->active)
> +	if (irq_is_active(irq))
>  		return irq->vcpu ? : irq->target_vcpu;
>  
>  	/*
> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>  {
>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
> +	bool activea, activeb;
>  	bool penda, pendb;
>  	int ret;
>  
>  	spin_lock(&irqa->irq_lock);
>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>  
> -	if (irqa->active || irqb->active) {
> -		ret = (int)irqb->active - (int)irqa->active;
> +	activea = irq_is_active(irqa);
> +	activeb = irq_is_active(irqb);
> +
> +	if (activea || activeb) {
> +		ret = (int)activeb - (int)activea;
>  		goto out;
>  	}
>  
> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>  
>  /*
>   * Only valid injection if changing level for level-triggered IRQs or for a
> - * rising edge.
> + * rising edge. Injection of virtual interrupts associated to physical
> + * interrupts always is valid.
>   */
>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>  {
>  	switch (irq->config) {
>  	case VGIC_CONFIG_LEVEL:
> -		return irq->line_level != level;
> +		if (unlikely(is_unshared_mapped(irq)))
> +			return true;
> +		else
> +			return irq->line_level != level;
>  	case VGIC_CONFIG_EDGE:
>  		return level;
>  	}
> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>  	return 0;
>  }
>  
> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> +			  u32 virt_irq, u32 phys_irq)
>  {
>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>  
> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>  
>  	irq->hw = true;
>  	irq->hwintid = phys_irq;
> +	irq->host_irq = host_irq;
>  
>  	spin_unlock(&irq->irq_lock);
>  	vgic_put_irq(vcpu->kvm, irq);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index da83e4c..dc4972b 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -17,6 +17,7 @@
>  #define __KVM_ARM_VGIC_NEW_H__
>  
>  #include <linux/irqchip/arm-gic-common.h>
> +#include <linux/interrupt.h>
>  
>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>  #define IMPLEMENTER_ARM		0x43b
> @@ -96,14 +97,20 @@
>  /* we only support 64 kB translation table page size */
>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>  
> +bool irq_line_level(struct vgic_irq *irq);
> +bool irq_is_active(struct vgic_irq *irq);
> +
>  static inline bool irq_is_pending(struct vgic_irq *irq)
>  {
>  	if (irq->config == VGIC_CONFIG_EDGE)
>  		return irq->pending_latch;
>  	else
> -		return irq->pending_latch || irq->line_level;
> +		return irq->pending_latch || irq_line_level(irq);
>  }
>  
> +#define is_unshared_mapped(i) \
> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
> +
>  /*
>   * This struct provides an intermediate representation of the fields contained
>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
> -- 
> 2.5.5
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-02 13:33     ` Christoffer Dall
@ 2017-06-02 14:10       ` Marc Zyngier
  -1 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-02 14:10 UTC (permalink / raw)
  To: Christoffer Dall, Eric Auger
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

On 02/06/17 14:33, Christoffer Dall wrote:
> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>> Virtual interrupts directly mapped to physical interrupts require
>> some special care. Their pending and active state must be observed
>> at distributor level and not in the list register.
> 
> This is not entirely true.  There's a dependency, but there is also
> separate virtual vs. physical state, see below.

I think this stems for the usual confusion about the "pending and active
state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
I state this again?

>>
>> Also a level sensitive interrupt's level is not toggled down by any
>> maintenance IRQ handler as the EOI is not trapped.
>>
>> This patch adds an host_irq field in vgic_irq struct to easily
>> get the irqchip state of the host irq. We also handle the
>> physical IRQ case in vgic_validate_injection and add helpers to
>> get the line level and active state.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/kvm/arm_vgic.h    |  4 +++-
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index ef71858..695ebc7 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>  	bool hw;			/* Tied to HW IRQ */
>>  	struct kref refcount;		/* Used for LPIs */
>>  	u32 hwintid;			/* HW INTID number */
>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>  	union {
>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>  		u32 mpidr;			/* GICv3 target VCPU */
>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			bool level);
>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			       bool level);
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 5976609..45f4779 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> +				    vtimer->irq.irq, phys_irq);
>>  	if (ret)
>>  		return ret;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index 83b24d2..aa0618c 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>  	kfree(irq);
>>  }
>>  
>> +bool irq_line_level(struct vgic_irq *irq)
>> +{
>> +	bool line_level = irq->line_level;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_PENDING,
>> +					      &line_level));
>> +	return line_level;
>> +}
> 
> This really looks fishy.  When do we need this exactly?
> 
> I feel like we should treat this more like everything else and set the
> line_level on the irq even for forwarded interrupts, and then you don't
> need changes to validate injection.
> 
> The challenge, then, is how to re-sample the line and lower the
> line_level field when necessary.  Can't we simply do this in
> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> level triggered and the level is high, then notify the one who injected
> this and tell it to adjust its line level (lower it if it changed).
> 
> That would follow our existing path very closely.
> 
> Am I missing something?

I don't think you are. I think Eric got confused because of the above.
But the flow is a bit a brainfsck :-(

- Physical interrupt fires, activated, injected in the vgic
- Injecting the interrupt has a very different flow from what we
currently have, and follow the same pattern as an Edge interrupt
(because the Pending state is kept at the physical distributor, so we
cannot preserve it in the emulation).
- Normal life cycle of the interrupt
- The fact that the Pending bit is kept at the distributor level ensures
that if it becomes pending again in the emulation, that's because the
guest has deactivated the physical interrupt by doing an EOI.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-06-02 14:10       ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-02 14:10 UTC (permalink / raw)
  To: Christoffer Dall, Eric Auger
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

On 02/06/17 14:33, Christoffer Dall wrote:
> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>> Virtual interrupts directly mapped to physical interrupts require
>> some special care. Their pending and active state must be observed
>> at distributor level and not in the list register.
> 
> This is not entirely true.  There's a dependency, but there is also
> separate virtual vs. physical state, see below.

I think this stems for the usual confusion about the "pending and active
state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
I state this again?

>>
>> Also a level sensitive interrupt's level is not toggled down by any
>> maintenance IRQ handler as the EOI is not trapped.
>>
>> This patch adds an host_irq field in vgic_irq struct to easily
>> get the irqchip state of the host irq. We also handle the
>> physical IRQ case in vgic_validate_injection and add helpers to
>> get the line level and active state.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/kvm/arm_vgic.h    |  4 +++-
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index ef71858..695ebc7 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>  	bool hw;			/* Tied to HW IRQ */
>>  	struct kref refcount;		/* Used for LPIs */
>>  	u32 hwintid;			/* HW INTID number */
>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>  	union {
>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>  		u32 mpidr;			/* GICv3 target VCPU */
>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			bool level);
>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			       bool level);
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 5976609..45f4779 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> +				    vtimer->irq.irq, phys_irq);
>>  	if (ret)
>>  		return ret;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index 83b24d2..aa0618c 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>  	kfree(irq);
>>  }
>>  
>> +bool irq_line_level(struct vgic_irq *irq)
>> +{
>> +	bool line_level = irq->line_level;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_PENDING,
>> +					      &line_level));
>> +	return line_level;
>> +}
> 
> This really looks fishy.  When do we need this exactly?
> 
> I feel like we should treat this more like everything else and set the
> line_level on the irq even for forwarded interrupts, and then you don't
> need changes to validate injection.
> 
> The challenge, then, is how to re-sample the line and lower the
> line_level field when necessary.  Can't we simply do this in
> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> level triggered and the level is high, then notify the one who injected
> this and tell it to adjust its line level (lower it if it changed).
> 
> That would follow our existing path very closely.
> 
> Am I missing something?

I don't think you are. I think Eric got confused because of the above.
But the flow is a bit a brainfsck :-(

- Physical interrupt fires, activated, injected in the vgic
- Injecting the interrupt has a very different flow from what we
currently have, and follow the same pattern as an Edge interrupt
(because the Pending state is kept at the physical distributor, so we
cannot preserve it in the emulation).
- Normal life cycle of the interrupt
- The fact that the Pending bit is kept at the distributor level ensures
that if it becomes pending again in the emulation, that's because the
guest has deactivated the physical interrupt by doing an EOI.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-02 14:10       ` Marc Zyngier
@ 2017-06-02 16:29         ` Christoffer Dall
  -1 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-02 16:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Eric Auger, eric.auger.pro, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, christoffer.dall, drjones, wei

On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
> On 02/06/17 14:33, Christoffer Dall wrote:
> > On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> >> Virtual interrupts directly mapped to physical interrupts require
> >> some special care. Their pending and active state must be observed
> >> at distributor level and not in the list register.
> > 
> > This is not entirely true.  There's a dependency, but there is also
> > separate virtual vs. physical state, see below.
> 
> I think this stems for the usual confusion about the "pending and active
> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
> I state this again?
> 
> >>
> >> Also a level sensitive interrupt's level is not toggled down by any
> >> maintenance IRQ handler as the EOI is not trapped.
> >>
> >> This patch adds an host_irq field in vgic_irq struct to easily
> >> get the irqchip state of the host irq. We also handle the
> >> physical IRQ case in vgic_validate_injection and add helpers to
> >> get the line level and active state.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>  include/kvm/arm_vgic.h    |  4 +++-
> >>  virt/kvm/arm/arch_timer.c |  3 ++-
> >>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
> >>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
> >>  4 files changed, 51 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index ef71858..695ebc7 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -112,6 +112,7 @@ struct vgic_irq {
> >>  	bool hw;			/* Tied to HW IRQ */
> >>  	struct kref refcount;		/* Used for LPIs */
> >>  	u32 hwintid;			/* HW INTID number */
> >> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
> >>  	union {
> >>  		u8 targets;			/* GICv2 target VCPUs mask */
> >>  		u32 mpidr;			/* GICv3 target VCPU */
> >> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>  			bool level);
> >>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>  			       bool level);
> >> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> >> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> >> +			  u32 virt_irq, u32 phys_irq);
> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>  
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 5976609..45f4779 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>  	 * physical interrupt. We do that once per VCPU.
> >>  	 */
> >> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> >> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> >> +				    vtimer->irq.irq, phys_irq);
> >>  	if (ret)
> >>  		return ret;
> >>  
> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> index 83b24d2..aa0618c 100644
> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >>  	kfree(irq);
> >>  }
> >>  
> >> +bool irq_line_level(struct vgic_irq *irq)
> >> +{
> >> +	bool line_level = irq->line_level;
> >> +
> >> +	if (unlikely(is_unshared_mapped(irq)))
> >> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >> +					      IRQCHIP_STATE_PENDING,
> >> +					      &line_level));
> >> +	return line_level;
> >> +}
> > 
> > This really looks fishy.  When do we need this exactly?
> > 
> > I feel like we should treat this more like everything else and set the
> > line_level on the irq even for forwarded interrupts, and then you don't
> > need changes to validate injection.
> > 
> > The challenge, then, is how to re-sample the line and lower the
> > line_level field when necessary.  Can't we simply do this in
> > vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> > level triggered and the level is high, then notify the one who injected
> > this and tell it to adjust its line level (lower it if it changed).
> > 
> > That would follow our existing path very closely.
> > 
> > Am I missing something?
> 
> I don't think you are. I think Eric got confused because of the above.
> But the flow is a bit a brainfsck :-(
> 
> - Physical interrupt fires, activated, injected in the vgic
> - Injecting the interrupt has a very different flow from what we
> currently have, and follow the same pattern as an Edge interrupt
> (because the Pending state is kept at the physical distributor, so we
> cannot preserve it in the emulation).
> - Normal life cycle of the interrupt
> - The fact that the Pending bit is kept at the distributor level ensures
> that if it becomes pending again in the emulation, that's because the
> guest has deactivated the physical interrupt by doing an EOI.
> 

I think there's a choice between how we choose to support this.  We can
either do the edge-like injection, or we can model the line_level to the
best of our ability (we just have to lower the line after the guest
exits after deactivation if it's not still pending at the physical
distributor).

One question with doing this edge-like, can you ahve this scenario:
 1. VM runs with active virtual interrupt linked to physical
    interrupt.
 2. VM deactivates virtual+physical interrupt
 3. Physical interrupt fires again on the host
 4. The host injects the virtual interrupt as pending to the VGIC (and
    IPIs the VCPU etc.)
 5. The device lowers the physical line (another VPCU programs the
    device, there's some delay, or whatever)
 6. The VCPU now sees a pending interrupt, which is no longer pending.

Not sure if the line-like approach really solves this, though, or if
getting a spurius interrupt is something we care about.

Perhaps we need to try to implement both and see how it looks like?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-06-02 16:29         ` Christoffer Dall
  0 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-02 16:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
> On 02/06/17 14:33, Christoffer Dall wrote:
> > On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> >> Virtual interrupts directly mapped to physical interrupts require
> >> some special care. Their pending and active state must be observed
> >> at distributor level and not in the list register.
> > 
> > This is not entirely true.  There's a dependency, but there is also
> > separate virtual vs. physical state, see below.
> 
> I think this stems for the usual confusion about the "pending and active
> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
> I state this again?
> 
> >>
> >> Also a level sensitive interrupt's level is not toggled down by any
> >> maintenance IRQ handler as the EOI is not trapped.
> >>
> >> This patch adds an host_irq field in vgic_irq struct to easily
> >> get the irqchip state of the host irq. We also handle the
> >> physical IRQ case in vgic_validate_injection and add helpers to
> >> get the line level and active state.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>  include/kvm/arm_vgic.h    |  4 +++-
> >>  virt/kvm/arm/arch_timer.c |  3 ++-
> >>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
> >>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
> >>  4 files changed, 51 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index ef71858..695ebc7 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -112,6 +112,7 @@ struct vgic_irq {
> >>  	bool hw;			/* Tied to HW IRQ */
> >>  	struct kref refcount;		/* Used for LPIs */
> >>  	u32 hwintid;			/* HW INTID number */
> >> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
> >>  	union {
> >>  		u8 targets;			/* GICv2 target VCPUs mask */
> >>  		u32 mpidr;			/* GICv3 target VCPU */
> >> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>  			bool level);
> >>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>  			       bool level);
> >> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> >> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> >> +			  u32 virt_irq, u32 phys_irq);
> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>  
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 5976609..45f4779 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>  	 * physical interrupt. We do that once per VCPU.
> >>  	 */
> >> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> >> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> >> +				    vtimer->irq.irq, phys_irq);
> >>  	if (ret)
> >>  		return ret;
> >>  
> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> index 83b24d2..aa0618c 100644
> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >>  	kfree(irq);
> >>  }
> >>  
> >> +bool irq_line_level(struct vgic_irq *irq)
> >> +{
> >> +	bool line_level = irq->line_level;
> >> +
> >> +	if (unlikely(is_unshared_mapped(irq)))
> >> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >> +					      IRQCHIP_STATE_PENDING,
> >> +					      &line_level));
> >> +	return line_level;
> >> +}
> > 
> > This really looks fishy.  When do we need this exactly?
> > 
> > I feel like we should treat this more like everything else and set the
> > line_level on the irq even for forwarded interrupts, and then you don't
> > need changes to validate injection.
> > 
> > The challenge, then, is how to re-sample the line and lower the
> > line_level field when necessary.  Can't we simply do this in
> > vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> > level triggered and the level is high, then notify the one who injected
> > this and tell it to adjust its line level (lower it if it changed).
> > 
> > That would follow our existing path very closely.
> > 
> > Am I missing something?
> 
> I don't think you are. I think Eric got confused because of the above.
> But the flow is a bit a brainfsck :-(
> 
> - Physical interrupt fires, activated, injected in the vgic
> - Injecting the interrupt has a very different flow from what we
> currently have, and follow the same pattern as an Edge interrupt
> (because the Pending state is kept at the physical distributor, so we
> cannot preserve it in the emulation).
> - Normal life cycle of the interrupt
> - The fact that the Pending bit is kept at the distributor level ensures
> that if it becomes pending again in the emulation, that's because the
> guest has deactivated the physical interrupt by doing an EOI.
> 

I think there's a choice between how we choose to support this.  We can
either do the edge-like injection, or we can model the line_level to the
best of our ability (we just have to lower the line after the guest
exits after deactivation if it's not still pending at the physical
distributor).

One question with doing this edge-like, can you ahve this scenario:
 1. VM runs with active virtual interrupt linked to physical
    interrupt.
 2. VM deactivates virtual+physical interrupt
 3. Physical interrupt fires again on the host
 4. The host injects the virtual interrupt as pending to the VGIC (and
    IPIs the VCPU etc.)
 5. The device lowers the physical line (another VPCU programs the
    device, there's some delay, or whatever)
 6. The VCPU now sees a pending interrupt, which is no longer pending.

Not sure if the line-like approach really solves this, though, or if
getting a spurius interrupt is something we care about.

Perhaps we need to try to implement both and see how it looks like?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-02 16:29         ` Christoffer Dall
  (?)
@ 2017-06-08  8:23         ` Marc Zyngier
  2017-06-08  8:34           ` Christoffer Dall
  -1 siblings, 1 reply; 69+ messages in thread
From: Marc Zyngier @ 2017-06-08  8:23 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Eric Auger, eric.auger.pro, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, christoffer.dall, drjones, wei

On Fri, Jun 02 2017 at  6:29:44 pm BST, Christoffer Dall <cdall@linaro.org> wrote:
> On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
>> On 02/06/17 14:33, Christoffer Dall wrote:
>> > On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>> >> Virtual interrupts directly mapped to physical interrupts require
>> >> some special care. Their pending and active state must be observed
>> >> at distributor level and not in the list register.
>> > 
>> > This is not entirely true.  There's a dependency, but there is also
>> > separate virtual vs. physical state, see below.
>> 
>> I think this stems for the usual confusion about the "pending and active
>> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
>> I state this again?
>> 
>> >>
>> >> Also a level sensitive interrupt's level is not toggled down by any
>> >> maintenance IRQ handler as the EOI is not trapped.
>> >>
>> >> This patch adds an host_irq field in vgic_irq struct to easily
>> >> get the irqchip state of the host irq. We also handle the
>> >> physical IRQ case in vgic_validate_injection and add helpers to
>> >> get the line level and active state.
>> >>
>> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> >> ---
>> >>  include/kvm/arm_vgic.h    |  4 +++-
>> >>  virt/kvm/arm/arch_timer.c |  3 ++-
>> >>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>> >>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>> >>  4 files changed, 51 insertions(+), 9 deletions(-)
>> >>
>> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> >> index ef71858..695ebc7 100644
>> >> --- a/include/kvm/arm_vgic.h
>> >> +++ b/include/kvm/arm_vgic.h
>> >> @@ -112,6 +112,7 @@ struct vgic_irq {
>> >>  	bool hw;			/* Tied to HW IRQ */
>> >>  	struct kref refcount;		/* Used for LPIs */
>> >>  	u32 hwintid;			/* HW INTID number */
>> >> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>> >>  	union {
>> >>  		u8 targets;			/* GICv2 target VCPUs mask */
>> >>  		u32 mpidr;			/* GICv3 target VCPU */
>> >> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>> >>  			bool level);
>> >>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>> >>  			       bool level);
>> >> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> >> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> >> +			  u32 virt_irq, u32 phys_irq);
>> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>> >>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>> >>  
>> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> >> index 5976609..45f4779 100644
>> >> --- a/virt/kvm/arm/arch_timer.c
>> >> +++ b/virt/kvm/arm/arch_timer.c
>> >> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
>> >>  	 * physical interrupt. We do that once per VCPU.
>> >>  	 */
>> >> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> >> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> >> +				    vtimer->irq.irq, phys_irq);
>> >>  	if (ret)
>> >>  		return ret;
>> >>  
>> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> >> index 83b24d2..aa0618c 100644
>> >> --- a/virt/kvm/arm/vgic/vgic.c
>> >> +++ b/virt/kvm/arm/vgic/vgic.c
>> >> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>> >>  	kfree(irq);
>> >>  }
>> >>  
>> >> +bool irq_line_level(struct vgic_irq *irq)
>> >> +{
>> >> +	bool line_level = irq->line_level;
>> >> +
>> >> +	if (unlikely(is_unshared_mapped(irq)))
>> >> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> >> +					      IRQCHIP_STATE_PENDING,
>> >> +					      &line_level));
>> >> +	return line_level;
>> >> +}
>> > 
>> > This really looks fishy.  When do we need this exactly?
>> > 
>> > I feel like we should treat this more like everything else and set the
>> > line_level on the irq even for forwarded interrupts, and then you don't
>> > need changes to validate injection.
>> > 
>> > The challenge, then, is how to re-sample the line and lower the
>> > line_level field when necessary.  Can't we simply do this in
>> > vgic_fold_lr_state(), and if you have a forwarded interrupt which is
>> > level triggered and the level is high, then notify the one who injected
>> > this and tell it to adjust its line level (lower it if it changed).
>> > 
>> > That would follow our existing path very closely.
>> > 
>> > Am I missing something?
>> 
>> I don't think you are. I think Eric got confused because of the above.
>> But the flow is a bit a brainfsck :-(
>> 
>> - Physical interrupt fires, activated, injected in the vgic
>> - Injecting the interrupt has a very different flow from what we
>> currently have, and follow the same pattern as an Edge interrupt
>> (because the Pending state is kept at the physical distributor, so we
>> cannot preserve it in the emulation).
>> - Normal life cycle of the interrupt
>> - The fact that the Pending bit is kept at the distributor level ensures
>> that if it becomes pending again in the emulation, that's because the
>> guest has deactivated the physical interrupt by doing an EOI.
>> 
>
> I think there's a choice between how we choose to support this.  We can
> either do the edge-like injection, or we can model the line_level to the
> best of our ability (we just have to lower the line after the guest
> exits after deactivation if it's not still pending at the physical
> distributor).
>
> One question with doing this edge-like, can you ahve this scenario:
>  1. VM runs with active virtual interrupt linked to physical
>     interrupt.
>  2. VM deactivates virtual+physical interrupt
>  3. Physical interrupt fires again on the host
>  4. The host injects the virtual interrupt as pending to the VGIC (and
>     IPIs the VCPU etc.)
>  5. The device lowers the physical line (another VPCU programs the
>     device, there's some delay, or whatever)
>  6. The VCPU now sees a pending interrupt, which is no longer pending.
>
> Not sure if the line-like approach really solves this, though, or if
> getting a spurius interrupt is something we care about.

That would be a spurious interrupt indeed, but I'm not sure that's
something the line level sampling you suggest would avoid either. There
is a fundamental disconnect between the injection and the physical line,
and it can only be modelled to some level of accuracy (/me curse the
architecture again).

> Perhaps we need to try to implement both and see how it looks like?

There is definitely room for experiment, but I feel Eric should focus on
one of them (whichever it is). Happy to help prototyping the other one
though.

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-08  8:23         ` Marc Zyngier
@ 2017-06-08  8:34           ` Christoffer Dall
  2017-06-08  8:55               ` Auger Eric
  0 siblings, 1 reply; 69+ messages in thread
From: Christoffer Dall @ 2017-06-08  8:34 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Eric Auger, eric.auger.pro, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, christoffer.dall, drjones, wei

On Thu, Jun 08, 2017 at 09:23:16AM +0100, Marc Zyngier wrote:
> On Fri, Jun 02 2017 at  6:29:44 pm BST, Christoffer Dall <cdall@linaro.org> wrote:
> > On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
> >> On 02/06/17 14:33, Christoffer Dall wrote:
> >> > On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> >> >> Virtual interrupts directly mapped to physical interrupts require
> >> >> some special care. Their pending and active state must be observed
> >> >> at distributor level and not in the list register.
> >> > 
> >> > This is not entirely true.  There's a dependency, but there is also
> >> > separate virtual vs. physical state, see below.
> >> 
> >> I think this stems for the usual confusion about the "pending and active
> >> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
> >> I state this again?
> >> 
> >> >>
> >> >> Also a level sensitive interrupt's level is not toggled down by any
> >> >> maintenance IRQ handler as the EOI is not trapped.
> >> >>
> >> >> This patch adds an host_irq field in vgic_irq struct to easily
> >> >> get the irqchip state of the host irq. We also handle the
> >> >> physical IRQ case in vgic_validate_injection and add helpers to
> >> >> get the line level and active state.
> >> >>
> >> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> >> ---
> >> >>  include/kvm/arm_vgic.h    |  4 +++-
> >> >>  virt/kvm/arm/arch_timer.c |  3 ++-
> >> >>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
> >> >>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
> >> >>  4 files changed, 51 insertions(+), 9 deletions(-)
> >> >>
> >> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> >> index ef71858..695ebc7 100644
> >> >> --- a/include/kvm/arm_vgic.h
> >> >> +++ b/include/kvm/arm_vgic.h
> >> >> @@ -112,6 +112,7 @@ struct vgic_irq {
> >> >>  	bool hw;			/* Tied to HW IRQ */
> >> >>  	struct kref refcount;		/* Used for LPIs */
> >> >>  	u32 hwintid;			/* HW INTID number */
> >> >> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
> >> >>  	union {
> >> >>  		u8 targets;			/* GICv2 target VCPUs mask */
> >> >>  		u32 mpidr;			/* GICv3 target VCPU */
> >> >> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >> >>  			bool level);
> >> >>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >> >>  			       bool level);
> >> >> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> >> >> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> >> >> +			  u32 virt_irq, u32 phys_irq);
> >> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >> >>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >> >>  
> >> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> >> index 5976609..45f4779 100644
> >> >> --- a/virt/kvm/arm/arch_timer.c
> >> >> +++ b/virt/kvm/arm/arch_timer.c
> >> >> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >> >>  	 * physical interrupt. We do that once per VCPU.
> >> >>  	 */
> >> >> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> >> >> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> >> >> +				    vtimer->irq.irq, phys_irq);
> >> >>  	if (ret)
> >> >>  		return ret;
> >> >>  
> >> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> >> index 83b24d2..aa0618c 100644
> >> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> >> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >> >>  	kfree(irq);
> >> >>  }
> >> >>  
> >> >> +bool irq_line_level(struct vgic_irq *irq)
> >> >> +{
> >> >> +	bool line_level = irq->line_level;
> >> >> +
> >> >> +	if (unlikely(is_unshared_mapped(irq)))
> >> >> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >> >> +					      IRQCHIP_STATE_PENDING,
> >> >> +					      &line_level));
> >> >> +	return line_level;
> >> >> +}
> >> > 
> >> > This really looks fishy.  When do we need this exactly?
> >> > 
> >> > I feel like we should treat this more like everything else and set the
> >> > line_level on the irq even for forwarded interrupts, and then you don't
> >> > need changes to validate injection.
> >> > 
> >> > The challenge, then, is how to re-sample the line and lower the
> >> > line_level field when necessary.  Can't we simply do this in
> >> > vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> >> > level triggered and the level is high, then notify the one who injected
> >> > this and tell it to adjust its line level (lower it if it changed).
> >> > 
> >> > That would follow our existing path very closely.
> >> > 
> >> > Am I missing something?
> >> 
> >> I don't think you are. I think Eric got confused because of the above.
> >> But the flow is a bit a brainfsck :-(
> >> 
> >> - Physical interrupt fires, activated, injected in the vgic
> >> - Injecting the interrupt has a very different flow from what we
> >> currently have, and follow the same pattern as an Edge interrupt
> >> (because the Pending state is kept at the physical distributor, so we
> >> cannot preserve it in the emulation).
> >> - Normal life cycle of the interrupt
> >> - The fact that the Pending bit is kept at the distributor level ensures
> >> that if it becomes pending again in the emulation, that's because the
> >> guest has deactivated the physical interrupt by doing an EOI.
> >> 
> >
> > I think there's a choice between how we choose to support this.  We can
> > either do the edge-like injection, or we can model the line_level to the
> > best of our ability (we just have to lower the line after the guest
> > exits after deactivation if it's not still pending at the physical
> > distributor).
> >
> > One question with doing this edge-like, can you ahve this scenario:
> >  1. VM runs with active virtual interrupt linked to physical
> >     interrupt.
> >  2. VM deactivates virtual+physical interrupt
> >  3. Physical interrupt fires again on the host
> >  4. The host injects the virtual interrupt as pending to the VGIC (and
> >     IPIs the VCPU etc.)
> >  5. The device lowers the physical line (another VPCU programs the
> >     device, there's some delay, or whatever)
> >  6. The VCPU now sees a pending interrupt, which is no longer pending.
> >
> > Not sure if the line-like approach really solves this, though, or if
> > getting a spurius interrupt is something we care about.
> 
> That would be a spurious interrupt indeed, but I'm not sure that's
> something the line level sampling you suggest would avoid either. There
> is a fundamental disconnect between the injection and the physical line,
> and it can only be modelled to some level of accuracy (/me curse the
> architecture again).
> 
> > Perhaps we need to try to implement both and see how it looks like?
> 
> There is definitely room for experiment, but I feel Eric should focus on
> one of them (whichever it is). Happy to help prototyping the other one
> though.
> 
That's fair.  I'm just worried about the whole "emulate level triggered
interrupts as edge triggered" thing, but as you said, the architecture
doesn't allow us to model it more accurately.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-02 13:33     ` Christoffer Dall
@ 2017-06-08  8:49       ` Auger Eric
  -1 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-06-08  8:49 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, marc.zyngier, christoffer.dall, drjones, wei

Hi Christoffer,

On 02/06/2017 15:33, Christoffer Dall wrote:
> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>> Virtual interrupts directly mapped to physical interrupts require
>> some special care. Their pending and active state must be observed
>> at distributor level and not in the list register.
> 
> This is not entirely true.  There's a dependency, but there is also
> separate virtual vs. physical state, see below.
> 
>>
>> Also a level sensitive interrupt's level is not toggled down by any
>> maintenance IRQ handler as the EOI is not trapped.
>>
>> This patch adds an host_irq field in vgic_irq struct to easily
>> get the irqchip state of the host irq. We also handle the
>> physical IRQ case in vgic_validate_injection and add helpers to
>> get the line level and active state.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/kvm/arm_vgic.h    |  4 +++-
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index ef71858..695ebc7 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>  	bool hw;			/* Tied to HW IRQ */
>>  	struct kref refcount;		/* Used for LPIs */
>>  	u32 hwintid;			/* HW INTID number */
>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>  	union {
>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>  		u32 mpidr;			/* GICv3 target VCPU */
>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			bool level);
>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			       bool level);
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 5976609..45f4779 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> +				    vtimer->irq.irq, phys_irq);
>>  	if (ret)
>>  		return ret;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index 83b24d2..aa0618c 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>  	kfree(irq);
>>  }
>>  
>> +bool irq_line_level(struct vgic_irq *irq)
>> +{
>> +	bool line_level = irq->line_level;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_PENDING,
>> +					      &line_level));
>> +	return line_level;
>> +}
> 
> This really looks fishy.  When do we need this exactly?
> 
> I feel like we should treat this more like everything else and set the
> line_level on the irq even for forwarded interrupts, and then you don't
> need changes to validate injection.
> 
> The challenge, then, is how to re-sample the line and lower the
> line_level field when necessary.  Can't we simply do this in
> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> level triggered and the level is high,
Didn't you mean level is low? When the level is low you would trigger
the resamplefd? Doesn't it bring a useless overhead?

So yes I was confused by the spec and I thought the pending and active
bit*s* were not reliable in the LRs and instead stared at the
distributor state.

That being said, as you mention, the line level is difficult to model as
the resamplefd is not called so why not directly looking at the physical
distributor state (pending bit?). The impact on the SM is reduced to
irq_line_level() and validate_injection() which is always true for
forwarded interrupts.

 then notify the one who injected
> this and tell it to adjust its line level (lower it if it changed).
> 
> That would follow our existing path very closely.
> 
> Am I missing something?
> 
>> +
>> +bool irq_is_active(struct vgic_irq *irq)
>> +{
>> +	bool is_active = irq->active;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_ACTIVE,
>> +					      &is_active));
>> +	return is_active;
>> +}
> 
> Why do we need this?
> 
> The active state of a virtual IRQ is independent from the underlying
> physical state, as I see it.
> 
> For example, when the interrupt is initially injected to the VGIC, it
> will be ACTIVE on the physical distributor, but PENDING on the VGIC.
as I can use the active bit of the LR, that code is not needed then.

Thanks

Eric
> 
> 
> Thanks,
> -Christoffer
> 
>> +
>>  /**
>>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>>   *
>> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>>  
>>  	/* If the interrupt is active, it must stay on the current vcpu */
>> -	if (irq->active)
>> +	if (irq_is_active(irq))
>>  		return irq->vcpu ? : irq->target_vcpu;
>>  
>>  	/*
>> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>>  {
>>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
>> +	bool activea, activeb;
>>  	bool penda, pendb;
>>  	int ret;
>>  
>>  	spin_lock(&irqa->irq_lock);
>>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>>  
>> -	if (irqa->active || irqb->active) {
>> -		ret = (int)irqb->active - (int)irqa->active;
>> +	activea = irq_is_active(irqa);
>> +	activeb = irq_is_active(irqb);
>> +
>> +	if (activea || activeb) {
>> +		ret = (int)activeb - (int)activea;
>>  		goto out;
>>  	}
>>  
>> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>>  
>>  /*
>>   * Only valid injection if changing level for level-triggered IRQs or for a
>> - * rising edge.
>> + * rising edge. Injection of virtual interrupts associated to physical
>> + * interrupts always is valid.
>>   */
>>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>>  {
>>  	switch (irq->config) {
>>  	case VGIC_CONFIG_LEVEL:
>> -		return irq->line_level != level;
>> +		if (unlikely(is_unshared_mapped(irq)))
>> +			return true;
>> +		else
>> +			return irq->line_level != level;
>>  	case VGIC_CONFIG_EDGE:
>>  		return level;
>>  	}
>> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  	return 0;
>>  }
>>  
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq)
>>  {
>>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>>  
>> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>>  
>>  	irq->hw = true;
>>  	irq->hwintid = phys_irq;
>> +	irq->host_irq = host_irq;
>>  
>>  	spin_unlock(&irq->irq_lock);
>>  	vgic_put_irq(vcpu->kvm, irq);
>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>> index da83e4c..dc4972b 100644
>> --- a/virt/kvm/arm/vgic/vgic.h
>> +++ b/virt/kvm/arm/vgic/vgic.h
>> @@ -17,6 +17,7 @@
>>  #define __KVM_ARM_VGIC_NEW_H__
>>  
>>  #include <linux/irqchip/arm-gic-common.h>
>> +#include <linux/interrupt.h>
>>  
>>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>>  #define IMPLEMENTER_ARM		0x43b
>> @@ -96,14 +97,20 @@
>>  /* we only support 64 kB translation table page size */
>>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>>  
>> +bool irq_line_level(struct vgic_irq *irq);
>> +bool irq_is_active(struct vgic_irq *irq);
>> +
>>  static inline bool irq_is_pending(struct vgic_irq *irq)
>>  {
>>  	if (irq->config == VGIC_CONFIG_EDGE)
>>  		return irq->pending_latch;
>>  	else
>> -		return irq->pending_latch || irq->line_level;
>> +		return irq->pending_latch || irq_line_level(irq);
>>  }
>>  
>> +#define is_unshared_mapped(i) \
>> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
>> +
>>  /*
>>   * This struct provides an intermediate representation of the fields contained
>>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
>> -- 
>> 2.5.5
>>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-06-08  8:49       ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-06-08  8:49 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, marc.zyngier, linux-kernel, alex.williamson, pbonzini,
	kvmarm, eric.auger.pro

Hi Christoffer,

On 02/06/2017 15:33, Christoffer Dall wrote:
> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>> Virtual interrupts directly mapped to physical interrupts require
>> some special care. Their pending and active state must be observed
>> at distributor level and not in the list register.
> 
> This is not entirely true.  There's a dependency, but there is also
> separate virtual vs. physical state, see below.
> 
>>
>> Also a level sensitive interrupt's level is not toggled down by any
>> maintenance IRQ handler as the EOI is not trapped.
>>
>> This patch adds an host_irq field in vgic_irq struct to easily
>> get the irqchip state of the host irq. We also handle the
>> physical IRQ case in vgic_validate_injection and add helpers to
>> get the line level and active state.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  include/kvm/arm_vgic.h    |  4 +++-
>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index ef71858..695ebc7 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>  	bool hw;			/* Tied to HW IRQ */
>>  	struct kref refcount;		/* Used for LPIs */
>>  	u32 hwintid;			/* HW INTID number */
>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>  	union {
>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>  		u32 mpidr;			/* GICv3 target VCPU */
>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			bool level);
>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  			       bool level);
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq);
>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>  
>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>> index 5976609..45f4779 100644
>> --- a/virt/kvm/arm/arch_timer.c
>> +++ b/virt/kvm/arm/arch_timer.c
>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>  	 * physical interrupt. We do that once per VCPU.
>>  	 */
>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>> +				    vtimer->irq.irq, phys_irq);
>>  	if (ret)
>>  		return ret;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>> index 83b24d2..aa0618c 100644
>> --- a/virt/kvm/arm/vgic/vgic.c
>> +++ b/virt/kvm/arm/vgic/vgic.c
>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>  	kfree(irq);
>>  }
>>  
>> +bool irq_line_level(struct vgic_irq *irq)
>> +{
>> +	bool line_level = irq->line_level;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_PENDING,
>> +					      &line_level));
>> +	return line_level;
>> +}
> 
> This really looks fishy.  When do we need this exactly?
> 
> I feel like we should treat this more like everything else and set the
> line_level on the irq even for forwarded interrupts, and then you don't
> need changes to validate injection.
> 
> The challenge, then, is how to re-sample the line and lower the
> line_level field when necessary.  Can't we simply do this in
> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> level triggered and the level is high,
Didn't you mean level is low? When the level is low you would trigger
the resamplefd? Doesn't it bring a useless overhead?

So yes I was confused by the spec and I thought the pending and active
bit*s* were not reliable in the LRs and instead stared at the
distributor state.

That being said, as you mention, the line level is difficult to model as
the resamplefd is not called so why not directly looking at the physical
distributor state (pending bit?). The impact on the SM is reduced to
irq_line_level() and validate_injection() which is always true for
forwarded interrupts.

 then notify the one who injected
> this and tell it to adjust its line level (lower it if it changed).
> 
> That would follow our existing path very closely.
> 
> Am I missing something?
> 
>> +
>> +bool irq_is_active(struct vgic_irq *irq)
>> +{
>> +	bool is_active = irq->active;
>> +
>> +	if (unlikely(is_unshared_mapped(irq)))
>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>> +					      IRQCHIP_STATE_ACTIVE,
>> +					      &is_active));
>> +	return is_active;
>> +}
> 
> Why do we need this?
> 
> The active state of a virtual IRQ is independent from the underlying
> physical state, as I see it.
> 
> For example, when the interrupt is initially injected to the VGIC, it
> will be ACTIVE on the physical distributor, but PENDING on the VGIC.
as I can use the active bit of the LR, that code is not needed then.

Thanks

Eric
> 
> 
> Thanks,
> -Christoffer
> 
>> +
>>  /**
>>   * kvm_vgic_target_oracle - compute the target vcpu for an irq
>>   *
>> @@ -153,7 +175,7 @@ static struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq)
>>  	DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&irq->irq_lock));
>>  
>>  	/* If the interrupt is active, it must stay on the current vcpu */
>> -	if (irq->active)
>> +	if (irq_is_active(irq))
>>  		return irq->vcpu ? : irq->target_vcpu;
>>  
>>  	/*
>> @@ -195,14 +217,18 @@ static int vgic_irq_cmp(void *priv, struct list_head *a, struct list_head *b)
>>  {
>>  	struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>>  	struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
>> +	bool activea, activeb;
>>  	bool penda, pendb;
>>  	int ret;
>>  
>>  	spin_lock(&irqa->irq_lock);
>>  	spin_lock_nested(&irqb->irq_lock, SINGLE_DEPTH_NESTING);
>>  
>> -	if (irqa->active || irqb->active) {
>> -		ret = (int)irqb->active - (int)irqa->active;
>> +	activea = irq_is_active(irqa);
>> +	activeb = irq_is_active(irqb);
>> +
>> +	if (activea || activeb) {
>> +		ret = (int)activeb - (int)activea;
>>  		goto out;
>>  	}
>>  
>> @@ -234,13 +260,17 @@ static void vgic_sort_ap_list(struct kvm_vcpu *vcpu)
>>  
>>  /*
>>   * Only valid injection if changing level for level-triggered IRQs or for a
>> - * rising edge.
>> + * rising edge. Injection of virtual interrupts associated to physical
>> + * interrupts always is valid.
>>   */
>>  static bool vgic_validate_injection(struct vgic_irq *irq, bool level)
>>  {
>>  	switch (irq->config) {
>>  	case VGIC_CONFIG_LEVEL:
>> -		return irq->line_level != level;
>> +		if (unlikely(is_unshared_mapped(irq)))
>> +			return true;
>> +		else
>> +			return irq->line_level != level;
>>  	case VGIC_CONFIG_EDGE:
>>  		return level;
>>  	}
>> @@ -392,7 +422,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>  	return 0;
>>  }
>>  
>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>> +			  u32 virt_irq, u32 phys_irq)
>>  {
>>  	struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
>>  
>> @@ -402,6 +433,7 @@ int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
>>  
>>  	irq->hw = true;
>>  	irq->hwintid = phys_irq;
>> +	irq->host_irq = host_irq;
>>  
>>  	spin_unlock(&irq->irq_lock);
>>  	vgic_put_irq(vcpu->kvm, irq);
>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>> index da83e4c..dc4972b 100644
>> --- a/virt/kvm/arm/vgic/vgic.h
>> +++ b/virt/kvm/arm/vgic/vgic.h
>> @@ -17,6 +17,7 @@
>>  #define __KVM_ARM_VGIC_NEW_H__
>>  
>>  #include <linux/irqchip/arm-gic-common.h>
>> +#include <linux/interrupt.h>
>>  
>>  #define PRODUCT_ID_KVM		0x4b	/* ASCII code K */
>>  #define IMPLEMENTER_ARM		0x43b
>> @@ -96,14 +97,20 @@
>>  /* we only support 64 kB translation table page size */
>>  #define KVM_ITS_L1E_ADDR_MASK		GENMASK_ULL(51, 16)
>>  
>> +bool irq_line_level(struct vgic_irq *irq);
>> +bool irq_is_active(struct vgic_irq *irq);
>> +
>>  static inline bool irq_is_pending(struct vgic_irq *irq)
>>  {
>>  	if (irq->config == VGIC_CONFIG_EDGE)
>>  		return irq->pending_latch;
>>  	else
>> -		return irq->pending_latch || irq->line_level;
>> +		return irq->pending_latch || irq_line_level(irq);
>>  }
>>  
>> +#define is_unshared_mapped(i) \
>> +((i)->hw && (i)->intid >= VGIC_NR_PRIVATE_IRQS && (i)->intid < 1020)
>> +
>>  /*
>>   * This struct provides an intermediate representation of the fields contained
>>   * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC
>> -- 
>> 2.5.5
>>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-08  8:34           ` Christoffer Dall
@ 2017-06-08  8:55               ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-06-08  8:55 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, christoffer.dall, drjones, wei

Hi Christoffer, Marc,

On 08/06/2017 10:34, Christoffer Dall wrote:
> On Thu, Jun 08, 2017 at 09:23:16AM +0100, Marc Zyngier wrote:
>> On Fri, Jun 02 2017 at  6:29:44 pm BST, Christoffer Dall <cdall@linaro.org> wrote:
>>> On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
>>>> On 02/06/17 14:33, Christoffer Dall wrote:
>>>>> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>>>>>> Virtual interrupts directly mapped to physical interrupts require
>>>>>> some special care. Their pending and active state must be observed
>>>>>> at distributor level and not in the list register.
>>>>>
>>>>> This is not entirely true.  There's a dependency, but there is also
>>>>> separate virtual vs. physical state, see below.
>>>>
>>>> I think this stems for the usual confusion about the "pending and active
>>>> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
>>>> I state this again?
>>>>
>>>>>>
>>>>>> Also a level sensitive interrupt's level is not toggled down by any
>>>>>> maintenance IRQ handler as the EOI is not trapped.
>>>>>>
>>>>>> This patch adds an host_irq field in vgic_irq struct to easily
>>>>>> get the irqchip state of the host irq. We also handle the
>>>>>> physical IRQ case in vgic_validate_injection and add helpers to
>>>>>> get the line level and active state.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> ---
>>>>>>  include/kvm/arm_vgic.h    |  4 +++-
>>>>>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>>>>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>>>>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>>>>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>> index ef71858..695ebc7 100644
>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>>>>>  	bool hw;			/* Tied to HW IRQ */
>>>>>>  	struct kref refcount;		/* Used for LPIs */
>>>>>>  	u32 hwintid;			/* HW INTID number */
>>>>>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>>>>>  	union {
>>>>>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>>>>>  		u32 mpidr;			/* GICv3 target VCPU */
>>>>>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>>>>>  			bool level);
>>>>>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>>>>>  			       bool level);
>>>>>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>>>>>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>>>>>> +			  u32 virt_irq, u32 phys_irq);
>>>>>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>>>>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>>>>>  
>>>>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>>>>> index 5976609..45f4779 100644
>>>>>> --- a/virt/kvm/arm/arch_timer.c
>>>>>> +++ b/virt/kvm/arm/arch_timer.c
>>>>>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>>>>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>>>>>  	 * physical interrupt. We do that once per VCPU.
>>>>>>  	 */
>>>>>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>>>>>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>>>>>> +				    vtimer->irq.irq, phys_irq);
>>>>>>  	if (ret)
>>>>>>  		return ret;
>>>>>>  
>>>>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>>>>> index 83b24d2..aa0618c 100644
>>>>>> --- a/virt/kvm/arm/vgic/vgic.c
>>>>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>>>>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>>>>>  	kfree(irq);
>>>>>>  }
>>>>>>  
>>>>>> +bool irq_line_level(struct vgic_irq *irq)
>>>>>> +{
>>>>>> +	bool line_level = irq->line_level;
>>>>>> +
>>>>>> +	if (unlikely(is_unshared_mapped(irq)))
>>>>>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>>>>>> +					      IRQCHIP_STATE_PENDING,
>>>>>> +					      &line_level));
>>>>>> +	return line_level;
>>>>>> +}
>>>>>
>>>>> This really looks fishy.  When do we need this exactly?
>>>>>
>>>>> I feel like we should treat this more like everything else and set the
>>>>> line_level on the irq even for forwarded interrupts, and then you don't
>>>>> need changes to validate injection.
>>>>>
>>>>> The challenge, then, is how to re-sample the line and lower the
>>>>> line_level field when necessary.  Can't we simply do this in
>>>>> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
>>>>> level triggered and the level is high, then notify the one who injected
>>>>> this and tell it to adjust its line level (lower it if it changed).
>>>>>
>>>>> That would follow our existing path very closely.
>>>>>
>>>>> Am I missing something?
>>>>
>>>> I don't think you are. I think Eric got confused because of the above.
>>>> But the flow is a bit a brainfsck :-(
>>>>
>>>> - Physical interrupt fires, activated, injected in the vgic
>>>> - Injecting the interrupt has a very different flow from what we
>>>> currently have, and follow the same pattern as an Edge interrupt
>>>> (because the Pending state is kept at the physical distributor, so we
>>>> cannot preserve it in the emulation).
>>>> - Normal life cycle of the interrupt
>>>> - The fact that the Pending bit is kept at the distributor level ensures
>>>> that if it becomes pending again in the emulation, that's because the
>>>> guest has deactivated the physical interrupt by doing an EOI.
>>>>
>>>
>>> I think there's a choice between how we choose to support this.  We can
>>> either do the edge-like injection, or we can model the line_level to the
>>> best of our ability (we just have to lower the line after the guest
>>> exits after deactivation if it's not still pending at the physical
>>> distributor).
>>>
>>> One question with doing this edge-like, can you ahve this scenario:
>>>  1. VM runs with active virtual interrupt linked to physical
>>>     interrupt.
>>>  2. VM deactivates virtual+physical interrupt
>>>  3. Physical interrupt fires again on the host
>>>  4. The host injects the virtual interrupt as pending to the VGIC (and
>>>     IPIs the VCPU etc.)
>>>  5. The device lowers the physical line (another VPCU programs the
>>>     device, there's some delay, or whatever)
>>>  6. The VCPU now sees a pending interrupt, which is no longer pending.
>>>
>>> Not sure if the line-like approach really solves this, though, or if
>>> getting a spurius interrupt is something we care about.
>>
>> That would be a spurious interrupt indeed, but I'm not sure that's
>> something the line level sampling you suggest would avoid either. There
>> is a fundamental disconnect between the injection and the physical line,
>> and it can only be modelled to some level of accuracy (/me curse the
>> architecture again).
>>
>>> Perhaps we need to try to implement both and see how it looks like?
>>
>> There is definitely room for experiment, but I feel Eric should focus on
>> one of them (whichever it is). Happy to help prototyping the other one
>> though.
>>
> That's fair.  I'm just worried about the whole "emulate level triggered
> interrupts as edge triggered" thing, but as you said, the architecture
> doesn't allow us to model it more accurately.
if the line level is modeled using the physical distributor pending
state, don't you fix that case?

Thanks

Eric
> 
> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-06-08  8:55               ` Auger Eric
  0 siblings, 0 replies; 69+ messages in thread
From: Auger Eric @ 2017-06-08  8:55 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier
  Cc: kvm, linux-kernel, alex.williamson, pbonzini, kvmarm, eric.auger.pro

Hi Christoffer, Marc,

On 08/06/2017 10:34, Christoffer Dall wrote:
> On Thu, Jun 08, 2017 at 09:23:16AM +0100, Marc Zyngier wrote:
>> On Fri, Jun 02 2017 at  6:29:44 pm BST, Christoffer Dall <cdall@linaro.org> wrote:
>>> On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
>>>> On 02/06/17 14:33, Christoffer Dall wrote:
>>>>> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
>>>>>> Virtual interrupts directly mapped to physical interrupts require
>>>>>> some special care. Their pending and active state must be observed
>>>>>> at distributor level and not in the list register.
>>>>>
>>>>> This is not entirely true.  There's a dependency, but there is also
>>>>> separate virtual vs. physical state, see below.
>>>>
>>>> I think this stems for the usual confusion about the "pending and active
>>>> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
>>>> I state this again?
>>>>
>>>>>>
>>>>>> Also a level sensitive interrupt's level is not toggled down by any
>>>>>> maintenance IRQ handler as the EOI is not trapped.
>>>>>>
>>>>>> This patch adds an host_irq field in vgic_irq struct to easily
>>>>>> get the irqchip state of the host irq. We also handle the
>>>>>> physical IRQ case in vgic_validate_injection and add helpers to
>>>>>> get the line level and active state.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> ---
>>>>>>  include/kvm/arm_vgic.h    |  4 +++-
>>>>>>  virt/kvm/arm/arch_timer.c |  3 ++-
>>>>>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
>>>>>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
>>>>>>  4 files changed, 51 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>> index ef71858..695ebc7 100644
>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>> @@ -112,6 +112,7 @@ struct vgic_irq {
>>>>>>  	bool hw;			/* Tied to HW IRQ */
>>>>>>  	struct kref refcount;		/* Used for LPIs */
>>>>>>  	u32 hwintid;			/* HW INTID number */
>>>>>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
>>>>>>  	union {
>>>>>>  		u8 targets;			/* GICv2 target VCPUs mask */
>>>>>>  		u32 mpidr;			/* GICv3 target VCPU */
>>>>>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>>>>>  			bool level);
>>>>>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
>>>>>>  			       bool level);
>>>>>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
>>>>>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
>>>>>> +			  u32 virt_irq, u32 phys_irq);
>>>>>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>>>>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
>>>>>>  
>>>>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>>>>> index 5976609..45f4779 100644
>>>>>> --- a/virt/kvm/arm/arch_timer.c
>>>>>> +++ b/virt/kvm/arm/arch_timer.c
>>>>>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>>>>>  	 * Tell the VGIC that the virtual interrupt is tied to a
>>>>>>  	 * physical interrupt. We do that once per VCPU.
>>>>>>  	 */
>>>>>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
>>>>>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
>>>>>> +				    vtimer->irq.irq, phys_irq);
>>>>>>  	if (ret)
>>>>>>  		return ret;
>>>>>>  
>>>>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
>>>>>> index 83b24d2..aa0618c 100644
>>>>>> --- a/virt/kvm/arm/vgic/vgic.c
>>>>>> +++ b/virt/kvm/arm/vgic/vgic.c
>>>>>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>>>>>>  	kfree(irq);
>>>>>>  }
>>>>>>  
>>>>>> +bool irq_line_level(struct vgic_irq *irq)
>>>>>> +{
>>>>>> +	bool line_level = irq->line_level;
>>>>>> +
>>>>>> +	if (unlikely(is_unshared_mapped(irq)))
>>>>>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
>>>>>> +					      IRQCHIP_STATE_PENDING,
>>>>>> +					      &line_level));
>>>>>> +	return line_level;
>>>>>> +}
>>>>>
>>>>> This really looks fishy.  When do we need this exactly?
>>>>>
>>>>> I feel like we should treat this more like everything else and set the
>>>>> line_level on the irq even for forwarded interrupts, and then you don't
>>>>> need changes to validate injection.
>>>>>
>>>>> The challenge, then, is how to re-sample the line and lower the
>>>>> line_level field when necessary.  Can't we simply do this in
>>>>> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
>>>>> level triggered and the level is high, then notify the one who injected
>>>>> this and tell it to adjust its line level (lower it if it changed).
>>>>>
>>>>> That would follow our existing path very closely.
>>>>>
>>>>> Am I missing something?
>>>>
>>>> I don't think you are. I think Eric got confused because of the above.
>>>> But the flow is a bit a brainfsck :-(
>>>>
>>>> - Physical interrupt fires, activated, injected in the vgic
>>>> - Injecting the interrupt has a very different flow from what we
>>>> currently have, and follow the same pattern as an Edge interrupt
>>>> (because the Pending state is kept at the physical distributor, so we
>>>> cannot preserve it in the emulation).
>>>> - Normal life cycle of the interrupt
>>>> - The fact that the Pending bit is kept at the distributor level ensures
>>>> that if it becomes pending again in the emulation, that's because the
>>>> guest has deactivated the physical interrupt by doing an EOI.
>>>>
>>>
>>> I think there's a choice between how we choose to support this.  We can
>>> either do the edge-like injection, or we can model the line_level to the
>>> best of our ability (we just have to lower the line after the guest
>>> exits after deactivation if it's not still pending at the physical
>>> distributor).
>>>
>>> One question with doing this edge-like, can you ahve this scenario:
>>>  1. VM runs with active virtual interrupt linked to physical
>>>     interrupt.
>>>  2. VM deactivates virtual+physical interrupt
>>>  3. Physical interrupt fires again on the host
>>>  4. The host injects the virtual interrupt as pending to the VGIC (and
>>>     IPIs the VCPU etc.)
>>>  5. The device lowers the physical line (another VPCU programs the
>>>     device, there's some delay, or whatever)
>>>  6. The VCPU now sees a pending interrupt, which is no longer pending.
>>>
>>> Not sure if the line-like approach really solves this, though, or if
>>> getting a spurius interrupt is something we care about.
>>
>> That would be a spurious interrupt indeed, but I'm not sure that's
>> something the line level sampling you suggest would avoid either. There
>> is a fundamental disconnect between the injection and the physical line,
>> and it can only be modelled to some level of accuracy (/me curse the
>> architecture again).
>>
>>> Perhaps we need to try to implement both and see how it looks like?
>>
>> There is definitely room for experiment, but I feel Eric should focus on
>> one of them (whichever it is). Happy to help prototyping the other one
>> though.
>>
> That's fair.  I'm just worried about the whole "emulate level triggered
> interrupts as edge triggered" thing, but as you said, the architecture
> doesn't allow us to model it more accurately.
if the line level is modeled using the physical distributor pending
state, don't you fix that case?

Thanks

Eric
> 
> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-08  8:49       ` Auger Eric
  (?)
@ 2017-06-08 10:11       ` Christoffer Dall
  -1 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-08 10:11 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, alex.williamson,
	pbonzini, marc.zyngier, christoffer.dall, drjones, wei

On Thu, Jun 08, 2017 at 10:49:12AM +0200, Auger Eric wrote:
> Hi Christoffer,
> 
> On 02/06/2017 15:33, Christoffer Dall wrote:
> > On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> >> Virtual interrupts directly mapped to physical interrupts require
> >> some special care. Their pending and active state must be observed
> >> at distributor level and not in the list register.
> > 
> > This is not entirely true.  There's a dependency, but there is also
> > separate virtual vs. physical state, see below.
> > 
> >>
> >> Also a level sensitive interrupt's level is not toggled down by any
> >> maintenance IRQ handler as the EOI is not trapped.
> >>
> >> This patch adds an host_irq field in vgic_irq struct to easily
> >> get the irqchip state of the host irq. We also handle the
> >> physical IRQ case in vgic_validate_injection and add helpers to
> >> get the line level and active state.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >>  include/kvm/arm_vgic.h    |  4 +++-
> >>  virt/kvm/arm/arch_timer.c |  3 ++-
> >>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
> >>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
> >>  4 files changed, 51 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index ef71858..695ebc7 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -112,6 +112,7 @@ struct vgic_irq {
> >>  	bool hw;			/* Tied to HW IRQ */
> >>  	struct kref refcount;		/* Used for LPIs */
> >>  	u32 hwintid;			/* HW INTID number */
> >> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
> >>  	union {
> >>  		u8 targets;			/* GICv2 target VCPUs mask */
> >>  		u32 mpidr;			/* GICv3 target VCPU */
> >> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>  			bool level);
> >>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>  			       bool level);
> >> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> >> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> >> +			  u32 virt_irq, u32 phys_irq);
> >>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>  
> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >> index 5976609..45f4779 100644
> >> --- a/virt/kvm/arm/arch_timer.c
> >> +++ b/virt/kvm/arm/arch_timer.c
> >> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>  	 * physical interrupt. We do that once per VCPU.
> >>  	 */
> >> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> >> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> >> +				    vtimer->irq.irq, phys_irq);
> >>  	if (ret)
> >>  		return ret;
> >>  
> >> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >> index 83b24d2..aa0618c 100644
> >> --- a/virt/kvm/arm/vgic/vgic.c
> >> +++ b/virt/kvm/arm/vgic/vgic.c
> >> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >>  	kfree(irq);
> >>  }
> >>  
> >> +bool irq_line_level(struct vgic_irq *irq)
> >> +{
> >> +	bool line_level = irq->line_level;
> >> +
> >> +	if (unlikely(is_unshared_mapped(irq)))
> >> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >> +					      IRQCHIP_STATE_PENDING,
> >> +					      &line_level));
> >> +	return line_level;
> >> +}
> > 
> > This really looks fishy.  When do we need this exactly?
> > 
> > I feel like we should treat this more like everything else and set the
> > line_level on the irq even for forwarded interrupts, and then you don't
> > need changes to validate injection.
> > 
> > The challenge, then, is how to re-sample the line and lower the
> > line_level field when necessary.  Can't we simply do this in
> > vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> > level triggered and the level is high,
> Didn't you mean level is low? When the level is low you would trigger
> the resamplefd? Doesn't it bring a useless overhead?

No, I meant high.

So the idea would be that (for example) VFIO gets notified when the
level goes up, and can forward this information to KVM at that time, but
neither VFIO nor KVM get notified when the level lowers again.  The idea
would then be that we only care about this change when we're done with
the virtual interrupt, so if the line_level is already high, we should
do the resample trick to get VFIO to lower the line - sort of the
opposite from the non-forwarding case.

But I'm now realizing that this doesn't work as easily as I thought,
because then you would have to change the injection path to IPI the VM
when setting the line_level to high when it's already high, and the
interrupt is a forwarded one.

> 
> So yes I was confused by the spec and I thought the pending and active
> bit*s* were not reliable in the LRs and instead stared at the
> distributor state.

Been there done that.

> 
> That being said, as you mention, the line level is difficult to model as
> the resamplefd is not called so why not directly looking at the physical
> distributor state (pending bit?). The impact on the SM is reduced to
> irq_line_level() and validate_injection() which is always true for
> forwarded interrupts.
> 

Because the virtual state is in fact decoupled from the physical state,
so it feels wrong to me to start peeking into the physical GIC for our
emulated model.


Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
  2017-06-08  8:55               ` Auger Eric
@ 2017-06-08 10:14                 ` Christoffer Dall
  -1 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-08 10:14 UTC (permalink / raw)
  To: Auger Eric
  Cc: Marc Zyngier, eric.auger.pro, linux-kernel, kvm, kvmarm,
	alex.williamson, pbonzini, christoffer.dall, drjones, wei

On Thu, Jun 08, 2017 at 10:55:29AM +0200, Auger Eric wrote:
> Hi Christoffer, Marc,
> 
> On 08/06/2017 10:34, Christoffer Dall wrote:
> > On Thu, Jun 08, 2017 at 09:23:16AM +0100, Marc Zyngier wrote:
> >> On Fri, Jun 02 2017 at  6:29:44 pm BST, Christoffer Dall <cdall@linaro.org> wrote:
> >>> On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
> >>>> On 02/06/17 14:33, Christoffer Dall wrote:
> >>>>> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> >>>>>> Virtual interrupts directly mapped to physical interrupts require
> >>>>>> some special care. Their pending and active state must be observed
> >>>>>> at distributor level and not in the list register.
> >>>>>
> >>>>> This is not entirely true.  There's a dependency, but there is also
> >>>>> separate virtual vs. physical state, see below.
> >>>>
> >>>> I think this stems for the usual confusion about the "pending and active
> >>>> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
> >>>> I state this again?
> >>>>
> >>>>>>
> >>>>>> Also a level sensitive interrupt's level is not toggled down by any
> >>>>>> maintenance IRQ handler as the EOI is not trapped.
> >>>>>>
> >>>>>> This patch adds an host_irq field in vgic_irq struct to easily
> >>>>>> get the irqchip state of the host irq. We also handle the
> >>>>>> physical IRQ case in vgic_validate_injection and add helpers to
> >>>>>> get the line level and active state.
> >>>>>>
> >>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>>>> ---
> >>>>>>  include/kvm/arm_vgic.h    |  4 +++-
> >>>>>>  virt/kvm/arm/arch_timer.c |  3 ++-
> >>>>>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
> >>>>>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
> >>>>>>  4 files changed, 51 insertions(+), 9 deletions(-)
> >>>>>>
> >>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>>>> index ef71858..695ebc7 100644
> >>>>>> --- a/include/kvm/arm_vgic.h
> >>>>>> +++ b/include/kvm/arm_vgic.h
> >>>>>> @@ -112,6 +112,7 @@ struct vgic_irq {
> >>>>>>  	bool hw;			/* Tied to HW IRQ */
> >>>>>>  	struct kref refcount;		/* Used for LPIs */
> >>>>>>  	u32 hwintid;			/* HW INTID number */
> >>>>>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
> >>>>>>  	union {
> >>>>>>  		u8 targets;			/* GICv2 target VCPUs mask */
> >>>>>>  		u32 mpidr;			/* GICv3 target VCPU */
> >>>>>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>>>>>  			bool level);
> >>>>>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>>>>>  			       bool level);
> >>>>>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> >>>>>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> >>>>>> +			  u32 virt_irq, u32 phys_irq);
> >>>>>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>>>>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>>>>>  
> >>>>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >>>>>> index 5976609..45f4779 100644
> >>>>>> --- a/virt/kvm/arm/arch_timer.c
> >>>>>> +++ b/virt/kvm/arm/arch_timer.c
> >>>>>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >>>>>>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>>>>>  	 * physical interrupt. We do that once per VCPU.
> >>>>>>  	 */
> >>>>>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> >>>>>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> >>>>>> +				    vtimer->irq.irq, phys_irq);
> >>>>>>  	if (ret)
> >>>>>>  		return ret;
> >>>>>>  
> >>>>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >>>>>> index 83b24d2..aa0618c 100644
> >>>>>> --- a/virt/kvm/arm/vgic/vgic.c
> >>>>>> +++ b/virt/kvm/arm/vgic/vgic.c
> >>>>>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >>>>>>  	kfree(irq);
> >>>>>>  }
> >>>>>>  
> >>>>>> +bool irq_line_level(struct vgic_irq *irq)
> >>>>>> +{
> >>>>>> +	bool line_level = irq->line_level;
> >>>>>> +
> >>>>>> +	if (unlikely(is_unshared_mapped(irq)))
> >>>>>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >>>>>> +					      IRQCHIP_STATE_PENDING,
> >>>>>> +					      &line_level));
> >>>>>> +	return line_level;
> >>>>>> +}
> >>>>>
> >>>>> This really looks fishy.  When do we need this exactly?
> >>>>>
> >>>>> I feel like we should treat this more like everything else and set the
> >>>>> line_level on the irq even for forwarded interrupts, and then you don't
> >>>>> need changes to validate injection.
> >>>>>
> >>>>> The challenge, then, is how to re-sample the line and lower the
> >>>>> line_level field when necessary.  Can't we simply do this in
> >>>>> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> >>>>> level triggered and the level is high, then notify the one who injected
> >>>>> this and tell it to adjust its line level (lower it if it changed).
> >>>>>
> >>>>> That would follow our existing path very closely.
> >>>>>
> >>>>> Am I missing something?
> >>>>
> >>>> I don't think you are. I think Eric got confused because of the above.
> >>>> But the flow is a bit a brainfsck :-(
> >>>>
> >>>> - Physical interrupt fires, activated, injected in the vgic
> >>>> - Injecting the interrupt has a very different flow from what we
> >>>> currently have, and follow the same pattern as an Edge interrupt
> >>>> (because the Pending state is kept at the physical distributor, so we
> >>>> cannot preserve it in the emulation).
> >>>> - Normal life cycle of the interrupt
> >>>> - The fact that the Pending bit is kept at the distributor level ensures
> >>>> that if it becomes pending again in the emulation, that's because the
> >>>> guest has deactivated the physical interrupt by doing an EOI.
> >>>>
> >>>
> >>> I think there's a choice between how we choose to support this.  We can
> >>> either do the edge-like injection, or we can model the line_level to the
> >>> best of our ability (we just have to lower the line after the guest
> >>> exits after deactivation if it's not still pending at the physical
> >>> distributor).
> >>>
> >>> One question with doing this edge-like, can you ahve this scenario:
> >>>  1. VM runs with active virtual interrupt linked to physical
> >>>     interrupt.
> >>>  2. VM deactivates virtual+physical interrupt
> >>>  3. Physical interrupt fires again on the host
> >>>  4. The host injects the virtual interrupt as pending to the VGIC (and
> >>>     IPIs the VCPU etc.)
> >>>  5. The device lowers the physical line (another VPCU programs the
> >>>     device, there's some delay, or whatever)
> >>>  6. The VCPU now sees a pending interrupt, which is no longer pending.
> >>>
> >>> Not sure if the line-like approach really solves this, though, or if
> >>> getting a spurius interrupt is something we care about.
> >>
> >> That would be a spurious interrupt indeed, but I'm not sure that's
> >> something the line level sampling you suggest would avoid either. There
> >> is a fundamental disconnect between the injection and the physical line,
> >> and it can only be modelled to some level of accuracy (/me curse the
> >> architecture again).
> >>
> >>> Perhaps we need to try to implement both and see how it looks like?
> >>
> >> There is definitely room for experiment, but I feel Eric should focus on
> >> one of them (whichever it is). Happy to help prototyping the other one
> >> though.
> >>
> > That's fair.  I'm just worried about the whole "emulate level triggered
> > interrupts as edge triggered" thing, but as you said, the architecture
> > doesn't allow us to model it more accurately.
> if the line level is modeled using the physical distributor pending
> state, don't you fix that case?
> 

I see what you mean.  Perhaps.  So that would mean that we move
line_level() to the physical distributor for forwarded interrupts, but
the active state is managed virtually in the GIC?

That contradicts my argument on the other mail, but if it's a more
accurate emulation, then that's definitely worth considering.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts
@ 2017-06-08 10:14                 ` Christoffer Dall
  0 siblings, 0 replies; 69+ messages in thread
From: Christoffer Dall @ 2017-06-08 10:14 UTC (permalink / raw)
  To: Auger Eric
  Cc: kvm, Marc Zyngier, linux-kernel, alex.williamson, pbonzini,
	kvmarm, eric.auger.pro

On Thu, Jun 08, 2017 at 10:55:29AM +0200, Auger Eric wrote:
> Hi Christoffer, Marc,
> 
> On 08/06/2017 10:34, Christoffer Dall wrote:
> > On Thu, Jun 08, 2017 at 09:23:16AM +0100, Marc Zyngier wrote:
> >> On Fri, Jun 02 2017 at  6:29:44 pm BST, Christoffer Dall <cdall@linaro.org> wrote:
> >>> On Fri, Jun 02, 2017 at 03:10:23PM +0100, Marc Zyngier wrote:
> >>>> On 02/06/17 14:33, Christoffer Dall wrote:
> >>>>> On Wed, May 24, 2017 at 10:13:21PM +0200, Eric Auger wrote:
> >>>>>> Virtual interrupts directly mapped to physical interrupts require
> >>>>>> some special care. Their pending and active state must be observed
> >>>>>> at distributor level and not in the list register.
> >>>>>
> >>>>> This is not entirely true.  There's a dependency, but there is also
> >>>>> separate virtual vs. physical state, see below.
> >>>>
> >>>> I think this stems for the usual confusion about the "pending and active
> >>>> state" vs "pending and active states". Yes, the GIC spec is rubbish. Can
> >>>> I state this again?
> >>>>
> >>>>>>
> >>>>>> Also a level sensitive interrupt's level is not toggled down by any
> >>>>>> maintenance IRQ handler as the EOI is not trapped.
> >>>>>>
> >>>>>> This patch adds an host_irq field in vgic_irq struct to easily
> >>>>>> get the irqchip state of the host irq. We also handle the
> >>>>>> physical IRQ case in vgic_validate_injection and add helpers to
> >>>>>> get the line level and active state.
> >>>>>>
> >>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>>>> ---
> >>>>>>  include/kvm/arm_vgic.h    |  4 +++-
> >>>>>>  virt/kvm/arm/arch_timer.c |  3 ++-
> >>>>>>  virt/kvm/arm/vgic/vgic.c  | 44 ++++++++++++++++++++++++++++++++++++++------
> >>>>>>  virt/kvm/arm/vgic/vgic.h  |  9 ++++++++-
> >>>>>>  4 files changed, 51 insertions(+), 9 deletions(-)
> >>>>>>
> >>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>>>> index ef71858..695ebc7 100644
> >>>>>> --- a/include/kvm/arm_vgic.h
> >>>>>> +++ b/include/kvm/arm_vgic.h
> >>>>>> @@ -112,6 +112,7 @@ struct vgic_irq {
> >>>>>>  	bool hw;			/* Tied to HW IRQ */
> >>>>>>  	struct kref refcount;		/* Used for LPIs */
> >>>>>>  	u32 hwintid;			/* HW INTID number */
> >>>>>> +	unsigned int host_irq;		/* linux irq corresponding to hwintid */
> >>>>>>  	union {
> >>>>>>  		u8 targets;			/* GICv2 target VCPUs mask */
> >>>>>>  		u32 mpidr;			/* GICv3 target VCPU */
> >>>>>> @@ -301,7 +302,8 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>>>>>  			bool level);
> >>>>>>  int kvm_vgic_inject_mapped_irq(struct kvm *kvm, int cpuid, unsigned int intid,
> >>>>>>  			       bool level);
> >>>>>> -int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
> >>>>>> +int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
> >>>>>> +			  u32 virt_irq, u32 phys_irq);
> >>>>>>  int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>>>>>  bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
> >>>>>>  
> >>>>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> >>>>>> index 5976609..45f4779 100644
> >>>>>> --- a/virt/kvm/arm/arch_timer.c
> >>>>>> +++ b/virt/kvm/arm/arch_timer.c
> >>>>>> @@ -651,7 +651,8 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >>>>>>  	 * Tell the VGIC that the virtual interrupt is tied to a
> >>>>>>  	 * physical interrupt. We do that once per VCPU.
> >>>>>>  	 */
> >>>>>> -	ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
> >>>>>> +	ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq,
> >>>>>> +				    vtimer->irq.irq, phys_irq);
> >>>>>>  	if (ret)
> >>>>>>  		return ret;
> >>>>>>  
> >>>>>> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> >>>>>> index 83b24d2..aa0618c 100644
> >>>>>> --- a/virt/kvm/arm/vgic/vgic.c
> >>>>>> +++ b/virt/kvm/arm/vgic/vgic.c
> >>>>>> @@ -137,6 +137,28 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
> >>>>>>  	kfree(irq);
> >>>>>>  }
> >>>>>>  
> >>>>>> +bool irq_line_level(struct vgic_irq *irq)
> >>>>>> +{
> >>>>>> +	bool line_level = irq->line_level;
> >>>>>> +
> >>>>>> +	if (unlikely(is_unshared_mapped(irq)))
> >>>>>> +		WARN_ON(irq_get_irqchip_state(irq->host_irq,
> >>>>>> +					      IRQCHIP_STATE_PENDING,
> >>>>>> +					      &line_level));
> >>>>>> +	return line_level;
> >>>>>> +}
> >>>>>
> >>>>> This really looks fishy.  When do we need this exactly?
> >>>>>
> >>>>> I feel like we should treat this more like everything else and set the
> >>>>> line_level on the irq even for forwarded interrupts, and then you don't
> >>>>> need changes to validate injection.
> >>>>>
> >>>>> The challenge, then, is how to re-sample the line and lower the
> >>>>> line_level field when necessary.  Can't we simply do this in
> >>>>> vgic_fold_lr_state(), and if you have a forwarded interrupt which is
> >>>>> level triggered and the level is high, then notify the one who injected
> >>>>> this and tell it to adjust its line level (lower it if it changed).
> >>>>>
> >>>>> That would follow our existing path very closely.
> >>>>>
> >>>>> Am I missing something?
> >>>>
> >>>> I don't think you are. I think Eric got confused because of the above.
> >>>> But the flow is a bit a brainfsck :-(
> >>>>
> >>>> - Physical interrupt fires, activated, injected in the vgic
> >>>> - Injecting the interrupt has a very different flow from what we
> >>>> currently have, and follow the same pattern as an Edge interrupt
> >>>> (because the Pending state is kept at the physical distributor, so we
> >>>> cannot preserve it in the emulation).
> >>>> - Normal life cycle of the interrupt
> >>>> - The fact that the Pending bit is kept at the distributor level ensures
> >>>> that if it becomes pending again in the emulation, that's because the
> >>>> guest has deactivated the physical interrupt by doing an EOI.
> >>>>
> >>>
> >>> I think there's a choice between how we choose to support this.  We can
> >>> either do the edge-like injection, or we can model the line_level to the
> >>> best of our ability (we just have to lower the line after the guest
> >>> exits after deactivation if it's not still pending at the physical
> >>> distributor).
> >>>
> >>> One question with doing this edge-like, can you ahve this scenario:
> >>>  1. VM runs with active virtual interrupt linked to physical
> >>>     interrupt.
> >>>  2. VM deactivates virtual+physical interrupt
> >>>  3. Physical interrupt fires again on the host
> >>>  4. The host injects the virtual interrupt as pending to the VGIC (and
> >>>     IPIs the VCPU etc.)
> >>>  5. The device lowers the physical line (another VPCU programs the
> >>>     device, there's some delay, or whatever)
> >>>  6. The VCPU now sees a pending interrupt, which is no longer pending.
> >>>
> >>> Not sure if the line-like approach really solves this, though, or if
> >>> getting a spurius interrupt is something we care about.
> >>
> >> That would be a spurious interrupt indeed, but I'm not sure that's
> >> something the line level sampling you suggest would avoid either. There
> >> is a fundamental disconnect between the injection and the physical line,
> >> and it can only be modelled to some level of accuracy (/me curse the
> >> architecture again).
> >>
> >>> Perhaps we need to try to implement both and see how it looks like?
> >>
> >> There is definitely room for experiment, but I feel Eric should focus on
> >> one of them (whichever it is). Happy to help prototyping the other one
> >> though.
> >>
> > That's fair.  I'm just worried about the whole "emulate level triggered
> > interrupts as edge triggered" thing, but as you said, the architecture
> > doesn't allow us to model it more accurately.
> if the line level is modeled using the physical distributor pending
> state, don't you fix that case?
> 

I see what you mean.  Perhaps.  So that would mean that we move
line_level() to the physical distributor for forwarded interrupts, but
the active state is managed virtually in the GIC?

That contradicts my argument on the other mail, but if it's a more
accurate emulation, then that's definitely worth considering.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
  2017-05-31 18:24   ` Alex Williamson
  2017-06-01 20:40       ` Auger Eric
@ 2017-06-14  8:07     ` Auger Eric
  2017-06-14  8:41         ` Marc Zyngier
  1 sibling, 1 reply; 69+ messages in thread
From: Auger Eric @ 2017-06-14  8:07 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	marc.zyngier, christoffer.dall, drjones, wei

Hi Alex, Marc,

On 31/05/2017 20:24, Alex Williamson wrote:
> On Wed, 24 May 2017 22:13:18 +0200
> Eric Auger <eric.auger@redhat.com> wrote:
> 
>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>> If deoi is set, this means the physical IRQ attached to the virtual
>> IRQ is directly deactivated by the guest and the VFIO driver does
>> not need to disable the physical IRQ and mask it at VFIO level.
>>
>> The handler pointer is set accordingly and a wrapper handler is
>> introduced that calls the chosen handler function.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> ---
>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>> index d4d377b..06aa713 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>  {
>>  	struct vfio_pci_device *vdev = dev_id;
>> -	unsigned long flags;
>>  	int ret = IRQ_NONE;
>>  
>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>> -
>>  	if (!vdev->pci_2_3) {
>>  		disable_irq_nosync(vdev->pdev->irq);
>>  		vdev->ctx[0].automasked = true;
>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>  		ret = IRQ_HANDLED;
>>  	}
>>  
>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>> -
>>  	if (ret == IRQ_HANDLED)
>>  		vfio_send_intx_eventfd(vdev, NULL);
>>  
>>  	return ret;
>>  }
>>  
>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>> +{
>> +	struct vfio_pci_device *vdev = dev_id;
>> +
>> +	vfio_send_intx_eventfd(vdev, NULL);
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>> +{
>> +	struct vfio_pci_device *vdev = dev_id;
>> +	unsigned long flags;
>> +	irqreturn_t ret;
>> +
>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>> +
>> +	return ret;
>> +}
>> +
>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>  {
>>  	if (!is_irq_none(vdev))
>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>  	if (!vdev->pci_2_3)
>>  		irqflags = 0;
>>  
>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>> +	if (vdev->ctx[0].deoi)
>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>> +	else
>> +		vdev->ctx[0].handler = vfio_intx_handler;
>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>  			  irqflags, vdev->ctx[0].name, vdev);
> 
> 
> Here's where I think we don't account for irqflags properly.  If we get
> a shared interrupt here, then enabling direct EOI needs to be disabled
> or else we'll starve other devices sharing the interrupt.  In practice,
> I wonder if this makes PCI direct EOI a useful feature.  We could try
> to get an exclusive interrupt and fallback to shared, but any time we
> get an exclusive interrupt we're more prone to conflicts with other
> devices.  I might have two VMs that share an interrupt and now it's a
> race that only the first to setup an IRQ can work.  Worse, one of those
> VMs might be fully booted and switched to MSI and now it's just a
> matter of time until they reboot in the right way to generate a
> conflict.  I might also have two devices in the same VM that share an
> IRQ and now I can't start the VM at all because the second device can
> no longer get an interrupt.  This is the same problem we have with the
> nointxmask flag, it's a useful debugging feature but since the masking
> is done at the APIC/GIC rather than the device, much like here, it's not
> very practical for more than debugging and isolating specific devices
> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
> PCI side here. Thanks,

So I agree Direct EOI with shared interrupts is a total mess as
- if the interrupt is not for VFIO, the physical interrupt will not be
deactivated
- if the interrupt is for VFIO, the physical interrupt will be
deactivated through guest virtual interrupt deactivation before
subsequent physical handlers complete their execution.

By the way, reading
"http://vfio.blogspot.fr/2014/09/vfio-interrupts-and-how-to-coax-windows.html"
was really helpful!

So I suggest I drop the feature for VFIO-PCI INTx and respin with
vfio-platform only. This series then mostly prepares for GICv4 integration.

Thanks

Eric


> 
> Alex
> 
>>  	if (ret) {
>>  		vdev->ctx[0].trigger = NULL;
>> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
>> index f7f1101..5cfe59a 100644
>> --- a/drivers/vfio/pci/vfio_pci_private.h
>> +++ b/drivers/vfio/pci/vfio_pci_private.h
>> @@ -36,6 +36,8 @@ struct vfio_pci_irq_ctx {
>>  	char			*name;
>>  	bool			masked;
>>  	bool			automasked;
>> +	bool			deoi;
>> +	irqreturn_t		(*handler)(int irq, void *dev_id);
>>  	struct irq_bypass_producer	producer;
>>  };
>>  
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
  2017-06-14  8:07     ` Auger Eric
@ 2017-06-14  8:41         ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-14  8:41 UTC (permalink / raw)
  To: Auger Eric, Alex Williamson
  Cc: eric.auger.pro, linux-kernel, kvm, kvmarm, pbonzini,
	christoffer.dall, drjones, wei

On 14/06/17 09:07, Auger Eric wrote:
> Hi Alex, Marc,
> 
> On 31/05/2017 20:24, Alex Williamson wrote:
>> On Wed, 24 May 2017 22:13:18 +0200
>> Eric Auger <eric.auger@redhat.com> wrote:
>>
>>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>>> If deoi is set, this means the physical IRQ attached to the virtual
>>> IRQ is directly deactivated by the guest and the VFIO driver does
>>> not need to disable the physical IRQ and mask it at VFIO level.
>>>
>>> The handler pointer is set accordingly and a wrapper handler is
>>> introduced that calls the chosen handler function.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>>> index d4d377b..06aa713 100644
>>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  {
>>>  	struct vfio_pci_device *vdev = dev_id;
>>> -	unsigned long flags;
>>>  	int ret = IRQ_NONE;
>>>  
>>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>>> -
>>>  	if (!vdev->pci_2_3) {
>>>  		disable_irq_nosync(vdev->pdev->irq);
>>>  		vdev->ctx[0].automasked = true;
>>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  		ret = IRQ_HANDLED;
>>>  	}
>>>  
>>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> -
>>>  	if (ret == IRQ_HANDLED)
>>>  		vfio_send_intx_eventfd(vdev, NULL);
>>>  
>>>  	return ret;
>>>  }
>>>  
>>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +
>>> +	vfio_send_intx_eventfd(vdev, NULL);
>>> +	return IRQ_HANDLED;
>>> +}
>>> +
>>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +	unsigned long flags;
>>> +	irqreturn_t ret;
>>> +
>>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>>  {
>>>  	if (!is_irq_none(vdev))
>>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>>  	if (!vdev->pci_2_3)
>>>  		irqflags = 0;
>>>  
>>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>>> +	if (vdev->ctx[0].deoi)
>>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>>> +	else
>>> +		vdev->ctx[0].handler = vfio_intx_handler;
>>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>>  			  irqflags, vdev->ctx[0].name, vdev);
>>
>>
>> Here's where I think we don't account for irqflags properly.  If we get
>> a shared interrupt here, then enabling direct EOI needs to be disabled
>> or else we'll starve other devices sharing the interrupt.  In practice,
>> I wonder if this makes PCI direct EOI a useful feature.  We could try
>> to get an exclusive interrupt and fallback to shared, but any time we
>> get an exclusive interrupt we're more prone to conflicts with other
>> devices.  I might have two VMs that share an interrupt and now it's a
>> race that only the first to setup an IRQ can work.  Worse, one of those
>> VMs might be fully booted and switched to MSI and now it's just a
>> matter of time until they reboot in the right way to generate a
>> conflict.  I might also have two devices in the same VM that share an
>> IRQ and now I can't start the VM at all because the second device can
>> no longer get an interrupt.  This is the same problem we have with the
>> nointxmask flag, it's a useful debugging feature but since the masking
>> is done at the APIC/GIC rather than the device, much like here, it's not
>> very practical for more than debugging and isolating specific devices
>> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
>> PCI side here. Thanks,
> 
> So I agree Direct EOI with shared interrupts is a total mess as
> - if the interrupt is not for VFIO, the physical interrupt will not be
> deactivated
> - if the interrupt is for VFIO, the physical interrupt will be
> deactivated through guest virtual interrupt deactivation before
> subsequent physical handlers complete their execution.
> 
> By the way, reading
> "http://vfio.blogspot.fr/2014/09/vfio-interrupts-and-how-to-coax-windows.html"
> was really helpful!
> 
> So I suggest I drop the feature for VFIO-PCI INTx and respin with
> vfio-platform only. This series then mostly prepares for GICv4 integration.

Agreed. That's probably good enough for the time being.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler
@ 2017-06-14  8:41         ` Marc Zyngier
  0 siblings, 0 replies; 69+ messages in thread
From: Marc Zyngier @ 2017-06-14  8:41 UTC (permalink / raw)
  To: Auger Eric, Alex Williamson
  Cc: kvm, linux-kernel, pbonzini, kvmarm, eric.auger.pro

On 14/06/17 09:07, Auger Eric wrote:
> Hi Alex, Marc,
> 
> On 31/05/2017 20:24, Alex Williamson wrote:
>> On Wed, 24 May 2017 22:13:18 +0200
>> Eric Auger <eric.auger@redhat.com> wrote:
>>
>>> We add two new fields in vfio_pci_irq_ctx struct: deoi and handler.
>>> If deoi is set, this means the physical IRQ attached to the virtual
>>> IRQ is directly deactivated by the guest and the VFIO driver does
>>> not need to disable the physical IRQ and mask it at VFIO level.
>>>
>>> The handler pointer is set accordingly and a wrapper handler is
>>> introduced that calls the chosen handler function.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>> ---
>>>  drivers/vfio/pci/vfio_pci_intrs.c   | 32 ++++++++++++++++++++++++++------
>>>  drivers/vfio/pci/vfio_pci_private.h |  2 ++
>>>  2 files changed, 28 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>>> index d4d377b..06aa713 100644
>>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>>> @@ -121,11 +121,8 @@ void vfio_pci_intx_unmask(struct vfio_pci_device *vdev)
>>>  static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  {
>>>  	struct vfio_pci_device *vdev = dev_id;
>>> -	unsigned long flags;
>>>  	int ret = IRQ_NONE;
>>>  
>>> -	spin_lock_irqsave(&vdev->irqlock, flags);
>>> -
>>>  	if (!vdev->pci_2_3) {
>>>  		disable_irq_nosync(vdev->pdev->irq);
>>>  		vdev->ctx[0].automasked = true;
>>> @@ -137,14 +134,33 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id)
>>>  		ret = IRQ_HANDLED;
>>>  	}
>>>  
>>> -	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> -
>>>  	if (ret == IRQ_HANDLED)
>>>  		vfio_send_intx_eventfd(vdev, NULL);
>>>  
>>>  	return ret;
>>>  }
>>>  
>>> +static irqreturn_t vfio_intx_handler_deoi(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +
>>> +	vfio_send_intx_eventfd(vdev, NULL);
>>> +	return IRQ_HANDLED;
>>> +}
>>> +
>>> +static irqreturn_t vfio_intx_wrapper_handler(int irq, void *dev_id)
>>> +{
>>> +	struct vfio_pci_device *vdev = dev_id;
>>> +	unsigned long flags;
>>> +	irqreturn_t ret;
>>> +
>>> +	spin_lock_irqsave(&vdev->irqlock, flags);
>>> +	ret = vdev->ctx[0].handler(irq, dev_id);
>>> +	spin_unlock_irqrestore(&vdev->irqlock, flags);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int vfio_intx_enable(struct vfio_pci_device *vdev)
>>>  {
>>>  	if (!is_irq_none(vdev))
>>> @@ -208,7 +224,11 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd)
>>>  	if (!vdev->pci_2_3)
>>>  		irqflags = 0;
>>>  
>>> -	ret = request_irq(pdev->irq, vfio_intx_handler,
>>> +	if (vdev->ctx[0].deoi)
>>> +		vdev->ctx[0].handler = vfio_intx_handler_deoi;
>>> +	else
>>> +		vdev->ctx[0].handler = vfio_intx_handler;
>>> +	ret = request_irq(pdev->irq, vfio_intx_wrapper_handler,
>>>  			  irqflags, vdev->ctx[0].name, vdev);
>>
>>
>> Here's where I think we don't account for irqflags properly.  If we get
>> a shared interrupt here, then enabling direct EOI needs to be disabled
>> or else we'll starve other devices sharing the interrupt.  In practice,
>> I wonder if this makes PCI direct EOI a useful feature.  We could try
>> to get an exclusive interrupt and fallback to shared, but any time we
>> get an exclusive interrupt we're more prone to conflicts with other
>> devices.  I might have two VMs that share an interrupt and now it's a
>> race that only the first to setup an IRQ can work.  Worse, one of those
>> VMs might be fully booted and switched to MSI and now it's just a
>> matter of time until they reboot in the right way to generate a
>> conflict.  I might also have two devices in the same VM that share an
>> IRQ and now I can't start the VM at all because the second device can
>> no longer get an interrupt.  This is the same problem we have with the
>> nointxmask flag, it's a useful debugging feature but since the masking
>> is done at the APIC/GIC rather than the device, much like here, it's not
>> very practical for more than debugging and isolating specific devices
>> as requiring APIC/GIC level masking.  I'm not sure how to proceed on the
>> PCI side here. Thanks,
> 
> So I agree Direct EOI with shared interrupts is a total mess as
> - if the interrupt is not for VFIO, the physical interrupt will not be
> deactivated
> - if the interrupt is for VFIO, the physical interrupt will be
> deactivated through guest virtual interrupt deactivation before
> subsequent physical handlers complete their execution.
> 
> By the way, reading
> "http://vfio.blogspot.fr/2014/09/vfio-interrupts-and-how-to-coax-windows.html"
> was really helpful!
> 
> So I suggest I drop the feature for VFIO-PCI INTx and respin with
> vfio-platform only. This series then mostly prepares for GICv4 integration.

Agreed. That's probably good enough for the time being.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2017-06-14  8:43 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-24 20:13 [PATCH 00/10] ARM/ARM64 Direct EOI setup for VFIO wired interrupts Eric Auger
2017-05-24 20:13 ` [PATCH 01/10] vfio: platform: Add automasked field to vfio_platform_irq Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-25 18:05   ` Marc Zyngier
2017-05-25 18:05     ` Marc Zyngier
2017-05-30 12:45     ` Auger Eric
2017-05-31 17:41       ` Alex Williamson
2017-05-31 17:41         ` Alex Williamson
2017-05-24 20:13 ` [PATCH 02/10] VFIO: platform: Introduce direct EOI interrupt handler Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-31 18:20   ` Alex Williamson
2017-05-31 18:20     ` Alex Williamson
2017-05-24 20:13 ` [PATCH 03/10] VFIO: platform: Direct EOI irq bypass for ARM/ARM64 Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-31 18:20   ` Alex Williamson
2017-05-31 18:20     ` Alex Williamson
2017-05-31 19:31     ` Auger Eric
2017-05-31 19:31       ` Auger Eric
2017-06-01 10:49       ` Marc Zyngier
2017-05-24 20:13 ` [PATCH 04/10] VFIO: pci: Add automasked field to vfio_pci_irq_ctx Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-31 18:21   ` Alex Williamson
2017-05-31 18:21     ` Alex Williamson
2017-05-24 20:13 ` [PATCH 05/10] VFIO: pci: Introduce direct EOI INTx interrupt handler Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-31 18:24   ` Alex Williamson
2017-06-01 20:40     ` Auger Eric
2017-06-01 20:40       ` Auger Eric
2017-06-02  8:41       ` Marc Zyngier
2017-06-02  8:41         ` Marc Zyngier
2017-06-14  8:07     ` Auger Eric
2017-06-14  8:41       ` Marc Zyngier
2017-06-14  8:41         ` Marc Zyngier
2017-05-24 20:13 ` [PATCH 06/10] irqbypass: Add a private field in the producer Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-24 20:13 ` [PATCH 07/10] VFIO: pci: Direct EOI irq bypass for ARM/ARM64 Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-24 20:13 ` [PATCH 08/10] KVM: arm/arm64: vgic: Handle unshared mapped interrupts Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-25 19:14   ` Marc Zyngier
2017-05-25 19:14     ` Marc Zyngier
2017-05-30 12:50     ` Auger Eric
2017-05-30 12:50       ` Auger Eric
2017-06-02 13:33   ` Christoffer Dall
2017-06-02 13:33     ` Christoffer Dall
2017-06-02 14:10     ` Marc Zyngier
2017-06-02 14:10       ` Marc Zyngier
2017-06-02 16:29       ` Christoffer Dall
2017-06-02 16:29         ` Christoffer Dall
2017-06-08  8:23         ` Marc Zyngier
2017-06-08  8:34           ` Christoffer Dall
2017-06-08  8:55             ` Auger Eric
2017-06-08  8:55               ` Auger Eric
2017-06-08 10:14               ` Christoffer Dall
2017-06-08 10:14                 ` Christoffer Dall
2017-06-08  8:49     ` Auger Eric
2017-06-08  8:49       ` Auger Eric
2017-06-08 10:11       ` Christoffer Dall
2017-05-24 20:13 ` [PATCH 09/10] KVM: arm/arm64: vgic: Implement forwarding setting Eric Auger
2017-05-24 20:13   ` Eric Auger
2017-05-25 19:19   ` Marc Zyngier
2017-05-25 19:19     ` Marc Zyngier
2017-05-30 12:54     ` Auger Eric
2017-05-30 12:54       ` Auger Eric
2017-05-30 13:17       ` Marc Zyngier
2017-05-30 13:17         ` Marc Zyngier
2017-05-30 14:03         ` Auger Eric
2017-05-24 20:13 ` [PATCH 10/10] KVM: arm/arm64: register DEOI irq bypass consumer on ARM/ARM64 Eric Auger
2017-05-24 20:13   ` Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.