All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough
@ 2015-02-13  3:47 Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 1/7] linux-headers: update VFIO header for VFIO platform drivers Eric Auger
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

This series aims at enabling KVM platform device passthrough.
It implements a VFIO platform device, derived from VFIO PCI device.

The VFIO platform device uses the host VFIO platform driver which must
be bound to the assigned device prior to the QEMU system start.

- the guest can directly access the device register space
- assigned device IRQs are transparently routed to the guest by
  QEMU/KVM (2 methods are supported in this series: user-level eventfd
  handling and irqfd). Forward method is proposed in a separate series.
- iommu is transparently programmed to prevent the device from
  accessing physical pages outside of the guest address space

This patch series is made of the following patch file groups:

group1: 1-6) VFIO abstract device & calxeda midway platform device
        without irqfd support
group2: 7) VFIO platform device with irqfd support

The 2 groups should be separately upstreamable in that order.

- the 2 groups depend on [1], [2], [3], [4]
- group2 depends on [5].

Dependency List:

QEMU dependencies:
[1] [PATCH v10 0/4] machvirt dynamic sysbus device instantiation
    Linaro, http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01733.html
[2] [RFC v2 0/2] explicit VGIC initialization in finalize function
    Linaro, http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01762.html

Kernel Dependencies:
[3] [PATCH v13 00/18] VFIO support for platform devices
    VOSYS, http://www.spinics.net/lists/kvm-arm/msg13414.html
[4] [PATCH v3 0/6] vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1
    VOSYS, http://www.spinics.net/lists/kvm-arm/msg11738.html
[5] [PATCH v8] irqfd support for arm/arm64
    Eric Auger, https://lkml.org/lkml/2015/1/19/410

- kernel pieces can be found at:
  http://git.linaro.org/people/eric.auger/linux.git
  (branch irqfd_integ_v9)
- QEMU pieces can be found at:
  http://git.linaro.org/people/eric.auger/qemu.git
  (branch vfio_integ_v10)

The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.
multiple IRQ use case was emulated using manual eventfd trigger.

Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

History:
v9->v10:
- rebase on "vfio: cleanup vfio_get_device error path, remove
  vfio_populate_device": vfio_populate_device no more called in
  vfio_get_device but in vfio_base_device_init
- update VFIO header according to vfio platform driver v13 (no AMBA)

v8->v9:
- rebase on 2.2.0 and machvirt dynamic sysbus instantiation v10
- v8 1-11 were pulled
- patch files related to forwarding are moved in a seperate series since
  it depends on kernel series still in RFC.
- introduction of basic VFIO platform device split into 3 patch files to
  ease the review (hope it will help).
- add an author in platform.c
- add deallocation in vfio_populate_device error case
- add patch file doing the VFIO header sync
- use VFIO_DEVICE_FLAGS_PLATFORM in vfio_populate_device
- rename calxeda_xgmac.c into calxeda-xgmac.c
- sysbus-fdt: add_calxeda_midway_xgmac_fdt_node g_free in case of errors
- reword of linux-headers patch files

v7->v8:
- rebase on v2.2.0-rc3 and integrate
  "Add skip_dump flag to ignore memory region during dump"
- KVM header evolution with subindex addition in kvm_arch_forwarded_irq
- split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
- vfio_compute_needs_reset does not return bool anymore
- add some comments about exposed MMIO region and IRQ in calxeda xgmac
  device
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- rework IRQ startup: former machine init done notifier is replaced by a
  reset notifier. machine file passes the interrupt controller
  DeviceState handle (not the platform bus first irq parameter).
- sysbus-fdt:
  - move the add_fdt_node_functions array declaration between the device
    specific code and the generic code to avoid forward declarations of
    decice specific functions
  - rename add_basic_vfio_fdt_node into add_calxeda_midway_xgmac_fdt_node
    emphasizing the fact it is xgmac specific

v6->v7:
- fake injection test modality removed
- VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
- new helper functions to start VFIO IRQ on machine init done notifier
  (introduced in hw/vfio/platform: add vfio-platform support and notifier
  registration invoked in hw/arm/virt: add support for VFIO devices).
  vfio_start_irq_injection is replaced by vfio_register_irq_starter.

v5->v6:
- rebase on 2.1rc5 PCI code
- forwarded IRQ first integraton
- vfio_device property renamed into host property
- split IRQ setup in different functions that match the 3 supported
  injection techniques (user handled eventfd, irqfd, forwarded IRQ):
  removes dynamic switch between injection methods
- introduce fake interrupts as a test modality:
  x makes possible to test multiple IRQ user-side handling.
  x this is a test feature only: enable to trigger a fd as if the
    real physical IRQ hit. No virtual IRQ is injected into the guest
    but handling is simulated so that the state machine can be tested
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs using fake interrupt modality
- irqfd no more advertised in this patchset (handled in [3])
- VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
  and class is re-introduced (as per v4)
- all DPRINTF removed in platform and replaced by trace-points
- corrects compilation with configure --disable-kvm
- simplifies the split for vfio_get_device and introduce a unique
  specialized function named vfio_populate_device
- group_list renamed into vfio_group_list
- hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
  instantiation. Needs to be specialized for other VFIO devices
- fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)

v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
  - trace updates in vfio_region_write/read
  - remove fd from VFIORegion
  - get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
  hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
  device tree node generation we do not have so many things to
  implement in that derived device yet. May be re-introduced later
  on if needed typically for reset/migration.
- no GSI routing table anymore

v3->v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI code change review
- mv include files in hw/vfio
- DPRINTF reformatting temporarily moved out
- support of VFIO virq (removal of resamplefd handler on user-side)
- integration with sysbus dynamic instantiation framwork
- removal of unrealize and cleanup routines until it is better
  understood what is really needed
- Support of VFIO for Amba devices should be handled in an inherited
  device to specialize the device tree generation (clock handle currently
  missing in framework however)
- "Always use eventfd as notifying mechanism" temporarily moved out
- static instantiation is not mainstream (although it remains possible)
  note if static instantiation is used, irqfd must be setup in machine file
  when virtual IRQ is known
- create the GSI routing table on qemu side

v2->v3 changes (Alvise Rigo, Eric Auger):
- Following Alex W recommandations, further efforts to factorize the
  code between PCI:introduction of VFIODevice and VFIORegion
  as base classes
- unique reset handler for platform and PCI
- cleanup following Kim's comments
- multiple IRQ support mechanics should be in place although not
  tested
- Better handling of MMIO multiple regions
- New features and fixes by Alvise (multiple compat string, exec
  flag, force eventfd usage, amba device tree support)
- irqfd support

v1->v2 changes (Kim Phillips, Eric Auger):
- IRQ initial support (legacy mode where eventfds are handled on
  user side)
- hacked dynamic instantiation

v1 (Kim Phillips):
- initial split between PCI and platform
- MMIO support only
- static instantiation

Best Regards

Eric

Eric Auger (7):
  linux-headers: update VFIO header for VFIO platform drivers
  hw/vfio/platform: vfio-platform skeleton
  hw/vfio/platform: add irq assignment
  hw/vfio/platform: add capability to start IRQ propagation
  hw/vfio: calxeda xgmac device
  hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  hw/vfio/platform: add irqfd support

 hw/arm/sysbus-fdt.c                  |  83 ++++
 hw/arm/virt.c                        |  15 +-
 hw/vfio/Makefile.objs                |   2 +
 hw/vfio/calxeda-xgmac.c              |  54 +++
 hw/vfio/platform.c                   | 747 +++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-calxeda-xgmac.h |  46 +++
 include/hw/vfio/vfio-common.h        |   1 +
 include/hw/vfio/vfio-platform.h      |  86 ++++
 linux-headers/linux/vfio.h           |  31 +-
 trace-events                         |  14 +
 10 files changed, 1062 insertions(+), 17 deletions(-)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
 create mode 100644 include/hw/vfio/vfio-platform.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 1/7] linux-headers: update VFIO header for VFIO platform drivers
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton Eric Auger
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

Update according to vfio.h header found in
http://git.linaro.org/people/eric.auger/linux.git
branch irqfd_integ_v9

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v9 -> v10:
- AMBA removed

v8 -> v9:
- rewording of the commit message
---
 linux-headers/linux/vfio.h | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 0f21aa6..2ef4485 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -19,22 +19,21 @@
 
 /* Kernel & User level defines for VFIO IOCTLs. */
 
-/* Extensions */
-
-#define VFIO_TYPE1_IOMMU		1
-#define VFIO_SPAPR_TCE_IOMMU		2
-#define VFIO_TYPE1v2_IOMMU		3
 /*
- * IOMMU enforces DMA cache coherence (ex. PCIe NoSnoop stripping).  This
- * capability is subject to change as groups are added or removed.
+ * Capabilities exposed by the VFIO IOMMU driver. Some capabilities are subject
+ * to change as groups are added or removed.
  */
-#define VFIO_DMA_CC_IOMMU		4
-
-/* Check if EEH is supported */
-#define VFIO_EEH			5
+enum vfio_iommu_cap {
+	VFIO_TYPE1_IOMMU = 1,
+	VFIO_SPAPR_TCE_IOMMU = 2,
+	VFIO_TYPE1v2_IOMMU = 3,
+	VFIO_DMA_CC_IOMMU = 4,		/* IOMMU enforces DMA cache coherence
+					   (ex. PCIe NoSnoop stripping) */
+	VFIO_EEH = 5,			/* Check if EEH is supported */
+	VFIO_TYPE1_NESTING_IOMMU = 6,	/* Two-stage IOMMU, implies v2  */
+	VFIO_DMA_NOEXEC_IOMMU = 7,
+};
 
-/* Two-stage IOMMU */
-#define VFIO_TYPE1_NESTING_IOMMU	6	/* Implies v2 */
 
 /*
  * The IOCTL interface is designed for extensibility by embedding the
@@ -160,6 +159,7 @@ struct vfio_device_info {
 	__u32	flags;
 #define VFIO_DEVICE_FLAGS_RESET	(1 << 0)	/* Device supports reset */
 #define VFIO_DEVICE_FLAGS_PCI	(1 << 1)	/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)	/* vfio-platform device */
 	__u32	num_regions;	/* Max region index + 1 */
 	__u32	num_irqs;	/* Max IRQ index + 1 */
 };
@@ -398,12 +398,17 @@ struct vfio_iommu_type1_info {
  *
  * Map process virtual addresses to IO virtual addresses using the
  * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
+ *
+ * To use the VFIO_DMA_MAP_FLAG_NOEXEC flag, the container must support the
+ * VFIO_DMA_NOEXEC_IOMMU capability. If mappings are created using this flag,
+ * any groups subsequently added to the container must support this capability.
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+#define VFIO_DMA_MAP_FLAG_NOEXEC (1 << 2)	/* not executable from device */
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 1/7] linux-headers: update VFIO header for VFIO platform drivers Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-17 10:56     ` Alex Bennée
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment Eric Auger
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, Kim Phillips, a.rigo, afaerber,
	Bharat.Bhushan, a.motakis, pbonzini, kvmarm

Minimal VFIO platform implementation supporting register space
user mapping but not IRQ assignment.

Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v9 -> v10:
- vfio_populate_device no more called in common vfio_get_device
  but in vfio_base_device_init

v8 -> v9:
- irq management is moved into a separate patch to ease the review
- VFIO_DEVICE_FLAGS_PLATFORM is checked in vfio_populate_device
- g_free of regions added in vfio_populate_device error label
- virtualID becomes 32b

v7 -> v8:
- change proto of vfio_platform_compute_needs_reset and sets
  vbasedev->needs_reset to false there
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- vfio_register_irq_starter renamed into vfio_kick_irqs
  we now use a reset notifier instead of a machine init done notifier.
  Enables to get rid of the VfioIrqStarterNotifierParams dangling
  pointer. Previously we use pbus first_irq. This is no more possible
  since the reset notifier takes a void * and first_irq is a field of
  a const struct. So now we pass the DeviceState handle of the
  interrupt controller. I tried to keep the code generic, reason why
  I did not rely on an architecture specific accessor to retrieve
  the gsi number (gic accessor as proposed by Alex). I would like to
  avoid creating an ARM VFIO device model. I hope this model
  model can work on other archs than arm (no multiple intc?);
  wouldn't it be simpler to keep the previous first_irq parameter and
  relax the const constraint.

v6 -> v7:
- compat is not exposed anymore as a user option. Rationale is
  the vfio device became abstract and a specialization is needed
  anyway. The derived device must set the compat string.
- in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
  A new function dubbed vfio_register_irq_starter replaces it. It
  registers a machine init done notifier that programs & starts
  all dynamic VFIO device IRQs. This function is supposed to be
  called by the machine file. A set of static helper routines are
  added too. It must be called before the creation of the platform
  bus device.

v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
  and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
  property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
  group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
  vfio_enable_intp renamed into vfio_init_intp and new function
  named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
  in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
  the function that will be used for starting injection
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs           |   1 +
 hw/vfio/platform.c              | 274 ++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-common.h   |   1 +
 include/hw/vfio/vfio-platform.h |  44 +++++++
 trace-events                    |  12 ++
 5 files changed, 332 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 0000000..caadb92
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,274 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips <kim.phillips@linaro.org>
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include <linux/vfio.h>
+#include <sys/ioctl.h>
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "hw/sysbus.h"
+#include "trace.h"
+#include "hw/platform-bus.h"
+
+/* not implemented yet */
+static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
+{
+    vbasedev->needs_reset = false;
+}
+
+/* not implemented yet */
+static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
+{
+    return 0;
+}
+
+/**
+ * vfio_populate_device - initialize MMIO region and IRQ
+ * @vbasedev: the VFIO device
+ *
+ * query the VFIO device for exposed MMIO regions and IRQ and
+ * populate the associated fields in the device struct
+ */
+static int vfio_populate_device(VFIODevice *vbasedev)
+{
+    struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+    int i, ret = -1;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
+        error_report("vfio: Um, this isn't a platform device");
+        goto error;
+    }
+
+    vdev->regions = g_malloc0(sizeof(VFIORegion *) * vbasedev->num_regions);
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        vdev->regions[i] = g_malloc0(sizeof(VFIORegion));
+        reg_info.index = i;
+        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
+        if (ret) {
+            error_report("vfio: Error getting region %d info: %m", i);
+            goto error;
+        }
+        vdev->regions[i]->flags = reg_info.flags;
+        vdev->regions[i]->size = reg_info.size;
+        vdev->regions[i]->fd_offset = reg_info.offset;
+        vdev->regions[i]->nr = i;
+        vdev->regions[i]->vbasedev = vbasedev;
+
+        trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
+                            (unsigned long)vdev->regions[i]->flags,
+                            (unsigned long)vdev->regions[i]->size,
+                            vdev->regions[i]->vbasedev->fd,
+                            (unsigned long)vdev->regions[i]->fd_offset);
+    }
+
+    return 0;
+error:
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        g_free(vdev->regions[i]);
+    }
+    g_free(vdev->regions);
+    return ret;
+}
+
+/* specialized functions ofr VFIO Platform devices */
+static VFIODeviceOps vfio_platform_ops = {
+    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
+    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+    .vfio_populate_device = vfio_populate_device,
+};
+
+/**
+ * vfio_base_device_init - implements some of the VFIO mechanics
+ * @vbasedev: the VFIO device
+ *
+ * retrieves the group the device belongs to and get the device fd
+ * returns the VFIO device fd
+ * precondition: the device name must be initialized
+ */
+static int vfio_base_device_init(VFIODevice *vbasedev)
+{
+    VFIOGroup *group;
+    VFIODevice *vbasedev_iter;
+    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
+    ssize_t len;
+    struct stat st;
+    int groupid;
+    int ret;
+
+    /* name must be set prior to the call */
+    if (!vbasedev->name) {
+        return -EINVAL;
+    }
+
+    /* Check that the host device exists */
+    snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
+             vbasedev->name);
+
+    if (stat(path, &st) < 0) {
+        error_report("vfio: error: no such host device: %s", path);
+        return -errno;
+    }
+
+    strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
+    len = readlink(path, iommu_group_path, sizeof(path));
+    if (len <= 0 || len >= sizeof(path)) {
+        error_report("vfio: error no iommu_group for device");
+        return len < 0 ? -errno : ENAMETOOLONG;
+    }
+
+    iommu_group_path[len] = 0;
+    group_name = basename(iommu_group_path);
+
+    if (sscanf(group_name, "%d", &groupid) != 1) {
+        error_report("vfio: error reading %s: %m", path);
+        return -errno;
+    }
+
+    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
+
+    group = vfio_get_group(groupid, &address_space_memory);
+    if (!group) {
+        error_report("vfio: failed to get group %d", groupid);
+        return -ENOENT;
+    }
+
+    snprintf(path, sizeof(path), "%s", vbasedev->name);
+
+    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
+            error_report("vfio: error: device %s is already attached", path);
+            vfio_put_group(group);
+            return -EBUSY;
+        }
+    }
+    ret = vfio_get_device(group, path, vbasedev);
+    if (ret) {
+        error_report("vfio: failed to get device %s", path);
+        vfio_put_group(group);
+    }
+    return ret;
+}
+
+/**
+ * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
+ * given index
+ * @vdev: the VFIO platform device
+ * @nr: the index of the region
+ *
+ * init the top memory region and the mmapped memroy region beneath
+ * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
+ * and could not be passed to memory region functions
+*/
+static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
+{
+    VFIORegion *region = vdev->regions[nr];
+    unsigned size = region->size;
+    char name[64];
+
+    if (!size) {
+        return;
+    }
+
+    snprintf(name, sizeof(name), "VFIO %s region %d",
+             vdev->vbasedev.name, nr);
+
+    /* A "slow" read/write mapping underlies all regions */
+    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
+                          region, name, size);
+
+    strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
+
+    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
+                         &region->mmap_mem, &region->mmap, size, 0, name)) {
+        error_report("%s unsupported. Performance may be slow", name);
+    }
+}
+
+/**
+ * vfio_platform_realize  - the device realize function
+ * @dev: device state pointer
+ * @errp: error
+ *
+ * initialize the device, its memory regions and IRQ structures
+ * IRQ are started separately
+ */
+static void vfio_platform_realize(DeviceState *dev, Error **errp)
+{
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    int i, ret;
+
+    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
+    vbasedev->ops = &vfio_platform_ops;
+
+    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
+
+    ret = vfio_base_device_init(vbasedev);
+    if (ret) {
+        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
+                   vbasedev->name);
+        return;
+    }
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        vfio_map_region(vdev, i);
+        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
+    }
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+    .name = TYPE_VFIO_PLATFORM,
+    .unmigratable = 1,
+};
+
+static Property vfio_platform_dev_properties[] = {
+    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vfio_platform_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->realize = vfio_platform_realize;
+    dc->props = vfio_platform_dev_properties;
+    dc->vmsd = &vfio_platform_vmstate;
+    dc->desc = "VFIO-based platform device assignment";
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static const TypeInfo vfio_platform_dev_info = {
+    .name = TYPE_VFIO_PLATFORM,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(VFIOPlatformDevice),
+    .class_init = vfio_platform_class_init,
+    .class_size = sizeof(VFIOPlatformDeviceClass),
+    .abstract   = true,
+};
+
+static void register_vfio_platform_dev_type(void)
+{
+    type_register_static(&vfio_platform_dev_info);
+}
+
+type_init(register_vfio_platform_dev_type)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 5f3679b..2d1d8b3 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -43,6 +43,7 @@
 
 enum {
     VFIO_DEVICE_TYPE_PCI = 0,
+    VFIO_DEVICE_TYPE_PLATFORM = 1,
 };
 
 typedef struct VFIORegion {
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
new file mode 100644
index 0000000..338f0c6
--- /dev/null
+++ b/include/hw/vfio/vfio-platform.h
@@ -0,0 +1,44 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips <kim.phillips@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#ifndef HW_VFIO_VFIO_PLATFORM_H
+#define HW_VFIO_VFIO_PLATFORM_H
+
+#include "hw/sysbus.h"
+#include "hw/vfio/vfio-common.h"
+
+#define TYPE_VFIO_PLATFORM "vfio-platform"
+
+typedef struct VFIOPlatformDevice {
+    SysBusDevice sbdev;
+    VFIODevice vbasedev; /* not a QOM object */
+    VFIORegion **regions;
+    char *compat; /* compatibility string */
+} VFIOPlatformDevice;
+
+typedef struct VFIOPlatformDeviceClass {
+    /*< private >*/
+    SysBusDeviceClass parent_class;
+    /*< public >*/
+} VFIOPlatformDeviceClass;
+
+#define VFIO_PLATFORM_DEVICE(obj) \
+     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
+     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
+
+#endif /*HW_VFIO_VFIO_PLATFORM_H*/
diff --git a/trace-events b/trace-events
index f87b077..d3685c9 100644
--- a/trace-events
+++ b/trace-events
@@ -1556,6 +1556,18 @@ vfio_put_group(int fd) "close group->fd=%d"
 vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
 vfio_put_base_device(int fd) "close vdev->fd=%d"
 
+# hw/vfio/platform.c
+vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
+vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
+vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
+vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
+vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
+vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
+vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
+vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
+vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
+vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
+
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
 mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 1/7] linux-headers: update VFIO header for VFIO platform drivers Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-17 11:24     ` Alex Bennée
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 4/7] hw/vfio/platform: add capability to start IRQ propagation Eric Auger
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

This patch adds the code requested to assign interrupts to
a guest. The interrupts are mediated through user handled
eventfds only.

The mechanics to start the IRQ handling is not yet there through.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v8 -> v9:
- free irq related resources in case of error in vfio_populate_device
---
 hw/vfio/platform.c              | 319 ++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-platform.h |  33 +++++
 2 files changed, 352 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index caadb92..b85ad6c 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -22,10 +22,259 @@
 #include "qemu/range.h"
 #include "sysemu/sysemu.h"
 #include "exec/memory.h"
+#include "qemu/queue.h"
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
 
+static void vfio_intp_interrupt(VFIOINTp *intp);
+typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+                                    eventfd_user_side_handler_t handler);
+
+/*
+ * Functions only used when eventfd are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_platform_eoi - IRQ completion routine
+ * @vbasedev: the VFIO device
+ *
+ * de-asserts the active virtual IRQ and unmask the physical IRQ
+ * (masked by the  VFIO driver). Handle pending IRQs if any.
+ * eoi function is called on the first access to any MMIO region
+ * after an IRQ was triggered. It is assumed this access corresponds
+ * to the IRQ status register reset. With such a mechanism, a single
+ * IRQ can be handled at a time since there is no way to know which
+ * IRQ was completed by the guest (we would need additional details
+ * about the IRQ status register mask)
+ */
+static void vfio_platform_eoi(VFIODevice *vbasedev)
+{
+    VFIOINTp *intp;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    QLIST_FOREACH(intp, &vdev->intp_list, next) {
+        if (intp->state == VFIO_IRQ_ACTIVE) {
+            trace_vfio_platform_eoi(intp->pin,
+                                event_notifier_get_fd(&intp->interrupt));
+            intp->state = VFIO_IRQ_INACTIVE;
+
+            /* deassert the virtual IRQ and unmask physical one */
+            qemu_set_irq(intp->qemuirq, 0);
+            vfio_unmask_single_irqindex(vbasedev, intp->pin);
+
+            /* a single IRQ can be active at a time */
+            break;
+        }
+    }
+    /* in case there are pending IRQs, handle them one at a time */
+    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
+        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
+        trace_vfio_platform_eoi_handle_pending(intp->pin);
+        qemu_mutex_unlock(&vdev->intp_mutex);
+        vfio_intp_interrupt(intp);
+        qemu_mutex_lock(&vdev->intp_mutex);
+        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
+        qemu_mutex_unlock(&vdev->intp_mutex);
+    } else {
+        qemu_mutex_unlock(&vdev->intp_mutex);
+    }
+}
+
+/**
+ * vfio_mmap_set_enabled - enable/disable the fast path mode
+ * @vdev: the VFIO platform device
+ * @enabled: the target mmap state
+ *
+ * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
+ * false ~ slow path = MMIO region is trapped and region callbacks
+ * are called slow path enables to trap the IRQ status register
+ * guest reset
+*/
+
+static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
+{
+    VFIORegion *region;
+    int i;
+
+    trace_vfio_platform_mmap_set_enabled(enabled);
+
+    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
+        region = vdev->regions[i];
+
+        /* register space is unmapped to trap EOI */
+        memory_region_set_enabled(&region->mmap_mem, enabled);
+    }
+}
+
+/**
+ * vfio_intp_mmap_enable - timer function, restores the fast path
+ * if there is no more active IRQ
+ * @opaque: actually points to the VFIO platform device
+ *
+ * Called on mmap timer timout, this function checks whether the
+ * IRQ is still active and in the negative restores the fast path.
+ * by construction a single eventfd is handled at a time.
+ * if the IRQ is still active, the timer is restarted.
+ */
+static void vfio_intp_mmap_enable(void *opaque)
+{
+    VFIOINTp *tmp;
+    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+        if (tmp->state == VFIO_IRQ_ACTIVE) {
+            trace_vfio_platform_intp_mmap_enable(tmp->pin);
+            /* re-program the timer to check active status later */
+            timer_mod(vdev->mmap_timer,
+                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                          vdev->mmap_timeout);
+            qemu_mutex_unlock(&vdev->intp_mutex);
+            return;
+        }
+    }
+    vfio_mmap_set_enabled(vdev, true);
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_intp_interrupt - The user-side eventfd handler
+ * @opaque: opaque pointer which in practice is the VFIOINTp*
+ *
+ * the function can be entered
+ * - in event handler context: this IRQ is inactive
+ *   in that case, the vIRQ is injected into the guest if there
+ *   is no other active or pending IRQ.
+ * - in IOhandler context: this IRQ is pending.
+ *   there is no ACTIVE IRQ
+ */
+static void vfio_intp_interrupt(VFIOINTp *intp)
+{
+    int ret;
+    VFIOINTp *tmp;
+    VFIOPlatformDevice *vdev = intp->vdev;
+    bool delay_handling = false;
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    if (intp->state == VFIO_IRQ_INACTIVE) {
+        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+            if (tmp->state == VFIO_IRQ_ACTIVE ||
+                tmp->state == VFIO_IRQ_PENDING) {
+                delay_handling = true;
+                break;
+            }
+        }
+    }
+    if (delay_handling) {
+        /*
+         * the new IRQ gets a pending status and is pushed in
+         * the pending queue
+         */
+        intp->state = VFIO_IRQ_PENDING;
+        trace_vfio_intp_interrupt_set_pending(intp->pin);
+        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
+                             intp, pqnext);
+        ret = event_notifier_test_and_clear(&intp->interrupt);
+        qemu_mutex_unlock(&vdev->intp_mutex);
+        return;
+    }
+
+    /* no active IRQ, the new IRQ can be forwarded to the guest */
+    trace_vfio_platform_intp_interrupt(intp->pin,
+                              event_notifier_get_fd(&intp->interrupt));
+
+    if (intp->state == VFIO_IRQ_INACTIVE) {
+        ret = event_notifier_test_and_clear(&intp->interrupt);
+        if (!ret) {
+            error_report("Error when clearing fd=%d (ret = %d)\n",
+                         event_notifier_get_fd(&intp->interrupt), ret);
+        }
+    } /* else this is a pending IRQ that moves to ACTIVE state */
+
+    intp->state = VFIO_IRQ_ACTIVE;
+
+    /* sets slow path */
+    vfio_mmap_set_enabled(vdev, false);
+
+    /* trigger the virtual IRQ */
+    qemu_set_irq(intp->qemuirq, 1);
+
+    /* schedule the mmap timer which will restore mmap path after EOI*/
+    if (vdev->mmap_timeout) {
+        timer_mod(vdev->mmap_timer,
+                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                      vdev->mmap_timeout);
+    }
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_start_eventfd_injection - starts the virtual IRQ injection using
+ * user-side handled eventfds
+ * @intp: the IRQ struct pointer
+ */
+
+static int vfio_start_eventfd_injection(VFIOINTp *intp)
+{
+    int ret;
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+
+    vfio_mask_single_irqindex(vbasedev, intp->pin);
+
+    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
+    if (ret) {
+        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
+        vfio_unmask_single_irqindex(vbasedev, intp->pin);
+        return ret;
+    }
+    vfio_unmask_single_irqindex(vbasedev, intp->pin);
+    return 0;
+}
+
+/*
+ * Functions used whatever the injection method
+ */
+
+/**
+ * vfio_set_trigger_eventfd - set VFIO eventfd handling
+ * ie. program the VFIO driver to associates a given IRQ index
+ * with a fd handler
+ *
+ * @intp: IRQ struct pointer
+ * @handler: handler to be called on eventfd trigger
+ */
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+                                    eventfd_user_side_handler_t handler)
+{
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+    struct vfio_irq_set *irq_set;
+    int argsz, ret;
+    int32_t *pfd;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+    irq_set->index = intp->pin;
+    irq_set->start = 0;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = event_notifier_get_fd(&intp->interrupt);
+    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (ret < 0) {
+        error_report("vfio: Failed to set trigger eventfd: %m");
+        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    }
+    return ret;
+}
+
 /* not implemented yet */
 static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
 {
@@ -39,6 +288,40 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
 }
 
 /**
+ * vfio_init_intp - allocate, initialize the IRQ struct pointer
+ * and add it into the list of IRQ
+ * @vbasedev: the VFIO device
+ * @index: VFIO device IRQ index
+ */
+static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
+{
+    int ret;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
+    VFIOINTp *intp;
+
+    /* allocate and populate a new VFIOINTp structure put in a queue list */
+    intp = g_malloc0(sizeof(*intp));
+    intp->vdev = vdev;
+    intp->pin = index;
+    intp->state = VFIO_IRQ_INACTIVE;
+    sysbus_init_irq(sbdev, &intp->qemuirq);
+
+    /* Get an eventfd for trigger */
+    ret = event_notifier_init(&intp->interrupt, 0);
+    if (ret) {
+        g_free(intp);
+        error_report("vfio: Error: trigger event_notifier_init failed ");
+        return NULL;
+    }
+
+    /* store the new intp in qlist */
+    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
+    return intp;
+}
+
+/**
  * vfio_populate_device - initialize MMIO region and IRQ
  * @vbasedev: the VFIO device
  *
@@ -47,7 +330,9 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
  */
 static int vfio_populate_device(VFIODevice *vbasedev)
 {
+    struct vfio_irq_info irq = { .argsz = sizeof(irq) };
     struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+    VFIOINTp *intp, *tmp;
     int i, ret = -1;
     VFIOPlatformDevice *vdev =
         container_of(vbasedev, VFIOPlatformDevice, vbasedev);
@@ -80,7 +365,37 @@ static int vfio_populate_device(VFIODevice *vbasedev)
                             (unsigned long)vdev->regions[i]->fd_offset);
     }
 
+    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
+                                    vfio_intp_mmap_enable, vdev);
+
+    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
+
+    for (i = 0; i < vbasedev->num_irqs; i++) {
+        irq.index = i;
+
+        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
+        if (ret) {
+            error_printf("vfio: error getting device %s irq info",
+                         vbasedev->name);
+            goto irq_err;
+        } else {
+            trace_vfio_platform_populate_interrupts(irq.index,
+                                                    irq.count,
+                                                    irq.flags);
+            intp = vfio_init_intp(vbasedev, irq.index);
+            if (!intp) {
+                error_report("vfio: Error installing IRQ %d up", i);
+                goto irq_err;
+            }
+        }
+    }
     return 0;
+irq_err:
+    timer_del(vdev->mmap_timer);
+    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
+        QLIST_REMOVE(intp, next);
+        g_free(intp);
+    }
 error:
     for (i = 0; i < vbasedev->num_regions; i++) {
         g_free(vdev->regions[i]);
@@ -93,6 +408,7 @@ error:
 static VFIODeviceOps vfio_platform_ops = {
     .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
     .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+    .vfio_eoi = vfio_platform_eoi,
     .vfio_populate_device = vfio_populate_device,
 };
 
@@ -220,6 +536,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
 
     vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
     vbasedev->ops = &vfio_platform_ops;
+    vdev->start_irq_fn = vfio_start_eventfd_injection;
 
     trace_vfio_platform_realize(vbasedev->name, vdev->compat);
 
@@ -243,6 +560,8 @@ static const VMStateDescription vfio_platform_vmstate = {
 
 static Property vfio_platform_dev_properties[] = {
     DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
+                       mmap_timeout, 1100),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 338f0c6..e55b711 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -18,16 +18,49 @@
 
 #include "hw/sysbus.h"
 #include "hw/vfio/vfio-common.h"
+#include "qemu/event_notifier.h"
+#include "qemu/queue.h"
+#include "hw/irq.h"
 
 #define TYPE_VFIO_PLATFORM "vfio-platform"
 
+enum {
+    VFIO_IRQ_INACTIVE = 0,
+    VFIO_IRQ_PENDING = 1,
+    VFIO_IRQ_ACTIVE = 2,
+    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
+};
+
+typedef struct VFIOINTp {
+    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
+    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
+    EventNotifier interrupt; /* eventfd triggered on interrupt */
+    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
+    qemu_irq qemuirq;
+    struct VFIOPlatformDevice *vdev; /* back pointer to device */
+    int state; /* inactive, pending, active */
+    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
+    uint8_t pin; /* index */
+    uint32_t virtualID; /* virtual IRQ */
+} VFIOINTp;
+
+typedef int (*start_irq_fn_t)(VFIOINTp *intp);
+
 typedef struct VFIOPlatformDevice {
     SysBusDevice sbdev;
     VFIODevice vbasedev; /* not a QOM object */
     VFIORegion **regions;
+    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */
+    /* queue of pending IRQ */
+    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
     char *compat; /* compatibility string */
+    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
+    QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
+    start_irq_fn_t start_irq_fn;
+    QemuMutex  intp_mutex;
 } VFIOPlatformDevice;
 
+
 typedef struct VFIOPlatformDeviceClass {
     /*< private >*/
     SysBusDeviceClass parent_class;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 4/7] hw/vfio/platform: add capability to start IRQ propagation
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
                   ` (2 preceding siblings ...)
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 5/7] hw/vfio: calxeda xgmac device Eric Auger
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

Add a reset notify function that enables to start the propagation of
interrupts to the guest.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v8 -> v9:
- handle failure in vfio_irq_starter
---
 hw/vfio/platform.c              | 52 +++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-platform.h |  8 +++++++
 2 files changed, 60 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index b85ad6c..30798d8 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -553,6 +553,58 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
     }
 }
 
+/*
+ * Binding to the platform bus IRQ happens on a machine init done
+ * notifier registered by the platform bus. Only at that time the
+ * absolute virtual IRQ = GSI is known and allows to setup IRQFD.
+ * vfio_kick_irqs can typically be used as a reset notifier function.
+ */
+
+/* Start injection of IRQ for a specific VFIO device */
+static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque)
+{
+    DeviceState *intc = (DeviceState *)opaque;
+    VFIOPlatformDevice *vdev;
+    VFIOINTp *intp;
+    qemu_irq irq;
+    int gsi, ret;
+
+    if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) {
+        vdev = VFIO_PLATFORM_DEVICE(sbdev);
+
+        QLIST_FOREACH(intp, &vdev->intp_list, next) {
+            gsi = 0;
+            while (1) {
+                irq = qdev_get_gpio_in(intc, gsi);
+                if (irq == intp->qemuirq) {
+                    break;
+                }
+                gsi++;
+            }
+            intp->virtualID = gsi;
+            ret = vdev->start_irq_fn(intp);
+            if (ret) {
+                error_report("%s unable to start propagation of IRQ index %d",
+                             vdev->vbasedev.name, intp->pin);
+                exit(1);
+            }
+        }
+    }
+    return 0;
+}
+
+/* loop on all VFIO platform devices and start their IRQ injection */
+void vfio_kick_irqs(void *data)
+{
+    DeviceState *intc = (DeviceState *)data;
+    DeviceState *dev =
+        qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE);
+    PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev);
+
+    assert(pbus->done_gathering);
+    foreach_dynamic_sysbus_device(vfio_irq_starter, intc);
+}
+
 static const VMStateDescription vfio_platform_vmstate = {
     .name = TYPE_VFIO_PLATFORM,
     .unmigratable = 1,
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index e55b711..bd1206e 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -74,4 +74,12 @@ typedef struct VFIOPlatformDeviceClass {
 #define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
      OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
 
+/**
+ * vfio_kick_irqs - reset notifier that starts IRQ injection
+ * for all VFIO dynamic sysbus devices attached to the platform bus.
+ *
+ * @opaque: handle to the interrupt controller DeviceState*
+ */
+void vfio_kick_irqs(void *opaque);
+
 #endif /*HW_VFIO_VFIO_PLATFORM_H*/
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 5/7] hw/vfio: calxeda xgmac device
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
                   ` (3 preceding siblings ...)
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 4/7] hw/vfio/platform: add capability to start IRQ propagation Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-17 11:29     ` Alex Bennée
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 7/7] hw/vfio/platform: add irqfd support Eric Auger
  6 siblings, 1 reply; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

The platform device class has become abstract. This patch introduces
a calxeda xgmac device that can be be instantiated on command line
using such option.

-device vfio-calxeda-xgmac,host="fff51000.ethernet"

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v8 -> v9:
- renamed calxeda_xgmac.c into calxeda-xgmac.c

v7 -> v8:
- add a comment in the header about the MMIO regions and IRQ which
  are exposed by the device

v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override

v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c

v4: creation for device tree specialization
---
 hw/arm/virt.c                        | 15 +++++++---
 hw/vfio/Makefile.objs                |  1 +
 hw/vfio/calxeda-xgmac.c              | 54 ++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++++++++++++++++++++++++++++++
 4 files changed, 112 insertions(+), 4 deletions(-)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9df9b60..c1e0a10 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -44,6 +44,7 @@
 #include "qemu/error-report.h"
 #include "hw/arm/sysbus-fdt.h"
 #include "hw/platform-bus.h"
+#include "hw/vfio/vfio-platform.h"
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -342,7 +343,7 @@ static void fdt_add_gic_node(const VirtBoardInfo *vbi)
     qemu_fdt_setprop_cell(vbi->fdt, "/intc", "phandle", gic_phandle);
 }
 
-static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
+static DeviceState *create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
 {
     /* We create a standalone GIC v2 */
     DeviceState *gicdev;
@@ -390,6 +391,7 @@ static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
     }
 
     fdt_add_gic_node(vbi);
+    return gicdev;
 }
 
 static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
@@ -594,7 +596,8 @@ static void create_fw_cfg(const VirtBoardInfo *vbi)
     g_free(nodename);
 }
 
-static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
+static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
+                                DeviceState *gic)
 {
     DeviceState *dev;
     SysBusDevice *s;
@@ -633,6 +636,9 @@ static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
     memory_region_add_subregion(sysmem,
                                 platform_bus_params.platform_bus_base,
                                 sysbus_mmio_get_region(s, 0));
+
+    /* setup VFIO signaling/IRQFD for all VFIO platform sysbus devices */
+    qemu_register_reset(vfio_kick_irqs, gic);
 }
 
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
@@ -652,6 +658,7 @@ static void machvirt_init(MachineState *machine)
     MemoryRegion *ram = g_new(MemoryRegion, 1);
     const char *cpu_model = machine->cpu_model;
     VirtBoardInfo *vbi;
+    DeviceState *gic;
 
     if (!cpu_model) {
         cpu_model = "cortex-a15";
@@ -713,7 +720,7 @@ static void machvirt_init(MachineState *machine)
 
     create_flash(vbi);
 
-    create_gic(vbi, pic);
+    gic = create_gic(vbi, pic);
 
     create_uart(vbi, pic);
 
@@ -744,7 +751,7 @@ static void machvirt_init(MachineState *machine)
      * another notifier is registered which adds platform bus nodes.
      * Notifiers are executed in registration reverse order.
      */
-    create_platform_bus(vbi, pic);
+    create_platform_bus(vbi, pic, gic);
 }
 
 static bool virt_get_secure(Object *obj, Error **errp)
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..d540c9d 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
 endif
diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
new file mode 100644
index 0000000..199e076
--- /dev/null
+++ b/hw/vfio/calxeda-xgmac.c
@@ -0,0 +1,54 @@
+/*
+ * calxeda xgmac example VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+    VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+
+    vdev->compat = g_strdup("calxeda,hb-xgmac");
+
+    k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+    .name = TYPE_VFIO_CALXEDA_XGMAC,
+    .unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VFIOCalxedaXgmacDeviceClass *vcxc =
+        VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+    vcxc->parent_realize = dc->realize;
+    dc->realize = calxeda_xgmac_realize;
+    dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+    .name = TYPE_VFIO_CALXEDA_XGMAC,
+    .parent = TYPE_VFIO_PLATFORM,
+    .instance_size = sizeof(VFIOCalxedaXgmacDevice),
+    .class_init = vfio_calxeda_xgmac_class_init,
+    .class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+    type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 0000000..f994775
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,46 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+/**
+ * This device exposes:
+ * - a single MMIO region corresponding to its register space
+ * - 3 IRQS (main and 2 power related IRQs)
+ */
+typedef struct VFIOCalxedaXgmacDevice {
+    VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+    /*< private >*/
+    VFIOPlatformDeviceClass parent_class;
+    /*< public >*/
+    DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+     OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+     OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+                        TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+                      TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
                   ` (4 preceding siblings ...)
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 5/7] hw/vfio: calxeda xgmac device Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-17 11:36     ` Alex Bennée
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 7/7] hw/vfio/platform: add irqfd support Eric Auger
  6 siblings, 1 reply; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

vfio-calxeda-xgmac now can be instantiated using the -device option.
The node creation function generates a very basic dt node composed
of the compat, reg and interrupts properties

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v8 -> v9:
- properly free resources in case of errors in
  add_calxeda_midway_xgmac_fdt_node

v7 -> v8:
- move the add_fdt_node_functions array declaration between the device
  specific code and the generic code to avoid forward declarations of
  decice specific functions
- rename add_basic_vfio_fdt_node into
  add_calxeda_midway_xgmac_fdt_node

v6 -> v7:
- compat string re-formatting removed since compat string is not exposed
  anymore as a user option
- VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
  device
---
 hw/arm/sysbus-fdt.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 3038b94..d4f97f5 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -26,6 +26,8 @@
 #include "sysemu/device_tree.h"
 #include "hw/platform-bus.h"
 #include "sysemu/sysemu.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
 
 /*
  * internal struct that contains the information to create dynamic
@@ -53,11 +55,92 @@ typedef struct NodeCreationPair {
     int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* Device Specific Code */
+
+/**
+ * add_calxeda_midway_xgmac_fdt_node
+ *
+ * Generates a very simple node with following properties:
+ * compatible string, regs, interrupts
+ */
+static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+    PlatformBusFDTData *data = opaque;
+    PlatformBusDevice *pbus = data->pbus;
+    void *fdt = data->fdt;
+    const char *parent_node = data->pbus_node_name;
+    int compat_str_len;
+    char *nodename;
+    int i, ret = -1;
+    uint32_t *irq_attr;
+    uint64_t *reg_attr;
+    uint64_t mmio_base;
+    uint64_t irq_number;
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    Object *obj = OBJECT(sbdev);
+
+    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+
+    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+                               vbasedev->name,
+                               mmio_base);
+
+    qemu_fdt_add_subnode(fdt, nodename);
+
+    compat_str_len = strlen(vdev->compat) + 1;
+    qemu_fdt_setprop(fdt, nodename, "compatible",
+                          vdev->compat, compat_str_len);
+
+    reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
+        reg_attr[4*i] = 1;
+        reg_attr[4*i+1] = mmio_base;
+        reg_attr[4*i+2] = 1;
+        reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
+    }
+
+    ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
+                     vbasedev->num_regions*2, reg_attr);
+    if (ret) {
+        error_report("could not set reg property of node %s", nodename);
+        goto fail_reg;
+    }
+
+    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+
+    for (i = 0; i < vbasedev->num_irqs; i++) {
+        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
+                         + data->irq_start;
+        irq_attr[3*i] = cpu_to_be32(0);
+        irq_attr[3*i+1] = cpu_to_be32(irq_number);
+        irq_attr[3*i+2] = cpu_to_be32(0x4);
+    }
+
+   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+                     irq_attr, vbasedev->num_irqs*3*sizeof(uint32_t));
+    if (ret) {
+        error_report("could not set interrupts property of node %s",
+                     nodename);
+    }
+
+    g_free(irq_attr);
+fail_reg:
+    g_free(reg_attr);
+    g_free(nodename);
+    return ret;
+}
+
 /* list of supported dynamic sysbus devices */
 static const NodeCreationPair add_fdt_node_functions[] = {
+    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
     {"", NULL}, /* last element */
 };
 
+/* Generic Code */
+
 /**
  * add_fdt_node - add the device tree node of a dynamic sysbus device
  *
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH v10 7/7] hw/vfio/platform: add irqfd support
  2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
                   ` (5 preceding siblings ...)
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
@ 2015-02-13  3:47 ` Eric Auger
  2015-02-17 11:41     ` Alex Bennée
  6 siblings, 1 reply; 24+ messages in thread
From: Eric Auger @ 2015-02-13  3:47 UTC (permalink / raw)
  To: eric.auger, eric.auger, christoffer.dall, qemu-devel,
	alex.williamson, peter.maydell, agraf, b.reynal, feng.wu
  Cc: kim.phillips, patches, a.rigo, afaerber, Bharat.Bhushan,
	a.motakis, pbonzini, kvmarm

This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

it depends on host kernel KVM irqfd.

Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.

Conflicts:
	hw/vfio/platform.c
---
 hw/vfio/platform.c              | 104 +++++++++++++++++++++++++++++++++++++++-
 include/hw/vfio/vfio-platform.h |   1 +
 trace-events                    |   2 +
 3 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 30798d8..cadc824 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -26,6 +26,7 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
+#include "sysemu/kvm.h"
 
 static void vfio_intp_interrupt(VFIOINTp *intp);
 typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
@@ -237,6 +238,83 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
 }
 
 /*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct pointer
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+    struct vfio_irq_set *irq_set;
+    int argsz, ret;
+    int32_t *pfd;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+    irq_set->index = intp->pin;
+    irq_set->start = 0;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = event_notifier_get_fd(&intp->unmask);
+    qemu_set_fd_handler(*pfd, NULL, NULL, intp);
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (ret < 0) {
+        error_report("vfio: Failed to set resample eventfd: %m");
+        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    }
+    return ret;
+}
+
+/**
+ * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
+ * programs VFIO driver with both the trigger and resamplefd
+ * programs KVM with the gsi, trigger & resample eventfds
+ */
+static int vfio_start_irqfd_injection(VFIOINTp *intp)
+{
+    struct kvm_irqfd irqfd = {
+        .fd = event_notifier_get_fd(&intp->interrupt),
+        .resamplefd = event_notifier_get_fd(&intp->unmask),
+        .gsi = intp->virtualID,
+        .flags = KVM_IRQFD_FLAG_RESAMPLE,
+    };
+
+    if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+        error_report("vfio: Error: Failed to assign the irqfd: %m");
+        goto fail_irqfd;
+    }
+    if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+        goto fail_vfio;
+    }
+    if (vfio_set_resample_eventfd(intp) < 0) {
+        goto fail_vfio;
+    }
+
+    intp->kvm_accel = true;
+    trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
+                                     irqfd.fd, irqfd.resamplefd);
+    return 0;
+
+fail_vfio:
+    irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+    kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+    return -1;
+}
+
+#endif
+
+/*
  * Functions used whatever the injection method
  */
 
@@ -315,6 +393,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
         error_report("vfio: Error: trigger event_notifier_init failed ");
         return NULL;
     }
+    /* Get an eventfd for resample/unmask */
+    ret = event_notifier_init(&intp->unmask, 0);
+    if (ret) {
+        g_free(intp);
+        error_report("vfio: Error: resample event_notifier_init failed eoi");
+        return NULL;
+    }
 
     /* store the new intp in qlist */
     QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
@@ -409,7 +494,6 @@ static VFIODeviceOps vfio_platform_ops = {
     .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
     .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
     .vfio_eoi = vfio_platform_eoi,
-    .vfio_populate_device = vfio_populate_device,
 };
 
 /**
@@ -481,6 +565,13 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
         error_report("vfio: failed to get device %s", path);
         vfio_put_group(group);
     }
+
+    ret = vfio_populate_device(vbasedev);
+    if (ret) {
+        error_report("vfio: failed to populate device %s", path);
+        vfio_put_group(group);
+    }
+
     return ret;
 }
 
@@ -536,7 +627,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
 
     vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
     vbasedev->ops = &vfio_platform_ops;
+
+#ifdef CONFIG_KVM
+    if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
+        vdev->irqfd_allowed) {
+        vdev->start_irq_fn = vfio_start_irqfd_injection;
+    } else {
+        vdev->start_irq_fn = vfio_start_eventfd_injection;
+    }
+#else
     vdev->start_irq_fn = vfio_start_eventfd_injection;
+#endif
 
     trace_vfio_platform_realize(vbasedev->name, vdev->compat);
 
@@ -614,6 +715,7 @@ static Property vfio_platform_dev_properties[] = {
     DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
     DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
                        mmap_timeout, 1100),
+    DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index bd1206e..097448b 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -58,6 +58,7 @@ typedef struct VFIOPlatformDevice {
     QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
     start_irq_fn_t start_irq_fn;
     QemuMutex  intp_mutex;
+    bool irqfd_allowed; /* debug option to force irqfd on/off */
 } VFIOPlatformDevice;
 
 
diff --git a/trace-events b/trace-events
index d3685c9..7a6a6aa 100644
--- a/trace-events
+++ b/trace-events
@@ -1567,6 +1567,8 @@ vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d
 vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
 vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
 vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
+vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d"
+vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d"
 
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton Eric Auger
@ 2015-02-17 10:56     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 10:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, eric.auger, patches, Kim Phillips, qemu-devel,
	agraf, alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall


Eric Auger <eric.auger@linaro.org> writes:

> Minimal VFIO platform implementation supporting register space
> user mapping but not IRQ assignment.
>
> Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>

See comments inline.

<snip>
> +/**
> + * vfio_populate_device - initialize MMIO region and IRQ
> + * @vbasedev: the VFIO device
> + *
> + * query the VFIO device for exposed MMIO regions and IRQ and
> + * populate the associated fields in the device struct
> + */
> +static int vfio_populate_device(VFIODevice *vbasedev)
> +{
> +    struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };

This could be inside the for block.

> +    int i, ret = -1;
> +    VFIOPlatformDevice *vdev =
> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> +
> +    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
> +        error_report("vfio: Um, this isn't a platform device");
> +        goto error;
> +    }
> +
> +    vdev->regions = g_malloc0(sizeof(VFIORegion *) *
> vbasedev->num_regions);

I may have considered a g_malloc0_n but I see that's not actually used
in the rest of the code (newer glib?).

> +
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        vdev->regions[i] = g_malloc0(sizeof(VFIORegion));

An intermediate VFIORegion *ptr here would have saved a bunch of typing
later on ;-) 

> +        reg_info.index = i;
> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
> +        if (ret) {
> +            error_report("vfio: Error getting region %d info: %m", i);
> +            goto error;
> +        }
> +        vdev->regions[i]->flags = reg_info.flags;
> +        vdev->regions[i]->size = reg_info.size;
> +        vdev->regions[i]->fd_offset = reg_info.offset;
> +        vdev->regions[i]->nr = i;
> +        vdev->regions[i]->vbasedev = vbasedev;
> +
> +        trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
> +                            (unsigned long)vdev->regions[i]->flags,
> +                            (unsigned long)vdev->regions[i]->size,
> +                            vdev->regions[i]->vbasedev->fd,
> +                            (unsigned long)vdev->regions[i]->fd_offset);
> +    }
> +
> +    return 0;
> +error:
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        g_free(vdev->regions[i]);
> +    }
> +    g_free(vdev->regions);
> +    return ret;
> +}
> +
> +/* specialized functions ofr VFIO Platform devices */
> +static VFIODeviceOps vfio_platform_ops = {
> +    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
> +    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
> +    .vfio_populate_device = vfio_populate_device,
> +};
> +
> +/**
> + * vfio_base_device_init - implements some of the VFIO mechanics
> + * @vbasedev: the VFIO device
> + *
> + * retrieves the group the device belongs to and get the device fd
> + * returns the VFIO device fd
> + * precondition: the device name must be initialized
> + */
> +static int vfio_base_device_init(VFIODevice *vbasedev)
> +{
> +    VFIOGroup *group;
> +    VFIODevice *vbasedev_iter;
> +    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
> +    ssize_t len;
> +    struct stat st;
> +    int groupid;
> +    int ret;
> +
> +    /* name must be set prior to the call */
> +    if (!vbasedev->name) {
> +        return -EINVAL;
> +    }
> +
> +    /* Check that the host device exists */
> +    snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
> +             vbasedev->name);
> +
> +    if (stat(path, &st) < 0) {
> +        error_report("vfio: error: no such host device: %s", path);
> +        return -errno;
> +    }
> +
> +    strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);

Consider g_strlcat which has nicer max length semantics.

> +    len = readlink(path, iommu_group_path, sizeof(path));
> +    if (len <= 0 || len >= sizeof(path)) {

readlink should never report more than sizeof(path) although that will
indicate a ENAMETOOLONG.

> +        error_report("vfio: error no iommu_group for device");
> +        return len < 0 ? -errno : ENAMETOOLONG;
> +    }
> +
> +    iommu_group_path[len] = 0;
> +    group_name = basename(iommu_group_path);
> +
> +    if (sscanf(group_name, "%d", &groupid) != 1) {
> +        error_report("vfio: error reading %s: %m", path);
> +        return -errno;
> +    }
> +
> +    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
> +
> +    group = vfio_get_group(groupid, &address_space_memory);
> +    if (!group) {
> +        error_report("vfio: failed to get group %d", groupid);
> +        return -ENOENT;
> +    }
> +
> +    snprintf(path, sizeof(path), "%s", vbasedev->name);
> +
> +    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> +        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
> +            error_report("vfio: error: device %s is already attached", path);
> +            vfio_put_group(group);
> +            return -EBUSY;
> +        }
> +    }
> +    ret = vfio_get_device(group, path, vbasedev);
> +    if (ret) {
> +        error_report("vfio: failed to get device %s", path);
> +        vfio_put_group(group);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
> + * given index
> + * @vdev: the VFIO platform device
> + * @nr: the index of the region
> + *
> + * init the top memory region and the mmapped memroy region beneath
> + * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
> + * and could not be passed to memory region functions
> +*/
> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
> +{
> +    VFIORegion *region = vdev->regions[nr];
> +    unsigned size = region->size;
> +    char name[64];
> +
> +    if (!size) {
> +        return;
> +    }
> +
> +    snprintf(name, sizeof(name), "VFIO %s region %d",
> +             vdev->vbasedev.name, nr);
> +
> +    /* A "slow" read/write mapping underlies all regions */
> +    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
> +                          region, name, size);
> +
> +    strncat(name, " mmap", sizeof(name) - strlen(name) - 1);

again consider g_strlcat

> +
> +    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
> +                         &region->mmap_mem, &region->mmap, size, 0, name)) {
> +        error_report("%s unsupported. Performance may be slow", name);
> +    }
> +}
> +
> +/**
> + * vfio_platform_realize  - the device realize function
> + * @dev: device state pointer
> + * @errp: error
> + *
> + * initialize the device, its memory regions and IRQ structures
> + * IRQ are started separately
> + */
> +static void vfio_platform_realize(DeviceState *dev, Error **errp)
> +{
> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
> +    VFIODevice *vbasedev = &vdev->vbasedev;
> +    int i, ret;
> +
> +    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
> +    vbasedev->ops = &vfio_platform_ops;
> +
> +    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
> +
> +    ret = vfio_base_device_init(vbasedev);
> +    if (ret) {
> +        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
> +                   vbasedev->name);
> +        return;
> +    }
> +
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        vfio_map_region(vdev, i);
> +        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
> +    }
> +}
> +
> +static const VMStateDescription vfio_platform_vmstate = {
> +    .name = TYPE_VFIO_PLATFORM,
> +    .unmigratable = 1,
> +};
> +
> +static Property vfio_platform_dev_properties[] = {
> +    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vfio_platform_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->realize = vfio_platform_realize;
> +    dc->props = vfio_platform_dev_properties;
> +    dc->vmsd = &vfio_platform_vmstate;
> +    dc->desc = "VFIO-based platform device assignment";
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> +}
> +
> +static const TypeInfo vfio_platform_dev_info = {
> +    .name = TYPE_VFIO_PLATFORM,
> +    .parent = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(VFIOPlatformDevice),
> +    .class_init = vfio_platform_class_init,
> +    .class_size = sizeof(VFIOPlatformDeviceClass),
> +    .abstract   = true,
> +};
> +
> +static void register_vfio_platform_dev_type(void)
> +{
> +    type_register_static(&vfio_platform_dev_info);
> +}
> +
> +type_init(register_vfio_platform_dev_type)
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 5f3679b..2d1d8b3 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -43,6 +43,7 @@
>  
>  enum {
>      VFIO_DEVICE_TYPE_PCI = 0,
> +    VFIO_DEVICE_TYPE_PLATFORM = 1,
>  };
>  
>  typedef struct VFIORegion {
> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
> new file mode 100644
> index 0000000..338f0c6
> --- /dev/null
> +++ b/include/hw/vfio/vfio-platform.h
> @@ -0,0 +1,44 @@
> +/*
> + * vfio based device assignment support - platform devices
> + *
> + * Copyright Linaro Limited, 2014
> + *
> + * Authors:
> + *  Kim Phillips <kim.phillips@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + * Based on vfio based PCI device assignment support:
> + *  Copyright Red Hat, Inc. 2012
> + */
> +
> +#ifndef HW_VFIO_VFIO_PLATFORM_H
> +#define HW_VFIO_VFIO_PLATFORM_H
> +
> +#include "hw/sysbus.h"
> +#include "hw/vfio/vfio-common.h"
> +
> +#define TYPE_VFIO_PLATFORM "vfio-platform"
> +
> +typedef struct VFIOPlatformDevice {
> +    SysBusDevice sbdev;
> +    VFIODevice vbasedev; /* not a QOM object */
> +    VFIORegion **regions;
> +    char *compat; /* compatibility string */
> +} VFIOPlatformDevice;
> +
> +typedef struct VFIOPlatformDeviceClass {
> +    /*< private >*/
> +    SysBusDeviceClass parent_class;
> +    /*< public >*/
> +} VFIOPlatformDeviceClass;
> +
> +#define VFIO_PLATFORM_DEVICE(obj) \
> +     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
> +#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
> +     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
> +#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
> +     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
> +
> +#endif /*HW_VFIO_VFIO_PLATFORM_H*/
> diff --git a/trace-events b/trace-events
> index f87b077..d3685c9 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1556,6 +1556,18 @@ vfio_put_group(int fd) "close group->fd=%d"
>  vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
>  vfio_put_base_device(int fd) "close vdev->fd=%d"
>  
> +# hw/vfio/platform.c
> +vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
> +vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
> +vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
> +vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
> +vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
> +vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
> +vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
> +vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
> +vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
> +vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
> +
>  #hw/acpi/memory_hotplug.c
>  mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
>  mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton
@ 2015-02-17 10:56     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 10:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, patches, Kim Phillips, qemu-devel, alex.williamson,
	pbonzini, feng.wu, kvmarm


Eric Auger <eric.auger@linaro.org> writes:

> Minimal VFIO platform implementation supporting register space
> user mapping but not IRQ assignment.
>
> Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>

See comments inline.

<snip>
> +/**
> + * vfio_populate_device - initialize MMIO region and IRQ
> + * @vbasedev: the VFIO device
> + *
> + * query the VFIO device for exposed MMIO regions and IRQ and
> + * populate the associated fields in the device struct
> + */
> +static int vfio_populate_device(VFIODevice *vbasedev)
> +{
> +    struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };

This could be inside the for block.

> +    int i, ret = -1;
> +    VFIOPlatformDevice *vdev =
> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> +
> +    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
> +        error_report("vfio: Um, this isn't a platform device");
> +        goto error;
> +    }
> +
> +    vdev->regions = g_malloc0(sizeof(VFIORegion *) *
> vbasedev->num_regions);

I may have considered a g_malloc0_n but I see that's not actually used
in the rest of the code (newer glib?).

> +
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        vdev->regions[i] = g_malloc0(sizeof(VFIORegion));

An intermediate VFIORegion *ptr here would have saved a bunch of typing
later on ;-) 

> +        reg_info.index = i;
> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
> +        if (ret) {
> +            error_report("vfio: Error getting region %d info: %m", i);
> +            goto error;
> +        }
> +        vdev->regions[i]->flags = reg_info.flags;
> +        vdev->regions[i]->size = reg_info.size;
> +        vdev->regions[i]->fd_offset = reg_info.offset;
> +        vdev->regions[i]->nr = i;
> +        vdev->regions[i]->vbasedev = vbasedev;
> +
> +        trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
> +                            (unsigned long)vdev->regions[i]->flags,
> +                            (unsigned long)vdev->regions[i]->size,
> +                            vdev->regions[i]->vbasedev->fd,
> +                            (unsigned long)vdev->regions[i]->fd_offset);
> +    }
> +
> +    return 0;
> +error:
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        g_free(vdev->regions[i]);
> +    }
> +    g_free(vdev->regions);
> +    return ret;
> +}
> +
> +/* specialized functions ofr VFIO Platform devices */
> +static VFIODeviceOps vfio_platform_ops = {
> +    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
> +    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
> +    .vfio_populate_device = vfio_populate_device,
> +};
> +
> +/**
> + * vfio_base_device_init - implements some of the VFIO mechanics
> + * @vbasedev: the VFIO device
> + *
> + * retrieves the group the device belongs to and get the device fd
> + * returns the VFIO device fd
> + * precondition: the device name must be initialized
> + */
> +static int vfio_base_device_init(VFIODevice *vbasedev)
> +{
> +    VFIOGroup *group;
> +    VFIODevice *vbasedev_iter;
> +    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
> +    ssize_t len;
> +    struct stat st;
> +    int groupid;
> +    int ret;
> +
> +    /* name must be set prior to the call */
> +    if (!vbasedev->name) {
> +        return -EINVAL;
> +    }
> +
> +    /* Check that the host device exists */
> +    snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
> +             vbasedev->name);
> +
> +    if (stat(path, &st) < 0) {
> +        error_report("vfio: error: no such host device: %s", path);
> +        return -errno;
> +    }
> +
> +    strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);

Consider g_strlcat which has nicer max length semantics.

> +    len = readlink(path, iommu_group_path, sizeof(path));
> +    if (len <= 0 || len >= sizeof(path)) {

readlink should never report more than sizeof(path) although that will
indicate a ENAMETOOLONG.

> +        error_report("vfio: error no iommu_group for device");
> +        return len < 0 ? -errno : ENAMETOOLONG;
> +    }
> +
> +    iommu_group_path[len] = 0;
> +    group_name = basename(iommu_group_path);
> +
> +    if (sscanf(group_name, "%d", &groupid) != 1) {
> +        error_report("vfio: error reading %s: %m", path);
> +        return -errno;
> +    }
> +
> +    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
> +
> +    group = vfio_get_group(groupid, &address_space_memory);
> +    if (!group) {
> +        error_report("vfio: failed to get group %d", groupid);
> +        return -ENOENT;
> +    }
> +
> +    snprintf(path, sizeof(path), "%s", vbasedev->name);
> +
> +    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> +        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
> +            error_report("vfio: error: device %s is already attached", path);
> +            vfio_put_group(group);
> +            return -EBUSY;
> +        }
> +    }
> +    ret = vfio_get_device(group, path, vbasedev);
> +    if (ret) {
> +        error_report("vfio: failed to get device %s", path);
> +        vfio_put_group(group);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
> + * given index
> + * @vdev: the VFIO platform device
> + * @nr: the index of the region
> + *
> + * init the top memory region and the mmapped memroy region beneath
> + * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
> + * and could not be passed to memory region functions
> +*/
> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
> +{
> +    VFIORegion *region = vdev->regions[nr];
> +    unsigned size = region->size;
> +    char name[64];
> +
> +    if (!size) {
> +        return;
> +    }
> +
> +    snprintf(name, sizeof(name), "VFIO %s region %d",
> +             vdev->vbasedev.name, nr);
> +
> +    /* A "slow" read/write mapping underlies all regions */
> +    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
> +                          region, name, size);
> +
> +    strncat(name, " mmap", sizeof(name) - strlen(name) - 1);

again consider g_strlcat

> +
> +    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
> +                         &region->mmap_mem, &region->mmap, size, 0, name)) {
> +        error_report("%s unsupported. Performance may be slow", name);
> +    }
> +}
> +
> +/**
> + * vfio_platform_realize  - the device realize function
> + * @dev: device state pointer
> + * @errp: error
> + *
> + * initialize the device, its memory regions and IRQ structures
> + * IRQ are started separately
> + */
> +static void vfio_platform_realize(DeviceState *dev, Error **errp)
> +{
> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
> +    VFIODevice *vbasedev = &vdev->vbasedev;
> +    int i, ret;
> +
> +    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
> +    vbasedev->ops = &vfio_platform_ops;
> +
> +    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
> +
> +    ret = vfio_base_device_init(vbasedev);
> +    if (ret) {
> +        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
> +                   vbasedev->name);
> +        return;
> +    }
> +
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        vfio_map_region(vdev, i);
> +        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
> +    }
> +}
> +
> +static const VMStateDescription vfio_platform_vmstate = {
> +    .name = TYPE_VFIO_PLATFORM,
> +    .unmigratable = 1,
> +};
> +
> +static Property vfio_platform_dev_properties[] = {
> +    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vfio_platform_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->realize = vfio_platform_realize;
> +    dc->props = vfio_platform_dev_properties;
> +    dc->vmsd = &vfio_platform_vmstate;
> +    dc->desc = "VFIO-based platform device assignment";
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> +}
> +
> +static const TypeInfo vfio_platform_dev_info = {
> +    .name = TYPE_VFIO_PLATFORM,
> +    .parent = TYPE_SYS_BUS_DEVICE,
> +    .instance_size = sizeof(VFIOPlatformDevice),
> +    .class_init = vfio_platform_class_init,
> +    .class_size = sizeof(VFIOPlatformDeviceClass),
> +    .abstract   = true,
> +};
> +
> +static void register_vfio_platform_dev_type(void)
> +{
> +    type_register_static(&vfio_platform_dev_info);
> +}
> +
> +type_init(register_vfio_platform_dev_type)
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 5f3679b..2d1d8b3 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -43,6 +43,7 @@
>  
>  enum {
>      VFIO_DEVICE_TYPE_PCI = 0,
> +    VFIO_DEVICE_TYPE_PLATFORM = 1,
>  };
>  
>  typedef struct VFIORegion {
> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
> new file mode 100644
> index 0000000..338f0c6
> --- /dev/null
> +++ b/include/hw/vfio/vfio-platform.h
> @@ -0,0 +1,44 @@
> +/*
> + * vfio based device assignment support - platform devices
> + *
> + * Copyright Linaro Limited, 2014
> + *
> + * Authors:
> + *  Kim Phillips <kim.phillips@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + * Based on vfio based PCI device assignment support:
> + *  Copyright Red Hat, Inc. 2012
> + */
> +
> +#ifndef HW_VFIO_VFIO_PLATFORM_H
> +#define HW_VFIO_VFIO_PLATFORM_H
> +
> +#include "hw/sysbus.h"
> +#include "hw/vfio/vfio-common.h"
> +
> +#define TYPE_VFIO_PLATFORM "vfio-platform"
> +
> +typedef struct VFIOPlatformDevice {
> +    SysBusDevice sbdev;
> +    VFIODevice vbasedev; /* not a QOM object */
> +    VFIORegion **regions;
> +    char *compat; /* compatibility string */
> +} VFIOPlatformDevice;
> +
> +typedef struct VFIOPlatformDeviceClass {
> +    /*< private >*/
> +    SysBusDeviceClass parent_class;
> +    /*< public >*/
> +} VFIOPlatformDeviceClass;
> +
> +#define VFIO_PLATFORM_DEVICE(obj) \
> +     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
> +#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
> +     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
> +#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
> +     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
> +
> +#endif /*HW_VFIO_VFIO_PLATFORM_H*/
> diff --git a/trace-events b/trace-events
> index f87b077..d3685c9 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1556,6 +1556,18 @@ vfio_put_group(int fd) "close group->fd=%d"
>  vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
>  vfio_put_base_device(int fd) "close vdev->fd=%d"
>  
> +# hw/vfio/platform.c
> +vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
> +vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
> +vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
> +vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
> +vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
> +vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
> +vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
> +vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
> +vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
> +vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
> +
>  #hw/acpi/memory_hotplug.c
>  mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
>  mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32

-- 
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment Eric Auger
@ 2015-02-17 11:24     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:24 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall


Eric Auger <eric.auger@linaro.org> writes:

> This patch adds the code requested to assign interrupts to
> a guest. The interrupts are mediated through user handled
> eventfds only.
>
> The mechanics to start the IRQ handling is not yet there through.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>

See comments inline.

>
> ---
>
> v8 -> v9:
> - free irq related resources in case of error in vfio_populate_device
> ---
>  hw/vfio/platform.c              | 319 ++++++++++++++++++++++++++++++++++++++++
>  include/hw/vfio/vfio-platform.h |  33 +++++
>  2 files changed, 352 insertions(+)
>
> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
> index caadb92..b85ad6c 100644
> --- a/hw/vfio/platform.c
> +++ b/hw/vfio/platform.c
> @@ -22,10 +22,259 @@
>  #include "qemu/range.h"
>  #include "sysemu/sysemu.h"
>  #include "exec/memory.h"
> +#include "qemu/queue.h"
>  #include "hw/sysbus.h"
>  #include "trace.h"
>  #include "hw/platform-bus.h"
>  
> +static void vfio_intp_interrupt(VFIOINTp *intp);
> +typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
> +                                    eventfd_user_side_handler_t handler);
> +
> +/*
> + * Functions only used when eventfd are handled on user-side
> + * ie. without irqfd
> + */
> +
> +/**
> + * vfio_platform_eoi - IRQ completion routine
> + * @vbasedev: the VFIO device
> + *
> + * de-asserts the active virtual IRQ and unmask the physical IRQ
> + * (masked by the  VFIO driver). Handle pending IRQs if any.
> + * eoi function is called on the first access to any MMIO region
> + * after an IRQ was triggered. It is assumed this access corresponds
> + * to the IRQ status register reset. With such a mechanism, a single
> + * IRQ can be handled at a time since there is no way to know which
> + * IRQ was completed by the guest (we would need additional details
> + * about the IRQ status register mask)
> + */
> +static void vfio_platform_eoi(VFIODevice *vbasedev)
> +{
> +    VFIOINTp *intp;
> +    VFIOPlatformDevice *vdev =
> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> +
> +    qemu_mutex_lock(&vdev->intp_mutex);
> +    QLIST_FOREACH(intp, &vdev->intp_list, next) {
> +        if (intp->state == VFIO_IRQ_ACTIVE) {
> +            trace_vfio_platform_eoi(intp->pin,
> +                                event_notifier_get_fd(&intp->interrupt));
> +            intp->state = VFIO_IRQ_INACTIVE;
> +
> +            /* deassert the virtual IRQ and unmask physical one */
> +            qemu_set_irq(intp->qemuirq, 0);
> +            vfio_unmask_single_irqindex(vbasedev, intp->pin);
> +
> +            /* a single IRQ can be active at a time */
> +            break;
> +        }
> +    }
> +    /* in case there are pending IRQs, handle them one at a time */
> +    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
> +        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
> +        trace_vfio_platform_eoi_handle_pending(intp->pin);
> +        qemu_mutex_unlock(&vdev->intp_mutex);
> +        vfio_intp_interrupt(intp);
> +        qemu_mutex_lock(&vdev->intp_mutex);
> +        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
> +        qemu_mutex_unlock(&vdev->intp_mutex);

This locking is way too ugly. If the intp lock is protecting the
structures then releasing it so the child function can grab it again is
just asking for races to happen. Perhaps vfio_intp_interrupt can be
split to have a _lockheld variant that can be used here and the other
version do the locking before calling the _lockheld function.


> +    } else {
> +        qemu_mutex_unlock(&vdev->intp_mutex);
> +    }
> +}
> +
> +/**
> + * vfio_mmap_set_enabled - enable/disable the fast path mode
> + * @vdev: the VFIO platform device
> + * @enabled: the target mmap state
> + *
> + * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
> + * false ~ slow path = MMIO region is trapped and region callbacks
> + * are called slow path enables to trap the IRQ status register
> + * guest reset
> +*/
> +
> +static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
> +{
> +    VFIORegion *region;

region could be defined inside the block, not that it matters too much
for a small function like this.

> +    int i;
> +
> +    trace_vfio_platform_mmap_set_enabled(enabled);
> +
> +    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
> +        region = vdev->regions[i];
> +
> +        /* register space is unmapped to trap EOI */
> +        memory_region_set_enabled(&region->mmap_mem, enabled);
> +    }
> +}
> +
> +/**
> + * vfio_intp_mmap_enable - timer function, restores the fast path
> + * if there is no more active IRQ
> + * @opaque: actually points to the VFIO platform device
> + *
> + * Called on mmap timer timout, this function checks whether the
> + * IRQ is still active and in the negative restores the fast path.
> + * by construction a single eventfd is handled at a time.
> + * if the IRQ is still active, the timer is restarted.
> + */
> +static void vfio_intp_mmap_enable(void *opaque)
> +{
> +    VFIOINTp *tmp;
> +    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
> +
> +    qemu_mutex_lock(&vdev->intp_mutex);
> +    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
> +        if (tmp->state == VFIO_IRQ_ACTIVE) {
> +            trace_vfio_platform_intp_mmap_enable(tmp->pin);
> +            /* re-program the timer to check active status later */
> +            timer_mod(vdev->mmap_timer,
> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> +                          vdev->mmap_timeout);
> +            qemu_mutex_unlock(&vdev->intp_mutex);
> +            return;
> +        }
> +    }
> +    vfio_mmap_set_enabled(vdev, true);
> +    qemu_mutex_unlock(&vdev->intp_mutex);
> +}
> +
> +/**
> + * vfio_intp_interrupt - The user-side eventfd handler
> + * @opaque: opaque pointer which in practice is the VFIOINTp*
> + *
> + * the function can be entered
> + * - in event handler context: this IRQ is inactive
> + *   in that case, the vIRQ is injected into the guest if there
> + *   is no other active or pending IRQ.
> + * - in IOhandler context: this IRQ is pending.
> + *   there is no ACTIVE IRQ
> + */
> +static void vfio_intp_interrupt(VFIOINTp *intp)
> +{
> +    int ret;
> +    VFIOINTp *tmp;
> +    VFIOPlatformDevice *vdev = intp->vdev;
> +    bool delay_handling = false;
> +
> +    qemu_mutex_lock(&vdev->intp_mutex);
> +    if (intp->state == VFIO_IRQ_INACTIVE) {
> +        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
> +            if (tmp->state == VFIO_IRQ_ACTIVE ||
> +                tmp->state == VFIO_IRQ_PENDING) {
> +                delay_handling = true;
> +                break;
> +            }
> +        }
> +    }
> +    if (delay_handling) {
> +        /*
> +         * the new IRQ gets a pending status and is pushed in
> +         * the pending queue
> +         */
> +        intp->state = VFIO_IRQ_PENDING;
> +        trace_vfio_intp_interrupt_set_pending(intp->pin);
> +        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
> +                             intp, pqnext);
> +        ret = event_notifier_test_and_clear(&intp->interrupt);
> +        qemu_mutex_unlock(&vdev->intp_mutex);
> +        return;
> +    }
> +
> +    /* no active IRQ, the new IRQ can be forwarded to the guest */
> +    trace_vfio_platform_intp_interrupt(intp->pin,
> +                              event_notifier_get_fd(&intp->interrupt));
> +
> +    if (intp->state == VFIO_IRQ_INACTIVE) {
> +        ret = event_notifier_test_and_clear(&intp->interrupt);
> +        if (!ret) {
> +            error_report("Error when clearing fd=%d (ret = %d)\n",
> +                         event_notifier_get_fd(&intp->interrupt), ret);
> +        }
> +    } /* else this is a pending IRQ that moves to ACTIVE state */
> +
> +    intp->state = VFIO_IRQ_ACTIVE;
> +
> +    /* sets slow path */
> +    vfio_mmap_set_enabled(vdev, false);
> +
> +    /* trigger the virtual IRQ */
> +    qemu_set_irq(intp->qemuirq, 1);
> +
> +    /* schedule the mmap timer which will restore mmap path after EOI*/
> +    if (vdev->mmap_timeout) {
> +        timer_mod(vdev->mmap_timer,
> +                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> +                      vdev->mmap_timeout);
> +    }
> +    qemu_mutex_unlock(&vdev->intp_mutex);

See above for comments about re-factoring this. It's not totally clear
what's being protected by the mutex, just the queues or the intp
structures themselves?

> +}
> +
> +/**
> + * vfio_start_eventfd_injection - starts the virtual IRQ injection using
> + * user-side handled eventfds
> + * @intp: the IRQ struct pointer
> + */
> +
> +static int vfio_start_eventfd_injection(VFIOINTp *intp)
> +{
> +    int ret;
> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
> +
> +    vfio_mask_single_irqindex(vbasedev, intp->pin);
> +
> +    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
> +    if (ret) {
> +        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
> +        vfio_unmask_single_irqindex(vbasedev, intp->pin);
> +        return ret;
> +    }
> +    vfio_unmask_single_irqindex(vbasedev, intp->pin);
> +    return 0;
> +}
> +
> +/*
> + * Functions used whatever the injection method
> + */
> +
> +/**
> + * vfio_set_trigger_eventfd - set VFIO eventfd handling
> + * ie. program the VFIO driver to associates a given IRQ index
> + * with a fd handler
> + *
> + * @intp: IRQ struct pointer
> + * @handler: handler to be called on eventfd trigger
> + */
> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
> +                                    eventfd_user_side_handler_t handler)
> +{
> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
> +    struct vfio_irq_set *irq_set;
> +    int argsz, ret;
> +    int32_t *pfd;
> +
> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
> +    irq_set = g_malloc0(argsz);
> +    irq_set->argsz = argsz;
> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
> +    irq_set->index = intp->pin;
> +    irq_set->start = 0;
> +    irq_set->count = 1;
> +    pfd = (int32_t *)&irq_set->data;
> +    *pfd = event_notifier_get_fd(&intp->interrupt);
> +    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
> +    g_free(irq_set);
> +    if (ret < 0) {
> +        error_report("vfio: Failed to set trigger eventfd: %m");
> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
> +    }
> +    return ret;
> +}
> +
>  /* not implemented yet */
>  static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
>  {
> @@ -39,6 +288,40 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>  }
>  
>  /**
> + * vfio_init_intp - allocate, initialize the IRQ struct pointer
> + * and add it into the list of IRQ
> + * @vbasedev: the VFIO device
> + * @index: VFIO device IRQ index
> + */
> +static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
> +{
> +    int ret;
> +    VFIOPlatformDevice *vdev =
> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
> +    VFIOINTp *intp;
> +
> +    /* allocate and populate a new VFIOINTp structure put in a queue list */
> +    intp = g_malloc0(sizeof(*intp));
> +    intp->vdev = vdev;
> +    intp->pin = index;
> +    intp->state = VFIO_IRQ_INACTIVE;
> +    sysbus_init_irq(sbdev, &intp->qemuirq);
> +
> +    /* Get an eventfd for trigger */
> +    ret = event_notifier_init(&intp->interrupt, 0);
> +    if (ret) {
> +        g_free(intp);
> +        error_report("vfio: Error: trigger event_notifier_init failed ");
> +        return NULL;
> +    }
> +
> +    /* store the new intp in qlist */
> +    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
> +    return intp;
> +}
> +
> +/**
>   * vfio_populate_device - initialize MMIO region and IRQ
>   * @vbasedev: the VFIO device
>   *
> @@ -47,7 +330,9 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>   */
>  static int vfio_populate_device(VFIODevice *vbasedev)
>  {
> +    struct vfio_irq_info irq = { .argsz = sizeof(irq) };
>      struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
> +    VFIOINTp *intp, *tmp;
>      int i, ret = -1;
>      VFIOPlatformDevice *vdev =
>          container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> @@ -80,7 +365,37 @@ static int vfio_populate_device(VFIODevice *vbasedev)
>                              (unsigned long)vdev->regions[i]->fd_offset);
>      }
>  
> +    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> +                                    vfio_intp_mmap_enable, vdev);
> +
> +    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
> +
> +    for (i = 0; i < vbasedev->num_irqs; i++) {
> +        irq.index = i;
> +
> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
> +        if (ret) {
> +            error_printf("vfio: error getting device %s irq info",
> +                         vbasedev->name);
> +            goto irq_err;
> +        } else {
> +            trace_vfio_platform_populate_interrupts(irq.index,
> +                                                    irq.count,
> +                                                    irq.flags);
> +            intp = vfio_init_intp(vbasedev, irq.index);
> +            if (!intp) {
> +                error_report("vfio: Error installing IRQ %d up", i);
> +                goto irq_err;
> +            }
> +        }
> +    }
>      return 0;
> +irq_err:
> +    timer_del(vdev->mmap_timer);
> +    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
> +        QLIST_REMOVE(intp, next);
> +        g_free(intp);
> +    }
>  error:
>      for (i = 0; i < vbasedev->num_regions; i++) {
>          g_free(vdev->regions[i]);
> @@ -93,6 +408,7 @@ error:
>  static VFIODeviceOps vfio_platform_ops = {
>      .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>      .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
> +    .vfio_eoi = vfio_platform_eoi,
>      .vfio_populate_device = vfio_populate_device,
>  };
>  
> @@ -220,6 +536,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
>  
>      vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>      vbasedev->ops = &vfio_platform_ops;
> +    vdev->start_irq_fn = vfio_start_eventfd_injection;
>  
>      trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>  
> @@ -243,6 +560,8 @@ static const VMStateDescription vfio_platform_vmstate = {
>  
>  static Property vfio_platform_dev_properties[] = {
>      DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
> +    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
> +                       mmap_timeout, 1100),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
> index 338f0c6..e55b711 100644
> --- a/include/hw/vfio/vfio-platform.h
> +++ b/include/hw/vfio/vfio-platform.h
> @@ -18,16 +18,49 @@
>  
>  #include "hw/sysbus.h"
>  #include "hw/vfio/vfio-common.h"
> +#include "qemu/event_notifier.h"
> +#include "qemu/queue.h"
> +#include "hw/irq.h"
>  
>  #define TYPE_VFIO_PLATFORM "vfio-platform"
>  
> +enum {
> +    VFIO_IRQ_INACTIVE = 0,
> +    VFIO_IRQ_PENDING = 1,
> +    VFIO_IRQ_ACTIVE = 2,
> +    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
> +};
> +
> +typedef struct VFIOINTp {
> +    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
> +    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
> +    EventNotifier interrupt; /* eventfd triggered on interrupt */
> +    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
> +    qemu_irq qemuirq;
> +    struct VFIOPlatformDevice *vdev; /* back pointer to device */
> +    int state; /* inactive, pending, active */
> +    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
> +    uint8_t pin; /* index */
> +    uint32_t virtualID; /* virtual IRQ */
> +} VFIOINTp;
> +
> +typedef int (*start_irq_fn_t)(VFIOINTp *intp);
> +
>  typedef struct VFIOPlatformDevice {
>      SysBusDevice sbdev;
>      VFIODevice vbasedev; /* not a QOM object */
>      VFIORegion **regions;
> +    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */
> +    /* queue of pending IRQ */
> +    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
>      char *compat; /* compatibility string */
> +    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
> +    QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
> +    start_irq_fn_t start_irq_fn;
> +    QemuMutex  intp_mutex;

Is this intp_mutex just for the intp_list or also the
pending_intp_queue? Perhaps consider re-arranging the structure and
adding some spacing to show what protects what.

>  } VFIOPlatformDevice;
>  
> +
>  typedef struct VFIOPlatformDeviceClass {
>      /*< private >*/
>      SysBusDeviceClass parent_class;

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 3/7] hw/vfio/platform: add irq assignment
@ 2015-02-17 11:24     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:24 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, patches, qemu-devel, alex.williamson, pbonzini,
	feng.wu, kvmarm


Eric Auger <eric.auger@linaro.org> writes:

> This patch adds the code requested to assign interrupts to
> a guest. The interrupts are mediated through user handled
> eventfds only.
>
> The mechanics to start the IRQ handling is not yet there through.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>

See comments inline.

>
> ---
>
> v8 -> v9:
> - free irq related resources in case of error in vfio_populate_device
> ---
>  hw/vfio/platform.c              | 319 ++++++++++++++++++++++++++++++++++++++++
>  include/hw/vfio/vfio-platform.h |  33 +++++
>  2 files changed, 352 insertions(+)
>
> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
> index caadb92..b85ad6c 100644
> --- a/hw/vfio/platform.c
> +++ b/hw/vfio/platform.c
> @@ -22,10 +22,259 @@
>  #include "qemu/range.h"
>  #include "sysemu/sysemu.h"
>  #include "exec/memory.h"
> +#include "qemu/queue.h"
>  #include "hw/sysbus.h"
>  #include "trace.h"
>  #include "hw/platform-bus.h"
>  
> +static void vfio_intp_interrupt(VFIOINTp *intp);
> +typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
> +                                    eventfd_user_side_handler_t handler);
> +
> +/*
> + * Functions only used when eventfd are handled on user-side
> + * ie. without irqfd
> + */
> +
> +/**
> + * vfio_platform_eoi - IRQ completion routine
> + * @vbasedev: the VFIO device
> + *
> + * de-asserts the active virtual IRQ and unmask the physical IRQ
> + * (masked by the  VFIO driver). Handle pending IRQs if any.
> + * eoi function is called on the first access to any MMIO region
> + * after an IRQ was triggered. It is assumed this access corresponds
> + * to the IRQ status register reset. With such a mechanism, a single
> + * IRQ can be handled at a time since there is no way to know which
> + * IRQ was completed by the guest (we would need additional details
> + * about the IRQ status register mask)
> + */
> +static void vfio_platform_eoi(VFIODevice *vbasedev)
> +{
> +    VFIOINTp *intp;
> +    VFIOPlatformDevice *vdev =
> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> +
> +    qemu_mutex_lock(&vdev->intp_mutex);
> +    QLIST_FOREACH(intp, &vdev->intp_list, next) {
> +        if (intp->state == VFIO_IRQ_ACTIVE) {
> +            trace_vfio_platform_eoi(intp->pin,
> +                                event_notifier_get_fd(&intp->interrupt));
> +            intp->state = VFIO_IRQ_INACTIVE;
> +
> +            /* deassert the virtual IRQ and unmask physical one */
> +            qemu_set_irq(intp->qemuirq, 0);
> +            vfio_unmask_single_irqindex(vbasedev, intp->pin);
> +
> +            /* a single IRQ can be active at a time */
> +            break;
> +        }
> +    }
> +    /* in case there are pending IRQs, handle them one at a time */
> +    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
> +        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
> +        trace_vfio_platform_eoi_handle_pending(intp->pin);
> +        qemu_mutex_unlock(&vdev->intp_mutex);
> +        vfio_intp_interrupt(intp);
> +        qemu_mutex_lock(&vdev->intp_mutex);
> +        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
> +        qemu_mutex_unlock(&vdev->intp_mutex);

This locking is way too ugly. If the intp lock is protecting the
structures then releasing it so the child function can grab it again is
just asking for races to happen. Perhaps vfio_intp_interrupt can be
split to have a _lockheld variant that can be used here and the other
version do the locking before calling the _lockheld function.


> +    } else {
> +        qemu_mutex_unlock(&vdev->intp_mutex);
> +    }
> +}
> +
> +/**
> + * vfio_mmap_set_enabled - enable/disable the fast path mode
> + * @vdev: the VFIO platform device
> + * @enabled: the target mmap state
> + *
> + * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
> + * false ~ slow path = MMIO region is trapped and region callbacks
> + * are called slow path enables to trap the IRQ status register
> + * guest reset
> +*/
> +
> +static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
> +{
> +    VFIORegion *region;

region could be defined inside the block, not that it matters too much
for a small function like this.

> +    int i;
> +
> +    trace_vfio_platform_mmap_set_enabled(enabled);
> +
> +    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
> +        region = vdev->regions[i];
> +
> +        /* register space is unmapped to trap EOI */
> +        memory_region_set_enabled(&region->mmap_mem, enabled);
> +    }
> +}
> +
> +/**
> + * vfio_intp_mmap_enable - timer function, restores the fast path
> + * if there is no more active IRQ
> + * @opaque: actually points to the VFIO platform device
> + *
> + * Called on mmap timer timout, this function checks whether the
> + * IRQ is still active and in the negative restores the fast path.
> + * by construction a single eventfd is handled at a time.
> + * if the IRQ is still active, the timer is restarted.
> + */
> +static void vfio_intp_mmap_enable(void *opaque)
> +{
> +    VFIOINTp *tmp;
> +    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
> +
> +    qemu_mutex_lock(&vdev->intp_mutex);
> +    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
> +        if (tmp->state == VFIO_IRQ_ACTIVE) {
> +            trace_vfio_platform_intp_mmap_enable(tmp->pin);
> +            /* re-program the timer to check active status later */
> +            timer_mod(vdev->mmap_timer,
> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> +                          vdev->mmap_timeout);
> +            qemu_mutex_unlock(&vdev->intp_mutex);
> +            return;
> +        }
> +    }
> +    vfio_mmap_set_enabled(vdev, true);
> +    qemu_mutex_unlock(&vdev->intp_mutex);
> +}
> +
> +/**
> + * vfio_intp_interrupt - The user-side eventfd handler
> + * @opaque: opaque pointer which in practice is the VFIOINTp*
> + *
> + * the function can be entered
> + * - in event handler context: this IRQ is inactive
> + *   in that case, the vIRQ is injected into the guest if there
> + *   is no other active or pending IRQ.
> + * - in IOhandler context: this IRQ is pending.
> + *   there is no ACTIVE IRQ
> + */
> +static void vfio_intp_interrupt(VFIOINTp *intp)
> +{
> +    int ret;
> +    VFIOINTp *tmp;
> +    VFIOPlatformDevice *vdev = intp->vdev;
> +    bool delay_handling = false;
> +
> +    qemu_mutex_lock(&vdev->intp_mutex);
> +    if (intp->state == VFIO_IRQ_INACTIVE) {
> +        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
> +            if (tmp->state == VFIO_IRQ_ACTIVE ||
> +                tmp->state == VFIO_IRQ_PENDING) {
> +                delay_handling = true;
> +                break;
> +            }
> +        }
> +    }
> +    if (delay_handling) {
> +        /*
> +         * the new IRQ gets a pending status and is pushed in
> +         * the pending queue
> +         */
> +        intp->state = VFIO_IRQ_PENDING;
> +        trace_vfio_intp_interrupt_set_pending(intp->pin);
> +        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
> +                             intp, pqnext);
> +        ret = event_notifier_test_and_clear(&intp->interrupt);
> +        qemu_mutex_unlock(&vdev->intp_mutex);
> +        return;
> +    }
> +
> +    /* no active IRQ, the new IRQ can be forwarded to the guest */
> +    trace_vfio_platform_intp_interrupt(intp->pin,
> +                              event_notifier_get_fd(&intp->interrupt));
> +
> +    if (intp->state == VFIO_IRQ_INACTIVE) {
> +        ret = event_notifier_test_and_clear(&intp->interrupt);
> +        if (!ret) {
> +            error_report("Error when clearing fd=%d (ret = %d)\n",
> +                         event_notifier_get_fd(&intp->interrupt), ret);
> +        }
> +    } /* else this is a pending IRQ that moves to ACTIVE state */
> +
> +    intp->state = VFIO_IRQ_ACTIVE;
> +
> +    /* sets slow path */
> +    vfio_mmap_set_enabled(vdev, false);
> +
> +    /* trigger the virtual IRQ */
> +    qemu_set_irq(intp->qemuirq, 1);
> +
> +    /* schedule the mmap timer which will restore mmap path after EOI*/
> +    if (vdev->mmap_timeout) {
> +        timer_mod(vdev->mmap_timer,
> +                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> +                      vdev->mmap_timeout);
> +    }
> +    qemu_mutex_unlock(&vdev->intp_mutex);

See above for comments about re-factoring this. It's not totally clear
what's being protected by the mutex, just the queues or the intp
structures themselves?

> +}
> +
> +/**
> + * vfio_start_eventfd_injection - starts the virtual IRQ injection using
> + * user-side handled eventfds
> + * @intp: the IRQ struct pointer
> + */
> +
> +static int vfio_start_eventfd_injection(VFIOINTp *intp)
> +{
> +    int ret;
> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
> +
> +    vfio_mask_single_irqindex(vbasedev, intp->pin);
> +
> +    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
> +    if (ret) {
> +        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
> +        vfio_unmask_single_irqindex(vbasedev, intp->pin);
> +        return ret;
> +    }
> +    vfio_unmask_single_irqindex(vbasedev, intp->pin);
> +    return 0;
> +}
> +
> +/*
> + * Functions used whatever the injection method
> + */
> +
> +/**
> + * vfio_set_trigger_eventfd - set VFIO eventfd handling
> + * ie. program the VFIO driver to associates a given IRQ index
> + * with a fd handler
> + *
> + * @intp: IRQ struct pointer
> + * @handler: handler to be called on eventfd trigger
> + */
> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
> +                                    eventfd_user_side_handler_t handler)
> +{
> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
> +    struct vfio_irq_set *irq_set;
> +    int argsz, ret;
> +    int32_t *pfd;
> +
> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
> +    irq_set = g_malloc0(argsz);
> +    irq_set->argsz = argsz;
> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
> +    irq_set->index = intp->pin;
> +    irq_set->start = 0;
> +    irq_set->count = 1;
> +    pfd = (int32_t *)&irq_set->data;
> +    *pfd = event_notifier_get_fd(&intp->interrupt);
> +    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
> +    g_free(irq_set);
> +    if (ret < 0) {
> +        error_report("vfio: Failed to set trigger eventfd: %m");
> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
> +    }
> +    return ret;
> +}
> +
>  /* not implemented yet */
>  static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
>  {
> @@ -39,6 +288,40 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>  }
>  
>  /**
> + * vfio_init_intp - allocate, initialize the IRQ struct pointer
> + * and add it into the list of IRQ
> + * @vbasedev: the VFIO device
> + * @index: VFIO device IRQ index
> + */
> +static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
> +{
> +    int ret;
> +    VFIOPlatformDevice *vdev =
> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
> +    VFIOINTp *intp;
> +
> +    /* allocate and populate a new VFIOINTp structure put in a queue list */
> +    intp = g_malloc0(sizeof(*intp));
> +    intp->vdev = vdev;
> +    intp->pin = index;
> +    intp->state = VFIO_IRQ_INACTIVE;
> +    sysbus_init_irq(sbdev, &intp->qemuirq);
> +
> +    /* Get an eventfd for trigger */
> +    ret = event_notifier_init(&intp->interrupt, 0);
> +    if (ret) {
> +        g_free(intp);
> +        error_report("vfio: Error: trigger event_notifier_init failed ");
> +        return NULL;
> +    }
> +
> +    /* store the new intp in qlist */
> +    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
> +    return intp;
> +}
> +
> +/**
>   * vfio_populate_device - initialize MMIO region and IRQ
>   * @vbasedev: the VFIO device
>   *
> @@ -47,7 +330,9 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>   */
>  static int vfio_populate_device(VFIODevice *vbasedev)
>  {
> +    struct vfio_irq_info irq = { .argsz = sizeof(irq) };
>      struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
> +    VFIOINTp *intp, *tmp;
>      int i, ret = -1;
>      VFIOPlatformDevice *vdev =
>          container_of(vbasedev, VFIOPlatformDevice, vbasedev);
> @@ -80,7 +365,37 @@ static int vfio_populate_device(VFIODevice *vbasedev)
>                              (unsigned long)vdev->regions[i]->fd_offset);
>      }
>  
> +    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> +                                    vfio_intp_mmap_enable, vdev);
> +
> +    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
> +
> +    for (i = 0; i < vbasedev->num_irqs; i++) {
> +        irq.index = i;
> +
> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
> +        if (ret) {
> +            error_printf("vfio: error getting device %s irq info",
> +                         vbasedev->name);
> +            goto irq_err;
> +        } else {
> +            trace_vfio_platform_populate_interrupts(irq.index,
> +                                                    irq.count,
> +                                                    irq.flags);
> +            intp = vfio_init_intp(vbasedev, irq.index);
> +            if (!intp) {
> +                error_report("vfio: Error installing IRQ %d up", i);
> +                goto irq_err;
> +            }
> +        }
> +    }
>      return 0;
> +irq_err:
> +    timer_del(vdev->mmap_timer);
> +    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
> +        QLIST_REMOVE(intp, next);
> +        g_free(intp);
> +    }
>  error:
>      for (i = 0; i < vbasedev->num_regions; i++) {
>          g_free(vdev->regions[i]);
> @@ -93,6 +408,7 @@ error:
>  static VFIODeviceOps vfio_platform_ops = {
>      .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>      .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
> +    .vfio_eoi = vfio_platform_eoi,
>      .vfio_populate_device = vfio_populate_device,
>  };
>  
> @@ -220,6 +536,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
>  
>      vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>      vbasedev->ops = &vfio_platform_ops;
> +    vdev->start_irq_fn = vfio_start_eventfd_injection;
>  
>      trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>  
> @@ -243,6 +560,8 @@ static const VMStateDescription vfio_platform_vmstate = {
>  
>  static Property vfio_platform_dev_properties[] = {
>      DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
> +    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
> +                       mmap_timeout, 1100),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
> index 338f0c6..e55b711 100644
> --- a/include/hw/vfio/vfio-platform.h
> +++ b/include/hw/vfio/vfio-platform.h
> @@ -18,16 +18,49 @@
>  
>  #include "hw/sysbus.h"
>  #include "hw/vfio/vfio-common.h"
> +#include "qemu/event_notifier.h"
> +#include "qemu/queue.h"
> +#include "hw/irq.h"
>  
>  #define TYPE_VFIO_PLATFORM "vfio-platform"
>  
> +enum {
> +    VFIO_IRQ_INACTIVE = 0,
> +    VFIO_IRQ_PENDING = 1,
> +    VFIO_IRQ_ACTIVE = 2,
> +    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
> +};
> +
> +typedef struct VFIOINTp {
> +    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
> +    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
> +    EventNotifier interrupt; /* eventfd triggered on interrupt */
> +    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
> +    qemu_irq qemuirq;
> +    struct VFIOPlatformDevice *vdev; /* back pointer to device */
> +    int state; /* inactive, pending, active */
> +    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
> +    uint8_t pin; /* index */
> +    uint32_t virtualID; /* virtual IRQ */
> +} VFIOINTp;
> +
> +typedef int (*start_irq_fn_t)(VFIOINTp *intp);
> +
>  typedef struct VFIOPlatformDevice {
>      SysBusDevice sbdev;
>      VFIODevice vbasedev; /* not a QOM object */
>      VFIORegion **regions;
> +    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */
> +    /* queue of pending IRQ */
> +    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
>      char *compat; /* compatibility string */
> +    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
> +    QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
> +    start_irq_fn_t start_irq_fn;
> +    QemuMutex  intp_mutex;

Is this intp_mutex just for the intp_list or also the
pending_intp_queue? Perhaps consider re-arranging the structure and
adding some spacing to show what protects what.

>  } VFIOPlatformDevice;
>  
> +
>  typedef struct VFIOPlatformDeviceClass {
>      /*< private >*/
>      SysBusDeviceClass parent_class;

-- 
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 5/7] hw/vfio: calxeda xgmac device
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 5/7] hw/vfio: calxeda xgmac device Eric Auger
@ 2015-02-17 11:29     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:29 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall


Eric Auger <eric.auger@linaro.org> writes:

> The platform device class has become abstract. This patch introduces
> a calxeda xgmac device that can be be instantiated on command line
> using such option.
>
> -device vfio-calxeda-xgmac,host="fff51000.ethernet"
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>
> ---
> v8 -> v9:
> - renamed calxeda_xgmac.c into calxeda-xgmac.c
>
> v7 -> v8:
> - add a comment in the header about the MMIO regions and IRQ which
>   are exposed by the device
>
> v5 -> v6
> - back again following Alex Graf advises
> - fix a bug related to compat override
>
> v4 -> v5:
> removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c
>
> v4: creation for device tree specialization
> ---
>  hw/arm/virt.c                        | 15 +++++++---
>  hw/vfio/Makefile.objs                |  1 +
>  hw/vfio/calxeda-xgmac.c              | 54 ++++++++++++++++++++++++++++++++++++
>  include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++++++++++++++++++++++++++++++
>  4 files changed, 112 insertions(+), 4 deletions(-)
>  create mode 100644 hw/vfio/calxeda-xgmac.c
>  create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 9df9b60..c1e0a10 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -44,6 +44,7 @@
>  #include "qemu/error-report.h"
>  #include "hw/arm/sysbus-fdt.h"
>  #include "hw/platform-bus.h"
> +#include "hw/vfio/vfio-platform.h"
>  
>  #define NUM_VIRTIO_TRANSPORTS 32
>  
> @@ -342,7 +343,7 @@ static void fdt_add_gic_node(const VirtBoardInfo *vbi)
>      qemu_fdt_setprop_cell(vbi->fdt, "/intc", "phandle", gic_phandle);
>  }
>  
> -static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
> +static DeviceState *create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
>  {
>      /* We create a standalone GIC v2 */
>      DeviceState *gicdev;
> @@ -390,6 +391,7 @@ static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
>      }
>  
>      fdt_add_gic_node(vbi);
> +    return gicdev;
>  }
>  
>  static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
> @@ -594,7 +596,8 @@ static void create_fw_cfg(const VirtBoardInfo *vbi)
>      g_free(nodename);
>  }
>  
> -static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
> +static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
> +                                DeviceState *gic)
>  {
>      DeviceState *dev;
>      SysBusDevice *s;
> @@ -633,6 +636,9 @@ static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
>      memory_region_add_subregion(sysmem,
>                                  platform_bus_params.platform_bus_base,
>                                  sysbus_mmio_get_region(s, 0));
> +
> +    /* setup VFIO signaling/IRQFD for all VFIO platform sysbus devices */
> +    qemu_register_reset(vfio_kick_irqs, gic);
>  }
>  
>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
> @@ -652,6 +658,7 @@ static void machvirt_init(MachineState *machine)
>      MemoryRegion *ram = g_new(MemoryRegion, 1);
>      const char *cpu_model = machine->cpu_model;
>      VirtBoardInfo *vbi;
> +    DeviceState *gic;
>  
>      if (!cpu_model) {
>          cpu_model = "cortex-a15";
> @@ -713,7 +720,7 @@ static void machvirt_init(MachineState *machine)
>  
>      create_flash(vbi);
>  
> -    create_gic(vbi, pic);
> +    gic = create_gic(vbi, pic);
>  
>      create_uart(vbi, pic);
>  
> @@ -744,7 +751,7 @@ static void machvirt_init(MachineState *machine)
>       * another notifier is registered which adds platform bus nodes.
>       * Notifiers are executed in registration reverse order.
>       */
> -    create_platform_bus(vbi, pic);
> +    create_platform_bus(vbi, pic, gic);
>  }
>  
>  static bool virt_get_secure(Object *obj, Error **errp)
> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
> index c5c76fe..d540c9d 100644
> --- a/hw/vfio/Makefile.objs
> +++ b/hw/vfio/Makefile.objs
> @@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
>  obj-$(CONFIG_SOFTMMU) += common.o
>  obj-$(CONFIG_PCI) += pci.o
>  obj-$(CONFIG_SOFTMMU) += platform.o
> +obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
>  endif
> diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
> new file mode 100644
> index 0000000..199e076
> --- /dev/null
> +++ b/hw/vfio/calxeda-xgmac.c
> @@ -0,0 +1,54 @@
> +/*
> + * calxeda xgmac example VFIO device
> + *
> + * Copyright Linaro Limited, 2014
> + *
> + * Authors:
> + *  Eric Auger <eric.auger@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "hw/vfio/vfio-calxeda-xgmac.h"
> +
> +static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
> +{
> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
> +    VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
> +
> +    vdev->compat = g_strdup("calxeda,hb-xgmac");
> +
> +    k->parent_realize(dev, errp);
> +}
> +
> +static const VMStateDescription vfio_platform_vmstate = {
> +    .name = TYPE_VFIO_CALXEDA_XGMAC,
> +    .unmigratable = 1,
> +};
> +
> +static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    VFIOCalxedaXgmacDeviceClass *vcxc =
> +        VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
> +    vcxc->parent_realize = dc->realize;
> +    dc->realize = calxeda_xgmac_realize;
> +    dc->desc = "VFIO Calxeda XGMAC";
> +}
> +
> +static const TypeInfo vfio_calxeda_xgmac_dev_info = {
> +    .name = TYPE_VFIO_CALXEDA_XGMAC,
> +    .parent = TYPE_VFIO_PLATFORM,
> +    .instance_size = sizeof(VFIOCalxedaXgmacDevice),
> +    .class_init = vfio_calxeda_xgmac_class_init,
> +    .class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
> +};
> +
> +static void register_calxeda_xgmac_dev_type(void)
> +{
> +    type_register_static(&vfio_calxeda_xgmac_dev_info);
> +}
> +
> +type_init(register_calxeda_xgmac_dev_type)
> diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h b/include/hw/vfio/vfio-calxeda-xgmac.h
> new file mode 100644
> index 0000000..f994775
> --- /dev/null
> +++ b/include/hw/vfio/vfio-calxeda-xgmac.h
> @@ -0,0 +1,46 @@
> +/*
> + * VFIO calxeda xgmac device
> + *
> + * Copyright Linaro Limited, 2014
> + *
> + * Authors:
> + *  Eric Auger <eric.auger@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
> +#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
> +
> +#include "hw/vfio/vfio-platform.h"
> +
> +#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
> +
> +/**
> + * This device exposes:
> + * - a single MMIO region corresponding to its register space
> + * - 3 IRQS (main and 2 power related IRQs)
> + */
> +typedef struct VFIOCalxedaXgmacDevice {
> +    VFIOPlatformDevice vdev;
> +} VFIOCalxedaXgmacDevice;
> +
> +typedef struct VFIOCalxedaXgmacDeviceClass {
> +    /*< private >*/
> +    VFIOPlatformDeviceClass parent_class;
> +    /*< public >*/
> +    DeviceRealize parent_realize;
> +} VFIOCalxedaXgmacDeviceClass;
> +
> +#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
> +     OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
> +#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
> +     OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
> +                        TYPE_VFIO_CALXEDA_XGMAC)
> +#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
> +     OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
> +                      TYPE_VFIO_CALXEDA_XGMAC)
> +
> +#endif

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 5/7] hw/vfio: calxeda xgmac device
@ 2015-02-17 11:29     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:29 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall


Eric Auger <eric.auger@linaro.org> writes:

> The platform device class has become abstract. This patch introduces
> a calxeda xgmac device that can be be instantiated on command line
> using such option.
>
> -device vfio-calxeda-xgmac,host="fff51000.ethernet"
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>
> ---
> v8 -> v9:
> - renamed calxeda_xgmac.c into calxeda-xgmac.c
>
> v7 -> v8:
> - add a comment in the header about the MMIO regions and IRQ which
>   are exposed by the device
>
> v5 -> v6
> - back again following Alex Graf advises
> - fix a bug related to compat override
>
> v4 -> v5:
> removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c
>
> v4: creation for device tree specialization
> ---
>  hw/arm/virt.c                        | 15 +++++++---
>  hw/vfio/Makefile.objs                |  1 +
>  hw/vfio/calxeda-xgmac.c              | 54 ++++++++++++++++++++++++++++++++++++
>  include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++++++++++++++++++++++++++++++
>  4 files changed, 112 insertions(+), 4 deletions(-)
>  create mode 100644 hw/vfio/calxeda-xgmac.c
>  create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 9df9b60..c1e0a10 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -44,6 +44,7 @@
>  #include "qemu/error-report.h"
>  #include "hw/arm/sysbus-fdt.h"
>  #include "hw/platform-bus.h"
> +#include "hw/vfio/vfio-platform.h"
>  
>  #define NUM_VIRTIO_TRANSPORTS 32
>  
> @@ -342,7 +343,7 @@ static void fdt_add_gic_node(const VirtBoardInfo *vbi)
>      qemu_fdt_setprop_cell(vbi->fdt, "/intc", "phandle", gic_phandle);
>  }
>  
> -static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
> +static DeviceState *create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
>  {
>      /* We create a standalone GIC v2 */
>      DeviceState *gicdev;
> @@ -390,6 +391,7 @@ static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
>      }
>  
>      fdt_add_gic_node(vbi);
> +    return gicdev;
>  }
>  
>  static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
> @@ -594,7 +596,8 @@ static void create_fw_cfg(const VirtBoardInfo *vbi)
>      g_free(nodename);
>  }
>  
> -static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
> +static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
> +                                DeviceState *gic)
>  {
>      DeviceState *dev;
>      SysBusDevice *s;
> @@ -633,6 +636,9 @@ static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
>      memory_region_add_subregion(sysmem,
>                                  platform_bus_params.platform_bus_base,
>                                  sysbus_mmio_get_region(s, 0));
> +
> +    /* setup VFIO signaling/IRQFD for all VFIO platform sysbus devices */
> +    qemu_register_reset(vfio_kick_irqs, gic);
>  }
>  
>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
> @@ -652,6 +658,7 @@ static void machvirt_init(MachineState *machine)
>      MemoryRegion *ram = g_new(MemoryRegion, 1);
>      const char *cpu_model = machine->cpu_model;
>      VirtBoardInfo *vbi;
> +    DeviceState *gic;
>  
>      if (!cpu_model) {
>          cpu_model = "cortex-a15";
> @@ -713,7 +720,7 @@ static void machvirt_init(MachineState *machine)
>  
>      create_flash(vbi);
>  
> -    create_gic(vbi, pic);
> +    gic = create_gic(vbi, pic);
>  
>      create_uart(vbi, pic);
>  
> @@ -744,7 +751,7 @@ static void machvirt_init(MachineState *machine)
>       * another notifier is registered which adds platform bus nodes.
>       * Notifiers are executed in registration reverse order.
>       */
> -    create_platform_bus(vbi, pic);
> +    create_platform_bus(vbi, pic, gic);
>  }
>  
>  static bool virt_get_secure(Object *obj, Error **errp)
> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
> index c5c76fe..d540c9d 100644
> --- a/hw/vfio/Makefile.objs
> +++ b/hw/vfio/Makefile.objs
> @@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
>  obj-$(CONFIG_SOFTMMU) += common.o
>  obj-$(CONFIG_PCI) += pci.o
>  obj-$(CONFIG_SOFTMMU) += platform.o
> +obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
>  endif
> diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
> new file mode 100644
> index 0000000..199e076
> --- /dev/null
> +++ b/hw/vfio/calxeda-xgmac.c
> @@ -0,0 +1,54 @@
> +/*
> + * calxeda xgmac example VFIO device
> + *
> + * Copyright Linaro Limited, 2014
> + *
> + * Authors:
> + *  Eric Auger <eric.auger@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "hw/vfio/vfio-calxeda-xgmac.h"
> +
> +static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
> +{
> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
> +    VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
> +
> +    vdev->compat = g_strdup("calxeda,hb-xgmac");
> +
> +    k->parent_realize(dev, errp);
> +}
> +
> +static const VMStateDescription vfio_platform_vmstate = {
> +    .name = TYPE_VFIO_CALXEDA_XGMAC,
> +    .unmigratable = 1,
> +};
> +
> +static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    VFIOCalxedaXgmacDeviceClass *vcxc =
> +        VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
> +    vcxc->parent_realize = dc->realize;
> +    dc->realize = calxeda_xgmac_realize;
> +    dc->desc = "VFIO Calxeda XGMAC";
> +}
> +
> +static const TypeInfo vfio_calxeda_xgmac_dev_info = {
> +    .name = TYPE_VFIO_CALXEDA_XGMAC,
> +    .parent = TYPE_VFIO_PLATFORM,
> +    .instance_size = sizeof(VFIOCalxedaXgmacDevice),
> +    .class_init = vfio_calxeda_xgmac_class_init,
> +    .class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
> +};
> +
> +static void register_calxeda_xgmac_dev_type(void)
> +{
> +    type_register_static(&vfio_calxeda_xgmac_dev_info);
> +}
> +
> +type_init(register_calxeda_xgmac_dev_type)
> diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h b/include/hw/vfio/vfio-calxeda-xgmac.h
> new file mode 100644
> index 0000000..f994775
> --- /dev/null
> +++ b/include/hw/vfio/vfio-calxeda-xgmac.h
> @@ -0,0 +1,46 @@
> +/*
> + * VFIO calxeda xgmac device
> + *
> + * Copyright Linaro Limited, 2014
> + *
> + * Authors:
> + *  Eric Auger <eric.auger@linaro.org>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
> +#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
> +
> +#include "hw/vfio/vfio-platform.h"
> +
> +#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
> +
> +/**
> + * This device exposes:
> + * - a single MMIO region corresponding to its register space
> + * - 3 IRQS (main and 2 power related IRQs)
> + */
> +typedef struct VFIOCalxedaXgmacDevice {
> +    VFIOPlatformDevice vdev;
> +} VFIOCalxedaXgmacDevice;
> +
> +typedef struct VFIOCalxedaXgmacDeviceClass {
> +    /*< private >*/
> +    VFIOPlatformDeviceClass parent_class;
> +    /*< public >*/
> +    DeviceRealize parent_realize;
> +} VFIOCalxedaXgmacDeviceClass;
> +
> +#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
> +     OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
> +#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
> +     OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
> +                        TYPE_VFIO_CALXEDA_XGMAC)
> +#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
> +     OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
> +                      TYPE_VFIO_CALXEDA_XGMAC)
> +
> +#endif

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
@ 2015-02-17 11:36     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:36 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall


Eric Auger <eric.auger@linaro.org> writes:

> vfio-calxeda-xgmac now can be instantiated using the -device option.
> The node creation function generates a very basic dt node composed
> of the compat, reg and interrupts properties
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v8 -> v9:
> - properly free resources in case of errors in
>   add_calxeda_midway_xgmac_fdt_node
>
> v7 -> v8:
> - move the add_fdt_node_functions array declaration between the device
>   specific code and the generic code to avoid forward declarations of
>   decice specific functions
> - rename add_basic_vfio_fdt_node into
>   add_calxeda_midway_xgmac_fdt_node
>
> v6 -> v7:
> - compat string re-formatting removed since compat string is not exposed
>   anymore as a user option
> - VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
>   device
> ---
>  hw/arm/sysbus-fdt.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
>
> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
> index 3038b94..d4f97f5 100644
> --- a/hw/arm/sysbus-fdt.c
> +++ b/hw/arm/sysbus-fdt.c
> @@ -26,6 +26,8 @@
>  #include "sysemu/device_tree.h"
>  #include "hw/platform-bus.h"
>  #include "sysemu/sysemu.h"
> +#include "hw/vfio/vfio-platform.h"
> +#include "hw/vfio/vfio-calxeda-xgmac.h"
>  
>  /*
>   * internal struct that contains the information to create dynamic
> @@ -53,11 +55,92 @@ typedef struct NodeCreationPair {
>      int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
>  } NodeCreationPair;
>  
> +/* Device Specific Code */
> +
> +/**
> + * add_calxeda_midway_xgmac_fdt_node
> + *
> + * Generates a very simple node with following properties:
> + * compatible string, regs, interrupts
> + */
> +static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
> +{
> +    PlatformBusFDTData *data = opaque;
> +    PlatformBusDevice *pbus = data->pbus;
> +    void *fdt = data->fdt;
> +    const char *parent_node = data->pbus_node_name;
> +    int compat_str_len;
> +    char *nodename;
> +    int i, ret = -1;
> +    uint32_t *irq_attr;
> +    uint64_t *reg_attr;
> +    uint64_t mmio_base;
> +    uint64_t irq_number;
> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
> +    VFIODevice *vbasedev = &vdev->vbasedev;
> +    Object *obj = OBJECT(sbdev);
> +
> +    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
> +
> +    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
> +                               vbasedev->name,
> +                               mmio_base);
> +
> +    qemu_fdt_add_subnode(fdt, nodename);
> +
> +    compat_str_len = strlen(vdev->compat) + 1;
> +    qemu_fdt_setprop(fdt, nodename, "compatible",
> +                          vdev->compat, compat_str_len);
> +
> +    reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
> +
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
> +        reg_attr[4*i] = 1;
> +        reg_attr[4*i+1] = mmio_base;
> +        reg_attr[4*i+2] = 1;
> +        reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
> +    }
> +
> +    ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
> +                     vbasedev->num_regions*2, reg_attr);

Could we use qemu_fdt_setprop_sized_cells() like everyone else to hide
the uglyness of the _from_array?

> +    if (ret) {
> +        error_report("could not set reg property of node %s", nodename);
> +        goto fail_reg;
> +    }
> +
> +    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
> +
> +    for (i = 0; i < vbasedev->num_irqs; i++) {
> +        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
> +                         + data->irq_start;
> +        irq_attr[3*i] = cpu_to_be32(0);
> +        irq_attr[3*i+1] = cpu_to_be32(irq_number);
> +        irq_attr[3*i+2] = cpu_to_be32(0x4);
> +    }
> +
> +   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
> +                     irq_attr,
> vbasedev->num_irqs*3*sizeof(uint32_t));

Ditto.

> +    if (ret) {
> +        error_report("could not set interrupts property of node %s",
> +                     nodename);
> +    }
> +
> +    g_free(irq_attr);
> +fail_reg:
> +    g_free(reg_attr);
> +    g_free(nodename);
> +    return ret;
> +}
> +
>  /* list of supported dynamic sysbus devices */
>  static const NodeCreationPair add_fdt_node_functions[] = {
> +    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
>      {"", NULL}, /* last element */
>  };
>  
> +/* Generic Code */
> +
>  /**
>   * add_fdt_node - add the device tree node of a dynamic sysbus device
>   *

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
@ 2015-02-17 11:36     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:36 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, patches, qemu-devel, alex.williamson, pbonzini,
	feng.wu, kvmarm


Eric Auger <eric.auger@linaro.org> writes:

> vfio-calxeda-xgmac now can be instantiated using the -device option.
> The node creation function generates a very basic dt node composed
> of the compat, reg and interrupts properties
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v8 -> v9:
> - properly free resources in case of errors in
>   add_calxeda_midway_xgmac_fdt_node
>
> v7 -> v8:
> - move the add_fdt_node_functions array declaration between the device
>   specific code and the generic code to avoid forward declarations of
>   decice specific functions
> - rename add_basic_vfio_fdt_node into
>   add_calxeda_midway_xgmac_fdt_node
>
> v6 -> v7:
> - compat string re-formatting removed since compat string is not exposed
>   anymore as a user option
> - VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
>   device
> ---
>  hw/arm/sysbus-fdt.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
>
> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
> index 3038b94..d4f97f5 100644
> --- a/hw/arm/sysbus-fdt.c
> +++ b/hw/arm/sysbus-fdt.c
> @@ -26,6 +26,8 @@
>  #include "sysemu/device_tree.h"
>  #include "hw/platform-bus.h"
>  #include "sysemu/sysemu.h"
> +#include "hw/vfio/vfio-platform.h"
> +#include "hw/vfio/vfio-calxeda-xgmac.h"
>  
>  /*
>   * internal struct that contains the information to create dynamic
> @@ -53,11 +55,92 @@ typedef struct NodeCreationPair {
>      int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
>  } NodeCreationPair;
>  
> +/* Device Specific Code */
> +
> +/**
> + * add_calxeda_midway_xgmac_fdt_node
> + *
> + * Generates a very simple node with following properties:
> + * compatible string, regs, interrupts
> + */
> +static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
> +{
> +    PlatformBusFDTData *data = opaque;
> +    PlatformBusDevice *pbus = data->pbus;
> +    void *fdt = data->fdt;
> +    const char *parent_node = data->pbus_node_name;
> +    int compat_str_len;
> +    char *nodename;
> +    int i, ret = -1;
> +    uint32_t *irq_attr;
> +    uint64_t *reg_attr;
> +    uint64_t mmio_base;
> +    uint64_t irq_number;
> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
> +    VFIODevice *vbasedev = &vdev->vbasedev;
> +    Object *obj = OBJECT(sbdev);
> +
> +    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
> +
> +    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
> +                               vbasedev->name,
> +                               mmio_base);
> +
> +    qemu_fdt_add_subnode(fdt, nodename);
> +
> +    compat_str_len = strlen(vdev->compat) + 1;
> +    qemu_fdt_setprop(fdt, nodename, "compatible",
> +                          vdev->compat, compat_str_len);
> +
> +    reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
> +
> +    for (i = 0; i < vbasedev->num_regions; i++) {
> +        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
> +        reg_attr[4*i] = 1;
> +        reg_attr[4*i+1] = mmio_base;
> +        reg_attr[4*i+2] = 1;
> +        reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
> +    }
> +
> +    ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
> +                     vbasedev->num_regions*2, reg_attr);

Could we use qemu_fdt_setprop_sized_cells() like everyone else to hide
the uglyness of the _from_array?

> +    if (ret) {
> +        error_report("could not set reg property of node %s", nodename);
> +        goto fail_reg;
> +    }
> +
> +    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
> +
> +    for (i = 0; i < vbasedev->num_irqs; i++) {
> +        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
> +                         + data->irq_start;
> +        irq_attr[3*i] = cpu_to_be32(0);
> +        irq_attr[3*i+1] = cpu_to_be32(irq_number);
> +        irq_attr[3*i+2] = cpu_to_be32(0x4);
> +    }
> +
> +   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
> +                     irq_attr,
> vbasedev->num_irqs*3*sizeof(uint32_t));

Ditto.

> +    if (ret) {
> +        error_report("could not set interrupts property of node %s",
> +                     nodename);
> +    }
> +
> +    g_free(irq_attr);
> +fail_reg:
> +    g_free(reg_attr);
> +    g_free(nodename);
> +    return ret;
> +}
> +
>  /* list of supported dynamic sysbus devices */
>  static const NodeCreationPair add_fdt_node_functions[] = {
> +    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
>      {"", NULL}, /* last element */
>  };
>  
> +/* Generic Code */
> +
>  /**
>   * add_fdt_node - add the device tree node of a dynamic sysbus device
>   *

-- 
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 7/7] hw/vfio/platform: add irqfd support
  2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 7/7] hw/vfio/platform: add irqfd support Eric Auger
@ 2015-02-17 11:41     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:41 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall


Eric Auger <eric.auger@linaro.org> writes:

> This patch aims at optimizing IRQ handling using irqfd framework.
>
> Instead of handling the eventfds on user-side they are handled on
> kernel side using
> - the KVM irqfd framework,
> - the VFIO driver virqfd framework.
>
> the virtual IRQ completion is trapped at interrupt controller
> This removes the need for fast/slow path swap.
>
> Overall this brings significant performance improvements.
>
> it depends on host kernel KVM irqfd.
>
> Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>
> ---
> v5 -> v6
> - rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
> - guard KVM code with #ifdef CONFIG_KVM
>
> v3 -> v4:
> [Alvise Rigo]
> Use of VFIO Platform driver v6 unmask/virqfd feature and removal
> of resamplefd handler. Physical IRQ unmasking is now done in
> VFIO driver.
>
> v3:
> [Eric Auger]
> initial support with resamplefd handled on QEMU side since the
> unmask was not supported on VFIO platform driver v5.
>
> Conflicts:
> 	hw/vfio/platform.c
> ---
>  hw/vfio/platform.c              | 104 +++++++++++++++++++++++++++++++++++++++-
>  include/hw/vfio/vfio-platform.h |   1 +
>  trace-events                    |   2 +
>  3 files changed, 106 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
> index 30798d8..cadc824 100644
> --- a/hw/vfio/platform.c
> +++ b/hw/vfio/platform.c
> @@ -26,6 +26,7 @@
>  #include "hw/sysbus.h"
>  #include "trace.h"
>  #include "hw/platform-bus.h"
> +#include "sysemu/kvm.h"
>  
>  static void vfio_intp_interrupt(VFIOINTp *intp);
>  typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
> @@ -237,6 +238,83 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
>  }
>  
>  /*
> + * Functions used for irqfd
> + */
> +
> +#ifdef CONFIG_KVM
> +
> +/**
> + * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
> + * @intp: the IRQ struct pointer
> + * programs the VFIO driver to unmask this IRQ when the
> + * intp->unmask eventfd is triggered
> + */
> +static int vfio_set_resample_eventfd(VFIOINTp *intp)
> +{
> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
> +    struct vfio_irq_set *irq_set;
> +    int argsz, ret;
> +    int32_t *pfd;
> +
> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
> +    irq_set = g_malloc0(argsz);
> +    irq_set->argsz = argsz;
> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
> +    irq_set->index = intp->pin;
> +    irq_set->start = 0;
> +    irq_set->count = 1;
> +    pfd = (int32_t *)&irq_set->data;
> +    *pfd = event_notifier_get_fd(&intp->unmask);
> +    qemu_set_fd_handler(*pfd, NULL, NULL, intp);
> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
> +    g_free(irq_set);
> +    if (ret < 0) {
> +        error_report("vfio: Failed to set resample eventfd: %m");
> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
> + * programs VFIO driver with both the trigger and resamplefd
> + * programs KVM with the gsi, trigger & resample eventfds
> + */
> +static int vfio_start_irqfd_injection(VFIOINTp *intp)
> +{
> +    struct kvm_irqfd irqfd = {
> +        .fd = event_notifier_get_fd(&intp->interrupt),
> +        .resamplefd = event_notifier_get_fd(&intp->unmask),
> +        .gsi = intp->virtualID,
> +        .flags = KVM_IRQFD_FLAG_RESAMPLE,
> +    };
> +
> +    if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
> +        error_report("vfio: Error: Failed to assign the irqfd: %m");
> +        goto fail_irqfd;
> +    }
> +    if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
> +        goto fail_vfio;
> +    }
> +    if (vfio_set_resample_eventfd(intp) < 0) {
> +        goto fail_vfio;
> +    }
> +
> +    intp->kvm_accel = true;
> +    trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
> +                                     irqfd.fd, irqfd.resamplefd);
> +    return 0;
> +
> +fail_vfio:
> +    irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
> +    kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
> +fail_irqfd:
> +    return -1;
> +}
> +
> +#endif
> +
> +/*
>   * Functions used whatever the injection method
>   */
>  
> @@ -315,6 +393,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
>          error_report("vfio: Error: trigger event_notifier_init failed ");
>          return NULL;
>      }
> +    /* Get an eventfd for resample/unmask */
> +    ret = event_notifier_init(&intp->unmask, 0);
> +    if (ret) {
> +        g_free(intp);
> +        error_report("vfio: Error: resample event_notifier_init failed eoi");
> +        return NULL;
> +    }
>  
>      /* store the new intp in qlist */
>      QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
> @@ -409,7 +494,6 @@ static VFIODeviceOps vfio_platform_ops = {
>      .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>      .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>      .vfio_eoi = vfio_platform_eoi,
> -    .vfio_populate_device = vfio_populate_device,
>  };
>  
>  /**
> @@ -481,6 +565,13 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
>          error_report("vfio: failed to get device %s", path);
>          vfio_put_group(group);
>      }
> +
> +    ret = vfio_populate_device(vbasedev);
> +    if (ret) {
> +        error_report("vfio: failed to populate device %s", path);
> +        vfio_put_group(group);
> +    }
> +
>      return ret;
>  }
>  
> @@ -536,7 +627,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
>  
>      vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>      vbasedev->ops = &vfio_platform_ops;
> +
> +#ifdef CONFIG_KVM
> +    if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
> +        vdev->irqfd_allowed) {
> +        vdev->start_irq_fn = vfio_start_irqfd_injection;
> +    } else {
> +        vdev->start_irq_fn = vfio_start_eventfd_injection;
> +    }
> +#else
>      vdev->start_irq_fn = vfio_start_eventfd_injection;
> +#endif
>  
>      trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>  
> @@ -614,6 +715,7 @@ static Property vfio_platform_dev_properties[] = {
>      DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
>      DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
>                         mmap_timeout, 1100),
> +    DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
> index bd1206e..097448b 100644
> --- a/include/hw/vfio/vfio-platform.h
> +++ b/include/hw/vfio/vfio-platform.h
> @@ -58,6 +58,7 @@ typedef struct VFIOPlatformDevice {
>      QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
>      start_irq_fn_t start_irq_fn;
>      QemuMutex  intp_mutex;
> +    bool irqfd_allowed; /* debug option to force irqfd on/off */
>  } VFIOPlatformDevice;
>  
>  
> diff --git a/trace-events b/trace-events
> index d3685c9..7a6a6aa 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1567,6 +1567,8 @@ vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d
>  vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
>  vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
>  vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
> +vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d"
> +vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d"
>  
>  #hw/acpi/memory_hotplug.c
>  mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 7/7] hw/vfio/platform: add irqfd support
@ 2015-02-17 11:41     ` Alex Bennée
  0 siblings, 0 replies; 24+ messages in thread
From: Alex Bennée @ 2015-02-17 11:41 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, patches, qemu-devel, alex.williamson, pbonzini,
	feng.wu, kvmarm


Eric Auger <eric.auger@linaro.org> writes:

> This patch aims at optimizing IRQ handling using irqfd framework.
>
> Instead of handling the eventfds on user-side they are handled on
> kernel side using
> - the KVM irqfd framework,
> - the VFIO driver virqfd framework.
>
> the virtual IRQ completion is trapped at interrupt controller
> This removes the need for fast/slow path swap.
>
> Overall this brings significant performance improvements.
>
> it depends on host kernel KVM irqfd.
>
> Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>
> ---
> v5 -> v6
> - rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
> - guard KVM code with #ifdef CONFIG_KVM
>
> v3 -> v4:
> [Alvise Rigo]
> Use of VFIO Platform driver v6 unmask/virqfd feature and removal
> of resamplefd handler. Physical IRQ unmasking is now done in
> VFIO driver.
>
> v3:
> [Eric Auger]
> initial support with resamplefd handled on QEMU side since the
> unmask was not supported on VFIO platform driver v5.
>
> Conflicts:
> 	hw/vfio/platform.c
> ---
>  hw/vfio/platform.c              | 104 +++++++++++++++++++++++++++++++++++++++-
>  include/hw/vfio/vfio-platform.h |   1 +
>  trace-events                    |   2 +
>  3 files changed, 106 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
> index 30798d8..cadc824 100644
> --- a/hw/vfio/platform.c
> +++ b/hw/vfio/platform.c
> @@ -26,6 +26,7 @@
>  #include "hw/sysbus.h"
>  #include "trace.h"
>  #include "hw/platform-bus.h"
> +#include "sysemu/kvm.h"
>  
>  static void vfio_intp_interrupt(VFIOINTp *intp);
>  typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
> @@ -237,6 +238,83 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
>  }
>  
>  /*
> + * Functions used for irqfd
> + */
> +
> +#ifdef CONFIG_KVM
> +
> +/**
> + * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
> + * @intp: the IRQ struct pointer
> + * programs the VFIO driver to unmask this IRQ when the
> + * intp->unmask eventfd is triggered
> + */
> +static int vfio_set_resample_eventfd(VFIOINTp *intp)
> +{
> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
> +    struct vfio_irq_set *irq_set;
> +    int argsz, ret;
> +    int32_t *pfd;
> +
> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
> +    irq_set = g_malloc0(argsz);
> +    irq_set->argsz = argsz;
> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
> +    irq_set->index = intp->pin;
> +    irq_set->start = 0;
> +    irq_set->count = 1;
> +    pfd = (int32_t *)&irq_set->data;
> +    *pfd = event_notifier_get_fd(&intp->unmask);
> +    qemu_set_fd_handler(*pfd, NULL, NULL, intp);
> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
> +    g_free(irq_set);
> +    if (ret < 0) {
> +        error_report("vfio: Failed to set resample eventfd: %m");
> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
> + * programs VFIO driver with both the trigger and resamplefd
> + * programs KVM with the gsi, trigger & resample eventfds
> + */
> +static int vfio_start_irqfd_injection(VFIOINTp *intp)
> +{
> +    struct kvm_irqfd irqfd = {
> +        .fd = event_notifier_get_fd(&intp->interrupt),
> +        .resamplefd = event_notifier_get_fd(&intp->unmask),
> +        .gsi = intp->virtualID,
> +        .flags = KVM_IRQFD_FLAG_RESAMPLE,
> +    };
> +
> +    if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
> +        error_report("vfio: Error: Failed to assign the irqfd: %m");
> +        goto fail_irqfd;
> +    }
> +    if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
> +        goto fail_vfio;
> +    }
> +    if (vfio_set_resample_eventfd(intp) < 0) {
> +        goto fail_vfio;
> +    }
> +
> +    intp->kvm_accel = true;
> +    trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
> +                                     irqfd.fd, irqfd.resamplefd);
> +    return 0;
> +
> +fail_vfio:
> +    irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
> +    kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
> +fail_irqfd:
> +    return -1;
> +}
> +
> +#endif
> +
> +/*
>   * Functions used whatever the injection method
>   */
>  
> @@ -315,6 +393,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
>          error_report("vfio: Error: trigger event_notifier_init failed ");
>          return NULL;
>      }
> +    /* Get an eventfd for resample/unmask */
> +    ret = event_notifier_init(&intp->unmask, 0);
> +    if (ret) {
> +        g_free(intp);
> +        error_report("vfio: Error: resample event_notifier_init failed eoi");
> +        return NULL;
> +    }
>  
>      /* store the new intp in qlist */
>      QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
> @@ -409,7 +494,6 @@ static VFIODeviceOps vfio_platform_ops = {
>      .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>      .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>      .vfio_eoi = vfio_platform_eoi,
> -    .vfio_populate_device = vfio_populate_device,
>  };
>  
>  /**
> @@ -481,6 +565,13 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
>          error_report("vfio: failed to get device %s", path);
>          vfio_put_group(group);
>      }
> +
> +    ret = vfio_populate_device(vbasedev);
> +    if (ret) {
> +        error_report("vfio: failed to populate device %s", path);
> +        vfio_put_group(group);
> +    }
> +
>      return ret;
>  }
>  
> @@ -536,7 +627,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
>  
>      vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>      vbasedev->ops = &vfio_platform_ops;
> +
> +#ifdef CONFIG_KVM
> +    if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
> +        vdev->irqfd_allowed) {
> +        vdev->start_irq_fn = vfio_start_irqfd_injection;
> +    } else {
> +        vdev->start_irq_fn = vfio_start_eventfd_injection;
> +    }
> +#else
>      vdev->start_irq_fn = vfio_start_eventfd_injection;
> +#endif
>  
>      trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>  
> @@ -614,6 +715,7 @@ static Property vfio_platform_dev_properties[] = {
>      DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
>      DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
>                         mmap_timeout, 1100),
> +    DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
> index bd1206e..097448b 100644
> --- a/include/hw/vfio/vfio-platform.h
> +++ b/include/hw/vfio/vfio-platform.h
> @@ -58,6 +58,7 @@ typedef struct VFIOPlatformDevice {
>      QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
>      start_irq_fn_t start_irq_fn;
>      QemuMutex  intp_mutex;
> +    bool irqfd_allowed; /* debug option to force irqfd on/off */
>  } VFIOPlatformDevice;
>  
>  
> diff --git a/trace-events b/trace-events
> index d3685c9..7a6a6aa 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1567,6 +1567,8 @@ vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d
>  vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
>  vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
>  vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
> +vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d"
> +vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d"
>  
>  #hw/acpi/memory_hotplug.c
>  mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32

-- 
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton
  2015-02-17 10:56     ` Alex Bennée
@ 2015-03-13  9:28       ` Eric Auger
  -1 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-03-13  9:28 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, eric.auger, patches, Kim Phillips, qemu-devel,
	agraf, alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall

Hi Alex,

Thank you for your review and the R-b on 5/7 & 7/7.

Please apologize for the long delay in response, mostly due to my
vacation :-(

I took into account all the comments below

Best Regards

Eric
On 02/17/2015 11:56 AM, Alex Bennée wrote:
> 
> Eric Auger <eric.auger@linaro.org> writes:
> 
>> Minimal VFIO platform implementation supporting register space
>> user mapping but not IRQ assignment.
>>
>> Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> See comments inline.
> 
> <snip>
>> +/**
>> + * vfio_populate_device - initialize MMIO region and IRQ
>> + * @vbasedev: the VFIO device
>> + *
>> + * query the VFIO device for exposed MMIO regions and IRQ and
>> + * populate the associated fields in the device struct
>> + */
>> +static int vfio_populate_device(VFIODevice *vbasedev)
>> +{
>> +    struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
> 
> This could be inside the for block.
> 
>> +    int i, ret = -1;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +
>> +    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
>> +        error_report("vfio: Um, this isn't a platform device");
>> +        goto error;
>> +    }
>> +
>> +    vdev->regions = g_malloc0(sizeof(VFIORegion *) *
>> vbasedev->num_regions);
> 
> I may have considered a g_malloc0_n but I see that's not actually used
> in the rest of the code (newer glib?).
> 
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        vdev->regions[i] = g_malloc0(sizeof(VFIORegion));
> 
> An intermediate VFIORegion *ptr here would have saved a bunch of typing
> later on ;-) 
> 
>> +        reg_info.index = i;
>> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
>> +        if (ret) {
>> +            error_report("vfio: Error getting region %d info: %m", i);
>> +            goto error;
>> +        }
>> +        vdev->regions[i]->flags = reg_info.flags;
>> +        vdev->regions[i]->size = reg_info.size;
>> +        vdev->regions[i]->fd_offset = reg_info.offset;
>> +        vdev->regions[i]->nr = i;
>> +        vdev->regions[i]->vbasedev = vbasedev;
>> +
>> +        trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
>> +                            (unsigned long)vdev->regions[i]->flags,
>> +                            (unsigned long)vdev->regions[i]->size,
>> +                            vdev->regions[i]->vbasedev->fd,
>> +                            (unsigned long)vdev->regions[i]->fd_offset);
>> +    }
>> +
>> +    return 0;
>> +error:
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        g_free(vdev->regions[i]);
>> +    }
>> +    g_free(vdev->regions);
>> +    return ret;
>> +}
>> +
>> +/* specialized functions ofr VFIO Platform devices */
>> +static VFIODeviceOps vfio_platform_ops = {
>> +    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>> +    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>> +    .vfio_populate_device = vfio_populate_device,
>> +};
>> +
>> +/**
>> + * vfio_base_device_init - implements some of the VFIO mechanics
>> + * @vbasedev: the VFIO device
>> + *
>> + * retrieves the group the device belongs to and get the device fd
>> + * returns the VFIO device fd
>> + * precondition: the device name must be initialized
>> + */
>> +static int vfio_base_device_init(VFIODevice *vbasedev)
>> +{
>> +    VFIOGroup *group;
>> +    VFIODevice *vbasedev_iter;
>> +    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
>> +    ssize_t len;
>> +    struct stat st;
>> +    int groupid;
>> +    int ret;
>> +
>> +    /* name must be set prior to the call */
>> +    if (!vbasedev->name) {
>> +        return -EINVAL;
>> +    }
>> +
>> +    /* Check that the host device exists */
>> +    snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
>> +             vbasedev->name);
>> +
>> +    if (stat(path, &st) < 0) {
>> +        error_report("vfio: error: no such host device: %s", path);
>> +        return -errno;
>> +    }
>> +
>> +    strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
> 
> Consider g_strlcat which has nicer max length semantics.
> 
>> +    len = readlink(path, iommu_group_path, sizeof(path));
>> +    if (len <= 0 || len >= sizeof(path)) {
> 
> readlink should never report more than sizeof(path) although that will
> indicate a ENAMETOOLONG.
> 
>> +        error_report("vfio: error no iommu_group for device");
>> +        return len < 0 ? -errno : ENAMETOOLONG;
>> +    }
>> +
>> +    iommu_group_path[len] = 0;
>> +    group_name = basename(iommu_group_path);
>> +
>> +    if (sscanf(group_name, "%d", &groupid) != 1) {
>> +        error_report("vfio: error reading %s: %m", path);
>> +        return -errno;
>> +    }
>> +
>> +    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
>> +
>> +    group = vfio_get_group(groupid, &address_space_memory);
>> +    if (!group) {
>> +        error_report("vfio: failed to get group %d", groupid);
>> +        return -ENOENT;
>> +    }
>> +
>> +    snprintf(path, sizeof(path), "%s", vbasedev->name);
>> +
>> +    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
>> +        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
>> +            error_report("vfio: error: device %s is already attached", path);
>> +            vfio_put_group(group);
>> +            return -EBUSY;
>> +        }
>> +    }
>> +    ret = vfio_get_device(group, path, vbasedev);
>> +    if (ret) {
>> +        error_report("vfio: failed to get device %s", path);
>> +        vfio_put_group(group);
>> +    }
>> +    return ret;
>> +}
>> +
>> +/**
>> + * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
>> + * given index
>> + * @vdev: the VFIO platform device
>> + * @nr: the index of the region
>> + *
>> + * init the top memory region and the mmapped memroy region beneath
>> + * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
>> + * and could not be passed to memory region functions
>> +*/
>> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
>> +{
>> +    VFIORegion *region = vdev->regions[nr];
>> +    unsigned size = region->size;
>> +    char name[64];
>> +
>> +    if (!size) {
>> +        return;
>> +    }
>> +
>> +    snprintf(name, sizeof(name), "VFIO %s region %d",
>> +             vdev->vbasedev.name, nr);
>> +
>> +    /* A "slow" read/write mapping underlies all regions */
>> +    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
>> +                          region, name, size);
>> +
>> +    strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
> 
> again consider g_strlcat
> 
>> +
>> +    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
>> +                         &region->mmap_mem, &region->mmap, size, 0, name)) {
>> +        error_report("%s unsupported. Performance may be slow", name);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_platform_realize  - the device realize function
>> + * @dev: device state pointer
>> + * @errp: error
>> + *
>> + * initialize the device, its memory regions and IRQ structures
>> + * IRQ are started separately
>> + */
>> +static void vfio_platform_realize(DeviceState *dev, Error **errp)
>> +{
>> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
>> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
>> +    VFIODevice *vbasedev = &vdev->vbasedev;
>> +    int i, ret;
>> +
>> +    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>> +    vbasedev->ops = &vfio_platform_ops;
>> +
>> +    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>> +
>> +    ret = vfio_base_device_init(vbasedev);
>> +    if (ret) {
>> +        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
>> +                   vbasedev->name);
>> +        return;
>> +    }
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        vfio_map_region(vdev, i);
>> +        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
>> +    }
>> +}
>> +
>> +static const VMStateDescription vfio_platform_vmstate = {
>> +    .name = TYPE_VFIO_PLATFORM,
>> +    .unmigratable = 1,
>> +};
>> +
>> +static Property vfio_platform_dev_properties[] = {
>> +    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void vfio_platform_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    dc->realize = vfio_platform_realize;
>> +    dc->props = vfio_platform_dev_properties;
>> +    dc->vmsd = &vfio_platform_vmstate;
>> +    dc->desc = "VFIO-based platform device assignment";
>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>> +}
>> +
>> +static const TypeInfo vfio_platform_dev_info = {
>> +    .name = TYPE_VFIO_PLATFORM,
>> +    .parent = TYPE_SYS_BUS_DEVICE,
>> +    .instance_size = sizeof(VFIOPlatformDevice),
>> +    .class_init = vfio_platform_class_init,
>> +    .class_size = sizeof(VFIOPlatformDeviceClass),
>> +    .abstract   = true,
>> +};
>> +
>> +static void register_vfio_platform_dev_type(void)
>> +{
>> +    type_register_static(&vfio_platform_dev_info);
>> +}
>> +
>> +type_init(register_vfio_platform_dev_type)
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> index 5f3679b..2d1d8b3 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -43,6 +43,7 @@
>>  
>>  enum {
>>      VFIO_DEVICE_TYPE_PCI = 0,
>> +    VFIO_DEVICE_TYPE_PLATFORM = 1,
>>  };
>>  
>>  typedef struct VFIORegion {
>> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
>> new file mode 100644
>> index 0000000..338f0c6
>> --- /dev/null
>> +++ b/include/hw/vfio/vfio-platform.h
>> @@ -0,0 +1,44 @@
>> +/*
>> + * vfio based device assignment support - platform devices
>> + *
>> + * Copyright Linaro Limited, 2014
>> + *
>> + * Authors:
>> + *  Kim Phillips <kim.phillips@linaro.org>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Based on vfio based PCI device assignment support:
>> + *  Copyright Red Hat, Inc. 2012
>> + */
>> +
>> +#ifndef HW_VFIO_VFIO_PLATFORM_H
>> +#define HW_VFIO_VFIO_PLATFORM_H
>> +
>> +#include "hw/sysbus.h"
>> +#include "hw/vfio/vfio-common.h"
>> +
>> +#define TYPE_VFIO_PLATFORM "vfio-platform"
>> +
>> +typedef struct VFIOPlatformDevice {
>> +    SysBusDevice sbdev;
>> +    VFIODevice vbasedev; /* not a QOM object */
>> +    VFIORegion **regions;
>> +    char *compat; /* compatibility string */
>> +} VFIOPlatformDevice;
>> +
>> +typedef struct VFIOPlatformDeviceClass {
>> +    /*< private >*/
>> +    SysBusDeviceClass parent_class;
>> +    /*< public >*/
>> +} VFIOPlatformDeviceClass;
>> +
>> +#define VFIO_PLATFORM_DEVICE(obj) \
>> +     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
>> +#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
>> +     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
>> +#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
>> +     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
>> +
>> +#endif /*HW_VFIO_VFIO_PLATFORM_H*/
>> diff --git a/trace-events b/trace-events
>> index f87b077..d3685c9 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1556,6 +1556,18 @@ vfio_put_group(int fd) "close group->fd=%d"
>>  vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
>>  vfio_put_base_device(int fd) "close vdev->fd=%d"
>>  
>> +# hw/vfio/platform.c
>> +vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
>> +vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
>> +vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
>> +vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
>> +vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
>> +vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
>> +vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
>> +vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
>> +vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
>> +vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
>> +
>>  #hw/acpi/memory_hotplug.c
>>  mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
>>  mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton
@ 2015-03-13  9:28       ` Eric Auger
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-03-13  9:28 UTC (permalink / raw)
  To: Alex Bennée
  Cc: eric.auger, patches, Kim Phillips, qemu-devel, alex.williamson,
	pbonzini, feng.wu, kvmarm

Hi Alex,

Thank you for your review and the R-b on 5/7 & 7/7.

Please apologize for the long delay in response, mostly due to my
vacation :-(

I took into account all the comments below

Best Regards

Eric
On 02/17/2015 11:56 AM, Alex Bennée wrote:
> 
> Eric Auger <eric.auger@linaro.org> writes:
> 
>> Minimal VFIO platform implementation supporting register space
>> user mapping but not IRQ assignment.
>>
>> Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> See comments inline.
> 
> <snip>
>> +/**
>> + * vfio_populate_device - initialize MMIO region and IRQ
>> + * @vbasedev: the VFIO device
>> + *
>> + * query the VFIO device for exposed MMIO regions and IRQ and
>> + * populate the associated fields in the device struct
>> + */
>> +static int vfio_populate_device(VFIODevice *vbasedev)
>> +{
>> +    struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
> 
> This could be inside the for block.
> 
>> +    int i, ret = -1;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +
>> +    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
>> +        error_report("vfio: Um, this isn't a platform device");
>> +        goto error;
>> +    }
>> +
>> +    vdev->regions = g_malloc0(sizeof(VFIORegion *) *
>> vbasedev->num_regions);
> 
> I may have considered a g_malloc0_n but I see that's not actually used
> in the rest of the code (newer glib?).
> 
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        vdev->regions[i] = g_malloc0(sizeof(VFIORegion));
> 
> An intermediate VFIORegion *ptr here would have saved a bunch of typing
> later on ;-) 
> 
>> +        reg_info.index = i;
>> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
>> +        if (ret) {
>> +            error_report("vfio: Error getting region %d info: %m", i);
>> +            goto error;
>> +        }
>> +        vdev->regions[i]->flags = reg_info.flags;
>> +        vdev->regions[i]->size = reg_info.size;
>> +        vdev->regions[i]->fd_offset = reg_info.offset;
>> +        vdev->regions[i]->nr = i;
>> +        vdev->regions[i]->vbasedev = vbasedev;
>> +
>> +        trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
>> +                            (unsigned long)vdev->regions[i]->flags,
>> +                            (unsigned long)vdev->regions[i]->size,
>> +                            vdev->regions[i]->vbasedev->fd,
>> +                            (unsigned long)vdev->regions[i]->fd_offset);
>> +    }
>> +
>> +    return 0;
>> +error:
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        g_free(vdev->regions[i]);
>> +    }
>> +    g_free(vdev->regions);
>> +    return ret;
>> +}
>> +
>> +/* specialized functions ofr VFIO Platform devices */
>> +static VFIODeviceOps vfio_platform_ops = {
>> +    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>> +    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>> +    .vfio_populate_device = vfio_populate_device,
>> +};
>> +
>> +/**
>> + * vfio_base_device_init - implements some of the VFIO mechanics
>> + * @vbasedev: the VFIO device
>> + *
>> + * retrieves the group the device belongs to and get the device fd
>> + * returns the VFIO device fd
>> + * precondition: the device name must be initialized
>> + */
>> +static int vfio_base_device_init(VFIODevice *vbasedev)
>> +{
>> +    VFIOGroup *group;
>> +    VFIODevice *vbasedev_iter;
>> +    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
>> +    ssize_t len;
>> +    struct stat st;
>> +    int groupid;
>> +    int ret;
>> +
>> +    /* name must be set prior to the call */
>> +    if (!vbasedev->name) {
>> +        return -EINVAL;
>> +    }
>> +
>> +    /* Check that the host device exists */
>> +    snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
>> +             vbasedev->name);
>> +
>> +    if (stat(path, &st) < 0) {
>> +        error_report("vfio: error: no such host device: %s", path);
>> +        return -errno;
>> +    }
>> +
>> +    strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
> 
> Consider g_strlcat which has nicer max length semantics.
> 
>> +    len = readlink(path, iommu_group_path, sizeof(path));
>> +    if (len <= 0 || len >= sizeof(path)) {
> 
> readlink should never report more than sizeof(path) although that will
> indicate a ENAMETOOLONG.
> 
>> +        error_report("vfio: error no iommu_group for device");
>> +        return len < 0 ? -errno : ENAMETOOLONG;
>> +    }
>> +
>> +    iommu_group_path[len] = 0;
>> +    group_name = basename(iommu_group_path);
>> +
>> +    if (sscanf(group_name, "%d", &groupid) != 1) {
>> +        error_report("vfio: error reading %s: %m", path);
>> +        return -errno;
>> +    }
>> +
>> +    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
>> +
>> +    group = vfio_get_group(groupid, &address_space_memory);
>> +    if (!group) {
>> +        error_report("vfio: failed to get group %d", groupid);
>> +        return -ENOENT;
>> +    }
>> +
>> +    snprintf(path, sizeof(path), "%s", vbasedev->name);
>> +
>> +    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
>> +        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
>> +            error_report("vfio: error: device %s is already attached", path);
>> +            vfio_put_group(group);
>> +            return -EBUSY;
>> +        }
>> +    }
>> +    ret = vfio_get_device(group, path, vbasedev);
>> +    if (ret) {
>> +        error_report("vfio: failed to get device %s", path);
>> +        vfio_put_group(group);
>> +    }
>> +    return ret;
>> +}
>> +
>> +/**
>> + * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
>> + * given index
>> + * @vdev: the VFIO platform device
>> + * @nr: the index of the region
>> + *
>> + * init the top memory region and the mmapped memroy region beneath
>> + * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
>> + * and could not be passed to memory region functions
>> +*/
>> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
>> +{
>> +    VFIORegion *region = vdev->regions[nr];
>> +    unsigned size = region->size;
>> +    char name[64];
>> +
>> +    if (!size) {
>> +        return;
>> +    }
>> +
>> +    snprintf(name, sizeof(name), "VFIO %s region %d",
>> +             vdev->vbasedev.name, nr);
>> +
>> +    /* A "slow" read/write mapping underlies all regions */
>> +    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
>> +                          region, name, size);
>> +
>> +    strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
> 
> again consider g_strlcat
> 
>> +
>> +    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
>> +                         &region->mmap_mem, &region->mmap, size, 0, name)) {
>> +        error_report("%s unsupported. Performance may be slow", name);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_platform_realize  - the device realize function
>> + * @dev: device state pointer
>> + * @errp: error
>> + *
>> + * initialize the device, its memory regions and IRQ structures
>> + * IRQ are started separately
>> + */
>> +static void vfio_platform_realize(DeviceState *dev, Error **errp)
>> +{
>> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
>> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
>> +    VFIODevice *vbasedev = &vdev->vbasedev;
>> +    int i, ret;
>> +
>> +    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>> +    vbasedev->ops = &vfio_platform_ops;
>> +
>> +    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>> +
>> +    ret = vfio_base_device_init(vbasedev);
>> +    if (ret) {
>> +        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
>> +                   vbasedev->name);
>> +        return;
>> +    }
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        vfio_map_region(vdev, i);
>> +        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
>> +    }
>> +}
>> +
>> +static const VMStateDescription vfio_platform_vmstate = {
>> +    .name = TYPE_VFIO_PLATFORM,
>> +    .unmigratable = 1,
>> +};
>> +
>> +static Property vfio_platform_dev_properties[] = {
>> +    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void vfio_platform_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    dc->realize = vfio_platform_realize;
>> +    dc->props = vfio_platform_dev_properties;
>> +    dc->vmsd = &vfio_platform_vmstate;
>> +    dc->desc = "VFIO-based platform device assignment";
>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>> +}
>> +
>> +static const TypeInfo vfio_platform_dev_info = {
>> +    .name = TYPE_VFIO_PLATFORM,
>> +    .parent = TYPE_SYS_BUS_DEVICE,
>> +    .instance_size = sizeof(VFIOPlatformDevice),
>> +    .class_init = vfio_platform_class_init,
>> +    .class_size = sizeof(VFIOPlatformDeviceClass),
>> +    .abstract   = true,
>> +};
>> +
>> +static void register_vfio_platform_dev_type(void)
>> +{
>> +    type_register_static(&vfio_platform_dev_info);
>> +}
>> +
>> +type_init(register_vfio_platform_dev_type)
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> index 5f3679b..2d1d8b3 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -43,6 +43,7 @@
>>  
>>  enum {
>>      VFIO_DEVICE_TYPE_PCI = 0,
>> +    VFIO_DEVICE_TYPE_PLATFORM = 1,
>>  };
>>  
>>  typedef struct VFIORegion {
>> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
>> new file mode 100644
>> index 0000000..338f0c6
>> --- /dev/null
>> +++ b/include/hw/vfio/vfio-platform.h
>> @@ -0,0 +1,44 @@
>> +/*
>> + * vfio based device assignment support - platform devices
>> + *
>> + * Copyright Linaro Limited, 2014
>> + *
>> + * Authors:
>> + *  Kim Phillips <kim.phillips@linaro.org>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Based on vfio based PCI device assignment support:
>> + *  Copyright Red Hat, Inc. 2012
>> + */
>> +
>> +#ifndef HW_VFIO_VFIO_PLATFORM_H
>> +#define HW_VFIO_VFIO_PLATFORM_H
>> +
>> +#include "hw/sysbus.h"
>> +#include "hw/vfio/vfio-common.h"
>> +
>> +#define TYPE_VFIO_PLATFORM "vfio-platform"
>> +
>> +typedef struct VFIOPlatformDevice {
>> +    SysBusDevice sbdev;
>> +    VFIODevice vbasedev; /* not a QOM object */
>> +    VFIORegion **regions;
>> +    char *compat; /* compatibility string */
>> +} VFIOPlatformDevice;
>> +
>> +typedef struct VFIOPlatformDeviceClass {
>> +    /*< private >*/
>> +    SysBusDeviceClass parent_class;
>> +    /*< public >*/
>> +} VFIOPlatformDeviceClass;
>> +
>> +#define VFIO_PLATFORM_DEVICE(obj) \
>> +     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
>> +#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
>> +     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
>> +#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
>> +     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
>> +
>> +#endif /*HW_VFIO_VFIO_PLATFORM_H*/
>> diff --git a/trace-events b/trace-events
>> index f87b077..d3685c9 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1556,6 +1556,18 @@ vfio_put_group(int fd) "close group->fd=%d"
>>  vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
>>  vfio_put_base_device(int fd) "close vdev->fd=%d"
>>  
>> +# hw/vfio/platform.c
>> +vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
>> +vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
>> +vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
>> +vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
>> +vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
>> +vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
>> +vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
>> +vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
>> +vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
>> +vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
>> +
>>  #hw/acpi/memory_hotplug.c
>>  mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
>>  mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  2015-02-17 11:36     ` Alex Bennée
@ 2015-03-13  9:33       ` Eric Auger
  -1 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-03-13  9:33 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall

Alex,
On 02/17/2015 12:36 PM, Alex Bennée wrote:
> 
> Eric Auger <eric.auger@linaro.org> writes:
> 
>> vfio-calxeda-xgmac now can be instantiated using the -device option.
>> The node creation function generates a very basic dt node composed
>> of the compat, reg and interrupts properties
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v8 -> v9:
>> - properly free resources in case of errors in
>>   add_calxeda_midway_xgmac_fdt_node
>>
>> v7 -> v8:
>> - move the add_fdt_node_functions array declaration between the device
>>   specific code and the generic code to avoid forward declarations of
>>   decice specific functions
>> - rename add_basic_vfio_fdt_node into
>>   add_calxeda_midway_xgmac_fdt_node
>>
>> v6 -> v7:
>> - compat string re-formatting removed since compat string is not exposed
>>   anymore as a user option
>> - VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
>>   device
>> ---
>>  hw/arm/sysbus-fdt.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 83 insertions(+)
>>
>> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
>> index 3038b94..d4f97f5 100644
>> --- a/hw/arm/sysbus-fdt.c
>> +++ b/hw/arm/sysbus-fdt.c
>> @@ -26,6 +26,8 @@
>>  #include "sysemu/device_tree.h"
>>  #include "hw/platform-bus.h"
>>  #include "sysemu/sysemu.h"
>> +#include "hw/vfio/vfio-platform.h"
>> +#include "hw/vfio/vfio-calxeda-xgmac.h"
>>  
>>  /*
>>   * internal struct that contains the information to create dynamic
>> @@ -53,11 +55,92 @@ typedef struct NodeCreationPair {
>>      int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
>>  } NodeCreationPair;
>>  
>> +/* Device Specific Code */
>> +
>> +/**
>> + * add_calxeda_midway_xgmac_fdt_node
>> + *
>> + * Generates a very simple node with following properties:
>> + * compatible string, regs, interrupts
>> + */
>> +static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
>> +{
>> +    PlatformBusFDTData *data = opaque;
>> +    PlatformBusDevice *pbus = data->pbus;
>> +    void *fdt = data->fdt;
>> +    const char *parent_node = data->pbus_node_name;
>> +    int compat_str_len;
>> +    char *nodename;
>> +    int i, ret = -1;
>> +    uint32_t *irq_attr;
>> +    uint64_t *reg_attr;
>> +    uint64_t mmio_base;
>> +    uint64_t irq_number;
>> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +    VFIODevice *vbasedev = &vdev->vbasedev;
>> +    Object *obj = OBJECT(sbdev);
>> +
>> +    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
>> +
>> +    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
>> +                               vbasedev->name,
>> +                               mmio_base);
>> +
>> +    qemu_fdt_add_subnode(fdt, nodename);
>> +
>> +    compat_str_len = strlen(vdev->compat) + 1;
>> +    qemu_fdt_setprop(fdt, nodename, "compatible",
>> +                          vdev->compat, compat_str_len);
>> +
>> +    reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
>> +        reg_attr[4*i] = 1;
>> +        reg_attr[4*i+1] = mmio_base;
>> +        reg_attr[4*i+2] = 1;
>> +        reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
>> +    }
>> +
>> +    ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
>> +                     vbasedev->num_regions*2, reg_attr);
> 
> Could we use qemu_fdt_setprop_sized_cells() like everyone else to hide
> the uglyness of the _from_array?

Due to the fact I need to handle 'num_regions' regions, I need to setup
the array programmatically. I could use qemu_fdt_setprop instead since
the cell size always is 1 but I currently do not see how I could easily
take benefit from qemu_fdt_setprop_sized_cells.
> 
>> +    if (ret) {
>> +        error_report("could not set reg property of node %s", nodename);
>> +        goto fail_reg;
>> +    }
>> +
>> +    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
>> +
>> +    for (i = 0; i < vbasedev->num_irqs; i++) {
>> +        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
>> +                         + data->irq_start;
>> +        irq_attr[3*i] = cpu_to_be32(0);
>> +        irq_attr[3*i+1] = cpu_to_be32(irq_number);
>> +        irq_attr[3*i+2] = cpu_to_be32(0x4);
>> +    }
>> +
>> +   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
>> +                     irq_attr,
>> vbasedev->num_irqs*3*sizeof(uint32_t));
> 
> Ditto.
Here also I need to handle num_irqs. But maybe I misunderstand your
comment here.

Best Regards

Eric
> 
>> +    if (ret) {
>> +        error_report("could not set interrupts property of node %s",
>> +                     nodename);
>> +    }
>> +
>> +    g_free(irq_attr);
>> +fail_reg:
>> +    g_free(reg_attr);
>> +    g_free(nodename);
>> +    return ret;
>> +}
>> +
>>  /* list of supported dynamic sysbus devices */
>>  static const NodeCreationPair add_fdt_node_functions[] = {
>> +    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
>>      {"", NULL}, /* last element */
>>  };
>>  
>> +/* Generic Code */
>> +
>>  /**
>>   * add_fdt_node - add the device tree node of a dynamic sysbus device
>>   *
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
@ 2015-03-13  9:33       ` Eric Auger
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-03-13  9:33 UTC (permalink / raw)
  To: Alex Bennée
  Cc: eric.auger, patches, qemu-devel, alex.williamson, pbonzini,
	feng.wu, kvmarm

Alex,
On 02/17/2015 12:36 PM, Alex Bennée wrote:
> 
> Eric Auger <eric.auger@linaro.org> writes:
> 
>> vfio-calxeda-xgmac now can be instantiated using the -device option.
>> The node creation function generates a very basic dt node composed
>> of the compat, reg and interrupts properties
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v8 -> v9:
>> - properly free resources in case of errors in
>>   add_calxeda_midway_xgmac_fdt_node
>>
>> v7 -> v8:
>> - move the add_fdt_node_functions array declaration between the device
>>   specific code and the generic code to avoid forward declarations of
>>   decice specific functions
>> - rename add_basic_vfio_fdt_node into
>>   add_calxeda_midway_xgmac_fdt_node
>>
>> v6 -> v7:
>> - compat string re-formatting removed since compat string is not exposed
>>   anymore as a user option
>> - VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
>>   device
>> ---
>>  hw/arm/sysbus-fdt.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 83 insertions(+)
>>
>> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
>> index 3038b94..d4f97f5 100644
>> --- a/hw/arm/sysbus-fdt.c
>> +++ b/hw/arm/sysbus-fdt.c
>> @@ -26,6 +26,8 @@
>>  #include "sysemu/device_tree.h"
>>  #include "hw/platform-bus.h"
>>  #include "sysemu/sysemu.h"
>> +#include "hw/vfio/vfio-platform.h"
>> +#include "hw/vfio/vfio-calxeda-xgmac.h"
>>  
>>  /*
>>   * internal struct that contains the information to create dynamic
>> @@ -53,11 +55,92 @@ typedef struct NodeCreationPair {
>>      int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
>>  } NodeCreationPair;
>>  
>> +/* Device Specific Code */
>> +
>> +/**
>> + * add_calxeda_midway_xgmac_fdt_node
>> + *
>> + * Generates a very simple node with following properties:
>> + * compatible string, regs, interrupts
>> + */
>> +static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
>> +{
>> +    PlatformBusFDTData *data = opaque;
>> +    PlatformBusDevice *pbus = data->pbus;
>> +    void *fdt = data->fdt;
>> +    const char *parent_node = data->pbus_node_name;
>> +    int compat_str_len;
>> +    char *nodename;
>> +    int i, ret = -1;
>> +    uint32_t *irq_attr;
>> +    uint64_t *reg_attr;
>> +    uint64_t mmio_base;
>> +    uint64_t irq_number;
>> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +    VFIODevice *vbasedev = &vdev->vbasedev;
>> +    Object *obj = OBJECT(sbdev);
>> +
>> +    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
>> +
>> +    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
>> +                               vbasedev->name,
>> +                               mmio_base);
>> +
>> +    qemu_fdt_add_subnode(fdt, nodename);
>> +
>> +    compat_str_len = strlen(vdev->compat) + 1;
>> +    qemu_fdt_setprop(fdt, nodename, "compatible",
>> +                          vdev->compat, compat_str_len);
>> +
>> +    reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
>> +        reg_attr[4*i] = 1;
>> +        reg_attr[4*i+1] = mmio_base;
>> +        reg_attr[4*i+2] = 1;
>> +        reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
>> +    }
>> +
>> +    ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
>> +                     vbasedev->num_regions*2, reg_attr);
> 
> Could we use qemu_fdt_setprop_sized_cells() like everyone else to hide
> the uglyness of the _from_array?

Due to the fact I need to handle 'num_regions' regions, I need to setup
the array programmatically. I could use qemu_fdt_setprop instead since
the cell size always is 1 but I currently do not see how I could easily
take benefit from qemu_fdt_setprop_sized_cells.
> 
>> +    if (ret) {
>> +        error_report("could not set reg property of node %s", nodename);
>> +        goto fail_reg;
>> +    }
>> +
>> +    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
>> +
>> +    for (i = 0; i < vbasedev->num_irqs; i++) {
>> +        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
>> +                         + data->irq_start;
>> +        irq_attr[3*i] = cpu_to_be32(0);
>> +        irq_attr[3*i+1] = cpu_to_be32(irq_number);
>> +        irq_attr[3*i+2] = cpu_to_be32(0x4);
>> +    }
>> +
>> +   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
>> +                     irq_attr,
>> vbasedev->num_irqs*3*sizeof(uint32_t));
> 
> Ditto.
Here also I need to handle num_irqs. But maybe I misunderstand your
comment here.

Best Regards

Eric
> 
>> +    if (ret) {
>> +        error_report("could not set interrupts property of node %s",
>> +                     nodename);
>> +    }
>> +
>> +    g_free(irq_attr);
>> +fail_reg:
>> +    g_free(reg_attr);
>> +    g_free(nodename);
>> +    return ret;
>> +}
>> +
>>  /* list of supported dynamic sysbus devices */
>>  static const NodeCreationPair add_fdt_node_functions[] = {
>> +    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
>>      {"", NULL}, /* last element */
>>  };
>>  
>> +/* Generic Code */
>> +
>>  /**
>>   * add_fdt_node - add the device tree node of a dynamic sysbus device
>>   *
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment
  2015-02-17 11:24     ` Alex Bennée
@ 2015-03-19 10:18       ` Eric Auger
  -1 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-03-19 10:18 UTC (permalink / raw)
  To: Alex Bennée
  Cc: peter.maydell, eric.auger, patches, qemu-devel, agraf,
	alex.williamson, pbonzini, b.reynal, feng.wu, kvmarm,
	christoffer.dall

Hi Alex,
On 02/17/2015 12:24 PM, Alex Bennée wrote:
> 
> Eric Auger <eric.auger@linaro.org> writes:
> 
>> This patch adds the code requested to assign interrupts to
>> a guest. The interrupts are mediated through user handled
>> eventfds only.
>>
>> The mechanics to start the IRQ handling is not yet there through.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> See comments inline.
> 
>>
>> ---
>>
>> v8 -> v9:
>> - free irq related resources in case of error in vfio_populate_device
>> ---
>>  hw/vfio/platform.c              | 319 ++++++++++++++++++++++++++++++++++++++++
>>  include/hw/vfio/vfio-platform.h |  33 +++++
>>  2 files changed, 352 insertions(+)
>>
>> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
>> index caadb92..b85ad6c 100644
>> --- a/hw/vfio/platform.c
>> +++ b/hw/vfio/platform.c
>> @@ -22,10 +22,259 @@
>>  #include "qemu/range.h"
>>  #include "sysemu/sysemu.h"
>>  #include "exec/memory.h"
>> +#include "qemu/queue.h"
>>  #include "hw/sysbus.h"
>>  #include "trace.h"
>>  #include "hw/platform-bus.h"
>>  
>> +static void vfio_intp_interrupt(VFIOINTp *intp);
>> +typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
>> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
>> +                                    eventfd_user_side_handler_t handler);
>> +
>> +/*
>> + * Functions only used when eventfd are handled on user-side
>> + * ie. without irqfd
>> + */
>> +
>> +/**
>> + * vfio_platform_eoi - IRQ completion routine
>> + * @vbasedev: the VFIO device
>> + *
>> + * de-asserts the active virtual IRQ and unmask the physical IRQ
>> + * (masked by the  VFIO driver). Handle pending IRQs if any.
>> + * eoi function is called on the first access to any MMIO region
>> + * after an IRQ was triggered. It is assumed this access corresponds
>> + * to the IRQ status register reset. With such a mechanism, a single
>> + * IRQ can be handled at a time since there is no way to know which
>> + * IRQ was completed by the guest (we would need additional details
>> + * about the IRQ status register mask)
>> + */
>> +static void vfio_platform_eoi(VFIODevice *vbasedev)
>> +{
>> +    VFIOINTp *intp;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    QLIST_FOREACH(intp, &vdev->intp_list, next) {
>> +        if (intp->state == VFIO_IRQ_ACTIVE) {
>> +            trace_vfio_platform_eoi(intp->pin,
>> +                                event_notifier_get_fd(&intp->interrupt));
>> +            intp->state = VFIO_IRQ_INACTIVE;
>> +
>> +            /* deassert the virtual IRQ and unmask physical one */
>> +            qemu_set_irq(intp->qemuirq, 0);
>> +            vfio_unmask_single_irqindex(vbasedev, intp->pin);
>> +
>> +            /* a single IRQ can be active at a time */
>> +            break;
>> +        }
>> +    }
>> +    /* in case there are pending IRQs, handle them one at a time */
>> +    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
>> +        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
>> +        trace_vfio_platform_eoi_handle_pending(intp->pin);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +        vfio_intp_interrupt(intp);
>> +        qemu_mutex_lock(&vdev->intp_mutex);
>> +        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
> 
> This locking is way too ugly. If the intp lock is protecting the
> structures then releasing it so the child function can grab it again is
> just asking for races to happen. Perhaps vfio_intp_interrupt can be
> split to have a _lockheld variant that can be used here and the other
> version do the locking before calling the _lockheld function.
The mutex aims at protecting the state of the IRQs stored in intp_list
IRQ. I refactored the code as you advised.
> 
> 
>> +    } else {
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_mmap_set_enabled - enable/disable the fast path mode
>> + * @vdev: the VFIO platform device
>> + * @enabled: the target mmap state
>> + *
>> + * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
>> + * false ~ slow path = MMIO region is trapped and region callbacks
>> + * are called slow path enables to trap the IRQ status register
>> + * guest reset
>> +*/
>> +
>> +static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
>> +{
>> +    VFIORegion *region;
> 
> region could be defined inside the block, not that it matters too much
> for a small function like this.
done
> 
>> +    int i;
>> +
>> +    trace_vfio_platform_mmap_set_enabled(enabled);
>> +
>> +    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
>> +        region = vdev->regions[i];
>> +
>> +        /* register space is unmapped to trap EOI */
>> +        memory_region_set_enabled(&region->mmap_mem, enabled);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_intp_mmap_enable - timer function, restores the fast path
>> + * if there is no more active IRQ
>> + * @opaque: actually points to the VFIO platform device
>> + *
>> + * Called on mmap timer timout, this function checks whether the
>> + * IRQ is still active and in the negative restores the fast path.
>> + * by construction a single eventfd is handled at a time.
>> + * if the IRQ is still active, the timer is restarted.
>> + */
>> +static void vfio_intp_mmap_enable(void *opaque)
>> +{
>> +    VFIOINTp *tmp;
>> +    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
>> +        if (tmp->state == VFIO_IRQ_ACTIVE) {
>> +            trace_vfio_platform_intp_mmap_enable(tmp->pin);
>> +            /* re-program the timer to check active status later */
>> +            timer_mod(vdev->mmap_timer,
>> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                          vdev->mmap_timeout);
>> +            qemu_mutex_unlock(&vdev->intp_mutex);
>> +            return;
>> +        }
>> +    }
>> +    vfio_mmap_set_enabled(vdev, true);
>> +    qemu_mutex_unlock(&vdev->intp_mutex);
>> +}
>> +
>> +/**
>> + * vfio_intp_interrupt - The user-side eventfd handler
>> + * @opaque: opaque pointer which in practice is the VFIOINTp*
>> + *
>> + * the function can be entered
>> + * - in event handler context: this IRQ is inactive
>> + *   in that case, the vIRQ is injected into the guest if there
>> + *   is no other active or pending IRQ.
>> + * - in IOhandler context: this IRQ is pending.
>> + *   there is no ACTIVE IRQ
>> + */
>> +static void vfio_intp_interrupt(VFIOINTp *intp)
>> +{
>> +    int ret;
>> +    VFIOINTp *tmp;
>> +    VFIOPlatformDevice *vdev = intp->vdev;
>> +    bool delay_handling = false;
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    if (intp->state == VFIO_IRQ_INACTIVE) {
>> +        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
>> +            if (tmp->state == VFIO_IRQ_ACTIVE ||
>> +                tmp->state == VFIO_IRQ_PENDING) {
>> +                delay_handling = true;
>> +                break;
>> +            }
>> +        }
>> +    }
>> +    if (delay_handling) {
>> +        /*
>> +         * the new IRQ gets a pending status and is pushed in
>> +         * the pending queue
>> +         */
>> +        intp->state = VFIO_IRQ_PENDING;
>> +        trace_vfio_intp_interrupt_set_pending(intp->pin);
>> +        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
>> +                             intp, pqnext);
>> +        ret = event_notifier_test_and_clear(&intp->interrupt);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +        return;
>> +    }
>> +
>> +    /* no active IRQ, the new IRQ can be forwarded to the guest */
>> +    trace_vfio_platform_intp_interrupt(intp->pin,
>> +                              event_notifier_get_fd(&intp->interrupt));
>> +
>> +    if (intp->state == VFIO_IRQ_INACTIVE) {
>> +        ret = event_notifier_test_and_clear(&intp->interrupt);
>> +        if (!ret) {
>> +            error_report("Error when clearing fd=%d (ret = %d)\n",
>> +                         event_notifier_get_fd(&intp->interrupt), ret);
>> +        }
>> +    } /* else this is a pending IRQ that moves to ACTIVE state */
>> +
>> +    intp->state = VFIO_IRQ_ACTIVE;
>> +
>> +    /* sets slow path */
>> +    vfio_mmap_set_enabled(vdev, false);
>> +
>> +    /* trigger the virtual IRQ */
>> +    qemu_set_irq(intp->qemuirq, 1);
>> +
>> +    /* schedule the mmap timer which will restore mmap path after EOI*/
>> +    if (vdev->mmap_timeout) {
>> +        timer_mod(vdev->mmap_timer,
>> +                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                      vdev->mmap_timeout);
>> +    }
>> +    qemu_mutex_unlock(&vdev->intp_mutex);
> 
> See above for comments about re-factoring this. It's not totally clear
> what's being protected by the mutex, just the queues or the intp
> structures themselves?
> 
>> +}
>> +
>> +/**
>> + * vfio_start_eventfd_injection - starts the virtual IRQ injection using
>> + * user-side handled eventfds
>> + * @intp: the IRQ struct pointer
>> + */
>> +
>> +static int vfio_start_eventfd_injection(VFIOINTp *intp)
>> +{
>> +    int ret;
>> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
>> +
>> +    vfio_mask_single_irqindex(vbasedev, intp->pin);
>> +
>> +    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
>> +    if (ret) {
>> +        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
>> +        vfio_unmask_single_irqindex(vbasedev, intp->pin);
>> +        return ret;
>> +    }
>> +    vfio_unmask_single_irqindex(vbasedev, intp->pin);
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Functions used whatever the injection method
>> + */
>> +
>> +/**
>> + * vfio_set_trigger_eventfd - set VFIO eventfd handling
>> + * ie. program the VFIO driver to associates a given IRQ index
>> + * with a fd handler
>> + *
>> + * @intp: IRQ struct pointer
>> + * @handler: handler to be called on eventfd trigger
>> + */
>> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
>> +                                    eventfd_user_side_handler_t handler)
>> +{
>> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
>> +    struct vfio_irq_set *irq_set;
>> +    int argsz, ret;
>> +    int32_t *pfd;
>> +
>> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
>> +    irq_set = g_malloc0(argsz);
>> +    irq_set->argsz = argsz;
>> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
>> +    irq_set->index = intp->pin;
>> +    irq_set->start = 0;
>> +    irq_set->count = 1;
>> +    pfd = (int32_t *)&irq_set->data;
>> +    *pfd = event_notifier_get_fd(&intp->interrupt);
>> +    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
>> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
>> +    g_free(irq_set);
>> +    if (ret < 0) {
>> +        error_report("vfio: Failed to set trigger eventfd: %m");
>> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
>> +    }
>> +    return ret;
>> +}
>> +
>>  /* not implemented yet */
>>  static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
>>  {
>> @@ -39,6 +288,40 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>>  }
>>  
>>  /**
>> + * vfio_init_intp - allocate, initialize the IRQ struct pointer
>> + * and add it into the list of IRQ
>> + * @vbasedev: the VFIO device
>> + * @index: VFIO device IRQ index
>> + */
>> +static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
>> +{
>> +    int ret;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
>> +    VFIOINTp *intp;
>> +
>> +    /* allocate and populate a new VFIOINTp structure put in a queue list */
>> +    intp = g_malloc0(sizeof(*intp));
>> +    intp->vdev = vdev;
>> +    intp->pin = index;
>> +    intp->state = VFIO_IRQ_INACTIVE;
>> +    sysbus_init_irq(sbdev, &intp->qemuirq);
>> +
>> +    /* Get an eventfd for trigger */
>> +    ret = event_notifier_init(&intp->interrupt, 0);
>> +    if (ret) {
>> +        g_free(intp);
>> +        error_report("vfio: Error: trigger event_notifier_init failed ");
>> +        return NULL;
>> +    }
>> +
>> +    /* store the new intp in qlist */
>> +    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
>> +    return intp;
>> +}
>> +
>> +/**
>>   * vfio_populate_device - initialize MMIO region and IRQ
>>   * @vbasedev: the VFIO device
>>   *
>> @@ -47,7 +330,9 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>>   */
>>  static int vfio_populate_device(VFIODevice *vbasedev)
>>  {
>> +    struct vfio_irq_info irq = { .argsz = sizeof(irq) };
>>      struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
>> +    VFIOINTp *intp, *tmp;
>>      int i, ret = -1;
>>      VFIOPlatformDevice *vdev =
>>          container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> @@ -80,7 +365,37 @@ static int vfio_populate_device(VFIODevice *vbasedev)
>>                              (unsigned long)vdev->regions[i]->fd_offset);
>>      }
>>  
>> +    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
>> +                                    vfio_intp_mmap_enable, vdev);
>> +
>> +    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
>> +
>> +    for (i = 0; i < vbasedev->num_irqs; i++) {
>> +        irq.index = i;
>> +
>> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
>> +        if (ret) {
>> +            error_printf("vfio: error getting device %s irq info",
>> +                         vbasedev->name);
>> +            goto irq_err;
>> +        } else {
>> +            trace_vfio_platform_populate_interrupts(irq.index,
>> +                                                    irq.count,
>> +                                                    irq.flags);
>> +            intp = vfio_init_intp(vbasedev, irq.index);
>> +            if (!intp) {
>> +                error_report("vfio: Error installing IRQ %d up", i);
>> +                goto irq_err;
>> +            }
>> +        }
>> +    }
>>      return 0;
>> +irq_err:
>> +    timer_del(vdev->mmap_timer);
>> +    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
>> +        QLIST_REMOVE(intp, next);
>> +        g_free(intp);
>> +    }
>>  error:
>>      for (i = 0; i < vbasedev->num_regions; i++) {
>>          g_free(vdev->regions[i]);
>> @@ -93,6 +408,7 @@ error:
>>  static VFIODeviceOps vfio_platform_ops = {
>>      .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>>      .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>> +    .vfio_eoi = vfio_platform_eoi,
>>      .vfio_populate_device = vfio_populate_device,
>>  };
>>  
>> @@ -220,6 +536,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
>>  
>>      vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>>      vbasedev->ops = &vfio_platform_ops;
>> +    vdev->start_irq_fn = vfio_start_eventfd_injection;
>>  
>>      trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>>  
>> @@ -243,6 +560,8 @@ static const VMStateDescription vfio_platform_vmstate = {
>>  
>>  static Property vfio_platform_dev_properties[] = {
>>      DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
>> +    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
>> +                       mmap_timeout, 1100),
>>      DEFINE_PROP_END_OF_LIST(),
>>  };
>>  
>> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
>> index 338f0c6..e55b711 100644
>> --- a/include/hw/vfio/vfio-platform.h
>> +++ b/include/hw/vfio/vfio-platform.h
>> @@ -18,16 +18,49 @@
>>  
>>  #include "hw/sysbus.h"
>>  #include "hw/vfio/vfio-common.h"
>> +#include "qemu/event_notifier.h"
>> +#include "qemu/queue.h"
>> +#include "hw/irq.h"
>>  
>>  #define TYPE_VFIO_PLATFORM "vfio-platform"
>>  
>> +enum {
>> +    VFIO_IRQ_INACTIVE = 0,
>> +    VFIO_IRQ_PENDING = 1,
>> +    VFIO_IRQ_ACTIVE = 2,
>> +    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
>> +};
>> +
>> +typedef struct VFIOINTp {
>> +    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
>> +    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
>> +    EventNotifier interrupt; /* eventfd triggered on interrupt */
>> +    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
>> +    qemu_irq qemuirq;
>> +    struct VFIOPlatformDevice *vdev; /* back pointer to device */
>> +    int state; /* inactive, pending, active */
>> +    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
>> +    uint8_t pin; /* index */
>> +    uint32_t virtualID; /* virtual IRQ */
>> +} VFIOINTp;
>> +
>> +typedef int (*start_irq_fn_t)(VFIOINTp *intp);
>> +
>>  typedef struct VFIOPlatformDevice {
>>      SysBusDevice sbdev;
>>      VFIODevice vbasedev; /* not a QOM object */
>>      VFIORegion **regions;
>> +    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */
>> +    /* queue of pending IRQ */
>> +    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
>>      char *compat; /* compatibility string */
>> +    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
>> +    QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
>> +    start_irq_fn_t start_irq_fn;
>> +    QemuMutex  intp_mutex;
> 
> Is this intp_mutex just for the intp_list or also the
> pending_intp_queue? Perhaps consider re-arranging the structure and
> adding some spacing to show what protects what.
Added some comments to clarify the role of this mutex.

Thanks again for the review. V11 just posted should take in account all
your comments.

Best Regards

Eric
> 
>>  } VFIOPlatformDevice;
>>  
>> +
>>  typedef struct VFIOPlatformDeviceClass {
>>      /*< private >*/
>>      SysBusDeviceClass parent_class;
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v10 3/7] hw/vfio/platform: add irq assignment
@ 2015-03-19 10:18       ` Eric Auger
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Auger @ 2015-03-19 10:18 UTC (permalink / raw)
  To: Alex Bennée
  Cc: eric.auger, patches, qemu-devel, alex.williamson, pbonzini,
	feng.wu, kvmarm

Hi Alex,
On 02/17/2015 12:24 PM, Alex Bennée wrote:
> 
> Eric Auger <eric.auger@linaro.org> writes:
> 
>> This patch adds the code requested to assign interrupts to
>> a guest. The interrupts are mediated through user handled
>> eventfds only.
>>
>> The mechanics to start the IRQ handling is not yet there through.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> See comments inline.
> 
>>
>> ---
>>
>> v8 -> v9:
>> - free irq related resources in case of error in vfio_populate_device
>> ---
>>  hw/vfio/platform.c              | 319 ++++++++++++++++++++++++++++++++++++++++
>>  include/hw/vfio/vfio-platform.h |  33 +++++
>>  2 files changed, 352 insertions(+)
>>
>> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
>> index caadb92..b85ad6c 100644
>> --- a/hw/vfio/platform.c
>> +++ b/hw/vfio/platform.c
>> @@ -22,10 +22,259 @@
>>  #include "qemu/range.h"
>>  #include "sysemu/sysemu.h"
>>  #include "exec/memory.h"
>> +#include "qemu/queue.h"
>>  #include "hw/sysbus.h"
>>  #include "trace.h"
>>  #include "hw/platform-bus.h"
>>  
>> +static void vfio_intp_interrupt(VFIOINTp *intp);
>> +typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
>> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
>> +                                    eventfd_user_side_handler_t handler);
>> +
>> +/*
>> + * Functions only used when eventfd are handled on user-side
>> + * ie. without irqfd
>> + */
>> +
>> +/**
>> + * vfio_platform_eoi - IRQ completion routine
>> + * @vbasedev: the VFIO device
>> + *
>> + * de-asserts the active virtual IRQ and unmask the physical IRQ
>> + * (masked by the  VFIO driver). Handle pending IRQs if any.
>> + * eoi function is called on the first access to any MMIO region
>> + * after an IRQ was triggered. It is assumed this access corresponds
>> + * to the IRQ status register reset. With such a mechanism, a single
>> + * IRQ can be handled at a time since there is no way to know which
>> + * IRQ was completed by the guest (we would need additional details
>> + * about the IRQ status register mask)
>> + */
>> +static void vfio_platform_eoi(VFIODevice *vbasedev)
>> +{
>> +    VFIOINTp *intp;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    QLIST_FOREACH(intp, &vdev->intp_list, next) {
>> +        if (intp->state == VFIO_IRQ_ACTIVE) {
>> +            trace_vfio_platform_eoi(intp->pin,
>> +                                event_notifier_get_fd(&intp->interrupt));
>> +            intp->state = VFIO_IRQ_INACTIVE;
>> +
>> +            /* deassert the virtual IRQ and unmask physical one */
>> +            qemu_set_irq(intp->qemuirq, 0);
>> +            vfio_unmask_single_irqindex(vbasedev, intp->pin);
>> +
>> +            /* a single IRQ can be active at a time */
>> +            break;
>> +        }
>> +    }
>> +    /* in case there are pending IRQs, handle them one at a time */
>> +    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
>> +        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
>> +        trace_vfio_platform_eoi_handle_pending(intp->pin);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +        vfio_intp_interrupt(intp);
>> +        qemu_mutex_lock(&vdev->intp_mutex);
>> +        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
> 
> This locking is way too ugly. If the intp lock is protecting the
> structures then releasing it so the child function can grab it again is
> just asking for races to happen. Perhaps vfio_intp_interrupt can be
> split to have a _lockheld variant that can be used here and the other
> version do the locking before calling the _lockheld function.
The mutex aims at protecting the state of the IRQs stored in intp_list
IRQ. I refactored the code as you advised.
> 
> 
>> +    } else {
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_mmap_set_enabled - enable/disable the fast path mode
>> + * @vdev: the VFIO platform device
>> + * @enabled: the target mmap state
>> + *
>> + * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
>> + * false ~ slow path = MMIO region is trapped and region callbacks
>> + * are called slow path enables to trap the IRQ status register
>> + * guest reset
>> +*/
>> +
>> +static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
>> +{
>> +    VFIORegion *region;
> 
> region could be defined inside the block, not that it matters too much
> for a small function like this.
done
> 
>> +    int i;
>> +
>> +    trace_vfio_platform_mmap_set_enabled(enabled);
>> +
>> +    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
>> +        region = vdev->regions[i];
>> +
>> +        /* register space is unmapped to trap EOI */
>> +        memory_region_set_enabled(&region->mmap_mem, enabled);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_intp_mmap_enable - timer function, restores the fast path
>> + * if there is no more active IRQ
>> + * @opaque: actually points to the VFIO platform device
>> + *
>> + * Called on mmap timer timout, this function checks whether the
>> + * IRQ is still active and in the negative restores the fast path.
>> + * by construction a single eventfd is handled at a time.
>> + * if the IRQ is still active, the timer is restarted.
>> + */
>> +static void vfio_intp_mmap_enable(void *opaque)
>> +{
>> +    VFIOINTp *tmp;
>> +    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
>> +        if (tmp->state == VFIO_IRQ_ACTIVE) {
>> +            trace_vfio_platform_intp_mmap_enable(tmp->pin);
>> +            /* re-program the timer to check active status later */
>> +            timer_mod(vdev->mmap_timer,
>> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                          vdev->mmap_timeout);
>> +            qemu_mutex_unlock(&vdev->intp_mutex);
>> +            return;
>> +        }
>> +    }
>> +    vfio_mmap_set_enabled(vdev, true);
>> +    qemu_mutex_unlock(&vdev->intp_mutex);
>> +}
>> +
>> +/**
>> + * vfio_intp_interrupt - The user-side eventfd handler
>> + * @opaque: opaque pointer which in practice is the VFIOINTp*
>> + *
>> + * the function can be entered
>> + * - in event handler context: this IRQ is inactive
>> + *   in that case, the vIRQ is injected into the guest if there
>> + *   is no other active or pending IRQ.
>> + * - in IOhandler context: this IRQ is pending.
>> + *   there is no ACTIVE IRQ
>> + */
>> +static void vfio_intp_interrupt(VFIOINTp *intp)
>> +{
>> +    int ret;
>> +    VFIOINTp *tmp;
>> +    VFIOPlatformDevice *vdev = intp->vdev;
>> +    bool delay_handling = false;
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    if (intp->state == VFIO_IRQ_INACTIVE) {
>> +        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
>> +            if (tmp->state == VFIO_IRQ_ACTIVE ||
>> +                tmp->state == VFIO_IRQ_PENDING) {
>> +                delay_handling = true;
>> +                break;
>> +            }
>> +        }
>> +    }
>> +    if (delay_handling) {
>> +        /*
>> +         * the new IRQ gets a pending status and is pushed in
>> +         * the pending queue
>> +         */
>> +        intp->state = VFIO_IRQ_PENDING;
>> +        trace_vfio_intp_interrupt_set_pending(intp->pin);
>> +        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
>> +                             intp, pqnext);
>> +        ret = event_notifier_test_and_clear(&intp->interrupt);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +        return;
>> +    }
>> +
>> +    /* no active IRQ, the new IRQ can be forwarded to the guest */
>> +    trace_vfio_platform_intp_interrupt(intp->pin,
>> +                              event_notifier_get_fd(&intp->interrupt));
>> +
>> +    if (intp->state == VFIO_IRQ_INACTIVE) {
>> +        ret = event_notifier_test_and_clear(&intp->interrupt);
>> +        if (!ret) {
>> +            error_report("Error when clearing fd=%d (ret = %d)\n",
>> +                         event_notifier_get_fd(&intp->interrupt), ret);
>> +        }
>> +    } /* else this is a pending IRQ that moves to ACTIVE state */
>> +
>> +    intp->state = VFIO_IRQ_ACTIVE;
>> +
>> +    /* sets slow path */
>> +    vfio_mmap_set_enabled(vdev, false);
>> +
>> +    /* trigger the virtual IRQ */
>> +    qemu_set_irq(intp->qemuirq, 1);
>> +
>> +    /* schedule the mmap timer which will restore mmap path after EOI*/
>> +    if (vdev->mmap_timeout) {
>> +        timer_mod(vdev->mmap_timer,
>> +                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                      vdev->mmap_timeout);
>> +    }
>> +    qemu_mutex_unlock(&vdev->intp_mutex);
> 
> See above for comments about re-factoring this. It's not totally clear
> what's being protected by the mutex, just the queues or the intp
> structures themselves?
> 
>> +}
>> +
>> +/**
>> + * vfio_start_eventfd_injection - starts the virtual IRQ injection using
>> + * user-side handled eventfds
>> + * @intp: the IRQ struct pointer
>> + */
>> +
>> +static int vfio_start_eventfd_injection(VFIOINTp *intp)
>> +{
>> +    int ret;
>> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
>> +
>> +    vfio_mask_single_irqindex(vbasedev, intp->pin);
>> +
>> +    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
>> +    if (ret) {
>> +        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
>> +        vfio_unmask_single_irqindex(vbasedev, intp->pin);
>> +        return ret;
>> +    }
>> +    vfio_unmask_single_irqindex(vbasedev, intp->pin);
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Functions used whatever the injection method
>> + */
>> +
>> +/**
>> + * vfio_set_trigger_eventfd - set VFIO eventfd handling
>> + * ie. program the VFIO driver to associates a given IRQ index
>> + * with a fd handler
>> + *
>> + * @intp: IRQ struct pointer
>> + * @handler: handler to be called on eventfd trigger
>> + */
>> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
>> +                                    eventfd_user_side_handler_t handler)
>> +{
>> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
>> +    struct vfio_irq_set *irq_set;
>> +    int argsz, ret;
>> +    int32_t *pfd;
>> +
>> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
>> +    irq_set = g_malloc0(argsz);
>> +    irq_set->argsz = argsz;
>> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
>> +    irq_set->index = intp->pin;
>> +    irq_set->start = 0;
>> +    irq_set->count = 1;
>> +    pfd = (int32_t *)&irq_set->data;
>> +    *pfd = event_notifier_get_fd(&intp->interrupt);
>> +    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
>> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
>> +    g_free(irq_set);
>> +    if (ret < 0) {
>> +        error_report("vfio: Failed to set trigger eventfd: %m");
>> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
>> +    }
>> +    return ret;
>> +}
>> +
>>  /* not implemented yet */
>>  static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
>>  {
>> @@ -39,6 +288,40 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>>  }
>>  
>>  /**
>> + * vfio_init_intp - allocate, initialize the IRQ struct pointer
>> + * and add it into the list of IRQ
>> + * @vbasedev: the VFIO device
>> + * @index: VFIO device IRQ index
>> + */
>> +static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
>> +{
>> +    int ret;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
>> +    VFIOINTp *intp;
>> +
>> +    /* allocate and populate a new VFIOINTp structure put in a queue list */
>> +    intp = g_malloc0(sizeof(*intp));
>> +    intp->vdev = vdev;
>> +    intp->pin = index;
>> +    intp->state = VFIO_IRQ_INACTIVE;
>> +    sysbus_init_irq(sbdev, &intp->qemuirq);
>> +
>> +    /* Get an eventfd for trigger */
>> +    ret = event_notifier_init(&intp->interrupt, 0);
>> +    if (ret) {
>> +        g_free(intp);
>> +        error_report("vfio: Error: trigger event_notifier_init failed ");
>> +        return NULL;
>> +    }
>> +
>> +    /* store the new intp in qlist */
>> +    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
>> +    return intp;
>> +}
>> +
>> +/**
>>   * vfio_populate_device - initialize MMIO region and IRQ
>>   * @vbasedev: the VFIO device
>>   *
>> @@ -47,7 +330,9 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
>>   */
>>  static int vfio_populate_device(VFIODevice *vbasedev)
>>  {
>> +    struct vfio_irq_info irq = { .argsz = sizeof(irq) };
>>      struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
>> +    VFIOINTp *intp, *tmp;
>>      int i, ret = -1;
>>      VFIOPlatformDevice *vdev =
>>          container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> @@ -80,7 +365,37 @@ static int vfio_populate_device(VFIODevice *vbasedev)
>>                              (unsigned long)vdev->regions[i]->fd_offset);
>>      }
>>  
>> +    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
>> +                                    vfio_intp_mmap_enable, vdev);
>> +
>> +    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
>> +
>> +    for (i = 0; i < vbasedev->num_irqs; i++) {
>> +        irq.index = i;
>> +
>> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
>> +        if (ret) {
>> +            error_printf("vfio: error getting device %s irq info",
>> +                         vbasedev->name);
>> +            goto irq_err;
>> +        } else {
>> +            trace_vfio_platform_populate_interrupts(irq.index,
>> +                                                    irq.count,
>> +                                                    irq.flags);
>> +            intp = vfio_init_intp(vbasedev, irq.index);
>> +            if (!intp) {
>> +                error_report("vfio: Error installing IRQ %d up", i);
>> +                goto irq_err;
>> +            }
>> +        }
>> +    }
>>      return 0;
>> +irq_err:
>> +    timer_del(vdev->mmap_timer);
>> +    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
>> +        QLIST_REMOVE(intp, next);
>> +        g_free(intp);
>> +    }
>>  error:
>>      for (i = 0; i < vbasedev->num_regions; i++) {
>>          g_free(vdev->regions[i]);
>> @@ -93,6 +408,7 @@ error:
>>  static VFIODeviceOps vfio_platform_ops = {
>>      .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>>      .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>> +    .vfio_eoi = vfio_platform_eoi,
>>      .vfio_populate_device = vfio_populate_device,
>>  };
>>  
>> @@ -220,6 +536,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
>>  
>>      vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>>      vbasedev->ops = &vfio_platform_ops;
>> +    vdev->start_irq_fn = vfio_start_eventfd_injection;
>>  
>>      trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>>  
>> @@ -243,6 +560,8 @@ static const VMStateDescription vfio_platform_vmstate = {
>>  
>>  static Property vfio_platform_dev_properties[] = {
>>      DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
>> +    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
>> +                       mmap_timeout, 1100),
>>      DEFINE_PROP_END_OF_LIST(),
>>  };
>>  
>> diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
>> index 338f0c6..e55b711 100644
>> --- a/include/hw/vfio/vfio-platform.h
>> +++ b/include/hw/vfio/vfio-platform.h
>> @@ -18,16 +18,49 @@
>>  
>>  #include "hw/sysbus.h"
>>  #include "hw/vfio/vfio-common.h"
>> +#include "qemu/event_notifier.h"
>> +#include "qemu/queue.h"
>> +#include "hw/irq.h"
>>  
>>  #define TYPE_VFIO_PLATFORM "vfio-platform"
>>  
>> +enum {
>> +    VFIO_IRQ_INACTIVE = 0,
>> +    VFIO_IRQ_PENDING = 1,
>> +    VFIO_IRQ_ACTIVE = 2,
>> +    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
>> +};
>> +
>> +typedef struct VFIOINTp {
>> +    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
>> +    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
>> +    EventNotifier interrupt; /* eventfd triggered on interrupt */
>> +    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
>> +    qemu_irq qemuirq;
>> +    struct VFIOPlatformDevice *vdev; /* back pointer to device */
>> +    int state; /* inactive, pending, active */
>> +    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
>> +    uint8_t pin; /* index */
>> +    uint32_t virtualID; /* virtual IRQ */
>> +} VFIOINTp;
>> +
>> +typedef int (*start_irq_fn_t)(VFIOINTp *intp);
>> +
>>  typedef struct VFIOPlatformDevice {
>>      SysBusDevice sbdev;
>>      VFIODevice vbasedev; /* not a QOM object */
>>      VFIORegion **regions;
>> +    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */
>> +    /* queue of pending IRQ */
>> +    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
>>      char *compat; /* compatibility string */
>> +    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
>> +    QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
>> +    start_irq_fn_t start_irq_fn;
>> +    QemuMutex  intp_mutex;
> 
> Is this intp_mutex just for the intp_list or also the
> pending_intp_queue? Perhaps consider re-arranging the structure and
> adding some spacing to show what protects what.
Added some comments to clarify the role of this mutex.

Thanks again for the review. V11 just posted should take in account all
your comments.

Best Regards

Eric
> 
>>  } VFIOPlatformDevice;
>>  
>> +
>>  typedef struct VFIOPlatformDeviceClass {
>>      /*< private >*/
>>      SysBusDeviceClass parent_class;
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-03-19 10:21 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-13  3:47 [Qemu-devel] [PATCH v10 0/7] KVM platform device passthrough Eric Auger
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 1/7] linux-headers: update VFIO header for VFIO platform drivers Eric Auger
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 2/7] hw/vfio/platform: vfio-platform skeleton Eric Auger
2015-02-17 10:56   ` Alex Bennée
2015-02-17 10:56     ` Alex Bennée
2015-03-13  9:28     ` [Qemu-devel] " Eric Auger
2015-03-13  9:28       ` Eric Auger
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 3/7] hw/vfio/platform: add irq assignment Eric Auger
2015-02-17 11:24   ` Alex Bennée
2015-02-17 11:24     ` Alex Bennée
2015-03-19 10:18     ` [Qemu-devel] " Eric Auger
2015-03-19 10:18       ` Eric Auger
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 4/7] hw/vfio/platform: add capability to start IRQ propagation Eric Auger
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 5/7] hw/vfio: calxeda xgmac device Eric Auger
2015-02-17 11:29   ` Alex Bennée
2015-02-17 11:29     ` Alex Bennée
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 6/7] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
2015-02-17 11:36   ` Alex Bennée
2015-02-17 11:36     ` Alex Bennée
2015-03-13  9:33     ` [Qemu-devel] " Eric Auger
2015-03-13  9:33       ` Eric Auger
2015-02-13  3:47 ` [Qemu-devel] [PATCH v10 7/7] hw/vfio/platform: add irqfd support Eric Auger
2015-02-17 11:41   ` Alex Bennée
2015-02-17 11:41     ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.