All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v11 0/9] KVM platform device passthrough
@ 2015-03-19 10:05 ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

This series aims at enabling KVM platform device passthrough.

Kernel dependencies were pulled for 4.1-rc0 (vfio platform driver and
irqfd for ARM)

This series now only relies on the following QEMU series that allows
to instantiate the VFIO platform device from QEMU command line:

[1] [PATCH v11 0/4] machvirt dynamic sysbus device instantiation
    https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00804.html

Both series are candidate for QEMU 2.4.

- QEMU pieces can be found at:
  http://git.linaro.org/people/eric.auger/qemu.git
  (branch vfio_integ_v11)

- kernel pieces can be found at:
  http://git.linaro.org/people/eric.auger/linux.git
  (branch vfio_integ_v11_kernel)

The series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.

Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

History:
v10->v11:
- rebase onto v2.3.0-rc0 (mainly related to PCIe support in virt)
- add dma-coherent property for calxeda midway (fix revealed by removal
  of kernel-side "vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1")
- virt modifications to start VFIO IRQ forwarding are now in a separate
  patch
- rearrange linux header exports (those are still partial exports
  waiting for definitive 4.1-rc0)
- take into account Alex Bennee comments:
  - use g_malloc0_n instead of g_malloc0
  - use block declarations when possible
  - rework readlink returned value treatment
  - use g_strlcat in place strncat
  - re-arrange mutex locking for multiple IRQ support (user-side handled
    eventfds)
- use g_snprintf instead of snprintf
- change the order of functions to avoid pre-declaration in platform.c
- add flags in VFIOINTp struct to detect whether the IRQ is automasked
- some comment rewriting

v9->v10:
- rebase on "vfio: cleanup vfio_get_device error path, remove
  vfio_populate_device": vfio_populate_device no more called in
  vfio_get_device but in vfio_base_device_init
- update VFIO header according to vfio platform driver v13 (no AMBA)

v8->v9:
- rebase on 2.2.0 and machvirt dynamic sysbus instantiation v10
- v8 1-11 were pulled
- patch files related to forwarding are moved in a seperate series since
  it depends on kernel series still in RFC.
- introduction of basic VFIO platform device split into 3 patch files to
  ease the review (hope it will help).
- add an author in platform.c
- add deallocation in vfio_populate_device error case
- add patch file doing the VFIO header sync
- use VFIO_DEVICE_FLAGS_PLATFORM in vfio_populate_device
- rename calxeda_xgmac.c into calxeda-xgmac.c
- sysbus-fdt: add_calxeda_midway_xgmac_fdt_node g_free in case of errors
- reword of linux-headers patch files

v7->v8:
- rebase on v2.2.0-rc3 and integrate
  "Add skip_dump flag to ignore memory region during dump"
- KVM header evolution with subindex addition in kvm_arch_forwarded_irq
- split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
- vfio_compute_needs_reset does not return bool anymore
- add some comments about exposed MMIO region and IRQ in calxeda xgmac
  device
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- rework IRQ startup: former machine init done notifier is replaced by a
  reset notifier. machine file passes the interrupt controller
  DeviceState handle (not the platform bus first irq parameter).
- sysbus-fdt:
  - move the add_fdt_node_functions array declaration between the device
    specific code and the generic code to avoid forward declarations of
    decice specific functions
  - rename add_basic_vfio_fdt_node into add_calxeda_midway_xgmac_fdt_node
    emphasizing the fact it is xgmac specific

v6->v7:
- fake injection test modality removed
- VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
- new helper functions to start VFIO IRQ on machine init done notifier
  (introduced in hw/vfio/platform: add vfio-platform support and notifier
  registration invoked in hw/arm/virt: add support for VFIO devices).
  vfio_start_irq_injection is replaced by vfio_register_irq_starter.

v5->v6:
- rebase on 2.1rc5 PCI code
- forwarded IRQ first integraton
- vfio_device property renamed into host property
- split IRQ setup in different functions that match the 3 supported
  injection techniques (user handled eventfd, irqfd, forwarded IRQ):
  removes dynamic switch between injection methods
- introduce fake interrupts as a test modality:
  x makes possible to test multiple IRQ user-side handling.
  x this is a test feature only: enable to trigger a fd as if the
    real physical IRQ hit. No virtual IRQ is injected into the guest
    but handling is simulated so that the state machine can be tested
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs using fake interrupt modality
- irqfd no more advertised in this patchset (handled in [3])
- VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
  and class is re-introduced (as per v4)
- all DPRINTF removed in platform and replaced by trace-points
- corrects compilation with configure --disable-kvm
- simplifies the split for vfio_get_device and introduce a unique
  specialized function named vfio_populate_device
- group_list renamed into vfio_group_list
- hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
  instantiation. Needs to be specialized for other VFIO devices
- fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)

v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
  - trace updates in vfio_region_write/read
  - remove fd from VFIORegion
  - get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
  hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
  device tree node generation we do not have so many things to
  implement in that derived device yet. May be re-introduced later
  on if needed typically for reset/migration.
- no GSI routing table anymore

v3->v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI code change review
- mv include files in hw/vfio
- DPRINTF reformatting temporarily moved out
- support of VFIO virq (removal of resamplefd handler on user-side)
- integration with sysbus dynamic instantiation framwork
- removal of unrealize and cleanup routines until it is better
  understood what is really needed
- Support of VFIO for Amba devices should be handled in an inherited
  device to specialize the device tree generation (clock handle currently
  missing in framework however)
- "Always use eventfd as notifying mechanism" temporarily moved out
- static instantiation is not mainstream (although it remains possible)
  note if static instantiation is used, irqfd must be setup in machine file
  when virtual IRQ is known
- create the GSI routing table on qemu side

v2->v3 changes (Alvise Rigo, Eric Auger):
- Following Alex W recommandations, further efforts to factorize the
  code between PCI:introduction of VFIODevice and VFIORegion
  as base classes
- unique reset handler for platform and PCI
- cleanup following Kim's comments
- multiple IRQ support mechanics should be in place although not
  tested
- Better handling of MMIO multiple regions
- New features and fixes by Alvise (multiple compat string, exec
  flag, force eventfd usage, amba device tree support)
- irqfd support

v1->v2 changes (Kim Phillips, Eric Auger):
- IRQ initial support (legacy mode where eventfds are handled on
  user side)
- hacked dynamic instantiation

v1 (Kim Phillips):
- initial split between PCI and platform
- MMIO support only
- static instantiation

Best Regards

Eric


Eric Auger (9):
  linux-headers: update VFIO header for VFIO platform/amba drivers
  hw/vfio/platform: vfio-platform skeleton
  hw/vfio/platform: add irq assignment
  hw/vfio/platform: add capability to start IRQ propagation
  hw/arm/virt: start VFIO IRQ propagation
  hw/vfio/platform: calxeda xgmac device
  hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  linux-headers: update arm/arm64 KVM headers for irqfd
  hw/vfio/platform: add irqfd support

 hw/arm/sysbus-fdt.c                  |  72 ++++
 hw/arm/virt.c                        |  34 +-
 hw/vfio/Makefile.objs                |   2 +
 hw/vfio/calxeda-xgmac.c              |  54 +++
 hw/vfio/platform.c                   | 772 +++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-calxeda-xgmac.h |  46 +++
 include/hw/vfio/vfio-common.h        |   1 +
 include/hw/vfio/vfio-platform.h      |  90 ++++
 linux-headers/asm-arm/kvm.h          |   3 +
 linux-headers/asm-arm64/kvm.h        |   3 +
 linux-headers/linux/vfio.h           |   8 +-
 trace-events                         |  14 +
 12 files changed, 1082 insertions(+), 17 deletions(-)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
 create mode 100644 include/hw/vfio/vfio-platform.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v11 0/9] KVM platform device passthrough
@ 2015-03-19 10:05 ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

This series aims at enabling KVM platform device passthrough.

Kernel dependencies were pulled for 4.1-rc0 (vfio platform driver and
irqfd for ARM)

This series now only relies on the following QEMU series that allows
to instantiate the VFIO platform device from QEMU command line:

[1] [PATCH v11 0/4] machvirt dynamic sysbus device instantiation
    https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00804.html

Both series are candidate for QEMU 2.4.

- QEMU pieces can be found at:
  http://git.linaro.org/people/eric.auger/qemu.git
  (branch vfio_integ_v11)

- kernel pieces can be found at:
  http://git.linaro.org/people/eric.auger/linux.git
  (branch vfio_integ_v11_kernel)

The series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.

Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

History:
v10->v11:
- rebase onto v2.3.0-rc0 (mainly related to PCIe support in virt)
- add dma-coherent property for calxeda midway (fix revealed by removal
  of kernel-side "vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1")
- virt modifications to start VFIO IRQ forwarding are now in a separate
  patch
- rearrange linux header exports (those are still partial exports
  waiting for definitive 4.1-rc0)
- take into account Alex Bennee comments:
  - use g_malloc0_n instead of g_malloc0
  - use block declarations when possible
  - rework readlink returned value treatment
  - use g_strlcat in place strncat
  - re-arrange mutex locking for multiple IRQ support (user-side handled
    eventfds)
- use g_snprintf instead of snprintf
- change the order of functions to avoid pre-declaration in platform.c
- add flags in VFIOINTp struct to detect whether the IRQ is automasked
- some comment rewriting

v9->v10:
- rebase on "vfio: cleanup vfio_get_device error path, remove
  vfio_populate_device": vfio_populate_device no more called in
  vfio_get_device but in vfio_base_device_init
- update VFIO header according to vfio platform driver v13 (no AMBA)

v8->v9:
- rebase on 2.2.0 and machvirt dynamic sysbus instantiation v10
- v8 1-11 were pulled
- patch files related to forwarding are moved in a seperate series since
  it depends on kernel series still in RFC.
- introduction of basic VFIO platform device split into 3 patch files to
  ease the review (hope it will help).
- add an author in platform.c
- add deallocation in vfio_populate_device error case
- add patch file doing the VFIO header sync
- use VFIO_DEVICE_FLAGS_PLATFORM in vfio_populate_device
- rename calxeda_xgmac.c into calxeda-xgmac.c
- sysbus-fdt: add_calxeda_midway_xgmac_fdt_node g_free in case of errors
- reword of linux-headers patch files

v7->v8:
- rebase on v2.2.0-rc3 and integrate
  "Add skip_dump flag to ignore memory region during dump"
- KVM header evolution with subindex addition in kvm_arch_forwarded_irq
- split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
- vfio_compute_needs_reset does not return bool anymore
- add some comments about exposed MMIO region and IRQ in calxeda xgmac
  device
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- rework IRQ startup: former machine init done notifier is replaced by a
  reset notifier. machine file passes the interrupt controller
  DeviceState handle (not the platform bus first irq parameter).
- sysbus-fdt:
  - move the add_fdt_node_functions array declaration between the device
    specific code and the generic code to avoid forward declarations of
    decice specific functions
  - rename add_basic_vfio_fdt_node into add_calxeda_midway_xgmac_fdt_node
    emphasizing the fact it is xgmac specific

v6->v7:
- fake injection test modality removed
- VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
- new helper functions to start VFIO IRQ on machine init done notifier
  (introduced in hw/vfio/platform: add vfio-platform support and notifier
  registration invoked in hw/arm/virt: add support for VFIO devices).
  vfio_start_irq_injection is replaced by vfio_register_irq_starter.

v5->v6:
- rebase on 2.1rc5 PCI code
- forwarded IRQ first integraton
- vfio_device property renamed into host property
- split IRQ setup in different functions that match the 3 supported
  injection techniques (user handled eventfd, irqfd, forwarded IRQ):
  removes dynamic switch between injection methods
- introduce fake interrupts as a test modality:
  x makes possible to test multiple IRQ user-side handling.
  x this is a test feature only: enable to trigger a fd as if the
    real physical IRQ hit. No virtual IRQ is injected into the guest
    but handling is simulated so that the state machine can be tested
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs using fake interrupt modality
- irqfd no more advertised in this patchset (handled in [3])
- VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
  and class is re-introduced (as per v4)
- all DPRINTF removed in platform and replaced by trace-points
- corrects compilation with configure --disable-kvm
- simplifies the split for vfio_get_device and introduce a unique
  specialized function named vfio_populate_device
- group_list renamed into vfio_group_list
- hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
  instantiation. Needs to be specialized for other VFIO devices
- fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)

v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
  - trace updates in vfio_region_write/read
  - remove fd from VFIORegion
  - get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
  hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
  device tree node generation we do not have so many things to
  implement in that derived device yet. May be re-introduced later
  on if needed typically for reset/migration.
- no GSI routing table anymore

v3->v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI code change review
- mv include files in hw/vfio
- DPRINTF reformatting temporarily moved out
- support of VFIO virq (removal of resamplefd handler on user-side)
- integration with sysbus dynamic instantiation framwork
- removal of unrealize and cleanup routines until it is better
  understood what is really needed
- Support of VFIO for Amba devices should be handled in an inherited
  device to specialize the device tree generation (clock handle currently
  missing in framework however)
- "Always use eventfd as notifying mechanism" temporarily moved out
- static instantiation is not mainstream (although it remains possible)
  note if static instantiation is used, irqfd must be setup in machine file
  when virtual IRQ is known
- create the GSI routing table on qemu side

v2->v3 changes (Alvise Rigo, Eric Auger):
- Following Alex W recommandations, further efforts to factorize the
  code between PCI:introduction of VFIODevice and VFIORegion
  as base classes
- unique reset handler for platform and PCI
- cleanup following Kim's comments
- multiple IRQ support mechanics should be in place although not
  tested
- Better handling of MMIO multiple regions
- New features and fixes by Alvise (multiple compat string, exec
  flag, force eventfd usage, amba device tree support)
- irqfd support

v1->v2 changes (Kim Phillips, Eric Auger):
- IRQ initial support (legacy mode where eventfds are handled on
  user side)
- hacked dynamic instantiation

v1 (Kim Phillips):
- initial split between PCI and platform
- MMIO support only
- static instantiation

Best Regards

Eric


Eric Auger (9):
  linux-headers: update VFIO header for VFIO platform/amba drivers
  hw/vfio/platform: vfio-platform skeleton
  hw/vfio/platform: add irq assignment
  hw/vfio/platform: add capability to start IRQ propagation
  hw/arm/virt: start VFIO IRQ propagation
  hw/vfio/platform: calxeda xgmac device
  hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  linux-headers: update arm/arm64 KVM headers for irqfd
  hw/vfio/platform: add irqfd support

 hw/arm/sysbus-fdt.c                  |  72 ++++
 hw/arm/virt.c                        |  34 +-
 hw/vfio/Makefile.objs                |   2 +
 hw/vfio/calxeda-xgmac.c              |  54 +++
 hw/vfio/platform.c                   | 772 +++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-calxeda-xgmac.h |  46 +++
 include/hw/vfio/vfio-common.h        |   1 +
 include/hw/vfio/vfio-platform.h      |  90 ++++
 linux-headers/asm-arm/kvm.h          |   3 +
 linux-headers/asm-arm64/kvm.h        |   3 +
 linux-headers/linux/vfio.h           |   8 +-
 trace-events                         |  14 +
 12 files changed, 1082 insertions(+), 17 deletions(-)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
 create mode 100644 include/hw/vfio/vfio-platform.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 1/9] linux-headers: update VFIO header for VFIO platform/amba drivers
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

Update according to the vfio.h header found in
ssh://git@git.linaro.org/people/eric.auger/linux.git
branch 4.0-rc3-v14

---

v10 -> v11:
- only includes header modifications related to vfio platform
  driver v14 and not those related to
  "vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1"
---
 linux-headers/linux/vfio.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 95ba870..b57b750 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -8,8 +8,8 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
-#ifndef VFIO_H
-#define VFIO_H
+#ifndef _UAPIVFIO_H
+#define _UAPIVFIO_H
 
 #include <linux/types.h>
 #include <linux/ioctl.h>
@@ -160,6 +160,8 @@ struct vfio_device_info {
 	__u32	flags;
 #define VFIO_DEVICE_FLAGS_RESET	(1 << 0)	/* Device supports reset */
 #define VFIO_DEVICE_FLAGS_PCI	(1 << 1)	/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)	/* vfio-platform device */
+#define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)	/* vfio-amba device */
 	__u32	num_regions;	/* Max region index + 1 */
 	__u32	num_irqs;	/* Max IRQ index + 1 */
 };
@@ -495,4 +497,4 @@ struct vfio_eeh_pe_op {
 
 /* ***************************************************************** */
 
-#endif /* VFIO_H */
+#endif /* _UAPIVFIO_H */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 1/9] linux-headers: update VFIO header for VFIO platform/amba drivers
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

Update according to the vfio.h header found in
ssh://git@git.linaro.org/people/eric.auger/linux.git
branch 4.0-rc3-v14

---

v10 -> v11:
- only includes header modifications related to vfio platform
  driver v14 and not those related to
  "vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1"
---
 linux-headers/linux/vfio.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 95ba870..b57b750 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -8,8 +8,8 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
-#ifndef VFIO_H
-#define VFIO_H
+#ifndef _UAPIVFIO_H
+#define _UAPIVFIO_H
 
 #include <linux/types.h>
 #include <linux/ioctl.h>
@@ -160,6 +160,8 @@ struct vfio_device_info {
 	__u32	flags;
 #define VFIO_DEVICE_FLAGS_RESET	(1 << 0)	/* Device supports reset */
 #define VFIO_DEVICE_FLAGS_PCI	(1 << 1)	/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)	/* vfio-platform device */
+#define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)	/* vfio-amba device */
 	__u32	num_regions;	/* Max region index + 1 */
 	__u32	num_irqs;	/* Max IRQ index + 1 */
 };
@@ -495,4 +497,4 @@ struct vfio_eeh_pe_op {
 
 /* ***************************************************************** */
 
-#endif /* VFIO_H */
+#endif /* _UAPIVFIO_H */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 2/9] hw/vfio/platform: vfio-platform skeleton
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

Minimal VFIO platform implementation supporting register space
user mapping but not IRQ assignment.

Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
x Take into account Alex Bennee's comments:
- use g_malloc0_n instead of g_malloc0
- use block declarations when possible
- rework readlink returned value treatment
- use g_strlcat in place of strncat
x use g_snprintf in place of snprintf
x correct error handling in vfio_populate_device,
  in case of flag not corresponding to platform device
x various cosmetic changes

v9 -> v10:
- vfio_populate_device no more called in common vfio_get_device
  but in vfio_base_device_init

v8 -> v9:
- irq management is moved into a separate patch to ease the review
- VFIO_DEVICE_FLAGS_PLATFORM is checked in vfio_populate_device
- g_free of regions added in vfio_populate_device error label
- virtualID becomes 32b

v7 -> v8:
- change proto of vfio_platform_compute_needs_reset and sets
  vbasedev->needs_reset to false there
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- vfio_register_irq_starter renamed into vfio_kick_irqs
  we now use a reset notifier instead of a machine init done notifier.
  Enables to get rid of the VfioIrqStarterNotifierParams dangling
  pointer. Previously we use pbus first_irq. This is no more possible
  since the reset notifier takes a void * and first_irq is a field of
  a const struct. So now we pass the DeviceState handle of the
  interrupt controller. I tried to keep the code generic, reason why
  I did not rely on an architecture specific accessor to retrieve
  the gsi number (gic accessor as proposed by Alex). I would like to
  avoid creating an ARM VFIO device model. I hope this model
  model can work on other archs than arm (no multiple intc?);
  wouldn't it be simpler to keep the previous first_irq parameter and
  relax the const constraint.

v6 -> v7:
- compat is not exposed anymore as a user option. Rationale is
  the vfio device became abstract and a specialization is needed
  anyway. The derived device must set the compat string.
- in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
  A new function dubbed vfio_register_irq_starter replaces it. It
  registers a machine init done notifier that programs & starts
  all dynamic VFIO device IRQs. This function is supposed to be
  called by the machine file. A set of static helper routines are
  added too. It must be called before the creation of the platform
  bus device.

v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
  and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
  property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
  group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
  vfio_enable_intp renamed into vfio_init_intp and new function
  named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
  in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
  the function that will be used for starting injection
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs           |   1 +
 hw/vfio/platform.c              | 287 ++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-common.h   |   1 +
 include/hw/vfio/vfio-platform.h |  44 ++++++
 trace-events                    |  12 ++
 5 files changed, 345 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 0000000..9cbfeb5
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,287 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips <kim.phillips@linaro.org>
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include <linux/vfio.h>
+#include <sys/ioctl.h>
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "hw/sysbus.h"
+#include "trace.h"
+#include "hw/platform-bus.h"
+
+/* VFIO skeleton */
+
+/* not implemented yet */
+static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
+{
+    vbasedev->needs_reset = false;
+}
+
+/* not implemented yet */
+static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
+{
+    return 0;
+}
+
+/**
+ * vfio_populate_device - Allocate and populate MMIO region
+ * structs according to driver returned information
+ * @vbasedev: the VFIO device handle
+ *
+ */
+static int vfio_populate_device(VFIODevice *vbasedev)
+{
+    int i, ret = -1;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
+        error_report("vfio: Um, this isn't a platform device");
+        return ret;
+    }
+
+    vdev->regions = g_malloc0_n(1,
+                                sizeof(VFIORegion *) * vbasedev->num_regions);
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+        VFIORegion *ptr;
+
+        vdev->regions[i] = g_malloc0_n(1, sizeof(VFIORegion));
+        ptr = vdev->regions[i];
+        reg_info.index = i;
+        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
+        if (ret) {
+            error_report("vfio: Error getting region %d info: %m", i);
+            goto reg_error;
+        }
+        ptr->flags = reg_info.flags;
+        ptr->size = reg_info.size;
+        ptr->fd_offset = reg_info.offset;
+        ptr->nr = i;
+        ptr->vbasedev = vbasedev;
+
+        trace_vfio_platform_populate_regions(ptr->nr,
+                            (unsigned long)ptr->flags,
+                            (unsigned long)ptr->size,
+                            ptr->vbasedev->fd,
+                            (unsigned long)ptr->fd_offset);
+    }
+
+    return 0;
+reg_error:
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        g_free(vdev->regions[i]);
+    }
+    g_free(vdev->regions);
+    return ret;
+}
+
+/* specialized functions for VFIO Platform devices */
+static VFIODeviceOps vfio_platform_ops = {
+    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
+    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+};
+
+/**
+ * vfio_base_device_init - perform preliminary VFIO setup
+ * @vbasedev: the VFIO device handle
+ *
+ * Implement the VFIO command sequence that allows to discover
+ * assigned device resources: group extraction, device
+ * fd retrieval, resource query.
+ * Precondition: the device name must be initialized
+ */
+static int vfio_base_device_init(VFIODevice *vbasedev)
+{
+    VFIOGroup *group;
+    VFIODevice *vbasedev_iter;
+    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
+    ssize_t len;
+    struct stat st;
+    int groupid;
+    int ret;
+
+    /* name must be set prior to the call */
+    if (!vbasedev->name) {
+        return -EINVAL;
+    }
+
+    /* Check that the host device exists */
+    g_snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
+               vbasedev->name);
+
+    if (stat(path, &st) < 0) {
+        error_report("vfio: error: no such host device: %s", path);
+        return -errno;
+    }
+
+    g_strlcat(path, "iommu_group", sizeof(path));
+    len = readlink(path, iommu_group_path, sizeof(iommu_group_path));
+    if (len < 0) {
+        error_report("vfio: error no iommu_group for device");
+        return -errno;
+    }
+
+    iommu_group_path[len] = 0;
+    group_name = basename(iommu_group_path);
+
+    if (sscanf(group_name, "%d", &groupid) != 1) {
+        error_report("vfio: error reading %s: %m", path);
+        return -errno;
+    }
+
+    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
+
+    group = vfio_get_group(groupid, &address_space_memory);
+    if (!group) {
+        error_report("vfio: failed to get group %d", groupid);
+        return -ENOENT;
+    }
+
+    g_snprintf(path, sizeof(path), "%s", vbasedev->name);
+
+    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
+            error_report("vfio: error: device %s is already attached", path);
+            vfio_put_group(group);
+            return -EBUSY;
+        }
+    }
+    ret = vfio_get_device(group, path, vbasedev);
+    if (ret) {
+        error_report("vfio: failed to get device %s", path);
+        vfio_put_group(group);
+        return ret;
+    }
+
+    ret = vfio_populate_device(vbasedev);
+    if (ret) {
+        error_report("vfio: failed to populate device %s", path);
+        vfio_put_group(group);
+    }
+
+    return ret;
+}
+
+/**
+ * vfio_map_region - initialize the 2 memory regions for a given
+ * MMIO region index
+ * @vdev: the VFIO platform device handle
+ * @nr: the index of the region
+ *
+ * Init the top memory region and the mmapped memory region beneath
+ * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
+ * and could not be passed to memory region functions
+*/
+static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
+{
+    VFIORegion *region = vdev->regions[nr];
+    unsigned size = region->size;
+    char name[64];
+
+    if (!size) {
+        return;
+    }
+
+    g_snprintf(name, sizeof(name), "VFIO %s region %d",
+               vdev->vbasedev.name, nr);
+
+    /* A "slow" read/write mapping underlies all regions */
+    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
+                          region, name, size);
+
+    g_strlcat(name, " mmap", sizeof(name));
+
+    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
+                         &region->mmap_mem, &region->mmap, size, 0, name)) {
+        error_report("%s unsupported. Performance may be slow", name);
+    }
+}
+
+/**
+ * vfio_platform_realize  - the device realize function
+ * @dev: device state pointer
+ * @errp: error
+ *
+ * initialize the device, its memory regions and IRQ structures
+ * IRQ are started separately
+ */
+static void vfio_platform_realize(DeviceState *dev, Error **errp)
+{
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    int i, ret;
+
+    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
+    vbasedev->ops = &vfio_platform_ops;
+
+    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
+
+    ret = vfio_base_device_init(vbasedev);
+    if (ret) {
+        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
+                   vbasedev->name);
+        return;
+    }
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        vfio_map_region(vdev, i);
+        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
+    }
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+    .name = TYPE_VFIO_PLATFORM,
+    .unmigratable = 1,
+};
+
+static Property vfio_platform_dev_properties[] = {
+    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vfio_platform_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->realize = vfio_platform_realize;
+    dc->props = vfio_platform_dev_properties;
+    dc->vmsd = &vfio_platform_vmstate;
+    dc->desc = "VFIO-based platform device assignment";
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static const TypeInfo vfio_platform_dev_info = {
+    .name = TYPE_VFIO_PLATFORM,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(VFIOPlatformDevice),
+    .class_init = vfio_platform_class_init,
+    .class_size = sizeof(VFIOPlatformDeviceClass),
+    .abstract   = true,
+};
+
+static void register_vfio_platform_dev_type(void)
+{
+    type_register_static(&vfio_platform_dev_info);
+}
+
+type_init(register_vfio_platform_dev_type)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 0d1fb80..59a321d 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -42,6 +42,7 @@
 
 enum {
     VFIO_DEVICE_TYPE_PCI = 0,
+    VFIO_DEVICE_TYPE_PLATFORM = 1,
 };
 
 typedef struct VFIORegion {
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
new file mode 100644
index 0000000..338f0c6
--- /dev/null
+++ b/include/hw/vfio/vfio-platform.h
@@ -0,0 +1,44 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips <kim.phillips@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#ifndef HW_VFIO_VFIO_PLATFORM_H
+#define HW_VFIO_VFIO_PLATFORM_H
+
+#include "hw/sysbus.h"
+#include "hw/vfio/vfio-common.h"
+
+#define TYPE_VFIO_PLATFORM "vfio-platform"
+
+typedef struct VFIOPlatformDevice {
+    SysBusDevice sbdev;
+    VFIODevice vbasedev; /* not a QOM object */
+    VFIORegion **regions;
+    char *compat; /* compatibility string */
+} VFIOPlatformDevice;
+
+typedef struct VFIOPlatformDeviceClass {
+    /*< private >*/
+    SysBusDeviceClass parent_class;
+    /*< public >*/
+} VFIOPlatformDeviceClass;
+
+#define VFIO_PLATFORM_DEVICE(obj) \
+     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
+     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
+
+#endif /*HW_VFIO_VFIO_PLATFORM_H*/
diff --git a/trace-events b/trace-events
index 30eba92..17c7108 100644
--- a/trace-events
+++ b/trace-events
@@ -1560,6 +1560,18 @@ vfio_put_group(int fd) "close group->fd=%d"
 vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
 vfio_put_base_device(int fd) "close vdev->fd=%d"
 
+# hw/vfio/platform.c
+vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
+vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
+vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
+vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
+vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
+vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
+vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
+vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
+vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
+vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
+
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
 mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 2/9] hw/vfio/platform: vfio-platform skeleton
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

Minimal VFIO platform implementation supporting register space
user mapping but not IRQ assignment.

Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
x Take into account Alex Bennee's comments:
- use g_malloc0_n instead of g_malloc0
- use block declarations when possible
- rework readlink returned value treatment
- use g_strlcat in place of strncat
x use g_snprintf in place of snprintf
x correct error handling in vfio_populate_device,
  in case of flag not corresponding to platform device
x various cosmetic changes

v9 -> v10:
- vfio_populate_device no more called in common vfio_get_device
  but in vfio_base_device_init

v8 -> v9:
- irq management is moved into a separate patch to ease the review
- VFIO_DEVICE_FLAGS_PLATFORM is checked in vfio_populate_device
- g_free of regions added in vfio_populate_device error label
- virtualID becomes 32b

v7 -> v8:
- change proto of vfio_platform_compute_needs_reset and sets
  vbasedev->needs_reset to false there
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- vfio_register_irq_starter renamed into vfio_kick_irqs
  we now use a reset notifier instead of a machine init done notifier.
  Enables to get rid of the VfioIrqStarterNotifierParams dangling
  pointer. Previously we use pbus first_irq. This is no more possible
  since the reset notifier takes a void * and first_irq is a field of
  a const struct. So now we pass the DeviceState handle of the
  interrupt controller. I tried to keep the code generic, reason why
  I did not rely on an architecture specific accessor to retrieve
  the gsi number (gic accessor as proposed by Alex). I would like to
  avoid creating an ARM VFIO device model. I hope this model
  model can work on other archs than arm (no multiple intc?);
  wouldn't it be simpler to keep the previous first_irq parameter and
  relax the const constraint.

v6 -> v7:
- compat is not exposed anymore as a user option. Rationale is
  the vfio device became abstract and a specialization is needed
  anyway. The derived device must set the compat string.
- in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
  A new function dubbed vfio_register_irq_starter replaces it. It
  registers a machine init done notifier that programs & starts
  all dynamic VFIO device IRQs. This function is supposed to be
  called by the machine file. A set of static helper routines are
  added too. It must be called before the creation of the platform
  bus device.

v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
  and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
  property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
  group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
  vfio_enable_intp renamed into vfio_init_intp and new function
  named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
  in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
  the function that will be used for starting injection
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
 hw/vfio/Makefile.objs           |   1 +
 hw/vfio/platform.c              | 287 ++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-common.h   |   1 +
 include/hw/vfio/vfio-platform.h |  44 ++++++
 trace-events                    |  12 ++
 5 files changed, 345 insertions(+)
 create mode 100644 hw/vfio/platform.c
 create mode 100644 include/hw/vfio/vfio-platform.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
 ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
 endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 0000000..9cbfeb5
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,287 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips <kim.phillips@linaro.org>
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#include <linux/vfio.h>
+#include <sys/ioctl.h>
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "hw/sysbus.h"
+#include "trace.h"
+#include "hw/platform-bus.h"
+
+/* VFIO skeleton */
+
+/* not implemented yet */
+static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
+{
+    vbasedev->needs_reset = false;
+}
+
+/* not implemented yet */
+static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
+{
+    return 0;
+}
+
+/**
+ * vfio_populate_device - Allocate and populate MMIO region
+ * structs according to driver returned information
+ * @vbasedev: the VFIO device handle
+ *
+ */
+static int vfio_populate_device(VFIODevice *vbasedev)
+{
+    int i, ret = -1;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+    if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PLATFORM)) {
+        error_report("vfio: Um, this isn't a platform device");
+        return ret;
+    }
+
+    vdev->regions = g_malloc0_n(1,
+                                sizeof(VFIORegion *) * vbasedev->num_regions);
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+        VFIORegion *ptr;
+
+        vdev->regions[i] = g_malloc0_n(1, sizeof(VFIORegion));
+        ptr = vdev->regions[i];
+        reg_info.index = i;
+        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
+        if (ret) {
+            error_report("vfio: Error getting region %d info: %m", i);
+            goto reg_error;
+        }
+        ptr->flags = reg_info.flags;
+        ptr->size = reg_info.size;
+        ptr->fd_offset = reg_info.offset;
+        ptr->nr = i;
+        ptr->vbasedev = vbasedev;
+
+        trace_vfio_platform_populate_regions(ptr->nr,
+                            (unsigned long)ptr->flags,
+                            (unsigned long)ptr->size,
+                            ptr->vbasedev->fd,
+                            (unsigned long)ptr->fd_offset);
+    }
+
+    return 0;
+reg_error:
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        g_free(vdev->regions[i]);
+    }
+    g_free(vdev->regions);
+    return ret;
+}
+
+/* specialized functions for VFIO Platform devices */
+static VFIODeviceOps vfio_platform_ops = {
+    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
+    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+};
+
+/**
+ * vfio_base_device_init - perform preliminary VFIO setup
+ * @vbasedev: the VFIO device handle
+ *
+ * Implement the VFIO command sequence that allows to discover
+ * assigned device resources: group extraction, device
+ * fd retrieval, resource query.
+ * Precondition: the device name must be initialized
+ */
+static int vfio_base_device_init(VFIODevice *vbasedev)
+{
+    VFIOGroup *group;
+    VFIODevice *vbasedev_iter;
+    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
+    ssize_t len;
+    struct stat st;
+    int groupid;
+    int ret;
+
+    /* name must be set prior to the call */
+    if (!vbasedev->name) {
+        return -EINVAL;
+    }
+
+    /* Check that the host device exists */
+    g_snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
+               vbasedev->name);
+
+    if (stat(path, &st) < 0) {
+        error_report("vfio: error: no such host device: %s", path);
+        return -errno;
+    }
+
+    g_strlcat(path, "iommu_group", sizeof(path));
+    len = readlink(path, iommu_group_path, sizeof(iommu_group_path));
+    if (len < 0) {
+        error_report("vfio: error no iommu_group for device");
+        return -errno;
+    }
+
+    iommu_group_path[len] = 0;
+    group_name = basename(iommu_group_path);
+
+    if (sscanf(group_name, "%d", &groupid) != 1) {
+        error_report("vfio: error reading %s: %m", path);
+        return -errno;
+    }
+
+    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
+
+    group = vfio_get_group(groupid, &address_space_memory);
+    if (!group) {
+        error_report("vfio: failed to get group %d", groupid);
+        return -ENOENT;
+    }
+
+    g_snprintf(path, sizeof(path), "%s", vbasedev->name);
+
+    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
+            error_report("vfio: error: device %s is already attached", path);
+            vfio_put_group(group);
+            return -EBUSY;
+        }
+    }
+    ret = vfio_get_device(group, path, vbasedev);
+    if (ret) {
+        error_report("vfio: failed to get device %s", path);
+        vfio_put_group(group);
+        return ret;
+    }
+
+    ret = vfio_populate_device(vbasedev);
+    if (ret) {
+        error_report("vfio: failed to populate device %s", path);
+        vfio_put_group(group);
+    }
+
+    return ret;
+}
+
+/**
+ * vfio_map_region - initialize the 2 memory regions for a given
+ * MMIO region index
+ * @vdev: the VFIO platform device handle
+ * @nr: the index of the region
+ *
+ * Init the top memory region and the mmapped memory region beneath
+ * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
+ * and could not be passed to memory region functions
+*/
+static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
+{
+    VFIORegion *region = vdev->regions[nr];
+    unsigned size = region->size;
+    char name[64];
+
+    if (!size) {
+        return;
+    }
+
+    g_snprintf(name, sizeof(name), "VFIO %s region %d",
+               vdev->vbasedev.name, nr);
+
+    /* A "slow" read/write mapping underlies all regions */
+    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
+                          region, name, size);
+
+    g_strlcat(name, " mmap", sizeof(name));
+
+    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
+                         &region->mmap_mem, &region->mmap, size, 0, name)) {
+        error_report("%s unsupported. Performance may be slow", name);
+    }
+}
+
+/**
+ * vfio_platform_realize  - the device realize function
+ * @dev: device state pointer
+ * @errp: error
+ *
+ * initialize the device, its memory regions and IRQ structures
+ * IRQ are started separately
+ */
+static void vfio_platform_realize(DeviceState *dev, Error **errp)
+{
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    int i, ret;
+
+    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
+    vbasedev->ops = &vfio_platform_ops;
+
+    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
+
+    ret = vfio_base_device_init(vbasedev);
+    if (ret) {
+        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
+                   vbasedev->name);
+        return;
+    }
+
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        vfio_map_region(vdev, i);
+        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
+    }
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+    .name = TYPE_VFIO_PLATFORM,
+    .unmigratable = 1,
+};
+
+static Property vfio_platform_dev_properties[] = {
+    DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vfio_platform_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->realize = vfio_platform_realize;
+    dc->props = vfio_platform_dev_properties;
+    dc->vmsd = &vfio_platform_vmstate;
+    dc->desc = "VFIO-based platform device assignment";
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static const TypeInfo vfio_platform_dev_info = {
+    .name = TYPE_VFIO_PLATFORM,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(VFIOPlatformDevice),
+    .class_init = vfio_platform_class_init,
+    .class_size = sizeof(VFIOPlatformDeviceClass),
+    .abstract   = true,
+};
+
+static void register_vfio_platform_dev_type(void)
+{
+    type_register_static(&vfio_platform_dev_info);
+}
+
+type_init(register_vfio_platform_dev_type)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 0d1fb80..59a321d 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -42,6 +42,7 @@
 
 enum {
     VFIO_DEVICE_TYPE_PCI = 0,
+    VFIO_DEVICE_TYPE_PLATFORM = 1,
 };
 
 typedef struct VFIORegion {
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
new file mode 100644
index 0000000..338f0c6
--- /dev/null
+++ b/include/hw/vfio/vfio-platform.h
@@ -0,0 +1,44 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Kim Phillips <kim.phillips@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ *  Copyright Red Hat, Inc. 2012
+ */
+
+#ifndef HW_VFIO_VFIO_PLATFORM_H
+#define HW_VFIO_VFIO_PLATFORM_H
+
+#include "hw/sysbus.h"
+#include "hw/vfio/vfio-common.h"
+
+#define TYPE_VFIO_PLATFORM "vfio-platform"
+
+typedef struct VFIOPlatformDevice {
+    SysBusDevice sbdev;
+    VFIODevice vbasedev; /* not a QOM object */
+    VFIORegion **regions;
+    char *compat; /* compatibility string */
+} VFIOPlatformDevice;
+
+typedef struct VFIOPlatformDeviceClass {
+    /*< private >*/
+    SysBusDeviceClass parent_class;
+    /*< public >*/
+} VFIOPlatformDeviceClass;
+
+#define VFIO_PLATFORM_DEVICE(obj) \
+     OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
+     OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
+
+#endif /*HW_VFIO_VFIO_PLATFORM_H*/
diff --git a/trace-events b/trace-events
index 30eba92..17c7108 100644
--- a/trace-events
+++ b/trace-events
@@ -1560,6 +1560,18 @@ vfio_put_group(int fd) "close group->fd=%d"
 vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
 vfio_put_base_device(int fd) "close vdev->fd=%d"
 
+# hw/vfio/platform.c
+vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
+vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
+vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
+vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
+vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
+vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
+vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
+vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
+vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
+vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
+
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
 mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 3/9] hw/vfio/platform: add irq assignment
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

This patch adds the code requested to assign interrupts to
a guest. The interrupts are mediated through user handled
eventfds only.

The mechanics to start the IRQ handling is not yet there through.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
- use block declaration when possible
- change order of vfio_platform_eoi vs vfio_intp_interrupt
- introduce vfio_intp_inject_pending_lockheld following Alex Bennee
  comments
- remove unmasking/masked when setting up VFIO signaling
- remove unused kvm_accel member in VFIOINTp struct
- add flags member in VFIOINTp in order to properly discriminate
  edge/level-sensitive IRQs; unmask the physical IRQ only in case of
  level-sensitive IRQ
- some comment rewording

v8 -> v9:
- free irq related resources in case of error in vfio_populate_device
---
 hw/vfio/platform.c              | 327 +++++++++++++++++++++++++++++++++++++++-
 include/hw/vfio/vfio-platform.h |  36 +++++
 trace-events                    |   4 +-
 3 files changed, 364 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 9cbfeb5..933940c 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -22,10 +22,299 @@
 #include "qemu/range.h"
 #include "sysemu/sysemu.h"
 #include "exec/memory.h"
+#include "qemu/queue.h"
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
 
+/*
+ * Functions used whatever the injection method
+ */
+
+/**
+ * vfio_init_intp - allocate, initialize the IRQ struct pointer
+ * and add it into the list of IRQs
+ * @vbasedev: the VFIO device handle
+ * @info: irq info struct retrieved from VFIO driver
+ */
+static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
+                                struct vfio_irq_info info)
+{
+    int ret;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
+    VFIOINTp *intp;
+
+    intp = g_malloc0(sizeof(*intp));
+    intp->vdev = vdev;
+    intp->pin = info.index;
+    intp->flags = info.flags;
+    intp->state = VFIO_IRQ_INACTIVE;
+
+    sysbus_init_irq(sbdev, &intp->qemuirq);
+
+    /* Get an eventfd for trigger */
+    ret = event_notifier_init(&intp->interrupt, 0);
+    if (ret) {
+        g_free(intp);
+        error_report("vfio: Error: trigger event_notifier_init failed ");
+        return NULL;
+    }
+
+    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
+    return intp;
+}
+
+/**
+ * vfio_set_trigger_eventfd - set VFIO eventfd handling
+ *
+ * @intp: IRQ struct handle
+ * @handler: handler to be called on eventfd signaling
+ *
+ * Setup VFIO signaling and attach an optional user-side handler
+ * to the eventfd
+ */
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+                                    eventfd_user_side_handler_t handler)
+{
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+    struct vfio_irq_set *irq_set;
+    int argsz, ret;
+    int32_t *pfd;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+    irq_set->index = intp->pin;
+    irq_set->start = 0;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = event_notifier_get_fd(&intp->interrupt);
+    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (ret < 0) {
+        error_report("vfio: Failed to set trigger eventfd: %m");
+        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    }
+    return ret;
+}
+
+/*
+ * Functions only used when eventfds are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_mmap_set_enabled - enable/disable the fast path mode
+ * @vdev: the VFIO platform device
+ * @enabled: the target mmap state
+ *
+ * enabled = true ~ fast path = MMIO region is mmaped (no KVM TRAP);
+ * enabled = false ~ slow path = MMIO region is trapped and region callbacks
+ * are called; slow path enables to trap the device IRQ status register reset
+*/
+
+static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
+{
+    int i;
+
+    trace_vfio_platform_mmap_set_enabled(enabled);
+
+    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
+        VFIORegion *region = vdev->regions[i];
+
+        memory_region_set_enabled(&region->mmap_mem, enabled);
+    }
+}
+
+/**
+ * vfio_intp_mmap_enable - timer function, restores the fast path
+ * if there is no more active IRQ
+ * @opaque: actually points to the VFIO platform device
+ *
+ * Called on mmap timer timout, this function checks whether the
+ * IRQ is still active and if not, restores the fast path.
+ * by construction a single eventfd is handled at a time.
+ * if the IRQ is still active, the timer is re-programmed.
+ */
+static void vfio_intp_mmap_enable(void *opaque)
+{
+    VFIOINTp *tmp;
+    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+        if (tmp->state == VFIO_IRQ_ACTIVE) {
+            trace_vfio_platform_intp_mmap_enable(tmp->pin);
+            /* re-program the timer to check active status later */
+            timer_mod(vdev->mmap_timer,
+                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                          vdev->mmap_timeout);
+            qemu_mutex_unlock(&vdev->intp_mutex);
+            return;
+        }
+    }
+    vfio_mmap_set_enabled(vdev, true);
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_intp_inject_pending_lockheld - Injects a pending IRQ
+ * @opaque: opaque pointer, in practice the VFIOINTp handle
+ *
+ * The function is called on a previous IRQ completion, from
+ * vfio_platform_eoi, while the intp_mutex is locked.
+ * Also in such situation, the slow path already is set and
+ * the mmap timer was already programmed.
+ */
+static void vfio_intp_inject_pending_lockheld(VFIOINTp *intp)
+{
+    trace_vfio_platform_intp_inject_pending_lockheld(intp->pin,
+                              event_notifier_get_fd(&intp->interrupt));
+
+    intp->state = VFIO_IRQ_ACTIVE;
+
+    /* trigger the virtual IRQ */
+    qemu_set_irq(intp->qemuirq, 1);
+}
+
+/**
+ * vfio_intp_interrupt - The user-side eventfd handler
+ * @opaque: opaque pointer which in practice is the VFIOINTp handle
+ *
+ * the function is entered in event handler context:
+ * the vIRQ is injected into the guest if there is no other active
+ * or pending IRQ.
+ */
+static void vfio_intp_interrupt(VFIOINTp *intp)
+{
+    int ret;
+    VFIOINTp *tmp;
+    VFIOPlatformDevice *vdev = intp->vdev;
+    bool delay_handling = false;
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    if (intp->state == VFIO_IRQ_INACTIVE) {
+        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+            if (tmp->state == VFIO_IRQ_ACTIVE ||
+                tmp->state == VFIO_IRQ_PENDING) {
+                delay_handling = true;
+                break;
+            }
+        }
+    }
+    if (delay_handling) {
+        /*
+         * the new IRQ gets a pending status and is pushed in
+         * the pending queue
+         */
+        intp->state = VFIO_IRQ_PENDING;
+        trace_vfio_intp_interrupt_set_pending(intp->pin);
+        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
+                             intp, pqnext);
+        ret = event_notifier_test_and_clear(&intp->interrupt);
+        qemu_mutex_unlock(&vdev->intp_mutex);
+        return;
+    }
+
+    trace_vfio_platform_intp_interrupt(intp->pin,
+                              event_notifier_get_fd(&intp->interrupt));
+
+    ret = event_notifier_test_and_clear(&intp->interrupt);
+    if (!ret) {
+        error_report("Error when clearing fd=%d (ret = %d)\n",
+                     event_notifier_get_fd(&intp->interrupt), ret);
+    }
+
+    intp->state = VFIO_IRQ_ACTIVE;
+
+    /* sets slow path */
+    vfio_mmap_set_enabled(vdev, false);
+
+    /* trigger the virtual IRQ */
+    qemu_set_irq(intp->qemuirq, 1);
+
+    /*
+     * Schedule the mmap timer which will restore fastpath when no IRQ
+     * is active anymore
+     */
+    if (vdev->mmap_timeout) {
+        timer_mod(vdev->mmap_timer,
+                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                      vdev->mmap_timeout);
+    }
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_platform_eoi - IRQ completion routine
+ * @vbasedev: the VFIO device handle
+ *
+ * De-asserts the active virtual IRQ and unmasks the physical IRQ
+ * (effective for level sensitive IRQ auto-masked by the  VFIO driver).
+ * Then it handles next pending IRQ if any.
+ * eoi function is called on the first access to any MMIO region
+ * after an IRQ was triggered, trapped since slow path was set.
+ * It is assumed this access corresponds to the IRQ status
+ * register reset. With such a mechanism, a single IRQ can be
+ * handled at a time since there is no way to know which IRQ
+ * was completed by the guest (we would need additional details
+ * about the IRQ status register mask).
+ */
+static void vfio_platform_eoi(VFIODevice *vbasedev)
+{
+    VFIOINTp *intp;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    QLIST_FOREACH(intp, &vdev->intp_list, next) {
+        if (intp->state == VFIO_IRQ_ACTIVE) {
+            trace_vfio_platform_eoi(intp->pin,
+                                event_notifier_get_fd(&intp->interrupt));
+            intp->state = VFIO_IRQ_INACTIVE;
+
+            /* deassert the virtual IRQ */
+            qemu_set_irq(intp->qemuirq, 0);
+
+            if (intp->flags & VFIO_IRQ_INFO_AUTOMASKED) {
+                /* unmasks the physical level-sensitive IRQ */
+                vfio_unmask_single_irqindex(vbasedev, intp->pin);
+            }
+
+            /* a single IRQ can be active at a time */
+            break;
+        }
+    }
+    /* in case there are pending IRQs, handle the first one */
+    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
+        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
+        vfio_intp_inject_pending_lockheld(intp);
+        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
+    }
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_start_eventfd_injection - starts the virtual IRQ injection using
+ * user-side handled eventfds
+ * @intp: the IRQ struct pointer
+ */
+
+static int vfio_start_eventfd_injection(VFIOINTp *intp)
+{
+    int ret;
+
+    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
+    if (ret) {
+        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
+    }
+    return ret;
+}
+
 /* VFIO skeleton */
 
 /* not implemented yet */
@@ -42,12 +331,13 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
 
 /**
  * vfio_populate_device - Allocate and populate MMIO region
- * structs according to driver returned information
+ * and IRQ structs according to driver returned information
  * @vbasedev: the VFIO device handle
  *
  */
 static int vfio_populate_device(VFIODevice *vbasedev)
 {
+    VFIOINTp *intp, *tmp;
     int i, ret = -1;
     VFIOPlatformDevice *vdev =
         container_of(vbasedev, VFIOPlatformDevice, vbasedev);
@@ -85,7 +375,38 @@ static int vfio_populate_device(VFIODevice *vbasedev)
                             (unsigned long)ptr->fd_offset);
     }
 
+    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
+                                    vfio_intp_mmap_enable, vdev);
+
+    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
+
+    for (i = 0; i < vbasedev->num_irqs; i++) {
+        struct vfio_irq_info irq = { .argsz = sizeof(irq) };
+
+        irq.index = i;
+        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
+        if (ret) {
+            error_printf("vfio: error getting device %s irq info",
+                         vbasedev->name);
+            goto irq_err;
+        } else {
+            trace_vfio_platform_populate_interrupts(irq.index,
+                                                    irq.count,
+                                                    irq.flags);
+            intp = vfio_init_intp(vbasedev, irq);
+            if (!intp) {
+                error_report("vfio: Error installing IRQ %d up", i);
+                goto irq_err;
+            }
+        }
+    }
     return 0;
+irq_err:
+    timer_del(vdev->mmap_timer);
+    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
+        QLIST_REMOVE(intp, next);
+        g_free(intp);
+    }
 reg_error:
     for (i = 0; i < vbasedev->num_regions; i++) {
         g_free(vdev->regions[i]);
@@ -98,6 +419,7 @@ reg_error:
 static VFIODeviceOps vfio_platform_ops = {
     .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
     .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+    .vfio_eoi = vfio_platform_eoi,
 };
 
 /**
@@ -233,6 +555,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
 
     vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
     vbasedev->ops = &vfio_platform_ops;
+    vdev->start_irq_fn = vfio_start_eventfd_injection;
 
     trace_vfio_platform_realize(vbasedev->name, vdev->compat);
 
@@ -256,6 +579,8 @@ static const VMStateDescription vfio_platform_vmstate = {
 
 static Property vfio_platform_dev_properties[] = {
     DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
+                       mmap_timeout, 1100),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 338f0c6..31893a3 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -18,14 +18,50 @@
 
 #include "hw/sysbus.h"
 #include "hw/vfio/vfio-common.h"
+#include "qemu/event_notifier.h"
+#include "qemu/queue.h"
+#include "hw/irq.h"
 
 #define TYPE_VFIO_PLATFORM "vfio-platform"
 
+enum {
+    VFIO_IRQ_INACTIVE = 0,
+    VFIO_IRQ_PENDING = 1,
+    VFIO_IRQ_ACTIVE = 2,
+    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
+};
+
+typedef struct VFIOINTp {
+    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
+    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
+    EventNotifier interrupt; /* eventfd triggered on interrupt */
+    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
+    qemu_irq qemuirq;
+    struct VFIOPlatformDevice *vdev; /* back pointer to device */
+    int state; /* inactive, pending, active */
+    uint8_t pin; /* index */
+    uint32_t virtualID; /* virtual IRQ */
+    uint32_t flags; /* IRQ info flags */
+} VFIOINTp;
+
+/* function type for routine starting IRQ propagation to the guest */
+typedef int (*start_irq_fn_t)(VFIOINTp *intp);
+/* function type for user side eventfd handler */
+typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
+
 typedef struct VFIOPlatformDevice {
     SysBusDevice sbdev;
     VFIODevice vbasedev; /* not a QOM object */
     VFIORegion **regions;
+    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQs */
+    /* queue of pending IRQs */
+    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
     char *compat; /* compatibility string */
+    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
+    QEMUTimer *mmap_timer; /* allows fast-path resume after IRQ hit */
+    /* function used to start IRQ propagation to the guest */
+    start_irq_fn_t start_irq_fn;
+    QemuMutex intp_mutex; /* protect the intp_list IRQ state */
 } VFIOPlatformDevice;
 
 typedef struct VFIOPlatformDeviceClass {
diff --git a/trace-events b/trace-events
index 17c7108..c3dbfc3 100644
--- a/trace-events
+++ b/trace-events
@@ -1564,13 +1564,13 @@ vfio_put_base_device(int fd) "close vdev->fd=%d"
 vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
 vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
 vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
-vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
+vfio_platform_intp_interrupt(int pin, int fd) "Inject IRQ #%d (fd = %d)"
+vfio_platform_intp_inject_pending_lockheld(int pin, int fd) "Inject pending IRQ #%d (fd = %d)"
 vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
 vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
 vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
 vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
-vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
 
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 3/9] hw/vfio/platform: add irq assignment
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

This patch adds the code requested to assign interrupts to
a guest. The interrupts are mediated through user handled
eventfds only.

The mechanics to start the IRQ handling is not yet there through.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
- use block declaration when possible
- change order of vfio_platform_eoi vs vfio_intp_interrupt
- introduce vfio_intp_inject_pending_lockheld following Alex Bennee
  comments
- remove unmasking/masked when setting up VFIO signaling
- remove unused kvm_accel member in VFIOINTp struct
- add flags member in VFIOINTp in order to properly discriminate
  edge/level-sensitive IRQs; unmask the physical IRQ only in case of
  level-sensitive IRQ
- some comment rewording

v8 -> v9:
- free irq related resources in case of error in vfio_populate_device
---
 hw/vfio/platform.c              | 327 +++++++++++++++++++++++++++++++++++++++-
 include/hw/vfio/vfio-platform.h |  36 +++++
 trace-events                    |   4 +-
 3 files changed, 364 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 9cbfeb5..933940c 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -22,10 +22,299 @@
 #include "qemu/range.h"
 #include "sysemu/sysemu.h"
 #include "exec/memory.h"
+#include "qemu/queue.h"
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
 
+/*
+ * Functions used whatever the injection method
+ */
+
+/**
+ * vfio_init_intp - allocate, initialize the IRQ struct pointer
+ * and add it into the list of IRQs
+ * @vbasedev: the VFIO device handle
+ * @info: irq info struct retrieved from VFIO driver
+ */
+static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
+                                struct vfio_irq_info info)
+{
+    int ret;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
+    VFIOINTp *intp;
+
+    intp = g_malloc0(sizeof(*intp));
+    intp->vdev = vdev;
+    intp->pin = info.index;
+    intp->flags = info.flags;
+    intp->state = VFIO_IRQ_INACTIVE;
+
+    sysbus_init_irq(sbdev, &intp->qemuirq);
+
+    /* Get an eventfd for trigger */
+    ret = event_notifier_init(&intp->interrupt, 0);
+    if (ret) {
+        g_free(intp);
+        error_report("vfio: Error: trigger event_notifier_init failed ");
+        return NULL;
+    }
+
+    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
+    return intp;
+}
+
+/**
+ * vfio_set_trigger_eventfd - set VFIO eventfd handling
+ *
+ * @intp: IRQ struct handle
+ * @handler: handler to be called on eventfd signaling
+ *
+ * Setup VFIO signaling and attach an optional user-side handler
+ * to the eventfd
+ */
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+                                    eventfd_user_side_handler_t handler)
+{
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+    struct vfio_irq_set *irq_set;
+    int argsz, ret;
+    int32_t *pfd;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+    irq_set->index = intp->pin;
+    irq_set->start = 0;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = event_notifier_get_fd(&intp->interrupt);
+    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (ret < 0) {
+        error_report("vfio: Failed to set trigger eventfd: %m");
+        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    }
+    return ret;
+}
+
+/*
+ * Functions only used when eventfds are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_mmap_set_enabled - enable/disable the fast path mode
+ * @vdev: the VFIO platform device
+ * @enabled: the target mmap state
+ *
+ * enabled = true ~ fast path = MMIO region is mmaped (no KVM TRAP);
+ * enabled = false ~ slow path = MMIO region is trapped and region callbacks
+ * are called; slow path enables to trap the device IRQ status register reset
+*/
+
+static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
+{
+    int i;
+
+    trace_vfio_platform_mmap_set_enabled(enabled);
+
+    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
+        VFIORegion *region = vdev->regions[i];
+
+        memory_region_set_enabled(&region->mmap_mem, enabled);
+    }
+}
+
+/**
+ * vfio_intp_mmap_enable - timer function, restores the fast path
+ * if there is no more active IRQ
+ * @opaque: actually points to the VFIO platform device
+ *
+ * Called on mmap timer timout, this function checks whether the
+ * IRQ is still active and if not, restores the fast path.
+ * by construction a single eventfd is handled at a time.
+ * if the IRQ is still active, the timer is re-programmed.
+ */
+static void vfio_intp_mmap_enable(void *opaque)
+{
+    VFIOINTp *tmp;
+    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+        if (tmp->state == VFIO_IRQ_ACTIVE) {
+            trace_vfio_platform_intp_mmap_enable(tmp->pin);
+            /* re-program the timer to check active status later */
+            timer_mod(vdev->mmap_timer,
+                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                          vdev->mmap_timeout);
+            qemu_mutex_unlock(&vdev->intp_mutex);
+            return;
+        }
+    }
+    vfio_mmap_set_enabled(vdev, true);
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_intp_inject_pending_lockheld - Injects a pending IRQ
+ * @opaque: opaque pointer, in practice the VFIOINTp handle
+ *
+ * The function is called on a previous IRQ completion, from
+ * vfio_platform_eoi, while the intp_mutex is locked.
+ * Also in such situation, the slow path already is set and
+ * the mmap timer was already programmed.
+ */
+static void vfio_intp_inject_pending_lockheld(VFIOINTp *intp)
+{
+    trace_vfio_platform_intp_inject_pending_lockheld(intp->pin,
+                              event_notifier_get_fd(&intp->interrupt));
+
+    intp->state = VFIO_IRQ_ACTIVE;
+
+    /* trigger the virtual IRQ */
+    qemu_set_irq(intp->qemuirq, 1);
+}
+
+/**
+ * vfio_intp_interrupt - The user-side eventfd handler
+ * @opaque: opaque pointer which in practice is the VFIOINTp handle
+ *
+ * the function is entered in event handler context:
+ * the vIRQ is injected into the guest if there is no other active
+ * or pending IRQ.
+ */
+static void vfio_intp_interrupt(VFIOINTp *intp)
+{
+    int ret;
+    VFIOINTp *tmp;
+    VFIOPlatformDevice *vdev = intp->vdev;
+    bool delay_handling = false;
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    if (intp->state == VFIO_IRQ_INACTIVE) {
+        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+            if (tmp->state == VFIO_IRQ_ACTIVE ||
+                tmp->state == VFIO_IRQ_PENDING) {
+                delay_handling = true;
+                break;
+            }
+        }
+    }
+    if (delay_handling) {
+        /*
+         * the new IRQ gets a pending status and is pushed in
+         * the pending queue
+         */
+        intp->state = VFIO_IRQ_PENDING;
+        trace_vfio_intp_interrupt_set_pending(intp->pin);
+        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
+                             intp, pqnext);
+        ret = event_notifier_test_and_clear(&intp->interrupt);
+        qemu_mutex_unlock(&vdev->intp_mutex);
+        return;
+    }
+
+    trace_vfio_platform_intp_interrupt(intp->pin,
+                              event_notifier_get_fd(&intp->interrupt));
+
+    ret = event_notifier_test_and_clear(&intp->interrupt);
+    if (!ret) {
+        error_report("Error when clearing fd=%d (ret = %d)\n",
+                     event_notifier_get_fd(&intp->interrupt), ret);
+    }
+
+    intp->state = VFIO_IRQ_ACTIVE;
+
+    /* sets slow path */
+    vfio_mmap_set_enabled(vdev, false);
+
+    /* trigger the virtual IRQ */
+    qemu_set_irq(intp->qemuirq, 1);
+
+    /*
+     * Schedule the mmap timer which will restore fastpath when no IRQ
+     * is active anymore
+     */
+    if (vdev->mmap_timeout) {
+        timer_mod(vdev->mmap_timer,
+                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                      vdev->mmap_timeout);
+    }
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_platform_eoi - IRQ completion routine
+ * @vbasedev: the VFIO device handle
+ *
+ * De-asserts the active virtual IRQ and unmasks the physical IRQ
+ * (effective for level sensitive IRQ auto-masked by the  VFIO driver).
+ * Then it handles next pending IRQ if any.
+ * eoi function is called on the first access to any MMIO region
+ * after an IRQ was triggered, trapped since slow path was set.
+ * It is assumed this access corresponds to the IRQ status
+ * register reset. With such a mechanism, a single IRQ can be
+ * handled at a time since there is no way to know which IRQ
+ * was completed by the guest (we would need additional details
+ * about the IRQ status register mask).
+ */
+static void vfio_platform_eoi(VFIODevice *vbasedev)
+{
+    VFIOINTp *intp;
+    VFIOPlatformDevice *vdev =
+        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+    qemu_mutex_lock(&vdev->intp_mutex);
+    QLIST_FOREACH(intp, &vdev->intp_list, next) {
+        if (intp->state == VFIO_IRQ_ACTIVE) {
+            trace_vfio_platform_eoi(intp->pin,
+                                event_notifier_get_fd(&intp->interrupt));
+            intp->state = VFIO_IRQ_INACTIVE;
+
+            /* deassert the virtual IRQ */
+            qemu_set_irq(intp->qemuirq, 0);
+
+            if (intp->flags & VFIO_IRQ_INFO_AUTOMASKED) {
+                /* unmasks the physical level-sensitive IRQ */
+                vfio_unmask_single_irqindex(vbasedev, intp->pin);
+            }
+
+            /* a single IRQ can be active at a time */
+            break;
+        }
+    }
+    /* in case there are pending IRQs, handle the first one */
+    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
+        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
+        vfio_intp_inject_pending_lockheld(intp);
+        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
+    }
+    qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_start_eventfd_injection - starts the virtual IRQ injection using
+ * user-side handled eventfds
+ * @intp: the IRQ struct pointer
+ */
+
+static int vfio_start_eventfd_injection(VFIOINTp *intp)
+{
+    int ret;
+
+    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
+    if (ret) {
+        error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
+    }
+    return ret;
+}
+
 /* VFIO skeleton */
 
 /* not implemented yet */
@@ -42,12 +331,13 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
 
 /**
  * vfio_populate_device - Allocate and populate MMIO region
- * structs according to driver returned information
+ * and IRQ structs according to driver returned information
  * @vbasedev: the VFIO device handle
  *
  */
 static int vfio_populate_device(VFIODevice *vbasedev)
 {
+    VFIOINTp *intp, *tmp;
     int i, ret = -1;
     VFIOPlatformDevice *vdev =
         container_of(vbasedev, VFIOPlatformDevice, vbasedev);
@@ -85,7 +375,38 @@ static int vfio_populate_device(VFIODevice *vbasedev)
                             (unsigned long)ptr->fd_offset);
     }
 
+    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
+                                    vfio_intp_mmap_enable, vdev);
+
+    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
+
+    for (i = 0; i < vbasedev->num_irqs; i++) {
+        struct vfio_irq_info irq = { .argsz = sizeof(irq) };
+
+        irq.index = i;
+        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
+        if (ret) {
+            error_printf("vfio: error getting device %s irq info",
+                         vbasedev->name);
+            goto irq_err;
+        } else {
+            trace_vfio_platform_populate_interrupts(irq.index,
+                                                    irq.count,
+                                                    irq.flags);
+            intp = vfio_init_intp(vbasedev, irq);
+            if (!intp) {
+                error_report("vfio: Error installing IRQ %d up", i);
+                goto irq_err;
+            }
+        }
+    }
     return 0;
+irq_err:
+    timer_del(vdev->mmap_timer);
+    QLIST_FOREACH_SAFE(intp, &vdev->intp_list, next, tmp) {
+        QLIST_REMOVE(intp, next);
+        g_free(intp);
+    }
 reg_error:
     for (i = 0; i < vbasedev->num_regions; i++) {
         g_free(vdev->regions[i]);
@@ -98,6 +419,7 @@ reg_error:
 static VFIODeviceOps vfio_platform_ops = {
     .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
     .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+    .vfio_eoi = vfio_platform_eoi,
 };
 
 /**
@@ -233,6 +555,7 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
 
     vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
     vbasedev->ops = &vfio_platform_ops;
+    vdev->start_irq_fn = vfio_start_eventfd_injection;
 
     trace_vfio_platform_realize(vbasedev->name, vdev->compat);
 
@@ -256,6 +579,8 @@ static const VMStateDescription vfio_platform_vmstate = {
 
 static Property vfio_platform_dev_properties[] = {
     DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+    DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
+                       mmap_timeout, 1100),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 338f0c6..31893a3 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -18,14 +18,50 @@
 
 #include "hw/sysbus.h"
 #include "hw/vfio/vfio-common.h"
+#include "qemu/event_notifier.h"
+#include "qemu/queue.h"
+#include "hw/irq.h"
 
 #define TYPE_VFIO_PLATFORM "vfio-platform"
 
+enum {
+    VFIO_IRQ_INACTIVE = 0,
+    VFIO_IRQ_PENDING = 1,
+    VFIO_IRQ_ACTIVE = 2,
+    /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
+};
+
+typedef struct VFIOINTp {
+    QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
+    QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
+    EventNotifier interrupt; /* eventfd triggered on interrupt */
+    EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
+    qemu_irq qemuirq;
+    struct VFIOPlatformDevice *vdev; /* back pointer to device */
+    int state; /* inactive, pending, active */
+    uint8_t pin; /* index */
+    uint32_t virtualID; /* virtual IRQ */
+    uint32_t flags; /* IRQ info flags */
+} VFIOINTp;
+
+/* function type for routine starting IRQ propagation to the guest */
+typedef int (*start_irq_fn_t)(VFIOINTp *intp);
+/* function type for user side eventfd handler */
+typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
+
 typedef struct VFIOPlatformDevice {
     SysBusDevice sbdev;
     VFIODevice vbasedev; /* not a QOM object */
     VFIORegion **regions;
+    QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQs */
+    /* queue of pending IRQs */
+    QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
     char *compat; /* compatibility string */
+    uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
+    QEMUTimer *mmap_timer; /* allows fast-path resume after IRQ hit */
+    /* function used to start IRQ propagation to the guest */
+    start_irq_fn_t start_irq_fn;
+    QemuMutex intp_mutex; /* protect the intp_list IRQ state */
 } VFIOPlatformDevice;
 
 typedef struct VFIOPlatformDeviceClass {
diff --git a/trace-events b/trace-events
index 17c7108..c3dbfc3 100644
--- a/trace-events
+++ b/trace-events
@@ -1564,13 +1564,13 @@ vfio_put_base_device(int fd) "close vdev->fd=%d"
 vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
 vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
 vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
-vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
+vfio_platform_intp_interrupt(int pin, int fd) "Inject IRQ #%d (fd = %d)"
+vfio_platform_intp_inject_pending_lockheld(int pin, int fd) "Inject pending IRQ #%d (fd = %d)"
 vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
 vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
 vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
 vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
-vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
 
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 4/9] hw/vfio/platform: add capability to start IRQ propagation
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

Add a reset notify function that enables to start the propagation of
interrupts to the guest.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
- comment reword

v8 -> v9:
- handle failure in vfio_irq_starter
---
 hw/vfio/platform.c              | 64 +++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-platform.h |  8 ++++++
 2 files changed, 72 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 933940c..59eeca2 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -572,6 +572,70 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
     }
 }
 
+/**
+ * vfio_irq_starter: Start IRQ forwarding for a specific VFIO device
+ * @sbdev: Sysbus device handle
+ * @opaque: DeviceState handle of the interrupt controller the device
+ *          is attached to
+ *
+ * The function retrieves the absolute GSI number each VFIO IRQ
+ * corresponds to and start forwarding.
+ */
+static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque)
+{
+    DeviceState *intc = (DeviceState *)opaque;
+    VFIOPlatformDevice *vdev;
+    VFIOINTp *intp;
+    qemu_irq irq;
+    int gsi, ret;
+
+    if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) {
+        vdev = VFIO_PLATFORM_DEVICE(sbdev);
+
+        QLIST_FOREACH(intp, &vdev->intp_list, next) {
+            gsi = 0;
+            while (1) {
+                irq = qdev_get_gpio_in(intc, gsi);
+                if (irq == intp->qemuirq) {
+                    break;
+                }
+                gsi++;
+            }
+            intp->virtualID = gsi;
+            ret = vdev->start_irq_fn(intp);
+            if (ret) {
+                error_report("%s unable to start propagation of IRQ index %d",
+                             vdev->vbasedev.name, intp->pin);
+                exit(1);
+            }
+        }
+    }
+    return 0;
+}
+
+/**
+ * vfio_kick_irqs: start IRQ forwarding for all the VFIO devices
+ * attached to the platform bus
+ * @data: Device state handle of the interrupt controller the VFIO IRQs
+ *        must be bound to
+ *
+ * Binding to the platform bus IRQs happens on a machine init done
+ * notifier registered by the platform bus. Only at that time the
+ * absolute virtual IRQ = GSI is known and allows to setup IRQFD.
+ * This function typically can be called in a reset notifier registered
+ * by the machine file.
+ */
+void vfio_kick_irqs(void *data)
+{
+    DeviceState *intc = (DeviceState *)data;
+    DeviceState *dev =
+        qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE);
+    PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev);
+
+    assert(pbus->done_gathering);
+    foreach_dynamic_sysbus_device(vfio_irq_starter, intc);
+}
+
 static const VMStateDescription vfio_platform_vmstate = {
     .name = TYPE_VFIO_PLATFORM,
     .unmigratable = 1,
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 31893a3..c9ee898 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -77,4 +77,12 @@ typedef struct VFIOPlatformDeviceClass {
 #define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
      OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
 
+/**
+ * vfio_kick_irqs - reset notifier that starts IRQ injection
+ * for all VFIO dynamic sysbus devices attached to the platform bus.
+ *
+ * @opaque: handle to the interrupt controller DeviceState
+ */
+void vfio_kick_irqs(void *opaque);
+
 #endif /*HW_VFIO_VFIO_PLATFORM_H*/
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 4/9] hw/vfio/platform: add capability to start IRQ propagation
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

Add a reset notify function that enables to start the propagation of
interrupts to the guest.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
- comment reword

v8 -> v9:
- handle failure in vfio_irq_starter
---
 hw/vfio/platform.c              | 64 +++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-platform.h |  8 ++++++
 2 files changed, 72 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 933940c..59eeca2 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -572,6 +572,70 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
     }
 }
 
+/**
+ * vfio_irq_starter: Start IRQ forwarding for a specific VFIO device
+ * @sbdev: Sysbus device handle
+ * @opaque: DeviceState handle of the interrupt controller the device
+ *          is attached to
+ *
+ * The function retrieves the absolute GSI number each VFIO IRQ
+ * corresponds to and start forwarding.
+ */
+static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque)
+{
+    DeviceState *intc = (DeviceState *)opaque;
+    VFIOPlatformDevice *vdev;
+    VFIOINTp *intp;
+    qemu_irq irq;
+    int gsi, ret;
+
+    if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) {
+        vdev = VFIO_PLATFORM_DEVICE(sbdev);
+
+        QLIST_FOREACH(intp, &vdev->intp_list, next) {
+            gsi = 0;
+            while (1) {
+                irq = qdev_get_gpio_in(intc, gsi);
+                if (irq == intp->qemuirq) {
+                    break;
+                }
+                gsi++;
+            }
+            intp->virtualID = gsi;
+            ret = vdev->start_irq_fn(intp);
+            if (ret) {
+                error_report("%s unable to start propagation of IRQ index %d",
+                             vdev->vbasedev.name, intp->pin);
+                exit(1);
+            }
+        }
+    }
+    return 0;
+}
+
+/**
+ * vfio_kick_irqs: start IRQ forwarding for all the VFIO devices
+ * attached to the platform bus
+ * @data: Device state handle of the interrupt controller the VFIO IRQs
+ *        must be bound to
+ *
+ * Binding to the platform bus IRQs happens on a machine init done
+ * notifier registered by the platform bus. Only at that time the
+ * absolute virtual IRQ = GSI is known and allows to setup IRQFD.
+ * This function typically can be called in a reset notifier registered
+ * by the machine file.
+ */
+void vfio_kick_irqs(void *data)
+{
+    DeviceState *intc = (DeviceState *)data;
+    DeviceState *dev =
+        qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE);
+    PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev);
+
+    assert(pbus->done_gathering);
+    foreach_dynamic_sysbus_device(vfio_irq_starter, intc);
+}
+
 static const VMStateDescription vfio_platform_vmstate = {
     .name = TYPE_VFIO_PLATFORM,
     .unmigratable = 1,
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 31893a3..c9ee898 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -77,4 +77,12 @@ typedef struct VFIOPlatformDeviceClass {
 #define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
      OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
 
+/**
+ * vfio_kick_irqs - reset notifier that starts IRQ injection
+ * for all VFIO dynamic sysbus devices attached to the platform bus.
+ *
+ * @opaque: handle to the interrupt controller DeviceState
+ */
+void vfio_kick_irqs(void *opaque);
+
 #endif /*HW_VFIO_VFIO_PLATFORM_H*/
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 5/9] hw/arm/virt: start VFIO IRQ propagation
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

Although the dynamic instantiation of VFIO QEMU devices already is
possible, VFIO IRQ signaling is not yet started. This patch enables
IRQ forwarding by registering a reset notifier that kick off VFIO
signaling for all VFIO devices.

Such mechanism is requested because the VFIO IRQ binding
is handled in a machine init done notifier and only at that time the
aboslute GSI number the physical IRQ is forwarded to is known.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

v10 - v11:
- becomes a separate patch
---
 hw/arm/virt.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 439739d..820b09d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -45,6 +45,7 @@
 #include "hw/pci-host/gpex.h"
 #include "hw/arm/sysbus-fdt.h"
 #include "hw/platform-bus.h"
+#include "hw/vfio/vfio-platform.h"
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -354,10 +355,10 @@ static uint32_t fdt_add_gic_node(const VirtBoardInfo *vbi)
     return gic_phandle;
 }
 
-static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
+static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic,
+                           DeviceState **gicdev)
 {
     /* We create a standalone GIC v2 */
-    DeviceState *gicdev;
     SysBusDevice *gicbusdev;
     const char *gictype = "arm_gic";
     int i;
@@ -366,15 +367,15 @@ static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
         gictype = "kvm-arm-gic";
     }
 
-    gicdev = qdev_create(NULL, gictype);
-    qdev_prop_set_uint32(gicdev, "revision", 2);
-    qdev_prop_set_uint32(gicdev, "num-cpu", smp_cpus);
+    *gicdev = qdev_create(NULL, gictype);
+    qdev_prop_set_uint32(*gicdev, "revision", 2);
+    qdev_prop_set_uint32(*gicdev, "num-cpu", smp_cpus);
     /* Note that the num-irq property counts both internal and external
      * interrupts; there are always 32 of the former (mandated by GIC spec).
      */
-    qdev_prop_set_uint32(gicdev, "num-irq", NUM_IRQS + 32);
-    qdev_init_nofail(gicdev);
-    gicbusdev = SYS_BUS_DEVICE(gicdev);
+    qdev_prop_set_uint32(*gicdev, "num-irq", NUM_IRQS + 32);
+    qdev_init_nofail(*gicdev);
+    gicbusdev = SYS_BUS_DEVICE(*gicdev);
     sysbus_mmio_map(gicbusdev, 0, vbi->memmap[VIRT_GIC_DIST].base);
     sysbus_mmio_map(gicbusdev, 1, vbi->memmap[VIRT_GIC_CPU].base);
 
@@ -389,16 +390,16 @@ static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
          * since a real A15 always has TrustZone but QEMU doesn't.
          */
         qdev_connect_gpio_out(cpudev, 0,
-                              qdev_get_gpio_in(gicdev, ppibase + 30));
+                              qdev_get_gpio_in(*gicdev, ppibase + 30));
         /* virtual timer */
         qdev_connect_gpio_out(cpudev, 1,
-                              qdev_get_gpio_in(gicdev, ppibase + 27));
+                              qdev_get_gpio_in(*gicdev, ppibase + 27));
 
         sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
     }
 
     for (i = 0; i < NUM_IRQS; i++) {
-        pic[i] = qdev_get_gpio_in(gicdev, i);
+        pic[i] = qdev_get_gpio_in(*gicdev, i);
     }
 
     return fdt_add_gic_node(vbi);
@@ -719,7 +720,8 @@ static void create_pcie(const VirtBoardInfo *vbi, qemu_irq *pic,
     g_free(nodename);
 }
 
-static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
+static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
+                                DeviceState *gic)
 {
     DeviceState *dev;
     SysBusDevice *s;
@@ -758,6 +760,9 @@ static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
     memory_region_add_subregion(sysmem,
                                 platform_bus_params.platform_bus_base,
                                 sysbus_mmio_get_region(s, 0));
+
+    /* setup VFIO signaling/IRQFD for all VFIO platform sysbus devices */
+    qemu_register_reset(vfio_kick_irqs, gic);
 }
 
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
@@ -779,6 +784,7 @@ static void machvirt_init(MachineState *machine)
     VirtBoardInfo *vbi;
     uint32_t gic_phandle;
     char **cpustr;
+    DeviceState *gicdev;
 
     if (!cpu_model) {
         cpu_model = "cortex-a15";
@@ -855,7 +861,7 @@ static void machvirt_init(MachineState *machine)
 
     create_flash(vbi);
 
-    gic_phandle = create_gic(vbi, pic);
+    gic_phandle = create_gic(vbi, pic, &gicdev);
 
     create_uart(vbi, pic);
 
@@ -888,7 +894,7 @@ static void machvirt_init(MachineState *machine)
      * another notifier is registered which adds platform bus nodes.
      * Notifiers are executed in registration reverse order.
      */
-    create_platform_bus(vbi, pic);
+    create_platform_bus(vbi, pic, gicdev);
 }
 
 static bool virt_get_secure(Object *obj, Error **errp)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 5/9] hw/arm/virt: start VFIO IRQ propagation
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

Although the dynamic instantiation of VFIO QEMU devices already is
possible, VFIO IRQ signaling is not yet started. This patch enables
IRQ forwarding by registering a reset notifier that kick off VFIO
signaling for all VFIO devices.

Such mechanism is requested because the VFIO IRQ binding
is handled in a machine init done notifier and only at that time the
aboslute GSI number the physical IRQ is forwarded to is known.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

v10 - v11:
- becomes a separate patch
---
 hw/arm/virt.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 439739d..820b09d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -45,6 +45,7 @@
 #include "hw/pci-host/gpex.h"
 #include "hw/arm/sysbus-fdt.h"
 #include "hw/platform-bus.h"
+#include "hw/vfio/vfio-platform.h"
 
 #define NUM_VIRTIO_TRANSPORTS 32
 
@@ -354,10 +355,10 @@ static uint32_t fdt_add_gic_node(const VirtBoardInfo *vbi)
     return gic_phandle;
 }
 
-static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
+static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic,
+                           DeviceState **gicdev)
 {
     /* We create a standalone GIC v2 */
-    DeviceState *gicdev;
     SysBusDevice *gicbusdev;
     const char *gictype = "arm_gic";
     int i;
@@ -366,15 +367,15 @@ static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
         gictype = "kvm-arm-gic";
     }
 
-    gicdev = qdev_create(NULL, gictype);
-    qdev_prop_set_uint32(gicdev, "revision", 2);
-    qdev_prop_set_uint32(gicdev, "num-cpu", smp_cpus);
+    *gicdev = qdev_create(NULL, gictype);
+    qdev_prop_set_uint32(*gicdev, "revision", 2);
+    qdev_prop_set_uint32(*gicdev, "num-cpu", smp_cpus);
     /* Note that the num-irq property counts both internal and external
      * interrupts; there are always 32 of the former (mandated by GIC spec).
      */
-    qdev_prop_set_uint32(gicdev, "num-irq", NUM_IRQS + 32);
-    qdev_init_nofail(gicdev);
-    gicbusdev = SYS_BUS_DEVICE(gicdev);
+    qdev_prop_set_uint32(*gicdev, "num-irq", NUM_IRQS + 32);
+    qdev_init_nofail(*gicdev);
+    gicbusdev = SYS_BUS_DEVICE(*gicdev);
     sysbus_mmio_map(gicbusdev, 0, vbi->memmap[VIRT_GIC_DIST].base);
     sysbus_mmio_map(gicbusdev, 1, vbi->memmap[VIRT_GIC_CPU].base);
 
@@ -389,16 +390,16 @@ static uint32_t create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
          * since a real A15 always has TrustZone but QEMU doesn't.
          */
         qdev_connect_gpio_out(cpudev, 0,
-                              qdev_get_gpio_in(gicdev, ppibase + 30));
+                              qdev_get_gpio_in(*gicdev, ppibase + 30));
         /* virtual timer */
         qdev_connect_gpio_out(cpudev, 1,
-                              qdev_get_gpio_in(gicdev, ppibase + 27));
+                              qdev_get_gpio_in(*gicdev, ppibase + 27));
 
         sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
     }
 
     for (i = 0; i < NUM_IRQS; i++) {
-        pic[i] = qdev_get_gpio_in(gicdev, i);
+        pic[i] = qdev_get_gpio_in(*gicdev, i);
     }
 
     return fdt_add_gic_node(vbi);
@@ -719,7 +720,8 @@ static void create_pcie(const VirtBoardInfo *vbi, qemu_irq *pic,
     g_free(nodename);
 }
 
-static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
+static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
+                                DeviceState *gic)
 {
     DeviceState *dev;
     SysBusDevice *s;
@@ -758,6 +760,9 @@ static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic)
     memory_region_add_subregion(sysmem,
                                 platform_bus_params.platform_bus_base,
                                 sysbus_mmio_get_region(s, 0));
+
+    /* setup VFIO signaling/IRQFD for all VFIO platform sysbus devices */
+    qemu_register_reset(vfio_kick_irqs, gic);
 }
 
 static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
@@ -779,6 +784,7 @@ static void machvirt_init(MachineState *machine)
     VirtBoardInfo *vbi;
     uint32_t gic_phandle;
     char **cpustr;
+    DeviceState *gicdev;
 
     if (!cpu_model) {
         cpu_model = "cortex-a15";
@@ -855,7 +861,7 @@ static void machvirt_init(MachineState *machine)
 
     create_flash(vbi);
 
-    gic_phandle = create_gic(vbi, pic);
+    gic_phandle = create_gic(vbi, pic, &gicdev);
 
     create_uart(vbi, pic);
 
@@ -888,7 +894,7 @@ static void machvirt_init(MachineState *machine)
      * another notifier is registered which adds platform bus nodes.
      * Notifiers are executed in registration reverse order.
      */
-    create_platform_bus(vbi, pic);
+    create_platform_bus(vbi, pic, gicdev);
 }
 
 static bool virt_get_secure(Object *obj, Error **errp)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 6/9] hw/vfio/platform: calxeda xgmac device
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

The platform device class has become abstract. This patch introduces
a calxeda xgmac device that derives from it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennee<alex.bennee@linaro.org>

---
v10 -> v11:
- add Alex Reviewed-by
- move virt modifications in a separate patch

v8 -> v9:
- renamed calxeda_xgmac.c into calxeda-xgmac.c

v7 -> v8:
- add a comment in the header about the MMIO regions and IRQ which
  are exposed by the device

v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override

v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c

v4: creation for device tree specialization

Conflicts:
	hw/arm/virt.c
---
 hw/vfio/Makefile.objs                |  1 +
 hw/vfio/calxeda-xgmac.c              | 54 ++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++++++++++++++++++++++++++++++
 3 files changed, 101 insertions(+)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..d540c9d 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
 endif
diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
new file mode 100644
index 0000000..c4b8fef
--- /dev/null
+++ b/hw/vfio/calxeda-xgmac.c
@@ -0,0 +1,54 @@
+/*
+ * calxeda xgmac VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+    VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+
+    vdev->compat = g_strdup("calxeda,hb-xgmac");
+
+    k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+    .name = TYPE_VFIO_CALXEDA_XGMAC,
+    .unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VFIOCalxedaXgmacDeviceClass *vcxc =
+        VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+    vcxc->parent_realize = dc->realize;
+    dc->realize = calxeda_xgmac_realize;
+    dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+    .name = TYPE_VFIO_CALXEDA_XGMAC,
+    .parent = TYPE_VFIO_PLATFORM,
+    .instance_size = sizeof(VFIOCalxedaXgmacDevice),
+    .class_init = vfio_calxeda_xgmac_class_init,
+    .class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+    type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 0000000..f994775
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,46 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+/**
+ * This device exposes:
+ * - a single MMIO region corresponding to its register space
+ * - 3 IRQS (main and 2 power related IRQs)
+ */
+typedef struct VFIOCalxedaXgmacDevice {
+    VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+    /*< private >*/
+    VFIOPlatformDeviceClass parent_class;
+    /*< public >*/
+    DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+     OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+     OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+                        TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+                      TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 6/9] hw/vfio/platform: calxeda xgmac device
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

The platform device class has become abstract. This patch introduces
a calxeda xgmac device that derives from it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennee<alex.bennee@linaro.org>

---
v10 -> v11:
- add Alex Reviewed-by
- move virt modifications in a separate patch

v8 -> v9:
- renamed calxeda_xgmac.c into calxeda-xgmac.c

v7 -> v8:
- add a comment in the header about the MMIO regions and IRQ which
  are exposed by the device

v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override

v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c

v4: creation for device tree specialization

Conflicts:
	hw/arm/virt.c
---
 hw/vfio/Makefile.objs                |  1 +
 hw/vfio/calxeda-xgmac.c              | 54 ++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++++++++++++++++++++++++++++++
 3 files changed, 101 insertions(+)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..d540c9d 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
 endif
diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
new file mode 100644
index 0000000..c4b8fef
--- /dev/null
+++ b/hw/vfio/calxeda-xgmac.c
@@ -0,0 +1,54 @@
+/*
+ * calxeda xgmac VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+    VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+
+    vdev->compat = g_strdup("calxeda,hb-xgmac");
+
+    k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+    .name = TYPE_VFIO_CALXEDA_XGMAC,
+    .unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VFIOCalxedaXgmacDeviceClass *vcxc =
+        VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+    vcxc->parent_realize = dc->realize;
+    dc->realize = calxeda_xgmac_realize;
+    dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+    .name = TYPE_VFIO_CALXEDA_XGMAC,
+    .parent = TYPE_VFIO_PLATFORM,
+    .instance_size = sizeof(VFIOCalxedaXgmacDevice),
+    .class_init = vfio_calxeda_xgmac_class_init,
+    .class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+    type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 0000000..f994775
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,46 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+/**
+ * This device exposes:
+ * - a single MMIO region corresponding to its register space
+ * - 3 IRQS (main and 2 power related IRQs)
+ */
+typedef struct VFIOCalxedaXgmacDevice {
+    VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+    /*< private >*/
+    VFIOPlatformDeviceClass parent_class;
+    /*< public >*/
+    DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+     OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+     OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+                        TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+     OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+                      TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 7/9] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

This patch allows the instantiation of the vfio-calxeda-xgmac device
from the QEMU command line (-device vfio-calxeda-xgmac,host="<device>").

A specialized device tree node is created for the guest, containing
compat, dma-coherent, reg and interrupts properties.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
- add dma-coherent property to calxeda midway xgmac node (fix)
- use qemu_fdt_setprop to add reg property instead of
  qemu_fdt_setprop_sized_cells_from_array
- commit message rewording

v8 -> v9:
- properly free resources in case of errors in
  add_calxeda_midway_xgmac_fdt_node

v7 -> v8:
- move the add_fdt_node_functions array declaration between the device
  specific code and the generic code to avoid forward declarations of
  decice specific functions
- rename add_basic_vfio_fdt_node into
  add_calxeda_midway_xgmac_fdt_node

v6 -> v7:
- compat string re-formatting removed since compat string is not exposed
  anymore as a user option
- VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
  device
---
 hw/arm/sysbus-fdt.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 3038b94..6e465b2 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -26,6 +26,8 @@
 #include "sysemu/device_tree.h"
 #include "hw/platform-bus.h"
 #include "sysemu/sysemu.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
 
 /*
  * internal struct that contains the information to create dynamic
@@ -53,11 +55,81 @@ typedef struct NodeCreationPair {
     int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* Device Specific Code */
+
+/**
+ * add_calxeda_midway_xgmac_fdt_node
+ *
+ * Generates a simple node with following properties:
+ * compatible string, regs, interrupts, dma-coherent
+ */
+static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+    PlatformBusFDTData *data = opaque;
+    PlatformBusDevice *pbus = data->pbus;
+    void *fdt = data->fdt;
+    const char *parent_node = data->pbus_node_name;
+    int compat_str_len, i, ret = -1;
+    char *nodename;
+    uint32_t *irq_attr, *reg_attr;
+    uint64_t mmio_base, irq_number;
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    Object *obj = OBJECT(sbdev);
+
+    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+                               vbasedev->name, mmio_base);
+    qemu_fdt_add_subnode(fdt, nodename);
+
+    compat_str_len = strlen(vdev->compat) + 1;
+    qemu_fdt_setprop(fdt, nodename, "compatible",
+                          vdev->compat, compat_str_len);
+
+    qemu_fdt_setprop(fdt, nodename, "dma-coherent", "", 0);
+
+    reg_attr = g_new(uint32_t, vbasedev->num_regions*2);
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
+        reg_attr[2*i] = (uint32_t)mmio_base;
+        reg_attr[2*i+1] = (uint32_t)memory_region_size(&vdev->regions[i]->mem);
+    }
+    ret = qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
+                           vbasedev->num_regions*2*sizeof(uint32_t));
+    if (ret) {
+        error_report("could not set reg property of node %s", nodename);
+        goto fail_reg;
+    }
+
+    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+    for (i = 0; i < vbasedev->num_irqs; i++) {
+        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
+                         + data->irq_start;
+        irq_attr[3*i] = cpu_to_be32(0);
+        irq_attr[3*i+1] = cpu_to_be32(irq_number);
+        irq_attr[3*i+2] = cpu_to_be32(0x4);
+    }
+   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+                     irq_attr, vbasedev->num_irqs*3*sizeof(uint32_t));
+    if (ret) {
+        error_report("could not set interrupts property of node %s",
+                     nodename);
+    }
+    g_free(irq_attr);
+fail_reg:
+    g_free(reg_attr);
+    g_free(nodename);
+    return ret;
+}
+
 /* list of supported dynamic sysbus devices */
 static const NodeCreationPair add_fdt_node_functions[] = {
+    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
     {"", NULL}, /* last element */
 };
 
+/* Generic Code */
+
 /**
  * add_fdt_node - add the device tree node of a dynamic sysbus device
  *
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 7/9] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

This patch allows the instantiation of the vfio-calxeda-xgmac device
from the QEMU command line (-device vfio-calxeda-xgmac,host="<device>").

A specialized device tree node is created for the guest, containing
compat, dma-coherent, reg and interrupts properties.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v10 -> v11:
- add dma-coherent property to calxeda midway xgmac node (fix)
- use qemu_fdt_setprop to add reg property instead of
  qemu_fdt_setprop_sized_cells_from_array
- commit message rewording

v8 -> v9:
- properly free resources in case of errors in
  add_calxeda_midway_xgmac_fdt_node

v7 -> v8:
- move the add_fdt_node_functions array declaration between the device
  specific code and the generic code to avoid forward declarations of
  decice specific functions
- rename add_basic_vfio_fdt_node into
  add_calxeda_midway_xgmac_fdt_node

v6 -> v7:
- compat string re-formatting removed since compat string is not exposed
  anymore as a user option
- VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
  device
---
 hw/arm/sysbus-fdt.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 3038b94..6e465b2 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -26,6 +26,8 @@
 #include "sysemu/device_tree.h"
 #include "hw/platform-bus.h"
 #include "sysemu/sysemu.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
 
 /*
  * internal struct that contains the information to create dynamic
@@ -53,11 +55,81 @@ typedef struct NodeCreationPair {
     int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* Device Specific Code */
+
+/**
+ * add_calxeda_midway_xgmac_fdt_node
+ *
+ * Generates a simple node with following properties:
+ * compatible string, regs, interrupts, dma-coherent
+ */
+static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+    PlatformBusFDTData *data = opaque;
+    PlatformBusDevice *pbus = data->pbus;
+    void *fdt = data->fdt;
+    const char *parent_node = data->pbus_node_name;
+    int compat_str_len, i, ret = -1;
+    char *nodename;
+    uint32_t *irq_attr, *reg_attr;
+    uint64_t mmio_base, irq_number;
+    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+    VFIODevice *vbasedev = &vdev->vbasedev;
+    Object *obj = OBJECT(sbdev);
+
+    mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+                               vbasedev->name, mmio_base);
+    qemu_fdt_add_subnode(fdt, nodename);
+
+    compat_str_len = strlen(vdev->compat) + 1;
+    qemu_fdt_setprop(fdt, nodename, "compatible",
+                          vdev->compat, compat_str_len);
+
+    qemu_fdt_setprop(fdt, nodename, "dma-coherent", "", 0);
+
+    reg_attr = g_new(uint32_t, vbasedev->num_regions*2);
+    for (i = 0; i < vbasedev->num_regions; i++) {
+        mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
+        reg_attr[2*i] = (uint32_t)mmio_base;
+        reg_attr[2*i+1] = (uint32_t)memory_region_size(&vdev->regions[i]->mem);
+    }
+    ret = qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
+                           vbasedev->num_regions*2*sizeof(uint32_t));
+    if (ret) {
+        error_report("could not set reg property of node %s", nodename);
+        goto fail_reg;
+    }
+
+    irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+    for (i = 0; i < vbasedev->num_irqs; i++) {
+        irq_number = platform_bus_get_irqn(pbus, sbdev , i)
+                         + data->irq_start;
+        irq_attr[3*i] = cpu_to_be32(0);
+        irq_attr[3*i+1] = cpu_to_be32(irq_number);
+        irq_attr[3*i+2] = cpu_to_be32(0x4);
+    }
+   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+                     irq_attr, vbasedev->num_irqs*3*sizeof(uint32_t));
+    if (ret) {
+        error_report("could not set interrupts property of node %s",
+                     nodename);
+    }
+    g_free(irq_attr);
+fail_reg:
+    g_free(reg_attr);
+    g_free(nodename);
+    return ret;
+}
+
 /* list of supported dynamic sysbus devices */
 static const NodeCreationPair add_fdt_node_functions[] = {
+    {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
     {"", NULL}, /* last element */
 };
 
+/* Generic Code */
+
 /**
  * add_fdt_node - add the device tree node of a dynamic sysbus device
  *
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 8/9] linux-headers: update arm/arm64 KVM headers for irqfd
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

Update according to arm/arm64 kvm.h headers found in
ssh://git.linaro.org/people/eric.auger/linux.git
branch irqfd_v9_stdalone_official_release

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 linux-headers/asm-arm/kvm.h   | 3 +++
 linux-headers/asm-arm64/kvm.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index 0db25bc..2499867 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -198,6 +198,9 @@ struct kvm_arch_memory_slot {
 /* Highest supported SPI, from VGIC_NR_IRQS */
 #define KVM_ARM_IRQ_GIC_MAX		127
 
+/* One single KVM irqchip, ie. the VGIC */
+#define KVM_NR_IRQCHIPS          1
+
 /* PSCI interface */
 #define KVM_PSCI_FN_BASE		0x95c1ba5e
 #define KVM_PSCI_FN(n)			(KVM_PSCI_FN_BASE + (n))
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index 3ef77a4..c154c0b 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -191,6 +191,9 @@ struct kvm_arch_memory_slot {
 /* Highest supported SPI, from VGIC_NR_IRQS */
 #define KVM_ARM_IRQ_GIC_MAX		127
 
+/* One single KVM irqchip, ie. the VGIC */
+#define KVM_NR_IRQCHIPS          1
+
 /* PSCI interface */
 #define KVM_PSCI_FN_BASE		0x95c1ba5e
 #define KVM_PSCI_FN(n)			(KVM_PSCI_FN_BASE + (n))
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 8/9] linux-headers: update arm/arm64 KVM headers for irqfd
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

Update according to arm/arm64 kvm.h headers found in
ssh://git.linaro.org/people/eric.auger/linux.git
branch irqfd_v9_stdalone_official_release

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 linux-headers/asm-arm/kvm.h   | 3 +++
 linux-headers/asm-arm64/kvm.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index 0db25bc..2499867 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -198,6 +198,9 @@ struct kvm_arch_memory_slot {
 /* Highest supported SPI, from VGIC_NR_IRQS */
 #define KVM_ARM_IRQ_GIC_MAX		127
 
+/* One single KVM irqchip, ie. the VGIC */
+#define KVM_NR_IRQCHIPS          1
+
 /* PSCI interface */
 #define KVM_PSCI_FN_BASE		0x95c1ba5e
 #define KVM_PSCI_FN(n)			(KVM_PSCI_FN_BASE + (n))
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index 3ef77a4..c154c0b 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -191,6 +191,9 @@ struct kvm_arch_memory_slot {
 /* Highest supported SPI, from VGIC_NR_IRQS */
 #define KVM_ARM_IRQ_GIC_MAX		127
 
+/* One single KVM irqchip, ie. the VGIC */
+#define KVM_NR_IRQCHIPS          1
+
 /* PSCI interface */
 #define KVM_PSCI_FN_BASE		0x95c1ba5e
 #define KVM_PSCI_FN(n)			(KVM_PSCI_FN_BASE + (n))
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v11 9/9] hw/vfio/platform: add irqfd support
  2015-03-19 10:05 ` Eric Auger
@ 2015-03-19 10:05   ` Eric Auger
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: kim.phillips, patches, a.rigo, pbonzini, kvmarm, christoffer.dall

This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

it depends on host kernel KVM irqfd.

Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennee<alex.bennee@linaro.org>

---
v10 -> v11:
- Add Alex' Reviewed-by
- introduce kvm_accel in this patch and initialize it

v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.

Conflicts:
	hw/vfio/platform.c
---
 hw/vfio/platform.c              | 96 +++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-platform.h |  2 +
 trace-events                    |  2 +
 3 files changed, 100 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 59eeca2..98872ec 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -26,6 +26,7 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
+#include "sysemu/kvm.h"
 
 /*
  * Functions used whatever the injection method
@@ -51,6 +52,7 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
     intp->pin = info.index;
     intp->flags = info.flags;
     intp->state = VFIO_IRQ_INACTIVE;
+    intp->kvm_accel = false;
 
     sysbus_init_irq(sbdev, &intp->qemuirq);
 
@@ -61,6 +63,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
         error_report("vfio: Error: trigger event_notifier_init failed ");
         return NULL;
     }
+    /* Get an eventfd for resample/unmask */
+    ret = event_notifier_init(&intp->unmask, 0);
+    if (ret) {
+        g_free(intp);
+        error_report("vfio: Error: resample event_notifier_init failed eoi");
+        return NULL;
+    }
 
     QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
     return intp;
@@ -315,6 +324,82 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
     return ret;
 }
 
+/*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct handle
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+    struct vfio_irq_set *irq_set;
+    int argsz, ret;
+    int32_t *pfd;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+    irq_set->index = intp->pin;
+    irq_set->start = 0;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = event_notifier_get_fd(&intp->unmask);
+    qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (ret < 0) {
+        error_report("vfio: Failed to set resample eventfd: %m");
+    }
+    return ret;
+}
+
+/**
+ * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
+ * programs VFIO driver with both the trigger and resamplefd
+ * programs KVM with the gsi, trigger & resample eventfds
+ */
+static int vfio_start_irqfd_injection(VFIOINTp *intp)
+{
+    struct kvm_irqfd irqfd = {
+        .fd = event_notifier_get_fd(&intp->interrupt),
+        .resamplefd = event_notifier_get_fd(&intp->unmask),
+        .gsi = intp->virtualID,
+        .flags = KVM_IRQFD_FLAG_RESAMPLE,
+    };
+
+    if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+        error_report("vfio: Error: Failed to assign the irqfd: %m");
+        goto fail_irqfd;
+    }
+    if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+        goto fail_vfio;
+    }
+    if (vfio_set_resample_eventfd(intp) < 0) {
+        goto fail_vfio;
+    }
+
+    intp->kvm_accel = true;
+    trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
+                                     irqfd.fd, irqfd.resamplefd);
+    return 0;
+
+fail_vfio:
+    irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+    kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+    return -1;
+}
+
+#endif /* CONFIG_KVM */
+
 /* VFIO skeleton */
 
 /* not implemented yet */
@@ -555,7 +640,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
 
     vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
     vbasedev->ops = &vfio_platform_ops;
+
+#ifdef CONFIG_KVM
+    if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
+        vdev->irqfd_allowed) {
+        vdev->start_irq_fn = vfio_start_irqfd_injection;
+    } else {
+        vdev->start_irq_fn = vfio_start_eventfd_injection;
+    }
+#else
     vdev->start_irq_fn = vfio_start_eventfd_injection;
+#endif
 
     trace_vfio_platform_realize(vbasedev->name, vdev->compat);
 
@@ -645,6 +740,7 @@ static Property vfio_platform_dev_properties[] = {
     DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
     DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
                        mmap_timeout, 1100),
+    DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index c9ee898..25d2c46 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -42,6 +42,7 @@ typedef struct VFIOINTp {
     uint8_t pin; /* index */
     uint32_t virtualID; /* virtual IRQ */
     uint32_t flags; /* IRQ info flags */
+    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
 } VFIOINTp;
 
 /* function type for routine starting IRQ propagation to the guest */
@@ -62,6 +63,7 @@ typedef struct VFIOPlatformDevice {
     /* function used to start IRQ propagation to the guest */
     start_irq_fn_t start_irq_fn;
     QemuMutex intp_mutex; /* protect the intp_list IRQ state */
+    bool irqfd_allowed; /* debug option to force irqfd on/off */
 } VFIOPlatformDevice;
 
 typedef struct VFIOPlatformDeviceClass {
diff --git a/trace-events b/trace-events
index c3dbfc3..31371b4 100644
--- a/trace-events
+++ b/trace-events
@@ -1571,6 +1571,8 @@ vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned lo
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
 vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
 vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
+vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d"
+vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d"
 
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v11 9/9] hw/vfio/platform: add irqfd support
@ 2015-03-19 10:05   ` Eric Auger
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Auger @ 2015-03-19 10:05 UTC (permalink / raw)
  To: eric.auger, eric.auger, qemu-devel, alex.williamson,
	peter.maydell, agraf, b.reynal
  Cc: patches, pbonzini, kvmarm

This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

it depends on host kernel KVM irqfd.

Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennee<alex.bennee@linaro.org>

---
v10 -> v11:
- Add Alex' Reviewed-by
- introduce kvm_accel in this patch and initialize it

v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.

Conflicts:
	hw/vfio/platform.c
---
 hw/vfio/platform.c              | 96 +++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-platform.h |  2 +
 trace-events                    |  2 +
 3 files changed, 100 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 59eeca2..98872ec 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -26,6 +26,7 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
+#include "sysemu/kvm.h"
 
 /*
  * Functions used whatever the injection method
@@ -51,6 +52,7 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
     intp->pin = info.index;
     intp->flags = info.flags;
     intp->state = VFIO_IRQ_INACTIVE;
+    intp->kvm_accel = false;
 
     sysbus_init_irq(sbdev, &intp->qemuirq);
 
@@ -61,6 +63,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
         error_report("vfio: Error: trigger event_notifier_init failed ");
         return NULL;
     }
+    /* Get an eventfd for resample/unmask */
+    ret = event_notifier_init(&intp->unmask, 0);
+    if (ret) {
+        g_free(intp);
+        error_report("vfio: Error: resample event_notifier_init failed eoi");
+        return NULL;
+    }
 
     QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
     return intp;
@@ -315,6 +324,82 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
     return ret;
 }
 
+/*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct handle
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+    VFIODevice *vbasedev = &intp->vdev->vbasedev;
+    struct vfio_irq_set *irq_set;
+    int argsz, ret;
+    int32_t *pfd;
+
+    argsz = sizeof(*irq_set) + sizeof(*pfd);
+    irq_set = g_malloc0(argsz);
+    irq_set->argsz = argsz;
+    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+    irq_set->index = intp->pin;
+    irq_set->start = 0;
+    irq_set->count = 1;
+    pfd = (int32_t *)&irq_set->data;
+    *pfd = event_notifier_get_fd(&intp->unmask);
+    qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (ret < 0) {
+        error_report("vfio: Failed to set resample eventfd: %m");
+    }
+    return ret;
+}
+
+/**
+ * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
+ * programs VFIO driver with both the trigger and resamplefd
+ * programs KVM with the gsi, trigger & resample eventfds
+ */
+static int vfio_start_irqfd_injection(VFIOINTp *intp)
+{
+    struct kvm_irqfd irqfd = {
+        .fd = event_notifier_get_fd(&intp->interrupt),
+        .resamplefd = event_notifier_get_fd(&intp->unmask),
+        .gsi = intp->virtualID,
+        .flags = KVM_IRQFD_FLAG_RESAMPLE,
+    };
+
+    if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+        error_report("vfio: Error: Failed to assign the irqfd: %m");
+        goto fail_irqfd;
+    }
+    if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+        goto fail_vfio;
+    }
+    if (vfio_set_resample_eventfd(intp) < 0) {
+        goto fail_vfio;
+    }
+
+    intp->kvm_accel = true;
+    trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
+                                     irqfd.fd, irqfd.resamplefd);
+    return 0;
+
+fail_vfio:
+    irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+    kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+    return -1;
+}
+
+#endif /* CONFIG_KVM */
+
 /* VFIO skeleton */
 
 /* not implemented yet */
@@ -555,7 +640,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
 
     vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
     vbasedev->ops = &vfio_platform_ops;
+
+#ifdef CONFIG_KVM
+    if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
+        vdev->irqfd_allowed) {
+        vdev->start_irq_fn = vfio_start_irqfd_injection;
+    } else {
+        vdev->start_irq_fn = vfio_start_eventfd_injection;
+    }
+#else
     vdev->start_irq_fn = vfio_start_eventfd_injection;
+#endif
 
     trace_vfio_platform_realize(vbasedev->name, vdev->compat);
 
@@ -645,6 +740,7 @@ static Property vfio_platform_dev_properties[] = {
     DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
     DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
                        mmap_timeout, 1100),
+    DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index c9ee898..25d2c46 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -42,6 +42,7 @@ typedef struct VFIOINTp {
     uint8_t pin; /* index */
     uint32_t virtualID; /* virtual IRQ */
     uint32_t flags; /* IRQ info flags */
+    bool kvm_accel; /* set when QEMU bypass through KVM enabled */
 } VFIOINTp;
 
 /* function type for routine starting IRQ propagation to the guest */
@@ -62,6 +63,7 @@ typedef struct VFIOPlatformDevice {
     /* function used to start IRQ propagation to the guest */
     start_irq_fn_t start_irq_fn;
     QemuMutex intp_mutex; /* protect the intp_list IRQ state */
+    bool irqfd_allowed; /* debug option to force irqfd on/off */
 } VFIOPlatformDevice;
 
 typedef struct VFIOPlatformDeviceClass {
diff --git a/trace-events b/trace-events
index c3dbfc3..31371b4 100644
--- a/trace-events
+++ b/trace-events
@@ -1571,6 +1571,8 @@ vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned lo
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
 vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
 vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
+vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d"
+vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d"
 
 #hw/acpi/memory_hotplug.c
 mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-03-19 10:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-19 10:05 [Qemu-devel] [PATCH v11 0/9] KVM platform device passthrough Eric Auger
2015-03-19 10:05 ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 1/9] linux-headers: update VFIO header for VFIO platform/amba drivers Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 2/9] hw/vfio/platform: vfio-platform skeleton Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 3/9] hw/vfio/platform: add irq assignment Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 4/9] hw/vfio/platform: add capability to start IRQ propagation Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 5/9] hw/arm/virt: start VFIO " Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 6/9] hw/vfio/platform: calxeda xgmac device Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 7/9] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 8/9] linux-headers: update arm/arm64 KVM headers for irqfd Eric Auger
2015-03-19 10:05   ` Eric Auger
2015-03-19 10:05 ` [Qemu-devel] [PATCH v11 9/9] hw/vfio/platform: add irqfd support Eric Auger
2015-03-19 10:05   ` Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.