All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-15 12:33 ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl,
	Lai Jiangshan

Changes in v5:
- properly introduce apic_report_irq_delivered (instead of
  apic_set_irq_delivered silently)
- rework apic to kvm core interface according to Blue's suggestion

CC: Lai Jiangshan <laijs@cn.fujitsu.com>

Jan Kiszka (16):
  msi: Generalize msix_supported to msi_supported
  kvm: Move kvmclock into hw/kvm folder
  apic: Stop timer on reset
  apic: Inject external NMI events via LINT1
  apic: Introduce apic_report_irq_delivered
  apic: Introduce backend/frontend infrastructure for KVM reuse
  apic: Open-code timer save/restore
  i8259: Introduce backend/frontend infrastructure for KVM reuse
  ioapic: Introduce backend/frontend infrastructure for KVM reuse
  memory: Introduce memory_region_init_reservation
  kvm: Introduce core services for in-kernel irqchip support
  kvm: x86: Establish IRQ0 override control
  kvm: x86: Add user space part for in-kernel APIC
  kvm: x86: Add user space part for in-kernel i8259
  kvm: x86: Add user space part for in-kernel IOAPIC
  kvm: Arm in-kernel irqchip support

 Makefile.objs                  |    2 +-
 Makefile.target                |    6 +-
 configure                      |    1 +
 hw/apic.c                      |  309 ++++-----------------------------------
 hw/apic.h                      |    1 +
 hw/apic_common.c               |  312 ++++++++++++++++++++++++++++++++++++++++
 hw/apic_internal.h             |  122 ++++++++++++++++
 hw/i8259.c                     |  127 ++--------------
 hw/i8259_common.c              |  173 ++++++++++++++++++++++
 hw/i8259_internal.h            |   82 +++++++++++
 hw/ioapic.c                    |  130 ++---------------
 hw/ioapic_common.c             |  138 ++++++++++++++++++
 hw/ioapic_internal.h           |  106 ++++++++++++++
 hw/kvm/apic.c                  |  138 ++++++++++++++++++
 hw/{kvmclock.c => kvm/clock.c} |    4 +-
 hw/{kvmclock.h => kvm/clock.h} |    0
 hw/kvm/i8259.c                 |  126 ++++++++++++++++
 hw/kvm/ioapic.c                |  101 +++++++++++++
 hw/msi.c                       |    8 +
 hw/msi.h                       |    2 +
 hw/msix.c                      |    9 +-
 hw/msix.h                      |    2 -
 hw/pc.c                        |   19 ++-
 hw/pc.h                        |    1 +
 hw/pc_piix.c                   |   66 ++++++++-
 kvm-all.c                      |  154 ++++++++++++++++++++
 kvm-stub.c                     |    5 +
 kvm.h                          |   14 ++
 memory.c                       |   36 +++++
 memory.h                       |   16 ++
 monitor.c                      |    6 +-
 qemu-config.c                  |    4 +
 qemu-options.hx                |    5 +-
 sysemu.h                       |    1 -
 target-i386/kvm.c              |   49 +++++++
 trace-events                   |    2 +-
 vl.c                           |    1 -
 37 files changed, 1739 insertions(+), 539 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h
 create mode 100644 hw/kvm/apic.c
 rename hw/{kvmclock.c => kvm/clock.c} (98%)
 rename hw/{kvmclock.h => kvm/clock.h} (100%)
 create mode 100644 hw/kvm/i8259.c
 create mode 100644 hw/kvm/ioapic.c

-- 
1.7.3.4


^ permalink raw reply	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-15 12:33 ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	qemu-devel, Blue Swirl

Changes in v5:
- properly introduce apic_report_irq_delivered (instead of
  apic_set_irq_delivered silently)
- rework apic to kvm core interface according to Blue's suggestion

CC: Lai Jiangshan <laijs@cn.fujitsu.com>

Jan Kiszka (16):
  msi: Generalize msix_supported to msi_supported
  kvm: Move kvmclock into hw/kvm folder
  apic: Stop timer on reset
  apic: Inject external NMI events via LINT1
  apic: Introduce apic_report_irq_delivered
  apic: Introduce backend/frontend infrastructure for KVM reuse
  apic: Open-code timer save/restore
  i8259: Introduce backend/frontend infrastructure for KVM reuse
  ioapic: Introduce backend/frontend infrastructure for KVM reuse
  memory: Introduce memory_region_init_reservation
  kvm: Introduce core services for in-kernel irqchip support
  kvm: x86: Establish IRQ0 override control
  kvm: x86: Add user space part for in-kernel APIC
  kvm: x86: Add user space part for in-kernel i8259
  kvm: x86: Add user space part for in-kernel IOAPIC
  kvm: Arm in-kernel irqchip support

 Makefile.objs                  |    2 +-
 Makefile.target                |    6 +-
 configure                      |    1 +
 hw/apic.c                      |  309 ++++-----------------------------------
 hw/apic.h                      |    1 +
 hw/apic_common.c               |  312 ++++++++++++++++++++++++++++++++++++++++
 hw/apic_internal.h             |  122 ++++++++++++++++
 hw/i8259.c                     |  127 ++--------------
 hw/i8259_common.c              |  173 ++++++++++++++++++++++
 hw/i8259_internal.h            |   82 +++++++++++
 hw/ioapic.c                    |  130 ++---------------
 hw/ioapic_common.c             |  138 ++++++++++++++++++
 hw/ioapic_internal.h           |  106 ++++++++++++++
 hw/kvm/apic.c                  |  138 ++++++++++++++++++
 hw/{kvmclock.c => kvm/clock.c} |    4 +-
 hw/{kvmclock.h => kvm/clock.h} |    0
 hw/kvm/i8259.c                 |  126 ++++++++++++++++
 hw/kvm/ioapic.c                |  101 +++++++++++++
 hw/msi.c                       |    8 +
 hw/msi.h                       |    2 +
 hw/msix.c                      |    9 +-
 hw/msix.h                      |    2 -
 hw/pc.c                        |   19 ++-
 hw/pc.h                        |    1 +
 hw/pc_piix.c                   |   66 ++++++++-
 kvm-all.c                      |  154 ++++++++++++++++++++
 kvm-stub.c                     |    5 +
 kvm.h                          |   14 ++
 memory.c                       |   36 +++++
 memory.h                       |   16 ++
 monitor.c                      |    6 +-
 qemu-config.c                  |    4 +
 qemu-options.hx                |    5 +-
 sysemu.h                       |    1 -
 target-i386/kvm.c              |   49 +++++++
 trace-events                   |    2 +-
 vl.c                           |    1 -
 37 files changed, 1739 insertions(+), 539 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h
 create mode 100644 hw/kvm/apic.c
 rename hw/{kvmclock.c => kvm/clock.c} (98%)
 rename hw/{kvmclock.h => kvm/clock.h} (100%)
 create mode 100644 hw/kvm/i8259.c
 create mode 100644 hw/kvm/ioapic.c

-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [PATCH v5 01/16] msi: Generalize msix_supported to msi_supported
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Rename msix_supported to msi_supported and control MSI and MSI-X
activation this way. That was likely to original intention for this
flag, but MSI support came after MSI-X.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/msi.c  |    8 ++++++++
 hw/msi.h  |    2 ++
 hw/msix.c |    9 ++++-----
 hw/msix.h |    2 --
 hw/pc.c   |    4 ++--
 5 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/msi.c b/hw/msi.c
index f214fcf..5d6ceb6 100644
--- a/hw/msi.c
+++ b/hw/msi.c
@@ -36,6 +36,9 @@
 
 #define PCI_MSI_VECTORS_MAX     32
 
+/* Flag for interrupt controller to declare MSI/MSI-X support */
+bool msi_supported;
+
 /* If we get rid of cap allocator, we won't need this. */
 static inline uint8_t msi_cap_sizeof(uint16_t flags)
 {
@@ -116,6 +119,11 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
     uint16_t flags;
     uint8_t cap_size;
     int config_offset;
+
+    if (!msi_supported) {
+        return -ENOTSUP;
+    }
+
     MSI_DEV_PRINTF(dev,
                    "init offset: 0x%"PRIx8" vector: %"PRId8
                    " 64bit %d mask %d\n",
diff --git a/hw/msi.h b/hw/msi.h
index 5766018..3040bb0 100644
--- a/hw/msi.h
+++ b/hw/msi.h
@@ -24,6 +24,8 @@
 #include "qemu-common.h"
 #include "pci.h"
 
+extern bool msi_supported;
+
 bool msi_enabled(const PCIDevice *dev);
 int msi_init(struct PCIDevice *dev, uint8_t offset,
              unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask);
diff --git a/hw/msix.c b/hw/msix.c
index 149eed2..107d4e5 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -12,6 +12,7 @@
  */
 
 #include "hw.h"
+#include "msi.h"
 #include "msix.h"
 #include "pci.h"
 #include "range.h"
@@ -32,9 +33,6 @@
 #define MSIX_MAX_ENTRIES 32
 
 
-/* Flag for interrupt controller to declare MSI-X support */
-int msix_supported;
-
 /* Add MSI-X capability to the config space for the device. */
 /* Given a bar and its size, add MSI-X table on top of it
  * and fill MSI-X capability in the config space.
@@ -235,10 +233,11 @@ int msix_init(struct PCIDevice *dev, unsigned short nentries,
               unsigned bar_nr, unsigned bar_size)
 {
     int ret;
+
     /* Nothing to do if MSI is not supported by interrupt controller */
-    if (!msix_supported)
+    if (!msi_supported) {
         return -ENOTSUP;
-
+    }
     if (nentries > MSIX_MAX_ENTRIES)
         return -EINVAL;
 
diff --git a/hw/msix.h b/hw/msix.h
index 7e04336..5aba22b 100644
--- a/hw/msix.h
+++ b/hw/msix.h
@@ -29,6 +29,4 @@ void msix_notify(PCIDevice *dev, unsigned vector);
 
 void msix_reset(PCIDevice *dev);
 
-extern int msix_supported;
-
 #endif
diff --git a/hw/pc.c b/hw/pc.c
index 7c4bfa8..240aaae 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -36,7 +36,7 @@
 #include "elf.h"
 #include "multiboot.h"
 #include "mc146818rtc.h"
-#include "msix.h"
+#include "msi.h"
 #include "sysbus.h"
 #include "sysemu.h"
 #include "blockdev.h"
@@ -896,7 +896,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
         apic_mapped = 1;
     }
 
-    msix_supported = 1;
+    msi_supported = true;
 
     return dev;
 }
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 01/16] msi: Generalize msix_supported to msi_supported
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Rename msix_supported to msi_supported and control MSI and MSI-X
activation this way. That was likely to original intention for this
flag, but MSI support came after MSI-X.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/msi.c  |    8 ++++++++
 hw/msi.h  |    2 ++
 hw/msix.c |    9 ++++-----
 hw/msix.h |    2 --
 hw/pc.c   |    4 ++--
 5 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/msi.c b/hw/msi.c
index f214fcf..5d6ceb6 100644
--- a/hw/msi.c
+++ b/hw/msi.c
@@ -36,6 +36,9 @@
 
 #define PCI_MSI_VECTORS_MAX     32
 
+/* Flag for interrupt controller to declare MSI/MSI-X support */
+bool msi_supported;
+
 /* If we get rid of cap allocator, we won't need this. */
 static inline uint8_t msi_cap_sizeof(uint16_t flags)
 {
@@ -116,6 +119,11 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
     uint16_t flags;
     uint8_t cap_size;
     int config_offset;
+
+    if (!msi_supported) {
+        return -ENOTSUP;
+    }
+
     MSI_DEV_PRINTF(dev,
                    "init offset: 0x%"PRIx8" vector: %"PRId8
                    " 64bit %d mask %d\n",
diff --git a/hw/msi.h b/hw/msi.h
index 5766018..3040bb0 100644
--- a/hw/msi.h
+++ b/hw/msi.h
@@ -24,6 +24,8 @@
 #include "qemu-common.h"
 #include "pci.h"
 
+extern bool msi_supported;
+
 bool msi_enabled(const PCIDevice *dev);
 int msi_init(struct PCIDevice *dev, uint8_t offset,
              unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask);
diff --git a/hw/msix.c b/hw/msix.c
index 149eed2..107d4e5 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -12,6 +12,7 @@
  */
 
 #include "hw.h"
+#include "msi.h"
 #include "msix.h"
 #include "pci.h"
 #include "range.h"
@@ -32,9 +33,6 @@
 #define MSIX_MAX_ENTRIES 32
 
 
-/* Flag for interrupt controller to declare MSI-X support */
-int msix_supported;
-
 /* Add MSI-X capability to the config space for the device. */
 /* Given a bar and its size, add MSI-X table on top of it
  * and fill MSI-X capability in the config space.
@@ -235,10 +233,11 @@ int msix_init(struct PCIDevice *dev, unsigned short nentries,
               unsigned bar_nr, unsigned bar_size)
 {
     int ret;
+
     /* Nothing to do if MSI is not supported by interrupt controller */
-    if (!msix_supported)
+    if (!msi_supported) {
         return -ENOTSUP;
-
+    }
     if (nentries > MSIX_MAX_ENTRIES)
         return -EINVAL;
 
diff --git a/hw/msix.h b/hw/msix.h
index 7e04336..5aba22b 100644
--- a/hw/msix.h
+++ b/hw/msix.h
@@ -29,6 +29,4 @@ void msix_notify(PCIDevice *dev, unsigned vector);
 
 void msix_reset(PCIDevice *dev);
 
-extern int msix_supported;
-
 #endif
diff --git a/hw/pc.c b/hw/pc.c
index 7c4bfa8..240aaae 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -36,7 +36,7 @@
 #include "elf.h"
 #include "multiboot.h"
 #include "mc146818rtc.h"
-#include "msix.h"
+#include "msi.h"
 #include "sysbus.h"
 #include "sysemu.h"
 #include "blockdev.h"
@@ -896,7 +896,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
         apic_mapped = 1;
     }
 
-    msix_supported = 1;
+    msi_supported = true;
 
     return dev;
 }
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 02/16] kvm: Move kvmclock into hw/kvm folder
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

More KVM-specific devices will come, so let's start with moving the
kvmclock into a dedicated folder.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target                |    4 ++--
 configure                      |    1 +
 hw/{kvmclock.c => kvm/clock.c} |    4 ++--
 hw/{kvmclock.h => kvm/clock.h} |    0
 hw/pc_piix.c                   |    2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)
 rename hw/{kvmclock.c => kvm/clock.c} (98%)
 rename hw/{kvmclock.h => kvm/clock.h} (100%)

diff --git a/Makefile.target b/Makefile.target
index a111521..1d24a30 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvmclock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
@@ -428,7 +428,7 @@ qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
 
 clean:
 	rm -f *.o *.a *~ $(PROGS) nwfpe/*.o fpu/*.o
-	rm -f *.d */*.d tcg/*.o ide/*.o 9pfs/*.o
+	rm -f *.d */*.d tcg/*.o ide/*.o 9pfs/*.o kvm/*.o
 	rm -f hmp-commands.h qmp-commands-old.h gdbstub-xml.c
 ifdef CONFIG_TRACE_SYSTEMTAP
 	rm -f *.stp
diff --git a/configure b/configure
index ac4840d..12cd9d1 100755
--- a/configure
+++ b/configure
@@ -3338,6 +3338,7 @@ mkdir -p $target_dir/fpu
 mkdir -p $target_dir/tcg
 mkdir -p $target_dir/ide
 mkdir -p $target_dir/9pfs
+mkdir -p $target_dir/kvm
 if test "$target" = "arm-linux-user" -o "$target" = "armeb-linux-user" -o "$target" = "arm-bsd-user" -o "$target" = "armeb-bsd-user" ; then
   mkdir -p $target_dir/nwfpe
 fi
diff --git a/hw/kvmclock.c b/hw/kvm/clock.c
similarity index 98%
rename from hw/kvmclock.c
rename to hw/kvm/clock.c
index 5388bc4..5983271 100644
--- a/hw/kvmclock.c
+++ b/hw/kvm/clock.c
@@ -13,9 +13,9 @@
 
 #include "qemu-common.h"
 #include "sysemu.h"
-#include "sysbus.h"
 #include "kvm.h"
-#include "kvmclock.h"
+#include "hw/sysbus.h"
+#include "hw/kvm/clock.h"
 
 #include <linux/kvm.h>
 #include <linux/kvm_para.h>
diff --git a/hw/kvmclock.h b/hw/kvm/clock.h
similarity index 100%
rename from hw/kvmclock.h
rename to hw/kvm/clock.h
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 970f43c..530fe9c 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -34,7 +34,7 @@
 #include "boards.h"
 #include "ide.h"
 #include "kvm.h"
-#include "kvmclock.h"
+#include "kvm/clock.h"
 #include "sysemu.h"
 #include "sysbus.h"
 #include "arch_init.h"
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 02/16] kvm: Move kvmclock into hw/kvm folder
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

More KVM-specific devices will come, so let's start with moving the
kvmclock into a dedicated folder.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target                |    4 ++--
 configure                      |    1 +
 hw/{kvmclock.c => kvm/clock.c} |    4 ++--
 hw/{kvmclock.h => kvm/clock.h} |    0
 hw/pc_piix.c                   |    2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)
 rename hw/{kvmclock.c => kvm/clock.c} (98%)
 rename hw/{kvmclock.h => kvm/clock.h} (100%)

diff --git a/Makefile.target b/Makefile.target
index a111521..1d24a30 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvmclock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
@@ -428,7 +428,7 @@ qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
 
 clean:
 	rm -f *.o *.a *~ $(PROGS) nwfpe/*.o fpu/*.o
-	rm -f *.d */*.d tcg/*.o ide/*.o 9pfs/*.o
+	rm -f *.d */*.d tcg/*.o ide/*.o 9pfs/*.o kvm/*.o
 	rm -f hmp-commands.h qmp-commands-old.h gdbstub-xml.c
 ifdef CONFIG_TRACE_SYSTEMTAP
 	rm -f *.stp
diff --git a/configure b/configure
index ac4840d..12cd9d1 100755
--- a/configure
+++ b/configure
@@ -3338,6 +3338,7 @@ mkdir -p $target_dir/fpu
 mkdir -p $target_dir/tcg
 mkdir -p $target_dir/ide
 mkdir -p $target_dir/9pfs
+mkdir -p $target_dir/kvm
 if test "$target" = "arm-linux-user" -o "$target" = "armeb-linux-user" -o "$target" = "arm-bsd-user" -o "$target" = "armeb-bsd-user" ; then
   mkdir -p $target_dir/nwfpe
 fi
diff --git a/hw/kvmclock.c b/hw/kvm/clock.c
similarity index 98%
rename from hw/kvmclock.c
rename to hw/kvm/clock.c
index 5388bc4..5983271 100644
--- a/hw/kvmclock.c
+++ b/hw/kvm/clock.c
@@ -13,9 +13,9 @@
 
 #include "qemu-common.h"
 #include "sysemu.h"
-#include "sysbus.h"
 #include "kvm.h"
-#include "kvmclock.h"
+#include "hw/sysbus.h"
+#include "hw/kvm/clock.h"
 
 #include <linux/kvm.h>
 #include <linux/kvm_para.h>
diff --git a/hw/kvmclock.h b/hw/kvm/clock.h
similarity index 100%
rename from hw/kvmclock.h
rename to hw/kvm/clock.h
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 970f43c..530fe9c 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -34,7 +34,7 @@
 #include "boards.h"
 #include "ide.h"
 #include "kvm.h"
-#include "kvmclock.h"
+#include "kvm/clock.h"
 #include "sysemu.h"
 #include "sysbus.h"
 #include "arch_init.h"
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 03/16] apic: Stop timer on reset
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

All LVTs are masked on reset, so the timer becomes ineffective. Letting
it tick nevertheless is harmless, but will at least create a spurious
trace event.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 9d0f460..4b97b17 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -528,6 +528,8 @@ void apic_init_reset(DeviceState *d)
     s->initial_count_load_time = 0;
     s->next_time = 0;
     s->wait_for_sipi = 1;
+
+    qemu_del_timer(s->timer);
 }
 
 static void apic_startup(APICState *s, int vector_num)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 03/16] apic: Stop timer on reset
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

All LVTs are masked on reset, so the timer becomes ineffective. Letting
it tick nevertheless is harmless, but will at least create a spurious
trace event.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 9d0f460..4b97b17 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -528,6 +528,8 @@ void apic_init_reset(DeviceState *d)
     s->initial_count_load_time = 0;
     s->next_time = 0;
     s->wait_for_sipi = 1;
+
+    qemu_del_timer(s->timer);
 }
 
 static void apic_startup(APICState *s, int vector_num)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 04/16] apic: Inject external NMI events via LINT1
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl,
	Lai Jiangshan

On real hardware, NMI button events are injected via the LINT1 line of
the APICs. E.g. kdump expect this wiring and gets upset if the per-APIC
LINT1 mask is not respected, i.e. if NMIs are injected to VCPUs that
should not receive them. Change the APIC emulation code to reflect this.

Based on qemu-kvm patch by Lai Jiangshan.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c |    7 +++++++
 hw/apic.h |    1 +
 monitor.c |    6 +++++-
 3 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 4b97b17..b9d733c 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -205,6 +205,13 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
     }
 }
 
+void apic_deliver_nmi(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    apic_local_deliver(s, APIC_LVT_LINT1);
+}
+
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
     int __i, __j, __mask;\
diff --git a/hw/apic.h b/hw/apic.h
index a5c910f..a62d83b 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -8,6 +8,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
                       uint8_t vector_num, uint8_t trigger_mode);
 int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
diff --git a/monitor.c b/monitor.c
index 1be222e..6bd0fb1 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2354,7 +2354,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
     CPUState *env;
 
     for (env = first_cpu; env != NULL; env = env->next_cpu) {
-        cpu_interrupt(env, CPU_INTERRUPT_NMI);
+        if (!env->apic_state) {
+            cpu_interrupt(env, CPU_INTERRUPT_NMI);
+        } else {
+            apic_deliver_nmi(env->apic_state);
+        }
     }
 
     return 0;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 04/16] apic: Inject external NMI events via LINT1
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	qemu-devel, Blue Swirl

On real hardware, NMI button events are injected via the LINT1 line of
the APICs. E.g. kdump expect this wiring and gets upset if the per-APIC
LINT1 mask is not respected, i.e. if NMIs are injected to VCPUs that
should not receive them. Change the APIC emulation code to reflect this.

Based on qemu-kvm patch by Lai Jiangshan.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c |    7 +++++++
 hw/apic.h |    1 +
 monitor.c |    6 +++++-
 3 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 4b97b17..b9d733c 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -205,6 +205,13 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
     }
 }
 
+void apic_deliver_nmi(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    apic_local_deliver(s, APIC_LVT_LINT1);
+}
+
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
     int __i, __j, __mask;\
diff --git a/hw/apic.h b/hw/apic.h
index a5c910f..a62d83b 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -8,6 +8,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
                       uint8_t vector_num, uint8_t trigger_mode);
 int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
diff --git a/monitor.c b/monitor.c
index 1be222e..6bd0fb1 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2354,7 +2354,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data)
     CPUState *env;
 
     for (env = first_cpu; env != NULL; env = env->next_cpu) {
-        cpu_interrupt(env, CPU_INTERRUPT_NMI);
+        if (!env->apic_state) {
+            cpu_interrupt(env, CPU_INTERRUPT_NMI);
+        } else {
+            apic_deliver_nmi(env->apic_state);
+        }
     }
 
     return 0;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 05/16] apic: Introduce apic_report_irq_delivered
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

The in-kernel i8259 and IOAPIC backends for KVM will need this, so
encapsulate the shared bits.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c    |   11 ++++++++---
 hw/apic.h    |    1 +
 trace-events |    2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index b9d733c..bec493b 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -413,6 +413,13 @@ static void apic_update_irq(APICState *s)
     }
 }
 
+void apic_report_irq_delivered(int delivered)
+{
+    apic_irq_delivered += delivered;
+
+    trace_apic_report_irq_delivered(apic_irq_delivered);
+}
+
 void apic_reset_irq_delivered(void)
 {
     trace_apic_reset_irq_delivered(apic_irq_delivered);
@@ -429,9 +436,7 @@ int apic_get_irq_delivered(void)
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
-    apic_irq_delivered += !get_bit(s->irr, vector_num);
-
-    trace_apic_set_irq(apic_irq_delivered);
+    apic_report_irq_delivered(!get_bit(s->irr, vector_num));
 
     set_bit(s->irr, vector_num);
     if (trigger_mode)
diff --git a/hw/apic.h b/hw/apic.h
index a62d83b..8173d8a 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,6 +10,7 @@ int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
 void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
+void apic_report_irq_delivered(int delivered);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
 void cpu_set_apic_base(DeviceState *s, uint64_t val);
diff --git a/trace-events b/trace-events
index 962caca..bf8de74 100644
--- a/trace-events
+++ b/trace-events
@@ -96,9 +96,9 @@ cpu_get_apic_base(uint64_t val) "%016"PRIx64
 apic_mem_readl(uint64_t addr, uint32_t val)  "%"PRIx64" = %08x"
 apic_mem_writel(uint64_t addr, uint32_t val) "%"PRIx64" = %08x"
 # coalescing
+apic_report_irq_delivered(int apic_irq_delivered) "coalescing %d"
 apic_reset_irq_delivered(int apic_irq_delivered) "old coalescing %d"
 apic_get_irq_delivered(int apic_irq_delivered) "returning coalescing %d"
-apic_set_irq(int apic_irq_delivered) "coalescing %d"
 
 # hw/cs4231.c
 cs4231_mem_readl_dreg(uint32_t reg, uint32_t ret) "read dreg %d: 0x%02x"
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 05/16] apic: Introduce apic_report_irq_delivered
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

The in-kernel i8259 and IOAPIC backends for KVM will need this, so
encapsulate the shared bits.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c    |   11 ++++++++---
 hw/apic.h    |    1 +
 trace-events |    2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index b9d733c..bec493b 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -413,6 +413,13 @@ static void apic_update_irq(APICState *s)
     }
 }
 
+void apic_report_irq_delivered(int delivered)
+{
+    apic_irq_delivered += delivered;
+
+    trace_apic_report_irq_delivered(apic_irq_delivered);
+}
+
 void apic_reset_irq_delivered(void)
 {
     trace_apic_reset_irq_delivered(apic_irq_delivered);
@@ -429,9 +436,7 @@ int apic_get_irq_delivered(void)
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
-    apic_irq_delivered += !get_bit(s->irr, vector_num);
-
-    trace_apic_set_irq(apic_irq_delivered);
+    apic_report_irq_delivered(!get_bit(s->irr, vector_num));
 
     set_bit(s->irr, vector_num);
     if (trigger_mode)
diff --git a/hw/apic.h b/hw/apic.h
index a62d83b..8173d8a 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,6 +10,7 @@ int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
 void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
+void apic_report_irq_delivered(int delivered);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
 void cpu_set_apic_base(DeviceState *s, uint64_t val);
diff --git a/trace-events b/trace-events
index 962caca..bf8de74 100644
--- a/trace-events
+++ b/trace-events
@@ -96,9 +96,9 @@ cpu_get_apic_base(uint64_t val) "%016"PRIx64
 apic_mem_readl(uint64_t addr, uint32_t val)  "%"PRIx64" = %08x"
 apic_mem_writel(uint64_t addr, uint32_t val) "%"PRIx64" = %08x"
 # coalescing
+apic_report_irq_delivered(int apic_irq_delivered) "coalescing %d"
 apic_reset_irq_delivered(int apic_irq_delivered) "old coalescing %d"
 apic_get_irq_delivered(int apic_irq_delivered) "returning coalescing %d"
-apic_set_irq(int apic_irq_delivered) "coalescing %d"
 
 # hw/cs4231.c
 cs4231_mem_readl_dreg(uint32_t reg, uint32_t ret) "read dreg %d: 0x%02x"
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

The KVM in-kernel APIC model will reuse parts of the user space model
while providing the same frontend view to guest and most management
interfaces. Introduce an APIC backend concept to encapsulate those
parts that will tell user space and KVM model apart. The backend offers
callback hooks for init, base/tpr setting, and the external NMI delivery
that will be implemented accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target    |    2 +-
 hw/apic.c          |  285 +++-------------------------------------------------
 hw/apic.h          |    1 -
 hw/apic_common.c   |  265 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/apic_internal.h |  119 ++++++++++++++++++++++
 hw/pc.c            |    1 +
 6 files changed, 401 insertions(+), 272 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h

diff --git a/Makefile.target b/Makefile.target
index 1d24a30..c46f062 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/apic.c b/hw/apic.c
index bec493b..5fa3111 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -16,53 +16,13 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see <http://www.gnu.org/licenses/>
  */
-#include "hw.h"
+#include "apic_internal.h"
 #include "apic.h"
 #include "ioapic.h"
-#include "qemu-timer.h"
 #include "host-utils.h"
-#include "sysbus.h"
 #include "trace.h"
 #include "pc.h"
 
-/* APIC Local Vector Table */
-#define APIC_LVT_TIMER   0
-#define APIC_LVT_THERMAL 1
-#define APIC_LVT_PERFORM 2
-#define APIC_LVT_LINT0   3
-#define APIC_LVT_LINT1   4
-#define APIC_LVT_ERROR   5
-#define APIC_LVT_NB      6
-
-/* APIC delivery modes */
-#define APIC_DM_FIXED	0
-#define APIC_DM_LOWPRI	1
-#define APIC_DM_SMI	2
-#define APIC_DM_NMI	4
-#define APIC_DM_INIT	5
-#define APIC_DM_SIPI	6
-#define APIC_DM_EXTINT	7
-
-/* APIC destination mode */
-#define APIC_DESTMODE_FLAT	0xf
-#define APIC_DESTMODE_CLUSTER	1
-
-#define APIC_TRIGGER_EDGE  0
-#define APIC_TRIGGER_LEVEL 1
-
-#define	APIC_LVT_TIMER_PERIODIC		(1<<17)
-#define	APIC_LVT_MASKED			(1<<16)
-#define	APIC_LVT_LEVEL_TRIGGER		(1<<15)
-#define	APIC_LVT_REMOTE_IRR		(1<<14)
-#define	APIC_INPUT_POLARITY		(1<<13)
-#define	APIC_SEND_PENDING		(1<<12)
-
-#define ESR_ILLEGAL_ADDRESS (1 << 7)
-
-#define APIC_SV_DIRECTED_IO             (1<<12)
-#define APIC_SV_ENABLE                  (1<<8)
-
-#define MAX_APICS 255
 #define MAX_APIC_WORDS 8
 
 /* Intel APIC constants: from include/asm/msidef.h */
@@ -75,40 +35,7 @@
 #define MSI_ADDR_DEST_ID_SHIFT		12
 #define	MSI_ADDR_DEST_ID_MASK		0x00ffff0
 
-#define MSI_ADDR_SIZE                   0x100000
-
-typedef struct APICState APICState;
-
-struct APICState {
-    SysBusDevice busdev;
-    MemoryRegion io_memory;
-    void *cpu_env;
-    uint32_t apicbase;
-    uint8_t id;
-    uint8_t arb_id;
-    uint8_t tpr;
-    uint32_t spurious_vec;
-    uint8_t log_dest;
-    uint8_t dest_mode;
-    uint32_t isr[8];  /* in service register */
-    uint32_t tmr[8];  /* trigger mode register */
-    uint32_t irr[8]; /* interrupt request register */
-    uint32_t lvt[APIC_LVT_NB];
-    uint32_t esr; /* error register */
-    uint32_t icr[2];
-
-    uint32_t divide_conf;
-    int count_shift;
-    uint32_t initial_count;
-    int64_t initial_count_load_time, next_time;
-    uint32_t idx;
-    QEMUTimer *timer;
-    int sipi_vector;
-    int wait_for_sipi;
-};
-
 static APICState *local_apics[MAX_APICS + 1];
-static int apic_irq_delivered;
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICState *s);
@@ -205,10 +132,8 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
     }
 }
 
-void apic_deliver_nmi(DeviceState *d)
+static void apic_external_nmi(APICState *s)
 {
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
     apic_local_deliver(s, APIC_LVT_LINT1);
 }
 
@@ -300,14 +225,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
     apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
 }
 
-void cpu_set_apic_base(DeviceState *d, uint64_t val)
+static void apic_set_base(APICState *s, uint64_t val)
 {
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    trace_cpu_set_apic_base(val);
-
-    if (!s)
-        return;
     s->apicbase = (val & 0xfffff000) |
         (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
     /* if disabled, cannot be enabled again */
@@ -318,32 +237,12 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
     }
 }
 
-uint64_t cpu_get_apic_base(DeviceState *d)
+static void apic_set_tpr(APICState *s, uint8_t val)
 {
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase: 0);
-
-    return s ? s->apicbase : 0;
-}
-
-void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    if (!s)
-        return;
     s->tpr = (val & 0x0f) << 4;
     apic_update_irq(s);
 }
 
-uint8_t cpu_get_apic_tpr(DeviceState *d)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    return s ? s->tpr >> 4 : 0;
-}
-
 /* return -1 if no bit is set */
 static int get_highest_priority_int(uint32_t *tab)
 {
@@ -413,27 +312,6 @@ static void apic_update_irq(APICState *s)
     }
 }
 
-void apic_report_irq_delivered(int delivered)
-{
-    apic_irq_delivered += delivered;
-
-    trace_apic_report_irq_delivered(apic_irq_delivered);
-}
-
-void apic_reset_irq_delivered(void)
-{
-    trace_apic_reset_irq_delivered(apic_irq_delivered);
-
-    apic_irq_delivered = 0;
-}
-
-int apic_get_irq_delivered(void)
-{
-    trace_apic_get_irq_delivered(apic_irq_delivered);
-
-    return apic_irq_delivered;
-}
-
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
     apic_report_irq_delivered(!get_bit(s->irr, vector_num));
@@ -515,35 +393,6 @@ static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
     }
 }
 
-void apic_init_reset(DeviceState *d)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-    int i;
-
-    if (!s)
-        return;
-
-    s->tpr = 0;
-    s->spurious_vec = 0xff;
-    s->log_dest = 0;
-    s->dest_mode = 0xf;
-    memset(s->isr, 0, sizeof(s->isr));
-    memset(s->tmr, 0, sizeof(s->tmr));
-    memset(s->irr, 0, sizeof(s->irr));
-    for(i = 0; i < APIC_LVT_NB; i++)
-        s->lvt[i] = 1 << 16; /* mask LVT */
-    s->esr = 0;
-    memset(s->icr, 0, sizeof(s->icr));
-    s->divide_conf = 0;
-    s->count_shift = 0;
-    s->initial_count = 0;
-    s->initial_count_load_time = 0;
-    s->next_time = 0;
-    s->wait_for_sipi = 1;
-
-    qemu_del_timer(s->timer);
-}
-
 static void apic_startup(APICState *s, int vector_num)
 {
     s->sipi_vector = vector_num;
@@ -904,96 +753,6 @@ static void apic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
     }
 }
 
-/* This function is only used for old state version 1 and 2 */
-static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
-{
-    APICState *s = opaque;
-    int i;
-
-    if (version_id > 2)
-        return -EINVAL;
-
-    /* XXX: what if the base changes? (registered memory regions) */
-    qemu_get_be32s(f, &s->apicbase);
-    qemu_get_8s(f, &s->id);
-    qemu_get_8s(f, &s->arb_id);
-    qemu_get_8s(f, &s->tpr);
-    qemu_get_be32s(f, &s->spurious_vec);
-    qemu_get_8s(f, &s->log_dest);
-    qemu_get_8s(f, &s->dest_mode);
-    for (i = 0; i < 8; i++) {
-        qemu_get_be32s(f, &s->isr[i]);
-        qemu_get_be32s(f, &s->tmr[i]);
-        qemu_get_be32s(f, &s->irr[i]);
-    }
-    for (i = 0; i < APIC_LVT_NB; i++) {
-        qemu_get_be32s(f, &s->lvt[i]);
-    }
-    qemu_get_be32s(f, &s->esr);
-    qemu_get_be32s(f, &s->icr[0]);
-    qemu_get_be32s(f, &s->icr[1]);
-    qemu_get_be32s(f, &s->divide_conf);
-    s->count_shift=qemu_get_be32(f);
-    qemu_get_be32s(f, &s->initial_count);
-    s->initial_count_load_time=qemu_get_be64(f);
-    s->next_time=qemu_get_be64(f);
-
-    if (version_id >= 2)
-        qemu_get_timer(f, s->timer);
-    return 0;
-}
-
-static const VMStateDescription vmstate_apic = {
-    .name = "apic",
-    .version_id = 3,
-    .minimum_version_id = 3,
-    .minimum_version_id_old = 1,
-    .load_state_old = apic_load_old,
-    .fields      = (VMStateField []) {
-        VMSTATE_UINT32(apicbase, APICState),
-        VMSTATE_UINT8(id, APICState),
-        VMSTATE_UINT8(arb_id, APICState),
-        VMSTATE_UINT8(tpr, APICState),
-        VMSTATE_UINT32(spurious_vec, APICState),
-        VMSTATE_UINT8(log_dest, APICState),
-        VMSTATE_UINT8(dest_mode, APICState),
-        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
-        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
-        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
-        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
-        VMSTATE_UINT32(esr, APICState),
-        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
-        VMSTATE_UINT32(divide_conf, APICState),
-        VMSTATE_INT32(count_shift, APICState),
-        VMSTATE_UINT32(initial_count, APICState),
-        VMSTATE_INT64(initial_count_load_time, APICState),
-        VMSTATE_INT64(next_time, APICState),
-        VMSTATE_TIMER(timer, APICState),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
-static void apic_reset(DeviceState *d)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-    int bsp;
-
-    bsp = cpu_is_bsp(s->cpu_env);
-    s->apicbase = 0xfee00000 |
-        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
-
-    apic_init_reset(d);
-
-    if (bsp) {
-        /*
-         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
-         * time typically by BIOS, so PIC interrupt can be delivered to the
-         * processor when local APIC is enabled.
-         */
-        s->lvt[APIC_LVT_LINT0] = 0x700;
-    }
-}
-
 static const MemoryRegionOps apic_io_ops = {
     .old_mmio = {
         .read = { apic_mem_readb, apic_mem_readw, apic_mem_readl, },
@@ -1002,41 +761,27 @@ static const MemoryRegionOps apic_io_ops = {
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static int apic_init1(SysBusDevice *dev)
+static void apic_backend_init(APICState *s)
 {
-    APICState *s = FROM_SYSBUS(APICState, dev);
-    static int last_apic_idx;
-
-    if (last_apic_idx >= MAX_APICS) {
-        return -1;
-    }
-    memory_region_init_io(&s->io_memory, &apic_io_ops, s, "apic",
-                          MSI_ADDR_SIZE);
-    sysbus_init_mmio(dev, &s->io_memory);
+    memory_region_init_io(&s->io_memory, &apic_io_ops, s, "apic-msi",
+                          MSI_SPACE_SIZE);
 
     s->timer = qemu_new_timer_ns(vm_clock, apic_timer, s);
-    s->idx = last_apic_idx++;
     local_apics[s->idx] = s;
-    return 0;
 }
 
-static SysBusDeviceInfo apic_info = {
-    .init = apic_init1,
-    .qdev.name = "apic",
-    .qdev.size = sizeof(APICState),
-    .qdev.vmsd = &vmstate_apic,
-    .qdev.reset = apic_reset,
-    .qdev.no_user = 1,
-    .qdev.props = (Property[]) {
-        DEFINE_PROP_UINT8("id", APICState, id, -1),
-        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
-        DEFINE_PROP_END_OF_LIST(),
-    }
+static APICBackend apic_backend = {
+    .name = "QEMU",
+    .init = apic_backend_init,
+    .set_base = apic_set_base,
+    .set_tpr = apic_set_tpr,
+    .external_nmi = apic_external_nmi,
 };
 
 static void apic_register_devices(void)
 {
-    sysbus_register_withprop(&apic_info);
+    apic_register_device();
+    apic_register_backend(&apic_backend);
 }
 
 device_init(apic_register_devices)
diff --git a/hw/apic.h b/hw/apic.h
index 8173d8a..a62d83b 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,7 +10,6 @@ int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
 void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
-void apic_report_irq_delivered(int delivered);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
 void cpu_set_apic_base(DeviceState *s, uint64_t val);
diff --git a/hw/apic_common.c b/hw/apic_common.c
new file mode 100644
index 0000000..4cdc45c
--- /dev/null
+++ b/hw/apic_common.c
@@ -0,0 +1,265 @@
+/*
+ *  APIC support - common bits of emulated and KVM kernel model
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+#include "apic.h"
+#include "apic_internal.h"
+#include "trace.h"
+
+static QSIMPLEQ_HEAD(, APICBackend) backends =
+    QSIMPLEQ_HEAD_INITIALIZER(backends);
+static int apic_irq_delivered;
+
+void cpu_set_apic_base(DeviceState *d, uint64_t val)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    trace_cpu_set_apic_base(val);
+
+    if (s) {
+        s->backend->set_base(s, val);
+    }
+}
+
+uint64_t cpu_get_apic_base(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase : 0);
+
+    return s ? s->apicbase : 0;
+}
+
+void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    if (s) {
+        s->backend->set_tpr(s, val);
+    }
+}
+
+uint8_t cpu_get_apic_tpr(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    return s ? s->tpr >> 4 : 0;
+}
+
+void apic_report_irq_delivered(int delivered)
+{
+    apic_irq_delivered += delivered;
+
+    trace_apic_report_irq_delivered(apic_irq_delivered);
+}
+
+void apic_reset_irq_delivered(void)
+{
+    trace_apic_reset_irq_delivered(apic_irq_delivered);
+
+    apic_irq_delivered = 0;
+}
+
+int apic_get_irq_delivered(void)
+{
+    trace_apic_get_irq_delivered(apic_irq_delivered);
+
+    return apic_irq_delivered;
+}
+
+void apic_deliver_nmi(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    s->backend->external_nmi(s);
+}
+
+void apic_init_reset(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    int i;
+
+    if (!s) {
+        return;
+    }
+    s->tpr = 0;
+    s->spurious_vec = 0xff;
+    s->log_dest = 0;
+    s->dest_mode = 0xf;
+    memset(s->isr, 0, sizeof(s->isr));
+    memset(s->tmr, 0, sizeof(s->tmr));
+    memset(s->irr, 0, sizeof(s->irr));
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        s->lvt[i] = APIC_LVT_MASKED;
+    }
+    s->esr = 0;
+    memset(s->icr, 0, sizeof(s->icr));
+    s->divide_conf = 0;
+    s->count_shift = 0;
+    s->initial_count = 0;
+    s->initial_count_load_time = 0;
+    s->next_time = 0;
+    s->wait_for_sipi = 1;
+
+    qemu_del_timer(s->timer);
+}
+
+static void apic_reset(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    bool bsp;
+
+    bsp = cpu_is_bsp(s->cpu_env);
+    s->apicbase = 0xfee00000 |
+        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
+
+    apic_init_reset(d);
+
+    if (bsp) {
+        /*
+         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
+         * time typically by BIOS, so PIC interrupt can be delivered to the
+         * processor when local APIC is enabled.
+         */
+        s->lvt[APIC_LVT_LINT0] = 0x700;
+    }
+}
+
+/* This function is only used for old state version 1 and 2 */
+static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
+{
+    APICState *s = opaque;
+    int i;
+
+    if (version_id > 2) {
+        return -EINVAL;
+    }
+
+    /* XXX: what if the base changes? (registered memory regions) */
+    qemu_get_be32s(f, &s->apicbase);
+    qemu_get_8s(f, &s->id);
+    qemu_get_8s(f, &s->arb_id);
+    qemu_get_8s(f, &s->tpr);
+    qemu_get_be32s(f, &s->spurious_vec);
+    qemu_get_8s(f, &s->log_dest);
+    qemu_get_8s(f, &s->dest_mode);
+    for (i = 0; i < 8; i++) {
+        qemu_get_be32s(f, &s->isr[i]);
+        qemu_get_be32s(f, &s->tmr[i]);
+        qemu_get_be32s(f, &s->irr[i]);
+    }
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        qemu_get_be32s(f, &s->lvt[i]);
+    }
+    qemu_get_be32s(f, &s->esr);
+    qemu_get_be32s(f, &s->icr[0]);
+    qemu_get_be32s(f, &s->icr[1]);
+    qemu_get_be32s(f, &s->divide_conf);
+    s->count_shift = qemu_get_be32(f);
+    qemu_get_be32s(f, &s->initial_count);
+    s->initial_count_load_time = qemu_get_be64(f);
+    s->next_time = qemu_get_be64(f);
+
+    if (version_id >= 2) {
+        qemu_get_timer(f, s->timer);
+    }
+    return 0;
+}
+
+static const VMStateDescription vmstate_apic = {
+    .name = "apic",
+    .version_id = 3,
+    .minimum_version_id = 3,
+    .minimum_version_id_old = 1,
+    .load_state_old = apic_load_old,
+    .fields      = (VMStateField[]) {
+        VMSTATE_UINT32(apicbase, APICState),
+        VMSTATE_UINT8(id, APICState),
+        VMSTATE_UINT8(arb_id, APICState),
+        VMSTATE_UINT8(tpr, APICState),
+        VMSTATE_UINT32(spurious_vec, APICState),
+        VMSTATE_UINT8(log_dest, APICState),
+        VMSTATE_UINT8(dest_mode, APICState),
+        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
+        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
+        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
+        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
+        VMSTATE_UINT32(esr, APICState),
+        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
+        VMSTATE_UINT32(divide_conf, APICState),
+        VMSTATE_INT32(count_shift, APICState),
+        VMSTATE_UINT32(initial_count, APICState),
+        VMSTATE_INT64(initial_count_load_time, APICState),
+        VMSTATE_INT64(next_time, APICState),
+        VMSTATE_TIMER(timer, APICState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int apic_init(SysBusDevice *dev)
+{
+    APICState *s = FROM_SYSBUS(APICState, dev);
+    static int apic_no;
+    APICBackend *b;
+
+    if (apic_no >= MAX_APICS) {
+        return -1;
+    }
+    s->idx = apic_no++;
+
+    QSIMPLEQ_FOREACH(b, &backends, entry) {
+        if (strcmp(b->name, s->backend_name) == 0) {
+            s->backend = b;
+            break;
+        }
+    }
+    if (!s->backend) {
+        hw_error("APIC backend '%s' not found!", s->backend_name);
+        exit(1);
+    }
+
+    b->init(s);
+
+    sysbus_init_mmio(&s->busdev, &s->io_memory);
+    return 0;
+}
+
+static SysBusDeviceInfo apic_info = {
+    .init = apic_init,
+    .qdev.name = "apic",
+    .qdev.size = sizeof(APICState),
+    .qdev.vmsd = &vmstate_apic,
+    .qdev.reset = apic_reset,
+    .qdev.no_user = 1,
+    .qdev.props = (Property[]) {
+        DEFINE_PROP_UINT8("id", APICState, id, -1),
+        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
+        DEFINE_PROP_STRING("backend", APICState, backend_name),
+        DEFINE_PROP_END_OF_LIST(),
+    }
+};
+
+void apic_register_backend(APICBackend *backend)
+{
+    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
+}
+
+void apic_register_device(void)
+{
+    sysbus_register_withprop(&apic_info);
+}
diff --git a/hw/apic_internal.h b/hw/apic_internal.h
new file mode 100644
index 0000000..6cbd901
--- /dev/null
+++ b/hw/apic_internal.h
@@ -0,0 +1,119 @@
+/*
+ *  APIC support - internal interfaces
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+#ifndef QEMU_APIC_INTERNAL_H
+#define QEMU_APIC_INTERNAL_H
+
+#include "memory.h"
+#include "sysbus.h"
+#include "qemu-timer.h"
+#include "qemu-queue.h"
+
+/* APIC Local Vector Table */
+#define APIC_LVT_TIMER                  0
+#define APIC_LVT_THERMAL                1
+#define APIC_LVT_PERFORM                2
+#define APIC_LVT_LINT0                  3
+#define APIC_LVT_LINT1                  4
+#define APIC_LVT_ERROR                  5
+#define APIC_LVT_NB                     6
+
+/* APIC delivery modes */
+#define APIC_DM_FIXED                   0
+#define APIC_DM_LOWPRI                  1
+#define APIC_DM_SMI                     2
+#define APIC_DM_NMI                     4
+#define APIC_DM_INIT                    5
+#define APIC_DM_SIPI                    6
+#define APIC_DM_EXTINT                  7
+
+/* APIC destination mode */
+#define APIC_DESTMODE_FLAT              0xf
+#define APIC_DESTMODE_CLUSTER           1
+
+#define APIC_TRIGGER_EDGE               0
+#define APIC_TRIGGER_LEVEL              1
+
+#define APIC_LVT_TIMER_PERIODIC         (1<<17)
+#define APIC_LVT_MASKED                 (1<<16)
+#define APIC_LVT_LEVEL_TRIGGER          (1<<15)
+#define APIC_LVT_REMOTE_IRR             (1<<14)
+#define APIC_INPUT_POLARITY             (1<<13)
+#define APIC_SEND_PENDING               (1<<12)
+
+#define ESR_ILLEGAL_ADDRESS (1 << 7)
+
+#define APIC_SV_DIRECTED_IO             (1<<12)
+#define APIC_SV_ENABLE                  (1<<8)
+
+#define MAX_APICS 255
+
+#define MSI_SPACE_SIZE                  0x100000
+
+typedef struct APICBackend APICBackend;
+typedef struct APICState APICState;
+
+struct APICBackend {
+    const char *name;
+    void (*init)(APICState *s);
+    void (*set_base)(APICState *s, uint64_t val);
+    void (*set_tpr)(APICState *s, uint8_t val);
+    void (*external_nmi)(APICState *s);
+
+    QSIMPLEQ_ENTRY(APICBackend) entry;
+};
+
+struct APICState {
+    SysBusDevice busdev;
+    MemoryRegion io_memory;
+    void *cpu_env;
+    uint32_t apicbase;
+    uint8_t id;
+    uint8_t arb_id;
+    uint8_t tpr;
+    uint32_t spurious_vec;
+    uint8_t log_dest;
+    uint8_t dest_mode;
+    uint32_t isr[8];  /* in service register */
+    uint32_t tmr[8];  /* trigger mode register */
+    uint32_t irr[8]; /* interrupt request register */
+    uint32_t lvt[APIC_LVT_NB];
+    uint32_t esr; /* error register */
+    uint32_t icr[2];
+
+    uint32_t divide_conf;
+    int count_shift;
+    uint32_t initial_count;
+    int64_t initial_count_load_time;
+    int64_t next_time;
+    int idx;
+    QEMUTimer *timer;
+    int sipi_vector;
+    int wait_for_sipi;
+
+    char *backend_name;
+    APICBackend *backend;
+};
+
+void apic_register_device(void);
+void apic_register_backend(APICBackend *backend);
+
+void apic_report_irq_delivered(int delivered);
+
+#endif /* !QEMU_APIC_INTERNAL_H */
diff --git a/hw/pc.c b/hw/pc.c
index 240aaae..ee6e59b 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -884,6 +884,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
     dev = qdev_create(NULL, "apic");
     qdev_prop_set_uint8(dev, "id", apic_id);
     qdev_prop_set_ptr(dev, "cpu_env", env);
+    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
     qdev_init_nofail(dev);
     d = sysbus_from_qdev(dev);
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

The KVM in-kernel APIC model will reuse parts of the user space model
while providing the same frontend view to guest and most management
interfaces. Introduce an APIC backend concept to encapsulate those
parts that will tell user space and KVM model apart. The backend offers
callback hooks for init, base/tpr setting, and the external NMI delivery
that will be implemented accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target    |    2 +-
 hw/apic.c          |  285 +++-------------------------------------------------
 hw/apic.h          |    1 -
 hw/apic_common.c   |  265 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/apic_internal.h |  119 ++++++++++++++++++++++
 hw/pc.c            |    1 +
 6 files changed, 401 insertions(+), 272 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h

diff --git a/Makefile.target b/Makefile.target
index 1d24a30..c46f062 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/apic.c b/hw/apic.c
index bec493b..5fa3111 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -16,53 +16,13 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see <http://www.gnu.org/licenses/>
  */
-#include "hw.h"
+#include "apic_internal.h"
 #include "apic.h"
 #include "ioapic.h"
-#include "qemu-timer.h"
 #include "host-utils.h"
-#include "sysbus.h"
 #include "trace.h"
 #include "pc.h"
 
-/* APIC Local Vector Table */
-#define APIC_LVT_TIMER   0
-#define APIC_LVT_THERMAL 1
-#define APIC_LVT_PERFORM 2
-#define APIC_LVT_LINT0   3
-#define APIC_LVT_LINT1   4
-#define APIC_LVT_ERROR   5
-#define APIC_LVT_NB      6
-
-/* APIC delivery modes */
-#define APIC_DM_FIXED	0
-#define APIC_DM_LOWPRI	1
-#define APIC_DM_SMI	2
-#define APIC_DM_NMI	4
-#define APIC_DM_INIT	5
-#define APIC_DM_SIPI	6
-#define APIC_DM_EXTINT	7
-
-/* APIC destination mode */
-#define APIC_DESTMODE_FLAT	0xf
-#define APIC_DESTMODE_CLUSTER	1
-
-#define APIC_TRIGGER_EDGE  0
-#define APIC_TRIGGER_LEVEL 1
-
-#define	APIC_LVT_TIMER_PERIODIC		(1<<17)
-#define	APIC_LVT_MASKED			(1<<16)
-#define	APIC_LVT_LEVEL_TRIGGER		(1<<15)
-#define	APIC_LVT_REMOTE_IRR		(1<<14)
-#define	APIC_INPUT_POLARITY		(1<<13)
-#define	APIC_SEND_PENDING		(1<<12)
-
-#define ESR_ILLEGAL_ADDRESS (1 << 7)
-
-#define APIC_SV_DIRECTED_IO             (1<<12)
-#define APIC_SV_ENABLE                  (1<<8)
-
-#define MAX_APICS 255
 #define MAX_APIC_WORDS 8
 
 /* Intel APIC constants: from include/asm/msidef.h */
@@ -75,40 +35,7 @@
 #define MSI_ADDR_DEST_ID_SHIFT		12
 #define	MSI_ADDR_DEST_ID_MASK		0x00ffff0
 
-#define MSI_ADDR_SIZE                   0x100000
-
-typedef struct APICState APICState;
-
-struct APICState {
-    SysBusDevice busdev;
-    MemoryRegion io_memory;
-    void *cpu_env;
-    uint32_t apicbase;
-    uint8_t id;
-    uint8_t arb_id;
-    uint8_t tpr;
-    uint32_t spurious_vec;
-    uint8_t log_dest;
-    uint8_t dest_mode;
-    uint32_t isr[8];  /* in service register */
-    uint32_t tmr[8];  /* trigger mode register */
-    uint32_t irr[8]; /* interrupt request register */
-    uint32_t lvt[APIC_LVT_NB];
-    uint32_t esr; /* error register */
-    uint32_t icr[2];
-
-    uint32_t divide_conf;
-    int count_shift;
-    uint32_t initial_count;
-    int64_t initial_count_load_time, next_time;
-    uint32_t idx;
-    QEMUTimer *timer;
-    int sipi_vector;
-    int wait_for_sipi;
-};
-
 static APICState *local_apics[MAX_APICS + 1];
-static int apic_irq_delivered;
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICState *s);
@@ -205,10 +132,8 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
     }
 }
 
-void apic_deliver_nmi(DeviceState *d)
+static void apic_external_nmi(APICState *s)
 {
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
     apic_local_deliver(s, APIC_LVT_LINT1);
 }
 
@@ -300,14 +225,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
     apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
 }
 
-void cpu_set_apic_base(DeviceState *d, uint64_t val)
+static void apic_set_base(APICState *s, uint64_t val)
 {
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    trace_cpu_set_apic_base(val);
-
-    if (!s)
-        return;
     s->apicbase = (val & 0xfffff000) |
         (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
     /* if disabled, cannot be enabled again */
@@ -318,32 +237,12 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
     }
 }
 
-uint64_t cpu_get_apic_base(DeviceState *d)
+static void apic_set_tpr(APICState *s, uint8_t val)
 {
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase: 0);
-
-    return s ? s->apicbase : 0;
-}
-
-void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    if (!s)
-        return;
     s->tpr = (val & 0x0f) << 4;
     apic_update_irq(s);
 }
 
-uint8_t cpu_get_apic_tpr(DeviceState *d)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-    return s ? s->tpr >> 4 : 0;
-}
-
 /* return -1 if no bit is set */
 static int get_highest_priority_int(uint32_t *tab)
 {
@@ -413,27 +312,6 @@ static void apic_update_irq(APICState *s)
     }
 }
 
-void apic_report_irq_delivered(int delivered)
-{
-    apic_irq_delivered += delivered;
-
-    trace_apic_report_irq_delivered(apic_irq_delivered);
-}
-
-void apic_reset_irq_delivered(void)
-{
-    trace_apic_reset_irq_delivered(apic_irq_delivered);
-
-    apic_irq_delivered = 0;
-}
-
-int apic_get_irq_delivered(void)
-{
-    trace_apic_get_irq_delivered(apic_irq_delivered);
-
-    return apic_irq_delivered;
-}
-
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
     apic_report_irq_delivered(!get_bit(s->irr, vector_num));
@@ -515,35 +393,6 @@ static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
     }
 }
 
-void apic_init_reset(DeviceState *d)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-    int i;
-
-    if (!s)
-        return;
-
-    s->tpr = 0;
-    s->spurious_vec = 0xff;
-    s->log_dest = 0;
-    s->dest_mode = 0xf;
-    memset(s->isr, 0, sizeof(s->isr));
-    memset(s->tmr, 0, sizeof(s->tmr));
-    memset(s->irr, 0, sizeof(s->irr));
-    for(i = 0; i < APIC_LVT_NB; i++)
-        s->lvt[i] = 1 << 16; /* mask LVT */
-    s->esr = 0;
-    memset(s->icr, 0, sizeof(s->icr));
-    s->divide_conf = 0;
-    s->count_shift = 0;
-    s->initial_count = 0;
-    s->initial_count_load_time = 0;
-    s->next_time = 0;
-    s->wait_for_sipi = 1;
-
-    qemu_del_timer(s->timer);
-}
-
 static void apic_startup(APICState *s, int vector_num)
 {
     s->sipi_vector = vector_num;
@@ -904,96 +753,6 @@ static void apic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
     }
 }
 
-/* This function is only used for old state version 1 and 2 */
-static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
-{
-    APICState *s = opaque;
-    int i;
-
-    if (version_id > 2)
-        return -EINVAL;
-
-    /* XXX: what if the base changes? (registered memory regions) */
-    qemu_get_be32s(f, &s->apicbase);
-    qemu_get_8s(f, &s->id);
-    qemu_get_8s(f, &s->arb_id);
-    qemu_get_8s(f, &s->tpr);
-    qemu_get_be32s(f, &s->spurious_vec);
-    qemu_get_8s(f, &s->log_dest);
-    qemu_get_8s(f, &s->dest_mode);
-    for (i = 0; i < 8; i++) {
-        qemu_get_be32s(f, &s->isr[i]);
-        qemu_get_be32s(f, &s->tmr[i]);
-        qemu_get_be32s(f, &s->irr[i]);
-    }
-    for (i = 0; i < APIC_LVT_NB; i++) {
-        qemu_get_be32s(f, &s->lvt[i]);
-    }
-    qemu_get_be32s(f, &s->esr);
-    qemu_get_be32s(f, &s->icr[0]);
-    qemu_get_be32s(f, &s->icr[1]);
-    qemu_get_be32s(f, &s->divide_conf);
-    s->count_shift=qemu_get_be32(f);
-    qemu_get_be32s(f, &s->initial_count);
-    s->initial_count_load_time=qemu_get_be64(f);
-    s->next_time=qemu_get_be64(f);
-
-    if (version_id >= 2)
-        qemu_get_timer(f, s->timer);
-    return 0;
-}
-
-static const VMStateDescription vmstate_apic = {
-    .name = "apic",
-    .version_id = 3,
-    .minimum_version_id = 3,
-    .minimum_version_id_old = 1,
-    .load_state_old = apic_load_old,
-    .fields      = (VMStateField []) {
-        VMSTATE_UINT32(apicbase, APICState),
-        VMSTATE_UINT8(id, APICState),
-        VMSTATE_UINT8(arb_id, APICState),
-        VMSTATE_UINT8(tpr, APICState),
-        VMSTATE_UINT32(spurious_vec, APICState),
-        VMSTATE_UINT8(log_dest, APICState),
-        VMSTATE_UINT8(dest_mode, APICState),
-        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
-        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
-        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
-        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
-        VMSTATE_UINT32(esr, APICState),
-        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
-        VMSTATE_UINT32(divide_conf, APICState),
-        VMSTATE_INT32(count_shift, APICState),
-        VMSTATE_UINT32(initial_count, APICState),
-        VMSTATE_INT64(initial_count_load_time, APICState),
-        VMSTATE_INT64(next_time, APICState),
-        VMSTATE_TIMER(timer, APICState),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
-static void apic_reset(DeviceState *d)
-{
-    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-    int bsp;
-
-    bsp = cpu_is_bsp(s->cpu_env);
-    s->apicbase = 0xfee00000 |
-        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
-
-    apic_init_reset(d);
-
-    if (bsp) {
-        /*
-         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
-         * time typically by BIOS, so PIC interrupt can be delivered to the
-         * processor when local APIC is enabled.
-         */
-        s->lvt[APIC_LVT_LINT0] = 0x700;
-    }
-}
-
 static const MemoryRegionOps apic_io_ops = {
     .old_mmio = {
         .read = { apic_mem_readb, apic_mem_readw, apic_mem_readl, },
@@ -1002,41 +761,27 @@ static const MemoryRegionOps apic_io_ops = {
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static int apic_init1(SysBusDevice *dev)
+static void apic_backend_init(APICState *s)
 {
-    APICState *s = FROM_SYSBUS(APICState, dev);
-    static int last_apic_idx;
-
-    if (last_apic_idx >= MAX_APICS) {
-        return -1;
-    }
-    memory_region_init_io(&s->io_memory, &apic_io_ops, s, "apic",
-                          MSI_ADDR_SIZE);
-    sysbus_init_mmio(dev, &s->io_memory);
+    memory_region_init_io(&s->io_memory, &apic_io_ops, s, "apic-msi",
+                          MSI_SPACE_SIZE);
 
     s->timer = qemu_new_timer_ns(vm_clock, apic_timer, s);
-    s->idx = last_apic_idx++;
     local_apics[s->idx] = s;
-    return 0;
 }
 
-static SysBusDeviceInfo apic_info = {
-    .init = apic_init1,
-    .qdev.name = "apic",
-    .qdev.size = sizeof(APICState),
-    .qdev.vmsd = &vmstate_apic,
-    .qdev.reset = apic_reset,
-    .qdev.no_user = 1,
-    .qdev.props = (Property[]) {
-        DEFINE_PROP_UINT8("id", APICState, id, -1),
-        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
-        DEFINE_PROP_END_OF_LIST(),
-    }
+static APICBackend apic_backend = {
+    .name = "QEMU",
+    .init = apic_backend_init,
+    .set_base = apic_set_base,
+    .set_tpr = apic_set_tpr,
+    .external_nmi = apic_external_nmi,
 };
 
 static void apic_register_devices(void)
 {
-    sysbus_register_withprop(&apic_info);
+    apic_register_device();
+    apic_register_backend(&apic_backend);
 }
 
 device_init(apic_register_devices)
diff --git a/hw/apic.h b/hw/apic.h
index 8173d8a..a62d83b 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,7 +10,6 @@ int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
 void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
-void apic_report_irq_delivered(int delivered);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
 void cpu_set_apic_base(DeviceState *s, uint64_t val);
diff --git a/hw/apic_common.c b/hw/apic_common.c
new file mode 100644
index 0000000..4cdc45c
--- /dev/null
+++ b/hw/apic_common.c
@@ -0,0 +1,265 @@
+/*
+ *  APIC support - common bits of emulated and KVM kernel model
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+#include "apic.h"
+#include "apic_internal.h"
+#include "trace.h"
+
+static QSIMPLEQ_HEAD(, APICBackend) backends =
+    QSIMPLEQ_HEAD_INITIALIZER(backends);
+static int apic_irq_delivered;
+
+void cpu_set_apic_base(DeviceState *d, uint64_t val)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    trace_cpu_set_apic_base(val);
+
+    if (s) {
+        s->backend->set_base(s, val);
+    }
+}
+
+uint64_t cpu_get_apic_base(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase : 0);
+
+    return s ? s->apicbase : 0;
+}
+
+void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    if (s) {
+        s->backend->set_tpr(s, val);
+    }
+}
+
+uint8_t cpu_get_apic_tpr(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    return s ? s->tpr >> 4 : 0;
+}
+
+void apic_report_irq_delivered(int delivered)
+{
+    apic_irq_delivered += delivered;
+
+    trace_apic_report_irq_delivered(apic_irq_delivered);
+}
+
+void apic_reset_irq_delivered(void)
+{
+    trace_apic_reset_irq_delivered(apic_irq_delivered);
+
+    apic_irq_delivered = 0;
+}
+
+int apic_get_irq_delivered(void)
+{
+    trace_apic_get_irq_delivered(apic_irq_delivered);
+
+    return apic_irq_delivered;
+}
+
+void apic_deliver_nmi(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    s->backend->external_nmi(s);
+}
+
+void apic_init_reset(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    int i;
+
+    if (!s) {
+        return;
+    }
+    s->tpr = 0;
+    s->spurious_vec = 0xff;
+    s->log_dest = 0;
+    s->dest_mode = 0xf;
+    memset(s->isr, 0, sizeof(s->isr));
+    memset(s->tmr, 0, sizeof(s->tmr));
+    memset(s->irr, 0, sizeof(s->irr));
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        s->lvt[i] = APIC_LVT_MASKED;
+    }
+    s->esr = 0;
+    memset(s->icr, 0, sizeof(s->icr));
+    s->divide_conf = 0;
+    s->count_shift = 0;
+    s->initial_count = 0;
+    s->initial_count_load_time = 0;
+    s->next_time = 0;
+    s->wait_for_sipi = 1;
+
+    qemu_del_timer(s->timer);
+}
+
+static void apic_reset(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    bool bsp;
+
+    bsp = cpu_is_bsp(s->cpu_env);
+    s->apicbase = 0xfee00000 |
+        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
+
+    apic_init_reset(d);
+
+    if (bsp) {
+        /*
+         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
+         * time typically by BIOS, so PIC interrupt can be delivered to the
+         * processor when local APIC is enabled.
+         */
+        s->lvt[APIC_LVT_LINT0] = 0x700;
+    }
+}
+
+/* This function is only used for old state version 1 and 2 */
+static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
+{
+    APICState *s = opaque;
+    int i;
+
+    if (version_id > 2) {
+        return -EINVAL;
+    }
+
+    /* XXX: what if the base changes? (registered memory regions) */
+    qemu_get_be32s(f, &s->apicbase);
+    qemu_get_8s(f, &s->id);
+    qemu_get_8s(f, &s->arb_id);
+    qemu_get_8s(f, &s->tpr);
+    qemu_get_be32s(f, &s->spurious_vec);
+    qemu_get_8s(f, &s->log_dest);
+    qemu_get_8s(f, &s->dest_mode);
+    for (i = 0; i < 8; i++) {
+        qemu_get_be32s(f, &s->isr[i]);
+        qemu_get_be32s(f, &s->tmr[i]);
+        qemu_get_be32s(f, &s->irr[i]);
+    }
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        qemu_get_be32s(f, &s->lvt[i]);
+    }
+    qemu_get_be32s(f, &s->esr);
+    qemu_get_be32s(f, &s->icr[0]);
+    qemu_get_be32s(f, &s->icr[1]);
+    qemu_get_be32s(f, &s->divide_conf);
+    s->count_shift = qemu_get_be32(f);
+    qemu_get_be32s(f, &s->initial_count);
+    s->initial_count_load_time = qemu_get_be64(f);
+    s->next_time = qemu_get_be64(f);
+
+    if (version_id >= 2) {
+        qemu_get_timer(f, s->timer);
+    }
+    return 0;
+}
+
+static const VMStateDescription vmstate_apic = {
+    .name = "apic",
+    .version_id = 3,
+    .minimum_version_id = 3,
+    .minimum_version_id_old = 1,
+    .load_state_old = apic_load_old,
+    .fields      = (VMStateField[]) {
+        VMSTATE_UINT32(apicbase, APICState),
+        VMSTATE_UINT8(id, APICState),
+        VMSTATE_UINT8(arb_id, APICState),
+        VMSTATE_UINT8(tpr, APICState),
+        VMSTATE_UINT32(spurious_vec, APICState),
+        VMSTATE_UINT8(log_dest, APICState),
+        VMSTATE_UINT8(dest_mode, APICState),
+        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
+        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
+        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
+        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
+        VMSTATE_UINT32(esr, APICState),
+        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
+        VMSTATE_UINT32(divide_conf, APICState),
+        VMSTATE_INT32(count_shift, APICState),
+        VMSTATE_UINT32(initial_count, APICState),
+        VMSTATE_INT64(initial_count_load_time, APICState),
+        VMSTATE_INT64(next_time, APICState),
+        VMSTATE_TIMER(timer, APICState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int apic_init(SysBusDevice *dev)
+{
+    APICState *s = FROM_SYSBUS(APICState, dev);
+    static int apic_no;
+    APICBackend *b;
+
+    if (apic_no >= MAX_APICS) {
+        return -1;
+    }
+    s->idx = apic_no++;
+
+    QSIMPLEQ_FOREACH(b, &backends, entry) {
+        if (strcmp(b->name, s->backend_name) == 0) {
+            s->backend = b;
+            break;
+        }
+    }
+    if (!s->backend) {
+        hw_error("APIC backend '%s' not found!", s->backend_name);
+        exit(1);
+    }
+
+    b->init(s);
+
+    sysbus_init_mmio(&s->busdev, &s->io_memory);
+    return 0;
+}
+
+static SysBusDeviceInfo apic_info = {
+    .init = apic_init,
+    .qdev.name = "apic",
+    .qdev.size = sizeof(APICState),
+    .qdev.vmsd = &vmstate_apic,
+    .qdev.reset = apic_reset,
+    .qdev.no_user = 1,
+    .qdev.props = (Property[]) {
+        DEFINE_PROP_UINT8("id", APICState, id, -1),
+        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
+        DEFINE_PROP_STRING("backend", APICState, backend_name),
+        DEFINE_PROP_END_OF_LIST(),
+    }
+};
+
+void apic_register_backend(APICBackend *backend)
+{
+    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
+}
+
+void apic_register_device(void)
+{
+    sysbus_register_withprop(&apic_info);
+}
diff --git a/hw/apic_internal.h b/hw/apic_internal.h
new file mode 100644
index 0000000..6cbd901
--- /dev/null
+++ b/hw/apic_internal.h
@@ -0,0 +1,119 @@
+/*
+ *  APIC support - internal interfaces
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+#ifndef QEMU_APIC_INTERNAL_H
+#define QEMU_APIC_INTERNAL_H
+
+#include "memory.h"
+#include "sysbus.h"
+#include "qemu-timer.h"
+#include "qemu-queue.h"
+
+/* APIC Local Vector Table */
+#define APIC_LVT_TIMER                  0
+#define APIC_LVT_THERMAL                1
+#define APIC_LVT_PERFORM                2
+#define APIC_LVT_LINT0                  3
+#define APIC_LVT_LINT1                  4
+#define APIC_LVT_ERROR                  5
+#define APIC_LVT_NB                     6
+
+/* APIC delivery modes */
+#define APIC_DM_FIXED                   0
+#define APIC_DM_LOWPRI                  1
+#define APIC_DM_SMI                     2
+#define APIC_DM_NMI                     4
+#define APIC_DM_INIT                    5
+#define APIC_DM_SIPI                    6
+#define APIC_DM_EXTINT                  7
+
+/* APIC destination mode */
+#define APIC_DESTMODE_FLAT              0xf
+#define APIC_DESTMODE_CLUSTER           1
+
+#define APIC_TRIGGER_EDGE               0
+#define APIC_TRIGGER_LEVEL              1
+
+#define APIC_LVT_TIMER_PERIODIC         (1<<17)
+#define APIC_LVT_MASKED                 (1<<16)
+#define APIC_LVT_LEVEL_TRIGGER          (1<<15)
+#define APIC_LVT_REMOTE_IRR             (1<<14)
+#define APIC_INPUT_POLARITY             (1<<13)
+#define APIC_SEND_PENDING               (1<<12)
+
+#define ESR_ILLEGAL_ADDRESS (1 << 7)
+
+#define APIC_SV_DIRECTED_IO             (1<<12)
+#define APIC_SV_ENABLE                  (1<<8)
+
+#define MAX_APICS 255
+
+#define MSI_SPACE_SIZE                  0x100000
+
+typedef struct APICBackend APICBackend;
+typedef struct APICState APICState;
+
+struct APICBackend {
+    const char *name;
+    void (*init)(APICState *s);
+    void (*set_base)(APICState *s, uint64_t val);
+    void (*set_tpr)(APICState *s, uint8_t val);
+    void (*external_nmi)(APICState *s);
+
+    QSIMPLEQ_ENTRY(APICBackend) entry;
+};
+
+struct APICState {
+    SysBusDevice busdev;
+    MemoryRegion io_memory;
+    void *cpu_env;
+    uint32_t apicbase;
+    uint8_t id;
+    uint8_t arb_id;
+    uint8_t tpr;
+    uint32_t spurious_vec;
+    uint8_t log_dest;
+    uint8_t dest_mode;
+    uint32_t isr[8];  /* in service register */
+    uint32_t tmr[8];  /* trigger mode register */
+    uint32_t irr[8]; /* interrupt request register */
+    uint32_t lvt[APIC_LVT_NB];
+    uint32_t esr; /* error register */
+    uint32_t icr[2];
+
+    uint32_t divide_conf;
+    int count_shift;
+    uint32_t initial_count;
+    int64_t initial_count_load_time;
+    int64_t next_time;
+    int idx;
+    QEMUTimer *timer;
+    int sipi_vector;
+    int wait_for_sipi;
+
+    char *backend_name;
+    APICBackend *backend;
+};
+
+void apic_register_device(void);
+void apic_register_backend(APICBackend *backend);
+
+void apic_report_irq_delivered(int delivered);
+
+#endif /* !QEMU_APIC_INTERNAL_H */
diff --git a/hw/pc.c b/hw/pc.c
index 240aaae..ee6e59b 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -884,6 +884,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
     dev = qdev_create(NULL, "apic");
     qdev_prop_set_uint8(dev, "id", apic_id);
     qdev_prop_set_ptr(dev, "cpu_env", env);
+    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
     qdev_init_nofail(dev);
     d = sysbus_from_qdev(dev);
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.

This patch therefore factors out the generic bits into apic_next_timer
and introduces a post-load callback that can be implemented differently
by both models.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c          |   30 ++++++++++++------------------
 hw/apic_common.c   |   51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 hw/apic_internal.h |    3 +++
 3 files changed, 64 insertions(+), 20 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 5fa3111..36d4ff3 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -521,25 +521,9 @@ static uint32_t apic_get_current_count(APICState *s)
 
 static void apic_timer_update(APICState *s, int64_t current_time)
 {
-    int64_t next_time, d;
-
-    if (!(s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED)) {
-        d = (current_time - s->initial_count_load_time) >>
-            s->count_shift;
-        if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-            if (!s->initial_count)
-                goto no_timer;
-            d = ((d / ((uint64_t)s->initial_count + 1)) + 1) * ((uint64_t)s->initial_count + 1);
-        } else {
-            if (d >= s->initial_count)
-                goto no_timer;
-            d = (uint64_t)s->initial_count + 1;
-        }
-        next_time = s->initial_count_load_time + (d << s->count_shift);
-        qemu_mod_timer(s->timer, next_time);
-        s->next_time = next_time;
+    if (apic_next_timer(s, current_time)) {
+        qemu_mod_timer(s->timer, s->next_time);
     } else {
-    no_timer:
         qemu_del_timer(s->timer);
     }
 }
@@ -770,12 +754,22 @@ static void apic_backend_init(APICState *s)
     local_apics[s->idx] = s;
 }
 
+static void apic_post_load(APICState *s)
+{
+    if (s->timer_expiry != -1) {
+        qemu_mod_timer(s->timer, s->timer_expiry);
+    } else {
+        qemu_del_timer(s->timer);
+    }
+}
+
 static APICBackend apic_backend = {
     .name = "QEMU",
     .init = apic_backend_init,
     .set_base = apic_set_base,
     .set_tpr = apic_set_tpr,
     .external_nmi = apic_external_nmi,
+    .post_load = apic_post_load,
 };
 
 static void apic_register_devices(void)
diff --git a/hw/apic_common.c b/hw/apic_common.c
index 4cdc45c..3d345b9 100644
--- a/hw/apic_common.c
+++ b/hw/apic_common.c
@@ -89,6 +89,39 @@ void apic_deliver_nmi(DeviceState *d)
     s->backend->external_nmi(s);
 }
 
+bool apic_next_timer(APICState *s, int64_t current_time)
+{
+    int64_t d;
+
+    /* We need to store the timer state separately to support APIC
+     * implementations that maintain a non-QEMU timer, e.g. inside the
+     * host kernel. This open-coded state allows us to migrate between
+     * both models. */
+    s->timer_expiry = -1;
+
+    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED) {
+        return false;
+    }
+
+    d = (current_time - s->initial_count_load_time) >> s->count_shift;
+
+    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+        if (!s->initial_count) {
+            return false;
+        }
+        d = ((d / ((uint64_t)s->initial_count + 1)) + 1) *
+            ((uint64_t)s->initial_count + 1);
+    } else {
+        if (d >= s->initial_count) {
+            return false;
+        }
+        d = (uint64_t)s->initial_count + 1;
+    }
+    s->next_time = s->initial_count_load_time + (d << s->count_shift);
+    s->timer_expiry = s->next_time;
+    return true;
+}
+
 void apic_init_reset(DeviceState *d)
 {
     APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
@@ -116,7 +149,10 @@ void apic_init_reset(DeviceState *d)
     s->next_time = 0;
     s->wait_for_sipi = 1;
 
-    qemu_del_timer(s->timer);
+    if (s->timer) {
+        qemu_del_timer(s->timer);
+    }
+    s->timer_expiry = -1;
 }
 
 static void apic_reset(DeviceState *d)
@@ -181,12 +217,23 @@ static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
     return 0;
 }
 
+static int apic_dispatch_post_load(void *opaque, int version_id)
+{
+    APICState *s = opaque;
+
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
 static const VMStateDescription vmstate_apic = {
     .name = "apic",
     .version_id = 3,
     .minimum_version_id = 3,
     .minimum_version_id_old = 1,
     .load_state_old = apic_load_old,
+    .post_load = apic_dispatch_post_load,
     .fields      = (VMStateField[]) {
         VMSTATE_UINT32(apicbase, APICState),
         VMSTATE_UINT8(id, APICState),
@@ -206,7 +253,7 @@ static const VMStateDescription vmstate_apic = {
         VMSTATE_UINT32(initial_count, APICState),
         VMSTATE_INT64(initial_count_load_time, APICState),
         VMSTATE_INT64(next_time, APICState),
-        VMSTATE_TIMER(timer, APICState),
+        VMSTATE_INT64(timer_expiry, APICState), /* open-coded timer state */
         VMSTATE_END_OF_LIST()
     }
 };
diff --git a/hw/apic_internal.h b/hw/apic_internal.h
index 6cbd901..898c93c 100644
--- a/hw/apic_internal.h
+++ b/hw/apic_internal.h
@@ -75,6 +75,7 @@ struct APICBackend {
     void (*set_base)(APICState *s, uint64_t val);
     void (*set_tpr)(APICState *s, uint8_t val);
     void (*external_nmi)(APICState *s);
+    void (*post_load)(APICState *s);
 
     QSIMPLEQ_ENTRY(APICBackend) entry;
 };
@@ -104,6 +105,7 @@ struct APICState {
     int64_t next_time;
     int idx;
     QEMUTimer *timer;
+    int64_t timer_expiry;
     int sipi_vector;
     int wait_for_sipi;
 
@@ -114,6 +116,7 @@ struct APICState {
 void apic_register_device(void);
 void apic_register_backend(APICBackend *backend);
 
+bool apic_next_timer(APICState *s, int64_t current_time);
 void apic_report_irq_delivered(int delivered);
 
 #endif /* !QEMU_APIC_INTERNAL_H */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.

This patch therefore factors out the generic bits into apic_next_timer
and introduces a post-load callback that can be implemented differently
by both models.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c          |   30 ++++++++++++------------------
 hw/apic_common.c   |   51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 hw/apic_internal.h |    3 +++
 3 files changed, 64 insertions(+), 20 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 5fa3111..36d4ff3 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -521,25 +521,9 @@ static uint32_t apic_get_current_count(APICState *s)
 
 static void apic_timer_update(APICState *s, int64_t current_time)
 {
-    int64_t next_time, d;
-
-    if (!(s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED)) {
-        d = (current_time - s->initial_count_load_time) >>
-            s->count_shift;
-        if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-            if (!s->initial_count)
-                goto no_timer;
-            d = ((d / ((uint64_t)s->initial_count + 1)) + 1) * ((uint64_t)s->initial_count + 1);
-        } else {
-            if (d >= s->initial_count)
-                goto no_timer;
-            d = (uint64_t)s->initial_count + 1;
-        }
-        next_time = s->initial_count_load_time + (d << s->count_shift);
-        qemu_mod_timer(s->timer, next_time);
-        s->next_time = next_time;
+    if (apic_next_timer(s, current_time)) {
+        qemu_mod_timer(s->timer, s->next_time);
     } else {
-    no_timer:
         qemu_del_timer(s->timer);
     }
 }
@@ -770,12 +754,22 @@ static void apic_backend_init(APICState *s)
     local_apics[s->idx] = s;
 }
 
+static void apic_post_load(APICState *s)
+{
+    if (s->timer_expiry != -1) {
+        qemu_mod_timer(s->timer, s->timer_expiry);
+    } else {
+        qemu_del_timer(s->timer);
+    }
+}
+
 static APICBackend apic_backend = {
     .name = "QEMU",
     .init = apic_backend_init,
     .set_base = apic_set_base,
     .set_tpr = apic_set_tpr,
     .external_nmi = apic_external_nmi,
+    .post_load = apic_post_load,
 };
 
 static void apic_register_devices(void)
diff --git a/hw/apic_common.c b/hw/apic_common.c
index 4cdc45c..3d345b9 100644
--- a/hw/apic_common.c
+++ b/hw/apic_common.c
@@ -89,6 +89,39 @@ void apic_deliver_nmi(DeviceState *d)
     s->backend->external_nmi(s);
 }
 
+bool apic_next_timer(APICState *s, int64_t current_time)
+{
+    int64_t d;
+
+    /* We need to store the timer state separately to support APIC
+     * implementations that maintain a non-QEMU timer, e.g. inside the
+     * host kernel. This open-coded state allows us to migrate between
+     * both models. */
+    s->timer_expiry = -1;
+
+    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED) {
+        return false;
+    }
+
+    d = (current_time - s->initial_count_load_time) >> s->count_shift;
+
+    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+        if (!s->initial_count) {
+            return false;
+        }
+        d = ((d / ((uint64_t)s->initial_count + 1)) + 1) *
+            ((uint64_t)s->initial_count + 1);
+    } else {
+        if (d >= s->initial_count) {
+            return false;
+        }
+        d = (uint64_t)s->initial_count + 1;
+    }
+    s->next_time = s->initial_count_load_time + (d << s->count_shift);
+    s->timer_expiry = s->next_time;
+    return true;
+}
+
 void apic_init_reset(DeviceState *d)
 {
     APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
@@ -116,7 +149,10 @@ void apic_init_reset(DeviceState *d)
     s->next_time = 0;
     s->wait_for_sipi = 1;
 
-    qemu_del_timer(s->timer);
+    if (s->timer) {
+        qemu_del_timer(s->timer);
+    }
+    s->timer_expiry = -1;
 }
 
 static void apic_reset(DeviceState *d)
@@ -181,12 +217,23 @@ static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
     return 0;
 }
 
+static int apic_dispatch_post_load(void *opaque, int version_id)
+{
+    APICState *s = opaque;
+
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
 static const VMStateDescription vmstate_apic = {
     .name = "apic",
     .version_id = 3,
     .minimum_version_id = 3,
     .minimum_version_id_old = 1,
     .load_state_old = apic_load_old,
+    .post_load = apic_dispatch_post_load,
     .fields      = (VMStateField[]) {
         VMSTATE_UINT32(apicbase, APICState),
         VMSTATE_UINT8(id, APICState),
@@ -206,7 +253,7 @@ static const VMStateDescription vmstate_apic = {
         VMSTATE_UINT32(initial_count, APICState),
         VMSTATE_INT64(initial_count_load_time, APICState),
         VMSTATE_INT64(next_time, APICState),
-        VMSTATE_TIMER(timer, APICState),
+        VMSTATE_INT64(timer_expiry, APICState), /* open-coded timer state */
         VMSTATE_END_OF_LIST()
     }
 };
diff --git a/hw/apic_internal.h b/hw/apic_internal.h
index 6cbd901..898c93c 100644
--- a/hw/apic_internal.h
+++ b/hw/apic_internal.h
@@ -75,6 +75,7 @@ struct APICBackend {
     void (*set_base)(APICState *s, uint64_t val);
     void (*set_tpr)(APICState *s, uint8_t val);
     void (*external_nmi)(APICState *s);
+    void (*post_load)(APICState *s);
 
     QSIMPLEQ_ENTRY(APICBackend) entry;
 };
@@ -104,6 +105,7 @@ struct APICState {
     int64_t next_time;
     int idx;
     QEMUTimer *timer;
+    int64_t timer_expiry;
     int sipi_vector;
     int wait_for_sipi;
 
@@ -114,6 +116,7 @@ struct APICState {
 void apic_register_device(void);
 void apic_register_backend(APICBackend *backend);
 
+bool apic_next_timer(APICState *s, int64_t current_time);
 void apic_report_irq_delivered(int delivered);
 
 #endif /* !QEMU_APIC_INTERNAL_H */
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 08/16] i8259: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Analogously to the APIC, we will reuse some parts of the user space
i8259 model for KVM. Again, we create a PIC backend infrastructure and
provide hooks for init, reset, and vmload/save. This also introduces a
common helper to instantiate a single i8259 chip from the cascade-
creating i8259_init function.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.objs       |    2 +-
 hw/i8259.c          |  127 +++++---------------------------------
 hw/i8259_common.c   |  173 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/i8259_internal.h |   82 ++++++++++++++++++++++++
 4 files changed, 271 insertions(+), 113 deletions(-)
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h

diff --git a/Makefile.objs b/Makefile.objs
index d7a6539..72d8ee7 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -221,7 +221,7 @@ hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
 hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
-hw-obj-$(CONFIG_I8259) += i8259.o
+hw-obj-$(CONFIG_I8259) += i8259_common.o i8259.o
 
 # PPC devices
 hw-obj-$(CONFIG_PREP_PCI) += prep_pci.o
diff --git a/hw/i8259.c b/hw/i8259.c
index ab519de..413802c 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -26,6 +26,7 @@
 #include "isa.h"
 #include "monitor.h"
 #include "qemu-timer.h"
+#include "i8259_internal.h"
 
 /* debug PIC */
 //#define DEBUG_PIC
@@ -40,33 +41,6 @@
 //#define DEBUG_IRQ_LATENCY
 //#define DEBUG_IRQ_COUNT
 
-struct PicState {
-    ISADevice dev;
-    uint8_t last_irr; /* edge detection */
-    uint8_t irr; /* interrupt request register */
-    uint8_t imr; /* interrupt mask register */
-    uint8_t isr; /* interrupt service register */
-    uint8_t priority_add; /* highest irq priority */
-    uint8_t irq_base;
-    uint8_t read_reg_select;
-    uint8_t poll;
-    uint8_t special_mask;
-    uint8_t init_state;
-    uint8_t auto_eoi;
-    uint8_t rotate_on_auto_eoi;
-    uint8_t special_fully_nested_mode;
-    uint8_t init4; /* true if 4 byte init */
-    uint8_t single_mode; /* true if slave pic is not initialized */
-    uint8_t elcr; /* PIIX edge/trigger selection*/
-    uint8_t elcr_mask;
-    qemu_irq int_out[1];
-    uint32_t master; /* reflects /SP input pin */
-    uint32_t iobase;
-    uint32_t elcr_addr;
-    MemoryRegion base_io;
-    MemoryRegion elcr_io;
-};
-
 #if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
 static int irq_level[16];
 #endif
@@ -248,29 +222,12 @@ int pic_read_irq(PicState *s)
 
 static void pic_init_reset(PicState *s)
 {
-    s->last_irr = 0;
-    s->irr = 0;
-    s->imr = 0;
-    s->isr = 0;
-    s->priority_add = 0;
-    s->irq_base = 0;
-    s->read_reg_select = 0;
-    s->poll = 0;
-    s->special_mask = 0;
-    s->init_state = 0;
-    s->auto_eoi = 0;
-    s->rotate_on_auto_eoi = 0;
-    s->special_fully_nested_mode = 0;
-    s->init4 = 0;
-    s->single_mode = 0;
-    /* Note: ELCR is not reset */
+    pic_reset_internal(s);
     pic_update_irq(s);
 }
 
-static void pic_reset(DeviceState *dev)
+static void pic_reset(PicState *s)
 {
-    PicState *s = container_of(dev, PicState, dev.qdev);
-
     pic_init_reset(s);
     s->elcr = 0;
 }
@@ -418,32 +375,6 @@ static uint64_t elcr_ioport_read(void *opaque, target_phys_addr_t addr,
     return s->elcr;
 }
 
-static const VMStateDescription vmstate_pic = {
-    .name = "i8259",
-    .version_id = 1,
-    .minimum_version_id = 1,
-    .minimum_version_id_old = 1,
-    .fields = (VMStateField[]) {
-        VMSTATE_UINT8(last_irr, PicState),
-        VMSTATE_UINT8(irr, PicState),
-        VMSTATE_UINT8(imr, PicState),
-        VMSTATE_UINT8(isr, PicState),
-        VMSTATE_UINT8(priority_add, PicState),
-        VMSTATE_UINT8(irq_base, PicState),
-        VMSTATE_UINT8(read_reg_select, PicState),
-        VMSTATE_UINT8(poll, PicState),
-        VMSTATE_UINT8(special_mask, PicState),
-        VMSTATE_UINT8(init_state, PicState),
-        VMSTATE_UINT8(auto_eoi, PicState),
-        VMSTATE_UINT8(rotate_on_auto_eoi, PicState),
-        VMSTATE_UINT8(special_fully_nested_mode, PicState),
-        VMSTATE_UINT8(init4, PicState),
-        VMSTATE_UINT8(single_mode, PicState),
-        VMSTATE_UINT8(elcr, PicState),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
 static const MemoryRegionOps pic_base_ioport_ops = {
     .read = pic_ioport_read,
     .write = pic_ioport_write,
@@ -462,24 +393,13 @@ static const MemoryRegionOps pic_elcr_ioport_ops = {
     },
 };
 
-static int pic_initfn(ISADevice *dev)
+static void pic_backend_init(PicState *s)
 {
-    PicState *s = DO_UPCAST(PicState, dev, dev);
-
     memory_region_init_io(&s->base_io, &pic_base_ioport_ops, s, "pic", 2);
     memory_region_init_io(&s->elcr_io, &pic_elcr_ioport_ops, s, "elcr", 1);
 
-    isa_register_ioport(NULL, &s->base_io, s->iobase);
-    if (s->elcr_addr != -1) {
-        isa_register_ioport(NULL, &s->elcr_io, s->elcr_addr);
-    }
-
-    qdev_init_gpio_out(&dev->qdev, s->int_out, ARRAY_SIZE(s->int_out));
-    qdev_init_gpio_in(&dev->qdev, pic_set_irq, 8);
-
-    qdev_set_legacy_instance_id(&dev->qdev, s->iobase, 1);
-
-    return 0;
+    qdev_init_gpio_out(&s->dev.qdev, s->int_out, ARRAY_SIZE(s->int_out));
+    qdev_init_gpio_in(&s->dev.qdev, pic_set_irq, 8);
 }
 
 void pic_info(Monitor *mon)
@@ -526,12 +446,7 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
 
     irq_set = g_malloc(ISA_NUM_IRQS * sizeof(qemu_irq));
 
-    dev = isa_create("isa-i8259");
-    qdev_prop_set_uint32(&dev->qdev, "iobase", 0x20);
-    qdev_prop_set_uint32(&dev->qdev, "elcr_addr", 0x4d0);
-    qdev_prop_set_uint8(&dev->qdev, "elcr_mask", 0xf8);
-    qdev_prop_set_bit(&dev->qdev, "master", true);
-    qdev_init_nofail(&dev->qdev);
+    dev = i8259_init_chip(true, "QEMU");
 
     qdev_connect_gpio_out(&dev->qdev, 0, parent_irq);
     for (i = 0 ; i < 8; i++) {
@@ -540,11 +455,7 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
 
     isa_pic = DO_UPCAST(PicState, dev, dev);
 
-    dev = isa_create("isa-i8259");
-    qdev_prop_set_uint32(&dev->qdev, "iobase", 0xa0);
-    qdev_prop_set_uint32(&dev->qdev, "elcr_addr", 0x4d1);
-    qdev_prop_set_uint8(&dev->qdev, "elcr_mask", 0xde);
-    qdev_init_nofail(&dev->qdev);
+    dev = i8259_init_chip(false, "QEMU");
 
     qdev_connect_gpio_out(&dev->qdev, 0, irq_set[2]);
     for (i = 0 ; i < 8; i++) {
@@ -556,24 +467,16 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
     return irq_set;
 }
 
-static ISADeviceInfo i8259_info = {
-    .qdev.name     = "isa-i8259",
-    .qdev.size     = sizeof(PicState),
-    .qdev.vmsd     = &vmstate_pic,
-    .qdev.reset    = pic_reset,
-    .qdev.no_user  = 1,
-    .init          = pic_initfn,
-    .qdev.props = (Property[]) {
-        DEFINE_PROP_HEX32("iobase", PicState, iobase,  -1),
-        DEFINE_PROP_HEX32("elcr_addr", PicState, elcr_addr,  -1),
-        DEFINE_PROP_HEX8("elcr_mask", PicState, elcr_mask,  -1),
-        DEFINE_PROP_BIT("master", PicState, master,  0, false),
-        DEFINE_PROP_END_OF_LIST(),
-    },
+static PICBackend pic_backend = {
+    .name = "QEMU",
+    .init = pic_backend_init,
+    .reset = pic_reset,
 };
 
 static void pic_register(void)
 {
-    isa_qdev_register(&i8259_info);
+    pic_register_device();
+    pic_register_backend(&pic_backend);
 }
+
 device_init(pic_register)
diff --git a/hw/i8259_common.c b/hw/i8259_common.c
new file mode 100644
index 0000000..403077d
--- /dev/null
+++ b/hw/i8259_common.c
@@ -0,0 +1,173 @@
+/*
+ * QEMU 8259 - common bits of emulated and KVM kernel model
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "pc.h"
+#include "i8259_internal.h"
+
+static QSIMPLEQ_HEAD(, PICBackend) backends =
+    QSIMPLEQ_HEAD_INITIALIZER(backends);
+
+void pic_reset_internal(PicState *s)
+{
+    s->last_irr = 0;
+    s->irr = 0;
+    s->imr = 0;
+    s->isr = 0;
+    s->priority_add = 0;
+    s->irq_base = 0;
+    s->read_reg_select = 0;
+    s->poll = 0;
+    s->special_mask = 0;
+    s->init_state = 0;
+    s->auto_eoi = 0;
+    s->rotate_on_auto_eoi = 0;
+    s->special_fully_nested_mode = 0;
+    s->init4 = 0;
+    s->single_mode = 0;
+    /* Note: ELCR is not reset */
+}
+
+static void pic_dispatch_pre_save(void *opaque)
+{
+    PicState *s = opaque;
+
+    if (s->backend->pre_save) {
+        s->backend->pre_save(s);
+    }
+}
+
+static int pic_dispatch_post_load(void *opaque, int version_id)
+{
+    PicState *s = opaque;
+
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
+static const VMStateDescription vmstate_pic = {
+    .name = "i8259",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .pre_save = pic_dispatch_pre_save,
+    .post_load = pic_dispatch_post_load,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(last_irr, PicState),
+        VMSTATE_UINT8(irr, PicState),
+        VMSTATE_UINT8(imr, PicState),
+        VMSTATE_UINT8(isr, PicState),
+        VMSTATE_UINT8(priority_add, PicState),
+        VMSTATE_UINT8(irq_base, PicState),
+        VMSTATE_UINT8(read_reg_select, PicState),
+        VMSTATE_UINT8(poll, PicState),
+        VMSTATE_UINT8(special_mask, PicState),
+        VMSTATE_UINT8(init_state, PicState),
+        VMSTATE_UINT8(auto_eoi, PicState),
+        VMSTATE_UINT8(rotate_on_auto_eoi, PicState),
+        VMSTATE_UINT8(special_fully_nested_mode, PicState),
+        VMSTATE_UINT8(init4, PicState),
+        VMSTATE_UINT8(single_mode, PicState),
+        VMSTATE_UINT8(elcr, PicState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int pic_initfn(ISADevice *dev)
+{
+    PicState *s = DO_UPCAST(PicState, dev, dev);
+    PICBackend *b;
+
+    QSIMPLEQ_FOREACH(b, &backends, entry) {
+        if (strcmp(b->name, s->backend_name) == 0) {
+            s->backend = b;
+            break;
+        }
+    }
+    if (!s->backend) {
+        hw_error("PIC backend '%s' not found!", s->backend_name);
+        exit(1);
+    }
+
+    b->init(s);
+
+    isa_register_ioport(NULL, &s->base_io, s->iobase);
+    if (s->elcr_addr != -1) {
+        isa_register_ioport(NULL, &s->elcr_io, s->elcr_addr);
+    }
+
+    qdev_set_legacy_instance_id(&s->dev.qdev, s->iobase, 1);
+
+    return 0;
+}
+
+static void pic_reset(DeviceState *dev)
+{
+    PicState *s = container_of(dev, PicState, dev.qdev);
+
+    s->backend->reset(s);
+}
+
+static ISADeviceInfo i8259_info = {
+    .qdev.name     = "isa-i8259",
+    .qdev.size     = sizeof(PicState),
+    .qdev.vmsd     = &vmstate_pic,
+    .qdev.reset    = pic_reset,
+    .qdev.no_user  = 1,
+    .init          = pic_initfn,
+    .qdev.props = (Property[]) {
+        DEFINE_PROP_HEX32("iobase", PicState, iobase,  -1),
+        DEFINE_PROP_HEX32("elcr_addr", PicState, elcr_addr,  -1),
+        DEFINE_PROP_HEX8("elcr_mask", PicState, elcr_mask,  -1),
+        DEFINE_PROP_BIT("master", PicState, master,  0, false),
+        DEFINE_PROP_STRING("backend", PicState, backend_name),
+        DEFINE_PROP_END_OF_LIST(),
+    },
+};
+
+void pic_register_backend(PICBackend *backend)
+{
+    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
+}
+
+void pic_register_device(void)
+{
+    isa_qdev_register(&i8259_info);
+}
+
+ISADevice *i8259_init_chip(bool master, const char *backend)
+{
+    ISADevice *dev;
+
+    dev = isa_create("isa-i8259");
+    qdev_prop_set_uint32(&dev->qdev, "iobase", master ? 0x20 : 0xa0);
+    qdev_prop_set_uint32(&dev->qdev, "elcr_addr", master ? 0x4d0 : 0x4d1);
+    qdev_prop_set_uint8(&dev->qdev, "elcr_mask", master ? 0xf8 : 0xde);
+    qdev_prop_set_bit(&dev->qdev, "master", master);
+    qdev_prop_set_string(&dev->qdev, "backend", g_strdup(backend));
+    qdev_init_nofail(&dev->qdev);
+
+    return dev;
+}
diff --git a/hw/i8259_internal.h b/hw/i8259_internal.h
new file mode 100644
index 0000000..e11c312
--- /dev/null
+++ b/hw/i8259_internal.h
@@ -0,0 +1,82 @@
+/*
+ * QEMU 8259 - internal interfaces
+ *
+ * Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef QEMU_I8259_INTERNAL_H
+#define QEMU_I8259_INTERNAL_H
+
+#include "hw.h"
+#include "pc.h"
+#include "isa.h"
+#include "qemu-queue.h"
+
+typedef struct PICBackend PICBackend;
+
+struct PICBackend {
+    const char *name;
+    void (*init)(PicState *s);
+    void (*reset)(PicState *s);
+    void (*pre_save)(PicState *s);
+    void (*post_load)(PicState *s);
+
+    QSIMPLEQ_ENTRY(PICBackend) entry;
+};
+
+struct PicState {
+    ISADevice dev;
+    uint8_t last_irr; /* edge detection */
+    uint8_t irr; /* interrupt request register */
+    uint8_t imr; /* interrupt mask register */
+    uint8_t isr; /* interrupt service register */
+    uint8_t priority_add; /* highest irq priority */
+    uint8_t irq_base;
+    uint8_t read_reg_select;
+    uint8_t poll;
+    uint8_t special_mask;
+    uint8_t init_state;
+    uint8_t auto_eoi;
+    uint8_t rotate_on_auto_eoi;
+    uint8_t special_fully_nested_mode;
+    uint8_t init4; /* true if 4 byte init */
+    uint8_t single_mode; /* true if slave pic is not initialized */
+    uint8_t elcr; /* PIIX edge/trigger selection*/
+    uint8_t elcr_mask;
+    qemu_irq int_out[1];
+    uint32_t master; /* reflects /SP input pin */
+    uint32_t iobase;
+    uint32_t elcr_addr;
+    MemoryRegion base_io;
+    MemoryRegion elcr_io;
+
+    char *backend_name;
+    PICBackend *backend;
+};
+
+void pic_register_device(void);
+void pic_register_backend(PICBackend *backend);
+
+void pic_reset_internal(PicState *s);
+
+ISADevice *i8259_init_chip(bool master, const char *backend);
+
+#endif /* !QEMU_I8259_INTERNAL_H */
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 08/16] i8259: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Analogously to the APIC, we will reuse some parts of the user space
i8259 model for KVM. Again, we create a PIC backend infrastructure and
provide hooks for init, reset, and vmload/save. This also introduces a
common helper to instantiate a single i8259 chip from the cascade-
creating i8259_init function.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.objs       |    2 +-
 hw/i8259.c          |  127 +++++---------------------------------
 hw/i8259_common.c   |  173 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/i8259_internal.h |   82 ++++++++++++++++++++++++
 4 files changed, 271 insertions(+), 113 deletions(-)
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h

diff --git a/Makefile.objs b/Makefile.objs
index d7a6539..72d8ee7 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -221,7 +221,7 @@ hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
 hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
-hw-obj-$(CONFIG_I8259) += i8259.o
+hw-obj-$(CONFIG_I8259) += i8259_common.o i8259.o
 
 # PPC devices
 hw-obj-$(CONFIG_PREP_PCI) += prep_pci.o
diff --git a/hw/i8259.c b/hw/i8259.c
index ab519de..413802c 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -26,6 +26,7 @@
 #include "isa.h"
 #include "monitor.h"
 #include "qemu-timer.h"
+#include "i8259_internal.h"
 
 /* debug PIC */
 //#define DEBUG_PIC
@@ -40,33 +41,6 @@
 //#define DEBUG_IRQ_LATENCY
 //#define DEBUG_IRQ_COUNT
 
-struct PicState {
-    ISADevice dev;
-    uint8_t last_irr; /* edge detection */
-    uint8_t irr; /* interrupt request register */
-    uint8_t imr; /* interrupt mask register */
-    uint8_t isr; /* interrupt service register */
-    uint8_t priority_add; /* highest irq priority */
-    uint8_t irq_base;
-    uint8_t read_reg_select;
-    uint8_t poll;
-    uint8_t special_mask;
-    uint8_t init_state;
-    uint8_t auto_eoi;
-    uint8_t rotate_on_auto_eoi;
-    uint8_t special_fully_nested_mode;
-    uint8_t init4; /* true if 4 byte init */
-    uint8_t single_mode; /* true if slave pic is not initialized */
-    uint8_t elcr; /* PIIX edge/trigger selection*/
-    uint8_t elcr_mask;
-    qemu_irq int_out[1];
-    uint32_t master; /* reflects /SP input pin */
-    uint32_t iobase;
-    uint32_t elcr_addr;
-    MemoryRegion base_io;
-    MemoryRegion elcr_io;
-};
-
 #if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
 static int irq_level[16];
 #endif
@@ -248,29 +222,12 @@ int pic_read_irq(PicState *s)
 
 static void pic_init_reset(PicState *s)
 {
-    s->last_irr = 0;
-    s->irr = 0;
-    s->imr = 0;
-    s->isr = 0;
-    s->priority_add = 0;
-    s->irq_base = 0;
-    s->read_reg_select = 0;
-    s->poll = 0;
-    s->special_mask = 0;
-    s->init_state = 0;
-    s->auto_eoi = 0;
-    s->rotate_on_auto_eoi = 0;
-    s->special_fully_nested_mode = 0;
-    s->init4 = 0;
-    s->single_mode = 0;
-    /* Note: ELCR is not reset */
+    pic_reset_internal(s);
     pic_update_irq(s);
 }
 
-static void pic_reset(DeviceState *dev)
+static void pic_reset(PicState *s)
 {
-    PicState *s = container_of(dev, PicState, dev.qdev);
-
     pic_init_reset(s);
     s->elcr = 0;
 }
@@ -418,32 +375,6 @@ static uint64_t elcr_ioport_read(void *opaque, target_phys_addr_t addr,
     return s->elcr;
 }
 
-static const VMStateDescription vmstate_pic = {
-    .name = "i8259",
-    .version_id = 1,
-    .minimum_version_id = 1,
-    .minimum_version_id_old = 1,
-    .fields = (VMStateField[]) {
-        VMSTATE_UINT8(last_irr, PicState),
-        VMSTATE_UINT8(irr, PicState),
-        VMSTATE_UINT8(imr, PicState),
-        VMSTATE_UINT8(isr, PicState),
-        VMSTATE_UINT8(priority_add, PicState),
-        VMSTATE_UINT8(irq_base, PicState),
-        VMSTATE_UINT8(read_reg_select, PicState),
-        VMSTATE_UINT8(poll, PicState),
-        VMSTATE_UINT8(special_mask, PicState),
-        VMSTATE_UINT8(init_state, PicState),
-        VMSTATE_UINT8(auto_eoi, PicState),
-        VMSTATE_UINT8(rotate_on_auto_eoi, PicState),
-        VMSTATE_UINT8(special_fully_nested_mode, PicState),
-        VMSTATE_UINT8(init4, PicState),
-        VMSTATE_UINT8(single_mode, PicState),
-        VMSTATE_UINT8(elcr, PicState),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
 static const MemoryRegionOps pic_base_ioport_ops = {
     .read = pic_ioport_read,
     .write = pic_ioport_write,
@@ -462,24 +393,13 @@ static const MemoryRegionOps pic_elcr_ioport_ops = {
     },
 };
 
-static int pic_initfn(ISADevice *dev)
+static void pic_backend_init(PicState *s)
 {
-    PicState *s = DO_UPCAST(PicState, dev, dev);
-
     memory_region_init_io(&s->base_io, &pic_base_ioport_ops, s, "pic", 2);
     memory_region_init_io(&s->elcr_io, &pic_elcr_ioport_ops, s, "elcr", 1);
 
-    isa_register_ioport(NULL, &s->base_io, s->iobase);
-    if (s->elcr_addr != -1) {
-        isa_register_ioport(NULL, &s->elcr_io, s->elcr_addr);
-    }
-
-    qdev_init_gpio_out(&dev->qdev, s->int_out, ARRAY_SIZE(s->int_out));
-    qdev_init_gpio_in(&dev->qdev, pic_set_irq, 8);
-
-    qdev_set_legacy_instance_id(&dev->qdev, s->iobase, 1);
-
-    return 0;
+    qdev_init_gpio_out(&s->dev.qdev, s->int_out, ARRAY_SIZE(s->int_out));
+    qdev_init_gpio_in(&s->dev.qdev, pic_set_irq, 8);
 }
 
 void pic_info(Monitor *mon)
@@ -526,12 +446,7 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
 
     irq_set = g_malloc(ISA_NUM_IRQS * sizeof(qemu_irq));
 
-    dev = isa_create("isa-i8259");
-    qdev_prop_set_uint32(&dev->qdev, "iobase", 0x20);
-    qdev_prop_set_uint32(&dev->qdev, "elcr_addr", 0x4d0);
-    qdev_prop_set_uint8(&dev->qdev, "elcr_mask", 0xf8);
-    qdev_prop_set_bit(&dev->qdev, "master", true);
-    qdev_init_nofail(&dev->qdev);
+    dev = i8259_init_chip(true, "QEMU");
 
     qdev_connect_gpio_out(&dev->qdev, 0, parent_irq);
     for (i = 0 ; i < 8; i++) {
@@ -540,11 +455,7 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
 
     isa_pic = DO_UPCAST(PicState, dev, dev);
 
-    dev = isa_create("isa-i8259");
-    qdev_prop_set_uint32(&dev->qdev, "iobase", 0xa0);
-    qdev_prop_set_uint32(&dev->qdev, "elcr_addr", 0x4d1);
-    qdev_prop_set_uint8(&dev->qdev, "elcr_mask", 0xde);
-    qdev_init_nofail(&dev->qdev);
+    dev = i8259_init_chip(false, "QEMU");
 
     qdev_connect_gpio_out(&dev->qdev, 0, irq_set[2]);
     for (i = 0 ; i < 8; i++) {
@@ -556,24 +467,16 @@ qemu_irq *i8259_init(qemu_irq parent_irq)
     return irq_set;
 }
 
-static ISADeviceInfo i8259_info = {
-    .qdev.name     = "isa-i8259",
-    .qdev.size     = sizeof(PicState),
-    .qdev.vmsd     = &vmstate_pic,
-    .qdev.reset    = pic_reset,
-    .qdev.no_user  = 1,
-    .init          = pic_initfn,
-    .qdev.props = (Property[]) {
-        DEFINE_PROP_HEX32("iobase", PicState, iobase,  -1),
-        DEFINE_PROP_HEX32("elcr_addr", PicState, elcr_addr,  -1),
-        DEFINE_PROP_HEX8("elcr_mask", PicState, elcr_mask,  -1),
-        DEFINE_PROP_BIT("master", PicState, master,  0, false),
-        DEFINE_PROP_END_OF_LIST(),
-    },
+static PICBackend pic_backend = {
+    .name = "QEMU",
+    .init = pic_backend_init,
+    .reset = pic_reset,
 };
 
 static void pic_register(void)
 {
-    isa_qdev_register(&i8259_info);
+    pic_register_device();
+    pic_register_backend(&pic_backend);
 }
+
 device_init(pic_register)
diff --git a/hw/i8259_common.c b/hw/i8259_common.c
new file mode 100644
index 0000000..403077d
--- /dev/null
+++ b/hw/i8259_common.c
@@ -0,0 +1,173 @@
+/*
+ * QEMU 8259 - common bits of emulated and KVM kernel model
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "pc.h"
+#include "i8259_internal.h"
+
+static QSIMPLEQ_HEAD(, PICBackend) backends =
+    QSIMPLEQ_HEAD_INITIALIZER(backends);
+
+void pic_reset_internal(PicState *s)
+{
+    s->last_irr = 0;
+    s->irr = 0;
+    s->imr = 0;
+    s->isr = 0;
+    s->priority_add = 0;
+    s->irq_base = 0;
+    s->read_reg_select = 0;
+    s->poll = 0;
+    s->special_mask = 0;
+    s->init_state = 0;
+    s->auto_eoi = 0;
+    s->rotate_on_auto_eoi = 0;
+    s->special_fully_nested_mode = 0;
+    s->init4 = 0;
+    s->single_mode = 0;
+    /* Note: ELCR is not reset */
+}
+
+static void pic_dispatch_pre_save(void *opaque)
+{
+    PicState *s = opaque;
+
+    if (s->backend->pre_save) {
+        s->backend->pre_save(s);
+    }
+}
+
+static int pic_dispatch_post_load(void *opaque, int version_id)
+{
+    PicState *s = opaque;
+
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
+static const VMStateDescription vmstate_pic = {
+    .name = "i8259",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .pre_save = pic_dispatch_pre_save,
+    .post_load = pic_dispatch_post_load,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(last_irr, PicState),
+        VMSTATE_UINT8(irr, PicState),
+        VMSTATE_UINT8(imr, PicState),
+        VMSTATE_UINT8(isr, PicState),
+        VMSTATE_UINT8(priority_add, PicState),
+        VMSTATE_UINT8(irq_base, PicState),
+        VMSTATE_UINT8(read_reg_select, PicState),
+        VMSTATE_UINT8(poll, PicState),
+        VMSTATE_UINT8(special_mask, PicState),
+        VMSTATE_UINT8(init_state, PicState),
+        VMSTATE_UINT8(auto_eoi, PicState),
+        VMSTATE_UINT8(rotate_on_auto_eoi, PicState),
+        VMSTATE_UINT8(special_fully_nested_mode, PicState),
+        VMSTATE_UINT8(init4, PicState),
+        VMSTATE_UINT8(single_mode, PicState),
+        VMSTATE_UINT8(elcr, PicState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int pic_initfn(ISADevice *dev)
+{
+    PicState *s = DO_UPCAST(PicState, dev, dev);
+    PICBackend *b;
+
+    QSIMPLEQ_FOREACH(b, &backends, entry) {
+        if (strcmp(b->name, s->backend_name) == 0) {
+            s->backend = b;
+            break;
+        }
+    }
+    if (!s->backend) {
+        hw_error("PIC backend '%s' not found!", s->backend_name);
+        exit(1);
+    }
+
+    b->init(s);
+
+    isa_register_ioport(NULL, &s->base_io, s->iobase);
+    if (s->elcr_addr != -1) {
+        isa_register_ioport(NULL, &s->elcr_io, s->elcr_addr);
+    }
+
+    qdev_set_legacy_instance_id(&s->dev.qdev, s->iobase, 1);
+
+    return 0;
+}
+
+static void pic_reset(DeviceState *dev)
+{
+    PicState *s = container_of(dev, PicState, dev.qdev);
+
+    s->backend->reset(s);
+}
+
+static ISADeviceInfo i8259_info = {
+    .qdev.name     = "isa-i8259",
+    .qdev.size     = sizeof(PicState),
+    .qdev.vmsd     = &vmstate_pic,
+    .qdev.reset    = pic_reset,
+    .qdev.no_user  = 1,
+    .init          = pic_initfn,
+    .qdev.props = (Property[]) {
+        DEFINE_PROP_HEX32("iobase", PicState, iobase,  -1),
+        DEFINE_PROP_HEX32("elcr_addr", PicState, elcr_addr,  -1),
+        DEFINE_PROP_HEX8("elcr_mask", PicState, elcr_mask,  -1),
+        DEFINE_PROP_BIT("master", PicState, master,  0, false),
+        DEFINE_PROP_STRING("backend", PicState, backend_name),
+        DEFINE_PROP_END_OF_LIST(),
+    },
+};
+
+void pic_register_backend(PICBackend *backend)
+{
+    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
+}
+
+void pic_register_device(void)
+{
+    isa_qdev_register(&i8259_info);
+}
+
+ISADevice *i8259_init_chip(bool master, const char *backend)
+{
+    ISADevice *dev;
+
+    dev = isa_create("isa-i8259");
+    qdev_prop_set_uint32(&dev->qdev, "iobase", master ? 0x20 : 0xa0);
+    qdev_prop_set_uint32(&dev->qdev, "elcr_addr", master ? 0x4d0 : 0x4d1);
+    qdev_prop_set_uint8(&dev->qdev, "elcr_mask", master ? 0xf8 : 0xde);
+    qdev_prop_set_bit(&dev->qdev, "master", master);
+    qdev_prop_set_string(&dev->qdev, "backend", g_strdup(backend));
+    qdev_init_nofail(&dev->qdev);
+
+    return dev;
+}
diff --git a/hw/i8259_internal.h b/hw/i8259_internal.h
new file mode 100644
index 0000000..e11c312
--- /dev/null
+++ b/hw/i8259_internal.h
@@ -0,0 +1,82 @@
+/*
+ * QEMU 8259 - internal interfaces
+ *
+ * Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef QEMU_I8259_INTERNAL_H
+#define QEMU_I8259_INTERNAL_H
+
+#include "hw.h"
+#include "pc.h"
+#include "isa.h"
+#include "qemu-queue.h"
+
+typedef struct PICBackend PICBackend;
+
+struct PICBackend {
+    const char *name;
+    void (*init)(PicState *s);
+    void (*reset)(PicState *s);
+    void (*pre_save)(PicState *s);
+    void (*post_load)(PicState *s);
+
+    QSIMPLEQ_ENTRY(PICBackend) entry;
+};
+
+struct PicState {
+    ISADevice dev;
+    uint8_t last_irr; /* edge detection */
+    uint8_t irr; /* interrupt request register */
+    uint8_t imr; /* interrupt mask register */
+    uint8_t isr; /* interrupt service register */
+    uint8_t priority_add; /* highest irq priority */
+    uint8_t irq_base;
+    uint8_t read_reg_select;
+    uint8_t poll;
+    uint8_t special_mask;
+    uint8_t init_state;
+    uint8_t auto_eoi;
+    uint8_t rotate_on_auto_eoi;
+    uint8_t special_fully_nested_mode;
+    uint8_t init4; /* true if 4 byte init */
+    uint8_t single_mode; /* true if slave pic is not initialized */
+    uint8_t elcr; /* PIIX edge/trigger selection*/
+    uint8_t elcr_mask;
+    qemu_irq int_out[1];
+    uint32_t master; /* reflects /SP input pin */
+    uint32_t iobase;
+    uint32_t elcr_addr;
+    MemoryRegion base_io;
+    MemoryRegion elcr_io;
+
+    char *backend_name;
+    PICBackend *backend;
+};
+
+void pic_register_device(void);
+void pic_register_backend(PICBackend *backend);
+
+void pic_reset_internal(PicState *s);
+
+ISADevice *i8259_init_chip(bool master, const char *backend);
+
+#endif /* !QEMU_I8259_INTERNAL_H */
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 09/16] ioapic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Split up the IOAPIC analogously to APIC and i8259. KVM will share the
device description, reset logic and certain init parts with the user
space model.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target      |    2 +-
 hw/ioapic.c          |  130 ++++-------------------------------------------
 hw/ioapic_common.c   |  137 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ioapic_internal.h |  105 ++++++++++++++++++++++++++++++++++++++
 hw/pc_piix.c         |    1 +
 5 files changed, 254 insertions(+), 121 deletions(-)
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h

diff --git a/Makefile.target b/Makefile.target
index c46f062..b549988 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic_common.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 27b07c6..2db72e0 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -24,9 +24,7 @@
 #include "pc.h"
 #include "apic.h"
 #include "ioapic.h"
-#include "qemu-timer.h"
-#include "host-utils.h"
-#include "sysbus.h"
+#include "ioapic_internal.h"
 
 //#define DEBUG_IOAPIC
 
@@ -37,62 +35,6 @@
 #define DPRINTF(fmt, ...)
 #endif
 
-#define MAX_IOAPICS                     1
-
-#define IOAPIC_VERSION                  0x11
-
-#define IOAPIC_LVT_DEST_SHIFT           56
-#define IOAPIC_LVT_MASKED_SHIFT         16
-#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
-#define IOAPIC_LVT_REMOTE_IRR_SHIFT     14
-#define IOAPIC_LVT_POLARITY_SHIFT       13
-#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
-#define IOAPIC_LVT_DEST_MODE_SHIFT      11
-#define IOAPIC_LVT_DELIV_MODE_SHIFT     8
-
-#define IOAPIC_LVT_MASKED               (1 << IOAPIC_LVT_MASKED_SHIFT)
-#define IOAPIC_LVT_REMOTE_IRR           (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
-
-#define IOAPIC_TRIGGER_EDGE             0
-#define IOAPIC_TRIGGER_LEVEL            1
-
-/*io{apic,sapic} delivery mode*/
-#define IOAPIC_DM_FIXED                 0x0
-#define IOAPIC_DM_LOWEST_PRIORITY       0x1
-#define IOAPIC_DM_PMI                   0x2
-#define IOAPIC_DM_NMI                   0x4
-#define IOAPIC_DM_INIT                  0x5
-#define IOAPIC_DM_SIPI                  0x6
-#define IOAPIC_DM_EXTINT                0x7
-#define IOAPIC_DM_MASK                  0x7
-
-#define IOAPIC_VECTOR_MASK              0xff
-
-#define IOAPIC_IOREGSEL                 0x00
-#define IOAPIC_IOWIN                    0x10
-
-#define IOAPIC_REG_ID                   0x00
-#define IOAPIC_REG_VER                  0x01
-#define IOAPIC_REG_ARB                  0x02
-#define IOAPIC_REG_REDTBL_BASE          0x10
-#define IOAPIC_ID                       0x00
-
-#define IOAPIC_ID_SHIFT                 24
-#define IOAPIC_ID_MASK                  0xf
-
-#define IOAPIC_VER_ENTRIES_SHIFT        16
-
-typedef struct IOAPICState IOAPICState;
-
-struct IOAPICState {
-    SysBusDevice busdev;
-    MemoryRegion io_memory;
-    uint8_t id;
-    uint8_t ioregsel;
-    uint32_t irr;
-    uint64_t ioredtbl[IOAPIC_NUM_PINS];
-};
-
 static IOAPICState *ioapics[MAX_IOAPICS];
 
 static void ioapic_service(IOAPICState *s)
@@ -278,83 +220,31 @@ ioapic_mem_write(void *opaque, target_phys_addr_t addr, uint64_t val,
     }
 }
 
-static int ioapic_post_load(void *opaque, int version_id)
-{
-    IOAPICState *s = opaque;
-
-    if (version_id == 1) {
-        /* set sane value */
-        s->irr = 0;
-    }
-    return 0;
-}
-
-static const VMStateDescription vmstate_ioapic = {
-    .name = "ioapic",
-    .version_id = 3,
-    .post_load = ioapic_post_load,
-    .minimum_version_id = 1,
-    .minimum_version_id_old = 1,
-    .fields = (VMStateField[]) {
-        VMSTATE_UINT8(id, IOAPICState),
-        VMSTATE_UINT8(ioregsel, IOAPICState),
-        VMSTATE_UNUSED_V(2, 8), /* to account for qemu-kvm's v2 format */
-        VMSTATE_UINT32_V(irr, IOAPICState, 2),
-        VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
-static void ioapic_reset(DeviceState *d)
-{
-    IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
-    int i;
-
-    s->id = 0;
-    s->ioregsel = 0;
-    s->irr = 0;
-    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-        s->ioredtbl[i] = 1 << IOAPIC_LVT_MASKED_SHIFT;
-    }
-}
-
 static const MemoryRegionOps ioapic_io_ops = {
     .read = ioapic_mem_read,
     .write = ioapic_mem_write,
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static int ioapic_init1(SysBusDevice *dev)
+static void ioapic_backend_init(IOAPICState *s, int index)
 {
-    IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
-    static int ioapic_no;
-
-    if (ioapic_no >= MAX_IOAPICS) {
-        return -1;
-    }
-
     memory_region_init_io(&s->io_memory, &ioapic_io_ops, s, "ioapic", 0x1000);
-    sysbus_init_mmio(dev, &s->io_memory);
-
-    qdev_init_gpio_in(&dev->qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
-    ioapics[ioapic_no++] = s;
+    qdev_init_gpio_in(&s->busdev.qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
-    return 0;
+    ioapics[index] = s;
 }
 
-static SysBusDeviceInfo ioapic_info = {
-    .init = ioapic_init1,
-    .qdev.name = "ioapic",
-    .qdev.size = sizeof(IOAPICState),
-    .qdev.vmsd = &vmstate_ioapic,
-    .qdev.reset = ioapic_reset,
-    .qdev.no_user = 1,
+static IOAPICBackend ioapic_backend = {
+    .name = "QEMU",
+    .init = ioapic_backend_init,
+    .reset = ioapic_reset_internal,
 };
 
 static void ioapic_register_devices(void)
 {
-    sysbus_register_withprop(&ioapic_info);
+    ioapic_register_device();
+    ioapic_register_backend(&ioapic_backend);
 }
 
 device_init(ioapic_register_devices)
diff --git a/hw/ioapic_common.c b/hw/ioapic_common.c
new file mode 100644
index 0000000..094551c
--- /dev/null
+++ b/hw/ioapic_common.c
@@ -0,0 +1,137 @@
+/*
+ *  IOAPIC emulation logic - common bits of emulated and KVM kernel model
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2009      Xiantao Zhang, Intel
+ *  Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "ioapic.h"
+#include "ioapic_internal.h"
+#include "sysbus.h"
+
+static QSIMPLEQ_HEAD(, IOAPICBackend) backends =
+    QSIMPLEQ_HEAD_INITIALIZER(backends);
+
+void ioapic_reset_internal(IOAPICState *s)
+{
+    int i;
+
+    s->id = 0;
+    s->ioregsel = 0;
+    s->irr = 0;
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        s->ioredtbl[i] = 1 << IOAPIC_LVT_MASKED_SHIFT;
+    }
+}
+
+static void ioapic_dispatch_pre_save(void *opaque)
+{
+    IOAPICState *s = opaque;
+
+    if (s->backend->pre_save) {
+        s->backend->pre_save(s);
+    }
+}
+
+static int ioapic_post_load(void *opaque, int version_id)
+{
+    IOAPICState *s = opaque;
+
+    if (version_id == 1) {
+        /* set sane value */
+        s->irr = 0;
+    }
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
+const VMStateDescription vmstate_ioapic = {
+    .name = "ioapic",
+    .version_id = 3,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .pre_save = ioapic_dispatch_pre_save,
+    .post_load = ioapic_post_load,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(id, IOAPICState),
+        VMSTATE_UINT8(ioregsel, IOAPICState),
+        VMSTATE_UNUSED_V(2, 8), /* to account for qemu-kvm's v2 format */
+        VMSTATE_UINT32_V(irr, IOAPICState, 2),
+        VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int ioapic_init(SysBusDevice *dev)
+{
+    IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
+    static int ioapic_no;
+    IOAPICBackend *b;
+
+    if (ioapic_no >= MAX_IOAPICS) {
+        return -1;
+    }
+
+    QSIMPLEQ_FOREACH(b, &backends, entry) {
+        if (strcmp(b->name, s->backend_name) == 0) {
+            s->backend = b;
+            break;
+        }
+    }
+    if (!s->backend) {
+        hw_error("IOAPIC backend '%s' not found!", s->backend_name);
+        exit(1);
+    }
+
+    b->init(s, ioapic_no++);
+
+    sysbus_init_mmio(&s->busdev, &s->io_memory);
+
+    return 0;
+}
+
+static void ioapic_reset(DeviceState *dev)
+{
+    IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, dev);
+
+    s->backend->reset(s);
+}
+
+static SysBusDeviceInfo ioapic_info = {
+    .init = ioapic_init,
+    .qdev.name = "ioapic",
+    .qdev.size = sizeof(IOAPICState),
+    .qdev.vmsd = &vmstate_ioapic,
+    .qdev.reset = ioapic_reset,
+    .qdev.no_user = 1,
+    .qdev.props = (Property[]) {
+        DEFINE_PROP_STRING("backend", IOAPICState, backend_name),
+        DEFINE_PROP_END_OF_LIST(),
+    },
+};
+
+void ioapic_register_backend(IOAPICBackend *backend)
+{
+    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
+}
+
+void ioapic_register_device(void)
+{
+    sysbus_register_withprop(&ioapic_info);
+}
diff --git a/hw/ioapic_internal.h b/hw/ioapic_internal.h
new file mode 100644
index 0000000..c5fab8b
--- /dev/null
+++ b/hw/ioapic_internal.h
@@ -0,0 +1,105 @@
+/*
+ *  IOAPIC emulation logic - internal interfaces
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2009      Xiantao Zhang, Intel
+ *  Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef QEMU_IOAPIC_INTERNAL_H
+#define QEMU_IOAPIC_INTERNAL_H
+
+#include "hw.h"
+#include "memory.h"
+#include "sysbus.h"
+#include "qemu-queue.h"
+
+#define MAX_IOAPICS                     1
+
+#define IOAPIC_VERSION                  0x11
+
+#define IOAPIC_LVT_DEST_SHIFT           56
+#define IOAPIC_LVT_MASKED_SHIFT         16
+#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
+#define IOAPIC_LVT_REMOTE_IRR_SHIFT     14
+#define IOAPIC_LVT_POLARITY_SHIFT       13
+#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
+#define IOAPIC_LVT_DEST_MODE_SHIFT      11
+#define IOAPIC_LVT_DELIV_MODE_SHIFT     8
+
+#define IOAPIC_LVT_MASKED               (1 << IOAPIC_LVT_MASKED_SHIFT)
+#define IOAPIC_LVT_REMOTE_IRR           (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
+
+#define IOAPIC_TRIGGER_EDGE             0
+#define IOAPIC_TRIGGER_LEVEL            1
+
+/*io{apic,sapic} delivery mode*/
+#define IOAPIC_DM_FIXED                 0x0
+#define IOAPIC_DM_LOWEST_PRIORITY       0x1
+#define IOAPIC_DM_PMI                   0x2
+#define IOAPIC_DM_NMI                   0x4
+#define IOAPIC_DM_INIT                  0x5
+#define IOAPIC_DM_SIPI                  0x6
+#define IOAPIC_DM_EXTINT                0x7
+#define IOAPIC_DM_MASK                  0x7
+
+#define IOAPIC_VECTOR_MASK              0xff
+
+#define IOAPIC_IOREGSEL                 0x00
+#define IOAPIC_IOWIN                    0x10
+
+#define IOAPIC_REG_ID                   0x00
+#define IOAPIC_REG_VER                  0x01
+#define IOAPIC_REG_ARB                  0x02
+#define IOAPIC_REG_REDTBL_BASE          0x10
+#define IOAPIC_ID                       0x00
+
+#define IOAPIC_ID_SHIFT                 24
+#define IOAPIC_ID_MASK                  0xf
+
+#define IOAPIC_VER_ENTRIES_SHIFT        16
+
+typedef struct IOAPICBackend IOAPICBackend;
+typedef struct IOAPICState IOAPICState;
+
+struct IOAPICBackend {
+    const char *name;
+    void (*init)(IOAPICState *s, int index);
+    void (*reset)(IOAPICState *s);
+    void (*pre_save)(IOAPICState *s);
+    void (*post_load)(IOAPICState *s);
+
+    QSIMPLEQ_ENTRY(IOAPICBackend) entry;
+};
+
+struct IOAPICState {
+    SysBusDevice busdev;
+    MemoryRegion io_memory;
+    uint8_t id;
+    uint8_t ioregsel;
+    uint32_t irr;
+    uint64_t ioredtbl[IOAPIC_NUM_PINS];
+
+    char *backend_name;
+    IOAPICBackend *backend;
+};
+
+void ioapic_register_device(void);
+void ioapic_register_backend(IOAPICBackend *backend);
+
+void ioapic_reset_internal(IOAPICState *s);
+
+#endif /* !QEMU_IOAPIC_INTERNAL_H */
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 530fe9c..98f2822 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -60,6 +60,7 @@ static void ioapic_init(GSIState *gsi_state)
     unsigned int i;
 
     dev = qdev_create(NULL, "ioapic");
+    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
     qdev_init_nofail(dev);
     d = sysbus_from_qdev(dev);
     sysbus_mmio_map(d, 0, 0xfec00000);
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 09/16] ioapic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Split up the IOAPIC analogously to APIC and i8259. KVM will share the
device description, reset logic and certain init parts with the user
space model.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target      |    2 +-
 hw/ioapic.c          |  130 ++++-------------------------------------------
 hw/ioapic_common.c   |  137 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ioapic_internal.h |  105 ++++++++++++++++++++++++++++++++++++++
 hw/pc_piix.c         |    1 +
 5 files changed, 254 insertions(+), 121 deletions(-)
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h

diff --git a/Makefile.target b/Makefile.target
index c46f062..b549988 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic_common.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 27b07c6..2db72e0 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -24,9 +24,7 @@
 #include "pc.h"
 #include "apic.h"
 #include "ioapic.h"
-#include "qemu-timer.h"
-#include "host-utils.h"
-#include "sysbus.h"
+#include "ioapic_internal.h"
 
 //#define DEBUG_IOAPIC
 
@@ -37,62 +35,6 @@
 #define DPRINTF(fmt, ...)
 #endif
 
-#define MAX_IOAPICS                     1
-
-#define IOAPIC_VERSION                  0x11
-
-#define IOAPIC_LVT_DEST_SHIFT           56
-#define IOAPIC_LVT_MASKED_SHIFT         16
-#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
-#define IOAPIC_LVT_REMOTE_IRR_SHIFT     14
-#define IOAPIC_LVT_POLARITY_SHIFT       13
-#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
-#define IOAPIC_LVT_DEST_MODE_SHIFT      11
-#define IOAPIC_LVT_DELIV_MODE_SHIFT     8
-
-#define IOAPIC_LVT_MASKED               (1 << IOAPIC_LVT_MASKED_SHIFT)
-#define IOAPIC_LVT_REMOTE_IRR           (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
-
-#define IOAPIC_TRIGGER_EDGE             0
-#define IOAPIC_TRIGGER_LEVEL            1
-
-/*io{apic,sapic} delivery mode*/
-#define IOAPIC_DM_FIXED                 0x0
-#define IOAPIC_DM_LOWEST_PRIORITY       0x1
-#define IOAPIC_DM_PMI                   0x2
-#define IOAPIC_DM_NMI                   0x4
-#define IOAPIC_DM_INIT                  0x5
-#define IOAPIC_DM_SIPI                  0x6
-#define IOAPIC_DM_EXTINT                0x7
-#define IOAPIC_DM_MASK                  0x7
-
-#define IOAPIC_VECTOR_MASK              0xff
-
-#define IOAPIC_IOREGSEL                 0x00
-#define IOAPIC_IOWIN                    0x10
-
-#define IOAPIC_REG_ID                   0x00
-#define IOAPIC_REG_VER                  0x01
-#define IOAPIC_REG_ARB                  0x02
-#define IOAPIC_REG_REDTBL_BASE          0x10
-#define IOAPIC_ID                       0x00
-
-#define IOAPIC_ID_SHIFT                 24
-#define IOAPIC_ID_MASK                  0xf
-
-#define IOAPIC_VER_ENTRIES_SHIFT        16
-
-typedef struct IOAPICState IOAPICState;
-
-struct IOAPICState {
-    SysBusDevice busdev;
-    MemoryRegion io_memory;
-    uint8_t id;
-    uint8_t ioregsel;
-    uint32_t irr;
-    uint64_t ioredtbl[IOAPIC_NUM_PINS];
-};
-
 static IOAPICState *ioapics[MAX_IOAPICS];
 
 static void ioapic_service(IOAPICState *s)
@@ -278,83 +220,31 @@ ioapic_mem_write(void *opaque, target_phys_addr_t addr, uint64_t val,
     }
 }
 
-static int ioapic_post_load(void *opaque, int version_id)
-{
-    IOAPICState *s = opaque;
-
-    if (version_id == 1) {
-        /* set sane value */
-        s->irr = 0;
-    }
-    return 0;
-}
-
-static const VMStateDescription vmstate_ioapic = {
-    .name = "ioapic",
-    .version_id = 3,
-    .post_load = ioapic_post_load,
-    .minimum_version_id = 1,
-    .minimum_version_id_old = 1,
-    .fields = (VMStateField[]) {
-        VMSTATE_UINT8(id, IOAPICState),
-        VMSTATE_UINT8(ioregsel, IOAPICState),
-        VMSTATE_UNUSED_V(2, 8), /* to account for qemu-kvm's v2 format */
-        VMSTATE_UINT32_V(irr, IOAPICState, 2),
-        VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
-        VMSTATE_END_OF_LIST()
-    }
-};
-
-static void ioapic_reset(DeviceState *d)
-{
-    IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
-    int i;
-
-    s->id = 0;
-    s->ioregsel = 0;
-    s->irr = 0;
-    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-        s->ioredtbl[i] = 1 << IOAPIC_LVT_MASKED_SHIFT;
-    }
-}
-
 static const MemoryRegionOps ioapic_io_ops = {
     .read = ioapic_mem_read,
     .write = ioapic_mem_write,
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static int ioapic_init1(SysBusDevice *dev)
+static void ioapic_backend_init(IOAPICState *s, int index)
 {
-    IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
-    static int ioapic_no;
-
-    if (ioapic_no >= MAX_IOAPICS) {
-        return -1;
-    }
-
     memory_region_init_io(&s->io_memory, &ioapic_io_ops, s, "ioapic", 0x1000);
-    sysbus_init_mmio(dev, &s->io_memory);
-
-    qdev_init_gpio_in(&dev->qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
-    ioapics[ioapic_no++] = s;
+    qdev_init_gpio_in(&s->busdev.qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
-    return 0;
+    ioapics[index] = s;
 }
 
-static SysBusDeviceInfo ioapic_info = {
-    .init = ioapic_init1,
-    .qdev.name = "ioapic",
-    .qdev.size = sizeof(IOAPICState),
-    .qdev.vmsd = &vmstate_ioapic,
-    .qdev.reset = ioapic_reset,
-    .qdev.no_user = 1,
+static IOAPICBackend ioapic_backend = {
+    .name = "QEMU",
+    .init = ioapic_backend_init,
+    .reset = ioapic_reset_internal,
 };
 
 static void ioapic_register_devices(void)
 {
-    sysbus_register_withprop(&ioapic_info);
+    ioapic_register_device();
+    ioapic_register_backend(&ioapic_backend);
 }
 
 device_init(ioapic_register_devices)
diff --git a/hw/ioapic_common.c b/hw/ioapic_common.c
new file mode 100644
index 0000000..094551c
--- /dev/null
+++ b/hw/ioapic_common.c
@@ -0,0 +1,137 @@
+/*
+ *  IOAPIC emulation logic - common bits of emulated and KVM kernel model
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2009      Xiantao Zhang, Intel
+ *  Copyright (c) 2011      Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "ioapic.h"
+#include "ioapic_internal.h"
+#include "sysbus.h"
+
+static QSIMPLEQ_HEAD(, IOAPICBackend) backends =
+    QSIMPLEQ_HEAD_INITIALIZER(backends);
+
+void ioapic_reset_internal(IOAPICState *s)
+{
+    int i;
+
+    s->id = 0;
+    s->ioregsel = 0;
+    s->irr = 0;
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        s->ioredtbl[i] = 1 << IOAPIC_LVT_MASKED_SHIFT;
+    }
+}
+
+static void ioapic_dispatch_pre_save(void *opaque)
+{
+    IOAPICState *s = opaque;
+
+    if (s->backend->pre_save) {
+        s->backend->pre_save(s);
+    }
+}
+
+static int ioapic_post_load(void *opaque, int version_id)
+{
+    IOAPICState *s = opaque;
+
+    if (version_id == 1) {
+        /* set sane value */
+        s->irr = 0;
+    }
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
+const VMStateDescription vmstate_ioapic = {
+    .name = "ioapic",
+    .version_id = 3,
+    .minimum_version_id = 1,
+    .minimum_version_id_old = 1,
+    .pre_save = ioapic_dispatch_pre_save,
+    .post_load = ioapic_post_load,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(id, IOAPICState),
+        VMSTATE_UINT8(ioregsel, IOAPICState),
+        VMSTATE_UNUSED_V(2, 8), /* to account for qemu-kvm's v2 format */
+        VMSTATE_UINT32_V(irr, IOAPICState, 2),
+        VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int ioapic_init(SysBusDevice *dev)
+{
+    IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
+    static int ioapic_no;
+    IOAPICBackend *b;
+
+    if (ioapic_no >= MAX_IOAPICS) {
+        return -1;
+    }
+
+    QSIMPLEQ_FOREACH(b, &backends, entry) {
+        if (strcmp(b->name, s->backend_name) == 0) {
+            s->backend = b;
+            break;
+        }
+    }
+    if (!s->backend) {
+        hw_error("IOAPIC backend '%s' not found!", s->backend_name);
+        exit(1);
+    }
+
+    b->init(s, ioapic_no++);
+
+    sysbus_init_mmio(&s->busdev, &s->io_memory);
+
+    return 0;
+}
+
+static void ioapic_reset(DeviceState *dev)
+{
+    IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, dev);
+
+    s->backend->reset(s);
+}
+
+static SysBusDeviceInfo ioapic_info = {
+    .init = ioapic_init,
+    .qdev.name = "ioapic",
+    .qdev.size = sizeof(IOAPICState),
+    .qdev.vmsd = &vmstate_ioapic,
+    .qdev.reset = ioapic_reset,
+    .qdev.no_user = 1,
+    .qdev.props = (Property[]) {
+        DEFINE_PROP_STRING("backend", IOAPICState, backend_name),
+        DEFINE_PROP_END_OF_LIST(),
+    },
+};
+
+void ioapic_register_backend(IOAPICBackend *backend)
+{
+    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
+}
+
+void ioapic_register_device(void)
+{
+    sysbus_register_withprop(&ioapic_info);
+}
diff --git a/hw/ioapic_internal.h b/hw/ioapic_internal.h
new file mode 100644
index 0000000..c5fab8b
--- /dev/null
+++ b/hw/ioapic_internal.h
@@ -0,0 +1,105 @@
+/*
+ *  IOAPIC emulation logic - internal interfaces
+ *
+ *  Copyright (c) 2004-2005 Fabrice Bellard
+ *  Copyright (c) 2009      Xiantao Zhang, Intel
+ *  Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef QEMU_IOAPIC_INTERNAL_H
+#define QEMU_IOAPIC_INTERNAL_H
+
+#include "hw.h"
+#include "memory.h"
+#include "sysbus.h"
+#include "qemu-queue.h"
+
+#define MAX_IOAPICS                     1
+
+#define IOAPIC_VERSION                  0x11
+
+#define IOAPIC_LVT_DEST_SHIFT           56
+#define IOAPIC_LVT_MASKED_SHIFT         16
+#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
+#define IOAPIC_LVT_REMOTE_IRR_SHIFT     14
+#define IOAPIC_LVT_POLARITY_SHIFT       13
+#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
+#define IOAPIC_LVT_DEST_MODE_SHIFT      11
+#define IOAPIC_LVT_DELIV_MODE_SHIFT     8
+
+#define IOAPIC_LVT_MASKED               (1 << IOAPIC_LVT_MASKED_SHIFT)
+#define IOAPIC_LVT_REMOTE_IRR           (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
+
+#define IOAPIC_TRIGGER_EDGE             0
+#define IOAPIC_TRIGGER_LEVEL            1
+
+/*io{apic,sapic} delivery mode*/
+#define IOAPIC_DM_FIXED                 0x0
+#define IOAPIC_DM_LOWEST_PRIORITY       0x1
+#define IOAPIC_DM_PMI                   0x2
+#define IOAPIC_DM_NMI                   0x4
+#define IOAPIC_DM_INIT                  0x5
+#define IOAPIC_DM_SIPI                  0x6
+#define IOAPIC_DM_EXTINT                0x7
+#define IOAPIC_DM_MASK                  0x7
+
+#define IOAPIC_VECTOR_MASK              0xff
+
+#define IOAPIC_IOREGSEL                 0x00
+#define IOAPIC_IOWIN                    0x10
+
+#define IOAPIC_REG_ID                   0x00
+#define IOAPIC_REG_VER                  0x01
+#define IOAPIC_REG_ARB                  0x02
+#define IOAPIC_REG_REDTBL_BASE          0x10
+#define IOAPIC_ID                       0x00
+
+#define IOAPIC_ID_SHIFT                 24
+#define IOAPIC_ID_MASK                  0xf
+
+#define IOAPIC_VER_ENTRIES_SHIFT        16
+
+typedef struct IOAPICBackend IOAPICBackend;
+typedef struct IOAPICState IOAPICState;
+
+struct IOAPICBackend {
+    const char *name;
+    void (*init)(IOAPICState *s, int index);
+    void (*reset)(IOAPICState *s);
+    void (*pre_save)(IOAPICState *s);
+    void (*post_load)(IOAPICState *s);
+
+    QSIMPLEQ_ENTRY(IOAPICBackend) entry;
+};
+
+struct IOAPICState {
+    SysBusDevice busdev;
+    MemoryRegion io_memory;
+    uint8_t id;
+    uint8_t ioregsel;
+    uint32_t irr;
+    uint64_t ioredtbl[IOAPIC_NUM_PINS];
+
+    char *backend_name;
+    IOAPICBackend *backend;
+};
+
+void ioapic_register_device(void);
+void ioapic_register_backend(IOAPICBackend *backend);
+
+void ioapic_reset_internal(IOAPICState *s);
+
+#endif /* !QEMU_IOAPIC_INTERNAL_H */
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 530fe9c..98f2822 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -60,6 +60,7 @@ static void ioapic_init(GSIState *gsi_state)
     unsigned int i;
 
     dev = qdev_create(NULL, "ioapic");
+    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
     qdev_init_nofail(dev);
     d = sysbus_from_qdev(dev);
     sysbus_mmio_map(d, 0, 0xfec00000);
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 10/16] memory: Introduce memory_region_init_reservation
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Introduce a memory region type that can reserve I/O space. Such regions
are useful for modeling I/O that is only handled outside of QEMU, i.e.
in the context of an accelerator like KVM.

Any access to such a region from QEMU is a bug, but could theoretically
be triggered by guest code (DMA to reserved region). So only warning
about such events once, then ignore them.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 memory.c |   36 ++++++++++++++++++++++++++++++++++++
 memory.h |   16 ++++++++++++++++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/memory.c b/memory.c
index adfdf14..71a252a 100644
--- a/memory.c
+++ b/memory.c
@@ -1031,6 +1031,42 @@ void memory_region_init_rom_device(MemoryRegion *mr,
     mr->backend_registered = true;
 }
 
+static uint64_t invalid_read(void *opaque, target_phys_addr_t addr,
+                             unsigned size)
+{
+    MemoryRegion *mr = opaque;
+
+    if (!mr->warning_printed) {
+        fprintf(stderr, "Invalid read from memory region %s\n", mr->name);
+        mr->warning_printed = true;
+    }
+    return -1U;
+}
+
+static void invalid_write(void *opaque, target_phys_addr_t addr, uint64_t data,
+                          unsigned size)
+{
+    MemoryRegion *mr = opaque;
+
+    if (!mr->warning_printed) {
+        fprintf(stderr, "Invalid write to memory region %s\n", mr->name);
+        mr->warning_printed = true;
+    }
+}
+
+static const MemoryRegionOps reservation_ops = {
+    .read = invalid_read,
+    .write = invalid_write,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+void memory_region_init_reservation(MemoryRegion *mr,
+                                    const char *name,
+                                    uint64_t size)
+{
+    memory_region_init_io(mr, &reservation_ops, mr, name, size);
+}
+
 void memory_region_destroy(MemoryRegion *mr)
 {
     assert(QTAILQ_EMPTY(&mr->subregions));
diff --git a/memory.h b/memory.h
index 53bf261..1097eac 100644
--- a/memory.h
+++ b/memory.h
@@ -123,6 +123,7 @@ struct MemoryRegion {
     bool terminates;
     bool readable;
     bool readonly; /* For RAM regions */
+    bool warning_printed; /* For reservations */
     MemoryRegion *alias;
     target_phys_addr_t alias_offset;
     unsigned priority;
@@ -250,6 +251,21 @@ void memory_region_init_rom_device(MemoryRegion *mr,
                                    uint64_t size);
 
 /**
+ * memory_region_init_reservation: Initialize a memory region that reserves
+ *                                 I/O space.
+ *
+ * A reservation region primariy serves debugging purposes.  It claims I/O
+ * space that is not supposed to be handled by QEMU itself.  Any access via
+ * the memory API will cause an abort().
+ *
+ * @mr: the #MemoryRegion to be initialized
+ * @name: used for debugging; not visible to the user or ABI
+ * @size: size of the region.
+ */
+void memory_region_init_reservation(MemoryRegion *mr,
+                                    const char *name,
+                                    uint64_t size);
+/**
  * memory_region_destroy: Destroy a memory region and relaim all resources.
  *
  * @mr: the region to be destroyed.  May not currently be a subregion
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 10/16] memory: Introduce memory_region_init_reservation
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Introduce a memory region type that can reserve I/O space. Such regions
are useful for modeling I/O that is only handled outside of QEMU, i.e.
in the context of an accelerator like KVM.

Any access to such a region from QEMU is a bug, but could theoretically
be triggered by guest code (DMA to reserved region). So only warning
about such events once, then ignore them.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 memory.c |   36 ++++++++++++++++++++++++++++++++++++
 memory.h |   16 ++++++++++++++++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/memory.c b/memory.c
index adfdf14..71a252a 100644
--- a/memory.c
+++ b/memory.c
@@ -1031,6 +1031,42 @@ void memory_region_init_rom_device(MemoryRegion *mr,
     mr->backend_registered = true;
 }
 
+static uint64_t invalid_read(void *opaque, target_phys_addr_t addr,
+                             unsigned size)
+{
+    MemoryRegion *mr = opaque;
+
+    if (!mr->warning_printed) {
+        fprintf(stderr, "Invalid read from memory region %s\n", mr->name);
+        mr->warning_printed = true;
+    }
+    return -1U;
+}
+
+static void invalid_write(void *opaque, target_phys_addr_t addr, uint64_t data,
+                          unsigned size)
+{
+    MemoryRegion *mr = opaque;
+
+    if (!mr->warning_printed) {
+        fprintf(stderr, "Invalid write to memory region %s\n", mr->name);
+        mr->warning_printed = true;
+    }
+}
+
+static const MemoryRegionOps reservation_ops = {
+    .read = invalid_read,
+    .write = invalid_write,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+void memory_region_init_reservation(MemoryRegion *mr,
+                                    const char *name,
+                                    uint64_t size)
+{
+    memory_region_init_io(mr, &reservation_ops, mr, name, size);
+}
+
 void memory_region_destroy(MemoryRegion *mr)
 {
     assert(QTAILQ_EMPTY(&mr->subregions));
diff --git a/memory.h b/memory.h
index 53bf261..1097eac 100644
--- a/memory.h
+++ b/memory.h
@@ -123,6 +123,7 @@ struct MemoryRegion {
     bool terminates;
     bool readable;
     bool readonly; /* For RAM regions */
+    bool warning_printed; /* For reservations */
     MemoryRegion *alias;
     target_phys_addr_t alias_offset;
     unsigned priority;
@@ -250,6 +251,21 @@ void memory_region_init_rom_device(MemoryRegion *mr,
                                    uint64_t size);
 
 /**
+ * memory_region_init_reservation: Initialize a memory region that reserves
+ *                                 I/O space.
+ *
+ * A reservation region primariy serves debugging purposes.  It claims I/O
+ * space that is not supposed to be handled by QEMU itself.  Any access via
+ * the memory API will cause an abort().
+ *
+ * @mr: the #MemoryRegion to be initialized
+ * @name: used for debugging; not visible to the user or ABI
+ * @size: size of the region.
+ */
+void memory_region_init_reservation(MemoryRegion *mr,
+                                    const char *name,
+                                    uint64_t size);
+/**
  * memory_region_destroy: Destroy a memory region and relaim all resources.
  *
  * @mr: the region to be destroyed.  May not currently be a subregion
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 11/16] kvm: Introduce core services for in-kernel irqchip support
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Add the basic infrastructure to active in-kernel irqchip support, inject
interrupts into these models, and maintain IRQ routes.

Routing is optional and depends on the host arch supporting
KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as
we can't route GSI0 to IOAPIC pin 2.

In-kernel irqchip support will once be controlled by the machine
property 'kernel_irqchip', but this is not yet wired up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm-all.c         |  149 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 kvm.h             |    8 +++
 target-i386/kvm.c |   11 ++++
 3 files changed, 168 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 4c466d6..8958abd 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -77,6 +77,13 @@ struct KVMState
     int pit_in_kernel;
     int xsave, xcrs;
     int many_ioeventfds;
+    int irqchip_inject_ioctl;
+#ifdef KVM_CAP_IRQ_ROUTING
+    struct kvm_irq_routing *irq_routes;
+    int nr_allocated_irq_routes;
+    uint32_t *used_gsi_bitmap;
+    unsigned int max_gsi;
+#endif
 };
 
 KVMState *kvm_state;
@@ -693,6 +700,138 @@ static void kvm_handle_interrupt(CPUState *env, int mask)
     }
 }
 
+int kvm_irqchip_set_irq(KVMState *s, int irq, int level)
+{
+    struct kvm_irq_level event;
+    int ret;
+
+    assert(s->irqchip_in_kernel);
+
+    event.level = level;
+    event.irq = irq;
+    ret = kvm_vm_ioctl(s, s->irqchip_inject_ioctl, &event);
+    if (ret < 0) {
+        perror("kvm_set_irqchip_line");
+        abort();
+    }
+
+    return (s->irqchip_inject_ioctl == KVM_IRQ_LINE) ? 1 : event.status;
+}
+
+#ifdef KVM_CAP_IRQ_ROUTING
+static void set_gsi(KVMState *s, unsigned int gsi)
+{
+    assert(gsi < s->max_gsi);
+
+    s->used_gsi_bitmap[gsi / 32] |= 1U << (gsi % 32);
+}
+
+static void kvm_init_irq_routing(KVMState *s)
+{
+    int gsi_count;
+
+    gsi_count = kvm_check_extension(s, KVM_CAP_IRQ_ROUTING);
+    if (gsi_count > 0) {
+        unsigned int gsi_bits, i;
+
+        /* Round up so we can search ints using ffs */
+        gsi_bits = (gsi_count + 31) / 32;
+        s->used_gsi_bitmap = g_malloc0(gsi_bits / 8);
+        s->max_gsi = gsi_bits;
+
+        /* Mark any over-allocated bits as already in use */
+        for (i = gsi_count; i < gsi_bits; i++) {
+            set_gsi(s, i);
+        }
+    }
+
+    s->irq_routes = g_malloc0(sizeof(*s->irq_routes));
+    s->nr_allocated_irq_routes = 0;
+
+    kvm_arch_init_irq_routing(s);
+}
+
+static void kvm_add_routing_entry(KVMState *s,
+                                  struct kvm_irq_routing_entry *entry)
+{
+    struct kvm_irq_routing_entry *new;
+    int n, size;
+
+    if (s->irq_routes->nr == s->nr_allocated_irq_routes) {
+        n = s->nr_allocated_irq_routes * 2;
+        if (n < 64) {
+            n = 64;
+        }
+        size = sizeof(struct kvm_irq_routing);
+        size += n * sizeof(*new);
+        s->irq_routes = g_realloc(s->irq_routes, size);
+        s->nr_allocated_irq_routes = n;
+    }
+    n = s->irq_routes->nr++;
+    new = &s->irq_routes->entries[n];
+    memset(new, 0, sizeof(*new));
+    new->gsi = entry->gsi;
+    new->type = entry->type;
+    new->flags = entry->flags;
+    new->u = entry->u;
+
+    set_gsi(s, entry->gsi);
+}
+
+void kvm_irqchip_add_route(KVMState *s, int irq, int irqchip, int pin)
+{
+    struct kvm_irq_routing_entry e;
+
+    e.gsi = irq;
+    e.type = KVM_IRQ_ROUTING_IRQCHIP;
+    e.flags = 0;
+    e.u.irqchip.irqchip = irqchip;
+    e.u.irqchip.pin = pin;
+    kvm_add_routing_entry(s, &e);
+}
+
+int kvm_irqchip_commit_routes(KVMState *s)
+{
+    s->irq_routes->flags = 0;
+    return kvm_vm_ioctl(s, KVM_SET_GSI_ROUTING, s->irq_routes);
+}
+
+#else /* !KVM_CAP_IRQ_ROUTING */
+
+static void kvm_init_irq_routing(KVMState *s)
+{
+}
+#endif /* !KVM_CAP_IRQ_ROUTING */
+
+static int kvm_irqchip_create(KVMState *s)
+{
+    QemuOptsList *list = qemu_find_opts("machine");
+    int ret;
+
+    if (QTAILQ_EMPTY(&list->head) ||
+        !qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
+                           "kernel_irqchip", false) ||
+        !kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
+        return 0;
+    }
+
+    ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+    if (ret < 0) {
+        fprintf(stderr, "Create kernel irqchip failed\n");
+        return ret;
+    }
+
+    s->irqchip_inject_ioctl = KVM_IRQ_LINE;
+    if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) {
+        s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS;
+    }
+    s->irqchip_in_kernel = 1;
+
+    kvm_init_irq_routing(s);
+
+    return 0;
+}
+
 int kvm_init(void)
 {
     static const char upgrade_note[] =
@@ -788,6 +927,11 @@ int kvm_init(void)
         goto err;
     }
 
+    ret = kvm_irqchip_create(s);
+    if (ret < 0) {
+        goto err;
+    }
+
     kvm_state = s;
     cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client);
 
@@ -1122,6 +1266,11 @@ int kvm_has_many_ioeventfds(void)
     return kvm_state->many_ioeventfds;
 }
 
+int kvm_has_gsi_routing(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING);
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm.h b/kvm.h
index 243b063..0d6c453 100644
--- a/kvm.h
+++ b/kvm.h
@@ -51,6 +51,7 @@ int kvm_has_debugregs(void);
 int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
+int kvm_has_gsi_routing(void);
 
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
@@ -124,6 +125,13 @@ void kvm_arch_reset_vcpu(CPUState *env);
 int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr);
 int kvm_arch_on_sigbus(int code, void *addr);
 
+void kvm_arch_init_irq_routing(KVMState *s);
+
+int kvm_irqchip_set_irq(KVMState *s, int irq, int level);
+
+void kvm_irqchip_add_route(KVMState *s, int gsi, int irqchip, int pin);
+int kvm_irqchip_commit_routes(KVMState *s);
+
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index d206852..9d1191f 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1879,3 +1879,14 @@ bool kvm_arch_stop_on_emulation_error(CPUState *env)
     return !(env->cr[0] & CR0_PE_MASK) ||
            ((env->segs[R_CS].selector  & 3) != 3);
 }
+
+void kvm_arch_init_irq_routing(KVMState *s)
+{
+    if (!kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
+        /* If kernel can't do irq routing, interrupt source
+         * override 0->2 cannot be set up as required by HPET.
+         * So we have to disable it.
+         */
+        no_hpet = 1;
+    }
+}
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 11/16] kvm: Introduce core services for in-kernel irqchip support
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Add the basic infrastructure to active in-kernel irqchip support, inject
interrupts into these models, and maintain IRQ routes.

Routing is optional and depends on the host arch supporting
KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as
we can't route GSI0 to IOAPIC pin 2.

In-kernel irqchip support will once be controlled by the machine
property 'kernel_irqchip', but this is not yet wired up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm-all.c         |  149 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 kvm.h             |    8 +++
 target-i386/kvm.c |   11 ++++
 3 files changed, 168 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 4c466d6..8958abd 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -77,6 +77,13 @@ struct KVMState
     int pit_in_kernel;
     int xsave, xcrs;
     int many_ioeventfds;
+    int irqchip_inject_ioctl;
+#ifdef KVM_CAP_IRQ_ROUTING
+    struct kvm_irq_routing *irq_routes;
+    int nr_allocated_irq_routes;
+    uint32_t *used_gsi_bitmap;
+    unsigned int max_gsi;
+#endif
 };
 
 KVMState *kvm_state;
@@ -693,6 +700,138 @@ static void kvm_handle_interrupt(CPUState *env, int mask)
     }
 }
 
+int kvm_irqchip_set_irq(KVMState *s, int irq, int level)
+{
+    struct kvm_irq_level event;
+    int ret;
+
+    assert(s->irqchip_in_kernel);
+
+    event.level = level;
+    event.irq = irq;
+    ret = kvm_vm_ioctl(s, s->irqchip_inject_ioctl, &event);
+    if (ret < 0) {
+        perror("kvm_set_irqchip_line");
+        abort();
+    }
+
+    return (s->irqchip_inject_ioctl == KVM_IRQ_LINE) ? 1 : event.status;
+}
+
+#ifdef KVM_CAP_IRQ_ROUTING
+static void set_gsi(KVMState *s, unsigned int gsi)
+{
+    assert(gsi < s->max_gsi);
+
+    s->used_gsi_bitmap[gsi / 32] |= 1U << (gsi % 32);
+}
+
+static void kvm_init_irq_routing(KVMState *s)
+{
+    int gsi_count;
+
+    gsi_count = kvm_check_extension(s, KVM_CAP_IRQ_ROUTING);
+    if (gsi_count > 0) {
+        unsigned int gsi_bits, i;
+
+        /* Round up so we can search ints using ffs */
+        gsi_bits = (gsi_count + 31) / 32;
+        s->used_gsi_bitmap = g_malloc0(gsi_bits / 8);
+        s->max_gsi = gsi_bits;
+
+        /* Mark any over-allocated bits as already in use */
+        for (i = gsi_count; i < gsi_bits; i++) {
+            set_gsi(s, i);
+        }
+    }
+
+    s->irq_routes = g_malloc0(sizeof(*s->irq_routes));
+    s->nr_allocated_irq_routes = 0;
+
+    kvm_arch_init_irq_routing(s);
+}
+
+static void kvm_add_routing_entry(KVMState *s,
+                                  struct kvm_irq_routing_entry *entry)
+{
+    struct kvm_irq_routing_entry *new;
+    int n, size;
+
+    if (s->irq_routes->nr == s->nr_allocated_irq_routes) {
+        n = s->nr_allocated_irq_routes * 2;
+        if (n < 64) {
+            n = 64;
+        }
+        size = sizeof(struct kvm_irq_routing);
+        size += n * sizeof(*new);
+        s->irq_routes = g_realloc(s->irq_routes, size);
+        s->nr_allocated_irq_routes = n;
+    }
+    n = s->irq_routes->nr++;
+    new = &s->irq_routes->entries[n];
+    memset(new, 0, sizeof(*new));
+    new->gsi = entry->gsi;
+    new->type = entry->type;
+    new->flags = entry->flags;
+    new->u = entry->u;
+
+    set_gsi(s, entry->gsi);
+}
+
+void kvm_irqchip_add_route(KVMState *s, int irq, int irqchip, int pin)
+{
+    struct kvm_irq_routing_entry e;
+
+    e.gsi = irq;
+    e.type = KVM_IRQ_ROUTING_IRQCHIP;
+    e.flags = 0;
+    e.u.irqchip.irqchip = irqchip;
+    e.u.irqchip.pin = pin;
+    kvm_add_routing_entry(s, &e);
+}
+
+int kvm_irqchip_commit_routes(KVMState *s)
+{
+    s->irq_routes->flags = 0;
+    return kvm_vm_ioctl(s, KVM_SET_GSI_ROUTING, s->irq_routes);
+}
+
+#else /* !KVM_CAP_IRQ_ROUTING */
+
+static void kvm_init_irq_routing(KVMState *s)
+{
+}
+#endif /* !KVM_CAP_IRQ_ROUTING */
+
+static int kvm_irqchip_create(KVMState *s)
+{
+    QemuOptsList *list = qemu_find_opts("machine");
+    int ret;
+
+    if (QTAILQ_EMPTY(&list->head) ||
+        !qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
+                           "kernel_irqchip", false) ||
+        !kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
+        return 0;
+    }
+
+    ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+    if (ret < 0) {
+        fprintf(stderr, "Create kernel irqchip failed\n");
+        return ret;
+    }
+
+    s->irqchip_inject_ioctl = KVM_IRQ_LINE;
+    if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) {
+        s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS;
+    }
+    s->irqchip_in_kernel = 1;
+
+    kvm_init_irq_routing(s);
+
+    return 0;
+}
+
 int kvm_init(void)
 {
     static const char upgrade_note[] =
@@ -788,6 +927,11 @@ int kvm_init(void)
         goto err;
     }
 
+    ret = kvm_irqchip_create(s);
+    if (ret < 0) {
+        goto err;
+    }
+
     kvm_state = s;
     cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client);
 
@@ -1122,6 +1266,11 @@ int kvm_has_many_ioeventfds(void)
     return kvm_state->many_ioeventfds;
 }
 
+int kvm_has_gsi_routing(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING);
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm.h b/kvm.h
index 243b063..0d6c453 100644
--- a/kvm.h
+++ b/kvm.h
@@ -51,6 +51,7 @@ int kvm_has_debugregs(void);
 int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
+int kvm_has_gsi_routing(void);
 
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
@@ -124,6 +125,13 @@ void kvm_arch_reset_vcpu(CPUState *env);
 int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr);
 int kvm_arch_on_sigbus(int code, void *addr);
 
+void kvm_arch_init_irq_routing(KVMState *s);
+
+int kvm_irqchip_set_irq(KVMState *s, int irq, int level);
+
+void kvm_irqchip_add_route(KVMState *s, int gsi, int irqchip, int pin);
+int kvm_irqchip_commit_routes(KVMState *s);
+
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index d206852..9d1191f 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1879,3 +1879,14 @@ bool kvm_arch_stop_on_emulation_error(CPUState *env)
     return !(env->cr[0] & CR0_PE_MASK) ||
            ((env->segs[R_CS].selector  & 3) != 3);
 }
+
+void kvm_arch_init_irq_routing(KVMState *s)
+{
+    if (!kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
+        /* If kernel can't do irq routing, interrupt source
+         * override 0->2 cannot be set up as required by HPET.
+         * So we have to disable it.
+         */
+        no_hpet = 1;
+    }
+}
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 12/16] kvm: x86: Establish IRQ0 override control
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c    |    3 ++-
 kvm-all.c  |    5 +++++
 kvm-stub.c |    5 +++++
 kvm.h      |    2 ++
 sysemu.h   |    1 -
 vl.c       |    1 -
 6 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index ee6e59b..066edc4 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -39,6 +39,7 @@
 #include "msi.h"
 #include "sysbus.h"
 #include "sysemu.h"
+#include "kvm.h"
 #include "blockdev.h"
 #include "ui/qemu-spice.h"
 #include "memory.h"
@@ -609,7 +610,7 @@ static void *bochs_bios_init(void)
     fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
     fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, (uint8_t *)acpi_tables,
                      acpi_tables_len);
-    fw_cfg_add_bytes(fw_cfg, FW_CFG_IRQ0_OVERRIDE, &irq0override, 1);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, kvm_allows_irq0_override());
 
     smbios_table = smbios_get_table(&smbios_len);
     if (smbios_table)
diff --git a/kvm-all.c b/kvm-all.c
index 8958abd..7387dd3 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1271,6 +1271,11 @@ int kvm_has_gsi_routing(void)
     return kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING);
 }
 
+int kvm_allows_irq0_override(void)
+{
+    return !kvm_enabled() || !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 06064b9..6c2b06b 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -78,6 +78,11 @@ int kvm_has_many_ioeventfds(void)
     return 0;
 }
 
+int kvm_allows_irq0_override(void)
+{
+    return 1;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index 0d6c453..a3c87af 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,6 +53,8 @@ int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_gsi_routing(void);
 
+int kvm_allows_irq0_override(void);
+
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
 
diff --git a/sysemu.h b/sysemu.h
index 22cd720..3bd896e 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -102,7 +102,6 @@ extern int vga_interface_type;
 extern int graphic_width;
 extern int graphic_height;
 extern int graphic_depth;
-extern uint8_t irq0override;
 extern DisplayType display_type;
 extern const char *keyboard_layout;
 extern int win2k_install_hack;
diff --git a/vl.c b/vl.c
index de5ecef..f9a8caf 100644
--- a/vl.c
+++ b/vl.c
@@ -218,7 +218,6 @@ int no_reboot = 0;
 int no_shutdown = 0;
 int cursor_hide = 1;
 int graphic_rotate = 0;
-uint8_t irq0override = 1;
 const char *watchdog;
 QEMUOptionRom option_rom[MAX_OPTION_ROMS];
 int nb_option_roms;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 12/16] kvm: x86: Establish IRQ0 override control
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/pc.c    |    3 ++-
 kvm-all.c  |    5 +++++
 kvm-stub.c |    5 +++++
 kvm.h      |    2 ++
 sysemu.h   |    1 -
 vl.c       |    1 -
 6 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index ee6e59b..066edc4 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -39,6 +39,7 @@
 #include "msi.h"
 #include "sysbus.h"
 #include "sysemu.h"
+#include "kvm.h"
 #include "blockdev.h"
 #include "ui/qemu-spice.h"
 #include "memory.h"
@@ -609,7 +610,7 @@ static void *bochs_bios_init(void)
     fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
     fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, (uint8_t *)acpi_tables,
                      acpi_tables_len);
-    fw_cfg_add_bytes(fw_cfg, FW_CFG_IRQ0_OVERRIDE, &irq0override, 1);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, kvm_allows_irq0_override());
 
     smbios_table = smbios_get_table(&smbios_len);
     if (smbios_table)
diff --git a/kvm-all.c b/kvm-all.c
index 8958abd..7387dd3 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1271,6 +1271,11 @@ int kvm_has_gsi_routing(void)
     return kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING);
 }
 
+int kvm_allows_irq0_override(void)
+{
+    return !kvm_enabled() || !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
     if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 06064b9..6c2b06b 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -78,6 +78,11 @@ int kvm_has_many_ioeventfds(void)
     return 0;
 }
 
+int kvm_allows_irq0_override(void)
+{
+    return 1;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index 0d6c453..a3c87af 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,6 +53,8 @@ int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_gsi_routing(void);
 
+int kvm_allows_irq0_override(void);
+
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
 
diff --git a/sysemu.h b/sysemu.h
index 22cd720..3bd896e 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -102,7 +102,6 @@ extern int vga_interface_type;
 extern int graphic_width;
 extern int graphic_height;
 extern int graphic_depth;
-extern uint8_t irq0override;
 extern DisplayType display_type;
 extern const char *keyboard_layout;
 extern int win2k_install_hack;
diff --git a/vl.c b/vl.c
index de5ecef..f9a8caf 100644
--- a/vl.c
+++ b/vl.c
@@ -218,7 +218,6 @@ int no_reboot = 0;
 int no_shutdown = 0;
 int cursor_hide = 1;
 int graphic_rotate = 0;
-uint8_t irq0override = 1;
 const char *watchdog;
 QEMUOptionRom option_rom[MAX_OPTION_ROMS];
 int nb_option_roms;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 13/16] kvm: x86: Add user space part for in-kernel APIC
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl,
	Lai Jiangshan

This introduces the alternative APIC backend which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.

MSI is not yet supported, so we disable this when the in-kernel model is
in use.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target   |    2 +-
 hw/kvm/apic.c     |  138 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc.c           |   15 ++++--
 kvm.h             |    4 ++
 target-i386/kvm.c |   38 +++++++++++++++
 5 files changed, 191 insertions(+), 6 deletions(-)
 create mode 100644 hw/kvm/apic.c

diff --git a/Makefile.target b/Makefile.target
index b549988..76de485 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
new file mode 100644
index 0000000..04005ae
--- /dev/null
+++ b/hw/kvm/apic.c
@@ -0,0 +1,138 @@
+/*
+ * KVM in-kernel APIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka          <jan.kiszka@siemens.com>
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
+                                   int reg_id, uint32_t val)
+{
+    *((uint32_t *)(kapic->regs + (reg_id << 4))) = val;
+}
+
+static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
+                                       int reg_id)
+{
+    return *((uint32_t *)(kapic->regs + (reg_id << 4)));
+}
+
+void kvm_put_apic_state(DeviceState *d, struct kvm_lapic_state *kapic)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    int i;
+
+    memset(kapic, 0, sizeof(kapic));
+    kvm_apic_set_reg(kapic, 0x2, s->id << 24);
+    kvm_apic_set_reg(kapic, 0x8, s->tpr);
+    kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
+    kvm_apic_set_reg(kapic, 0xe, s->dest_mode << 28 | 0x0fffffff);
+    kvm_apic_set_reg(kapic, 0xf, s->spurious_vec);
+    for (i = 0; i < 8; i++) {
+        kvm_apic_set_reg(kapic, 0x10 + i, s->isr[i]);
+        kvm_apic_set_reg(kapic, 0x18 + i, s->tmr[i]);
+        kvm_apic_set_reg(kapic, 0x20 + i, s->irr[i]);
+    }
+    kvm_apic_set_reg(kapic, 0x28, s->esr);
+    kvm_apic_set_reg(kapic, 0x30, s->icr[0]);
+    kvm_apic_set_reg(kapic, 0x31, s->icr[1]);
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        kvm_apic_set_reg(kapic, 0x32 + i, s->lvt[i]);
+    }
+    kvm_apic_set_reg(kapic, 0x38, s->initial_count);
+    kvm_apic_set_reg(kapic, 0x3e, s->divide_conf);
+}
+
+void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    int i, v;
+
+    s->id = kvm_apic_get_reg(kapic, 0x2) >> 24;
+    s->tpr = kvm_apic_get_reg(kapic, 0x8);
+    s->arb_id = kvm_apic_get_reg(kapic, 0x9);
+    s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24;
+    s->dest_mode = kvm_apic_get_reg(kapic, 0xe) >> 28;
+    s->spurious_vec = kvm_apic_get_reg(kapic, 0xf);
+    for (i = 0; i < 8; i++) {
+        s->isr[i] = kvm_apic_get_reg(kapic, 0x10 + i);
+        s->tmr[i] = kvm_apic_get_reg(kapic, 0x18 + i);
+        s->irr[i] = kvm_apic_get_reg(kapic, 0x20 + i);
+    }
+    s->esr = kvm_apic_get_reg(kapic, 0x28);
+    s->icr[0] = kvm_apic_get_reg(kapic, 0x30);
+    s->icr[1] = kvm_apic_get_reg(kapic, 0x31);
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        s->lvt[i] = kvm_apic_get_reg(kapic, 0x32 + i);
+    }
+    s->initial_count = kvm_apic_get_reg(kapic, 0x38);
+    s->divide_conf = kvm_apic_get_reg(kapic, 0x3e);
+
+    v = (s->divide_conf & 3) | ((s->divide_conf >> 1) & 4);
+    s->count_shift = (v + 1) & 7;
+
+    s->initial_count_load_time = qemu_get_clock_ns(vm_clock);
+    apic_next_timer(s, s->initial_count_load_time);
+}
+
+static void kvm_apic_set_base(APICState *s, uint64_t val)
+{
+    s->apicbase = val;
+}
+
+static void kvm_apic_set_tpr(APICState *s, uint8_t val)
+{
+    s->tpr = (val & 0x0f) << 4;
+}
+
+static void do_inject_external_nmi(void *data)
+{
+    APICState *s = data;
+    CPUState *env = s->cpu_env;
+    uint32_t lvt;
+    int ret;
+
+    cpu_synchronize_state(env);
+
+    lvt = s->lvt[APIC_LVT_LINT1];
+    if (!(lvt & APIC_LVT_MASKED) && ((lvt >> 8) & 7) == APIC_DM_NMI) {
+        ret = kvm_vcpu_ioctl(env, KVM_NMI);
+        if (ret < 0) {
+            fprintf(stderr, "KVM: injection failed, NMI lost (%s)\n",
+                    strerror(-ret));
+        }
+    }
+}
+
+static void kvm_apic_external_nmi(APICState *s)
+{
+    run_on_cpu(s->cpu_env, do_inject_external_nmi, s);
+}
+
+static void kvm_apic_backend_init(APICState *s)
+{
+    memory_region_init_reservation(&s->io_memory, "kvm-apic-msi",
+                                   MSI_SPACE_SIZE);
+}
+
+static APICBackend kvm_apic_backend = {
+    .name = "KVM",
+    .init = kvm_apic_backend_init,
+    .set_base = kvm_apic_set_base,
+    .set_tpr = kvm_apic_set_tpr,
+    .external_nmi = kvm_apic_external_nmi,
+};
+
+static void kvm_apic_register_backend(void)
+{
+    apic_register_backend(&kvm_apic_backend);
+}
+
+device_init(kvm_apic_register_backend)
diff --git a/hw/pc.c b/hw/pc.c
index 066edc4..8c8aa49 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -878,27 +878,32 @@ DeviceState *cpu_get_current_apic(void)
 
 static DeviceState *apic_init(void *env, uint8_t apic_id)
 {
+    const char *backend = "QEMU";
     DeviceState *dev;
-    SysBusDevice *d;
     static int apic_mapped;
 
     dev = qdev_create(NULL, "apic");
     qdev_prop_set_uint8(dev, "id", apic_id);
     qdev_prop_set_ptr(dev, "cpu_env", env);
-    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        backend = "KVM";
+    }
+    qdev_prop_set_string(dev, "backend", g_strdup(backend));
     qdev_init_nofail(dev);
-    d = sysbus_from_qdev(dev);
 
     /* XXX: mapping more APICs at the same memory location */
     if (apic_mapped == 0) {
         /* NOTE: the APIC is directly connected to the CPU - it is not
            on the global memory bus. */
         /* XXX: what if the base changes? */
-        sysbus_mmio_map(d, 0, MSI_ADDR_BASE);
+        sysbus_mmio_map(sysbus_from_qdev(dev), 0, MSI_ADDR_BASE);
         apic_mapped = 1;
     }
 
-    msi_supported = true;
+    /* KVM does not support MSI yet. */
+    if (!kvm_enabled() || !kvm_irqchip_in_kernel()) {
+        msi_supported = true;
+    }
 
     return dev;
 }
diff --git a/kvm.h b/kvm.h
index a3c87af..f866296 100644
--- a/kvm.h
+++ b/kvm.h
@@ -31,6 +31,7 @@ extern int kvm_allowed;
 #endif
 
 struct kvm_run;
+struct kvm_lapic_state;
 
 typedef struct KVMCapabilityInfo {
     const char *name;
@@ -134,6 +135,9 @@ int kvm_irqchip_set_irq(KVMState *s, int irq, int level);
 void kvm_irqchip_add_route(KVMState *s, int gsi, int irqchip, int pin);
 int kvm_irqchip_commit_routes(KVMState *s);
 
+void kvm_put_apic_state(DeviceState *d, struct kvm_lapic_state *kapic);
+void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic);
+
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9d1191f..274b3cb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1277,6 +1277,36 @@ static int kvm_get_mp_state(CPUState *env)
     return 0;
 }
 
+static int kvm_get_apic(CPUState *env)
+{
+    DeviceState *apic = env->apic_state;
+    struct kvm_lapic_state kapic;
+    int ret;
+
+    if (apic && kvm_enabled() && kvm_irqchip_in_kernel()) {
+        ret = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, &kapic);
+        if (ret < 0) {
+            return ret;
+        }
+
+        kvm_get_apic_state(apic, &kapic);
+    }
+    return 0;
+}
+
+static int kvm_put_apic(CPUState *env)
+{
+    DeviceState *apic = env->apic_state;
+    struct kvm_lapic_state kapic;
+
+    if (apic && kvm_enabled() && kvm_irqchip_in_kernel()) {
+        kvm_put_apic_state(apic, &kapic);
+
+        return kvm_vcpu_ioctl(env, KVM_SET_LAPIC, &kapic);
+    }
+    return 0;
+}
+
 static int kvm_put_vcpu_events(CPUState *env, int level)
 {
     struct kvm_vcpu_events events;
@@ -1450,6 +1480,10 @@ int kvm_arch_put_registers(CPUState *env, int level)
         if (ret < 0) {
             return ret;
         }
+        ret = kvm_put_apic(env);
+        if (ret < 0) {
+            return ret;
+        }
     }
     ret = kvm_put_vcpu_events(env, level);
     if (ret < 0) {
@@ -1497,6 +1531,10 @@ int kvm_arch_get_registers(CPUState *env)
     if (ret < 0) {
         return ret;
     }
+    ret = kvm_get_apic(env);
+    if (ret < 0) {
+        return ret;
+    }
     ret = kvm_get_vcpu_events(env);
     if (ret < 0) {
         return ret;
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 13/16] kvm: x86: Add user space part for in-kernel APIC
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	qemu-devel, Blue Swirl

This introduces the alternative APIC backend which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.

MSI is not yet supported, so we disable this when the in-kernel model is
in use.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target   |    2 +-
 hw/kvm/apic.c     |  138 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc.c           |   15 ++++--
 kvm.h             |    4 ++
 target-i386/kvm.c |   38 +++++++++++++++
 5 files changed, 191 insertions(+), 6 deletions(-)
 create mode 100644 hw/kvm/apic.c

diff --git a/Makefile.target b/Makefile.target
index b549988..76de485 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
new file mode 100644
index 0000000..04005ae
--- /dev/null
+++ b/hw/kvm/apic.c
@@ -0,0 +1,138 @@
+/*
+ * KVM in-kernel APIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka          <jan.kiszka@siemens.com>
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
+                                   int reg_id, uint32_t val)
+{
+    *((uint32_t *)(kapic->regs + (reg_id << 4))) = val;
+}
+
+static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
+                                       int reg_id)
+{
+    return *((uint32_t *)(kapic->regs + (reg_id << 4)));
+}
+
+void kvm_put_apic_state(DeviceState *d, struct kvm_lapic_state *kapic)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    int i;
+
+    memset(kapic, 0, sizeof(kapic));
+    kvm_apic_set_reg(kapic, 0x2, s->id << 24);
+    kvm_apic_set_reg(kapic, 0x8, s->tpr);
+    kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
+    kvm_apic_set_reg(kapic, 0xe, s->dest_mode << 28 | 0x0fffffff);
+    kvm_apic_set_reg(kapic, 0xf, s->spurious_vec);
+    for (i = 0; i < 8; i++) {
+        kvm_apic_set_reg(kapic, 0x10 + i, s->isr[i]);
+        kvm_apic_set_reg(kapic, 0x18 + i, s->tmr[i]);
+        kvm_apic_set_reg(kapic, 0x20 + i, s->irr[i]);
+    }
+    kvm_apic_set_reg(kapic, 0x28, s->esr);
+    kvm_apic_set_reg(kapic, 0x30, s->icr[0]);
+    kvm_apic_set_reg(kapic, 0x31, s->icr[1]);
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        kvm_apic_set_reg(kapic, 0x32 + i, s->lvt[i]);
+    }
+    kvm_apic_set_reg(kapic, 0x38, s->initial_count);
+    kvm_apic_set_reg(kapic, 0x3e, s->divide_conf);
+}
+
+void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+    int i, v;
+
+    s->id = kvm_apic_get_reg(kapic, 0x2) >> 24;
+    s->tpr = kvm_apic_get_reg(kapic, 0x8);
+    s->arb_id = kvm_apic_get_reg(kapic, 0x9);
+    s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24;
+    s->dest_mode = kvm_apic_get_reg(kapic, 0xe) >> 28;
+    s->spurious_vec = kvm_apic_get_reg(kapic, 0xf);
+    for (i = 0; i < 8; i++) {
+        s->isr[i] = kvm_apic_get_reg(kapic, 0x10 + i);
+        s->tmr[i] = kvm_apic_get_reg(kapic, 0x18 + i);
+        s->irr[i] = kvm_apic_get_reg(kapic, 0x20 + i);
+    }
+    s->esr = kvm_apic_get_reg(kapic, 0x28);
+    s->icr[0] = kvm_apic_get_reg(kapic, 0x30);
+    s->icr[1] = kvm_apic_get_reg(kapic, 0x31);
+    for (i = 0; i < APIC_LVT_NB; i++) {
+        s->lvt[i] = kvm_apic_get_reg(kapic, 0x32 + i);
+    }
+    s->initial_count = kvm_apic_get_reg(kapic, 0x38);
+    s->divide_conf = kvm_apic_get_reg(kapic, 0x3e);
+
+    v = (s->divide_conf & 3) | ((s->divide_conf >> 1) & 4);
+    s->count_shift = (v + 1) & 7;
+
+    s->initial_count_load_time = qemu_get_clock_ns(vm_clock);
+    apic_next_timer(s, s->initial_count_load_time);
+}
+
+static void kvm_apic_set_base(APICState *s, uint64_t val)
+{
+    s->apicbase = val;
+}
+
+static void kvm_apic_set_tpr(APICState *s, uint8_t val)
+{
+    s->tpr = (val & 0x0f) << 4;
+}
+
+static void do_inject_external_nmi(void *data)
+{
+    APICState *s = data;
+    CPUState *env = s->cpu_env;
+    uint32_t lvt;
+    int ret;
+
+    cpu_synchronize_state(env);
+
+    lvt = s->lvt[APIC_LVT_LINT1];
+    if (!(lvt & APIC_LVT_MASKED) && ((lvt >> 8) & 7) == APIC_DM_NMI) {
+        ret = kvm_vcpu_ioctl(env, KVM_NMI);
+        if (ret < 0) {
+            fprintf(stderr, "KVM: injection failed, NMI lost (%s)\n",
+                    strerror(-ret));
+        }
+    }
+}
+
+static void kvm_apic_external_nmi(APICState *s)
+{
+    run_on_cpu(s->cpu_env, do_inject_external_nmi, s);
+}
+
+static void kvm_apic_backend_init(APICState *s)
+{
+    memory_region_init_reservation(&s->io_memory, "kvm-apic-msi",
+                                   MSI_SPACE_SIZE);
+}
+
+static APICBackend kvm_apic_backend = {
+    .name = "KVM",
+    .init = kvm_apic_backend_init,
+    .set_base = kvm_apic_set_base,
+    .set_tpr = kvm_apic_set_tpr,
+    .external_nmi = kvm_apic_external_nmi,
+};
+
+static void kvm_apic_register_backend(void)
+{
+    apic_register_backend(&kvm_apic_backend);
+}
+
+device_init(kvm_apic_register_backend)
diff --git a/hw/pc.c b/hw/pc.c
index 066edc4..8c8aa49 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -878,27 +878,32 @@ DeviceState *cpu_get_current_apic(void)
 
 static DeviceState *apic_init(void *env, uint8_t apic_id)
 {
+    const char *backend = "QEMU";
     DeviceState *dev;
-    SysBusDevice *d;
     static int apic_mapped;
 
     dev = qdev_create(NULL, "apic");
     qdev_prop_set_uint8(dev, "id", apic_id);
     qdev_prop_set_ptr(dev, "cpu_env", env);
-    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        backend = "KVM";
+    }
+    qdev_prop_set_string(dev, "backend", g_strdup(backend));
     qdev_init_nofail(dev);
-    d = sysbus_from_qdev(dev);
 
     /* XXX: mapping more APICs at the same memory location */
     if (apic_mapped == 0) {
         /* NOTE: the APIC is directly connected to the CPU - it is not
            on the global memory bus. */
         /* XXX: what if the base changes? */
-        sysbus_mmio_map(d, 0, MSI_ADDR_BASE);
+        sysbus_mmio_map(sysbus_from_qdev(dev), 0, MSI_ADDR_BASE);
         apic_mapped = 1;
     }
 
-    msi_supported = true;
+    /* KVM does not support MSI yet. */
+    if (!kvm_enabled() || !kvm_irqchip_in_kernel()) {
+        msi_supported = true;
+    }
 
     return dev;
 }
diff --git a/kvm.h b/kvm.h
index a3c87af..f866296 100644
--- a/kvm.h
+++ b/kvm.h
@@ -31,6 +31,7 @@ extern int kvm_allowed;
 #endif
 
 struct kvm_run;
+struct kvm_lapic_state;
 
 typedef struct KVMCapabilityInfo {
     const char *name;
@@ -134,6 +135,9 @@ int kvm_irqchip_set_irq(KVMState *s, int irq, int level);
 void kvm_irqchip_add_route(KVMState *s, int gsi, int irqchip, int pin);
 int kvm_irqchip_commit_routes(KVMState *s);
 
+void kvm_put_apic_state(DeviceState *d, struct kvm_lapic_state *kapic);
+void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic);
+
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9d1191f..274b3cb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1277,6 +1277,36 @@ static int kvm_get_mp_state(CPUState *env)
     return 0;
 }
 
+static int kvm_get_apic(CPUState *env)
+{
+    DeviceState *apic = env->apic_state;
+    struct kvm_lapic_state kapic;
+    int ret;
+
+    if (apic && kvm_enabled() && kvm_irqchip_in_kernel()) {
+        ret = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, &kapic);
+        if (ret < 0) {
+            return ret;
+        }
+
+        kvm_get_apic_state(apic, &kapic);
+    }
+    return 0;
+}
+
+static int kvm_put_apic(CPUState *env)
+{
+    DeviceState *apic = env->apic_state;
+    struct kvm_lapic_state kapic;
+
+    if (apic && kvm_enabled() && kvm_irqchip_in_kernel()) {
+        kvm_put_apic_state(apic, &kapic);
+
+        return kvm_vcpu_ioctl(env, KVM_SET_LAPIC, &kapic);
+    }
+    return 0;
+}
+
 static int kvm_put_vcpu_events(CPUState *env, int level)
 {
     struct kvm_vcpu_events events;
@@ -1450,6 +1480,10 @@ int kvm_arch_put_registers(CPUState *env, int level)
         if (ret < 0) {
             return ret;
         }
+        ret = kvm_put_apic(env);
+        if (ret < 0) {
+            return ret;
+        }
     }
     ret = kvm_put_vcpu_events(env, level);
     if (ret < 0) {
@@ -1497,6 +1531,10 @@ int kvm_arch_get_registers(CPUState *env)
     if (ret < 0) {
         return ret;
     }
+    ret = kvm_get_apic(env);
+    if (ret < 0) {
+        return ret;
+    }
     ret = kvm_get_vcpu_events(env);
     if (ret < 0) {
         return ret;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 14/16] kvm: x86: Add user space part for in-kernel i8259
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Introduce the alternative i8259 backend that exploits KVM in-kernel
acceleration.

The PIIX3 initialization code is furthermore extended by KVM specific
IRQ route setup. GSI injection differs in KVM mode from the user space
model. As we can dispatch ISA-range IRQs to both IOAPIC and PIC inside
the kernel, we do not need to inject them separately. This is reflected
by a KVM-specific GSI handler.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target |    2 +-
 hw/kvm/i8259.c  |  126 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc.h         |    1 +
 hw/pc_piix.c    |   50 ++++++++++++++++++++--
 4 files changed, 174 insertions(+), 5 deletions(-)
 create mode 100644 hw/kvm/i8259.c

diff --git a/Makefile.target b/Makefile.target
index 76de485..fb10143 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/i8259.c b/hw/kvm/i8259.c
new file mode 100644
index 0000000..d4a1339
--- /dev/null
+++ b/hw/kvm/i8259.c
@@ -0,0 +1,126 @@
+/*
+ * KVM in-kernel PIC (i8259) support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka          <jan.kiszka@siemens.com>
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include "hw/i8259_internal.h"
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static void kvm_pic_get(PicState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_pic_state *kpic;
+    int ret;
+
+    chip.chip_id = s->master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+    ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+
+    kpic = &chip.chip.pic;
+
+    s->last_irr = kpic->last_irr;
+    s->irr = kpic->irr;
+    s->imr = kpic->imr;
+    s->isr = kpic->isr;
+    s->priority_add = kpic->priority_add;
+    s->irq_base = kpic->irq_base;
+    s->read_reg_select = kpic->read_reg_select;
+    s->poll = kpic->poll;
+    s->special_mask = kpic->special_mask;
+    s->init_state = kpic->init_state;
+    s->auto_eoi = kpic->auto_eoi;
+    s->rotate_on_auto_eoi = kpic->rotate_on_auto_eoi;
+    s->special_fully_nested_mode = kpic->special_fully_nested_mode;
+    s->init4 = kpic->init4;
+    s->elcr = kpic->elcr;
+    s->elcr_mask = kpic->elcr_mask;
+}
+
+static void kvm_pic_put(PicState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_pic_state *kpic;
+    int ret;
+
+    chip.chip_id = s->master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+
+    kpic = &chip.chip.pic;
+
+    kpic->last_irr = s->last_irr;
+    kpic->irr = s->irr;
+    kpic->imr = s->imr;
+    kpic->isr = s->isr;
+    kpic->priority_add = s->priority_add;
+    kpic->irq_base = s->irq_base;
+    kpic->read_reg_select = s->read_reg_select;
+    kpic->poll = s->poll;
+    kpic->special_mask = s->special_mask;
+    kpic->init_state = s->init_state;
+    kpic->auto_eoi = s->auto_eoi;
+    kpic->rotate_on_auto_eoi = s->rotate_on_auto_eoi;
+    kpic->special_fully_nested_mode = s->special_fully_nested_mode;
+    kpic->init4 = s->init4;
+    kpic->elcr = s->elcr;
+    kpic->elcr_mask = s->elcr_mask;
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+}
+
+static void kvm_pic_reset(PicState *s)
+{
+    pic_reset_internal(s);
+    s->elcr = 0;
+
+    kvm_pic_put(s);
+}
+
+static void kvm_pic_set_irq(void *opaque, int irq, int level)
+{
+    int delivered;
+
+    delivered = kvm_irqchip_set_irq(kvm_state, irq, level);
+    apic_report_irq_delivered(delivered);
+}
+
+static void kvm_pic_backend_init(PicState *s)
+{
+    memory_region_init_reservation(&s->base_io, "kvm-pic", 2);
+    memory_region_init_reservation(&s->elcr_io, "kvm-elcr", 1);
+}
+
+qemu_irq *kvm_i8259_init(void)
+{
+    i8259_init_chip(true, "KVM");
+    i8259_init_chip(false, "KVM");
+
+    return qemu_allocate_irqs(kvm_pic_set_irq, NULL, ISA_NUM_IRQS);
+}
+
+static PICBackend kvm_pic_backend = {
+    .name = "KVM",
+    .init = kvm_pic_backend_init,
+    .reset = kvm_pic_reset,
+    .pre_save = kvm_pic_get,
+    .post_load = kvm_pic_put,
+};
+
+static void kvm_pic_register(void)
+{
+    pic_register_backend(&kvm_pic_backend);
+}
+
+device_init(kvm_pic_register)
diff --git a/hw/pc.h b/hw/pc.h
index b7b7e40..fc6f446 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -64,6 +64,7 @@ bool parallel_mm_init(MemoryRegion *address_space,
 typedef struct PicState PicState;
 extern PicState *isa_pic;
 qemu_irq *i8259_init(qemu_irq parent_irq);
+qemu_irq *kvm_i8259_init(void);
 int pic_read_irq(PicState *s);
 int pic_get_output(PicState *s);
 void pic_info(Monitor *mon);
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 98f2822..8650319 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -53,6 +53,40 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
 static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 
+static void kvm_piix3_setup_irq_routing(bool pci_enabled)
+{
+    KVMState *s = kvm_state;
+    int ret, i;
+
+    if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
+        for (i = 0; i < 8; ++i) {
+            if (i == 2) {
+                continue;
+            }
+            kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_MASTER, i);
+        }
+        for (i = 8; i < 16; ++i) {
+            kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_SLAVE, i - 8);
+        }
+        ret = kvm_irqchip_commit_routes(s);
+        if (ret < 0) {
+            hw_error("KVM IRQ routing setup failed");
+        }
+    }
+}
+
+static void kvm_piix3_gsi_handler(void *opaque, int n, int level)
+{
+    GSIState *s = opaque;
+
+    if (n < ISA_NUM_IRQS) {
+        /* Kernel will forward to both PIC and IOAPIC */
+        qemu_set_irq(s->i8259_irq[n], level);
+    } else {
+        qemu_set_irq(s->ioapic_irq[n], level);
+    }
+}
+
 static void ioapic_init(GSIState *gsi_state)
 {
     DeviceState *dev;
@@ -133,7 +167,13 @@ static void pc_init1(MemoryRegion *system_memory,
     }
 
     gsi_state = g_malloc0(sizeof(*gsi_state));
-    gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        kvm_piix3_setup_irq_routing(pci_enabled);
+        gsi = qemu_allocate_irqs(kvm_piix3_gsi_handler, gsi_state,
+                                 GSI_NUM_PINS);
+    } else {
+        gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+    }
 
     if (pci_enabled) {
         pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, gsi,
@@ -153,11 +193,13 @@ static void pc_init1(MemoryRegion *system_memory,
     }
     isa_bus_irqs(gsi);
 
-    if (!xen_enabled()) {
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        i8259 = kvm_i8259_init();
+    } else if (xen_enabled()) {
+        i8259 = xen_interrupt_controller_init();
+    } else {
         cpu_irq = pc_allocate_cpu_irq();
         i8259 = i8259_init(cpu_irq[0]);
-    } else {
-        i8259 = xen_interrupt_controller_init();
     }
 
     for (i = 0; i < ISA_NUM_IRQS; i++) {
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 14/16] kvm: x86: Add user space part for in-kernel i8259
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Introduce the alternative i8259 backend that exploits KVM in-kernel
acceleration.

The PIIX3 initialization code is furthermore extended by KVM specific
IRQ route setup. GSI injection differs in KVM mode from the user space
model. As we can dispatch ISA-range IRQs to both IOAPIC and PIC inside
the kernel, we do not need to inject them separately. This is reflected
by a KVM-specific GSI handler.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target |    2 +-
 hw/kvm/i8259.c  |  126 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc.h         |    1 +
 hw/pc_piix.c    |   50 ++++++++++++++++++++--
 4 files changed, 174 insertions(+), 5 deletions(-)
 create mode 100644 hw/kvm/i8259.c

diff --git a/Makefile.target b/Makefile.target
index 76de485..fb10143 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/i8259.c b/hw/kvm/i8259.c
new file mode 100644
index 0000000..d4a1339
--- /dev/null
+++ b/hw/kvm/i8259.c
@@ -0,0 +1,126 @@
+/*
+ * KVM in-kernel PIC (i8259) support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka          <jan.kiszka@siemens.com>
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include "hw/i8259_internal.h"
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static void kvm_pic_get(PicState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_pic_state *kpic;
+    int ret;
+
+    chip.chip_id = s->master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+    ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+
+    kpic = &chip.chip.pic;
+
+    s->last_irr = kpic->last_irr;
+    s->irr = kpic->irr;
+    s->imr = kpic->imr;
+    s->isr = kpic->isr;
+    s->priority_add = kpic->priority_add;
+    s->irq_base = kpic->irq_base;
+    s->read_reg_select = kpic->read_reg_select;
+    s->poll = kpic->poll;
+    s->special_mask = kpic->special_mask;
+    s->init_state = kpic->init_state;
+    s->auto_eoi = kpic->auto_eoi;
+    s->rotate_on_auto_eoi = kpic->rotate_on_auto_eoi;
+    s->special_fully_nested_mode = kpic->special_fully_nested_mode;
+    s->init4 = kpic->init4;
+    s->elcr = kpic->elcr;
+    s->elcr_mask = kpic->elcr_mask;
+}
+
+static void kvm_pic_put(PicState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_pic_state *kpic;
+    int ret;
+
+    chip.chip_id = s->master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+
+    kpic = &chip.chip.pic;
+
+    kpic->last_irr = s->last_irr;
+    kpic->irr = s->irr;
+    kpic->imr = s->imr;
+    kpic->isr = s->isr;
+    kpic->priority_add = s->priority_add;
+    kpic->irq_base = s->irq_base;
+    kpic->read_reg_select = s->read_reg_select;
+    kpic->poll = s->poll;
+    kpic->special_mask = s->special_mask;
+    kpic->init_state = s->init_state;
+    kpic->auto_eoi = s->auto_eoi;
+    kpic->rotate_on_auto_eoi = s->rotate_on_auto_eoi;
+    kpic->special_fully_nested_mode = s->special_fully_nested_mode;
+    kpic->init4 = s->init4;
+    kpic->elcr = s->elcr;
+    kpic->elcr_mask = s->elcr_mask;
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+}
+
+static void kvm_pic_reset(PicState *s)
+{
+    pic_reset_internal(s);
+    s->elcr = 0;
+
+    kvm_pic_put(s);
+}
+
+static void kvm_pic_set_irq(void *opaque, int irq, int level)
+{
+    int delivered;
+
+    delivered = kvm_irqchip_set_irq(kvm_state, irq, level);
+    apic_report_irq_delivered(delivered);
+}
+
+static void kvm_pic_backend_init(PicState *s)
+{
+    memory_region_init_reservation(&s->base_io, "kvm-pic", 2);
+    memory_region_init_reservation(&s->elcr_io, "kvm-elcr", 1);
+}
+
+qemu_irq *kvm_i8259_init(void)
+{
+    i8259_init_chip(true, "KVM");
+    i8259_init_chip(false, "KVM");
+
+    return qemu_allocate_irqs(kvm_pic_set_irq, NULL, ISA_NUM_IRQS);
+}
+
+static PICBackend kvm_pic_backend = {
+    .name = "KVM",
+    .init = kvm_pic_backend_init,
+    .reset = kvm_pic_reset,
+    .pre_save = kvm_pic_get,
+    .post_load = kvm_pic_put,
+};
+
+static void kvm_pic_register(void)
+{
+    pic_register_backend(&kvm_pic_backend);
+}
+
+device_init(kvm_pic_register)
diff --git a/hw/pc.h b/hw/pc.h
index b7b7e40..fc6f446 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -64,6 +64,7 @@ bool parallel_mm_init(MemoryRegion *address_space,
 typedef struct PicState PicState;
 extern PicState *isa_pic;
 qemu_irq *i8259_init(qemu_irq parent_irq);
+qemu_irq *kvm_i8259_init(void);
 int pic_read_irq(PicState *s);
 int pic_get_output(PicState *s);
 void pic_info(Monitor *mon);
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 98f2822..8650319 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -53,6 +53,40 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
 static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 
+static void kvm_piix3_setup_irq_routing(bool pci_enabled)
+{
+    KVMState *s = kvm_state;
+    int ret, i;
+
+    if (kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) {
+        for (i = 0; i < 8; ++i) {
+            if (i == 2) {
+                continue;
+            }
+            kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_MASTER, i);
+        }
+        for (i = 8; i < 16; ++i) {
+            kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_SLAVE, i - 8);
+        }
+        ret = kvm_irqchip_commit_routes(s);
+        if (ret < 0) {
+            hw_error("KVM IRQ routing setup failed");
+        }
+    }
+}
+
+static void kvm_piix3_gsi_handler(void *opaque, int n, int level)
+{
+    GSIState *s = opaque;
+
+    if (n < ISA_NUM_IRQS) {
+        /* Kernel will forward to both PIC and IOAPIC */
+        qemu_set_irq(s->i8259_irq[n], level);
+    } else {
+        qemu_set_irq(s->ioapic_irq[n], level);
+    }
+}
+
 static void ioapic_init(GSIState *gsi_state)
 {
     DeviceState *dev;
@@ -133,7 +167,13 @@ static void pc_init1(MemoryRegion *system_memory,
     }
 
     gsi_state = g_malloc0(sizeof(*gsi_state));
-    gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        kvm_piix3_setup_irq_routing(pci_enabled);
+        gsi = qemu_allocate_irqs(kvm_piix3_gsi_handler, gsi_state,
+                                 GSI_NUM_PINS);
+    } else {
+        gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+    }
 
     if (pci_enabled) {
         pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, gsi,
@@ -153,11 +193,13 @@ static void pc_init1(MemoryRegion *system_memory,
     }
     isa_bus_irqs(gsi);
 
-    if (!xen_enabled()) {
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        i8259 = kvm_i8259_init();
+    } else if (xen_enabled()) {
+        i8259 = xen_interrupt_controller_init();
+    } else {
         cpu_irq = pc_allocate_cpu_irq();
         i8259 = i8259_init(cpu_irq[0]);
-    } else {
-        i8259 = xen_interrupt_controller_init();
     }
 
     for (i = 0; i < ISA_NUM_IRQS; i++) {
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 15/16] kvm: x86: Add user space part for in-kernel IOAPIC
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

This introduces the KVM-accelerated IOAPIC backend and extends the IRQ
routing setup by the 0->2 redirection when needed.

The IOAPIC gains a KVM-specific property that allows to define the GSI
base for injecting interrupts into the kernel model. This will allow to
disentangle PIC and IOAPIC pins for chipsets that support more
sophisticated IRQ routes than the PIIX3. So far the base is kept at 0,
i.e. PIC and IOAPIC share pins 0..15.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target      |    2 +-
 hw/ioapic_common.c   |    1 +
 hw/ioapic_internal.h |    1 +
 hw/kvm/ioapic.c      |  101 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc_piix.c         |   15 +++++++-
 5 files changed, 118 insertions(+), 2 deletions(-)
 create mode 100644 hw/kvm/ioapic.c

diff --git a/Makefile.target b/Makefile.target
index fb10143..b48bb57 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o kvm/ioapic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/ioapic_common.c b/hw/ioapic_common.c
index 094551c..efc1d44 100644
--- a/hw/ioapic_common.c
+++ b/hw/ioapic_common.c
@@ -122,6 +122,7 @@ static SysBusDeviceInfo ioapic_info = {
     .qdev.no_user = 1,
     .qdev.props = (Property[]) {
         DEFINE_PROP_STRING("backend", IOAPICState, backend_name),
+        DEFINE_PROP_UINT32("kvm_gsi_base", IOAPICState, kvm_gsi_base, 0),
         DEFINE_PROP_END_OF_LIST(),
     },
 };
diff --git a/hw/ioapic_internal.h b/hw/ioapic_internal.h
index c5fab8b..bf63115 100644
--- a/hw/ioapic_internal.h
+++ b/hw/ioapic_internal.h
@@ -95,6 +95,7 @@ struct IOAPICState {
 
     char *backend_name;
     IOAPICBackend *backend;
+    uint32_t kvm_gsi_base;
 };
 
 void ioapic_register_device(void);
diff --git a/hw/kvm/ioapic.c b/hw/kvm/ioapic.c
new file mode 100644
index 0000000..1e886d4
--- /dev/null
+++ b/hw/kvm/ioapic.c
@@ -0,0 +1,101 @@
+/*
+ * KVM in-kernel IOPIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka          <jan.kiszka@siemens.com>
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "hw/pc.h"
+#include "hw/ioapic_internal.h"
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static void kvm_ioapic_get(IOAPICState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_ioapic_state *kioapic;
+    int ret, i;
+
+    chip.chip_id = KVM_IRQCHIP_IOAPIC;
+    ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+
+    kioapic = &chip.chip.ioapic;
+
+    s->id = kioapic->id;
+    s->ioregsel = kioapic->ioregsel;
+    s->irr = kioapic->irr;
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        s->ioredtbl[i] = kioapic->redirtbl[i].bits;
+    }
+}
+
+static void kvm_ioapic_put(IOAPICState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_ioapic_state *kioapic;
+    int ret, i;
+
+    chip.chip_id = KVM_IRQCHIP_IOAPIC;
+    kioapic = &chip.chip.ioapic;
+
+    kioapic->id = s->id;
+    kioapic->ioregsel = s->ioregsel;
+    kioapic->base_address = s->busdev.mmio[0].addr;
+    kioapic->irr = s->irr;
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        kioapic->redirtbl[i].bits = s->ioredtbl[i];
+    }
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+}
+
+static void kvm_ioapic_reset(IOAPICState *s)
+{
+    ioapic_reset_internal(s);
+
+    kvm_ioapic_put(s);
+}
+
+static void kvm_ioapic_set_irq(void *opaque, int irq, int level)
+{
+    IOAPICState *s = opaque;
+    int delivered;
+
+    delivered = kvm_irqchip_set_irq(kvm_state, s->kvm_gsi_base + irq, level);
+    apic_report_irq_delivered(delivered);
+}
+
+static void kvm_ioapic_backend_init(IOAPICState *s, int index)
+{
+    memory_region_init_reservation(&s->io_memory, "kvm-ioapic", 0x1000);
+
+    qdev_init_gpio_in(&s->busdev.qdev, kvm_ioapic_set_irq, IOAPIC_NUM_PINS);
+}
+
+static IOAPICBackend kvm_ioapic_backend = {
+    .name = "KVM",
+    .init = kvm_ioapic_backend_init,
+    .reset = kvm_ioapic_reset,
+    .pre_save = kvm_ioapic_get,
+    .post_load = kvm_ioapic_put,
+};
+
+static void kvm_ioapic_register(void)
+{
+    ioapic_register_backend(&kvm_ioapic_backend);
+}
+
+device_init(kvm_ioapic_register)
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 8650319..93d0eba 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -68,6 +68,15 @@ static void kvm_piix3_setup_irq_routing(bool pci_enabled)
         for (i = 8; i < 16; ++i) {
             kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_SLAVE, i - 8);
         }
+        if (pci_enabled) {
+            for (i = 0; i < 24; ++i) {
+                if (i == 0) {
+                    kvm_irqchip_add_route(s, i, KVM_IRQCHIP_IOAPIC, 2);
+                } else if (i != 2) {
+                    kvm_irqchip_add_route(s, i, KVM_IRQCHIP_IOAPIC, i);
+                }
+            }
+        }
         ret = kvm_irqchip_commit_routes(s);
         if (ret < 0) {
             hw_error("KVM IRQ routing setup failed");
@@ -89,12 +98,16 @@ static void kvm_piix3_gsi_handler(void *opaque, int n, int level)
 
 static void ioapic_init(GSIState *gsi_state)
 {
+    const char *backend = "QEMU";
     DeviceState *dev;
     SysBusDevice *d;
     unsigned int i;
 
     dev = qdev_create(NULL, "ioapic");
-    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        backend = "KVM";
+    }
+    qdev_prop_set_string(dev, "backend", g_strdup(backend));
     qdev_init_nofail(dev);
     d = sysbus_from_qdev(dev);
     sysbus_mmio_map(d, 0, 0xfec00000);
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 15/16] kvm: x86: Add user space part for in-kernel IOAPIC
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

This introduces the KVM-accelerated IOAPIC backend and extends the IRQ
routing setup by the 0->2 redirection when needed.

The IOAPIC gains a KVM-specific property that allows to define the GSI
base for injecting interrupts into the kernel model. This will allow to
disentangle PIC and IOAPIC pins for chipsets that support more
sophisticated IRQ routes than the PIIX3. So far the base is kept at 0,
i.e. PIC and IOAPIC share pins 0..15.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 Makefile.target      |    2 +-
 hw/ioapic_common.c   |    1 +
 hw/ioapic_internal.h |    1 +
 hw/kvm/ioapic.c      |  101 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc_piix.c         |   15 +++++++-
 5 files changed, 118 insertions(+), 2 deletions(-)
 create mode 100644 hw/kvm/ioapic.c

diff --git a/Makefile.target b/Makefile.target
index fb10143..b48bb57 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -236,7 +236,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o kvm/ioapic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/ioapic_common.c b/hw/ioapic_common.c
index 094551c..efc1d44 100644
--- a/hw/ioapic_common.c
+++ b/hw/ioapic_common.c
@@ -122,6 +122,7 @@ static SysBusDeviceInfo ioapic_info = {
     .qdev.no_user = 1,
     .qdev.props = (Property[]) {
         DEFINE_PROP_STRING("backend", IOAPICState, backend_name),
+        DEFINE_PROP_UINT32("kvm_gsi_base", IOAPICState, kvm_gsi_base, 0),
         DEFINE_PROP_END_OF_LIST(),
     },
 };
diff --git a/hw/ioapic_internal.h b/hw/ioapic_internal.h
index c5fab8b..bf63115 100644
--- a/hw/ioapic_internal.h
+++ b/hw/ioapic_internal.h
@@ -95,6 +95,7 @@ struct IOAPICState {
 
     char *backend_name;
     IOAPICBackend *backend;
+    uint32_t kvm_gsi_base;
 };
 
 void ioapic_register_device(void);
diff --git a/hw/kvm/ioapic.c b/hw/kvm/ioapic.c
new file mode 100644
index 0000000..1e886d4
--- /dev/null
+++ b/hw/kvm/ioapic.c
@@ -0,0 +1,101 @@
+/*
+ * KVM in-kernel IOPIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka          <jan.kiszka@siemens.com>
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "hw/pc.h"
+#include "hw/ioapic_internal.h"
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static void kvm_ioapic_get(IOAPICState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_ioapic_state *kioapic;
+    int ret, i;
+
+    chip.chip_id = KVM_IRQCHIP_IOAPIC;
+    ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+
+    kioapic = &chip.chip.ioapic;
+
+    s->id = kioapic->id;
+    s->ioregsel = kioapic->ioregsel;
+    s->irr = kioapic->irr;
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        s->ioredtbl[i] = kioapic->redirtbl[i].bits;
+    }
+}
+
+static void kvm_ioapic_put(IOAPICState *s)
+{
+    struct kvm_irqchip chip;
+    struct kvm_ioapic_state *kioapic;
+    int ret, i;
+
+    chip.chip_id = KVM_IRQCHIP_IOAPIC;
+    kioapic = &chip.chip.ioapic;
+
+    kioapic->id = s->id;
+    kioapic->ioregsel = s->ioregsel;
+    kioapic->base_address = s->busdev.mmio[0].addr;
+    kioapic->irr = s->irr;
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        kioapic->redirtbl[i].bits = s->ioredtbl[i];
+    }
+
+    ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, &chip);
+    if (ret < 0) {
+        fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+        abort();
+    }
+}
+
+static void kvm_ioapic_reset(IOAPICState *s)
+{
+    ioapic_reset_internal(s);
+
+    kvm_ioapic_put(s);
+}
+
+static void kvm_ioapic_set_irq(void *opaque, int irq, int level)
+{
+    IOAPICState *s = opaque;
+    int delivered;
+
+    delivered = kvm_irqchip_set_irq(kvm_state, s->kvm_gsi_base + irq, level);
+    apic_report_irq_delivered(delivered);
+}
+
+static void kvm_ioapic_backend_init(IOAPICState *s, int index)
+{
+    memory_region_init_reservation(&s->io_memory, "kvm-ioapic", 0x1000);
+
+    qdev_init_gpio_in(&s->busdev.qdev, kvm_ioapic_set_irq, IOAPIC_NUM_PINS);
+}
+
+static IOAPICBackend kvm_ioapic_backend = {
+    .name = "KVM",
+    .init = kvm_ioapic_backend_init,
+    .reset = kvm_ioapic_reset,
+    .pre_save = kvm_ioapic_get,
+    .post_load = kvm_ioapic_put,
+};
+
+static void kvm_ioapic_register(void)
+{
+    ioapic_register_backend(&kvm_ioapic_backend);
+}
+
+device_init(kvm_ioapic_register)
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 8650319..93d0eba 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -68,6 +68,15 @@ static void kvm_piix3_setup_irq_routing(bool pci_enabled)
         for (i = 8; i < 16; ++i) {
             kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_SLAVE, i - 8);
         }
+        if (pci_enabled) {
+            for (i = 0; i < 24; ++i) {
+                if (i == 0) {
+                    kvm_irqchip_add_route(s, i, KVM_IRQCHIP_IOAPIC, 2);
+                } else if (i != 2) {
+                    kvm_irqchip_add_route(s, i, KVM_IRQCHIP_IOAPIC, i);
+                }
+            }
+        }
         ret = kvm_irqchip_commit_routes(s);
         if (ret < 0) {
             hw_error("KVM IRQ routing setup failed");
@@ -89,12 +98,16 @@ static void kvm_piix3_gsi_handler(void *opaque, int n, int level)
 
 static void ioapic_init(GSIState *gsi_state)
 {
+    const char *backend = "QEMU";
     DeviceState *dev;
     SysBusDevice *d;
     unsigned int i;
 
     dev = qdev_create(NULL, "ioapic");
-    qdev_prop_set_string(dev, "backend", g_strdup("QEMU"));
+    if (kvm_enabled() && kvm_irqchip_in_kernel()) {
+        backend = "KVM";
+    }
+    qdev_prop_set_string(dev, "backend", g_strdup(backend));
     qdev_init_nofail(dev);
     d = sysbus_from_qdev(dev);
     sysbus_mmio_map(d, 0, 0xfec00000);
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH v5 16/16] kvm: Arm in-kernel irqchip support
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-15 12:33   ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, qemu-devel, Anthony Liguori, Michael S. Tsirkin, Blue Swirl

Make the basic in-kernel irqchip support selectable via
-machine ...,kernel_irqchip=on. Leave it off by default until it can
fully replace user space models.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 qemu-config.c   |    4 ++++
 qemu-options.hx |    5 ++++-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 597d7e1..a761bea 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -490,6 +490,10 @@ static QemuOptsList qemu_machine_opts = {
             .name = "accel",
             .type = QEMU_OPT_STRING,
             .help = "accelerator list",
+        }, {
+            .name = "kernel_irqchip",
+            .type = QEMU_OPT_BOOL,
+            .help = "use KVM in-kernel irqchip",
         },
         { /* End of list */ }
     },
diff --git a/qemu-options.hx b/qemu-options.hx
index 681eaf1..60b7dc0 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -31,7 +31,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
     "-machine [type=]name[,prop[=value][,...]]\n"
     "                selects emulated machine (-machine ? for list)\n"
     "                property accel=accel1[:accel2[:...]] selects accelerator\n"
-    "                supported accelerators are kvm, xen, tcg (default: tcg)\n",
+    "                supported accelerators are kvm, xen, tcg (default: tcg)\n"
+    "                kernel_irqchip=on|off controls accelerated irqchip support\n",
     QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -44,6 +45,8 @@ This is used to enable an accelerator. Depending on the target architecture,
 kvm, xen, or tcg can be available. By default, tcg is used. If there is more
 than one accelerator specified, the next one is used if the previous one fails
 to initialize.
+@item kernel_irqchip=on|off
+Enables in-kernel irqchip support for the chosen accelerator when available.
 @end table
 ETEXI
 
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [Qemu-devel] [PATCH v5 16/16] kvm: Arm in-kernel irqchip support
@ 2011-12-15 12:33   ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-15 12:33 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Blue Swirl, Anthony Liguori, qemu-devel, kvm, Michael S. Tsirkin

Make the basic in-kernel irqchip support selectable via
-machine ...,kernel_irqchip=on. Leave it off by default until it can
fully replace user space models.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 qemu-config.c   |    4 ++++
 qemu-options.hx |    5 ++++-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 597d7e1..a761bea 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -490,6 +490,10 @@ static QemuOptsList qemu_machine_opts = {
             .name = "accel",
             .type = QEMU_OPT_STRING,
             .help = "accelerator list",
+        }, {
+            .name = "kernel_irqchip",
+            .type = QEMU_OPT_BOOL,
+            .help = "use KVM in-kernel irqchip",
         },
         { /* End of list */ }
     },
diff --git a/qemu-options.hx b/qemu-options.hx
index 681eaf1..60b7dc0 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -31,7 +31,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
     "-machine [type=]name[,prop[=value][,...]]\n"
     "                selects emulated machine (-machine ? for list)\n"
     "                property accel=accel1[:accel2[:...]] selects accelerator\n"
-    "                supported accelerators are kvm, xen, tcg (default: tcg)\n",
+    "                supported accelerators are kvm, xen, tcg (default: tcg)\n"
+    "                kernel_irqchip=on|off controls accelerated irqchip support\n",
     QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -44,6 +45,8 @@ This is used to enable an accelerator. Depending on the target architecture,
 kvm, xen, or tcg can be available. By default, tcg is used. If there is more
 than one accelerator specified, the next one is used if the previous one fails
 to initialize.
+@item kernel_irqchip=on|off
+Enables in-kernel irqchip support for the chosen accelerator when available.
 @end table
 ETEXI
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
@ 2011-12-19 21:17   ` Marcelo Tosatti
  -1 siblings, 0 replies; 99+ messages in thread
From: Marcelo Tosatti @ 2011-12-19 21:17 UTC (permalink / raw)
  To: Jan Kiszka, Anthony Liguori
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	qemu-devel, Blue Swirl, Avi Kivity


Anthony,

Can you please review & ACK?

You could even apply directly but well do a kvm-autotest run through
uq/master. Still, your review is needed.

Thanks

On Thu, Dec 15, 2011 at 01:33:15PM +0100, Jan Kiszka wrote:
> Changes in v5:
> - properly introduce apic_report_irq_delivered (instead of
>   apic_set_irq_delivered silently)
> - rework apic to kvm core interface according to Blue's suggestion
> 
> CC: Lai Jiangshan <laijs@cn.fujitsu.com>
> 
> Jan Kiszka (16):
>   msi: Generalize msix_supported to msi_supported
>   kvm: Move kvmclock into hw/kvm folder
>   apic: Stop timer on reset
>   apic: Inject external NMI events via LINT1
>   apic: Introduce apic_report_irq_delivered
>   apic: Introduce backend/frontend infrastructure for KVM reuse
>   apic: Open-code timer save/restore
>   i8259: Introduce backend/frontend infrastructure for KVM reuse
>   ioapic: Introduce backend/frontend infrastructure for KVM reuse
>   memory: Introduce memory_region_init_reservation
>   kvm: Introduce core services for in-kernel irqchip support
>   kvm: x86: Establish IRQ0 override control
>   kvm: x86: Add user space part for in-kernel APIC
>   kvm: x86: Add user space part for in-kernel i8259
>   kvm: x86: Add user space part for in-kernel IOAPIC
>   kvm: Arm in-kernel irqchip support
> 
>  Makefile.objs                  |    2 +-
>  Makefile.target                |    6 +-
>  configure                      |    1 +
>  hw/apic.c                      |  309 ++++-----------------------------------
>  hw/apic.h                      |    1 +
>  hw/apic_common.c               |  312 ++++++++++++++++++++++++++++++++++++++++
>  hw/apic_internal.h             |  122 ++++++++++++++++
>  hw/i8259.c                     |  127 ++--------------
>  hw/i8259_common.c              |  173 ++++++++++++++++++++++
>  hw/i8259_internal.h            |   82 +++++++++++
>  hw/ioapic.c                    |  130 ++---------------
>  hw/ioapic_common.c             |  138 ++++++++++++++++++
>  hw/ioapic_internal.h           |  106 ++++++++++++++
>  hw/kvm/apic.c                  |  138 ++++++++++++++++++
>  hw/{kvmclock.c => kvm/clock.c} |    4 +-
>  hw/{kvmclock.h => kvm/clock.h} |    0
>  hw/kvm/i8259.c                 |  126 ++++++++++++++++
>  hw/kvm/ioapic.c                |  101 +++++++++++++
>  hw/msi.c                       |    8 +
>  hw/msi.h                       |    2 +
>  hw/msix.c                      |    9 +-
>  hw/msix.h                      |    2 -
>  hw/pc.c                        |   19 ++-
>  hw/pc.h                        |    1 +
>  hw/pc_piix.c                   |   66 ++++++++-
>  kvm-all.c                      |  154 ++++++++++++++++++++
>  kvm-stub.c                     |    5 +
>  kvm.h                          |   14 ++
>  memory.c                       |   36 +++++
>  memory.h                       |   16 ++
>  monitor.c                      |    6 +-
>  qemu-config.c                  |    4 +
>  qemu-options.hx                |    5 +-
>  sysemu.h                       |    1 -
>  target-i386/kvm.c              |   49 +++++++
>  trace-events                   |    2 +-
>  vl.c                           |    1 -
>  37 files changed, 1739 insertions(+), 539 deletions(-)
>  create mode 100644 hw/apic_common.c
>  create mode 100644 hw/apic_internal.h
>  create mode 100644 hw/i8259_common.c
>  create mode 100644 hw/i8259_internal.h
>  create mode 100644 hw/ioapic_common.c
>  create mode 100644 hw/ioapic_internal.h
>  create mode 100644 hw/kvm/apic.c
>  rename hw/{kvmclock.c => kvm/clock.c} (98%)
>  rename hw/{kvmclock.h => kvm/clock.h} (100%)
>  create mode 100644 hw/kvm/i8259.c
>  create mode 100644 hw/kvm/ioapic.c
> 
> -- 
> 1.7.3.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-19 21:17   ` Marcelo Tosatti
  0 siblings, 0 replies; 99+ messages in thread
From: Marcelo Tosatti @ 2011-12-19 21:17 UTC (permalink / raw)
  To: Jan Kiszka, Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, qemu-devel, Blue Swirl,
	Avi Kivity


Anthony,

Can you please review & ACK?

You could even apply directly but well do a kvm-autotest run through
uq/master. Still, your review is needed.

Thanks

On Thu, Dec 15, 2011 at 01:33:15PM +0100, Jan Kiszka wrote:
> Changes in v5:
> - properly introduce apic_report_irq_delivered (instead of
>   apic_set_irq_delivered silently)
> - rework apic to kvm core interface according to Blue's suggestion
> 
> CC: Lai Jiangshan <laijs@cn.fujitsu.com>
> 
> Jan Kiszka (16):
>   msi: Generalize msix_supported to msi_supported
>   kvm: Move kvmclock into hw/kvm folder
>   apic: Stop timer on reset
>   apic: Inject external NMI events via LINT1
>   apic: Introduce apic_report_irq_delivered
>   apic: Introduce backend/frontend infrastructure for KVM reuse
>   apic: Open-code timer save/restore
>   i8259: Introduce backend/frontend infrastructure for KVM reuse
>   ioapic: Introduce backend/frontend infrastructure for KVM reuse
>   memory: Introduce memory_region_init_reservation
>   kvm: Introduce core services for in-kernel irqchip support
>   kvm: x86: Establish IRQ0 override control
>   kvm: x86: Add user space part for in-kernel APIC
>   kvm: x86: Add user space part for in-kernel i8259
>   kvm: x86: Add user space part for in-kernel IOAPIC
>   kvm: Arm in-kernel irqchip support
> 
>  Makefile.objs                  |    2 +-
>  Makefile.target                |    6 +-
>  configure                      |    1 +
>  hw/apic.c                      |  309 ++++-----------------------------------
>  hw/apic.h                      |    1 +
>  hw/apic_common.c               |  312 ++++++++++++++++++++++++++++++++++++++++
>  hw/apic_internal.h             |  122 ++++++++++++++++
>  hw/i8259.c                     |  127 ++--------------
>  hw/i8259_common.c              |  173 ++++++++++++++++++++++
>  hw/i8259_internal.h            |   82 +++++++++++
>  hw/ioapic.c                    |  130 ++---------------
>  hw/ioapic_common.c             |  138 ++++++++++++++++++
>  hw/ioapic_internal.h           |  106 ++++++++++++++
>  hw/kvm/apic.c                  |  138 ++++++++++++++++++
>  hw/{kvmclock.c => kvm/clock.c} |    4 +-
>  hw/{kvmclock.h => kvm/clock.h} |    0
>  hw/kvm/i8259.c                 |  126 ++++++++++++++++
>  hw/kvm/ioapic.c                |  101 +++++++++++++
>  hw/msi.c                       |    8 +
>  hw/msi.h                       |    2 +
>  hw/msix.c                      |    9 +-
>  hw/msix.h                      |    2 -
>  hw/pc.c                        |   19 ++-
>  hw/pc.h                        |    1 +
>  hw/pc_piix.c                   |   66 ++++++++-
>  kvm-all.c                      |  154 ++++++++++++++++++++
>  kvm-stub.c                     |    5 +
>  kvm.h                          |   14 ++
>  memory.c                       |   36 +++++
>  memory.h                       |   16 ++
>  monitor.c                      |    6 +-
>  qemu-config.c                  |    4 +
>  qemu-options.hx                |    5 +-
>  sysemu.h                       |    1 -
>  target-i386/kvm.c              |   49 +++++++
>  trace-events                   |    2 +-
>  vl.c                           |    1 -
>  37 files changed, 1739 insertions(+), 539 deletions(-)
>  create mode 100644 hw/apic_common.c
>  create mode 100644 hw/apic_internal.h
>  create mode 100644 hw/i8259_common.c
>  create mode 100644 hw/i8259_internal.h
>  create mode 100644 hw/ioapic_common.c
>  create mode 100644 hw/ioapic_internal.h
>  create mode 100644 hw/kvm/apic.c
>  rename hw/{kvmclock.c => kvm/clock.c} (98%)
>  rename hw/{kvmclock.h => kvm/clock.h} (100%)
>  create mode 100644 hw/kvm/i8259.c
>  create mode 100644 hw/kvm/ioapic.c
> 
> -- 
> 1.7.3.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
@ 2011-12-19 22:14     ` Anthony Liguori
  -1 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-19 22:14 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Avi Kivity, Marcelo Tosatti, Blue Swirl, Anthony Liguori,
	qemu-devel, kvm, Michael S. Tsirkin

On 12/15/2011 06:33 AM, Jan Kiszka wrote:
> The KVM in-kernel APIC model will reuse parts of the user space model
> while providing the same frontend view to guest and most management
> interfaces. Introduce an APIC backend concept to encapsulate those
> parts that will tell user space and KVM model apart. The backend offers
> callback hooks for init, base/tpr setting, and the external NMI delivery
> that will be implemented accordingly.
>
> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
> ---
>   Makefile.target    |    2 +-
>   hw/apic.c          |  285 +++-------------------------------------------------
>   hw/apic.h          |    1 -
>   hw/apic_common.c   |  265 ++++++++++++++++++++++++++++++++++++++++++++++++
>   hw/apic_internal.h |  119 ++++++++++++++++++++++
>   hw/pc.c            |    1 +
>   6 files changed, 401 insertions(+), 272 deletions(-)
>   create mode 100644 hw/apic_common.c
>   create mode 100644 hw/apic_internal.h
>
> diff --git a/Makefile.target b/Makefile.target
> index 1d24a30..c46f062 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -231,7 +231,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
>   # Hardware support
>   obj-i386-y += vga.o
>   obj-i386-y += mc146818rtc.o pc.o
> -obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
> +obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
>   obj-i386-y += vmport.o
>   obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
>   obj-i386-y += debugcon.o multiboot.o
> diff --git a/hw/apic.c b/hw/apic.c
> index bec493b..5fa3111 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -16,53 +16,13 @@
>    * You should have received a copy of the GNU Lesser General Public
>    * License along with this library; if not, see<http://www.gnu.org/licenses/>
>    */
> -#include "hw.h"
> +#include "apic_internal.h"
>   #include "apic.h"
>   #include "ioapic.h"
> -#include "qemu-timer.h"
>   #include "host-utils.h"
> -#include "sysbus.h"
>   #include "trace.h"
>   #include "pc.h"
>
> -/* APIC Local Vector Table */
> -#define APIC_LVT_TIMER   0
> -#define APIC_LVT_THERMAL 1
> -#define APIC_LVT_PERFORM 2
> -#define APIC_LVT_LINT0   3
> -#define APIC_LVT_LINT1   4
> -#define APIC_LVT_ERROR   5
> -#define APIC_LVT_NB      6
> -
> -/* APIC delivery modes */
> -#define APIC_DM_FIXED	0
> -#define APIC_DM_LOWPRI	1
> -#define APIC_DM_SMI	2
> -#define APIC_DM_NMI	4
> -#define APIC_DM_INIT	5
> -#define APIC_DM_SIPI	6
> -#define APIC_DM_EXTINT	7
> -
> -/* APIC destination mode */
> -#define APIC_DESTMODE_FLAT	0xf
> -#define APIC_DESTMODE_CLUSTER	1
> -
> -#define APIC_TRIGGER_EDGE  0
> -#define APIC_TRIGGER_LEVEL 1
> -
> -#define	APIC_LVT_TIMER_PERIODIC		(1<<17)
> -#define	APIC_LVT_MASKED			(1<<16)
> -#define	APIC_LVT_LEVEL_TRIGGER		(1<<15)
> -#define	APIC_LVT_REMOTE_IRR		(1<<14)
> -#define	APIC_INPUT_POLARITY		(1<<13)
> -#define	APIC_SEND_PENDING		(1<<12)
> -
> -#define ESR_ILLEGAL_ADDRESS (1<<  7)
> -
> -#define APIC_SV_DIRECTED_IO             (1<<12)
> -#define APIC_SV_ENABLE                  (1<<8)
> -
> -#define MAX_APICS 255
>   #define MAX_APIC_WORDS 8
>
>   /* Intel APIC constants: from include/asm/msidef.h */
> @@ -75,40 +35,7 @@
>   #define MSI_ADDR_DEST_ID_SHIFT		12
>   #define	MSI_ADDR_DEST_ID_MASK		0x00ffff0
>
> -#define MSI_ADDR_SIZE                   0x100000
> -
> -typedef struct APICState APICState;
> -
> -struct APICState {
> -    SysBusDevice busdev;
> -    MemoryRegion io_memory;
> -    void *cpu_env;
> -    uint32_t apicbase;
> -    uint8_t id;
> -    uint8_t arb_id;
> -    uint8_t tpr;
> -    uint32_t spurious_vec;
> -    uint8_t log_dest;
> -    uint8_t dest_mode;
> -    uint32_t isr[8];  /* in service register */
> -    uint32_t tmr[8];  /* trigger mode register */
> -    uint32_t irr[8]; /* interrupt request register */
> -    uint32_t lvt[APIC_LVT_NB];
> -    uint32_t esr; /* error register */
> -    uint32_t icr[2];
> -
> -    uint32_t divide_conf;
> -    int count_shift;
> -    uint32_t initial_count;
> -    int64_t initial_count_load_time, next_time;
> -    uint32_t idx;
> -    QEMUTimer *timer;
> -    int sipi_vector;
> -    int wait_for_sipi;
> -};
> -
>   static APICState *local_apics[MAX_APICS + 1];
> -static int apic_irq_delivered;
>
>   static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
>   static void apic_update_irq(APICState *s);
> @@ -205,10 +132,8 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>       }
>   }
>
> -void apic_deliver_nmi(DeviceState *d)
> +static void apic_external_nmi(APICState *s)
>   {
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
>       apic_local_deliver(s, APIC_LVT_LINT1);
>   }
>
> @@ -300,14 +225,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
>       apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
>   }
>
> -void cpu_set_apic_base(DeviceState *d, uint64_t val)
> +static void apic_set_base(APICState *s, uint64_t val)
>   {
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    trace_cpu_set_apic_base(val);
> -
> -    if (!s)
> -        return;
>       s->apicbase = (val&  0xfffff000) |
>           (s->apicbase&  (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
>       /* if disabled, cannot be enabled again */
> @@ -318,32 +237,12 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
>       }
>   }
>
> -uint64_t cpu_get_apic_base(DeviceState *d)
> +static void apic_set_tpr(APICState *s, uint8_t val)
>   {
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase: 0);
> -
> -    return s ? s->apicbase : 0;
> -}
> -
> -void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    if (!s)
> -        return;
>       s->tpr = (val&  0x0f)<<  4;
>       apic_update_irq(s);
>   }
>
> -uint8_t cpu_get_apic_tpr(DeviceState *d)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    return s ? s->tpr>>  4 : 0;
> -}
> -
>   /* return -1 if no bit is set */
>   static int get_highest_priority_int(uint32_t *tab)
>   {
> @@ -413,27 +312,6 @@ static void apic_update_irq(APICState *s)
>       }
>   }
>
> -void apic_report_irq_delivered(int delivered)
> -{
> -    apic_irq_delivered += delivered;
> -
> -    trace_apic_report_irq_delivered(apic_irq_delivered);
> -}
> -
> -void apic_reset_irq_delivered(void)
> -{
> -    trace_apic_reset_irq_delivered(apic_irq_delivered);
> -
> -    apic_irq_delivered = 0;
> -}
> -
> -int apic_get_irq_delivered(void)
> -{
> -    trace_apic_get_irq_delivered(apic_irq_delivered);
> -
> -    return apic_irq_delivered;
> -}
> -
>   static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
>   {
>       apic_report_irq_delivered(!get_bit(s->irr, vector_num));
> @@ -515,35 +393,6 @@ static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
>       }
>   }
>
> -void apic_init_reset(DeviceState *d)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -    int i;
> -
> -    if (!s)
> -        return;
> -
> -    s->tpr = 0;
> -    s->spurious_vec = 0xff;
> -    s->log_dest = 0;
> -    s->dest_mode = 0xf;
> -    memset(s->isr, 0, sizeof(s->isr));
> -    memset(s->tmr, 0, sizeof(s->tmr));
> -    memset(s->irr, 0, sizeof(s->irr));
> -    for(i = 0; i<  APIC_LVT_NB; i++)
> -        s->lvt[i] = 1<<  16; /* mask LVT */
> -    s->esr = 0;
> -    memset(s->icr, 0, sizeof(s->icr));
> -    s->divide_conf = 0;
> -    s->count_shift = 0;
> -    s->initial_count = 0;
> -    s->initial_count_load_time = 0;
> -    s->next_time = 0;
> -    s->wait_for_sipi = 1;
> -
> -    qemu_del_timer(s->timer);
> -}
> -
>   static void apic_startup(APICState *s, int vector_num)
>   {
>       s->sipi_vector = vector_num;
> @@ -904,96 +753,6 @@ static void apic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
>       }
>   }
>
> -/* This function is only used for old state version 1 and 2 */
> -static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
> -{
> -    APICState *s = opaque;
> -    int i;
> -
> -    if (version_id>  2)
> -        return -EINVAL;
> -
> -    /* XXX: what if the base changes? (registered memory regions) */
> -    qemu_get_be32s(f,&s->apicbase);
> -    qemu_get_8s(f,&s->id);
> -    qemu_get_8s(f,&s->arb_id);
> -    qemu_get_8s(f,&s->tpr);
> -    qemu_get_be32s(f,&s->spurious_vec);
> -    qemu_get_8s(f,&s->log_dest);
> -    qemu_get_8s(f,&s->dest_mode);
> -    for (i = 0; i<  8; i++) {
> -        qemu_get_be32s(f,&s->isr[i]);
> -        qemu_get_be32s(f,&s->tmr[i]);
> -        qemu_get_be32s(f,&s->irr[i]);
> -    }
> -    for (i = 0; i<  APIC_LVT_NB; i++) {
> -        qemu_get_be32s(f,&s->lvt[i]);
> -    }
> -    qemu_get_be32s(f,&s->esr);
> -    qemu_get_be32s(f,&s->icr[0]);
> -    qemu_get_be32s(f,&s->icr[1]);
> -    qemu_get_be32s(f,&s->divide_conf);
> -    s->count_shift=qemu_get_be32(f);
> -    qemu_get_be32s(f,&s->initial_count);
> -    s->initial_count_load_time=qemu_get_be64(f);
> -    s->next_time=qemu_get_be64(f);
> -
> -    if (version_id>= 2)
> -        qemu_get_timer(f, s->timer);
> -    return 0;
> -}
> -
> -static const VMStateDescription vmstate_apic = {
> -    .name = "apic",
> -    .version_id = 3,
> -    .minimum_version_id = 3,
> -    .minimum_version_id_old = 1,
> -    .load_state_old = apic_load_old,
> -    .fields      = (VMStateField []) {
> -        VMSTATE_UINT32(apicbase, APICState),
> -        VMSTATE_UINT8(id, APICState),
> -        VMSTATE_UINT8(arb_id, APICState),
> -        VMSTATE_UINT8(tpr, APICState),
> -        VMSTATE_UINT32(spurious_vec, APICState),
> -        VMSTATE_UINT8(log_dest, APICState),
> -        VMSTATE_UINT8(dest_mode, APICState),
> -        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
> -        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
> -        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
> -        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
> -        VMSTATE_UINT32(esr, APICState),
> -        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
> -        VMSTATE_UINT32(divide_conf, APICState),
> -        VMSTATE_INT32(count_shift, APICState),
> -        VMSTATE_UINT32(initial_count, APICState),
> -        VMSTATE_INT64(initial_count_load_time, APICState),
> -        VMSTATE_INT64(next_time, APICState),
> -        VMSTATE_TIMER(timer, APICState),
> -        VMSTATE_END_OF_LIST()
> -    }
> -};
> -
> -static void apic_reset(DeviceState *d)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -    int bsp;
> -
> -    bsp = cpu_is_bsp(s->cpu_env);
> -    s->apicbase = 0xfee00000 |
> -        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
> -
> -    apic_init_reset(d);
> -
> -    if (bsp) {
> -        /*
> -         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
> -         * time typically by BIOS, so PIC interrupt can be delivered to the
> -         * processor when local APIC is enabled.
> -         */
> -        s->lvt[APIC_LVT_LINT0] = 0x700;
> -    }
> -}
> -
>   static const MemoryRegionOps apic_io_ops = {
>       .old_mmio = {
>           .read = { apic_mem_readb, apic_mem_readw, apic_mem_readl, },
> @@ -1002,41 +761,27 @@ static const MemoryRegionOps apic_io_ops = {
>       .endianness = DEVICE_NATIVE_ENDIAN,
>   };
>
> -static int apic_init1(SysBusDevice *dev)
> +static void apic_backend_init(APICState *s)
>   {
> -    APICState *s = FROM_SYSBUS(APICState, dev);
> -    static int last_apic_idx;
> -
> -    if (last_apic_idx>= MAX_APICS) {
> -        return -1;
> -    }
> -    memory_region_init_io(&s->io_memory,&apic_io_ops, s, "apic",
> -                          MSI_ADDR_SIZE);
> -    sysbus_init_mmio(dev,&s->io_memory);
> +    memory_region_init_io(&s->io_memory,&apic_io_ops, s, "apic-msi",
> +                          MSI_SPACE_SIZE);
>
>       s->timer = qemu_new_timer_ns(vm_clock, apic_timer, s);
> -    s->idx = last_apic_idx++;
>       local_apics[s->idx] = s;
> -    return 0;
>   }
>
> -static SysBusDeviceInfo apic_info = {
> -    .init = apic_init1,
> -    .qdev.name = "apic",
> -    .qdev.size = sizeof(APICState),
> -    .qdev.vmsd =&vmstate_apic,
> -    .qdev.reset = apic_reset,
> -    .qdev.no_user = 1,
> -    .qdev.props = (Property[]) {
> -        DEFINE_PROP_UINT8("id", APICState, id, -1),
> -        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
> -        DEFINE_PROP_END_OF_LIST(),
> -    }
> +static APICBackend apic_backend = {
> +    .name = "QEMU",
> +    .init = apic_backend_init,
> +    .set_base = apic_set_base,
> +    .set_tpr = apic_set_tpr,
> +    .external_nmi = apic_external_nmi,
>   };
>
>   static void apic_register_devices(void)
>   {
> -    sysbus_register_withprop(&apic_info);
> +    apic_register_device();
> +    apic_register_backend(&apic_backend);
>   }
>
>   device_init(apic_register_devices)
> diff --git a/hw/apic.h b/hw/apic.h
> index 8173d8a..a62d83b 100644
> --- a/hw/apic.h
> +++ b/hw/apic.h
> @@ -10,7 +10,6 @@ int apic_accept_pic_intr(DeviceState *s);
>   void apic_deliver_pic_intr(DeviceState *s, int level);
>   void apic_deliver_nmi(DeviceState *d);
>   int apic_get_interrupt(DeviceState *s);
> -void apic_report_irq_delivered(int delivered);
>   void apic_reset_irq_delivered(void);
>   int apic_get_irq_delivered(void);
>   void cpu_set_apic_base(DeviceState *s, uint64_t val);
> diff --git a/hw/apic_common.c b/hw/apic_common.c
> new file mode 100644
> index 0000000..4cdc45c
> --- /dev/null
> +++ b/hw/apic_common.c
> @@ -0,0 +1,265 @@
> +/*
> + *  APIC support - common bits of emulated and KVM kernel model
> + *
> + *  Copyright (c) 2004-2005 Fabrice Bellard
> + *  Copyright (c) 2011      Jan Kiszka, Siemens AG
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see<http://www.gnu.org/licenses/>
> + */
> +#include "apic.h"
> +#include "apic_internal.h"
> +#include "trace.h"
> +
> +static QSIMPLEQ_HEAD(, APICBackend) backends =
> +    QSIMPLEQ_HEAD_INITIALIZER(backends);
> +static int apic_irq_delivered;
> +
> +void cpu_set_apic_base(DeviceState *d, uint64_t val)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    trace_cpu_set_apic_base(val);
> +
> +    if (s) {
> +        s->backend->set_base(s, val);
> +    }
> +}
> +
> +uint64_t cpu_get_apic_base(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase : 0);
> +
> +    return s ? s->apicbase : 0;
> +}
> +
> +void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    if (s) {
> +        s->backend->set_tpr(s, val);
> +    }
> +}
> +
> +uint8_t cpu_get_apic_tpr(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    return s ? s->tpr>>  4 : 0;
> +}
> +
> +void apic_report_irq_delivered(int delivered)
> +{
> +    apic_irq_delivered += delivered;
> +
> +    trace_apic_report_irq_delivered(apic_irq_delivered);
> +}
> +
> +void apic_reset_irq_delivered(void)
> +{
> +    trace_apic_reset_irq_delivered(apic_irq_delivered);
> +
> +    apic_irq_delivered = 0;
> +}
> +
> +int apic_get_irq_delivered(void)
> +{
> +    trace_apic_get_irq_delivered(apic_irq_delivered);
> +
> +    return apic_irq_delivered;
> +}
> +
> +void apic_deliver_nmi(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    s->backend->external_nmi(s);
> +}
> +
> +void apic_init_reset(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +    int i;
> +
> +    if (!s) {
> +        return;
> +    }
> +    s->tpr = 0;
> +    s->spurious_vec = 0xff;
> +    s->log_dest = 0;
> +    s->dest_mode = 0xf;
> +    memset(s->isr, 0, sizeof(s->isr));
> +    memset(s->tmr, 0, sizeof(s->tmr));
> +    memset(s->irr, 0, sizeof(s->irr));
> +    for (i = 0; i<  APIC_LVT_NB; i++) {
> +        s->lvt[i] = APIC_LVT_MASKED;
> +    }
> +    s->esr = 0;
> +    memset(s->icr, 0, sizeof(s->icr));
> +    s->divide_conf = 0;
> +    s->count_shift = 0;
> +    s->initial_count = 0;
> +    s->initial_count_load_time = 0;
> +    s->next_time = 0;
> +    s->wait_for_sipi = 1;
> +
> +    qemu_del_timer(s->timer);
> +}
> +
> +static void apic_reset(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +    bool bsp;
> +
> +    bsp = cpu_is_bsp(s->cpu_env);
> +    s->apicbase = 0xfee00000 |
> +        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
> +
> +    apic_init_reset(d);
> +
> +    if (bsp) {
> +        /*
> +         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
> +         * time typically by BIOS, so PIC interrupt can be delivered to the
> +         * processor when local APIC is enabled.
> +         */
> +        s->lvt[APIC_LVT_LINT0] = 0x700;
> +    }
> +}
> +
> +/* This function is only used for old state version 1 and 2 */
> +static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
> +{
> +    APICState *s = opaque;
> +    int i;
> +
> +    if (version_id>  2) {
> +        return -EINVAL;
> +    }
> +
> +    /* XXX: what if the base changes? (registered memory regions) */
> +    qemu_get_be32s(f,&s->apicbase);
> +    qemu_get_8s(f,&s->id);
> +    qemu_get_8s(f,&s->arb_id);
> +    qemu_get_8s(f,&s->tpr);
> +    qemu_get_be32s(f,&s->spurious_vec);
> +    qemu_get_8s(f,&s->log_dest);
> +    qemu_get_8s(f,&s->dest_mode);
> +    for (i = 0; i<  8; i++) {
> +        qemu_get_be32s(f,&s->isr[i]);
> +        qemu_get_be32s(f,&s->tmr[i]);
> +        qemu_get_be32s(f,&s->irr[i]);
> +    }
> +    for (i = 0; i<  APIC_LVT_NB; i++) {
> +        qemu_get_be32s(f,&s->lvt[i]);
> +    }
> +    qemu_get_be32s(f,&s->esr);
> +    qemu_get_be32s(f,&s->icr[0]);
> +    qemu_get_be32s(f,&s->icr[1]);
> +    qemu_get_be32s(f,&s->divide_conf);
> +    s->count_shift = qemu_get_be32(f);
> +    qemu_get_be32s(f,&s->initial_count);
> +    s->initial_count_load_time = qemu_get_be64(f);
> +    s->next_time = qemu_get_be64(f);
> +
> +    if (version_id>= 2) {
> +        qemu_get_timer(f, s->timer);
> +    }
> +    return 0;
> +}
> +
> +static const VMStateDescription vmstate_apic = {
> +    .name = "apic",
> +    .version_id = 3,
> +    .minimum_version_id = 3,
> +    .minimum_version_id_old = 1,
> +    .load_state_old = apic_load_old,
> +    .fields      = (VMStateField[]) {
> +        VMSTATE_UINT32(apicbase, APICState),
> +        VMSTATE_UINT8(id, APICState),
> +        VMSTATE_UINT8(arb_id, APICState),
> +        VMSTATE_UINT8(tpr, APICState),
> +        VMSTATE_UINT32(spurious_vec, APICState),
> +        VMSTATE_UINT8(log_dest, APICState),
> +        VMSTATE_UINT8(dest_mode, APICState),
> +        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
> +        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
> +        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
> +        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
> +        VMSTATE_UINT32(esr, APICState),
> +        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
> +        VMSTATE_UINT32(divide_conf, APICState),
> +        VMSTATE_INT32(count_shift, APICState),
> +        VMSTATE_UINT32(initial_count, APICState),
> +        VMSTATE_INT64(initial_count_load_time, APICState),
> +        VMSTATE_INT64(next_time, APICState),
> +        VMSTATE_TIMER(timer, APICState),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
> +static int apic_init(SysBusDevice *dev)
> +{
> +    APICState *s = FROM_SYSBUS(APICState, dev);
> +    static int apic_no;
> +    APICBackend *b;
> +
> +    if (apic_no>= MAX_APICS) {
> +        return -1;
> +    }
> +    s->idx = apic_no++;
> +
> +    QSIMPLEQ_FOREACH(b,&backends, entry) {
> +        if (strcmp(b->name, s->backend_name) == 0) {
> +            s->backend = b;
> +            break;
> +        }
> +    }
> +    if (!s->backend) {
> +        hw_error("APIC backend '%s' not found!", s->backend_name);
> +        exit(1);
> +    }
> +
> +    b->init(s);
> +
> +    sysbus_init_mmio(&s->busdev,&s->io_memory);
> +    return 0;
> +}
> +
> +static SysBusDeviceInfo apic_info = {
> +    .init = apic_init,
> +    .qdev.name = "apic",
> +    .qdev.size = sizeof(APICState),
> +    .qdev.vmsd =&vmstate_apic,
> +    .qdev.reset = apic_reset,
> +    .qdev.no_user = 1,
> +    .qdev.props = (Property[]) {
> +        DEFINE_PROP_UINT8("id", APICState, id, -1),
> +        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
> +        DEFINE_PROP_STRING("backend", APICState, backend_name),
> +        DEFINE_PROP_END_OF_LIST(),
> +    }
> +};
> +
> +void apic_register_backend(APICBackend *backend)
> +{
> +    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
> +}
> +
> +void apic_register_device(void)
> +{
> +    sysbus_register_withprop(&apic_info);
> +}
> diff --git a/hw/apic_internal.h b/hw/apic_internal.h
> new file mode 100644
> index 0000000..6cbd901
> --- /dev/null
> +++ b/hw/apic_internal.h
> @@ -0,0 +1,119 @@
> +/*
> + *  APIC support - internal interfaces
> + *
> + *  Copyright (c) 2004-2005 Fabrice Bellard
> + *  Copyright (c) 2011      Jan Kiszka, Siemens AG
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see<http://www.gnu.org/licenses/>
> + */
> +#ifndef QEMU_APIC_INTERNAL_H
> +#define QEMU_APIC_INTERNAL_H
> +
> +#include "memory.h"
> +#include "sysbus.h"
> +#include "qemu-timer.h"
> +#include "qemu-queue.h"
> +
> +/* APIC Local Vector Table */
> +#define APIC_LVT_TIMER                  0
> +#define APIC_LVT_THERMAL                1
> +#define APIC_LVT_PERFORM                2
> +#define APIC_LVT_LINT0                  3
> +#define APIC_LVT_LINT1                  4
> +#define APIC_LVT_ERROR                  5
> +#define APIC_LVT_NB                     6
> +
> +/* APIC delivery modes */
> +#define APIC_DM_FIXED                   0
> +#define APIC_DM_LOWPRI                  1
> +#define APIC_DM_SMI                     2
> +#define APIC_DM_NMI                     4
> +#define APIC_DM_INIT                    5
> +#define APIC_DM_SIPI                    6
> +#define APIC_DM_EXTINT                  7
> +
> +/* APIC destination mode */
> +#define APIC_DESTMODE_FLAT              0xf
> +#define APIC_DESTMODE_CLUSTER           1
> +
> +#define APIC_TRIGGER_EDGE               0
> +#define APIC_TRIGGER_LEVEL              1
> +
> +#define APIC_LVT_TIMER_PERIODIC         (1<<17)
> +#define APIC_LVT_MASKED                 (1<<16)
> +#define APIC_LVT_LEVEL_TRIGGER          (1<<15)
> +#define APIC_LVT_REMOTE_IRR             (1<<14)
> +#define APIC_INPUT_POLARITY             (1<<13)
> +#define APIC_SEND_PENDING               (1<<12)
> +
> +#define ESR_ILLEGAL_ADDRESS (1<<  7)
> +
> +#define APIC_SV_DIRECTED_IO             (1<<12)
> +#define APIC_SV_ENABLE                  (1<<8)
> +
> +#define MAX_APICS 255
> +
> +#define MSI_SPACE_SIZE                  0x100000
> +
> +typedef struct APICBackend APICBackend;
> +typedef struct APICState APICState;
> +
> +struct APICBackend {
> +    const char *name;
> +    void (*init)(APICState *s);
> +    void (*set_base)(APICState *s, uint64_t val);
> +    void (*set_tpr)(APICState *s, uint8_t val);
> +    void (*external_nmi)(APICState *s);
> +
> +    QSIMPLEQ_ENTRY(APICBackend) entry;
> +};


Wouldn't this be more naturally modeled by making APICBackend be a base class?

In qdev today, this would look like:

struct APICCommon {
    SysBusDevice qdev;
    ...
};

struct APICCommonInfo {
     DeviceInfo qdev;
     void (*init)(APICState *s);
     void (*set_base)(APICState *s, uint64_t val);
     void (*set_tpr)(APICState *s, uint8_t val);
     void (*external_nmi)(APICState *s);
};

Take a look at SCSIDevice for an example of this in practice.  This is nicer 
because as we move save/load into devices methods, it becomes natural to define 
the state and save/load function in the base class.  Provided it only uses base 
class state, it lets save/load be compatible between both in-kernel and in-qemu 
device model.

Regards,

Anthony Liguori

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-19 22:14     ` Anthony Liguori
  0 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-19 22:14 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/15/2011 06:33 AM, Jan Kiszka wrote:
> The KVM in-kernel APIC model will reuse parts of the user space model
> while providing the same frontend view to guest and most management
> interfaces. Introduce an APIC backend concept to encapsulate those
> parts that will tell user space and KVM model apart. The backend offers
> callback hooks for init, base/tpr setting, and the external NMI delivery
> that will be implemented accordingly.
>
> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
> ---
>   Makefile.target    |    2 +-
>   hw/apic.c          |  285 +++-------------------------------------------------
>   hw/apic.h          |    1 -
>   hw/apic_common.c   |  265 ++++++++++++++++++++++++++++++++++++++++++++++++
>   hw/apic_internal.h |  119 ++++++++++++++++++++++
>   hw/pc.c            |    1 +
>   6 files changed, 401 insertions(+), 272 deletions(-)
>   create mode 100644 hw/apic_common.c
>   create mode 100644 hw/apic_internal.h
>
> diff --git a/Makefile.target b/Makefile.target
> index 1d24a30..c46f062 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -231,7 +231,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
>   # Hardware support
>   obj-i386-y += vga.o
>   obj-i386-y += mc146818rtc.o pc.o
> -obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
> +obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
>   obj-i386-y += vmport.o
>   obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
>   obj-i386-y += debugcon.o multiboot.o
> diff --git a/hw/apic.c b/hw/apic.c
> index bec493b..5fa3111 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -16,53 +16,13 @@
>    * You should have received a copy of the GNU Lesser General Public
>    * License along with this library; if not, see<http://www.gnu.org/licenses/>
>    */
> -#include "hw.h"
> +#include "apic_internal.h"
>   #include "apic.h"
>   #include "ioapic.h"
> -#include "qemu-timer.h"
>   #include "host-utils.h"
> -#include "sysbus.h"
>   #include "trace.h"
>   #include "pc.h"
>
> -/* APIC Local Vector Table */
> -#define APIC_LVT_TIMER   0
> -#define APIC_LVT_THERMAL 1
> -#define APIC_LVT_PERFORM 2
> -#define APIC_LVT_LINT0   3
> -#define APIC_LVT_LINT1   4
> -#define APIC_LVT_ERROR   5
> -#define APIC_LVT_NB      6
> -
> -/* APIC delivery modes */
> -#define APIC_DM_FIXED	0
> -#define APIC_DM_LOWPRI	1
> -#define APIC_DM_SMI	2
> -#define APIC_DM_NMI	4
> -#define APIC_DM_INIT	5
> -#define APIC_DM_SIPI	6
> -#define APIC_DM_EXTINT	7
> -
> -/* APIC destination mode */
> -#define APIC_DESTMODE_FLAT	0xf
> -#define APIC_DESTMODE_CLUSTER	1
> -
> -#define APIC_TRIGGER_EDGE  0
> -#define APIC_TRIGGER_LEVEL 1
> -
> -#define	APIC_LVT_TIMER_PERIODIC		(1<<17)
> -#define	APIC_LVT_MASKED			(1<<16)
> -#define	APIC_LVT_LEVEL_TRIGGER		(1<<15)
> -#define	APIC_LVT_REMOTE_IRR		(1<<14)
> -#define	APIC_INPUT_POLARITY		(1<<13)
> -#define	APIC_SEND_PENDING		(1<<12)
> -
> -#define ESR_ILLEGAL_ADDRESS (1<<  7)
> -
> -#define APIC_SV_DIRECTED_IO             (1<<12)
> -#define APIC_SV_ENABLE                  (1<<8)
> -
> -#define MAX_APICS 255
>   #define MAX_APIC_WORDS 8
>
>   /* Intel APIC constants: from include/asm/msidef.h */
> @@ -75,40 +35,7 @@
>   #define MSI_ADDR_DEST_ID_SHIFT		12
>   #define	MSI_ADDR_DEST_ID_MASK		0x00ffff0
>
> -#define MSI_ADDR_SIZE                   0x100000
> -
> -typedef struct APICState APICState;
> -
> -struct APICState {
> -    SysBusDevice busdev;
> -    MemoryRegion io_memory;
> -    void *cpu_env;
> -    uint32_t apicbase;
> -    uint8_t id;
> -    uint8_t arb_id;
> -    uint8_t tpr;
> -    uint32_t spurious_vec;
> -    uint8_t log_dest;
> -    uint8_t dest_mode;
> -    uint32_t isr[8];  /* in service register */
> -    uint32_t tmr[8];  /* trigger mode register */
> -    uint32_t irr[8]; /* interrupt request register */
> -    uint32_t lvt[APIC_LVT_NB];
> -    uint32_t esr; /* error register */
> -    uint32_t icr[2];
> -
> -    uint32_t divide_conf;
> -    int count_shift;
> -    uint32_t initial_count;
> -    int64_t initial_count_load_time, next_time;
> -    uint32_t idx;
> -    QEMUTimer *timer;
> -    int sipi_vector;
> -    int wait_for_sipi;
> -};
> -
>   static APICState *local_apics[MAX_APICS + 1];
> -static int apic_irq_delivered;
>
>   static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
>   static void apic_update_irq(APICState *s);
> @@ -205,10 +132,8 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>       }
>   }
>
> -void apic_deliver_nmi(DeviceState *d)
> +static void apic_external_nmi(APICState *s)
>   {
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
>       apic_local_deliver(s, APIC_LVT_LINT1);
>   }
>
> @@ -300,14 +225,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
>       apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
>   }
>
> -void cpu_set_apic_base(DeviceState *d, uint64_t val)
> +static void apic_set_base(APICState *s, uint64_t val)
>   {
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    trace_cpu_set_apic_base(val);
> -
> -    if (!s)
> -        return;
>       s->apicbase = (val&  0xfffff000) |
>           (s->apicbase&  (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
>       /* if disabled, cannot be enabled again */
> @@ -318,32 +237,12 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
>       }
>   }
>
> -uint64_t cpu_get_apic_base(DeviceState *d)
> +static void apic_set_tpr(APICState *s, uint8_t val)
>   {
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase: 0);
> -
> -    return s ? s->apicbase : 0;
> -}
> -
> -void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    if (!s)
> -        return;
>       s->tpr = (val&  0x0f)<<  4;
>       apic_update_irq(s);
>   }
>
> -uint8_t cpu_get_apic_tpr(DeviceState *d)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -
> -    return s ? s->tpr>>  4 : 0;
> -}
> -
>   /* return -1 if no bit is set */
>   static int get_highest_priority_int(uint32_t *tab)
>   {
> @@ -413,27 +312,6 @@ static void apic_update_irq(APICState *s)
>       }
>   }
>
> -void apic_report_irq_delivered(int delivered)
> -{
> -    apic_irq_delivered += delivered;
> -
> -    trace_apic_report_irq_delivered(apic_irq_delivered);
> -}
> -
> -void apic_reset_irq_delivered(void)
> -{
> -    trace_apic_reset_irq_delivered(apic_irq_delivered);
> -
> -    apic_irq_delivered = 0;
> -}
> -
> -int apic_get_irq_delivered(void)
> -{
> -    trace_apic_get_irq_delivered(apic_irq_delivered);
> -
> -    return apic_irq_delivered;
> -}
> -
>   static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
>   {
>       apic_report_irq_delivered(!get_bit(s->irr, vector_num));
> @@ -515,35 +393,6 @@ static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
>       }
>   }
>
> -void apic_init_reset(DeviceState *d)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -    int i;
> -
> -    if (!s)
> -        return;
> -
> -    s->tpr = 0;
> -    s->spurious_vec = 0xff;
> -    s->log_dest = 0;
> -    s->dest_mode = 0xf;
> -    memset(s->isr, 0, sizeof(s->isr));
> -    memset(s->tmr, 0, sizeof(s->tmr));
> -    memset(s->irr, 0, sizeof(s->irr));
> -    for(i = 0; i<  APIC_LVT_NB; i++)
> -        s->lvt[i] = 1<<  16; /* mask LVT */
> -    s->esr = 0;
> -    memset(s->icr, 0, sizeof(s->icr));
> -    s->divide_conf = 0;
> -    s->count_shift = 0;
> -    s->initial_count = 0;
> -    s->initial_count_load_time = 0;
> -    s->next_time = 0;
> -    s->wait_for_sipi = 1;
> -
> -    qemu_del_timer(s->timer);
> -}
> -
>   static void apic_startup(APICState *s, int vector_num)
>   {
>       s->sipi_vector = vector_num;
> @@ -904,96 +753,6 @@ static void apic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
>       }
>   }
>
> -/* This function is only used for old state version 1 and 2 */
> -static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
> -{
> -    APICState *s = opaque;
> -    int i;
> -
> -    if (version_id>  2)
> -        return -EINVAL;
> -
> -    /* XXX: what if the base changes? (registered memory regions) */
> -    qemu_get_be32s(f,&s->apicbase);
> -    qemu_get_8s(f,&s->id);
> -    qemu_get_8s(f,&s->arb_id);
> -    qemu_get_8s(f,&s->tpr);
> -    qemu_get_be32s(f,&s->spurious_vec);
> -    qemu_get_8s(f,&s->log_dest);
> -    qemu_get_8s(f,&s->dest_mode);
> -    for (i = 0; i<  8; i++) {
> -        qemu_get_be32s(f,&s->isr[i]);
> -        qemu_get_be32s(f,&s->tmr[i]);
> -        qemu_get_be32s(f,&s->irr[i]);
> -    }
> -    for (i = 0; i<  APIC_LVT_NB; i++) {
> -        qemu_get_be32s(f,&s->lvt[i]);
> -    }
> -    qemu_get_be32s(f,&s->esr);
> -    qemu_get_be32s(f,&s->icr[0]);
> -    qemu_get_be32s(f,&s->icr[1]);
> -    qemu_get_be32s(f,&s->divide_conf);
> -    s->count_shift=qemu_get_be32(f);
> -    qemu_get_be32s(f,&s->initial_count);
> -    s->initial_count_load_time=qemu_get_be64(f);
> -    s->next_time=qemu_get_be64(f);
> -
> -    if (version_id>= 2)
> -        qemu_get_timer(f, s->timer);
> -    return 0;
> -}
> -
> -static const VMStateDescription vmstate_apic = {
> -    .name = "apic",
> -    .version_id = 3,
> -    .minimum_version_id = 3,
> -    .minimum_version_id_old = 1,
> -    .load_state_old = apic_load_old,
> -    .fields      = (VMStateField []) {
> -        VMSTATE_UINT32(apicbase, APICState),
> -        VMSTATE_UINT8(id, APICState),
> -        VMSTATE_UINT8(arb_id, APICState),
> -        VMSTATE_UINT8(tpr, APICState),
> -        VMSTATE_UINT32(spurious_vec, APICState),
> -        VMSTATE_UINT8(log_dest, APICState),
> -        VMSTATE_UINT8(dest_mode, APICState),
> -        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
> -        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
> -        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
> -        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
> -        VMSTATE_UINT32(esr, APICState),
> -        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
> -        VMSTATE_UINT32(divide_conf, APICState),
> -        VMSTATE_INT32(count_shift, APICState),
> -        VMSTATE_UINT32(initial_count, APICState),
> -        VMSTATE_INT64(initial_count_load_time, APICState),
> -        VMSTATE_INT64(next_time, APICState),
> -        VMSTATE_TIMER(timer, APICState),
> -        VMSTATE_END_OF_LIST()
> -    }
> -};
> -
> -static void apic_reset(DeviceState *d)
> -{
> -    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> -    int bsp;
> -
> -    bsp = cpu_is_bsp(s->cpu_env);
> -    s->apicbase = 0xfee00000 |
> -        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
> -
> -    apic_init_reset(d);
> -
> -    if (bsp) {
> -        /*
> -         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
> -         * time typically by BIOS, so PIC interrupt can be delivered to the
> -         * processor when local APIC is enabled.
> -         */
> -        s->lvt[APIC_LVT_LINT0] = 0x700;
> -    }
> -}
> -
>   static const MemoryRegionOps apic_io_ops = {
>       .old_mmio = {
>           .read = { apic_mem_readb, apic_mem_readw, apic_mem_readl, },
> @@ -1002,41 +761,27 @@ static const MemoryRegionOps apic_io_ops = {
>       .endianness = DEVICE_NATIVE_ENDIAN,
>   };
>
> -static int apic_init1(SysBusDevice *dev)
> +static void apic_backend_init(APICState *s)
>   {
> -    APICState *s = FROM_SYSBUS(APICState, dev);
> -    static int last_apic_idx;
> -
> -    if (last_apic_idx>= MAX_APICS) {
> -        return -1;
> -    }
> -    memory_region_init_io(&s->io_memory,&apic_io_ops, s, "apic",
> -                          MSI_ADDR_SIZE);
> -    sysbus_init_mmio(dev,&s->io_memory);
> +    memory_region_init_io(&s->io_memory,&apic_io_ops, s, "apic-msi",
> +                          MSI_SPACE_SIZE);
>
>       s->timer = qemu_new_timer_ns(vm_clock, apic_timer, s);
> -    s->idx = last_apic_idx++;
>       local_apics[s->idx] = s;
> -    return 0;
>   }
>
> -static SysBusDeviceInfo apic_info = {
> -    .init = apic_init1,
> -    .qdev.name = "apic",
> -    .qdev.size = sizeof(APICState),
> -    .qdev.vmsd =&vmstate_apic,
> -    .qdev.reset = apic_reset,
> -    .qdev.no_user = 1,
> -    .qdev.props = (Property[]) {
> -        DEFINE_PROP_UINT8("id", APICState, id, -1),
> -        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
> -        DEFINE_PROP_END_OF_LIST(),
> -    }
> +static APICBackend apic_backend = {
> +    .name = "QEMU",
> +    .init = apic_backend_init,
> +    .set_base = apic_set_base,
> +    .set_tpr = apic_set_tpr,
> +    .external_nmi = apic_external_nmi,
>   };
>
>   static void apic_register_devices(void)
>   {
> -    sysbus_register_withprop(&apic_info);
> +    apic_register_device();
> +    apic_register_backend(&apic_backend);
>   }
>
>   device_init(apic_register_devices)
> diff --git a/hw/apic.h b/hw/apic.h
> index 8173d8a..a62d83b 100644
> --- a/hw/apic.h
> +++ b/hw/apic.h
> @@ -10,7 +10,6 @@ int apic_accept_pic_intr(DeviceState *s);
>   void apic_deliver_pic_intr(DeviceState *s, int level);
>   void apic_deliver_nmi(DeviceState *d);
>   int apic_get_interrupt(DeviceState *s);
> -void apic_report_irq_delivered(int delivered);
>   void apic_reset_irq_delivered(void);
>   int apic_get_irq_delivered(void);
>   void cpu_set_apic_base(DeviceState *s, uint64_t val);
> diff --git a/hw/apic_common.c b/hw/apic_common.c
> new file mode 100644
> index 0000000..4cdc45c
> --- /dev/null
> +++ b/hw/apic_common.c
> @@ -0,0 +1,265 @@
> +/*
> + *  APIC support - common bits of emulated and KVM kernel model
> + *
> + *  Copyright (c) 2004-2005 Fabrice Bellard
> + *  Copyright (c) 2011      Jan Kiszka, Siemens AG
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see<http://www.gnu.org/licenses/>
> + */
> +#include "apic.h"
> +#include "apic_internal.h"
> +#include "trace.h"
> +
> +static QSIMPLEQ_HEAD(, APICBackend) backends =
> +    QSIMPLEQ_HEAD_INITIALIZER(backends);
> +static int apic_irq_delivered;
> +
> +void cpu_set_apic_base(DeviceState *d, uint64_t val)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    trace_cpu_set_apic_base(val);
> +
> +    if (s) {
> +        s->backend->set_base(s, val);
> +    }
> +}
> +
> +uint64_t cpu_get_apic_base(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    trace_cpu_get_apic_base(s ? (uint64_t)s->apicbase : 0);
> +
> +    return s ? s->apicbase : 0;
> +}
> +
> +void cpu_set_apic_tpr(DeviceState *d, uint8_t val)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    if (s) {
> +        s->backend->set_tpr(s, val);
> +    }
> +}
> +
> +uint8_t cpu_get_apic_tpr(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    return s ? s->tpr>>  4 : 0;
> +}
> +
> +void apic_report_irq_delivered(int delivered)
> +{
> +    apic_irq_delivered += delivered;
> +
> +    trace_apic_report_irq_delivered(apic_irq_delivered);
> +}
> +
> +void apic_reset_irq_delivered(void)
> +{
> +    trace_apic_reset_irq_delivered(apic_irq_delivered);
> +
> +    apic_irq_delivered = 0;
> +}
> +
> +int apic_get_irq_delivered(void)
> +{
> +    trace_apic_get_irq_delivered(apic_irq_delivered);
> +
> +    return apic_irq_delivered;
> +}
> +
> +void apic_deliver_nmi(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +
> +    s->backend->external_nmi(s);
> +}
> +
> +void apic_init_reset(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +    int i;
> +
> +    if (!s) {
> +        return;
> +    }
> +    s->tpr = 0;
> +    s->spurious_vec = 0xff;
> +    s->log_dest = 0;
> +    s->dest_mode = 0xf;
> +    memset(s->isr, 0, sizeof(s->isr));
> +    memset(s->tmr, 0, sizeof(s->tmr));
> +    memset(s->irr, 0, sizeof(s->irr));
> +    for (i = 0; i<  APIC_LVT_NB; i++) {
> +        s->lvt[i] = APIC_LVT_MASKED;
> +    }
> +    s->esr = 0;
> +    memset(s->icr, 0, sizeof(s->icr));
> +    s->divide_conf = 0;
> +    s->count_shift = 0;
> +    s->initial_count = 0;
> +    s->initial_count_load_time = 0;
> +    s->next_time = 0;
> +    s->wait_for_sipi = 1;
> +
> +    qemu_del_timer(s->timer);
> +}
> +
> +static void apic_reset(DeviceState *d)
> +{
> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> +    bool bsp;
> +
> +    bsp = cpu_is_bsp(s->cpu_env);
> +    s->apicbase = 0xfee00000 |
> +        (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
> +
> +    apic_init_reset(d);
> +
> +    if (bsp) {
> +        /*
> +         * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
> +         * time typically by BIOS, so PIC interrupt can be delivered to the
> +         * processor when local APIC is enabled.
> +         */
> +        s->lvt[APIC_LVT_LINT0] = 0x700;
> +    }
> +}
> +
> +/* This function is only used for old state version 1 and 2 */
> +static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
> +{
> +    APICState *s = opaque;
> +    int i;
> +
> +    if (version_id>  2) {
> +        return -EINVAL;
> +    }
> +
> +    /* XXX: what if the base changes? (registered memory regions) */
> +    qemu_get_be32s(f,&s->apicbase);
> +    qemu_get_8s(f,&s->id);
> +    qemu_get_8s(f,&s->arb_id);
> +    qemu_get_8s(f,&s->tpr);
> +    qemu_get_be32s(f,&s->spurious_vec);
> +    qemu_get_8s(f,&s->log_dest);
> +    qemu_get_8s(f,&s->dest_mode);
> +    for (i = 0; i<  8; i++) {
> +        qemu_get_be32s(f,&s->isr[i]);
> +        qemu_get_be32s(f,&s->tmr[i]);
> +        qemu_get_be32s(f,&s->irr[i]);
> +    }
> +    for (i = 0; i<  APIC_LVT_NB; i++) {
> +        qemu_get_be32s(f,&s->lvt[i]);
> +    }
> +    qemu_get_be32s(f,&s->esr);
> +    qemu_get_be32s(f,&s->icr[0]);
> +    qemu_get_be32s(f,&s->icr[1]);
> +    qemu_get_be32s(f,&s->divide_conf);
> +    s->count_shift = qemu_get_be32(f);
> +    qemu_get_be32s(f,&s->initial_count);
> +    s->initial_count_load_time = qemu_get_be64(f);
> +    s->next_time = qemu_get_be64(f);
> +
> +    if (version_id>= 2) {
> +        qemu_get_timer(f, s->timer);
> +    }
> +    return 0;
> +}
> +
> +static const VMStateDescription vmstate_apic = {
> +    .name = "apic",
> +    .version_id = 3,
> +    .minimum_version_id = 3,
> +    .minimum_version_id_old = 1,
> +    .load_state_old = apic_load_old,
> +    .fields      = (VMStateField[]) {
> +        VMSTATE_UINT32(apicbase, APICState),
> +        VMSTATE_UINT8(id, APICState),
> +        VMSTATE_UINT8(arb_id, APICState),
> +        VMSTATE_UINT8(tpr, APICState),
> +        VMSTATE_UINT32(spurious_vec, APICState),
> +        VMSTATE_UINT8(log_dest, APICState),
> +        VMSTATE_UINT8(dest_mode, APICState),
> +        VMSTATE_UINT32_ARRAY(isr, APICState, 8),
> +        VMSTATE_UINT32_ARRAY(tmr, APICState, 8),
> +        VMSTATE_UINT32_ARRAY(irr, APICState, 8),
> +        VMSTATE_UINT32_ARRAY(lvt, APICState, APIC_LVT_NB),
> +        VMSTATE_UINT32(esr, APICState),
> +        VMSTATE_UINT32_ARRAY(icr, APICState, 2),
> +        VMSTATE_UINT32(divide_conf, APICState),
> +        VMSTATE_INT32(count_shift, APICState),
> +        VMSTATE_UINT32(initial_count, APICState),
> +        VMSTATE_INT64(initial_count_load_time, APICState),
> +        VMSTATE_INT64(next_time, APICState),
> +        VMSTATE_TIMER(timer, APICState),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
> +static int apic_init(SysBusDevice *dev)
> +{
> +    APICState *s = FROM_SYSBUS(APICState, dev);
> +    static int apic_no;
> +    APICBackend *b;
> +
> +    if (apic_no>= MAX_APICS) {
> +        return -1;
> +    }
> +    s->idx = apic_no++;
> +
> +    QSIMPLEQ_FOREACH(b,&backends, entry) {
> +        if (strcmp(b->name, s->backend_name) == 0) {
> +            s->backend = b;
> +            break;
> +        }
> +    }
> +    if (!s->backend) {
> +        hw_error("APIC backend '%s' not found!", s->backend_name);
> +        exit(1);
> +    }
> +
> +    b->init(s);
> +
> +    sysbus_init_mmio(&s->busdev,&s->io_memory);
> +    return 0;
> +}
> +
> +static SysBusDeviceInfo apic_info = {
> +    .init = apic_init,
> +    .qdev.name = "apic",
> +    .qdev.size = sizeof(APICState),
> +    .qdev.vmsd =&vmstate_apic,
> +    .qdev.reset = apic_reset,
> +    .qdev.no_user = 1,
> +    .qdev.props = (Property[]) {
> +        DEFINE_PROP_UINT8("id", APICState, id, -1),
> +        DEFINE_PROP_PTR("cpu_env", APICState, cpu_env),
> +        DEFINE_PROP_STRING("backend", APICState, backend_name),
> +        DEFINE_PROP_END_OF_LIST(),
> +    }
> +};
> +
> +void apic_register_backend(APICBackend *backend)
> +{
> +    QSIMPLEQ_INSERT_TAIL(&backends, backend, entry);
> +}
> +
> +void apic_register_device(void)
> +{
> +    sysbus_register_withprop(&apic_info);
> +}
> diff --git a/hw/apic_internal.h b/hw/apic_internal.h
> new file mode 100644
> index 0000000..6cbd901
> --- /dev/null
> +++ b/hw/apic_internal.h
> @@ -0,0 +1,119 @@
> +/*
> + *  APIC support - internal interfaces
> + *
> + *  Copyright (c) 2004-2005 Fabrice Bellard
> + *  Copyright (c) 2011      Jan Kiszka, Siemens AG
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see<http://www.gnu.org/licenses/>
> + */
> +#ifndef QEMU_APIC_INTERNAL_H
> +#define QEMU_APIC_INTERNAL_H
> +
> +#include "memory.h"
> +#include "sysbus.h"
> +#include "qemu-timer.h"
> +#include "qemu-queue.h"
> +
> +/* APIC Local Vector Table */
> +#define APIC_LVT_TIMER                  0
> +#define APIC_LVT_THERMAL                1
> +#define APIC_LVT_PERFORM                2
> +#define APIC_LVT_LINT0                  3
> +#define APIC_LVT_LINT1                  4
> +#define APIC_LVT_ERROR                  5
> +#define APIC_LVT_NB                     6
> +
> +/* APIC delivery modes */
> +#define APIC_DM_FIXED                   0
> +#define APIC_DM_LOWPRI                  1
> +#define APIC_DM_SMI                     2
> +#define APIC_DM_NMI                     4
> +#define APIC_DM_INIT                    5
> +#define APIC_DM_SIPI                    6
> +#define APIC_DM_EXTINT                  7
> +
> +/* APIC destination mode */
> +#define APIC_DESTMODE_FLAT              0xf
> +#define APIC_DESTMODE_CLUSTER           1
> +
> +#define APIC_TRIGGER_EDGE               0
> +#define APIC_TRIGGER_LEVEL              1
> +
> +#define APIC_LVT_TIMER_PERIODIC         (1<<17)
> +#define APIC_LVT_MASKED                 (1<<16)
> +#define APIC_LVT_LEVEL_TRIGGER          (1<<15)
> +#define APIC_LVT_REMOTE_IRR             (1<<14)
> +#define APIC_INPUT_POLARITY             (1<<13)
> +#define APIC_SEND_PENDING               (1<<12)
> +
> +#define ESR_ILLEGAL_ADDRESS (1<<  7)
> +
> +#define APIC_SV_DIRECTED_IO             (1<<12)
> +#define APIC_SV_ENABLE                  (1<<8)
> +
> +#define MAX_APICS 255
> +
> +#define MSI_SPACE_SIZE                  0x100000
> +
> +typedef struct APICBackend APICBackend;
> +typedef struct APICState APICState;
> +
> +struct APICBackend {
> +    const char *name;
> +    void (*init)(APICState *s);
> +    void (*set_base)(APICState *s, uint64_t val);
> +    void (*set_tpr)(APICState *s, uint8_t val);
> +    void (*external_nmi)(APICState *s);
> +
> +    QSIMPLEQ_ENTRY(APICBackend) entry;
> +};


Wouldn't this be more naturally modeled by making APICBackend be a base class?

In qdev today, this would look like:

struct APICCommon {
    SysBusDevice qdev;
    ...
};

struct APICCommonInfo {
     DeviceInfo qdev;
     void (*init)(APICState *s);
     void (*set_base)(APICState *s, uint64_t val);
     void (*set_tpr)(APICState *s, uint8_t val);
     void (*external_nmi)(APICState *s);
};

Take a look at SCSIDevice for an example of this in practice.  This is nicer 
because as we move save/load into devices methods, it becomes natural to define 
the state and save/load function in the base class.  Provided it only uses base 
class state, it lets save/load be compatible between both in-kernel and in-qemu 
device model.

Regards,

Anthony Liguori

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
@ 2011-12-19 22:21     ` Anthony Liguori
  -1 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-19 22:21 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/15/2011 06:33 AM, Jan Kiszka wrote:
> To enable migration between accelerated and non-accelerated APIC models,
> we will need to handle the timer saving and restoring specially and can
> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
> accelerated model will not start any QEMUTimer.
>
> This patch therefore factors out the generic bits into apic_next_timer
> and introduces a post-load callback that can be implemented differently
> by both models.
>
> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>

So you basically want the timer to be a dummy field for the in-kernel apic?

Can you fix this up in a pre-save routine (put QEMUTimer into a state where 
there isn't an event pending)?

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
@ 2011-12-19 22:21     ` Anthony Liguori
  0 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-19 22:21 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/15/2011 06:33 AM, Jan Kiszka wrote:
> To enable migration between accelerated and non-accelerated APIC models,
> we will need to handle the timer saving and restoring specially and can
> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
> accelerated model will not start any QEMUTimer.
>
> This patch therefore factors out the generic bits into apic_next_timer
> and introduces a post-load callback that can be implemented differently
> by both models.
>
> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>

So you basically want the timer to be a dummy field for the in-kernel apic?

Can you fix this up in a pre-save routine (put QEMUTimer into a state where 
there isn't an event pending)?

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-19 21:17   ` [Qemu-devel] " Marcelo Tosatti
@ 2011-12-19 22:24     ` Anthony Liguori
  -1 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-19 22:24 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Jan Kiszka, Anthony Liguori, Lai Jiangshan, kvm,
	Michael S. Tsirkin, qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>
> Anthony,
>
> Can you please review&  ACK?
>
> You could even apply directly but well do a kvm-autotest run through
> uq/master. Still, your review is needed.

Overall, it looks good except for the backend/frontend split.  This should be 
done in terms of qdev inheritance.  As we progress to QOM, this will mean that 
the various links will just be link<APICCommon> or whatever it ends up being called.

Regards,

Anthony Liguori

>
> Thanks
>
> On Thu, Dec 15, 2011 at 01:33:15PM +0100, Jan Kiszka wrote:
>> Changes in v5:
>> - properly introduce apic_report_irq_delivered (instead of
>>    apic_set_irq_delivered silently)
>> - rework apic to kvm core interface according to Blue's suggestion
>>
>> CC: Lai Jiangshan<laijs@cn.fujitsu.com>
>>
>> Jan Kiszka (16):
>>    msi: Generalize msix_supported to msi_supported
>>    kvm: Move kvmclock into hw/kvm folder
>>    apic: Stop timer on reset
>>    apic: Inject external NMI events via LINT1
>>    apic: Introduce apic_report_irq_delivered
>>    apic: Introduce backend/frontend infrastructure for KVM reuse
>>    apic: Open-code timer save/restore
>>    i8259: Introduce backend/frontend infrastructure for KVM reuse
>>    ioapic: Introduce backend/frontend infrastructure for KVM reuse
>>    memory: Introduce memory_region_init_reservation
>>    kvm: Introduce core services for in-kernel irqchip support
>>    kvm: x86: Establish IRQ0 override control
>>    kvm: x86: Add user space part for in-kernel APIC
>>    kvm: x86: Add user space part for in-kernel i8259
>>    kvm: x86: Add user space part for in-kernel IOAPIC
>>    kvm: Arm in-kernel irqchip support
>>
>>   Makefile.objs                  |    2 +-
>>   Makefile.target                |    6 +-
>>   configure                      |    1 +
>>   hw/apic.c                      |  309 ++++-----------------------------------
>>   hw/apic.h                      |    1 +
>>   hw/apic_common.c               |  312 ++++++++++++++++++++++++++++++++++++++++
>>   hw/apic_internal.h             |  122 ++++++++++++++++
>>   hw/i8259.c                     |  127 ++--------------
>>   hw/i8259_common.c              |  173 ++++++++++++++++++++++
>>   hw/i8259_internal.h            |   82 +++++++++++
>>   hw/ioapic.c                    |  130 ++---------------
>>   hw/ioapic_common.c             |  138 ++++++++++++++++++
>>   hw/ioapic_internal.h           |  106 ++++++++++++++
>>   hw/kvm/apic.c                  |  138 ++++++++++++++++++
>>   hw/{kvmclock.c =>  kvm/clock.c} |    4 +-
>>   hw/{kvmclock.h =>  kvm/clock.h} |    0
>>   hw/kvm/i8259.c                 |  126 ++++++++++++++++
>>   hw/kvm/ioapic.c                |  101 +++++++++++++
>>   hw/msi.c                       |    8 +
>>   hw/msi.h                       |    2 +
>>   hw/msix.c                      |    9 +-
>>   hw/msix.h                      |    2 -
>>   hw/pc.c                        |   19 ++-
>>   hw/pc.h                        |    1 +
>>   hw/pc_piix.c                   |   66 ++++++++-
>>   kvm-all.c                      |  154 ++++++++++++++++++++
>>   kvm-stub.c                     |    5 +
>>   kvm.h                          |   14 ++
>>   memory.c                       |   36 +++++
>>   memory.h                       |   16 ++
>>   monitor.c                      |    6 +-
>>   qemu-config.c                  |    4 +
>>   qemu-options.hx                |    5 +-
>>   sysemu.h                       |    1 -
>>   target-i386/kvm.c              |   49 +++++++
>>   trace-events                   |    2 +-
>>   vl.c                           |    1 -
>>   37 files changed, 1739 insertions(+), 539 deletions(-)
>>   create mode 100644 hw/apic_common.c
>>   create mode 100644 hw/apic_internal.h
>>   create mode 100644 hw/i8259_common.c
>>   create mode 100644 hw/i8259_internal.h
>>   create mode 100644 hw/ioapic_common.c
>>   create mode 100644 hw/ioapic_internal.h
>>   create mode 100644 hw/kvm/apic.c
>>   rename hw/{kvmclock.c =>  kvm/clock.c} (98%)
>>   rename hw/{kvmclock.h =>  kvm/clock.h} (100%)
>>   create mode 100644 hw/kvm/i8259.c
>>   create mode 100644 hw/kvm/ioapic.c
>>
>> --
>> 1.7.3.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-19 22:24     ` Anthony Liguori
  0 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-19 22:24 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	Jan Kiszka, qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>
> Anthony,
>
> Can you please review&  ACK?
>
> You could even apply directly but well do a kvm-autotest run through
> uq/master. Still, your review is needed.

Overall, it looks good except for the backend/frontend split.  This should be 
done in terms of qdev inheritance.  As we progress to QOM, this will mean that 
the various links will just be link<APICCommon> or whatever it ends up being called.

Regards,

Anthony Liguori

>
> Thanks
>
> On Thu, Dec 15, 2011 at 01:33:15PM +0100, Jan Kiszka wrote:
>> Changes in v5:
>> - properly introduce apic_report_irq_delivered (instead of
>>    apic_set_irq_delivered silently)
>> - rework apic to kvm core interface according to Blue's suggestion
>>
>> CC: Lai Jiangshan<laijs@cn.fujitsu.com>
>>
>> Jan Kiszka (16):
>>    msi: Generalize msix_supported to msi_supported
>>    kvm: Move kvmclock into hw/kvm folder
>>    apic: Stop timer on reset
>>    apic: Inject external NMI events via LINT1
>>    apic: Introduce apic_report_irq_delivered
>>    apic: Introduce backend/frontend infrastructure for KVM reuse
>>    apic: Open-code timer save/restore
>>    i8259: Introduce backend/frontend infrastructure for KVM reuse
>>    ioapic: Introduce backend/frontend infrastructure for KVM reuse
>>    memory: Introduce memory_region_init_reservation
>>    kvm: Introduce core services for in-kernel irqchip support
>>    kvm: x86: Establish IRQ0 override control
>>    kvm: x86: Add user space part for in-kernel APIC
>>    kvm: x86: Add user space part for in-kernel i8259
>>    kvm: x86: Add user space part for in-kernel IOAPIC
>>    kvm: Arm in-kernel irqchip support
>>
>>   Makefile.objs                  |    2 +-
>>   Makefile.target                |    6 +-
>>   configure                      |    1 +
>>   hw/apic.c                      |  309 ++++-----------------------------------
>>   hw/apic.h                      |    1 +
>>   hw/apic_common.c               |  312 ++++++++++++++++++++++++++++++++++++++++
>>   hw/apic_internal.h             |  122 ++++++++++++++++
>>   hw/i8259.c                     |  127 ++--------------
>>   hw/i8259_common.c              |  173 ++++++++++++++++++++++
>>   hw/i8259_internal.h            |   82 +++++++++++
>>   hw/ioapic.c                    |  130 ++---------------
>>   hw/ioapic_common.c             |  138 ++++++++++++++++++
>>   hw/ioapic_internal.h           |  106 ++++++++++++++
>>   hw/kvm/apic.c                  |  138 ++++++++++++++++++
>>   hw/{kvmclock.c =>  kvm/clock.c} |    4 +-
>>   hw/{kvmclock.h =>  kvm/clock.h} |    0
>>   hw/kvm/i8259.c                 |  126 ++++++++++++++++
>>   hw/kvm/ioapic.c                |  101 +++++++++++++
>>   hw/msi.c                       |    8 +
>>   hw/msi.h                       |    2 +
>>   hw/msix.c                      |    9 +-
>>   hw/msix.h                      |    2 -
>>   hw/pc.c                        |   19 ++-
>>   hw/pc.h                        |    1 +
>>   hw/pc_piix.c                   |   66 ++++++++-
>>   kvm-all.c                      |  154 ++++++++++++++++++++
>>   kvm-stub.c                     |    5 +
>>   kvm.h                          |   14 ++
>>   memory.c                       |   36 +++++
>>   memory.h                       |   16 ++
>>   monitor.c                      |    6 +-
>>   qemu-config.c                  |    4 +
>>   qemu-options.hx                |    5 +-
>>   sysemu.h                       |    1 -
>>   target-i386/kvm.c              |   49 +++++++
>>   trace-events                   |    2 +-
>>   vl.c                           |    1 -
>>   37 files changed, 1739 insertions(+), 539 deletions(-)
>>   create mode 100644 hw/apic_common.c
>>   create mode 100644 hw/apic_internal.h
>>   create mode 100644 hw/i8259_common.c
>>   create mode 100644 hw/i8259_internal.h
>>   create mode 100644 hw/ioapic_common.c
>>   create mode 100644 hw/ioapic_internal.h
>>   create mode 100644 hw/kvm/apic.c
>>   rename hw/{kvmclock.c =>  kvm/clock.c} (98%)
>>   rename hw/{kvmclock.h =>  kvm/clock.h} (100%)
>>   create mode 100644 hw/kvm/i8259.c
>>   create mode 100644 hw/kvm/ioapic.c
>>
>> --
>> 1.7.3.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-19 22:14     ` Anthony Liguori
@ 2011-12-19 23:32       ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-19 23:32 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Avi Kivity, Marcelo Tosatti, Blue Swirl, Anthony Liguori,
	qemu-devel, kvm, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

[ Please strip your replies a bit. I always worry to miss a comment when
scrolling down dozens of pages. ]

On 2011-12-19 23:14, Anthony Liguori wrote:
>> +
>> +struct APICBackend {
>> +    const char *name;
>> +    void (*init)(APICState *s);
>> +    void (*set_base)(APICState *s, uint64_t val);
>> +    void (*set_tpr)(APICState *s, uint8_t val);
>> +    void (*external_nmi)(APICState *s);
>> +
>> +    QSIMPLEQ_ENTRY(APICBackend) entry;
>> +};
> 
> 
> Wouldn't this be more naturally modeled by making APICBackend be a base
> class?
> 
> In qdev today, this would look like:
> 
> struct APICCommon {
>    SysBusDevice qdev;
>    ...
> };
> 
> struct APICCommonInfo {
>     DeviceInfo qdev;
>     void (*init)(APICState *s);
>     void (*set_base)(APICState *s, uint64_t val);
>     void (*set_tpr)(APICState *s, uint8_t val);
>     void (*external_nmi)(APICState *s);
> };
> 
> Take a look at SCSIDevice for an example of this in practice.  This is
> nicer because as we move save/load into devices methods, it becomes
> natural to define the state and save/load function in the base class. 
> Provided it only uses base class state, it lets save/load be compatible
> between both in-kernel and in-qemu device model.

The difference is (unless I completely miss your point) that a common
SCSI base class is used by different derived classes. Here we have a
common frontend class but different base classes, so to say. And we have
a mechanism to chose where to inherit from on instantiation. Precisely
this allows to keep the compatibility between in-kernel and user space
model in this series.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-19 23:32       ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-19 23:32 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

[ Please strip your replies a bit. I always worry to miss a comment when
scrolling down dozens of pages. ]

On 2011-12-19 23:14, Anthony Liguori wrote:
>> +
>> +struct APICBackend {
>> +    const char *name;
>> +    void (*init)(APICState *s);
>> +    void (*set_base)(APICState *s, uint64_t val);
>> +    void (*set_tpr)(APICState *s, uint8_t val);
>> +    void (*external_nmi)(APICState *s);
>> +
>> +    QSIMPLEQ_ENTRY(APICBackend) entry;
>> +};
> 
> 
> Wouldn't this be more naturally modeled by making APICBackend be a base
> class?
> 
> In qdev today, this would look like:
> 
> struct APICCommon {
>    SysBusDevice qdev;
>    ...
> };
> 
> struct APICCommonInfo {
>     DeviceInfo qdev;
>     void (*init)(APICState *s);
>     void (*set_base)(APICState *s, uint64_t val);
>     void (*set_tpr)(APICState *s, uint8_t val);
>     void (*external_nmi)(APICState *s);
> };
> 
> Take a look at SCSIDevice for an example of this in practice.  This is
> nicer because as we move save/load into devices methods, it becomes
> natural to define the state and save/load function in the base class. 
> Provided it only uses base class state, it lets save/load be compatible
> between both in-kernel and in-qemu device model.

The difference is (unless I completely miss your point) that a common
SCSI base class is used by different derived classes. Here we have a
common frontend class but different base classes, so to say. And we have
a mechanism to chose where to inherit from on instantiation. Precisely
this allows to keep the compatibility between in-kernel and user space
model in this series.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-19 22:21     ` [Qemu-devel] " Anthony Liguori
@ 2011-12-19 23:45       ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-19 23:45 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

On 2011-12-19 23:21, Anthony Liguori wrote:
> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>> To enable migration between accelerated and non-accelerated APIC models,
>> we will need to handle the timer saving and restoring specially and can
>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>> accelerated model will not start any QEMUTimer.
>>
>> This patch therefore factors out the generic bits into apic_next_timer
>> and introduces a post-load callback that can be implemented differently
>> by both models.
>>
>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
> 
> So you basically want the timer to be a dummy field for the in-kernel apic?
> 
> Can you fix this up in a pre-save routine (put QEMUTimer into a state
> where there isn't an event pending)?

It is not a dummy field, it contains the proper state in both cases. We
just need to convert it to an open-coded state to avoid the QEMUTimer
restoration magic in the in-kernel case (where there must be no QEMUTimer).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
@ 2011-12-19 23:45       ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-19 23:45 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

On 2011-12-19 23:21, Anthony Liguori wrote:
> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>> To enable migration between accelerated and non-accelerated APIC models,
>> we will need to handle the timer saving and restoring specially and can
>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>> accelerated model will not start any QEMUTimer.
>>
>> This patch therefore factors out the generic bits into apic_next_timer
>> and introduces a post-load callback that can be implemented differently
>> by both models.
>>
>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
> 
> So you basically want the timer to be a dummy field for the in-kernel apic?
> 
> Can you fix this up in a pre-save routine (put QEMUTimer into a state
> where there isn't an event pending)?

It is not a dummy field, it contains the proper state in both cases. We
just need to convert it to an open-coded state to avoid the QEMUTimer
restoration magic in the in-kernel case (where there must be no QEMUTimer).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-19 22:24     ` Anthony Liguori
@ 2011-12-19 23:49       ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-19 23:49 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

On 2011-12-19 23:24, Anthony Liguori wrote:
> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>
>> Anthony,
>>
>> Can you please review&  ACK?
>>
>> You could even apply directly but well do a kvm-autotest run through
>> uq/master. Still, your review is needed.
> 
> Overall, it looks good except for the backend/frontend split.  This
> should be done in terms of qdev inheritance.

I cannot follow your idea here yet. There is no inheritance as we end up
with only a single class that permutes (selects a different backend) on
creation. I'm not sure how to model two classes that will still only
mean a single qdev registration.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-19 23:49       ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-19 23:49 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, Lai Jiangshan, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 656 bytes --]

On 2011-12-19 23:24, Anthony Liguori wrote:
> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>
>> Anthony,
>>
>> Can you please review&  ACK?
>>
>> You could even apply directly but well do a kvm-autotest run through
>> uq/master. Still, your review is needed.
> 
> Overall, it looks good except for the backend/frontend split.  This
> should be done in terms of qdev inheritance.

I cannot follow your idea here yet. There is no inheritance as we end up
with only a single class that permutes (selects a different backend) on
creation. I'm not sure how to model two classes that will still only
mean a single qdev registration.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-19 23:32       ` Jan Kiszka
  (?)
@ 2011-12-20  0:28       ` Anthony Liguori
  2011-12-20  0:32         ` Jan Kiszka
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:28 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 05:32 PM, Jan Kiszka wrote:
>> struct APICCommonInfo {
>>      DeviceInfo qdev;
>>      void (*init)(APICState *s);
>>      void (*set_base)(APICState *s, uint64_t val);
>>      void (*set_tpr)(APICState *s, uint8_t val);
>>      void (*external_nmi)(APICState *s);
>> };
>>
>> Take a look at SCSIDevice for an example of this in practice.  This is
>> nicer because as we move save/load into devices methods, it becomes
>> natural to define the state and save/load function in the base class.
>> Provided it only uses base class state, it lets save/load be compatible
>> between both in-kernel and in-qemu device model.
>
> The difference is (unless I completely miss your point) that a common
> SCSI base class is used by different derived classes.

The 'frontend' is the common code and the 'backend' are the bits that are 
different, no?

We ultimately want there to be two devices that share all of the 'frontend' code 
by providing different 'backend' implementations.

So make the 'frontend' a base class that provides a set of abstract virtual 
methods (the set you have as the 'backend' interface).  Each device instance 
then inherits from the base class and provides its own implementation of the 
virtual methods.

> Here we have a
> common frontend class but different base classes, so to say. And we have
> a mechanism to chose where to inherit from on instantiation. Precisely
> this allows to keep the compatibility between in-kernel and user space
> model in this series.

Okay, so I really think this is the problem.  The in-kernel APIC is a separate 
device, no a property of the userspace APIC device.

It should be modeled as two separate devices.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-19 23:45       ` [Qemu-devel] " Jan Kiszka
  (?)
@ 2011-12-20  0:31       ` Anthony Liguori
  2011-12-20  0:34           ` [Qemu-devel] " Jan Kiszka
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:31 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

On 12/19/2011 05:45 PM, Jan Kiszka wrote:
> On 2011-12-19 23:21, Anthony Liguori wrote:
>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>> To enable migration between accelerated and non-accelerated APIC models,
>>> we will need to handle the timer saving and restoring specially and can
>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>> accelerated model will not start any QEMUTimer.
>>>
>>> This patch therefore factors out the generic bits into apic_next_timer
>>> and introduces a post-load callback that can be implemented differently
>>> by both models.
>>>
>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>
>> So you basically want the timer to be a dummy field for the in-kernel apic?
>>
>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>> where there isn't an event pending)?
>
> It is not a dummy field, it contains the proper state in both cases. We
> just need to convert it to an open-coded state to avoid the QEMUTimer
> restoration magic in the in-kernel case (where there must be no QEMUTimer).

So the state gets fed into the kernel instead of userspace?

This seems a bit much to me, can't we just have two VMStateDescriptions that 
happen to look the same and break migration between userspace and in-kernel?

Are we trying to solve a problem no one cares about?

If you want to avoid regressing migration compat in qemu-kvm, have the vmstate 
name both be the same, it can be two separate devices as the vmstate name is not 
tied to the qdev name right now.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20  0:28       ` Anthony Liguori
@ 2011-12-20  0:32         ` Jan Kiszka
  2011-12-20  0:38           ` Anthony Liguori
  0 siblings, 1 reply; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  0:32 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]

On 2011-12-20 01:28, Anthony Liguori wrote:
> On 12/19/2011 05:32 PM, Jan Kiszka wrote:
>>> struct APICCommonInfo {
>>>      DeviceInfo qdev;
>>>      void (*init)(APICState *s);
>>>      void (*set_base)(APICState *s, uint64_t val);
>>>      void (*set_tpr)(APICState *s, uint8_t val);
>>>      void (*external_nmi)(APICState *s);
>>> };
>>>
>>> Take a look at SCSIDevice for an example of this in practice.  This is
>>> nicer because as we move save/load into devices methods, it becomes
>>> natural to define the state and save/load function in the base class.
>>> Provided it only uses base class state, it lets save/load be compatible
>>> between both in-kernel and in-qemu device model.
>>
>> The difference is (unless I completely miss your point) that a common
>> SCSI base class is used by different derived classes.
> 
> The 'frontend' is the common code and the 'backend' are the bits that
> are different, no?
> 
> We ultimately want there to be two devices that share all of the
> 'frontend' code by providing different 'backend' implementations.
> 
> So make the 'frontend' a base class that provides a set of abstract
> virtual methods (the set you have as the 'backend' interface).  Each
> device instance then inherits from the base class and provides its own
> implementation of the virtual methods.
> 
>> Here we have a
>> common frontend class but different base classes, so to say. And we have
>> a mechanism to chose where to inherit from on instantiation. Precisely
>> this allows to keep the compatibility between in-kernel and user space
>> model in this series.
> 
> Okay, so I really think this is the problem.  The in-kernel APIC is a
> separate device, no a property of the userspace APIC device.
> 
> It should be modeled as two separate devices.

That was v1 of my patches. Avi didn't like it, I tried it like this, and
in the end I had to agree. So, no, I don't think we want such a model.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-19 23:49       ` [Qemu-devel] " Jan Kiszka
  (?)
@ 2011-12-20  0:32       ` Anthony Liguori
  2011-12-20  0:37         ` Jan Kiszka
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:32 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 05:49 PM, Jan Kiszka wrote:
> On 2011-12-19 23:24, Anthony Liguori wrote:
>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>
>>> Anthony,
>>>
>>> Can you please review&   ACK?
>>>
>>> You could even apply directly but well do a kvm-autotest run through
>>> uq/master. Still, your review is needed.
>>
>> Overall, it looks good except for the backend/frontend split.  This
>> should be done in terms of qdev inheritance.
>
> I cannot follow your idea here yet. There is no inheritance as we end up
> with only a single class that permutes (selects a different backend) on
> creation. I'm not sure how to model two classes that will still only
> mean a single qdev registration.

See other reply in thread.

We should model this as two separate qdev devices.  We can avoid regressing 
migration in qemu-kvm by just having a common vmstate name.

apic is a no-user device so there's no way that changing the name of it in 
qemu-kvm can affect users.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-20  0:31       ` Anthony Liguori
@ 2011-12-20  0:34           ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  0:34 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1581 bytes --]

On 2011-12-20 01:31, Anthony Liguori wrote:
> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>> To enable migration between accelerated and non-accelerated APIC
>>>> models,
>>>> we will need to handle the timer saving and restoring specially and can
>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>> accelerated model will not start any QEMUTimer.
>>>>
>>>> This patch therefore factors out the generic bits into apic_next_timer
>>>> and introduces a post-load callback that can be implemented differently
>>>> by both models.
>>>>
>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>
>>> So you basically want the timer to be a dummy field for the in-kernel
>>> apic?
>>>
>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>> where there isn't an event pending)?
>>
>> It is not a dummy field, it contains the proper state in both cases. We
>> just need to convert it to an open-coded state to avoid the QEMUTimer
>> restoration magic in the in-kernel case (where there must be no
>> QEMUTimer).
> 
> So the state gets fed into the kernel instead of userspace?

Nope. It's kept for eventual use by a user space model.

> 
> This seems a bit much to me, can't we just have two VMStateDescriptions
> that happen to look the same and break migration between userspace and
> in-kernel?

There is nothing broken, at least according to my tests. Migration works
between both backend variants.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
@ 2011-12-20  0:34           ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  0:34 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1581 bytes --]

On 2011-12-20 01:31, Anthony Liguori wrote:
> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>> To enable migration between accelerated and non-accelerated APIC
>>>> models,
>>>> we will need to handle the timer saving and restoring specially and can
>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>> accelerated model will not start any QEMUTimer.
>>>>
>>>> This patch therefore factors out the generic bits into apic_next_timer
>>>> and introduces a post-load callback that can be implemented differently
>>>> by both models.
>>>>
>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>
>>> So you basically want the timer to be a dummy field for the in-kernel
>>> apic?
>>>
>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>> where there isn't an event pending)?
>>
>> It is not a dummy field, it contains the proper state in both cases. We
>> just need to convert it to an open-coded state to avoid the QEMUTimer
>> restoration magic in the in-kernel case (where there must be no
>> QEMUTimer).
> 
> So the state gets fed into the kernel instead of userspace?

Nope. It's kept for eventual use by a user space model.

> 
> This seems a bit much to me, can't we just have two VMStateDescriptions
> that happen to look the same and break migration between userspace and
> in-kernel?

There is nothing broken, at least according to my tests. Migration works
between both backend variants.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  0:32       ` Anthony Liguori
@ 2011-12-20  0:37         ` Jan Kiszka
  2011-12-20  0:42             ` [Qemu-devel] " Anthony Liguori
  2011-12-20  1:08           ` Anthony Liguori
  0 siblings, 2 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  0:37 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]

On 2011-12-20 01:32, Anthony Liguori wrote:
> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>
>>>> Anthony,
>>>>
>>>> Can you please review&   ACK?
>>>>
>>>> You could even apply directly but well do a kvm-autotest run through
>>>> uq/master. Still, your review is needed.
>>>
>>> Overall, it looks good except for the backend/frontend split.  This
>>> should be done in terms of qdev inheritance.
>>
>> I cannot follow your idea here yet. There is no inheritance as we end up
>> with only a single class that permutes (selects a different backend) on
>> creation. I'm not sure how to model two classes that will still only
>> mean a single qdev registration.
> 
> See other reply in thread.
> 
> We should model this as two separate qdev devices.  We can avoid
> regressing migration in qemu-kvm by just having a common vmstate name.
> 
> apic is a no-user device so there's no way that changing the name of it
> in qemu-kvm can affect users.

Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
for the discussion of that model.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20  0:32         ` Jan Kiszka
@ 2011-12-20  0:38           ` Anthony Liguori
  2011-12-20  9:56               ` Avi Kivity
  0 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:38 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 06:32 PM, Jan Kiszka wrote:
> On 2011-12-20 01:28, Anthony Liguori wrote:
>> On 12/19/2011 05:32 PM, Jan Kiszka wrote:
>>>> struct APICCommonInfo {
>>>>       DeviceInfo qdev;
>>>>       void (*init)(APICState *s);
>>>>       void (*set_base)(APICState *s, uint64_t val);
>>>>       void (*set_tpr)(APICState *s, uint8_t val);
>>>>       void (*external_nmi)(APICState *s);
>>>> };
>>>>
>>>> Take a look at SCSIDevice for an example of this in practice.  This is
>>>> nicer because as we move save/load into devices methods, it becomes
>>>> natural to define the state and save/load function in the base class.
>>>> Provided it only uses base class state, it lets save/load be compatible
>>>> between both in-kernel and in-qemu device model.
>>>
>>> The difference is (unless I completely miss your point) that a common
>>> SCSI base class is used by different derived classes.
>>
>> The 'frontend' is the common code and the 'backend' are the bits that
>> are different, no?
>>
>> We ultimately want there to be two devices that share all of the
>> 'frontend' code by providing different 'backend' implementations.
>>
>> So make the 'frontend' a base class that provides a set of abstract
>> virtual methods (the set you have as the 'backend' interface).  Each
>> device instance then inherits from the base class and provides its own
>> implementation of the virtual methods.
>>
>>> Here we have a
>>> common frontend class but different base classes, so to say. And we have
>>> a mechanism to chose where to inherit from on instantiation. Precisely
>>> this allows to keep the compatibility between in-kernel and user space
>>> model in this series.
>>
>> Okay, so I really think this is the problem.  The in-kernel APIC is a
>> separate device, no a property of the userspace APIC device.
>>
>> It should be modeled as two separate devices.
>
> That was v1 of my patches. Avi didn't like it, I tried it like this, and
> in the end I had to agree. So, no, I don't think we want such a model.

Yes, we do :-)

The in-kernel APIC is a different implementation of the APIC device.  It's not 
an "accelerator" for the userspace APIC.

All that you're doing here is reinventing qdev.  You're defining your own type 
system (APICBackend), creating a new regression system for it, and then defining 
your own factory function for creating it (through a qdev property).

I'm struggling to understand the reason to avoid using the infrastructure we 
already have to do all of this.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  0:37         ` Jan Kiszka
@ 2011-12-20  0:42             ` Anthony Liguori
  2011-12-20  1:08           ` Anthony Liguori
  1 sibling, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:42 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 06:37 PM, Jan Kiszka wrote:
> On 2011-12-20 01:32, Anthony Liguori wrote:
>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>
>>>>> Anthony,
>>>>>
>>>>> Can you please review&    ACK?
>>>>>
>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>> uq/master. Still, your review is needed.
>>>>
>>>> Overall, it looks good except for the backend/frontend split.  This
>>>> should be done in terms of qdev inheritance.
>>>
>>> I cannot follow your idea here yet. There is no inheritance as we end up
>>> with only a single class that permutes (selects a different backend) on
>>> creation. I'm not sure how to model two classes that will still only
>>> mean a single qdev registration.
>>
>> See other reply in thread.
>>
>> We should model this as two separate qdev devices.  We can avoid
>> regressing migration in qemu-kvm by just having a common vmstate name.
>>
>> apic is a no-user device so there's no way that changing the name of it
>> in qemu-kvm can affect users.
>
> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
> for the discussion of that model.

I have.  I don't understand the rationale for jumping through hoops here.

There seems to be an assertion that migrating from in-kernel APIC to userspace 
APIC is an important use case.  I don't really see how that's true.

But nonetheless, the direction migration is heading is not just to migrate the 
QOM path names to identify devices, but to provide a way to introspect the 
device model, transfer the current device model description to the other end, 
and create the device model on the destination.

This is the only way to reliably support things like hot-plug during live 
migration which is something we punt to management tools (which really can't 
implement it properly).

So we'll already be migrating the apic backend property which means that you are 
not going to have migration to and from in-kernel APIC and userspace APIC 
without some sort of in-between translation layer (which could just as easily 
change the device names).

Regards,

Anthony Liguori

>
> Jan
>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20  0:42             ` Anthony Liguori
  0 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:42 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 06:37 PM, Jan Kiszka wrote:
> On 2011-12-20 01:32, Anthony Liguori wrote:
>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>
>>>>> Anthony,
>>>>>
>>>>> Can you please review&    ACK?
>>>>>
>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>> uq/master. Still, your review is needed.
>>>>
>>>> Overall, it looks good except for the backend/frontend split.  This
>>>> should be done in terms of qdev inheritance.
>>>
>>> I cannot follow your idea here yet. There is no inheritance as we end up
>>> with only a single class that permutes (selects a different backend) on
>>> creation. I'm not sure how to model two classes that will still only
>>> mean a single qdev registration.
>>
>> See other reply in thread.
>>
>> We should model this as two separate qdev devices.  We can avoid
>> regressing migration in qemu-kvm by just having a common vmstate name.
>>
>> apic is a no-user device so there's no way that changing the name of it
>> in qemu-kvm can affect users.
>
> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
> for the discussion of that model.

I have.  I don't understand the rationale for jumping through hoops here.

There seems to be an assertion that migrating from in-kernel APIC to userspace 
APIC is an important use case.  I don't really see how that's true.

But nonetheless, the direction migration is heading is not just to migrate the 
QOM path names to identify devices, but to provide a way to introspect the 
device model, transfer the current device model description to the other end, 
and create the device model on the destination.

This is the only way to reliably support things like hot-plug during live 
migration which is something we punt to management tools (which really can't 
implement it properly).

So we'll already be migrating the apic backend property which means that you are 
not going to have migration to and from in-kernel APIC and userspace APIC 
without some sort of in-between translation layer (which could just as easily 
change the device names).

Regards,

Anthony Liguori

>
> Jan
>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-20  0:34           ` [Qemu-devel] " Jan Kiszka
@ 2011-12-20  0:53             ` Anthony Liguori
  -1 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:53 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

On 12/19/2011 06:34 PM, Jan Kiszka wrote:
> On 2011-12-20 01:31, Anthony Liguori wrote:
>> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>>> To enable migration between accelerated and non-accelerated APIC
>>>>> models,
>>>>> we will need to handle the timer saving and restoring specially and can
>>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>>> accelerated model will not start any QEMUTimer.
>>>>>
>>>>> This patch therefore factors out the generic bits into apic_next_timer
>>>>> and introduces a post-load callback that can be implemented differently
>>>>> by both models.
>>>>>
>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>>
>>>> So you basically want the timer to be a dummy field for the in-kernel
>>>> apic?
>>>>
>>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>>> where there isn't an event pending)?
>>>
>>> It is not a dummy field, it contains the proper state in both cases. We
>>> just need to convert it to an open-coded state to avoid the QEMUTimer
>>> restoration magic in the in-kernel case (where there must be no
>>> QEMUTimer).
>>
>> So the state gets fed into the kernel instead of userspace?
>
> Nope. It's kept for eventual use by a user space model.

I think you misunderstood my comments.

When you are using the in-kernel APIC, the is no implementation for the 
post_load hook.  As far as I can tell, the state isn't used.

I know it's used by the user space model but from what I can tell, the value is 
essentially sync with the in-kernel APIC almost immediately as it happens during 
KVM_RUN.

So it's a QEMUTimer in the userspace model, but it's just an integer when used 
in the in-kernel APIC as the timer never fires.  It is just saved/restored from 
and to the kernel.

Is this correct?

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
@ 2011-12-20  0:53             ` Anthony Liguori
  0 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  0:53 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

On 12/19/2011 06:34 PM, Jan Kiszka wrote:
> On 2011-12-20 01:31, Anthony Liguori wrote:
>> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>>> To enable migration between accelerated and non-accelerated APIC
>>>>> models,
>>>>> we will need to handle the timer saving and restoring specially and can
>>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>>> accelerated model will not start any QEMUTimer.
>>>>>
>>>>> This patch therefore factors out the generic bits into apic_next_timer
>>>>> and introduces a post-load callback that can be implemented differently
>>>>> by both models.
>>>>>
>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>>
>>>> So you basically want the timer to be a dummy field for the in-kernel
>>>> apic?
>>>>
>>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>>> where there isn't an event pending)?
>>>
>>> It is not a dummy field, it contains the proper state in both cases. We
>>> just need to convert it to an open-coded state to avoid the QEMUTimer
>>> restoration magic in the in-kernel case (where there must be no
>>> QEMUTimer).
>>
>> So the state gets fed into the kernel instead of userspace?
>
> Nope. It's kept for eventual use by a user space model.

I think you misunderstood my comments.

When you are using the in-kernel APIC, the is no implementation for the 
post_load hook.  As far as I can tell, the state isn't used.

I know it's used by the user space model but from what I can tell, the value is 
essentially sync with the in-kernel APIC almost immediately as it happens during 
KVM_RUN.

So it's a QEMUTimer in the userspace model, but it's just an integer when used 
in the in-kernel APIC as the timer never fires.  It is just saved/restored from 
and to the kernel.

Is this correct?

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  0:37         ` Jan Kiszka
  2011-12-20  0:42             ` [Qemu-devel] " Anthony Liguori
@ 2011-12-20  1:08           ` Anthony Liguori
  2011-12-20  1:19               ` [Qemu-devel] " Jan Kiszka
  1 sibling, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  1:08 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 06:37 PM, Jan Kiszka wrote:
> On 2011-12-20 01:32, Anthony Liguori wrote:
>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>
>>>>> Anthony,
>>>>>
>>>>> Can you please review&    ACK?
>>>>>
>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>> uq/master. Still, your review is needed.
>>>>
>>>> Overall, it looks good except for the backend/frontend split.  This
>>>> should be done in terms of qdev inheritance.
>>>
>>> I cannot follow your idea here yet. There is no inheritance as we end up
>>> with only a single class that permutes (selects a different backend) on
>>> creation. I'm not sure how to model two classes that will still only
>>> mean a single qdev registration.
>>
>> See other reply in thread.
>>
>> We should model this as two separate qdev devices.  We can avoid
>> regressing migration in qemu-kvm by just having a common vmstate name.
>>
>> apic is a no-user device so there's no way that changing the name of it
>> in qemu-kvm can affect users.
>
> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
> for the discussion of that model.

Let me say that I know this is the last bit of qemu-kvm that needs merging and 
that this has been an epic effort.  I wouldn't refuse to merge a pull request 
that came in with this in its current form.

If we merged this now, I would be submitting patches in the not too distant 
future to remove all of this backend stuff in favor of proper modeling 
(including using two separate devices).

There's lot of inconsistency in qdev already today so adding a little more isn't 
the end of the world.  We're going to need to eventually have this debate soon 
so it's up to you whether you want to just get this merged now and worry about 
this another day or resolve this before merge.

I don't see any compatibility issues here so I'm not really concerned about 
introducing a regression by breaking it into two devices.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  1:08           ` Anthony Liguori
@ 2011-12-20  1:19               ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  1:19 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 3195 bytes --]

On 2011-12-20 02:08, Anthony Liguori wrote:
> On 12/19/2011 06:37 PM, Jan Kiszka wrote:
>> On 2011-12-20 01:32, Anthony Liguori wrote:
>>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>>
>>>>>> Anthony,
>>>>>>
>>>>>> Can you please review&    ACK?
>>>>>>
>>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>>> uq/master. Still, your review is needed.
>>>>>
>>>>> Overall, it looks good except for the backend/frontend split.  This
>>>>> should be done in terms of qdev inheritance.
>>>>
>>>> I cannot follow your idea here yet. There is no inheritance as we
>>>> end up
>>>> with only a single class that permutes (selects a different backend) on
>>>> creation. I'm not sure how to model two classes that will still only
>>>> mean a single qdev registration.
>>>
>>> See other reply in thread.
>>>
>>> We should model this as two separate qdev devices.  We can avoid
>>> regressing migration in qemu-kvm by just having a common vmstate name.
>>>
>>> apic is a no-user device so there's no way that changing the name of it
>>> in qemu-kvm can affect users.
>>
>> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
>> for the discussion of that model.
> 
> Let me say that I know this is the last bit of qemu-kvm that needs
> merging and that this has been an epic effort.  I wouldn't refuse to
> merge a pull request that came in with this in its current form.
> 
> If we merged this now, I would be submitting patches in the not too
> distant future to remove all of this backend stuff in favor of proper
> modeling (including using two separate devices).
> 
> There's lot of inconsistency in qdev already today so adding a little
> more isn't the end of the world.  We're going to need to eventually have
> this debate soon so it's up to you whether you want to just get this
> merged now and worry about this another day or resolve this before merge.
> 
> I don't see any compatibility issues here so I'm not really concerned
> about introducing a regression by breaking it into two devices.

I don't want to see yet another attempt merged that requires foreseeable
refactoring later on. The point of this one is to do it in a way that is
providing a sound foundation for all those other features that still
wait in qemu-kvm for refactoring.

The point is that migration support between in-kernel on/off is a
worthwhile feature we should design for. That either means skipping the
backend property on device tree migration (maybe a feature we want in
other use cases as well) or provide an alias naming scheme where you can
address APICs, IOAPICs, i8259, i8254 and all the chips that non-x86 will
bring us without knowing where they are implemented and without worrying
to migrate between those variants. If you have a good model for that in
mind, rolling back to v1, rebasing improvements from v5 over it would
not be a big deal. But everyone in this round should agree on this
first. I don't wanna port back and forth nor refactor all this again
when once it's in.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20  1:19               ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  1:19 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 3195 bytes --]

On 2011-12-20 02:08, Anthony Liguori wrote:
> On 12/19/2011 06:37 PM, Jan Kiszka wrote:
>> On 2011-12-20 01:32, Anthony Liguori wrote:
>>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>>
>>>>>> Anthony,
>>>>>>
>>>>>> Can you please review&    ACK?
>>>>>>
>>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>>> uq/master. Still, your review is needed.
>>>>>
>>>>> Overall, it looks good except for the backend/frontend split.  This
>>>>> should be done in terms of qdev inheritance.
>>>>
>>>> I cannot follow your idea here yet. There is no inheritance as we
>>>> end up
>>>> with only a single class that permutes (selects a different backend) on
>>>> creation. I'm not sure how to model two classes that will still only
>>>> mean a single qdev registration.
>>>
>>> See other reply in thread.
>>>
>>> We should model this as two separate qdev devices.  We can avoid
>>> regressing migration in qemu-kvm by just having a common vmstate name.
>>>
>>> apic is a no-user device so there's no way that changing the name of it
>>> in qemu-kvm can affect users.
>>
>> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
>> for the discussion of that model.
> 
> Let me say that I know this is the last bit of qemu-kvm that needs
> merging and that this has been an epic effort.  I wouldn't refuse to
> merge a pull request that came in with this in its current form.
> 
> If we merged this now, I would be submitting patches in the not too
> distant future to remove all of this backend stuff in favor of proper
> modeling (including using two separate devices).
> 
> There's lot of inconsistency in qdev already today so adding a little
> more isn't the end of the world.  We're going to need to eventually have
> this debate soon so it's up to you whether you want to just get this
> merged now and worry about this another day or resolve this before merge.
> 
> I don't see any compatibility issues here so I'm not really concerned
> about introducing a regression by breaking it into two devices.

I don't want to see yet another attempt merged that requires foreseeable
refactoring later on. The point of this one is to do it in a way that is
providing a sound foundation for all those other features that still
wait in qemu-kvm for refactoring.

The point is that migration support between in-kernel on/off is a
worthwhile feature we should design for. That either means skipping the
backend property on device tree migration (maybe a feature we want in
other use cases as well) or provide an alias naming scheme where you can
address APICs, IOAPICs, i8259, i8254 and all the chips that non-x86 will
bring us without knowing where they are implemented and without worrying
to migrate between those variants. If you have a good model for that in
mind, rolling back to v1, rebasing improvements from v5 over it would
not be a big deal. But everyone in this round should agree on this
first. I don't wanna port back and forth nor refactor all this again
when once it's in.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 07/16] apic: Open-code timer save/restore
  2011-12-20  0:53             ` [Qemu-devel] " Anthony Liguori
@ 2011-12-20  1:24               ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  1:24 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 2392 bytes --]

On 2011-12-20 01:53, Anthony Liguori wrote:
> On 12/19/2011 06:34 PM, Jan Kiszka wrote:
>> On 2011-12-20 01:31, Anthony Liguori wrote:
>>> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>>>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>>>> To enable migration between accelerated and non-accelerated APIC
>>>>>> models,
>>>>>> we will need to handle the timer saving and restoring specially
>>>>>> and can
>>>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>>>> accelerated model will not start any QEMUTimer.
>>>>>>
>>>>>> This patch therefore factors out the generic bits into
>>>>>> apic_next_timer
>>>>>> and introduces a post-load callback that can be implemented
>>>>>> differently
>>>>>> by both models.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>>>
>>>>> So you basically want the timer to be a dummy field for the in-kernel
>>>>> apic?
>>>>>
>>>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>>>> where there isn't an event pending)?
>>>>
>>>> It is not a dummy field, it contains the proper state in both cases. We
>>>> just need to convert it to an open-coded state to avoid the QEMUTimer
>>>> restoration magic in the in-kernel case (where there must be no
>>>> QEMUTimer).
>>>
>>> So the state gets fed into the kernel instead of userspace?
>>
>> Nope. It's kept for eventual use by a user space model.
> 
> I think you misunderstood my comments.
> 
> When you are using the in-kernel APIC, the is no implementation for the
> post_load hook.  As far as I can tell, the state isn't used.

Correct, it's just kept up to date.

> 
> I know it's used by the user space model but from what I can tell, the
> value is essentially sync with the in-kernel APIC almost immediately as
> it happens during KVM_RUN.
> 
> So it's a QEMUTimer in the userspace model, but it's just an integer
> when used in the in-kernel APIC as the timer never fires.  It is just
> saved/restored from and to the kernel.
> 
> Is this correct?

Almost. timer_expiry is calculated on get_apic_state based on the APIC
registers. And it is initialized on reset. But it is never saved into
the kernel nor does it otherwise affect the in-kernel model. It is
really just a compatibility field for potential user space apic usage.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/16] apic: Open-code timer save/restore
@ 2011-12-20  1:24               ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  1:24 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 2392 bytes --]

On 2011-12-20 01:53, Anthony Liguori wrote:
> On 12/19/2011 06:34 PM, Jan Kiszka wrote:
>> On 2011-12-20 01:31, Anthony Liguori wrote:
>>> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>>>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>>>> To enable migration between accelerated and non-accelerated APIC
>>>>>> models,
>>>>>> we will need to handle the timer saving and restoring specially
>>>>>> and can
>>>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>>>> accelerated model will not start any QEMUTimer.
>>>>>>
>>>>>> This patch therefore factors out the generic bits into
>>>>>> apic_next_timer
>>>>>> and introduces a post-load callback that can be implemented
>>>>>> differently
>>>>>> by both models.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>>>
>>>>> So you basically want the timer to be a dummy field for the in-kernel
>>>>> apic?
>>>>>
>>>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>>>> where there isn't an event pending)?
>>>>
>>>> It is not a dummy field, it contains the proper state in both cases. We
>>>> just need to convert it to an open-coded state to avoid the QEMUTimer
>>>> restoration magic in the in-kernel case (where there must be no
>>>> QEMUTimer).
>>>
>>> So the state gets fed into the kernel instead of userspace?
>>
>> Nope. It's kept for eventual use by a user space model.
> 
> I think you misunderstood my comments.
> 
> When you are using the in-kernel APIC, the is no implementation for the
> post_load hook.  As far as I can tell, the state isn't used.

Correct, it's just kept up to date.

> 
> I know it's used by the user space model but from what I can tell, the
> value is essentially sync with the in-kernel APIC almost immediately as
> it happens during KVM_RUN.
> 
> So it's a QEMUTimer in the userspace model, but it's just an integer
> when used in the in-kernel APIC as the timer never fires.  It is just
> saved/restored from and to the kernel.
> 
> Is this correct?

Almost. timer_expiry is calculated on get_apic_state based on the APIC
registers. And it is initialized on reset. But it is never saved into
the kernel nor does it otherwise affect the in-kernel model. It is
really just a compatibility field for potential user space apic usage.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  1:19               ` [Qemu-devel] " Jan Kiszka
@ 2011-12-20  1:28                 ` Jan Kiszka
  -1 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  1:28 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 3611 bytes --]

On 2011-12-20 02:19, Jan Kiszka wrote:
> On 2011-12-20 02:08, Anthony Liguori wrote:
>> On 12/19/2011 06:37 PM, Jan Kiszka wrote:
>>> On 2011-12-20 01:32, Anthony Liguori wrote:
>>>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>>>
>>>>>>> Anthony,
>>>>>>>
>>>>>>> Can you please review&    ACK?
>>>>>>>
>>>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>>>> uq/master. Still, your review is needed.
>>>>>>
>>>>>> Overall, it looks good except for the backend/frontend split.  This
>>>>>> should be done in terms of qdev inheritance.
>>>>>
>>>>> I cannot follow your idea here yet. There is no inheritance as we
>>>>> end up
>>>>> with only a single class that permutes (selects a different backend) on
>>>>> creation. I'm not sure how to model two classes that will still only
>>>>> mean a single qdev registration.
>>>>
>>>> See other reply in thread.
>>>>
>>>> We should model this as two separate qdev devices.  We can avoid
>>>> regressing migration in qemu-kvm by just having a common vmstate name.
>>>>
>>>> apic is a no-user device so there's no way that changing the name of it
>>>> in qemu-kvm can affect users.
>>>
>>> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
>>> for the discussion of that model.
>>
>> Let me say that I know this is the last bit of qemu-kvm that needs
>> merging and that this has been an epic effort.  I wouldn't refuse to
>> merge a pull request that came in with this in its current form.
>>
>> If we merged this now, I would be submitting patches in the not too
>> distant future to remove all of this backend stuff in favor of proper
>> modeling (including using two separate devices).
>>
>> There's lot of inconsistency in qdev already today so adding a little
>> more isn't the end of the world.  We're going to need to eventually have
>> this debate soon so it's up to you whether you want to just get this
>> merged now and worry about this another day or resolve this before merge.
>>
>> I don't see any compatibility issues here so I'm not really concerned
>> about introducing a regression by breaking it into two devices.
> 
> I don't want to see yet another attempt merged that requires foreseeable
> refactoring later on. The point of this one is to do it in a way that is
> providing a sound foundation for all those other features that still
> wait in qemu-kvm for refactoring.
> 
> The point is that migration support between in-kernel on/off is a
> worthwhile feature we should design for.

Forgot to state the why: This allows seamless migration from older,
non-accelerated setups and to switch between both models in case on
faces some issues. That not only applies to the APIC but to all those
various in-kernel device models we have and will add in the future.

> That either means skipping the
> backend property on device tree migration (maybe a feature we want in
> other use cases as well) or provide an alias naming scheme where you can
> address APICs, IOAPICs, i8259, i8254 and all the chips that non-x86 will
> bring us without knowing where they are implemented and without worrying
> to migrate between those variants. If you have a good model for that in
> mind, rolling back to v1, rebasing improvements from v5 over it would
> not be a big deal. But everyone in this round should agree on this
> first. I don't wanna port back and forth nor refactor all this again
> when once it's in.
> 
> Jan

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20  1:28                 ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  1:28 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 3611 bytes --]

On 2011-12-20 02:19, Jan Kiszka wrote:
> On 2011-12-20 02:08, Anthony Liguori wrote:
>> On 12/19/2011 06:37 PM, Jan Kiszka wrote:
>>> On 2011-12-20 01:32, Anthony Liguori wrote:
>>>> On 12/19/2011 05:49 PM, Jan Kiszka wrote:
>>>>> On 2011-12-19 23:24, Anthony Liguori wrote:
>>>>>> On 12/19/2011 03:17 PM, Marcelo Tosatti wrote:
>>>>>>>
>>>>>>> Anthony,
>>>>>>>
>>>>>>> Can you please review&    ACK?
>>>>>>>
>>>>>>> You could even apply directly but well do a kvm-autotest run through
>>>>>>> uq/master. Still, your review is needed.
>>>>>>
>>>>>> Overall, it looks good except for the backend/frontend split.  This
>>>>>> should be done in terms of qdev inheritance.
>>>>>
>>>>> I cannot follow your idea here yet. There is no inheritance as we
>>>>> end up
>>>>> with only a single class that permutes (selects a different backend) on
>>>>> creation. I'm not sure how to model two classes that will still only
>>>>> mean a single qdev registration.
>>>>
>>>> See other reply in thread.
>>>>
>>>> We should model this as two separate qdev devices.  We can avoid
>>>> regressing migration in qemu-kvm by just having a common vmstate name.
>>>>
>>>> apic is a no-user device so there's no way that changing the name of it
>>>> in qemu-kvm can affect users.
>>>
>>> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
>>> for the discussion of that model.
>>
>> Let me say that I know this is the last bit of qemu-kvm that needs
>> merging and that this has been an epic effort.  I wouldn't refuse to
>> merge a pull request that came in with this in its current form.
>>
>> If we merged this now, I would be submitting patches in the not too
>> distant future to remove all of this backend stuff in favor of proper
>> modeling (including using two separate devices).
>>
>> There's lot of inconsistency in qdev already today so adding a little
>> more isn't the end of the world.  We're going to need to eventually have
>> this debate soon so it's up to you whether you want to just get this
>> merged now and worry about this another day or resolve this before merge.
>>
>> I don't see any compatibility issues here so I'm not really concerned
>> about introducing a regression by breaking it into two devices.
> 
> I don't want to see yet another attempt merged that requires foreseeable
> refactoring later on. The point of this one is to do it in a way that is
> providing a sound foundation for all those other features that still
> wait in qemu-kvm for refactoring.
> 
> The point is that migration support between in-kernel on/off is a
> worthwhile feature we should design for.

Forgot to state the why: This allows seamless migration from older,
non-accelerated setups and to switch between both models in case on
faces some issues. That not only applies to the APIC but to all those
various in-kernel device models we have and will add in the future.

> That either means skipping the
> backend property on device tree migration (maybe a feature we want in
> other use cases as well) or provide an alias naming scheme where you can
> address APICs, IOAPICs, i8259, i8254 and all the chips that non-x86 will
> bring us without knowing where they are implemented and without worrying
> to migrate between those variants. If you have a good model for that in
> mind, rolling back to v1, rebasing improvements from v5 over it would
> not be a big deal. But everyone in this round should agree on this
> first. I don't wanna port back and forth nor refactor all this again
> when once it's in.
> 
> Jan

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  1:19               ` [Qemu-devel] " Jan Kiszka
  (?)
  (?)
@ 2011-12-20  2:46               ` Anthony Liguori
  2011-12-20  3:10                 ` Anthony Liguori
  2011-12-20 10:03                   ` [Qemu-devel] " Avi Kivity
  -1 siblings, 2 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  2:46 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 07:19 PM, Jan Kiszka wrote:
> On 2011-12-20 02:08, Anthony Liguori wrote:
>> There's lot of inconsistency in qdev already today so adding a little
>> more isn't the end of the world.  We're going to need to eventually have
>> this debate soon so it's up to you whether you want to just get this
>> merged now and worry about this another day or resolve this before merge.
>>
>> I don't see any compatibility issues here so I'm not really concerned
>> about introducing a regression by breaking it into two devices.
>
> I don't want to see yet another attempt merged that requires foreseeable
> refactoring later on. The point of this one is to do it in a way that is
> providing a sound foundation for all those other features that still
> wait in qemu-kvm for refactoring.

Excellent, that was what I was hoping for :-)

> The point is that migration support between in-kernel on/off is a
> worthwhile feature we should design for.

I'm not convinced of that but for the sake of this discussion, let's assume it is.

I would hope that you would agree that when designing the device model, we 
should aim to do what makes sense independent of migration.  If we cannot 
achieve a certain feature with migration given the logical modeling of devices, 
it probably suggests that we need to improve our migration infrastructure.

I assume that given the above, we all agree that separate devices is what makes 
the most sense ignoring migration.  If so, let's just focus on how to make 
migration work.

> That either means skipping the
> backend property on device tree migration (maybe a feature we want in
> other use cases as well) or provide an alias naming scheme where you can
> address APICs, IOAPICs, i8259, i8254 and all the chips that non-x86 will
> bring us without knowing where they are implemented and without worrying
> to migrate between those variants. If you have a good model for that in
> mind, rolling back to v1, rebasing improvements from v5 over it would
> not be a big deal. But everyone in this round should agree on this
> first. I don't wanna port back and forth nor refactor all this again
> when once it's in.

Here's how we solve this problem:

1) In the short term, advertise both devices as having the same VMstate name. 
Since we don't register until the device is instantiated, this will Just Work 
and is easy.

2) In the not so short term, we'll have Mike Roth's Visitor series land in the 
tree (Juan promised me it will be in his next pull request).

3) Once we have the Visitor infrastructure in place, we can introduce a self 
describing migration format (that will also use QOM path names).  With a self 
describing format, we can read all of the data from the wire into memory without 
consulting devices.

4) We now have the ability to arbitrarily manipulate this tree in memory.  It's 
just a matter or writing a small tree transformer that converts the KVM-APIC 
state to the APIC device state (by just renaming a level of the tree).  Heck, we 
could even map fields if we needed to (although we should probably avoid 
divergence if at all possible).

5) We can now hand this manipulated tree to an input Visitor and the devices 
will read it in as if it came from the same device.

This is the level of flexibility we need to support migration compatibility 
moving forward.  We're actually not that far from it either.  We'll definitely 
have it in place before we have a new migration protocol.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  2:46               ` Anthony Liguori
@ 2011-12-20  3:10                 ` Anthony Liguori
  2011-12-20  8:34                     ` [Qemu-devel] " Jan Kiszka
  2011-12-20 10:03                   ` [Qemu-devel] " Avi Kivity
  1 sibling, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20  3:10 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 12/19/2011 08:46 PM, Anthony Liguori wrote:
> On 12/19/2011 07:19 PM, Jan Kiszka wrote:
>> On 2011-12-20 02:08, Anthony Liguori wrote:
> Here's how we solve this problem:
>
> 1) In the short term, advertise both devices as having the same VMstate name.
> Since we don't register until the device is instantiated, this will Just Work
> and is easy.
>
> 2) In the not so short term, we'll have Mike Roth's Visitor series land in the
> tree (Juan promised me it will be in his next pull request).
>
> 3) Once we have the Visitor infrastructure in place, we can introduce a self
> describing migration format (that will also use QOM path names). With a self
> describing format, we can read all of the data from the wire into memory without
> consulting devices.
>
> 4) We now have the ability to arbitrarily manipulate this tree in memory. It's
> just a matter or writing a small tree transformer that converts the KVM-APIC
> state to the APIC device state (by just renaming a level of the tree). Heck, we
> could even map fields if we needed to (although we should probably avoid
> divergence if at all possible).

The way this would is that something would register a migration "filter" when a 
userspace APIC was instantiated.  Maybe that's the device itself or maybe it's 
some centralized logic.  At any rate, since we have a self-describing format 
(and maybe it's just JSON), we can build a QObject.

The filters would get called with the QObject before it was decoded and 
dispatched to devices.  It would look something like:

static QDict *kvm_apic_to_userspace_apic(QDict *state, void *opaque)
{
    if (strcmp(qdict_get_str(state, "__type__"), "kvm-apic") {
       QDict *userspace_apic = qdict_new();
       const char *key;

       qdict_foreach_key(&key, state) {
           QObject *value = qdict_get(state, key);

           qobject_incref(value);
           qdict_put_obj(userspace_apic, key, value);
       }
       qdict_put_str(userspace_apic, "__type__", "apic");
       return userspace_apic;
    } else {
       qobject_incref(state);
       return state;
    }
}

The same sort of filter function could also handle migration compatibility 
between virtio-blk-pci and a pair of virtio-blk/virtio-pci devices.  It would 
simply match on the __type__ of "virtio-blk-pci", and then split apart the state 
into an appropriate "virtio-pci" dictionary and a "virtio-blk" dictionary.

This is just psuedo-code mind you.  We'll need to think carefully about how we 
recurse and apply these filters.  But it will be an extremely powerful mechanism 
that will let us solve most of these compatibility problems in an elegant way.

Regards,

Anthony Liguori

>
> 5) We can now hand this manipulated tree to an input Visitor and the devices
> will read it in as if it came from the same device.
>
> This is the level of flexibility we need to support migration compatibility
> moving forward. We're actually not that far from it either. We'll definitely
> have it in place before we have a new migration protocol.
>
> Regards,
>
> Anthony Liguori
>
>>
>> Jan
>>
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  3:10                 ` Anthony Liguori
@ 2011-12-20  8:34                     ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  8:34 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]

On 2011-12-20 04:10, Anthony Liguori wrote:
> On 12/19/2011 08:46 PM, Anthony Liguori wrote:
>> On 12/19/2011 07:19 PM, Jan Kiszka wrote:
>>> On 2011-12-20 02:08, Anthony Liguori wrote:
>> Here's how we solve this problem:
>>
>> 1) In the short term, advertise both devices as having the same
>> VMstate name.
>> Since we don't register until the device is instantiated, this will
>> Just Work
>> and is easy.
>>
>> 2) In the not so short term, we'll have Mike Roth's Visitor series
>> land in the
>> tree (Juan promised me it will be in his next pull request).
>>
>> 3) Once we have the Visitor infrastructure in place, we can introduce
>> a self
>> describing migration format (that will also use QOM path names). With
>> a self
>> describing format, we can read all of the data from the wire into
>> memory without
>> consulting devices.
>>
>> 4) We now have the ability to arbitrarily manipulate this tree in
>> memory. It's
>> just a matter or writing a small tree transformer that converts the
>> KVM-APIC
>> state to the APIC device state (by just renaming a level of the tree).
>> Heck, we
>> could even map fields if we needed to (although we should probably avoid
>> divergence if at all possible).
> 
> The way this would is that something would register a migration "filter"
> when a userspace APIC was instantiated.  Maybe that's the device itself
> or maybe it's some centralized logic.  At any rate, since we have a
> self-describing format (and maybe it's just JSON), we can build a QObject.
> 
> The filters would get called with the QObject before it was decoded and
> dispatched to devices.  It would look something like:
> 
> static QDict *kvm_apic_to_userspace_apic(QDict *state, void *opaque)
> {
>    if (strcmp(qdict_get_str(state, "__type__"), "kvm-apic") {
>       QDict *userspace_apic = qdict_new();
>       const char *key;
> 
>       qdict_foreach_key(&key, state) {
>           QObject *value = qdict_get(state, key);
> 
>           qobject_incref(value);
>           qdict_put_obj(userspace_apic, key, value);
>       }
>       qdict_put_str(userspace_apic, "__type__", "apic");
>       return userspace_apic;
>    } else {
>       qobject_incref(state);
>       return state;
>    }
> }
> 
> The same sort of filter function could also handle migration
> compatibility between virtio-blk-pci and a pair of virtio-blk/virtio-pci
> devices.  It would simply match on the __type__ of "virtio-blk-pci", and
> then split apart the state into an appropriate "virtio-pci" dictionary
> and a "virtio-blk" dictionary.
> 
> This is just psuedo-code mind you.  We'll need to think carefully about
> how we recurse and apply these filters.  But it will be an extremely
> powerful mechanism that will let us solve most of these compatibility
> problems in an elegant way.

Another approach, which also solves an issue the above does not, go like
this:

Use some device alias as name fore saving, and also accept this for
addressing the device in a running VM. The latter would allow for
/path/to/the/ioapic to always point you to the currently used IOAPIC
version, no matter if it is actually kvm-ioapic or [qemu-]ioapic. This
feature was requested by Avi back then. It doesn't map to existing
features directly, though.

In any case, I'm not going to touch a line of code until there is
consensus about the way to go.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20  8:34                     ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20  8:34 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]

On 2011-12-20 04:10, Anthony Liguori wrote:
> On 12/19/2011 08:46 PM, Anthony Liguori wrote:
>> On 12/19/2011 07:19 PM, Jan Kiszka wrote:
>>> On 2011-12-20 02:08, Anthony Liguori wrote:
>> Here's how we solve this problem:
>>
>> 1) In the short term, advertise both devices as having the same
>> VMstate name.
>> Since we don't register until the device is instantiated, this will
>> Just Work
>> and is easy.
>>
>> 2) In the not so short term, we'll have Mike Roth's Visitor series
>> land in the
>> tree (Juan promised me it will be in his next pull request).
>>
>> 3) Once we have the Visitor infrastructure in place, we can introduce
>> a self
>> describing migration format (that will also use QOM path names). With
>> a self
>> describing format, we can read all of the data from the wire into
>> memory without
>> consulting devices.
>>
>> 4) We now have the ability to arbitrarily manipulate this tree in
>> memory. It's
>> just a matter or writing a small tree transformer that converts the
>> KVM-APIC
>> state to the APIC device state (by just renaming a level of the tree).
>> Heck, we
>> could even map fields if we needed to (although we should probably avoid
>> divergence if at all possible).
> 
> The way this would is that something would register a migration "filter"
> when a userspace APIC was instantiated.  Maybe that's the device itself
> or maybe it's some centralized logic.  At any rate, since we have a
> self-describing format (and maybe it's just JSON), we can build a QObject.
> 
> The filters would get called with the QObject before it was decoded and
> dispatched to devices.  It would look something like:
> 
> static QDict *kvm_apic_to_userspace_apic(QDict *state, void *opaque)
> {
>    if (strcmp(qdict_get_str(state, "__type__"), "kvm-apic") {
>       QDict *userspace_apic = qdict_new();
>       const char *key;
> 
>       qdict_foreach_key(&key, state) {
>           QObject *value = qdict_get(state, key);
> 
>           qobject_incref(value);
>           qdict_put_obj(userspace_apic, key, value);
>       }
>       qdict_put_str(userspace_apic, "__type__", "apic");
>       return userspace_apic;
>    } else {
>       qobject_incref(state);
>       return state;
>    }
> }
> 
> The same sort of filter function could also handle migration
> compatibility between virtio-blk-pci and a pair of virtio-blk/virtio-pci
> devices.  It would simply match on the __type__ of "virtio-blk-pci", and
> then split apart the state into an appropriate "virtio-pci" dictionary
> and a "virtio-blk" dictionary.
> 
> This is just psuedo-code mind you.  We'll need to think carefully about
> how we recurse and apply these filters.  But it will be an extremely
> powerful mechanism that will let us solve most of these compatibility
> problems in an elegant way.

Another approach, which also solves an issue the above does not, go like
this:

Use some device alias as name fore saving, and also accept this for
addressing the device in a running VM. The latter would allow for
/path/to/the/ioapic to always point you to the currently used IOAPIC
version, no matter if it is actually kvm-ioapic or [qemu-]ioapic. This
feature was requested by Avi back then. It doesn't map to existing
features directly, though.

In any case, I'm not going to touch a line of code until there is
consensus about the way to go.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20  0:38           ` Anthony Liguori
@ 2011-12-20  9:56               ` Avi Kivity
  0 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20  9:56 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jan Kiszka, Anthony Liguori, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl

On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>> That was v1 of my patches. Avi didn't like it, I tried it like this, and
>> in the end I had to agree. So, no, I don't think we want such a model.
>
>
> Yes, we do :-)
>
> The in-kernel APIC is a different implementation of the APIC device. 
> It's not an "accelerator" for the userspace APIC.

A different implementation but not a different device.  Device == spec.

>
> All that you're doing here is reinventing qdev.  You're defining your
> own type system (APICBackend), creating a new regression system for
> it, and then defining your own factory function for creating it
> (through a qdev property).
>
> I'm struggling to understand the reason to avoid using the
> infrastructure we already have to do all of this.

Not every table of function pointers has to be done through qdev (not
that I feel strongly about this - only that there is just one APIC device).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-20  9:56               ` Avi Kivity
  0 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20  9:56 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>> That was v1 of my patches. Avi didn't like it, I tried it like this, and
>> in the end I had to agree. So, no, I don't think we want such a model.
>
>
> Yes, we do :-)
>
> The in-kernel APIC is a different implementation of the APIC device. 
> It's not an "accelerator" for the userspace APIC.

A different implementation but not a different device.  Device == spec.

>
> All that you're doing here is reinventing qdev.  You're defining your
> own type system (APICBackend), creating a new regression system for
> it, and then defining your own factory function for creating it
> (through a qdev property).
>
> I'm struggling to understand the reason to avoid using the
> infrastructure we already have to do all of this.

Not every table of function pointers has to be done through qdev (not
that I feel strongly about this - only that there is just one APIC device).

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  0:42             ` [Qemu-devel] " Anthony Liguori
@ 2011-12-20 10:01               ` Avi Kivity
  -1 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 10:01 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jan Kiszka, Lai Jiangshan, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl

On 12/20/2011 02:42 AM, Anthony Liguori wrote:
>> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
>> for the discussion of that model.
>
>
> I have.  I don't understand the rationale for jumping through hoops here.
>
> There seems to be an assertion that migrating from in-kernel APIC to
> userspace APIC is an important use case.  I don't really see how
> that's true.
>

That's only because no one is using qemu.git for virtualization.  If
they were, then you'd prevent existing users from using it, except
through guest shutdown and relaunch of qemu (and perhaps reconfiguration).

We've discussed removing the ioapic from the kernel.  If we do that,
then we need to support migration from in-kernel ioapic to userspace ioapic.

> But nonetheless, the direction migration is heading is not just to
> migrate the QOM path names to identify devices, but to provide a way
> to introspect the device model, transfer the current device model
> description to the other end, and create the device model on the
> destination.
>
> This is the only way to reliably support things like hot-plug during
> live migration which is something we punt to management tools (which
> really can't implement it properly).
>
> So we'll already be migrating the apic backend property which means
> that you are not going to have migration to and from in-kernel APIC
> and userspace APIC without some sort of in-between translation layer
> (which could just as easily change the device names).

To what?

The backend property should be private and not migrated.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20 10:01               ` Avi Kivity
  0 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 10:01 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 02:42 AM, Anthony Liguori wrote:
>> Look down http://thread.gmane.org/gmane.comp.emulators.kvm.devel/82598
>> for the discussion of that model.
>
>
> I have.  I don't understand the rationale for jumping through hoops here.
>
> There seems to be an assertion that migrating from in-kernel APIC to
> userspace APIC is an important use case.  I don't really see how
> that's true.
>

That's only because no one is using qemu.git for virtualization.  If
they were, then you'd prevent existing users from using it, except
through guest shutdown and relaunch of qemu (and perhaps reconfiguration).

We've discussed removing the ioapic from the kernel.  If we do that,
then we need to support migration from in-kernel ioapic to userspace ioapic.

> But nonetheless, the direction migration is heading is not just to
> migrate the QOM path names to identify devices, but to provide a way
> to introspect the device model, transfer the current device model
> description to the other end, and create the device model on the
> destination.
>
> This is the only way to reliably support things like hot-plug during
> live migration which is something we punt to management tools (which
> really can't implement it properly).
>
> So we'll already be migrating the apic backend property which means
> that you are not going to have migration to and from in-kernel APIC
> and userspace APIC without some sort of in-between translation layer
> (which could just as easily change the device names).

To what?

The backend property should be private and not migrated.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20  2:46               ` Anthony Liguori
@ 2011-12-20 10:03                   ` Avi Kivity
  2011-12-20 10:03                   ` [Qemu-devel] " Avi Kivity
  1 sibling, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 10:03 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 04:46 AM, Anthony Liguori wrote:
>
> I would hope that you would agree that when designing the device
> model, we should aim to do what makes sense independent of migration. 
> If we cannot achieve a certain feature with migration given the
> logical modeling of devices, it probably suggests that we need to
> improve our migration infrastructure.
>
> I assume that given the above, we all agree that separate devices is
> what makes the most sense ignoring migration.

I don't agree with this.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20 10:03                   ` Avi Kivity
  0 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 10:03 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 04:46 AM, Anthony Liguori wrote:
>
> I would hope that you would agree that when designing the device
> model, we should aim to do what makes sense independent of migration. 
> If we cannot achieve a certain feature with migration given the
> logical modeling of devices, it probably suggests that we need to
> improve our migration infrastructure.
>
> I assume that given the above, we all agree that separate devices is
> what makes the most sense ignoring migration.

I don't agree with this.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20 10:03                   ` [Qemu-devel] " Avi Kivity
@ 2011-12-20 10:08                     ` Avi Kivity
  -1 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 10:08 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Jan Kiszka, Lai Jiangshan, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl

On 12/20/2011 12:03 PM, Avi Kivity wrote:
> On 12/20/2011 04:46 AM, Anthony Liguori wrote:
> >
> > I would hope that you would agree that when designing the device
> > model, we should aim to do what makes sense independent of migration. 
> > If we cannot achieve a certain feature with migration given the
> > logical modeling of devices, it probably suggests that we need to
> > improve our migration infrastructure.
> >
> > I assume that given the above, we all agree that separate devices is
> > what makes the most sense ignoring migration.
>
> I don't agree with this.

The problem with having two devices, is that now you have to identify
the common code, put them somewhere, and use them as necessary.

"apic" and "kvm-apic" both is-a (are-a?) "apic".  This suggests either a
base class (containing the common code) and derived classes, or (like
Jan's implementation), just one class, that defers part of the
implementation to an interface implemented by two other classes.

Two unrelated classes which happen to implement exactly the same
interface (vmstate fields) except one (visible name) and share some code
are a strange solution to this problem.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
@ 2011-12-20 10:08                     ` Avi Kivity
  0 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 10:08 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 12:03 PM, Avi Kivity wrote:
> On 12/20/2011 04:46 AM, Anthony Liguori wrote:
> >
> > I would hope that you would agree that when designing the device
> > model, we should aim to do what makes sense independent of migration. 
> > If we cannot achieve a certain feature with migration given the
> > logical modeling of devices, it probably suggests that we need to
> > improve our migration infrastructure.
> >
> > I assume that given the above, we all agree that separate devices is
> > what makes the most sense ignoring migration.
>
> I don't agree with this.

The problem with having two devices, is that now you have to identify
the common code, put them somewhere, and use them as necessary.

"apic" and "kvm-apic" both is-a (are-a?) "apic".  This suggests either a
base class (containing the common code) and derived classes, or (like
Jan's implementation), just one class, that defers part of the
implementation to an interface implemented by two other classes.

Two unrelated classes which happen to implement exactly the same
interface (vmstate fields) except one (visible name) and share some code
are a strange solution to this problem.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20  9:56               ` Avi Kivity
  (?)
@ 2011-12-20 13:41               ` Anthony Liguori
  2011-12-20 13:51                   ` Paolo Bonzini
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 13:41 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 03:56 AM, Avi Kivity wrote:
> On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>>> That was v1 of my patches. Avi didn't like it, I tried it like this, and
>>> in the end I had to agree. So, no, I don't think we want such a model.
>>
>>
>> Yes, we do :-)
>>
>> The in-kernel APIC is a different implementation of the APIC device.
>> It's not an "accelerator" for the userspace APIC.
>
> A different implementation but not a different device.  Device == spec.

If it was hardware, it'd be a fully compatible clone.  The way we would model 
this is via inheritance.

Regards,

Anthony Liguori

>
>>
>> All that you're doing here is reinventing qdev.  You're defining your
>> own type system (APICBackend), creating a new regression system for
>> it, and then defining your own factory function for creating it
>> (through a qdev property).
>>
>> I'm struggling to understand the reason to avoid using the
>> infrastructure we already have to do all of this.
>
> Not every table of function pointers has to be done through qdev (not
> that I feel strongly about this - only that there is just one APIC device).
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/16] uq/master: Introduce basic irqchip support
  2011-12-20 10:08                     ` Avi Kivity
  (?)
@ 2011-12-20 13:45                     ` Anthony Liguori
  -1 siblings, 0 replies; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 13:45 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Lai Jiangshan, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 04:08 AM, Avi Kivity wrote:
> On 12/20/2011 12:03 PM, Avi Kivity wrote:
>> On 12/20/2011 04:46 AM, Anthony Liguori wrote:
>>>
>>> I would hope that you would agree that when designing the device
>>> model, we should aim to do what makes sense independent of migration.
>>> If we cannot achieve a certain feature with migration given the
>>> logical modeling of devices, it probably suggests that we need to
>>> improve our migration infrastructure.
>>>
>>> I assume that given the above, we all agree that separate devices is
>>> what makes the most sense ignoring migration.
>>
>> I don't agree with this.
>
> The problem with having two devices, is that now you have to identify
> the common code, put them somewhere, and use them as necessary.
>
> "apic" and "kvm-apic" both is-a (are-a?) "apic".  This suggests either a
> base class (containing the common code) and derived classes, or (like
> Jan's implementation), just one class, that defers part of the
> implementation to an interface implemented by two other classes.

Yes, a base-class is what I'm suggesting since this is what qdev is capable of 
today.

The other approach to this is to have an APICFrontend has-a APICBackend and then 
UserspaceAPIC is-a APICBackend and KernelAPIC is-a APICBackend.

You still now have three visible devices in the device model.  This is 
essentially what Jan's patches do today.

I think a simple base-class + subclass inheritance scheme makes the most sense here.

Regards,

Anthony Liguori

>
> Two unrelated classes which happen to implement exactly the same
> interface (vmstate fields) except one (visible name) and share some code
> are a strange solution to this problem.
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 13:41               ` Anthony Liguori
@ 2011-12-20 13:51                   ` Paolo Bonzini
  0 siblings, 0 replies; 99+ messages in thread
From: Paolo Bonzini @ 2011-12-20 13:51 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Avi Kivity, Anthony Liguori, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 02:41 PM, Anthony Liguori wrote:
> On 12/20/2011 03:56 AM, Avi Kivity wrote:
>> On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>>>> That was v1 of my patches. Avi didn't like it, I tried it like this,
>>>> and
>>>> in the end I had to agree. So, no, I don't think we want such a model.
>>>
>>>
>>> Yes, we do :-)
>>>
>>> The in-kernel APIC is a different implementation of the APIC device.
>>> It's not an "accelerator" for the userspace APIC.
>>
>> A different implementation but not a different device. Device == spec.
>
> If it was hardware, it'd be a fully compatible clone. The way we would
> model this is via inheritance.

I see your fully compatible clone, and I raise my bridge with a 
different implementation underneath.  It's the same old debate on is-a 
vs has-a.

In QOM parlance Jan implemented this:

     abstract class Object
         abstract class Device
             class APIC: { backend: link<APICBackend> }
         abstract class APICBackend
             class QEMU_APICBackend
             class KVM_APICBackend

and you're proposing this:

     abstract class Object
         abstract class Device
             abstract class APIC
                 class QEMU_APIC
                 class KVM_APIC

Both can be right, both can be wrong.

Paolo

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-20 13:51                   ` Paolo Bonzini
  0 siblings, 0 replies; 99+ messages in thread
From: Paolo Bonzini @ 2011-12-20 13:51 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka, Avi Kivity

On 12/20/2011 02:41 PM, Anthony Liguori wrote:
> On 12/20/2011 03:56 AM, Avi Kivity wrote:
>> On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>>>> That was v1 of my patches. Avi didn't like it, I tried it like this,
>>>> and
>>>> in the end I had to agree. So, no, I don't think we want such a model.
>>>
>>>
>>> Yes, we do :-)
>>>
>>> The in-kernel APIC is a different implementation of the APIC device.
>>> It's not an "accelerator" for the userspace APIC.
>>
>> A different implementation but not a different device. Device == spec.
>
> If it was hardware, it'd be a fully compatible clone. The way we would
> model this is via inheritance.

I see your fully compatible clone, and I raise my bridge with a 
different implementation underneath.  It's the same old debate on is-a 
vs has-a.

In QOM parlance Jan implemented this:

     abstract class Object
         abstract class Device
             class APIC: { backend: link<APICBackend> }
         abstract class APICBackend
             class QEMU_APICBackend
             class KVM_APICBackend

and you're proposing this:

     abstract class Object
         abstract class Device
             abstract class APIC
                 class QEMU_APIC
                 class KVM_APIC

Both can be right, both can be wrong.

Paolo

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 13:51                   ` Paolo Bonzini
  (?)
@ 2011-12-20 13:54                   ` Anthony Liguori
  2011-12-20 13:57                     ` Paolo Bonzini
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 13:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Jan Kiszka, Avi Kivity

On 12/20/2011 07:51 AM, Paolo Bonzini wrote:
> On 12/20/2011 02:41 PM, Anthony Liguori wrote:
>> On 12/20/2011 03:56 AM, Avi Kivity wrote:
>>> On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>>>>> That was v1 of my patches. Avi didn't like it, I tried it like this,
>>>>> and
>>>>> in the end I had to agree. So, no, I don't think we want such a model.
>>>>
>>>>
>>>> Yes, we do :-)
>>>>
>>>> The in-kernel APIC is a different implementation of the APIC device.
>>>> It's not an "accelerator" for the userspace APIC.
>>>
>>> A different implementation but not a different device. Device == spec.
>>
>> If it was hardware, it'd be a fully compatible clone. The way we would
>> model this is via inheritance.
>
> I see your fully compatible clone, and I raise my bridge with a different
> implementation underneath. It's the same old debate on is-a vs has-a.
>
> In QOM parlance Jan implemented this:
>
> abstract class Object
> abstract class Device
> class APIC: { backend: link<APICBackend> }
> abstract class APICBackend
> class QEMU_APICBackend
> class KVM_APICBackend

I don't fundamentally object to modeling it like this provided that it's modeled 
(and visible) through qdev and not done through a one-off infrastructure.

But yes, you are exactly correct in your observation (and that both can be right).

Regards,

Anthony Liguori

>
> and you're proposing this:
>
> abstract class Object
> abstract class Device
> abstract class APIC
> class QEMU_APIC
> class KVM_APIC
>
> Both can be right, both can be wrong.
>
> Paolo
>
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 13:54                   ` Anthony Liguori
@ 2011-12-20 13:57                     ` Paolo Bonzini
  2011-12-20 14:07                       ` Anthony Liguori
  0 siblings, 1 reply; 99+ messages in thread
From: Paolo Bonzini @ 2011-12-20 13:57 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Jan Kiszka, Avi Kivity

On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>> In QOM parlance Jan implemented this:
>>
>> abstract class Object
>>     abstract class Device
>>         class APIC: { backend: link<APICBackend> }
>>     abstract class APICBackend
>>         class QEMU_APICBackend
>>         class KVM_APICBackend
>
> I don't fundamentally object to modeling it like this provided that it's
> modeled (and visible) through qdev and not done through a one-off
> infrastructure.

There is no superclass of DeviceState, hence doing it through qdev would 
mean introducing a new bus type and so on.  This would be a superb 
example of a useless bus that can disappear with QOM, but I don't see 
why we should take the pain to add it in the first place. :)

We sure can revisit this when the subclassing and interface 
infrastructures of QOM are merged.

Paolo

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 13:57                     ` Paolo Bonzini
@ 2011-12-20 14:07                       ` Anthony Liguori
  2011-12-20 17:02                           ` Jan Kiszka
  0 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 14:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Jan Kiszka, Avi Kivity

On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>> In QOM parlance Jan implemented this:
>>>
>>> abstract class Object
>>> abstract class Device
>>> class APIC: { backend: link<APICBackend> }
>>> abstract class APICBackend
>>> class QEMU_APICBackend
>>> class KVM_APICBackend
>>
>> I don't fundamentally object to modeling it like this provided that it's
>> modeled (and visible) through qdev and not done through a one-off
>> infrastructure.
>
> There is no superclass of DeviceState, hence doing it through qdev would mean
> introducing a new bus type and so on. This would be a superb example of a
> useless bus that can disappear with QOM, but I don't see why we should take the
> pain to add it in the first place. :)

Right, so let's modeled it for now as inheritance which qdev can cope with.

>
> We sure can revisit this when the subclassing and interface infrastructures of
> QOM are merged.

I'll have patches out this week (just trying to write some more test cases). 
The latest series is below if you're interested.  I fear that it won't be until 
mid to late January before this can be merged though as I want to give folks 
like Markus a chance to review it.

https://github.com/aliguori/qemu/tree/qom-upstream.3

Regards,

Anthony Liguori

>
> Paolo
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 13:51                   ` Paolo Bonzini
@ 2011-12-20 14:07                     ` Avi Kivity
  -1 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 14:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Anthony Liguori, Anthony Liguori, kvm, Michael S. Tsirkin,
	Marcelo Tosatti, qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 03:51 PM, Paolo Bonzini wrote:
> On 12/20/2011 02:41 PM, Anthony Liguori wrote:
>> On 12/20/2011 03:56 AM, Avi Kivity wrote:
>>> On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>>>>> That was v1 of my patches. Avi didn't like it, I tried it like this,
>>>>> and
>>>>> in the end I had to agree. So, no, I don't think we want such a
>>>>> model.
>>>>
>>>>
>>>> Yes, we do :-)
>>>>
>>>> The in-kernel APIC is a different implementation of the APIC device.
>>>> It's not an "accelerator" for the userspace APIC.
>>>
>>> A different implementation but not a different device. Device == spec.
>>
>> If it was hardware, it'd be a fully compatible clone. The way we would
>> model this is via inheritance.
>
> I see your fully compatible clone, and I raise my bridge with a
> different implementation underneath.  It's the same old debate on is-a
> vs has-a.
>
> In QOM parlance Jan implemented this:

QOM is the new C++

>
>     abstract class Object
>         abstract class Device
>             class APIC: { backend: link<APICBackend> }
>         abstract class APICBackend
>             class QEMU_APICBackend
>             class KVM_APICBackend
>
> and you're proposing this:
>
>     abstract class Object
>         abstract class Device
>             abstract class APIC
>                 class QEMU_APIC
>                 class KVM_APIC
>
> Both can be right, both can be wrong.

I don't mind either.  What I don't want:

  abstract class Object
     abstract class Device
        class APIC
        class KVMAPIC

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-20 14:07                     ` Avi Kivity
  0 siblings, 0 replies; 99+ messages in thread
From: Avi Kivity @ 2011-12-20 14:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Anthony Liguori, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Jan Kiszka

On 12/20/2011 03:51 PM, Paolo Bonzini wrote:
> On 12/20/2011 02:41 PM, Anthony Liguori wrote:
>> On 12/20/2011 03:56 AM, Avi Kivity wrote:
>>> On 12/20/2011 02:38 AM, Anthony Liguori wrote:
>>>>> That was v1 of my patches. Avi didn't like it, I tried it like this,
>>>>> and
>>>>> in the end I had to agree. So, no, I don't think we want such a
>>>>> model.
>>>>
>>>>
>>>> Yes, we do :-)
>>>>
>>>> The in-kernel APIC is a different implementation of the APIC device.
>>>> It's not an "accelerator" for the userspace APIC.
>>>
>>> A different implementation but not a different device. Device == spec.
>>
>> If it was hardware, it'd be a fully compatible clone. The way we would
>> model this is via inheritance.
>
> I see your fully compatible clone, and I raise my bridge with a
> different implementation underneath.  It's the same old debate on is-a
> vs has-a.
>
> In QOM parlance Jan implemented this:

QOM is the new C++

>
>     abstract class Object
>         abstract class Device
>             class APIC: { backend: link<APICBackend> }
>         abstract class APICBackend
>             class QEMU_APICBackend
>             class KVM_APICBackend
>
> and you're proposing this:
>
>     abstract class Object
>         abstract class Device
>             abstract class APIC
>                 class QEMU_APIC
>                 class KVM_APIC
>
> Both can be right, both can be wrong.

I don't mind either.  What I don't want:

  abstract class Object
     abstract class Device
        class APIC
        class KVMAPIC

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 14:07                       ` Anthony Liguori
@ 2011-12-20 17:02                           ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 17:02 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Paolo Bonzini, kvm, Michael S. Tsirkin, Marcelo Tosatti,
	qemu-devel, Blue Swirl, Avi Kivity

On 2011-12-20 15:07, Anthony Liguori wrote:
> On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
>> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>>> In QOM parlance Jan implemented this:
>>>>
>>>> abstract class Object
>>>> abstract class Device
>>>> class APIC: { backend: link<APICBackend> }
>>>> abstract class APICBackend
>>>> class QEMU_APICBackend
>>>> class KVM_APICBackend
>>>
>>> I don't fundamentally object to modeling it like this provided that it's
>>> modeled (and visible) through qdev and not done through a one-off
>>> infrastructure.
>>
>> There is no superclass of DeviceState, hence doing it through qdev
>> would mean
>> introducing a new bus type and so on. This would be a superb example of a
>> useless bus that can disappear with QOM, but I don't see why we should
>> take the
>> pain to add it in the first place. :)
> 
> Right, so let's modeled it for now as inheritance which qdev can cope with.

Do we have a clear plan now how to sort out the addressing issues in
this model? I mean when registering two devices under different names
that are supposed to be addressable under the same alias once
instantiated. I didn't follow recent qtree naming changes in details
unfortunately, if they already enable this.

This does not need to be implemented before merge. I just like to have a
common view on how to address it once it matters (for device inspection).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-20 17:02                           ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 17:02 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

On 2011-12-20 15:07, Anthony Liguori wrote:
> On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
>> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>>> In QOM parlance Jan implemented this:
>>>>
>>>> abstract class Object
>>>> abstract class Device
>>>> class APIC: { backend: link<APICBackend> }
>>>> abstract class APICBackend
>>>> class QEMU_APICBackend
>>>> class KVM_APICBackend
>>>
>>> I don't fundamentally object to modeling it like this provided that it's
>>> modeled (and visible) through qdev and not done through a one-off
>>> infrastructure.
>>
>> There is no superclass of DeviceState, hence doing it through qdev
>> would mean
>> introducing a new bus type and so on. This would be a superb example of a
>> useless bus that can disappear with QOM, but I don't see why we should
>> take the
>> pain to add it in the first place. :)
> 
> Right, so let's modeled it for now as inheritance which qdev can cope with.

Do we have a clear plan now how to sort out the addressing issues in
this model? I mean when registering two devices under different names
that are supposed to be addressable under the same alias once
instantiated. I didn't follow recent qtree naming changes in details
unfortunately, if they already enable this.

This does not need to be implemented before merge. I just like to have a
common view on how to address it once it matters (for device inspection).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 17:02                           ` Jan Kiszka
  (?)
@ 2011-12-20 19:14                           ` Anthony Liguori
  2011-12-20 21:23                             ` Jan Kiszka
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 19:14 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

On 12/20/2011 11:02 AM, Jan Kiszka wrote:
> On 2011-12-20 15:07, Anthony Liguori wrote:
>> On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
>>> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>>>> In QOM parlance Jan implemented this:
>>>>>
>>>>> abstract class Object
>>>>> abstract class Device
>>>>> class APIC: { backend: link<APICBackend>  }
>>>>> abstract class APICBackend
>>>>> class QEMU_APICBackend
>>>>> class KVM_APICBackend
>>>>
>>>> I don't fundamentally object to modeling it like this provided that it's
>>>> modeled (and visible) through qdev and not done through a one-off
>>>> infrastructure.
>>>
>>> There is no superclass of DeviceState, hence doing it through qdev
>>> would mean
>>> introducing a new bus type and so on. This would be a superb example of a
>>> useless bus that can disappear with QOM, but I don't see why we should
>>> take the
>>> pain to add it in the first place. :)
>>
>> Right, so let's modeled it for now as inheritance which qdev can cope with.
>
> Do we have a clear plan now how to sort out the addressing issues in
> this model? I mean when registering two devices under different names
> that are supposed to be addressable under the same alias once
> instantiated. I didn't follow recent qtree naming changes in details
> unfortunately, if they already enable this.

I think everyone is in agreement.  We'll start with an APICBase type that's 
modeled in qdev as a base class.

There will be an APICBaseInfo that will replace APICBackend.

There will be two classes that implement APICBaseInfo, KvmAPIC and APIC.  They 
will be separate devices.

APICBase will register the vmsd and will use the name "apic" to register it. 
You can just set the qdev.vmsd field in the apic_qdev_register() function to 
ensure that both use the same implementation.

>
> This does not need to be implemented before merge. I just like to have a
> common view on how to address it once it matters (for device inspection).

You can do this all today without any pending patches.  As I mentioned earlier, 
I don't mind doing this after the fact if you'd just like to get the current 
series merged.

If your series lands before the QOM series I just posted, then I will need to do 
it as part of the QOM series anyway.

Regards,

Anthony Liguori

> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 19:14                           ` Anthony Liguori
@ 2011-12-20 21:23                             ` Jan Kiszka
  2011-12-20 21:38                               ` Anthony Liguori
  0 siblings, 1 reply; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 21:23 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 2514 bytes --]

On 2011-12-20 20:14, Anthony Liguori wrote:
> On 12/20/2011 11:02 AM, Jan Kiszka wrote:
>> On 2011-12-20 15:07, Anthony Liguori wrote:
>>> On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
>>>> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>>>>> In QOM parlance Jan implemented this:
>>>>>>
>>>>>> abstract class Object
>>>>>> abstract class Device
>>>>>> class APIC: { backend: link<APICBackend>  }
>>>>>> abstract class APICBackend
>>>>>> class QEMU_APICBackend
>>>>>> class KVM_APICBackend
>>>>>
>>>>> I don't fundamentally object to modeling it like this provided that
>>>>> it's
>>>>> modeled (and visible) through qdev and not done through a one-off
>>>>> infrastructure.
>>>>
>>>> There is no superclass of DeviceState, hence doing it through qdev
>>>> would mean
>>>> introducing a new bus type and so on. This would be a superb example
>>>> of a
>>>> useless bus that can disappear with QOM, but I don't see why we should
>>>> take the
>>>> pain to add it in the first place. :)
>>>
>>> Right, so let's modeled it for now as inheritance which qdev can cope
>>> with.
>>
>> Do we have a clear plan now how to sort out the addressing issues in
>> this model? I mean when registering two devices under different names
>> that are supposed to be addressable under the same alias once
>> instantiated. I didn't follow recent qtree naming changes in details
>> unfortunately, if they already enable this.
> 
> I think everyone is in agreement.  We'll start with an APICBase type
> that's modeled in qdev as a base class.
> 
> There will be an APICBaseInfo that will replace APICBackend.
> 
> There will be two classes that implement APICBaseInfo, KvmAPIC and
> APIC.  They will be separate devices.
> 
> APICBase will register the vmsd and will use the name "apic" to register
> it. You can just set the qdev.vmsd field in the apic_qdev_register()
> function to ensure that both use the same implementation.

I'm not talking about migration here, I'm talking about qtree
addressability. That is orthogonal, at least right now.

> 
>>
>> This does not need to be implemented before merge. I just like to have a
>> common view on how to address it once it matters (for device inspection).
> 
> You can do this all today without any pending patches.

Nope, don't see how.

There is currently no use case for it (e.g. no device_show -
device_add/del makes no sense for the devices in question), but it
should be addressable in QOM in the future.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 21:23                             ` Jan Kiszka
@ 2011-12-20 21:38                               ` Anthony Liguori
  2011-12-20 21:45                                 ` Jan Kiszka
  0 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 21:38 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

On 12/20/2011 03:23 PM, Jan Kiszka wrote:
> On 2011-12-20 20:14, Anthony Liguori wrote:
>> On 12/20/2011 11:02 AM, Jan Kiszka wrote:
>>> On 2011-12-20 15:07, Anthony Liguori wrote:
>>>> On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
>>>>> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>>>>>> In QOM parlance Jan implemented this:
>>>>>>>
>>>>>>> abstract class Object
>>>>>>> abstract class Device
>>>>>>> class APIC: { backend: link<APICBackend>   }
>>>>>>> abstract class APICBackend
>>>>>>> class QEMU_APICBackend
>>>>>>> class KVM_APICBackend
>>>>>>
>>>>>> I don't fundamentally object to modeling it like this provided that
>>>>>> it's
>>>>>> modeled (and visible) through qdev and not done through a one-off
>>>>>> infrastructure.
>>>>>
>>>>> There is no superclass of DeviceState, hence doing it through qdev
>>>>> would mean
>>>>> introducing a new bus type and so on. This would be a superb example
>>>>> of a
>>>>> useless bus that can disappear with QOM, but I don't see why we should
>>>>> take the
>>>>> pain to add it in the first place. :)
>>>>
>>>> Right, so let's modeled it for now as inheritance which qdev can cope
>>>> with.
>>>
>>> Do we have a clear plan now how to sort out the addressing issues in
>>> this model? I mean when registering two devices under different names
>>> that are supposed to be addressable under the same alias once
>>> instantiated. I didn't follow recent qtree naming changes in details
>>> unfortunately, if they already enable this.
>>
>> I think everyone is in agreement.  We'll start with an APICBase type
>> that's modeled in qdev as a base class.
>>
>> There will be an APICBaseInfo that will replace APICBackend.
>>
>> There will be two classes that implement APICBaseInfo, KvmAPIC and
>> APIC.  They will be separate devices.
>>
>> APICBase will register the vmsd and will use the name "apic" to register
>> it. You can just set the qdev.vmsd field in the apic_qdev_register()
>> function to ensure that both use the same implementation.
>
> I'm not talking about migration here, I'm talking about qtree
> addressability. That is orthogonal, at least right now.

qtree is not an ABI.  The output of info qtree can (and will) change over time.

>
>>
>>>
>>> This does not need to be implemented before merge. I just like to have a
>>> common view on how to address it once it matters (for device inspection).
>>
>> You can do this all today without any pending patches.
>
> Nope, don't see how.

What is this issue?

>
> There is currently no use case for it (e.g. no device_show -
> device_add/del makes no sense for the devices in question), but it
> should be addressable in QOM in the future.

I guess I'm a bit confused...

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 21:38                               ` Anthony Liguori
@ 2011-12-20 21:45                                 ` Jan Kiszka
  2011-12-20 21:55                                   ` Anthony Liguori
  0 siblings, 1 reply; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 21:45 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 2963 bytes --]

On 2011-12-20 22:38, Anthony Liguori wrote:
> On 12/20/2011 03:23 PM, Jan Kiszka wrote:
>> On 2011-12-20 20:14, Anthony Liguori wrote:
>>> On 12/20/2011 11:02 AM, Jan Kiszka wrote:
>>>> On 2011-12-20 15:07, Anthony Liguori wrote:
>>>>> On 12/20/2011 07:57 AM, Paolo Bonzini wrote:
>>>>>> On 12/20/2011 02:54 PM, Anthony Liguori wrote:
>>>>>>>> In QOM parlance Jan implemented this:
>>>>>>>>
>>>>>>>> abstract class Object
>>>>>>>> abstract class Device
>>>>>>>> class APIC: { backend: link<APICBackend>   }
>>>>>>>> abstract class APICBackend
>>>>>>>> class QEMU_APICBackend
>>>>>>>> class KVM_APICBackend
>>>>>>>
>>>>>>> I don't fundamentally object to modeling it like this provided that
>>>>>>> it's
>>>>>>> modeled (and visible) through qdev and not done through a one-off
>>>>>>> infrastructure.
>>>>>>
>>>>>> There is no superclass of DeviceState, hence doing it through qdev
>>>>>> would mean
>>>>>> introducing a new bus type and so on. This would be a superb example
>>>>>> of a
>>>>>> useless bus that can disappear with QOM, but I don't see why we
>>>>>> should
>>>>>> take the
>>>>>> pain to add it in the first place. :)
>>>>>
>>>>> Right, so let's modeled it for now as inheritance which qdev can cope
>>>>> with.
>>>>
>>>> Do we have a clear plan now how to sort out the addressing issues in
>>>> this model? I mean when registering two devices under different names
>>>> that are supposed to be addressable under the same alias once
>>>> instantiated. I didn't follow recent qtree naming changes in details
>>>> unfortunately, if they already enable this.
>>>
>>> I think everyone is in agreement.  We'll start with an APICBase type
>>> that's modeled in qdev as a base class.
>>>
>>> There will be an APICBaseInfo that will replace APICBackend.
>>>
>>> There will be two classes that implement APICBaseInfo, KvmAPIC and
>>> APIC.  They will be separate devices.
>>>
>>> APICBase will register the vmsd and will use the name "apic" to register
>>> it. You can just set the qdev.vmsd field in the apic_qdev_register()
>>> function to ensure that both use the same implementation.
>>
>> I'm not talking about migration here, I'm talking about qtree
>> addressability. That is orthogonal, at least right now.
> 
> qtree is not an ABI.  The output of info qtree can (and will) change
> over time.

That's not the point. The point is that at least some branch of the
qtree should be identically named for both the KVM and the user space
incarnations of a particular device (given a certain qemu version).

The request was that /qtree/path/to/apic should not change if you enable
KVM in-kernel acceleration in the very same qemu release. There can also
be some /qtree/path/to/kvm-apic then, but as alias (or as primary name
and the other becomes an alias). I think this makes sense if the user is
still able to clearly differentiate between both versions when listing
devices.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 21:45                                 ` Jan Kiszka
@ 2011-12-20 21:55                                   ` Anthony Liguori
  2011-12-20 22:20                                       ` [Qemu-devel] " Jan Kiszka
  0 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 21:55 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

On 12/20/2011 03:45 PM, Jan Kiszka wrote:
> On 2011-12-20 22:38, Anthony Liguori wrote:
>>> I'm not talking about migration here, I'm talking about qtree
>>> addressability. That is orthogonal, at least right now.
>>
>> qtree is not an ABI.  The output of info qtree can (and will) change
>> over time.
>
> That's not the point. The point is that at least some branch of the
> qtree should be identically named for both the KVM and the user space
> incarnations of a particular device (given a certain qemu version).

There is no such thing as "qtree paths".  Today, devices have ids or are 
anonymous.  The apic is currently an anonymous device and there's no way to 
address it until we complete the PC composition tree.  I have patches for this, 
but that won't land until after series 4.

Starting right now, we have a standard path mechanism.  This path will either 
follow the composition tree or potentially an arbitrary path through the link graph.

The components of the path are the *property* names of the parent device.  In 
the case of the local APIC, you would have something like:

/cpus/cpu0/apic
/cpus/cpu1/apic

Which would be links on the composition tree.  The name wouldn't change even if 
the type of this object changed.  You'll probably have a flag or something in 
the cpu object that lets you determine whether the child is created as a 
kvm-apic or just a normal apic.  But that would only affect the 'type' flag.

> The request was that /qtree/path/to/apic should not change if you enable
> KVM in-kernel acceleration in the very same qemu release.

The type names of the devices are orthogonal to the path names.

> There can also
> be some /qtree/path/to/kvm-apic then, but as alias (or as primary name
> and the other becomes an alias).   I think this makes sense if the user is
> still able to clearly differentiate between both versions when listing
> devices.

Yes, they just need to read the 'type' property.  The distinguishing property 
would be:

/cpus/cpu0/apic.type = 'apic'

vs.

/cpus/cpu0/apic.type = 'kvm-apic'

But otherwise, it would look the same.

Again, if you implement qdev based inheritance as I described in my previous 
note, this will all Just Work.  We have everything we need in the tree to model 
this.

Regards,

Anthony Liguori

>
> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 21:55                                   ` Anthony Liguori
@ 2011-12-20 22:20                                       ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 22:20 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

On 2011-12-20 22:55, Anthony Liguori wrote:
> On 12/20/2011 03:45 PM, Jan Kiszka wrote:
>> On 2011-12-20 22:38, Anthony Liguori wrote:
>>>> I'm not talking about migration here, I'm talking about qtree
>>>> addressability. That is orthogonal, at least right now.
>>>
>>> qtree is not an ABI.  The output of info qtree can (and will) change
>>> over time.
>>
>> That's not the point. The point is that at least some branch of the
>> qtree should be identically named for both the KVM and the user space
>> incarnations of a particular device (given a certain qemu version).
> 
> There is no such thing as "qtree paths".  Today, devices have ids or are
> anonymous.  The apic is currently an anonymous device and there's no way
> to address it until we complete the PC composition tree.  I have patches
> for this, but that won't land until after series 4.
> 
> Starting right now, we have a standard path mechanism.  This path will
> either follow the composition tree or potentially an arbitrary path
> through the link graph.
> 
> The components of the path are the *property* names of the parent
> device.  In the case of the local APIC, you would have something like:
> 
> /cpus/cpu0/apic
> /cpus/cpu1/apic
> 
> Which would be links on the composition tree.  The name wouldn't change
> even if the type of this object changed. 

Perfect! That was what I forgot about and what makes it possible to
return to the original two-device model.

> You'll probably have a flag or
> something in the cpu object that lets you determine whether the child is
> created as a kvm-apic or just a normal apic. 

I rather hope you will be able to ask the device for its type instead
replicating that information.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
@ 2011-12-20 22:20                                       ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 22:20 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

On 2011-12-20 22:55, Anthony Liguori wrote:
> On 12/20/2011 03:45 PM, Jan Kiszka wrote:
>> On 2011-12-20 22:38, Anthony Liguori wrote:
>>>> I'm not talking about migration here, I'm talking about qtree
>>>> addressability. That is orthogonal, at least right now.
>>>
>>> qtree is not an ABI.  The output of info qtree can (and will) change
>>> over time.
>>
>> That's not the point. The point is that at least some branch of the
>> qtree should be identically named for both the KVM and the user space
>> incarnations of a particular device (given a certain qemu version).
> 
> There is no such thing as "qtree paths".  Today, devices have ids or are
> anonymous.  The apic is currently an anonymous device and there's no way
> to address it until we complete the PC composition tree.  I have patches
> for this, but that won't land until after series 4.
> 
> Starting right now, we have a standard path mechanism.  This path will
> either follow the composition tree or potentially an arbitrary path
> through the link graph.
> 
> The components of the path are the *property* names of the parent
> device.  In the case of the local APIC, you would have something like:
> 
> /cpus/cpu0/apic
> /cpus/cpu1/apic
> 
> Which would be links on the composition tree.  The name wouldn't change
> even if the type of this object changed. 

Perfect! That was what I forgot about and what makes it possible to
return to the original two-device model.

> You'll probably have a flag or
> something in the cpu object that lets you determine whether the child is
> created as a kvm-apic or just a normal apic. 

I rather hope you will be able to ask the device for its type instead
replicating that information.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 22:20                                       ` [Qemu-devel] " Jan Kiszka
  (?)
@ 2011-12-20 23:41                                       ` Anthony Liguori
  2011-12-20 23:45                                         ` Jan Kiszka
  -1 siblings, 1 reply; 99+ messages in thread
From: Anthony Liguori @ 2011-12-20 23:41 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

On 12/20/2011 04:20 PM, Jan Kiszka wrote:
> On 2011-12-20 22:55, Anthony Liguori wrote:
>> The components of the path are the *property* names of the parent
>> device.  In the case of the local APIC, you would have something like:
>>
>> /cpus/cpu0/apic
>> /cpus/cpu1/apic
>>
>> Which would be links on the composition tree.  The name wouldn't change
>> even if the type of this object changed.
>
> Perfect! That was what I forgot about and what makes it possible to
> return to the original two-device model.
>
>> You'll probably have a flag or
>> something in the cpu object that lets you determine whether the child is
>> created as a kvm-apic or just a normal apic.
>
> I rather hope you will be able to ask the device for its type instead
> replicating that information.

Yes, but that's not what I was getting at.

I think you are currently planning on enabling/disabling the in-kernel apic 
through a machine option?

Where I'd like to get to is that the CPUs are modeled as devices and whether the 
APIC is in-kernel or not is a property of the CPU (just like any other CPU flag).

For something like the i8254, since that's a child of the PIIX3, it would be a 
property of the PIIX3 which it would use to create the appropriate i8254 type.

You could also have the CPU and/or i8254 have a link<> which would allow a user 
to explicitly instantiate the appropriate device but I think that makes it 
harder to use than it should be.

By making it a property of the composition parent, you let the parent make the 
best choice to start with and then a user has the ability to override it if it 
sees fit to.

Regards,

Anthony Liguori

> Jan
>


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse
  2011-12-20 23:41                                       ` Anthony Liguori
@ 2011-12-20 23:45                                         ` Jan Kiszka
  0 siblings, 0 replies; 99+ messages in thread
From: Jan Kiszka @ 2011-12-20 23:45 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Michael S. Tsirkin, Marcelo Tosatti, qemu-devel, Blue Swirl,
	Avi Kivity, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1342 bytes --]

On 2011-12-21 00:41, Anthony Liguori wrote:
> On 12/20/2011 04:20 PM, Jan Kiszka wrote:
>> On 2011-12-20 22:55, Anthony Liguori wrote:
>>> The components of the path are the *property* names of the parent
>>> device.  In the case of the local APIC, you would have something like:
>>>
>>> /cpus/cpu0/apic
>>> /cpus/cpu1/apic
>>>
>>> Which would be links on the composition tree.  The name wouldn't change
>>> even if the type of this object changed.
>>
>> Perfect! That was what I forgot about and what makes it possible to
>> return to the original two-device model.
>>
>>> You'll probably have a flag or
>>> something in the cpu object that lets you determine whether the child is
>>> created as a kvm-apic or just a normal apic.
>>
>> I rather hope you will be able to ask the device for its type instead
>> replicating that information.
> 
> Yes, but that's not what I was getting at.
> 
> I think you are currently planning on enabling/disabling the in-kernel
> apic through a machine option?

Yes, because it is a VM-wide flag, nothing you can control per irqchip,
per chipset or whatever. It must be consistent for the whole VM, means
all CPUs, the chipset, the IOAPIC (which may or may not (PIIX3) be part
of it) etc. It also affects KVM internals that are not directly bound to
device models.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

end of thread, other threads:[~2011-12-20 23:45 UTC | newest]

Thread overview: 99+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-15 12:33 [PATCH v5 00/16] uq/master: Introduce basic irqchip support Jan Kiszka
2011-12-15 12:33 ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 01/16] msi: Generalize msix_supported to msi_supported Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 02/16] kvm: Move kvmclock into hw/kvm folder Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 03/16] apic: Stop timer on reset Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 04/16] apic: Inject external NMI events via LINT1 Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 05/16] apic: Introduce apic_report_irq_delivered Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 06/16] apic: Introduce backend/frontend infrastructure for KVM reuse Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-19 22:14   ` Anthony Liguori
2011-12-19 22:14     ` Anthony Liguori
2011-12-19 23:32     ` Jan Kiszka
2011-12-19 23:32       ` Jan Kiszka
2011-12-20  0:28       ` Anthony Liguori
2011-12-20  0:32         ` Jan Kiszka
2011-12-20  0:38           ` Anthony Liguori
2011-12-20  9:56             ` Avi Kivity
2011-12-20  9:56               ` Avi Kivity
2011-12-20 13:41               ` Anthony Liguori
2011-12-20 13:51                 ` Paolo Bonzini
2011-12-20 13:51                   ` Paolo Bonzini
2011-12-20 13:54                   ` Anthony Liguori
2011-12-20 13:57                     ` Paolo Bonzini
2011-12-20 14:07                       ` Anthony Liguori
2011-12-20 17:02                         ` Jan Kiszka
2011-12-20 17:02                           ` Jan Kiszka
2011-12-20 19:14                           ` Anthony Liguori
2011-12-20 21:23                             ` Jan Kiszka
2011-12-20 21:38                               ` Anthony Liguori
2011-12-20 21:45                                 ` Jan Kiszka
2011-12-20 21:55                                   ` Anthony Liguori
2011-12-20 22:20                                     ` Jan Kiszka
2011-12-20 22:20                                       ` [Qemu-devel] " Jan Kiszka
2011-12-20 23:41                                       ` Anthony Liguori
2011-12-20 23:45                                         ` Jan Kiszka
2011-12-20 14:07                   ` Avi Kivity
2011-12-20 14:07                     ` Avi Kivity
2011-12-15 12:33 ` [PATCH v5 07/16] apic: Open-code timer save/restore Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-19 22:21   ` Anthony Liguori
2011-12-19 22:21     ` [Qemu-devel] " Anthony Liguori
2011-12-19 23:45     ` Jan Kiszka
2011-12-19 23:45       ` [Qemu-devel] " Jan Kiszka
2011-12-20  0:31       ` Anthony Liguori
2011-12-20  0:34         ` Jan Kiszka
2011-12-20  0:34           ` [Qemu-devel] " Jan Kiszka
2011-12-20  0:53           ` Anthony Liguori
2011-12-20  0:53             ` [Qemu-devel] " Anthony Liguori
2011-12-20  1:24             ` Jan Kiszka
2011-12-20  1:24               ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 08/16] i8259: Introduce backend/frontend infrastructure for KVM reuse Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 09/16] ioapic: " Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 10/16] memory: Introduce memory_region_init_reservation Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 11/16] kvm: Introduce core services for in-kernel irqchip support Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 12/16] kvm: x86: Establish IRQ0 override control Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 13/16] kvm: x86: Add user space part for in-kernel APIC Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 14/16] kvm: x86: Add user space part for in-kernel i8259 Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 15/16] kvm: x86: Add user space part for in-kernel IOAPIC Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-15 12:33 ` [PATCH v5 16/16] kvm: Arm in-kernel irqchip support Jan Kiszka
2011-12-15 12:33   ` [Qemu-devel] " Jan Kiszka
2011-12-19 21:17 ` [PATCH v5 00/16] uq/master: Introduce basic " Marcelo Tosatti
2011-12-19 21:17   ` [Qemu-devel] " Marcelo Tosatti
2011-12-19 22:24   ` Anthony Liguori
2011-12-19 22:24     ` Anthony Liguori
2011-12-19 23:49     ` Jan Kiszka
2011-12-19 23:49       ` [Qemu-devel] " Jan Kiszka
2011-12-20  0:32       ` Anthony Liguori
2011-12-20  0:37         ` Jan Kiszka
2011-12-20  0:42           ` Anthony Liguori
2011-12-20  0:42             ` [Qemu-devel] " Anthony Liguori
2011-12-20 10:01             ` Avi Kivity
2011-12-20 10:01               ` Avi Kivity
2011-12-20  1:08           ` Anthony Liguori
2011-12-20  1:19             ` Jan Kiszka
2011-12-20  1:19               ` [Qemu-devel] " Jan Kiszka
2011-12-20  1:28               ` Jan Kiszka
2011-12-20  1:28                 ` [Qemu-devel] " Jan Kiszka
2011-12-20  2:46               ` Anthony Liguori
2011-12-20  3:10                 ` Anthony Liguori
2011-12-20  8:34                   ` Jan Kiszka
2011-12-20  8:34                     ` [Qemu-devel] " Jan Kiszka
2011-12-20 10:03                 ` Avi Kivity
2011-12-20 10:03                   ` [Qemu-devel] " Avi Kivity
2011-12-20 10:08                   ` Avi Kivity
2011-12-20 10:08                     ` Avi Kivity
2011-12-20 13:45                     ` Anthony Liguori

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.