All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd
@ 2013-04-18 14:11 ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Hi,

This patch set contains a fully working implementation of the in-kernel MPIC
from Scott with a few fixups and a new version of my irqfd generalization
patch set.

Major changes to my last irqfd set are:

  - depend on CONFIG_ defines rather than __KVM defines
  - fix compile issues
  - fix the kvm_irqchip{,s} typo

I have refrained from touching IA64 at all in this patch set. It's marked
as BROKEN, I doubt it even compiles at all today. The only sensible thing
to do would be to remove all of IA64 kvm code from the kernel tree, but
that is out of scope for this patch set and definitely should not gate it.


Alex

Alexander Graf (11):
  KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  KVM: Remove kvm_get_intr_delivery_bitmask
  KVM: Move irq routing to generic code
  KVM: Extract generic irqchip logic into irqchip.c
  KVM: Move irq routing setup to irqchip.c
  KVM: Move irqfd resample cap handling to generic code
  KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  KVM: PPC: MPIC: Restrict to e500 platforms

Scott Wood (6):
  kvm: add device control API
  kvm/ppc/mpic: import hw/openpic.c from QEMU
  kvm/ppc/mpic: remove some obviously unneeded code
  kvm/ppc/mpic: adapt to kernel style and environment
  kvm/ppc/mpic: in-kernel MPIC emulation
  kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

 Documentation/virtual/kvm/api.txt          |   78 ++
 Documentation/virtual/kvm/devices/README   |    1 +
 Documentation/virtual/kvm/devices/mpic.txt |   37 +
 arch/powerpc/include/asm/kvm_host.h        |   24 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   30 +
 arch/powerpc/include/uapi/asm/kvm.h        |    9 +
 arch/powerpc/kvm/Kconfig                   |   12 +
 arch/powerpc/kvm/Makefile                  |    3 +
 arch/powerpc/kvm/booke.c                   |   12 +-
 arch/powerpc/kvm/irq.h                     |   17 +
 arch/powerpc/kvm/mpic.c                    | 1869 ++++++++++++++++++++++++++++
 arch/powerpc/kvm/powerpc.c                 |   55 +-
 arch/x86/include/asm/kvm_host.h            |    2 +
 arch/x86/kvm/Kconfig                       |    1 +
 arch/x86/kvm/Makefile                      |    2 +-
 arch/x86/kvm/x86.c                         |    1 -
 include/linux/kvm_host.h                   |   53 +-
 include/trace/events/kvm.h                 |   12 +-
 include/uapi/linux/kvm.h                   |   33 +-
 virt/kvm/Kconfig                           |    3 +
 virt/kvm/assigned-dev.c                    |   30 -
 virt/kvm/eventfd.c                         |    6 +-
 virt/kvm/irq_comm.c                        |  194 +---
 virt/kvm/irqchip.c                         |  237 ++++
 virt/kvm/kvm_main.c                        |  170 +++-
 25 files changed, 2641 insertions(+), 250 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt
 create mode 100644 arch/powerpc/kvm/irq.h
 create mode 100644 arch/powerpc/kvm/mpic.c
 create mode 100644 virt/kvm/irqchip.c


^ permalink raw reply	[flat|nested] 128+ messages in thread

* [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd
@ 2013-04-18 14:11 ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Hi,

This patch set contains a fully working implementation of the in-kernel MPIC
from Scott with a few fixups and a new version of my irqfd generalization
patch set.

Major changes to my last irqfd set are:

  - depend on CONFIG_ defines rather than __KVM defines
  - fix compile issues
  - fix the kvm_irqchip{,s} typo

I have refrained from touching IA64 at all in this patch set. It's marked
as BROKEN, I doubt it even compiles at all today. The only sensible thing
to do would be to remove all of IA64 kvm code from the kernel tree, but
that is out of scope for this patch set and definitely should not gate it.


Alex

Alexander Graf (11):
  KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  KVM: Remove kvm_get_intr_delivery_bitmask
  KVM: Move irq routing to generic code
  KVM: Extract generic irqchip logic into irqchip.c
  KVM: Move irq routing setup to irqchip.c
  KVM: Move irqfd resample cap handling to generic code
  KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  KVM: PPC: MPIC: Restrict to e500 platforms

Scott Wood (6):
  kvm: add device control API
  kvm/ppc/mpic: import hw/openpic.c from QEMU
  kvm/ppc/mpic: remove some obviously unneeded code
  kvm/ppc/mpic: adapt to kernel style and environment
  kvm/ppc/mpic: in-kernel MPIC emulation
  kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

 Documentation/virtual/kvm/api.txt          |   78 ++
 Documentation/virtual/kvm/devices/README   |    1 +
 Documentation/virtual/kvm/devices/mpic.txt |   37 +
 arch/powerpc/include/asm/kvm_host.h        |   24 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   30 +
 arch/powerpc/include/uapi/asm/kvm.h        |    9 +
 arch/powerpc/kvm/Kconfig                   |   12 +
 arch/powerpc/kvm/Makefile                  |    3 +
 arch/powerpc/kvm/booke.c                   |   12 +-
 arch/powerpc/kvm/irq.h                     |   17 +
 arch/powerpc/kvm/mpic.c                    | 1869 ++++++++++++++++++++++++++++
 arch/powerpc/kvm/powerpc.c                 |   55 +-
 arch/x86/include/asm/kvm_host.h            |    2 +
 arch/x86/kvm/Kconfig                       |    1 +
 arch/x86/kvm/Makefile                      |    2 +-
 arch/x86/kvm/x86.c                         |    1 -
 include/linux/kvm_host.h                   |   53 +-
 include/trace/events/kvm.h                 |   12 +-
 include/uapi/linux/kvm.h                   |   33 +-
 virt/kvm/Kconfig                           |    3 +
 virt/kvm/assigned-dev.c                    |   30 -
 virt/kvm/eventfd.c                         |    6 +-
 virt/kvm/irq_comm.c                        |  194 +---
 virt/kvm/irqchip.c                         |  237 ++++
 virt/kvm/kvm_main.c                        |  170 +++-
 25 files changed, 2641 insertions(+), 250 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt
 create mode 100644 arch/powerpc/kvm/irq.h
 create mode 100644 arch/powerpc/kvm/mpic.c
 create mode 100644 virt/kvm/irqchip.c


^ permalink raw reply	[flat|nested] 128+ messages in thread

* [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.

So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 include/linux/kvm_host.h        |    2 +-
 virt/kvm/irq_comm.c             |    2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 599f98b..f44c3fe 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -43,6 +43,8 @@
 #define KVM_PIO_PAGE_OFFSET 1
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
 
+#define KVM_IRQCHIP_NUM_PINS  KVM_IOAPIC_NUM_PINS
+
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
 			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 93a5005..bf3b1dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -307,7 +307,7 @@ struct kvm_kernel_irq_routing_entry {
 #ifdef __KVM_HAVE_IOAPIC
 
 struct kvm_irq_routing_table {
-	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
+	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
 	struct kvm_kernel_irq_routing_entry *rt_entries;
 	u32 nr_rt_entries;
 	/*
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 25ab480..7c0071d 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -480,7 +480,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
 
 	new->nr_rt_entries = nr_rt_entries;
 	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
 			new->chip[i][j] = -1;
 
 	for (i = 0; i < nr; ++i) {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.

So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 include/linux/kvm_host.h        |    2 +-
 virt/kvm/irq_comm.c             |    2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 599f98b..f44c3fe 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -43,6 +43,8 @@
 #define KVM_PIO_PAGE_OFFSET 1
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
 
+#define KVM_IRQCHIP_NUM_PINS  KVM_IOAPIC_NUM_PINS
+
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
 			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 93a5005..bf3b1dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -307,7 +307,7 @@ struct kvm_kernel_irq_routing_entry {
 #ifdef __KVM_HAVE_IOAPIC
 
 struct kvm_irq_routing_table {
-	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
+	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
 	struct kvm_kernel_irq_routing_entry *rt_entries;
 	u32 nr_rt_entries;
 	/*
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 25ab480..7c0071d 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -480,7 +480,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
 
 	new->nr_rt_entries = nr_rt_entries;
 	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
 			new->chip[i][j] = -1;
 
 	for (i = 0; i < nr; ++i) {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.

Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Kconfig     |    1 +
 include/linux/kvm_host.h |    6 +++---
 virt/kvm/Kconfig         |    3 +++
 virt/kvm/eventfd.c       |    6 +++---
 virt/kvm/kvm_main.c      |    2 +-
 5 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 586f000..9d50efd 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -29,6 +29,7 @@ config KVM
 	select MMU_NOTIFIER
 	select ANON_INODES
 	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_EVENTFD
 	select KVM_APIC_ARCHITECTURE
 	select KVM_ASYNC_PF
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index bf3b1dc..4215d4f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -304,7 +304,7 @@ struct kvm_kernel_irq_routing_entry {
 	struct hlist_node link;
 };
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 struct kvm_irq_routing_table {
 	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
@@ -432,7 +432,7 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
 int __must_check vcpu_load(struct kvm_vcpu *vcpu);
 void vcpu_put(struct kvm_vcpu *vcpu);
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 int kvm_irqfd_init(void);
 void kvm_irqfd_exit(void);
 #else
@@ -957,7 +957,7 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq)
 }
 #endif
 
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 #define KVM_MAX_IRQ_ROUTES 1024
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index d01b24b..779262f 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -6,6 +6,9 @@ config HAVE_KVM
 config HAVE_KVM_IRQCHIP
        bool
 
+config HAVE_KVM_IRQ_ROUTING
+       bool
+
 config HAVE_KVM_EVENTFD
        bool
        select EVENTFD
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index c5d43ff..64ee720 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -35,7 +35,7 @@
 
 #include "iodev.h"
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * --------------------------------------------------------------------
  * irqfd: Allows an fd to be used to inject an interrupt to the guest
@@ -433,7 +433,7 @@ fail:
 void
 kvm_eventfd_init(struct kvm *kvm)
 {
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	spin_lock_init(&kvm->irqfds.lock);
 	INIT_LIST_HEAD(&kvm->irqfds.items);
 	INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
@@ -442,7 +442,7 @@ kvm_eventfd_init(struct kvm *kvm)
 	INIT_LIST_HEAD(&kvm->ioeventfds);
 }
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * shutdown any irqfd's that match fd+gsi
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aaac1a7..2c3b226 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2404,7 +2404,7 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 	case KVM_CAP_SIGNAL_MSI:
 #endif
 		return 1;
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
 		return KVM_MAX_IRQ_ROUTES;
 #endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.

Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Kconfig     |    1 +
 include/linux/kvm_host.h |    6 +++---
 virt/kvm/Kconfig         |    3 +++
 virt/kvm/eventfd.c       |    6 +++---
 virt/kvm/kvm_main.c      |    2 +-
 5 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 586f000..9d50efd 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -29,6 +29,7 @@ config KVM
 	select MMU_NOTIFIER
 	select ANON_INODES
 	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_EVENTFD
 	select KVM_APIC_ARCHITECTURE
 	select KVM_ASYNC_PF
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index bf3b1dc..4215d4f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -304,7 +304,7 @@ struct kvm_kernel_irq_routing_entry {
 	struct hlist_node link;
 };
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 struct kvm_irq_routing_table {
 	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
@@ -432,7 +432,7 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
 int __must_check vcpu_load(struct kvm_vcpu *vcpu);
 void vcpu_put(struct kvm_vcpu *vcpu);
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 int kvm_irqfd_init(void);
 void kvm_irqfd_exit(void);
 #else
@@ -957,7 +957,7 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq)
 }
 #endif
 
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 #define KVM_MAX_IRQ_ROUTES 1024
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index d01b24b..779262f 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -6,6 +6,9 @@ config HAVE_KVM
 config HAVE_KVM_IRQCHIP
        bool
 
+config HAVE_KVM_IRQ_ROUTING
+       bool
+
 config HAVE_KVM_EVENTFD
        bool
        select EVENTFD
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index c5d43ff..64ee720 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -35,7 +35,7 @@
 
 #include "iodev.h"
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * --------------------------------------------------------------------
  * irqfd: Allows an fd to be used to inject an interrupt to the guest
@@ -433,7 +433,7 @@ fail:
 void
 kvm_eventfd_init(struct kvm *kvm)
 {
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	spin_lock_init(&kvm->irqfds.lock);
 	INIT_LIST_HEAD(&kvm->irqfds.items);
 	INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
@@ -442,7 +442,7 @@ kvm_eventfd_init(struct kvm *kvm)
 	INIT_LIST_HEAD(&kvm->ioeventfds);
 }
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * shutdown any irqfd's that match fd+gsi
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aaac1a7..2c3b226 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2404,7 +2404,7 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 	case KVM_CAP_SIGNAL_MSI:
 #endif
 		return 1;
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
 		return KVM_MAX_IRQ_ROUTES;
 #endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

We have a capability enquire system that allows user space to ask kvm
whether a feature is available.

The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.

Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/uapi/linux/kvm.h |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 74d0ff3..c741902 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -579,9 +579,7 @@ struct kvm_ppc_smmu_info {
 #ifdef __KVM_HAVE_PIT
 #define KVM_CAP_REINJECT_CONTROL 24
 #endif
-#ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
-#endif
 #define KVM_CAP_IRQ_INJECT_STATUS 26
 #ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
 #define KVM_CAP_DEVICE_DEASSIGNMENT 27
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

We have a capability enquire system that allows user space to ask kvm
whether a feature is available.

The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.

Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/uapi/linux/kvm.h |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 74d0ff3..c741902 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -579,9 +579,7 @@ struct kvm_ppc_smmu_info {
 #ifdef __KVM_HAVE_PIT
 #define KVM_CAP_REINJECT_CONTROL 24
 #endif
-#ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
-#endif
 #define KVM_CAP_IRQ_INJECT_STATUS 26
 #ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
 #define KVM_CAP_DEVICE_DEASSIGNMENT 27
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4215d4f..a7bfe9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -719,11 +719,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
 void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 			     bool mask);
 
-#ifdef __KVM_HAVE_IOAPIC
-void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
-				   union kvm_ioapic_redirect_entry *entry,
-				   unsigned long *deliver_bitmask);
-#endif
 int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
 		bool line_status);
 int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4215d4f..a7bfe9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -719,11 +719,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
 void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 			     bool mask);
 
-#ifdef __KVM_HAVE_IOAPIC
-void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
-				   union kvm_ioapic_redirect_entry *entry,
-				   unsigned long *deliver_bitmask);
-#endif
 int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
 		bool line_status);
 int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 05/17] KVM: Move irq routing to generic code
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The IRQ routing set ioctl lives in the hacky device assignment code inside
of KVM today. This is definitely the wrong place for it. Move it to the much
more natural kvm_main.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 virt/kvm/assigned-dev.c |   30 ------------------------------
 virt/kvm/kvm_main.c     |   30 ++++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index f4c7f59..8db4370 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -983,36 +983,6 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl,
 			goto out;
 		break;
 	}
-#ifdef KVM_CAP_IRQ_ROUTING
-	case KVM_SET_GSI_ROUTING: {
-		struct kvm_irq_routing routing;
-		struct kvm_irq_routing __user *urouting;
-		struct kvm_irq_routing_entry *entries;
-
-		r = -EFAULT;
-		if (copy_from_user(&routing, argp, sizeof(routing)))
-			goto out;
-		r = -EINVAL;
-		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
-			goto out;
-		if (routing.flags)
-			goto out;
-		r = -ENOMEM;
-		entries = vmalloc(routing.nr * sizeof(*entries));
-		if (!entries)
-			goto out;
-		r = -EFAULT;
-		urouting = argp;
-		if (copy_from_user(entries, urouting->entries,
-				   routing.nr * sizeof(*entries)))
-			goto out_free_irq_routing;
-		r = kvm_set_irq_routing(kvm, entries, routing.nr,
-					routing.flags);
-	out_free_irq_routing:
-		vfree(entries);
-		break;
-	}
-#endif /* KVM_CAP_IRQ_ROUTING */
 #ifdef __KVM_HAVE_MSIX
 	case KVM_ASSIGN_SET_MSIX_NR: {
 		struct kvm_assigned_msix_nr entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2c3b226..b6f3354 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2274,6 +2274,36 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_SET_GSI_ROUTING: {
+		struct kvm_irq_routing routing;
+		struct kvm_irq_routing __user *urouting;
+		struct kvm_irq_routing_entry *entries;
+
+		r = -EFAULT;
+		if (copy_from_user(&routing, argp, sizeof(routing)))
+			goto out;
+		r = -EINVAL;
+		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
+			goto out;
+		if (routing.flags)
+			goto out;
+		r = -ENOMEM;
+		entries = vmalloc(routing.nr * sizeof(*entries));
+		if (!entries)
+			goto out;
+		r = -EFAULT;
+		urouting = argp;
+		if (copy_from_user(entries, urouting->entries,
+				   routing.nr * sizeof(*entries)))
+			goto out_free_irq_routing;
+		r = kvm_set_irq_routing(kvm, entries, routing.nr,
+					routing.flags);
+	out_free_irq_routing:
+		vfree(entries);
+		break;
+	}
+#endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r == -ENOTTY)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 05/17] KVM: Move irq routing to generic code
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The IRQ routing set ioctl lives in the hacky device assignment code inside
of KVM today. This is definitely the wrong place for it. Move it to the much
more natural kvm_main.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 virt/kvm/assigned-dev.c |   30 ------------------------------
 virt/kvm/kvm_main.c     |   30 ++++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index f4c7f59..8db4370 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -983,36 +983,6 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl,
 			goto out;
 		break;
 	}
-#ifdef KVM_CAP_IRQ_ROUTING
-	case KVM_SET_GSI_ROUTING: {
-		struct kvm_irq_routing routing;
-		struct kvm_irq_routing __user *urouting;
-		struct kvm_irq_routing_entry *entries;
-
-		r = -EFAULT;
-		if (copy_from_user(&routing, argp, sizeof(routing)))
-			goto out;
-		r = -EINVAL;
-		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
-			goto out;
-		if (routing.flags)
-			goto out;
-		r = -ENOMEM;
-		entries = vmalloc(routing.nr * sizeof(*entries));
-		if (!entries)
-			goto out;
-		r = -EFAULT;
-		urouting = argp;
-		if (copy_from_user(entries, urouting->entries,
-				   routing.nr * sizeof(*entries)))
-			goto out_free_irq_routing;
-		r = kvm_set_irq_routing(kvm, entries, routing.nr,
-					routing.flags);
-	out_free_irq_routing:
-		vfree(entries);
-		break;
-	}
-#endif /* KVM_CAP_IRQ_ROUTING */
 #ifdef __KVM_HAVE_MSIX
 	case KVM_ASSIGN_SET_MSIX_NR: {
 		struct kvm_assigned_msix_nr entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2c3b226..b6f3354 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2274,6 +2274,36 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_SET_GSI_ROUTING: {
+		struct kvm_irq_routing routing;
+		struct kvm_irq_routing __user *urouting;
+		struct kvm_irq_routing_entry *entries;
+
+		r = -EFAULT;
+		if (copy_from_user(&routing, argp, sizeof(routing)))
+			goto out;
+		r = -EINVAL;
+		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
+			goto out;
+		if (routing.flags)
+			goto out;
+		r = -ENOMEM;
+		entries = vmalloc(routing.nr * sizeof(*entries));
+		if (!entries)
+			goto out;
+		r = -EFAULT;
+		urouting = argp;
+		if (copy_from_user(entries, urouting->entries,
+				   routing.nr * sizeof(*entries)))
+			goto out_free_irq_routing;
+		r = kvm_set_irq_routing(kvm, entries, routing.nr,
+					routing.flags);
+	out_free_irq_routing:
+		vfree(entries);
+		break;
+	}
+#endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r = -ENOTTY)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.

Split the generic bits out into irqchip.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Makefile      |    2 +-
 include/trace/events/kvm.h |   12 +++-
 virt/kvm/irq_comm.c        |  118 ----------------------------------
 virt/kvm/irqchip.c         |  152 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+), 121 deletions(-)
 create mode 100644 virt/kvm/irqchip.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 04d3040..a797b8e 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
-				assigned-dev.o)
+				assigned-dev.o irqchip.o)
 kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/, iommu.o)
 kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix ../../../virt/kvm/, async_pf.o)
 
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 19911dd..7005d11 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -37,7 +37,7 @@ TRACE_EVENT(kvm_userspace_exit,
 		  __entry->errno < 0 ? -__entry->errno : __entry->reason)
 );
 
-#if defined(__KVM_HAVE_IRQ_LINE)
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
 TRACE_EVENT(kvm_set_irq,
 	TP_PROTO(unsigned int gsi, int level, int irq_source_id),
 	TP_ARGS(gsi, level, irq_source_id),
@@ -122,6 +122,10 @@ TRACE_EVENT(kvm_msi_set_irq,
 	{KVM_IRQCHIP_PIC_SLAVE,		"PIC slave"},		\
 	{KVM_IRQCHIP_IOAPIC,		"IOAPIC"}
 
+#endif /* defined(__KVM_HAVE_IOAPIC) */
+
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
+
 TRACE_EVENT(kvm_ack_irq,
 	TP_PROTO(unsigned int irqchip, unsigned int pin),
 	TP_ARGS(irqchip, pin),
@@ -136,14 +140,18 @@ TRACE_EVENT(kvm_ack_irq,
 		__entry->pin		= pin;
 	),
 
+#ifdef kvm_irqchips
 	TP_printk("irqchip %s pin %u",
 		  __print_symbolic(__entry->irqchip, kvm_irqchips),
 		 __entry->pin)
+#else
+	TP_printk("irqchip %d pin %u", __entry->irqchip, __entry->pin)
+#endif
 );
 
+#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
 
 
-#endif /* defined(__KVM_HAVE_IOAPIC) */
 
 #define KVM_TRACE_MMIO_READ_UNSATISFIED 0
 #define KVM_TRACE_MMIO_READ 1
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 7c0071d..d5008f4 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -151,59 +151,6 @@ static int kvm_set_msi_inatomic(struct kvm_kernel_irq_routing_entry *e,
 		return -EWOULDBLOCK;
 }
 
-int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
-{
-	struct kvm_kernel_irq_routing_entry route;
-
-	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
-		return -EINVAL;
-
-	route.msi.address_lo = msi->address_lo;
-	route.msi.address_hi = msi->address_hi;
-	route.msi.data = msi->data;
-
-	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
-}
-
-/*
- * Return value:
- *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
- *  = 0   Interrupt was coalesced (previous irq is still pending)
- *  > 0   Number of CPUs interrupt was delivered to
- */
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
-		bool line_status)
-{
-	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
-	int ret = -1, i = 0;
-	struct kvm_irq_routing_table *irq_rt;
-
-	trace_kvm_set_irq(irq, level, irq_source_id);
-
-	/* Not possible to detect if the guest uses the PIC or the
-	 * IOAPIC.  So set the bit in both. The guest will ignore
-	 * writes to the unused one.
-	 */
-	rcu_read_lock();
-	irq_rt = rcu_dereference(kvm->irq_routing);
-	if (irq < irq_rt->nr_rt_entries)
-		hlist_for_each_entry(e, &irq_rt->map[irq], link)
-			irq_set[i++] = *e;
-	rcu_read_unlock();
-
-	while(i--) {
-		int r;
-		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
-				line_status);
-		if (r < 0)
-			continue;
-
-		ret = r + ((ret < 0) ? 0 : ret);
-	}
-
-	return ret;
-}
-
 /*
  * Deliver an IRQ in an atomic context if we can, or return a failure,
  * user can retry in a process context.
@@ -241,63 +188,6 @@ int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 	return ret;
 }
 
-bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi == gsi) {
-				rcu_read_unlock();
-				return true;
-			}
-
-	rcu_read_unlock();
-
-	return false;
-}
-EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
-
-void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	trace_kvm_ack_irq(irqchip, pin);
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi == gsi)
-				kian->irq_acked(kian);
-	rcu_read_unlock();
-}
-
-void kvm_register_irq_ack_notifier(struct kvm *kvm,
-				   struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
-	mutex_unlock(&kvm->irq_lock);
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
-void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
-				    struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_del_init_rcu(&kian->link);
-	mutex_unlock(&kvm->irq_lock);
-	synchronize_rcu();
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
 int kvm_request_irq_source_id(struct kvm *kvm)
 {
 	unsigned long *bitmap = &kvm->arch.irq_sources_bitmap;
@@ -381,13 +271,6 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-void kvm_free_irq_routing(struct kvm *kvm)
-{
-	/* Called only during vm destruction. Nobody can use the pointer
-	   at this stage */
-	kfree(kvm->irq_routing);
-}
-
 static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 			       struct kvm_kernel_irq_routing_entry *e,
 			       const struct kvm_irq_routing_entry *ue)
@@ -451,7 +334,6 @@ out:
 	return r;
 }
 
-
 int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *ue,
 			unsigned nr,
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
new file mode 100644
index 0000000..12f7f26
--- /dev/null
+++ b/virt/kvm/irqchip.c
@@ -0,0 +1,152 @@
+/*
+ * irqchip.c: Common API for in kernel interrupt controllers
+ * Copyright (c) 2007, Intel Corporation.
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (c) 2013, Alexander Graf <agraf@suse.de>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * This file is derived from virt/kvm/irq_comm.c.
+ *
+ * Authors:
+ *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
+ *   Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <trace/events/kvm.h>
+#include "irq.h"
+
+bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi == gsi) {
+				rcu_read_unlock();
+				return true;
+			}
+
+	rcu_read_unlock();
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
+
+void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	trace_kvm_ack_irq(irqchip, pin);
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi == gsi)
+				kian->irq_acked(kian);
+	rcu_read_unlock();
+}
+
+void kvm_register_irq_ack_notifier(struct kvm *kvm,
+				   struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
+	mutex_unlock(&kvm->irq_lock);
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
+				    struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_del_init_rcu(&kian->link);
+	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct kvm_kernel_irq_routing_entry route;
+
+	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
+		return -EINVAL;
+
+	route.msi.address_lo = msi->address_lo;
+	route.msi.address_hi = msi->address_hi;
+	route.msi.data = msi->data;
+
+	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
+}
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
+		bool line_status)
+{
+	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
+	int ret = -1, i = 0;
+	struct kvm_irq_routing_table *irq_rt;
+
+	trace_kvm_set_irq(irq, level, irq_source_id);
+
+	/* Not possible to detect if the guest uses the PIC or the
+	 * IOAPIC.  So set the bit in both. The guest will ignore
+	 * writes to the unused one.
+	 */
+	rcu_read_lock();
+	irq_rt = rcu_dereference(kvm->irq_routing);
+	if (irq < irq_rt->nr_rt_entries)
+		hlist_for_each_entry(e, &irq_rt->map[irq], link)
+			irq_set[i++] = *e;
+	rcu_read_unlock();
+
+	while(i--) {
+		int r;
+		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
+				   line_status);
+		if (r < 0)
+			continue;
+
+		ret = r + ((ret < 0) ? 0 : ret);
+	}
+
+	return ret;
+}
+
+void kvm_free_irq_routing(struct kvm *kvm)
+{
+	/* Called only during vm destruction. Nobody can use the pointer
+	   at this stage */
+	kfree(kvm->irq_routing);
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.

Split the generic bits out into irqchip.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Makefile      |    2 +-
 include/trace/events/kvm.h |   12 +++-
 virt/kvm/irq_comm.c        |  118 ----------------------------------
 virt/kvm/irqchip.c         |  152 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+), 121 deletions(-)
 create mode 100644 virt/kvm/irqchip.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 04d3040..a797b8e 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
-				assigned-dev.o)
+				assigned-dev.o irqchip.o)
 kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/, iommu.o)
 kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix ../../../virt/kvm/, async_pf.o)
 
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 19911dd..7005d11 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -37,7 +37,7 @@ TRACE_EVENT(kvm_userspace_exit,
 		  __entry->errno < 0 ? -__entry->errno : __entry->reason)
 );
 
-#if defined(__KVM_HAVE_IRQ_LINE)
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
 TRACE_EVENT(kvm_set_irq,
 	TP_PROTO(unsigned int gsi, int level, int irq_source_id),
 	TP_ARGS(gsi, level, irq_source_id),
@@ -122,6 +122,10 @@ TRACE_EVENT(kvm_msi_set_irq,
 	{KVM_IRQCHIP_PIC_SLAVE,		"PIC slave"},		\
 	{KVM_IRQCHIP_IOAPIC,		"IOAPIC"}
 
+#endif /* defined(__KVM_HAVE_IOAPIC) */
+
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
+
 TRACE_EVENT(kvm_ack_irq,
 	TP_PROTO(unsigned int irqchip, unsigned int pin),
 	TP_ARGS(irqchip, pin),
@@ -136,14 +140,18 @@ TRACE_EVENT(kvm_ack_irq,
 		__entry->pin		= pin;
 	),
 
+#ifdef kvm_irqchips
 	TP_printk("irqchip %s pin %u",
 		  __print_symbolic(__entry->irqchip, kvm_irqchips),
 		 __entry->pin)
+#else
+	TP_printk("irqchip %d pin %u", __entry->irqchip, __entry->pin)
+#endif
 );
 
+#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
 
 
-#endif /* defined(__KVM_HAVE_IOAPIC) */
 
 #define KVM_TRACE_MMIO_READ_UNSATISFIED 0
 #define KVM_TRACE_MMIO_READ 1
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 7c0071d..d5008f4 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -151,59 +151,6 @@ static int kvm_set_msi_inatomic(struct kvm_kernel_irq_routing_entry *e,
 		return -EWOULDBLOCK;
 }
 
-int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
-{
-	struct kvm_kernel_irq_routing_entry route;
-
-	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
-		return -EINVAL;
-
-	route.msi.address_lo = msi->address_lo;
-	route.msi.address_hi = msi->address_hi;
-	route.msi.data = msi->data;
-
-	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
-}
-
-/*
- * Return value:
- *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
- *  = 0   Interrupt was coalesced (previous irq is still pending)
- *  > 0   Number of CPUs interrupt was delivered to
- */
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
-		bool line_status)
-{
-	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
-	int ret = -1, i = 0;
-	struct kvm_irq_routing_table *irq_rt;
-
-	trace_kvm_set_irq(irq, level, irq_source_id);
-
-	/* Not possible to detect if the guest uses the PIC or the
-	 * IOAPIC.  So set the bit in both. The guest will ignore
-	 * writes to the unused one.
-	 */
-	rcu_read_lock();
-	irq_rt = rcu_dereference(kvm->irq_routing);
-	if (irq < irq_rt->nr_rt_entries)
-		hlist_for_each_entry(e, &irq_rt->map[irq], link)
-			irq_set[i++] = *e;
-	rcu_read_unlock();
-
-	while(i--) {
-		int r;
-		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
-				line_status);
-		if (r < 0)
-			continue;
-
-		ret = r + ((ret < 0) ? 0 : ret);
-	}
-
-	return ret;
-}
-
 /*
  * Deliver an IRQ in an atomic context if we can, or return a failure,
  * user can retry in a process context.
@@ -241,63 +188,6 @@ int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 	return ret;
 }
 
-bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi = gsi) {
-				rcu_read_unlock();
-				return true;
-			}
-
-	rcu_read_unlock();
-
-	return false;
-}
-EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
-
-void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	trace_kvm_ack_irq(irqchip, pin);
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi = gsi)
-				kian->irq_acked(kian);
-	rcu_read_unlock();
-}
-
-void kvm_register_irq_ack_notifier(struct kvm *kvm,
-				   struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
-	mutex_unlock(&kvm->irq_lock);
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
-void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
-				    struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_del_init_rcu(&kian->link);
-	mutex_unlock(&kvm->irq_lock);
-	synchronize_rcu();
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
 int kvm_request_irq_source_id(struct kvm *kvm)
 {
 	unsigned long *bitmap = &kvm->arch.irq_sources_bitmap;
@@ -381,13 +271,6 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-void kvm_free_irq_routing(struct kvm *kvm)
-{
-	/* Called only during vm destruction. Nobody can use the pointer
-	   at this stage */
-	kfree(kvm->irq_routing);
-}
-
 static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 			       struct kvm_kernel_irq_routing_entry *e,
 			       const struct kvm_irq_routing_entry *ue)
@@ -451,7 +334,6 @@ out:
 	return r;
 }
 
-
 int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *ue,
 			unsigned nr,
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
new file mode 100644
index 0000000..12f7f26
--- /dev/null
+++ b/virt/kvm/irqchip.c
@@ -0,0 +1,152 @@
+/*
+ * irqchip.c: Common API for in kernel interrupt controllers
+ * Copyright (c) 2007, Intel Corporation.
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (c) 2013, Alexander Graf <agraf@suse.de>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * This file is derived from virt/kvm/irq_comm.c.
+ *
+ * Authors:
+ *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
+ *   Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <trace/events/kvm.h>
+#include "irq.h"
+
+bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi = gsi) {
+				rcu_read_unlock();
+				return true;
+			}
+
+	rcu_read_unlock();
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
+
+void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	trace_kvm_ack_irq(irqchip, pin);
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi = gsi)
+				kian->irq_acked(kian);
+	rcu_read_unlock();
+}
+
+void kvm_register_irq_ack_notifier(struct kvm *kvm,
+				   struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
+	mutex_unlock(&kvm->irq_lock);
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
+				    struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_del_init_rcu(&kian->link);
+	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct kvm_kernel_irq_routing_entry route;
+
+	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
+		return -EINVAL;
+
+	route.msi.address_lo = msi->address_lo;
+	route.msi.address_hi = msi->address_hi;
+	route.msi.data = msi->data;
+
+	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
+}
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
+		bool line_status)
+{
+	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
+	int ret = -1, i = 0;
+	struct kvm_irq_routing_table *irq_rt;
+
+	trace_kvm_set_irq(irq, level, irq_source_id);
+
+	/* Not possible to detect if the guest uses the PIC or the
+	 * IOAPIC.  So set the bit in both. The guest will ignore
+	 * writes to the unused one.
+	 */
+	rcu_read_lock();
+	irq_rt = rcu_dereference(kvm->irq_routing);
+	if (irq < irq_rt->nr_rt_entries)
+		hlist_for_each_entry(e, &irq_rt->map[irq], link)
+			irq_set[i++] = *e;
+	rcu_read_unlock();
+
+	while(i--) {
+		int r;
+		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
+				   line_status);
+		if (r < 0)
+			continue;
+
+		ret = r + ((ret < 0) ? 0 : ret);
+	}
+
+	return ret;
+}
+
+void kvm_free_irq_routing(struct kvm *kvm)
+{
+	/* Called only during vm destruction. Nobody can use the pointer
+	   at this stage */
+	kfree(kvm->irq_routing);
+}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 07/17] KVM: Move irq routing setup to irqchip.c
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    3 ++
 virt/kvm/irq_comm.c      |   76 ++---------------------------------------
 virt/kvm/irqchip.c       |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 73 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a7bfe9d..dcef724 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -961,6 +961,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *entries,
 			unsigned nr,
 			unsigned flags);
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue);
 void kvm_free_irq_routing(struct kvm *kvm);
 
 int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index d5008f4..e2e6b44 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -271,27 +271,14 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-static int setup_routing_entry(struct kvm_irq_routing_table *rt,
-			       struct kvm_kernel_irq_routing_entry *e,
-			       const struct kvm_irq_routing_entry *ue)
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
 {
 	int r = -EINVAL;
 	int delta;
 	unsigned max_pin;
-	struct kvm_kernel_irq_routing_entry *ei;
 
-	/*
-	 * Do not allow GSI to be mapped to the same irqchip more than once.
-	 * Allow only one to one mapping between GSI and MSI.
-	 */
-	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
-		if (ei->type == KVM_IRQ_ROUTING_MSI ||
-		    ue->type == KVM_IRQ_ROUTING_MSI ||
-		    ue->u.irqchip.irqchip == ei->irqchip.irqchip)
-			return r;
-
-	e->gsi = ue->gsi;
-	e->type = ue->type;
 	switch (ue->type) {
 	case KVM_IRQ_ROUTING_IRQCHIP:
 		delta = 0;
@@ -328,68 +315,11 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 		goto out;
 	}
 
-	hlist_add_head(&e->link, &rt->map[e->gsi]);
 	r = 0;
 out:
 	return r;
 }
 
-int kvm_set_irq_routing(struct kvm *kvm,
-			const struct kvm_irq_routing_entry *ue,
-			unsigned nr,
-			unsigned flags)
-{
-	struct kvm_irq_routing_table *new, *old;
-	u32 i, j, nr_rt_entries = 0;
-	int r;
-
-	for (i = 0; i < nr; ++i) {
-		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
-			return -EINVAL;
-		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
-	}
-
-	nr_rt_entries += 1;
-
-	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
-		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
-		      GFP_KERNEL);
-
-	if (!new)
-		return -ENOMEM;
-
-	new->rt_entries = (void *)&new->map[nr_rt_entries];
-
-	new->nr_rt_entries = nr_rt_entries;
-	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
-			new->chip[i][j] = -1;
-
-	for (i = 0; i < nr; ++i) {
-		r = -EINVAL;
-		if (ue->flags)
-			goto out;
-		r = setup_routing_entry(new, &new->rt_entries[i], ue);
-		if (r)
-			goto out;
-		++ue;
-	}
-
-	mutex_lock(&kvm->irq_lock);
-	old = kvm->irq_routing;
-	kvm_irq_routing_update(kvm, new);
-	mutex_unlock(&kvm->irq_lock);
-
-	synchronize_rcu();
-
-	new = old;
-	r = 0;
-
-out:
-	kfree(new);
-	return r;
-}
-
 #define IOAPIC_ROUTING_ENTRY(irq) \
 	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
 	  .u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 12f7f26..20dc9e4 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -150,3 +150,88 @@ void kvm_free_irq_routing(struct kvm *kvm)
 	   at this stage */
 	kfree(kvm->irq_routing);
 }
+
+static int setup_routing_entry(struct kvm_irq_routing_table *rt,
+			       struct kvm_kernel_irq_routing_entry *e,
+			       const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+	struct kvm_kernel_irq_routing_entry *ei;
+
+	/*
+	 * Do not allow GSI to be mapped to the same irqchip more than once.
+	 * Allow only one to one mapping between GSI and MSI.
+	 */
+	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
+		if (ei->type == KVM_IRQ_ROUTING_MSI ||
+		    ue->type == KVM_IRQ_ROUTING_MSI ||
+		    ue->u.irqchip.irqchip == ei->irqchip.irqchip)
+			return r;
+
+	e->gsi = ue->gsi;
+	e->type = ue->type;
+	r = kvm_set_routing_entry(rt, e, ue);
+	if (r)
+		goto out;
+
+	hlist_add_head(&e->link, &rt->map[e->gsi]);
+	r = 0;
+out:
+	return r;
+}
+
+int kvm_set_irq_routing(struct kvm *kvm,
+			const struct kvm_irq_routing_entry *ue,
+			unsigned nr,
+			unsigned flags)
+{
+	struct kvm_irq_routing_table *new, *old;
+	u32 i, j, nr_rt_entries = 0;
+	int r;
+
+	for (i = 0; i < nr; ++i) {
+		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
+			return -EINVAL;
+		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
+	}
+
+	nr_rt_entries += 1;
+
+	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
+		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
+		      GFP_KERNEL);
+
+	if (!new)
+		return -ENOMEM;
+
+	new->rt_entries = (void *)&new->map[nr_rt_entries];
+
+	new->nr_rt_entries = nr_rt_entries;
+	for (i = 0; i < KVM_NR_IRQCHIPS; i++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
+			new->chip[i][j] = -1;
+
+	for (i = 0; i < nr; ++i) {
+		r = -EINVAL;
+		if (ue->flags)
+			goto out;
+		r = setup_routing_entry(new, &new->rt_entries[i], ue);
+		if (r)
+			goto out;
+		++ue;
+	}
+
+	mutex_lock(&kvm->irq_lock);
+	old = kvm->irq_routing;
+	kvm_irq_routing_update(kvm, new);
+	mutex_unlock(&kvm->irq_lock);
+
+	synchronize_rcu();
+
+	new = old;
+	r = 0;
+
+out:
+	kfree(new);
+	return r;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 07/17] KVM: Move irq routing setup to irqchip.c
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    3 ++
 virt/kvm/irq_comm.c      |   76 ++---------------------------------------
 virt/kvm/irqchip.c       |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 73 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a7bfe9d..dcef724 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -961,6 +961,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *entries,
 			unsigned nr,
 			unsigned flags);
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue);
 void kvm_free_irq_routing(struct kvm *kvm);
 
 int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index d5008f4..e2e6b44 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -271,27 +271,14 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-static int setup_routing_entry(struct kvm_irq_routing_table *rt,
-			       struct kvm_kernel_irq_routing_entry *e,
-			       const struct kvm_irq_routing_entry *ue)
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
 {
 	int r = -EINVAL;
 	int delta;
 	unsigned max_pin;
-	struct kvm_kernel_irq_routing_entry *ei;
 
-	/*
-	 * Do not allow GSI to be mapped to the same irqchip more than once.
-	 * Allow only one to one mapping between GSI and MSI.
-	 */
-	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
-		if (ei->type = KVM_IRQ_ROUTING_MSI ||
-		    ue->type = KVM_IRQ_ROUTING_MSI ||
-		    ue->u.irqchip.irqchip = ei->irqchip.irqchip)
-			return r;
-
-	e->gsi = ue->gsi;
-	e->type = ue->type;
 	switch (ue->type) {
 	case KVM_IRQ_ROUTING_IRQCHIP:
 		delta = 0;
@@ -328,68 +315,11 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 		goto out;
 	}
 
-	hlist_add_head(&e->link, &rt->map[e->gsi]);
 	r = 0;
 out:
 	return r;
 }
 
-int kvm_set_irq_routing(struct kvm *kvm,
-			const struct kvm_irq_routing_entry *ue,
-			unsigned nr,
-			unsigned flags)
-{
-	struct kvm_irq_routing_table *new, *old;
-	u32 i, j, nr_rt_entries = 0;
-	int r;
-
-	for (i = 0; i < nr; ++i) {
-		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
-			return -EINVAL;
-		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
-	}
-
-	nr_rt_entries += 1;
-
-	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
-		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
-		      GFP_KERNEL);
-
-	if (!new)
-		return -ENOMEM;
-
-	new->rt_entries = (void *)&new->map[nr_rt_entries];
-
-	new->nr_rt_entries = nr_rt_entries;
-	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
-			new->chip[i][j] = -1;
-
-	for (i = 0; i < nr; ++i) {
-		r = -EINVAL;
-		if (ue->flags)
-			goto out;
-		r = setup_routing_entry(new, &new->rt_entries[i], ue);
-		if (r)
-			goto out;
-		++ue;
-	}
-
-	mutex_lock(&kvm->irq_lock);
-	old = kvm->irq_routing;
-	kvm_irq_routing_update(kvm, new);
-	mutex_unlock(&kvm->irq_lock);
-
-	synchronize_rcu();
-
-	new = old;
-	r = 0;
-
-out:
-	kfree(new);
-	return r;
-}
-
 #define IOAPIC_ROUTING_ENTRY(irq) \
 	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
 	  .u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 12f7f26..20dc9e4 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -150,3 +150,88 @@ void kvm_free_irq_routing(struct kvm *kvm)
 	   at this stage */
 	kfree(kvm->irq_routing);
 }
+
+static int setup_routing_entry(struct kvm_irq_routing_table *rt,
+			       struct kvm_kernel_irq_routing_entry *e,
+			       const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+	struct kvm_kernel_irq_routing_entry *ei;
+
+	/*
+	 * Do not allow GSI to be mapped to the same irqchip more than once.
+	 * Allow only one to one mapping between GSI and MSI.
+	 */
+	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
+		if (ei->type = KVM_IRQ_ROUTING_MSI ||
+		    ue->type = KVM_IRQ_ROUTING_MSI ||
+		    ue->u.irqchip.irqchip = ei->irqchip.irqchip)
+			return r;
+
+	e->gsi = ue->gsi;
+	e->type = ue->type;
+	r = kvm_set_routing_entry(rt, e, ue);
+	if (r)
+		goto out;
+
+	hlist_add_head(&e->link, &rt->map[e->gsi]);
+	r = 0;
+out:
+	return r;
+}
+
+int kvm_set_irq_routing(struct kvm *kvm,
+			const struct kvm_irq_routing_entry *ue,
+			unsigned nr,
+			unsigned flags)
+{
+	struct kvm_irq_routing_table *new, *old;
+	u32 i, j, nr_rt_entries = 0;
+	int r;
+
+	for (i = 0; i < nr; ++i) {
+		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
+			return -EINVAL;
+		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
+	}
+
+	nr_rt_entries += 1;
+
+	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
+		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
+		      GFP_KERNEL);
+
+	if (!new)
+		return -ENOMEM;
+
+	new->rt_entries = (void *)&new->map[nr_rt_entries];
+
+	new->nr_rt_entries = nr_rt_entries;
+	for (i = 0; i < KVM_NR_IRQCHIPS; i++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
+			new->chip[i][j] = -1;
+
+	for (i = 0; i < nr; ++i) {
+		r = -EINVAL;
+		if (ue->flags)
+			goto out;
+		r = setup_routing_entry(new, &new->rt_entries[i], ue);
+		if (r)
+			goto out;
+		++ue;
+	}
+
+	mutex_lock(&kvm->irq_lock);
+	old = kvm->irq_routing;
+	kvm_irq_routing_update(kvm, new);
+	mutex_unlock(&kvm->irq_lock);
+
+	synchronize_rcu();
+
+	new = old;
+	r = 0;
+
+out:
+	kfree(new);
+	return r;
+}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that we have most irqfd code completely platform agnostic, let's move
irqfd's resample capability return to generic code as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/x86.c  |    1 -
 virt/kvm/kvm_main.c |    3 +++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 50e2e10..888d892 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2513,7 +2513,6 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_PCI_2_3:
 	case KVM_CAP_KVMCLOCK_CTRL:
 	case KVM_CAP_READONLY_MEM:
-	case KVM_CAP_IRQFD_RESAMPLE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b6f3354..f9492f3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2433,6 +2433,9 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 #ifdef CONFIG_HAVE_KVM_MSI
 	case KVM_CAP_SIGNAL_MSI:
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_CAP_IRQFD_RESAMPLE:
+#endif
 		return 1;
 #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that we have most irqfd code completely platform agnostic, let's move
irqfd's resample capability return to generic code as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/x86.c  |    1 -
 virt/kvm/kvm_main.c |    3 +++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 50e2e10..888d892 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2513,7 +2513,6 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_PCI_2_3:
 	case KVM_CAP_KVMCLOCK_CTRL:
 	case KVM_CAP_READONLY_MEM:
-	case KVM_CAP_IRQFD_RESAMPLE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b6f3354..f9492f3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2433,6 +2433,9 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 #ifdef CONFIG_HAVE_KVM_MSI
 	case KVM_CAP_SIGNAL_MSI:
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_CAP_IRQFD_RESAMPLE:
+#endif
 		return 1;
 #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 09/17] kvm: add device control API
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it.  If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.

This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device.  Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc.  It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.

Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt        |   70 ++++++++++++++++
 Documentation/virtual/kvm/devices/README |    1 +
 include/linux/kvm_host.h                 |   35 ++++++++
 include/uapi/linux/kvm.h                 |   27 ++++++
 virt/kvm/kvm_main.c                      |  129 ++++++++++++++++++++++++++++++
 5 files changed, 262 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 976eb65..d52f3f9 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
+4.79 KVM_CREATE_DEVICE
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: vm ioctl
+Parameters: struct kvm_create_device (in/out)
+Returns: 0 on success, -1 on error
+Errors:
+  ENODEV: The device type is unknown or unsupported
+  EEXIST: Device already created, and this type of device may not
+          be instantiated multiple times
+
+  Other error conditions may be defined by individual device types or
+  have their standard meanings.
+
+Creates an emulated device in the kernel.  The file descriptor returned
+in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
+
+If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
+device type is supported (not necessarily whether it can be created
+in the current vm).
+
+Individual devices should not define flags.  Attributes should be used
+for specifying any behavior that is not implied by the device type
+number.
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+  EPERM:  The attribute cannot (currently) be accessed this way
+          (e.g. read-only attribute, or attribute that only makes
+          sense when the device is in a different state)
+
+  Other error conditions may be defined by individual device types.
+
+Gets/sets a specified piece of device configuration and/or state.  The
+semantics are device-specific.  See individual device documentation in
+the "devices" directory.  As with ONE_REG, the size of the data
+transferred is defined by the particular attribute.
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+4.81 KVM_HAS_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+Tests whether a device supports a particular attribute.  A successful
+return indicates the attribute is implemented.  It does not necessarily
+indicate that the attribute can be read or written in the device's
+current state.  "addr" is ignored.
 
 4.77 KVM_ARM_VCPU_INIT
 
diff --git a/Documentation/virtual/kvm/devices/README b/Documentation/virtual/kvm/devices/README
new file mode 100644
index 0000000..34a6983
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/README
@@ -0,0 +1 @@
+This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dcef724..6dab6b5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1064,6 +1064,41 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 
 extern bool kvm_rebooting;
 
+struct kvm_device_ops;
+
+struct kvm_device {
+	struct kvm_device_ops *ops;
+	struct kvm *kvm;
+	atomic_t users;
+	void *private;
+};
+
+/* create, destroy, and name are mandatory */
+struct kvm_device_ops {
+	const char *name;
+	int (*create)(struct kvm_device *dev, u32 type);
+
+	/*
+	 * Destroy is responsible for freeing dev.
+	 *
+	 * Destroy may be called before or after destructors are called
+	 * on emulated I/O regions, depending on whether a reference is
+	 * held by a vcpu or other kvm component that gets destroyed
+	 * after the emulated I/O.
+	 */
+	void (*destroy)(struct kvm_device *dev);
+
+	int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
+		      unsigned long arg);
+};
+
+void kvm_device_get(struct kvm_device *dev);
+void kvm_device_put(struct kvm_device *dev);
+struct kvm_device *kvm_device_from_filp(struct file *filp);
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index c741902..be15aff 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_EPR 86
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
+#define KVM_CAP_DEVICE_CTRL 89
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -907,6 +908,32 @@ struct kvm_s390_ucas_mapping {
 #define KVM_ARM_SET_DEVICE_ADDR	  _IOW(KVMIO,  0xab, struct kvm_arm_device_addr)
 
 /*
+ * Device control API, available with KVM_CAP_DEVICE_CTRL
+ */
+#define KVM_CREATE_DEVICE_TEST		1
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+/* ioctl for vm fd */
+#define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
+
+/* ioctls for fds returned by KVM_CREATE_DEVICE */
+#define KVM_SET_DEVICE_ATTR	  _IOW(KVMIO,  0xe1, struct kvm_device_attr)
+#define KVM_GET_DEVICE_ATTR	  _IOW(KVMIO,  0xe2, struct kvm_device_attr)
+#define KVM_HAS_DEVICE_ATTR	  _IOW(KVMIO,  0xe3, struct kvm_device_attr)
+
+/*
  * ioctls for vcpu fds
  */
 #define KVM_RUN                   _IO(KVMIO,   0x80)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f9492f3..5f0d78c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2159,6 +2159,117 @@ out:
 }
 #endif
 
+static int kvm_device_ioctl_attr(struct kvm_device *dev,
+				 int (*accessor)(struct kvm_device *dev,
+						 struct kvm_device_attr *attr),
+				 unsigned long arg)
+{
+	struct kvm_device_attr attr;
+
+	if (!accessor)
+		return -EPERM;
+
+	if (copy_from_user(&attr, (void __user *)arg, sizeof(attr)))
+		return -EFAULT;
+
+	return accessor(dev, &attr);
+}
+
+static long kvm_device_ioctl(struct file *filp, unsigned int ioctl,
+			     unsigned long arg)
+{
+	struct kvm_device *dev = filp->private_data;
+
+	switch (ioctl) {
+	case KVM_SET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->set_attr, arg);
+	case KVM_GET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->get_attr, arg);
+	case KVM_HAS_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->has_attr, arg);
+	default:
+		if (dev->ops->ioctl)
+			return dev->ops->ioctl(dev, ioctl, arg);
+
+		return -ENOTTY;
+	}
+}
+
+void kvm_device_get(struct kvm_device *dev)
+{
+	atomic_inc(&dev->users);
+}
+
+void kvm_device_put(struct kvm_device *dev)
+{
+	if (atomic_dec_and_test(&dev->users))
+		dev->ops->destroy(dev);
+}
+
+static int kvm_device_release(struct inode *inode, struct file *filp)
+{
+	struct kvm_device *dev = filp->private_data;
+	struct kvm *kvm = dev->kvm;
+
+	kvm_device_put(dev);
+	kvm_put_kvm(kvm);
+	return 0;
+}
+
+static const struct file_operations kvm_device_fops = {
+	.unlocked_ioctl = kvm_device_ioctl,
+	.release = kvm_device_release,
+};
+
+struct kvm_device *kvm_device_from_filp(struct file *filp)
+{
+	if (filp->f_op != &kvm_device_fops)
+		return NULL;
+
+	return filp->private_data;
+}
+
+static int kvm_ioctl_create_device(struct kvm *kvm,
+				   struct kvm_create_device *cd)
+{
+	struct kvm_device_ops *ops = NULL;
+	struct kvm_device *dev;
+	bool test = cd->flags & KVM_CREATE_DEVICE_TEST;
+	int ret;
+
+	switch (cd->type) {
+	default:
+		return -ENODEV;
+	}
+
+	if (test)
+		return 0;
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	dev->ops = ops;
+	dev->kvm = kvm;
+	atomic_set(&dev->users, 1);
+
+	ret = ops->create(dev, cd->type);
+	if (ret < 0) {
+		kfree(dev);
+		return ret;
+	}
+
+	ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR);
+	if (ret < 0) {
+		ops->destroy(dev);
+		return ret;
+	}
+
+	kvm_get_kvm(kvm);
+	cd->fd = ret;
+	return 0;
+}
+
 static long kvm_vm_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2304,6 +2415,24 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
+	case KVM_CREATE_DEVICE: {
+		struct kvm_create_device cd;
+
+		r = -EFAULT;
+		if (copy_from_user(&cd, argp, sizeof(cd)))
+			goto out;
+
+		r = kvm_ioctl_create_device(kvm, &cd);
+		if (r)
+			goto out;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &cd, sizeof(cd)))
+			goto out;
+
+		r = 0;
+		break;
+	}
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r == -ENOTTY)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 09/17] kvm: add device control API
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it.  If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.

This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device.  Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc.  It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.

Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt        |   70 ++++++++++++++++
 Documentation/virtual/kvm/devices/README |    1 +
 include/linux/kvm_host.h                 |   35 ++++++++
 include/uapi/linux/kvm.h                 |   27 ++++++
 virt/kvm/kvm_main.c                      |  129 ++++++++++++++++++++++++++++++
 5 files changed, 262 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 976eb65..d52f3f9 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
+4.79 KVM_CREATE_DEVICE
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: vm ioctl
+Parameters: struct kvm_create_device (in/out)
+Returns: 0 on success, -1 on error
+Errors:
+  ENODEV: The device type is unknown or unsupported
+  EEXIST: Device already created, and this type of device may not
+          be instantiated multiple times
+
+  Other error conditions may be defined by individual device types or
+  have their standard meanings.
+
+Creates an emulated device in the kernel.  The file descriptor returned
+in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
+
+If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
+device type is supported (not necessarily whether it can be created
+in the current vm).
+
+Individual devices should not define flags.  Attributes should be used
+for specifying any behavior that is not implied by the device type
+number.
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+  EPERM:  The attribute cannot (currently) be accessed this way
+          (e.g. read-only attribute, or attribute that only makes
+          sense when the device is in a different state)
+
+  Other error conditions may be defined by individual device types.
+
+Gets/sets a specified piece of device configuration and/or state.  The
+semantics are device-specific.  See individual device documentation in
+the "devices" directory.  As with ONE_REG, the size of the data
+transferred is defined by the particular attribute.
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+4.81 KVM_HAS_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+Tests whether a device supports a particular attribute.  A successful
+return indicates the attribute is implemented.  It does not necessarily
+indicate that the attribute can be read or written in the device's
+current state.  "addr" is ignored.
 
 4.77 KVM_ARM_VCPU_INIT
 
diff --git a/Documentation/virtual/kvm/devices/README b/Documentation/virtual/kvm/devices/README
new file mode 100644
index 0000000..34a6983
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/README
@@ -0,0 +1 @@
+This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dcef724..6dab6b5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1064,6 +1064,41 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 
 extern bool kvm_rebooting;
 
+struct kvm_device_ops;
+
+struct kvm_device {
+	struct kvm_device_ops *ops;
+	struct kvm *kvm;
+	atomic_t users;
+	void *private;
+};
+
+/* create, destroy, and name are mandatory */
+struct kvm_device_ops {
+	const char *name;
+	int (*create)(struct kvm_device *dev, u32 type);
+
+	/*
+	 * Destroy is responsible for freeing dev.
+	 *
+	 * Destroy may be called before or after destructors are called
+	 * on emulated I/O regions, depending on whether a reference is
+	 * held by a vcpu or other kvm component that gets destroyed
+	 * after the emulated I/O.
+	 */
+	void (*destroy)(struct kvm_device *dev);
+
+	int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
+		      unsigned long arg);
+};
+
+void kvm_device_get(struct kvm_device *dev);
+void kvm_device_put(struct kvm_device *dev);
+struct kvm_device *kvm_device_from_filp(struct file *filp);
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index c741902..be15aff 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_EPR 86
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
+#define KVM_CAP_DEVICE_CTRL 89
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -907,6 +908,32 @@ struct kvm_s390_ucas_mapping {
 #define KVM_ARM_SET_DEVICE_ADDR	  _IOW(KVMIO,  0xab, struct kvm_arm_device_addr)
 
 /*
+ * Device control API, available with KVM_CAP_DEVICE_CTRL
+ */
+#define KVM_CREATE_DEVICE_TEST		1
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+/* ioctl for vm fd */
+#define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
+
+/* ioctls for fds returned by KVM_CREATE_DEVICE */
+#define KVM_SET_DEVICE_ATTR	  _IOW(KVMIO,  0xe1, struct kvm_device_attr)
+#define KVM_GET_DEVICE_ATTR	  _IOW(KVMIO,  0xe2, struct kvm_device_attr)
+#define KVM_HAS_DEVICE_ATTR	  _IOW(KVMIO,  0xe3, struct kvm_device_attr)
+
+/*
  * ioctls for vcpu fds
  */
 #define KVM_RUN                   _IO(KVMIO,   0x80)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f9492f3..5f0d78c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2159,6 +2159,117 @@ out:
 }
 #endif
 
+static int kvm_device_ioctl_attr(struct kvm_device *dev,
+				 int (*accessor)(struct kvm_device *dev,
+						 struct kvm_device_attr *attr),
+				 unsigned long arg)
+{
+	struct kvm_device_attr attr;
+
+	if (!accessor)
+		return -EPERM;
+
+	if (copy_from_user(&attr, (void __user *)arg, sizeof(attr)))
+		return -EFAULT;
+
+	return accessor(dev, &attr);
+}
+
+static long kvm_device_ioctl(struct file *filp, unsigned int ioctl,
+			     unsigned long arg)
+{
+	struct kvm_device *dev = filp->private_data;
+
+	switch (ioctl) {
+	case KVM_SET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->set_attr, arg);
+	case KVM_GET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->get_attr, arg);
+	case KVM_HAS_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->has_attr, arg);
+	default:
+		if (dev->ops->ioctl)
+			return dev->ops->ioctl(dev, ioctl, arg);
+
+		return -ENOTTY;
+	}
+}
+
+void kvm_device_get(struct kvm_device *dev)
+{
+	atomic_inc(&dev->users);
+}
+
+void kvm_device_put(struct kvm_device *dev)
+{
+	if (atomic_dec_and_test(&dev->users))
+		dev->ops->destroy(dev);
+}
+
+static int kvm_device_release(struct inode *inode, struct file *filp)
+{
+	struct kvm_device *dev = filp->private_data;
+	struct kvm *kvm = dev->kvm;
+
+	kvm_device_put(dev);
+	kvm_put_kvm(kvm);
+	return 0;
+}
+
+static const struct file_operations kvm_device_fops = {
+	.unlocked_ioctl = kvm_device_ioctl,
+	.release = kvm_device_release,
+};
+
+struct kvm_device *kvm_device_from_filp(struct file *filp)
+{
+	if (filp->f_op != &kvm_device_fops)
+		return NULL;
+
+	return filp->private_data;
+}
+
+static int kvm_ioctl_create_device(struct kvm *kvm,
+				   struct kvm_create_device *cd)
+{
+	struct kvm_device_ops *ops = NULL;
+	struct kvm_device *dev;
+	bool test = cd->flags & KVM_CREATE_DEVICE_TEST;
+	int ret;
+
+	switch (cd->type) {
+	default:
+		return -ENODEV;
+	}
+
+	if (test)
+		return 0;
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	dev->ops = ops;
+	dev->kvm = kvm;
+	atomic_set(&dev->users, 1);
+
+	ret = ops->create(dev, cd->type);
+	if (ret < 0) {
+		kfree(dev);
+		return ret;
+	}
+
+	ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR);
+	if (ret < 0) {
+		ops->destroy(dev);
+		return ret;
+	}
+
+	kvm_get_kvm(kvm);
+	cd->fd = ret;
+	return 0;
+}
+
 static long kvm_vm_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2304,6 +2415,24 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
+	case KVM_CREATE_DEVICE: {
+		struct kvm_create_device cd;
+
+		r = -EFAULT;
+		if (copy_from_user(&cd, argp, sizeof(cd)))
+			goto out;
+
+		r = kvm_ioctl_create_device(kvm, &cd);
+		if (r)
+			goto out;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &cd, sizeof(cd)))
+			goto out;
+
+		r = 0;
+		break;
+	}
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r = -ENOTTY)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 10/17] kvm/ppc/mpic: import hw/openpic.c from QEMU
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e ("Update version for
1.4.0-rc0"), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU.  Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c | 1686 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1686 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/mpic.c

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
new file mode 100644
index 0000000..57655b9
--- /dev/null
+++ b/arch/powerpc/kvm/mpic.c
@@ -0,0 +1,1686 @@
+/*
+ * OpenPIC emulation
+ *
+ * Copyright (c) 2004 Jocelyn Mayer
+ *               2011 Alexander Graf
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+/*
+ *
+ * Based on OpenPic implementations:
+ * - Intel GW80314 I/O companion chip developer's manual
+ * - Motorola MPC8245 & MPC8540 user manuals.
+ * - Motorola MCP750 (aka Raven) programmer manual.
+ * - Motorola Harrier programmer manuel
+ *
+ * Serial interrupts, as implemented in Raven chipset are not supported yet.
+ *
+ */
+#include "hw.h"
+#include "ppc/mac.h"
+#include "pci/pci.h"
+#include "openpic.h"
+#include "sysbus.h"
+#include "pci/msi.h"
+#include "qemu/bitops.h"
+#include "ppc.h"
+
+//#define DEBUG_OPENPIC
+
+#ifdef DEBUG_OPENPIC
+static const int debug_openpic = 1;
+#else
+static const int debug_openpic = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+        if (debug_openpic) { \
+            printf(fmt , ## __VA_ARGS__); \
+        } \
+    } while (0)
+
+#define MAX_CPU     32
+#define MAX_SRC     256
+#define MAX_TMR     4
+#define MAX_IPI     4
+#define MAX_MSI     8
+#define MAX_IRQ     (MAX_SRC + MAX_IPI + MAX_TMR)
+#define VID         0x03	/* MPIC version ID */
+
+/* OpenPIC capability flags */
+#define OPENPIC_FLAG_IDR_CRIT     (1 << 0)
+#define OPENPIC_FLAG_ILR          (2 << 0)
+
+/* OpenPIC address map */
+#define OPENPIC_GLB_REG_START        0x0
+#define OPENPIC_GLB_REG_SIZE         0x10F0
+#define OPENPIC_TMR_REG_START        0x10F0
+#define OPENPIC_TMR_REG_SIZE         0x220
+#define OPENPIC_MSI_REG_START        0x1600
+#define OPENPIC_MSI_REG_SIZE         0x200
+#define OPENPIC_SUMMARY_REG_START   0x3800
+#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SRC_REG_START        0x10000
+#define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
+#define OPENPIC_CPU_REG_START        0x20000
+#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+
+/* Raven */
+#define RAVEN_MAX_CPU      2
+#define RAVEN_MAX_EXT     48
+#define RAVEN_MAX_IRQ     64
+#define RAVEN_MAX_TMR      MAX_TMR
+#define RAVEN_MAX_IPI      MAX_IPI
+
+/* Interrupt definitions */
+#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
+#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
+#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
+#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
+/* First doorbell IRQ */
+#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
+
+typedef struct FslMpicInfo {
+	int max_ext;
+} FslMpicInfo;
+
+static FslMpicInfo fsl_mpic_20 = {
+	.max_ext = 12,
+};
+
+static FslMpicInfo fsl_mpic_42 = {
+	.max_ext = 12,
+};
+
+#define FRR_NIRQ_SHIFT    16
+#define FRR_NCPU_SHIFT     8
+#define FRR_VID_SHIFT      0
+
+#define VID_REVISION_1_2   2
+#define VID_REVISION_1_3   3
+
+#define VIR_GENERIC      0x00000000	/* Generic Vendor ID */
+
+#define GCR_RESET        0x80000000
+#define GCR_MODE_PASS    0x00000000
+#define GCR_MODE_MIXED   0x20000000
+#define GCR_MODE_PROXY   0x60000000
+
+#define TBCR_CI           0x80000000	/* count inhibit */
+#define TCCR_TOG          0x80000000	/* toggles when decrement to zero */
+
+#define IDR_EP_SHIFT      31
+#define IDR_EP_MASK       (1 << IDR_EP_SHIFT)
+#define IDR_CI0_SHIFT     30
+#define IDR_CI1_SHIFT     29
+#define IDR_P1_SHIFT      1
+#define IDR_P0_SHIFT      0
+
+#define ILR_INTTGT_MASK   0x000000ff
+#define ILR_INTTGT_INT    0x00
+#define ILR_INTTGT_CINT   0x01	/* critical */
+#define ILR_INTTGT_MCP    0x02	/* machine check */
+
+/* The currently supported INTTGT values happen to be the same as QEMU's
+ * openpic output codes, but don't depend on this.  The output codes
+ * could change (unlikely, but...) or support could be added for
+ * more INTTGT values.
+ */
+static const int inttgt_output[][2] = {
+	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
+	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
+	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
+};
+
+static int inttgt_to_output(int inttgt)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][0] == inttgt) {
+			return inttgt_output[i][1];
+		}
+	}
+
+	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
+	return OPENPIC_OUTPUT_INT;
+}
+
+static int output_to_inttgt(int output)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][1] == output) {
+			return inttgt_output[i][0];
+		}
+	}
+
+	abort();
+}
+
+#define MSIIR_OFFSET       0x140
+#define MSIIR_SRS_SHIFT    29
+#define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
+#define MSIIR_IBS_SHIFT    24
+#define MSIIR_IBS_MASK     (0x1f << MSIIR_IBS_SHIFT)
+
+static int get_current_cpu(void)
+{
+	CPUState *cpu_single_cpu;
+
+	if (!cpu_single_env) {
+		return -1;
+	}
+
+	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
+	return cpu_single_cpu->cpu_index;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx);
+
+typedef enum IRQType {
+	IRQ_TYPE_NORMAL = 0,
+	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
+	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
+} IRQType;
+
+typedef struct IRQQueue {
+	/* Round up to the nearest 64 IRQs so that the queue length
+	 * won't change when moving between 32 and 64 bit hosts.
+	 */
+	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
+	int next;
+	int priority;
+} IRQQueue;
+
+typedef struct IRQSource {
+	uint32_t ivpr;		/* IRQ vector/priority register */
+	uint32_t idr;		/* IRQ destination register */
+	uint32_t destmask;	/* bitmap of CPU destinations */
+	int last_cpu;
+	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int pending;		/* TRUE if IRQ is pending */
+	IRQType type;
+	bool level:1;		/* level-triggered */
+	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
+} IRQSource;
+
+#define IVPR_MASK_SHIFT       31
+#define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
+#define IVPR_ACTIVITY_SHIFT   30
+#define IVPR_ACTIVITY_MASK    (1 << IVPR_ACTIVITY_SHIFT)
+#define IVPR_MODE_SHIFT       29
+#define IVPR_MODE_MASK        (1 << IVPR_MODE_SHIFT)
+#define IVPR_POLARITY_SHIFT   23
+#define IVPR_POLARITY_MASK    (1 << IVPR_POLARITY_SHIFT)
+#define IVPR_SENSE_SHIFT      22
+#define IVPR_SENSE_MASK       (1 << IVPR_SENSE_SHIFT)
+
+#define IVPR_PRIORITY_MASK     (0xF << 16)
+#define IVPR_PRIORITY(_ivprr_) ((int)(((_ivprr_) & IVPR_PRIORITY_MASK) >> 16))
+#define IVPR_VECTOR(opp, _ivprr_) ((_ivprr_) & (opp)->vector_mask)
+
+/* IDR[EP/CI] are only for FSL MPIC prior to v4.0 */
+#define IDR_EP      0x80000000	/* external pin */
+#define IDR_CI      0x40000000	/* critical interrupt */
+
+typedef struct IRQDest {
+	int32_t ctpr;		/* CPU current task priority */
+	IRQQueue raised;
+	IRQQueue servicing;
+	qemu_irq *irqs;
+
+	/* Count of IRQ sources asserting on non-INT outputs */
+	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+} IRQDest;
+
+typedef struct OpenPICState {
+	SysBusDevice busdev;
+	MemoryRegion mem;
+
+	/* Behavior control */
+	FslMpicInfo *fsl;
+	uint32_t model;
+	uint32_t flags;
+	uint32_t nb_irqs;
+	uint32_t vid;
+	uint32_t vir;		/* Vendor identification register */
+	uint32_t vector_mask;
+	uint32_t tfrr_reset;
+	uint32_t ivpr_reset;
+	uint32_t idr_reset;
+	uint32_t brr1;
+	uint32_t mpic_mode_mask;
+
+	/* Sub-regions */
+	MemoryRegion sub_io_mem[6];
+
+	/* Global registers */
+	uint32_t frr;		/* Feature reporting register */
+	uint32_t gcr;		/* Global configuration register  */
+	uint32_t pir;		/* Processor initialization register */
+	uint32_t spve;		/* Spurious vector register */
+	uint32_t tfrr;		/* Timer frequency reporting register */
+	/* Source registers */
+	IRQSource src[MAX_IRQ];
+	/* Local registers per output pin */
+	IRQDest dst[MAX_CPU];
+	uint32_t nb_cpus;
+	/* Timer registers */
+	struct {
+		uint32_t tccr;	/* Global timer current count register */
+		uint32_t tbcr;	/* Global timer base count register */
+	} timers[MAX_TMR];
+	/* Shared MSI registers */
+	struct {
+		uint32_t msir;	/* Shared Message Signaled Interrupt Register */
+	} msi[MAX_MSI];
+	uint32_t max_irq;
+	uint32_t irq_ipi0;
+	uint32_t irq_tim0;
+	uint32_t irq_msi;
+} OpenPICState;
+
+static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+{
+	set_bit(n_IRQ, q->queue);
+}
+
+static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+{
+	clear_bit(n_IRQ, q->queue);
+}
+
+static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+{
+	return test_bit(n_IRQ, q->queue);
+}
+
+static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+{
+	int irq = -1;
+	int next = -1;
+	int priority = -1;
+
+	for (;;) {
+		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
+		if (irq == opp->max_irq) {
+			break;
+		}
+
+		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
+
+		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
+			next = irq;
+			priority = IVPR_PRIORITY(opp->src[irq].ivpr);
+		}
+	}
+
+	q->next = next;
+	q->priority = priority;
+}
+
+static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+{
+	/* XXX: optimize */
+	IRQ_check(opp, q);
+
+	return q->next;
+}
+
+static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+			   bool active, bool was_active)
+{
+	IRQDest *dst;
+	IRQSource *src;
+	int priority;
+
+	dst = &opp->dst[n_CPU];
+	src = &opp->src[n_IRQ];
+
+	DPRINTF("%s: IRQ %d active %d was %d\n",
+		__func__, n_IRQ, active, was_active);
+
+	if (src->output != OPENPIC_OUTPUT_INT) {
+		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+			__func__, src->output, n_IRQ, active, was_active,
+			dst->outputs_active[src->output]);
+
+		/* On Freescale MPIC, critical interrupts ignore priority,
+		 * IACK, EOI, etc.  Before MPIC v4.1 they also ignore
+		 * masking.
+		 */
+		if (active) {
+			if (!was_active
+			    && dst->outputs_active[src->output]++ == 0) {
+				DPRINTF
+				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_raise(dst->irqs[src->output]);
+			}
+		} else {
+			if (was_active
+			    && --dst->outputs_active[src->output] == 0) {
+				DPRINTF
+				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_lower(dst->irqs[src->output]);
+			}
+		}
+
+		return;
+	}
+
+	priority = IVPR_PRIORITY(src->ivpr);
+
+	/* Even if the interrupt doesn't have enough priority,
+	 * it is still raised, in case ctpr is lowered later.
+	 */
+	if (active) {
+		IRQ_setbit(&dst->raised, n_IRQ);
+	} else {
+		IRQ_resetbit(&dst->raised, n_IRQ);
+	}
+
+	IRQ_check(opp, &dst->raised);
+
+	if (active && priority <= dst->ctpr) {
+		DPRINTF
+		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		active = 0;
+	}
+
+	if (active) {
+		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
+		    priority <= dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+		} else {
+			DPRINTF
+			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			qemu_irq_raise(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	} else {
+		IRQ_get_next(opp, &dst->servicing);
+		if (dst->raised.priority > dst->ctpr &&
+		    dst->raised.priority > dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->raised.next,
+			     dst->raised.priority, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			/* IRQ line stays asserted */
+		} else {
+			DPRINTF
+			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			qemu_irq_lower(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	}
+}
+
+/* update pic state because registers for n_IRQ have changed value */
+static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+{
+	IRQSource *src;
+	bool active, was_active;
+	int i;
+
+	src = &opp->src[n_IRQ];
+	active = src->pending;
+
+	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
+		/* Interrupt source is disabled */
+		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		active = false;
+	}
+
+	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+
+	/*
+	 * We don't have a similar check for already-active because
+	 * ctpr may have changed and we need to withdraw the interrupt.
+	 */
+	if (!active && !was_active) {
+		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (active) {
+		src->ivpr |= IVPR_ACTIVITY_MASK;
+	} else {
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+	}
+
+	if (src->destmask == 0) {
+		/* No target */
+		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (src->destmask == (1 << src->last_cpu)) {
+		/* Only one CPU is allowed to receive this IRQ */
+		IRQ_local_pipe(opp, src->last_cpu, n_IRQ, active, was_active);
+	} else if (!(src->ivpr & IVPR_MODE_MASK)) {
+		/* Directed delivery mode */
+		for (i = 0; i < opp->nb_cpus; i++) {
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+			}
+		}
+	} else {
+		/* Distributed delivery mode */
+		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
+			if (i == opp->nb_cpus) {
+				i = 0;
+			}
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+				src->last_cpu = i;
+				break;
+			}
+		}
+	}
+}
+
+static void openpic_set_irq(void *opaque, int n_IRQ, int level)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+
+	if (n_IRQ >= MAX_IRQ) {
+		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		abort();
+	}
+
+	src = &opp->src[n_IRQ];
+	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+		n_IRQ, level, src->ivpr);
+	if (src->level) {
+		/* level-sensitive irq */
+		src->pending = level;
+		openpic_update_irq(opp, n_IRQ);
+	} else {
+		/* edge-sensitive irq */
+		if (level) {
+			src->pending = 1;
+			openpic_update_irq(opp, n_IRQ);
+		}
+
+		if (src->output != OPENPIC_OUTPUT_INT) {
+			/* Edge-triggered interrupts shouldn't be used
+			 * with non-INT delivery, but just in case,
+			 * try to make it do something sane rather than
+			 * cause an interrupt storm.  This is close to
+			 * what you'd probably see happen in real hardware.
+			 */
+			src->pending = 0;
+			openpic_update_irq(opp, n_IRQ);
+		}
+	}
+}
+
+static void openpic_reset(DeviceState * d)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	int i;
+
+	opp->gcr = GCR_RESET;
+	/* Initialise controller registers */
+	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
+	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
+	    (opp->vid << FRR_VID_SHIFT);
+
+	opp->pir = 0;
+	opp->spve = -1 & opp->vector_mask;
+	opp->tfrr = opp->tfrr_reset;
+	/* Initialise IRQ sources */
+	for (i = 0; i < opp->max_irq; i++) {
+		opp->src[i].ivpr = opp->ivpr_reset;
+		opp->src[i].idr = opp->idr_reset;
+
+		switch (opp->src[i].type) {
+		case IRQ_TYPE_NORMAL:
+			opp->src[i].level =
+			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			break;
+
+		case IRQ_TYPE_FSLINT:
+			opp->src[i].ivpr |= IVPR_POLARITY_MASK;
+			break;
+
+		case IRQ_TYPE_FSLSPECIAL:
+			break;
+		}
+	}
+	/* Initialise IRQ destinations */
+	for (i = 0; i < MAX_CPU; i++) {
+		opp->dst[i].ctpr = 15;
+		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		opp->dst[i].raised.next = -1;
+		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		opp->dst[i].servicing.next = -1;
+	}
+	/* Initialise timers */
+	for (i = 0; i < MAX_TMR; i++) {
+		opp->timers[i].tccr = 0;
+		opp->timers[i].tbcr = TBCR_CI;
+	}
+	/* Go out of RESET state */
+	opp->gcr = 0;
+}
+
+static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].idr;
+}
+
+static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		return output_to_inttgt(opp->src[n_IRQ].output);
+	}
+
+	return 0xffffffff;
+}
+
+static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].ivpr;
+}
+
+static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	IRQSource *src = &opp->src[n_IRQ];
+	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
+	uint32_t crit_mask = 0;
+	uint32_t mask = normal_mask;
+	int crit_shift = IDR_EP_SHIFT - opp->nb_cpus;
+	int i;
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		crit_mask = mask << crit_shift;
+		mask |= crit_mask | IDR_EP;
+	}
+
+	src->idr = val & mask;
+	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		if (src->idr & crit_mask) {
+			if (src->idr & normal_mask) {
+				DPRINTF
+				    ("%s: IRQ configured for multiple output types, using "
+				     "critical\n", __func__);
+			}
+
+			src->output = OPENPIC_OUTPUT_CINT;
+			src->nomask = true;
+			src->destmask = 0;
+
+			for (i = 0; i < opp->nb_cpus; i++) {
+				int n_ci = IDR_CI0_SHIFT - i;
+
+				if (src->idr & (1UL << n_ci)) {
+					src->destmask |= 1UL << i;
+				}
+			}
+		} else {
+			src->output = OPENPIC_OUTPUT_INT;
+			src->nomask = false;
+			src->destmask = src->idr & normal_mask;
+		}
+	} else {
+		src->destmask = src->idr;
+	}
+}
+
+static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		IRQSource *src = &opp->src[n_IRQ];
+
+		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+			src->output);
+
+		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
+	}
+}
+
+static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+				     uint32_t val)
+{
+	uint32_t mask;
+
+	/* NOTE when implementing newer FSL MPIC models: starting with v4.0,
+	 * the polarity bit is read-only on internal interrupts.
+	 */
+	mask = IVPR_MASK_MASK | IVPR_PRIORITY_MASK | IVPR_SENSE_MASK |
+	    IVPR_POLARITY_MASK | opp->vector_mask;
+
+	/* ACTIVITY bit is read-only */
+	opp->src[n_IRQ].ivpr =
+	    (opp->src[n_IRQ].ivpr & IVPR_ACTIVITY_MASK) | (val & mask);
+
+	/* For FSL internal interrupts, The sense bit is reserved and zero,
+	 * and the interrupt is always level-triggered.  Timers and IPIs
+	 * have no sense or polarity bits, and are edge-triggered.
+	 */
+	switch (opp->src[n_IRQ].type) {
+	case IRQ_TYPE_NORMAL:
+		opp->src[n_IRQ].level =
+		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		break;
+
+	case IRQ_TYPE_FSLINT:
+		opp->src[n_IRQ].ivpr &= ~IVPR_SENSE_MASK;
+		break;
+
+	case IRQ_TYPE_FSLSPECIAL:
+		opp->src[n_IRQ].ivpr &= ~(IVPR_POLARITY_MASK | IVPR_SENSE_MASK);
+		break;
+	}
+
+	openpic_update_irq(opp, n_IRQ);
+	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+		opp->src[n_IRQ].ivpr);
+}
+
+static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+{
+	bool mpic_proxy = false;
+
+	if (val & GCR_RESET) {
+		openpic_reset(&opp->busdev.qdev);
+		return;
+	}
+
+	opp->gcr &= ~opp->mpic_mode_mask;
+	opp->gcr |= val & opp->mpic_mode_mask;
+
+	/* Set external proxy mode */
+	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY) {
+		mpic_proxy = true;
+	}
+
+	ppce500_set_mpic_proxy(mpic_proxy);
+}
+
+static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+	switch (addr) {
+	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		break;
+	case 0x1000:		/* FRR */
+		break;
+	case 0x1020:		/* GCR */
+		openpic_gcr_write(opp, val);
+		break;
+	case 0x1080:		/* VIR */
+		break;
+	case 0x1090:		/* PIR */
+		for (idx = 0; idx < opp->nb_cpus; idx++) {
+			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Raise OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			} else if (!(val & (1 << idx))
+				   && (opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Lower OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			}
+		}
+		opp->pir = val;
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		opp->spve = val & opp->vector_mask;
+		break;
+	default:
+		break;
+	}
+}
+
+static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+	if (addr & 0xF) {
+		return retval;
+	}
+	switch (addr) {
+	case 0x1000:		/* FRR */
+		retval = opp->frr;
+		break;
+	case 0x1020:		/* GCR */
+		retval = opp->gcr;
+		break;
+	case 0x1080:		/* VIR */
+		retval = opp->vir;
+		break;
+	case 0x1090:		/* PIR */
+		retval = 0x00000000;
+		break;
+	case 0x00:		/* Block Revision Register1 (BRR1) */
+		retval = opp->brr1;
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		retval =
+		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			retval = read_IRQreg_ivpr(opp, opp->irq_ipi0 + idx);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		retval = opp->spve;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	addr += 0x10f0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	if (addr == 0x10f0) {
+		/* TFRR */
+		opp->tfrr = val;
+		return;
+	}
+
+	idx = (addr >> 6) & 0x3;
+	addr = addr & 0x30;
+
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		break;
+	case 0x10:		/* TBCR */
+		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
+		    (val & TBCR_CI) == 0 &&
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+			opp->timers[idx].tccr &= ~TCCR_TOG;
+		}
+		opp->timers[idx].tbcr = val;
+		break;
+	case 0x20:		/* TVPR */
+		write_IRQreg_ivpr(opp, opp->irq_tim0 + idx, val);
+		break;
+	case 0x30:		/* TDR */
+		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval = -1;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		goto out;
+	}
+	idx = (addr >> 6) & 0x3;
+	if (addr == 0x0) {
+		/* TFRR */
+		retval = opp->tfrr;
+		goto out;
+	}
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		retval = opp->timers[idx].tccr;
+		break;
+	case 0x10:		/* TBCR */
+		retval = opp->timers[idx].tbcr;
+		break;
+	case 0x20:		/* TIPV */
+		retval = read_IRQreg_ivpr(opp, opp->irq_tim0 + idx);
+		break;
+	case 0x30:		/* TIDE (TIDR) */
+		retval = read_IRQreg_idr(opp, opp->irq_tim0 + idx);
+		break;
+	}
+
+out:
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		write_IRQreg_ivpr(opp, idx, val);
+		break;
+	case 0x10:
+		write_IRQreg_idr(opp, idx, val);
+		break;
+	case 0x18:
+		write_IRQreg_ilr(opp, idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		retval = read_IRQreg_ivpr(opp, idx);
+		break;
+	case 0x10:
+		retval = read_IRQreg_idr(opp, idx);
+		break;
+	case 0x18:
+		retval = read_IRQreg_ilr(opp, idx);
+		break;
+	}
+
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	return retval;
+}
+
+static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned size)
+{
+	OpenPICState *opp = opaque;
+	int idx = opp->irq_msi;
+	int srs, ibs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	switch (addr) {
+	case MSIIR_OFFSET:
+		srs = val >> MSIIR_SRS_SHIFT;
+		idx += srs;
+		ibs = (val & MSIIR_IBS_MASK) >> MSIIR_IBS_SHIFT;
+		opp->msi[srs].msir |= 1 << ibs;
+		openpic_set_irq(opp, idx, 1);
+		break;
+	default:
+		/* most registers are read-only, thus ignored */
+		break;
+	}
+}
+
+static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+{
+	OpenPICState *opp = opaque;
+	uint64_t r = 0;
+	int i, srs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		return -1;
+	}
+
+	srs = addr >> 4;
+
+	switch (addr) {
+	case 0x00:
+	case 0x10:
+	case 0x20:
+	case 0x30:
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:		/* MSIRs */
+		r = opp->msi[srs].msir;
+		/* Clear on read */
+		opp->msi[srs].msir = 0;
+		openpic_set_irq(opp, opp->irq_msi + srs, 0);
+		break;
+	case 0x120:		/* MSISR */
+		for (i = 0; i < MAX_MSI; i++) {
+			r |= (opp->msi[i].msir ? 1 : 0) << i;
+		}
+		break;
+	}
+
+	return r;
+}
+
+static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+{
+	uint64_t r = 0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+
+	/* TODO: EISR/EIMR */
+
+	return r;
+}
+
+static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+				  unsigned size)
+{
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+
+	/* TODO: EISR/EIMR */
+}
+
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+	IRQDest *dst;
+	int s_IRQ, n_IRQ;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+		addr, val);
+
+	if (idx < 0) {
+		return;
+	}
+
+	if (addr & 0xF) {
+		return;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x40:		/* IPIDR */
+	case 0x50:
+	case 0x60:
+	case 0x70:
+		idx = (addr - 0x40) >> 4;
+		/* we use IDE as mask which CPUs to deliver the IPI to still. */
+		opp->src[opp->irq_ipi0 + idx].destmask |= val;
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 1);
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 0);
+		break;
+	case 0x80:		/* CTPR */
+		dst->ctpr = val & 0x0000000F;
+
+		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+			__func__, idx, dst->ctpr, dst->raised.priority,
+			dst->servicing.priority);
+
+		if (dst->raised.priority <= dst->ctpr) {
+			DPRINTF
+			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+			     __func__, idx);
+			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+		} else if (dst->raised.priority > dst->servicing.priority) {
+			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+				__func__, idx, dst->raised.next);
+			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+		}
+
+		break;
+	case 0x90:		/* WHOAMI */
+		/* Read-only register */
+		break;
+	case 0xA0:		/* IACK */
+		/* Read-only register */
+		break;
+	case 0xB0:		/* EOI */
+		DPRINTF("EOI\n");
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+
+		if (s_IRQ < 0) {
+			DPRINTF("%s: EOI with no interrupt in service\n",
+				__func__);
+			break;
+		}
+
+		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Set up next servicing IRQ */
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+		/* Check queued interrupts. */
+		n_IRQ = IRQ_get_next(opp, &dst->raised);
+		src = &opp->src[n_IRQ];
+		if (n_IRQ != -1 &&
+		    (s_IRQ == -1 ||
+		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
+			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+				idx, n_IRQ);
+			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+		}
+		break;
+	default:
+		break;
+	}
+}
+
+static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+}
+
+static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+{
+	IRQSource *src;
+	int retval, irq;
+
+	DPRINTF("Lower OpenPIC INT output\n");
+	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+
+	irq = IRQ_get_next(opp, &dst->raised);
+	DPRINTF("IACK: irq=%d\n", irq);
+
+	if (irq == -1) {
+		/* No more interrupt pending */
+		return opp->spve;
+	}
+
+	src = &opp->src[irq];
+	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
+	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
+		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+			__func__, irq, dst->ctpr, src->ivpr);
+		openpic_update_irq(opp, irq);
+		retval = opp->spve;
+	} else {
+		/* IRQ enter servicing state */
+		IRQ_setbit(&dst->servicing, irq);
+		retval = IVPR_VECTOR(opp, src->ivpr);
+	}
+
+	if (!src->level) {
+		/* edge-sensitive IRQ */
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+		src->pending = 0;
+		IRQ_resetbit(&dst->raised, irq);
+	}
+
+	if ((irq >= opp->irq_ipi0) && (irq < (opp->irq_ipi0 + MAX_IPI))) {
+		src->destmask &= ~(1 << cpu);
+		if (src->destmask && !src->level) {
+			/* trigger on CPUs that didn't know about it yet */
+			openpic_set_irq(opp, irq, 1);
+			openpic_set_irq(opp, irq, 0);
+			/* if all CPUs knew about it, set active bit again */
+			src->ivpr |= IVPR_ACTIVITY_MASK;
+		}
+	}
+
+	return retval;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	uint32_t retval;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	retval = 0xFFFFFFFF;
+
+	if (idx < 0) {
+		return retval;
+	}
+
+	if (addr & 0xF) {
+		return retval;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x80:		/* CTPR */
+		retval = dst->ctpr;
+		break;
+	case 0x90:		/* WHOAMI */
+		retval = idx;
+		break;
+	case 0xA0:		/* IACK */
+		retval = openpic_iack(opp, dst, idx);
+		break;
+	case 0xB0:		/* EOI */
+		retval = 0;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+{
+	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+}
+
+static const MemoryRegionOps openpic_glb_ops_le = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_glb_ops_be = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_le = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_be = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_le = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_be = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_le = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_be = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_msi_ops_be = {
+	.read = openpic_msi_read,
+	.write = openpic_msi_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_summary_ops_be = {
+	.read = openpic_summary_read,
+	.write = openpic_summary_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		/* Always put the lower half of a 64-bit long first, in case we
+		 * restore on a 32-bit host.  The least significant bits correspond
+		 * to lower IRQ numbers in the bitmap.
+		 */
+		qemu_put_be32(f, (uint32_t) q->queue[i]);
+#if LONG_MAX > 0x7FFFFFFF
+		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
+#endif
+	}
+
+	qemu_put_sbe32s(f, &q->next);
+	qemu_put_sbe32s(f, &q->priority);
+}
+
+static void openpic_save(QEMUFile * f, void *opaque)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	qemu_put_be32s(f, &opp->gcr);
+	qemu_put_be32s(f, &opp->vir);
+	qemu_put_be32s(f, &opp->pir);
+	qemu_put_be32s(f, &opp->spve);
+	qemu_put_be32s(f, &opp->tfrr);
+
+	qemu_put_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_put_be32s(f, &opp->timers[i].tccr);
+		qemu_put_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		qemu_put_be32s(f, &opp->src[i].ivpr);
+		qemu_put_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_put_sbe32s(f, &opp->src[i].pending);
+	}
+}
+
+static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		unsigned long val;
+
+		val = qemu_get_be32(f);
+#if LONG_MAX > 0x7FFFFFFF
+		val <<= 32;
+		val |= qemu_get_be32(f);
+#endif
+
+		q->queue[i] = val;
+	}
+
+	qemu_get_sbe32s(f, &q->next);
+	qemu_get_sbe32s(f, &q->priority);
+}
+
+static int openpic_load(QEMUFile * f, void *opaque, int version_id)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	if (version_id != 1) {
+		return -EINVAL;
+	}
+
+	qemu_get_be32s(f, &opp->gcr);
+	qemu_get_be32s(f, &opp->vir);
+	qemu_get_be32s(f, &opp->pir);
+	qemu_get_be32s(f, &opp->spve);
+	qemu_get_be32s(f, &opp->tfrr);
+
+	qemu_get_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_get_be32s(f, &opp->timers[i].tccr);
+		qemu_get_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		uint32_t val;
+
+		val = qemu_get_be32(f);
+		write_IRQreg_idr(opp, i, val);
+		val = qemu_get_be32(f);
+		write_IRQreg_ivpr(opp, i, val);
+
+		qemu_get_be32s(f, &opp->src[i].ivpr);
+		qemu_get_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_get_sbe32s(f, &opp->src[i].pending);
+	}
+
+	return 0;
+}
+
+typedef struct MemReg {
+	const char *name;
+	MemoryRegionOps const *ops;
+	hwaddr start_addr;
+	ram_addr_t size;
+} MemReg;
+
+static void fsl_common_init(OpenPICState * opp)
+{
+	int i;
+	int virq = MAX_SRC;
+
+	opp->vid = VID_REVISION_1_2;
+	opp->vir = VIR_GENERIC;
+	opp->vector_mask = 0xFFFF;
+	opp->tfrr_reset = 0;
+	opp->ivpr_reset = IVPR_MASK_MASK;
+	opp->idr_reset = 1 << 0;
+	opp->max_irq = MAX_IRQ;
+
+	opp->irq_ipi0 = virq;
+	virq += MAX_IPI;
+	opp->irq_tim0 = virq;
+	virq += MAX_TMR;
+
+	assert(virq <= MAX_IRQ);
+
+	opp->irq_msi = 224;
+
+	msi_supported = true;
+	for (i = 0; i < opp->fsl->max_ext; i++) {
+		opp->src[i].level = false;
+	}
+
+	/* Internal interrupts, including message and MSI */
+	for (i = 16; i < MAX_SRC; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLINT;
+		opp->src[i].level = true;
+	}
+
+	/* timers and IPIs */
+	for (i = MAX_SRC; i < virq; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLSPECIAL;
+		opp->src[i].level = false;
+	}
+}
+
+static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+{
+	while (list->name) {
+		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+
+		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
+				      list->name, list->size);
+
+		memory_region_add_subregion(&opp->mem, list->start_addr,
+					    &opp->sub_io_mem[*count]);
+
+		(*count)++;
+		list++;
+	}
+}
+
+static int openpic_init(SysBusDevice * dev)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	int i, j;
+	int list_count = 0;
+	static const MemReg list_le[] = {
+		{"glb", &openpic_glb_ops_le,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_le,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_le,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_le,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_be[] = {
+		{"glb", &openpic_glb_ops_be,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_be,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_be,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_be,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_fsl[] = {
+		{"msi", &openpic_msi_ops_be,
+		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
+		{"summary", &openpic_summary_ops_be,
+		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
+		{NULL}
+	};
+
+	memory_region_init(&opp->mem, "openpic", 0x40000);
+
+	switch (opp->model) {
+	case OPENPIC_MODEL_FSL_MPIC_20:
+	default:
+		opp->fsl = &fsl_mpic_20;
+		opp->brr1 = 0x00400200;
+		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
+		opp->nb_irqs = 80;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_FSL_MPIC_42:
+		opp->fsl = &fsl_mpic_42;
+		opp->brr1 = 0x00400402;
+		opp->flags |= OPENPIC_FLAG_ILR;
+		opp->nb_irqs = 196;
+		opp->mpic_mode_mask = GCR_MODE_PROXY;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_RAVEN:
+		opp->nb_irqs = RAVEN_MAX_EXT;
+		opp->vid = VID_REVISION_1_3;
+		opp->vir = VIR_GENERIC;
+		opp->vector_mask = 0xFF;
+		opp->tfrr_reset = 4160000;
+		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
+		opp->idr_reset = 0;
+		opp->max_irq = RAVEN_MAX_IRQ;
+		opp->irq_ipi0 = RAVEN_IPI_IRQ;
+		opp->irq_tim0 = RAVEN_TMR_IRQ;
+		opp->brr1 = -1;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		/* Only UP supported today */
+		if (opp->nb_cpus != 1) {
+			return -EINVAL;
+		}
+
+		map_list(opp, list_le, &list_count);
+		break;
+	}
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
+		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
+			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
+		}
+	}
+
+	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
+			openpic_save, openpic_load, opp);
+
+	sysbus_init_mmio(dev, &opp->mem);
+	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
+
+	return 0;
+}
+
+static Property openpic_properties[] = {
+	DEFINE_PROP_UINT32("model", OpenPICState, model,
+			   OPENPIC_MODEL_FSL_MPIC_20),
+	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
+	DEFINE_PROP_END_OF_LIST(),
+};
+
+static void openpic_class_init(ObjectClass * klass, void *data)
+{
+	DeviceClass *dc = DEVICE_CLASS(klass);
+	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+
+	k->init = openpic_init;
+	dc->props = openpic_properties;
+	dc->reset = openpic_reset;
+}
+
+static const TypeInfo openpic_info = {
+	.name = "openpic",
+	.parent = TYPE_SYS_BUS_DEVICE,
+	.instance_size = sizeof(OpenPICState),
+	.class_init = openpic_class_init,
+};
+
+static void openpic_register_types(void)
+{
+	type_register_static(&openpic_info);
+}
+
+type_init(openpic_register_types)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 10/17] kvm/ppc/mpic: import hw/openpic.c from QEMU
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e ("Update version for
1.4.0-rc0"), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU.  Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c | 1686 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1686 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/mpic.c

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
new file mode 100644
index 0000000..57655b9
--- /dev/null
+++ b/arch/powerpc/kvm/mpic.c
@@ -0,0 +1,1686 @@
+/*
+ * OpenPIC emulation
+ *
+ * Copyright (c) 2004 Jocelyn Mayer
+ *               2011 Alexander Graf
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+/*
+ *
+ * Based on OpenPic implementations:
+ * - Intel GW80314 I/O companion chip developer's manual
+ * - Motorola MPC8245 & MPC8540 user manuals.
+ * - Motorola MCP750 (aka Raven) programmer manual.
+ * - Motorola Harrier programmer manuel
+ *
+ * Serial interrupts, as implemented in Raven chipset are not supported yet.
+ *
+ */
+#include "hw.h"
+#include "ppc/mac.h"
+#include "pci/pci.h"
+#include "openpic.h"
+#include "sysbus.h"
+#include "pci/msi.h"
+#include "qemu/bitops.h"
+#include "ppc.h"
+
+//#define DEBUG_OPENPIC
+
+#ifdef DEBUG_OPENPIC
+static const int debug_openpic = 1;
+#else
+static const int debug_openpic = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+        if (debug_openpic) { \
+            printf(fmt , ## __VA_ARGS__); \
+        } \
+    } while (0)
+
+#define MAX_CPU     32
+#define MAX_SRC     256
+#define MAX_TMR     4
+#define MAX_IPI     4
+#define MAX_MSI     8
+#define MAX_IRQ     (MAX_SRC + MAX_IPI + MAX_TMR)
+#define VID         0x03	/* MPIC version ID */
+
+/* OpenPIC capability flags */
+#define OPENPIC_FLAG_IDR_CRIT     (1 << 0)
+#define OPENPIC_FLAG_ILR          (2 << 0)
+
+/* OpenPIC address map */
+#define OPENPIC_GLB_REG_START        0x0
+#define OPENPIC_GLB_REG_SIZE         0x10F0
+#define OPENPIC_TMR_REG_START        0x10F0
+#define OPENPIC_TMR_REG_SIZE         0x220
+#define OPENPIC_MSI_REG_START        0x1600
+#define OPENPIC_MSI_REG_SIZE         0x200
+#define OPENPIC_SUMMARY_REG_START   0x3800
+#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SRC_REG_START        0x10000
+#define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
+#define OPENPIC_CPU_REG_START        0x20000
+#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+
+/* Raven */
+#define RAVEN_MAX_CPU      2
+#define RAVEN_MAX_EXT     48
+#define RAVEN_MAX_IRQ     64
+#define RAVEN_MAX_TMR      MAX_TMR
+#define RAVEN_MAX_IPI      MAX_IPI
+
+/* Interrupt definitions */
+#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
+#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
+#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
+#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
+/* First doorbell IRQ */
+#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
+
+typedef struct FslMpicInfo {
+	int max_ext;
+} FslMpicInfo;
+
+static FslMpicInfo fsl_mpic_20 = {
+	.max_ext = 12,
+};
+
+static FslMpicInfo fsl_mpic_42 = {
+	.max_ext = 12,
+};
+
+#define FRR_NIRQ_SHIFT    16
+#define FRR_NCPU_SHIFT     8
+#define FRR_VID_SHIFT      0
+
+#define VID_REVISION_1_2   2
+#define VID_REVISION_1_3   3
+
+#define VIR_GENERIC      0x00000000	/* Generic Vendor ID */
+
+#define GCR_RESET        0x80000000
+#define GCR_MODE_PASS    0x00000000
+#define GCR_MODE_MIXED   0x20000000
+#define GCR_MODE_PROXY   0x60000000
+
+#define TBCR_CI           0x80000000	/* count inhibit */
+#define TCCR_TOG          0x80000000	/* toggles when decrement to zero */
+
+#define IDR_EP_SHIFT      31
+#define IDR_EP_MASK       (1 << IDR_EP_SHIFT)
+#define IDR_CI0_SHIFT     30
+#define IDR_CI1_SHIFT     29
+#define IDR_P1_SHIFT      1
+#define IDR_P0_SHIFT      0
+
+#define ILR_INTTGT_MASK   0x000000ff
+#define ILR_INTTGT_INT    0x00
+#define ILR_INTTGT_CINT   0x01	/* critical */
+#define ILR_INTTGT_MCP    0x02	/* machine check */
+
+/* The currently supported INTTGT values happen to be the same as QEMU's
+ * openpic output codes, but don't depend on this.  The output codes
+ * could change (unlikely, but...) or support could be added for
+ * more INTTGT values.
+ */
+static const int inttgt_output[][2] = {
+	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
+	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
+	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
+};
+
+static int inttgt_to_output(int inttgt)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][0] = inttgt) {
+			return inttgt_output[i][1];
+		}
+	}
+
+	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
+	return OPENPIC_OUTPUT_INT;
+}
+
+static int output_to_inttgt(int output)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][1] = output) {
+			return inttgt_output[i][0];
+		}
+	}
+
+	abort();
+}
+
+#define MSIIR_OFFSET       0x140
+#define MSIIR_SRS_SHIFT    29
+#define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
+#define MSIIR_IBS_SHIFT    24
+#define MSIIR_IBS_MASK     (0x1f << MSIIR_IBS_SHIFT)
+
+static int get_current_cpu(void)
+{
+	CPUState *cpu_single_cpu;
+
+	if (!cpu_single_env) {
+		return -1;
+	}
+
+	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
+	return cpu_single_cpu->cpu_index;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx);
+
+typedef enum IRQType {
+	IRQ_TYPE_NORMAL = 0,
+	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
+	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
+} IRQType;
+
+typedef struct IRQQueue {
+	/* Round up to the nearest 64 IRQs so that the queue length
+	 * won't change when moving between 32 and 64 bit hosts.
+	 */
+	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
+	int next;
+	int priority;
+} IRQQueue;
+
+typedef struct IRQSource {
+	uint32_t ivpr;		/* IRQ vector/priority register */
+	uint32_t idr;		/* IRQ destination register */
+	uint32_t destmask;	/* bitmap of CPU destinations */
+	int last_cpu;
+	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int pending;		/* TRUE if IRQ is pending */
+	IRQType type;
+	bool level:1;		/* level-triggered */
+	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
+} IRQSource;
+
+#define IVPR_MASK_SHIFT       31
+#define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
+#define IVPR_ACTIVITY_SHIFT   30
+#define IVPR_ACTIVITY_MASK    (1 << IVPR_ACTIVITY_SHIFT)
+#define IVPR_MODE_SHIFT       29
+#define IVPR_MODE_MASK        (1 << IVPR_MODE_SHIFT)
+#define IVPR_POLARITY_SHIFT   23
+#define IVPR_POLARITY_MASK    (1 << IVPR_POLARITY_SHIFT)
+#define IVPR_SENSE_SHIFT      22
+#define IVPR_SENSE_MASK       (1 << IVPR_SENSE_SHIFT)
+
+#define IVPR_PRIORITY_MASK     (0xF << 16)
+#define IVPR_PRIORITY(_ivprr_) ((int)(((_ivprr_) & IVPR_PRIORITY_MASK) >> 16))
+#define IVPR_VECTOR(opp, _ivprr_) ((_ivprr_) & (opp)->vector_mask)
+
+/* IDR[EP/CI] are only for FSL MPIC prior to v4.0 */
+#define IDR_EP      0x80000000	/* external pin */
+#define IDR_CI      0x40000000	/* critical interrupt */
+
+typedef struct IRQDest {
+	int32_t ctpr;		/* CPU current task priority */
+	IRQQueue raised;
+	IRQQueue servicing;
+	qemu_irq *irqs;
+
+	/* Count of IRQ sources asserting on non-INT outputs */
+	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+} IRQDest;
+
+typedef struct OpenPICState {
+	SysBusDevice busdev;
+	MemoryRegion mem;
+
+	/* Behavior control */
+	FslMpicInfo *fsl;
+	uint32_t model;
+	uint32_t flags;
+	uint32_t nb_irqs;
+	uint32_t vid;
+	uint32_t vir;		/* Vendor identification register */
+	uint32_t vector_mask;
+	uint32_t tfrr_reset;
+	uint32_t ivpr_reset;
+	uint32_t idr_reset;
+	uint32_t brr1;
+	uint32_t mpic_mode_mask;
+
+	/* Sub-regions */
+	MemoryRegion sub_io_mem[6];
+
+	/* Global registers */
+	uint32_t frr;		/* Feature reporting register */
+	uint32_t gcr;		/* Global configuration register  */
+	uint32_t pir;		/* Processor initialization register */
+	uint32_t spve;		/* Spurious vector register */
+	uint32_t tfrr;		/* Timer frequency reporting register */
+	/* Source registers */
+	IRQSource src[MAX_IRQ];
+	/* Local registers per output pin */
+	IRQDest dst[MAX_CPU];
+	uint32_t nb_cpus;
+	/* Timer registers */
+	struct {
+		uint32_t tccr;	/* Global timer current count register */
+		uint32_t tbcr;	/* Global timer base count register */
+	} timers[MAX_TMR];
+	/* Shared MSI registers */
+	struct {
+		uint32_t msir;	/* Shared Message Signaled Interrupt Register */
+	} msi[MAX_MSI];
+	uint32_t max_irq;
+	uint32_t irq_ipi0;
+	uint32_t irq_tim0;
+	uint32_t irq_msi;
+} OpenPICState;
+
+static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+{
+	set_bit(n_IRQ, q->queue);
+}
+
+static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+{
+	clear_bit(n_IRQ, q->queue);
+}
+
+static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+{
+	return test_bit(n_IRQ, q->queue);
+}
+
+static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+{
+	int irq = -1;
+	int next = -1;
+	int priority = -1;
+
+	for (;;) {
+		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
+		if (irq = opp->max_irq) {
+			break;
+		}
+
+		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
+
+		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
+			next = irq;
+			priority = IVPR_PRIORITY(opp->src[irq].ivpr);
+		}
+	}
+
+	q->next = next;
+	q->priority = priority;
+}
+
+static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+{
+	/* XXX: optimize */
+	IRQ_check(opp, q);
+
+	return q->next;
+}
+
+static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+			   bool active, bool was_active)
+{
+	IRQDest *dst;
+	IRQSource *src;
+	int priority;
+
+	dst = &opp->dst[n_CPU];
+	src = &opp->src[n_IRQ];
+
+	DPRINTF("%s: IRQ %d active %d was %d\n",
+		__func__, n_IRQ, active, was_active);
+
+	if (src->output != OPENPIC_OUTPUT_INT) {
+		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+			__func__, src->output, n_IRQ, active, was_active,
+			dst->outputs_active[src->output]);
+
+		/* On Freescale MPIC, critical interrupts ignore priority,
+		 * IACK, EOI, etc.  Before MPIC v4.1 they also ignore
+		 * masking.
+		 */
+		if (active) {
+			if (!was_active
+			    && dst->outputs_active[src->output]++ = 0) {
+				DPRINTF
+				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_raise(dst->irqs[src->output]);
+			}
+		} else {
+			if (was_active
+			    && --dst->outputs_active[src->output] = 0) {
+				DPRINTF
+				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_lower(dst->irqs[src->output]);
+			}
+		}
+
+		return;
+	}
+
+	priority = IVPR_PRIORITY(src->ivpr);
+
+	/* Even if the interrupt doesn't have enough priority,
+	 * it is still raised, in case ctpr is lowered later.
+	 */
+	if (active) {
+		IRQ_setbit(&dst->raised, n_IRQ);
+	} else {
+		IRQ_resetbit(&dst->raised, n_IRQ);
+	}
+
+	IRQ_check(opp, &dst->raised);
+
+	if (active && priority <= dst->ctpr) {
+		DPRINTF
+		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		active = 0;
+	}
+
+	if (active) {
+		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
+		    priority <= dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+		} else {
+			DPRINTF
+			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			qemu_irq_raise(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	} else {
+		IRQ_get_next(opp, &dst->servicing);
+		if (dst->raised.priority > dst->ctpr &&
+		    dst->raised.priority > dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->raised.next,
+			     dst->raised.priority, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			/* IRQ line stays asserted */
+		} else {
+			DPRINTF
+			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			qemu_irq_lower(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	}
+}
+
+/* update pic state because registers for n_IRQ have changed value */
+static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+{
+	IRQSource *src;
+	bool active, was_active;
+	int i;
+
+	src = &opp->src[n_IRQ];
+	active = src->pending;
+
+	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
+		/* Interrupt source is disabled */
+		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		active = false;
+	}
+
+	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+
+	/*
+	 * We don't have a similar check for already-active because
+	 * ctpr may have changed and we need to withdraw the interrupt.
+	 */
+	if (!active && !was_active) {
+		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (active) {
+		src->ivpr |= IVPR_ACTIVITY_MASK;
+	} else {
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+	}
+
+	if (src->destmask = 0) {
+		/* No target */
+		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (src->destmask = (1 << src->last_cpu)) {
+		/* Only one CPU is allowed to receive this IRQ */
+		IRQ_local_pipe(opp, src->last_cpu, n_IRQ, active, was_active);
+	} else if (!(src->ivpr & IVPR_MODE_MASK)) {
+		/* Directed delivery mode */
+		for (i = 0; i < opp->nb_cpus; i++) {
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+			}
+		}
+	} else {
+		/* Distributed delivery mode */
+		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
+			if (i = opp->nb_cpus) {
+				i = 0;
+			}
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+				src->last_cpu = i;
+				break;
+			}
+		}
+	}
+}
+
+static void openpic_set_irq(void *opaque, int n_IRQ, int level)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+
+	if (n_IRQ >= MAX_IRQ) {
+		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		abort();
+	}
+
+	src = &opp->src[n_IRQ];
+	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+		n_IRQ, level, src->ivpr);
+	if (src->level) {
+		/* level-sensitive irq */
+		src->pending = level;
+		openpic_update_irq(opp, n_IRQ);
+	} else {
+		/* edge-sensitive irq */
+		if (level) {
+			src->pending = 1;
+			openpic_update_irq(opp, n_IRQ);
+		}
+
+		if (src->output != OPENPIC_OUTPUT_INT) {
+			/* Edge-triggered interrupts shouldn't be used
+			 * with non-INT delivery, but just in case,
+			 * try to make it do something sane rather than
+			 * cause an interrupt storm.  This is close to
+			 * what you'd probably see happen in real hardware.
+			 */
+			src->pending = 0;
+			openpic_update_irq(opp, n_IRQ);
+		}
+	}
+}
+
+static void openpic_reset(DeviceState * d)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	int i;
+
+	opp->gcr = GCR_RESET;
+	/* Initialise controller registers */
+	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
+	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
+	    (opp->vid << FRR_VID_SHIFT);
+
+	opp->pir = 0;
+	opp->spve = -1 & opp->vector_mask;
+	opp->tfrr = opp->tfrr_reset;
+	/* Initialise IRQ sources */
+	for (i = 0; i < opp->max_irq; i++) {
+		opp->src[i].ivpr = opp->ivpr_reset;
+		opp->src[i].idr = opp->idr_reset;
+
+		switch (opp->src[i].type) {
+		case IRQ_TYPE_NORMAL:
+			opp->src[i].level +			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			break;
+
+		case IRQ_TYPE_FSLINT:
+			opp->src[i].ivpr |= IVPR_POLARITY_MASK;
+			break;
+
+		case IRQ_TYPE_FSLSPECIAL:
+			break;
+		}
+	}
+	/* Initialise IRQ destinations */
+	for (i = 0; i < MAX_CPU; i++) {
+		opp->dst[i].ctpr = 15;
+		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		opp->dst[i].raised.next = -1;
+		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		opp->dst[i].servicing.next = -1;
+	}
+	/* Initialise timers */
+	for (i = 0; i < MAX_TMR; i++) {
+		opp->timers[i].tccr = 0;
+		opp->timers[i].tbcr = TBCR_CI;
+	}
+	/* Go out of RESET state */
+	opp->gcr = 0;
+}
+
+static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].idr;
+}
+
+static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		return output_to_inttgt(opp->src[n_IRQ].output);
+	}
+
+	return 0xffffffff;
+}
+
+static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].ivpr;
+}
+
+static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	IRQSource *src = &opp->src[n_IRQ];
+	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
+	uint32_t crit_mask = 0;
+	uint32_t mask = normal_mask;
+	int crit_shift = IDR_EP_SHIFT - opp->nb_cpus;
+	int i;
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		crit_mask = mask << crit_shift;
+		mask |= crit_mask | IDR_EP;
+	}
+
+	src->idr = val & mask;
+	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		if (src->idr & crit_mask) {
+			if (src->idr & normal_mask) {
+				DPRINTF
+				    ("%s: IRQ configured for multiple output types, using "
+				     "critical\n", __func__);
+			}
+
+			src->output = OPENPIC_OUTPUT_CINT;
+			src->nomask = true;
+			src->destmask = 0;
+
+			for (i = 0; i < opp->nb_cpus; i++) {
+				int n_ci = IDR_CI0_SHIFT - i;
+
+				if (src->idr & (1UL << n_ci)) {
+					src->destmask |= 1UL << i;
+				}
+			}
+		} else {
+			src->output = OPENPIC_OUTPUT_INT;
+			src->nomask = false;
+			src->destmask = src->idr & normal_mask;
+		}
+	} else {
+		src->destmask = src->idr;
+	}
+}
+
+static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		IRQSource *src = &opp->src[n_IRQ];
+
+		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+			src->output);
+
+		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
+	}
+}
+
+static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+				     uint32_t val)
+{
+	uint32_t mask;
+
+	/* NOTE when implementing newer FSL MPIC models: starting with v4.0,
+	 * the polarity bit is read-only on internal interrupts.
+	 */
+	mask = IVPR_MASK_MASK | IVPR_PRIORITY_MASK | IVPR_SENSE_MASK |
+	    IVPR_POLARITY_MASK | opp->vector_mask;
+
+	/* ACTIVITY bit is read-only */
+	opp->src[n_IRQ].ivpr +	    (opp->src[n_IRQ].ivpr & IVPR_ACTIVITY_MASK) | (val & mask);
+
+	/* For FSL internal interrupts, The sense bit is reserved and zero,
+	 * and the interrupt is always level-triggered.  Timers and IPIs
+	 * have no sense or polarity bits, and are edge-triggered.
+	 */
+	switch (opp->src[n_IRQ].type) {
+	case IRQ_TYPE_NORMAL:
+		opp->src[n_IRQ].level +		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		break;
+
+	case IRQ_TYPE_FSLINT:
+		opp->src[n_IRQ].ivpr &= ~IVPR_SENSE_MASK;
+		break;
+
+	case IRQ_TYPE_FSLSPECIAL:
+		opp->src[n_IRQ].ivpr &= ~(IVPR_POLARITY_MASK | IVPR_SENSE_MASK);
+		break;
+	}
+
+	openpic_update_irq(opp, n_IRQ);
+	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+		opp->src[n_IRQ].ivpr);
+}
+
+static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+{
+	bool mpic_proxy = false;
+
+	if (val & GCR_RESET) {
+		openpic_reset(&opp->busdev.qdev);
+		return;
+	}
+
+	opp->gcr &= ~opp->mpic_mode_mask;
+	opp->gcr |= val & opp->mpic_mode_mask;
+
+	/* Set external proxy mode */
+	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY) {
+		mpic_proxy = true;
+	}
+
+	ppce500_set_mpic_proxy(mpic_proxy);
+}
+
+static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+	switch (addr) {
+	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		break;
+	case 0x1000:		/* FRR */
+		break;
+	case 0x1020:		/* GCR */
+		openpic_gcr_write(opp, val);
+		break;
+	case 0x1080:		/* VIR */
+		break;
+	case 0x1090:		/* PIR */
+		for (idx = 0; idx < opp->nb_cpus; idx++) {
+			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Raise OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			} else if (!(val & (1 << idx))
+				   && (opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Lower OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			}
+		}
+		opp->pir = val;
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		opp->spve = val & opp->vector_mask;
+		break;
+	default:
+		break;
+	}
+}
+
+static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+	if (addr & 0xF) {
+		return retval;
+	}
+	switch (addr) {
+	case 0x1000:		/* FRR */
+		retval = opp->frr;
+		break;
+	case 0x1020:		/* GCR */
+		retval = opp->gcr;
+		break;
+	case 0x1080:		/* VIR */
+		retval = opp->vir;
+		break;
+	case 0x1090:		/* PIR */
+		retval = 0x00000000;
+		break;
+	case 0x00:		/* Block Revision Register1 (BRR1) */
+		retval = opp->brr1;
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		retval +		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			retval = read_IRQreg_ivpr(opp, opp->irq_ipi0 + idx);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		retval = opp->spve;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	addr += 0x10f0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	if (addr = 0x10f0) {
+		/* TFRR */
+		opp->tfrr = val;
+		return;
+	}
+
+	idx = (addr >> 6) & 0x3;
+	addr = addr & 0x30;
+
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		break;
+	case 0x10:		/* TBCR */
+		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
+		    (val & TBCR_CI) = 0 &&
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+			opp->timers[idx].tccr &= ~TCCR_TOG;
+		}
+		opp->timers[idx].tbcr = val;
+		break;
+	case 0x20:		/* TVPR */
+		write_IRQreg_ivpr(opp, opp->irq_tim0 + idx, val);
+		break;
+	case 0x30:		/* TDR */
+		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval = -1;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		goto out;
+	}
+	idx = (addr >> 6) & 0x3;
+	if (addr = 0x0) {
+		/* TFRR */
+		retval = opp->tfrr;
+		goto out;
+	}
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		retval = opp->timers[idx].tccr;
+		break;
+	case 0x10:		/* TBCR */
+		retval = opp->timers[idx].tbcr;
+		break;
+	case 0x20:		/* TIPV */
+		retval = read_IRQreg_ivpr(opp, opp->irq_tim0 + idx);
+		break;
+	case 0x30:		/* TIDE (TIDR) */
+		retval = read_IRQreg_idr(opp, opp->irq_tim0 + idx);
+		break;
+	}
+
+out:
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		write_IRQreg_ivpr(opp, idx, val);
+		break;
+	case 0x10:
+		write_IRQreg_idr(opp, idx, val);
+		break;
+	case 0x18:
+		write_IRQreg_ilr(opp, idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		retval = read_IRQreg_ivpr(opp, idx);
+		break;
+	case 0x10:
+		retval = read_IRQreg_idr(opp, idx);
+		break;
+	case 0x18:
+		retval = read_IRQreg_ilr(opp, idx);
+		break;
+	}
+
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	return retval;
+}
+
+static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned size)
+{
+	OpenPICState *opp = opaque;
+	int idx = opp->irq_msi;
+	int srs, ibs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	switch (addr) {
+	case MSIIR_OFFSET:
+		srs = val >> MSIIR_SRS_SHIFT;
+		idx += srs;
+		ibs = (val & MSIIR_IBS_MASK) >> MSIIR_IBS_SHIFT;
+		opp->msi[srs].msir |= 1 << ibs;
+		openpic_set_irq(opp, idx, 1);
+		break;
+	default:
+		/* most registers are read-only, thus ignored */
+		break;
+	}
+}
+
+static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+{
+	OpenPICState *opp = opaque;
+	uint64_t r = 0;
+	int i, srs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		return -1;
+	}
+
+	srs = addr >> 4;
+
+	switch (addr) {
+	case 0x00:
+	case 0x10:
+	case 0x20:
+	case 0x30:
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:		/* MSIRs */
+		r = opp->msi[srs].msir;
+		/* Clear on read */
+		opp->msi[srs].msir = 0;
+		openpic_set_irq(opp, opp->irq_msi + srs, 0);
+		break;
+	case 0x120:		/* MSISR */
+		for (i = 0; i < MAX_MSI; i++) {
+			r |= (opp->msi[i].msir ? 1 : 0) << i;
+		}
+		break;
+	}
+
+	return r;
+}
+
+static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+{
+	uint64_t r = 0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+
+	/* TODO: EISR/EIMR */
+
+	return r;
+}
+
+static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+				  unsigned size)
+{
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+
+	/* TODO: EISR/EIMR */
+}
+
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+	IRQDest *dst;
+	int s_IRQ, n_IRQ;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+		addr, val);
+
+	if (idx < 0) {
+		return;
+	}
+
+	if (addr & 0xF) {
+		return;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x40:		/* IPIDR */
+	case 0x50:
+	case 0x60:
+	case 0x70:
+		idx = (addr - 0x40) >> 4;
+		/* we use IDE as mask which CPUs to deliver the IPI to still. */
+		opp->src[opp->irq_ipi0 + idx].destmask |= val;
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 1);
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 0);
+		break;
+	case 0x80:		/* CTPR */
+		dst->ctpr = val & 0x0000000F;
+
+		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+			__func__, idx, dst->ctpr, dst->raised.priority,
+			dst->servicing.priority);
+
+		if (dst->raised.priority <= dst->ctpr) {
+			DPRINTF
+			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+			     __func__, idx);
+			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+		} else if (dst->raised.priority > dst->servicing.priority) {
+			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+				__func__, idx, dst->raised.next);
+			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+		}
+
+		break;
+	case 0x90:		/* WHOAMI */
+		/* Read-only register */
+		break;
+	case 0xA0:		/* IACK */
+		/* Read-only register */
+		break;
+	case 0xB0:		/* EOI */
+		DPRINTF("EOI\n");
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+
+		if (s_IRQ < 0) {
+			DPRINTF("%s: EOI with no interrupt in service\n",
+				__func__);
+			break;
+		}
+
+		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Set up next servicing IRQ */
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+		/* Check queued interrupts. */
+		n_IRQ = IRQ_get_next(opp, &dst->raised);
+		src = &opp->src[n_IRQ];
+		if (n_IRQ != -1 &&
+		    (s_IRQ = -1 ||
+		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
+			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+				idx, n_IRQ);
+			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+		}
+		break;
+	default:
+		break;
+	}
+}
+
+static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+}
+
+static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+{
+	IRQSource *src;
+	int retval, irq;
+
+	DPRINTF("Lower OpenPIC INT output\n");
+	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+
+	irq = IRQ_get_next(opp, &dst->raised);
+	DPRINTF("IACK: irq=%d\n", irq);
+
+	if (irq = -1) {
+		/* No more interrupt pending */
+		return opp->spve;
+	}
+
+	src = &opp->src[irq];
+	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
+	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
+		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+			__func__, irq, dst->ctpr, src->ivpr);
+		openpic_update_irq(opp, irq);
+		retval = opp->spve;
+	} else {
+		/* IRQ enter servicing state */
+		IRQ_setbit(&dst->servicing, irq);
+		retval = IVPR_VECTOR(opp, src->ivpr);
+	}
+
+	if (!src->level) {
+		/* edge-sensitive IRQ */
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+		src->pending = 0;
+		IRQ_resetbit(&dst->raised, irq);
+	}
+
+	if ((irq >= opp->irq_ipi0) && (irq < (opp->irq_ipi0 + MAX_IPI))) {
+		src->destmask &= ~(1 << cpu);
+		if (src->destmask && !src->level) {
+			/* trigger on CPUs that didn't know about it yet */
+			openpic_set_irq(opp, irq, 1);
+			openpic_set_irq(opp, irq, 0);
+			/* if all CPUs knew about it, set active bit again */
+			src->ivpr |= IVPR_ACTIVITY_MASK;
+		}
+	}
+
+	return retval;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	uint32_t retval;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	retval = 0xFFFFFFFF;
+
+	if (idx < 0) {
+		return retval;
+	}
+
+	if (addr & 0xF) {
+		return retval;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x80:		/* CTPR */
+		retval = dst->ctpr;
+		break;
+	case 0x90:		/* WHOAMI */
+		retval = idx;
+		break;
+	case 0xA0:		/* IACK */
+		retval = openpic_iack(opp, dst, idx);
+		break;
+	case 0xB0:		/* EOI */
+		retval = 0;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+{
+	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+}
+
+static const MemoryRegionOps openpic_glb_ops_le = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_glb_ops_be = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_le = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_be = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_le = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_be = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_le = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_be = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_msi_ops_be = {
+	.read = openpic_msi_read,
+	.write = openpic_msi_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_summary_ops_be = {
+	.read = openpic_summary_read,
+	.write = openpic_summary_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		/* Always put the lower half of a 64-bit long first, in case we
+		 * restore on a 32-bit host.  The least significant bits correspond
+		 * to lower IRQ numbers in the bitmap.
+		 */
+		qemu_put_be32(f, (uint32_t) q->queue[i]);
+#if LONG_MAX > 0x7FFFFFFF
+		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
+#endif
+	}
+
+	qemu_put_sbe32s(f, &q->next);
+	qemu_put_sbe32s(f, &q->priority);
+}
+
+static void openpic_save(QEMUFile * f, void *opaque)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	qemu_put_be32s(f, &opp->gcr);
+	qemu_put_be32s(f, &opp->vir);
+	qemu_put_be32s(f, &opp->pir);
+	qemu_put_be32s(f, &opp->spve);
+	qemu_put_be32s(f, &opp->tfrr);
+
+	qemu_put_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_put_be32s(f, &opp->timers[i].tccr);
+		qemu_put_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		qemu_put_be32s(f, &opp->src[i].ivpr);
+		qemu_put_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_put_sbe32s(f, &opp->src[i].pending);
+	}
+}
+
+static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		unsigned long val;
+
+		val = qemu_get_be32(f);
+#if LONG_MAX > 0x7FFFFFFF
+		val <<= 32;
+		val |= qemu_get_be32(f);
+#endif
+
+		q->queue[i] = val;
+	}
+
+	qemu_get_sbe32s(f, &q->next);
+	qemu_get_sbe32s(f, &q->priority);
+}
+
+static int openpic_load(QEMUFile * f, void *opaque, int version_id)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	if (version_id != 1) {
+		return -EINVAL;
+	}
+
+	qemu_get_be32s(f, &opp->gcr);
+	qemu_get_be32s(f, &opp->vir);
+	qemu_get_be32s(f, &opp->pir);
+	qemu_get_be32s(f, &opp->spve);
+	qemu_get_be32s(f, &opp->tfrr);
+
+	qemu_get_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_get_be32s(f, &opp->timers[i].tccr);
+		qemu_get_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		uint32_t val;
+
+		val = qemu_get_be32(f);
+		write_IRQreg_idr(opp, i, val);
+		val = qemu_get_be32(f);
+		write_IRQreg_ivpr(opp, i, val);
+
+		qemu_get_be32s(f, &opp->src[i].ivpr);
+		qemu_get_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_get_sbe32s(f, &opp->src[i].pending);
+	}
+
+	return 0;
+}
+
+typedef struct MemReg {
+	const char *name;
+	MemoryRegionOps const *ops;
+	hwaddr start_addr;
+	ram_addr_t size;
+} MemReg;
+
+static void fsl_common_init(OpenPICState * opp)
+{
+	int i;
+	int virq = MAX_SRC;
+
+	opp->vid = VID_REVISION_1_2;
+	opp->vir = VIR_GENERIC;
+	opp->vector_mask = 0xFFFF;
+	opp->tfrr_reset = 0;
+	opp->ivpr_reset = IVPR_MASK_MASK;
+	opp->idr_reset = 1 << 0;
+	opp->max_irq = MAX_IRQ;
+
+	opp->irq_ipi0 = virq;
+	virq += MAX_IPI;
+	opp->irq_tim0 = virq;
+	virq += MAX_TMR;
+
+	assert(virq <= MAX_IRQ);
+
+	opp->irq_msi = 224;
+
+	msi_supported = true;
+	for (i = 0; i < opp->fsl->max_ext; i++) {
+		opp->src[i].level = false;
+	}
+
+	/* Internal interrupts, including message and MSI */
+	for (i = 16; i < MAX_SRC; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLINT;
+		opp->src[i].level = true;
+	}
+
+	/* timers and IPIs */
+	for (i = MAX_SRC; i < virq; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLSPECIAL;
+		opp->src[i].level = false;
+	}
+}
+
+static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+{
+	while (list->name) {
+		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+
+		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
+				      list->name, list->size);
+
+		memory_region_add_subregion(&opp->mem, list->start_addr,
+					    &opp->sub_io_mem[*count]);
+
+		(*count)++;
+		list++;
+	}
+}
+
+static int openpic_init(SysBusDevice * dev)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	int i, j;
+	int list_count = 0;
+	static const MemReg list_le[] = {
+		{"glb", &openpic_glb_ops_le,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_le,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_le,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_le,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_be[] = {
+		{"glb", &openpic_glb_ops_be,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_be,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_be,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_be,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_fsl[] = {
+		{"msi", &openpic_msi_ops_be,
+		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
+		{"summary", &openpic_summary_ops_be,
+		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
+		{NULL}
+	};
+
+	memory_region_init(&opp->mem, "openpic", 0x40000);
+
+	switch (opp->model) {
+	case OPENPIC_MODEL_FSL_MPIC_20:
+	default:
+		opp->fsl = &fsl_mpic_20;
+		opp->brr1 = 0x00400200;
+		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
+		opp->nb_irqs = 80;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_FSL_MPIC_42:
+		opp->fsl = &fsl_mpic_42;
+		opp->brr1 = 0x00400402;
+		opp->flags |= OPENPIC_FLAG_ILR;
+		opp->nb_irqs = 196;
+		opp->mpic_mode_mask = GCR_MODE_PROXY;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_RAVEN:
+		opp->nb_irqs = RAVEN_MAX_EXT;
+		opp->vid = VID_REVISION_1_3;
+		opp->vir = VIR_GENERIC;
+		opp->vector_mask = 0xFF;
+		opp->tfrr_reset = 4160000;
+		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
+		opp->idr_reset = 0;
+		opp->max_irq = RAVEN_MAX_IRQ;
+		opp->irq_ipi0 = RAVEN_IPI_IRQ;
+		opp->irq_tim0 = RAVEN_TMR_IRQ;
+		opp->brr1 = -1;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		/* Only UP supported today */
+		if (opp->nb_cpus != 1) {
+			return -EINVAL;
+		}
+
+		map_list(opp, list_le, &list_count);
+		break;
+	}
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
+		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
+			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
+		}
+	}
+
+	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
+			openpic_save, openpic_load, opp);
+
+	sysbus_init_mmio(dev, &opp->mem);
+	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
+
+	return 0;
+}
+
+static Property openpic_properties[] = {
+	DEFINE_PROP_UINT32("model", OpenPICState, model,
+			   OPENPIC_MODEL_FSL_MPIC_20),
+	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
+	DEFINE_PROP_END_OF_LIST(),
+};
+
+static void openpic_class_init(ObjectClass * klass, void *data)
+{
+	DeviceClass *dc = DEVICE_CLASS(klass);
+	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+
+	k->init = openpic_init;
+	dc->props = openpic_properties;
+	dc->reset = openpic_reset;
+}
+
+static const TypeInfo openpic_info = {
+	.name = "openpic",
+	.parent = TYPE_SYS_BUS_DEVICE,
+	.instance_size = sizeof(OpenPICState),
+	.class_init = openpic_class_init,
+};
+
+static void openpic_register_types(void)
+{
+	type_register_static(&openpic_info);
+}
+
+type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 11/17] kvm/ppc/mpic: remove some obviously unneeded code
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove some parts of the code that are obviously QEMU or Raven specific
before fixing style issues, to reduce the style issues that need to be
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  344 -----------------------------------------------
 1 files changed, 0 insertions(+), 344 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 57655b9..d6d70a4 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -22,39 +22,6 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
-/*
- *
- * Based on OpenPic implementations:
- * - Intel GW80314 I/O companion chip developer's manual
- * - Motorola MPC8245 & MPC8540 user manuals.
- * - Motorola MCP750 (aka Raven) programmer manual.
- * - Motorola Harrier programmer manuel
- *
- * Serial interrupts, as implemented in Raven chipset are not supported yet.
- *
- */
-#include "hw.h"
-#include "ppc/mac.h"
-#include "pci/pci.h"
-#include "openpic.h"
-#include "sysbus.h"
-#include "pci/msi.h"
-#include "qemu/bitops.h"
-#include "ppc.h"
-
-//#define DEBUG_OPENPIC
-
-#ifdef DEBUG_OPENPIC
-static const int debug_openpic = 1;
-#else
-static const int debug_openpic = 0;
-#endif
-
-#define DPRINTF(fmt, ...) do { \
-        if (debug_openpic) { \
-            printf(fmt , ## __VA_ARGS__); \
-        } \
-    } while (0)
 
 #define MAX_CPU     32
 #define MAX_SRC     256
@@ -82,21 +49,6 @@ static const int debug_openpic = 0;
 #define OPENPIC_CPU_REG_START        0x20000
 #define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
 
-/* Raven */
-#define RAVEN_MAX_CPU      2
-#define RAVEN_MAX_EXT     48
-#define RAVEN_MAX_IRQ     64
-#define RAVEN_MAX_TMR      MAX_TMR
-#define RAVEN_MAX_IPI      MAX_IPI
-
-/* Interrupt definitions */
-#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
-#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
-#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
-#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
-/* First doorbell IRQ */
-#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
-
 typedef struct FslMpicInfo {
 	int max_ext;
 } FslMpicInfo;
@@ -138,44 +90,6 @@ static FslMpicInfo fsl_mpic_42 = {
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
 
-/* The currently supported INTTGT values happen to be the same as QEMU's
- * openpic output codes, but don't depend on this.  The output codes
- * could change (unlikely, but...) or support could be added for
- * more INTTGT values.
- */
-static const int inttgt_output[][2] = {
-	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
-	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
-	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
-};
-
-static int inttgt_to_output(int inttgt)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][0] == inttgt) {
-			return inttgt_output[i][1];
-		}
-	}
-
-	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
-	return OPENPIC_OUTPUT_INT;
-}
-
-static int output_to_inttgt(int output)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][1] == output) {
-			return inttgt_output[i][0];
-		}
-	}
-
-	abort();
-}
-
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
 #define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
@@ -1265,228 +1179,36 @@ static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_le = {
-	.write = openpic_gbl_write,
-	.read = openpic_gbl_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
 static const MemoryRegionOps openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_tmr_ops_le = {
-	.write = openpic_tmr_write,
-	.read = openpic_tmr_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_cpu_ops_le = {
-	.write = openpic_cpu_write,
-	.read = openpic_cpu_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_src_ops_le = {
-	.write = openpic_src_write,
-	.read = openpic_src_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
-static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		/* Always put the lower half of a 64-bit long first, in case we
-		 * restore on a 32-bit host.  The least significant bits correspond
-		 * to lower IRQ numbers in the bitmap.
-		 */
-		qemu_put_be32(f, (uint32_t) q->queue[i]);
-#if LONG_MAX > 0x7FFFFFFF
-		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
-#endif
-	}
-
-	qemu_put_sbe32s(f, &q->next);
-	qemu_put_sbe32s(f, &q->priority);
-}
-
-static void openpic_save(QEMUFile * f, void *opaque)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	qemu_put_be32s(f, &opp->gcr);
-	qemu_put_be32s(f, &opp->vir);
-	qemu_put_be32s(f, &opp->pir);
-	qemu_put_be32s(f, &opp->spve);
-	qemu_put_be32s(f, &opp->tfrr);
-
-	qemu_put_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_put_be32s(f, &opp->timers[i].tccr);
-		qemu_put_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		qemu_put_be32s(f, &opp->src[i].ivpr);
-		qemu_put_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_put_sbe32s(f, &opp->src[i].pending);
-	}
-}
-
-static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		unsigned long val;
-
-		val = qemu_get_be32(f);
-#if LONG_MAX > 0x7FFFFFFF
-		val <<= 32;
-		val |= qemu_get_be32(f);
-#endif
-
-		q->queue[i] = val;
-	}
-
-	qemu_get_sbe32s(f, &q->next);
-	qemu_get_sbe32s(f, &q->priority);
-}
-
-static int openpic_load(QEMUFile * f, void *opaque, int version_id)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	if (version_id != 1) {
-		return -EINVAL;
-	}
-
-	qemu_get_be32s(f, &opp->gcr);
-	qemu_get_be32s(f, &opp->vir);
-	qemu_get_be32s(f, &opp->pir);
-	qemu_get_be32s(f, &opp->spve);
-	qemu_get_be32s(f, &opp->tfrr);
-
-	qemu_get_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_get_be32s(f, &opp->timers[i].tccr);
-		qemu_get_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		uint32_t val;
-
-		val = qemu_get_be32(f);
-		write_IRQreg_idr(opp, i, val);
-		val = qemu_get_be32(f);
-		write_IRQreg_ivpr(opp, i, val);
-
-		qemu_get_be32s(f, &opp->src[i].ivpr);
-		qemu_get_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_get_sbe32s(f, &opp->src[i].pending);
-	}
-
-	return 0;
-}
-
 typedef struct MemReg {
 	const char *name;
 	MemoryRegionOps const *ops;
@@ -1614,73 +1336,7 @@ static int openpic_init(SysBusDevice * dev)
 		map_list(opp, list_fsl, &list_count);
 
 		break;
-
-	case OPENPIC_MODEL_RAVEN:
-		opp->nb_irqs = RAVEN_MAX_EXT;
-		opp->vid = VID_REVISION_1_3;
-		opp->vir = VIR_GENERIC;
-		opp->vector_mask = 0xFF;
-		opp->tfrr_reset = 4160000;
-		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
-		opp->idr_reset = 0;
-		opp->max_irq = RAVEN_MAX_IRQ;
-		opp->irq_ipi0 = RAVEN_IPI_IRQ;
-		opp->irq_tim0 = RAVEN_TMR_IRQ;
-		opp->brr1 = -1;
-		opp->mpic_mode_mask = GCR_MODE_MIXED;
-
-		/* Only UP supported today */
-		if (opp->nb_cpus != 1) {
-			return -EINVAL;
-		}
-
-		map_list(opp, list_le, &list_count);
-		break;
-	}
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
-		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
-			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
-		}
 	}
 
-	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
-			openpic_save, openpic_load, opp);
-
-	sysbus_init_mmio(dev, &opp->mem);
-	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
-
 	return 0;
 }
-
-static Property openpic_properties[] = {
-	DEFINE_PROP_UINT32("model", OpenPICState, model,
-			   OPENPIC_MODEL_FSL_MPIC_20),
-	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
-	DEFINE_PROP_END_OF_LIST(),
-};
-
-static void openpic_class_init(ObjectClass * klass, void *data)
-{
-	DeviceClass *dc = DEVICE_CLASS(klass);
-	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
-
-	k->init = openpic_init;
-	dc->props = openpic_properties;
-	dc->reset = openpic_reset;
-}
-
-static const TypeInfo openpic_info = {
-	.name = "openpic",
-	.parent = TYPE_SYS_BUS_DEVICE,
-	.instance_size = sizeof(OpenPICState),
-	.class_init = openpic_class_init,
-};
-
-static void openpic_register_types(void)
-{
-	type_register_static(&openpic_info);
-}
-
-type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 11/17] kvm/ppc/mpic: remove some obviously unneeded code
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove some parts of the code that are obviously QEMU or Raven specific
before fixing style issues, to reduce the style issues that need to be
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  344 -----------------------------------------------
 1 files changed, 0 insertions(+), 344 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 57655b9..d6d70a4 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -22,39 +22,6 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
-/*
- *
- * Based on OpenPic implementations:
- * - Intel GW80314 I/O companion chip developer's manual
- * - Motorola MPC8245 & MPC8540 user manuals.
- * - Motorola MCP750 (aka Raven) programmer manual.
- * - Motorola Harrier programmer manuel
- *
- * Serial interrupts, as implemented in Raven chipset are not supported yet.
- *
- */
-#include "hw.h"
-#include "ppc/mac.h"
-#include "pci/pci.h"
-#include "openpic.h"
-#include "sysbus.h"
-#include "pci/msi.h"
-#include "qemu/bitops.h"
-#include "ppc.h"
-
-//#define DEBUG_OPENPIC
-
-#ifdef DEBUG_OPENPIC
-static const int debug_openpic = 1;
-#else
-static const int debug_openpic = 0;
-#endif
-
-#define DPRINTF(fmt, ...) do { \
-        if (debug_openpic) { \
-            printf(fmt , ## __VA_ARGS__); \
-        } \
-    } while (0)
 
 #define MAX_CPU     32
 #define MAX_SRC     256
@@ -82,21 +49,6 @@ static const int debug_openpic = 0;
 #define OPENPIC_CPU_REG_START        0x20000
 #define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
 
-/* Raven */
-#define RAVEN_MAX_CPU      2
-#define RAVEN_MAX_EXT     48
-#define RAVEN_MAX_IRQ     64
-#define RAVEN_MAX_TMR      MAX_TMR
-#define RAVEN_MAX_IPI      MAX_IPI
-
-/* Interrupt definitions */
-#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
-#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
-#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
-#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
-/* First doorbell IRQ */
-#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
-
 typedef struct FslMpicInfo {
 	int max_ext;
 } FslMpicInfo;
@@ -138,44 +90,6 @@ static FslMpicInfo fsl_mpic_42 = {
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
 
-/* The currently supported INTTGT values happen to be the same as QEMU's
- * openpic output codes, but don't depend on this.  The output codes
- * could change (unlikely, but...) or support could be added for
- * more INTTGT values.
- */
-static const int inttgt_output[][2] = {
-	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
-	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
-	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
-};
-
-static int inttgt_to_output(int inttgt)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][0] = inttgt) {
-			return inttgt_output[i][1];
-		}
-	}
-
-	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
-	return OPENPIC_OUTPUT_INT;
-}
-
-static int output_to_inttgt(int output)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][1] = output) {
-			return inttgt_output[i][0];
-		}
-	}
-
-	abort();
-}
-
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
 #define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
@@ -1265,228 +1179,36 @@ static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_le = {
-	.write = openpic_gbl_write,
-	.read = openpic_gbl_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
 static const MemoryRegionOps openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_tmr_ops_le = {
-	.write = openpic_tmr_write,
-	.read = openpic_tmr_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_cpu_ops_le = {
-	.write = openpic_cpu_write,
-	.read = openpic_cpu_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_src_ops_le = {
-	.write = openpic_src_write,
-	.read = openpic_src_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
-static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		/* Always put the lower half of a 64-bit long first, in case we
-		 * restore on a 32-bit host.  The least significant bits correspond
-		 * to lower IRQ numbers in the bitmap.
-		 */
-		qemu_put_be32(f, (uint32_t) q->queue[i]);
-#if LONG_MAX > 0x7FFFFFFF
-		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
-#endif
-	}
-
-	qemu_put_sbe32s(f, &q->next);
-	qemu_put_sbe32s(f, &q->priority);
-}
-
-static void openpic_save(QEMUFile * f, void *opaque)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	qemu_put_be32s(f, &opp->gcr);
-	qemu_put_be32s(f, &opp->vir);
-	qemu_put_be32s(f, &opp->pir);
-	qemu_put_be32s(f, &opp->spve);
-	qemu_put_be32s(f, &opp->tfrr);
-
-	qemu_put_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_put_be32s(f, &opp->timers[i].tccr);
-		qemu_put_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		qemu_put_be32s(f, &opp->src[i].ivpr);
-		qemu_put_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_put_sbe32s(f, &opp->src[i].pending);
-	}
-}
-
-static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		unsigned long val;
-
-		val = qemu_get_be32(f);
-#if LONG_MAX > 0x7FFFFFFF
-		val <<= 32;
-		val |= qemu_get_be32(f);
-#endif
-
-		q->queue[i] = val;
-	}
-
-	qemu_get_sbe32s(f, &q->next);
-	qemu_get_sbe32s(f, &q->priority);
-}
-
-static int openpic_load(QEMUFile * f, void *opaque, int version_id)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	if (version_id != 1) {
-		return -EINVAL;
-	}
-
-	qemu_get_be32s(f, &opp->gcr);
-	qemu_get_be32s(f, &opp->vir);
-	qemu_get_be32s(f, &opp->pir);
-	qemu_get_be32s(f, &opp->spve);
-	qemu_get_be32s(f, &opp->tfrr);
-
-	qemu_get_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_get_be32s(f, &opp->timers[i].tccr);
-		qemu_get_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		uint32_t val;
-
-		val = qemu_get_be32(f);
-		write_IRQreg_idr(opp, i, val);
-		val = qemu_get_be32(f);
-		write_IRQreg_ivpr(opp, i, val);
-
-		qemu_get_be32s(f, &opp->src[i].ivpr);
-		qemu_get_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_get_sbe32s(f, &opp->src[i].pending);
-	}
-
-	return 0;
-}
-
 typedef struct MemReg {
 	const char *name;
 	MemoryRegionOps const *ops;
@@ -1614,73 +1336,7 @@ static int openpic_init(SysBusDevice * dev)
 		map_list(opp, list_fsl, &list_count);
 
 		break;
-
-	case OPENPIC_MODEL_RAVEN:
-		opp->nb_irqs = RAVEN_MAX_EXT;
-		opp->vid = VID_REVISION_1_3;
-		opp->vir = VIR_GENERIC;
-		opp->vector_mask = 0xFF;
-		opp->tfrr_reset = 4160000;
-		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
-		opp->idr_reset = 0;
-		opp->max_irq = RAVEN_MAX_IRQ;
-		opp->irq_ipi0 = RAVEN_IPI_IRQ;
-		opp->irq_tim0 = RAVEN_TMR_IRQ;
-		opp->brr1 = -1;
-		opp->mpic_mode_mask = GCR_MODE_MIXED;
-
-		/* Only UP supported today */
-		if (opp->nb_cpus != 1) {
-			return -EINVAL;
-		}
-
-		map_list(opp, list_le, &list_count);
-		break;
-	}
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
-		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
-			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
-		}
 	}
 
-	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
-			openpic_save, openpic_load, opp);
-
-	sysbus_init_mmio(dev, &opp->mem);
-	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
-
 	return 0;
 }
-
-static Property openpic_properties[] = {
-	DEFINE_PROP_UINT32("model", OpenPICState, model,
-			   OPENPIC_MODEL_FSL_MPIC_20),
-	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
-	DEFINE_PROP_END_OF_LIST(),
-};
-
-static void openpic_class_init(ObjectClass * klass, void *data)
-{
-	DeviceClass *dc = DEVICE_CLASS(klass);
-	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
-
-	k->init = openpic_init;
-	dc->props = openpic_properties;
-	dc->reset = openpic_reset;
-}
-
-static const TypeInfo openpic_info = {
-	.name = "openpic",
-	.parent = TYPE_SYS_BUS_DEVICE,
-	.instance_size = sizeof(OpenPICState),
-	.class_init = openpic_class_init,
-};
-
-static void openpic_register_types(void)
-{
-	type_register_static(&openpic_info);
-}
-
-type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 12/17] kvm/ppc/mpic: adapt to kernel style and environment
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove braces that Linux style doesn't permit, remove space after
'*' that Lindent added, keep error/debug strings contiguous, etc.

Substitute type names, debug prints, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  445 ++++++++++++++++++++++-------------------------
 1 files changed, 208 insertions(+), 237 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index d6d70a4..1df67ae 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -42,22 +42,22 @@
 #define OPENPIC_TMR_REG_SIZE         0x220
 #define OPENPIC_MSI_REG_START        0x1600
 #define OPENPIC_MSI_REG_SIZE         0x200
-#define OPENPIC_SUMMARY_REG_START   0x3800
-#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SUMMARY_REG_START    0x3800
+#define OPENPIC_SUMMARY_REG_SIZE     0x800
 #define OPENPIC_SRC_REG_START        0x10000
 #define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
 #define OPENPIC_CPU_REG_START        0x20000
-#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+#define OPENPIC_CPU_REG_SIZE         (0x100 + ((MAX_CPU - 1) * 0x1000))
 
-typedef struct FslMpicInfo {
+struct fsl_mpic_info {
 	int max_ext;
-} FslMpicInfo;
+};
 
-static FslMpicInfo fsl_mpic_20 = {
+static struct fsl_mpic_info fsl_mpic_20 = {
 	.max_ext = 12,
 };
 
-static FslMpicInfo fsl_mpic_42 = {
+static struct fsl_mpic_info fsl_mpic_42 = {
 	.max_ext = 12,
 };
 
@@ -100,44 +100,43 @@ static int get_current_cpu(void)
 {
 	CPUState *cpu_single_cpu;
 
-	if (!cpu_single_env) {
+	if (!cpu_single_env)
 		return -1;
-	}
 
 	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
 	return cpu_single_cpu->cpu_index;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx);
 
-typedef enum IRQType {
+enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
 	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
 	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
-} IRQType;
+};
 
-typedef struct IRQQueue {
+struct irq_queue {
 	/* Round up to the nearest 64 IRQs so that the queue length
 	 * won't change when moving between 32 and 64 bit hosts.
 	 */
 	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
 	int next;
 	int priority;
-} IRQQueue;
+};
 
-typedef struct IRQSource {
+struct irq_source {
 	uint32_t ivpr;		/* IRQ vector/priority register */
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
 	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
 	int pending;		/* TRUE if IRQ is pending */
-	IRQType type;
+	enum irq_type type;
 	bool level:1;		/* level-triggered */
-	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
-} IRQSource;
+	bool nomask:1;	/* critical interrupts ignore mask on some FSL MPICs */
+};
 
 #define IVPR_MASK_SHIFT       31
 #define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
@@ -158,22 +157,19 @@ typedef struct IRQSource {
 #define IDR_EP      0x80000000	/* external pin */
 #define IDR_CI      0x40000000	/* critical interrupt */
 
-typedef struct IRQDest {
+struct irq_dest {
 	int32_t ctpr;		/* CPU current task priority */
-	IRQQueue raised;
-	IRQQueue servicing;
+	struct irq_queue raised;
+	struct irq_queue servicing;
 	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
 	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
-} IRQDest;
-
-typedef struct OpenPICState {
-	SysBusDevice busdev;
-	MemoryRegion mem;
+};
 
+struct openpic {
 	/* Behavior control */
-	FslMpicInfo *fsl;
+	struct fsl_mpic_info *fsl;
 	uint32_t model;
 	uint32_t flags;
 	uint32_t nb_irqs;
@@ -186,9 +182,6 @@ typedef struct OpenPICState {
 	uint32_t brr1;
 	uint32_t mpic_mode_mask;
 
-	/* Sub-regions */
-	MemoryRegion sub_io_mem[6];
-
 	/* Global registers */
 	uint32_t frr;		/* Feature reporting register */
 	uint32_t gcr;		/* Global configuration register  */
@@ -196,9 +189,9 @@ typedef struct OpenPICState {
 	uint32_t spve;		/* Spurious vector register */
 	uint32_t tfrr;		/* Timer frequency reporting register */
 	/* Source registers */
-	IRQSource src[MAX_IRQ];
+	struct irq_source src[MAX_IRQ];
 	/* Local registers per output pin */
-	IRQDest dst[MAX_CPU];
+	struct irq_dest dst[MAX_CPU];
 	uint32_t nb_cpus;
 	/* Timer registers */
 	struct {
@@ -213,24 +206,24 @@ typedef struct OpenPICState {
 	uint32_t irq_ipi0;
 	uint32_t irq_tim0;
 	uint32_t irq_msi;
-} OpenPICState;
+};
 
-static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
 }
 
-static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_resetbit(struct irq_queue *q, int n_IRQ)
 {
 	clear_bit(n_IRQ, q->queue);
 }
 
-static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+static inline int IRQ_testbit(struct irq_queue *q, int n_IRQ)
 {
 	return test_bit(n_IRQ, q->queue);
 }
 
-static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+static void IRQ_check(struct openpic *opp, struct irq_queue *q)
 {
 	int irq = -1;
 	int next = -1;
@@ -238,11 +231,10 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 
 	for (;;) {
 		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
-		if (irq == opp->max_irq) {
+		if (irq == opp->max_irq)
 			break;
-		}
 
-		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+		pr_debug("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
 			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
 
 		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
@@ -255,7 +247,7 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 	q->priority = priority;
 }
 
-static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+static int IRQ_get_next(struct openpic *opp, struct irq_queue *q)
 {
 	/* XXX: optimize */
 	IRQ_check(opp, q);
@@ -263,21 +255,21 @@ static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
 	return q->next;
 }
 
-static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			   bool active, bool was_active)
 {
-	IRQDest *dst;
-	IRQSource *src;
+	struct irq_dest *dst;
+	struct irq_source *src;
 	int priority;
 
 	dst = &opp->dst[n_CPU];
 	src = &opp->src[n_IRQ];
 
-	DPRINTF("%s: IRQ %d active %d was %d\n",
+	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
 	if (src->output != OPENPIC_OUTPUT_INT) {
-		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
 
@@ -286,19 +278,17 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		 * masking.
 		 */
 		if (active) {
-			if (!was_active
-			    && dst->outputs_active[src->output]++ == 0) {
-				DPRINTF
-				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (!was_active &&
+			    dst->outputs_active[src->output]++ == 0) {
+				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_raise(dst->irqs[src->output]);
 			}
 		} else {
-			if (was_active
-			    && --dst->outputs_active[src->output] == 0) {
-				DPRINTF
-				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (was_active &&
+			    --dst->outputs_active[src->output] == 0) {
+				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_lower(dst->irqs[src->output]);
 			}
 		}
@@ -311,31 +301,27 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 	/* Even if the interrupt doesn't have enough priority,
 	 * it is still raised, in case ctpr is lowered later.
 	 */
-	if (active) {
+	if (active)
 		IRQ_setbit(&dst->raised, n_IRQ);
-	} else {
+	else
 		IRQ_resetbit(&dst->raised, n_IRQ);
-	}
 
 	IRQ_check(opp, &dst->raised);
 
 	if (active && priority <= dst->ctpr) {
-		DPRINTF
-		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
-		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		pr_debug("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+			__func__, n_IRQ, priority, dst->ctpr, n_CPU);
 		active = 0;
 	}
 
 	if (active) {
 		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
 		    priority <= dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
-			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+			pr_debug("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+				__func__, n_IRQ, dst->servicing.next, n_CPU);
 		} else {
-			DPRINTF
-			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
-			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+				__func__, n_CPU, n_IRQ, dst->raised.next);
 			qemu_irq_raise(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -343,17 +329,15 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		IRQ_get_next(opp, &dst->servicing);
 		if (dst->raised.priority > dst->ctpr &&
 		    dst->raised.priority > dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->raised.next,
-			     dst->raised.priority, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->raised.next,
+				dst->raised.priority, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			/* IRQ line stays asserted */
 		} else {
-			DPRINTF
-			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			qemu_irq_lower(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -361,9 +345,9 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 }
 
 /* update pic state because registers for n_IRQ have changed value */
-static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+static void openpic_update_irq(struct openpic *opp, int n_IRQ)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	bool active, was_active;
 	int i;
 
@@ -372,30 +356,29 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
 		/* Interrupt source is disabled */
-		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is disabled\n", __func__, n_IRQ);
 		active = false;
 	}
 
-	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+	was_active = !!(src->ivpr & IVPR_ACTIVITY_MASK);
 
 	/*
 	 * We don't have a similar check for already-active because
 	 * ctpr may have changed and we need to withdraw the interrupt.
 	 */
 	if (!active && !was_active) {
-		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
 		return;
 	}
 
-	if (active) {
+	if (active)
 		src->ivpr |= IVPR_ACTIVITY_MASK;
-	} else {
+	else
 		src->ivpr &= ~IVPR_ACTIVITY_MASK;
-	}
 
 	if (src->destmask == 0) {
 		/* No target */
-		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d has no target\n", __func__, n_IRQ);
 		return;
 	}
 
@@ -413,9 +396,9 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 	} else {
 		/* Distributed delivery mode */
 		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
-			if (i == opp->nb_cpus) {
+			if (i == opp->nb_cpus)
 				i = 0;
-			}
+
 			if (src->destmask & (1 << i)) {
 				IRQ_local_pipe(opp, i, n_IRQ, active,
 					       was_active);
@@ -428,16 +411,16 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
 		abort();
 	}
 
 	src = &opp->src[n_IRQ];
-	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+	pr_debug("openpic: set irq %d = %d ivpr=0x%08x\n",
 		n_IRQ, level, src->ivpr);
 	if (src->level) {
 		/* level-sensitive irq */
@@ -463,9 +446,9 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState * d)
+static void openpic_reset(DeviceState *d)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
@@ -485,7 +468,7 @@ static void openpic_reset(DeviceState * d)
 		switch (opp->src[i].type) {
 		case IRQ_TYPE_NORMAL:
 			opp->src[i].level =
-			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			    !!(opp->ivpr_reset & IVPR_SENSE_MASK);
 			break;
 
 		case IRQ_TYPE_FSLINT:
@@ -499,9 +482,9 @@ static void openpic_reset(DeviceState * d)
 	/* Initialise IRQ destinations */
 	for (i = 0; i < MAX_CPU; i++) {
 		opp->dst[i].ctpr = 15;
-		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].raised, 0, sizeof(struct irq_queue));
 		opp->dst[i].raised.next = -1;
-		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].servicing, 0, sizeof(struct irq_queue));
 		opp->dst[i].servicing.next = -1;
 	}
 	/* Initialise timers */
@@ -513,28 +496,28 @@ static void openpic_reset(DeviceState * d)
 	opp->gcr = 0;
 }
 
-static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].idr;
 }
 
-static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
-	if (opp->flags & OPENPIC_FLAG_ILR) {
+	if (opp->flags & OPENPIC_FLAG_ILR)
 		return output_to_inttgt(opp->src[n_IRQ].output);
-	}
 
 	return 0xffffffff;
 }
 
-static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ivpr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].ivpr;
 }
 
-static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
-	IRQSource *src = &opp->src[n_IRQ];
+	struct irq_source *src = &opp->src[n_IRQ];
 	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
 	uint32_t crit_mask = 0;
 	uint32_t mask = normal_mask;
@@ -547,14 +530,13 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 
 	src->idr = val & mask;
-	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+	pr_debug("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
 
 	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
 		if (src->idr & crit_mask) {
 			if (src->idr & normal_mask) {
-				DPRINTF
-				    ("%s: IRQ configured for multiple output types, using "
-				     "critical\n", __func__);
+				pr_debug("%s: IRQ configured for multiple output types, using critical\n",
+					__func__);
 			}
 
 			src->output = OPENPIC_OUTPUT_CINT;
@@ -564,9 +546,8 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 			for (i = 0; i < opp->nb_cpus; i++) {
 				int n_ci = IDR_CI0_SHIFT - i;
 
-				if (src->idr & (1UL << n_ci)) {
+				if (src->idr & (1UL << n_ci))
 					src->destmask |= 1UL << i;
-				}
 			}
 		} else {
 			src->output = OPENPIC_OUTPUT_INT;
@@ -578,20 +559,21 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 }
 
-static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR) {
-		IRQSource *src = &opp->src[n_IRQ];
+		struct irq_source *src = &opp->src[n_IRQ];
 
 		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
-		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
 		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
 	}
 }
 
-static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 				     uint32_t val)
 {
 	uint32_t mask;
@@ -613,7 +595,7 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	switch (opp->src[n_IRQ].type) {
 	case IRQ_TYPE_NORMAL:
 		opp->src[n_IRQ].level =
-		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		    !!(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
 		break;
 
 	case IRQ_TYPE_FSLINT:
@@ -626,11 +608,11 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	}
 
 	openpic_update_irq(opp, n_IRQ);
-	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+	pr_debug("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
 		opp->src[n_IRQ].ivpr);
 }
 
-static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
 	bool mpic_proxy = false;
 
@@ -643,27 +625,26 @@ static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
 	opp->gcr |= val & opp->mpic_mode_mask;
 
 	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY) {
+	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY)
 		mpic_proxy = true;
-	}
 
 	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	switch (addr) {
-	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
 		break;
 	case 0x40:
 	case 0x50:
@@ -685,16 +666,14 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x1090:		/* PIR */
 		for (idx = 0; idx < opp->nb_cpus; idx++) {
 			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Raise OpenPIC RESET output for CPU %d\n",
-				     idx);
+				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx))
-				   && (opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Lower OpenPIC RESET output for CPU %d\n",
-				     idx);
+			} else if (!(val & (1 << idx)) &&
+				   (opp->pir & (1 << idx))) {
+				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
 			}
@@ -704,13 +683,12 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
 	case 0x10C0:
-	case 0x10D0:
-		{
-			int idx;
-			idx = (addr - 0x10A0) >> 4;
-			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
-		}
+	case 0x10D0: {
+		int idx;
+		idx = (addr - 0x10A0) >> 4;
+		write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
 		break;
+	}
 	case 0x10E0:		/* SPVE */
 		opp->spve = val & opp->vector_mask;
 		break;
@@ -719,16 +697,16 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
@@ -772,24 +750,23 @@ static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	if (addr == 0x10f0) {
 		/* TFRR */
@@ -806,9 +783,9 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10:		/* TBCR */
 		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
 		    (val & TBCR_CI) == 0 &&
-		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0)
 			opp->timers[idx].tccr &= ~TCCR_TOG;
-		}
+
 		opp->timers[idx].tbcr = val;
 		break;
 	case 0x20:		/* TVPR */
@@ -820,16 +797,16 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		goto out;
-	}
+
 	idx = (addr >> 6) & 0x3;
 	if (addr == 0x0) {
 		/* TFRR */
@@ -852,18 +829,18 @@ static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
 	}
 
 out:
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
 
 	addr = addr & 0xffff;
@@ -884,11 +861,11 @@ static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
 
 static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -906,22 +883,21 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 		break;
 	}
 
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 	return retval;
 }
 
-static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -937,16 +913,15 @@ static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint64_t r = 0;
 	int i, srs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		return -1;
-	}
 
 	srs = addr >> 4;
 
@@ -965,53 +940,51 @@ static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
 		openpic_set_irq(opp, opp->irq_msi + srs, 0);
 		break;
 	case 0x120:		/* MSISR */
-		for (i = 0; i < MAX_MSI; i++) {
+		for (i = 0; i < MAX_MSI; i++)
 			r |= (opp->msi[i].msir ? 1 : 0) << i;
-		}
 		break;
 	}
 
 	return r;
 }
 
-static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
 {
 	uint64_t r = 0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
 	return r;
 }
 
-static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
 				  unsigned size)
 {
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
 
 	/* TODO: EISR/EIMR */
 }
 
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
+	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
 		addr, val);
 
-	if (idx < 0) {
+	if (idx < 0)
 		return;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1028,17 +1001,16 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	case 0x80:		/* CTPR */
 		dst->ctpr = val & 0x0000000F;
 
-		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+		pr_debug("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
 			__func__, idx, dst->ctpr, dst->raised.priority,
 			dst->servicing.priority);
 
 		if (dst->raised.priority <= dst->ctpr) {
-			DPRINTF
-			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
-			     __func__, idx);
+			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+				__func__, idx);
 			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 		} else if (dst->raised.priority > dst->servicing.priority) {
-			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
 			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1051,11 +1023,11 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		/* Read-only register */
 		break;
 	case 0xB0:		/* EOI */
-		DPRINTF("EOI\n");
+		pr_debug("EOI\n");
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 
 		if (s_IRQ < 0) {
-			DPRINTF("%s: EOI with no interrupt in service\n",
+			pr_debug("%s: EOI with no interrupt in service\n",
 				__func__);
 			break;
 		}
@@ -1069,7 +1041,7 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		if (n_IRQ != -1 &&
 		    (s_IRQ == -1 ||
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
-			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
 			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1079,32 +1051,32 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	}
 }
 
-static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
 	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
 }
 
-static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
+			     int cpu)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	int retval, irq;
 
-	DPRINTF("Lower OpenPIC INT output\n");
+	pr_debug("Lower OpenPIC INT output\n");
 	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 
 	irq = IRQ_get_next(opp, &dst->raised);
-	DPRINTF("IACK: irq=%d\n", irq);
+	pr_debug("IACK: irq=%d\n", irq);
 
-	if (irq == -1) {
+	if (irq == -1)
 		/* No more interrupt pending */
 		return opp->spve;
-	}
 
 	src = &opp->src[irq];
 	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
 	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
-		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+		pr_err("%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
 			__func__, irq, dst->ctpr, src->ivpr);
 		openpic_update_irq(opp, irq);
 		retval = opp->spve;
@@ -1135,22 +1107,21 @@ static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	uint32_t retval;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
-	if (idx < 0) {
+	if (idx < 0)
 		return retval;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1169,54 +1140,54 @@ static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
 {
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_be = {
+static const struct kvm_io_device_ops openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
 };
 
-static const MemoryRegionOps openpic_tmr_ops_be = {
+static const struct kvm_io_device_ops openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
 };
 
-static const MemoryRegionOps openpic_cpu_ops_be = {
+static const struct kvm_io_device_ops openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
 };
 
-static const MemoryRegionOps openpic_src_ops_be = {
+static const struct kvm_io_device_ops openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
 };
 
-static const MemoryRegionOps openpic_msi_ops_be = {
+static const struct kvm_io_device_ops openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
 };
 
-static const MemoryRegionOps openpic_summary_ops_be = {
+static const struct kvm_io_device_ops openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
 };
 
-typedef struct MemReg {
+struct mem_reg {
 	const char *name;
-	MemoryRegionOps const *ops;
-	hwaddr start_addr;
-	ram_addr_t size;
-} MemReg;
+	const struct kvm_io_device_ops *ops;
+	gpa_t start_addr;
+	int size;
+};
 
-static void fsl_common_init(OpenPICState * opp)
+static void fsl_common_init(struct openpic *opp)
 {
 	int i;
 	int virq = MAX_SRC;
@@ -1239,9 +1210,8 @@ static void fsl_common_init(OpenPICState * opp)
 	opp->irq_msi = 224;
 
 	msi_supported = true;
-	for (i = 0; i < opp->fsl->max_ext; i++) {
+	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
-	}
 
 	/* Internal interrupts, including message and MSI */
 	for (i = 16; i < MAX_SRC; i++) {
@@ -1256,7 +1226,8 @@ static void fsl_common_init(OpenPICState * opp)
 	}
 }
 
-static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+static void map_list(struct openpic *opp, const struct mem_reg *list,
+		     int *count)
 {
 	while (list->name) {
 		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
@@ -1272,12 +1243,12 @@ static void map_list(OpenPICState * opp, const MemReg * list, int *count)
 	}
 }
 
-static int openpic_init(SysBusDevice * dev)
+static int openpic_init(SysBusDevice *dev)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
 	int i, j;
 	int list_count = 0;
-	static const MemReg list_le[] = {
+	static const struct mem_reg list_le[] = {
 		{"glb", &openpic_glb_ops_le,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_le,
@@ -1288,7 +1259,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_be[] = {
+	static const struct mem_reg list_be[] = {
 		{"glb", &openpic_glb_ops_be,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_be,
@@ -1299,7 +1270,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_fsl[] = {
+	static const struct mem_reg list_fsl[] = {
 		{"msi", &openpic_msi_ops_be,
 		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
 		{"summary", &openpic_summary_ops_be,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 12/17] kvm/ppc/mpic: adapt to kernel style and environment
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove braces that Linux style doesn't permit, remove space after
'*' that Lindent added, keep error/debug strings contiguous, etc.

Substitute type names, debug prints, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  445 ++++++++++++++++++++++-------------------------
 1 files changed, 208 insertions(+), 237 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index d6d70a4..1df67ae 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -42,22 +42,22 @@
 #define OPENPIC_TMR_REG_SIZE         0x220
 #define OPENPIC_MSI_REG_START        0x1600
 #define OPENPIC_MSI_REG_SIZE         0x200
-#define OPENPIC_SUMMARY_REG_START   0x3800
-#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SUMMARY_REG_START    0x3800
+#define OPENPIC_SUMMARY_REG_SIZE     0x800
 #define OPENPIC_SRC_REG_START        0x10000
 #define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
 #define OPENPIC_CPU_REG_START        0x20000
-#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+#define OPENPIC_CPU_REG_SIZE         (0x100 + ((MAX_CPU - 1) * 0x1000))
 
-typedef struct FslMpicInfo {
+struct fsl_mpic_info {
 	int max_ext;
-} FslMpicInfo;
+};
 
-static FslMpicInfo fsl_mpic_20 = {
+static struct fsl_mpic_info fsl_mpic_20 = {
 	.max_ext = 12,
 };
 
-static FslMpicInfo fsl_mpic_42 = {
+static struct fsl_mpic_info fsl_mpic_42 = {
 	.max_ext = 12,
 };
 
@@ -100,44 +100,43 @@ static int get_current_cpu(void)
 {
 	CPUState *cpu_single_cpu;
 
-	if (!cpu_single_env) {
+	if (!cpu_single_env)
 		return -1;
-	}
 
 	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
 	return cpu_single_cpu->cpu_index;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx);
 
-typedef enum IRQType {
+enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
 	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
 	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
-} IRQType;
+};
 
-typedef struct IRQQueue {
+struct irq_queue {
 	/* Round up to the nearest 64 IRQs so that the queue length
 	 * won't change when moving between 32 and 64 bit hosts.
 	 */
 	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
 	int next;
 	int priority;
-} IRQQueue;
+};
 
-typedef struct IRQSource {
+struct irq_source {
 	uint32_t ivpr;		/* IRQ vector/priority register */
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
 	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
 	int pending;		/* TRUE if IRQ is pending */
-	IRQType type;
+	enum irq_type type;
 	bool level:1;		/* level-triggered */
-	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
-} IRQSource;
+	bool nomask:1;	/* critical interrupts ignore mask on some FSL MPICs */
+};
 
 #define IVPR_MASK_SHIFT       31
 #define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
@@ -158,22 +157,19 @@ typedef struct IRQSource {
 #define IDR_EP      0x80000000	/* external pin */
 #define IDR_CI      0x40000000	/* critical interrupt */
 
-typedef struct IRQDest {
+struct irq_dest {
 	int32_t ctpr;		/* CPU current task priority */
-	IRQQueue raised;
-	IRQQueue servicing;
+	struct irq_queue raised;
+	struct irq_queue servicing;
 	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
 	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
-} IRQDest;
-
-typedef struct OpenPICState {
-	SysBusDevice busdev;
-	MemoryRegion mem;
+};
 
+struct openpic {
 	/* Behavior control */
-	FslMpicInfo *fsl;
+	struct fsl_mpic_info *fsl;
 	uint32_t model;
 	uint32_t flags;
 	uint32_t nb_irqs;
@@ -186,9 +182,6 @@ typedef struct OpenPICState {
 	uint32_t brr1;
 	uint32_t mpic_mode_mask;
 
-	/* Sub-regions */
-	MemoryRegion sub_io_mem[6];
-
 	/* Global registers */
 	uint32_t frr;		/* Feature reporting register */
 	uint32_t gcr;		/* Global configuration register  */
@@ -196,9 +189,9 @@ typedef struct OpenPICState {
 	uint32_t spve;		/* Spurious vector register */
 	uint32_t tfrr;		/* Timer frequency reporting register */
 	/* Source registers */
-	IRQSource src[MAX_IRQ];
+	struct irq_source src[MAX_IRQ];
 	/* Local registers per output pin */
-	IRQDest dst[MAX_CPU];
+	struct irq_dest dst[MAX_CPU];
 	uint32_t nb_cpus;
 	/* Timer registers */
 	struct {
@@ -213,24 +206,24 @@ typedef struct OpenPICState {
 	uint32_t irq_ipi0;
 	uint32_t irq_tim0;
 	uint32_t irq_msi;
-} OpenPICState;
+};
 
-static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
 }
 
-static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_resetbit(struct irq_queue *q, int n_IRQ)
 {
 	clear_bit(n_IRQ, q->queue);
 }
 
-static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+static inline int IRQ_testbit(struct irq_queue *q, int n_IRQ)
 {
 	return test_bit(n_IRQ, q->queue);
 }
 
-static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+static void IRQ_check(struct openpic *opp, struct irq_queue *q)
 {
 	int irq = -1;
 	int next = -1;
@@ -238,11 +231,10 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 
 	for (;;) {
 		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
-		if (irq = opp->max_irq) {
+		if (irq = opp->max_irq)
 			break;
-		}
 
-		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+		pr_debug("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
 			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
 
 		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
@@ -255,7 +247,7 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 	q->priority = priority;
 }
 
-static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+static int IRQ_get_next(struct openpic *opp, struct irq_queue *q)
 {
 	/* XXX: optimize */
 	IRQ_check(opp, q);
@@ -263,21 +255,21 @@ static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
 	return q->next;
 }
 
-static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			   bool active, bool was_active)
 {
-	IRQDest *dst;
-	IRQSource *src;
+	struct irq_dest *dst;
+	struct irq_source *src;
 	int priority;
 
 	dst = &opp->dst[n_CPU];
 	src = &opp->src[n_IRQ];
 
-	DPRINTF("%s: IRQ %d active %d was %d\n",
+	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
 	if (src->output != OPENPIC_OUTPUT_INT) {
-		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
 
@@ -286,19 +278,17 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		 * masking.
 		 */
 		if (active) {
-			if (!was_active
-			    && dst->outputs_active[src->output]++ = 0) {
-				DPRINTF
-				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (!was_active &&
+			    dst->outputs_active[src->output]++ = 0) {
+				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_raise(dst->irqs[src->output]);
 			}
 		} else {
-			if (was_active
-			    && --dst->outputs_active[src->output] = 0) {
-				DPRINTF
-				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (was_active &&
+			    --dst->outputs_active[src->output] = 0) {
+				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_lower(dst->irqs[src->output]);
 			}
 		}
@@ -311,31 +301,27 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 	/* Even if the interrupt doesn't have enough priority,
 	 * it is still raised, in case ctpr is lowered later.
 	 */
-	if (active) {
+	if (active)
 		IRQ_setbit(&dst->raised, n_IRQ);
-	} else {
+	else
 		IRQ_resetbit(&dst->raised, n_IRQ);
-	}
 
 	IRQ_check(opp, &dst->raised);
 
 	if (active && priority <= dst->ctpr) {
-		DPRINTF
-		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
-		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		pr_debug("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+			__func__, n_IRQ, priority, dst->ctpr, n_CPU);
 		active = 0;
 	}
 
 	if (active) {
 		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
 		    priority <= dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
-			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+			pr_debug("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+				__func__, n_IRQ, dst->servicing.next, n_CPU);
 		} else {
-			DPRINTF
-			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
-			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+				__func__, n_CPU, n_IRQ, dst->raised.next);
 			qemu_irq_raise(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -343,17 +329,15 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		IRQ_get_next(opp, &dst->servicing);
 		if (dst->raised.priority > dst->ctpr &&
 		    dst->raised.priority > dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->raised.next,
-			     dst->raised.priority, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->raised.next,
+				dst->raised.priority, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			/* IRQ line stays asserted */
 		} else {
-			DPRINTF
-			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			qemu_irq_lower(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -361,9 +345,9 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 }
 
 /* update pic state because registers for n_IRQ have changed value */
-static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+static void openpic_update_irq(struct openpic *opp, int n_IRQ)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	bool active, was_active;
 	int i;
 
@@ -372,30 +356,29 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
 		/* Interrupt source is disabled */
-		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is disabled\n", __func__, n_IRQ);
 		active = false;
 	}
 
-	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+	was_active = !!(src->ivpr & IVPR_ACTIVITY_MASK);
 
 	/*
 	 * We don't have a similar check for already-active because
 	 * ctpr may have changed and we need to withdraw the interrupt.
 	 */
 	if (!active && !was_active) {
-		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
 		return;
 	}
 
-	if (active) {
+	if (active)
 		src->ivpr |= IVPR_ACTIVITY_MASK;
-	} else {
+	else
 		src->ivpr &= ~IVPR_ACTIVITY_MASK;
-	}
 
 	if (src->destmask = 0) {
 		/* No target */
-		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d has no target\n", __func__, n_IRQ);
 		return;
 	}
 
@@ -413,9 +396,9 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 	} else {
 		/* Distributed delivery mode */
 		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
-			if (i = opp->nb_cpus) {
+			if (i = opp->nb_cpus)
 				i = 0;
-			}
+
 			if (src->destmask & (1 << i)) {
 				IRQ_local_pipe(opp, i, n_IRQ, active,
 					       was_active);
@@ -428,16 +411,16 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
 		abort();
 	}
 
 	src = &opp->src[n_IRQ];
-	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+	pr_debug("openpic: set irq %d = %d ivpr=0x%08x\n",
 		n_IRQ, level, src->ivpr);
 	if (src->level) {
 		/* level-sensitive irq */
@@ -463,9 +446,9 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState * d)
+static void openpic_reset(DeviceState *d)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
@@ -485,7 +468,7 @@ static void openpic_reset(DeviceState * d)
 		switch (opp->src[i].type) {
 		case IRQ_TYPE_NORMAL:
 			opp->src[i].level -			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			    !!(opp->ivpr_reset & IVPR_SENSE_MASK);
 			break;
 
 		case IRQ_TYPE_FSLINT:
@@ -499,9 +482,9 @@ static void openpic_reset(DeviceState * d)
 	/* Initialise IRQ destinations */
 	for (i = 0; i < MAX_CPU; i++) {
 		opp->dst[i].ctpr = 15;
-		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].raised, 0, sizeof(struct irq_queue));
 		opp->dst[i].raised.next = -1;
-		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].servicing, 0, sizeof(struct irq_queue));
 		opp->dst[i].servicing.next = -1;
 	}
 	/* Initialise timers */
@@ -513,28 +496,28 @@ static void openpic_reset(DeviceState * d)
 	opp->gcr = 0;
 }
 
-static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].idr;
 }
 
-static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
-	if (opp->flags & OPENPIC_FLAG_ILR) {
+	if (opp->flags & OPENPIC_FLAG_ILR)
 		return output_to_inttgt(opp->src[n_IRQ].output);
-	}
 
 	return 0xffffffff;
 }
 
-static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ivpr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].ivpr;
 }
 
-static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
-	IRQSource *src = &opp->src[n_IRQ];
+	struct irq_source *src = &opp->src[n_IRQ];
 	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
 	uint32_t crit_mask = 0;
 	uint32_t mask = normal_mask;
@@ -547,14 +530,13 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 
 	src->idr = val & mask;
-	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+	pr_debug("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
 
 	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
 		if (src->idr & crit_mask) {
 			if (src->idr & normal_mask) {
-				DPRINTF
-				    ("%s: IRQ configured for multiple output types, using "
-				     "critical\n", __func__);
+				pr_debug("%s: IRQ configured for multiple output types, using critical\n",
+					__func__);
 			}
 
 			src->output = OPENPIC_OUTPUT_CINT;
@@ -564,9 +546,8 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 			for (i = 0; i < opp->nb_cpus; i++) {
 				int n_ci = IDR_CI0_SHIFT - i;
 
-				if (src->idr & (1UL << n_ci)) {
+				if (src->idr & (1UL << n_ci))
 					src->destmask |= 1UL << i;
-				}
 			}
 		} else {
 			src->output = OPENPIC_OUTPUT_INT;
@@ -578,20 +559,21 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 }
 
-static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR) {
-		IRQSource *src = &opp->src[n_IRQ];
+		struct irq_source *src = &opp->src[n_IRQ];
 
 		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
-		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
 		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
 	}
 }
 
-static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 				     uint32_t val)
 {
 	uint32_t mask;
@@ -613,7 +595,7 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	switch (opp->src[n_IRQ].type) {
 	case IRQ_TYPE_NORMAL:
 		opp->src[n_IRQ].level -		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		    !!(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
 		break;
 
 	case IRQ_TYPE_FSLINT:
@@ -626,11 +608,11 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	}
 
 	openpic_update_irq(opp, n_IRQ);
-	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+	pr_debug("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
 		opp->src[n_IRQ].ivpr);
 }
 
-static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
 	bool mpic_proxy = false;
 
@@ -643,27 +625,26 @@ static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
 	opp->gcr |= val & opp->mpic_mode_mask;
 
 	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY) {
+	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY)
 		mpic_proxy = true;
-	}
 
 	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	switch (addr) {
-	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
 		break;
 	case 0x40:
 	case 0x50:
@@ -685,16 +666,14 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x1090:		/* PIR */
 		for (idx = 0; idx < opp->nb_cpus; idx++) {
 			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Raise OpenPIC RESET output for CPU %d\n",
-				     idx);
+				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx))
-				   && (opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Lower OpenPIC RESET output for CPU %d\n",
-				     idx);
+			} else if (!(val & (1 << idx)) &&
+				   (opp->pir & (1 << idx))) {
+				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
 			}
@@ -704,13 +683,12 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
 	case 0x10C0:
-	case 0x10D0:
-		{
-			int idx;
-			idx = (addr - 0x10A0) >> 4;
-			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
-		}
+	case 0x10D0: {
+		int idx;
+		idx = (addr - 0x10A0) >> 4;
+		write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
 		break;
+	}
 	case 0x10E0:		/* SPVE */
 		opp->spve = val & opp->vector_mask;
 		break;
@@ -719,16 +697,16 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
@@ -772,24 +750,23 @@ static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	if (addr = 0x10f0) {
 		/* TFRR */
@@ -806,9 +783,9 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10:		/* TBCR */
 		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
 		    (val & TBCR_CI) = 0 &&
-		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0)
 			opp->timers[idx].tccr &= ~TCCR_TOG;
-		}
+
 		opp->timers[idx].tbcr = val;
 		break;
 	case 0x20:		/* TVPR */
@@ -820,16 +797,16 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		goto out;
-	}
+
 	idx = (addr >> 6) & 0x3;
 	if (addr = 0x0) {
 		/* TFRR */
@@ -852,18 +829,18 @@ static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
 	}
 
 out:
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
 
 	addr = addr & 0xffff;
@@ -884,11 +861,11 @@ static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
 
 static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -906,22 +883,21 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 		break;
 	}
 
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 	return retval;
 }
 
-static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -937,16 +913,15 @@ static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint64_t r = 0;
 	int i, srs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		return -1;
-	}
 
 	srs = addr >> 4;
 
@@ -965,53 +940,51 @@ static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
 		openpic_set_irq(opp, opp->irq_msi + srs, 0);
 		break;
 	case 0x120:		/* MSISR */
-		for (i = 0; i < MAX_MSI; i++) {
+		for (i = 0; i < MAX_MSI; i++)
 			r |= (opp->msi[i].msir ? 1 : 0) << i;
-		}
 		break;
 	}
 
 	return r;
 }
 
-static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
 {
 	uint64_t r = 0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
 	return r;
 }
 
-static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
 				  unsigned size)
 {
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
 
 	/* TODO: EISR/EIMR */
 }
 
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
+	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
 		addr, val);
 
-	if (idx < 0) {
+	if (idx < 0)
 		return;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1028,17 +1001,16 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	case 0x80:		/* CTPR */
 		dst->ctpr = val & 0x0000000F;
 
-		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+		pr_debug("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
 			__func__, idx, dst->ctpr, dst->raised.priority,
 			dst->servicing.priority);
 
 		if (dst->raised.priority <= dst->ctpr) {
-			DPRINTF
-			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
-			     __func__, idx);
+			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+				__func__, idx);
 			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 		} else if (dst->raised.priority > dst->servicing.priority) {
-			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
 			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1051,11 +1023,11 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		/* Read-only register */
 		break;
 	case 0xB0:		/* EOI */
-		DPRINTF("EOI\n");
+		pr_debug("EOI\n");
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 
 		if (s_IRQ < 0) {
-			DPRINTF("%s: EOI with no interrupt in service\n",
+			pr_debug("%s: EOI with no interrupt in service\n",
 				__func__);
 			break;
 		}
@@ -1069,7 +1041,7 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		if (n_IRQ != -1 &&
 		    (s_IRQ = -1 ||
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
-			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
 			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1079,32 +1051,32 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	}
 }
 
-static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
 	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
 }
 
-static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
+			     int cpu)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	int retval, irq;
 
-	DPRINTF("Lower OpenPIC INT output\n");
+	pr_debug("Lower OpenPIC INT output\n");
 	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 
 	irq = IRQ_get_next(opp, &dst->raised);
-	DPRINTF("IACK: irq=%d\n", irq);
+	pr_debug("IACK: irq=%d\n", irq);
 
-	if (irq = -1) {
+	if (irq = -1)
 		/* No more interrupt pending */
 		return opp->spve;
-	}
 
 	src = &opp->src[irq];
 	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
 	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
-		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+		pr_err("%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
 			__func__, irq, dst->ctpr, src->ivpr);
 		openpic_update_irq(opp, irq);
 		retval = opp->spve;
@@ -1135,22 +1107,21 @@ static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	uint32_t retval;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
-	if (idx < 0) {
+	if (idx < 0)
 		return retval;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1169,54 +1140,54 @@ static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
 {
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_be = {
+static const struct kvm_io_device_ops openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
 };
 
-static const MemoryRegionOps openpic_tmr_ops_be = {
+static const struct kvm_io_device_ops openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
 };
 
-static const MemoryRegionOps openpic_cpu_ops_be = {
+static const struct kvm_io_device_ops openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
 };
 
-static const MemoryRegionOps openpic_src_ops_be = {
+static const struct kvm_io_device_ops openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
 };
 
-static const MemoryRegionOps openpic_msi_ops_be = {
+static const struct kvm_io_device_ops openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
 };
 
-static const MemoryRegionOps openpic_summary_ops_be = {
+static const struct kvm_io_device_ops openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
 };
 
-typedef struct MemReg {
+struct mem_reg {
 	const char *name;
-	MemoryRegionOps const *ops;
-	hwaddr start_addr;
-	ram_addr_t size;
-} MemReg;
+	const struct kvm_io_device_ops *ops;
+	gpa_t start_addr;
+	int size;
+};
 
-static void fsl_common_init(OpenPICState * opp)
+static void fsl_common_init(struct openpic *opp)
 {
 	int i;
 	int virq = MAX_SRC;
@@ -1239,9 +1210,8 @@ static void fsl_common_init(OpenPICState * opp)
 	opp->irq_msi = 224;
 
 	msi_supported = true;
-	for (i = 0; i < opp->fsl->max_ext; i++) {
+	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
-	}
 
 	/* Internal interrupts, including message and MSI */
 	for (i = 16; i < MAX_SRC; i++) {
@@ -1256,7 +1226,8 @@ static void fsl_common_init(OpenPICState * opp)
 	}
 }
 
-static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+static void map_list(struct openpic *opp, const struct mem_reg *list,
+		     int *count)
 {
 	while (list->name) {
 		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
@@ -1272,12 +1243,12 @@ static void map_list(OpenPICState * opp, const MemReg * list, int *count)
 	}
 }
 
-static int openpic_init(SysBusDevice * dev)
+static int openpic_init(SysBusDevice *dev)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
 	int i, j;
 	int list_count = 0;
-	static const MemReg list_le[] = {
+	static const struct mem_reg list_le[] = {
 		{"glb", &openpic_glb_ops_le,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_le,
@@ -1288,7 +1259,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_be[] = {
+	static const struct mem_reg list_be[] = {
 		{"glb", &openpic_glb_ops_be,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_be,
@@ -1299,7 +1270,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_fsl[] = {
+	static const struct mem_reg list_fsl[] = {
 		{"msi", &openpic_msi_ops_be,
 		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
 		{"summary", &openpic_summary_ops_be,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 13/17] kvm/ppc/mpic: in-kernel MPIC emulation
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Hook the MPIC code up to the KVM interfaces, add locking, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/devices/mpic.txt |   37 ++
 arch/powerpc/include/asm/kvm_host.h        |    8 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   17 +
 arch/powerpc/include/uapi/asm/kvm.h        |    7 +
 arch/powerpc/kvm/Kconfig                   |    9 +
 arch/powerpc/kvm/Makefile                  |    2 +
 arch/powerpc/kvm/booke.c                   |    8 +-
 arch/powerpc/kvm/mpic.c                    |  762 +++++++++++++++++++++-------
 arch/powerpc/kvm/powerpc.c                 |   12 +-
 include/linux/kvm_host.h                   |    2 +
 include/uapi/linux/kvm.h                   |    3 +
 virt/kvm/kvm_main.c                        |    6 +
 12 files changed, 673 insertions(+), 200 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt

diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
new file mode 100644
index 0000000..ce98e32
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -0,0 +1,37 @@
+MPIC interrupt controller
+=========================
+
+Device types supported:
+  KVM_DEV_TYPE_FSL_MPIC_20     Freescale MPIC v2.0
+  KVM_DEV_TYPE_FSL_MPIC_42     Freescale MPIC v4.2
+
+Only one MPIC instance, of any type, may be instantiated.  The created
+MPIC will act as the system interrupt controller, connecting to each
+vcpu's interrupt inputs.
+
+Groups:
+  KVM_DEV_MPIC_GRP_MISC
+  Attributes:
+    KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
+      Base address of the 256 KiB MPIC register space.  Must be
+      naturally aligned.  A value of zero disables the mapping.
+      Reset value is zero.
+
+  KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
+    Access an MPIC register, as if the access were made from the guest.
+    "attr" is the byte offset into the MPIC register space.  Accesses
+    must be 4-byte aligned.
+
+    MSIs may be signaled by using this attribute group to write
+    to the relevant MSIIR.
+
+  KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
+    IRQ input line for each standard openpic source.  0 is inactive and 1
+    is active, regardless of interrupt sense.
+
+    For edge-triggered interrupts:  Writing 1 is considered an activating
+    edge, and writing 0 is ignored.  Reading returns 1 if a previously
+    signaled edge has not been acknowledged, and 0 otherwise.
+
+    "attr" is the IRQ number.  IRQ numbers for standard sources are the
+    byte offset of the relevant IVPR from EIVPR0, divided by 32.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..7e7aef9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -359,6 +359,11 @@ struct kvmppc_slb {
 #define KVMPPC_BOOKE_MAX_IAC	4
 #define KVMPPC_BOOKE_MAX_DAC	2
 
+/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */
+#define KVMPPC_EPR_NONE		0 /* EPR not supported */
+#define KVMPPC_EPR_USER		1 /* exit to userspace to fill EPR */
+#define KVMPPC_EPR_KERNEL	2 /* in-kernel irqchip */
+
 struct kvmppc_booke_debug_reg {
 	u32 dbcr0;
 	u32 dbcr1;
@@ -522,7 +527,7 @@ struct kvm_vcpu_arch {
 	u8 sane;
 	u8 cpu_type;
 	u8 hcall_needed;
-	u8 epr_enabled;
+	u8 epr_flags; /* KVMPPC_EPR_xxx */
 	u8 epr_needed;
 
 	u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
@@ -589,5 +594,6 @@ struct kvm_vcpu_arch {
 #define KVM_MMIO_REG_FQPR	0x0060
 
 #define __KVM_HAVE_ARCH_WQP
+#define __KVM_HAVE_CREATE_DEVICE
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index f589307..0b86604 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
 extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
 
+int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
@@ -245,6 +247,9 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
+struct openpic;
+void kvmppc_mpic_put(struct openpic *opp);
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {
@@ -270,6 +275,18 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #endif
 }
 
+#ifdef CONFIG_KVM_MPIC
+
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+
+#else
+
+static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+}
+
+#endif /* CONFIG_KVM_MPIC */
+
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 			      struct kvm_config_tlb *cfg);
 int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index c2ff99c..36be2fe 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -426,4 +426,11 @@ struct kvm_get_htab_header {
 /* Debugging: Special instruction for software breakpoint */
 #define KVM_REG_PPC_DEBUG_INST	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
 
+/* Device control API: PPC-specific devices */
+#define KVM_DEV_MPIC_GRP_MISC		1
+#define   KVM_DEV_MPIC_BASE_ADDR	0	/* 64-bit */
+
+#define KVM_DEV_MPIC_GRP_REGISTER	2	/* 32-bit */
+#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE	3	/* 32-bit */
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 63c67ec..938a729 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -151,6 +151,15 @@ config KVM_E500MC
 
 	  If unsure, say N.
 
+config KVM_MPIC
+	bool "KVM in-kernel MPIC emulation"
+	depends on KVM
+	help
+	  Enable support for emulating MPIC devices inside the
+          host kernel, rather than relying on userspace to emulate.
+          Currently, support is limited to certain versions of
+          Freescale's MPIC implementation.
+
 source drivers/vhost/Kconfig
 
 endif # VIRTUALIZATION
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index b772ede..4a2277a 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -103,6 +103,8 @@ kvm-book3s_32-objs := \
 	book3s_32_mmu.o
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
+kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
 obj-$(CONFIG_KVM_440) += kvm.o
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a49a68a..cff53d4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -346,7 +346,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		keep_irq = true;
 	}
 
-	if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled)
+	if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_flags)
 		update_epr = true;
 
 	switch (priority) {
@@ -427,8 +427,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 			set_guest_esr(vcpu, vcpu->arch.queued_esr);
 		if (update_dear == true)
 			set_guest_dear(vcpu, vcpu->arch.queued_dear);
-		if (update_epr == true)
-			kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		if (update_epr == true) {
+			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
+				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		}
 
 		new_msr &= msr_mask;
 #if defined(CONFIG_64BIT)
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 1df67ae..60857d5 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -23,6 +23,19 @@
  * THE SOFTWARE.
  */
 
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/kvm_host.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/anon_inodes.h>
+#include <asm/uaccess.h>
+#include <asm/mpic.h>
+#include <asm/kvm_para.h>
+#include <asm/kvm_host.h>
+#include <asm/kvm_ppc.h>
+#include "iodev.h"
+
 #define MAX_CPU     32
 #define MAX_SRC     256
 #define MAX_TMR     4
@@ -36,6 +49,7 @@
 #define OPENPIC_FLAG_ILR          (2 << 0)
 
 /* OpenPIC address map */
+#define OPENPIC_REG_SIZE             0x40000
 #define OPENPIC_GLB_REG_START        0x0
 #define OPENPIC_GLB_REG_SIZE         0x10F0
 #define OPENPIC_TMR_REG_START        0x10F0
@@ -89,6 +103,7 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 #define ILR_INTTGT_INT    0x00
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
+#define NUM_OUTPUTS       3
 
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
@@ -98,18 +113,19 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 
 static int get_current_cpu(void)
 {
-	CPUState *cpu_single_cpu;
-
-	if (!cpu_single_env)
-		return -1;
-
-	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
-	return cpu_single_cpu->cpu_index;
+#if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
+	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
+	return vcpu ? vcpu->vcpu_id : -1;
+#else
+	/* XXX */
+	return -1;
+#endif
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx);
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx);
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx);
 
 enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
@@ -131,7 +147,7 @@ struct irq_source {
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
-	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int output;		/* IRQ level, e.g. ILR_INTTGT_INT */
 	int pending;		/* TRUE if IRQ is pending */
 	enum irq_type type;
 	bool level:1;		/* level-triggered */
@@ -158,16 +174,27 @@ struct irq_source {
 #define IDR_CI      0x40000000	/* critical interrupt */
 
 struct irq_dest {
+	struct kvm_vcpu *vcpu;
+
 	int32_t ctpr;		/* CPU current task priority */
 	struct irq_queue raised;
 	struct irq_queue servicing;
-	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
-	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+	uint32_t outputs_active[NUM_OUTPUTS];
 };
 
 struct openpic {
+	struct kvm *kvm;
+	struct kvm_device *dev;
+	struct kvm_io_device mmio;
+	struct list_head mmio_regions;
+	atomic_t users;
+	bool mmio_mapped;
+
+	gpa_t reg_base;
+	spinlock_t lock;
+
 	/* Behavior control */
 	struct fsl_mpic_info *fsl;
 	uint32_t model;
@@ -208,6 +235,47 @@ struct openpic {
 	uint32_t irq_msi;
 };
 
+
+static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	struct kvm_interrupt irq = {
+		.irq = KVM_INTERRUPT_SET_LEVEL,
+	};
+
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %ld does not exist\n",
+			 __func__, dst - &opp->dst[0]);
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvm_vcpu_ioctl_interrupt(dst->vcpu, &irq);
+}
+
+static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %ld does not exist\n",
+			 __func__, dst - &opp->dst[0]);
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvmppc_core_dequeue_external(dst->vcpu);
+}
+
 static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
@@ -268,7 +336,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
-	if (src->output != OPENPIC_OUTPUT_INT) {
+	if (src->output != ILR_INTTGT_INT) {
 		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
@@ -282,14 +350,14 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			    dst->outputs_active[src->output]++ == 0) {
 				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_raise(dst->irqs[src->output]);
+				mpic_irq_raise(opp, dst, src->output);
 			}
 		} else {
 			if (was_active &&
 			    --dst->outputs_active[src->output] == 0) {
 				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_lower(dst->irqs[src->output]);
+				mpic_irq_lower(opp, dst, src->output);
 			}
 		}
 
@@ -322,8 +390,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 		} else {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
 				__func__, n_CPU, n_IRQ, dst->raised.next);
-			qemu_irq_raise(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 	} else {
 		IRQ_get_next(opp, &dst->servicing);
@@ -338,8 +405,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
 				__func__, n_IRQ, dst->ctpr,
 				dst->servicing.priority, n_CPU);
-			qemu_irq_lower(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		}
 	}
 }
@@ -415,8 +481,8 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
-		abort();
+		WARN_ONCE(1, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		return;
 	}
 
 	src = &opp->src[n_IRQ];
@@ -433,7 +499,7 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 			openpic_update_irq(opp, n_IRQ);
 		}
 
-		if (src->output != OPENPIC_OUTPUT_INT) {
+		if (src->output != ILR_INTTGT_INT) {
 			/* Edge-triggered interrupts shouldn't be used
 			 * with non-INT delivery, but just in case,
 			 * try to make it do something sane rather than
@@ -446,15 +512,13 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState *d)
+static void openpic_reset(struct openpic *opp)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
 	/* Initialise controller registers */
 	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
-	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
 	    (opp->vid << FRR_VID_SHIFT);
 
 	opp->pir = 0;
@@ -504,7 +568,7 @@ static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR)
-		return output_to_inttgt(opp->src[n_IRQ].output);
+		return opp->src[n_IRQ].output;
 
 	return 0xffffffff;
 }
@@ -539,7 +603,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					__func__);
 			}
 
-			src->output = OPENPIC_OUTPUT_CINT;
+			src->output = ILR_INTTGT_CINT;
 			src->nomask = true;
 			src->destmask = 0;
 
@@ -550,7 +614,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					src->destmask |= 1UL << i;
 			}
 		} else {
-			src->output = OPENPIC_OUTPUT_INT;
+			src->output = ILR_INTTGT_INT;
 			src->nomask = false;
 			src->destmask = src->idr & normal_mask;
 		}
@@ -565,7 +629,7 @@ static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
 	if (opp->flags & OPENPIC_FLAG_ILR) {
 		struct irq_source *src = &opp->src[n_IRQ];
 
-		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		src->output = val & ILR_INTTGT_MASK;
 		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
@@ -614,34 +678,23 @@ static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 
 static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
-	bool mpic_proxy = false;
-
 	if (val & GCR_RESET) {
-		openpic_reset(&opp->busdev.qdev);
+		openpic_reset(opp);
 		return;
 	}
 
 	opp->gcr &= ~opp->mpic_mode_mask;
 	opp->gcr |= val & opp->mpic_mode_mask;
-
-	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY)
-		mpic_proxy = true;
-
-	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_gbl_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
-	struct irq_dest *dst;
-	int idx;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
@@ -654,7 +707,8 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		err = openpic_cpu_write_internal(opp, addr, val,
+						 get_current_cpu());
 		break;
 	case 0x1000:		/* FRR */
 		break;
@@ -664,21 +718,11 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x1080:		/* VIR */
 		break;
 	case 0x1090:		/* PIR */
-		for (idx = 0; idx < opp->nb_cpus; idx++) {
-			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx)) &&
-				   (opp->pir & (1 << idx))) {
-				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			}
-		}
-		opp->pir = val;
+		/*
+		 * This register is used to reset a CPU core --
+		 * let userspace handle it.
+		 */
+		err = -ENXIO;
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -695,21 +739,25 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	default:
 		break;
 	}
+
+	return err;
 }
 
-static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_gbl_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint32_t retval;
+	u32 retval;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
+		retval |= (opp->nb_cpus - 1) << FRR_NCPU_SHIFT;
 		break;
 	case 0x1020:		/* GCR */
 		retval = opp->gcr;
@@ -731,8 +779,8 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		retval =
-		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		err = openpic_cpu_read_internal(opp, addr,
+			&retval, get_current_cpu());
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -750,28 +798,28 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	default:
 		break;
 	}
-	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
+	*ptr = retval;
+	return err;
 }
 
-static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_tmr_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	if (addr == 0x10f0) {
 		/* TFRR */
 		opp->tfrr = val;
-		return;
+		return 0;
 	}
 
 	idx = (addr >> 6) & 0x3;
@@ -795,15 +843,17 @@ static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_tmr_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
 		goto out;
 
@@ -813,6 +863,7 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 		retval = opp->tfrr;
 		goto out;
 	}
+
 	switch (addr & 0x30) {
 	case 0x00:		/* TCCR */
 		retval = opp->timers[idx].tccr;
@@ -830,18 +881,16 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 
 out:
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_src_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 
 	addr = addr & 0xffff;
 	idx = addr >> 5;
@@ -857,15 +906,17 @@ static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_ilr(opp, idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+static int openpic_src_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -884,20 +935,19 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 	}
 
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned size)
+static int openpic_msi_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -911,17 +961,19 @@ static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 		/* most registers are read-only, thus ignored */
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_msi_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint64_t r = 0;
+	uint32_t r = 0;
 	int i, srs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
-		return -1;
+		return -ENXIO;
 
 	srs = addr >> 4;
 
@@ -945,45 +997,47 @@ static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 		break;
 	}
 
-	return r;
+	pr_debug("%s: => 0x%08x\n", __func__, r);
+	*ptr = r;
+	return 0;
 }
 
-static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_summary_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	uint64_t r = 0;
+	uint32_t r = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
-	return r;
+	*ptr = r;
+	return 0;
 }
 
-static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
-				  unsigned size)
+static int openpic_summary_write(void *opaque, gpa_t addr, u32 val)
 {
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 
 	/* TODO: EISR/EIMR */
+	return 0;
 }
 
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx)
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_source *src;
 	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
 		addr, val);
 
 	if (idx < 0)
-		return;
+		return 0;
 
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1008,11 +1062,11 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		if (dst->raised.priority <= dst->ctpr) {
 			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
 				__func__, idx);
-			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		} else if (dst->raised.priority > dst->servicing.priority) {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
-			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 
 		break;
@@ -1043,18 +1097,22 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
 			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
-			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 		break;
 	default:
 		break;
 	}
+
+	return 0;
 }
 
-static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_cpu_write(void *opaque, gpa_t addr, u32 val)
 {
-	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_write_internal(opp, addr, val,
+					 (addr & 0x1f000) >> 12);
 }
 
 static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
@@ -1064,7 +1122,7 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	int retval, irq;
 
 	pr_debug("Lower OpenPIC INT output\n");
-	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+	mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 
 	irq = IRQ_get_next(opp, &dst->raised);
 	pr_debug("IACK: irq=%d\n", irq);
@@ -1107,20 +1165,21 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_dest *dst;
 	uint32_t retval;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#llx\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
 	if (idx < 0)
-		return retval;
+		goto out;
 
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1142,49 +1201,67 @@ static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 	}
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	*ptr = retval;
+	return 0;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_cpu_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_read_internal(opp, addr, ptr,
+					 (addr & 0x1f000) >> 12);
 }
 
-static const struct kvm_io_device_ops openpic_glb_ops_be = {
+struct mem_reg {
+	struct list_head list;
+	int (*read)(void *opaque, gpa_t addr, u32 *ptr);
+	int (*write)(void *opaque, gpa_t addr, u32 val);
+	gpa_t start_addr;
+	int size;
+};
+
+static struct mem_reg openpic_gbl_mmio = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
+	.start_addr = OPENPIC_GLB_REG_START,
+	.size = OPENPIC_GLB_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_tmr_ops_be = {
+static struct mem_reg openpic_tmr_mmio = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
+	.start_addr = OPENPIC_TMR_REG_START,
+	.size = OPENPIC_TMR_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_cpu_ops_be = {
+static struct mem_reg openpic_cpu_mmio = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
+	.start_addr = OPENPIC_CPU_REG_START,
+	.size = OPENPIC_CPU_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_src_ops_be = {
+static struct mem_reg openpic_src_mmio = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
+	.start_addr = OPENPIC_SRC_REG_START,
+	.size = OPENPIC_SRC_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_msi_ops_be = {
+static struct mem_reg openpic_msi_mmio = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
+	.start_addr = OPENPIC_MSI_REG_START,
+	.size = OPENPIC_MSI_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_summary_ops_be = {
+static struct mem_reg openpic_summary_mmio = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-};
-
-struct mem_reg {
-	const char *name;
-	const struct kvm_io_device_ops *ops;
-	gpa_t start_addr;
-	int size;
+	.start_addr = OPENPIC_SUMMARY_REG_START,
+	.size = OPENPIC_SUMMARY_REG_SIZE,
 };
 
 static void fsl_common_init(struct openpic *opp)
@@ -1192,6 +1269,9 @@ static void fsl_common_init(struct openpic *opp)
 	int i;
 	int virq = MAX_SRC;
 
+	list_add(&openpic_msi_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_summary_mmio.list, &opp->mmio_regions);
+
 	opp->vid = VID_REVISION_1_2;
 	opp->vir = VIR_GENERIC;
 	opp->vector_mask = 0xFFFF;
@@ -1205,11 +1285,10 @@ static void fsl_common_init(struct openpic *opp)
 	opp->irq_tim0 = virq;
 	virq += MAX_TMR;
 
-	assert(virq <= MAX_IRQ);
+	BUG_ON(virq > MAX_IRQ);
 
 	opp->irq_msi = 224;
 
-	msi_supported = true;
 	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
 
@@ -1226,63 +1305,352 @@ static void fsl_common_init(struct openpic *opp)
 	}
 }
 
-static void map_list(struct openpic *opp, const struct mem_reg *list,
-		     int *count)
+static int kvm_mpic_read_internal(struct openpic *opp, gpa_t addr, u32 *ptr)
 {
-	while (list->name) {
-		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+	struct list_head *node;
 
-		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
-				      list->name, list->size);
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
 
-		memory_region_add_subregion(&opp->mem, list->start_addr,
-					    &opp->sub_io_mem[*count]);
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-		(*count)++;
-		list++;
+		return mr->read(opp, addr - mr->start_addr, ptr);
 	}
+
+	return -ENXIO;
 }
 
-static int openpic_init(SysBusDevice *dev)
+static int kvm_mpic_write_internal(struct openpic *opp, gpa_t addr, u32 val)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
-	int i, j;
-	int list_count = 0;
-	static const struct mem_reg list_le[] = {
-		{"glb", &openpic_glb_ops_le,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_le,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_le,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_le,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_be[] = {
-		{"glb", &openpic_glb_ops_be,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_be,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_be,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_be,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_fsl[] = {
-		{"msi", &openpic_msi_ops_be,
-		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
-		{"summary", &openpic_summary_ops_be,
-		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
-		{NULL}
-	};
+	struct list_head *node;
+
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
+
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-	memory_region_init(&opp->mem, "openpic", 0x40000);
+		return mr->write(opp, addr - mr->start_addr, val);
+	}
+
+	return -ENXIO;
+}
+
+static int kvm_mpic_read(struct kvm_io_device *this, gpa_t addr,
+			 int len, void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+	union {
+		u32 val;
+		u8 bytes[4];
+	} u;
+
+	if (addr & (len - 1)) {
+		pr_debug("%s: bad alignment %llx/%d\n",
+			 __func__, addr, len);
+		return -EINVAL;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_read_internal(opp, addr - opp->reg_base, &u.val);
+	spin_unlock_irq(&opp->lock);
+
+	/*
+	 * Technically only 32-bit accesses are allowed, but be nice to
+	 * people dumping registers a byte at a time -- it works in real
+	 * hardware (reads only, not writes).
+	 */
+	if (len == 4) {
+		*(u32 *)ptr = u.val;
+		pr_debug("%s: addr %llx ret %d len 4 val %x\n",
+			 __func__, addr, ret, u.val);
+	} else if (len == 1) {
+		*(u8 *)ptr = u.bytes[addr & 3];
+		pr_debug("%s: addr %llx ret %d len 1 val %x\n",
+			 __func__, addr, ret, u.bytes[addr & 3]);
+	} else {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
+static int kvm_mpic_write(struct kvm_io_device *this, gpa_t addr,
+			  int len, const void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+
+	if (len != 4) {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EOPNOTSUPP;
+	}
+	if (addr & 3) {
+		pr_debug("%s: bad alignment %llx/%d\n", __func__, addr, len);
+		return -EOPNOTSUPP;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_write_internal(opp, addr - opp->reg_base,
+				      *(const u32 *)ptr);
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: addr %llx ret %d val %x\n",
+		 __func__, addr, ret, *(const u32 *)ptr);
+
+	return ret;
+}
+
+static void kvm_mpic_dtor(struct kvm_io_device *this)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+
+	opp->mmio_mapped = false;
+}
+
+static const struct kvm_io_device_ops mpic_mmio_ops = {
+	.read = kvm_mpic_read,
+	.write = kvm_mpic_write,
+	.destructor = kvm_mpic_dtor,
+};
+
+static void map_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = true;
+
+	kvm_iodevice_init(&opp->mmio, &mpic_mmio_ops);
+
+	kvm_io_bus_register_dev(opp->kvm, KVM_MMIO_BUS,
+				opp->reg_base, OPENPIC_REG_SIZE,
+				&opp->mmio);
+}
+
+static void unmap_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = false;
+
+	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+}
+
+static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
+{
+	u64 base;
+
+	if (copy_from_user(&base, (u64 __user *)(long)attr->addr, sizeof(u64)))
+		return -EFAULT;
+
+	if (base & 0x3ffff) {
+		pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx not aligned\n",
+			 __func__, base);
+		return -EINVAL;
+	}
+
+	if (base == opp->reg_base)
+		return 0;
+
+	mutex_lock(&opp->kvm->slots_lock);
+
+	unmap_mmio(opp);
+	opp->reg_base = base;
+
+	pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx\n",
+		 __func__, base);
+
+	if (base == 0)
+		goto out;
+
+	map_mmio(opp);
+
+	mutex_unlock(&opp->kvm->slots_lock);
+out:
+	return 0;
+}
+
+#define ATTR_SET		0
+#define ATTR_GET		1
+
+static int access_reg(struct openpic *opp, gpa_t addr, u32 *val, int type)
+{
+	int ret;
+
+	if (addr & 3)
+		return -ENXIO;
+
+	spin_lock_irq(&opp->lock);
+
+	if (type == ATTR_SET)
+		ret = kvm_mpic_write_internal(opp, addr, *val);
+	else
+		ret = kvm_mpic_read_internal(opp, addr, val);
+
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: type %d addr %llx val %x\n", __func__, type, addr, *val);
+
+	return ret;
+}
+
+static int mpic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u32 attr32;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return set_base_addr(opp, attr);
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return access_reg(opp, attr->attr, &attr32, ATTR_SET);
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		if (attr32 != 0 && attr32 != 1)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		openpic_set_irq(opp, attr->attr, attr32);
+		spin_unlock_irq(&opp->lock);
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u64 attr64;
+	u32 attr32;
+	int ret;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			mutex_lock(&opp->kvm->slots_lock);
+			attr64 = opp->reg_base;
+			mutex_unlock(&opp->kvm->slots_lock);
+
+			if (copy_to_user((u64 __user *)(long)attr->addr,
+					 &attr64, sizeof(u64)))
+				return -EFAULT;
+
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		ret = access_reg(opp, attr->attr, &attr32, ATTR_GET);
+		if (ret)
+			return ret;
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		attr32 = opp->src[attr->attr].pending;
+		spin_unlock_irq(&opp->lock);
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			break;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static void mpic_destroy(struct kvm_device *dev)
+{
+	struct openpic *opp = dev->private;
+
+	if (opp->mmio_mapped) {
+		/*
+		 * Normally we get unmapped by kvm_io_bus_destroy(),
+		 * which happens before the VCPUs release their references.
+		 *
+		 * Thus, we should only get here if no VCPUs took a reference
+		 * to us in the first place.
+		 */
+		WARN_ON(opp->nb_cpus != 0);
+		unmap_mmio(opp);
+	}
+
+	kfree(opp);
+}
+
+static int mpic_create(struct kvm_device *dev, u32 type)
+{
+	struct openpic *opp;
+	int ret;
+
+	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
+	if (!opp)
+		return -ENOMEM;
+
+	dev->private = opp;
+	opp->kvm = dev->kvm;
+	opp->dev = dev;
+	opp->model = type;
+	spin_lock_init(&opp->lock);
+
+	INIT_LIST_HEAD(&opp->mmio_regions);
+	list_add(&openpic_gbl_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_tmr_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_src_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_cpu_mmio.list, &opp->mmio_regions);
 
 	switch (opp->model) {
-	case OPENPIC_MODEL_FSL_MPIC_20:
-	default:
+	case KVM_DEV_TYPE_FSL_MPIC_20:
 		opp->fsl = &fsl_mpic_20;
 		opp->brr1 = 0x00400200;
 		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
@@ -1290,12 +1658,10 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_MIXED;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
 
-	case OPENPIC_MODEL_FSL_MPIC_42:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
 		opp->fsl = &fsl_mpic_42;
 		opp->brr1 = 0x00400402;
 		opp->flags |= OPENPIC_FLAG_ILR;
@@ -1303,11 +1669,27 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_PROXY;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
+
+	default:
+		ret = -ENODEV;
+		goto err;
 	}
 
+	openpic_reset(opp);
 	return 0;
+
+err:
+	kfree(opp);
+	return ret;
 }
+
+struct kvm_device_ops kvm_mpic_ops = {
+	.name = "kvm-mpic",
+	.create = mpic_create,
+	.destroy = mpic_destroy,
+	.set_attr = mpic_set_attr,
+	.get_attr = mpic_get_attr,
+	.has_attr = mpic_has_attr,
+};
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a822659..3cad714 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -317,6 +317,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_ONE_REG:
 	case KVM_CAP_IOEVENTFD:
+	case KVM_CAP_DEVICE_CTRL:
 		r = 1;
 		break;
 #ifndef CONFIG_KVM_BOOK3S_64_HV
@@ -768,7 +769,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	case KVM_CAP_PPC_EPR:
 		r = 0;
-		vcpu->arch.epr_enabled = cap->args[0];
+		if (cap->args[0])
+			vcpu->arch.epr_flags |= KVMPPC_EPR_USER;
+		else
+			vcpu->arch.epr_flags &= ~KVMPPC_EPR_USER;
 		break;
 #ifdef CONFIG_BOOKE
 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
@@ -914,6 +918,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
+	struct kvm *kvm __maybe_unused = filp->private_data;
 	void __user *argp = (void __user *)arg;
 	long r;
 
@@ -932,7 +937,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
-		struct kvm *kvm = filp->private_data;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
@@ -944,7 +948,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	case KVM_ALLOCATE_RMA: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_allocate_rma rma;
 
 		r = kvm_vm_ioctl_allocate_rma(kvm, &rma);
@@ -954,7 +957,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_ALLOCATE_HTAB: {
-		struct kvm *kvm = filp->private_data;
 		u32 htab_order;
 
 		r = -EFAULT;
@@ -971,7 +973,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_GET_HTAB_FD: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_get_htab_fd ghf;
 
 		r = -EFAULT;
@@ -984,7 +985,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_PPC_GET_SMMU_INFO: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_ppc_smmu_info info;
 
 		memset(&info, 0, sizeof(info));
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6dab6b5..feffbda 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1099,6 +1099,8 @@ void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
 
+extern struct kvm_device_ops kvm_mpic_ops;
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index be15aff..568d65d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -925,6 +925,9 @@ struct kvm_device_attr {
 	__u64	addr;		/* userspace address of attr data */
 };
 
+#define KVM_DEV_TYPE_FSL_MPIC_20	1
+#define KVM_DEV_TYPE_FSL_MPIC_42	2
+
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5f0d78c..f6cd14d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2238,6 +2238,12 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
 	int ret;
 
 	switch (cd->type) {
+#ifdef CONFIG_KVM_MPIC
+	case KVM_DEV_TYPE_FSL_MPIC_20:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
+		ops = &kvm_mpic_ops;
+		break;
+#endif
 	default:
 		return -ENODEV;
 	}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 13/17] kvm/ppc/mpic: in-kernel MPIC emulation
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Hook the MPIC code up to the KVM interfaces, add locking, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/devices/mpic.txt |   37 ++
 arch/powerpc/include/asm/kvm_host.h        |    8 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   17 +
 arch/powerpc/include/uapi/asm/kvm.h        |    7 +
 arch/powerpc/kvm/Kconfig                   |    9 +
 arch/powerpc/kvm/Makefile                  |    2 +
 arch/powerpc/kvm/booke.c                   |    8 +-
 arch/powerpc/kvm/mpic.c                    |  762 +++++++++++++++++++++-------
 arch/powerpc/kvm/powerpc.c                 |   12 +-
 include/linux/kvm_host.h                   |    2 +
 include/uapi/linux/kvm.h                   |    3 +
 virt/kvm/kvm_main.c                        |    6 +
 12 files changed, 673 insertions(+), 200 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt

diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
new file mode 100644
index 0000000..ce98e32
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -0,0 +1,37 @@
+MPIC interrupt controller
+============+
+Device types supported:
+  KVM_DEV_TYPE_FSL_MPIC_20     Freescale MPIC v2.0
+  KVM_DEV_TYPE_FSL_MPIC_42     Freescale MPIC v4.2
+
+Only one MPIC instance, of any type, may be instantiated.  The created
+MPIC will act as the system interrupt controller, connecting to each
+vcpu's interrupt inputs.
+
+Groups:
+  KVM_DEV_MPIC_GRP_MISC
+  Attributes:
+    KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
+      Base address of the 256 KiB MPIC register space.  Must be
+      naturally aligned.  A value of zero disables the mapping.
+      Reset value is zero.
+
+  KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
+    Access an MPIC register, as if the access were made from the guest.
+    "attr" is the byte offset into the MPIC register space.  Accesses
+    must be 4-byte aligned.
+
+    MSIs may be signaled by using this attribute group to write
+    to the relevant MSIIR.
+
+  KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
+    IRQ input line for each standard openpic source.  0 is inactive and 1
+    is active, regardless of interrupt sense.
+
+    For edge-triggered interrupts:  Writing 1 is considered an activating
+    edge, and writing 0 is ignored.  Reading returns 1 if a previously
+    signaled edge has not been acknowledged, and 0 otherwise.
+
+    "attr" is the IRQ number.  IRQ numbers for standard sources are the
+    byte offset of the relevant IVPR from EIVPR0, divided by 32.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..7e7aef9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -359,6 +359,11 @@ struct kvmppc_slb {
 #define KVMPPC_BOOKE_MAX_IAC	4
 #define KVMPPC_BOOKE_MAX_DAC	2
 
+/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */
+#define KVMPPC_EPR_NONE		0 /* EPR not supported */
+#define KVMPPC_EPR_USER		1 /* exit to userspace to fill EPR */
+#define KVMPPC_EPR_KERNEL	2 /* in-kernel irqchip */
+
 struct kvmppc_booke_debug_reg {
 	u32 dbcr0;
 	u32 dbcr1;
@@ -522,7 +527,7 @@ struct kvm_vcpu_arch {
 	u8 sane;
 	u8 cpu_type;
 	u8 hcall_needed;
-	u8 epr_enabled;
+	u8 epr_flags; /* KVMPPC_EPR_xxx */
 	u8 epr_needed;
 
 	u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
@@ -589,5 +594,6 @@ struct kvm_vcpu_arch {
 #define KVM_MMIO_REG_FQPR	0x0060
 
 #define __KVM_HAVE_ARCH_WQP
+#define __KVM_HAVE_CREATE_DEVICE
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index f589307..0b86604 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
 extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
 
+int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
@@ -245,6 +247,9 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
+struct openpic;
+void kvmppc_mpic_put(struct openpic *opp);
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {
@@ -270,6 +275,18 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #endif
 }
 
+#ifdef CONFIG_KVM_MPIC
+
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+
+#else
+
+static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+}
+
+#endif /* CONFIG_KVM_MPIC */
+
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 			      struct kvm_config_tlb *cfg);
 int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index c2ff99c..36be2fe 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -426,4 +426,11 @@ struct kvm_get_htab_header {
 /* Debugging: Special instruction for software breakpoint */
 #define KVM_REG_PPC_DEBUG_INST	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
 
+/* Device control API: PPC-specific devices */
+#define KVM_DEV_MPIC_GRP_MISC		1
+#define   KVM_DEV_MPIC_BASE_ADDR	0	/* 64-bit */
+
+#define KVM_DEV_MPIC_GRP_REGISTER	2	/* 32-bit */
+#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE	3	/* 32-bit */
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 63c67ec..938a729 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -151,6 +151,15 @@ config KVM_E500MC
 
 	  If unsure, say N.
 
+config KVM_MPIC
+	bool "KVM in-kernel MPIC emulation"
+	depends on KVM
+	help
+	  Enable support for emulating MPIC devices inside the
+          host kernel, rather than relying on userspace to emulate.
+          Currently, support is limited to certain versions of
+          Freescale's MPIC implementation.
+
 source drivers/vhost/Kconfig
 
 endif # VIRTUALIZATION
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index b772ede..4a2277a 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -103,6 +103,8 @@ kvm-book3s_32-objs := \
 	book3s_32_mmu.o
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
+kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
 obj-$(CONFIG_KVM_440) += kvm.o
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a49a68a..cff53d4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -346,7 +346,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		keep_irq = true;
 	}
 
-	if ((priority = BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled)
+	if ((priority = BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_flags)
 		update_epr = true;
 
 	switch (priority) {
@@ -427,8 +427,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 			set_guest_esr(vcpu, vcpu->arch.queued_esr);
 		if (update_dear = true)
 			set_guest_dear(vcpu, vcpu->arch.queued_dear);
-		if (update_epr = true)
-			kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		if (update_epr = true) {
+			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
+				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		}
 
 		new_msr &= msr_mask;
 #if defined(CONFIG_64BIT)
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 1df67ae..60857d5 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -23,6 +23,19 @@
  * THE SOFTWARE.
  */
 
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/kvm_host.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/anon_inodes.h>
+#include <asm/uaccess.h>
+#include <asm/mpic.h>
+#include <asm/kvm_para.h>
+#include <asm/kvm_host.h>
+#include <asm/kvm_ppc.h>
+#include "iodev.h"
+
 #define MAX_CPU     32
 #define MAX_SRC     256
 #define MAX_TMR     4
@@ -36,6 +49,7 @@
 #define OPENPIC_FLAG_ILR          (2 << 0)
 
 /* OpenPIC address map */
+#define OPENPIC_REG_SIZE             0x40000
 #define OPENPIC_GLB_REG_START        0x0
 #define OPENPIC_GLB_REG_SIZE         0x10F0
 #define OPENPIC_TMR_REG_START        0x10F0
@@ -89,6 +103,7 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 #define ILR_INTTGT_INT    0x00
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
+#define NUM_OUTPUTS       3
 
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
@@ -98,18 +113,19 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 
 static int get_current_cpu(void)
 {
-	CPUState *cpu_single_cpu;
-
-	if (!cpu_single_env)
-		return -1;
-
-	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
-	return cpu_single_cpu->cpu_index;
+#if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
+	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
+	return vcpu ? vcpu->vcpu_id : -1;
+#else
+	/* XXX */
+	return -1;
+#endif
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx);
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx);
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx);
 
 enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
@@ -131,7 +147,7 @@ struct irq_source {
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
-	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int output;		/* IRQ level, e.g. ILR_INTTGT_INT */
 	int pending;		/* TRUE if IRQ is pending */
 	enum irq_type type;
 	bool level:1;		/* level-triggered */
@@ -158,16 +174,27 @@ struct irq_source {
 #define IDR_CI      0x40000000	/* critical interrupt */
 
 struct irq_dest {
+	struct kvm_vcpu *vcpu;
+
 	int32_t ctpr;		/* CPU current task priority */
 	struct irq_queue raised;
 	struct irq_queue servicing;
-	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
-	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+	uint32_t outputs_active[NUM_OUTPUTS];
 };
 
 struct openpic {
+	struct kvm *kvm;
+	struct kvm_device *dev;
+	struct kvm_io_device mmio;
+	struct list_head mmio_regions;
+	atomic_t users;
+	bool mmio_mapped;
+
+	gpa_t reg_base;
+	spinlock_t lock;
+
 	/* Behavior control */
 	struct fsl_mpic_info *fsl;
 	uint32_t model;
@@ -208,6 +235,47 @@ struct openpic {
 	uint32_t irq_msi;
 };
 
+
+static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	struct kvm_interrupt irq = {
+		.irq = KVM_INTERRUPT_SET_LEVEL,
+	};
+
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %ld does not exist\n",
+			 __func__, dst - &opp->dst[0]);
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvm_vcpu_ioctl_interrupt(dst->vcpu, &irq);
+}
+
+static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %ld does not exist\n",
+			 __func__, dst - &opp->dst[0]);
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvmppc_core_dequeue_external(dst->vcpu);
+}
+
 static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
@@ -268,7 +336,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
-	if (src->output != OPENPIC_OUTPUT_INT) {
+	if (src->output != ILR_INTTGT_INT) {
 		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
@@ -282,14 +350,14 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			    dst->outputs_active[src->output]++ = 0) {
 				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_raise(dst->irqs[src->output]);
+				mpic_irq_raise(opp, dst, src->output);
 			}
 		} else {
 			if (was_active &&
 			    --dst->outputs_active[src->output] = 0) {
 				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_lower(dst->irqs[src->output]);
+				mpic_irq_lower(opp, dst, src->output);
 			}
 		}
 
@@ -322,8 +390,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 		} else {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
 				__func__, n_CPU, n_IRQ, dst->raised.next);
-			qemu_irq_raise(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 	} else {
 		IRQ_get_next(opp, &dst->servicing);
@@ -338,8 +405,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
 				__func__, n_IRQ, dst->ctpr,
 				dst->servicing.priority, n_CPU);
-			qemu_irq_lower(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		}
 	}
 }
@@ -415,8 +481,8 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
-		abort();
+		WARN_ONCE(1, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		return;
 	}
 
 	src = &opp->src[n_IRQ];
@@ -433,7 +499,7 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 			openpic_update_irq(opp, n_IRQ);
 		}
 
-		if (src->output != OPENPIC_OUTPUT_INT) {
+		if (src->output != ILR_INTTGT_INT) {
 			/* Edge-triggered interrupts shouldn't be used
 			 * with non-INT delivery, but just in case,
 			 * try to make it do something sane rather than
@@ -446,15 +512,13 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState *d)
+static void openpic_reset(struct openpic *opp)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
 	/* Initialise controller registers */
 	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
-	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
 	    (opp->vid << FRR_VID_SHIFT);
 
 	opp->pir = 0;
@@ -504,7 +568,7 @@ static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR)
-		return output_to_inttgt(opp->src[n_IRQ].output);
+		return opp->src[n_IRQ].output;
 
 	return 0xffffffff;
 }
@@ -539,7 +603,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					__func__);
 			}
 
-			src->output = OPENPIC_OUTPUT_CINT;
+			src->output = ILR_INTTGT_CINT;
 			src->nomask = true;
 			src->destmask = 0;
 
@@ -550,7 +614,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					src->destmask |= 1UL << i;
 			}
 		} else {
-			src->output = OPENPIC_OUTPUT_INT;
+			src->output = ILR_INTTGT_INT;
 			src->nomask = false;
 			src->destmask = src->idr & normal_mask;
 		}
@@ -565,7 +629,7 @@ static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
 	if (opp->flags & OPENPIC_FLAG_ILR) {
 		struct irq_source *src = &opp->src[n_IRQ];
 
-		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		src->output = val & ILR_INTTGT_MASK;
 		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
@@ -614,34 +678,23 @@ static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 
 static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
-	bool mpic_proxy = false;
-
 	if (val & GCR_RESET) {
-		openpic_reset(&opp->busdev.qdev);
+		openpic_reset(opp);
 		return;
 	}
 
 	opp->gcr &= ~opp->mpic_mode_mask;
 	opp->gcr |= val & opp->mpic_mode_mask;
-
-	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY)
-		mpic_proxy = true;
-
-	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_gbl_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
-	struct irq_dest *dst;
-	int idx;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
@@ -654,7 +707,8 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		err = openpic_cpu_write_internal(opp, addr, val,
+						 get_current_cpu());
 		break;
 	case 0x1000:		/* FRR */
 		break;
@@ -664,21 +718,11 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x1080:		/* VIR */
 		break;
 	case 0x1090:		/* PIR */
-		for (idx = 0; idx < opp->nb_cpus; idx++) {
-			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx)) &&
-				   (opp->pir & (1 << idx))) {
-				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			}
-		}
-		opp->pir = val;
+		/*
+		 * This register is used to reset a CPU core --
+		 * let userspace handle it.
+		 */
+		err = -ENXIO;
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -695,21 +739,25 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	default:
 		break;
 	}
+
+	return err;
 }
 
-static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_gbl_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint32_t retval;
+	u32 retval;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
+		retval |= (opp->nb_cpus - 1) << FRR_NCPU_SHIFT;
 		break;
 	case 0x1020:		/* GCR */
 		retval = opp->gcr;
@@ -731,8 +779,8 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		retval -		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		err = openpic_cpu_read_internal(opp, addr,
+			&retval, get_current_cpu());
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -750,28 +798,28 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	default:
 		break;
 	}
-	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
+	*ptr = retval;
+	return err;
 }
 
-static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_tmr_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	if (addr = 0x10f0) {
 		/* TFRR */
 		opp->tfrr = val;
-		return;
+		return 0;
 	}
 
 	idx = (addr >> 6) & 0x3;
@@ -795,15 +843,17 @@ static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_tmr_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
 		goto out;
 
@@ -813,6 +863,7 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 		retval = opp->tfrr;
 		goto out;
 	}
+
 	switch (addr & 0x30) {
 	case 0x00:		/* TCCR */
 		retval = opp->timers[idx].tccr;
@@ -830,18 +881,16 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 
 out:
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_src_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 
 	addr = addr & 0xffff;
 	idx = addr >> 5;
@@ -857,15 +906,17 @@ static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_ilr(opp, idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+static int openpic_src_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -884,20 +935,19 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 	}
 
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned size)
+static int openpic_msi_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -911,17 +961,19 @@ static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 		/* most registers are read-only, thus ignored */
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_msi_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint64_t r = 0;
+	uint32_t r = 0;
 	int i, srs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
-		return -1;
+		return -ENXIO;
 
 	srs = addr >> 4;
 
@@ -945,45 +997,47 @@ static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 		break;
 	}
 
-	return r;
+	pr_debug("%s: => 0x%08x\n", __func__, r);
+	*ptr = r;
+	return 0;
 }
 
-static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_summary_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	uint64_t r = 0;
+	uint32_t r = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
-	return r;
+	*ptr = r;
+	return 0;
 }
 
-static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
-				  unsigned size)
+static int openpic_summary_write(void *opaque, gpa_t addr, u32 val)
 {
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 
 	/* TODO: EISR/EIMR */
+	return 0;
 }
 
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx)
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_source *src;
 	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
 		addr, val);
 
 	if (idx < 0)
-		return;
+		return 0;
 
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1008,11 +1062,11 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		if (dst->raised.priority <= dst->ctpr) {
 			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
 				__func__, idx);
-			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		} else if (dst->raised.priority > dst->servicing.priority) {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
-			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 
 		break;
@@ -1043,18 +1097,22 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
 			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
-			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 		break;
 	default:
 		break;
 	}
+
+	return 0;
 }
 
-static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_cpu_write(void *opaque, gpa_t addr, u32 val)
 {
-	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_write_internal(opp, addr, val,
+					 (addr & 0x1f000) >> 12);
 }
 
 static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
@@ -1064,7 +1122,7 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	int retval, irq;
 
 	pr_debug("Lower OpenPIC INT output\n");
-	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+	mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 
 	irq = IRQ_get_next(opp, &dst->raised);
 	pr_debug("IACK: irq=%d\n", irq);
@@ -1107,20 +1165,21 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_dest *dst;
 	uint32_t retval;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#llx\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
 	if (idx < 0)
-		return retval;
+		goto out;
 
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1142,49 +1201,67 @@ static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 	}
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	*ptr = retval;
+	return 0;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_cpu_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_read_internal(opp, addr, ptr,
+					 (addr & 0x1f000) >> 12);
 }
 
-static const struct kvm_io_device_ops openpic_glb_ops_be = {
+struct mem_reg {
+	struct list_head list;
+	int (*read)(void *opaque, gpa_t addr, u32 *ptr);
+	int (*write)(void *opaque, gpa_t addr, u32 val);
+	gpa_t start_addr;
+	int size;
+};
+
+static struct mem_reg openpic_gbl_mmio = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
+	.start_addr = OPENPIC_GLB_REG_START,
+	.size = OPENPIC_GLB_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_tmr_ops_be = {
+static struct mem_reg openpic_tmr_mmio = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
+	.start_addr = OPENPIC_TMR_REG_START,
+	.size = OPENPIC_TMR_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_cpu_ops_be = {
+static struct mem_reg openpic_cpu_mmio = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
+	.start_addr = OPENPIC_CPU_REG_START,
+	.size = OPENPIC_CPU_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_src_ops_be = {
+static struct mem_reg openpic_src_mmio = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
+	.start_addr = OPENPIC_SRC_REG_START,
+	.size = OPENPIC_SRC_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_msi_ops_be = {
+static struct mem_reg openpic_msi_mmio = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
+	.start_addr = OPENPIC_MSI_REG_START,
+	.size = OPENPIC_MSI_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_summary_ops_be = {
+static struct mem_reg openpic_summary_mmio = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-};
-
-struct mem_reg {
-	const char *name;
-	const struct kvm_io_device_ops *ops;
-	gpa_t start_addr;
-	int size;
+	.start_addr = OPENPIC_SUMMARY_REG_START,
+	.size = OPENPIC_SUMMARY_REG_SIZE,
 };
 
 static void fsl_common_init(struct openpic *opp)
@@ -1192,6 +1269,9 @@ static void fsl_common_init(struct openpic *opp)
 	int i;
 	int virq = MAX_SRC;
 
+	list_add(&openpic_msi_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_summary_mmio.list, &opp->mmio_regions);
+
 	opp->vid = VID_REVISION_1_2;
 	opp->vir = VIR_GENERIC;
 	opp->vector_mask = 0xFFFF;
@@ -1205,11 +1285,10 @@ static void fsl_common_init(struct openpic *opp)
 	opp->irq_tim0 = virq;
 	virq += MAX_TMR;
 
-	assert(virq <= MAX_IRQ);
+	BUG_ON(virq > MAX_IRQ);
 
 	opp->irq_msi = 224;
 
-	msi_supported = true;
 	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
 
@@ -1226,63 +1305,352 @@ static void fsl_common_init(struct openpic *opp)
 	}
 }
 
-static void map_list(struct openpic *opp, const struct mem_reg *list,
-		     int *count)
+static int kvm_mpic_read_internal(struct openpic *opp, gpa_t addr, u32 *ptr)
 {
-	while (list->name) {
-		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+	struct list_head *node;
 
-		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
-				      list->name, list->size);
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
 
-		memory_region_add_subregion(&opp->mem, list->start_addr,
-					    &opp->sub_io_mem[*count]);
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-		(*count)++;
-		list++;
+		return mr->read(opp, addr - mr->start_addr, ptr);
 	}
+
+	return -ENXIO;
 }
 
-static int openpic_init(SysBusDevice *dev)
+static int kvm_mpic_write_internal(struct openpic *opp, gpa_t addr, u32 val)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
-	int i, j;
-	int list_count = 0;
-	static const struct mem_reg list_le[] = {
-		{"glb", &openpic_glb_ops_le,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_le,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_le,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_le,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_be[] = {
-		{"glb", &openpic_glb_ops_be,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_be,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_be,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_be,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_fsl[] = {
-		{"msi", &openpic_msi_ops_be,
-		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
-		{"summary", &openpic_summary_ops_be,
-		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
-		{NULL}
-	};
+	struct list_head *node;
+
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
+
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-	memory_region_init(&opp->mem, "openpic", 0x40000);
+		return mr->write(opp, addr - mr->start_addr, val);
+	}
+
+	return -ENXIO;
+}
+
+static int kvm_mpic_read(struct kvm_io_device *this, gpa_t addr,
+			 int len, void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+	union {
+		u32 val;
+		u8 bytes[4];
+	} u;
+
+	if (addr & (len - 1)) {
+		pr_debug("%s: bad alignment %llx/%d\n",
+			 __func__, addr, len);
+		return -EINVAL;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_read_internal(opp, addr - opp->reg_base, &u.val);
+	spin_unlock_irq(&opp->lock);
+
+	/*
+	 * Technically only 32-bit accesses are allowed, but be nice to
+	 * people dumping registers a byte at a time -- it works in real
+	 * hardware (reads only, not writes).
+	 */
+	if (len = 4) {
+		*(u32 *)ptr = u.val;
+		pr_debug("%s: addr %llx ret %d len 4 val %x\n",
+			 __func__, addr, ret, u.val);
+	} else if (len = 1) {
+		*(u8 *)ptr = u.bytes[addr & 3];
+		pr_debug("%s: addr %llx ret %d len 1 val %x\n",
+			 __func__, addr, ret, u.bytes[addr & 3]);
+	} else {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
+static int kvm_mpic_write(struct kvm_io_device *this, gpa_t addr,
+			  int len, const void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+
+	if (len != 4) {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EOPNOTSUPP;
+	}
+	if (addr & 3) {
+		pr_debug("%s: bad alignment %llx/%d\n", __func__, addr, len);
+		return -EOPNOTSUPP;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_write_internal(opp, addr - opp->reg_base,
+				      *(const u32 *)ptr);
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: addr %llx ret %d val %x\n",
+		 __func__, addr, ret, *(const u32 *)ptr);
+
+	return ret;
+}
+
+static void kvm_mpic_dtor(struct kvm_io_device *this)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+
+	opp->mmio_mapped = false;
+}
+
+static const struct kvm_io_device_ops mpic_mmio_ops = {
+	.read = kvm_mpic_read,
+	.write = kvm_mpic_write,
+	.destructor = kvm_mpic_dtor,
+};
+
+static void map_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = true;
+
+	kvm_iodevice_init(&opp->mmio, &mpic_mmio_ops);
+
+	kvm_io_bus_register_dev(opp->kvm, KVM_MMIO_BUS,
+				opp->reg_base, OPENPIC_REG_SIZE,
+				&opp->mmio);
+}
+
+static void unmap_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = false;
+
+	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+}
+
+static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
+{
+	u64 base;
+
+	if (copy_from_user(&base, (u64 __user *)(long)attr->addr, sizeof(u64)))
+		return -EFAULT;
+
+	if (base & 0x3ffff) {
+		pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx not aligned\n",
+			 __func__, base);
+		return -EINVAL;
+	}
+
+	if (base = opp->reg_base)
+		return 0;
+
+	mutex_lock(&opp->kvm->slots_lock);
+
+	unmap_mmio(opp);
+	opp->reg_base = base;
+
+	pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx\n",
+		 __func__, base);
+
+	if (base = 0)
+		goto out;
+
+	map_mmio(opp);
+
+	mutex_unlock(&opp->kvm->slots_lock);
+out:
+	return 0;
+}
+
+#define ATTR_SET		0
+#define ATTR_GET		1
+
+static int access_reg(struct openpic *opp, gpa_t addr, u32 *val, int type)
+{
+	int ret;
+
+	if (addr & 3)
+		return -ENXIO;
+
+	spin_lock_irq(&opp->lock);
+
+	if (type = ATTR_SET)
+		ret = kvm_mpic_write_internal(opp, addr, *val);
+	else
+		ret = kvm_mpic_read_internal(opp, addr, val);
+
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: type %d addr %llx val %x\n", __func__, type, addr, *val);
+
+	return ret;
+}
+
+static int mpic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u32 attr32;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return set_base_addr(opp, attr);
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return access_reg(opp, attr->attr, &attr32, ATTR_SET);
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		if (attr32 != 0 && attr32 != 1)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		openpic_set_irq(opp, attr->attr, attr32);
+		spin_unlock_irq(&opp->lock);
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u64 attr64;
+	u32 attr32;
+	int ret;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			mutex_lock(&opp->kvm->slots_lock);
+			attr64 = opp->reg_base;
+			mutex_unlock(&opp->kvm->slots_lock);
+
+			if (copy_to_user((u64 __user *)(long)attr->addr,
+					 &attr64, sizeof(u64)))
+				return -EFAULT;
+
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		ret = access_reg(opp, attr->attr, &attr32, ATTR_GET);
+		if (ret)
+			return ret;
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		attr32 = opp->src[attr->attr].pending;
+		spin_unlock_irq(&opp->lock);
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			break;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static void mpic_destroy(struct kvm_device *dev)
+{
+	struct openpic *opp = dev->private;
+
+	if (opp->mmio_mapped) {
+		/*
+		 * Normally we get unmapped by kvm_io_bus_destroy(),
+		 * which happens before the VCPUs release their references.
+		 *
+		 * Thus, we should only get here if no VCPUs took a reference
+		 * to us in the first place.
+		 */
+		WARN_ON(opp->nb_cpus != 0);
+		unmap_mmio(opp);
+	}
+
+	kfree(opp);
+}
+
+static int mpic_create(struct kvm_device *dev, u32 type)
+{
+	struct openpic *opp;
+	int ret;
+
+	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
+	if (!opp)
+		return -ENOMEM;
+
+	dev->private = opp;
+	opp->kvm = dev->kvm;
+	opp->dev = dev;
+	opp->model = type;
+	spin_lock_init(&opp->lock);
+
+	INIT_LIST_HEAD(&opp->mmio_regions);
+	list_add(&openpic_gbl_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_tmr_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_src_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_cpu_mmio.list, &opp->mmio_regions);
 
 	switch (opp->model) {
-	case OPENPIC_MODEL_FSL_MPIC_20:
-	default:
+	case KVM_DEV_TYPE_FSL_MPIC_20:
 		opp->fsl = &fsl_mpic_20;
 		opp->brr1 = 0x00400200;
 		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
@@ -1290,12 +1658,10 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_MIXED;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
 
-	case OPENPIC_MODEL_FSL_MPIC_42:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
 		opp->fsl = &fsl_mpic_42;
 		opp->brr1 = 0x00400402;
 		opp->flags |= OPENPIC_FLAG_ILR;
@@ -1303,11 +1669,27 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_PROXY;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
+
+	default:
+		ret = -ENODEV;
+		goto err;
 	}
 
+	openpic_reset(opp);
 	return 0;
+
+err:
+	kfree(opp);
+	return ret;
 }
+
+struct kvm_device_ops kvm_mpic_ops = {
+	.name = "kvm-mpic",
+	.create = mpic_create,
+	.destroy = mpic_destroy,
+	.set_attr = mpic_set_attr,
+	.get_attr = mpic_get_attr,
+	.has_attr = mpic_has_attr,
+};
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a822659..3cad714 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -317,6 +317,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_ONE_REG:
 	case KVM_CAP_IOEVENTFD:
+	case KVM_CAP_DEVICE_CTRL:
 		r = 1;
 		break;
 #ifndef CONFIG_KVM_BOOK3S_64_HV
@@ -768,7 +769,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	case KVM_CAP_PPC_EPR:
 		r = 0;
-		vcpu->arch.epr_enabled = cap->args[0];
+		if (cap->args[0])
+			vcpu->arch.epr_flags |= KVMPPC_EPR_USER;
+		else
+			vcpu->arch.epr_flags &= ~KVMPPC_EPR_USER;
 		break;
 #ifdef CONFIG_BOOKE
 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
@@ -914,6 +918,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
+	struct kvm *kvm __maybe_unused = filp->private_data;
 	void __user *argp = (void __user *)arg;
 	long r;
 
@@ -932,7 +937,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
-		struct kvm *kvm = filp->private_data;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
@@ -944,7 +948,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	case KVM_ALLOCATE_RMA: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_allocate_rma rma;
 
 		r = kvm_vm_ioctl_allocate_rma(kvm, &rma);
@@ -954,7 +957,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_ALLOCATE_HTAB: {
-		struct kvm *kvm = filp->private_data;
 		u32 htab_order;
 
 		r = -EFAULT;
@@ -971,7 +973,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_GET_HTAB_FD: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_get_htab_fd ghf;
 
 		r = -EFAULT;
@@ -984,7 +985,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_PPC_GET_SMMU_INFO: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_ppc_smmu_info info;
 
 		memset(&info, 0, sizeof(info));
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6dab6b5..feffbda 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1099,6 +1099,8 @@ void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
 
+extern struct kvm_device_ops kvm_mpic_ops;
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index be15aff..568d65d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -925,6 +925,9 @@ struct kvm_device_attr {
 	__u64	addr;		/* userspace address of attr data */
 };
 
+#define KVM_DEV_TYPE_FSL_MPIC_20	1
+#define KVM_DEV_TYPE_FSL_MPIC_42	2
+
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5f0d78c..f6cd14d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2238,6 +2238,12 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
 	int ret;
 
 	switch (cd->type) {
+#ifdef CONFIG_KVM_MPIC
+	case KVM_DEV_TYPE_FSL_MPIC_20:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
+		ops = &kvm_mpic_ops;
+		break;
+#endif
 	default:
 		return -ENODEV;
 	}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 14/17] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt   |    8 +++
 arch/powerpc/include/asm/kvm_host.h |    9 ++++
 arch/powerpc/include/asm/kvm_ppc.h  |   15 ++++++-
 arch/powerpc/kvm/booke.c            |    4 ++
 arch/powerpc/kvm/mpic.c             |   82 ++++++++++++++++++++++++++++++++---
 arch/powerpc/kvm/powerpc.c          |   30 +++++++++++++
 include/uapi/linux/kvm.h            |    1 +
 7 files changed, 141 insertions(+), 8 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d52f3f9..4c326ae 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2728,3 +2728,11 @@ to receive the topmost interrupt vector.
 When disabled (args[0] == 0), behavior is as if this facility is unsupported.
 
 When this capability is enabled, KVM_EXIT_EPR can occur.
+
+6.6 KVM_CAP_IRQ_MPIC
+
+Architectures: ppc
+Parameters: args[0] is the MPIC device fd
+            args[1] is the MPIC CPU number for this vcpu
+
+This capability connects the vcpu to an in-kernel MPIC device.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e7aef9..36368c9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -375,6 +375,11 @@ struct kvmppc_booke_debug_reg {
 	u64 dac[KVMPPC_BOOKE_MAX_DAC];
 };
 
+#define KVMPPC_IRQ_DEFAULT	0
+#define KVMPPC_IRQ_MPIC		1
+
+struct openpic;
+
 struct kvm_vcpu_arch {
 	ulong host_stack;
 	u32 host_pid;
@@ -554,6 +559,10 @@ struct kvm_vcpu_arch {
 	unsigned long magic_page_pa; /* phys addr to map the magic page to */
 	unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
+	int irq_type;		/* one of KVM_IRQ_* */
+	int irq_cpu_id;
+	struct openpic *mpic;	/* KVM_IRQ_MPIC */
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	struct kvm_vcpu_arch_shared shregs;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 0b86604..c9d9faf 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -248,7 +248,6 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
 struct openpic;
-void kvmppc_mpic_put(struct openpic *opp);
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
@@ -278,6 +277,9 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #ifdef CONFIG_KVM_MPIC
 
 void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu);
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu);
 
 #else
 
@@ -285,6 +287,17 @@ static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
 {
 }
 
+static inline int kvmppc_mpic_connect_vcpu(struct kvm_device *dev,
+		struct kvm_vcpu *vcpu, u32 cpu)
+{
+	return -EINVAL;
+}
+
+static inline void kvmppc_mpic_disconnect_vcpu(struct openpic *opp,
+		struct kvm_vcpu *vcpu)
+{
+}
+
 #endif /* CONFIG_KVM_MPIC */
 
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cff53d4..0097912 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -430,6 +430,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		if (update_epr == true) {
 			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
 				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+			else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) {
+				BUG_ON(vcpu->arch.irq_type != KVMPPC_IRQ_MPIC);
+				kvmppc_mpic_set_epr(vcpu);
+			}
 		}
 
 		new_msr &= msr_mask;
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 60857d5..b1ae02d 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -115,7 +115,7 @@ static int get_current_cpu(void)
 {
 #if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
 	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
-	return vcpu ? vcpu->vcpu_id : -1;
+	return vcpu ? vcpu->arch.irq_cpu_id : -1;
 #else
 	/* XXX */
 	return -1;
@@ -249,7 +249,7 @@ static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -267,7 +267,7 @@ static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -1165,6 +1165,20 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+	struct openpic *opp = vcpu->arch.mpic;
+	int cpu = vcpu->arch.irq_cpu_id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&opp->lock, flags);
+
+	if ((opp->gcr & opp->mpic_mode_mask) == GCR_MODE_PROXY)
+		kvmppc_set_epr(vcpu, openpic_iack(opp, &opp->dst[cpu], cpu));
+
+	spin_unlock_irqrestore(&opp->lock, flags);
+}
+
 static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
 				     u32 *ptr, int idx)
 {
@@ -1431,10 +1445,10 @@ static void map_mmio(struct openpic *opp)
 
 static void unmap_mmio(struct openpic *opp)
 {
-	BUG_ON(opp->mmio_mapped);
-	opp->mmio_mapped = false;
-
-	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	if (opp->mmio_mapped) {
+		opp->mmio_mapped = false;
+		kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	}
 }
 
 static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
@@ -1693,3 +1707,57 @@ struct kvm_device_ops kvm_mpic_ops = {
 	.get_attr = mpic_get_attr,
 	.has_attr = mpic_has_attr,
 };
+
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu)
+{
+	struct openpic *opp = dev->private;
+	int ret = 0;
+
+	if (dev->ops != &kvm_mpic_ops)
+		return -EPERM;
+	if (opp->kvm != vcpu->kvm)
+		return -EPERM;
+	if (cpu < 0 || cpu >= MAX_CPU)
+		return -EPERM;
+
+	spin_lock_irq(&opp->lock);
+
+	if (opp->dst[cpu].vcpu) {
+		ret = -EEXIST;
+		goto out;
+	}
+	if (vcpu->arch.irq_type) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	opp->dst[cpu].vcpu = vcpu;
+	opp->nb_cpus = max(opp->nb_cpus, cpu + 1);
+
+	vcpu->arch.mpic = opp;
+	vcpu->arch.irq_cpu_id = cpu;
+	vcpu->arch.irq_type = KVMPPC_IRQ_MPIC;
+
+	/* This might need to be changed if GCR gets extended */
+	if (opp->mpic_mode_mask == GCR_MODE_PROXY)
+		vcpu->arch.epr_flags |= KVMPPC_EPR_KERNEL;
+
+	kvm_device_get(dev);
+out:
+	spin_unlock_irq(&opp->lock);
+	return ret;
+}
+
+/*
+ * This should only happen immediately before the mpic is destroyed,
+ * so we shouldn't need to worry about anything still trying to
+ * access the vcpu pointer.
+ */
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
+{
+	BUG_ON(!opp->dst[vcpu->arch.irq_cpu_id].vcpu);
+
+	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
+	kvm_device_put(opp->dev);
+}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cad714..c431fea 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -25,6 +25,7 @@
 #include <linux/hrtimer.h>
 #include <linux/fs.h>
 #include <linux/slab.h>
+#include <linux/file.h>
 #include <asm/cputable.h>
 #include <asm/uaccess.h>
 #include <asm/kvm_ppc.h>
@@ -327,6 +328,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	case KVM_CAP_SW_TLB:
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC:
+#endif
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
 	tasklet_kill(&vcpu->arch.tasklet);
 
 	kvmppc_remove_vcpu_debugfs(vcpu);
+
+	switch (vcpu->arch.irq_type) {
+	case KVMPPC_IRQ_MPIC:
+		kvmppc_mpic_disconnect_vcpu(vcpu->arch.mpic, vcpu);
+		break;
+	}
+
 	kvmppc_core_vcpu_free(vcpu);
 }
 
@@ -793,6 +804,25 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	}
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC: {
+		struct file *filp;
+		struct kvm_device *dev;
+
+		r = -EBADF;
+		filp = fget(cap->args[0]);
+		if (!filp)
+			break;
+
+		r = -EPERM;
+		dev = kvm_device_from_filp(filp);
+		if (dev)
+			r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
+
+		fput(filp);
+		break;
+	}
+#endif
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 568d65d..8f95cac 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
 #define KVM_CAP_DEVICE_CTRL 89
+#define KVM_CAP_IRQ_MPIC 90
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 14/17] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt   |    8 +++
 arch/powerpc/include/asm/kvm_host.h |    9 ++++
 arch/powerpc/include/asm/kvm_ppc.h  |   15 ++++++-
 arch/powerpc/kvm/booke.c            |    4 ++
 arch/powerpc/kvm/mpic.c             |   82 ++++++++++++++++++++++++++++++++---
 arch/powerpc/kvm/powerpc.c          |   30 +++++++++++++
 include/uapi/linux/kvm.h            |    1 +
 7 files changed, 141 insertions(+), 8 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d52f3f9..4c326ae 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2728,3 +2728,11 @@ to receive the topmost interrupt vector.
 When disabled (args[0] = 0), behavior is as if this facility is unsupported.
 
 When this capability is enabled, KVM_EXIT_EPR can occur.
+
+6.6 KVM_CAP_IRQ_MPIC
+
+Architectures: ppc
+Parameters: args[0] is the MPIC device fd
+            args[1] is the MPIC CPU number for this vcpu
+
+This capability connects the vcpu to an in-kernel MPIC device.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e7aef9..36368c9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -375,6 +375,11 @@ struct kvmppc_booke_debug_reg {
 	u64 dac[KVMPPC_BOOKE_MAX_DAC];
 };
 
+#define KVMPPC_IRQ_DEFAULT	0
+#define KVMPPC_IRQ_MPIC		1
+
+struct openpic;
+
 struct kvm_vcpu_arch {
 	ulong host_stack;
 	u32 host_pid;
@@ -554,6 +559,10 @@ struct kvm_vcpu_arch {
 	unsigned long magic_page_pa; /* phys addr to map the magic page to */
 	unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
+	int irq_type;		/* one of KVM_IRQ_* */
+	int irq_cpu_id;
+	struct openpic *mpic;	/* KVM_IRQ_MPIC */
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	struct kvm_vcpu_arch_shared shregs;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 0b86604..c9d9faf 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -248,7 +248,6 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
 struct openpic;
-void kvmppc_mpic_put(struct openpic *opp);
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
@@ -278,6 +277,9 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #ifdef CONFIG_KVM_MPIC
 
 void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu);
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu);
 
 #else
 
@@ -285,6 +287,17 @@ static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
 {
 }
 
+static inline int kvmppc_mpic_connect_vcpu(struct kvm_device *dev,
+		struct kvm_vcpu *vcpu, u32 cpu)
+{
+	return -EINVAL;
+}
+
+static inline void kvmppc_mpic_disconnect_vcpu(struct openpic *opp,
+		struct kvm_vcpu *vcpu)
+{
+}
+
 #endif /* CONFIG_KVM_MPIC */
 
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cff53d4..0097912 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -430,6 +430,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		if (update_epr = true) {
 			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
 				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+			else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) {
+				BUG_ON(vcpu->arch.irq_type != KVMPPC_IRQ_MPIC);
+				kvmppc_mpic_set_epr(vcpu);
+			}
 		}
 
 		new_msr &= msr_mask;
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 60857d5..b1ae02d 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -115,7 +115,7 @@ static int get_current_cpu(void)
 {
 #if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
 	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
-	return vcpu ? vcpu->vcpu_id : -1;
+	return vcpu ? vcpu->arch.irq_cpu_id : -1;
 #else
 	/* XXX */
 	return -1;
@@ -249,7 +249,7 @@ static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -267,7 +267,7 @@ static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -1165,6 +1165,20 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+	struct openpic *opp = vcpu->arch.mpic;
+	int cpu = vcpu->arch.irq_cpu_id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&opp->lock, flags);
+
+	if ((opp->gcr & opp->mpic_mode_mask) = GCR_MODE_PROXY)
+		kvmppc_set_epr(vcpu, openpic_iack(opp, &opp->dst[cpu], cpu));
+
+	spin_unlock_irqrestore(&opp->lock, flags);
+}
+
 static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
 				     u32 *ptr, int idx)
 {
@@ -1431,10 +1445,10 @@ static void map_mmio(struct openpic *opp)
 
 static void unmap_mmio(struct openpic *opp)
 {
-	BUG_ON(opp->mmio_mapped);
-	opp->mmio_mapped = false;
-
-	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	if (opp->mmio_mapped) {
+		opp->mmio_mapped = false;
+		kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	}
 }
 
 static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
@@ -1693,3 +1707,57 @@ struct kvm_device_ops kvm_mpic_ops = {
 	.get_attr = mpic_get_attr,
 	.has_attr = mpic_has_attr,
 };
+
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu)
+{
+	struct openpic *opp = dev->private;
+	int ret = 0;
+
+	if (dev->ops != &kvm_mpic_ops)
+		return -EPERM;
+	if (opp->kvm != vcpu->kvm)
+		return -EPERM;
+	if (cpu < 0 || cpu >= MAX_CPU)
+		return -EPERM;
+
+	spin_lock_irq(&opp->lock);
+
+	if (opp->dst[cpu].vcpu) {
+		ret = -EEXIST;
+		goto out;
+	}
+	if (vcpu->arch.irq_type) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	opp->dst[cpu].vcpu = vcpu;
+	opp->nb_cpus = max(opp->nb_cpus, cpu + 1);
+
+	vcpu->arch.mpic = opp;
+	vcpu->arch.irq_cpu_id = cpu;
+	vcpu->arch.irq_type = KVMPPC_IRQ_MPIC;
+
+	/* This might need to be changed if GCR gets extended */
+	if (opp->mpic_mode_mask = GCR_MODE_PROXY)
+		vcpu->arch.epr_flags |= KVMPPC_EPR_KERNEL;
+
+	kvm_device_get(dev);
+out:
+	spin_unlock_irq(&opp->lock);
+	return ret;
+}
+
+/*
+ * This should only happen immediately before the mpic is destroyed,
+ * so we shouldn't need to worry about anything still trying to
+ * access the vcpu pointer.
+ */
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
+{
+	BUG_ON(!opp->dst[vcpu->arch.irq_cpu_id].vcpu);
+
+	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
+	kvm_device_put(opp->dev);
+}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cad714..c431fea 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -25,6 +25,7 @@
 #include <linux/hrtimer.h>
 #include <linux/fs.h>
 #include <linux/slab.h>
+#include <linux/file.h>
 #include <asm/cputable.h>
 #include <asm/uaccess.h>
 #include <asm/kvm_ppc.h>
@@ -327,6 +328,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	case KVM_CAP_SW_TLB:
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC:
+#endif
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
 	tasklet_kill(&vcpu->arch.tasklet);
 
 	kvmppc_remove_vcpu_debugfs(vcpu);
+
+	switch (vcpu->arch.irq_type) {
+	case KVMPPC_IRQ_MPIC:
+		kvmppc_mpic_disconnect_vcpu(vcpu->arch.mpic, vcpu);
+		break;
+	}
+
 	kvmppc_core_vcpu_free(vcpu);
 }
 
@@ -793,6 +804,25 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	}
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC: {
+		struct file *filp;
+		struct kvm_device *dev;
+
+		r = -EBADF;
+		filp = fget(cap->args[0]);
+		if (!filp)
+			break;
+
+		r = -EPERM;
+		dev = kvm_device_from_filp(filp);
+		if (dev)
+			r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
+
+		fput(filp);
+		break;
+	}
+#endif
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 568d65d..8f95cac 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
 #define KVM_CAP_DEVICE_CTRL 89
+#define KVM_CAP_IRQ_MPIC 90
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all the irq routing and irqfd pieces are generic, we can expose
real irqchip support to all of KVM's internal helpers.

This allows us to use irqfd with the in-kernel MPIC.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    7 ++
 arch/powerpc/include/uapi/asm/kvm.h |    1 +
 arch/powerpc/kvm/Kconfig            |    3 +
 arch/powerpc/kvm/Makefile           |    1 +
 arch/powerpc/kvm/irq.h              |   17 ++++++
 arch/powerpc/kvm/mpic.c             |  106 +++++++++++++++++++++++++++++++++++
 6 files changed, 135 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/irq.h

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 36368c9..d5fbb4b 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -44,6 +44,10 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
+/* These values are internal and can be increased later */
+#define KVM_NR_IRQCHIPS          1
+#define KVM_IRQCHIP_NUM_PINS     256
+
 #if !defined(CONFIG_KVM_440)
 #include <linux/mmu_notifier.h>
 
@@ -256,6 +260,9 @@ struct kvm_arch {
 #ifdef CONFIG_PPC_BOOK3S_64
 	struct list_head spapr_tce_tables;
 #endif
+#ifdef CONFIG_KVM_MPIC
+	void *mpic;
+#endif
 };
 
 /*
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 36be2fe..3537bf3 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -25,6 +25,7 @@
 /* Select powerpc specific features in <linux/kvm.h> */
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
+#define __KVM_HAVE_IRQCHIP
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 938a729..a608570 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -154,6 +154,9 @@ config KVM_E500MC
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
 	depends on KVM
+	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
+	select HAVE_KVM_MSI
 	help
 	  Enable support for emulating MPIC devices inside the
           host kernel, rather than relying on userspace to emulate.
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4a2277a..4eada0c 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -104,6 +104,7 @@ kvm-book3s_32-objs := \
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
 kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+kvm-objs-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(addprefix ../../../virt/kvm/, irqchip.o)
 
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
new file mode 100644
index 0000000..f1e27fd
--- /dev/null
+++ b/arch/powerpc/kvm/irq.h
@@ -0,0 +1,17 @@
+#ifndef __IRQ_H
+#define __IRQ_H
+
+#include <linux/kvm_host.h>
+
+static inline int irqchip_in_kernel(struct kvm *kvm)
+{
+	int ret = 0;
+
+#ifdef CONFIG_KVM_MPIC
+	ret = ret || (kvm->arch.mpic != NULL);
+#endif
+	smp_rmb();
+	return ret;
+}
+
+#endif
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index b1ae02d..c8de2f2 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		}
 
 		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Notify listeners that the IRQ is over */
+		kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);
 		/* Set up next servicing IRQ */
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 		/* Check queued interrupts. */
@@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device *dev)
 		unmap_mmio(opp);
 	}
 
+	dev->kvm->arch.mpic = NULL;
 	kfree(opp);
 }
 
+static int mpic_set_default_irq_routing(struct openpic *opp)
+{
+	int i;
+	struct kvm_irq_routing_entry *routing;
+
+	/* XXX be more dynamic if we ever want to support multiple MPIC chips */
+	routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
+	if (!routing)
+		return -ENOMEM;
+
+	for (i = 0; i < opp->nb_irqs; i++) {
+		routing[i].gsi = i;
+		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
+		routing[i].u.irqchip.irqchip = 0;
+		routing[i].u.irqchip.pin = i;
+	}
+
+	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
+
+	kfree(routing);
+	return 0;
+}
+
 static int mpic_create(struct kvm_device *dev, u32 type)
 {
 	struct openpic *opp;
 	int ret;
 
+	/* We only support one MPIC at a time for now */
+	if (dev->kvm->arch.mpic)
+		return -EINVAL;
+
 	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
 	if (!opp)
 		return -ENOMEM;
@@ -1691,10 +1721,18 @@ static int mpic_create(struct kvm_device *dev, u32 type)
 		goto err;
 	}
 
+	dev->kvm->arch.mpic = opp;
+
+	ret = mpic_set_default_irq_routing(opp);
+	if (ret)
+		goto err;
+
 	openpic_reset(opp);
+
 	return 0;
 
 err:
+	dev->kvm->arch.mpic = NULL;
 	kfree(opp);
 	return ret;
 }
@@ -1761,3 +1799,71 @@ void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
 	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
 	kvm_device_put(opp->dev);
 }
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+static int mpic_set_irq(struct kvm_kernel_irq_routing_entry *e,
+			struct kvm *kvm, int irq_source_id, int level,
+			bool line_status)
+{
+	u32 irq = e->irqchip.pin;
+	struct openpic *opp = kvm->arch.mpic;
+
+	spin_lock_irq(&opp->lock);
+	openpic_set_irq(opp, irq, level);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
+		struct kvm *kvm, int irq_source_id, int level, bool line_status)
+{
+	struct openpic *opp = kvm->arch.mpic;
+	spin_lock_irq(&opp->lock);
+
+	/*
+	 * XXX We ignore the target address for now, as we only support
+	 *     a single MSI bank.
+	 */
+	openpic_msi_write(kvm->arch.mpic, MSIIR_OFFSET, e->msi.data);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+
+	switch (ue->type) {
+	case KVM_IRQ_ROUTING_IRQCHIP:
+		e->set = mpic_set_irq;
+		e->irqchip.irqchip = ue->u.irqchip.irqchip;
+		e->irqchip.pin = ue->u.irqchip.pin;
+		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
+			goto out;
+		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
+		break;
+	case KVM_IRQ_ROUTING_MSI:
+		e->set = kvm_set_msi;
+		e->msi.address_lo = ue->u.msi.address_lo;
+		e->msi.address_hi = ue->u.msi.address_hi;
+		e->msi.data = ue->u.msi.data;
+		break;
+	default:
+		goto out;
+	}
+
+	r = 0;
+out:
+	return r;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all the irq routing and irqfd pieces are generic, we can expose
real irqchip support to all of KVM's internal helpers.

This allows us to use irqfd with the in-kernel MPIC.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    7 ++
 arch/powerpc/include/uapi/asm/kvm.h |    1 +
 arch/powerpc/kvm/Kconfig            |    3 +
 arch/powerpc/kvm/Makefile           |    1 +
 arch/powerpc/kvm/irq.h              |   17 ++++++
 arch/powerpc/kvm/mpic.c             |  106 +++++++++++++++++++++++++++++++++++
 6 files changed, 135 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/irq.h

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 36368c9..d5fbb4b 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -44,6 +44,10 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
+/* These values are internal and can be increased later */
+#define KVM_NR_IRQCHIPS          1
+#define KVM_IRQCHIP_NUM_PINS     256
+
 #if !defined(CONFIG_KVM_440)
 #include <linux/mmu_notifier.h>
 
@@ -256,6 +260,9 @@ struct kvm_arch {
 #ifdef CONFIG_PPC_BOOK3S_64
 	struct list_head spapr_tce_tables;
 #endif
+#ifdef CONFIG_KVM_MPIC
+	void *mpic;
+#endif
 };
 
 /*
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 36be2fe..3537bf3 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -25,6 +25,7 @@
 /* Select powerpc specific features in <linux/kvm.h> */
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
+#define __KVM_HAVE_IRQCHIP
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 938a729..a608570 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -154,6 +154,9 @@ config KVM_E500MC
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
 	depends on KVM
+	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
+	select HAVE_KVM_MSI
 	help
 	  Enable support for emulating MPIC devices inside the
           host kernel, rather than relying on userspace to emulate.
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4a2277a..4eada0c 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -104,6 +104,7 @@ kvm-book3s_32-objs := \
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
 kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+kvm-objs-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(addprefix ../../../virt/kvm/, irqchip.o)
 
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
new file mode 100644
index 0000000..f1e27fd
--- /dev/null
+++ b/arch/powerpc/kvm/irq.h
@@ -0,0 +1,17 @@
+#ifndef __IRQ_H
+#define __IRQ_H
+
+#include <linux/kvm_host.h>
+
+static inline int irqchip_in_kernel(struct kvm *kvm)
+{
+	int ret = 0;
+
+#ifdef CONFIG_KVM_MPIC
+	ret = ret || (kvm->arch.mpic != NULL);
+#endif
+	smp_rmb();
+	return ret;
+}
+
+#endif
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index b1ae02d..c8de2f2 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		}
 
 		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Notify listeners that the IRQ is over */
+		kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);
 		/* Set up next servicing IRQ */
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 		/* Check queued interrupts. */
@@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device *dev)
 		unmap_mmio(opp);
 	}
 
+	dev->kvm->arch.mpic = NULL;
 	kfree(opp);
 }
 
+static int mpic_set_default_irq_routing(struct openpic *opp)
+{
+	int i;
+	struct kvm_irq_routing_entry *routing;
+
+	/* XXX be more dynamic if we ever want to support multiple MPIC chips */
+	routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
+	if (!routing)
+		return -ENOMEM;
+
+	for (i = 0; i < opp->nb_irqs; i++) {
+		routing[i].gsi = i;
+		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
+		routing[i].u.irqchip.irqchip = 0;
+		routing[i].u.irqchip.pin = i;
+	}
+
+	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
+
+	kfree(routing);
+	return 0;
+}
+
 static int mpic_create(struct kvm_device *dev, u32 type)
 {
 	struct openpic *opp;
 	int ret;
 
+	/* We only support one MPIC at a time for now */
+	if (dev->kvm->arch.mpic)
+		return -EINVAL;
+
 	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
 	if (!opp)
 		return -ENOMEM;
@@ -1691,10 +1721,18 @@ static int mpic_create(struct kvm_device *dev, u32 type)
 		goto err;
 	}
 
+	dev->kvm->arch.mpic = opp;
+
+	ret = mpic_set_default_irq_routing(opp);
+	if (ret)
+		goto err;
+
 	openpic_reset(opp);
+
 	return 0;
 
 err:
+	dev->kvm->arch.mpic = NULL;
 	kfree(opp);
 	return ret;
 }
@@ -1761,3 +1799,71 @@ void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
 	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
 	kvm_device_put(opp->dev);
 }
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+static int mpic_set_irq(struct kvm_kernel_irq_routing_entry *e,
+			struct kvm *kvm, int irq_source_id, int level,
+			bool line_status)
+{
+	u32 irq = e->irqchip.pin;
+	struct openpic *opp = kvm->arch.mpic;
+
+	spin_lock_irq(&opp->lock);
+	openpic_set_irq(opp, irq, level);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
+		struct kvm *kvm, int irq_source_id, int level, bool line_status)
+{
+	struct openpic *opp = kvm->arch.mpic;
+	spin_lock_irq(&opp->lock);
+
+	/*
+	 * XXX We ignore the target address for now, as we only support
+	 *     a single MSI bank.
+	 */
+	openpic_msi_write(kvm->arch.mpic, MSIIR_OFFSET, e->msi.data);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+
+	switch (ue->type) {
+	case KVM_IRQ_ROUTING_IRQCHIP:
+		e->set = mpic_set_irq;
+		e->irqchip.irqchip = ue->u.irqchip.irqchip;
+		e->irqchip.pin = ue->u.irqchip.pin;
+		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
+			goto out;
+		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
+		break;
+	case KVM_IRQ_ROUTING_MSI:
+		e->set = kvm_set_msi;
+		e->msi.address_lo = ue->u.msi.address_lo;
+		e->msi.address_hi = ue->u.msi.address_hi;
+		e->msi.data = ue->u.msi.data;
+		break;
+	default:
+		goto out;
+	}
+
+	r = 0;
+out:
+	return r;
+}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all pieces are in place for reusing generic irq infrastructure,
we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
reuse it for PPC, as it will work there just as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/uapi/asm/kvm.h |    1 +
 arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 3537bf3..dbb2ac2 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -26,6 +26,7 @@
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
 #define __KVM_HAVE_IRQCHIP
+#define __KVM_HAVE_IRQ_LINE
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c431fea..874c106 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
 #include "timing.h"
+#include "irq.h"
 #include "../mm/mmu_decl.h"
 
 #define CREATE_TRACE_POINTS
@@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	return 0;
 }
 
+int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
+			  bool line_status)
+{
+	if (!irqchip_in_kernel(kvm))
+		return -ENXIO;
+
+	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
+					irq_event->irq, irq_event->level,
+					line_status);
+	return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all pieces are in place for reusing generic irq infrastructure,
we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
reuse it for PPC, as it will work there just as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/uapi/asm/kvm.h |    1 +
 arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 3537bf3..dbb2ac2 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -26,6 +26,7 @@
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
 #define __KVM_HAVE_IRQCHIP
+#define __KVM_HAVE_IRQ_LINE
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c431fea..874c106 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
 #include "timing.h"
+#include "irq.h"
 #include "../mm/mmu_decl.h"
 
 #define CREATE_TRACE_POINTS
@@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	return 0;
 }
 
+int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
+			  bool line_status)
+{
+	if (!irqchip_in_kernel(kvm))
+		return -ENXIO;
+
+	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
+					irq_event->irq, irq_event->level,
+					line_status);
+	return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-18 14:11   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The code as is doesn't make any sense on non-e500 platforms. Restrict it
there, so that people don't get wrong ideas on what would actually work.

This patch should get reverted as soon as it's possible to either run e500
guests on non-e500 hosts or the MPIC emulation gains support for non-e500
modes.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index a608570..e88b1da 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -153,7 +153,7 @@ config KVM_E500MC
 
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
-	depends on KVM
+	depends on KVM && E500
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_MSI
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
@ 2013-04-18 14:11   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:11 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The code as is doesn't make any sense on non-e500 platforms. Restrict it
there, so that people don't get wrong ideas on what would actually work.

This patch should get reverted as soon as it's possible to either run e500
guests on non-e500 hosts or the MPIC emulation gains support for non-e500
modes.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index a608570..e88b1da 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -153,7 +153,7 @@ config KVM_E500MC
 
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
-	depends on KVM
+	depends on KVM && E500
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_MSI
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* Re: [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
  2013-04-18 14:11   ` Alexander Graf
@ 2013-04-18 14:29     ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-18 14:29 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 09:11:57 AM, Alexander Graf wrote:
> The code as is doesn't make any sense on non-e500 platforms.

Why?  What actually breaks?

You never answered my earlier question about whether 74xx is  
supported.  MPC86xx is 74xx-derived and has an FSL MPIC.

Plus, as pointed out earlier, this limits allyesconfig build testing.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
@ 2013-04-18 14:29     ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-18 14:29 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 09:11:57 AM, Alexander Graf wrote:
> The code as is doesn't make any sense on non-e500 platforms.

Why?  What actually breaks?

You never answered my earlier question about whether 74xx is  
supported.  MPC86xx is 74xx-derived and has an FSL MPIC.

Plus, as pointed out earlier, this limits allyesconfig build testing.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
  2013-04-18 14:29     ` Scott Wood
@ 2013-04-18 14:52       ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:52 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 18.04.2013, at 16:29, Scott Wood wrote:

> On 04/18/2013 09:11:57 AM, Alexander Graf wrote:
>> The code as is doesn't make any sense on non-e500 platforms.
> 
> Why?  What actually breaks?

It broke even compiling, because book3s doesn't provide anything for EPR for example. I fixed up the issues I found, but it's clear that this is an untested configuration.

> You never answered my earlier question about whether 74xx is supported.  MPC86xx is 74xx-derived and has an FSL MPIC.

Heck, even get_current_cpu() has no chance of working on book3s as it is today. I would prefer that anyone who wants to run MPC86xx with in an-kernel MPIC (which is a pure optimization!) just sits down, enables it and reverts this patch.

> Plus, as pointed out earlier, this limits allyesconfig build testing.

That's the point. It's code that only ever got tested on e500.


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
@ 2013-04-18 14:52       ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-18 14:52 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 18.04.2013, at 16:29, Scott Wood wrote:

> On 04/18/2013 09:11:57 AM, Alexander Graf wrote:
>> The code as is doesn't make any sense on non-e500 platforms.
> 
> Why?  What actually breaks?

It broke even compiling, because book3s doesn't provide anything for EPR for example. I fixed up the issues I found, but it's clear that this is an untested configuration.

> You never answered my earlier question about whether 74xx is supported.  MPC86xx is 74xx-derived and has an FSL MPIC.

Heck, even get_current_cpu() has no chance of working on book3s as it is today. I would prefer that anyone who wants to run MPC86xx with in an-kernel MPIC (which is a pure optimization!) just sits down, enables it and reverts this patch.

> Plus, as pointed out earlier, this limits allyesconfig build testing.

That's the point. It's code that only ever got tested on e500.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-18 14:11   ` Alexander Graf
@ 2013-04-18 21:39     ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-18 21:39 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
> Now that all the irq routing and irqfd pieces are generic, we can  
> expose
> real irqchip support to all of KVM's internal helpers.
> 
> This allows us to use irqfd with the in-kernel MPIC.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/asm/kvm_host.h |    7 ++
>  arch/powerpc/include/uapi/asm/kvm.h |    1 +
>  arch/powerpc/kvm/Kconfig            |    3 +
>  arch/powerpc/kvm/Makefile           |    1 +
>  arch/powerpc/kvm/irq.h              |   17 ++++++
>  arch/powerpc/kvm/mpic.c             |  106  
> +++++++++++++++++++++++++++++++++++
>  6 files changed, 135 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/kvm/irq.h
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h  
> b/arch/powerpc/include/asm/kvm_host.h
> index 36368c9..d5fbb4b 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -44,6 +44,10 @@
>  #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>  #endif
> 
> +/* These values are internal and can be increased later */
> +#define KVM_NR_IRQCHIPS          1
> +#define KVM_IRQCHIP_NUM_PINS     256
> +
>  #if !defined(CONFIG_KVM_440)
>  #include <linux/mmu_notifier.h>
> 
> @@ -256,6 +260,9 @@ struct kvm_arch {
>  #ifdef CONFIG_PPC_BOOK3S_64
>  	struct list_head spapr_tce_tables;
>  #endif
> +#ifdef CONFIG_KVM_MPIC
> +	void *mpic;
> +#endif
>  };

This can be "struct openpic *mpic" -- we already do this in the vcpu.

> diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
> new file mode 100644
> index 0000000..f1e27fd
> --- /dev/null
> +++ b/arch/powerpc/kvm/irq.h
> @@ -0,0 +1,17 @@
> +#ifndef __IRQ_H
> +#define __IRQ_H
> +
> +#include <linux/kvm_host.h>
> +
> +static inline int irqchip_in_kernel(struct kvm *kvm)
> +{
> +	int ret = 0;
> +
> +#ifdef CONFIG_KVM_MPIC
> +	ret = ret || (kvm->arch.mpic != NULL);
> +#endif
> +	smp_rmb();
> +	return ret;
> +}

Couldn't we just set a non-irqchip-specific bool?  Though eventually  
this
shouldn't be needed at all -- instead the check would be "does the
requested irqchip fd exist and refer to something that exposes an  
irqchip
interface"?

The rmb needs documentation.  I don't see a corresponding wmb before
writing to kvm->arch.mpic.

> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index b1ae02d..c8de2f2 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
> @@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  		}
> 
>  		IRQ_resetbit(&dst->servicing, s_IRQ);
> +		/* Notify listeners that the IRQ is over */
> +		kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);

I don't think we want to call that with the lock held -- it looks like  
it
can call back into kvm_set_irq.  In our old internal version I waited
until the end of the EOI code, and called the notify function with the
lock dropped temporarily.

>  		/* Set up next servicing IRQ */
>  		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>  		/* Check queued interrupts. */
> @@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device  
> *dev)
>  		unmap_mmio(opp);
>  	}
> 
> +	dev->kvm->arch.mpic = NULL;
>  	kfree(opp);
>  }
> 
> +static int mpic_set_default_irq_routing(struct openpic *opp)
> +{
> +	int i;
> +	struct kvm_irq_routing_entry *routing;
> +
> +	/* XXX be more dynamic if we ever want to support multiple MPIC  
> chips */
> +	routing = kzalloc((sizeof(*routing) * opp->nb_irqs),  
> GFP_KERNEL);
> +	if (!routing)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < opp->nb_irqs; i++) {
> +		routing[i].gsi = i;
> +		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
> +		routing[i].u.irqchip.irqchip = 0;
> +		routing[i].u.irqchip.pin = i;
> +	}
> +
> +	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
> +
> +	kfree(routing);
> +	return 0;
> +}

Do we really want any default routes?  There's no platform notion of GSI
here, so how is userspace to know how the kernel set it up (or what GSIs
are free to be used for new routes) -- other than the "read the code"
answer I got when I asked about x86?  :-P

> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue)
> +{
> +	int r = -EINVAL;
> +
> +	switch (ue->type) {
> +	case KVM_IRQ_ROUTING_IRQCHIP:
> +		e->set = mpic_set_irq;
> +		e->irqchip.irqchip = ue->u.irqchip.irqchip;
> +		e->irqchip.pin = ue->u.irqchip.pin;
> +		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
> +			goto out;
> +		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] =  
> ue->gsi;
> +		break;
> +	case KVM_IRQ_ROUTING_MSI:
> +		e->set = kvm_set_msi;
> +		e->msi.address_lo = ue->u.msi.address_lo;
> +		e->msi.address_hi = ue->u.msi.address_hi;
> +		e->msi.data = ue->u.msi.data;
> +		break;
> +	default:
> +		goto out;
> +	}
> +
> +	r = 0;
> +out:
> +	return r;
> +}

How would one create a route for IPIs, timers, etc (we have had  
customers
wanting to assign those to KVM guests)?  What about error interrupts on
MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
there be an API documentation update for this?

We also need to document whach irqchip means, if we want to reserve the
ability to use it in the future -- otherwise userspace could fill in any
old junk there and we'd need to retain compatibility.  It should  
probably
be the fd of the MPIC.

It looks like you only support having one type of irqchip built into
the kernel at a time?  That may be OK for now given the existing
restrictions for building KVM, but it should be noted as a
limitation/TODO.

BTW, thanks for taking this on -- it would have taken me a while to
digest the existing code.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-18 21:39     ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-18 21:39 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
> Now that all the irq routing and irqfd pieces are generic, we can  
> expose
> real irqchip support to all of KVM's internal helpers.
> 
> This allows us to use irqfd with the in-kernel MPIC.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/asm/kvm_host.h |    7 ++
>  arch/powerpc/include/uapi/asm/kvm.h |    1 +
>  arch/powerpc/kvm/Kconfig            |    3 +
>  arch/powerpc/kvm/Makefile           |    1 +
>  arch/powerpc/kvm/irq.h              |   17 ++++++
>  arch/powerpc/kvm/mpic.c             |  106  
> +++++++++++++++++++++++++++++++++++
>  6 files changed, 135 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/kvm/irq.h
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h  
> b/arch/powerpc/include/asm/kvm_host.h
> index 36368c9..d5fbb4b 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -44,6 +44,10 @@
>  #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>  #endif
> 
> +/* These values are internal and can be increased later */
> +#define KVM_NR_IRQCHIPS          1
> +#define KVM_IRQCHIP_NUM_PINS     256
> +
>  #if !defined(CONFIG_KVM_440)
>  #include <linux/mmu_notifier.h>
> 
> @@ -256,6 +260,9 @@ struct kvm_arch {
>  #ifdef CONFIG_PPC_BOOK3S_64
>  	struct list_head spapr_tce_tables;
>  #endif
> +#ifdef CONFIG_KVM_MPIC
> +	void *mpic;
> +#endif
>  };

This can be "struct openpic *mpic" -- we already do this in the vcpu.

> diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
> new file mode 100644
> index 0000000..f1e27fd
> --- /dev/null
> +++ b/arch/powerpc/kvm/irq.h
> @@ -0,0 +1,17 @@
> +#ifndef __IRQ_H
> +#define __IRQ_H
> +
> +#include <linux/kvm_host.h>
> +
> +static inline int irqchip_in_kernel(struct kvm *kvm)
> +{
> +	int ret = 0;
> +
> +#ifdef CONFIG_KVM_MPIC
> +	ret = ret || (kvm->arch.mpic != NULL);
> +#endif
> +	smp_rmb();
> +	return ret;
> +}

Couldn't we just set a non-irqchip-specific bool?  Though eventually  
this
shouldn't be needed at all -- instead the check would be "does the
requested irqchip fd exist and refer to something that exposes an  
irqchip
interface"?

The rmb needs documentation.  I don't see a corresponding wmb before
writing to kvm->arch.mpic.

> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index b1ae02d..c8de2f2 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
> @@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  		}
> 
>  		IRQ_resetbit(&dst->servicing, s_IRQ);
> +		/* Notify listeners that the IRQ is over */
> +		kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);

I don't think we want to call that with the lock held -- it looks like  
it
can call back into kvm_set_irq.  In our old internal version I waited
until the end of the EOI code, and called the notify function with the
lock dropped temporarily.

>  		/* Set up next servicing IRQ */
>  		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>  		/* Check queued interrupts. */
> @@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device  
> *dev)
>  		unmap_mmio(opp);
>  	}
> 
> +	dev->kvm->arch.mpic = NULL;
>  	kfree(opp);
>  }
> 
> +static int mpic_set_default_irq_routing(struct openpic *opp)
> +{
> +	int i;
> +	struct kvm_irq_routing_entry *routing;
> +
> +	/* XXX be more dynamic if we ever want to support multiple MPIC  
> chips */
> +	routing = kzalloc((sizeof(*routing) * opp->nb_irqs),  
> GFP_KERNEL);
> +	if (!routing)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < opp->nb_irqs; i++) {
> +		routing[i].gsi = i;
> +		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
> +		routing[i].u.irqchip.irqchip = 0;
> +		routing[i].u.irqchip.pin = i;
> +	}
> +
> +	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
> +
> +	kfree(routing);
> +	return 0;
> +}

Do we really want any default routes?  There's no platform notion of GSI
here, so how is userspace to know how the kernel set it up (or what GSIs
are free to be used for new routes) -- other than the "read the code"
answer I got when I asked about x86?  :-P

> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue)
> +{
> +	int r = -EINVAL;
> +
> +	switch (ue->type) {
> +	case KVM_IRQ_ROUTING_IRQCHIP:
> +		e->set = mpic_set_irq;
> +		e->irqchip.irqchip = ue->u.irqchip.irqchip;
> +		e->irqchip.pin = ue->u.irqchip.pin;
> +		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
> +			goto out;
> +		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] =  
> ue->gsi;
> +		break;
> +	case KVM_IRQ_ROUTING_MSI:
> +		e->set = kvm_set_msi;
> +		e->msi.address_lo = ue->u.msi.address_lo;
> +		e->msi.address_hi = ue->u.msi.address_hi;
> +		e->msi.data = ue->u.msi.data;
> +		break;
> +	default:
> +		goto out;
> +	}
> +
> +	r = 0;
> +out:
> +	return r;
> +}

How would one create a route for IPIs, timers, etc (we have had  
customers
wanting to assign those to KVM guests)?  What about error interrupts on
MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
there be an API documentation update for this?

We also need to document whach irqchip means, if we want to reserve the
ability to use it in the future -- otherwise userspace could fill in any
old junk there and we'd need to retain compatibility.  It should  
probably
be the fd of the MPIC.

It looks like you only support having one type of irqchip built into
the kernel at a time?  That may be OK for now given the existing
restrictions for building KVM, but it should be noted as a
limitation/TODO.

BTW, thanks for taking this on -- it would have taken me a while to
digest the existing code.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-18 21:39     ` Scott Wood
@ 2013-04-19  0:15       ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19  0:15 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 18.04.2013, at 23:39, Scott Wood wrote:

> On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> Now that all the irq routing and irqfd pieces are generic, we can expose
>> real irqchip support to all of KVM's internal helpers.
>> This allows us to use irqfd with the in-kernel MPIC.
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm_host.h |    7 ++
>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> arch/powerpc/kvm/Kconfig            |    3 +
>> arch/powerpc/kvm/Makefile           |    1 +
>> arch/powerpc/kvm/irq.h              |   17 ++++++
>> arch/powerpc/kvm/mpic.c             |  106 +++++++++++++++++++++++++++++++++++
>> 6 files changed, 135 insertions(+), 0 deletions(-)
>> create mode 100644 arch/powerpc/kvm/irq.h
>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>> index 36368c9..d5fbb4b 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -44,6 +44,10 @@
>> #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>> #endif
>> +/* These values are internal and can be increased later */
>> +#define KVM_NR_IRQCHIPS          1
>> +#define KVM_IRQCHIP_NUM_PINS     256
>> +
>> #if !defined(CONFIG_KVM_440)
>> #include <linux/mmu_notifier.h>
>> @@ -256,6 +260,9 @@ struct kvm_arch {
>> #ifdef CONFIG_PPC_BOOK3S_64
>> 	struct list_head spapr_tce_tables;
>> #endif
>> +#ifdef CONFIG_KVM_MPIC
>> +	void *mpic;
>> +#endif
>> };
> 
> This can be "struct openpic *mpic" -- we already do this in the vcpu.

There was a reason I made it void *, but I don't remember what it was. I can certainly try to make it struct openpic * again :).

> 
>> diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
>> new file mode 100644
>> index 0000000..f1e27fd
>> --- /dev/null
>> +++ b/arch/powerpc/kvm/irq.h
>> @@ -0,0 +1,17 @@
>> +#ifndef __IRQ_H
>> +#define __IRQ_H
>> +
>> +#include <linux/kvm_host.h>
>> +
>> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> +{
>> +	int ret = 0;
>> +
>> +#ifdef CONFIG_KVM_MPIC
>> +	ret = ret || (kvm->arch.mpic != NULL);
>> +#endif
>> +	smp_rmb();
>> +	return ret;
>> +}
> 
> Couldn't we just set a non-irqchip-specific bool?  Though eventually this
> shouldn't be needed at all -- instead the check would be "does the
> requested irqchip fd exist and refer to something that exposes an irqchip
> interface"?

How would we get the irqchip id?

> The rmb needs documentation.  I don't see a corresponding wmb before
> writing to kvm->arch.mpic.

I actually just copied it from the x86 code, wondering what the rmb() is supposed to do here. Do we need this at all?

> 
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index b1ae02d..c8de2f2 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
>> @@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 		}
>> 		IRQ_resetbit(&dst->servicing, s_IRQ);
>> +		/* Notify listeners that the IRQ is over */
>> +		kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);
> 
> I don't think we want to call that with the lock held -- it looks like it
> can call back into kvm_set_irq.  In our old internal version I waited

Yeah, usually it would call into a non-mpic set handler though IIUC, so we're safe. However, if you want to use an MPIC input pin as resample fd, then you'd be lost here.

> until the end of the EOI code, and called the notify function with the
> lock dropped temporarily.

Sounds like a good idea.

> 
>> 		/* Set up next servicing IRQ */
>> 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>> 		/* Check queued interrupts. */
>> @@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device *dev)
>> 		unmap_mmio(opp);
>> 	}
>> +	dev->kvm->arch.mpic = NULL;
>> 	kfree(opp);
>> }
>> +static int mpic_set_default_irq_routing(struct openpic *opp)
>> +{
>> +	int i;
>> +	struct kvm_irq_routing_entry *routing;
>> +
>> +	/* XXX be more dynamic if we ever want to support multiple MPIC chips */
>> +	routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
>> +	if (!routing)
>> +		return -ENOMEM;
>> +
>> +	for (i = 0; i < opp->nb_irqs; i++) {
>> +		routing[i].gsi = i;
>> +		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
>> +		routing[i].u.irqchip.irqchip = 0;
>> +		routing[i].u.irqchip.pin = i;
>> +	}
>> +
>> +	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
>> +
>> +	kfree(routing);
>> +	return 0;
>> +}
> 
> Do we really want any default routes?  There's no platform notion of GSI
> here, so how is userspace to know how the kernel set it up (or what GSIs
> are free to be used for new routes) -- other than the "read the code"
> answer I got when I asked about x86?  :-P

The "default routes" really are just "expose all pins 1:1 as GSI". I think it makes sense to have a simple default for user space that doesn't want to mess with irq routing.

What GSIs are free for new routes doesn't matter. Routes are always completely rewritten as a while from user space. So when user space goes in and wants to change only a single line it has to lay out the full map itself anyway.

> 
>> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
>> +			  struct kvm_kernel_irq_routing_entry *e,
>> +			  const struct kvm_irq_routing_entry *ue)
>> +{
>> +	int r = -EINVAL;
>> +
>> +	switch (ue->type) {
>> +	case KVM_IRQ_ROUTING_IRQCHIP:
>> +		e->set = mpic_set_irq;
>> +		e->irqchip.irqchip = ue->u.irqchip.irqchip;
>> +		e->irqchip.pin = ue->u.irqchip.pin;
>> +		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
>> +			goto out;
>> +		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
>> +		break;
>> +	case KVM_IRQ_ROUTING_MSI:
>> +		e->set = kvm_set_msi;
>> +		e->msi.address_lo = ue->u.msi.address_lo;
>> +		e->msi.address_hi = ue->u.msi.address_hi;
>> +		e->msi.data = ue->u.msi.data;
>> +		break;
>> +	default:
>> +		goto out;
>> +	}
>> +
>> +	r = 0;
>> +out:
>> +	return r;
>> +}
> 
> How would one create a route for IPIs, timers, etc (we have had customers
> wanting to assign those to KVM guests)?  What about error interrupts on
> MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't

"pin" is basically what a "source line" is on the MPIC. That's what the equivalent of an IOAPIC interrupt line would be for the MPIC world.

IPIs, timer interrupts and error interrupts are special vectors. We could probably model them as different KVM_IRQ_ROUTING_ types if we ever need to support mapping those to irqfd. For simple injection we can always do an ioctl on the MPIC device.

However, I'd be inclined to say that it's rather unlikely you'd want to have a vfio device's interrupt line hooked up to the IPI interrupt of your guest...

> there be an API documentation update for this?

Hrm. I can certainly add one.

> We also need to document whach irqchip means, if we want to reserve the
> ability to use it in the future -- otherwise userspace could fill in any
> old junk there and we'd need to retain compatibility.  It should probably
> be the fd of the MPIC.

It can't, because irqchip is an index into an array. We'd have to add another layer of indirection if we want to find the fd to a certain irqchip. That's why I simply restricted it to always be "0" now.

> It looks like you only support having one type of irqchip built into
> the kernel at a time?  That may be OK for now given the existing
> restrictions for building KVM, but it should be noted as a
> limitation/TODO.

I don't think it's really hard to add support for another irqchip type. But we certainly have harder issues to tackle first before we end up in a situation where in-kernel MPIC and in-kernel XICS would make sense in the same kernel.

> BTW, thanks for taking this on -- it would have taken me a while to
> digest the existing code.

It helps to know what the x86 words mean :)


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-19  0:15       ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19  0:15 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 18.04.2013, at 23:39, Scott Wood wrote:

> On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> Now that all the irq routing and irqfd pieces are generic, we can expose
>> real irqchip support to all of KVM's internal helpers.
>> This allows us to use irqfd with the in-kernel MPIC.
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm_host.h |    7 ++
>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> arch/powerpc/kvm/Kconfig            |    3 +
>> arch/powerpc/kvm/Makefile           |    1 +
>> arch/powerpc/kvm/irq.h              |   17 ++++++
>> arch/powerpc/kvm/mpic.c             |  106 +++++++++++++++++++++++++++++++++++
>> 6 files changed, 135 insertions(+), 0 deletions(-)
>> create mode 100644 arch/powerpc/kvm/irq.h
>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>> index 36368c9..d5fbb4b 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -44,6 +44,10 @@
>> #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>> #endif
>> +/* These values are internal and can be increased later */
>> +#define KVM_NR_IRQCHIPS          1
>> +#define KVM_IRQCHIP_NUM_PINS     256
>> +
>> #if !defined(CONFIG_KVM_440)
>> #include <linux/mmu_notifier.h>
>> @@ -256,6 +260,9 @@ struct kvm_arch {
>> #ifdef CONFIG_PPC_BOOK3S_64
>> 	struct list_head spapr_tce_tables;
>> #endif
>> +#ifdef CONFIG_KVM_MPIC
>> +	void *mpic;
>> +#endif
>> };
> 
> This can be "struct openpic *mpic" -- we already do this in the vcpu.

There was a reason I made it void *, but I don't remember what it was. I can certainly try to make it struct openpic * again :).

> 
>> diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
>> new file mode 100644
>> index 0000000..f1e27fd
>> --- /dev/null
>> +++ b/arch/powerpc/kvm/irq.h
>> @@ -0,0 +1,17 @@
>> +#ifndef __IRQ_H
>> +#define __IRQ_H
>> +
>> +#include <linux/kvm_host.h>
>> +
>> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> +{
>> +	int ret = 0;
>> +
>> +#ifdef CONFIG_KVM_MPIC
>> +	ret = ret || (kvm->arch.mpic != NULL);
>> +#endif
>> +	smp_rmb();
>> +	return ret;
>> +}
> 
> Couldn't we just set a non-irqchip-specific bool?  Though eventually this
> shouldn't be needed at all -- instead the check would be "does the
> requested irqchip fd exist and refer to something that exposes an irqchip
> interface"?

How would we get the irqchip id?

> The rmb needs documentation.  I don't see a corresponding wmb before
> writing to kvm->arch.mpic.

I actually just copied it from the x86 code, wondering what the rmb() is supposed to do here. Do we need this at all?

> 
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index b1ae02d..c8de2f2 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
>> @@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 		}
>> 		IRQ_resetbit(&dst->servicing, s_IRQ);
>> +		/* Notify listeners that the IRQ is over */
>> +		kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);
> 
> I don't think we want to call that with the lock held -- it looks like it
> can call back into kvm_set_irq.  In our old internal version I waited

Yeah, usually it would call into a non-mpic set handler though IIUC, so we're safe. However, if you want to use an MPIC input pin as resample fd, then you'd be lost here.

> until the end of the EOI code, and called the notify function with the
> lock dropped temporarily.

Sounds like a good idea.

> 
>> 		/* Set up next servicing IRQ */
>> 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>> 		/* Check queued interrupts. */
>> @@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device *dev)
>> 		unmap_mmio(opp);
>> 	}
>> +	dev->kvm->arch.mpic = NULL;
>> 	kfree(opp);
>> }
>> +static int mpic_set_default_irq_routing(struct openpic *opp)
>> +{
>> +	int i;
>> +	struct kvm_irq_routing_entry *routing;
>> +
>> +	/* XXX be more dynamic if we ever want to support multiple MPIC chips */
>> +	routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
>> +	if (!routing)
>> +		return -ENOMEM;
>> +
>> +	for (i = 0; i < opp->nb_irqs; i++) {
>> +		routing[i].gsi = i;
>> +		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
>> +		routing[i].u.irqchip.irqchip = 0;
>> +		routing[i].u.irqchip.pin = i;
>> +	}
>> +
>> +	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
>> +
>> +	kfree(routing);
>> +	return 0;
>> +}
> 
> Do we really want any default routes?  There's no platform notion of GSI
> here, so how is userspace to know how the kernel set it up (or what GSIs
> are free to be used for new routes) -- other than the "read the code"
> answer I got when I asked about x86?  :-P

The "default routes" really are just "expose all pins 1:1 as GSI". I think it makes sense to have a simple default for user space that doesn't want to mess with irq routing.

What GSIs are free for new routes doesn't matter. Routes are always completely rewritten as a while from user space. So when user space goes in and wants to change only a single line it has to lay out the full map itself anyway.

> 
>> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
>> +			  struct kvm_kernel_irq_routing_entry *e,
>> +			  const struct kvm_irq_routing_entry *ue)
>> +{
>> +	int r = -EINVAL;
>> +
>> +	switch (ue->type) {
>> +	case KVM_IRQ_ROUTING_IRQCHIP:
>> +		e->set = mpic_set_irq;
>> +		e->irqchip.irqchip = ue->u.irqchip.irqchip;
>> +		e->irqchip.pin = ue->u.irqchip.pin;
>> +		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
>> +			goto out;
>> +		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
>> +		break;
>> +	case KVM_IRQ_ROUTING_MSI:
>> +		e->set = kvm_set_msi;
>> +		e->msi.address_lo = ue->u.msi.address_lo;
>> +		e->msi.address_hi = ue->u.msi.address_hi;
>> +		e->msi.data = ue->u.msi.data;
>> +		break;
>> +	default:
>> +		goto out;
>> +	}
>> +
>> +	r = 0;
>> +out:
>> +	return r;
>> +}
> 
> How would one create a route for IPIs, timers, etc (we have had customers
> wanting to assign those to KVM guests)?  What about error interrupts on
> MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't

"pin" is basically what a "source line" is on the MPIC. That's what the equivalent of an IOAPIC interrupt line would be for the MPIC world.

IPIs, timer interrupts and error interrupts are special vectors. We could probably model them as different KVM_IRQ_ROUTING_ types if we ever need to support mapping those to irqfd. For simple injection we can always do an ioctl on the MPIC device.

However, I'd be inclined to say that it's rather unlikely you'd want to have a vfio device's interrupt line hooked up to the IPI interrupt of your guest...

> there be an API documentation update for this?

Hrm. I can certainly add one.

> We also need to document whach irqchip means, if we want to reserve the
> ability to use it in the future -- otherwise userspace could fill in any
> old junk there and we'd need to retain compatibility.  It should probably
> be the fd of the MPIC.

It can't, because irqchip is an index into an array. We'd have to add another layer of indirection if we want to find the fd to a certain irqchip. That's why I simply restricted it to always be "0" now.

> It looks like you only support having one type of irqchip built into
> the kernel at a time?  That may be OK for now given the existing
> restrictions for building KVM, but it should be noted as a
> limitation/TODO.

I don't think it's really hard to add support for another irqchip type. But we certainly have harder issues to tackle first before we end up in a situation where in-kernel MPIC and in-kernel XICS would make sense in the same kernel.

> BTW, thanks for taking this on -- it would have taken me a while to
> digest the existing code.

It helps to know what the x86 words mean :)


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19  0:15       ` Alexander Graf
@ 2013-04-19  0:50         ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19  0:50 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
> 
> On 18.04.2013, at 23:39, Scott Wood wrote:
> 
> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
> >> +{
> >> +	int ret = 0;
> >> +
> >> +#ifdef CONFIG_KVM_MPIC
> >> +	ret = ret || (kvm->arch.mpic != NULL);
> >> +#endif
> >> +	smp_rmb();
> >> +	return ret;
> >> +}
> >
> > Couldn't we just set a non-irqchip-specific bool?  Though  
> eventually this
> > shouldn't be needed at all -- instead the check would be "does the
> > requested irqchip fd exist and refer to something that exposes an  
> irqchip
> > interface"?
> 
> How would we get the irqchip id?

I was thinking it would come from whatever operation you're trying to  
do, though I see that MSI routing doesn't specify an irqchip. :-P

In that case I guess it would check for whether any MSI handler has  
been registered.

> > The rmb needs documentation.  I don't see a corresponding wmb before
> > writing to kvm->arch.mpic.
> 
> I actually just copied it from the x86 code, wondering what the rmb()  
> is supposed to do here. Do we need this at all?

Well, we don't want to start using the irqchip before it's been fully  
initialized -- but if we want to use barriers to accomplish that rather  
than a lock, there needs to be a wmb on the writing side.

> > How would one create a route for IPIs, timers, etc (we have had  
> customers
> > wanting to assign those to KVM guests)?  What about error  
> interrupts on
> > MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
> 
> "pin" is basically what a "source line" is on the MPIC. That's what  
> the equivalent of an IOAPIC interrupt line would be for the MPIC  
> world.
> 
> IPIs, timer interrupts and error interrupts are special vectors. We  
> could probably model them as different KVM_IRQ_ROUTING_ types if we  
> ever need to support mapping those to irqfd.

Seems a bit heavyweight to add several new MPIC-specific routing types  
-- maybe just one new KVM_IRQ_ROUTING type that lets multiple words be  
used to describe the interrupt?

> For simple injection we can always do an ioctl on the MPIC device.

I got complaints for that originally. :-)

> However, I'd be inclined to say that it's rather unlikely you'd want  
> to have a vfio device's interrupt line hooked up to the IPI interrupt  
> of your guest...

Well, as I said, we've done it before for a customer (not with VFIO of  
course, but our old internal-tree device assignment mechanism), so not  
*that* unlikely.  The host was a two-core chip running Linux on only  
one of the cores, and a custom OS on the other core.  They wanted to  
communicate between the custom OS and a KVM guest.  Since host Linux  
was only running on one core, it didn't need the IPIs for itself (and  
beginning with e500mc, the host uses msgsnd instead, so there also the  
host will not need the IPIs for itself).

> > there be an API documentation update for this?
> 
> Hrm. I can certainly add one.

OK.  We've had enough confusion from poor documentation in the device  
tree binding, due to Freescale documentation calling openpic irq 16  
"internal interrupt 0", that we should be clear exactly what it means.

> > We also need to document whach irqchip means, if we want to reserve  
> the
> > ability to use it in the future -- otherwise userspace could fill  
> in any
> > old junk there and we'd need to retain compatibility.  It should  
> probably
> > be the fd of the MPIC.
> 
> It can't, because irqchip is an index into an array. We'd have to add  
> another layer of indirection if we want to find the fd to a certain  
> irqchip. That's why I simply restricted it to always be "0" now.

OK, didn't know how firmly that was baked into the code.  Maybe a  
device attribute for returning the irqchip number -- which would  
accommodate devices that expose multiple irqchips.

> > It looks like you only support having one type of irqchip built into
> > the kernel at a time?  That may be OK for now given the existing
> > restrictions for building KVM, but it should be noted as a
> > limitation/TODO.
> 
> I don't think it's really hard to add support for another irqchip  
> type. But we certainly have harder issues to tackle first before we  
> end up in a situation where in-kernel MPIC and in-kernel XICS would  
> make sense in the same kernel.

Not necessarily hard or hugely important, just looked odd putting  
global non-MPIC-specific functions in the MPIC file.  I tend to prefer  
getting the component separation (at least resaonably) correct from the  
start, so the person who would end up needing to refactor to fit their  
device in doesn't need to mess with MPIC code to do so.  Such a person  
might be unfamiliar with MPIC, have no easy way to test, etc.  Similar  
to the current situation with the IRQ routing code, at least before you  
took it over. :-)

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-19  0:50         ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19  0:50 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
> 
> On 18.04.2013, at 23:39, Scott Wood wrote:
> 
> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
> >> +{
> >> +	int ret = 0;
> >> +
> >> +#ifdef CONFIG_KVM_MPIC
> >> +	ret = ret || (kvm->arch.mpic != NULL);
> >> +#endif
> >> +	smp_rmb();
> >> +	return ret;
> >> +}
> >
> > Couldn't we just set a non-irqchip-specific bool?  Though  
> eventually this
> > shouldn't be needed at all -- instead the check would be "does the
> > requested irqchip fd exist and refer to something that exposes an  
> irqchip
> > interface"?
> 
> How would we get the irqchip id?

I was thinking it would come from whatever operation you're trying to  
do, though I see that MSI routing doesn't specify an irqchip. :-P

In that case I guess it would check for whether any MSI handler has  
been registered.

> > The rmb needs documentation.  I don't see a corresponding wmb before
> > writing to kvm->arch.mpic.
> 
> I actually just copied it from the x86 code, wondering what the rmb()  
> is supposed to do here. Do we need this at all?

Well, we don't want to start using the irqchip before it's been fully  
initialized -- but if we want to use barriers to accomplish that rather  
than a lock, there needs to be a wmb on the writing side.

> > How would one create a route for IPIs, timers, etc (we have had  
> customers
> > wanting to assign those to KVM guests)?  What about error  
> interrupts on
> > MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
> 
> "pin" is basically what a "source line" is on the MPIC. That's what  
> the equivalent of an IOAPIC interrupt line would be for the MPIC  
> world.
> 
> IPIs, timer interrupts and error interrupts are special vectors. We  
> could probably model them as different KVM_IRQ_ROUTING_ types if we  
> ever need to support mapping those to irqfd.

Seems a bit heavyweight to add several new MPIC-specific routing types  
-- maybe just one new KVM_IRQ_ROUTING type that lets multiple words be  
used to describe the interrupt?

> For simple injection we can always do an ioctl on the MPIC device.

I got complaints for that originally. :-)

> However, I'd be inclined to say that it's rather unlikely you'd want  
> to have a vfio device's interrupt line hooked up to the IPI interrupt  
> of your guest...

Well, as I said, we've done it before for a customer (not with VFIO of  
course, but our old internal-tree device assignment mechanism), so not  
*that* unlikely.  The host was a two-core chip running Linux on only  
one of the cores, and a custom OS on the other core.  They wanted to  
communicate between the custom OS and a KVM guest.  Since host Linux  
was only running on one core, it didn't need the IPIs for itself (and  
beginning with e500mc, the host uses msgsnd instead, so there also the  
host will not need the IPIs for itself).

> > there be an API documentation update for this?
> 
> Hrm. I can certainly add one.

OK.  We've had enough confusion from poor documentation in the device  
tree binding, due to Freescale documentation calling openpic irq 16  
"internal interrupt 0", that we should be clear exactly what it means.

> > We also need to document whach irqchip means, if we want to reserve  
> the
> > ability to use it in the future -- otherwise userspace could fill  
> in any
> > old junk there and we'd need to retain compatibility.  It should  
> probably
> > be the fd of the MPIC.
> 
> It can't, because irqchip is an index into an array. We'd have to add  
> another layer of indirection if we want to find the fd to a certain  
> irqchip. That's why I simply restricted it to always be "0" now.

OK, didn't know how firmly that was baked into the code.  Maybe a  
device attribute for returning the irqchip number -- which would  
accommodate devices that expose multiple irqchips.

> > It looks like you only support having one type of irqchip built into
> > the kernel at a time?  That may be OK for now given the existing
> > restrictions for building KVM, but it should be noted as a
> > limitation/TODO.
> 
> I don't think it's really hard to add support for another irqchip  
> type. But we certainly have harder issues to tackle first before we  
> end up in a situation where in-kernel MPIC and in-kernel XICS would  
> make sense in the same kernel.

Not necessarily hard or hugely important, just looked odd putting  
global non-MPIC-specific functions in the MPIC file.  I tend to prefer  
getting the component separation (at least resaonably) correct from the  
start, so the person who would end up needing to refactor to fit their  
device in doesn't need to mess with MPIC code to do so.  Such a person  
might be unfamiliar with MPIC, have no easy way to test, etc.  Similar  
to the current situation with the IRQ routing code, at least before you  
took it over. :-)

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19  0:50         ` Scott Wood
@ 2013-04-19  1:09           ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19  1:09 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 19.04.2013, at 02:50, Scott Wood wrote:

> On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
>> On 18.04.2013, at 23:39, Scott Wood wrote:
>> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> >> +{
>> >> +	int ret = 0;
>> >> +
>> >> +#ifdef CONFIG_KVM_MPIC
>> >> +	ret = ret || (kvm->arch.mpic != NULL);
>> >> +#endif
>> >> +	smp_rmb();
>> >> +	return ret;
>> >> +}
>> >
>> > Couldn't we just set a non-irqchip-specific bool?  Though eventually this
>> > shouldn't be needed at all -- instead the check would be "does the
>> > requested irqchip fd exist and refer to something that exposes an irqchip
>> > interface"?
>> How would we get the irqchip id?
> 
> I was thinking it would come from whatever operation you're trying to do, though I see that MSI routing doesn't specify an irqchip. :-P
> 
> In that case I guess it would check for whether any MSI handler has been registered.

I think you're over engineering here :). These are internal interfaces that whoever wants to implement a secondary irqchip can worry about. I merely wanted to make sure we don't block our road for multiple irqchips in the kernel<->user interface from the beginning. Internal ones are a different story :).

> 
>> > The rmb needs documentation.  I don't see a corresponding wmb before
>> > writing to kvm->arch.mpic.
>> I actually just copied it from the x86 code, wondering what the rmb() is supposed to do here. Do we need this at all?
> 
> Well, we don't want to start using the irqchip before it's been fully initialized -- but if we want to use barriers to accomplish that rather than a lock, there needs to be a wmb on the writing side.
> 
>> > How would one create a route for IPIs, timers, etc (we have had customers
>> > wanting to assign those to KVM guests)?  What about error interrupts on
>> > MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
>> "pin" is basically what a "source line" is on the MPIC. That's what the equivalent of an IOAPIC interrupt line would be for the MPIC world.
>> IPIs, timer interrupts and error interrupts are special vectors. We could probably model them as different KVM_IRQ_ROUTING_ types if we ever need to support mapping those to irqfd.
> 
> Seems a bit heavyweight to add several new MPIC-specific routing types -- maybe just one new KVM_IRQ_ROUTING type that lets multiple words be used to describe the interrupt?

Well, we can add a single KVM_IRQ_ROUTING_MPIC type if that's better. But we don't have to do it now. I dislike code that we can't test, and I don't have a good test case for user space injected IPIs right now :).

> 
>> For simple injection we can always do an ioctl on the MPIC device.
> 
> I got complaints for that originally. :-)

There are 2 reasons why direct ioctls on the MPIC device could be bad

  1) irqfd
  2) code sharing

I'm not convinced yet we care about performance for MPIC IPI, TMR or ERR interrupts. So irqfd is out of the question there.
Code sharing only makes sense in areas where things are common. In case of the MPIC, this is for SRC interrupts.

So I don't think there should be complaints here :).

> 
>> However, I'd be inclined to say that it's rather unlikely you'd want to have a vfio device's interrupt line hooked up to the IPI interrupt of your guest...
> 
> Well, as I said, we've done it before for a customer (not with VFIO of course, but our old internal-tree device assignment mechanism), so not *that* unlikely.  The host was a two-core chip running Linux on only one of the cores, and a custom OS on the other core.  They wanted to communicate between the custom OS and a KVM guest.  Since host Linux was only running on one core, it didn't need the IPIs for itself (and beginning with e500mc, the host uses msgsnd instead, so there also the host will not need the IPIs for itself).

They could just as well use a guest SRC line for that, no? What the listener on the host connects to on the host's MPIC is pretty much orthogonal to what we inject into the guest.

> 
>> > there be an API documentation update for this?
>> Hrm. I can certainly add one.
> 
> OK.  We've had enough confusion from poor documentation in the device tree binding, due to Freescale documentation calling openpic irq 16 "internal interrupt 0", that we should be clear exactly what it means.
> 
>> > We also need to document whach irqchip means, if we want to reserve the
>> > ability to use it in the future -- otherwise userspace could fill in any
>> > old junk there and we'd need to retain compatibility.  It should probably
>> > be the fd of the MPIC.
>> It can't, because irqchip is an index into an array. We'd have to add another layer of indirection if we want to find the fd to a certain irqchip. That's why I simply restricted it to always be "0" now.
> 
> OK, didn't know how firmly that was baked into the code.  Maybe a device attribute for returning the irqchip number -- which would accommodate devices that expose multiple irqchips.

That would work, yes. We'd have to have 2 mappings available

  irqchip number -> fd
  fd -> irqchip number

The first is just an array inside of kvm->arch. The second is the device attribute. We could probably even live with only the first. That gives us exactly the new layer of indirection I was talking about above.

So we know that it's possible with the API as we have it. It only needs to be extended by an ioctl to declare which fd which irqchip number belongs to. By default, id 0 is mapped to the first created PIC. That way we stay backwards compatible, but allow for multiple irqchips.

> 
>> > It looks like you only support having one type of irqchip built into
>> > the kernel at a time?  That may be OK for now given the existing
>> > restrictions for building KVM, but it should be noted as a
>> > limitation/TODO.
>> I don't think it's really hard to add support for another irqchip type. But we certainly have harder issues to tackle first before we end up in a situation where in-kernel MPIC and in-kernel XICS would make sense in the same kernel.
> 
> Not necessarily hard or hugely important, just looked odd putting global non-MPIC-specific functions in the MPIC file.  I tend to prefer getting the component separation (at least resaonably) correct from the start, so the person who would end up needing to refactor to fit their device in doesn't need to mess with MPIC code to do so.  Such a person might be unfamiliar with MPIC, have no easy way to test, etc.  Similar to the current situation with the IRQ routing code, at least before you took it over. :-)

The only person interested in generalizing that code for now would be Paul. He knows the MPIC quite well, so I doubt he'll have issues with it :).

All it takes really are simple multiplexing functions in front of the existing callbacks that call either MPIC or XICS code depending on the target irqchip.


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-19  1:09           ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19  1:09 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 19.04.2013, at 02:50, Scott Wood wrote:

> On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
>> On 18.04.2013, at 23:39, Scott Wood wrote:
>> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> >> +{
>> >> +	int ret = 0;
>> >> +
>> >> +#ifdef CONFIG_KVM_MPIC
>> >> +	ret = ret || (kvm->arch.mpic != NULL);
>> >> +#endif
>> >> +	smp_rmb();
>> >> +	return ret;
>> >> +}
>> >
>> > Couldn't we just set a non-irqchip-specific bool?  Though eventually this
>> > shouldn't be needed at all -- instead the check would be "does the
>> > requested irqchip fd exist and refer to something that exposes an irqchip
>> > interface"?
>> How would we get the irqchip id?
> 
> I was thinking it would come from whatever operation you're trying to do, though I see that MSI routing doesn't specify an irqchip. :-P
> 
> In that case I guess it would check for whether any MSI handler has been registered.

I think you're over engineering here :). These are internal interfaces that whoever wants to implement a secondary irqchip can worry about. I merely wanted to make sure we don't block our road for multiple irqchips in the kernel<->user interface from the beginning. Internal ones are a different story :).

> 
>> > The rmb needs documentation.  I don't see a corresponding wmb before
>> > writing to kvm->arch.mpic.
>> I actually just copied it from the x86 code, wondering what the rmb() is supposed to do here. Do we need this at all?
> 
> Well, we don't want to start using the irqchip before it's been fully initialized -- but if we want to use barriers to accomplish that rather than a lock, there needs to be a wmb on the writing side.
> 
>> > How would one create a route for IPIs, timers, etc (we have had customers
>> > wanting to assign those to KVM guests)?  What about error interrupts on
>> > MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
>> "pin" is basically what a "source line" is on the MPIC. That's what the equivalent of an IOAPIC interrupt line would be for the MPIC world.
>> IPIs, timer interrupts and error interrupts are special vectors. We could probably model them as different KVM_IRQ_ROUTING_ types if we ever need to support mapping those to irqfd.
> 
> Seems a bit heavyweight to add several new MPIC-specific routing types -- maybe just one new KVM_IRQ_ROUTING type that lets multiple words be used to describe the interrupt?

Well, we can add a single KVM_IRQ_ROUTING_MPIC type if that's better. But we don't have to do it now. I dislike code that we can't test, and I don't have a good test case for user space injected IPIs right now :).

> 
>> For simple injection we can always do an ioctl on the MPIC device.
> 
> I got complaints for that originally. :-)

There are 2 reasons why direct ioctls on the MPIC device could be bad

  1) irqfd
  2) code sharing

I'm not convinced yet we care about performance for MPIC IPI, TMR or ERR interrupts. So irqfd is out of the question there.
Code sharing only makes sense in areas where things are common. In case of the MPIC, this is for SRC interrupts.

So I don't think there should be complaints here :).

> 
>> However, I'd be inclined to say that it's rather unlikely you'd want to have a vfio device's interrupt line hooked up to the IPI interrupt of your guest...
> 
> Well, as I said, we've done it before for a customer (not with VFIO of course, but our old internal-tree device assignment mechanism), so not *that* unlikely.  The host was a two-core chip running Linux on only one of the cores, and a custom OS on the other core.  They wanted to communicate between the custom OS and a KVM guest.  Since host Linux was only running on one core, it didn't need the IPIs for itself (and beginning with e500mc, the host uses msgsnd instead, so there also the host will not need the IPIs for itself).

They could just as well use a guest SRC line for that, no? What the listener on the host connects to on the host's MPIC is pretty much orthogonal to what we inject into the guest.

> 
>> > there be an API documentation update for this?
>> Hrm. I can certainly add one.
> 
> OK.  We've had enough confusion from poor documentation in the device tree binding, due to Freescale documentation calling openpic irq 16 "internal interrupt 0", that we should be clear exactly what it means.
> 
>> > We also need to document whach irqchip means, if we want to reserve the
>> > ability to use it in the future -- otherwise userspace could fill in any
>> > old junk there and we'd need to retain compatibility.  It should probably
>> > be the fd of the MPIC.
>> It can't, because irqchip is an index into an array. We'd have to add another layer of indirection if we want to find the fd to a certain irqchip. That's why I simply restricted it to always be "0" now.
> 
> OK, didn't know how firmly that was baked into the code.  Maybe a device attribute for returning the irqchip number -- which would accommodate devices that expose multiple irqchips.

That would work, yes. We'd have to have 2 mappings available

  irqchip number -> fd
  fd -> irqchip number

The first is just an array inside of kvm->arch. The second is the device attribute. We could probably even live with only the first. That gives us exactly the new layer of indirection I was talking about above.

So we know that it's possible with the API as we have it. It only needs to be extended by an ioctl to declare which fd which irqchip number belongs to. By default, id 0 is mapped to the first created PIC. That way we stay backwards compatible, but allow for multiple irqchips.

> 
>> > It looks like you only support having one type of irqchip built into
>> > the kernel at a time?  That may be OK for now given the existing
>> > restrictions for building KVM, but it should be noted as a
>> > limitation/TODO.
>> I don't think it's really hard to add support for another irqchip type. But we certainly have harder issues to tackle first before we end up in a situation where in-kernel MPIC and in-kernel XICS would make sense in the same kernel.
> 
> Not necessarily hard or hugely important, just looked odd putting global non-MPIC-specific functions in the MPIC file.  I tend to prefer getting the component separation (at least resaonably) correct from the start, so the person who would end up needing to refactor to fit their device in doesn't need to mess with MPIC code to do so.  Such a person might be unfamiliar with MPIC, have no easy way to test, etc.  Similar to the current situation with the IRQ routing code, at least before you took it over. :-)

The only person interested in generalizing that code for now would be Paul. He knows the MPIC quite well, so I doubt he'll have issues with it :).

All it takes really are simple multiplexing functions in front of the existing callbacks that call either MPIC or XICS code depending on the target irqchip.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19  1:09           ` Alexander Graf
@ 2013-04-19  1:37             ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19  1:37 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 08:09:23 PM, Alexander Graf wrote:
> 
> On 19.04.2013, at 02:50, Scott Wood wrote:
> 
> > On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
> >> On 18.04.2013, at 23:39, Scott Wood wrote:
> >> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
> >> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
> >> >> +{
> >> >> +	int ret = 0;
> >> >> +
> >> >> +#ifdef CONFIG_KVM_MPIC
> >> >> +	ret = ret || (kvm->arch.mpic != NULL);
> >> >> +#endif
> >> >> +	smp_rmb();
> >> >> +	return ret;
> >> >> +}
> >> >
> >> > Couldn't we just set a non-irqchip-specific bool?  Though  
> eventually this
> >> > shouldn't be needed at all -- instead the check would be "does  
> the
> >> > requested irqchip fd exist and refer to something that exposes  
> an irqchip
> >> > interface"?
> >> How would we get the irqchip id?
> >
> > I was thinking it would come from whatever operation you're trying  
> to do, though I see that MSI routing doesn't specify an irqchip. :-P
> >
> > In that case I guess it would check for whether any MSI handler has  
> been registered.
> 
> I think you're over engineering here :). These are internal  
> interfaces that whoever wants to implement a secondary irqchip can  
> worry about. I merely wanted to make sure we don't block our road for  
> multiple irqchips in the kernel<->user interface from the beginning.  
> Internal ones are a different story :).

Well, I did say "eventually".  My more immediate reaction was to seeing  
"MPIC" in a place it doesn't need to be, hence the question about a  
simple bool.

> > Seems a bit heavyweight to add several new MPIC-specific routing  
> types -- maybe just one new KVM_IRQ_ROUTING type that lets multiple  
> words be used to describe the interrupt?
> 
> Well, we can add a single KVM_IRQ_ROUTING_MPIC type if that's better.  
> But we don't have to do it now. I dislike code that we can't test,  
> and I don't have a good test case for user space injected IPIs right  
> now :).

Sure, it doesn't need to be now, as long as it's clear we have the  
extensibility to do it if/when the need arises.  It likely will arise  
at least for error interrupts.

I'd rather see it not have MPIC in the name, though.  It's not  
MPIC-specific, just a new version of the struct with additional words  
for "pin".

> >> For simple injection we can always do an ioctl on the MPIC device.
> >
> > I got complaints for that originally. :-)
> 
> There are 2 reasons why direct ioctls on the MPIC device could be bad
> 
>   1) irqfd
>   2) code sharing
> 
> I'm not convinced yet we care about performance for MPIC IPI, TMR or  
> ERR interrupts. So irqfd is out of the question there.

It's out of the question because you're not yet convinced? :-)

I think I'm done being surprised by what users try to do (though they  
may prove me wrong...).  A different user complained about our non-KVM  
hypervisor because we didn't give guests direct access to the MPIC  
timers and IPIs, even though we provided adequate hypervisor-emulated  
mechanisms -- their application wanted to use hundreds of thousands of  
timers per second.  They would not listen when we suggested that  
perhaps they could rearchitect things to not be as reliant on timers.   
We ended up implementing direct MPIC timer and IPI access there as well.

As for error interrupts, no, they're not performance-critical, but we  
still probably want to use irqfd if we're using it for everything  
else.  Even if the code for userspace reflection is already there, do  
you really want errors to be the one thing using a special IRQ path  
that's rarely tested?  Beyond the extent to which it's necessary  
because of the specialness of the error interrupts in the hardware  
design.

> Code sharing only makes sense in areas where things are common. In  
> case of the MPIC, this is for SRC interrupts.
> 
> So I don't think there should be complaints here :).

If you ignore irqfd (which was covered in #1), there isn't much that is  
really common even for "SRC" interrupts (and what is common is  
artificially so), and I made that point, but still got complaints.  I  
was the one to point out that irqfd was the real reason to need to use  
the routing interface (and thus KVM_IRQ_LINE); they were more hung up  
on "why are you being different?".

> >> However, I'd be inclined to say that it's rather unlikely you'd  
> want to have a vfio device's interrupt line hooked up to the IPI  
> interrupt of your guest...
> >
> > Well, as I said, we've done it before for a customer (not with VFIO  
> of course, but our old internal-tree device assignment mechanism), so  
> not *that* unlikely.  The host was a two-core chip running Linux on  
> only one of the cores, and a custom OS on the other core.  They  
> wanted to communicate between the custom OS and a KVM guest.  Since  
> host Linux was only running on one core, it didn't need the IPIs for  
> itself (and beginning with e500mc, the host uses msgsnd instead, so  
> there also the host will not need the IPIs for itself).
> 
> They could just as well use a guest SRC line for that, no? What the  
> listener on the host connects to on the host's MPIC is pretty much  
> orthogonal to what we inject into the guest.

What, and make more modifications to their custom OS code? :-)

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-19  1:37             ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19  1:37 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 08:09:23 PM, Alexander Graf wrote:
> 
> On 19.04.2013, at 02:50, Scott Wood wrote:
> 
> > On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
> >> On 18.04.2013, at 23:39, Scott Wood wrote:
> >> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
> >> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
> >> >> +{
> >> >> +	int ret = 0;
> >> >> +
> >> >> +#ifdef CONFIG_KVM_MPIC
> >> >> +	ret = ret || (kvm->arch.mpic != NULL);
> >> >> +#endif
> >> >> +	smp_rmb();
> >> >> +	return ret;
> >> >> +}
> >> >
> >> > Couldn't we just set a non-irqchip-specific bool?  Though  
> eventually this
> >> > shouldn't be needed at all -- instead the check would be "does  
> the
> >> > requested irqchip fd exist and refer to something that exposes  
> an irqchip
> >> > interface"?
> >> How would we get the irqchip id?
> >
> > I was thinking it would come from whatever operation you're trying  
> to do, though I see that MSI routing doesn't specify an irqchip. :-P
> >
> > In that case I guess it would check for whether any MSI handler has  
> been registered.
> 
> I think you're over engineering here :). These are internal  
> interfaces that whoever wants to implement a secondary irqchip can  
> worry about. I merely wanted to make sure we don't block our road for  
> multiple irqchips in the kernel<->user interface from the beginning.  
> Internal ones are a different story :).

Well, I did say "eventually".  My more immediate reaction was to seeing  
"MPIC" in a place it doesn't need to be, hence the question about a  
simple bool.

> > Seems a bit heavyweight to add several new MPIC-specific routing  
> types -- maybe just one new KVM_IRQ_ROUTING type that lets multiple  
> words be used to describe the interrupt?
> 
> Well, we can add a single KVM_IRQ_ROUTING_MPIC type if that's better.  
> But we don't have to do it now. I dislike code that we can't test,  
> and I don't have a good test case for user space injected IPIs right  
> now :).

Sure, it doesn't need to be now, as long as it's clear we have the  
extensibility to do it if/when the need arises.  It likely will arise  
at least for error interrupts.

I'd rather see it not have MPIC in the name, though.  It's not  
MPIC-specific, just a new version of the struct with additional words  
for "pin".

> >> For simple injection we can always do an ioctl on the MPIC device.
> >
> > I got complaints for that originally. :-)
> 
> There are 2 reasons why direct ioctls on the MPIC device could be bad
> 
>   1) irqfd
>   2) code sharing
> 
> I'm not convinced yet we care about performance for MPIC IPI, TMR or  
> ERR interrupts. So irqfd is out of the question there.

It's out of the question because you're not yet convinced? :-)

I think I'm done being surprised by what users try to do (though they  
may prove me wrong...).  A different user complained about our non-KVM  
hypervisor because we didn't give guests direct access to the MPIC  
timers and IPIs, even though we provided adequate hypervisor-emulated  
mechanisms -- their application wanted to use hundreds of thousands of  
timers per second.  They would not listen when we suggested that  
perhaps they could rearchitect things to not be as reliant on timers.   
We ended up implementing direct MPIC timer and IPI access there as well.

As for error interrupts, no, they're not performance-critical, but we  
still probably want to use irqfd if we're using it for everything  
else.  Even if the code for userspace reflection is already there, do  
you really want errors to be the one thing using a special IRQ path  
that's rarely tested?  Beyond the extent to which it's necessary  
because of the specialness of the error interrupts in the hardware  
design.

> Code sharing only makes sense in areas where things are common. In  
> case of the MPIC, this is for SRC interrupts.
> 
> So I don't think there should be complaints here :).

If you ignore irqfd (which was covered in #1), there isn't much that is  
really common even for "SRC" interrupts (and what is common is  
artificially so), and I made that point, but still got complaints.  I  
was the one to point out that irqfd was the real reason to need to use  
the routing interface (and thus KVM_IRQ_LINE); they were more hung up  
on "why are you being different?".

> >> However, I'd be inclined to say that it's rather unlikely you'd  
> want to have a vfio device's interrupt line hooked up to the IPI  
> interrupt of your guest...
> >
> > Well, as I said, we've done it before for a customer (not with VFIO  
> of course, but our old internal-tree device assignment mechanism), so  
> not *that* unlikely.  The host was a two-core chip running Linux on  
> only one of the cores, and a custom OS on the other core.  They  
> wanted to communicate between the custom OS and a KVM guest.  Since  
> host Linux was only running on one core, it didn't need the IPIs for  
> itself (and beginning with e500mc, the host uses msgsnd instead, so  
> there also the host will not need the IPIs for itself).
> 
> They could just as well use a guest SRC line for that, no? What the  
> listener on the host connects to on the host's MPIC is pretty much  
> orthogonal to what we inject into the guest.

What, and make more modifications to their custom OS code? :-)

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd v3
  2013-04-18 14:11 ` Alexander Graf
@ 2013-04-19 14:06 ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Hi,

This patch set contains a fully working implementation of the in-kernel MPIC
from Scott with a few fixups and a new version of my irqfd generalization
patch set.

v1 -> v2:

  - depend on CONFIG_ defines rather than __KVM defines
  - fix compile issues
  - fix the kvm_irqchip{,s} typo

v2 -> v3:

  - make mpic pointer type safe
  - add wmb before setting global mpic variable
  - make eoi notification happen unlockedly
  - add IRQ routing documentation
  - announce mpic availability after its creation
  - fix pr_debug again

I have refrained from touching IA64 at all in this patch set. It's marked
as BROKEN, I doubt it even compiles at all today. The only sensible thing
to do would be to remove all of IA64 kvm code from the kernel tree, but
that is out of scope for this patch set and definitely should not gate it.


Alex

Alexander Graf (11):
  KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  KVM: Remove kvm_get_intr_delivery_bitmask
  KVM: Move irq routing to generic code
  KVM: Extract generic irqchip logic into irqchip.c
  KVM: Move irq routing setup to irqchip.c
  KVM: Move irqfd resample cap handling to generic code
  KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  KVM: PPC: MPIC: Restrict to e500 platforms

Scott Wood (6):
  kvm: add device control API
  kvm/ppc/mpic: import hw/openpic.c from QEMU
  kvm/ppc/mpic: remove some obviously unneeded code
  kvm/ppc/mpic: adapt to kernel style and environment
  kvm/ppc/mpic: in-kernel MPIC emulation
  kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

 Documentation/virtual/kvm/api.txt          |   78 ++
 Documentation/virtual/kvm/devices/README   |    1 +
 Documentation/virtual/kvm/devices/mpic.txt |   48 +
 arch/powerpc/include/asm/kvm_host.h        |   24 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   30 +
 arch/powerpc/include/uapi/asm/kvm.h        |    9 +
 arch/powerpc/kvm/Kconfig                   |   12 +
 arch/powerpc/kvm/Makefile                  |    3 +
 arch/powerpc/kvm/booke.c                   |   12 +-
 arch/powerpc/kvm/irq.h                     |   17 +
 arch/powerpc/kvm/mpic.c                    | 1876 ++++++++++++++++++++++++++++
 arch/powerpc/kvm/powerpc.c                 |   55 +-
 arch/x86/include/asm/kvm_host.h            |    2 +
 arch/x86/kvm/Kconfig                       |    1 +
 arch/x86/kvm/Makefile                      |    2 +-
 arch/x86/kvm/x86.c                         |    1 -
 include/linux/kvm_host.h                   |   53 +-
 include/trace/events/kvm.h                 |   12 +-
 include/uapi/linux/kvm.h                   |   33 +-
 virt/kvm/Kconfig                           |    3 +
 virt/kvm/assigned-dev.c                    |   30 -
 virt/kvm/eventfd.c                         |    6 +-
 virt/kvm/irq_comm.c                        |  194 +---
 virt/kvm/irqchip.c                         |  237 ++++
 virt/kvm/kvm_main.c                        |  170 +++-
 25 files changed, 2659 insertions(+), 250 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt
 create mode 100644 arch/powerpc/kvm/irq.h
 create mode 100644 arch/powerpc/kvm/mpic.c
 create mode 100644 virt/kvm/irqchip.c


^ permalink raw reply	[flat|nested] 128+ messages in thread

* [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd v3
@ 2013-04-19 14:06 ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Hi,

This patch set contains a fully working implementation of the in-kernel MPIC
from Scott with a few fixups and a new version of my irqfd generalization
patch set.

v1 -> v2:

  - depend on CONFIG_ defines rather than __KVM defines
  - fix compile issues
  - fix the kvm_irqchip{,s} typo

v2 -> v3:

  - make mpic pointer type safe
  - add wmb before setting global mpic variable
  - make eoi notification happen unlockedly
  - add IRQ routing documentation
  - announce mpic availability after its creation
  - fix pr_debug again

I have refrained from touching IA64 at all in this patch set. It's marked
as BROKEN, I doubt it even compiles at all today. The only sensible thing
to do would be to remove all of IA64 kvm code from the kernel tree, but
that is out of scope for this patch set and definitely should not gate it.


Alex

Alexander Graf (11):
  KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  KVM: Remove kvm_get_intr_delivery_bitmask
  KVM: Move irq routing to generic code
  KVM: Extract generic irqchip logic into irqchip.c
  KVM: Move irq routing setup to irqchip.c
  KVM: Move irqfd resample cap handling to generic code
  KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  KVM: PPC: MPIC: Restrict to e500 platforms

Scott Wood (6):
  kvm: add device control API
  kvm/ppc/mpic: import hw/openpic.c from QEMU
  kvm/ppc/mpic: remove some obviously unneeded code
  kvm/ppc/mpic: adapt to kernel style and environment
  kvm/ppc/mpic: in-kernel MPIC emulation
  kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

 Documentation/virtual/kvm/api.txt          |   78 ++
 Documentation/virtual/kvm/devices/README   |    1 +
 Documentation/virtual/kvm/devices/mpic.txt |   48 +
 arch/powerpc/include/asm/kvm_host.h        |   24 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   30 +
 arch/powerpc/include/uapi/asm/kvm.h        |    9 +
 arch/powerpc/kvm/Kconfig                   |   12 +
 arch/powerpc/kvm/Makefile                  |    3 +
 arch/powerpc/kvm/booke.c                   |   12 +-
 arch/powerpc/kvm/irq.h                     |   17 +
 arch/powerpc/kvm/mpic.c                    | 1876 ++++++++++++++++++++++++++++
 arch/powerpc/kvm/powerpc.c                 |   55 +-
 arch/x86/include/asm/kvm_host.h            |    2 +
 arch/x86/kvm/Kconfig                       |    1 +
 arch/x86/kvm/Makefile                      |    2 +-
 arch/x86/kvm/x86.c                         |    1 -
 include/linux/kvm_host.h                   |   53 +-
 include/trace/events/kvm.h                 |   12 +-
 include/uapi/linux/kvm.h                   |   33 +-
 virt/kvm/Kconfig                           |    3 +
 virt/kvm/assigned-dev.c                    |   30 -
 virt/kvm/eventfd.c                         |    6 +-
 virt/kvm/irq_comm.c                        |  194 +---
 virt/kvm/irqchip.c                         |  237 ++++
 virt/kvm/kvm_main.c                        |  170 +++-
 25 files changed, 2659 insertions(+), 250 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt
 create mode 100644 arch/powerpc/kvm/irq.h
 create mode 100644 arch/powerpc/kvm/mpic.c
 create mode 100644 virt/kvm/irqchip.c


^ permalink raw reply	[flat|nested] 128+ messages in thread

* [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.

So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 include/linux/kvm_host.h        |    2 +-
 virt/kvm/irq_comm.c             |    2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 599f98b..f44c3fe 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -43,6 +43,8 @@
 #define KVM_PIO_PAGE_OFFSET 1
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
 
+#define KVM_IRQCHIP_NUM_PINS  KVM_IOAPIC_NUM_PINS
+
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
 			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 93a5005..bf3b1dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -307,7 +307,7 @@ struct kvm_kernel_irq_routing_entry {
 #ifdef __KVM_HAVE_IOAPIC
 
 struct kvm_irq_routing_table {
-	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
+	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
 	struct kvm_kernel_irq_routing_entry *rt_entries;
 	u32 nr_rt_entries;
 	/*
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 25ab480..7c0071d 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -480,7 +480,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
 
 	new->nr_rt_entries = nr_rt_entries;
 	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
 			new->chip[i][j] = -1;
 
 	for (i = 0; i < nr; ++i) {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.

So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 include/linux/kvm_host.h        |    2 +-
 virt/kvm/irq_comm.c             |    2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 599f98b..f44c3fe 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -43,6 +43,8 @@
 #define KVM_PIO_PAGE_OFFSET 1
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
 
+#define KVM_IRQCHIP_NUM_PINS  KVM_IOAPIC_NUM_PINS
+
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
 			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 93a5005..bf3b1dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -307,7 +307,7 @@ struct kvm_kernel_irq_routing_entry {
 #ifdef __KVM_HAVE_IOAPIC
 
 struct kvm_irq_routing_table {
-	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
+	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
 	struct kvm_kernel_irq_routing_entry *rt_entries;
 	u32 nr_rt_entries;
 	/*
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 25ab480..7c0071d 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -480,7 +480,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
 
 	new->nr_rt_entries = nr_rt_entries;
 	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
 			new->chip[i][j] = -1;
 
 	for (i = 0; i < nr; ++i) {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.

Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Kconfig     |    1 +
 include/linux/kvm_host.h |    6 +++---
 virt/kvm/Kconfig         |    3 +++
 virt/kvm/eventfd.c       |    6 +++---
 virt/kvm/kvm_main.c      |    2 +-
 5 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 586f000..9d50efd 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -29,6 +29,7 @@ config KVM
 	select MMU_NOTIFIER
 	select ANON_INODES
 	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_EVENTFD
 	select KVM_APIC_ARCHITECTURE
 	select KVM_ASYNC_PF
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index bf3b1dc..4215d4f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -304,7 +304,7 @@ struct kvm_kernel_irq_routing_entry {
 	struct hlist_node link;
 };
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 struct kvm_irq_routing_table {
 	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
@@ -432,7 +432,7 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
 int __must_check vcpu_load(struct kvm_vcpu *vcpu);
 void vcpu_put(struct kvm_vcpu *vcpu);
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 int kvm_irqfd_init(void);
 void kvm_irqfd_exit(void);
 #else
@@ -957,7 +957,7 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq)
 }
 #endif
 
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 #define KVM_MAX_IRQ_ROUTES 1024
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index d01b24b..779262f 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -6,6 +6,9 @@ config HAVE_KVM
 config HAVE_KVM_IRQCHIP
        bool
 
+config HAVE_KVM_IRQ_ROUTING
+       bool
+
 config HAVE_KVM_EVENTFD
        bool
        select EVENTFD
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index c5d43ff..64ee720 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -35,7 +35,7 @@
 
 #include "iodev.h"
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * --------------------------------------------------------------------
  * irqfd: Allows an fd to be used to inject an interrupt to the guest
@@ -433,7 +433,7 @@ fail:
 void
 kvm_eventfd_init(struct kvm *kvm)
 {
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	spin_lock_init(&kvm->irqfds.lock);
 	INIT_LIST_HEAD(&kvm->irqfds.items);
 	INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
@@ -442,7 +442,7 @@ kvm_eventfd_init(struct kvm *kvm)
 	INIT_LIST_HEAD(&kvm->ioeventfds);
 }
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * shutdown any irqfd's that match fd+gsi
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aaac1a7..2c3b226 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2404,7 +2404,7 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 	case KVM_CAP_SIGNAL_MSI:
 #endif
 		return 1;
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
 		return KVM_MAX_IRQ_ROUTES;
 #endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.

Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Kconfig     |    1 +
 include/linux/kvm_host.h |    6 +++---
 virt/kvm/Kconfig         |    3 +++
 virt/kvm/eventfd.c       |    6 +++---
 virt/kvm/kvm_main.c      |    2 +-
 5 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 586f000..9d50efd 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -29,6 +29,7 @@ config KVM
 	select MMU_NOTIFIER
 	select ANON_INODES
 	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_EVENTFD
 	select KVM_APIC_ARCHITECTURE
 	select KVM_ASYNC_PF
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index bf3b1dc..4215d4f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -304,7 +304,7 @@ struct kvm_kernel_irq_routing_entry {
 	struct hlist_node link;
 };
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 struct kvm_irq_routing_table {
 	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
@@ -432,7 +432,7 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
 int __must_check vcpu_load(struct kvm_vcpu *vcpu);
 void vcpu_put(struct kvm_vcpu *vcpu);
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 int kvm_irqfd_init(void);
 void kvm_irqfd_exit(void);
 #else
@@ -957,7 +957,7 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq)
 }
 #endif
 
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 
 #define KVM_MAX_IRQ_ROUTES 1024
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index d01b24b..779262f 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -6,6 +6,9 @@ config HAVE_KVM
 config HAVE_KVM_IRQCHIP
        bool
 
+config HAVE_KVM_IRQ_ROUTING
+       bool
+
 config HAVE_KVM_EVENTFD
        bool
        select EVENTFD
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index c5d43ff..64ee720 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -35,7 +35,7 @@
 
 #include "iodev.h"
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * --------------------------------------------------------------------
  * irqfd: Allows an fd to be used to inject an interrupt to the guest
@@ -433,7 +433,7 @@ fail:
 void
 kvm_eventfd_init(struct kvm *kvm)
 {
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	spin_lock_init(&kvm->irqfds.lock);
 	INIT_LIST_HEAD(&kvm->irqfds.items);
 	INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
@@ -442,7 +442,7 @@ kvm_eventfd_init(struct kvm *kvm)
 	INIT_LIST_HEAD(&kvm->ioeventfds);
 }
 
-#ifdef __KVM_HAVE_IOAPIC
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 /*
  * shutdown any irqfd's that match fd+gsi
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aaac1a7..2c3b226 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2404,7 +2404,7 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 	case KVM_CAP_SIGNAL_MSI:
 #endif
 		return 1;
-#ifdef KVM_CAP_IRQ_ROUTING
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
 		return KVM_MAX_IRQ_ROUTES;
 #endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

We have a capability enquire system that allows user space to ask kvm
whether a feature is available.

The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.

Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/uapi/linux/kvm.h |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 74d0ff3..c741902 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -579,9 +579,7 @@ struct kvm_ppc_smmu_info {
 #ifdef __KVM_HAVE_PIT
 #define KVM_CAP_REINJECT_CONTROL 24
 #endif
-#ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
-#endif
 #define KVM_CAP_IRQ_INJECT_STATUS 26
 #ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
 #define KVM_CAP_DEVICE_DEASSIGNMENT 27
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

We have a capability enquire system that allows user space to ask kvm
whether a feature is available.

The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.

Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/uapi/linux/kvm.h |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 74d0ff3..c741902 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -579,9 +579,7 @@ struct kvm_ppc_smmu_info {
 #ifdef __KVM_HAVE_PIT
 #define KVM_CAP_REINJECT_CONTROL 24
 #endif
-#ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
-#endif
 #define KVM_CAP_IRQ_INJECT_STATUS 26
 #ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
 #define KVM_CAP_DEVICE_DEASSIGNMENT 27
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4215d4f..a7bfe9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -719,11 +719,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
 void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 			     bool mask);
 
-#ifdef __KVM_HAVE_IOAPIC
-void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
-				   union kvm_ioapic_redirect_entry *entry,
-				   unsigned long *deliver_bitmask);
-#endif
 int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
 		bool line_status);
 int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level);
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 4215d4f..a7bfe9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -719,11 +719,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
 void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 			     bool mask);
 
-#ifdef __KVM_HAVE_IOAPIC
-void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
-				   union kvm_ioapic_redirect_entry *entry,
-				   unsigned long *deliver_bitmask);
-#endif
 int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
 		bool line_status);
 int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 05/17] KVM: Move irq routing to generic code
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The IRQ routing set ioctl lives in the hacky device assignment code inside
of KVM today. This is definitely the wrong place for it. Move it to the much
more natural kvm_main.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 virt/kvm/assigned-dev.c |   30 ------------------------------
 virt/kvm/kvm_main.c     |   30 ++++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index f4c7f59..8db4370 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -983,36 +983,6 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl,
 			goto out;
 		break;
 	}
-#ifdef KVM_CAP_IRQ_ROUTING
-	case KVM_SET_GSI_ROUTING: {
-		struct kvm_irq_routing routing;
-		struct kvm_irq_routing __user *urouting;
-		struct kvm_irq_routing_entry *entries;
-
-		r = -EFAULT;
-		if (copy_from_user(&routing, argp, sizeof(routing)))
-			goto out;
-		r = -EINVAL;
-		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
-			goto out;
-		if (routing.flags)
-			goto out;
-		r = -ENOMEM;
-		entries = vmalloc(routing.nr * sizeof(*entries));
-		if (!entries)
-			goto out;
-		r = -EFAULT;
-		urouting = argp;
-		if (copy_from_user(entries, urouting->entries,
-				   routing.nr * sizeof(*entries)))
-			goto out_free_irq_routing;
-		r = kvm_set_irq_routing(kvm, entries, routing.nr,
-					routing.flags);
-	out_free_irq_routing:
-		vfree(entries);
-		break;
-	}
-#endif /* KVM_CAP_IRQ_ROUTING */
 #ifdef __KVM_HAVE_MSIX
 	case KVM_ASSIGN_SET_MSIX_NR: {
 		struct kvm_assigned_msix_nr entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2c3b226..b6f3354 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2274,6 +2274,36 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_SET_GSI_ROUTING: {
+		struct kvm_irq_routing routing;
+		struct kvm_irq_routing __user *urouting;
+		struct kvm_irq_routing_entry *entries;
+
+		r = -EFAULT;
+		if (copy_from_user(&routing, argp, sizeof(routing)))
+			goto out;
+		r = -EINVAL;
+		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
+			goto out;
+		if (routing.flags)
+			goto out;
+		r = -ENOMEM;
+		entries = vmalloc(routing.nr * sizeof(*entries));
+		if (!entries)
+			goto out;
+		r = -EFAULT;
+		urouting = argp;
+		if (copy_from_user(entries, urouting->entries,
+				   routing.nr * sizeof(*entries)))
+			goto out_free_irq_routing;
+		r = kvm_set_irq_routing(kvm, entries, routing.nr,
+					routing.flags);
+	out_free_irq_routing:
+		vfree(entries);
+		break;
+	}
+#endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r == -ENOTTY)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 05/17] KVM: Move irq routing to generic code
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The IRQ routing set ioctl lives in the hacky device assignment code inside
of KVM today. This is definitely the wrong place for it. Move it to the much
more natural kvm_main.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 virt/kvm/assigned-dev.c |   30 ------------------------------
 virt/kvm/kvm_main.c     |   30 ++++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index f4c7f59..8db4370 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -983,36 +983,6 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl,
 			goto out;
 		break;
 	}
-#ifdef KVM_CAP_IRQ_ROUTING
-	case KVM_SET_GSI_ROUTING: {
-		struct kvm_irq_routing routing;
-		struct kvm_irq_routing __user *urouting;
-		struct kvm_irq_routing_entry *entries;
-
-		r = -EFAULT;
-		if (copy_from_user(&routing, argp, sizeof(routing)))
-			goto out;
-		r = -EINVAL;
-		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
-			goto out;
-		if (routing.flags)
-			goto out;
-		r = -ENOMEM;
-		entries = vmalloc(routing.nr * sizeof(*entries));
-		if (!entries)
-			goto out;
-		r = -EFAULT;
-		urouting = argp;
-		if (copy_from_user(entries, urouting->entries,
-				   routing.nr * sizeof(*entries)))
-			goto out_free_irq_routing;
-		r = kvm_set_irq_routing(kvm, entries, routing.nr,
-					routing.flags);
-	out_free_irq_routing:
-		vfree(entries);
-		break;
-	}
-#endif /* KVM_CAP_IRQ_ROUTING */
 #ifdef __KVM_HAVE_MSIX
 	case KVM_ASSIGN_SET_MSIX_NR: {
 		struct kvm_assigned_msix_nr entry_nr;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2c3b226..b6f3354 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2274,6 +2274,36 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_SET_GSI_ROUTING: {
+		struct kvm_irq_routing routing;
+		struct kvm_irq_routing __user *urouting;
+		struct kvm_irq_routing_entry *entries;
+
+		r = -EFAULT;
+		if (copy_from_user(&routing, argp, sizeof(routing)))
+			goto out;
+		r = -EINVAL;
+		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
+			goto out;
+		if (routing.flags)
+			goto out;
+		r = -ENOMEM;
+		entries = vmalloc(routing.nr * sizeof(*entries));
+		if (!entries)
+			goto out;
+		r = -EFAULT;
+		urouting = argp;
+		if (copy_from_user(entries, urouting->entries,
+				   routing.nr * sizeof(*entries)))
+			goto out_free_irq_routing;
+		r = kvm_set_irq_routing(kvm, entries, routing.nr,
+					routing.flags);
+	out_free_irq_routing:
+		vfree(entries);
+		break;
+	}
+#endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r = -ENOTTY)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.

Split the generic bits out into irqchip.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Makefile      |    2 +-
 include/trace/events/kvm.h |   12 +++-
 virt/kvm/irq_comm.c        |  118 ----------------------------------
 virt/kvm/irqchip.c         |  152 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+), 121 deletions(-)
 create mode 100644 virt/kvm/irqchip.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 04d3040..a797b8e 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
-				assigned-dev.o)
+				assigned-dev.o irqchip.o)
 kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/, iommu.o)
 kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix ../../../virt/kvm/, async_pf.o)
 
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 19911dd..7005d11 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -37,7 +37,7 @@ TRACE_EVENT(kvm_userspace_exit,
 		  __entry->errno < 0 ? -__entry->errno : __entry->reason)
 );
 
-#if defined(__KVM_HAVE_IRQ_LINE)
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
 TRACE_EVENT(kvm_set_irq,
 	TP_PROTO(unsigned int gsi, int level, int irq_source_id),
 	TP_ARGS(gsi, level, irq_source_id),
@@ -122,6 +122,10 @@ TRACE_EVENT(kvm_msi_set_irq,
 	{KVM_IRQCHIP_PIC_SLAVE,		"PIC slave"},		\
 	{KVM_IRQCHIP_IOAPIC,		"IOAPIC"}
 
+#endif /* defined(__KVM_HAVE_IOAPIC) */
+
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
+
 TRACE_EVENT(kvm_ack_irq,
 	TP_PROTO(unsigned int irqchip, unsigned int pin),
 	TP_ARGS(irqchip, pin),
@@ -136,14 +140,18 @@ TRACE_EVENT(kvm_ack_irq,
 		__entry->pin		= pin;
 	),
 
+#ifdef kvm_irqchips
 	TP_printk("irqchip %s pin %u",
 		  __print_symbolic(__entry->irqchip, kvm_irqchips),
 		 __entry->pin)
+#else
+	TP_printk("irqchip %d pin %u", __entry->irqchip, __entry->pin)
+#endif
 );
 
+#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
 
 
-#endif /* defined(__KVM_HAVE_IOAPIC) */
 
 #define KVM_TRACE_MMIO_READ_UNSATISFIED 0
 #define KVM_TRACE_MMIO_READ 1
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 7c0071d..d5008f4 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -151,59 +151,6 @@ static int kvm_set_msi_inatomic(struct kvm_kernel_irq_routing_entry *e,
 		return -EWOULDBLOCK;
 }
 
-int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
-{
-	struct kvm_kernel_irq_routing_entry route;
-
-	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
-		return -EINVAL;
-
-	route.msi.address_lo = msi->address_lo;
-	route.msi.address_hi = msi->address_hi;
-	route.msi.data = msi->data;
-
-	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
-}
-
-/*
- * Return value:
- *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
- *  = 0   Interrupt was coalesced (previous irq is still pending)
- *  > 0   Number of CPUs interrupt was delivered to
- */
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
-		bool line_status)
-{
-	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
-	int ret = -1, i = 0;
-	struct kvm_irq_routing_table *irq_rt;
-
-	trace_kvm_set_irq(irq, level, irq_source_id);
-
-	/* Not possible to detect if the guest uses the PIC or the
-	 * IOAPIC.  So set the bit in both. The guest will ignore
-	 * writes to the unused one.
-	 */
-	rcu_read_lock();
-	irq_rt = rcu_dereference(kvm->irq_routing);
-	if (irq < irq_rt->nr_rt_entries)
-		hlist_for_each_entry(e, &irq_rt->map[irq], link)
-			irq_set[i++] = *e;
-	rcu_read_unlock();
-
-	while(i--) {
-		int r;
-		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
-				line_status);
-		if (r < 0)
-			continue;
-
-		ret = r + ((ret < 0) ? 0 : ret);
-	}
-
-	return ret;
-}
-
 /*
  * Deliver an IRQ in an atomic context if we can, or return a failure,
  * user can retry in a process context.
@@ -241,63 +188,6 @@ int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 	return ret;
 }
 
-bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi == gsi) {
-				rcu_read_unlock();
-				return true;
-			}
-
-	rcu_read_unlock();
-
-	return false;
-}
-EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
-
-void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	trace_kvm_ack_irq(irqchip, pin);
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi == gsi)
-				kian->irq_acked(kian);
-	rcu_read_unlock();
-}
-
-void kvm_register_irq_ack_notifier(struct kvm *kvm,
-				   struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
-	mutex_unlock(&kvm->irq_lock);
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
-void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
-				    struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_del_init_rcu(&kian->link);
-	mutex_unlock(&kvm->irq_lock);
-	synchronize_rcu();
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
 int kvm_request_irq_source_id(struct kvm *kvm)
 {
 	unsigned long *bitmap = &kvm->arch.irq_sources_bitmap;
@@ -381,13 +271,6 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-void kvm_free_irq_routing(struct kvm *kvm)
-{
-	/* Called only during vm destruction. Nobody can use the pointer
-	   at this stage */
-	kfree(kvm->irq_routing);
-}
-
 static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 			       struct kvm_kernel_irq_routing_entry *e,
 			       const struct kvm_irq_routing_entry *ue)
@@ -451,7 +334,6 @@ out:
 	return r;
 }
 
-
 int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *ue,
 			unsigned nr,
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
new file mode 100644
index 0000000..12f7f26
--- /dev/null
+++ b/virt/kvm/irqchip.c
@@ -0,0 +1,152 @@
+/*
+ * irqchip.c: Common API for in kernel interrupt controllers
+ * Copyright (c) 2007, Intel Corporation.
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (c) 2013, Alexander Graf <agraf@suse.de>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * This file is derived from virt/kvm/irq_comm.c.
+ *
+ * Authors:
+ *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
+ *   Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <trace/events/kvm.h>
+#include "irq.h"
+
+bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi == gsi) {
+				rcu_read_unlock();
+				return true;
+			}
+
+	rcu_read_unlock();
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
+
+void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	trace_kvm_ack_irq(irqchip, pin);
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi == gsi)
+				kian->irq_acked(kian);
+	rcu_read_unlock();
+}
+
+void kvm_register_irq_ack_notifier(struct kvm *kvm,
+				   struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
+	mutex_unlock(&kvm->irq_lock);
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
+				    struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_del_init_rcu(&kian->link);
+	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct kvm_kernel_irq_routing_entry route;
+
+	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
+		return -EINVAL;
+
+	route.msi.address_lo = msi->address_lo;
+	route.msi.address_hi = msi->address_hi;
+	route.msi.data = msi->data;
+
+	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
+}
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
+		bool line_status)
+{
+	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
+	int ret = -1, i = 0;
+	struct kvm_irq_routing_table *irq_rt;
+
+	trace_kvm_set_irq(irq, level, irq_source_id);
+
+	/* Not possible to detect if the guest uses the PIC or the
+	 * IOAPIC.  So set the bit in both. The guest will ignore
+	 * writes to the unused one.
+	 */
+	rcu_read_lock();
+	irq_rt = rcu_dereference(kvm->irq_routing);
+	if (irq < irq_rt->nr_rt_entries)
+		hlist_for_each_entry(e, &irq_rt->map[irq], link)
+			irq_set[i++] = *e;
+	rcu_read_unlock();
+
+	while(i--) {
+		int r;
+		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
+				   line_status);
+		if (r < 0)
+			continue;
+
+		ret = r + ((ret < 0) ? 0 : ret);
+	}
+
+	return ret;
+}
+
+void kvm_free_irq_routing(struct kvm *kvm)
+{
+	/* Called only during vm destruction. Nobody can use the pointer
+	   at this stage */
+	kfree(kvm->irq_routing);
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.

Split the generic bits out into irqchip.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/Makefile      |    2 +-
 include/trace/events/kvm.h |   12 +++-
 virt/kvm/irq_comm.c        |  118 ----------------------------------
 virt/kvm/irqchip.c         |  152 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+), 121 deletions(-)
 create mode 100644 virt/kvm/irqchip.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 04d3040..a797b8e 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
 
 kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
 				coalesced_mmio.o irq_comm.o eventfd.o \
-				assigned-dev.o)
+				assigned-dev.o irqchip.o)
 kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/, iommu.o)
 kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix ../../../virt/kvm/, async_pf.o)
 
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 19911dd..7005d11 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -37,7 +37,7 @@ TRACE_EVENT(kvm_userspace_exit,
 		  __entry->errno < 0 ? -__entry->errno : __entry->reason)
 );
 
-#if defined(__KVM_HAVE_IRQ_LINE)
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
 TRACE_EVENT(kvm_set_irq,
 	TP_PROTO(unsigned int gsi, int level, int irq_source_id),
 	TP_ARGS(gsi, level, irq_source_id),
@@ -122,6 +122,10 @@ TRACE_EVENT(kvm_msi_set_irq,
 	{KVM_IRQCHIP_PIC_SLAVE,		"PIC slave"},		\
 	{KVM_IRQCHIP_IOAPIC,		"IOAPIC"}
 
+#endif /* defined(__KVM_HAVE_IOAPIC) */
+
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
+
 TRACE_EVENT(kvm_ack_irq,
 	TP_PROTO(unsigned int irqchip, unsigned int pin),
 	TP_ARGS(irqchip, pin),
@@ -136,14 +140,18 @@ TRACE_EVENT(kvm_ack_irq,
 		__entry->pin		= pin;
 	),
 
+#ifdef kvm_irqchips
 	TP_printk("irqchip %s pin %u",
 		  __print_symbolic(__entry->irqchip, kvm_irqchips),
 		 __entry->pin)
+#else
+	TP_printk("irqchip %d pin %u", __entry->irqchip, __entry->pin)
+#endif
 );
 
+#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
 
 
-#endif /* defined(__KVM_HAVE_IOAPIC) */
 
 #define KVM_TRACE_MMIO_READ_UNSATISFIED 0
 #define KVM_TRACE_MMIO_READ 1
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 7c0071d..d5008f4 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -151,59 +151,6 @@ static int kvm_set_msi_inatomic(struct kvm_kernel_irq_routing_entry *e,
 		return -EWOULDBLOCK;
 }
 
-int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
-{
-	struct kvm_kernel_irq_routing_entry route;
-
-	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
-		return -EINVAL;
-
-	route.msi.address_lo = msi->address_lo;
-	route.msi.address_hi = msi->address_hi;
-	route.msi.data = msi->data;
-
-	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
-}
-
-/*
- * Return value:
- *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
- *  = 0   Interrupt was coalesced (previous irq is still pending)
- *  > 0   Number of CPUs interrupt was delivered to
- */
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
-		bool line_status)
-{
-	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
-	int ret = -1, i = 0;
-	struct kvm_irq_routing_table *irq_rt;
-
-	trace_kvm_set_irq(irq, level, irq_source_id);
-
-	/* Not possible to detect if the guest uses the PIC or the
-	 * IOAPIC.  So set the bit in both. The guest will ignore
-	 * writes to the unused one.
-	 */
-	rcu_read_lock();
-	irq_rt = rcu_dereference(kvm->irq_routing);
-	if (irq < irq_rt->nr_rt_entries)
-		hlist_for_each_entry(e, &irq_rt->map[irq], link)
-			irq_set[i++] = *e;
-	rcu_read_unlock();
-
-	while(i--) {
-		int r;
-		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
-				line_status);
-		if (r < 0)
-			continue;
-
-		ret = r + ((ret < 0) ? 0 : ret);
-	}
-
-	return ret;
-}
-
 /*
  * Deliver an IRQ in an atomic context if we can, or return a failure,
  * user can retry in a process context.
@@ -241,63 +188,6 @@ int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 	return ret;
 }
 
-bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi = gsi) {
-				rcu_read_unlock();
-				return true;
-			}
-
-	rcu_read_unlock();
-
-	return false;
-}
-EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
-
-void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
-{
-	struct kvm_irq_ack_notifier *kian;
-	int gsi;
-
-	trace_kvm_ack_irq(irqchip, pin);
-
-	rcu_read_lock();
-	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	if (gsi != -1)
-		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
-					 link)
-			if (kian->gsi = gsi)
-				kian->irq_acked(kian);
-	rcu_read_unlock();
-}
-
-void kvm_register_irq_ack_notifier(struct kvm *kvm,
-				   struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
-	mutex_unlock(&kvm->irq_lock);
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
-void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
-				    struct kvm_irq_ack_notifier *kian)
-{
-	mutex_lock(&kvm->irq_lock);
-	hlist_del_init_rcu(&kian->link);
-	mutex_unlock(&kvm->irq_lock);
-	synchronize_rcu();
-	kvm_vcpu_request_scan_ioapic(kvm);
-}
-
 int kvm_request_irq_source_id(struct kvm *kvm)
 {
 	unsigned long *bitmap = &kvm->arch.irq_sources_bitmap;
@@ -381,13 +271,6 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-void kvm_free_irq_routing(struct kvm *kvm)
-{
-	/* Called only during vm destruction. Nobody can use the pointer
-	   at this stage */
-	kfree(kvm->irq_routing);
-}
-
 static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 			       struct kvm_kernel_irq_routing_entry *e,
 			       const struct kvm_irq_routing_entry *ue)
@@ -451,7 +334,6 @@ out:
 	return r;
 }
 
-
 int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *ue,
 			unsigned nr,
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
new file mode 100644
index 0000000..12f7f26
--- /dev/null
+++ b/virt/kvm/irqchip.c
@@ -0,0 +1,152 @@
+/*
+ * irqchip.c: Common API for in kernel interrupt controllers
+ * Copyright (c) 2007, Intel Corporation.
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (c) 2013, Alexander Graf <agraf@suse.de>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * This file is derived from virt/kvm/irq_comm.c.
+ *
+ * Authors:
+ *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
+ *   Alexander Graf <agraf@suse.de>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <trace/events/kvm.h>
+#include "irq.h"
+
+bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi = gsi) {
+				rcu_read_unlock();
+				return true;
+			}
+
+	rcu_read_unlock();
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
+
+void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
+{
+	struct kvm_irq_ack_notifier *kian;
+	int gsi;
+
+	trace_kvm_ack_irq(irqchip, pin);
+
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
+					 link)
+			if (kian->gsi = gsi)
+				kian->irq_acked(kian);
+	rcu_read_unlock();
+}
+
+void kvm_register_irq_ack_notifier(struct kvm *kvm,
+				   struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
+	mutex_unlock(&kvm->irq_lock);
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
+				    struct kvm_irq_ack_notifier *kian)
+{
+	mutex_lock(&kvm->irq_lock);
+	hlist_del_init_rcu(&kian->link);
+	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
+#ifdef __KVM_HAVE_IOAPIC
+	kvm_vcpu_request_scan_ioapic(kvm);
+#endif
+}
+
+int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct kvm_kernel_irq_routing_entry route;
+
+	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
+		return -EINVAL;
+
+	route.msi.address_lo = msi->address_lo;
+	route.msi.address_hi = msi->address_hi;
+	route.msi.data = msi->data;
+
+	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
+}
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
+		bool line_status)
+{
+	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
+	int ret = -1, i = 0;
+	struct kvm_irq_routing_table *irq_rt;
+
+	trace_kvm_set_irq(irq, level, irq_source_id);
+
+	/* Not possible to detect if the guest uses the PIC or the
+	 * IOAPIC.  So set the bit in both. The guest will ignore
+	 * writes to the unused one.
+	 */
+	rcu_read_lock();
+	irq_rt = rcu_dereference(kvm->irq_routing);
+	if (irq < irq_rt->nr_rt_entries)
+		hlist_for_each_entry(e, &irq_rt->map[irq], link)
+			irq_set[i++] = *e;
+	rcu_read_unlock();
+
+	while(i--) {
+		int r;
+		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
+				   line_status);
+		if (r < 0)
+			continue;
+
+		ret = r + ((ret < 0) ? 0 : ret);
+	}
+
+	return ret;
+}
+
+void kvm_free_irq_routing(struct kvm *kvm)
+{
+	/* Called only during vm destruction. Nobody can use the pointer
+	   at this stage */
+	kfree(kvm->irq_routing);
+}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 07/17] KVM: Move irq routing setup to irqchip.c
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    3 ++
 virt/kvm/irq_comm.c      |   76 ++---------------------------------------
 virt/kvm/irqchip.c       |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 73 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a7bfe9d..dcef724 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -961,6 +961,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *entries,
 			unsigned nr,
 			unsigned flags);
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue);
 void kvm_free_irq_routing(struct kvm *kvm);
 
 int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index d5008f4..e2e6b44 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -271,27 +271,14 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-static int setup_routing_entry(struct kvm_irq_routing_table *rt,
-			       struct kvm_kernel_irq_routing_entry *e,
-			       const struct kvm_irq_routing_entry *ue)
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
 {
 	int r = -EINVAL;
 	int delta;
 	unsigned max_pin;
-	struct kvm_kernel_irq_routing_entry *ei;
 
-	/*
-	 * Do not allow GSI to be mapped to the same irqchip more than once.
-	 * Allow only one to one mapping between GSI and MSI.
-	 */
-	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
-		if (ei->type == KVM_IRQ_ROUTING_MSI ||
-		    ue->type == KVM_IRQ_ROUTING_MSI ||
-		    ue->u.irqchip.irqchip == ei->irqchip.irqchip)
-			return r;
-
-	e->gsi = ue->gsi;
-	e->type = ue->type;
 	switch (ue->type) {
 	case KVM_IRQ_ROUTING_IRQCHIP:
 		delta = 0;
@@ -328,68 +315,11 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 		goto out;
 	}
 
-	hlist_add_head(&e->link, &rt->map[e->gsi]);
 	r = 0;
 out:
 	return r;
 }
 
-int kvm_set_irq_routing(struct kvm *kvm,
-			const struct kvm_irq_routing_entry *ue,
-			unsigned nr,
-			unsigned flags)
-{
-	struct kvm_irq_routing_table *new, *old;
-	u32 i, j, nr_rt_entries = 0;
-	int r;
-
-	for (i = 0; i < nr; ++i) {
-		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
-			return -EINVAL;
-		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
-	}
-
-	nr_rt_entries += 1;
-
-	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
-		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
-		      GFP_KERNEL);
-
-	if (!new)
-		return -ENOMEM;
-
-	new->rt_entries = (void *)&new->map[nr_rt_entries];
-
-	new->nr_rt_entries = nr_rt_entries;
-	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
-			new->chip[i][j] = -1;
-
-	for (i = 0; i < nr; ++i) {
-		r = -EINVAL;
-		if (ue->flags)
-			goto out;
-		r = setup_routing_entry(new, &new->rt_entries[i], ue);
-		if (r)
-			goto out;
-		++ue;
-	}
-
-	mutex_lock(&kvm->irq_lock);
-	old = kvm->irq_routing;
-	kvm_irq_routing_update(kvm, new);
-	mutex_unlock(&kvm->irq_lock);
-
-	synchronize_rcu();
-
-	new = old;
-	r = 0;
-
-out:
-	kfree(new);
-	return r;
-}
-
 #define IOAPIC_ROUTING_ENTRY(irq) \
 	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
 	  .u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 12f7f26..20dc9e4 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -150,3 +150,88 @@ void kvm_free_irq_routing(struct kvm *kvm)
 	   at this stage */
 	kfree(kvm->irq_routing);
 }
+
+static int setup_routing_entry(struct kvm_irq_routing_table *rt,
+			       struct kvm_kernel_irq_routing_entry *e,
+			       const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+	struct kvm_kernel_irq_routing_entry *ei;
+
+	/*
+	 * Do not allow GSI to be mapped to the same irqchip more than once.
+	 * Allow only one to one mapping between GSI and MSI.
+	 */
+	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
+		if (ei->type == KVM_IRQ_ROUTING_MSI ||
+		    ue->type == KVM_IRQ_ROUTING_MSI ||
+		    ue->u.irqchip.irqchip == ei->irqchip.irqchip)
+			return r;
+
+	e->gsi = ue->gsi;
+	e->type = ue->type;
+	r = kvm_set_routing_entry(rt, e, ue);
+	if (r)
+		goto out;
+
+	hlist_add_head(&e->link, &rt->map[e->gsi]);
+	r = 0;
+out:
+	return r;
+}
+
+int kvm_set_irq_routing(struct kvm *kvm,
+			const struct kvm_irq_routing_entry *ue,
+			unsigned nr,
+			unsigned flags)
+{
+	struct kvm_irq_routing_table *new, *old;
+	u32 i, j, nr_rt_entries = 0;
+	int r;
+
+	for (i = 0; i < nr; ++i) {
+		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
+			return -EINVAL;
+		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
+	}
+
+	nr_rt_entries += 1;
+
+	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
+		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
+		      GFP_KERNEL);
+
+	if (!new)
+		return -ENOMEM;
+
+	new->rt_entries = (void *)&new->map[nr_rt_entries];
+
+	new->nr_rt_entries = nr_rt_entries;
+	for (i = 0; i < KVM_NR_IRQCHIPS; i++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
+			new->chip[i][j] = -1;
+
+	for (i = 0; i < nr; ++i) {
+		r = -EINVAL;
+		if (ue->flags)
+			goto out;
+		r = setup_routing_entry(new, &new->rt_entries[i], ue);
+		if (r)
+			goto out;
+		++ue;
+	}
+
+	mutex_lock(&kvm->irq_lock);
+	old = kvm->irq_routing;
+	kvm_irq_routing_update(kvm, new);
+	mutex_unlock(&kvm->irq_lock);
+
+	synchronize_rcu();
+
+	new = old;
+	r = 0;
+
+out:
+	kfree(new);
+	return r;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 07/17] KVM: Move irq routing setup to irqchip.c
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 include/linux/kvm_host.h |    3 ++
 virt/kvm/irq_comm.c      |   76 ++---------------------------------------
 virt/kvm/irqchip.c       |   85 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 73 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a7bfe9d..dcef724 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -961,6 +961,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *entries,
 			unsigned nr,
 			unsigned flags);
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue);
 void kvm_free_irq_routing(struct kvm *kvm);
 
 int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index d5008f4..e2e6b44 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -271,27 +271,14 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	rcu_read_unlock();
 }
 
-static int setup_routing_entry(struct kvm_irq_routing_table *rt,
-			       struct kvm_kernel_irq_routing_entry *e,
-			       const struct kvm_irq_routing_entry *ue)
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
 {
 	int r = -EINVAL;
 	int delta;
 	unsigned max_pin;
-	struct kvm_kernel_irq_routing_entry *ei;
 
-	/*
-	 * Do not allow GSI to be mapped to the same irqchip more than once.
-	 * Allow only one to one mapping between GSI and MSI.
-	 */
-	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
-		if (ei->type = KVM_IRQ_ROUTING_MSI ||
-		    ue->type = KVM_IRQ_ROUTING_MSI ||
-		    ue->u.irqchip.irqchip = ei->irqchip.irqchip)
-			return r;
-
-	e->gsi = ue->gsi;
-	e->type = ue->type;
 	switch (ue->type) {
 	case KVM_IRQ_ROUTING_IRQCHIP:
 		delta = 0;
@@ -328,68 +315,11 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 		goto out;
 	}
 
-	hlist_add_head(&e->link, &rt->map[e->gsi]);
 	r = 0;
 out:
 	return r;
 }
 
-int kvm_set_irq_routing(struct kvm *kvm,
-			const struct kvm_irq_routing_entry *ue,
-			unsigned nr,
-			unsigned flags)
-{
-	struct kvm_irq_routing_table *new, *old;
-	u32 i, j, nr_rt_entries = 0;
-	int r;
-
-	for (i = 0; i < nr; ++i) {
-		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
-			return -EINVAL;
-		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
-	}
-
-	nr_rt_entries += 1;
-
-	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
-		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
-		      GFP_KERNEL);
-
-	if (!new)
-		return -ENOMEM;
-
-	new->rt_entries = (void *)&new->map[nr_rt_entries];
-
-	new->nr_rt_entries = nr_rt_entries;
-	for (i = 0; i < 3; i++)
-		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
-			new->chip[i][j] = -1;
-
-	for (i = 0; i < nr; ++i) {
-		r = -EINVAL;
-		if (ue->flags)
-			goto out;
-		r = setup_routing_entry(new, &new->rt_entries[i], ue);
-		if (r)
-			goto out;
-		++ue;
-	}
-
-	mutex_lock(&kvm->irq_lock);
-	old = kvm->irq_routing;
-	kvm_irq_routing_update(kvm, new);
-	mutex_unlock(&kvm->irq_lock);
-
-	synchronize_rcu();
-
-	new = old;
-	r = 0;
-
-out:
-	kfree(new);
-	return r;
-}
-
 #define IOAPIC_ROUTING_ENTRY(irq) \
 	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
 	  .u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 12f7f26..20dc9e4 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -150,3 +150,88 @@ void kvm_free_irq_routing(struct kvm *kvm)
 	   at this stage */
 	kfree(kvm->irq_routing);
 }
+
+static int setup_routing_entry(struct kvm_irq_routing_table *rt,
+			       struct kvm_kernel_irq_routing_entry *e,
+			       const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+	struct kvm_kernel_irq_routing_entry *ei;
+
+	/*
+	 * Do not allow GSI to be mapped to the same irqchip more than once.
+	 * Allow only one to one mapping between GSI and MSI.
+	 */
+	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
+		if (ei->type = KVM_IRQ_ROUTING_MSI ||
+		    ue->type = KVM_IRQ_ROUTING_MSI ||
+		    ue->u.irqchip.irqchip = ei->irqchip.irqchip)
+			return r;
+
+	e->gsi = ue->gsi;
+	e->type = ue->type;
+	r = kvm_set_routing_entry(rt, e, ue);
+	if (r)
+		goto out;
+
+	hlist_add_head(&e->link, &rt->map[e->gsi]);
+	r = 0;
+out:
+	return r;
+}
+
+int kvm_set_irq_routing(struct kvm *kvm,
+			const struct kvm_irq_routing_entry *ue,
+			unsigned nr,
+			unsigned flags)
+{
+	struct kvm_irq_routing_table *new, *old;
+	u32 i, j, nr_rt_entries = 0;
+	int r;
+
+	for (i = 0; i < nr; ++i) {
+		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
+			return -EINVAL;
+		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
+	}
+
+	nr_rt_entries += 1;
+
+	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
+		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
+		      GFP_KERNEL);
+
+	if (!new)
+		return -ENOMEM;
+
+	new->rt_entries = (void *)&new->map[nr_rt_entries];
+
+	new->nr_rt_entries = nr_rt_entries;
+	for (i = 0; i < KVM_NR_IRQCHIPS; i++)
+		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
+			new->chip[i][j] = -1;
+
+	for (i = 0; i < nr; ++i) {
+		r = -EINVAL;
+		if (ue->flags)
+			goto out;
+		r = setup_routing_entry(new, &new->rt_entries[i], ue);
+		if (r)
+			goto out;
+		++ue;
+	}
+
+	mutex_lock(&kvm->irq_lock);
+	old = kvm->irq_routing;
+	kvm_irq_routing_update(kvm, new);
+	mutex_unlock(&kvm->irq_lock);
+
+	synchronize_rcu();
+
+	new = old;
+	r = 0;
+
+out:
+	kfree(new);
+	return r;
+}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that we have most irqfd code completely platform agnostic, let's move
irqfd's resample capability return to generic code as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/x86.c  |    1 -
 virt/kvm/kvm_main.c |    3 +++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 50e2e10..888d892 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2513,7 +2513,6 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_PCI_2_3:
 	case KVM_CAP_KVMCLOCK_CTRL:
 	case KVM_CAP_READONLY_MEM:
-	case KVM_CAP_IRQFD_RESAMPLE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b6f3354..f9492f3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2433,6 +2433,9 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 #ifdef CONFIG_HAVE_KVM_MSI
 	case KVM_CAP_SIGNAL_MSI:
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_CAP_IRQFD_RESAMPLE:
+#endif
 		return 1;
 #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that we have most irqfd code completely platform agnostic, let's move
irqfd's resample capability return to generic code as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/x86.c  |    1 -
 virt/kvm/kvm_main.c |    3 +++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 50e2e10..888d892 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2513,7 +2513,6 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_PCI_2_3:
 	case KVM_CAP_KVMCLOCK_CTRL:
 	case KVM_CAP_READONLY_MEM:
-	case KVM_CAP_IRQFD_RESAMPLE:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b6f3354..f9492f3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2433,6 +2433,9 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
 #ifdef CONFIG_HAVE_KVM_MSI
 	case KVM_CAP_SIGNAL_MSI:
 #endif
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+	case KVM_CAP_IRQFD_RESAMPLE:
+#endif
 		return 1;
 #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 09/17] kvm: add device control API
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it.  If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.

This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device.  Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc.  It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.

Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt        |   70 ++++++++++++++++
 Documentation/virtual/kvm/devices/README |    1 +
 include/linux/kvm_host.h                 |   35 ++++++++
 include/uapi/linux/kvm.h                 |   27 ++++++
 virt/kvm/kvm_main.c                      |  129 ++++++++++++++++++++++++++++++
 5 files changed, 262 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 976eb65..d52f3f9 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
+4.79 KVM_CREATE_DEVICE
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: vm ioctl
+Parameters: struct kvm_create_device (in/out)
+Returns: 0 on success, -1 on error
+Errors:
+  ENODEV: The device type is unknown or unsupported
+  EEXIST: Device already created, and this type of device may not
+          be instantiated multiple times
+
+  Other error conditions may be defined by individual device types or
+  have their standard meanings.
+
+Creates an emulated device in the kernel.  The file descriptor returned
+in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
+
+If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
+device type is supported (not necessarily whether it can be created
+in the current vm).
+
+Individual devices should not define flags.  Attributes should be used
+for specifying any behavior that is not implied by the device type
+number.
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+  EPERM:  The attribute cannot (currently) be accessed this way
+          (e.g. read-only attribute, or attribute that only makes
+          sense when the device is in a different state)
+
+  Other error conditions may be defined by individual device types.
+
+Gets/sets a specified piece of device configuration and/or state.  The
+semantics are device-specific.  See individual device documentation in
+the "devices" directory.  As with ONE_REG, the size of the data
+transferred is defined by the particular attribute.
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+4.81 KVM_HAS_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+Tests whether a device supports a particular attribute.  A successful
+return indicates the attribute is implemented.  It does not necessarily
+indicate that the attribute can be read or written in the device's
+current state.  "addr" is ignored.
 
 4.77 KVM_ARM_VCPU_INIT
 
diff --git a/Documentation/virtual/kvm/devices/README b/Documentation/virtual/kvm/devices/README
new file mode 100644
index 0000000..34a6983
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/README
@@ -0,0 +1 @@
+This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dcef724..6dab6b5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1064,6 +1064,41 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 
 extern bool kvm_rebooting;
 
+struct kvm_device_ops;
+
+struct kvm_device {
+	struct kvm_device_ops *ops;
+	struct kvm *kvm;
+	atomic_t users;
+	void *private;
+};
+
+/* create, destroy, and name are mandatory */
+struct kvm_device_ops {
+	const char *name;
+	int (*create)(struct kvm_device *dev, u32 type);
+
+	/*
+	 * Destroy is responsible for freeing dev.
+	 *
+	 * Destroy may be called before or after destructors are called
+	 * on emulated I/O regions, depending on whether a reference is
+	 * held by a vcpu or other kvm component that gets destroyed
+	 * after the emulated I/O.
+	 */
+	void (*destroy)(struct kvm_device *dev);
+
+	int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
+		      unsigned long arg);
+};
+
+void kvm_device_get(struct kvm_device *dev);
+void kvm_device_put(struct kvm_device *dev);
+struct kvm_device *kvm_device_from_filp(struct file *filp);
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index c741902..be15aff 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_EPR 86
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
+#define KVM_CAP_DEVICE_CTRL 89
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -907,6 +908,32 @@ struct kvm_s390_ucas_mapping {
 #define KVM_ARM_SET_DEVICE_ADDR	  _IOW(KVMIO,  0xab, struct kvm_arm_device_addr)
 
 /*
+ * Device control API, available with KVM_CAP_DEVICE_CTRL
+ */
+#define KVM_CREATE_DEVICE_TEST		1
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+/* ioctl for vm fd */
+#define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
+
+/* ioctls for fds returned by KVM_CREATE_DEVICE */
+#define KVM_SET_DEVICE_ATTR	  _IOW(KVMIO,  0xe1, struct kvm_device_attr)
+#define KVM_GET_DEVICE_ATTR	  _IOW(KVMIO,  0xe2, struct kvm_device_attr)
+#define KVM_HAS_DEVICE_ATTR	  _IOW(KVMIO,  0xe3, struct kvm_device_attr)
+
+/*
  * ioctls for vcpu fds
  */
 #define KVM_RUN                   _IO(KVMIO,   0x80)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f9492f3..5f0d78c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2159,6 +2159,117 @@ out:
 }
 #endif
 
+static int kvm_device_ioctl_attr(struct kvm_device *dev,
+				 int (*accessor)(struct kvm_device *dev,
+						 struct kvm_device_attr *attr),
+				 unsigned long arg)
+{
+	struct kvm_device_attr attr;
+
+	if (!accessor)
+		return -EPERM;
+
+	if (copy_from_user(&attr, (void __user *)arg, sizeof(attr)))
+		return -EFAULT;
+
+	return accessor(dev, &attr);
+}
+
+static long kvm_device_ioctl(struct file *filp, unsigned int ioctl,
+			     unsigned long arg)
+{
+	struct kvm_device *dev = filp->private_data;
+
+	switch (ioctl) {
+	case KVM_SET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->set_attr, arg);
+	case KVM_GET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->get_attr, arg);
+	case KVM_HAS_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->has_attr, arg);
+	default:
+		if (dev->ops->ioctl)
+			return dev->ops->ioctl(dev, ioctl, arg);
+
+		return -ENOTTY;
+	}
+}
+
+void kvm_device_get(struct kvm_device *dev)
+{
+	atomic_inc(&dev->users);
+}
+
+void kvm_device_put(struct kvm_device *dev)
+{
+	if (atomic_dec_and_test(&dev->users))
+		dev->ops->destroy(dev);
+}
+
+static int kvm_device_release(struct inode *inode, struct file *filp)
+{
+	struct kvm_device *dev = filp->private_data;
+	struct kvm *kvm = dev->kvm;
+
+	kvm_device_put(dev);
+	kvm_put_kvm(kvm);
+	return 0;
+}
+
+static const struct file_operations kvm_device_fops = {
+	.unlocked_ioctl = kvm_device_ioctl,
+	.release = kvm_device_release,
+};
+
+struct kvm_device *kvm_device_from_filp(struct file *filp)
+{
+	if (filp->f_op != &kvm_device_fops)
+		return NULL;
+
+	return filp->private_data;
+}
+
+static int kvm_ioctl_create_device(struct kvm *kvm,
+				   struct kvm_create_device *cd)
+{
+	struct kvm_device_ops *ops = NULL;
+	struct kvm_device *dev;
+	bool test = cd->flags & KVM_CREATE_DEVICE_TEST;
+	int ret;
+
+	switch (cd->type) {
+	default:
+		return -ENODEV;
+	}
+
+	if (test)
+		return 0;
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	dev->ops = ops;
+	dev->kvm = kvm;
+	atomic_set(&dev->users, 1);
+
+	ret = ops->create(dev, cd->type);
+	if (ret < 0) {
+		kfree(dev);
+		return ret;
+	}
+
+	ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR);
+	if (ret < 0) {
+		ops->destroy(dev);
+		return ret;
+	}
+
+	kvm_get_kvm(kvm);
+	cd->fd = ret;
+	return 0;
+}
+
 static long kvm_vm_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2304,6 +2415,24 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
+	case KVM_CREATE_DEVICE: {
+		struct kvm_create_device cd;
+
+		r = -EFAULT;
+		if (copy_from_user(&cd, argp, sizeof(cd)))
+			goto out;
+
+		r = kvm_ioctl_create_device(kvm, &cd);
+		if (r)
+			goto out;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &cd, sizeof(cd)))
+			goto out;
+
+		r = 0;
+		break;
+	}
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r == -ENOTTY)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 09/17] kvm: add device control API
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it.  If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.

This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device.  Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc.  It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.

Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt        |   70 ++++++++++++++++
 Documentation/virtual/kvm/devices/README |    1 +
 include/linux/kvm_host.h                 |   35 ++++++++
 include/uapi/linux/kvm.h                 |   27 ++++++
 virt/kvm/kvm_main.c                      |  129 ++++++++++++++++++++++++++++++
 5 files changed, 262 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/README

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 976eb65..d52f3f9 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
+4.79 KVM_CREATE_DEVICE
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: vm ioctl
+Parameters: struct kvm_create_device (in/out)
+Returns: 0 on success, -1 on error
+Errors:
+  ENODEV: The device type is unknown or unsupported
+  EEXIST: Device already created, and this type of device may not
+          be instantiated multiple times
+
+  Other error conditions may be defined by individual device types or
+  have their standard meanings.
+
+Creates an emulated device in the kernel.  The file descriptor returned
+in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
+
+If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
+device type is supported (not necessarily whether it can be created
+in the current vm).
+
+Individual devices should not define flags.  Attributes should be used
+for specifying any behavior that is not implied by the device type
+number.
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+  EPERM:  The attribute cannot (currently) be accessed this way
+          (e.g. read-only attribute, or attribute that only makes
+          sense when the device is in a different state)
+
+  Other error conditions may be defined by individual device types.
+
+Gets/sets a specified piece of device configuration and/or state.  The
+semantics are device-specific.  See individual device documentation in
+the "devices" directory.  As with ONE_REG, the size of the data
+transferred is defined by the particular attribute.
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+4.81 KVM_HAS_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+Tests whether a device supports a particular attribute.  A successful
+return indicates the attribute is implemented.  It does not necessarily
+indicate that the attribute can be read or written in the device's
+current state.  "addr" is ignored.
 
 4.77 KVM_ARM_VCPU_INIT
 
diff --git a/Documentation/virtual/kvm/devices/README b/Documentation/virtual/kvm/devices/README
new file mode 100644
index 0000000..34a6983
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/README
@@ -0,0 +1 @@
+This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dcef724..6dab6b5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1064,6 +1064,41 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 
 extern bool kvm_rebooting;
 
+struct kvm_device_ops;
+
+struct kvm_device {
+	struct kvm_device_ops *ops;
+	struct kvm *kvm;
+	atomic_t users;
+	void *private;
+};
+
+/* create, destroy, and name are mandatory */
+struct kvm_device_ops {
+	const char *name;
+	int (*create)(struct kvm_device *dev, u32 type);
+
+	/*
+	 * Destroy is responsible for freeing dev.
+	 *
+	 * Destroy may be called before or after destructors are called
+	 * on emulated I/O regions, depending on whether a reference is
+	 * held by a vcpu or other kvm component that gets destroyed
+	 * after the emulated I/O.
+	 */
+	void (*destroy)(struct kvm_device *dev);
+
+	int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
+	long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
+		      unsigned long arg);
+};
+
+void kvm_device_get(struct kvm_device *dev);
+void kvm_device_put(struct kvm_device *dev);
+struct kvm_device *kvm_device_from_filp(struct file *filp);
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index c741902..be15aff 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_EPR 86
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
+#define KVM_CAP_DEVICE_CTRL 89
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -907,6 +908,32 @@ struct kvm_s390_ucas_mapping {
 #define KVM_ARM_SET_DEVICE_ADDR	  _IOW(KVMIO,  0xab, struct kvm_arm_device_addr)
 
 /*
+ * Device control API, available with KVM_CAP_DEVICE_CTRL
+ */
+#define KVM_CREATE_DEVICE_TEST		1
+
+struct kvm_create_device {
+	__u32	type;	/* in: KVM_DEV_TYPE_xxx */
+	__u32	fd;	/* out: device handle */
+	__u32	flags;	/* in: KVM_CREATE_DEVICE_xxx */
+};
+
+struct kvm_device_attr {
+	__u32	flags;		/* no flags currently defined */
+	__u32	group;		/* device-defined */
+	__u64	attr;		/* group-defined */
+	__u64	addr;		/* userspace address of attr data */
+};
+
+/* ioctl for vm fd */
+#define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
+
+/* ioctls for fds returned by KVM_CREATE_DEVICE */
+#define KVM_SET_DEVICE_ATTR	  _IOW(KVMIO,  0xe1, struct kvm_device_attr)
+#define KVM_GET_DEVICE_ATTR	  _IOW(KVMIO,  0xe2, struct kvm_device_attr)
+#define KVM_HAS_DEVICE_ATTR	  _IOW(KVMIO,  0xe3, struct kvm_device_attr)
+
+/*
  * ioctls for vcpu fds
  */
 #define KVM_RUN                   _IO(KVMIO,   0x80)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f9492f3..5f0d78c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2159,6 +2159,117 @@ out:
 }
 #endif
 
+static int kvm_device_ioctl_attr(struct kvm_device *dev,
+				 int (*accessor)(struct kvm_device *dev,
+						 struct kvm_device_attr *attr),
+				 unsigned long arg)
+{
+	struct kvm_device_attr attr;
+
+	if (!accessor)
+		return -EPERM;
+
+	if (copy_from_user(&attr, (void __user *)arg, sizeof(attr)))
+		return -EFAULT;
+
+	return accessor(dev, &attr);
+}
+
+static long kvm_device_ioctl(struct file *filp, unsigned int ioctl,
+			     unsigned long arg)
+{
+	struct kvm_device *dev = filp->private_data;
+
+	switch (ioctl) {
+	case KVM_SET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->set_attr, arg);
+	case KVM_GET_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->get_attr, arg);
+	case KVM_HAS_DEVICE_ATTR:
+		return kvm_device_ioctl_attr(dev, dev->ops->has_attr, arg);
+	default:
+		if (dev->ops->ioctl)
+			return dev->ops->ioctl(dev, ioctl, arg);
+
+		return -ENOTTY;
+	}
+}
+
+void kvm_device_get(struct kvm_device *dev)
+{
+	atomic_inc(&dev->users);
+}
+
+void kvm_device_put(struct kvm_device *dev)
+{
+	if (atomic_dec_and_test(&dev->users))
+		dev->ops->destroy(dev);
+}
+
+static int kvm_device_release(struct inode *inode, struct file *filp)
+{
+	struct kvm_device *dev = filp->private_data;
+	struct kvm *kvm = dev->kvm;
+
+	kvm_device_put(dev);
+	kvm_put_kvm(kvm);
+	return 0;
+}
+
+static const struct file_operations kvm_device_fops = {
+	.unlocked_ioctl = kvm_device_ioctl,
+	.release = kvm_device_release,
+};
+
+struct kvm_device *kvm_device_from_filp(struct file *filp)
+{
+	if (filp->f_op != &kvm_device_fops)
+		return NULL;
+
+	return filp->private_data;
+}
+
+static int kvm_ioctl_create_device(struct kvm *kvm,
+				   struct kvm_create_device *cd)
+{
+	struct kvm_device_ops *ops = NULL;
+	struct kvm_device *dev;
+	bool test = cd->flags & KVM_CREATE_DEVICE_TEST;
+	int ret;
+
+	switch (cd->type) {
+	default:
+		return -ENODEV;
+	}
+
+	if (test)
+		return 0;
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	dev->ops = ops;
+	dev->kvm = kvm;
+	atomic_set(&dev->users, 1);
+
+	ret = ops->create(dev, cd->type);
+	if (ret < 0) {
+		kfree(dev);
+		return ret;
+	}
+
+	ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR);
+	if (ret < 0) {
+		ops->destroy(dev);
+		return ret;
+	}
+
+	kvm_get_kvm(kvm);
+	cd->fd = ret;
+	return 0;
+}
+
 static long kvm_vm_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
@@ -2304,6 +2415,24 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
+	case KVM_CREATE_DEVICE: {
+		struct kvm_create_device cd;
+
+		r = -EFAULT;
+		if (copy_from_user(&cd, argp, sizeof(cd)))
+			goto out;
+
+		r = kvm_ioctl_create_device(kvm, &cd);
+		if (r)
+			goto out;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &cd, sizeof(cd)))
+			goto out;
+
+		r = 0;
+		break;
+	}
 	default:
 		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
 		if (r = -ENOTTY)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 10/17] kvm/ppc/mpic: import hw/openpic.c from QEMU
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e ("Update version for
1.4.0-rc0"), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU.  Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c | 1686 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1686 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/mpic.c

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
new file mode 100644
index 0000000..57655b9
--- /dev/null
+++ b/arch/powerpc/kvm/mpic.c
@@ -0,0 +1,1686 @@
+/*
+ * OpenPIC emulation
+ *
+ * Copyright (c) 2004 Jocelyn Mayer
+ *               2011 Alexander Graf
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+/*
+ *
+ * Based on OpenPic implementations:
+ * - Intel GW80314 I/O companion chip developer's manual
+ * - Motorola MPC8245 & MPC8540 user manuals.
+ * - Motorola MCP750 (aka Raven) programmer manual.
+ * - Motorola Harrier programmer manuel
+ *
+ * Serial interrupts, as implemented in Raven chipset are not supported yet.
+ *
+ */
+#include "hw.h"
+#include "ppc/mac.h"
+#include "pci/pci.h"
+#include "openpic.h"
+#include "sysbus.h"
+#include "pci/msi.h"
+#include "qemu/bitops.h"
+#include "ppc.h"
+
+//#define DEBUG_OPENPIC
+
+#ifdef DEBUG_OPENPIC
+static const int debug_openpic = 1;
+#else
+static const int debug_openpic = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+        if (debug_openpic) { \
+            printf(fmt , ## __VA_ARGS__); \
+        } \
+    } while (0)
+
+#define MAX_CPU     32
+#define MAX_SRC     256
+#define MAX_TMR     4
+#define MAX_IPI     4
+#define MAX_MSI     8
+#define MAX_IRQ     (MAX_SRC + MAX_IPI + MAX_TMR)
+#define VID         0x03	/* MPIC version ID */
+
+/* OpenPIC capability flags */
+#define OPENPIC_FLAG_IDR_CRIT     (1 << 0)
+#define OPENPIC_FLAG_ILR          (2 << 0)
+
+/* OpenPIC address map */
+#define OPENPIC_GLB_REG_START        0x0
+#define OPENPIC_GLB_REG_SIZE         0x10F0
+#define OPENPIC_TMR_REG_START        0x10F0
+#define OPENPIC_TMR_REG_SIZE         0x220
+#define OPENPIC_MSI_REG_START        0x1600
+#define OPENPIC_MSI_REG_SIZE         0x200
+#define OPENPIC_SUMMARY_REG_START   0x3800
+#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SRC_REG_START        0x10000
+#define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
+#define OPENPIC_CPU_REG_START        0x20000
+#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+
+/* Raven */
+#define RAVEN_MAX_CPU      2
+#define RAVEN_MAX_EXT     48
+#define RAVEN_MAX_IRQ     64
+#define RAVEN_MAX_TMR      MAX_TMR
+#define RAVEN_MAX_IPI      MAX_IPI
+
+/* Interrupt definitions */
+#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
+#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
+#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
+#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
+/* First doorbell IRQ */
+#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
+
+typedef struct FslMpicInfo {
+	int max_ext;
+} FslMpicInfo;
+
+static FslMpicInfo fsl_mpic_20 = {
+	.max_ext = 12,
+};
+
+static FslMpicInfo fsl_mpic_42 = {
+	.max_ext = 12,
+};
+
+#define FRR_NIRQ_SHIFT    16
+#define FRR_NCPU_SHIFT     8
+#define FRR_VID_SHIFT      0
+
+#define VID_REVISION_1_2   2
+#define VID_REVISION_1_3   3
+
+#define VIR_GENERIC      0x00000000	/* Generic Vendor ID */
+
+#define GCR_RESET        0x80000000
+#define GCR_MODE_PASS    0x00000000
+#define GCR_MODE_MIXED   0x20000000
+#define GCR_MODE_PROXY   0x60000000
+
+#define TBCR_CI           0x80000000	/* count inhibit */
+#define TCCR_TOG          0x80000000	/* toggles when decrement to zero */
+
+#define IDR_EP_SHIFT      31
+#define IDR_EP_MASK       (1 << IDR_EP_SHIFT)
+#define IDR_CI0_SHIFT     30
+#define IDR_CI1_SHIFT     29
+#define IDR_P1_SHIFT      1
+#define IDR_P0_SHIFT      0
+
+#define ILR_INTTGT_MASK   0x000000ff
+#define ILR_INTTGT_INT    0x00
+#define ILR_INTTGT_CINT   0x01	/* critical */
+#define ILR_INTTGT_MCP    0x02	/* machine check */
+
+/* The currently supported INTTGT values happen to be the same as QEMU's
+ * openpic output codes, but don't depend on this.  The output codes
+ * could change (unlikely, but...) or support could be added for
+ * more INTTGT values.
+ */
+static const int inttgt_output[][2] = {
+	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
+	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
+	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
+};
+
+static int inttgt_to_output(int inttgt)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][0] == inttgt) {
+			return inttgt_output[i][1];
+		}
+	}
+
+	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
+	return OPENPIC_OUTPUT_INT;
+}
+
+static int output_to_inttgt(int output)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][1] == output) {
+			return inttgt_output[i][0];
+		}
+	}
+
+	abort();
+}
+
+#define MSIIR_OFFSET       0x140
+#define MSIIR_SRS_SHIFT    29
+#define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
+#define MSIIR_IBS_SHIFT    24
+#define MSIIR_IBS_MASK     (0x1f << MSIIR_IBS_SHIFT)
+
+static int get_current_cpu(void)
+{
+	CPUState *cpu_single_cpu;
+
+	if (!cpu_single_env) {
+		return -1;
+	}
+
+	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
+	return cpu_single_cpu->cpu_index;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx);
+
+typedef enum IRQType {
+	IRQ_TYPE_NORMAL = 0,
+	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
+	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
+} IRQType;
+
+typedef struct IRQQueue {
+	/* Round up to the nearest 64 IRQs so that the queue length
+	 * won't change when moving between 32 and 64 bit hosts.
+	 */
+	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
+	int next;
+	int priority;
+} IRQQueue;
+
+typedef struct IRQSource {
+	uint32_t ivpr;		/* IRQ vector/priority register */
+	uint32_t idr;		/* IRQ destination register */
+	uint32_t destmask;	/* bitmap of CPU destinations */
+	int last_cpu;
+	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int pending;		/* TRUE if IRQ is pending */
+	IRQType type;
+	bool level:1;		/* level-triggered */
+	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
+} IRQSource;
+
+#define IVPR_MASK_SHIFT       31
+#define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
+#define IVPR_ACTIVITY_SHIFT   30
+#define IVPR_ACTIVITY_MASK    (1 << IVPR_ACTIVITY_SHIFT)
+#define IVPR_MODE_SHIFT       29
+#define IVPR_MODE_MASK        (1 << IVPR_MODE_SHIFT)
+#define IVPR_POLARITY_SHIFT   23
+#define IVPR_POLARITY_MASK    (1 << IVPR_POLARITY_SHIFT)
+#define IVPR_SENSE_SHIFT      22
+#define IVPR_SENSE_MASK       (1 << IVPR_SENSE_SHIFT)
+
+#define IVPR_PRIORITY_MASK     (0xF << 16)
+#define IVPR_PRIORITY(_ivprr_) ((int)(((_ivprr_) & IVPR_PRIORITY_MASK) >> 16))
+#define IVPR_VECTOR(opp, _ivprr_) ((_ivprr_) & (opp)->vector_mask)
+
+/* IDR[EP/CI] are only for FSL MPIC prior to v4.0 */
+#define IDR_EP      0x80000000	/* external pin */
+#define IDR_CI      0x40000000	/* critical interrupt */
+
+typedef struct IRQDest {
+	int32_t ctpr;		/* CPU current task priority */
+	IRQQueue raised;
+	IRQQueue servicing;
+	qemu_irq *irqs;
+
+	/* Count of IRQ sources asserting on non-INT outputs */
+	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+} IRQDest;
+
+typedef struct OpenPICState {
+	SysBusDevice busdev;
+	MemoryRegion mem;
+
+	/* Behavior control */
+	FslMpicInfo *fsl;
+	uint32_t model;
+	uint32_t flags;
+	uint32_t nb_irqs;
+	uint32_t vid;
+	uint32_t vir;		/* Vendor identification register */
+	uint32_t vector_mask;
+	uint32_t tfrr_reset;
+	uint32_t ivpr_reset;
+	uint32_t idr_reset;
+	uint32_t brr1;
+	uint32_t mpic_mode_mask;
+
+	/* Sub-regions */
+	MemoryRegion sub_io_mem[6];
+
+	/* Global registers */
+	uint32_t frr;		/* Feature reporting register */
+	uint32_t gcr;		/* Global configuration register  */
+	uint32_t pir;		/* Processor initialization register */
+	uint32_t spve;		/* Spurious vector register */
+	uint32_t tfrr;		/* Timer frequency reporting register */
+	/* Source registers */
+	IRQSource src[MAX_IRQ];
+	/* Local registers per output pin */
+	IRQDest dst[MAX_CPU];
+	uint32_t nb_cpus;
+	/* Timer registers */
+	struct {
+		uint32_t tccr;	/* Global timer current count register */
+		uint32_t tbcr;	/* Global timer base count register */
+	} timers[MAX_TMR];
+	/* Shared MSI registers */
+	struct {
+		uint32_t msir;	/* Shared Message Signaled Interrupt Register */
+	} msi[MAX_MSI];
+	uint32_t max_irq;
+	uint32_t irq_ipi0;
+	uint32_t irq_tim0;
+	uint32_t irq_msi;
+} OpenPICState;
+
+static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+{
+	set_bit(n_IRQ, q->queue);
+}
+
+static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+{
+	clear_bit(n_IRQ, q->queue);
+}
+
+static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+{
+	return test_bit(n_IRQ, q->queue);
+}
+
+static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+{
+	int irq = -1;
+	int next = -1;
+	int priority = -1;
+
+	for (;;) {
+		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
+		if (irq == opp->max_irq) {
+			break;
+		}
+
+		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
+
+		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
+			next = irq;
+			priority = IVPR_PRIORITY(opp->src[irq].ivpr);
+		}
+	}
+
+	q->next = next;
+	q->priority = priority;
+}
+
+static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+{
+	/* XXX: optimize */
+	IRQ_check(opp, q);
+
+	return q->next;
+}
+
+static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+			   bool active, bool was_active)
+{
+	IRQDest *dst;
+	IRQSource *src;
+	int priority;
+
+	dst = &opp->dst[n_CPU];
+	src = &opp->src[n_IRQ];
+
+	DPRINTF("%s: IRQ %d active %d was %d\n",
+		__func__, n_IRQ, active, was_active);
+
+	if (src->output != OPENPIC_OUTPUT_INT) {
+		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+			__func__, src->output, n_IRQ, active, was_active,
+			dst->outputs_active[src->output]);
+
+		/* On Freescale MPIC, critical interrupts ignore priority,
+		 * IACK, EOI, etc.  Before MPIC v4.1 they also ignore
+		 * masking.
+		 */
+		if (active) {
+			if (!was_active
+			    && dst->outputs_active[src->output]++ == 0) {
+				DPRINTF
+				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_raise(dst->irqs[src->output]);
+			}
+		} else {
+			if (was_active
+			    && --dst->outputs_active[src->output] == 0) {
+				DPRINTF
+				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_lower(dst->irqs[src->output]);
+			}
+		}
+
+		return;
+	}
+
+	priority = IVPR_PRIORITY(src->ivpr);
+
+	/* Even if the interrupt doesn't have enough priority,
+	 * it is still raised, in case ctpr is lowered later.
+	 */
+	if (active) {
+		IRQ_setbit(&dst->raised, n_IRQ);
+	} else {
+		IRQ_resetbit(&dst->raised, n_IRQ);
+	}
+
+	IRQ_check(opp, &dst->raised);
+
+	if (active && priority <= dst->ctpr) {
+		DPRINTF
+		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		active = 0;
+	}
+
+	if (active) {
+		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
+		    priority <= dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+		} else {
+			DPRINTF
+			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			qemu_irq_raise(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	} else {
+		IRQ_get_next(opp, &dst->servicing);
+		if (dst->raised.priority > dst->ctpr &&
+		    dst->raised.priority > dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->raised.next,
+			     dst->raised.priority, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			/* IRQ line stays asserted */
+		} else {
+			DPRINTF
+			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			qemu_irq_lower(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	}
+}
+
+/* update pic state because registers for n_IRQ have changed value */
+static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+{
+	IRQSource *src;
+	bool active, was_active;
+	int i;
+
+	src = &opp->src[n_IRQ];
+	active = src->pending;
+
+	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
+		/* Interrupt source is disabled */
+		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		active = false;
+	}
+
+	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+
+	/*
+	 * We don't have a similar check for already-active because
+	 * ctpr may have changed and we need to withdraw the interrupt.
+	 */
+	if (!active && !was_active) {
+		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (active) {
+		src->ivpr |= IVPR_ACTIVITY_MASK;
+	} else {
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+	}
+
+	if (src->destmask == 0) {
+		/* No target */
+		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (src->destmask == (1 << src->last_cpu)) {
+		/* Only one CPU is allowed to receive this IRQ */
+		IRQ_local_pipe(opp, src->last_cpu, n_IRQ, active, was_active);
+	} else if (!(src->ivpr & IVPR_MODE_MASK)) {
+		/* Directed delivery mode */
+		for (i = 0; i < opp->nb_cpus; i++) {
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+			}
+		}
+	} else {
+		/* Distributed delivery mode */
+		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
+			if (i == opp->nb_cpus) {
+				i = 0;
+			}
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+				src->last_cpu = i;
+				break;
+			}
+		}
+	}
+}
+
+static void openpic_set_irq(void *opaque, int n_IRQ, int level)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+
+	if (n_IRQ >= MAX_IRQ) {
+		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		abort();
+	}
+
+	src = &opp->src[n_IRQ];
+	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+		n_IRQ, level, src->ivpr);
+	if (src->level) {
+		/* level-sensitive irq */
+		src->pending = level;
+		openpic_update_irq(opp, n_IRQ);
+	} else {
+		/* edge-sensitive irq */
+		if (level) {
+			src->pending = 1;
+			openpic_update_irq(opp, n_IRQ);
+		}
+
+		if (src->output != OPENPIC_OUTPUT_INT) {
+			/* Edge-triggered interrupts shouldn't be used
+			 * with non-INT delivery, but just in case,
+			 * try to make it do something sane rather than
+			 * cause an interrupt storm.  This is close to
+			 * what you'd probably see happen in real hardware.
+			 */
+			src->pending = 0;
+			openpic_update_irq(opp, n_IRQ);
+		}
+	}
+}
+
+static void openpic_reset(DeviceState * d)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	int i;
+
+	opp->gcr = GCR_RESET;
+	/* Initialise controller registers */
+	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
+	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
+	    (opp->vid << FRR_VID_SHIFT);
+
+	opp->pir = 0;
+	opp->spve = -1 & opp->vector_mask;
+	opp->tfrr = opp->tfrr_reset;
+	/* Initialise IRQ sources */
+	for (i = 0; i < opp->max_irq; i++) {
+		opp->src[i].ivpr = opp->ivpr_reset;
+		opp->src[i].idr = opp->idr_reset;
+
+		switch (opp->src[i].type) {
+		case IRQ_TYPE_NORMAL:
+			opp->src[i].level =
+			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			break;
+
+		case IRQ_TYPE_FSLINT:
+			opp->src[i].ivpr |= IVPR_POLARITY_MASK;
+			break;
+
+		case IRQ_TYPE_FSLSPECIAL:
+			break;
+		}
+	}
+	/* Initialise IRQ destinations */
+	for (i = 0; i < MAX_CPU; i++) {
+		opp->dst[i].ctpr = 15;
+		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		opp->dst[i].raised.next = -1;
+		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		opp->dst[i].servicing.next = -1;
+	}
+	/* Initialise timers */
+	for (i = 0; i < MAX_TMR; i++) {
+		opp->timers[i].tccr = 0;
+		opp->timers[i].tbcr = TBCR_CI;
+	}
+	/* Go out of RESET state */
+	opp->gcr = 0;
+}
+
+static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].idr;
+}
+
+static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		return output_to_inttgt(opp->src[n_IRQ].output);
+	}
+
+	return 0xffffffff;
+}
+
+static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].ivpr;
+}
+
+static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	IRQSource *src = &opp->src[n_IRQ];
+	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
+	uint32_t crit_mask = 0;
+	uint32_t mask = normal_mask;
+	int crit_shift = IDR_EP_SHIFT - opp->nb_cpus;
+	int i;
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		crit_mask = mask << crit_shift;
+		mask |= crit_mask | IDR_EP;
+	}
+
+	src->idr = val & mask;
+	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		if (src->idr & crit_mask) {
+			if (src->idr & normal_mask) {
+				DPRINTF
+				    ("%s: IRQ configured for multiple output types, using "
+				     "critical\n", __func__);
+			}
+
+			src->output = OPENPIC_OUTPUT_CINT;
+			src->nomask = true;
+			src->destmask = 0;
+
+			for (i = 0; i < opp->nb_cpus; i++) {
+				int n_ci = IDR_CI0_SHIFT - i;
+
+				if (src->idr & (1UL << n_ci)) {
+					src->destmask |= 1UL << i;
+				}
+			}
+		} else {
+			src->output = OPENPIC_OUTPUT_INT;
+			src->nomask = false;
+			src->destmask = src->idr & normal_mask;
+		}
+	} else {
+		src->destmask = src->idr;
+	}
+}
+
+static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		IRQSource *src = &opp->src[n_IRQ];
+
+		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+			src->output);
+
+		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
+	}
+}
+
+static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+				     uint32_t val)
+{
+	uint32_t mask;
+
+	/* NOTE when implementing newer FSL MPIC models: starting with v4.0,
+	 * the polarity bit is read-only on internal interrupts.
+	 */
+	mask = IVPR_MASK_MASK | IVPR_PRIORITY_MASK | IVPR_SENSE_MASK |
+	    IVPR_POLARITY_MASK | opp->vector_mask;
+
+	/* ACTIVITY bit is read-only */
+	opp->src[n_IRQ].ivpr =
+	    (opp->src[n_IRQ].ivpr & IVPR_ACTIVITY_MASK) | (val & mask);
+
+	/* For FSL internal interrupts, The sense bit is reserved and zero,
+	 * and the interrupt is always level-triggered.  Timers and IPIs
+	 * have no sense or polarity bits, and are edge-triggered.
+	 */
+	switch (opp->src[n_IRQ].type) {
+	case IRQ_TYPE_NORMAL:
+		opp->src[n_IRQ].level =
+		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		break;
+
+	case IRQ_TYPE_FSLINT:
+		opp->src[n_IRQ].ivpr &= ~IVPR_SENSE_MASK;
+		break;
+
+	case IRQ_TYPE_FSLSPECIAL:
+		opp->src[n_IRQ].ivpr &= ~(IVPR_POLARITY_MASK | IVPR_SENSE_MASK);
+		break;
+	}
+
+	openpic_update_irq(opp, n_IRQ);
+	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+		opp->src[n_IRQ].ivpr);
+}
+
+static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+{
+	bool mpic_proxy = false;
+
+	if (val & GCR_RESET) {
+		openpic_reset(&opp->busdev.qdev);
+		return;
+	}
+
+	opp->gcr &= ~opp->mpic_mode_mask;
+	opp->gcr |= val & opp->mpic_mode_mask;
+
+	/* Set external proxy mode */
+	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY) {
+		mpic_proxy = true;
+	}
+
+	ppce500_set_mpic_proxy(mpic_proxy);
+}
+
+static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+	switch (addr) {
+	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		break;
+	case 0x1000:		/* FRR */
+		break;
+	case 0x1020:		/* GCR */
+		openpic_gcr_write(opp, val);
+		break;
+	case 0x1080:		/* VIR */
+		break;
+	case 0x1090:		/* PIR */
+		for (idx = 0; idx < opp->nb_cpus; idx++) {
+			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Raise OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			} else if (!(val & (1 << idx))
+				   && (opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Lower OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			}
+		}
+		opp->pir = val;
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		opp->spve = val & opp->vector_mask;
+		break;
+	default:
+		break;
+	}
+}
+
+static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+	if (addr & 0xF) {
+		return retval;
+	}
+	switch (addr) {
+	case 0x1000:		/* FRR */
+		retval = opp->frr;
+		break;
+	case 0x1020:		/* GCR */
+		retval = opp->gcr;
+		break;
+	case 0x1080:		/* VIR */
+		retval = opp->vir;
+		break;
+	case 0x1090:		/* PIR */
+		retval = 0x00000000;
+		break;
+	case 0x00:		/* Block Revision Register1 (BRR1) */
+		retval = opp->brr1;
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		retval =
+		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			retval = read_IRQreg_ivpr(opp, opp->irq_ipi0 + idx);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		retval = opp->spve;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	addr += 0x10f0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	if (addr == 0x10f0) {
+		/* TFRR */
+		opp->tfrr = val;
+		return;
+	}
+
+	idx = (addr >> 6) & 0x3;
+	addr = addr & 0x30;
+
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		break;
+	case 0x10:		/* TBCR */
+		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
+		    (val & TBCR_CI) == 0 &&
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+			opp->timers[idx].tccr &= ~TCCR_TOG;
+		}
+		opp->timers[idx].tbcr = val;
+		break;
+	case 0x20:		/* TVPR */
+		write_IRQreg_ivpr(opp, opp->irq_tim0 + idx, val);
+		break;
+	case 0x30:		/* TDR */
+		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval = -1;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		goto out;
+	}
+	idx = (addr >> 6) & 0x3;
+	if (addr == 0x0) {
+		/* TFRR */
+		retval = opp->tfrr;
+		goto out;
+	}
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		retval = opp->timers[idx].tccr;
+		break;
+	case 0x10:		/* TBCR */
+		retval = opp->timers[idx].tbcr;
+		break;
+	case 0x20:		/* TIPV */
+		retval = read_IRQreg_ivpr(opp, opp->irq_tim0 + idx);
+		break;
+	case 0x30:		/* TIDE (TIDR) */
+		retval = read_IRQreg_idr(opp, opp->irq_tim0 + idx);
+		break;
+	}
+
+out:
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		write_IRQreg_ivpr(opp, idx, val);
+		break;
+	case 0x10:
+		write_IRQreg_idr(opp, idx, val);
+		break;
+	case 0x18:
+		write_IRQreg_ilr(opp, idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		retval = read_IRQreg_ivpr(opp, idx);
+		break;
+	case 0x10:
+		retval = read_IRQreg_idr(opp, idx);
+		break;
+	case 0x18:
+		retval = read_IRQreg_ilr(opp, idx);
+		break;
+	}
+
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	return retval;
+}
+
+static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned size)
+{
+	OpenPICState *opp = opaque;
+	int idx = opp->irq_msi;
+	int srs, ibs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	switch (addr) {
+	case MSIIR_OFFSET:
+		srs = val >> MSIIR_SRS_SHIFT;
+		idx += srs;
+		ibs = (val & MSIIR_IBS_MASK) >> MSIIR_IBS_SHIFT;
+		opp->msi[srs].msir |= 1 << ibs;
+		openpic_set_irq(opp, idx, 1);
+		break;
+	default:
+		/* most registers are read-only, thus ignored */
+		break;
+	}
+}
+
+static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+{
+	OpenPICState *opp = opaque;
+	uint64_t r = 0;
+	int i, srs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		return -1;
+	}
+
+	srs = addr >> 4;
+
+	switch (addr) {
+	case 0x00:
+	case 0x10:
+	case 0x20:
+	case 0x30:
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:		/* MSIRs */
+		r = opp->msi[srs].msir;
+		/* Clear on read */
+		opp->msi[srs].msir = 0;
+		openpic_set_irq(opp, opp->irq_msi + srs, 0);
+		break;
+	case 0x120:		/* MSISR */
+		for (i = 0; i < MAX_MSI; i++) {
+			r |= (opp->msi[i].msir ? 1 : 0) << i;
+		}
+		break;
+	}
+
+	return r;
+}
+
+static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+{
+	uint64_t r = 0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+
+	/* TODO: EISR/EIMR */
+
+	return r;
+}
+
+static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+				  unsigned size)
+{
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+
+	/* TODO: EISR/EIMR */
+}
+
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+	IRQDest *dst;
+	int s_IRQ, n_IRQ;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+		addr, val);
+
+	if (idx < 0) {
+		return;
+	}
+
+	if (addr & 0xF) {
+		return;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x40:		/* IPIDR */
+	case 0x50:
+	case 0x60:
+	case 0x70:
+		idx = (addr - 0x40) >> 4;
+		/* we use IDE as mask which CPUs to deliver the IPI to still. */
+		opp->src[opp->irq_ipi0 + idx].destmask |= val;
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 1);
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 0);
+		break;
+	case 0x80:		/* CTPR */
+		dst->ctpr = val & 0x0000000F;
+
+		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+			__func__, idx, dst->ctpr, dst->raised.priority,
+			dst->servicing.priority);
+
+		if (dst->raised.priority <= dst->ctpr) {
+			DPRINTF
+			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+			     __func__, idx);
+			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+		} else if (dst->raised.priority > dst->servicing.priority) {
+			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+				__func__, idx, dst->raised.next);
+			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+		}
+
+		break;
+	case 0x90:		/* WHOAMI */
+		/* Read-only register */
+		break;
+	case 0xA0:		/* IACK */
+		/* Read-only register */
+		break;
+	case 0xB0:		/* EOI */
+		DPRINTF("EOI\n");
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+
+		if (s_IRQ < 0) {
+			DPRINTF("%s: EOI with no interrupt in service\n",
+				__func__);
+			break;
+		}
+
+		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Set up next servicing IRQ */
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+		/* Check queued interrupts. */
+		n_IRQ = IRQ_get_next(opp, &dst->raised);
+		src = &opp->src[n_IRQ];
+		if (n_IRQ != -1 &&
+		    (s_IRQ == -1 ||
+		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
+			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+				idx, n_IRQ);
+			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+		}
+		break;
+	default:
+		break;
+	}
+}
+
+static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+}
+
+static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+{
+	IRQSource *src;
+	int retval, irq;
+
+	DPRINTF("Lower OpenPIC INT output\n");
+	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+
+	irq = IRQ_get_next(opp, &dst->raised);
+	DPRINTF("IACK: irq=%d\n", irq);
+
+	if (irq == -1) {
+		/* No more interrupt pending */
+		return opp->spve;
+	}
+
+	src = &opp->src[irq];
+	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
+	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
+		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+			__func__, irq, dst->ctpr, src->ivpr);
+		openpic_update_irq(opp, irq);
+		retval = opp->spve;
+	} else {
+		/* IRQ enter servicing state */
+		IRQ_setbit(&dst->servicing, irq);
+		retval = IVPR_VECTOR(opp, src->ivpr);
+	}
+
+	if (!src->level) {
+		/* edge-sensitive IRQ */
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+		src->pending = 0;
+		IRQ_resetbit(&dst->raised, irq);
+	}
+
+	if ((irq >= opp->irq_ipi0) && (irq < (opp->irq_ipi0 + MAX_IPI))) {
+		src->destmask &= ~(1 << cpu);
+		if (src->destmask && !src->level) {
+			/* trigger on CPUs that didn't know about it yet */
+			openpic_set_irq(opp, irq, 1);
+			openpic_set_irq(opp, irq, 0);
+			/* if all CPUs knew about it, set active bit again */
+			src->ivpr |= IVPR_ACTIVITY_MASK;
+		}
+	}
+
+	return retval;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	uint32_t retval;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	retval = 0xFFFFFFFF;
+
+	if (idx < 0) {
+		return retval;
+	}
+
+	if (addr & 0xF) {
+		return retval;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x80:		/* CTPR */
+		retval = dst->ctpr;
+		break;
+	case 0x90:		/* WHOAMI */
+		retval = idx;
+		break;
+	case 0xA0:		/* IACK */
+		retval = openpic_iack(opp, dst, idx);
+		break;
+	case 0xB0:		/* EOI */
+		retval = 0;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+{
+	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+}
+
+static const MemoryRegionOps openpic_glb_ops_le = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_glb_ops_be = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_le = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_be = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_le = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_be = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_le = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_be = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_msi_ops_be = {
+	.read = openpic_msi_read,
+	.write = openpic_msi_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_summary_ops_be = {
+	.read = openpic_summary_read,
+	.write = openpic_summary_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		/* Always put the lower half of a 64-bit long first, in case we
+		 * restore on a 32-bit host.  The least significant bits correspond
+		 * to lower IRQ numbers in the bitmap.
+		 */
+		qemu_put_be32(f, (uint32_t) q->queue[i]);
+#if LONG_MAX > 0x7FFFFFFF
+		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
+#endif
+	}
+
+	qemu_put_sbe32s(f, &q->next);
+	qemu_put_sbe32s(f, &q->priority);
+}
+
+static void openpic_save(QEMUFile * f, void *opaque)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	qemu_put_be32s(f, &opp->gcr);
+	qemu_put_be32s(f, &opp->vir);
+	qemu_put_be32s(f, &opp->pir);
+	qemu_put_be32s(f, &opp->spve);
+	qemu_put_be32s(f, &opp->tfrr);
+
+	qemu_put_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_put_be32s(f, &opp->timers[i].tccr);
+		qemu_put_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		qemu_put_be32s(f, &opp->src[i].ivpr);
+		qemu_put_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_put_sbe32s(f, &opp->src[i].pending);
+	}
+}
+
+static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		unsigned long val;
+
+		val = qemu_get_be32(f);
+#if LONG_MAX > 0x7FFFFFFF
+		val <<= 32;
+		val |= qemu_get_be32(f);
+#endif
+
+		q->queue[i] = val;
+	}
+
+	qemu_get_sbe32s(f, &q->next);
+	qemu_get_sbe32s(f, &q->priority);
+}
+
+static int openpic_load(QEMUFile * f, void *opaque, int version_id)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	if (version_id != 1) {
+		return -EINVAL;
+	}
+
+	qemu_get_be32s(f, &opp->gcr);
+	qemu_get_be32s(f, &opp->vir);
+	qemu_get_be32s(f, &opp->pir);
+	qemu_get_be32s(f, &opp->spve);
+	qemu_get_be32s(f, &opp->tfrr);
+
+	qemu_get_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_get_be32s(f, &opp->timers[i].tccr);
+		qemu_get_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		uint32_t val;
+
+		val = qemu_get_be32(f);
+		write_IRQreg_idr(opp, i, val);
+		val = qemu_get_be32(f);
+		write_IRQreg_ivpr(opp, i, val);
+
+		qemu_get_be32s(f, &opp->src[i].ivpr);
+		qemu_get_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_get_sbe32s(f, &opp->src[i].pending);
+	}
+
+	return 0;
+}
+
+typedef struct MemReg {
+	const char *name;
+	MemoryRegionOps const *ops;
+	hwaddr start_addr;
+	ram_addr_t size;
+} MemReg;
+
+static void fsl_common_init(OpenPICState * opp)
+{
+	int i;
+	int virq = MAX_SRC;
+
+	opp->vid = VID_REVISION_1_2;
+	opp->vir = VIR_GENERIC;
+	opp->vector_mask = 0xFFFF;
+	opp->tfrr_reset = 0;
+	opp->ivpr_reset = IVPR_MASK_MASK;
+	opp->idr_reset = 1 << 0;
+	opp->max_irq = MAX_IRQ;
+
+	opp->irq_ipi0 = virq;
+	virq += MAX_IPI;
+	opp->irq_tim0 = virq;
+	virq += MAX_TMR;
+
+	assert(virq <= MAX_IRQ);
+
+	opp->irq_msi = 224;
+
+	msi_supported = true;
+	for (i = 0; i < opp->fsl->max_ext; i++) {
+		opp->src[i].level = false;
+	}
+
+	/* Internal interrupts, including message and MSI */
+	for (i = 16; i < MAX_SRC; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLINT;
+		opp->src[i].level = true;
+	}
+
+	/* timers and IPIs */
+	for (i = MAX_SRC; i < virq; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLSPECIAL;
+		opp->src[i].level = false;
+	}
+}
+
+static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+{
+	while (list->name) {
+		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+
+		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
+				      list->name, list->size);
+
+		memory_region_add_subregion(&opp->mem, list->start_addr,
+					    &opp->sub_io_mem[*count]);
+
+		(*count)++;
+		list++;
+	}
+}
+
+static int openpic_init(SysBusDevice * dev)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	int i, j;
+	int list_count = 0;
+	static const MemReg list_le[] = {
+		{"glb", &openpic_glb_ops_le,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_le,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_le,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_le,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_be[] = {
+		{"glb", &openpic_glb_ops_be,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_be,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_be,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_be,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_fsl[] = {
+		{"msi", &openpic_msi_ops_be,
+		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
+		{"summary", &openpic_summary_ops_be,
+		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
+		{NULL}
+	};
+
+	memory_region_init(&opp->mem, "openpic", 0x40000);
+
+	switch (opp->model) {
+	case OPENPIC_MODEL_FSL_MPIC_20:
+	default:
+		opp->fsl = &fsl_mpic_20;
+		opp->brr1 = 0x00400200;
+		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
+		opp->nb_irqs = 80;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_FSL_MPIC_42:
+		opp->fsl = &fsl_mpic_42;
+		opp->brr1 = 0x00400402;
+		opp->flags |= OPENPIC_FLAG_ILR;
+		opp->nb_irqs = 196;
+		opp->mpic_mode_mask = GCR_MODE_PROXY;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_RAVEN:
+		opp->nb_irqs = RAVEN_MAX_EXT;
+		opp->vid = VID_REVISION_1_3;
+		opp->vir = VIR_GENERIC;
+		opp->vector_mask = 0xFF;
+		opp->tfrr_reset = 4160000;
+		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
+		opp->idr_reset = 0;
+		opp->max_irq = RAVEN_MAX_IRQ;
+		opp->irq_ipi0 = RAVEN_IPI_IRQ;
+		opp->irq_tim0 = RAVEN_TMR_IRQ;
+		opp->brr1 = -1;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		/* Only UP supported today */
+		if (opp->nb_cpus != 1) {
+			return -EINVAL;
+		}
+
+		map_list(opp, list_le, &list_count);
+		break;
+	}
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
+		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
+			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
+		}
+	}
+
+	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
+			openpic_save, openpic_load, opp);
+
+	sysbus_init_mmio(dev, &opp->mem);
+	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
+
+	return 0;
+}
+
+static Property openpic_properties[] = {
+	DEFINE_PROP_UINT32("model", OpenPICState, model,
+			   OPENPIC_MODEL_FSL_MPIC_20),
+	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
+	DEFINE_PROP_END_OF_LIST(),
+};
+
+static void openpic_class_init(ObjectClass * klass, void *data)
+{
+	DeviceClass *dc = DEVICE_CLASS(klass);
+	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+
+	k->init = openpic_init;
+	dc->props = openpic_properties;
+	dc->reset = openpic_reset;
+}
+
+static const TypeInfo openpic_info = {
+	.name = "openpic",
+	.parent = TYPE_SYS_BUS_DEVICE,
+	.instance_size = sizeof(OpenPICState),
+	.class_init = openpic_class_init,
+};
+
+static void openpic_register_types(void)
+{
+	type_register_static(&openpic_info);
+}
+
+type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 10/17] kvm/ppc/mpic: import hw/openpic.c from QEMU
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e ("Update version for
1.4.0-rc0"), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU.  Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c | 1686 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1686 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/mpic.c

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
new file mode 100644
index 0000000..57655b9
--- /dev/null
+++ b/arch/powerpc/kvm/mpic.c
@@ -0,0 +1,1686 @@
+/*
+ * OpenPIC emulation
+ *
+ * Copyright (c) 2004 Jocelyn Mayer
+ *               2011 Alexander Graf
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+/*
+ *
+ * Based on OpenPic implementations:
+ * - Intel GW80314 I/O companion chip developer's manual
+ * - Motorola MPC8245 & MPC8540 user manuals.
+ * - Motorola MCP750 (aka Raven) programmer manual.
+ * - Motorola Harrier programmer manuel
+ *
+ * Serial interrupts, as implemented in Raven chipset are not supported yet.
+ *
+ */
+#include "hw.h"
+#include "ppc/mac.h"
+#include "pci/pci.h"
+#include "openpic.h"
+#include "sysbus.h"
+#include "pci/msi.h"
+#include "qemu/bitops.h"
+#include "ppc.h"
+
+//#define DEBUG_OPENPIC
+
+#ifdef DEBUG_OPENPIC
+static const int debug_openpic = 1;
+#else
+static const int debug_openpic = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+        if (debug_openpic) { \
+            printf(fmt , ## __VA_ARGS__); \
+        } \
+    } while (0)
+
+#define MAX_CPU     32
+#define MAX_SRC     256
+#define MAX_TMR     4
+#define MAX_IPI     4
+#define MAX_MSI     8
+#define MAX_IRQ     (MAX_SRC + MAX_IPI + MAX_TMR)
+#define VID         0x03	/* MPIC version ID */
+
+/* OpenPIC capability flags */
+#define OPENPIC_FLAG_IDR_CRIT     (1 << 0)
+#define OPENPIC_FLAG_ILR          (2 << 0)
+
+/* OpenPIC address map */
+#define OPENPIC_GLB_REG_START        0x0
+#define OPENPIC_GLB_REG_SIZE         0x10F0
+#define OPENPIC_TMR_REG_START        0x10F0
+#define OPENPIC_TMR_REG_SIZE         0x220
+#define OPENPIC_MSI_REG_START        0x1600
+#define OPENPIC_MSI_REG_SIZE         0x200
+#define OPENPIC_SUMMARY_REG_START   0x3800
+#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SRC_REG_START        0x10000
+#define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
+#define OPENPIC_CPU_REG_START        0x20000
+#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+
+/* Raven */
+#define RAVEN_MAX_CPU      2
+#define RAVEN_MAX_EXT     48
+#define RAVEN_MAX_IRQ     64
+#define RAVEN_MAX_TMR      MAX_TMR
+#define RAVEN_MAX_IPI      MAX_IPI
+
+/* Interrupt definitions */
+#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
+#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
+#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
+#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
+/* First doorbell IRQ */
+#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
+
+typedef struct FslMpicInfo {
+	int max_ext;
+} FslMpicInfo;
+
+static FslMpicInfo fsl_mpic_20 = {
+	.max_ext = 12,
+};
+
+static FslMpicInfo fsl_mpic_42 = {
+	.max_ext = 12,
+};
+
+#define FRR_NIRQ_SHIFT    16
+#define FRR_NCPU_SHIFT     8
+#define FRR_VID_SHIFT      0
+
+#define VID_REVISION_1_2   2
+#define VID_REVISION_1_3   3
+
+#define VIR_GENERIC      0x00000000	/* Generic Vendor ID */
+
+#define GCR_RESET        0x80000000
+#define GCR_MODE_PASS    0x00000000
+#define GCR_MODE_MIXED   0x20000000
+#define GCR_MODE_PROXY   0x60000000
+
+#define TBCR_CI           0x80000000	/* count inhibit */
+#define TCCR_TOG          0x80000000	/* toggles when decrement to zero */
+
+#define IDR_EP_SHIFT      31
+#define IDR_EP_MASK       (1 << IDR_EP_SHIFT)
+#define IDR_CI0_SHIFT     30
+#define IDR_CI1_SHIFT     29
+#define IDR_P1_SHIFT      1
+#define IDR_P0_SHIFT      0
+
+#define ILR_INTTGT_MASK   0x000000ff
+#define ILR_INTTGT_INT    0x00
+#define ILR_INTTGT_CINT   0x01	/* critical */
+#define ILR_INTTGT_MCP    0x02	/* machine check */
+
+/* The currently supported INTTGT values happen to be the same as QEMU's
+ * openpic output codes, but don't depend on this.  The output codes
+ * could change (unlikely, but...) or support could be added for
+ * more INTTGT values.
+ */
+static const int inttgt_output[][2] = {
+	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
+	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
+	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
+};
+
+static int inttgt_to_output(int inttgt)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][0] = inttgt) {
+			return inttgt_output[i][1];
+		}
+	}
+
+	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
+	return OPENPIC_OUTPUT_INT;
+}
+
+static int output_to_inttgt(int output)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
+		if (inttgt_output[i][1] = output) {
+			return inttgt_output[i][0];
+		}
+	}
+
+	abort();
+}
+
+#define MSIIR_OFFSET       0x140
+#define MSIIR_SRS_SHIFT    29
+#define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
+#define MSIIR_IBS_SHIFT    24
+#define MSIIR_IBS_MASK     (0x1f << MSIIR_IBS_SHIFT)
+
+static int get_current_cpu(void)
+{
+	CPUState *cpu_single_cpu;
+
+	if (!cpu_single_env) {
+		return -1;
+	}
+
+	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
+	return cpu_single_cpu->cpu_index;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx);
+
+typedef enum IRQType {
+	IRQ_TYPE_NORMAL = 0,
+	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
+	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
+} IRQType;
+
+typedef struct IRQQueue {
+	/* Round up to the nearest 64 IRQs so that the queue length
+	 * won't change when moving between 32 and 64 bit hosts.
+	 */
+	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
+	int next;
+	int priority;
+} IRQQueue;
+
+typedef struct IRQSource {
+	uint32_t ivpr;		/* IRQ vector/priority register */
+	uint32_t idr;		/* IRQ destination register */
+	uint32_t destmask;	/* bitmap of CPU destinations */
+	int last_cpu;
+	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int pending;		/* TRUE if IRQ is pending */
+	IRQType type;
+	bool level:1;		/* level-triggered */
+	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
+} IRQSource;
+
+#define IVPR_MASK_SHIFT       31
+#define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
+#define IVPR_ACTIVITY_SHIFT   30
+#define IVPR_ACTIVITY_MASK    (1 << IVPR_ACTIVITY_SHIFT)
+#define IVPR_MODE_SHIFT       29
+#define IVPR_MODE_MASK        (1 << IVPR_MODE_SHIFT)
+#define IVPR_POLARITY_SHIFT   23
+#define IVPR_POLARITY_MASK    (1 << IVPR_POLARITY_SHIFT)
+#define IVPR_SENSE_SHIFT      22
+#define IVPR_SENSE_MASK       (1 << IVPR_SENSE_SHIFT)
+
+#define IVPR_PRIORITY_MASK     (0xF << 16)
+#define IVPR_PRIORITY(_ivprr_) ((int)(((_ivprr_) & IVPR_PRIORITY_MASK) >> 16))
+#define IVPR_VECTOR(opp, _ivprr_) ((_ivprr_) & (opp)->vector_mask)
+
+/* IDR[EP/CI] are only for FSL MPIC prior to v4.0 */
+#define IDR_EP      0x80000000	/* external pin */
+#define IDR_CI      0x40000000	/* critical interrupt */
+
+typedef struct IRQDest {
+	int32_t ctpr;		/* CPU current task priority */
+	IRQQueue raised;
+	IRQQueue servicing;
+	qemu_irq *irqs;
+
+	/* Count of IRQ sources asserting on non-INT outputs */
+	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+} IRQDest;
+
+typedef struct OpenPICState {
+	SysBusDevice busdev;
+	MemoryRegion mem;
+
+	/* Behavior control */
+	FslMpicInfo *fsl;
+	uint32_t model;
+	uint32_t flags;
+	uint32_t nb_irqs;
+	uint32_t vid;
+	uint32_t vir;		/* Vendor identification register */
+	uint32_t vector_mask;
+	uint32_t tfrr_reset;
+	uint32_t ivpr_reset;
+	uint32_t idr_reset;
+	uint32_t brr1;
+	uint32_t mpic_mode_mask;
+
+	/* Sub-regions */
+	MemoryRegion sub_io_mem[6];
+
+	/* Global registers */
+	uint32_t frr;		/* Feature reporting register */
+	uint32_t gcr;		/* Global configuration register  */
+	uint32_t pir;		/* Processor initialization register */
+	uint32_t spve;		/* Spurious vector register */
+	uint32_t tfrr;		/* Timer frequency reporting register */
+	/* Source registers */
+	IRQSource src[MAX_IRQ];
+	/* Local registers per output pin */
+	IRQDest dst[MAX_CPU];
+	uint32_t nb_cpus;
+	/* Timer registers */
+	struct {
+		uint32_t tccr;	/* Global timer current count register */
+		uint32_t tbcr;	/* Global timer base count register */
+	} timers[MAX_TMR];
+	/* Shared MSI registers */
+	struct {
+		uint32_t msir;	/* Shared Message Signaled Interrupt Register */
+	} msi[MAX_MSI];
+	uint32_t max_irq;
+	uint32_t irq_ipi0;
+	uint32_t irq_tim0;
+	uint32_t irq_msi;
+} OpenPICState;
+
+static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+{
+	set_bit(n_IRQ, q->queue);
+}
+
+static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+{
+	clear_bit(n_IRQ, q->queue);
+}
+
+static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+{
+	return test_bit(n_IRQ, q->queue);
+}
+
+static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+{
+	int irq = -1;
+	int next = -1;
+	int priority = -1;
+
+	for (;;) {
+		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
+		if (irq = opp->max_irq) {
+			break;
+		}
+
+		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
+
+		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
+			next = irq;
+			priority = IVPR_PRIORITY(opp->src[irq].ivpr);
+		}
+	}
+
+	q->next = next;
+	q->priority = priority;
+}
+
+static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+{
+	/* XXX: optimize */
+	IRQ_check(opp, q);
+
+	return q->next;
+}
+
+static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+			   bool active, bool was_active)
+{
+	IRQDest *dst;
+	IRQSource *src;
+	int priority;
+
+	dst = &opp->dst[n_CPU];
+	src = &opp->src[n_IRQ];
+
+	DPRINTF("%s: IRQ %d active %d was %d\n",
+		__func__, n_IRQ, active, was_active);
+
+	if (src->output != OPENPIC_OUTPUT_INT) {
+		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+			__func__, src->output, n_IRQ, active, was_active,
+			dst->outputs_active[src->output]);
+
+		/* On Freescale MPIC, critical interrupts ignore priority,
+		 * IACK, EOI, etc.  Before MPIC v4.1 they also ignore
+		 * masking.
+		 */
+		if (active) {
+			if (!was_active
+			    && dst->outputs_active[src->output]++ = 0) {
+				DPRINTF
+				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_raise(dst->irqs[src->output]);
+			}
+		} else {
+			if (was_active
+			    && --dst->outputs_active[src->output] = 0) {
+				DPRINTF
+				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+				     __func__, src->output, n_CPU, n_IRQ);
+				qemu_irq_lower(dst->irqs[src->output]);
+			}
+		}
+
+		return;
+	}
+
+	priority = IVPR_PRIORITY(src->ivpr);
+
+	/* Even if the interrupt doesn't have enough priority,
+	 * it is still raised, in case ctpr is lowered later.
+	 */
+	if (active) {
+		IRQ_setbit(&dst->raised, n_IRQ);
+	} else {
+		IRQ_resetbit(&dst->raised, n_IRQ);
+	}
+
+	IRQ_check(opp, &dst->raised);
+
+	if (active && priority <= dst->ctpr) {
+		DPRINTF
+		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		active = 0;
+	}
+
+	if (active) {
+		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
+		    priority <= dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+		} else {
+			DPRINTF
+			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			qemu_irq_raise(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	} else {
+		IRQ_get_next(opp, &dst->servicing);
+		if (dst->raised.priority > dst->ctpr &&
+		    dst->raised.priority > dst->servicing.priority) {
+			DPRINTF
+			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->raised.next,
+			     dst->raised.priority, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			/* IRQ line stays asserted */
+		} else {
+			DPRINTF
+			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+			     __func__, n_IRQ, dst->ctpr,
+			     dst->servicing.priority, n_CPU);
+			qemu_irq_lower(opp->dst[n_CPU].
+				       irqs[OPENPIC_OUTPUT_INT]);
+		}
+	}
+}
+
+/* update pic state because registers for n_IRQ have changed value */
+static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+{
+	IRQSource *src;
+	bool active, was_active;
+	int i;
+
+	src = &opp->src[n_IRQ];
+	active = src->pending;
+
+	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
+		/* Interrupt source is disabled */
+		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		active = false;
+	}
+
+	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+
+	/*
+	 * We don't have a similar check for already-active because
+	 * ctpr may have changed and we need to withdraw the interrupt.
+	 */
+	if (!active && !was_active) {
+		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (active) {
+		src->ivpr |= IVPR_ACTIVITY_MASK;
+	} else {
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+	}
+
+	if (src->destmask = 0) {
+		/* No target */
+		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		return;
+	}
+
+	if (src->destmask = (1 << src->last_cpu)) {
+		/* Only one CPU is allowed to receive this IRQ */
+		IRQ_local_pipe(opp, src->last_cpu, n_IRQ, active, was_active);
+	} else if (!(src->ivpr & IVPR_MODE_MASK)) {
+		/* Directed delivery mode */
+		for (i = 0; i < opp->nb_cpus; i++) {
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+			}
+		}
+	} else {
+		/* Distributed delivery mode */
+		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
+			if (i = opp->nb_cpus) {
+				i = 0;
+			}
+			if (src->destmask & (1 << i)) {
+				IRQ_local_pipe(opp, i, n_IRQ, active,
+					       was_active);
+				src->last_cpu = i;
+				break;
+			}
+		}
+	}
+}
+
+static void openpic_set_irq(void *opaque, int n_IRQ, int level)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+
+	if (n_IRQ >= MAX_IRQ) {
+		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		abort();
+	}
+
+	src = &opp->src[n_IRQ];
+	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+		n_IRQ, level, src->ivpr);
+	if (src->level) {
+		/* level-sensitive irq */
+		src->pending = level;
+		openpic_update_irq(opp, n_IRQ);
+	} else {
+		/* edge-sensitive irq */
+		if (level) {
+			src->pending = 1;
+			openpic_update_irq(opp, n_IRQ);
+		}
+
+		if (src->output != OPENPIC_OUTPUT_INT) {
+			/* Edge-triggered interrupts shouldn't be used
+			 * with non-INT delivery, but just in case,
+			 * try to make it do something sane rather than
+			 * cause an interrupt storm.  This is close to
+			 * what you'd probably see happen in real hardware.
+			 */
+			src->pending = 0;
+			openpic_update_irq(opp, n_IRQ);
+		}
+	}
+}
+
+static void openpic_reset(DeviceState * d)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	int i;
+
+	opp->gcr = GCR_RESET;
+	/* Initialise controller registers */
+	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
+	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
+	    (opp->vid << FRR_VID_SHIFT);
+
+	opp->pir = 0;
+	opp->spve = -1 & opp->vector_mask;
+	opp->tfrr = opp->tfrr_reset;
+	/* Initialise IRQ sources */
+	for (i = 0; i < opp->max_irq; i++) {
+		opp->src[i].ivpr = opp->ivpr_reset;
+		opp->src[i].idr = opp->idr_reset;
+
+		switch (opp->src[i].type) {
+		case IRQ_TYPE_NORMAL:
+			opp->src[i].level +			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			break;
+
+		case IRQ_TYPE_FSLINT:
+			opp->src[i].ivpr |= IVPR_POLARITY_MASK;
+			break;
+
+		case IRQ_TYPE_FSLSPECIAL:
+			break;
+		}
+	}
+	/* Initialise IRQ destinations */
+	for (i = 0; i < MAX_CPU; i++) {
+		opp->dst[i].ctpr = 15;
+		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		opp->dst[i].raised.next = -1;
+		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		opp->dst[i].servicing.next = -1;
+	}
+	/* Initialise timers */
+	for (i = 0; i < MAX_TMR; i++) {
+		opp->timers[i].tccr = 0;
+		opp->timers[i].tbcr = TBCR_CI;
+	}
+	/* Go out of RESET state */
+	opp->gcr = 0;
+}
+
+static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].idr;
+}
+
+static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		return output_to_inttgt(opp->src[n_IRQ].output);
+	}
+
+	return 0xffffffff;
+}
+
+static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+{
+	return opp->src[n_IRQ].ivpr;
+}
+
+static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	IRQSource *src = &opp->src[n_IRQ];
+	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
+	uint32_t crit_mask = 0;
+	uint32_t mask = normal_mask;
+	int crit_shift = IDR_EP_SHIFT - opp->nb_cpus;
+	int i;
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		crit_mask = mask << crit_shift;
+		mask |= crit_mask | IDR_EP;
+	}
+
+	src->idr = val & mask;
+	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+
+	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
+		if (src->idr & crit_mask) {
+			if (src->idr & normal_mask) {
+				DPRINTF
+				    ("%s: IRQ configured for multiple output types, using "
+				     "critical\n", __func__);
+			}
+
+			src->output = OPENPIC_OUTPUT_CINT;
+			src->nomask = true;
+			src->destmask = 0;
+
+			for (i = 0; i < opp->nb_cpus; i++) {
+				int n_ci = IDR_CI0_SHIFT - i;
+
+				if (src->idr & (1UL << n_ci)) {
+					src->destmask |= 1UL << i;
+				}
+			}
+		} else {
+			src->output = OPENPIC_OUTPUT_INT;
+			src->nomask = false;
+			src->destmask = src->idr & normal_mask;
+		}
+	} else {
+		src->destmask = src->idr;
+	}
+}
+
+static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+{
+	if (opp->flags & OPENPIC_FLAG_ILR) {
+		IRQSource *src = &opp->src[n_IRQ];
+
+		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+			src->output);
+
+		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
+	}
+}
+
+static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+				     uint32_t val)
+{
+	uint32_t mask;
+
+	/* NOTE when implementing newer FSL MPIC models: starting with v4.0,
+	 * the polarity bit is read-only on internal interrupts.
+	 */
+	mask = IVPR_MASK_MASK | IVPR_PRIORITY_MASK | IVPR_SENSE_MASK |
+	    IVPR_POLARITY_MASK | opp->vector_mask;
+
+	/* ACTIVITY bit is read-only */
+	opp->src[n_IRQ].ivpr +	    (opp->src[n_IRQ].ivpr & IVPR_ACTIVITY_MASK) | (val & mask);
+
+	/* For FSL internal interrupts, The sense bit is reserved and zero,
+	 * and the interrupt is always level-triggered.  Timers and IPIs
+	 * have no sense or polarity bits, and are edge-triggered.
+	 */
+	switch (opp->src[n_IRQ].type) {
+	case IRQ_TYPE_NORMAL:
+		opp->src[n_IRQ].level +		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		break;
+
+	case IRQ_TYPE_FSLINT:
+		opp->src[n_IRQ].ivpr &= ~IVPR_SENSE_MASK;
+		break;
+
+	case IRQ_TYPE_FSLSPECIAL:
+		opp->src[n_IRQ].ivpr &= ~(IVPR_POLARITY_MASK | IVPR_SENSE_MASK);
+		break;
+	}
+
+	openpic_update_irq(opp, n_IRQ);
+	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+		opp->src[n_IRQ].ivpr);
+}
+
+static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+{
+	bool mpic_proxy = false;
+
+	if (val & GCR_RESET) {
+		openpic_reset(&opp->busdev.qdev);
+		return;
+	}
+
+	opp->gcr &= ~opp->mpic_mode_mask;
+	opp->gcr |= val & opp->mpic_mode_mask;
+
+	/* Set external proxy mode */
+	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY) {
+		mpic_proxy = true;
+	}
+
+	ppce500_set_mpic_proxy(mpic_proxy);
+}
+
+static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+	switch (addr) {
+	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		break;
+	case 0x1000:		/* FRR */
+		break;
+	case 0x1020:		/* GCR */
+		openpic_gcr_write(opp, val);
+		break;
+	case 0x1080:		/* VIR */
+		break;
+	case 0x1090:		/* PIR */
+		for (idx = 0; idx < opp->nb_cpus; idx++) {
+			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Raise OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			} else if (!(val & (1 << idx))
+				   && (opp->pir & (1 << idx))) {
+				DPRINTF
+				    ("Lower OpenPIC RESET output for CPU %d\n",
+				     idx);
+				dst = &opp->dst[idx];
+				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
+			}
+		}
+		opp->pir = val;
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		opp->spve = val & opp->vector_mask;
+		break;
+	default:
+		break;
+	}
+}
+
+static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+	if (addr & 0xF) {
+		return retval;
+	}
+	switch (addr) {
+	case 0x1000:		/* FRR */
+		retval = opp->frr;
+		break;
+	case 0x1020:		/* GCR */
+		retval = opp->gcr;
+		break;
+	case 0x1080:		/* VIR */
+		retval = opp->vir;
+		break;
+	case 0x1090:		/* PIR */
+		retval = 0x00000000;
+		break;
+	case 0x00:		/* Block Revision Register1 (BRR1) */
+		retval = opp->brr1;
+		break;
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:
+	case 0x80:
+	case 0x90:
+	case 0xA0:
+	case 0xB0:
+		retval +		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		break;
+	case 0x10A0:		/* IPI_IVPR */
+	case 0x10B0:
+	case 0x10C0:
+	case 0x10D0:
+		{
+			int idx;
+			idx = (addr - 0x10A0) >> 4;
+			retval = read_IRQreg_ivpr(opp, opp->irq_ipi0 + idx);
+		}
+		break;
+	case 0x10E0:		/* SPVE */
+		retval = opp->spve;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	addr += 0x10f0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	if (addr = 0x10f0) {
+		/* TFRR */
+		opp->tfrr = val;
+		return;
+	}
+
+	idx = (addr >> 6) & 0x3;
+	addr = addr & 0x30;
+
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		break;
+	case 0x10:		/* TBCR */
+		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
+		    (val & TBCR_CI) = 0 &&
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+			opp->timers[idx].tccr &= ~TCCR_TOG;
+		}
+		opp->timers[idx].tbcr = val;
+		break;
+	case 0x20:		/* TVPR */
+		write_IRQreg_ivpr(opp, opp->irq_tim0 + idx, val);
+		break;
+	case 0x30:		/* TDR */
+		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval = -1;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		goto out;
+	}
+	idx = (addr >> 6) & 0x3;
+	if (addr = 0x0) {
+		/* TFRR */
+		retval = opp->tfrr;
+		goto out;
+	}
+	switch (addr & 0x30) {
+	case 0x00:		/* TCCR */
+		retval = opp->timers[idx].tccr;
+		break;
+	case 0x10:		/* TBCR */
+		retval = opp->timers[idx].tbcr;
+		break;
+	case 0x20:		/* TIPV */
+		retval = read_IRQreg_ivpr(opp, opp->irq_tim0 + idx);
+		break;
+	case 0x30:		/* TIDE (TIDR) */
+		retval = read_IRQreg_idr(opp, opp->irq_tim0 + idx);
+		break;
+	}
+
+out:
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	OpenPICState *opp = opaque;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+		__func__, addr, val);
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		write_IRQreg_ivpr(opp, idx, val);
+		break;
+	case 0x10:
+		write_IRQreg_idr(opp, idx, val);
+		break;
+	case 0x18:
+		write_IRQreg_ilr(opp, idx, val);
+		break;
+	}
+}
+
+static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+{
+	OpenPICState *opp = opaque;
+	uint32_t retval;
+	int idx;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	retval = 0xFFFFFFFF;
+
+	addr = addr & 0xffff;
+	idx = addr >> 5;
+
+	switch (addr & 0x1f) {
+	case 0x00:
+		retval = read_IRQreg_ivpr(opp, idx);
+		break;
+	case 0x10:
+		retval = read_IRQreg_idr(opp, idx);
+		break;
+	case 0x18:
+		retval = read_IRQreg_ilr(opp, idx);
+		break;
+	}
+
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	return retval;
+}
+
+static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned size)
+{
+	OpenPICState *opp = opaque;
+	int idx = opp->irq_msi;
+	int srs, ibs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+	if (addr & 0xF) {
+		return;
+	}
+
+	switch (addr) {
+	case MSIIR_OFFSET:
+		srs = val >> MSIIR_SRS_SHIFT;
+		idx += srs;
+		ibs = (val & MSIIR_IBS_MASK) >> MSIIR_IBS_SHIFT;
+		opp->msi[srs].msir |= 1 << ibs;
+		openpic_set_irq(opp, idx, 1);
+		break;
+	default:
+		/* most registers are read-only, thus ignored */
+		break;
+	}
+}
+
+static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+{
+	OpenPICState *opp = opaque;
+	uint64_t r = 0;
+	int i, srs;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF) {
+		return -1;
+	}
+
+	srs = addr >> 4;
+
+	switch (addr) {
+	case 0x00:
+	case 0x10:
+	case 0x20:
+	case 0x30:
+	case 0x40:
+	case 0x50:
+	case 0x60:
+	case 0x70:		/* MSIRs */
+		r = opp->msi[srs].msir;
+		/* Clear on read */
+		opp->msi[srs].msir = 0;
+		openpic_set_irq(opp, opp->irq_msi + srs, 0);
+		break;
+	case 0x120:		/* MSISR */
+		for (i = 0; i < MAX_MSI; i++) {
+			r |= (opp->msi[i].msir ? 1 : 0) << i;
+		}
+		break;
+	}
+
+	return r;
+}
+
+static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+{
+	uint64_t r = 0;
+
+	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+
+	/* TODO: EISR/EIMR */
+
+	return r;
+}
+
+static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+				  unsigned size)
+{
+	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+		__func__, addr, val);
+
+	/* TODO: EISR/EIMR */
+}
+
+static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+				       uint32_t val, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQSource *src;
+	IRQDest *dst;
+	int s_IRQ, n_IRQ;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+		addr, val);
+
+	if (idx < 0) {
+		return;
+	}
+
+	if (addr & 0xF) {
+		return;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x40:		/* IPIDR */
+	case 0x50:
+	case 0x60:
+	case 0x70:
+		idx = (addr - 0x40) >> 4;
+		/* we use IDE as mask which CPUs to deliver the IPI to still. */
+		opp->src[opp->irq_ipi0 + idx].destmask |= val;
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 1);
+		openpic_set_irq(opp, opp->irq_ipi0 + idx, 0);
+		break;
+	case 0x80:		/* CTPR */
+		dst->ctpr = val & 0x0000000F;
+
+		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+			__func__, idx, dst->ctpr, dst->raised.priority,
+			dst->servicing.priority);
+
+		if (dst->raised.priority <= dst->ctpr) {
+			DPRINTF
+			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+			     __func__, idx);
+			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+		} else if (dst->raised.priority > dst->servicing.priority) {
+			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+				__func__, idx, dst->raised.next);
+			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+		}
+
+		break;
+	case 0x90:		/* WHOAMI */
+		/* Read-only register */
+		break;
+	case 0xA0:		/* IACK */
+		/* Read-only register */
+		break;
+	case 0xB0:		/* EOI */
+		DPRINTF("EOI\n");
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+
+		if (s_IRQ < 0) {
+			DPRINTF("%s: EOI with no interrupt in service\n",
+				__func__);
+			break;
+		}
+
+		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Set up next servicing IRQ */
+		s_IRQ = IRQ_get_next(opp, &dst->servicing);
+		/* Check queued interrupts. */
+		n_IRQ = IRQ_get_next(opp, &dst->raised);
+		src = &opp->src[n_IRQ];
+		if (n_IRQ != -1 &&
+		    (s_IRQ = -1 ||
+		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
+			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+				idx, n_IRQ);
+			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+		}
+		break;
+	default:
+		break;
+	}
+}
+
+static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+			      unsigned len)
+{
+	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+}
+
+static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+{
+	IRQSource *src;
+	int retval, irq;
+
+	DPRINTF("Lower OpenPIC INT output\n");
+	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+
+	irq = IRQ_get_next(opp, &dst->raised);
+	DPRINTF("IACK: irq=%d\n", irq);
+
+	if (irq = -1) {
+		/* No more interrupt pending */
+		return opp->spve;
+	}
+
+	src = &opp->src[irq];
+	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
+	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
+		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+			__func__, irq, dst->ctpr, src->ivpr);
+		openpic_update_irq(opp, irq);
+		retval = opp->spve;
+	} else {
+		/* IRQ enter servicing state */
+		IRQ_setbit(&dst->servicing, irq);
+		retval = IVPR_VECTOR(opp, src->ivpr);
+	}
+
+	if (!src->level) {
+		/* edge-sensitive IRQ */
+		src->ivpr &= ~IVPR_ACTIVITY_MASK;
+		src->pending = 0;
+		IRQ_resetbit(&dst->raised, irq);
+	}
+
+	if ((irq >= opp->irq_ipi0) && (irq < (opp->irq_ipi0 + MAX_IPI))) {
+		src->destmask &= ~(1 << cpu);
+		if (src->destmask && !src->level) {
+			/* trigger on CPUs that didn't know about it yet */
+			openpic_set_irq(opp, irq, 1);
+			openpic_set_irq(opp, irq, 0);
+			/* if all CPUs knew about it, set active bit again */
+			src->ivpr |= IVPR_ACTIVITY_MASK;
+		}
+	}
+
+	return retval;
+}
+
+static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+{
+	OpenPICState *opp = opaque;
+	IRQDest *dst;
+	uint32_t retval;
+
+	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	retval = 0xFFFFFFFF;
+
+	if (idx < 0) {
+		return retval;
+	}
+
+	if (addr & 0xF) {
+		return retval;
+	}
+	dst = &opp->dst[idx];
+	addr &= 0xFF0;
+	switch (addr) {
+	case 0x80:		/* CTPR */
+		retval = dst->ctpr;
+		break;
+	case 0x90:		/* WHOAMI */
+		retval = idx;
+		break;
+	case 0xA0:		/* IACK */
+		retval = openpic_iack(opp, dst, idx);
+		break;
+	case 0xB0:		/* EOI */
+		retval = 0;
+		break;
+	default:
+		break;
+	}
+	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+
+	return retval;
+}
+
+static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+{
+	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+}
+
+static const MemoryRegionOps openpic_glb_ops_le = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_glb_ops_be = {
+	.write = openpic_gbl_write,
+	.read = openpic_gbl_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_le = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_tmr_ops_be = {
+	.write = openpic_tmr_write,
+	.read = openpic_tmr_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_le = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_cpu_ops_be = {
+	.write = openpic_cpu_write,
+	.read = openpic_cpu_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_le = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_LITTLE_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_src_ops_be = {
+	.write = openpic_src_write,
+	.read = openpic_src_read,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_msi_ops_be = {
+	.read = openpic_msi_read,
+	.write = openpic_msi_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static const MemoryRegionOps openpic_summary_ops_be = {
+	.read = openpic_summary_read,
+	.write = openpic_summary_write,
+	.endianness = DEVICE_BIG_ENDIAN,
+	.impl = {
+		 .min_access_size = 4,
+		 .max_access_size = 4,
+		 },
+};
+
+static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		/* Always put the lower half of a 64-bit long first, in case we
+		 * restore on a 32-bit host.  The least significant bits correspond
+		 * to lower IRQ numbers in the bitmap.
+		 */
+		qemu_put_be32(f, (uint32_t) q->queue[i]);
+#if LONG_MAX > 0x7FFFFFFF
+		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
+#endif
+	}
+
+	qemu_put_sbe32s(f, &q->next);
+	qemu_put_sbe32s(f, &q->priority);
+}
+
+static void openpic_save(QEMUFile * f, void *opaque)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	qemu_put_be32s(f, &opp->gcr);
+	qemu_put_be32s(f, &opp->vir);
+	qemu_put_be32s(f, &opp->pir);
+	qemu_put_be32s(f, &opp->spve);
+	qemu_put_be32s(f, &opp->tfrr);
+
+	qemu_put_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_put_be32s(f, &opp->timers[i].tccr);
+		qemu_put_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		qemu_put_be32s(f, &opp->src[i].ivpr);
+		qemu_put_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_put_sbe32s(f, &opp->src[i].pending);
+	}
+}
+
+static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
+		unsigned long val;
+
+		val = qemu_get_be32(f);
+#if LONG_MAX > 0x7FFFFFFF
+		val <<= 32;
+		val |= qemu_get_be32(f);
+#endif
+
+		q->queue[i] = val;
+	}
+
+	qemu_get_sbe32s(f, &q->next);
+	qemu_get_sbe32s(f, &q->priority);
+}
+
+static int openpic_load(QEMUFile * f, void *opaque, int version_id)
+{
+	OpenPICState *opp = (OpenPICState *) opaque;
+	unsigned int i;
+
+	if (version_id != 1) {
+		return -EINVAL;
+	}
+
+	qemu_get_be32s(f, &opp->gcr);
+	qemu_get_be32s(f, &opp->vir);
+	qemu_get_be32s(f, &opp->pir);
+	qemu_get_be32s(f, &opp->spve);
+	qemu_get_be32s(f, &opp->tfrr);
+
+	qemu_get_be32s(f, &opp->nb_cpus);
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
+		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
+		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
+		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
+				sizeof(opp->dst[i].outputs_active));
+	}
+
+	for (i = 0; i < MAX_TMR; i++) {
+		qemu_get_be32s(f, &opp->timers[i].tccr);
+		qemu_get_be32s(f, &opp->timers[i].tbcr);
+	}
+
+	for (i = 0; i < opp->max_irq; i++) {
+		uint32_t val;
+
+		val = qemu_get_be32(f);
+		write_IRQreg_idr(opp, i, val);
+		val = qemu_get_be32(f);
+		write_IRQreg_ivpr(opp, i, val);
+
+		qemu_get_be32s(f, &opp->src[i].ivpr);
+		qemu_get_be32s(f, &opp->src[i].idr);
+		qemu_get_be32s(f, &opp->src[i].destmask);
+		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
+		qemu_get_sbe32s(f, &opp->src[i].pending);
+	}
+
+	return 0;
+}
+
+typedef struct MemReg {
+	const char *name;
+	MemoryRegionOps const *ops;
+	hwaddr start_addr;
+	ram_addr_t size;
+} MemReg;
+
+static void fsl_common_init(OpenPICState * opp)
+{
+	int i;
+	int virq = MAX_SRC;
+
+	opp->vid = VID_REVISION_1_2;
+	opp->vir = VIR_GENERIC;
+	opp->vector_mask = 0xFFFF;
+	opp->tfrr_reset = 0;
+	opp->ivpr_reset = IVPR_MASK_MASK;
+	opp->idr_reset = 1 << 0;
+	opp->max_irq = MAX_IRQ;
+
+	opp->irq_ipi0 = virq;
+	virq += MAX_IPI;
+	opp->irq_tim0 = virq;
+	virq += MAX_TMR;
+
+	assert(virq <= MAX_IRQ);
+
+	opp->irq_msi = 224;
+
+	msi_supported = true;
+	for (i = 0; i < opp->fsl->max_ext; i++) {
+		opp->src[i].level = false;
+	}
+
+	/* Internal interrupts, including message and MSI */
+	for (i = 16; i < MAX_SRC; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLINT;
+		opp->src[i].level = true;
+	}
+
+	/* timers and IPIs */
+	for (i = MAX_SRC; i < virq; i++) {
+		opp->src[i].type = IRQ_TYPE_FSLSPECIAL;
+		opp->src[i].level = false;
+	}
+}
+
+static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+{
+	while (list->name) {
+		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+
+		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
+				      list->name, list->size);
+
+		memory_region_add_subregion(&opp->mem, list->start_addr,
+					    &opp->sub_io_mem[*count]);
+
+		(*count)++;
+		list++;
+	}
+}
+
+static int openpic_init(SysBusDevice * dev)
+{
+	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	int i, j;
+	int list_count = 0;
+	static const MemReg list_le[] = {
+		{"glb", &openpic_glb_ops_le,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_le,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_le,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_le,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_be[] = {
+		{"glb", &openpic_glb_ops_be,
+		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
+		{"tmr", &openpic_tmr_ops_be,
+		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
+		{"src", &openpic_src_ops_be,
+		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
+		{"cpu", &openpic_cpu_ops_be,
+		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
+		{NULL}
+	};
+	static const MemReg list_fsl[] = {
+		{"msi", &openpic_msi_ops_be,
+		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
+		{"summary", &openpic_summary_ops_be,
+		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
+		{NULL}
+	};
+
+	memory_region_init(&opp->mem, "openpic", 0x40000);
+
+	switch (opp->model) {
+	case OPENPIC_MODEL_FSL_MPIC_20:
+	default:
+		opp->fsl = &fsl_mpic_20;
+		opp->brr1 = 0x00400200;
+		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
+		opp->nb_irqs = 80;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_FSL_MPIC_42:
+		opp->fsl = &fsl_mpic_42;
+		opp->brr1 = 0x00400402;
+		opp->flags |= OPENPIC_FLAG_ILR;
+		opp->nb_irqs = 196;
+		opp->mpic_mode_mask = GCR_MODE_PROXY;
+
+		fsl_common_init(opp);
+		map_list(opp, list_be, &list_count);
+		map_list(opp, list_fsl, &list_count);
+
+		break;
+
+	case OPENPIC_MODEL_RAVEN:
+		opp->nb_irqs = RAVEN_MAX_EXT;
+		opp->vid = VID_REVISION_1_3;
+		opp->vir = VIR_GENERIC;
+		opp->vector_mask = 0xFF;
+		opp->tfrr_reset = 4160000;
+		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
+		opp->idr_reset = 0;
+		opp->max_irq = RAVEN_MAX_IRQ;
+		opp->irq_ipi0 = RAVEN_IPI_IRQ;
+		opp->irq_tim0 = RAVEN_TMR_IRQ;
+		opp->brr1 = -1;
+		opp->mpic_mode_mask = GCR_MODE_MIXED;
+
+		/* Only UP supported today */
+		if (opp->nb_cpus != 1) {
+			return -EINVAL;
+		}
+
+		map_list(opp, list_le, &list_count);
+		break;
+	}
+
+	for (i = 0; i < opp->nb_cpus; i++) {
+		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
+		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
+			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
+		}
+	}
+
+	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
+			openpic_save, openpic_load, opp);
+
+	sysbus_init_mmio(dev, &opp->mem);
+	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
+
+	return 0;
+}
+
+static Property openpic_properties[] = {
+	DEFINE_PROP_UINT32("model", OpenPICState, model,
+			   OPENPIC_MODEL_FSL_MPIC_20),
+	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
+	DEFINE_PROP_END_OF_LIST(),
+};
+
+static void openpic_class_init(ObjectClass * klass, void *data)
+{
+	DeviceClass *dc = DEVICE_CLASS(klass);
+	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+
+	k->init = openpic_init;
+	dc->props = openpic_properties;
+	dc->reset = openpic_reset;
+}
+
+static const TypeInfo openpic_info = {
+	.name = "openpic",
+	.parent = TYPE_SYS_BUS_DEVICE,
+	.instance_size = sizeof(OpenPICState),
+	.class_init = openpic_class_init,
+};
+
+static void openpic_register_types(void)
+{
+	type_register_static(&openpic_info);
+}
+
+type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 11/17] kvm/ppc/mpic: remove some obviously unneeded code
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove some parts of the code that are obviously QEMU or Raven specific
before fixing style issues, to reduce the style issues that need to be
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  344 -----------------------------------------------
 1 files changed, 0 insertions(+), 344 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 57655b9..d6d70a4 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -22,39 +22,6 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
-/*
- *
- * Based on OpenPic implementations:
- * - Intel GW80314 I/O companion chip developer's manual
- * - Motorola MPC8245 & MPC8540 user manuals.
- * - Motorola MCP750 (aka Raven) programmer manual.
- * - Motorola Harrier programmer manuel
- *
- * Serial interrupts, as implemented in Raven chipset are not supported yet.
- *
- */
-#include "hw.h"
-#include "ppc/mac.h"
-#include "pci/pci.h"
-#include "openpic.h"
-#include "sysbus.h"
-#include "pci/msi.h"
-#include "qemu/bitops.h"
-#include "ppc.h"
-
-//#define DEBUG_OPENPIC
-
-#ifdef DEBUG_OPENPIC
-static const int debug_openpic = 1;
-#else
-static const int debug_openpic = 0;
-#endif
-
-#define DPRINTF(fmt, ...) do { \
-        if (debug_openpic) { \
-            printf(fmt , ## __VA_ARGS__); \
-        } \
-    } while (0)
 
 #define MAX_CPU     32
 #define MAX_SRC     256
@@ -82,21 +49,6 @@ static const int debug_openpic = 0;
 #define OPENPIC_CPU_REG_START        0x20000
 #define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
 
-/* Raven */
-#define RAVEN_MAX_CPU      2
-#define RAVEN_MAX_EXT     48
-#define RAVEN_MAX_IRQ     64
-#define RAVEN_MAX_TMR      MAX_TMR
-#define RAVEN_MAX_IPI      MAX_IPI
-
-/* Interrupt definitions */
-#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
-#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
-#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
-#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
-/* First doorbell IRQ */
-#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
-
 typedef struct FslMpicInfo {
 	int max_ext;
 } FslMpicInfo;
@@ -138,44 +90,6 @@ static FslMpicInfo fsl_mpic_42 = {
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
 
-/* The currently supported INTTGT values happen to be the same as QEMU's
- * openpic output codes, but don't depend on this.  The output codes
- * could change (unlikely, but...) or support could be added for
- * more INTTGT values.
- */
-static const int inttgt_output[][2] = {
-	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
-	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
-	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
-};
-
-static int inttgt_to_output(int inttgt)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][0] == inttgt) {
-			return inttgt_output[i][1];
-		}
-	}
-
-	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
-	return OPENPIC_OUTPUT_INT;
-}
-
-static int output_to_inttgt(int output)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][1] == output) {
-			return inttgt_output[i][0];
-		}
-	}
-
-	abort();
-}
-
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
 #define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
@@ -1265,228 +1179,36 @@ static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_le = {
-	.write = openpic_gbl_write,
-	.read = openpic_gbl_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
 static const MemoryRegionOps openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_tmr_ops_le = {
-	.write = openpic_tmr_write,
-	.read = openpic_tmr_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_cpu_ops_le = {
-	.write = openpic_cpu_write,
-	.read = openpic_cpu_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_src_ops_le = {
-	.write = openpic_src_write,
-	.read = openpic_src_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
-static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		/* Always put the lower half of a 64-bit long first, in case we
-		 * restore on a 32-bit host.  The least significant bits correspond
-		 * to lower IRQ numbers in the bitmap.
-		 */
-		qemu_put_be32(f, (uint32_t) q->queue[i]);
-#if LONG_MAX > 0x7FFFFFFF
-		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
-#endif
-	}
-
-	qemu_put_sbe32s(f, &q->next);
-	qemu_put_sbe32s(f, &q->priority);
-}
-
-static void openpic_save(QEMUFile * f, void *opaque)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	qemu_put_be32s(f, &opp->gcr);
-	qemu_put_be32s(f, &opp->vir);
-	qemu_put_be32s(f, &opp->pir);
-	qemu_put_be32s(f, &opp->spve);
-	qemu_put_be32s(f, &opp->tfrr);
-
-	qemu_put_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_put_be32s(f, &opp->timers[i].tccr);
-		qemu_put_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		qemu_put_be32s(f, &opp->src[i].ivpr);
-		qemu_put_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_put_sbe32s(f, &opp->src[i].pending);
-	}
-}
-
-static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		unsigned long val;
-
-		val = qemu_get_be32(f);
-#if LONG_MAX > 0x7FFFFFFF
-		val <<= 32;
-		val |= qemu_get_be32(f);
-#endif
-
-		q->queue[i] = val;
-	}
-
-	qemu_get_sbe32s(f, &q->next);
-	qemu_get_sbe32s(f, &q->priority);
-}
-
-static int openpic_load(QEMUFile * f, void *opaque, int version_id)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	if (version_id != 1) {
-		return -EINVAL;
-	}
-
-	qemu_get_be32s(f, &opp->gcr);
-	qemu_get_be32s(f, &opp->vir);
-	qemu_get_be32s(f, &opp->pir);
-	qemu_get_be32s(f, &opp->spve);
-	qemu_get_be32s(f, &opp->tfrr);
-
-	qemu_get_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_get_be32s(f, &opp->timers[i].tccr);
-		qemu_get_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		uint32_t val;
-
-		val = qemu_get_be32(f);
-		write_IRQreg_idr(opp, i, val);
-		val = qemu_get_be32(f);
-		write_IRQreg_ivpr(opp, i, val);
-
-		qemu_get_be32s(f, &opp->src[i].ivpr);
-		qemu_get_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_get_sbe32s(f, &opp->src[i].pending);
-	}
-
-	return 0;
-}
-
 typedef struct MemReg {
 	const char *name;
 	MemoryRegionOps const *ops;
@@ -1614,73 +1336,7 @@ static int openpic_init(SysBusDevice * dev)
 		map_list(opp, list_fsl, &list_count);
 
 		break;
-
-	case OPENPIC_MODEL_RAVEN:
-		opp->nb_irqs = RAVEN_MAX_EXT;
-		opp->vid = VID_REVISION_1_3;
-		opp->vir = VIR_GENERIC;
-		opp->vector_mask = 0xFF;
-		opp->tfrr_reset = 4160000;
-		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
-		opp->idr_reset = 0;
-		opp->max_irq = RAVEN_MAX_IRQ;
-		opp->irq_ipi0 = RAVEN_IPI_IRQ;
-		opp->irq_tim0 = RAVEN_TMR_IRQ;
-		opp->brr1 = -1;
-		opp->mpic_mode_mask = GCR_MODE_MIXED;
-
-		/* Only UP supported today */
-		if (opp->nb_cpus != 1) {
-			return -EINVAL;
-		}
-
-		map_list(opp, list_le, &list_count);
-		break;
-	}
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
-		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
-			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
-		}
 	}
 
-	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
-			openpic_save, openpic_load, opp);
-
-	sysbus_init_mmio(dev, &opp->mem);
-	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
-
 	return 0;
 }
-
-static Property openpic_properties[] = {
-	DEFINE_PROP_UINT32("model", OpenPICState, model,
-			   OPENPIC_MODEL_FSL_MPIC_20),
-	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
-	DEFINE_PROP_END_OF_LIST(),
-};
-
-static void openpic_class_init(ObjectClass * klass, void *data)
-{
-	DeviceClass *dc = DEVICE_CLASS(klass);
-	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
-
-	k->init = openpic_init;
-	dc->props = openpic_properties;
-	dc->reset = openpic_reset;
-}
-
-static const TypeInfo openpic_info = {
-	.name = "openpic",
-	.parent = TYPE_SYS_BUS_DEVICE,
-	.instance_size = sizeof(OpenPICState),
-	.class_init = openpic_class_init,
-};
-
-static void openpic_register_types(void)
-{
-	type_register_static(&openpic_info);
-}
-
-type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 11/17] kvm/ppc/mpic: remove some obviously unneeded code
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove some parts of the code that are obviously QEMU or Raven specific
before fixing style issues, to reduce the style issues that need to be
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  344 -----------------------------------------------
 1 files changed, 0 insertions(+), 344 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 57655b9..d6d70a4 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -22,39 +22,6 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
-/*
- *
- * Based on OpenPic implementations:
- * - Intel GW80314 I/O companion chip developer's manual
- * - Motorola MPC8245 & MPC8540 user manuals.
- * - Motorola MCP750 (aka Raven) programmer manual.
- * - Motorola Harrier programmer manuel
- *
- * Serial interrupts, as implemented in Raven chipset are not supported yet.
- *
- */
-#include "hw.h"
-#include "ppc/mac.h"
-#include "pci/pci.h"
-#include "openpic.h"
-#include "sysbus.h"
-#include "pci/msi.h"
-#include "qemu/bitops.h"
-#include "ppc.h"
-
-//#define DEBUG_OPENPIC
-
-#ifdef DEBUG_OPENPIC
-static const int debug_openpic = 1;
-#else
-static const int debug_openpic = 0;
-#endif
-
-#define DPRINTF(fmt, ...) do { \
-        if (debug_openpic) { \
-            printf(fmt , ## __VA_ARGS__); \
-        } \
-    } while (0)
 
 #define MAX_CPU     32
 #define MAX_SRC     256
@@ -82,21 +49,6 @@ static const int debug_openpic = 0;
 #define OPENPIC_CPU_REG_START        0x20000
 #define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
 
-/* Raven */
-#define RAVEN_MAX_CPU      2
-#define RAVEN_MAX_EXT     48
-#define RAVEN_MAX_IRQ     64
-#define RAVEN_MAX_TMR      MAX_TMR
-#define RAVEN_MAX_IPI      MAX_IPI
-
-/* Interrupt definitions */
-#define RAVEN_FE_IRQ     (RAVEN_MAX_EXT)	/* Internal functional IRQ */
-#define RAVEN_ERR_IRQ    (RAVEN_MAX_EXT + 1)	/* Error IRQ */
-#define RAVEN_TMR_IRQ    (RAVEN_MAX_EXT + 2)	/* First timer IRQ */
-#define RAVEN_IPI_IRQ    (RAVEN_TMR_IRQ + RAVEN_MAX_TMR)	/* First IPI IRQ */
-/* First doorbell IRQ */
-#define RAVEN_DBL_IRQ    (RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
-
 typedef struct FslMpicInfo {
 	int max_ext;
 } FslMpicInfo;
@@ -138,44 +90,6 @@ static FslMpicInfo fsl_mpic_42 = {
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
 
-/* The currently supported INTTGT values happen to be the same as QEMU's
- * openpic output codes, but don't depend on this.  The output codes
- * could change (unlikely, but...) or support could be added for
- * more INTTGT values.
- */
-static const int inttgt_output[][2] = {
-	{ILR_INTTGT_INT, OPENPIC_OUTPUT_INT},
-	{ILR_INTTGT_CINT, OPENPIC_OUTPUT_CINT},
-	{ILR_INTTGT_MCP, OPENPIC_OUTPUT_MCK},
-};
-
-static int inttgt_to_output(int inttgt)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][0] = inttgt) {
-			return inttgt_output[i][1];
-		}
-	}
-
-	fprintf(stderr, "%s: unsupported inttgt %d\n", __func__, inttgt);
-	return OPENPIC_OUTPUT_INT;
-}
-
-static int output_to_inttgt(int output)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(inttgt_output); i++) {
-		if (inttgt_output[i][1] = output) {
-			return inttgt_output[i][0];
-		}
-	}
-
-	abort();
-}
-
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
 #define MSIIR_SRS_MASK     (0x7 << MSIIR_SRS_SHIFT)
@@ -1265,228 +1179,36 @@ static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_le = {
-	.write = openpic_gbl_write,
-	.read = openpic_gbl_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
 static const MemoryRegionOps openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_tmr_ops_le = {
-	.write = openpic_tmr_write,
-	.read = openpic_tmr_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_cpu_ops_le = {
-	.write = openpic_cpu_write,
-	.read = openpic_cpu_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
-};
-
-static const MemoryRegionOps openpic_src_ops_le = {
-	.write = openpic_src_write,
-	.read = openpic_src_read,
-	.endianness = DEVICE_LITTLE_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
 static const MemoryRegionOps openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-	.endianness = DEVICE_BIG_ENDIAN,
-	.impl = {
-		 .min_access_size = 4,
-		 .max_access_size = 4,
-		 },
 };
 
-static void openpic_save_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		/* Always put the lower half of a 64-bit long first, in case we
-		 * restore on a 32-bit host.  The least significant bits correspond
-		 * to lower IRQ numbers in the bitmap.
-		 */
-		qemu_put_be32(f, (uint32_t) q->queue[i]);
-#if LONG_MAX > 0x7FFFFFFF
-		qemu_put_be32(f, (uint32_t) (q->queue[i] >> 32));
-#endif
-	}
-
-	qemu_put_sbe32s(f, &q->next);
-	qemu_put_sbe32s(f, &q->priority);
-}
-
-static void openpic_save(QEMUFile * f, void *opaque)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	qemu_put_be32s(f, &opp->gcr);
-	qemu_put_be32s(f, &opp->vir);
-	qemu_put_be32s(f, &opp->pir);
-	qemu_put_be32s(f, &opp->spve);
-	qemu_put_be32s(f, &opp->tfrr);
-
-	qemu_put_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_put_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_save_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_save_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_put_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_put_be32s(f, &opp->timers[i].tccr);
-		qemu_put_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		qemu_put_be32s(f, &opp->src[i].ivpr);
-		qemu_put_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_put_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_put_sbe32s(f, &opp->src[i].pending);
-	}
-}
-
-static void openpic_load_IRQ_queue(QEMUFile * f, IRQQueue * q)
-{
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(q->queue); i++) {
-		unsigned long val;
-
-		val = qemu_get_be32(f);
-#if LONG_MAX > 0x7FFFFFFF
-		val <<= 32;
-		val |= qemu_get_be32(f);
-#endif
-
-		q->queue[i] = val;
-	}
-
-	qemu_get_sbe32s(f, &q->next);
-	qemu_get_sbe32s(f, &q->priority);
-}
-
-static int openpic_load(QEMUFile * f, void *opaque, int version_id)
-{
-	OpenPICState *opp = (OpenPICState *) opaque;
-	unsigned int i;
-
-	if (version_id != 1) {
-		return -EINVAL;
-	}
-
-	qemu_get_be32s(f, &opp->gcr);
-	qemu_get_be32s(f, &opp->vir);
-	qemu_get_be32s(f, &opp->pir);
-	qemu_get_be32s(f, &opp->spve);
-	qemu_get_be32s(f, &opp->tfrr);
-
-	qemu_get_be32s(f, &opp->nb_cpus);
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		qemu_get_sbe32s(f, &opp->dst[i].ctpr);
-		openpic_load_IRQ_queue(f, &opp->dst[i].raised);
-		openpic_load_IRQ_queue(f, &opp->dst[i].servicing);
-		qemu_get_buffer(f, (uint8_t *) & opp->dst[i].outputs_active,
-				sizeof(opp->dst[i].outputs_active));
-	}
-
-	for (i = 0; i < MAX_TMR; i++) {
-		qemu_get_be32s(f, &opp->timers[i].tccr);
-		qemu_get_be32s(f, &opp->timers[i].tbcr);
-	}
-
-	for (i = 0; i < opp->max_irq; i++) {
-		uint32_t val;
-
-		val = qemu_get_be32(f);
-		write_IRQreg_idr(opp, i, val);
-		val = qemu_get_be32(f);
-		write_IRQreg_ivpr(opp, i, val);
-
-		qemu_get_be32s(f, &opp->src[i].ivpr);
-		qemu_get_be32s(f, &opp->src[i].idr);
-		qemu_get_be32s(f, &opp->src[i].destmask);
-		qemu_get_sbe32s(f, &opp->src[i].last_cpu);
-		qemu_get_sbe32s(f, &opp->src[i].pending);
-	}
-
-	return 0;
-}
-
 typedef struct MemReg {
 	const char *name;
 	MemoryRegionOps const *ops;
@@ -1614,73 +1336,7 @@ static int openpic_init(SysBusDevice * dev)
 		map_list(opp, list_fsl, &list_count);
 
 		break;
-
-	case OPENPIC_MODEL_RAVEN:
-		opp->nb_irqs = RAVEN_MAX_EXT;
-		opp->vid = VID_REVISION_1_3;
-		opp->vir = VIR_GENERIC;
-		opp->vector_mask = 0xFF;
-		opp->tfrr_reset = 4160000;
-		opp->ivpr_reset = IVPR_MASK_MASK | IVPR_MODE_MASK;
-		opp->idr_reset = 0;
-		opp->max_irq = RAVEN_MAX_IRQ;
-		opp->irq_ipi0 = RAVEN_IPI_IRQ;
-		opp->irq_tim0 = RAVEN_TMR_IRQ;
-		opp->brr1 = -1;
-		opp->mpic_mode_mask = GCR_MODE_MIXED;
-
-		/* Only UP supported today */
-		if (opp->nb_cpus != 1) {
-			return -EINVAL;
-		}
-
-		map_list(opp, list_le, &list_count);
-		break;
-	}
-
-	for (i = 0; i < opp->nb_cpus; i++) {
-		opp->dst[i].irqs = g_new(qemu_irq, OPENPIC_OUTPUT_NB);
-		for (j = 0; j < OPENPIC_OUTPUT_NB; j++) {
-			sysbus_init_irq(dev, &opp->dst[i].irqs[j]);
-		}
 	}
 
-	register_savevm(&opp->busdev.qdev, "openpic", 0, 2,
-			openpic_save, openpic_load, opp);
-
-	sysbus_init_mmio(dev, &opp->mem);
-	qdev_init_gpio_in(&dev->qdev, openpic_set_irq, opp->max_irq);
-
 	return 0;
 }
-
-static Property openpic_properties[] = {
-	DEFINE_PROP_UINT32("model", OpenPICState, model,
-			   OPENPIC_MODEL_FSL_MPIC_20),
-	DEFINE_PROP_UINT32("nb_cpus", OpenPICState, nb_cpus, 1),
-	DEFINE_PROP_END_OF_LIST(),
-};
-
-static void openpic_class_init(ObjectClass * klass, void *data)
-{
-	DeviceClass *dc = DEVICE_CLASS(klass);
-	SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
-
-	k->init = openpic_init;
-	dc->props = openpic_properties;
-	dc->reset = openpic_reset;
-}
-
-static const TypeInfo openpic_info = {
-	.name = "openpic",
-	.parent = TYPE_SYS_BUS_DEVICE,
-	.instance_size = sizeof(OpenPICState),
-	.class_init = openpic_class_init,
-};
-
-static void openpic_register_types(void)
-{
-	type_register_static(&openpic_info);
-}
-
-type_init(openpic_register_types)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 12/17] kvm/ppc/mpic: adapt to kernel style and environment
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove braces that Linux style doesn't permit, remove space after
'*' that Lindent added, keep error/debug strings contiguous, etc.

Substitute type names, debug prints, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  445 ++++++++++++++++++++++-------------------------
 1 files changed, 208 insertions(+), 237 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index d6d70a4..1df67ae 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -42,22 +42,22 @@
 #define OPENPIC_TMR_REG_SIZE         0x220
 #define OPENPIC_MSI_REG_START        0x1600
 #define OPENPIC_MSI_REG_SIZE         0x200
-#define OPENPIC_SUMMARY_REG_START   0x3800
-#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SUMMARY_REG_START    0x3800
+#define OPENPIC_SUMMARY_REG_SIZE     0x800
 #define OPENPIC_SRC_REG_START        0x10000
 #define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
 #define OPENPIC_CPU_REG_START        0x20000
-#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+#define OPENPIC_CPU_REG_SIZE         (0x100 + ((MAX_CPU - 1) * 0x1000))
 
-typedef struct FslMpicInfo {
+struct fsl_mpic_info {
 	int max_ext;
-} FslMpicInfo;
+};
 
-static FslMpicInfo fsl_mpic_20 = {
+static struct fsl_mpic_info fsl_mpic_20 = {
 	.max_ext = 12,
 };
 
-static FslMpicInfo fsl_mpic_42 = {
+static struct fsl_mpic_info fsl_mpic_42 = {
 	.max_ext = 12,
 };
 
@@ -100,44 +100,43 @@ static int get_current_cpu(void)
 {
 	CPUState *cpu_single_cpu;
 
-	if (!cpu_single_env) {
+	if (!cpu_single_env)
 		return -1;
-	}
 
 	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
 	return cpu_single_cpu->cpu_index;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx);
 
-typedef enum IRQType {
+enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
 	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
 	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
-} IRQType;
+};
 
-typedef struct IRQQueue {
+struct irq_queue {
 	/* Round up to the nearest 64 IRQs so that the queue length
 	 * won't change when moving between 32 and 64 bit hosts.
 	 */
 	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
 	int next;
 	int priority;
-} IRQQueue;
+};
 
-typedef struct IRQSource {
+struct irq_source {
 	uint32_t ivpr;		/* IRQ vector/priority register */
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
 	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
 	int pending;		/* TRUE if IRQ is pending */
-	IRQType type;
+	enum irq_type type;
 	bool level:1;		/* level-triggered */
-	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
-} IRQSource;
+	bool nomask:1;	/* critical interrupts ignore mask on some FSL MPICs */
+};
 
 #define IVPR_MASK_SHIFT       31
 #define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
@@ -158,22 +157,19 @@ typedef struct IRQSource {
 #define IDR_EP      0x80000000	/* external pin */
 #define IDR_CI      0x40000000	/* critical interrupt */
 
-typedef struct IRQDest {
+struct irq_dest {
 	int32_t ctpr;		/* CPU current task priority */
-	IRQQueue raised;
-	IRQQueue servicing;
+	struct irq_queue raised;
+	struct irq_queue servicing;
 	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
 	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
-} IRQDest;
-
-typedef struct OpenPICState {
-	SysBusDevice busdev;
-	MemoryRegion mem;
+};
 
+struct openpic {
 	/* Behavior control */
-	FslMpicInfo *fsl;
+	struct fsl_mpic_info *fsl;
 	uint32_t model;
 	uint32_t flags;
 	uint32_t nb_irqs;
@@ -186,9 +182,6 @@ typedef struct OpenPICState {
 	uint32_t brr1;
 	uint32_t mpic_mode_mask;
 
-	/* Sub-regions */
-	MemoryRegion sub_io_mem[6];
-
 	/* Global registers */
 	uint32_t frr;		/* Feature reporting register */
 	uint32_t gcr;		/* Global configuration register  */
@@ -196,9 +189,9 @@ typedef struct OpenPICState {
 	uint32_t spve;		/* Spurious vector register */
 	uint32_t tfrr;		/* Timer frequency reporting register */
 	/* Source registers */
-	IRQSource src[MAX_IRQ];
+	struct irq_source src[MAX_IRQ];
 	/* Local registers per output pin */
-	IRQDest dst[MAX_CPU];
+	struct irq_dest dst[MAX_CPU];
 	uint32_t nb_cpus;
 	/* Timer registers */
 	struct {
@@ -213,24 +206,24 @@ typedef struct OpenPICState {
 	uint32_t irq_ipi0;
 	uint32_t irq_tim0;
 	uint32_t irq_msi;
-} OpenPICState;
+};
 
-static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
 }
 
-static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_resetbit(struct irq_queue *q, int n_IRQ)
 {
 	clear_bit(n_IRQ, q->queue);
 }
 
-static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+static inline int IRQ_testbit(struct irq_queue *q, int n_IRQ)
 {
 	return test_bit(n_IRQ, q->queue);
 }
 
-static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+static void IRQ_check(struct openpic *opp, struct irq_queue *q)
 {
 	int irq = -1;
 	int next = -1;
@@ -238,11 +231,10 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 
 	for (;;) {
 		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
-		if (irq == opp->max_irq) {
+		if (irq == opp->max_irq)
 			break;
-		}
 
-		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+		pr_debug("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
 			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
 
 		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
@@ -255,7 +247,7 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 	q->priority = priority;
 }
 
-static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+static int IRQ_get_next(struct openpic *opp, struct irq_queue *q)
 {
 	/* XXX: optimize */
 	IRQ_check(opp, q);
@@ -263,21 +255,21 @@ static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
 	return q->next;
 }
 
-static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			   bool active, bool was_active)
 {
-	IRQDest *dst;
-	IRQSource *src;
+	struct irq_dest *dst;
+	struct irq_source *src;
 	int priority;
 
 	dst = &opp->dst[n_CPU];
 	src = &opp->src[n_IRQ];
 
-	DPRINTF("%s: IRQ %d active %d was %d\n",
+	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
 	if (src->output != OPENPIC_OUTPUT_INT) {
-		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
 
@@ -286,19 +278,17 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		 * masking.
 		 */
 		if (active) {
-			if (!was_active
-			    && dst->outputs_active[src->output]++ == 0) {
-				DPRINTF
-				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (!was_active &&
+			    dst->outputs_active[src->output]++ == 0) {
+				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_raise(dst->irqs[src->output]);
 			}
 		} else {
-			if (was_active
-			    && --dst->outputs_active[src->output] == 0) {
-				DPRINTF
-				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (was_active &&
+			    --dst->outputs_active[src->output] == 0) {
+				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_lower(dst->irqs[src->output]);
 			}
 		}
@@ -311,31 +301,27 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 	/* Even if the interrupt doesn't have enough priority,
 	 * it is still raised, in case ctpr is lowered later.
 	 */
-	if (active) {
+	if (active)
 		IRQ_setbit(&dst->raised, n_IRQ);
-	} else {
+	else
 		IRQ_resetbit(&dst->raised, n_IRQ);
-	}
 
 	IRQ_check(opp, &dst->raised);
 
 	if (active && priority <= dst->ctpr) {
-		DPRINTF
-		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
-		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		pr_debug("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+			__func__, n_IRQ, priority, dst->ctpr, n_CPU);
 		active = 0;
 	}
 
 	if (active) {
 		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
 		    priority <= dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
-			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+			pr_debug("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+				__func__, n_IRQ, dst->servicing.next, n_CPU);
 		} else {
-			DPRINTF
-			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
-			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+				__func__, n_CPU, n_IRQ, dst->raised.next);
 			qemu_irq_raise(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -343,17 +329,15 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		IRQ_get_next(opp, &dst->servicing);
 		if (dst->raised.priority > dst->ctpr &&
 		    dst->raised.priority > dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->raised.next,
-			     dst->raised.priority, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->raised.next,
+				dst->raised.priority, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			/* IRQ line stays asserted */
 		} else {
-			DPRINTF
-			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			qemu_irq_lower(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -361,9 +345,9 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 }
 
 /* update pic state because registers for n_IRQ have changed value */
-static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+static void openpic_update_irq(struct openpic *opp, int n_IRQ)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	bool active, was_active;
 	int i;
 
@@ -372,30 +356,29 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
 		/* Interrupt source is disabled */
-		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is disabled\n", __func__, n_IRQ);
 		active = false;
 	}
 
-	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+	was_active = !!(src->ivpr & IVPR_ACTIVITY_MASK);
 
 	/*
 	 * We don't have a similar check for already-active because
 	 * ctpr may have changed and we need to withdraw the interrupt.
 	 */
 	if (!active && !was_active) {
-		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
 		return;
 	}
 
-	if (active) {
+	if (active)
 		src->ivpr |= IVPR_ACTIVITY_MASK;
-	} else {
+	else
 		src->ivpr &= ~IVPR_ACTIVITY_MASK;
-	}
 
 	if (src->destmask == 0) {
 		/* No target */
-		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d has no target\n", __func__, n_IRQ);
 		return;
 	}
 
@@ -413,9 +396,9 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 	} else {
 		/* Distributed delivery mode */
 		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
-			if (i == opp->nb_cpus) {
+			if (i == opp->nb_cpus)
 				i = 0;
-			}
+
 			if (src->destmask & (1 << i)) {
 				IRQ_local_pipe(opp, i, n_IRQ, active,
 					       was_active);
@@ -428,16 +411,16 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
 		abort();
 	}
 
 	src = &opp->src[n_IRQ];
-	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+	pr_debug("openpic: set irq %d = %d ivpr=0x%08x\n",
 		n_IRQ, level, src->ivpr);
 	if (src->level) {
 		/* level-sensitive irq */
@@ -463,9 +446,9 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState * d)
+static void openpic_reset(DeviceState *d)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
@@ -485,7 +468,7 @@ static void openpic_reset(DeviceState * d)
 		switch (opp->src[i].type) {
 		case IRQ_TYPE_NORMAL:
 			opp->src[i].level =
-			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			    !!(opp->ivpr_reset & IVPR_SENSE_MASK);
 			break;
 
 		case IRQ_TYPE_FSLINT:
@@ -499,9 +482,9 @@ static void openpic_reset(DeviceState * d)
 	/* Initialise IRQ destinations */
 	for (i = 0; i < MAX_CPU; i++) {
 		opp->dst[i].ctpr = 15;
-		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].raised, 0, sizeof(struct irq_queue));
 		opp->dst[i].raised.next = -1;
-		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].servicing, 0, sizeof(struct irq_queue));
 		opp->dst[i].servicing.next = -1;
 	}
 	/* Initialise timers */
@@ -513,28 +496,28 @@ static void openpic_reset(DeviceState * d)
 	opp->gcr = 0;
 }
 
-static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].idr;
 }
 
-static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
-	if (opp->flags & OPENPIC_FLAG_ILR) {
+	if (opp->flags & OPENPIC_FLAG_ILR)
 		return output_to_inttgt(opp->src[n_IRQ].output);
-	}
 
 	return 0xffffffff;
 }
 
-static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ivpr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].ivpr;
 }
 
-static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
-	IRQSource *src = &opp->src[n_IRQ];
+	struct irq_source *src = &opp->src[n_IRQ];
 	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
 	uint32_t crit_mask = 0;
 	uint32_t mask = normal_mask;
@@ -547,14 +530,13 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 
 	src->idr = val & mask;
-	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+	pr_debug("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
 
 	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
 		if (src->idr & crit_mask) {
 			if (src->idr & normal_mask) {
-				DPRINTF
-				    ("%s: IRQ configured for multiple output types, using "
-				     "critical\n", __func__);
+				pr_debug("%s: IRQ configured for multiple output types, using critical\n",
+					__func__);
 			}
 
 			src->output = OPENPIC_OUTPUT_CINT;
@@ -564,9 +546,8 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 			for (i = 0; i < opp->nb_cpus; i++) {
 				int n_ci = IDR_CI0_SHIFT - i;
 
-				if (src->idr & (1UL << n_ci)) {
+				if (src->idr & (1UL << n_ci))
 					src->destmask |= 1UL << i;
-				}
 			}
 		} else {
 			src->output = OPENPIC_OUTPUT_INT;
@@ -578,20 +559,21 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 }
 
-static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR) {
-		IRQSource *src = &opp->src[n_IRQ];
+		struct irq_source *src = &opp->src[n_IRQ];
 
 		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
-		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
 		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
 	}
 }
 
-static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 				     uint32_t val)
 {
 	uint32_t mask;
@@ -613,7 +595,7 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	switch (opp->src[n_IRQ].type) {
 	case IRQ_TYPE_NORMAL:
 		opp->src[n_IRQ].level =
-		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		    !!(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
 		break;
 
 	case IRQ_TYPE_FSLINT:
@@ -626,11 +608,11 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	}
 
 	openpic_update_irq(opp, n_IRQ);
-	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+	pr_debug("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
 		opp->src[n_IRQ].ivpr);
 }
 
-static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
 	bool mpic_proxy = false;
 
@@ -643,27 +625,26 @@ static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
 	opp->gcr |= val & opp->mpic_mode_mask;
 
 	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY) {
+	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY)
 		mpic_proxy = true;
-	}
 
 	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	switch (addr) {
-	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
 		break;
 	case 0x40:
 	case 0x50:
@@ -685,16 +666,14 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x1090:		/* PIR */
 		for (idx = 0; idx < opp->nb_cpus; idx++) {
 			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Raise OpenPIC RESET output for CPU %d\n",
-				     idx);
+				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx))
-				   && (opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Lower OpenPIC RESET output for CPU %d\n",
-				     idx);
+			} else if (!(val & (1 << idx)) &&
+				   (opp->pir & (1 << idx))) {
+				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
 			}
@@ -704,13 +683,12 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
 	case 0x10C0:
-	case 0x10D0:
-		{
-			int idx;
-			idx = (addr - 0x10A0) >> 4;
-			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
-		}
+	case 0x10D0: {
+		int idx;
+		idx = (addr - 0x10A0) >> 4;
+		write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
 		break;
+	}
 	case 0x10E0:		/* SPVE */
 		opp->spve = val & opp->vector_mask;
 		break;
@@ -719,16 +697,16 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
@@ -772,24 +750,23 @@ static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	if (addr == 0x10f0) {
 		/* TFRR */
@@ -806,9 +783,9 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10:		/* TBCR */
 		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
 		    (val & TBCR_CI) == 0 &&
-		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0)
 			opp->timers[idx].tccr &= ~TCCR_TOG;
-		}
+
 		opp->timers[idx].tbcr = val;
 		break;
 	case 0x20:		/* TVPR */
@@ -820,16 +797,16 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		goto out;
-	}
+
 	idx = (addr >> 6) & 0x3;
 	if (addr == 0x0) {
 		/* TFRR */
@@ -852,18 +829,18 @@ static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
 	}
 
 out:
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
 
 	addr = addr & 0xffff;
@@ -884,11 +861,11 @@ static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
 
 static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -906,22 +883,21 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 		break;
 	}
 
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 	return retval;
 }
 
-static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -937,16 +913,15 @@ static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint64_t r = 0;
 	int i, srs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		return -1;
-	}
 
 	srs = addr >> 4;
 
@@ -965,53 +940,51 @@ static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
 		openpic_set_irq(opp, opp->irq_msi + srs, 0);
 		break;
 	case 0x120:		/* MSISR */
-		for (i = 0; i < MAX_MSI; i++) {
+		for (i = 0; i < MAX_MSI; i++)
 			r |= (opp->msi[i].msir ? 1 : 0) << i;
-		}
 		break;
 	}
 
 	return r;
 }
 
-static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
 {
 	uint64_t r = 0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
 	return r;
 }
 
-static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
 				  unsigned size)
 {
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
 
 	/* TODO: EISR/EIMR */
 }
 
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
+	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
 		addr, val);
 
-	if (idx < 0) {
+	if (idx < 0)
 		return;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1028,17 +1001,16 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	case 0x80:		/* CTPR */
 		dst->ctpr = val & 0x0000000F;
 
-		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+		pr_debug("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
 			__func__, idx, dst->ctpr, dst->raised.priority,
 			dst->servicing.priority);
 
 		if (dst->raised.priority <= dst->ctpr) {
-			DPRINTF
-			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
-			     __func__, idx);
+			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+				__func__, idx);
 			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 		} else if (dst->raised.priority > dst->servicing.priority) {
-			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
 			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1051,11 +1023,11 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		/* Read-only register */
 		break;
 	case 0xB0:		/* EOI */
-		DPRINTF("EOI\n");
+		pr_debug("EOI\n");
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 
 		if (s_IRQ < 0) {
-			DPRINTF("%s: EOI with no interrupt in service\n",
+			pr_debug("%s: EOI with no interrupt in service\n",
 				__func__);
 			break;
 		}
@@ -1069,7 +1041,7 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		if (n_IRQ != -1 &&
 		    (s_IRQ == -1 ||
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
-			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
 			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1079,32 +1051,32 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	}
 }
 
-static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
 	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
 }
 
-static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
+			     int cpu)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	int retval, irq;
 
-	DPRINTF("Lower OpenPIC INT output\n");
+	pr_debug("Lower OpenPIC INT output\n");
 	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 
 	irq = IRQ_get_next(opp, &dst->raised);
-	DPRINTF("IACK: irq=%d\n", irq);
+	pr_debug("IACK: irq=%d\n", irq);
 
-	if (irq == -1) {
+	if (irq == -1)
 		/* No more interrupt pending */
 		return opp->spve;
-	}
 
 	src = &opp->src[irq];
 	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
 	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
-		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+		pr_err("%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
 			__func__, irq, dst->ctpr, src->ivpr);
 		openpic_update_irq(opp, irq);
 		retval = opp->spve;
@@ -1135,22 +1107,21 @@ static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	uint32_t retval;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
-	if (idx < 0) {
+	if (idx < 0)
 		return retval;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1169,54 +1140,54 @@ static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
 {
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_be = {
+static const struct kvm_io_device_ops openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
 };
 
-static const MemoryRegionOps openpic_tmr_ops_be = {
+static const struct kvm_io_device_ops openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
 };
 
-static const MemoryRegionOps openpic_cpu_ops_be = {
+static const struct kvm_io_device_ops openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
 };
 
-static const MemoryRegionOps openpic_src_ops_be = {
+static const struct kvm_io_device_ops openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
 };
 
-static const MemoryRegionOps openpic_msi_ops_be = {
+static const struct kvm_io_device_ops openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
 };
 
-static const MemoryRegionOps openpic_summary_ops_be = {
+static const struct kvm_io_device_ops openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
 };
 
-typedef struct MemReg {
+struct mem_reg {
 	const char *name;
-	MemoryRegionOps const *ops;
-	hwaddr start_addr;
-	ram_addr_t size;
-} MemReg;
+	const struct kvm_io_device_ops *ops;
+	gpa_t start_addr;
+	int size;
+};
 
-static void fsl_common_init(OpenPICState * opp)
+static void fsl_common_init(struct openpic *opp)
 {
 	int i;
 	int virq = MAX_SRC;
@@ -1239,9 +1210,8 @@ static void fsl_common_init(OpenPICState * opp)
 	opp->irq_msi = 224;
 
 	msi_supported = true;
-	for (i = 0; i < opp->fsl->max_ext; i++) {
+	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
-	}
 
 	/* Internal interrupts, including message and MSI */
 	for (i = 16; i < MAX_SRC; i++) {
@@ -1256,7 +1226,8 @@ static void fsl_common_init(OpenPICState * opp)
 	}
 }
 
-static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+static void map_list(struct openpic *opp, const struct mem_reg *list,
+		     int *count)
 {
 	while (list->name) {
 		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
@@ -1272,12 +1243,12 @@ static void map_list(OpenPICState * opp, const MemReg * list, int *count)
 	}
 }
 
-static int openpic_init(SysBusDevice * dev)
+static int openpic_init(SysBusDevice *dev)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
 	int i, j;
 	int list_count = 0;
-	static const MemReg list_le[] = {
+	static const struct mem_reg list_le[] = {
 		{"glb", &openpic_glb_ops_le,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_le,
@@ -1288,7 +1259,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_be[] = {
+	static const struct mem_reg list_be[] = {
 		{"glb", &openpic_glb_ops_be,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_be,
@@ -1299,7 +1270,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_fsl[] = {
+	static const struct mem_reg list_fsl[] = {
 		{"msi", &openpic_msi_ops_be,
 		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
 		{"summary", &openpic_summary_ops_be,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 12/17] kvm/ppc/mpic: adapt to kernel style and environment
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Remove braces that Linux style doesn't permit, remove space after
'*' that Lindent added, keep error/debug strings contiguous, etc.

Substitute type names, debug prints, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/mpic.c |  445 ++++++++++++++++++++++-------------------------
 1 files changed, 208 insertions(+), 237 deletions(-)

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index d6d70a4..1df67ae 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -42,22 +42,22 @@
 #define OPENPIC_TMR_REG_SIZE         0x220
 #define OPENPIC_MSI_REG_START        0x1600
 #define OPENPIC_MSI_REG_SIZE         0x200
-#define OPENPIC_SUMMARY_REG_START   0x3800
-#define OPENPIC_SUMMARY_REG_SIZE    0x800
+#define OPENPIC_SUMMARY_REG_START    0x3800
+#define OPENPIC_SUMMARY_REG_SIZE     0x800
 #define OPENPIC_SRC_REG_START        0x10000
 #define OPENPIC_SRC_REG_SIZE         (MAX_SRC * 0x20)
 #define OPENPIC_CPU_REG_START        0x20000
-#define OPENPIC_CPU_REG_SIZE         0x100 + ((MAX_CPU - 1) * 0x1000)
+#define OPENPIC_CPU_REG_SIZE         (0x100 + ((MAX_CPU - 1) * 0x1000))
 
-typedef struct FslMpicInfo {
+struct fsl_mpic_info {
 	int max_ext;
-} FslMpicInfo;
+};
 
-static FslMpicInfo fsl_mpic_20 = {
+static struct fsl_mpic_info fsl_mpic_20 = {
 	.max_ext = 12,
 };
 
-static FslMpicInfo fsl_mpic_42 = {
+static struct fsl_mpic_info fsl_mpic_42 = {
 	.max_ext = 12,
 };
 
@@ -100,44 +100,43 @@ static int get_current_cpu(void)
 {
 	CPUState *cpu_single_cpu;
 
-	if (!cpu_single_env) {
+	if (!cpu_single_env)
 		return -1;
-	}
 
 	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
 	return cpu_single_cpu->cpu_index;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx);
 
-typedef enum IRQType {
+enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
 	IRQ_TYPE_FSLINT,	/* FSL internal interrupt -- level only */
 	IRQ_TYPE_FSLSPECIAL,	/* FSL timer/IPI interrupt, edge, no polarity */
-} IRQType;
+};
 
-typedef struct IRQQueue {
+struct irq_queue {
 	/* Round up to the nearest 64 IRQs so that the queue length
 	 * won't change when moving between 32 and 64 bit hosts.
 	 */
 	unsigned long queue[BITS_TO_LONGS((MAX_IRQ + 63) & ~63)];
 	int next;
 	int priority;
-} IRQQueue;
+};
 
-typedef struct IRQSource {
+struct irq_source {
 	uint32_t ivpr;		/* IRQ vector/priority register */
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
 	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
 	int pending;		/* TRUE if IRQ is pending */
-	IRQType type;
+	enum irq_type type;
 	bool level:1;		/* level-triggered */
-	bool nomask:1;		/* critical interrupts ignore mask on some FSL MPICs */
-} IRQSource;
+	bool nomask:1;	/* critical interrupts ignore mask on some FSL MPICs */
+};
 
 #define IVPR_MASK_SHIFT       31
 #define IVPR_MASK_MASK        (1 << IVPR_MASK_SHIFT)
@@ -158,22 +157,19 @@ typedef struct IRQSource {
 #define IDR_EP      0x80000000	/* external pin */
 #define IDR_CI      0x40000000	/* critical interrupt */
 
-typedef struct IRQDest {
+struct irq_dest {
 	int32_t ctpr;		/* CPU current task priority */
-	IRQQueue raised;
-	IRQQueue servicing;
+	struct irq_queue raised;
+	struct irq_queue servicing;
 	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
 	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
-} IRQDest;
-
-typedef struct OpenPICState {
-	SysBusDevice busdev;
-	MemoryRegion mem;
+};
 
+struct openpic {
 	/* Behavior control */
-	FslMpicInfo *fsl;
+	struct fsl_mpic_info *fsl;
 	uint32_t model;
 	uint32_t flags;
 	uint32_t nb_irqs;
@@ -186,9 +182,6 @@ typedef struct OpenPICState {
 	uint32_t brr1;
 	uint32_t mpic_mode_mask;
 
-	/* Sub-regions */
-	MemoryRegion sub_io_mem[6];
-
 	/* Global registers */
 	uint32_t frr;		/* Feature reporting register */
 	uint32_t gcr;		/* Global configuration register  */
@@ -196,9 +189,9 @@ typedef struct OpenPICState {
 	uint32_t spve;		/* Spurious vector register */
 	uint32_t tfrr;		/* Timer frequency reporting register */
 	/* Source registers */
-	IRQSource src[MAX_IRQ];
+	struct irq_source src[MAX_IRQ];
 	/* Local registers per output pin */
-	IRQDest dst[MAX_CPU];
+	struct irq_dest dst[MAX_CPU];
 	uint32_t nb_cpus;
 	/* Timer registers */
 	struct {
@@ -213,24 +206,24 @@ typedef struct OpenPICState {
 	uint32_t irq_ipi0;
 	uint32_t irq_tim0;
 	uint32_t irq_msi;
-} OpenPICState;
+};
 
-static inline void IRQ_setbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
 }
 
-static inline void IRQ_resetbit(IRQQueue * q, int n_IRQ)
+static inline void IRQ_resetbit(struct irq_queue *q, int n_IRQ)
 {
 	clear_bit(n_IRQ, q->queue);
 }
 
-static inline int IRQ_testbit(IRQQueue * q, int n_IRQ)
+static inline int IRQ_testbit(struct irq_queue *q, int n_IRQ)
 {
 	return test_bit(n_IRQ, q->queue);
 }
 
-static void IRQ_check(OpenPICState * opp, IRQQueue * q)
+static void IRQ_check(struct openpic *opp, struct irq_queue *q)
 {
 	int irq = -1;
 	int next = -1;
@@ -238,11 +231,10 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 
 	for (;;) {
 		irq = find_next_bit(q->queue, opp->max_irq, irq + 1);
-		if (irq = opp->max_irq) {
+		if (irq = opp->max_irq)
 			break;
-		}
 
-		DPRINTF("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
+		pr_debug("IRQ_check: irq %d set ivpr_pr=%d pr=%d\n",
 			irq, IVPR_PRIORITY(opp->src[irq].ivpr), priority);
 
 		if (IVPR_PRIORITY(opp->src[irq].ivpr) > priority) {
@@ -255,7 +247,7 @@ static void IRQ_check(OpenPICState * opp, IRQQueue * q)
 	q->priority = priority;
 }
 
-static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
+static int IRQ_get_next(struct openpic *opp, struct irq_queue *q)
 {
 	/* XXX: optimize */
 	IRQ_check(opp, q);
@@ -263,21 +255,21 @@ static int IRQ_get_next(OpenPICState * opp, IRQQueue * q)
 	return q->next;
 }
 
-static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
+static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			   bool active, bool was_active)
 {
-	IRQDest *dst;
-	IRQSource *src;
+	struct irq_dest *dst;
+	struct irq_source *src;
 	int priority;
 
 	dst = &opp->dst[n_CPU];
 	src = &opp->src[n_IRQ];
 
-	DPRINTF("%s: IRQ %d active %d was %d\n",
+	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
 	if (src->output != OPENPIC_OUTPUT_INT) {
-		DPRINTF("%s: output %d irq %d active %d was %d count %d\n",
+		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
 
@@ -286,19 +278,17 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		 * masking.
 		 */
 		if (active) {
-			if (!was_active
-			    && dst->outputs_active[src->output]++ = 0) {
-				DPRINTF
-				    ("%s: Raise OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (!was_active &&
+			    dst->outputs_active[src->output]++ = 0) {
+				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_raise(dst->irqs[src->output]);
 			}
 		} else {
-			if (was_active
-			    && --dst->outputs_active[src->output] = 0) {
-				DPRINTF
-				    ("%s: Lower OpenPIC output %d cpu %d irq %d\n",
-				     __func__, src->output, n_CPU, n_IRQ);
+			if (was_active &&
+			    --dst->outputs_active[src->output] = 0) {
+				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
+					__func__, src->output, n_CPU, n_IRQ);
 				qemu_irq_lower(dst->irqs[src->output]);
 			}
 		}
@@ -311,31 +301,27 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 	/* Even if the interrupt doesn't have enough priority,
 	 * it is still raised, in case ctpr is lowered later.
 	 */
-	if (active) {
+	if (active)
 		IRQ_setbit(&dst->raised, n_IRQ);
-	} else {
+	else
 		IRQ_resetbit(&dst->raised, n_IRQ);
-	}
 
 	IRQ_check(opp, &dst->raised);
 
 	if (active && priority <= dst->ctpr) {
-		DPRINTF
-		    ("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
-		     __func__, n_IRQ, priority, dst->ctpr, n_CPU);
+		pr_debug("%s: IRQ %d priority %d too low for ctpr %d on CPU %d\n",
+			__func__, n_IRQ, priority, dst->ctpr, n_CPU);
 		active = 0;
 	}
 
 	if (active) {
 		if (IRQ_get_next(opp, &dst->servicing) >= 0 &&
 		    priority <= dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
-			     __func__, n_IRQ, dst->servicing.next, n_CPU);
+			pr_debug("%s: IRQ %d is hidden by servicing IRQ %d on CPU %d\n",
+				__func__, n_IRQ, dst->servicing.next, n_CPU);
 		} else {
-			DPRINTF
-			    ("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
-			     __func__, n_CPU, n_IRQ, dst->raised.next);
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
+				__func__, n_CPU, n_IRQ, dst->raised.next);
 			qemu_irq_raise(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -343,17 +329,15 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 		IRQ_get_next(opp, &dst->servicing);
 		if (dst->raised.priority > dst->ctpr &&
 		    dst->raised.priority > dst->servicing.priority) {
-			DPRINTF
-			    ("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->raised.next,
-			     dst->raised.priority, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, IRQ %d prio %d above %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->raised.next,
+				dst->raised.priority, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			/* IRQ line stays asserted */
 		} else {
-			DPRINTF
-			    ("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
-			     __func__, n_IRQ, dst->ctpr,
-			     dst->servicing.priority, n_CPU);
+			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
+				__func__, n_IRQ, dst->ctpr,
+				dst->servicing.priority, n_CPU);
 			qemu_irq_lower(opp->dst[n_CPU].
 				       irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -361,9 +345,9 @@ static void IRQ_local_pipe(OpenPICState * opp, int n_CPU, int n_IRQ,
 }
 
 /* update pic state because registers for n_IRQ have changed value */
-static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
+static void openpic_update_irq(struct openpic *opp, int n_IRQ)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	bool active, was_active;
 	int i;
 
@@ -372,30 +356,29 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 	if ((src->ivpr & IVPR_MASK_MASK) && !src->nomask) {
 		/* Interrupt source is disabled */
-		DPRINTF("%s: IRQ %d is disabled\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is disabled\n", __func__, n_IRQ);
 		active = false;
 	}
 
-	was_active = ! !(src->ivpr & IVPR_ACTIVITY_MASK);
+	was_active = !!(src->ivpr & IVPR_ACTIVITY_MASK);
 
 	/*
 	 * We don't have a similar check for already-active because
 	 * ctpr may have changed and we need to withdraw the interrupt.
 	 */
 	if (!active && !was_active) {
-		DPRINTF("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d is already inactive\n", __func__, n_IRQ);
 		return;
 	}
 
-	if (active) {
+	if (active)
 		src->ivpr |= IVPR_ACTIVITY_MASK;
-	} else {
+	else
 		src->ivpr &= ~IVPR_ACTIVITY_MASK;
-	}
 
 	if (src->destmask = 0) {
 		/* No target */
-		DPRINTF("%s: IRQ %d has no target\n", __func__, n_IRQ);
+		pr_debug("%s: IRQ %d has no target\n", __func__, n_IRQ);
 		return;
 	}
 
@@ -413,9 +396,9 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 	} else {
 		/* Distributed delivery mode */
 		for (i = src->last_cpu + 1; i != src->last_cpu; i++) {
-			if (i = opp->nb_cpus) {
+			if (i = opp->nb_cpus)
 				i = 0;
-			}
+
 			if (src->destmask & (1 << i)) {
 				IRQ_local_pipe(opp, i, n_IRQ, active,
 					       was_active);
@@ -428,16 +411,16 @@ static void openpic_update_irq(OpenPICState * opp, int n_IRQ)
 
 static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		fprintf(stderr, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
 		abort();
 	}
 
 	src = &opp->src[n_IRQ];
-	DPRINTF("openpic: set irq %d = %d ivpr=0x%08x\n",
+	pr_debug("openpic: set irq %d = %d ivpr=0x%08x\n",
 		n_IRQ, level, src->ivpr);
 	if (src->level) {
 		/* level-sensitive irq */
@@ -463,9 +446,9 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState * d)
+static void openpic_reset(DeviceState *d)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
@@ -485,7 +468,7 @@ static void openpic_reset(DeviceState * d)
 		switch (opp->src[i].type) {
 		case IRQ_TYPE_NORMAL:
 			opp->src[i].level -			    ! !(opp->ivpr_reset & IVPR_SENSE_MASK);
+			    !!(opp->ivpr_reset & IVPR_SENSE_MASK);
 			break;
 
 		case IRQ_TYPE_FSLINT:
@@ -499,9 +482,9 @@ static void openpic_reset(DeviceState * d)
 	/* Initialise IRQ destinations */
 	for (i = 0; i < MAX_CPU; i++) {
 		opp->dst[i].ctpr = 15;
-		memset(&opp->dst[i].raised, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].raised, 0, sizeof(struct irq_queue));
 		opp->dst[i].raised.next = -1;
-		memset(&opp->dst[i].servicing, 0, sizeof(IRQQueue));
+		memset(&opp->dst[i].servicing, 0, sizeof(struct irq_queue));
 		opp->dst[i].servicing.next = -1;
 	}
 	/* Initialise timers */
@@ -513,28 +496,28 @@ static void openpic_reset(DeviceState * d)
 	opp->gcr = 0;
 }
 
-static inline uint32_t read_IRQreg_idr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].idr;
 }
 
-static inline uint32_t read_IRQreg_ilr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
-	if (opp->flags & OPENPIC_FLAG_ILR) {
+	if (opp->flags & OPENPIC_FLAG_ILR)
 		return output_to_inttgt(opp->src[n_IRQ].output);
-	}
 
 	return 0xffffffff;
 }
 
-static inline uint32_t read_IRQreg_ivpr(OpenPICState * opp, int n_IRQ)
+static inline uint32_t read_IRQreg_ivpr(struct openpic *opp, int n_IRQ)
 {
 	return opp->src[n_IRQ].ivpr;
 }
 
-static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
-	IRQSource *src = &opp->src[n_IRQ];
+	struct irq_source *src = &opp->src[n_IRQ];
 	uint32_t normal_mask = (1UL << opp->nb_cpus) - 1;
 	uint32_t crit_mask = 0;
 	uint32_t mask = normal_mask;
@@ -547,14 +530,13 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 
 	src->idr = val & mask;
-	DPRINTF("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
+	pr_debug("Set IDR %d to 0x%08x\n", n_IRQ, src->idr);
 
 	if (opp->flags & OPENPIC_FLAG_IDR_CRIT) {
 		if (src->idr & crit_mask) {
 			if (src->idr & normal_mask) {
-				DPRINTF
-				    ("%s: IRQ configured for multiple output types, using "
-				     "critical\n", __func__);
+				pr_debug("%s: IRQ configured for multiple output types, using critical\n",
+					__func__);
 			}
 
 			src->output = OPENPIC_OUTPUT_CINT;
@@ -564,9 +546,8 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 			for (i = 0; i < opp->nb_cpus; i++) {
 				int n_ci = IDR_CI0_SHIFT - i;
 
-				if (src->idr & (1UL << n_ci)) {
+				if (src->idr & (1UL << n_ci))
 					src->destmask |= 1UL << i;
-				}
 			}
 		} else {
 			src->output = OPENPIC_OUTPUT_INT;
@@ -578,20 +559,21 @@ static inline void write_IRQreg_idr(OpenPICState * opp, int n_IRQ, uint32_t val)
 	}
 }
 
-static inline void write_IRQreg_ilr(OpenPICState * opp, int n_IRQ, uint32_t val)
+static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
+				    uint32_t val)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR) {
-		IRQSource *src = &opp->src[n_IRQ];
+		struct irq_source *src = &opp->src[n_IRQ];
 
 		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
-		DPRINTF("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
+		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
 		/* TODO: on MPIC v4.0 only, set nomask for non-INT */
 	}
 }
 
-static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
+static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 				     uint32_t val)
 {
 	uint32_t mask;
@@ -613,7 +595,7 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	switch (opp->src[n_IRQ].type) {
 	case IRQ_TYPE_NORMAL:
 		opp->src[n_IRQ].level -		    ! !(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
+		    !!(opp->src[n_IRQ].ivpr & IVPR_SENSE_MASK);
 		break;
 
 	case IRQ_TYPE_FSLINT:
@@ -626,11 +608,11 @@ static inline void write_IRQreg_ivpr(OpenPICState * opp, int n_IRQ,
 	}
 
 	openpic_update_irq(opp, n_IRQ);
-	DPRINTF("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
+	pr_debug("Set IVPR %d to 0x%08x -> 0x%08x\n", n_IRQ, val,
 		opp->src[n_IRQ].ivpr);
 }
 
-static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
+static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
 	bool mpic_proxy = false;
 
@@ -643,27 +625,26 @@ static void openpic_gcr_write(OpenPICState * opp, uint64_t val)
 	opp->gcr |= val & opp->mpic_mode_mask;
 
 	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY) {
+	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY)
 		mpic_proxy = true;
-	}
 
 	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	switch (addr) {
-	case 0x00:		/* Block Revision Register1 (BRR1) is Readonly */
+	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
 		break;
 	case 0x40:
 	case 0x50:
@@ -685,16 +666,14 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x1090:		/* PIR */
 		for (idx = 0; idx < opp->nb_cpus; idx++) {
 			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Raise OpenPIC RESET output for CPU %d\n",
-				     idx);
+				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx))
-				   && (opp->pir & (1 << idx))) {
-				DPRINTF
-				    ("Lower OpenPIC RESET output for CPU %d\n",
-				     idx);
+			} else if (!(val & (1 << idx)) &&
+				   (opp->pir & (1 << idx))) {
+				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
+					idx);
 				dst = &opp->dst[idx];
 				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
 			}
@@ -704,13 +683,12 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
 	case 0x10C0:
-	case 0x10D0:
-		{
-			int idx;
-			idx = (addr - 0x10A0) >> 4;
-			write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
-		}
+	case 0x10D0: {
+		int idx;
+		idx = (addr - 0x10A0) >> 4;
+		write_IRQreg_ivpr(opp, opp->irq_ipi0 + idx, val);
 		break;
+	}
 	case 0x10E0:		/* SPVE */
 		opp->spve = val & opp->vector_mask;
 		break;
@@ -719,16 +697,16 @@ static void openpic_gbl_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
@@ -772,24 +750,23 @@ static uint64_t openpic_gbl_read(void *opaque, hwaddr addr, unsigned len)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	if (addr = 0x10f0) {
 		/* TFRR */
@@ -806,9 +783,9 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	case 0x10:		/* TBCR */
 		if ((opp->timers[idx].tccr & TCCR_TOG) != 0 &&
 		    (val & TBCR_CI) = 0 &&
-		    (opp->timers[idx].tbcr & TBCR_CI) != 0) {
+		    (opp->timers[idx].tbcr & TBCR_CI) != 0)
 			opp->timers[idx].tccr &= ~TCCR_TOG;
-		}
+
 		opp->timers[idx].tbcr = val;
 		break;
 	case 0x20:		/* TVPR */
@@ -820,16 +797,16 @@ static void openpic_tmr_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		goto out;
-	}
+
 	idx = (addr >> 6) & 0x3;
 	if (addr = 0x0) {
 		/* TFRR */
@@ -852,18 +829,18 @@ static uint64_t openpic_tmr_read(void *opaque, hwaddr addr, unsigned len)
 	}
 
 out:
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
 		__func__, addr, val);
 
 	addr = addr & 0xffff;
@@ -884,11 +861,11 @@ static void openpic_src_write(void *opaque, hwaddr addr, uint64_t val,
 
 static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -906,22 +883,21 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 		break;
 	}
 
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 	return retval;
 }
 
-static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -937,16 +913,15 @@ static void openpic_msi_write(void *opaque, hwaddr addr, uint64_t val,
 	}
 }
 
-static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 {
-	OpenPICState *opp = opaque;
+	struct openpic *opp = opaque;
 	uint64_t r = 0;
 	int i, srs;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
-	if (addr & 0xF) {
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	if (addr & 0xF)
 		return -1;
-	}
 
 	srs = addr >> 4;
 
@@ -965,53 +940,51 @@ static uint64_t openpic_msi_read(void *opaque, hwaddr addr, unsigned size)
 		openpic_set_irq(opp, opp->irq_msi + srs, 0);
 		break;
 	case 0x120:		/* MSISR */
-		for (i = 0; i < MAX_MSI; i++) {
+		for (i = 0; i < MAX_MSI; i++)
 			r |= (opp->msi[i].msir ? 1 : 0) << i;
-		}
 		break;
 	}
 
 	return r;
 }
 
-static uint64_t openpic_summary_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
 {
 	uint64_t r = 0;
 
-	DPRINTF("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
 	return r;
 }
 
-static void openpic_summary_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
 				  unsigned size)
 {
-	DPRINTF("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
+	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
 		__func__, addr, val);
 
 	/* TODO: EISR/EIMR */
 }
 
-static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
+static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 				       uint32_t val, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQSource *src;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_source *src;
+	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
 		addr, val);
 
-	if (idx < 0) {
+	if (idx < 0)
 		return;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1028,17 +1001,16 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	case 0x80:		/* CTPR */
 		dst->ctpr = val & 0x0000000F;
 
-		DPRINTF("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
+		pr_debug("%s: set CPU %d ctpr to %d, raised %d servicing %d\n",
 			__func__, idx, dst->ctpr, dst->raised.priority,
 			dst->servicing.priority);
 
 		if (dst->raised.priority <= dst->ctpr) {
-			DPRINTF
-			    ("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
-			     __func__, idx);
+			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
+				__func__, idx);
 			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 		} else if (dst->raised.priority > dst->servicing.priority) {
-			DPRINTF("%s: Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
 			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1051,11 +1023,11 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		/* Read-only register */
 		break;
 	case 0xB0:		/* EOI */
-		DPRINTF("EOI\n");
+		pr_debug("EOI\n");
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 
 		if (s_IRQ < 0) {
-			DPRINTF("%s: EOI with no interrupt in service\n",
+			pr_debug("%s: EOI with no interrupt in service\n",
 				__func__);
 			break;
 		}
@@ -1069,7 +1041,7 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 		if (n_IRQ != -1 &&
 		    (s_IRQ = -1 ||
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
-			DPRINTF("Raise OpenPIC INT output cpu %d irq %d\n",
+			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
 			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
 		}
@@ -1079,32 +1051,32 @@ static void openpic_cpu_write_internal(void *opaque, hwaddr addr,
 	}
 }
 
-static void openpic_cpu_write(void *opaque, hwaddr addr, uint64_t val,
+static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
 			      unsigned len)
 {
 	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
 }
 
-static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
+static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
+			     int cpu)
 {
-	IRQSource *src;
+	struct irq_source *src;
 	int retval, irq;
 
-	DPRINTF("Lower OpenPIC INT output\n");
+	pr_debug("Lower OpenPIC INT output\n");
 	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
 
 	irq = IRQ_get_next(opp, &dst->raised);
-	DPRINTF("IACK: irq=%d\n", irq);
+	pr_debug("IACK: irq=%d\n", irq);
 
-	if (irq = -1) {
+	if (irq = -1)
 		/* No more interrupt pending */
 		return opp->spve;
-	}
 
 	src = &opp->src[irq];
 	if (!(src->ivpr & IVPR_ACTIVITY_MASK) ||
 	    !(IVPR_PRIORITY(src->ivpr) > dst->ctpr)) {
-		fprintf(stderr, "%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
+		pr_err("%s: bad raised IRQ %d ctpr %d ivpr 0x%08x\n",
 			__func__, irq, dst->ctpr, src->ivpr);
 		openpic_update_irq(opp, irq);
 		retval = opp->spve;
@@ -1135,22 +1107,21 @@ static uint32_t openpic_iack(OpenPICState * opp, IRQDest * dst, int cpu)
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
+static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 {
-	OpenPICState *opp = opaque;
-	IRQDest *dst;
+	struct openpic *opp = opaque;
+	struct irq_dest *dst;
 	uint32_t retval;
 
-	DPRINTF("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
-	if (idx < 0) {
+	if (idx < 0)
 		return retval;
-	}
 
-	if (addr & 0xF) {
+	if (addr & 0xF)
 		return retval;
-	}
+
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
 	switch (addr) {
@@ -1169,54 +1140,54 @@ static uint32_t openpic_cpu_read_internal(void *opaque, hwaddr addr, int idx)
 	default:
 		break;
 	}
-	DPRINTF("%s: => 0x%08x\n", __func__, retval);
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
 	return retval;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, hwaddr addr, unsigned len)
+static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
 {
 	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
 }
 
-static const MemoryRegionOps openpic_glb_ops_be = {
+static const struct kvm_io_device_ops openpic_glb_ops_be = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
 };
 
-static const MemoryRegionOps openpic_tmr_ops_be = {
+static const struct kvm_io_device_ops openpic_tmr_ops_be = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
 };
 
-static const MemoryRegionOps openpic_cpu_ops_be = {
+static const struct kvm_io_device_ops openpic_cpu_ops_be = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
 };
 
-static const MemoryRegionOps openpic_src_ops_be = {
+static const struct kvm_io_device_ops openpic_src_ops_be = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
 };
 
-static const MemoryRegionOps openpic_msi_ops_be = {
+static const struct kvm_io_device_ops openpic_msi_ops_be = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
 };
 
-static const MemoryRegionOps openpic_summary_ops_be = {
+static const struct kvm_io_device_ops openpic_summary_ops_be = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
 };
 
-typedef struct MemReg {
+struct mem_reg {
 	const char *name;
-	MemoryRegionOps const *ops;
-	hwaddr start_addr;
-	ram_addr_t size;
-} MemReg;
+	const struct kvm_io_device_ops *ops;
+	gpa_t start_addr;
+	int size;
+};
 
-static void fsl_common_init(OpenPICState * opp)
+static void fsl_common_init(struct openpic *opp)
 {
 	int i;
 	int virq = MAX_SRC;
@@ -1239,9 +1210,8 @@ static void fsl_common_init(OpenPICState * opp)
 	opp->irq_msi = 224;
 
 	msi_supported = true;
-	for (i = 0; i < opp->fsl->max_ext; i++) {
+	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
-	}
 
 	/* Internal interrupts, including message and MSI */
 	for (i = 16; i < MAX_SRC; i++) {
@@ -1256,7 +1226,8 @@ static void fsl_common_init(OpenPICState * opp)
 	}
 }
 
-static void map_list(OpenPICState * opp, const MemReg * list, int *count)
+static void map_list(struct openpic *opp, const struct mem_reg *list,
+		     int *count)
 {
 	while (list->name) {
 		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
@@ -1272,12 +1243,12 @@ static void map_list(OpenPICState * opp, const MemReg * list, int *count)
 	}
 }
 
-static int openpic_init(SysBusDevice * dev)
+static int openpic_init(SysBusDevice *dev)
 {
-	OpenPICState *opp = FROM_SYSBUS(typeof(*opp), dev);
+	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
 	int i, j;
 	int list_count = 0;
-	static const MemReg list_le[] = {
+	static const struct mem_reg list_le[] = {
 		{"glb", &openpic_glb_ops_le,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_le,
@@ -1288,7 +1259,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_be[] = {
+	static const struct mem_reg list_be[] = {
 		{"glb", &openpic_glb_ops_be,
 		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
 		{"tmr", &openpic_tmr_ops_be,
@@ -1299,7 +1270,7 @@ static int openpic_init(SysBusDevice * dev)
 		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
 		{NULL}
 	};
-	static const MemReg list_fsl[] = {
+	static const struct mem_reg list_fsl[] = {
 		{"msi", &openpic_msi_ops_be,
 		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
 		{"summary", &openpic_summary_ops_be,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 13/17] kvm/ppc/mpic: in-kernel MPIC emulation
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Hook the MPIC code up to the KVM interfaces, add locking, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - fix pr_debug again
---
 Documentation/virtual/kvm/devices/mpic.txt |   37 ++
 arch/powerpc/include/asm/kvm_host.h        |    8 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   17 +
 arch/powerpc/include/uapi/asm/kvm.h        |    7 +
 arch/powerpc/kvm/Kconfig                   |    9 +
 arch/powerpc/kvm/Makefile                  |    2 +
 arch/powerpc/kvm/booke.c                   |    8 +-
 arch/powerpc/kvm/mpic.c                    |  762 +++++++++++++++++++++-------
 arch/powerpc/kvm/powerpc.c                 |   12 +-
 include/linux/kvm_host.h                   |    2 +
 include/uapi/linux/kvm.h                   |    3 +
 virt/kvm/kvm_main.c                        |    6 +
 12 files changed, 673 insertions(+), 200 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt

diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
new file mode 100644
index 0000000..ce98e32
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -0,0 +1,37 @@
+MPIC interrupt controller
+=========================
+
+Device types supported:
+  KVM_DEV_TYPE_FSL_MPIC_20     Freescale MPIC v2.0
+  KVM_DEV_TYPE_FSL_MPIC_42     Freescale MPIC v4.2
+
+Only one MPIC instance, of any type, may be instantiated.  The created
+MPIC will act as the system interrupt controller, connecting to each
+vcpu's interrupt inputs.
+
+Groups:
+  KVM_DEV_MPIC_GRP_MISC
+  Attributes:
+    KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
+      Base address of the 256 KiB MPIC register space.  Must be
+      naturally aligned.  A value of zero disables the mapping.
+      Reset value is zero.
+
+  KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
+    Access an MPIC register, as if the access were made from the guest.
+    "attr" is the byte offset into the MPIC register space.  Accesses
+    must be 4-byte aligned.
+
+    MSIs may be signaled by using this attribute group to write
+    to the relevant MSIIR.
+
+  KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
+    IRQ input line for each standard openpic source.  0 is inactive and 1
+    is active, regardless of interrupt sense.
+
+    For edge-triggered interrupts:  Writing 1 is considered an activating
+    edge, and writing 0 is ignored.  Reading returns 1 if a previously
+    signaled edge has not been acknowledged, and 0 otherwise.
+
+    "attr" is the IRQ number.  IRQ numbers for standard sources are the
+    byte offset of the relevant IVPR from EIVPR0, divided by 32.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..7e7aef9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -359,6 +359,11 @@ struct kvmppc_slb {
 #define KVMPPC_BOOKE_MAX_IAC	4
 #define KVMPPC_BOOKE_MAX_DAC	2
 
+/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */
+#define KVMPPC_EPR_NONE		0 /* EPR not supported */
+#define KVMPPC_EPR_USER		1 /* exit to userspace to fill EPR */
+#define KVMPPC_EPR_KERNEL	2 /* in-kernel irqchip */
+
 struct kvmppc_booke_debug_reg {
 	u32 dbcr0;
 	u32 dbcr1;
@@ -522,7 +527,7 @@ struct kvm_vcpu_arch {
 	u8 sane;
 	u8 cpu_type;
 	u8 hcall_needed;
-	u8 epr_enabled;
+	u8 epr_flags; /* KVMPPC_EPR_xxx */
 	u8 epr_needed;
 
 	u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
@@ -589,5 +594,6 @@ struct kvm_vcpu_arch {
 #define KVM_MMIO_REG_FQPR	0x0060
 
 #define __KVM_HAVE_ARCH_WQP
+#define __KVM_HAVE_CREATE_DEVICE
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index f589307..0b86604 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
 extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
 
+int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
@@ -245,6 +247,9 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
+struct openpic;
+void kvmppc_mpic_put(struct openpic *opp);
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {
@@ -270,6 +275,18 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #endif
 }
 
+#ifdef CONFIG_KVM_MPIC
+
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+
+#else
+
+static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+}
+
+#endif /* CONFIG_KVM_MPIC */
+
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 			      struct kvm_config_tlb *cfg);
 int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index c2ff99c..36be2fe 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -426,4 +426,11 @@ struct kvm_get_htab_header {
 /* Debugging: Special instruction for software breakpoint */
 #define KVM_REG_PPC_DEBUG_INST	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
 
+/* Device control API: PPC-specific devices */
+#define KVM_DEV_MPIC_GRP_MISC		1
+#define   KVM_DEV_MPIC_BASE_ADDR	0	/* 64-bit */
+
+#define KVM_DEV_MPIC_GRP_REGISTER	2	/* 32-bit */
+#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE	3	/* 32-bit */
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 63c67ec..938a729 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -151,6 +151,15 @@ config KVM_E500MC
 
 	  If unsure, say N.
 
+config KVM_MPIC
+	bool "KVM in-kernel MPIC emulation"
+	depends on KVM
+	help
+	  Enable support for emulating MPIC devices inside the
+          host kernel, rather than relying on userspace to emulate.
+          Currently, support is limited to certain versions of
+          Freescale's MPIC implementation.
+
 source drivers/vhost/Kconfig
 
 endif # VIRTUALIZATION
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index b772ede..4a2277a 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -103,6 +103,8 @@ kvm-book3s_32-objs := \
 	book3s_32_mmu.o
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
+kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
 obj-$(CONFIG_KVM_440) += kvm.o
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a49a68a..cff53d4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -346,7 +346,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		keep_irq = true;
 	}
 
-	if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled)
+	if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_flags)
 		update_epr = true;
 
 	switch (priority) {
@@ -427,8 +427,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 			set_guest_esr(vcpu, vcpu->arch.queued_esr);
 		if (update_dear == true)
 			set_guest_dear(vcpu, vcpu->arch.queued_dear);
-		if (update_epr == true)
-			kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		if (update_epr == true) {
+			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
+				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		}
 
 		new_msr &= msr_mask;
 #if defined(CONFIG_64BIT)
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 1df67ae..cb451b9 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -23,6 +23,19 @@
  * THE SOFTWARE.
  */
 
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/kvm_host.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/anon_inodes.h>
+#include <asm/uaccess.h>
+#include <asm/mpic.h>
+#include <asm/kvm_para.h>
+#include <asm/kvm_host.h>
+#include <asm/kvm_ppc.h>
+#include "iodev.h"
+
 #define MAX_CPU     32
 #define MAX_SRC     256
 #define MAX_TMR     4
@@ -36,6 +49,7 @@
 #define OPENPIC_FLAG_ILR          (2 << 0)
 
 /* OpenPIC address map */
+#define OPENPIC_REG_SIZE             0x40000
 #define OPENPIC_GLB_REG_START        0x0
 #define OPENPIC_GLB_REG_SIZE         0x10F0
 #define OPENPIC_TMR_REG_START        0x10F0
@@ -89,6 +103,7 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 #define ILR_INTTGT_INT    0x00
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
+#define NUM_OUTPUTS       3
 
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
@@ -98,18 +113,19 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 
 static int get_current_cpu(void)
 {
-	CPUState *cpu_single_cpu;
-
-	if (!cpu_single_env)
-		return -1;
-
-	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
-	return cpu_single_cpu->cpu_index;
+#if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
+	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
+	return vcpu ? vcpu->vcpu_id : -1;
+#else
+	/* XXX */
+	return -1;
+#endif
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx);
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx);
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx);
 
 enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
@@ -131,7 +147,7 @@ struct irq_source {
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
-	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int output;		/* IRQ level, e.g. ILR_INTTGT_INT */
 	int pending;		/* TRUE if IRQ is pending */
 	enum irq_type type;
 	bool level:1;		/* level-triggered */
@@ -158,16 +174,27 @@ struct irq_source {
 #define IDR_CI      0x40000000	/* critical interrupt */
 
 struct irq_dest {
+	struct kvm_vcpu *vcpu;
+
 	int32_t ctpr;		/* CPU current task priority */
 	struct irq_queue raised;
 	struct irq_queue servicing;
-	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
-	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+	uint32_t outputs_active[NUM_OUTPUTS];
 };
 
 struct openpic {
+	struct kvm *kvm;
+	struct kvm_device *dev;
+	struct kvm_io_device mmio;
+	struct list_head mmio_regions;
+	atomic_t users;
+	bool mmio_mapped;
+
+	gpa_t reg_base;
+	spinlock_t lock;
+
 	/* Behavior control */
 	struct fsl_mpic_info *fsl;
 	uint32_t model;
@@ -208,6 +235,47 @@ struct openpic {
 	uint32_t irq_msi;
 };
 
+
+static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	struct kvm_interrupt irq = {
+		.irq = KVM_INTERRUPT_SET_LEVEL,
+	};
+
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %d does not exist\n",
+			 __func__, (int)(dst - &opp->dst[0]));
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvm_vcpu_ioctl_interrupt(dst->vcpu, &irq);
+}
+
+static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %d does not exist\n",
+			 __func__, (int)(dst - &opp->dst[0]));
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvmppc_core_dequeue_external(dst->vcpu);
+}
+
 static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
@@ -268,7 +336,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
-	if (src->output != OPENPIC_OUTPUT_INT) {
+	if (src->output != ILR_INTTGT_INT) {
 		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
@@ -282,14 +350,14 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			    dst->outputs_active[src->output]++ == 0) {
 				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_raise(dst->irqs[src->output]);
+				mpic_irq_raise(opp, dst, src->output);
 			}
 		} else {
 			if (was_active &&
 			    --dst->outputs_active[src->output] == 0) {
 				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_lower(dst->irqs[src->output]);
+				mpic_irq_lower(opp, dst, src->output);
 			}
 		}
 
@@ -322,8 +390,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 		} else {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
 				__func__, n_CPU, n_IRQ, dst->raised.next);
-			qemu_irq_raise(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 	} else {
 		IRQ_get_next(opp, &dst->servicing);
@@ -338,8 +405,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
 				__func__, n_IRQ, dst->ctpr,
 				dst->servicing.priority, n_CPU);
-			qemu_irq_lower(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		}
 	}
 }
@@ -415,8 +481,8 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
-		abort();
+		WARN_ONCE(1, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		return;
 	}
 
 	src = &opp->src[n_IRQ];
@@ -433,7 +499,7 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 			openpic_update_irq(opp, n_IRQ);
 		}
 
-		if (src->output != OPENPIC_OUTPUT_INT) {
+		if (src->output != ILR_INTTGT_INT) {
 			/* Edge-triggered interrupts shouldn't be used
 			 * with non-INT delivery, but just in case,
 			 * try to make it do something sane rather than
@@ -446,15 +512,13 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState *d)
+static void openpic_reset(struct openpic *opp)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
 	/* Initialise controller registers */
 	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
-	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
 	    (opp->vid << FRR_VID_SHIFT);
 
 	opp->pir = 0;
@@ -504,7 +568,7 @@ static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR)
-		return output_to_inttgt(opp->src[n_IRQ].output);
+		return opp->src[n_IRQ].output;
 
 	return 0xffffffff;
 }
@@ -539,7 +603,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					__func__);
 			}
 
-			src->output = OPENPIC_OUTPUT_CINT;
+			src->output = ILR_INTTGT_CINT;
 			src->nomask = true;
 			src->destmask = 0;
 
@@ -550,7 +614,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					src->destmask |= 1UL << i;
 			}
 		} else {
-			src->output = OPENPIC_OUTPUT_INT;
+			src->output = ILR_INTTGT_INT;
 			src->nomask = false;
 			src->destmask = src->idr & normal_mask;
 		}
@@ -565,7 +629,7 @@ static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
 	if (opp->flags & OPENPIC_FLAG_ILR) {
 		struct irq_source *src = &opp->src[n_IRQ];
 
-		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		src->output = val & ILR_INTTGT_MASK;
 		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
@@ -614,34 +678,23 @@ static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 
 static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
-	bool mpic_proxy = false;
-
 	if (val & GCR_RESET) {
-		openpic_reset(&opp->busdev.qdev);
+		openpic_reset(opp);
 		return;
 	}
 
 	opp->gcr &= ~opp->mpic_mode_mask;
 	opp->gcr |= val & opp->mpic_mode_mask;
-
-	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY)
-		mpic_proxy = true;
-
-	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_gbl_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
-	struct irq_dest *dst;
-	int idx;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
@@ -654,7 +707,8 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		err = openpic_cpu_write_internal(opp, addr, val,
+						 get_current_cpu());
 		break;
 	case 0x1000:		/* FRR */
 		break;
@@ -664,21 +718,11 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x1080:		/* VIR */
 		break;
 	case 0x1090:		/* PIR */
-		for (idx = 0; idx < opp->nb_cpus; idx++) {
-			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx)) &&
-				   (opp->pir & (1 << idx))) {
-				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			}
-		}
-		opp->pir = val;
+		/*
+		 * This register is used to reset a CPU core --
+		 * let userspace handle it.
+		 */
+		err = -ENXIO;
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -695,21 +739,25 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	default:
 		break;
 	}
+
+	return err;
 }
 
-static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_gbl_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint32_t retval;
+	u32 retval;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
+		retval |= (opp->nb_cpus - 1) << FRR_NCPU_SHIFT;
 		break;
 	case 0x1020:		/* GCR */
 		retval = opp->gcr;
@@ -731,8 +779,8 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		retval =
-		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		err = openpic_cpu_read_internal(opp, addr,
+			&retval, get_current_cpu());
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -750,28 +798,28 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	default:
 		break;
 	}
-	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
+	*ptr = retval;
+	return err;
 }
 
-static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_tmr_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	if (addr == 0x10f0) {
 		/* TFRR */
 		opp->tfrr = val;
-		return;
+		return 0;
 	}
 
 	idx = (addr >> 6) & 0x3;
@@ -795,15 +843,17 @@ static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_tmr_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
 		goto out;
 
@@ -813,6 +863,7 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 		retval = opp->tfrr;
 		goto out;
 	}
+
 	switch (addr & 0x30) {
 	case 0x00:		/* TCCR */
 		retval = opp->timers[idx].tccr;
@@ -830,18 +881,16 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 
 out:
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_src_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 
 	addr = addr & 0xffff;
 	idx = addr >> 5;
@@ -857,15 +906,17 @@ static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_ilr(opp, idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+static int openpic_src_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -884,20 +935,19 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 	}
 
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned size)
+static int openpic_msi_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -911,17 +961,19 @@ static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 		/* most registers are read-only, thus ignored */
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_msi_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint64_t r = 0;
+	uint32_t r = 0;
 	int i, srs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
-		return -1;
+		return -ENXIO;
 
 	srs = addr >> 4;
 
@@ -945,45 +997,47 @@ static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 		break;
 	}
 
-	return r;
+	pr_debug("%s: => 0x%08x\n", __func__, r);
+	*ptr = r;
+	return 0;
 }
 
-static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_summary_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	uint64_t r = 0;
+	uint32_t r = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
-	return r;
+	*ptr = r;
+	return 0;
 }
 
-static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
-				  unsigned size)
+static int openpic_summary_write(void *opaque, gpa_t addr, u32 val)
 {
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 
 	/* TODO: EISR/EIMR */
+	return 0;
 }
 
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx)
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_source *src;
 	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
 		addr, val);
 
 	if (idx < 0)
-		return;
+		return 0;
 
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1008,11 +1062,11 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		if (dst->raised.priority <= dst->ctpr) {
 			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
 				__func__, idx);
-			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		} else if (dst->raised.priority > dst->servicing.priority) {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
-			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 
 		break;
@@ -1043,18 +1097,22 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
 			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
-			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 		break;
 	default:
 		break;
 	}
+
+	return 0;
 }
 
-static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_cpu_write(void *opaque, gpa_t addr, u32 val)
 {
-	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_write_internal(opp, addr, val,
+					 (addr & 0x1f000) >> 12);
 }
 
 static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
@@ -1064,7 +1122,7 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	int retval, irq;
 
 	pr_debug("Lower OpenPIC INT output\n");
-	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+	mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 
 	irq = IRQ_get_next(opp, &dst->raised);
 	pr_debug("IACK: irq=%d\n", irq);
@@ -1107,20 +1165,21 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_dest *dst;
 	uint32_t retval;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#llx\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
 	if (idx < 0)
-		return retval;
+		goto out;
 
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1142,49 +1201,67 @@ static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 	}
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	*ptr = retval;
+	return 0;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_cpu_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_read_internal(opp, addr, ptr,
+					 (addr & 0x1f000) >> 12);
 }
 
-static const struct kvm_io_device_ops openpic_glb_ops_be = {
+struct mem_reg {
+	struct list_head list;
+	int (*read)(void *opaque, gpa_t addr, u32 *ptr);
+	int (*write)(void *opaque, gpa_t addr, u32 val);
+	gpa_t start_addr;
+	int size;
+};
+
+static struct mem_reg openpic_gbl_mmio = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
+	.start_addr = OPENPIC_GLB_REG_START,
+	.size = OPENPIC_GLB_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_tmr_ops_be = {
+static struct mem_reg openpic_tmr_mmio = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
+	.start_addr = OPENPIC_TMR_REG_START,
+	.size = OPENPIC_TMR_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_cpu_ops_be = {
+static struct mem_reg openpic_cpu_mmio = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
+	.start_addr = OPENPIC_CPU_REG_START,
+	.size = OPENPIC_CPU_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_src_ops_be = {
+static struct mem_reg openpic_src_mmio = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
+	.start_addr = OPENPIC_SRC_REG_START,
+	.size = OPENPIC_SRC_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_msi_ops_be = {
+static struct mem_reg openpic_msi_mmio = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
+	.start_addr = OPENPIC_MSI_REG_START,
+	.size = OPENPIC_MSI_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_summary_ops_be = {
+static struct mem_reg openpic_summary_mmio = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-};
-
-struct mem_reg {
-	const char *name;
-	const struct kvm_io_device_ops *ops;
-	gpa_t start_addr;
-	int size;
+	.start_addr = OPENPIC_SUMMARY_REG_START,
+	.size = OPENPIC_SUMMARY_REG_SIZE,
 };
 
 static void fsl_common_init(struct openpic *opp)
@@ -1192,6 +1269,9 @@ static void fsl_common_init(struct openpic *opp)
 	int i;
 	int virq = MAX_SRC;
 
+	list_add(&openpic_msi_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_summary_mmio.list, &opp->mmio_regions);
+
 	opp->vid = VID_REVISION_1_2;
 	opp->vir = VIR_GENERIC;
 	opp->vector_mask = 0xFFFF;
@@ -1205,11 +1285,10 @@ static void fsl_common_init(struct openpic *opp)
 	opp->irq_tim0 = virq;
 	virq += MAX_TMR;
 
-	assert(virq <= MAX_IRQ);
+	BUG_ON(virq > MAX_IRQ);
 
 	opp->irq_msi = 224;
 
-	msi_supported = true;
 	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
 
@@ -1226,63 +1305,352 @@ static void fsl_common_init(struct openpic *opp)
 	}
 }
 
-static void map_list(struct openpic *opp, const struct mem_reg *list,
-		     int *count)
+static int kvm_mpic_read_internal(struct openpic *opp, gpa_t addr, u32 *ptr)
 {
-	while (list->name) {
-		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+	struct list_head *node;
 
-		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
-				      list->name, list->size);
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
 
-		memory_region_add_subregion(&opp->mem, list->start_addr,
-					    &opp->sub_io_mem[*count]);
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-		(*count)++;
-		list++;
+		return mr->read(opp, addr - mr->start_addr, ptr);
 	}
+
+	return -ENXIO;
 }
 
-static int openpic_init(SysBusDevice *dev)
+static int kvm_mpic_write_internal(struct openpic *opp, gpa_t addr, u32 val)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
-	int i, j;
-	int list_count = 0;
-	static const struct mem_reg list_le[] = {
-		{"glb", &openpic_glb_ops_le,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_le,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_le,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_le,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_be[] = {
-		{"glb", &openpic_glb_ops_be,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_be,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_be,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_be,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_fsl[] = {
-		{"msi", &openpic_msi_ops_be,
-		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
-		{"summary", &openpic_summary_ops_be,
-		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
-		{NULL}
-	};
+	struct list_head *node;
+
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
+
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-	memory_region_init(&opp->mem, "openpic", 0x40000);
+		return mr->write(opp, addr - mr->start_addr, val);
+	}
+
+	return -ENXIO;
+}
+
+static int kvm_mpic_read(struct kvm_io_device *this, gpa_t addr,
+			 int len, void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+	union {
+		u32 val;
+		u8 bytes[4];
+	} u;
+
+	if (addr & (len - 1)) {
+		pr_debug("%s: bad alignment %llx/%d\n",
+			 __func__, addr, len);
+		return -EINVAL;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_read_internal(opp, addr - opp->reg_base, &u.val);
+	spin_unlock_irq(&opp->lock);
+
+	/*
+	 * Technically only 32-bit accesses are allowed, but be nice to
+	 * people dumping registers a byte at a time -- it works in real
+	 * hardware (reads only, not writes).
+	 */
+	if (len == 4) {
+		*(u32 *)ptr = u.val;
+		pr_debug("%s: addr %llx ret %d len 4 val %x\n",
+			 __func__, addr, ret, u.val);
+	} else if (len == 1) {
+		*(u8 *)ptr = u.bytes[addr & 3];
+		pr_debug("%s: addr %llx ret %d len 1 val %x\n",
+			 __func__, addr, ret, u.bytes[addr & 3]);
+	} else {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
+static int kvm_mpic_write(struct kvm_io_device *this, gpa_t addr,
+			  int len, const void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+
+	if (len != 4) {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EOPNOTSUPP;
+	}
+	if (addr & 3) {
+		pr_debug("%s: bad alignment %llx/%d\n", __func__, addr, len);
+		return -EOPNOTSUPP;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_write_internal(opp, addr - opp->reg_base,
+				      *(const u32 *)ptr);
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: addr %llx ret %d val %x\n",
+		 __func__, addr, ret, *(const u32 *)ptr);
+
+	return ret;
+}
+
+static void kvm_mpic_dtor(struct kvm_io_device *this)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+
+	opp->mmio_mapped = false;
+}
+
+static const struct kvm_io_device_ops mpic_mmio_ops = {
+	.read = kvm_mpic_read,
+	.write = kvm_mpic_write,
+	.destructor = kvm_mpic_dtor,
+};
+
+static void map_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = true;
+
+	kvm_iodevice_init(&opp->mmio, &mpic_mmio_ops);
+
+	kvm_io_bus_register_dev(opp->kvm, KVM_MMIO_BUS,
+				opp->reg_base, OPENPIC_REG_SIZE,
+				&opp->mmio);
+}
+
+static void unmap_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = false;
+
+	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+}
+
+static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
+{
+	u64 base;
+
+	if (copy_from_user(&base, (u64 __user *)(long)attr->addr, sizeof(u64)))
+		return -EFAULT;
+
+	if (base & 0x3ffff) {
+		pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx not aligned\n",
+			 __func__, base);
+		return -EINVAL;
+	}
+
+	if (base == opp->reg_base)
+		return 0;
+
+	mutex_lock(&opp->kvm->slots_lock);
+
+	unmap_mmio(opp);
+	opp->reg_base = base;
+
+	pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx\n",
+		 __func__, base);
+
+	if (base == 0)
+		goto out;
+
+	map_mmio(opp);
+
+	mutex_unlock(&opp->kvm->slots_lock);
+out:
+	return 0;
+}
+
+#define ATTR_SET		0
+#define ATTR_GET		1
+
+static int access_reg(struct openpic *opp, gpa_t addr, u32 *val, int type)
+{
+	int ret;
+
+	if (addr & 3)
+		return -ENXIO;
+
+	spin_lock_irq(&opp->lock);
+
+	if (type == ATTR_SET)
+		ret = kvm_mpic_write_internal(opp, addr, *val);
+	else
+		ret = kvm_mpic_read_internal(opp, addr, val);
+
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: type %d addr %llx val %x\n", __func__, type, addr, *val);
+
+	return ret;
+}
+
+static int mpic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u32 attr32;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return set_base_addr(opp, attr);
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return access_reg(opp, attr->attr, &attr32, ATTR_SET);
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		if (attr32 != 0 && attr32 != 1)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		openpic_set_irq(opp, attr->attr, attr32);
+		spin_unlock_irq(&opp->lock);
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u64 attr64;
+	u32 attr32;
+	int ret;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			mutex_lock(&opp->kvm->slots_lock);
+			attr64 = opp->reg_base;
+			mutex_unlock(&opp->kvm->slots_lock);
+
+			if (copy_to_user((u64 __user *)(long)attr->addr,
+					 &attr64, sizeof(u64)))
+				return -EFAULT;
+
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		ret = access_reg(opp, attr->attr, &attr32, ATTR_GET);
+		if (ret)
+			return ret;
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		attr32 = opp->src[attr->attr].pending;
+		spin_unlock_irq(&opp->lock);
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			break;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static void mpic_destroy(struct kvm_device *dev)
+{
+	struct openpic *opp = dev->private;
+
+	if (opp->mmio_mapped) {
+		/*
+		 * Normally we get unmapped by kvm_io_bus_destroy(),
+		 * which happens before the VCPUs release their references.
+		 *
+		 * Thus, we should only get here if no VCPUs took a reference
+		 * to us in the first place.
+		 */
+		WARN_ON(opp->nb_cpus != 0);
+		unmap_mmio(opp);
+	}
+
+	kfree(opp);
+}
+
+static int mpic_create(struct kvm_device *dev, u32 type)
+{
+	struct openpic *opp;
+	int ret;
+
+	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
+	if (!opp)
+		return -ENOMEM;
+
+	dev->private = opp;
+	opp->kvm = dev->kvm;
+	opp->dev = dev;
+	opp->model = type;
+	spin_lock_init(&opp->lock);
+
+	INIT_LIST_HEAD(&opp->mmio_regions);
+	list_add(&openpic_gbl_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_tmr_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_src_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_cpu_mmio.list, &opp->mmio_regions);
 
 	switch (opp->model) {
-	case OPENPIC_MODEL_FSL_MPIC_20:
-	default:
+	case KVM_DEV_TYPE_FSL_MPIC_20:
 		opp->fsl = &fsl_mpic_20;
 		opp->brr1 = 0x00400200;
 		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
@@ -1290,12 +1658,10 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_MIXED;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
 
-	case OPENPIC_MODEL_FSL_MPIC_42:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
 		opp->fsl = &fsl_mpic_42;
 		opp->brr1 = 0x00400402;
 		opp->flags |= OPENPIC_FLAG_ILR;
@@ -1303,11 +1669,27 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_PROXY;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
+
+	default:
+		ret = -ENODEV;
+		goto err;
 	}
 
+	openpic_reset(opp);
 	return 0;
+
+err:
+	kfree(opp);
+	return ret;
 }
+
+struct kvm_device_ops kvm_mpic_ops = {
+	.name = "kvm-mpic",
+	.create = mpic_create,
+	.destroy = mpic_destroy,
+	.set_attr = mpic_set_attr,
+	.get_attr = mpic_get_attr,
+	.has_attr = mpic_has_attr,
+};
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a822659..3cad714 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -317,6 +317,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_ONE_REG:
 	case KVM_CAP_IOEVENTFD:
+	case KVM_CAP_DEVICE_CTRL:
 		r = 1;
 		break;
 #ifndef CONFIG_KVM_BOOK3S_64_HV
@@ -768,7 +769,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	case KVM_CAP_PPC_EPR:
 		r = 0;
-		vcpu->arch.epr_enabled = cap->args[0];
+		if (cap->args[0])
+			vcpu->arch.epr_flags |= KVMPPC_EPR_USER;
+		else
+			vcpu->arch.epr_flags &= ~KVMPPC_EPR_USER;
 		break;
 #ifdef CONFIG_BOOKE
 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
@@ -914,6 +918,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
+	struct kvm *kvm __maybe_unused = filp->private_data;
 	void __user *argp = (void __user *)arg;
 	long r;
 
@@ -932,7 +937,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
-		struct kvm *kvm = filp->private_data;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
@@ -944,7 +948,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	case KVM_ALLOCATE_RMA: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_allocate_rma rma;
 
 		r = kvm_vm_ioctl_allocate_rma(kvm, &rma);
@@ -954,7 +957,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_ALLOCATE_HTAB: {
-		struct kvm *kvm = filp->private_data;
 		u32 htab_order;
 
 		r = -EFAULT;
@@ -971,7 +973,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_GET_HTAB_FD: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_get_htab_fd ghf;
 
 		r = -EFAULT;
@@ -984,7 +985,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_PPC_GET_SMMU_INFO: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_ppc_smmu_info info;
 
 		memset(&info, 0, sizeof(info));
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6dab6b5..feffbda 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1099,6 +1099,8 @@ void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
 
+extern struct kvm_device_ops kvm_mpic_ops;
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index be15aff..568d65d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -925,6 +925,9 @@ struct kvm_device_attr {
 	__u64	addr;		/* userspace address of attr data */
 };
 
+#define KVM_DEV_TYPE_FSL_MPIC_20	1
+#define KVM_DEV_TYPE_FSL_MPIC_42	2
+
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5f0d78c..f6cd14d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2238,6 +2238,12 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
 	int ret;
 
 	switch (cd->type) {
+#ifdef CONFIG_KVM_MPIC
+	case KVM_DEV_TYPE_FSL_MPIC_20:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
+		ops = &kvm_mpic_ops;
+		break;
+#endif
 	default:
 		return -ENODEV;
 	}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 13/17] kvm/ppc/mpic: in-kernel MPIC emulation
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Hook the MPIC code up to the KVM interfaces, add locking, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - fix pr_debug again
---
 Documentation/virtual/kvm/devices/mpic.txt |   37 ++
 arch/powerpc/include/asm/kvm_host.h        |    8 +-
 arch/powerpc/include/asm/kvm_ppc.h         |   17 +
 arch/powerpc/include/uapi/asm/kvm.h        |    7 +
 arch/powerpc/kvm/Kconfig                   |    9 +
 arch/powerpc/kvm/Makefile                  |    2 +
 arch/powerpc/kvm/booke.c                   |    8 +-
 arch/powerpc/kvm/mpic.c                    |  762 +++++++++++++++++++++-------
 arch/powerpc/kvm/powerpc.c                 |   12 +-
 include/linux/kvm_host.h                   |    2 +
 include/uapi/linux/kvm.h                   |    3 +
 virt/kvm/kvm_main.c                        |    6 +
 12 files changed, 673 insertions(+), 200 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt

diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
new file mode 100644
index 0000000..ce98e32
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -0,0 +1,37 @@
+MPIC interrupt controller
+============+
+Device types supported:
+  KVM_DEV_TYPE_FSL_MPIC_20     Freescale MPIC v2.0
+  KVM_DEV_TYPE_FSL_MPIC_42     Freescale MPIC v4.2
+
+Only one MPIC instance, of any type, may be instantiated.  The created
+MPIC will act as the system interrupt controller, connecting to each
+vcpu's interrupt inputs.
+
+Groups:
+  KVM_DEV_MPIC_GRP_MISC
+  Attributes:
+    KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
+      Base address of the 256 KiB MPIC register space.  Must be
+      naturally aligned.  A value of zero disables the mapping.
+      Reset value is zero.
+
+  KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
+    Access an MPIC register, as if the access were made from the guest.
+    "attr" is the byte offset into the MPIC register space.  Accesses
+    must be 4-byte aligned.
+
+    MSIs may be signaled by using this attribute group to write
+    to the relevant MSIIR.
+
+  KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
+    IRQ input line for each standard openpic source.  0 is inactive and 1
+    is active, regardless of interrupt sense.
+
+    For edge-triggered interrupts:  Writing 1 is considered an activating
+    edge, and writing 0 is ignored.  Reading returns 1 if a previously
+    signaled edge has not been acknowledged, and 0 otherwise.
+
+    "attr" is the IRQ number.  IRQ numbers for standard sources are the
+    byte offset of the relevant IVPR from EIVPR0, divided by 32.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..7e7aef9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -359,6 +359,11 @@ struct kvmppc_slb {
 #define KVMPPC_BOOKE_MAX_IAC	4
 #define KVMPPC_BOOKE_MAX_DAC	2
 
+/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */
+#define KVMPPC_EPR_NONE		0 /* EPR not supported */
+#define KVMPPC_EPR_USER		1 /* exit to userspace to fill EPR */
+#define KVMPPC_EPR_KERNEL	2 /* in-kernel irqchip */
+
 struct kvmppc_booke_debug_reg {
 	u32 dbcr0;
 	u32 dbcr1;
@@ -522,7 +527,7 @@ struct kvm_vcpu_arch {
 	u8 sane;
 	u8 cpu_type;
 	u8 hcall_needed;
-	u8 epr_enabled;
+	u8 epr_flags; /* KVMPPC_EPR_xxx */
 	u8 epr_needed;
 
 	u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
@@ -589,5 +594,6 @@ struct kvm_vcpu_arch {
 #define KVM_MMIO_REG_FQPR	0x0060
 
 #define __KVM_HAVE_ARCH_WQP
+#define __KVM_HAVE_CREATE_DEVICE
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index f589307..0b86604 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
 extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
 
+int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
@@ -245,6 +247,9 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
+struct openpic;
+void kvmppc_mpic_put(struct openpic *opp);
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {
@@ -270,6 +275,18 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #endif
 }
 
+#ifdef CONFIG_KVM_MPIC
+
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+
+#else
+
+static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+}
+
+#endif /* CONFIG_KVM_MPIC */
+
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
 			      struct kvm_config_tlb *cfg);
 int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index c2ff99c..36be2fe 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -426,4 +426,11 @@ struct kvm_get_htab_header {
 /* Debugging: Special instruction for software breakpoint */
 #define KVM_REG_PPC_DEBUG_INST	(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
 
+/* Device control API: PPC-specific devices */
+#define KVM_DEV_MPIC_GRP_MISC		1
+#define   KVM_DEV_MPIC_BASE_ADDR	0	/* 64-bit */
+
+#define KVM_DEV_MPIC_GRP_REGISTER	2	/* 32-bit */
+#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE	3	/* 32-bit */
+
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 63c67ec..938a729 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -151,6 +151,15 @@ config KVM_E500MC
 
 	  If unsure, say N.
 
+config KVM_MPIC
+	bool "KVM in-kernel MPIC emulation"
+	depends on KVM
+	help
+	  Enable support for emulating MPIC devices inside the
+          host kernel, rather than relying on userspace to emulate.
+          Currently, support is limited to certain versions of
+          Freescale's MPIC implementation.
+
 source drivers/vhost/Kconfig
 
 endif # VIRTUALIZATION
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index b772ede..4a2277a 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -103,6 +103,8 @@ kvm-book3s_32-objs := \
 	book3s_32_mmu.o
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
+kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
 obj-$(CONFIG_KVM_440) += kvm.o
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index a49a68a..cff53d4 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -346,7 +346,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		keep_irq = true;
 	}
 
-	if ((priority = BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled)
+	if ((priority = BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_flags)
 		update_epr = true;
 
 	switch (priority) {
@@ -427,8 +427,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 			set_guest_esr(vcpu, vcpu->arch.queued_esr);
 		if (update_dear = true)
 			set_guest_dear(vcpu, vcpu->arch.queued_dear);
-		if (update_epr = true)
-			kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		if (update_epr = true) {
+			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
+				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+		}
 
 		new_msr &= msr_mask;
 #if defined(CONFIG_64BIT)
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 1df67ae..cb451b9 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -23,6 +23,19 @@
  * THE SOFTWARE.
  */
 
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/kvm_host.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/anon_inodes.h>
+#include <asm/uaccess.h>
+#include <asm/mpic.h>
+#include <asm/kvm_para.h>
+#include <asm/kvm_host.h>
+#include <asm/kvm_ppc.h>
+#include "iodev.h"
+
 #define MAX_CPU     32
 #define MAX_SRC     256
 #define MAX_TMR     4
@@ -36,6 +49,7 @@
 #define OPENPIC_FLAG_ILR          (2 << 0)
 
 /* OpenPIC address map */
+#define OPENPIC_REG_SIZE             0x40000
 #define OPENPIC_GLB_REG_START        0x0
 #define OPENPIC_GLB_REG_SIZE         0x10F0
 #define OPENPIC_TMR_REG_START        0x10F0
@@ -89,6 +103,7 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 #define ILR_INTTGT_INT    0x00
 #define ILR_INTTGT_CINT   0x01	/* critical */
 #define ILR_INTTGT_MCP    0x02	/* machine check */
+#define NUM_OUTPUTS       3
 
 #define MSIIR_OFFSET       0x140
 #define MSIIR_SRS_SHIFT    29
@@ -98,18 +113,19 @@ static struct fsl_mpic_info fsl_mpic_42 = {
 
 static int get_current_cpu(void)
 {
-	CPUState *cpu_single_cpu;
-
-	if (!cpu_single_env)
-		return -1;
-
-	cpu_single_cpu = ENV_GET_CPU(cpu_single_env);
-	return cpu_single_cpu->cpu_index;
+#if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
+	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
+	return vcpu ? vcpu->vcpu_id : -1;
+#else
+	/* XXX */
+	return -1;
+#endif
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx);
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx);
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx);
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx);
 
 enum irq_type {
 	IRQ_TYPE_NORMAL = 0,
@@ -131,7 +147,7 @@ struct irq_source {
 	uint32_t idr;		/* IRQ destination register */
 	uint32_t destmask;	/* bitmap of CPU destinations */
 	int last_cpu;
-	int output;		/* IRQ level, e.g. OPENPIC_OUTPUT_INT */
+	int output;		/* IRQ level, e.g. ILR_INTTGT_INT */
 	int pending;		/* TRUE if IRQ is pending */
 	enum irq_type type;
 	bool level:1;		/* level-triggered */
@@ -158,16 +174,27 @@ struct irq_source {
 #define IDR_CI      0x40000000	/* critical interrupt */
 
 struct irq_dest {
+	struct kvm_vcpu *vcpu;
+
 	int32_t ctpr;		/* CPU current task priority */
 	struct irq_queue raised;
 	struct irq_queue servicing;
-	qemu_irq *irqs;
 
 	/* Count of IRQ sources asserting on non-INT outputs */
-	uint32_t outputs_active[OPENPIC_OUTPUT_NB];
+	uint32_t outputs_active[NUM_OUTPUTS];
 };
 
 struct openpic {
+	struct kvm *kvm;
+	struct kvm_device *dev;
+	struct kvm_io_device mmio;
+	struct list_head mmio_regions;
+	atomic_t users;
+	bool mmio_mapped;
+
+	gpa_t reg_base;
+	spinlock_t lock;
+
 	/* Behavior control */
 	struct fsl_mpic_info *fsl;
 	uint32_t model;
@@ -208,6 +235,47 @@ struct openpic {
 	uint32_t irq_msi;
 };
 
+
+static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	struct kvm_interrupt irq = {
+		.irq = KVM_INTERRUPT_SET_LEVEL,
+	};
+
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %d does not exist\n",
+			 __func__, (int)(dst - &opp->dst[0]));
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvm_vcpu_ioctl_interrupt(dst->vcpu, &irq);
+}
+
+static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
+			   int output)
+{
+	if (!dst->vcpu) {
+		pr_debug("%s: destination cpu %d does not exist\n",
+			 __func__, (int)(dst - &opp->dst[0]));
+		return;
+	}
+
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+		output);
+
+	if (output != ILR_INTTGT_INT)	/* TODO */
+		return;
+
+	kvmppc_core_dequeue_external(dst->vcpu);
+}
+
 static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ)
 {
 	set_bit(n_IRQ, q->queue);
@@ -268,7 +336,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 	pr_debug("%s: IRQ %d active %d was %d\n",
 		__func__, n_IRQ, active, was_active);
 
-	if (src->output != OPENPIC_OUTPUT_INT) {
+	if (src->output != ILR_INTTGT_INT) {
 		pr_debug("%s: output %d irq %d active %d was %d count %d\n",
 			__func__, src->output, n_IRQ, active, was_active,
 			dst->outputs_active[src->output]);
@@ -282,14 +350,14 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			    dst->outputs_active[src->output]++ = 0) {
 				pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_raise(dst->irqs[src->output]);
+				mpic_irq_raise(opp, dst, src->output);
 			}
 		} else {
 			if (was_active &&
 			    --dst->outputs_active[src->output] = 0) {
 				pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n",
 					__func__, src->output, n_CPU, n_IRQ);
-				qemu_irq_lower(dst->irqs[src->output]);
+				mpic_irq_lower(opp, dst, src->output);
 			}
 		}
 
@@ -322,8 +390,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 		} else {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n",
 				__func__, n_CPU, n_IRQ, dst->raised.next);
-			qemu_irq_raise(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 	} else {
 		IRQ_get_next(opp, &dst->servicing);
@@ -338,8 +405,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ,
 			pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n",
 				__func__, n_IRQ, dst->ctpr,
 				dst->servicing.priority, n_CPU);
-			qemu_irq_lower(opp->dst[n_CPU].
-				       irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		}
 	}
 }
@@ -415,8 +481,8 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	struct irq_source *src;
 
 	if (n_IRQ >= MAX_IRQ) {
-		pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ);
-		abort();
+		WARN_ONCE(1, "%s: IRQ %d out of range\n", __func__, n_IRQ);
+		return;
 	}
 
 	src = &opp->src[n_IRQ];
@@ -433,7 +499,7 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 			openpic_update_irq(opp, n_IRQ);
 		}
 
-		if (src->output != OPENPIC_OUTPUT_INT) {
+		if (src->output != ILR_INTTGT_INT) {
 			/* Edge-triggered interrupts shouldn't be used
 			 * with non-INT delivery, but just in case,
 			 * try to make it do something sane rather than
@@ -446,15 +512,13 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level)
 	}
 }
 
-static void openpic_reset(DeviceState *d)
+static void openpic_reset(struct openpic *opp)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d));
 	int i;
 
 	opp->gcr = GCR_RESET;
 	/* Initialise controller registers */
 	opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) |
-	    ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) |
 	    (opp->vid << FRR_VID_SHIFT);
 
 	opp->pir = 0;
@@ -504,7 +568,7 @@ static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ)
 static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ)
 {
 	if (opp->flags & OPENPIC_FLAG_ILR)
-		return output_to_inttgt(opp->src[n_IRQ].output);
+		return opp->src[n_IRQ].output;
 
 	return 0xffffffff;
 }
@@ -539,7 +603,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					__func__);
 			}
 
-			src->output = OPENPIC_OUTPUT_CINT;
+			src->output = ILR_INTTGT_CINT;
 			src->nomask = true;
 			src->destmask = 0;
 
@@ -550,7 +614,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ,
 					src->destmask |= 1UL << i;
 			}
 		} else {
-			src->output = OPENPIC_OUTPUT_INT;
+			src->output = ILR_INTTGT_INT;
 			src->nomask = false;
 			src->destmask = src->idr & normal_mask;
 		}
@@ -565,7 +629,7 @@ static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ,
 	if (opp->flags & OPENPIC_FLAG_ILR) {
 		struct irq_source *src = &opp->src[n_IRQ];
 
-		src->output = inttgt_to_output(val & ILR_INTTGT_MASK);
+		src->output = val & ILR_INTTGT_MASK;
 		pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr,
 			src->output);
 
@@ -614,34 +678,23 @@ static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ,
 
 static void openpic_gcr_write(struct openpic *opp, uint64_t val)
 {
-	bool mpic_proxy = false;
-
 	if (val & GCR_RESET) {
-		openpic_reset(&opp->busdev.qdev);
+		openpic_reset(opp);
 		return;
 	}
 
 	opp->gcr &= ~opp->mpic_mode_mask;
 	opp->gcr |= val & opp->mpic_mode_mask;
-
-	/* Set external proxy mode */
-	if ((val & opp->mpic_mode_mask) = GCR_MODE_PROXY)
-		mpic_proxy = true;
-
-	ppce500_set_mpic_proxy(mpic_proxy);
 }
 
-static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_gbl_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
-	struct irq_dest *dst;
-	int idx;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case 0x00:	/* Block Revision Register1 (BRR1) is Readonly */
@@ -654,7 +707,8 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		openpic_cpu_write_internal(opp, addr, val, get_current_cpu());
+		err = openpic_cpu_write_internal(opp, addr, val,
+						 get_current_cpu());
 		break;
 	case 0x1000:		/* FRR */
 		break;
@@ -664,21 +718,11 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	case 0x1080:		/* VIR */
 		break;
 	case 0x1090:		/* PIR */
-		for (idx = 0; idx < opp->nb_cpus; idx++) {
-			if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) {
-				pr_debug("Raise OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			} else if (!(val & (1 << idx)) &&
-				   (opp->pir & (1 << idx))) {
-				pr_debug("Lower OpenPIC RESET output for CPU %d\n",
-					idx);
-				dst = &opp->dst[idx];
-				qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]);
-			}
-		}
-		opp->pir = val;
+		/*
+		 * This register is used to reset a CPU core --
+		 * let userspace handle it.
+		 */
+		err = -ENXIO;
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -695,21 +739,25 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val,
 	default:
 		break;
 	}
+
+	return err;
 }
 
-static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_gbl_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint32_t retval;
+	u32 retval;
+	int err = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	switch (addr) {
 	case 0x1000:		/* FRR */
 		retval = opp->frr;
+		retval |= (opp->nb_cpus - 1) << FRR_NCPU_SHIFT;
 		break;
 	case 0x1020:		/* GCR */
 		retval = opp->gcr;
@@ -731,8 +779,8 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	case 0x90:
 	case 0xA0:
 	case 0xB0:
-		retval -		    openpic_cpu_read_internal(opp, addr, get_current_cpu());
+		err = openpic_cpu_read_internal(opp, addr,
+			&retval, get_current_cpu());
 		break;
 	case 0x10A0:		/* IPI_IVPR */
 	case 0x10B0:
@@ -750,28 +798,28 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
 	default:
 		break;
 	}
-	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	pr_debug("%s: => 0x%08x\n", __func__, retval);
+	*ptr = retval;
+	return err;
 }
 
-static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_tmr_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
 	addr += 0x10f0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	if (addr = 0x10f0) {
 		/* TFRR */
 		opp->tfrr = val;
-		return;
+		return 0;
 	}
 
 	idx = (addr >> 6) & 0x3;
@@ -795,15 +843,17 @@ static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_idr(opp, opp->irq_tim0 + idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_tmr_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval = -1;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
 		goto out;
 
@@ -813,6 +863,7 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 		retval = opp->tfrr;
 		goto out;
 	}
+
 	switch (addr & 0x30) {
 	case 0x00:		/* TCCR */
 		retval = opp->timers[idx].tccr;
@@ -830,18 +881,16 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len)
 
 out:
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_src_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val);
 
 	addr = addr & 0xffff;
 	idx = addr >> 5;
@@ -857,15 +906,17 @@ static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val,
 		write_IRQreg_ilr(opp, idx, val);
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
+static int openpic_src_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
 	uint32_t retval;
 	int idx;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	retval = 0xFFFFFFFF;
 
 	addr = addr & 0xffff;
@@ -884,20 +935,19 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len)
 	}
 
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
-	return retval;
+	*ptr = retval;
+	return 0;
 }
 
-static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned size)
+static int openpic_msi_write(void *opaque, gpa_t addr, u32 val)
 {
 	struct openpic *opp = opaque;
 	int idx = opp->irq_msi;
 	int srs, ibs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	switch (addr) {
 	case MSIIR_OFFSET:
@@ -911,17 +961,19 @@ static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val,
 		/* most registers are read-only, thus ignored */
 		break;
 	}
+
+	return 0;
 }
 
-static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_msi_read(void *opaque, gpa_t addr, u32 *ptr)
 {
 	struct openpic *opp = opaque;
-	uint64_t r = 0;
+	uint32_t r = 0;
 	int i, srs;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 	if (addr & 0xF)
-		return -1;
+		return -ENXIO;
 
 	srs = addr >> 4;
 
@@ -945,45 +997,47 @@ static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size)
 		break;
 	}
 
-	return r;
+	pr_debug("%s: => 0x%08x\n", __func__, r);
+	*ptr = r;
+	return 0;
 }
 
-static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size)
+static int openpic_summary_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	uint64_t r = 0;
+	uint32_t r = 0;
 
-	pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr);
+	pr_debug("%s: addr %#llx\n", __func__, addr);
 
 	/* TODO: EISR/EIMR */
 
-	return r;
+	*ptr = r;
+	return 0;
 }
 
-static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val,
-				  unsigned size)
+static int openpic_summary_write(void *opaque, gpa_t addr, u32 val)
 {
-	pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n",
-		__func__, addr, val);
+	pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val);
 
 	/* TODO: EISR/EIMR */
+	return 0;
 }
 
-static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
-				       uint32_t val, int idx)
+static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
+				      u32 val, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_source *src;
 	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx,
+	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
 		addr, val);
 
 	if (idx < 0)
-		return;
+		return 0;
 
 	if (addr & 0xF)
-		return;
+		return 0;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1008,11 +1062,11 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		if (dst->raised.priority <= dst->ctpr) {
 			pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n",
 				__func__, idx);
-			qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 		} else if (dst->raised.priority > dst->servicing.priority) {
 			pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n",
 				__func__, idx, dst->raised.next);
-			qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 
 		break;
@@ -1043,18 +1097,22 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		     IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) {
 			pr_debug("Raise OpenPIC INT output cpu %d irq %d\n",
 				idx, n_IRQ);
-			qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]);
+			mpic_irq_raise(opp, dst, ILR_INTTGT_INT);
 		}
 		break;
 	default:
 		break;
 	}
+
+	return 0;
 }
 
-static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val,
-			      unsigned len)
+static int openpic_cpu_write(void *opaque, gpa_t addr, u32 val)
 {
-	openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_write_internal(opp, addr, val,
+					 (addr & 0x1f000) >> 12);
 }
 
 static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
@@ -1064,7 +1122,7 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	int retval, irq;
 
 	pr_debug("Lower OpenPIC INT output\n");
-	qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]);
+	mpic_irq_lower(opp, dst, ILR_INTTGT_INT);
 
 	irq = IRQ_get_next(opp, &dst->raised);
 	pr_debug("IACK: irq=%d\n", irq);
@@ -1107,20 +1165,21 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
-static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
+static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
+				     u32 *ptr, int idx)
 {
 	struct openpic *opp = opaque;
 	struct irq_dest *dst;
 	uint32_t retval;
 
-	pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr);
+	pr_debug("%s: cpu %d addr %#llx\n", __func__, idx, addr);
 	retval = 0xFFFFFFFF;
 
 	if (idx < 0)
-		return retval;
+		goto out;
 
 	if (addr & 0xF)
-		return retval;
+		goto out;
 
 	dst = &opp->dst[idx];
 	addr &= 0xFF0;
@@ -1142,49 +1201,67 @@ static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx)
 	}
 	pr_debug("%s: => 0x%08x\n", __func__, retval);
 
-	return retval;
+out:
+	*ptr = retval;
+	return 0;
 }
 
-static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len)
+static int openpic_cpu_read(void *opaque, gpa_t addr, u32 *ptr)
 {
-	return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12);
+	struct openpic *opp = opaque;
+
+	return openpic_cpu_read_internal(opp, addr, ptr,
+					 (addr & 0x1f000) >> 12);
 }
 
-static const struct kvm_io_device_ops openpic_glb_ops_be = {
+struct mem_reg {
+	struct list_head list;
+	int (*read)(void *opaque, gpa_t addr, u32 *ptr);
+	int (*write)(void *opaque, gpa_t addr, u32 val);
+	gpa_t start_addr;
+	int size;
+};
+
+static struct mem_reg openpic_gbl_mmio = {
 	.write = openpic_gbl_write,
 	.read = openpic_gbl_read,
+	.start_addr = OPENPIC_GLB_REG_START,
+	.size = OPENPIC_GLB_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_tmr_ops_be = {
+static struct mem_reg openpic_tmr_mmio = {
 	.write = openpic_tmr_write,
 	.read = openpic_tmr_read,
+	.start_addr = OPENPIC_TMR_REG_START,
+	.size = OPENPIC_TMR_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_cpu_ops_be = {
+static struct mem_reg openpic_cpu_mmio = {
 	.write = openpic_cpu_write,
 	.read = openpic_cpu_read,
+	.start_addr = OPENPIC_CPU_REG_START,
+	.size = OPENPIC_CPU_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_src_ops_be = {
+static struct mem_reg openpic_src_mmio = {
 	.write = openpic_src_write,
 	.read = openpic_src_read,
+	.start_addr = OPENPIC_SRC_REG_START,
+	.size = OPENPIC_SRC_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_msi_ops_be = {
+static struct mem_reg openpic_msi_mmio = {
 	.read = openpic_msi_read,
 	.write = openpic_msi_write,
+	.start_addr = OPENPIC_MSI_REG_START,
+	.size = OPENPIC_MSI_REG_SIZE,
 };
 
-static const struct kvm_io_device_ops openpic_summary_ops_be = {
+static struct mem_reg openpic_summary_mmio = {
 	.read = openpic_summary_read,
 	.write = openpic_summary_write,
-};
-
-struct mem_reg {
-	const char *name;
-	const struct kvm_io_device_ops *ops;
-	gpa_t start_addr;
-	int size;
+	.start_addr = OPENPIC_SUMMARY_REG_START,
+	.size = OPENPIC_SUMMARY_REG_SIZE,
 };
 
 static void fsl_common_init(struct openpic *opp)
@@ -1192,6 +1269,9 @@ static void fsl_common_init(struct openpic *opp)
 	int i;
 	int virq = MAX_SRC;
 
+	list_add(&openpic_msi_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_summary_mmio.list, &opp->mmio_regions);
+
 	opp->vid = VID_REVISION_1_2;
 	opp->vir = VIR_GENERIC;
 	opp->vector_mask = 0xFFFF;
@@ -1205,11 +1285,10 @@ static void fsl_common_init(struct openpic *opp)
 	opp->irq_tim0 = virq;
 	virq += MAX_TMR;
 
-	assert(virq <= MAX_IRQ);
+	BUG_ON(virq > MAX_IRQ);
 
 	opp->irq_msi = 224;
 
-	msi_supported = true;
 	for (i = 0; i < opp->fsl->max_ext; i++)
 		opp->src[i].level = false;
 
@@ -1226,63 +1305,352 @@ static void fsl_common_init(struct openpic *opp)
 	}
 }
 
-static void map_list(struct openpic *opp, const struct mem_reg *list,
-		     int *count)
+static int kvm_mpic_read_internal(struct openpic *opp, gpa_t addr, u32 *ptr)
 {
-	while (list->name) {
-		assert(*count < ARRAY_SIZE(opp->sub_io_mem));
+	struct list_head *node;
 
-		memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp,
-				      list->name, list->size);
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
 
-		memory_region_add_subregion(&opp->mem, list->start_addr,
-					    &opp->sub_io_mem[*count]);
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-		(*count)++;
-		list++;
+		return mr->read(opp, addr - mr->start_addr, ptr);
 	}
+
+	return -ENXIO;
 }
 
-static int openpic_init(SysBusDevice *dev)
+static int kvm_mpic_write_internal(struct openpic *opp, gpa_t addr, u32 val)
 {
-	struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev);
-	int i, j;
-	int list_count = 0;
-	static const struct mem_reg list_le[] = {
-		{"glb", &openpic_glb_ops_le,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_le,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_le,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_le,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_be[] = {
-		{"glb", &openpic_glb_ops_be,
-		 OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE},
-		{"tmr", &openpic_tmr_ops_be,
-		 OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE},
-		{"src", &openpic_src_ops_be,
-		 OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE},
-		{"cpu", &openpic_cpu_ops_be,
-		 OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE},
-		{NULL}
-	};
-	static const struct mem_reg list_fsl[] = {
-		{"msi", &openpic_msi_ops_be,
-		 OPENPIC_MSI_REG_START, OPENPIC_MSI_REG_SIZE},
-		{"summary", &openpic_summary_ops_be,
-		 OPENPIC_SUMMARY_REG_START, OPENPIC_SUMMARY_REG_SIZE},
-		{NULL}
-	};
+	struct list_head *node;
+
+	list_for_each(node, &opp->mmio_regions) {
+		struct mem_reg *mr = list_entry(node, struct mem_reg, list);
+
+		if (mr->start_addr > addr || addr >= mr->start_addr + mr->size)
+			continue;
 
-	memory_region_init(&opp->mem, "openpic", 0x40000);
+		return mr->write(opp, addr - mr->start_addr, val);
+	}
+
+	return -ENXIO;
+}
+
+static int kvm_mpic_read(struct kvm_io_device *this, gpa_t addr,
+			 int len, void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+	union {
+		u32 val;
+		u8 bytes[4];
+	} u;
+
+	if (addr & (len - 1)) {
+		pr_debug("%s: bad alignment %llx/%d\n",
+			 __func__, addr, len);
+		return -EINVAL;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_read_internal(opp, addr - opp->reg_base, &u.val);
+	spin_unlock_irq(&opp->lock);
+
+	/*
+	 * Technically only 32-bit accesses are allowed, but be nice to
+	 * people dumping registers a byte at a time -- it works in real
+	 * hardware (reads only, not writes).
+	 */
+	if (len = 4) {
+		*(u32 *)ptr = u.val;
+		pr_debug("%s: addr %llx ret %d len 4 val %x\n",
+			 __func__, addr, ret, u.val);
+	} else if (len = 1) {
+		*(u8 *)ptr = u.bytes[addr & 3];
+		pr_debug("%s: addr %llx ret %d len 1 val %x\n",
+			 __func__, addr, ret, u.bytes[addr & 3]);
+	} else {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
+static int kvm_mpic_write(struct kvm_io_device *this, gpa_t addr,
+			  int len, const void *ptr)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+	int ret;
+
+	if (len != 4) {
+		pr_debug("%s: bad length %d\n", __func__, len);
+		return -EOPNOTSUPP;
+	}
+	if (addr & 3) {
+		pr_debug("%s: bad alignment %llx/%d\n", __func__, addr, len);
+		return -EOPNOTSUPP;
+	}
+
+	spin_lock_irq(&opp->lock);
+	ret = kvm_mpic_write_internal(opp, addr - opp->reg_base,
+				      *(const u32 *)ptr);
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: addr %llx ret %d val %x\n",
+		 __func__, addr, ret, *(const u32 *)ptr);
+
+	return ret;
+}
+
+static void kvm_mpic_dtor(struct kvm_io_device *this)
+{
+	struct openpic *opp = container_of(this, struct openpic, mmio);
+
+	opp->mmio_mapped = false;
+}
+
+static const struct kvm_io_device_ops mpic_mmio_ops = {
+	.read = kvm_mpic_read,
+	.write = kvm_mpic_write,
+	.destructor = kvm_mpic_dtor,
+};
+
+static void map_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = true;
+
+	kvm_iodevice_init(&opp->mmio, &mpic_mmio_ops);
+
+	kvm_io_bus_register_dev(opp->kvm, KVM_MMIO_BUS,
+				opp->reg_base, OPENPIC_REG_SIZE,
+				&opp->mmio);
+}
+
+static void unmap_mmio(struct openpic *opp)
+{
+	BUG_ON(opp->mmio_mapped);
+	opp->mmio_mapped = false;
+
+	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+}
+
+static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
+{
+	u64 base;
+
+	if (copy_from_user(&base, (u64 __user *)(long)attr->addr, sizeof(u64)))
+		return -EFAULT;
+
+	if (base & 0x3ffff) {
+		pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx not aligned\n",
+			 __func__, base);
+		return -EINVAL;
+	}
+
+	if (base = opp->reg_base)
+		return 0;
+
+	mutex_lock(&opp->kvm->slots_lock);
+
+	unmap_mmio(opp);
+	opp->reg_base = base;
+
+	pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx\n",
+		 __func__, base);
+
+	if (base = 0)
+		goto out;
+
+	map_mmio(opp);
+
+	mutex_unlock(&opp->kvm->slots_lock);
+out:
+	return 0;
+}
+
+#define ATTR_SET		0
+#define ATTR_GET		1
+
+static int access_reg(struct openpic *opp, gpa_t addr, u32 *val, int type)
+{
+	int ret;
+
+	if (addr & 3)
+		return -ENXIO;
+
+	spin_lock_irq(&opp->lock);
+
+	if (type = ATTR_SET)
+		ret = kvm_mpic_write_internal(opp, addr, *val);
+	else
+		ret = kvm_mpic_read_internal(opp, addr, val);
+
+	spin_unlock_irq(&opp->lock);
+
+	pr_debug("%s: type %d addr %llx val %x\n", __func__, type, addr, *val);
+
+	return ret;
+}
+
+static int mpic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u32 attr32;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return set_base_addr(opp, attr);
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return access_reg(opp, attr->attr, &attr32, ATTR_SET);
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		if (get_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		if (attr32 != 0 && attr32 != 1)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		openpic_set_irq(opp, attr->attr, attr32);
+		spin_unlock_irq(&opp->lock);
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	struct openpic *opp = dev->private;
+	u64 attr64;
+	u32 attr32;
+	int ret;
+
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			mutex_lock(&opp->kvm->slots_lock);
+			attr64 = opp->reg_base;
+			mutex_unlock(&opp->kvm->slots_lock);
+
+			if (copy_to_user((u64 __user *)(long)attr->addr,
+					 &attr64, sizeof(u64)))
+				return -EFAULT;
+
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		ret = access_reg(opp, attr->attr, &attr32, ATTR_GET);
+		if (ret)
+			return ret;
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			return -EINVAL;
+
+		spin_lock_irq(&opp->lock);
+		attr32 = opp->src[attr->attr].pending;
+		spin_unlock_irq(&opp->lock);
+
+		if (put_user(attr32, (u32 __user *)(long)attr->addr))
+			return -EFAULT;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static int mpic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
+{
+	switch (attr->group) {
+	case KVM_DEV_MPIC_GRP_MISC:
+		switch (attr->attr) {
+		case KVM_DEV_MPIC_BASE_ADDR:
+			return 0;
+		}
+
+		break;
+
+	case KVM_DEV_MPIC_GRP_REGISTER:
+		return 0;
+
+	case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
+		if (attr->attr > MAX_SRC)
+			break;
+
+		return 0;
+	}
+
+	return -ENXIO;
+}
+
+static void mpic_destroy(struct kvm_device *dev)
+{
+	struct openpic *opp = dev->private;
+
+	if (opp->mmio_mapped) {
+		/*
+		 * Normally we get unmapped by kvm_io_bus_destroy(),
+		 * which happens before the VCPUs release their references.
+		 *
+		 * Thus, we should only get here if no VCPUs took a reference
+		 * to us in the first place.
+		 */
+		WARN_ON(opp->nb_cpus != 0);
+		unmap_mmio(opp);
+	}
+
+	kfree(opp);
+}
+
+static int mpic_create(struct kvm_device *dev, u32 type)
+{
+	struct openpic *opp;
+	int ret;
+
+	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
+	if (!opp)
+		return -ENOMEM;
+
+	dev->private = opp;
+	opp->kvm = dev->kvm;
+	opp->dev = dev;
+	opp->model = type;
+	spin_lock_init(&opp->lock);
+
+	INIT_LIST_HEAD(&opp->mmio_regions);
+	list_add(&openpic_gbl_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_tmr_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_src_mmio.list, &opp->mmio_regions);
+	list_add(&openpic_cpu_mmio.list, &opp->mmio_regions);
 
 	switch (opp->model) {
-	case OPENPIC_MODEL_FSL_MPIC_20:
-	default:
+	case KVM_DEV_TYPE_FSL_MPIC_20:
 		opp->fsl = &fsl_mpic_20;
 		opp->brr1 = 0x00400200;
 		opp->flags |= OPENPIC_FLAG_IDR_CRIT;
@@ -1290,12 +1658,10 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_MIXED;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
 
-	case OPENPIC_MODEL_FSL_MPIC_42:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
 		opp->fsl = &fsl_mpic_42;
 		opp->brr1 = 0x00400402;
 		opp->flags |= OPENPIC_FLAG_ILR;
@@ -1303,11 +1669,27 @@ static int openpic_init(SysBusDevice *dev)
 		opp->mpic_mode_mask = GCR_MODE_PROXY;
 
 		fsl_common_init(opp);
-		map_list(opp, list_be, &list_count);
-		map_list(opp, list_fsl, &list_count);
 
 		break;
+
+	default:
+		ret = -ENODEV;
+		goto err;
 	}
 
+	openpic_reset(opp);
 	return 0;
+
+err:
+	kfree(opp);
+	return ret;
 }
+
+struct kvm_device_ops kvm_mpic_ops = {
+	.name = "kvm-mpic",
+	.create = mpic_create,
+	.destroy = mpic_destroy,
+	.set_attr = mpic_set_attr,
+	.get_attr = mpic_get_attr,
+	.has_attr = mpic_has_attr,
+};
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index a822659..3cad714 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -317,6 +317,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_ONE_REG:
 	case KVM_CAP_IOEVENTFD:
+	case KVM_CAP_DEVICE_CTRL:
 		r = 1;
 		break;
 #ifndef CONFIG_KVM_BOOK3S_64_HV
@@ -768,7 +769,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	case KVM_CAP_PPC_EPR:
 		r = 0;
-		vcpu->arch.epr_enabled = cap->args[0];
+		if (cap->args[0])
+			vcpu->arch.epr_flags |= KVMPPC_EPR_USER;
+		else
+			vcpu->arch.epr_flags &= ~KVMPPC_EPR_USER;
 		break;
 #ifdef CONFIG_BOOKE
 	case KVM_CAP_PPC_BOOKE_WATCHDOG:
@@ -914,6 +918,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
+	struct kvm *kvm __maybe_unused = filp->private_data;
 	void __user *argp = (void __user *)arg;
 	long r;
 
@@ -932,7 +937,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_CREATE_SPAPR_TCE: {
 		struct kvm_create_spapr_tce create_tce;
-		struct kvm *kvm = filp->private_data;
 
 		r = -EFAULT;
 		if (copy_from_user(&create_tce, argp, sizeof(create_tce)))
@@ -944,7 +948,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	case KVM_ALLOCATE_RMA: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_allocate_rma rma;
 
 		r = kvm_vm_ioctl_allocate_rma(kvm, &rma);
@@ -954,7 +957,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_ALLOCATE_HTAB: {
-		struct kvm *kvm = filp->private_data;
 		u32 htab_order;
 
 		r = -EFAULT;
@@ -971,7 +973,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 
 	case KVM_PPC_GET_HTAB_FD: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_get_htab_fd ghf;
 
 		r = -EFAULT;
@@ -984,7 +985,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	case KVM_PPC_GET_SMMU_INFO: {
-		struct kvm *kvm = filp->private_data;
 		struct kvm_ppc_smmu_info info;
 
 		memset(&info, 0, sizeof(info));
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6dab6b5..feffbda 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1099,6 +1099,8 @@ void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
 
+extern struct kvm_device_ops kvm_mpic_ops;
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index be15aff..568d65d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -925,6 +925,9 @@ struct kvm_device_attr {
 	__u64	addr;		/* userspace address of attr data */
 };
 
+#define KVM_DEV_TYPE_FSL_MPIC_20	1
+#define KVM_DEV_TYPE_FSL_MPIC_42	2
+
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5f0d78c..f6cd14d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2238,6 +2238,12 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
 	int ret;
 
 	switch (cd->type) {
+#ifdef CONFIG_KVM_MPIC
+	case KVM_DEV_TYPE_FSL_MPIC_20:
+	case KVM_DEV_TYPE_FSL_MPIC_42:
+		ops = &kvm_mpic_ops;
+		break;
+#endif
 	default:
 		return -ENODEV;
 	}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 14/17] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt   |    8 +++
 arch/powerpc/include/asm/kvm_host.h |    9 ++++
 arch/powerpc/include/asm/kvm_ppc.h  |   15 ++++++-
 arch/powerpc/kvm/booke.c            |    4 ++
 arch/powerpc/kvm/mpic.c             |   82 ++++++++++++++++++++++++++++++++---
 arch/powerpc/kvm/powerpc.c          |   30 +++++++++++++
 include/uapi/linux/kvm.h            |    1 +
 7 files changed, 141 insertions(+), 8 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d52f3f9..4c326ae 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2728,3 +2728,11 @@ to receive the topmost interrupt vector.
 When disabled (args[0] == 0), behavior is as if this facility is unsupported.
 
 When this capability is enabled, KVM_EXIT_EPR can occur.
+
+6.6 KVM_CAP_IRQ_MPIC
+
+Architectures: ppc
+Parameters: args[0] is the MPIC device fd
+            args[1] is the MPIC CPU number for this vcpu
+
+This capability connects the vcpu to an in-kernel MPIC device.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e7aef9..36368c9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -375,6 +375,11 @@ struct kvmppc_booke_debug_reg {
 	u64 dac[KVMPPC_BOOKE_MAX_DAC];
 };
 
+#define KVMPPC_IRQ_DEFAULT	0
+#define KVMPPC_IRQ_MPIC		1
+
+struct openpic;
+
 struct kvm_vcpu_arch {
 	ulong host_stack;
 	u32 host_pid;
@@ -554,6 +559,10 @@ struct kvm_vcpu_arch {
 	unsigned long magic_page_pa; /* phys addr to map the magic page to */
 	unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
+	int irq_type;		/* one of KVM_IRQ_* */
+	int irq_cpu_id;
+	struct openpic *mpic;	/* KVM_IRQ_MPIC */
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	struct kvm_vcpu_arch_shared shregs;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 0b86604..c9d9faf 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -248,7 +248,6 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
 struct openpic;
-void kvmppc_mpic_put(struct openpic *opp);
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
@@ -278,6 +277,9 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #ifdef CONFIG_KVM_MPIC
 
 void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu);
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu);
 
 #else
 
@@ -285,6 +287,17 @@ static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
 {
 }
 
+static inline int kvmppc_mpic_connect_vcpu(struct kvm_device *dev,
+		struct kvm_vcpu *vcpu, u32 cpu)
+{
+	return -EINVAL;
+}
+
+static inline void kvmppc_mpic_disconnect_vcpu(struct openpic *opp,
+		struct kvm_vcpu *vcpu)
+{
+}
+
 #endif /* CONFIG_KVM_MPIC */
 
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cff53d4..0097912 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -430,6 +430,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		if (update_epr == true) {
 			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
 				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+			else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) {
+				BUG_ON(vcpu->arch.irq_type != KVMPPC_IRQ_MPIC);
+				kvmppc_mpic_set_epr(vcpu);
+			}
 		}
 
 		new_msr &= msr_mask;
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index cb451b9..10bc08a 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -115,7 +115,7 @@ static int get_current_cpu(void)
 {
 #if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
 	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
-	return vcpu ? vcpu->vcpu_id : -1;
+	return vcpu ? vcpu->arch.irq_cpu_id : -1;
 #else
 	/* XXX */
 	return -1;
@@ -249,7 +249,7 @@ static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -267,7 +267,7 @@ static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -1165,6 +1165,20 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+	struct openpic *opp = vcpu->arch.mpic;
+	int cpu = vcpu->arch.irq_cpu_id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&opp->lock, flags);
+
+	if ((opp->gcr & opp->mpic_mode_mask) == GCR_MODE_PROXY)
+		kvmppc_set_epr(vcpu, openpic_iack(opp, &opp->dst[cpu], cpu));
+
+	spin_unlock_irqrestore(&opp->lock, flags);
+}
+
 static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
 				     u32 *ptr, int idx)
 {
@@ -1431,10 +1445,10 @@ static void map_mmio(struct openpic *opp)
 
 static void unmap_mmio(struct openpic *opp)
 {
-	BUG_ON(opp->mmio_mapped);
-	opp->mmio_mapped = false;
-
-	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	if (opp->mmio_mapped) {
+		opp->mmio_mapped = false;
+		kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	}
 }
 
 static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
@@ -1693,3 +1707,57 @@ struct kvm_device_ops kvm_mpic_ops = {
 	.get_attr = mpic_get_attr,
 	.has_attr = mpic_has_attr,
 };
+
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu)
+{
+	struct openpic *opp = dev->private;
+	int ret = 0;
+
+	if (dev->ops != &kvm_mpic_ops)
+		return -EPERM;
+	if (opp->kvm != vcpu->kvm)
+		return -EPERM;
+	if (cpu < 0 || cpu >= MAX_CPU)
+		return -EPERM;
+
+	spin_lock_irq(&opp->lock);
+
+	if (opp->dst[cpu].vcpu) {
+		ret = -EEXIST;
+		goto out;
+	}
+	if (vcpu->arch.irq_type) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	opp->dst[cpu].vcpu = vcpu;
+	opp->nb_cpus = max(opp->nb_cpus, cpu + 1);
+
+	vcpu->arch.mpic = opp;
+	vcpu->arch.irq_cpu_id = cpu;
+	vcpu->arch.irq_type = KVMPPC_IRQ_MPIC;
+
+	/* This might need to be changed if GCR gets extended */
+	if (opp->mpic_mode_mask == GCR_MODE_PROXY)
+		vcpu->arch.epr_flags |= KVMPPC_EPR_KERNEL;
+
+	kvm_device_get(dev);
+out:
+	spin_unlock_irq(&opp->lock);
+	return ret;
+}
+
+/*
+ * This should only happen immediately before the mpic is destroyed,
+ * so we shouldn't need to worry about anything still trying to
+ * access the vcpu pointer.
+ */
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
+{
+	BUG_ON(!opp->dst[vcpu->arch.irq_cpu_id].vcpu);
+
+	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
+	kvm_device_put(opp->dev);
+}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cad714..c431fea 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -25,6 +25,7 @@
 #include <linux/hrtimer.h>
 #include <linux/fs.h>
 #include <linux/slab.h>
+#include <linux/file.h>
 #include <asm/cputable.h>
 #include <asm/uaccess.h>
 #include <asm/kvm_ppc.h>
@@ -327,6 +328,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	case KVM_CAP_SW_TLB:
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC:
+#endif
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
 	tasklet_kill(&vcpu->arch.tasklet);
 
 	kvmppc_remove_vcpu_debugfs(vcpu);
+
+	switch (vcpu->arch.irq_type) {
+	case KVMPPC_IRQ_MPIC:
+		kvmppc_mpic_disconnect_vcpu(vcpu->arch.mpic, vcpu);
+		break;
+	}
+
 	kvmppc_core_vcpu_free(vcpu);
 }
 
@@ -793,6 +804,25 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	}
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC: {
+		struct file *filp;
+		struct kvm_device *dev;
+
+		r = -EBADF;
+		filp = fget(cap->args[0]);
+		if (!filp)
+			break;
+
+		r = -EPERM;
+		dev = kvm_device_from_filp(filp);
+		if (dev)
+			r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
+
+		fput(filp);
+		break;
+	}
+#endif
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 568d65d..8f95cac 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
 #define KVM_CAP_DEVICE_CTRL 89
+#define KVM_CAP_IRQ_MPIC 90
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 14/17] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

From: Scott Wood <scottwood@freescale.com>

Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt   |    8 +++
 arch/powerpc/include/asm/kvm_host.h |    9 ++++
 arch/powerpc/include/asm/kvm_ppc.h  |   15 ++++++-
 arch/powerpc/kvm/booke.c            |    4 ++
 arch/powerpc/kvm/mpic.c             |   82 ++++++++++++++++++++++++++++++++---
 arch/powerpc/kvm/powerpc.c          |   30 +++++++++++++
 include/uapi/linux/kvm.h            |    1 +
 7 files changed, 141 insertions(+), 8 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d52f3f9..4c326ae 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2728,3 +2728,11 @@ to receive the topmost interrupt vector.
 When disabled (args[0] = 0), behavior is as if this facility is unsupported.
 
 When this capability is enabled, KVM_EXIT_EPR can occur.
+
+6.6 KVM_CAP_IRQ_MPIC
+
+Architectures: ppc
+Parameters: args[0] is the MPIC device fd
+            args[1] is the MPIC CPU number for this vcpu
+
+This capability connects the vcpu to an in-kernel MPIC device.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e7aef9..36368c9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -375,6 +375,11 @@ struct kvmppc_booke_debug_reg {
 	u64 dac[KVMPPC_BOOKE_MAX_DAC];
 };
 
+#define KVMPPC_IRQ_DEFAULT	0
+#define KVMPPC_IRQ_MPIC		1
+
+struct openpic;
+
 struct kvm_vcpu_arch {
 	ulong host_stack;
 	u32 host_pid;
@@ -554,6 +559,10 @@ struct kvm_vcpu_arch {
 	unsigned long magic_page_pa; /* phys addr to map the magic page to */
 	unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
+	int irq_type;		/* one of KVM_IRQ_* */
+	int irq_cpu_id;
+	struct openpic *mpic;	/* KVM_IRQ_MPIC */
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 	struct kvm_vcpu_arch_shared shregs;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 0b86604..c9d9faf 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -248,7 +248,6 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
 struct openpic;
-void kvmppc_mpic_put(struct openpic *opp);
 
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
@@ -278,6 +277,9 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 #ifdef CONFIG_KVM_MPIC
 
 void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu);
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu);
 
 #else
 
@@ -285,6 +287,17 @@ static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
 {
 }
 
+static inline int kvmppc_mpic_connect_vcpu(struct kvm_device *dev,
+		struct kvm_vcpu *vcpu, u32 cpu)
+{
+	return -EINVAL;
+}
+
+static inline void kvmppc_mpic_disconnect_vcpu(struct openpic *opp,
+		struct kvm_vcpu *vcpu)
+{
+}
+
 #endif /* CONFIG_KVM_MPIC */
 
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cff53d4..0097912 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -430,6 +430,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		if (update_epr = true) {
 			if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
 				kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
+			else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) {
+				BUG_ON(vcpu->arch.irq_type != KVMPPC_IRQ_MPIC);
+				kvmppc_mpic_set_epr(vcpu);
+			}
 		}
 
 		new_msr &= msr_mask;
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index cb451b9..10bc08a 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -115,7 +115,7 @@ static int get_current_cpu(void)
 {
 #if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
 	struct kvm_vcpu *vcpu = current->thread.kvm_vcpu;
-	return vcpu ? vcpu->vcpu_id : -1;
+	return vcpu ? vcpu->arch.irq_cpu_id : -1;
 #else
 	/* XXX */
 	return -1;
@@ -249,7 +249,7 @@ static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -267,7 +267,7 @@ static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst,
 		return;
 	}
 
-	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id,
+	pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->arch.irq_cpu_id,
 		output);
 
 	if (output != ILR_INTTGT_INT)	/* TODO */
@@ -1165,6 +1165,20 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst,
 	return retval;
 }
 
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
+{
+	struct openpic *opp = vcpu->arch.mpic;
+	int cpu = vcpu->arch.irq_cpu_id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&opp->lock, flags);
+
+	if ((opp->gcr & opp->mpic_mode_mask) = GCR_MODE_PROXY)
+		kvmppc_set_epr(vcpu, openpic_iack(opp, &opp->dst[cpu], cpu));
+
+	spin_unlock_irqrestore(&opp->lock, flags);
+}
+
 static int openpic_cpu_read_internal(void *opaque, gpa_t addr,
 				     u32 *ptr, int idx)
 {
@@ -1431,10 +1445,10 @@ static void map_mmio(struct openpic *opp)
 
 static void unmap_mmio(struct openpic *opp)
 {
-	BUG_ON(opp->mmio_mapped);
-	opp->mmio_mapped = false;
-
-	kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	if (opp->mmio_mapped) {
+		opp->mmio_mapped = false;
+		kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
+	}
 }
 
 static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
@@ -1693,3 +1707,57 @@ struct kvm_device_ops kvm_mpic_ops = {
 	.get_attr = mpic_get_attr,
 	.has_attr = mpic_has_attr,
 };
+
+int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
+			     u32 cpu)
+{
+	struct openpic *opp = dev->private;
+	int ret = 0;
+
+	if (dev->ops != &kvm_mpic_ops)
+		return -EPERM;
+	if (opp->kvm != vcpu->kvm)
+		return -EPERM;
+	if (cpu < 0 || cpu >= MAX_CPU)
+		return -EPERM;
+
+	spin_lock_irq(&opp->lock);
+
+	if (opp->dst[cpu].vcpu) {
+		ret = -EEXIST;
+		goto out;
+	}
+	if (vcpu->arch.irq_type) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	opp->dst[cpu].vcpu = vcpu;
+	opp->nb_cpus = max(opp->nb_cpus, cpu + 1);
+
+	vcpu->arch.mpic = opp;
+	vcpu->arch.irq_cpu_id = cpu;
+	vcpu->arch.irq_type = KVMPPC_IRQ_MPIC;
+
+	/* This might need to be changed if GCR gets extended */
+	if (opp->mpic_mode_mask = GCR_MODE_PROXY)
+		vcpu->arch.epr_flags |= KVMPPC_EPR_KERNEL;
+
+	kvm_device_get(dev);
+out:
+	spin_unlock_irq(&opp->lock);
+	return ret;
+}
+
+/*
+ * This should only happen immediately before the mpic is destroyed,
+ * so we shouldn't need to worry about anything still trying to
+ * access the vcpu pointer.
+ */
+void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
+{
+	BUG_ON(!opp->dst[vcpu->arch.irq_cpu_id].vcpu);
+
+	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
+	kvm_device_put(opp->dev);
+}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cad714..c431fea 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -25,6 +25,7 @@
 #include <linux/hrtimer.h>
 #include <linux/fs.h>
 #include <linux/slab.h>
+#include <linux/file.h>
 #include <asm/cputable.h>
 #include <asm/uaccess.h>
 #include <asm/kvm_ppc.h>
@@ -327,6 +328,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	case KVM_CAP_SW_TLB:
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC:
+#endif
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
 	tasklet_kill(&vcpu->arch.tasklet);
 
 	kvmppc_remove_vcpu_debugfs(vcpu);
+
+	switch (vcpu->arch.irq_type) {
+	case KVMPPC_IRQ_MPIC:
+		kvmppc_mpic_disconnect_vcpu(vcpu->arch.mpic, vcpu);
+		break;
+	}
+
 	kvmppc_core_vcpu_free(vcpu);
 }
 
@@ -793,6 +804,25 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		break;
 	}
 #endif
+#ifdef CONFIG_KVM_MPIC
+	case KVM_CAP_IRQ_MPIC: {
+		struct file *filp;
+		struct kvm_device *dev;
+
+		r = -EBADF;
+		filp = fget(cap->args[0]);
+		if (!filp)
+			break;
+
+		r = -EPERM;
+		dev = kvm_device_from_filp(filp);
+		if (dev)
+			r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
+
+		fput(filp);
+		break;
+	}
+#endif
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 568d65d..8f95cac 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_PSCI 87
 #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
 #define KVM_CAP_DEVICE_CTRL 89
+#define KVM_CAP_IRQ_MPIC 90
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all the irq routing and irqfd pieces are generic, we can expose
real irqchip support to all of KVM's internal helpers.

This allows us to use irqfd with the in-kernel MPIC.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - make mpic pointer type safe
  - add wmb before setting global mpic variable
  - make eoi notification happen unlockedly
  - add IRQ routing documentation
  - announce mpic availability after its creation
---
 Documentation/virtual/kvm/devices/mpic.txt |   11 +++
 arch/powerpc/include/asm/kvm_host.h        |    7 ++
 arch/powerpc/include/uapi/asm/kvm.h        |    1 +
 arch/powerpc/kvm/Kconfig                   |    3 +
 arch/powerpc/kvm/Makefile                  |    1 +
 arch/powerpc/kvm/irq.h                     |   17 ++++
 arch/powerpc/kvm/mpic.c                    |  113 ++++++++++++++++++++++++++++
 7 files changed, 153 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/irq.h

diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
index ce98e32..dadc1e0 100644
--- a/Documentation/virtual/kvm/devices/mpic.txt
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -35,3 +35,14 @@ Groups:
 
     "attr" is the IRQ number.  IRQ numbers for standard sources are the
     byte offset of the relevant IVPR from EIVPR0, divided by 32.
+
+IRQ Routing:
+
+  The MPIC emulation supports IRQ routing. Only a single MPIC device can
+  be instantiated. Once that device has been created, it's available as
+  irqchip id 0.
+
+  This irqchip 0 has 256 interrupt pins. These pins reflect the SRC pins
+  on the MPIC controller.
+
+  Access to on-SRC registers is not implemented through IRQ routing mechanisms.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 36368c9..80f2004 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -44,6 +44,10 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
+/* These values are internal and can be increased later */
+#define KVM_NR_IRQCHIPS          1
+#define KVM_IRQCHIP_NUM_PINS     256
+
 #if !defined(CONFIG_KVM_440)
 #include <linux/mmu_notifier.h>
 
@@ -256,6 +260,9 @@ struct kvm_arch {
 #ifdef CONFIG_PPC_BOOK3S_64
 	struct list_head spapr_tce_tables;
 #endif
+#ifdef CONFIG_KVM_MPIC
+	struct openpic *mpic;
+#endif
 };
 
 /*
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 36be2fe..3537bf3 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -25,6 +25,7 @@
 /* Select powerpc specific features in <linux/kvm.h> */
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
+#define __KVM_HAVE_IRQCHIP
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 938a729..a608570 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -154,6 +154,9 @@ config KVM_E500MC
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
 	depends on KVM
+	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
+	select HAVE_KVM_MSI
 	help
 	  Enable support for emulating MPIC devices inside the
           host kernel, rather than relying on userspace to emulate.
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4a2277a..4eada0c 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -104,6 +104,7 @@ kvm-book3s_32-objs := \
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
 kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+kvm-objs-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(addprefix ../../../virt/kvm/, irqchip.o)
 
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
new file mode 100644
index 0000000..f1e27fd
--- /dev/null
+++ b/arch/powerpc/kvm/irq.h
@@ -0,0 +1,17 @@
+#ifndef __IRQ_H
+#define __IRQ_H
+
+#include <linux/kvm_host.h>
+
+static inline int irqchip_in_kernel(struct kvm *kvm)
+{
+	int ret = 0;
+
+#ifdef CONFIG_KVM_MPIC
+	ret = ret || (kvm->arch.mpic != NULL);
+#endif
+	smp_rmb();
+	return ret;
+}
+
+#endif
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 10bc08a..d137df8 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1029,6 +1029,7 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 	struct irq_source *src;
 	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
+	int notify_eoi = -1;
 
 	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
 		addr, val);
@@ -1087,6 +1088,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		}
 
 		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Notify listeners that the IRQ is over */
+		notify_eoi = s_IRQ;
 		/* Set up next servicing IRQ */
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 		/* Check queued interrupts. */
@@ -1104,6 +1107,12 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		break;
 	}
 
+	if (notify_eoi != -1) {
+		spin_unlock_irq(&opp->lock);
+		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
+		spin_lock_irq(&opp->lock);
+	}
+
 	return 0;
 }
 
@@ -1639,14 +1648,42 @@ static void mpic_destroy(struct kvm_device *dev)
 		unmap_mmio(opp);
 	}
 
+	dev->kvm->arch.mpic = NULL;
 	kfree(opp);
 }
 
+static int mpic_set_default_irq_routing(struct openpic *opp)
+{
+	int i;
+	struct kvm_irq_routing_entry *routing;
+
+	/* XXX be more dynamic if we ever want to support multiple MPIC chips */
+	routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
+	if (!routing)
+		return -ENOMEM;
+
+	for (i = 0; i < opp->nb_irqs; i++) {
+		routing[i].gsi = i;
+		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
+		routing[i].u.irqchip.irqchip = 0;
+		routing[i].u.irqchip.pin = i;
+	}
+
+	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
+
+	kfree(routing);
+	return 0;
+}
+
 static int mpic_create(struct kvm_device *dev, u32 type)
 {
 	struct openpic *opp;
 	int ret;
 
+	/* We only support one MPIC at a time for now */
+	if (dev->kvm->arch.mpic)
+		return -EINVAL;
+
 	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
 	if (!opp)
 		return -ENOMEM;
@@ -1691,7 +1728,15 @@ static int mpic_create(struct kvm_device *dev, u32 type)
 		goto err;
 	}
 
+	ret = mpic_set_default_irq_routing(opp);
+	if (ret)
+		goto err;
+
 	openpic_reset(opp);
+
+	smp_wmb();
+	dev->kvm->arch.mpic = opp;
+
 	return 0;
 
 err:
@@ -1761,3 +1806,71 @@ void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
 	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
 	kvm_device_put(opp->dev);
 }
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+static int mpic_set_irq(struct kvm_kernel_irq_routing_entry *e,
+			struct kvm *kvm, int irq_source_id, int level,
+			bool line_status)
+{
+	u32 irq = e->irqchip.pin;
+	struct openpic *opp = kvm->arch.mpic;
+
+	spin_lock_irq(&opp->lock);
+	openpic_set_irq(opp, irq, level);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
+		struct kvm *kvm, int irq_source_id, int level, bool line_status)
+{
+	struct openpic *opp = kvm->arch.mpic;
+	spin_lock_irq(&opp->lock);
+
+	/*
+	 * XXX We ignore the target address for now, as we only support
+	 *     a single MSI bank.
+	 */
+	openpic_msi_write(kvm->arch.mpic, MSIIR_OFFSET, e->msi.data);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+
+	switch (ue->type) {
+	case KVM_IRQ_ROUTING_IRQCHIP:
+		e->set = mpic_set_irq;
+		e->irqchip.irqchip = ue->u.irqchip.irqchip;
+		e->irqchip.pin = ue->u.irqchip.pin;
+		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
+			goto out;
+		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
+		break;
+	case KVM_IRQ_ROUTING_MSI:
+		e->set = kvm_set_msi;
+		e->msi.address_lo = ue->u.msi.address_lo;
+		e->msi.address_hi = ue->u.msi.address_hi;
+		e->msi.data = ue->u.msi.data;
+		break;
+	default:
+		goto out;
+	}
+
+	r = 0;
+out:
+	return r;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all the irq routing and irqfd pieces are generic, we can expose
real irqchip support to all of KVM's internal helpers.

This allows us to use irqfd with the in-kernel MPIC.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v2 -> v3:

  - make mpic pointer type safe
  - add wmb before setting global mpic variable
  - make eoi notification happen unlockedly
  - add IRQ routing documentation
  - announce mpic availability after its creation
---
 Documentation/virtual/kvm/devices/mpic.txt |   11 +++
 arch/powerpc/include/asm/kvm_host.h        |    7 ++
 arch/powerpc/include/uapi/asm/kvm.h        |    1 +
 arch/powerpc/kvm/Kconfig                   |    3 +
 arch/powerpc/kvm/Makefile                  |    1 +
 arch/powerpc/kvm/irq.h                     |   17 ++++
 arch/powerpc/kvm/mpic.c                    |  113 ++++++++++++++++++++++++++++
 7 files changed, 153 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/irq.h

diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
index ce98e32..dadc1e0 100644
--- a/Documentation/virtual/kvm/devices/mpic.txt
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -35,3 +35,14 @@ Groups:
 
     "attr" is the IRQ number.  IRQ numbers for standard sources are the
     byte offset of the relevant IVPR from EIVPR0, divided by 32.
+
+IRQ Routing:
+
+  The MPIC emulation supports IRQ routing. Only a single MPIC device can
+  be instantiated. Once that device has been created, it's available as
+  irqchip id 0.
+
+  This irqchip 0 has 256 interrupt pins. These pins reflect the SRC pins
+  on the MPIC controller.
+
+  Access to on-SRC registers is not implemented through IRQ routing mechanisms.
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 36368c9..80f2004 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -44,6 +44,10 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
+/* These values are internal and can be increased later */
+#define KVM_NR_IRQCHIPS          1
+#define KVM_IRQCHIP_NUM_PINS     256
+
 #if !defined(CONFIG_KVM_440)
 #include <linux/mmu_notifier.h>
 
@@ -256,6 +260,9 @@ struct kvm_arch {
 #ifdef CONFIG_PPC_BOOK3S_64
 	struct list_head spapr_tce_tables;
 #endif
+#ifdef CONFIG_KVM_MPIC
+	struct openpic *mpic;
+#endif
 };
 
 /*
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 36be2fe..3537bf3 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -25,6 +25,7 @@
 /* Select powerpc specific features in <linux/kvm.h> */
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
+#define __KVM_HAVE_IRQCHIP
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 938a729..a608570 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -154,6 +154,9 @@ config KVM_E500MC
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
 	depends on KVM
+	select HAVE_KVM_IRQCHIP
+	select HAVE_KVM_IRQ_ROUTING
+	select HAVE_KVM_MSI
 	help
 	  Enable support for emulating MPIC devices inside the
           host kernel, rather than relying on userspace to emulate.
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4a2277a..4eada0c 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -104,6 +104,7 @@ kvm-book3s_32-objs := \
 kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
 
 kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
+kvm-objs-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(addprefix ../../../virt/kvm/, irqchip.o)
 
 kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
 
diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
new file mode 100644
index 0000000..f1e27fd
--- /dev/null
+++ b/arch/powerpc/kvm/irq.h
@@ -0,0 +1,17 @@
+#ifndef __IRQ_H
+#define __IRQ_H
+
+#include <linux/kvm_host.h>
+
+static inline int irqchip_in_kernel(struct kvm *kvm)
+{
+	int ret = 0;
+
+#ifdef CONFIG_KVM_MPIC
+	ret = ret || (kvm->arch.mpic != NULL);
+#endif
+	smp_rmb();
+	return ret;
+}
+
+#endif
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 10bc08a..d137df8 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1029,6 +1029,7 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 	struct irq_source *src;
 	struct irq_dest *dst;
 	int s_IRQ, n_IRQ;
+	int notify_eoi = -1;
 
 	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
 		addr, val);
@@ -1087,6 +1088,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		}
 
 		IRQ_resetbit(&dst->servicing, s_IRQ);
+		/* Notify listeners that the IRQ is over */
+		notify_eoi = s_IRQ;
 		/* Set up next servicing IRQ */
 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
 		/* Check queued interrupts. */
@@ -1104,6 +1107,12 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
 		break;
 	}
 
+	if (notify_eoi != -1) {
+		spin_unlock_irq(&opp->lock);
+		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
+		spin_lock_irq(&opp->lock);
+	}
+
 	return 0;
 }
 
@@ -1639,14 +1648,42 @@ static void mpic_destroy(struct kvm_device *dev)
 		unmap_mmio(opp);
 	}
 
+	dev->kvm->arch.mpic = NULL;
 	kfree(opp);
 }
 
+static int mpic_set_default_irq_routing(struct openpic *opp)
+{
+	int i;
+	struct kvm_irq_routing_entry *routing;
+
+	/* XXX be more dynamic if we ever want to support multiple MPIC chips */
+	routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
+	if (!routing)
+		return -ENOMEM;
+
+	for (i = 0; i < opp->nb_irqs; i++) {
+		routing[i].gsi = i;
+		routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
+		routing[i].u.irqchip.irqchip = 0;
+		routing[i].u.irqchip.pin = i;
+	}
+
+	kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
+
+	kfree(routing);
+	return 0;
+}
+
 static int mpic_create(struct kvm_device *dev, u32 type)
 {
 	struct openpic *opp;
 	int ret;
 
+	/* We only support one MPIC at a time for now */
+	if (dev->kvm->arch.mpic)
+		return -EINVAL;
+
 	opp = kzalloc(sizeof(struct openpic), GFP_KERNEL);
 	if (!opp)
 		return -ENOMEM;
@@ -1691,7 +1728,15 @@ static int mpic_create(struct kvm_device *dev, u32 type)
 		goto err;
 	}
 
+	ret = mpic_set_default_irq_routing(opp);
+	if (ret)
+		goto err;
+
 	openpic_reset(opp);
+
+	smp_wmb();
+	dev->kvm->arch.mpic = opp;
+
 	return 0;
 
 err:
@@ -1761,3 +1806,71 @@ void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu)
 	opp->dst[vcpu->arch.irq_cpu_id].vcpu = NULL;
 	kvm_device_put(opp->dev);
 }
+
+/*
+ * Return value:
+ *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
+ *  = 0   Interrupt was coalesced (previous irq is still pending)
+ *  > 0   Number of CPUs interrupt was delivered to
+ */
+static int mpic_set_irq(struct kvm_kernel_irq_routing_entry *e,
+			struct kvm *kvm, int irq_source_id, int level,
+			bool line_status)
+{
+	u32 irq = e->irqchip.pin;
+	struct openpic *opp = kvm->arch.mpic;
+
+	spin_lock_irq(&opp->lock);
+	openpic_set_irq(opp, irq, level);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
+		struct kvm *kvm, int irq_source_id, int level, bool line_status)
+{
+	struct openpic *opp = kvm->arch.mpic;
+	spin_lock_irq(&opp->lock);
+
+	/*
+	 * XXX We ignore the target address for now, as we only support
+	 *     a single MSI bank.
+	 */
+	openpic_msi_write(kvm->arch.mpic, MSIIR_OFFSET, e->msi.data);
+	spin_unlock_irq(&opp->lock);
+
+	/* All code paths we care about don't check for the return value */
+	return 0;
+}
+
+int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
+			  struct kvm_kernel_irq_routing_entry *e,
+			  const struct kvm_irq_routing_entry *ue)
+{
+	int r = -EINVAL;
+
+	switch (ue->type) {
+	case KVM_IRQ_ROUTING_IRQCHIP:
+		e->set = mpic_set_irq;
+		e->irqchip.irqchip = ue->u.irqchip.irqchip;
+		e->irqchip.pin = ue->u.irqchip.pin;
+		if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
+			goto out;
+		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
+		break;
+	case KVM_IRQ_ROUTING_MSI:
+		e->set = kvm_set_msi;
+		e->msi.address_lo = ue->u.msi.address_lo;
+		e->msi.address_hi = ue->u.msi.address_hi;
+		e->msi.data = ue->u.msi.data;
+		break;
+	default:
+		goto out;
+	}
+
+	r = 0;
+out:
+	return r;
+}
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all pieces are in place for reusing generic irq infrastructure,
we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
reuse it for PPC, as it will work there just as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/uapi/asm/kvm.h |    1 +
 arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 3537bf3..dbb2ac2 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -26,6 +26,7 @@
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
 #define __KVM_HAVE_IRQCHIP
+#define __KVM_HAVE_IRQ_LINE
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c431fea..874c106 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
 #include "timing.h"
+#include "irq.h"
 #include "../mm/mmu_decl.h"
 
 #define CREATE_TRACE_POINTS
@@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	return 0;
 }
 
+int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
+			  bool line_status)
+{
+	if (!irqchip_in_kernel(kvm))
+		return -ENXIO;
+
+	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
+					irq_event->irq, irq_event->level,
+					line_status);
+	return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

Now that all pieces are in place for reusing generic irq infrastructure,
we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
reuse it for PPC, as it will work there just as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/uapi/asm/kvm.h |    1 +
 arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 3537bf3..dbb2ac2 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -26,6 +26,7 @@
 #define __KVM_HAVE_SPAPR_TCE
 #define __KVM_HAVE_PPC_SMT
 #define __KVM_HAVE_IRQCHIP
+#define __KVM_HAVE_IRQ_LINE
 
 struct kvm_regs {
 	__u64 pc;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c431fea..874c106 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -33,6 +33,7 @@
 #include <asm/cputhreads.h>
 #include <asm/irqflags.h>
 #include "timing.h"
+#include "irq.h"
 #include "../mm/mmu_decl.h"
 
 #define CREATE_TRACE_POINTS
@@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	return 0;
 }
 
+int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
+			  bool line_status)
+{
+	if (!irqchip_in_kernel(kvm))
+		return -ENXIO;
+
+	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
+					irq_event->irq, irq_event->level,
+					line_status);
+	return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-19 14:06   ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The code as is doesn't make any sense on non-e500 platforms. Restrict it
there, so that people don't get wrong ideas on what would actually work.

This patch should get reverted as soon as it's possible to either run e500
guests on non-e500 hosts or the MPIC emulation gains support for non-e500
modes.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index a608570..e88b1da 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -153,7 +153,7 @@ config KVM_E500MC
 
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
-	depends on KVM
+	depends on KVM && E500
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_MSI
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms
@ 2013-04-19 14:06   ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-19 14:06 UTC (permalink / raw)
  To: kvm-ppc
  Cc: kvm@vger.kernel.org mailing list, Scott Wood, Marcelo Tosatti,
	Gleb Natapov

The code as is doesn't make any sense on non-e500 platforms. Restrict it
there, so that people don't get wrong ideas on what would actually work.

This patch should get reverted as soon as it's possible to either run e500
guests on non-e500 hosts or the MPIC emulation gains support for non-e500
modes.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index a608570..e88b1da 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -153,7 +153,7 @@ config KVM_E500MC
 
 config KVM_MPIC
 	bool "KVM in-kernel MPIC emulation"
-	depends on KVM
+	depends on KVM && E500
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_MSI
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-19 18:02     ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19 18:02 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/19/2013 09:06:26 AM, Alexander Graf wrote:
> diff --git a/Documentation/virtual/kvm/devices/mpic.txt  
> b/Documentation/virtual/kvm/devices/mpic.txt
> index ce98e32..dadc1e0 100644
> --- a/Documentation/virtual/kvm/devices/mpic.txt
> +++ b/Documentation/virtual/kvm/devices/mpic.txt
> @@ -35,3 +35,14 @@ Groups:
> 
>      "attr" is the IRQ number.  IRQ numbers for standard sources are  
> the
>      byte offset of the relevant IVPR from EIVPR0, divided by 32.
> +
> +IRQ Routing:
> +
> +  The MPIC emulation supports IRQ routing. Only a single MPIC device  
> can
> +  be instantiated. Once that device has been created, it's available  
> as
> +  irqchip id 0.
> +

> +  This irqchip 0 has 256 interrupt pins. These pins reflect the SRC  
> pins
> +  on the MPIC controller.

This irqchip 0 has 256 interrupt pins, which expose the interrupts in  
the main array of interrupt sources (a.k.a. "SRC" interrupts).  The  
numbering is the same as the MPIC device tree binding -- based on the  
register offset from the beginning of the sources array, without regard  
to any subdivisions in chip documentation such as "internal" or  
"external" interrupts.  Default routes are established for these pins,  
with the GSI being equal to the pin number.

> +  Access to on-SRC registers is not implemented through IRQ routing  
> mechanisms.

s/on-SRC registers/non-SRC interrupts/

> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index 10bc08a..d137df8 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
> @@ -1029,6 +1029,7 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  	struct irq_source *src;
>  	struct irq_dest *dst;
>  	int s_IRQ, n_IRQ;
> +	int notify_eoi = -1;
> 
>  	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
>  		addr, val);
> @@ -1087,6 +1088,8 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  		}
> 
>  		IRQ_resetbit(&dst->servicing, s_IRQ);
> +		/* Notify listeners that the IRQ is over */
> +		notify_eoi = s_IRQ;
>  		/* Set up next servicing IRQ */
>  		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>  		/* Check queued interrupts. */
> @@ -1104,6 +1107,12 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  		break;
>  	}
> 
> +	if (notify_eoi != -1) {
> +		spin_unlock_irq(&opp->lock);
> +		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
> +		spin_lock_irq(&opp->lock);
> +	}

I'd rather not have the "_irq" here, which could break if we enter this  
patch via an "_irqsave" (I realize there currently is no such path that  
reaches EOI emulation).

Will we ever set notify_eoi when addr != EOI?  I'm wondering why it was  
moved out of the switch statement, instead of being put at the end of  
the case EOI: code.

> +/*
> + * Return value:
> + *  < 0   Interrupt was ignored (masked or not delivered for other  
> reasons)
> + *  = 0   Interrupt was coalesced (previous irq is still pending)
> + *  > 0   Number of CPUs interrupt was delivered to
> + */
> +static int mpic_set_irq(struct kvm_kernel_irq_routing_entry *e,
> +			struct kvm *kvm, int irq_source_id, int level,
> +			bool line_status)
> +{
> +	u32 irq = e->irqchip.pin;
> +	struct openpic *opp = kvm->arch.mpic;
> +
> +	spin_lock_irq(&opp->lock);
> +	openpic_set_irq(opp, irq, level);
> +	spin_unlock_irq(&opp->lock);

Use irqsave here and in kvm_set_msi.  The latter can already be called  
with interrupts disabled, and we may want to do the same for non-MSIs  
once we start assigning non-PCI devices (where there's no longer the  
excuse of "if you want it to be fast, use MSIs").

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-19 18:02     ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19 18:02 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/19/2013 09:06:26 AM, Alexander Graf wrote:
> diff --git a/Documentation/virtual/kvm/devices/mpic.txt  
> b/Documentation/virtual/kvm/devices/mpic.txt
> index ce98e32..dadc1e0 100644
> --- a/Documentation/virtual/kvm/devices/mpic.txt
> +++ b/Documentation/virtual/kvm/devices/mpic.txt
> @@ -35,3 +35,14 @@ Groups:
> 
>      "attr" is the IRQ number.  IRQ numbers for standard sources are  
> the
>      byte offset of the relevant IVPR from EIVPR0, divided by 32.
> +
> +IRQ Routing:
> +
> +  The MPIC emulation supports IRQ routing. Only a single MPIC device  
> can
> +  be instantiated. Once that device has been created, it's available  
> as
> +  irqchip id 0.
> +

> +  This irqchip 0 has 256 interrupt pins. These pins reflect the SRC  
> pins
> +  on the MPIC controller.

This irqchip 0 has 256 interrupt pins, which expose the interrupts in  
the main array of interrupt sources (a.k.a. "SRC" interrupts).  The  
numbering is the same as the MPIC device tree binding -- based on the  
register offset from the beginning of the sources array, without regard  
to any subdivisions in chip documentation such as "internal" or  
"external" interrupts.  Default routes are established for these pins,  
with the GSI being equal to the pin number.

> +  Access to on-SRC registers is not implemented through IRQ routing  
> mechanisms.

s/on-SRC registers/non-SRC interrupts/

> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index 10bc08a..d137df8 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
> @@ -1029,6 +1029,7 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  	struct irq_source *src;
>  	struct irq_dest *dst;
>  	int s_IRQ, n_IRQ;
> +	int notify_eoi = -1;
> 
>  	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
>  		addr, val);
> @@ -1087,6 +1088,8 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  		}
> 
>  		IRQ_resetbit(&dst->servicing, s_IRQ);
> +		/* Notify listeners that the IRQ is over */
> +		notify_eoi = s_IRQ;
>  		/* Set up next servicing IRQ */
>  		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>  		/* Check queued interrupts. */
> @@ -1104,6 +1107,12 @@ static int openpic_cpu_write_internal(void  
> *opaque, gpa_t addr,
>  		break;
>  	}
> 
> +	if (notify_eoi != -1) {
> +		spin_unlock_irq(&opp->lock);
> +		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
> +		spin_lock_irq(&opp->lock);
> +	}

I'd rather not have the "_irq" here, which could break if we enter this  
patch via an "_irqsave" (I realize there currently is no such path that  
reaches EOI emulation).

Will we ever set notify_eoi when addr != EOI?  I'm wondering why it was  
moved out of the switch statement, instead of being put at the end of  
the case EOI: code.

> +/*
> + * Return value:
> + *  < 0   Interrupt was ignored (masked or not delivered for other  
> reasons)
> + *  = 0   Interrupt was coalesced (previous irq is still pending)
> + *  > 0   Number of CPUs interrupt was delivered to
> + */
> +static int mpic_set_irq(struct kvm_kernel_irq_routing_entry *e,
> +			struct kvm *kvm, int irq_source_id, int level,
> +			bool line_status)
> +{
> +	u32 irq = e->irqchip.pin;
> +	struct openpic *opp = kvm->arch.mpic;
> +
> +	spin_lock_irq(&opp->lock);
> +	openpic_set_irq(opp, irq, level);
> +	spin_unlock_irq(&opp->lock);

Use irqsave here and in kvm_set_msi.  The latter can already be called  
with interrupts disabled, and we may want to do the same for non-MSIs  
once we start assigning non-PCI devices (where there's no longer the  
excuse of "if you want it to be fast, use MSIs").

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-19 18:51     ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19 18:51 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> Now that all pieces are in place for reusing generic irq  
> infrastructure,
> we can copy x86's implementation of KVM_IRQ_LINE irq injection and  
> simply
> reuse it for PPC, as it will work there just as well.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/uapi/asm/kvm.h |    1 +
>  arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>  2 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h  
> b/arch/powerpc/include/uapi/asm/kvm.h
> index 3537bf3..dbb2ac2 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -26,6 +26,7 @@
>  #define __KVM_HAVE_SPAPR_TCE
>  #define __KVM_HAVE_PPC_SMT
>  #define __KVM_HAVE_IRQCHIP
> +#define __KVM_HAVE_IRQ_LINE
> 
>  struct kvm_regs {
>  	__u64 pc;
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index c431fea..874c106 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -33,6 +33,7 @@
>  #include <asm/cputhreads.h>
>  #include <asm/irqflags.h>
>  #include "timing.h"
> +#include "irq.h"
>  #include "../mm/mmu_decl.h"
> 
>  #define CREATE_TRACE_POINTS
> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct  
> kvm_ppc_pvinfo *pvinfo)
>  	return 0;
>  }
> 
> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level  
> *irq_event,
> +			  bool line_status)
> +{
> +	if (!irqchip_in_kernel(kvm))
> +		return -ENXIO;
> +
> +	irq_event->status = kvm_set_irq(kvm,  
> KVM_USERSPACE_IRQ_SOURCE_ID,
> +					irq_event->irq,  
> irq_event->level,
> +					line_status);
> +	return 0;
> +}

As Paul noted in the XICS patchset, this could reference an MPIC that  
has gone away if the user never attached any vcpus and then closed the  
MPIC fd.  It's not a reasonable use case, but it could be used  
malicously to get the kernel to access a bad pointer.  The  
irqchip_in_kernel check helps somewhat, but it's meant for ensuring  
that the creation has happened -- it's racy if used for ensuring that  
destruction hasn't happened.

The problem is rooted in the awkwardness of performing an operation  
that logically should be on the MPIC fd, but is instead being done on  
the vm fd.

I think these three steps would fix it (the first two seem like things  
we should be doing anyway):
- During MPIC destruction, make sure MPIC deregisters all routes that  
reference it.
- In kvm_set_irq(), do not release the RCU read lock until after the  
set() function has been called.
- Do not hook up kvm_send_userspace_msi() to MPIC or other new  
irqchips, as that bypasses the RCU lock.  It could be supported as a  
device fd ioctl if desired, or it could be reworked to operate on an  
RCU-managed list of MSI handlers, though MPIC really doesn't need this  
at all.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-19 18:51     ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-19 18:51 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> Now that all pieces are in place for reusing generic irq  
> infrastructure,
> we can copy x86's implementation of KVM_IRQ_LINE irq injection and  
> simply
> reuse it for PPC, as it will work there just as well.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/uapi/asm/kvm.h |    1 +
>  arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>  2 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h  
> b/arch/powerpc/include/uapi/asm/kvm.h
> index 3537bf3..dbb2ac2 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -26,6 +26,7 @@
>  #define __KVM_HAVE_SPAPR_TCE
>  #define __KVM_HAVE_PPC_SMT
>  #define __KVM_HAVE_IRQCHIP
> +#define __KVM_HAVE_IRQ_LINE
> 
>  struct kvm_regs {
>  	__u64 pc;
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index c431fea..874c106 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -33,6 +33,7 @@
>  #include <asm/cputhreads.h>
>  #include <asm/irqflags.h>
>  #include "timing.h"
> +#include "irq.h"
>  #include "../mm/mmu_decl.h"
> 
>  #define CREATE_TRACE_POINTS
> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct  
> kvm_ppc_pvinfo *pvinfo)
>  	return 0;
>  }
> 
> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level  
> *irq_event,
> +			  bool line_status)
> +{
> +	if (!irqchip_in_kernel(kvm))
> +		return -ENXIO;
> +
> +	irq_event->status = kvm_set_irq(kvm,  
> KVM_USERSPACE_IRQ_SOURCE_ID,
> +					irq_event->irq,  
> irq_event->level,
> +					line_status);
> +	return 0;
> +}

As Paul noted in the XICS patchset, this could reference an MPIC that  
has gone away if the user never attached any vcpus and then closed the  
MPIC fd.  It's not a reasonable use case, but it could be used  
malicously to get the kernel to access a bad pointer.  The  
irqchip_in_kernel check helps somewhat, but it's meant for ensuring  
that the creation has happened -- it's racy if used for ensuring that  
destruction hasn't happened.

The problem is rooted in the awkwardness of performing an operation  
that logically should be on the MPIC fd, but is instead being done on  
the vm fd.

I think these three steps would fix it (the first two seem like things  
we should be doing anyway):
- During MPIC destruction, make sure MPIC deregisters all routes that  
reference it.
- In kvm_set_irq(), do not release the RCU read lock until after the  
set() function has been called.
- Do not hook up kvm_send_userspace_msi() to MPIC or other new  
irqchips, as that bypasses the RCU lock.  It could be supported as a  
device fd ioctl if desired, or it could be reworked to operate on an  
RCU-managed list of MSI handlers, though MPIC really doesn't need this  
at all.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19  0:15       ` Alexander Graf
@ 2013-04-22 23:31         ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-22 23:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
> 
> On 18.04.2013, at 23:39, Scott Wood wrote:
> 
> > Do we really want any default routes?  There's no platform notion  
> of GSI
> > here, so how is userspace to know how the kernel set it up (or what  
> GSIs
> > are free to be used for new routes) -- other than the "read the  
> code"
> > answer I got when I asked about x86?  :-P
> 
> The "default routes" really are just "expose all pins 1:1 as GSI". I  
> think it makes sense to have a simple default for user space that  
> doesn't want to mess with irq routing.
> 
> What GSIs are free for new routes doesn't matter. Routes are always  
> completely rewritten as a while from user space. So when user space  
> goes in and wants to change only a single line it has to lay out the  
> full map itself anyway.

It looks like you already write the routes in your QEMU patches, so I'd  
like to avoid adding MPIC default routes in KVM to keep things simple.   
It's legacy baggage from day one.  With default routes, what happens if  
we later support instantiating multiple interrupt controllers?

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-22 23:31         ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-22 23:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
> 
> On 18.04.2013, at 23:39, Scott Wood wrote:
> 
> > Do we really want any default routes?  There's no platform notion  
> of GSI
> > here, so how is userspace to know how the kernel set it up (or what  
> GSIs
> > are free to be used for new routes) -- other than the "read the  
> code"
> > answer I got when I asked about x86?  :-P
> 
> The "default routes" really are just "expose all pins 1:1 as GSI". I  
> think it makes sense to have a simple default for user space that  
> doesn't want to mess with irq routing.
> 
> What GSIs are free for new routes doesn't matter. Routes are always  
> completely rewritten as a while from user space. So when user space  
> goes in and wants to change only a single line it has to lay out the  
> full map itself anyway.

It looks like you already write the routes in your QEMU patches, so I'd  
like to avoid adding MPIC default routes in KVM to keep things simple.   
It's legacy baggage from day one.  With default routes, what happens if  
we later support instantiating multiple interrupt controllers?

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-23  6:38     ` Paul Mackerras
  -1 siblings, 0 replies; 128+ messages in thread
From: Paul Mackerras @ 2013-04-23  6:38 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:26PM +0200, Alexander Graf wrote:
> Now that all the irq routing and irqfd pieces are generic, we can expose
> real irqchip support to all of KVM's internal helpers.
> 
> This allows us to use irqfd with the in-kernel MPIC.

[snip]
> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index 10bc08a..d137df8 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
[snip]
> +int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> +		struct kvm *kvm, int irq_source_id, int level, bool line_status)
[snip]
> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue)

How do you see this working once we have more than one interrupt
controller emulation in the kernel?  Presumably these two will have to
move out to a common file, rather than being in mpic.c, but then the
question is how do we know which interrupt controller to send the GSI
to?  Were you thinking we would have a restriction that you can only
instantiate one interrupt controller of any type?  Or were you
thinking we would have an enum for kvm_irq_routing_irqchip::irqchip?
In that case how would we handle MSIs?

Paul.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-23  6:38     ` Paul Mackerras
  0 siblings, 0 replies; 128+ messages in thread
From: Paul Mackerras @ 2013-04-23  6:38 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:26PM +0200, Alexander Graf wrote:
> Now that all the irq routing and irqfd pieces are generic, we can expose
> real irqchip support to all of KVM's internal helpers.
> 
> This allows us to use irqfd with the in-kernel MPIC.

[snip]
> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index 10bc08a..d137df8 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
[snip]
> +int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> +		struct kvm *kvm, int irq_source_id, int level, bool line_status)
[snip]
> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue)

How do you see this working once we have more than one interrupt
controller emulation in the kernel?  Presumably these two will have to
move out to a common file, rather than being in mpic.c, but then the
question is how do we know which interrupt controller to send the GSI
to?  Were you thinking we would have a restriction that you can only
instantiate one interrupt controller of any type?  Or were you
thinking we would have an enum for kvm_irq_routing_irqchip::irqchip?
In that case how would we handle MSIs?

Paul.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-19 18:02     ` Scott Wood
@ 2013-04-25  9:58       ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25  9:58 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 19.04.2013, at 20:02, Scott Wood wrote:

> On 04/19/2013 09:06:26 AM, Alexander Graf wrote:
>> diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
>> index ce98e32..dadc1e0 100644
>> --- a/Documentation/virtual/kvm/devices/mpic.txt
>> +++ b/Documentation/virtual/kvm/devices/mpic.txt
>> @@ -35,3 +35,14 @@ Groups:
>>     "attr" is the IRQ number.  IRQ numbers for standard sources are the
>>     byte offset of the relevant IVPR from EIVPR0, divided by 32.
>> +
>> +IRQ Routing:
>> +
>> +  The MPIC emulation supports IRQ routing. Only a single MPIC device can
>> +  be instantiated. Once that device has been created, it's available as
>> +  irqchip id 0.
>> +
> 
>> +  This irqchip 0 has 256 interrupt pins. These pins reflect the SRC pins
>> +  on the MPIC controller.
> 
> This irqchip 0 has 256 interrupt pins, which expose the interrupts in the main array of interrupt sources (a.k.a. "SRC" interrupts).  The numbering is the same as the MPIC device tree binding -- based on the register offset from the beginning of the sources array, without regard to any subdivisions in chip documentation such as "internal" or "external" interrupts.  Default routes are established for these pins, with the GSI being equal to the pin number.
> 
>> +  Access to on-SRC registers is not implemented through IRQ routing mechanisms.
> 
> s/on-SRC registers/non-SRC interrupts/
> 
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index 10bc08a..d137df8 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
>> @@ -1029,6 +1029,7 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 	struct irq_source *src;
>> 	struct irq_dest *dst;
>> 	int s_IRQ, n_IRQ;
>> +	int notify_eoi = -1;
>> 	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
>> 		addr, val);
>> @@ -1087,6 +1088,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 		}
>> 		IRQ_resetbit(&dst->servicing, s_IRQ);
>> +		/* Notify listeners that the IRQ is over */
>> +		notify_eoi = s_IRQ;
>> 		/* Set up next servicing IRQ */
>> 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>> 		/* Check queued interrupts. */
>> @@ -1104,6 +1107,12 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 		break;
>> 	}
>> +	if (notify_eoi != -1) {
>> +		spin_unlock_irq(&opp->lock);
>> +		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
>> +		spin_lock_irq(&opp->lock);
>> +	}
> 
> I'd rather not have the "_irq" here, which could break if we enter this patch via an "_irqsave" (I realize there currently is no such path that reaches EOI emulation).
> 
> Will we ever set notify_eoi when addr != EOI?  I'm wondering why it was moved out of the switch statement, instead of being put at the end of the case EOI: code.

I doubt it, but that's for the compiler to optimize away. I found it cleaner for some reason to put it down there. I don't think it really matters.


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-25  9:58       ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25  9:58 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 19.04.2013, at 20:02, Scott Wood wrote:

> On 04/19/2013 09:06:26 AM, Alexander Graf wrote:
>> diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt
>> index ce98e32..dadc1e0 100644
>> --- a/Documentation/virtual/kvm/devices/mpic.txt
>> +++ b/Documentation/virtual/kvm/devices/mpic.txt
>> @@ -35,3 +35,14 @@ Groups:
>>     "attr" is the IRQ number.  IRQ numbers for standard sources are the
>>     byte offset of the relevant IVPR from EIVPR0, divided by 32.
>> +
>> +IRQ Routing:
>> +
>> +  The MPIC emulation supports IRQ routing. Only a single MPIC device can
>> +  be instantiated. Once that device has been created, it's available as
>> +  irqchip id 0.
>> +
> 
>> +  This irqchip 0 has 256 interrupt pins. These pins reflect the SRC pins
>> +  on the MPIC controller.
> 
> This irqchip 0 has 256 interrupt pins, which expose the interrupts in the main array of interrupt sources (a.k.a. "SRC" interrupts).  The numbering is the same as the MPIC device tree binding -- based on the register offset from the beginning of the sources array, without regard to any subdivisions in chip documentation such as "internal" or "external" interrupts.  Default routes are established for these pins, with the GSI being equal to the pin number.
> 
>> +  Access to on-SRC registers is not implemented through IRQ routing mechanisms.
> 
> s/on-SRC registers/non-SRC interrupts/
> 
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index 10bc08a..d137df8 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
>> @@ -1029,6 +1029,7 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 	struct irq_source *src;
>> 	struct irq_dest *dst;
>> 	int s_IRQ, n_IRQ;
>> +	int notify_eoi = -1;
>> 	pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx,
>> 		addr, val);
>> @@ -1087,6 +1088,8 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 		}
>> 		IRQ_resetbit(&dst->servicing, s_IRQ);
>> +		/* Notify listeners that the IRQ is over */
>> +		notify_eoi = s_IRQ;
>> 		/* Set up next servicing IRQ */
>> 		s_IRQ = IRQ_get_next(opp, &dst->servicing);
>> 		/* Check queued interrupts. */
>> @@ -1104,6 +1107,12 @@ static int openpic_cpu_write_internal(void *opaque, gpa_t addr,
>> 		break;
>> 	}
>> +	if (notify_eoi != -1) {
>> +		spin_unlock_irq(&opp->lock);
>> +		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
>> +		spin_lock_irq(&opp->lock);
>> +	}
> 
> I'd rather not have the "_irq" here, which could break if we enter this patch via an "_irqsave" (I realize there currently is no such path that reaches EOI emulation).
> 
> Will we ever set notify_eoi when addr != EOI?  I'm wondering why it was moved out of the switch statement, instead of being put at the end of the case EOI: code.

I doubt it, but that's for the compiler to optimize away. I found it cleaner for some reason to put it down there. I don't think it really matters.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-23  6:38     ` Paul Mackerras
@ 2013-04-25 10:02       ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 10:02 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov


On 23.04.2013, at 08:38, Paul Mackerras wrote:

> On Fri, Apr 19, 2013 at 04:06:26PM +0200, Alexander Graf wrote:
>> Now that all the irq routing and irqfd pieces are generic, we can expose
>> real irqchip support to all of KVM's internal helpers.
>> 
>> This allows us to use irqfd with the in-kernel MPIC.
> 
> [snip]
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index 10bc08a..d137df8 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
> [snip]
>> +int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>> +		struct kvm *kvm, int irq_source_id, int level, bool line_status)
> [snip]
>> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
>> +			  struct kvm_kernel_irq_routing_entry *e,
>> +			  const struct kvm_irq_routing_entry *ue)
> 
> How do you see this working once we have more than one interrupt
> controller emulation in the kernel?  Presumably these two will have to
> move out to a common file, rather than being in mpic.c, but then the
> question is how do we know which interrupt controller to send the GSI
> to?  Were you thinking we would have a restriction that you can only
> instantiate one interrupt controller of any type?  Or were you
> thinking we would have an enum for kvm_irq_routing_irqchip::irqchip?
> In that case how would we handle MSIs?

In a first version of having 2 interrupt controllers, I'd make them mutually exclusive in Kconfig. That way each interrupt controller implements these functions itself.

Later we can sit down and generalize this support. Then we would need to have a mapping table which irqchip type each irqchip number is and call the respective functions.

But the use for that is so incredibly slim and the user space API would still be the same, that I don't think we need to worry about it today.


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-25 10:02       ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 10:02 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov


On 23.04.2013, at 08:38, Paul Mackerras wrote:

> On Fri, Apr 19, 2013 at 04:06:26PM +0200, Alexander Graf wrote:
>> Now that all the irq routing and irqfd pieces are generic, we can expose
>> real irqchip support to all of KVM's internal helpers.
>> 
>> This allows us to use irqfd with the in-kernel MPIC.
> 
> [snip]
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index 10bc08a..d137df8 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
> [snip]
>> +int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>> +		struct kvm *kvm, int irq_source_id, int level, bool line_status)
> [snip]
>> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
>> +			  struct kvm_kernel_irq_routing_entry *e,
>> +			  const struct kvm_irq_routing_entry *ue)
> 
> How do you see this working once we have more than one interrupt
> controller emulation in the kernel?  Presumably these two will have to
> move out to a common file, rather than being in mpic.c, but then the
> question is how do we know which interrupt controller to send the GSI
> to?  Were you thinking we would have a restriction that you can only
> instantiate one interrupt controller of any type?  Or were you
> thinking we would have an enum for kvm_irq_routing_irqchip::irqchip?
> In that case how would we handle MSIs?

In a first version of having 2 interrupt controllers, I'd make them mutually exclusive in Kconfig. That way each interrupt controller implements these functions itself.

Later we can sit down and generalize this support. Then we would need to have a mapping table which irqchip type each irqchip number is and call the respective functions.

But the use for that is so incredibly slim and the user space API would still be the same, that I don't think we need to worry about it today.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:18     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:18 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:12PM +0200, Alexander Graf wrote:
> The concept of routing interrupt lines to an irqchip is nothing
> that is IOAPIC specific. Every irqchip has a maximum number of pins
> that can be linked to irq lines.
> 
> So let's add a new define that allows us to reuse generic code for
> non-IOAPIC platforms.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  include/linux/kvm_host.h        |    2 +-
>  virt/kvm/irq_comm.c             |    2 +-
>  3 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 599f98b..f44c3fe 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -43,6 +43,8 @@
>  #define KVM_PIO_PAGE_OFFSET 1
>  #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
>  
> +#define KVM_IRQCHIP_NUM_PINS  KVM_IOAPIC_NUM_PINS
> +
>  #define CR0_RESERVED_BITS                                               \
>  	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
>  			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 93a5005..bf3b1dc 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -307,7 +307,7 @@ struct kvm_kernel_irq_routing_entry {
>  #ifdef __KVM_HAVE_IOAPIC
>  
>  struct kvm_irq_routing_table {
> -	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
> +	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
>  	struct kvm_kernel_irq_routing_entry *rt_entries;
>  	u32 nr_rt_entries;
>  	/*
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 25ab480..7c0071d 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -480,7 +480,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
>  
>  	new->nr_rt_entries = nr_rt_entries;
>  	for (i = 0; i < 3; i++)
> -		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
> +		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
>  			new->chip[i][j] = -1;
>  
>  	for (i = 0; i < nr; ++i) {
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
@ 2013-04-25 10:18     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:18 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:12PM +0200, Alexander Graf wrote:
> The concept of routing interrupt lines to an irqchip is nothing
> that is IOAPIC specific. Every irqchip has a maximum number of pins
> that can be linked to irq lines.
> 
> So let's add a new define that allows us to reuse generic code for
> non-IOAPIC platforms.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  include/linux/kvm_host.h        |    2 +-
>  virt/kvm/irq_comm.c             |    2 +-
>  3 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 599f98b..f44c3fe 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -43,6 +43,8 @@
>  #define KVM_PIO_PAGE_OFFSET 1
>  #define KVM_COALESCED_MMIO_PAGE_OFFSET 2
>  
> +#define KVM_IRQCHIP_NUM_PINS  KVM_IOAPIC_NUM_PINS
> +
>  #define CR0_RESERVED_BITS                                               \
>  	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
>  			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 93a5005..bf3b1dc 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -307,7 +307,7 @@ struct kvm_kernel_irq_routing_entry {
>  #ifdef __KVM_HAVE_IOAPIC
>  
>  struct kvm_irq_routing_table {
> -	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
> +	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
>  	struct kvm_kernel_irq_routing_entry *rt_entries;
>  	u32 nr_rt_entries;
>  	/*
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 25ab480..7c0071d 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -480,7 +480,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
>  
>  	new->nr_rt_entries = nr_rt_entries;
>  	for (i = 0; i < 3; i++)
> -		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
> +		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
>  			new->chip[i][j] = -1;
>  
>  	for (i = 0; i < nr; ++i) {
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:18     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:18 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:13PM +0200, Alexander Graf wrote:
> Quite a bit of code in KVM has been conditionalized on availability of
> IOAPIC emulation. However, most of it is generically applicable to
> platforms that don't have an IOPIC, but a different type of irq chip.
> 
> Make code that only relies on IRQ routing, not an APIC itself, on
> CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/kvm/Kconfig     |    1 +
>  include/linux/kvm_host.h |    6 +++---
>  virt/kvm/Kconfig         |    3 +++
>  virt/kvm/eventfd.c       |    6 +++---
>  virt/kvm/kvm_main.c      |    2 +-
>  5 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 586f000..9d50efd 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -29,6 +29,7 @@ config KVM
>  	select MMU_NOTIFIER
>  	select ANON_INODES
>  	select HAVE_KVM_IRQCHIP
> +	select HAVE_KVM_IRQ_ROUTING
>  	select HAVE_KVM_EVENTFD
>  	select KVM_APIC_ARCHITECTURE
>  	select KVM_ASYNC_PF
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index bf3b1dc..4215d4f 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -304,7 +304,7 @@ struct kvm_kernel_irq_routing_entry {
>  	struct hlist_node link;
>  };
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  
>  struct kvm_irq_routing_table {
>  	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
> @@ -432,7 +432,7 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
>  int __must_check vcpu_load(struct kvm_vcpu *vcpu);
>  void vcpu_put(struct kvm_vcpu *vcpu);
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  int kvm_irqfd_init(void);
>  void kvm_irqfd_exit(void);
>  #else
> @@ -957,7 +957,7 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq)
>  }
>  #endif
>  
> -#ifdef KVM_CAP_IRQ_ROUTING
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  
>  #define KVM_MAX_IRQ_ROUTES 1024
>  
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index d01b24b..779262f 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -6,6 +6,9 @@ config HAVE_KVM
>  config HAVE_KVM_IRQCHIP
>         bool
>  
> +config HAVE_KVM_IRQ_ROUTING
> +       bool
> +
>  config HAVE_KVM_EVENTFD
>         bool
>         select EVENTFD
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index c5d43ff..64ee720 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -35,7 +35,7 @@
>  
>  #include "iodev.h"
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  /*
>   * --------------------------------------------------------------------
>   * irqfd: Allows an fd to be used to inject an interrupt to the guest
> @@ -433,7 +433,7 @@ fail:
>  void
>  kvm_eventfd_init(struct kvm *kvm)
>  {
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  	spin_lock_init(&kvm->irqfds.lock);
>  	INIT_LIST_HEAD(&kvm->irqfds.items);
>  	INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
> @@ -442,7 +442,7 @@ kvm_eventfd_init(struct kvm *kvm)
>  	INIT_LIST_HEAD(&kvm->ioeventfds);
>  }
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  /*
>   * shutdown any irqfd's that match fd+gsi
>   */
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index aaac1a7..2c3b226 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2404,7 +2404,7 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
>  	case KVM_CAP_SIGNAL_MSI:
>  #endif
>  		return 1;
> -#ifdef KVM_CAP_IRQ_ROUTING
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  	case KVM_CAP_IRQ_ROUTING:
>  		return KVM_MAX_IRQ_ROUTES;
>  #endif
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
@ 2013-04-25 10:18     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:18 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:13PM +0200, Alexander Graf wrote:
> Quite a bit of code in KVM has been conditionalized on availability of
> IOAPIC emulation. However, most of it is generically applicable to
> platforms that don't have an IOPIC, but a different type of irq chip.
> 
> Make code that only relies on IRQ routing, not an APIC itself, on
> CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/kvm/Kconfig     |    1 +
>  include/linux/kvm_host.h |    6 +++---
>  virt/kvm/Kconfig         |    3 +++
>  virt/kvm/eventfd.c       |    6 +++---
>  virt/kvm/kvm_main.c      |    2 +-
>  5 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 586f000..9d50efd 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -29,6 +29,7 @@ config KVM
>  	select MMU_NOTIFIER
>  	select ANON_INODES
>  	select HAVE_KVM_IRQCHIP
> +	select HAVE_KVM_IRQ_ROUTING
>  	select HAVE_KVM_EVENTFD
>  	select KVM_APIC_ARCHITECTURE
>  	select KVM_ASYNC_PF
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index bf3b1dc..4215d4f 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -304,7 +304,7 @@ struct kvm_kernel_irq_routing_entry {
>  	struct hlist_node link;
>  };
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  
>  struct kvm_irq_routing_table {
>  	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
> @@ -432,7 +432,7 @@ void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
>  int __must_check vcpu_load(struct kvm_vcpu *vcpu);
>  void vcpu_put(struct kvm_vcpu *vcpu);
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  int kvm_irqfd_init(void);
>  void kvm_irqfd_exit(void);
>  #else
> @@ -957,7 +957,7 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq)
>  }
>  #endif
>  
> -#ifdef KVM_CAP_IRQ_ROUTING
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  
>  #define KVM_MAX_IRQ_ROUTES 1024
>  
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index d01b24b..779262f 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -6,6 +6,9 @@ config HAVE_KVM
>  config HAVE_KVM_IRQCHIP
>         bool
>  
> +config HAVE_KVM_IRQ_ROUTING
> +       bool
> +
>  config HAVE_KVM_EVENTFD
>         bool
>         select EVENTFD
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index c5d43ff..64ee720 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -35,7 +35,7 @@
>  
>  #include "iodev.h"
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  /*
>   * --------------------------------------------------------------------
>   * irqfd: Allows an fd to be used to inject an interrupt to the guest
> @@ -433,7 +433,7 @@ fail:
>  void
>  kvm_eventfd_init(struct kvm *kvm)
>  {
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  	spin_lock_init(&kvm->irqfds.lock);
>  	INIT_LIST_HEAD(&kvm->irqfds.items);
>  	INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
> @@ -442,7 +442,7 @@ kvm_eventfd_init(struct kvm *kvm)
>  	INIT_LIST_HEAD(&kvm->ioeventfds);
>  }
>  
> -#ifdef __KVM_HAVE_IOAPIC
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  /*
>   * shutdown any irqfd's that match fd+gsi
>   */
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index aaac1a7..2c3b226 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2404,7 +2404,7 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
>  	case KVM_CAP_SIGNAL_MSI:
>  #endif
>  		return 1;
> -#ifdef KVM_CAP_IRQ_ROUTING
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  	case KVM_CAP_IRQ_ROUTING:
>  		return KVM_MAX_IRQ_ROUTES;
>  #endif
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:14PM +0200, Alexander Graf wrote:
> We have a capability enquire system that allows user space to ask kvm
> whether a feature is available.
> 
> The point behind this system is that we can have different kernel
> configurations with different capabilities and user space can adjust
> accordingly.
> 
> Because features can always be non existent, we can drop any #ifdefs
> on CAP defines that could be used generically, like the irq routing
> bits. These can be easily reused for non-IOAPIC systems as well.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  include/uapi/linux/kvm.h |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 74d0ff3..c741902 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -579,9 +579,7 @@ struct kvm_ppc_smmu_info {
>  #ifdef __KVM_HAVE_PIT
>  #define KVM_CAP_REINJECT_CONTROL 24
>  #endif
> -#ifdef __KVM_HAVE_IOAPIC
>  #define KVM_CAP_IRQ_ROUTING 25
> -#endif
>  #define KVM_CAP_IRQ_INJECT_STATUS 26
>  #ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
>  #define KVM_CAP_DEVICE_DEASSIGNMENT 27
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:14PM +0200, Alexander Graf wrote:
> We have a capability enquire system that allows user space to ask kvm
> whether a feature is available.
> 
> The point behind this system is that we can have different kernel
> configurations with different capabilities and user space can adjust
> accordingly.
> 
> Because features can always be non existent, we can drop any #ifdefs
> on CAP defines that could be used generically, like the irq routing
> bits. These can be easily reused for non-IOAPIC systems as well.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  include/uapi/linux/kvm.h |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 74d0ff3..c741902 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -579,9 +579,7 @@ struct kvm_ppc_smmu_info {
>  #ifdef __KVM_HAVE_PIT
>  #define KVM_CAP_REINJECT_CONTROL 24
>  #endif
> -#ifdef __KVM_HAVE_IOAPIC
>  #define KVM_CAP_IRQ_ROUTING 25
> -#endif
>  #define KVM_CAP_IRQ_INJECT_STATUS 26
>  #ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
>  #define KVM_CAP_DEVICE_DEASSIGNMENT 27
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:15PM +0200, Alexander Graf wrote:
> The prototype has been stale for a while, I can't spot any real function
> define behind it. Let's just remove it.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  include/linux/kvm_host.h |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 4215d4f..a7bfe9d 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -719,11 +719,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
>  void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
>  			     bool mask);
>  
> -#ifdef __KVM_HAVE_IOAPIC
> -void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> -				   union kvm_ioapic_redirect_entry *entry,
> -				   unsigned long *deliver_bitmask);
> -#endif
>  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
>  		bool line_status);
>  int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:15PM +0200, Alexander Graf wrote:
> The prototype has been stale for a while, I can't spot any real function
> define behind it. Let's just remove it.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  include/linux/kvm_host.h |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 4215d4f..a7bfe9d 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -719,11 +719,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
>  void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
>  			     bool mask);
>  
> -#ifdef __KVM_HAVE_IOAPIC
> -void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> -				   union kvm_ioapic_redirect_entry *entry,
> -				   unsigned long *deliver_bitmask);
> -#endif
>  int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
>  		bool line_status);
>  int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level);
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 05/17] KVM: Move irq routing to generic code
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:16PM +0200, Alexander Graf wrote:
> The IRQ routing set ioctl lives in the hacky device assignment code inside
> of KVM today. This is definitely the wrong place for it. Move it to the much
> more natural kvm_main.c.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  virt/kvm/assigned-dev.c |   30 ------------------------------
>  virt/kvm/kvm_main.c     |   30 ++++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 30 deletions(-)
> 
> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> index f4c7f59..8db4370 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -983,36 +983,6 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl,
>  			goto out;
>  		break;
>  	}
> -#ifdef KVM_CAP_IRQ_ROUTING
> -	case KVM_SET_GSI_ROUTING: {
> -		struct kvm_irq_routing routing;
> -		struct kvm_irq_routing __user *urouting;
> -		struct kvm_irq_routing_entry *entries;
> -
> -		r = -EFAULT;
> -		if (copy_from_user(&routing, argp, sizeof(routing)))
> -			goto out;
> -		r = -EINVAL;
> -		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
> -			goto out;
> -		if (routing.flags)
> -			goto out;
> -		r = -ENOMEM;
> -		entries = vmalloc(routing.nr * sizeof(*entries));
> -		if (!entries)
> -			goto out;
> -		r = -EFAULT;
> -		urouting = argp;
> -		if (copy_from_user(entries, urouting->entries,
> -				   routing.nr * sizeof(*entries)))
> -			goto out_free_irq_routing;
> -		r = kvm_set_irq_routing(kvm, entries, routing.nr,
> -					routing.flags);
> -	out_free_irq_routing:
> -		vfree(entries);
> -		break;
> -	}
> -#endif /* KVM_CAP_IRQ_ROUTING */
>  #ifdef __KVM_HAVE_MSIX
>  	case KVM_ASSIGN_SET_MSIX_NR: {
>  		struct kvm_assigned_msix_nr entry_nr;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2c3b226..b6f3354 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2274,6 +2274,36 @@ static long kvm_vm_ioctl(struct file *filp,
>  		break;
>  	}
>  #endif
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
> +	case KVM_SET_GSI_ROUTING: {
> +		struct kvm_irq_routing routing;
> +		struct kvm_irq_routing __user *urouting;
> +		struct kvm_irq_routing_entry *entries;
> +
> +		r = -EFAULT;
> +		if (copy_from_user(&routing, argp, sizeof(routing)))
> +			goto out;
> +		r = -EINVAL;
> +		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
> +			goto out;
> +		if (routing.flags)
> +			goto out;
> +		r = -ENOMEM;
> +		entries = vmalloc(routing.nr * sizeof(*entries));
> +		if (!entries)
> +			goto out;
> +		r = -EFAULT;
> +		urouting = argp;
> +		if (copy_from_user(entries, urouting->entries,
> +				   routing.nr * sizeof(*entries)))
> +			goto out_free_irq_routing;
> +		r = kvm_set_irq_routing(kvm, entries, routing.nr,
> +					routing.flags);
> +	out_free_irq_routing:
> +		vfree(entries);
> +		break;
> +	}
> +#endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
>  	default:
>  		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
>  		if (r == -ENOTTY)
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 05/17] KVM: Move irq routing to generic code
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:16PM +0200, Alexander Graf wrote:
> The IRQ routing set ioctl lives in the hacky device assignment code inside
> of KVM today. This is definitely the wrong place for it. Move it to the much
> more natural kvm_main.c.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  virt/kvm/assigned-dev.c |   30 ------------------------------
>  virt/kvm/kvm_main.c     |   30 ++++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 30 deletions(-)
> 
> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> index f4c7f59..8db4370 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -983,36 +983,6 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl,
>  			goto out;
>  		break;
>  	}
> -#ifdef KVM_CAP_IRQ_ROUTING
> -	case KVM_SET_GSI_ROUTING: {
> -		struct kvm_irq_routing routing;
> -		struct kvm_irq_routing __user *urouting;
> -		struct kvm_irq_routing_entry *entries;
> -
> -		r = -EFAULT;
> -		if (copy_from_user(&routing, argp, sizeof(routing)))
> -			goto out;
> -		r = -EINVAL;
> -		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
> -			goto out;
> -		if (routing.flags)
> -			goto out;
> -		r = -ENOMEM;
> -		entries = vmalloc(routing.nr * sizeof(*entries));
> -		if (!entries)
> -			goto out;
> -		r = -EFAULT;
> -		urouting = argp;
> -		if (copy_from_user(entries, urouting->entries,
> -				   routing.nr * sizeof(*entries)))
> -			goto out_free_irq_routing;
> -		r = kvm_set_irq_routing(kvm, entries, routing.nr,
> -					routing.flags);
> -	out_free_irq_routing:
> -		vfree(entries);
> -		break;
> -	}
> -#endif /* KVM_CAP_IRQ_ROUTING */
>  #ifdef __KVM_HAVE_MSIX
>  	case KVM_ASSIGN_SET_MSIX_NR: {
>  		struct kvm_assigned_msix_nr entry_nr;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2c3b226..b6f3354 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2274,6 +2274,36 @@ static long kvm_vm_ioctl(struct file *filp,
>  		break;
>  	}
>  #endif
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
> +	case KVM_SET_GSI_ROUTING: {
> +		struct kvm_irq_routing routing;
> +		struct kvm_irq_routing __user *urouting;
> +		struct kvm_irq_routing_entry *entries;
> +
> +		r = -EFAULT;
> +		if (copy_from_user(&routing, argp, sizeof(routing)))
> +			goto out;
> +		r = -EINVAL;
> +		if (routing.nr >= KVM_MAX_IRQ_ROUTES)
> +			goto out;
> +		if (routing.flags)
> +			goto out;
> +		r = -ENOMEM;
> +		entries = vmalloc(routing.nr * sizeof(*entries));
> +		if (!entries)
> +			goto out;
> +		r = -EFAULT;
> +		urouting = argp;
> +		if (copy_from_user(entries, urouting->entries,
> +				   routing.nr * sizeof(*entries)))
> +			goto out_free_irq_routing;
> +		r = kvm_set_irq_routing(kvm, entries, routing.nr,
> +					routing.flags);
> +	out_free_irq_routing:
> +		vfree(entries);
> +		break;
> +	}
> +#endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */
>  	default:
>  		r = kvm_arch_vm_ioctl(filp, ioctl, arg);
>  		if (r = -ENOTTY)
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:17PM +0200, Alexander Graf wrote:
> The current irq_comm.c file contains pieces of code that are generic
> across different irqchip implementations, as well as code that is
> fully IOAPIC specific.
> 
> Split the generic bits out into irqchip.c.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/kvm/Makefile      |    2 +-
>  include/trace/events/kvm.h |   12 +++-
>  virt/kvm/irq_comm.c        |  118 ----------------------------------
>  virt/kvm/irqchip.c         |  152 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 163 insertions(+), 121 deletions(-)
>  create mode 100644 virt/kvm/irqchip.c
> 
> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index 04d3040..a797b8e 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
>  
>  kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
>  				coalesced_mmio.o irq_comm.o eventfd.o \
> -				assigned-dev.o)
> +				assigned-dev.o irqchip.o)
>  kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/, iommu.o)
>  kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix ../../../virt/kvm/, async_pf.o)
>  
> diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
> index 19911dd..7005d11 100644
> --- a/include/trace/events/kvm.h
> +++ b/include/trace/events/kvm.h
> @@ -37,7 +37,7 @@ TRACE_EVENT(kvm_userspace_exit,
>  		  __entry->errno < 0 ? -__entry->errno : __entry->reason)
>  );
>  
> -#if defined(__KVM_HAVE_IRQ_LINE)
> +#if defined(CONFIG_HAVE_KVM_IRQCHIP)
>  TRACE_EVENT(kvm_set_irq,
>  	TP_PROTO(unsigned int gsi, int level, int irq_source_id),
>  	TP_ARGS(gsi, level, irq_source_id),
> @@ -122,6 +122,10 @@ TRACE_EVENT(kvm_msi_set_irq,
>  	{KVM_IRQCHIP_PIC_SLAVE,		"PIC slave"},		\
>  	{KVM_IRQCHIP_IOAPIC,		"IOAPIC"}
>  
> +#endif /* defined(__KVM_HAVE_IOAPIC) */
> +
> +#if defined(CONFIG_HAVE_KVM_IRQCHIP)
> +
>  TRACE_EVENT(kvm_ack_irq,
>  	TP_PROTO(unsigned int irqchip, unsigned int pin),
>  	TP_ARGS(irqchip, pin),
> @@ -136,14 +140,18 @@ TRACE_EVENT(kvm_ack_irq,
>  		__entry->pin		= pin;
>  	),
>  
> +#ifdef kvm_irqchips
>  	TP_printk("irqchip %s pin %u",
>  		  __print_symbolic(__entry->irqchip, kvm_irqchips),
>  		 __entry->pin)
> +#else
> +	TP_printk("irqchip %d pin %u", __entry->irqchip, __entry->pin)
> +#endif
>  );
>  
> +#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
>  
>  
> -#endif /* defined(__KVM_HAVE_IOAPIC) */
>  
>  #define KVM_TRACE_MMIO_READ_UNSATISFIED 0
>  #define KVM_TRACE_MMIO_READ 1
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 7c0071d..d5008f4 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -151,59 +151,6 @@ static int kvm_set_msi_inatomic(struct kvm_kernel_irq_routing_entry *e,
>  		return -EWOULDBLOCK;
>  }
>  
> -int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
> -{
> -	struct kvm_kernel_irq_routing_entry route;
> -
> -	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
> -		return -EINVAL;
> -
> -	route.msi.address_lo = msi->address_lo;
> -	route.msi.address_hi = msi->address_hi;
> -	route.msi.data = msi->data;
> -
> -	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
> -}
> -
> -/*
> - * Return value:
> - *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
> - *  = 0   Interrupt was coalesced (previous irq is still pending)
> - *  > 0   Number of CPUs interrupt was delivered to
> - */
> -int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
> -		bool line_status)
> -{
> -	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> -	int ret = -1, i = 0;
> -	struct kvm_irq_routing_table *irq_rt;
> -
> -	trace_kvm_set_irq(irq, level, irq_source_id);
> -
> -	/* Not possible to detect if the guest uses the PIC or the
> -	 * IOAPIC.  So set the bit in both. The guest will ignore
> -	 * writes to the unused one.
> -	 */
> -	rcu_read_lock();
> -	irq_rt = rcu_dereference(kvm->irq_routing);
> -	if (irq < irq_rt->nr_rt_entries)
> -		hlist_for_each_entry(e, &irq_rt->map[irq], link)
> -			irq_set[i++] = *e;
> -	rcu_read_unlock();
> -
> -	while(i--) {
> -		int r;
> -		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
> -				line_status);
> -		if (r < 0)
> -			continue;
> -
> -		ret = r + ((ret < 0) ? 0 : ret);
> -	}
> -
> -	return ret;
> -}
> -
>  /*
>   * Deliver an IRQ in an atomic context if we can, or return a failure,
>   * user can retry in a process context.
> @@ -241,63 +188,6 @@ int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level)
>  	return ret;
>  }
>  
> -bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
> -{
> -	struct kvm_irq_ack_notifier *kian;
> -	int gsi;
> -
> -	rcu_read_lock();
> -	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> -	if (gsi != -1)
> -		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> -					 link)
> -			if (kian->gsi == gsi) {
> -				rcu_read_unlock();
> -				return true;
> -			}
> -
> -	rcu_read_unlock();
> -
> -	return false;
> -}
> -EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
> -
> -void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
> -{
> -	struct kvm_irq_ack_notifier *kian;
> -	int gsi;
> -
> -	trace_kvm_ack_irq(irqchip, pin);
> -
> -	rcu_read_lock();
> -	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> -	if (gsi != -1)
> -		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> -					 link)
> -			if (kian->gsi == gsi)
> -				kian->irq_acked(kian);
> -	rcu_read_unlock();
> -}
> -
> -void kvm_register_irq_ack_notifier(struct kvm *kvm,
> -				   struct kvm_irq_ack_notifier *kian)
> -{
> -	mutex_lock(&kvm->irq_lock);
> -	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
> -	mutex_unlock(&kvm->irq_lock);
> -	kvm_vcpu_request_scan_ioapic(kvm);
> -}
> -
> -void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
> -				    struct kvm_irq_ack_notifier *kian)
> -{
> -	mutex_lock(&kvm->irq_lock);
> -	hlist_del_init_rcu(&kian->link);
> -	mutex_unlock(&kvm->irq_lock);
> -	synchronize_rcu();
> -	kvm_vcpu_request_scan_ioapic(kvm);
> -}
> -
>  int kvm_request_irq_source_id(struct kvm *kvm)
>  {
>  	unsigned long *bitmap = &kvm->arch.irq_sources_bitmap;
> @@ -381,13 +271,6 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
>  	rcu_read_unlock();
>  }
>  
> -void kvm_free_irq_routing(struct kvm *kvm)
> -{
> -	/* Called only during vm destruction. Nobody can use the pointer
> -	   at this stage */
> -	kfree(kvm->irq_routing);
> -}
> -
>  static int setup_routing_entry(struct kvm_irq_routing_table *rt,
>  			       struct kvm_kernel_irq_routing_entry *e,
>  			       const struct kvm_irq_routing_entry *ue)
> @@ -451,7 +334,6 @@ out:
>  	return r;
>  }
>  
> -
>  int kvm_set_irq_routing(struct kvm *kvm,
>  			const struct kvm_irq_routing_entry *ue,
>  			unsigned nr,
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> new file mode 100644
> index 0000000..12f7f26
> --- /dev/null
> +++ b/virt/kvm/irqchip.c
> @@ -0,0 +1,152 @@
> +/*
> + * irqchip.c: Common API for in kernel interrupt controllers
> + * Copyright (c) 2007, Intel Corporation.
> + * Copyright 2010 Red Hat, Inc. and/or its affiliates.
> + * Copyright (c) 2013, Alexander Graf <agraf@suse.de>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * This file is derived from virt/kvm/irq_comm.c.
> + *
> + * Authors:
> + *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
> + *   Alexander Graf <agraf@suse.de>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/slab.h>
> +#include <linux/export.h>
> +#include <trace/events/kvm.h>
> +#include "irq.h"
> +
> +bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
> +{
> +	struct kvm_irq_ack_notifier *kian;
> +	int gsi;
> +
> +	rcu_read_lock();
> +	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> +	if (gsi != -1)
> +		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> +					 link)
> +			if (kian->gsi == gsi) {
> +				rcu_read_unlock();
> +				return true;
> +			}
> +
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
> +
> +void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
> +{
> +	struct kvm_irq_ack_notifier *kian;
> +	int gsi;
> +
> +	trace_kvm_ack_irq(irqchip, pin);
> +
> +	rcu_read_lock();
> +	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> +	if (gsi != -1)
> +		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> +					 link)
> +			if (kian->gsi == gsi)
> +				kian->irq_acked(kian);
> +	rcu_read_unlock();
> +}
> +
> +void kvm_register_irq_ack_notifier(struct kvm *kvm,
> +				   struct kvm_irq_ack_notifier *kian)
> +{
> +	mutex_lock(&kvm->irq_lock);
> +	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
> +	mutex_unlock(&kvm->irq_lock);
> +#ifdef __KVM_HAVE_IOAPIC
> +	kvm_vcpu_request_scan_ioapic(kvm);
> +#endif
> +}
> +
> +void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
> +				    struct kvm_irq_ack_notifier *kian)
> +{
> +	mutex_lock(&kvm->irq_lock);
> +	hlist_del_init_rcu(&kian->link);
> +	mutex_unlock(&kvm->irq_lock);
> +	synchronize_rcu();
> +#ifdef __KVM_HAVE_IOAPIC
> +	kvm_vcpu_request_scan_ioapic(kvm);
> +#endif
> +}
> +
> +int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
> +{
> +	struct kvm_kernel_irq_routing_entry route;
> +
> +	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
> +		return -EINVAL;
> +
> +	route.msi.address_lo = msi->address_lo;
> +	route.msi.address_hi = msi->address_hi;
> +	route.msi.data = msi->data;
> +
> +	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
> +}
> +
> +/*
> + * Return value:
> + *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
> + *  = 0   Interrupt was coalesced (previous irq is still pending)
> + *  > 0   Number of CPUs interrupt was delivered to
> + */
> +int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
> +		bool line_status)
> +{
> +	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> +	int ret = -1, i = 0;
> +	struct kvm_irq_routing_table *irq_rt;
> +
> +	trace_kvm_set_irq(irq, level, irq_source_id);
> +
> +	/* Not possible to detect if the guest uses the PIC or the
> +	 * IOAPIC.  So set the bit in both. The guest will ignore
> +	 * writes to the unused one.
> +	 */
> +	rcu_read_lock();
> +	irq_rt = rcu_dereference(kvm->irq_routing);
> +	if (irq < irq_rt->nr_rt_entries)
> +		hlist_for_each_entry(e, &irq_rt->map[irq], link)
> +			irq_set[i++] = *e;
> +	rcu_read_unlock();
> +
> +	while(i--) {
> +		int r;
> +		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
> +				   line_status);
> +		if (r < 0)
> +			continue;
> +
> +		ret = r + ((ret < 0) ? 0 : ret);
> +	}
> +
> +	return ret;
> +}
> +
> +void kvm_free_irq_routing(struct kvm *kvm)
> +{
> +	/* Called only during vm destruction. Nobody can use the pointer
> +	   at this stage */
> +	kfree(kvm->irq_routing);
> +}
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c
@ 2013-04-25 10:19     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:17PM +0200, Alexander Graf wrote:
> The current irq_comm.c file contains pieces of code that are generic
> across different irqchip implementations, as well as code that is
> fully IOAPIC specific.
> 
> Split the generic bits out into irqchip.c.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/kvm/Makefile      |    2 +-
>  include/trace/events/kvm.h |   12 +++-
>  virt/kvm/irq_comm.c        |  118 ----------------------------------
>  virt/kvm/irqchip.c         |  152 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 163 insertions(+), 121 deletions(-)
>  create mode 100644 virt/kvm/irqchip.c
> 
> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index 04d3040..a797b8e 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
>  
>  kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
>  				coalesced_mmio.o irq_comm.o eventfd.o \
> -				assigned-dev.o)
> +				assigned-dev.o irqchip.o)
>  kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/, iommu.o)
>  kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix ../../../virt/kvm/, async_pf.o)
>  
> diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
> index 19911dd..7005d11 100644
> --- a/include/trace/events/kvm.h
> +++ b/include/trace/events/kvm.h
> @@ -37,7 +37,7 @@ TRACE_EVENT(kvm_userspace_exit,
>  		  __entry->errno < 0 ? -__entry->errno : __entry->reason)
>  );
>  
> -#if defined(__KVM_HAVE_IRQ_LINE)
> +#if defined(CONFIG_HAVE_KVM_IRQCHIP)
>  TRACE_EVENT(kvm_set_irq,
>  	TP_PROTO(unsigned int gsi, int level, int irq_source_id),
>  	TP_ARGS(gsi, level, irq_source_id),
> @@ -122,6 +122,10 @@ TRACE_EVENT(kvm_msi_set_irq,
>  	{KVM_IRQCHIP_PIC_SLAVE,		"PIC slave"},		\
>  	{KVM_IRQCHIP_IOAPIC,		"IOAPIC"}
>  
> +#endif /* defined(__KVM_HAVE_IOAPIC) */
> +
> +#if defined(CONFIG_HAVE_KVM_IRQCHIP)
> +
>  TRACE_EVENT(kvm_ack_irq,
>  	TP_PROTO(unsigned int irqchip, unsigned int pin),
>  	TP_ARGS(irqchip, pin),
> @@ -136,14 +140,18 @@ TRACE_EVENT(kvm_ack_irq,
>  		__entry->pin		= pin;
>  	),
>  
> +#ifdef kvm_irqchips
>  	TP_printk("irqchip %s pin %u",
>  		  __print_symbolic(__entry->irqchip, kvm_irqchips),
>  		 __entry->pin)
> +#else
> +	TP_printk("irqchip %d pin %u", __entry->irqchip, __entry->pin)
> +#endif
>  );
>  
> +#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
>  
>  
> -#endif /* defined(__KVM_HAVE_IOAPIC) */
>  
>  #define KVM_TRACE_MMIO_READ_UNSATISFIED 0
>  #define KVM_TRACE_MMIO_READ 1
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 7c0071d..d5008f4 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -151,59 +151,6 @@ static int kvm_set_msi_inatomic(struct kvm_kernel_irq_routing_entry *e,
>  		return -EWOULDBLOCK;
>  }
>  
> -int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
> -{
> -	struct kvm_kernel_irq_routing_entry route;
> -
> -	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
> -		return -EINVAL;
> -
> -	route.msi.address_lo = msi->address_lo;
> -	route.msi.address_hi = msi->address_hi;
> -	route.msi.data = msi->data;
> -
> -	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
> -}
> -
> -/*
> - * Return value:
> - *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
> - *  = 0   Interrupt was coalesced (previous irq is still pending)
> - *  > 0   Number of CPUs interrupt was delivered to
> - */
> -int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
> -		bool line_status)
> -{
> -	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> -	int ret = -1, i = 0;
> -	struct kvm_irq_routing_table *irq_rt;
> -
> -	trace_kvm_set_irq(irq, level, irq_source_id);
> -
> -	/* Not possible to detect if the guest uses the PIC or the
> -	 * IOAPIC.  So set the bit in both. The guest will ignore
> -	 * writes to the unused one.
> -	 */
> -	rcu_read_lock();
> -	irq_rt = rcu_dereference(kvm->irq_routing);
> -	if (irq < irq_rt->nr_rt_entries)
> -		hlist_for_each_entry(e, &irq_rt->map[irq], link)
> -			irq_set[i++] = *e;
> -	rcu_read_unlock();
> -
> -	while(i--) {
> -		int r;
> -		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
> -				line_status);
> -		if (r < 0)
> -			continue;
> -
> -		ret = r + ((ret < 0) ? 0 : ret);
> -	}
> -
> -	return ret;
> -}
> -
>  /*
>   * Deliver an IRQ in an atomic context if we can, or return a failure,
>   * user can retry in a process context.
> @@ -241,63 +188,6 @@ int kvm_set_irq_inatomic(struct kvm *kvm, int irq_source_id, u32 irq, int level)
>  	return ret;
>  }
>  
> -bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
> -{
> -	struct kvm_irq_ack_notifier *kian;
> -	int gsi;
> -
> -	rcu_read_lock();
> -	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> -	if (gsi != -1)
> -		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> -					 link)
> -			if (kian->gsi = gsi) {
> -				rcu_read_unlock();
> -				return true;
> -			}
> -
> -	rcu_read_unlock();
> -
> -	return false;
> -}
> -EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
> -
> -void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
> -{
> -	struct kvm_irq_ack_notifier *kian;
> -	int gsi;
> -
> -	trace_kvm_ack_irq(irqchip, pin);
> -
> -	rcu_read_lock();
> -	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> -	if (gsi != -1)
> -		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> -					 link)
> -			if (kian->gsi = gsi)
> -				kian->irq_acked(kian);
> -	rcu_read_unlock();
> -}
> -
> -void kvm_register_irq_ack_notifier(struct kvm *kvm,
> -				   struct kvm_irq_ack_notifier *kian)
> -{
> -	mutex_lock(&kvm->irq_lock);
> -	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
> -	mutex_unlock(&kvm->irq_lock);
> -	kvm_vcpu_request_scan_ioapic(kvm);
> -}
> -
> -void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
> -				    struct kvm_irq_ack_notifier *kian)
> -{
> -	mutex_lock(&kvm->irq_lock);
> -	hlist_del_init_rcu(&kian->link);
> -	mutex_unlock(&kvm->irq_lock);
> -	synchronize_rcu();
> -	kvm_vcpu_request_scan_ioapic(kvm);
> -}
> -
>  int kvm_request_irq_source_id(struct kvm *kvm)
>  {
>  	unsigned long *bitmap = &kvm->arch.irq_sources_bitmap;
> @@ -381,13 +271,6 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
>  	rcu_read_unlock();
>  }
>  
> -void kvm_free_irq_routing(struct kvm *kvm)
> -{
> -	/* Called only during vm destruction. Nobody can use the pointer
> -	   at this stage */
> -	kfree(kvm->irq_routing);
> -}
> -
>  static int setup_routing_entry(struct kvm_irq_routing_table *rt,
>  			       struct kvm_kernel_irq_routing_entry *e,
>  			       const struct kvm_irq_routing_entry *ue)
> @@ -451,7 +334,6 @@ out:
>  	return r;
>  }
>  
> -
>  int kvm_set_irq_routing(struct kvm *kvm,
>  			const struct kvm_irq_routing_entry *ue,
>  			unsigned nr,
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> new file mode 100644
> index 0000000..12f7f26
> --- /dev/null
> +++ b/virt/kvm/irqchip.c
> @@ -0,0 +1,152 @@
> +/*
> + * irqchip.c: Common API for in kernel interrupt controllers
> + * Copyright (c) 2007, Intel Corporation.
> + * Copyright 2010 Red Hat, Inc. and/or its affiliates.
> + * Copyright (c) 2013, Alexander Graf <agraf@suse.de>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * This file is derived from virt/kvm/irq_comm.c.
> + *
> + * Authors:
> + *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
> + *   Alexander Graf <agraf@suse.de>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/slab.h>
> +#include <linux/export.h>
> +#include <trace/events/kvm.h>
> +#include "irq.h"
> +
> +bool kvm_irq_has_notifier(struct kvm *kvm, unsigned irqchip, unsigned pin)
> +{
> +	struct kvm_irq_ack_notifier *kian;
> +	int gsi;
> +
> +	rcu_read_lock();
> +	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> +	if (gsi != -1)
> +		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> +					 link)
> +			if (kian->gsi = gsi) {
> +				rcu_read_unlock();
> +				return true;
> +			}
> +
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(kvm_irq_has_notifier);
> +
> +void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
> +{
> +	struct kvm_irq_ack_notifier *kian;
> +	int gsi;
> +
> +	trace_kvm_ack_irq(irqchip, pin);
> +
> +	rcu_read_lock();
> +	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
> +	if (gsi != -1)
> +		hlist_for_each_entry_rcu(kian, &kvm->irq_ack_notifier_list,
> +					 link)
> +			if (kian->gsi = gsi)
> +				kian->irq_acked(kian);
> +	rcu_read_unlock();
> +}
> +
> +void kvm_register_irq_ack_notifier(struct kvm *kvm,
> +				   struct kvm_irq_ack_notifier *kian)
> +{
> +	mutex_lock(&kvm->irq_lock);
> +	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
> +	mutex_unlock(&kvm->irq_lock);
> +#ifdef __KVM_HAVE_IOAPIC
> +	kvm_vcpu_request_scan_ioapic(kvm);
> +#endif
> +}
> +
> +void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
> +				    struct kvm_irq_ack_notifier *kian)
> +{
> +	mutex_lock(&kvm->irq_lock);
> +	hlist_del_init_rcu(&kian->link);
> +	mutex_unlock(&kvm->irq_lock);
> +	synchronize_rcu();
> +#ifdef __KVM_HAVE_IOAPIC
> +	kvm_vcpu_request_scan_ioapic(kvm);
> +#endif
> +}
> +
> +int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
> +{
> +	struct kvm_kernel_irq_routing_entry route;
> +
> +	if (!irqchip_in_kernel(kvm) || msi->flags != 0)
> +		return -EINVAL;
> +
> +	route.msi.address_lo = msi->address_lo;
> +	route.msi.address_hi = msi->address_hi;
> +	route.msi.data = msi->data;
> +
> +	return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
> +}
> +
> +/*
> + * Return value:
> + *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
> + *  = 0   Interrupt was coalesced (previous irq is still pending)
> + *  > 0   Number of CPUs interrupt was delivered to
> + */
> +int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
> +		bool line_status)
> +{
> +	struct kvm_kernel_irq_routing_entry *e, irq_set[KVM_NR_IRQCHIPS];
> +	int ret = -1, i = 0;
> +	struct kvm_irq_routing_table *irq_rt;
> +
> +	trace_kvm_set_irq(irq, level, irq_source_id);
> +
> +	/* Not possible to detect if the guest uses the PIC or the
> +	 * IOAPIC.  So set the bit in both. The guest will ignore
> +	 * writes to the unused one.
> +	 */
> +	rcu_read_lock();
> +	irq_rt = rcu_dereference(kvm->irq_routing);
> +	if (irq < irq_rt->nr_rt_entries)
> +		hlist_for_each_entry(e, &irq_rt->map[irq], link)
> +			irq_set[i++] = *e;
> +	rcu_read_unlock();
> +
> +	while(i--) {
> +		int r;
> +		r = irq_set[i].set(&irq_set[i], kvm, irq_source_id, level,
> +				   line_status);
> +		if (r < 0)
> +			continue;
> +
> +		ret = r + ((ret < 0) ? 0 : ret);
> +	}
> +
> +	return ret;
> +}
> +
> +void kvm_free_irq_routing(struct kvm *kvm)
> +{
> +	/* Called only during vm destruction. Nobody can use the pointer
> +	   at this stage */
> +	kfree(kvm->irq_routing);
> +}
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 07/17] KVM: Move irq routing setup to irqchip.c
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:20     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:20 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:18PM +0200, Alexander Graf wrote:
> Setting up IRQ routes is nothing IOAPIC specific. Extract everything
> that really is generic code into irqchip.c and only leave the ioapic
> specific bits to irq_comm.c.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  include/linux/kvm_host.h |    3 ++
>  virt/kvm/irq_comm.c      |   76 ++---------------------------------------
>  virt/kvm/irqchip.c       |   85 ++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 91 insertions(+), 73 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a7bfe9d..dcef724 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -961,6 +961,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
>  			const struct kvm_irq_routing_entry *entries,
>  			unsigned nr,
>  			unsigned flags);
> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue);
>  void kvm_free_irq_routing(struct kvm *kvm);
>  
>  int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index d5008f4..e2e6b44 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -271,27 +271,14 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
>  	rcu_read_unlock();
>  }
>  
> -static int setup_routing_entry(struct kvm_irq_routing_table *rt,
> -			       struct kvm_kernel_irq_routing_entry *e,
> -			       const struct kvm_irq_routing_entry *ue)
> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue)
>  {
>  	int r = -EINVAL;
>  	int delta;
>  	unsigned max_pin;
> -	struct kvm_kernel_irq_routing_entry *ei;
>  
> -	/*
> -	 * Do not allow GSI to be mapped to the same irqchip more than once.
> -	 * Allow only one to one mapping between GSI and MSI.
> -	 */
> -	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
> -		if (ei->type == KVM_IRQ_ROUTING_MSI ||
> -		    ue->type == KVM_IRQ_ROUTING_MSI ||
> -		    ue->u.irqchip.irqchip == ei->irqchip.irqchip)
> -			return r;
> -
> -	e->gsi = ue->gsi;
> -	e->type = ue->type;
>  	switch (ue->type) {
>  	case KVM_IRQ_ROUTING_IRQCHIP:
>  		delta = 0;
> @@ -328,68 +315,11 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
>  		goto out;
>  	}
>  
> -	hlist_add_head(&e->link, &rt->map[e->gsi]);
>  	r = 0;
>  out:
>  	return r;
>  }
>  
> -int kvm_set_irq_routing(struct kvm *kvm,
> -			const struct kvm_irq_routing_entry *ue,
> -			unsigned nr,
> -			unsigned flags)
> -{
> -	struct kvm_irq_routing_table *new, *old;
> -	u32 i, j, nr_rt_entries = 0;
> -	int r;
> -
> -	for (i = 0; i < nr; ++i) {
> -		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
> -			return -EINVAL;
> -		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
> -	}
> -
> -	nr_rt_entries += 1;
> -
> -	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
> -		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
> -		      GFP_KERNEL);
> -
> -	if (!new)
> -		return -ENOMEM;
> -
> -	new->rt_entries = (void *)&new->map[nr_rt_entries];
> -
> -	new->nr_rt_entries = nr_rt_entries;
> -	for (i = 0; i < 3; i++)
> -		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
> -			new->chip[i][j] = -1;
> -
> -	for (i = 0; i < nr; ++i) {
> -		r = -EINVAL;
> -		if (ue->flags)
> -			goto out;
> -		r = setup_routing_entry(new, &new->rt_entries[i], ue);
> -		if (r)
> -			goto out;
> -		++ue;
> -	}
> -
> -	mutex_lock(&kvm->irq_lock);
> -	old = kvm->irq_routing;
> -	kvm_irq_routing_update(kvm, new);
> -	mutex_unlock(&kvm->irq_lock);
> -
> -	synchronize_rcu();
> -
> -	new = old;
> -	r = 0;
> -
> -out:
> -	kfree(new);
> -	return r;
> -}
> -
>  #define IOAPIC_ROUTING_ENTRY(irq) \
>  	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
>  	  .u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 12f7f26..20dc9e4 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -150,3 +150,88 @@ void kvm_free_irq_routing(struct kvm *kvm)
>  	   at this stage */
>  	kfree(kvm->irq_routing);
>  }
> +
> +static int setup_routing_entry(struct kvm_irq_routing_table *rt,
> +			       struct kvm_kernel_irq_routing_entry *e,
> +			       const struct kvm_irq_routing_entry *ue)
> +{
> +	int r = -EINVAL;
> +	struct kvm_kernel_irq_routing_entry *ei;
> +
> +	/*
> +	 * Do not allow GSI to be mapped to the same irqchip more than once.
> +	 * Allow only one to one mapping between GSI and MSI.
> +	 */
> +	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
> +		if (ei->type == KVM_IRQ_ROUTING_MSI ||
> +		    ue->type == KVM_IRQ_ROUTING_MSI ||
> +		    ue->u.irqchip.irqchip == ei->irqchip.irqchip)
> +			return r;
> +
> +	e->gsi = ue->gsi;
> +	e->type = ue->type;
> +	r = kvm_set_routing_entry(rt, e, ue);
> +	if (r)
> +		goto out;
> +
> +	hlist_add_head(&e->link, &rt->map[e->gsi]);
> +	r = 0;
> +out:
> +	return r;
> +}
> +
> +int kvm_set_irq_routing(struct kvm *kvm,
> +			const struct kvm_irq_routing_entry *ue,
> +			unsigned nr,
> +			unsigned flags)
> +{
> +	struct kvm_irq_routing_table *new, *old;
> +	u32 i, j, nr_rt_entries = 0;
> +	int r;
> +
> +	for (i = 0; i < nr; ++i) {
> +		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
> +			return -EINVAL;
> +		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
> +	}
> +
> +	nr_rt_entries += 1;
> +
> +	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
> +		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
> +		      GFP_KERNEL);
> +
> +	if (!new)
> +		return -ENOMEM;
> +
> +	new->rt_entries = (void *)&new->map[nr_rt_entries];
> +
> +	new->nr_rt_entries = nr_rt_entries;
> +	for (i = 0; i < KVM_NR_IRQCHIPS; i++)
> +		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
> +			new->chip[i][j] = -1;
> +
> +	for (i = 0; i < nr; ++i) {
> +		r = -EINVAL;
> +		if (ue->flags)
> +			goto out;
> +		r = setup_routing_entry(new, &new->rt_entries[i], ue);
> +		if (r)
> +			goto out;
> +		++ue;
> +	}
> +
> +	mutex_lock(&kvm->irq_lock);
> +	old = kvm->irq_routing;
> +	kvm_irq_routing_update(kvm, new);
> +	mutex_unlock(&kvm->irq_lock);
> +
> +	synchronize_rcu();
> +
> +	new = old;
> +	r = 0;
> +
> +out:
> +	kfree(new);
> +	return r;
> +}
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 07/17] KVM: Move irq routing setup to irqchip.c
@ 2013-04-25 10:20     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:20 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:18PM +0200, Alexander Graf wrote:
> Setting up IRQ routes is nothing IOAPIC specific. Extract everything
> that really is generic code into irqchip.c and only leave the ioapic
> specific bits to irq_comm.c.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  include/linux/kvm_host.h |    3 ++
>  virt/kvm/irq_comm.c      |   76 ++---------------------------------------
>  virt/kvm/irqchip.c       |   85 ++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 91 insertions(+), 73 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index a7bfe9d..dcef724 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -961,6 +961,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
>  			const struct kvm_irq_routing_entry *entries,
>  			unsigned nr,
>  			unsigned flags);
> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue);
>  void kvm_free_irq_routing(struct kvm *kvm);
>  
>  int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index d5008f4..e2e6b44 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -271,27 +271,14 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
>  	rcu_read_unlock();
>  }
>  
> -static int setup_routing_entry(struct kvm_irq_routing_table *rt,
> -			       struct kvm_kernel_irq_routing_entry *e,
> -			       const struct kvm_irq_routing_entry *ue)
> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
> +			  struct kvm_kernel_irq_routing_entry *e,
> +			  const struct kvm_irq_routing_entry *ue)
>  {
>  	int r = -EINVAL;
>  	int delta;
>  	unsigned max_pin;
> -	struct kvm_kernel_irq_routing_entry *ei;
>  
> -	/*
> -	 * Do not allow GSI to be mapped to the same irqchip more than once.
> -	 * Allow only one to one mapping between GSI and MSI.
> -	 */
> -	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
> -		if (ei->type = KVM_IRQ_ROUTING_MSI ||
> -		    ue->type = KVM_IRQ_ROUTING_MSI ||
> -		    ue->u.irqchip.irqchip = ei->irqchip.irqchip)
> -			return r;
> -
> -	e->gsi = ue->gsi;
> -	e->type = ue->type;
>  	switch (ue->type) {
>  	case KVM_IRQ_ROUTING_IRQCHIP:
>  		delta = 0;
> @@ -328,68 +315,11 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
>  		goto out;
>  	}
>  
> -	hlist_add_head(&e->link, &rt->map[e->gsi]);
>  	r = 0;
>  out:
>  	return r;
>  }
>  
> -int kvm_set_irq_routing(struct kvm *kvm,
> -			const struct kvm_irq_routing_entry *ue,
> -			unsigned nr,
> -			unsigned flags)
> -{
> -	struct kvm_irq_routing_table *new, *old;
> -	u32 i, j, nr_rt_entries = 0;
> -	int r;
> -
> -	for (i = 0; i < nr; ++i) {
> -		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
> -			return -EINVAL;
> -		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
> -	}
> -
> -	nr_rt_entries += 1;
> -
> -	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
> -		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
> -		      GFP_KERNEL);
> -
> -	if (!new)
> -		return -ENOMEM;
> -
> -	new->rt_entries = (void *)&new->map[nr_rt_entries];
> -
> -	new->nr_rt_entries = nr_rt_entries;
> -	for (i = 0; i < 3; i++)
> -		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
> -			new->chip[i][j] = -1;
> -
> -	for (i = 0; i < nr; ++i) {
> -		r = -EINVAL;
> -		if (ue->flags)
> -			goto out;
> -		r = setup_routing_entry(new, &new->rt_entries[i], ue);
> -		if (r)
> -			goto out;
> -		++ue;
> -	}
> -
> -	mutex_lock(&kvm->irq_lock);
> -	old = kvm->irq_routing;
> -	kvm_irq_routing_update(kvm, new);
> -	mutex_unlock(&kvm->irq_lock);
> -
> -	synchronize_rcu();
> -
> -	new = old;
> -	r = 0;
> -
> -out:
> -	kfree(new);
> -	return r;
> -}
> -
>  #define IOAPIC_ROUTING_ENTRY(irq) \
>  	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
>  	  .u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 12f7f26..20dc9e4 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -150,3 +150,88 @@ void kvm_free_irq_routing(struct kvm *kvm)
>  	   at this stage */
>  	kfree(kvm->irq_routing);
>  }
> +
> +static int setup_routing_entry(struct kvm_irq_routing_table *rt,
> +			       struct kvm_kernel_irq_routing_entry *e,
> +			       const struct kvm_irq_routing_entry *ue)
> +{
> +	int r = -EINVAL;
> +	struct kvm_kernel_irq_routing_entry *ei;
> +
> +	/*
> +	 * Do not allow GSI to be mapped to the same irqchip more than once.
> +	 * Allow only one to one mapping between GSI and MSI.
> +	 */
> +	hlist_for_each_entry(ei, &rt->map[ue->gsi], link)
> +		if (ei->type = KVM_IRQ_ROUTING_MSI ||
> +		    ue->type = KVM_IRQ_ROUTING_MSI ||
> +		    ue->u.irqchip.irqchip = ei->irqchip.irqchip)
> +			return r;
> +
> +	e->gsi = ue->gsi;
> +	e->type = ue->type;
> +	r = kvm_set_routing_entry(rt, e, ue);
> +	if (r)
> +		goto out;
> +
> +	hlist_add_head(&e->link, &rt->map[e->gsi]);
> +	r = 0;
> +out:
> +	return r;
> +}
> +
> +int kvm_set_irq_routing(struct kvm *kvm,
> +			const struct kvm_irq_routing_entry *ue,
> +			unsigned nr,
> +			unsigned flags)
> +{
> +	struct kvm_irq_routing_table *new, *old;
> +	u32 i, j, nr_rt_entries = 0;
> +	int r;
> +
> +	for (i = 0; i < nr; ++i) {
> +		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
> +			return -EINVAL;
> +		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
> +	}
> +
> +	nr_rt_entries += 1;
> +
> +	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
> +		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
> +		      GFP_KERNEL);
> +
> +	if (!new)
> +		return -ENOMEM;
> +
> +	new->rt_entries = (void *)&new->map[nr_rt_entries];
> +
> +	new->nr_rt_entries = nr_rt_entries;
> +	for (i = 0; i < KVM_NR_IRQCHIPS; i++)
> +		for (j = 0; j < KVM_IRQCHIP_NUM_PINS; j++)
> +			new->chip[i][j] = -1;
> +
> +	for (i = 0; i < nr; ++i) {
> +		r = -EINVAL;
> +		if (ue->flags)
> +			goto out;
> +		r = setup_routing_entry(new, &new->rt_entries[i], ue);
> +		if (r)
> +			goto out;
> +		++ue;
> +	}
> +
> +	mutex_lock(&kvm->irq_lock);
> +	old = kvm->irq_routing;
> +	kvm_irq_routing_update(kvm, new);
> +	mutex_unlock(&kvm->irq_lock);
> +
> +	synchronize_rcu();
> +
> +	new = old;
> +	r = 0;
> +
> +out:
> +	kfree(new);
> +	return r;
> +}
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code
  2013-04-19 14:06   ` Alexander Graf
@ 2013-04-25 10:21     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:21 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:19PM +0200, Alexander Graf wrote:
> Now that we have most irqfd code completely platform agnostic, let's move
> irqfd's resample capability return to generic code as well.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/kvm/x86.c  |    1 -
>  virt/kvm/kvm_main.c |    3 +++
>  2 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 50e2e10..888d892 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2513,7 +2513,6 @@ int kvm_dev_ioctl_check_extension(long ext)
>  	case KVM_CAP_PCI_2_3:
>  	case KVM_CAP_KVMCLOCK_CTRL:
>  	case KVM_CAP_READONLY_MEM:
> -	case KVM_CAP_IRQFD_RESAMPLE:
>  		r = 1;
>  		break;
>  	case KVM_CAP_COALESCED_MMIO:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b6f3354..f9492f3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2433,6 +2433,9 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
>  #ifdef CONFIG_HAVE_KVM_MSI
>  	case KVM_CAP_SIGNAL_MSI:
>  #endif
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
> +	case KVM_CAP_IRQFD_RESAMPLE:
> +#endif
>  		return 1;
>  #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  	case KVM_CAP_IRQ_ROUTING:
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code
@ 2013-04-25 10:21     ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:21 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:19PM +0200, Alexander Graf wrote:
> Now that we have most irqfd code completely platform agnostic, let's move
> irqfd's resample capability return to generic code as well.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  arch/x86/kvm/x86.c  |    1 -
>  virt/kvm/kvm_main.c |    3 +++
>  2 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 50e2e10..888d892 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2513,7 +2513,6 @@ int kvm_dev_ioctl_check_extension(long ext)
>  	case KVM_CAP_PCI_2_3:
>  	case KVM_CAP_KVMCLOCK_CTRL:
>  	case KVM_CAP_READONLY_MEM:
> -	case KVM_CAP_IRQFD_RESAMPLE:
>  		r = 1;
>  		break;
>  	case KVM_CAP_COALESCED_MMIO:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b6f3354..f9492f3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2433,6 +2433,9 @@ static long kvm_dev_ioctl_check_extension_generic(long arg)
>  #ifdef CONFIG_HAVE_KVM_MSI
>  	case KVM_CAP_SIGNAL_MSI:
>  #endif
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
> +	case KVM_CAP_IRQFD_RESAMPLE:
> +#endif
>  		return 1;
>  #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>  	case KVM_CAP_IRQ_ROUTING:
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd v3
  2013-04-19 14:06 ` Alexander Graf
@ 2013-04-25 10:24   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:24 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:11PM +0200, Alexander Graf wrote:
> Hi,
> 
> This patch set contains a fully working implementation of the in-kernel MPIC
> from Scott with a few fixups and a new version of my irqfd generalization
> patch set.

For patches 1-8:
Acked-by: Michael S. Tsirkin <mst@redhat.com>

I don't have an opinion about the rest.

> v1 -> v2:
> 
>   - depend on CONFIG_ defines rather than __KVM defines
>   - fix compile issues
>   - fix the kvm_irqchip{,s} typo
> 
> v2 -> v3:
> 
>   - make mpic pointer type safe
>   - add wmb before setting global mpic variable
>   - make eoi notification happen unlockedly
>   - add IRQ routing documentation
>   - announce mpic availability after its creation
>   - fix pr_debug again
> 
> I have refrained from touching IA64 at all in this patch set. It's marked
> as BROKEN, I doubt it even compiles at all today. The only sensible thing
> to do would be to remove all of IA64 kvm code from the kernel tree, but
> that is out of scope for this patch set and definitely should not gate it.
> 
> 
> Alex
> 
> Alexander Graf (11):
>   KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
>   KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
>   KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
>   KVM: Remove kvm_get_intr_delivery_bitmask
>   KVM: Move irq routing to generic code
>   KVM: Extract generic irqchip logic into irqchip.c
>   KVM: Move irq routing setup to irqchip.c
>   KVM: Move irqfd resample cap handling to generic code
>   KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
>   KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
>   KVM: PPC: MPIC: Restrict to e500 platforms
> 
> Scott Wood (6):
>   kvm: add device control API
>   kvm/ppc/mpic: import hw/openpic.c from QEMU
>   kvm/ppc/mpic: remove some obviously unneeded code
>   kvm/ppc/mpic: adapt to kernel style and environment
>   kvm/ppc/mpic: in-kernel MPIC emulation
>   kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
> 
>  Documentation/virtual/kvm/api.txt          |   78 ++
>  Documentation/virtual/kvm/devices/README   |    1 +
>  Documentation/virtual/kvm/devices/mpic.txt |   48 +
>  arch/powerpc/include/asm/kvm_host.h        |   24 +-
>  arch/powerpc/include/asm/kvm_ppc.h         |   30 +
>  arch/powerpc/include/uapi/asm/kvm.h        |    9 +
>  arch/powerpc/kvm/Kconfig                   |   12 +
>  arch/powerpc/kvm/Makefile                  |    3 +
>  arch/powerpc/kvm/booke.c                   |   12 +-
>  arch/powerpc/kvm/irq.h                     |   17 +
>  arch/powerpc/kvm/mpic.c                    | 1876 ++++++++++++++++++++++++++++
>  arch/powerpc/kvm/powerpc.c                 |   55 +-
>  arch/x86/include/asm/kvm_host.h            |    2 +
>  arch/x86/kvm/Kconfig                       |    1 +
>  arch/x86/kvm/Makefile                      |    2 +-
>  arch/x86/kvm/x86.c                         |    1 -
>  include/linux/kvm_host.h                   |   53 +-
>  include/trace/events/kvm.h                 |   12 +-
>  include/uapi/linux/kvm.h                   |   33 +-
>  virt/kvm/Kconfig                           |    3 +
>  virt/kvm/assigned-dev.c                    |   30 -
>  virt/kvm/eventfd.c                         |    6 +-
>  virt/kvm/irq_comm.c                        |  194 +---
>  virt/kvm/irqchip.c                         |  237 ++++
>  virt/kvm/kvm_main.c                        |  170 +++-
>  25 files changed, 2659 insertions(+), 250 deletions(-)
>  create mode 100644 Documentation/virtual/kvm/devices/README
>  create mode 100644 Documentation/virtual/kvm/devices/mpic.txt
>  create mode 100644 arch/powerpc/kvm/irq.h
>  create mode 100644 arch/powerpc/kvm/mpic.c
>  create mode 100644 virt/kvm/irqchip.c
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd v3
@ 2013-04-25 10:24   ` Michael S. Tsirkin
  0 siblings, 0 replies; 128+ messages in thread
From: Michael S. Tsirkin @ 2013-04-25 10:24 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Scott Wood,
	Marcelo Tosatti, Gleb Natapov

On Fri, Apr 19, 2013 at 04:06:11PM +0200, Alexander Graf wrote:
> Hi,
> 
> This patch set contains a fully working implementation of the in-kernel MPIC
> from Scott with a few fixups and a new version of my irqfd generalization
> patch set.

For patches 1-8:
Acked-by: Michael S. Tsirkin <mst@redhat.com>

I don't have an opinion about the rest.

> v1 -> v2:
> 
>   - depend on CONFIG_ defines rather than __KVM defines
>   - fix compile issues
>   - fix the kvm_irqchip{,s} typo
> 
> v2 -> v3:
> 
>   - make mpic pointer type safe
>   - add wmb before setting global mpic variable
>   - make eoi notification happen unlockedly
>   - add IRQ routing documentation
>   - announce mpic availability after its creation
>   - fix pr_debug again
> 
> I have refrained from touching IA64 at all in this patch set. It's marked
> as BROKEN, I doubt it even compiles at all today. The only sensible thing
> to do would be to remove all of IA64 kvm code from the kernel tree, but
> that is out of scope for this patch set and definitely should not gate it.
> 
> 
> Alex
> 
> Alexander Graf (11):
>   KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
>   KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
>   KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
>   KVM: Remove kvm_get_intr_delivery_bitmask
>   KVM: Move irq routing to generic code
>   KVM: Extract generic irqchip logic into irqchip.c
>   KVM: Move irq routing setup to irqchip.c
>   KVM: Move irqfd resample cap handling to generic code
>   KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
>   KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
>   KVM: PPC: MPIC: Restrict to e500 platforms
> 
> Scott Wood (6):
>   kvm: add device control API
>   kvm/ppc/mpic: import hw/openpic.c from QEMU
>   kvm/ppc/mpic: remove some obviously unneeded code
>   kvm/ppc/mpic: adapt to kernel style and environment
>   kvm/ppc/mpic: in-kernel MPIC emulation
>   kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
> 
>  Documentation/virtual/kvm/api.txt          |   78 ++
>  Documentation/virtual/kvm/devices/README   |    1 +
>  Documentation/virtual/kvm/devices/mpic.txt |   48 +
>  arch/powerpc/include/asm/kvm_host.h        |   24 +-
>  arch/powerpc/include/asm/kvm_ppc.h         |   30 +
>  arch/powerpc/include/uapi/asm/kvm.h        |    9 +
>  arch/powerpc/kvm/Kconfig                   |   12 +
>  arch/powerpc/kvm/Makefile                  |    3 +
>  arch/powerpc/kvm/booke.c                   |   12 +-
>  arch/powerpc/kvm/irq.h                     |   17 +
>  arch/powerpc/kvm/mpic.c                    | 1876 ++++++++++++++++++++++++++++
>  arch/powerpc/kvm/powerpc.c                 |   55 +-
>  arch/x86/include/asm/kvm_host.h            |    2 +
>  arch/x86/kvm/Kconfig                       |    1 +
>  arch/x86/kvm/Makefile                      |    2 +-
>  arch/x86/kvm/x86.c                         |    1 -
>  include/linux/kvm_host.h                   |   53 +-
>  include/trace/events/kvm.h                 |   12 +-
>  include/uapi/linux/kvm.h                   |   33 +-
>  virt/kvm/Kconfig                           |    3 +
>  virt/kvm/assigned-dev.c                    |   30 -
>  virt/kvm/eventfd.c                         |    6 +-
>  virt/kvm/irq_comm.c                        |  194 +---
>  virt/kvm/irqchip.c                         |  237 ++++
>  virt/kvm/kvm_main.c                        |  170 +++-
>  25 files changed, 2659 insertions(+), 250 deletions(-)
>  create mode 100644 Documentation/virtual/kvm/devices/README
>  create mode 100644 Documentation/virtual/kvm/devices/mpic.txt
>  create mode 100644 arch/powerpc/kvm/irq.h
>  create mode 100644 arch/powerpc/kvm/mpic.c
>  create mode 100644 virt/kvm/irqchip.c
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-19 18:51     ` Scott Wood
@ 2013-04-25 11:30       ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 11:30 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 19.04.2013, at 20:51, Scott Wood wrote:

> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>> Now that all pieces are in place for reusing generic irq infrastructure,
>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
>> reuse it for PPC, as it will work there just as well.
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>> 2 files changed, 14 insertions(+), 0 deletions(-)
>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
>> index 3537bf3..dbb2ac2 100644
>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>> @@ -26,6 +26,7 @@
>> #define __KVM_HAVE_SPAPR_TCE
>> #define __KVM_HAVE_PPC_SMT
>> #define __KVM_HAVE_IRQCHIP
>> +#define __KVM_HAVE_IRQ_LINE
>> struct kvm_regs {
>> 	__u64 pc;
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index c431fea..874c106 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -33,6 +33,7 @@
>> #include <asm/cputhreads.h>
>> #include <asm/irqflags.h>
>> #include "timing.h"
>> +#include "irq.h"
>> #include "../mm/mmu_decl.h"
>> #define CREATE_TRACE_POINTS
>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
>> 	return 0;
>> }
>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>> +			  bool line_status)
>> +{
>> +	if (!irqchip_in_kernel(kvm))
>> +		return -ENXIO;
>> +
>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
>> +					irq_event->irq, irq_event->level,
>> +					line_status);
>> +	return 0;
>> +}
> 
> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
> 
> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
> 
> I think these three steps would fix it (the first two seem like things we should be doing anyway):
> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.

Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-25 11:30       ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 11:30 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 19.04.2013, at 20:51, Scott Wood wrote:

> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>> Now that all pieces are in place for reusing generic irq infrastructure,
>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
>> reuse it for PPC, as it will work there just as well.
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>> 2 files changed, 14 insertions(+), 0 deletions(-)
>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
>> index 3537bf3..dbb2ac2 100644
>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>> @@ -26,6 +26,7 @@
>> #define __KVM_HAVE_SPAPR_TCE
>> #define __KVM_HAVE_PPC_SMT
>> #define __KVM_HAVE_IRQCHIP
>> +#define __KVM_HAVE_IRQ_LINE
>> struct kvm_regs {
>> 	__u64 pc;
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index c431fea..874c106 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -33,6 +33,7 @@
>> #include <asm/cputhreads.h>
>> #include <asm/irqflags.h>
>> #include "timing.h"
>> +#include "irq.h"
>> #include "../mm/mmu_decl.h"
>> #define CREATE_TRACE_POINTS
>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
>> 	return 0;
>> }
>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>> +			  bool line_status)
>> +{
>> +	if (!irqchip_in_kernel(kvm))
>> +		return -ENXIO;
>> +
>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
>> +					irq_event->irq, irq_event->level,
>> +					line_status);
>> +	return 0;
>> +}
> 
> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
> 
> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
> 
> I think these three steps would fix it (the first two seem like things we should be doing anyway):
> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.

Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-25 11:30       ` Alexander Graf
@ 2013-04-25 14:49         ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 14:49 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 25.04.2013, at 13:30, Alexander Graf wrote:

> 
> On 19.04.2013, at 20:51, Scott Wood wrote:
> 
>> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>>> Now that all pieces are in place for reusing generic irq infrastructure,
>>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
>>> reuse it for PPC, as it will work there just as well.
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>>> 2 files changed, 14 insertions(+), 0 deletions(-)
>>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
>>> index 3537bf3..dbb2ac2 100644
>>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>>> @@ -26,6 +26,7 @@
>>> #define __KVM_HAVE_SPAPR_TCE
>>> #define __KVM_HAVE_PPC_SMT
>>> #define __KVM_HAVE_IRQCHIP
>>> +#define __KVM_HAVE_IRQ_LINE
>>> struct kvm_regs {
>>> 	__u64 pc;
>>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>>> index c431fea..874c106 100644
>>> --- a/arch/powerpc/kvm/powerpc.c
>>> +++ b/arch/powerpc/kvm/powerpc.c
>>> @@ -33,6 +33,7 @@
>>> #include <asm/cputhreads.h>
>>> #include <asm/irqflags.h>
>>> #include "timing.h"
>>> +#include "irq.h"
>>> #include "../mm/mmu_decl.h"
>>> #define CREATE_TRACE_POINTS
>>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
>>> 	return 0;
>>> }
>>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>>> +			  bool line_status)
>>> +{
>>> +	if (!irqchip_in_kernel(kvm))
>>> +		return -ENXIO;
>>> +
>>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
>>> +					irq_event->irq, irq_event->level,
>>> +					line_status);
>>> +	return 0;
>>> +}
>> 
>> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
>> 
>> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
>> 
>> I think these three steps would fix it (the first two seem like things we should be doing anyway):
>> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
>> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
>> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.
> 
> Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.

In fact I'm having a hard time verifying that we're always accessing things with proper locks held. I'm pretty sure we're missing a few cases.

So how about we delay mpic destruction to vm destruction? We simply add one user too many when we spawn the mpic and put it on vm_destruct. That way users _can_ destroy mpics, but they will only be really free'd once the vm is also gone.


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-25 14:49         ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 14:49 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 25.04.2013, at 13:30, Alexander Graf wrote:

> 
> On 19.04.2013, at 20:51, Scott Wood wrote:
> 
>> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>>> Now that all pieces are in place for reusing generic irq infrastructure,
>>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
>>> reuse it for PPC, as it will work there just as well.
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>>> 2 files changed, 14 insertions(+), 0 deletions(-)
>>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
>>> index 3537bf3..dbb2ac2 100644
>>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>>> @@ -26,6 +26,7 @@
>>> #define __KVM_HAVE_SPAPR_TCE
>>> #define __KVM_HAVE_PPC_SMT
>>> #define __KVM_HAVE_IRQCHIP
>>> +#define __KVM_HAVE_IRQ_LINE
>>> struct kvm_regs {
>>> 	__u64 pc;
>>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>>> index c431fea..874c106 100644
>>> --- a/arch/powerpc/kvm/powerpc.c
>>> +++ b/arch/powerpc/kvm/powerpc.c
>>> @@ -33,6 +33,7 @@
>>> #include <asm/cputhreads.h>
>>> #include <asm/irqflags.h>
>>> #include "timing.h"
>>> +#include "irq.h"
>>> #include "../mm/mmu_decl.h"
>>> #define CREATE_TRACE_POINTS
>>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
>>> 	return 0;
>>> }
>>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>>> +			  bool line_status)
>>> +{
>>> +	if (!irqchip_in_kernel(kvm))
>>> +		return -ENXIO;
>>> +
>>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
>>> +					irq_event->irq, irq_event->level,
>>> +					line_status);
>>> +	return 0;
>>> +}
>> 
>> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
>> 
>> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
>> 
>> I think these three steps would fix it (the first two seem like things we should be doing anyway):
>> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
>> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
>> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.
> 
> Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.

In fact I'm having a hard time verifying that we're always accessing things with proper locks held. I'm pretty sure we're missing a few cases.

So how about we delay mpic destruction to vm destruction? We simply add one user too many when we spawn the mpic and put it on vm_destruct. That way users _can_ destroy mpics, but they will only be really free'd once the vm is also gone.


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
  2013-04-25  9:58       ` Alexander Graf
@ 2013-04-25 16:53         ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-25 16:53 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/25/2013 04:58:51 AM, Alexander Graf wrote:
> 
> On 19.04.2013, at 20:02, Scott Wood wrote:
> 
> > On 04/19/2013 09:06:26 AM, Alexander Graf wrote:
> >> +	if (notify_eoi != -1) {
> >> +		spin_unlock_irq(&opp->lock);
> >> +		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
> >> +		spin_lock_irq(&opp->lock);
> >> +	}
> >
> > I'd rather not have the "_irq" here, which could break if we enter  
> this patch via an "_irqsave" (I realize there currently is no such  
> path that reaches EOI emulation).
> >
> > Will we ever set notify_eoi when addr != EOI?  I'm wondering why it  
> was moved out of the switch statement, instead of being put at the  
> end of the case EOI: code.
> 
> I doubt it, but that's for the compiler to optimize away. I found it  
> cleaner for some reason to put it down there. I don't think it really  
> matters.

Cleanliness is my concern as well.  It doesn't seem clean to  
arbitrarily split up the EOI implementation.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
@ 2013-04-25 16:53         ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-25 16:53 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/25/2013 04:58:51 AM, Alexander Graf wrote:
> 
> On 19.04.2013, at 20:02, Scott Wood wrote:
> 
> > On 04/19/2013 09:06:26 AM, Alexander Graf wrote:
> >> +	if (notify_eoi != -1) {
> >> +		spin_unlock_irq(&opp->lock);
> >> +		kvm_notify_acked_irq(opp->kvm, 0, notify_eoi);
> >> +		spin_lock_irq(&opp->lock);
> >> +	}
> >
> > I'd rather not have the "_irq" here, which could break if we enter  
> this patch via an "_irqsave" (I realize there currently is no such  
> path that reaches EOI emulation).
> >
> > Will we ever set notify_eoi when addr != EOI?  I'm wondering why it  
> was moved out of the switch statement, instead of being put at the  
> end of the case EOI: code.
> 
> I doubt it, but that's for the compiler to optimize away. I found it  
> cleaner for some reason to put it down there. I don't think it really  
> matters.

Cleanliness is my concern as well.  It doesn't seem clean to  
arbitrarily split up the EOI implementation.

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-25 14:49         ` Alexander Graf
@ 2013-04-25 19:03           ` Scott Wood
  -1 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-25 19:03 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
> 
> On 25.04.2013, at 13:30, Alexander Graf wrote:
> 
> >
> > On 19.04.2013, at 20:51, Scott Wood wrote:
> >
> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> >>> Now that all pieces are in place for reusing generic irq  
> infrastructure,
> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection  
> and simply
> >>> reuse it for PPC, as it will work there just as well.
> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
> >>> ---
> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h  
> b/arch/powerpc/include/uapi/asm/kvm.h
> >>> index 3537bf3..dbb2ac2 100644
> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> >>> @@ -26,6 +26,7 @@
> >>> #define __KVM_HAVE_SPAPR_TCE
> >>> #define __KVM_HAVE_PPC_SMT
> >>> #define __KVM_HAVE_IRQCHIP
> >>> +#define __KVM_HAVE_IRQ_LINE
> >>> struct kvm_regs {
> >>> 	__u64 pc;
> >>> diff --git a/arch/powerpc/kvm/powerpc.c  
> b/arch/powerpc/kvm/powerpc.c
> >>> index c431fea..874c106 100644
> >>> --- a/arch/powerpc/kvm/powerpc.c
> >>> +++ b/arch/powerpc/kvm/powerpc.c
> >>> @@ -33,6 +33,7 @@
> >>> #include <asm/cputhreads.h>
> >>> #include <asm/irqflags.h>
> >>> #include "timing.h"
> >>> +#include "irq.h"
> >>> #include "../mm/mmu_decl.h"
> >>> #define CREATE_TRACE_POINTS
> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct  
> kvm_ppc_pvinfo *pvinfo)
> >>> 	return 0;
> >>> }
> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level  
> *irq_event,
> >>> +			  bool line_status)
> >>> +{
> >>> +	if (!irqchip_in_kernel(kvm))
> >>> +		return -ENXIO;
> >>> +
> >>> +	irq_event->status = kvm_set_irq(kvm,  
> KVM_USERSPACE_IRQ_SOURCE_ID,
> >>> +					irq_event->irq,  
> irq_event->level,
> >>> +					line_status);
> >>> +	return 0;
> >>> +}
> >>
> >> As Paul noted in the XICS patchset, this could reference an MPIC  
> that has gone away if the user never attached any vcpus and then  
> closed the MPIC fd.  It's not a reasonable use case, but it could be  
> used malicously to get the kernel to access a bad pointer.  The  
> irqchip_in_kernel check helps somewhat, but it's meant for ensuring  
> that the creation has happened -- it's racy if used for ensuring that  
> destruction hasn't happened.
> >>
> >> The problem is rooted in the awkwardness of performing an  
> operation that logically should be on the MPIC fd, but is instead  
> being done on the vm fd.
> >>
> >> I think these three steps would fix it (the first two seem like  
> things we should be doing anyway):
> >> - During MPIC destruction, make sure MPIC deregisters all routes  
> that reference it.
> >> - In kvm_set_irq(), do not release the RCU read lock until after  
> the set() function has been called.
> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new  
> irqchips, as that bypasses the RCU lock.  It could be supported as a  
> device fd ioctl if desired, or it could be reworked to operate on an  
> RCU-managed list of MSI handlers, though MPIC really doesn't need  
> this at all.
> >
> > Can't we just add an RCU lock in the send_userspace_msi case? I  
> don't think we should handle MSIs any differently from normal IRQs.

Well, you can't *just* add the RCU lock -- you need to add data to be  
managed via RCU (e.g. a list of MSI callbacks, or at least a boolean  
indicating whether calling the MSI code is OK).

> In fact I'm having a hard time verifying that we're always accessing  
> things with proper locks held. I'm pretty sure we're missing a few  
> cases.

Any path in particular?

> So how about we delay mpic destruction to vm destruction? We simply  
> add one user too many when we spawn the mpic and put it on  
> vm_destruct. That way users _can_ destroy mpics, but they will only  
> be really free'd once the vm is also gone.

That's what we originally had before the fd conversion.  If we want it  
again, we'll need to go back to maintaining a list of devices in KVM  
(though it could be a linked list now that we don't need to use it for  
lookups), or have some hardcoded MPIC hack.

IIRC I said back then that converting to fd would make destruction  
ordering more of a pain...

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-25 19:03           ` Scott Wood
  0 siblings, 0 replies; 128+ messages in thread
From: Scott Wood @ 2013-04-25 19:03 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov

On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
> 
> On 25.04.2013, at 13:30, Alexander Graf wrote:
> 
> >
> > On 19.04.2013, at 20:51, Scott Wood wrote:
> >
> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> >>> Now that all pieces are in place for reusing generic irq  
> infrastructure,
> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection  
> and simply
> >>> reuse it for PPC, as it will work there just as well.
> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
> >>> ---
> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h  
> b/arch/powerpc/include/uapi/asm/kvm.h
> >>> index 3537bf3..dbb2ac2 100644
> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> >>> @@ -26,6 +26,7 @@
> >>> #define __KVM_HAVE_SPAPR_TCE
> >>> #define __KVM_HAVE_PPC_SMT
> >>> #define __KVM_HAVE_IRQCHIP
> >>> +#define __KVM_HAVE_IRQ_LINE
> >>> struct kvm_regs {
> >>> 	__u64 pc;
> >>> diff --git a/arch/powerpc/kvm/powerpc.c  
> b/arch/powerpc/kvm/powerpc.c
> >>> index c431fea..874c106 100644
> >>> --- a/arch/powerpc/kvm/powerpc.c
> >>> +++ b/arch/powerpc/kvm/powerpc.c
> >>> @@ -33,6 +33,7 @@
> >>> #include <asm/cputhreads.h>
> >>> #include <asm/irqflags.h>
> >>> #include "timing.h"
> >>> +#include "irq.h"
> >>> #include "../mm/mmu_decl.h"
> >>> #define CREATE_TRACE_POINTS
> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct  
> kvm_ppc_pvinfo *pvinfo)
> >>> 	return 0;
> >>> }
> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level  
> *irq_event,
> >>> +			  bool line_status)
> >>> +{
> >>> +	if (!irqchip_in_kernel(kvm))
> >>> +		return -ENXIO;
> >>> +
> >>> +	irq_event->status = kvm_set_irq(kvm,  
> KVM_USERSPACE_IRQ_SOURCE_ID,
> >>> +					irq_event->irq,  
> irq_event->level,
> >>> +					line_status);
> >>> +	return 0;
> >>> +}
> >>
> >> As Paul noted in the XICS patchset, this could reference an MPIC  
> that has gone away if the user never attached any vcpus and then  
> closed the MPIC fd.  It's not a reasonable use case, but it could be  
> used malicously to get the kernel to access a bad pointer.  The  
> irqchip_in_kernel check helps somewhat, but it's meant for ensuring  
> that the creation has happened -- it's racy if used for ensuring that  
> destruction hasn't happened.
> >>
> >> The problem is rooted in the awkwardness of performing an  
> operation that logically should be on the MPIC fd, but is instead  
> being done on the vm fd.
> >>
> >> I think these three steps would fix it (the first two seem like  
> things we should be doing anyway):
> >> - During MPIC destruction, make sure MPIC deregisters all routes  
> that reference it.
> >> - In kvm_set_irq(), do not release the RCU read lock until after  
> the set() function has been called.
> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new  
> irqchips, as that bypasses the RCU lock.  It could be supported as a  
> device fd ioctl if desired, or it could be reworked to operate on an  
> RCU-managed list of MSI handlers, though MPIC really doesn't need  
> this at all.
> >
> > Can't we just add an RCU lock in the send_userspace_msi case? I  
> don't think we should handle MSIs any differently from normal IRQs.

Well, you can't *just* add the RCU lock -- you need to add data to be  
managed via RCU (e.g. a list of MSI callbacks, or at least a boolean  
indicating whether calling the MSI code is OK).

> In fact I'm having a hard time verifying that we're always accessing  
> things with proper locks held. I'm pretty sure we're missing a few  
> cases.

Any path in particular?

> So how about we delay mpic destruction to vm destruction? We simply  
> add one user too many when we spawn the mpic and put it on  
> vm_destruct. That way users _can_ destroy mpics, but they will only  
> be really free'd once the vm is also gone.

That's what we originally had before the fd conversion.  If we want it  
again, we'll need to go back to maintaining a list of devices in KVM  
(though it could be a linked list now that we don't need to use it for  
lookups), or have some hardcoded MPIC hack.

IIRC I said back then that converting to fd would make destruction  
ordering more of a pain...

-Scott

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-25 19:03           ` Scott Wood
@ 2013-04-25 21:13             ` Alexander Graf
  -1 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 21:13 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 25.04.2013, at 21:03, Scott Wood wrote:

> On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
>> On 25.04.2013, at 13:30, Alexander Graf wrote:
>> >
>> > On 19.04.2013, at 20:51, Scott Wood wrote:
>> >
>> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>> >>> Now that all pieces are in place for reusing generic irq infrastructure,
>> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
>> >>> reuse it for PPC, as it will work there just as well.
>> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> >>> ---
>> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
>> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
>> >>> index 3537bf3..dbb2ac2 100644
>> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>> >>> @@ -26,6 +26,7 @@
>> >>> #define __KVM_HAVE_SPAPR_TCE
>> >>> #define __KVM_HAVE_PPC_SMT
>> >>> #define __KVM_HAVE_IRQCHIP
>> >>> +#define __KVM_HAVE_IRQ_LINE
>> >>> struct kvm_regs {
>> >>> 	__u64 pc;
>> >>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> >>> index c431fea..874c106 100644
>> >>> --- a/arch/powerpc/kvm/powerpc.c
>> >>> +++ b/arch/powerpc/kvm/powerpc.c
>> >>> @@ -33,6 +33,7 @@
>> >>> #include <asm/cputhreads.h>
>> >>> #include <asm/irqflags.h>
>> >>> #include "timing.h"
>> >>> +#include "irq.h"
>> >>> #include "../mm/mmu_decl.h"
>> >>> #define CREATE_TRACE_POINTS
>> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
>> >>> 	return 0;
>> >>> }
>> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>> >>> +			  bool line_status)
>> >>> +{
>> >>> +	if (!irqchip_in_kernel(kvm))
>> >>> +		return -ENXIO;
>> >>> +
>> >>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
>> >>> +					irq_event->irq, irq_event->level,
>> >>> +					line_status);
>> >>> +	return 0;
>> >>> +}
>> >>
>> >> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
>> >>
>> >> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
>> >>
>> >> I think these three steps would fix it (the first two seem like things we should be doing anyway):
>> >> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
>> >> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
>> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.
>> >
>> > Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.
> 
> Well, you can't *just* add the RCU lock -- you need to add data to be managed via RCU (e.g. a list of MSI callbacks, or at least a boolean indicating whether calling the MSI code is OK).

Well, we'd just access a random pin routing :).

> 
>> In fact I'm having a hard time verifying that we're always accessing things with proper locks held. I'm pretty sure we're missing a few cases.
> 
> Any path in particular?

I'm already getting confused on whether normal MMIO accesses are always safe.

> 
>> So how about we delay mpic destruction to vm destruction? We simply add one user too many when we spawn the mpic and put it on vm_destruct. That way users _can_ destroy mpics, but they will only be really free'd once the vm is also gone.
> 
> That's what we originally had before the fd conversion.  If we want it again, we'll need to go back to maintaining a list of devices in KVM (though it could be a linked list now that we don't need to use it for lookups), or have some hardcoded MPIC hack.

Well, we could have an anonymous linked list of device pointers with a simple registration function. That way it's generic enough for any device to be kept alive until vm destruction if it wants that.

> IIRC I said back then that converting to fd would make destruction ordering more of a pain...

I usually like to pick the raisins from everything I can. So while I like the fd approach for its universally understandable scheme, simplicity of use, extensibility of ioctls etc, I don't really like the headaches that come with destroying a device while a VM is running. So having a device keep itself alive until the VM is gone is the best of all worlds :).


Alex

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-04-25 21:13             ` Alexander Graf
  0 siblings, 0 replies; 128+ messages in thread
From: Alexander Graf @ 2013-04-25 21:13 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm-ppc, kvm@vger.kernel.org mailing list, Marcelo Tosatti, Gleb Natapov


On 25.04.2013, at 21:03, Scott Wood wrote:

> On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
>> On 25.04.2013, at 13:30, Alexander Graf wrote:
>> >
>> > On 19.04.2013, at 20:51, Scott Wood wrote:
>> >
>> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>> >>> Now that all pieces are in place for reusing generic irq infrastructure,
>> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
>> >>> reuse it for PPC, as it will work there just as well.
>> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> >>> ---
>> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
>> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
>> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
>> >>> index 3537bf3..dbb2ac2 100644
>> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>> >>> @@ -26,6 +26,7 @@
>> >>> #define __KVM_HAVE_SPAPR_TCE
>> >>> #define __KVM_HAVE_PPC_SMT
>> >>> #define __KVM_HAVE_IRQCHIP
>> >>> +#define __KVM_HAVE_IRQ_LINE
>> >>> struct kvm_regs {
>> >>> 	__u64 pc;
>> >>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> >>> index c431fea..874c106 100644
>> >>> --- a/arch/powerpc/kvm/powerpc.c
>> >>> +++ b/arch/powerpc/kvm/powerpc.c
>> >>> @@ -33,6 +33,7 @@
>> >>> #include <asm/cputhreads.h>
>> >>> #include <asm/irqflags.h>
>> >>> #include "timing.h"
>> >>> +#include "irq.h"
>> >>> #include "../mm/mmu_decl.h"
>> >>> #define CREATE_TRACE_POINTS
>> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
>> >>> 	return 0;
>> >>> }
>> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
>> >>> +			  bool line_status)
>> >>> +{
>> >>> +	if (!irqchip_in_kernel(kvm))
>> >>> +		return -ENXIO;
>> >>> +
>> >>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
>> >>> +					irq_event->irq, irq_event->level,
>> >>> +					line_status);
>> >>> +	return 0;
>> >>> +}
>> >>
>> >> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
>> >>
>> >> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
>> >>
>> >> I think these three steps would fix it (the first two seem like things we should be doing anyway):
>> >> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
>> >> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
>> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.
>> >
>> > Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.
> 
> Well, you can't *just* add the RCU lock -- you need to add data to be managed via RCU (e.g. a list of MSI callbacks, or at least a boolean indicating whether calling the MSI code is OK).

Well, we'd just access a random pin routing :).

> 
>> In fact I'm having a hard time verifying that we're always accessing things with proper locks held. I'm pretty sure we're missing a few cases.
> 
> Any path in particular?

I'm already getting confused on whether normal MMIO accesses are always safe.

> 
>> So how about we delay mpic destruction to vm destruction? We simply add one user too many when we spawn the mpic and put it on vm_destruct. That way users _can_ destroy mpics, but they will only be really free'd once the vm is also gone.
> 
> That's what we originally had before the fd conversion.  If we want it again, we'll need to go back to maintaining a list of devices in KVM (though it could be a linked list now that we don't need to use it for lookups), or have some hardcoded MPIC hack.

Well, we could have an anonymous linked list of device pointers with a simple registration function. That way it's generic enough for any device to be kept alive until vm destruction if it wants that.

> IIRC I said back then that converting to fd would make destruction ordering more of a pain...

I usually like to pick the raisins from everything I can. So while I like the fd approach for its universally understandable scheme, simplicity of use, extensibility of ioctls etc, I don't really like the headaches that come with destroying a device while a VM is running. So having a device keep itself alive until the VM is gone is the best of all worlds :).


Alex


^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
  2013-04-25 21:13             ` Alexander Graf
@ 2013-05-01 13:15               ` Marcelo Tosatti
  -1 siblings, 0 replies; 128+ messages in thread
From: Marcelo Tosatti @ 2013-05-01 13:15 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, kvm-ppc, kvm@vger.kernel.org mailing list, Gleb Natapov

On Thu, Apr 25, 2013 at 11:13:40PM +0200, Alexander Graf wrote:
> 
> On 25.04.2013, at 21:03, Scott Wood wrote:
> 
> > On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
> >> On 25.04.2013, at 13:30, Alexander Graf wrote:
> >> >
> >> > On 19.04.2013, at 20:51, Scott Wood wrote:
> >> >
> >> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> >> >>> Now that all pieces are in place for reusing generic irq infrastructure,
> >> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
> >> >>> reuse it for PPC, as it will work there just as well.
> >> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
> >> >>> ---
> >> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
> >> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
> >> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
> >> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> >> >>> index 3537bf3..dbb2ac2 100644
> >> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
> >> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> >> >>> @@ -26,6 +26,7 @@
> >> >>> #define __KVM_HAVE_SPAPR_TCE
> >> >>> #define __KVM_HAVE_PPC_SMT
> >> >>> #define __KVM_HAVE_IRQCHIP
> >> >>> +#define __KVM_HAVE_IRQ_LINE
> >> >>> struct kvm_regs {
> >> >>> 	__u64 pc;
> >> >>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> >> >>> index c431fea..874c106 100644
> >> >>> --- a/arch/powerpc/kvm/powerpc.c
> >> >>> +++ b/arch/powerpc/kvm/powerpc.c
> >> >>> @@ -33,6 +33,7 @@
> >> >>> #include <asm/cputhreads.h>
> >> >>> #include <asm/irqflags.h>
> >> >>> #include "timing.h"
> >> >>> +#include "irq.h"
> >> >>> #include "../mm/mmu_decl.h"
> >> >>> #define CREATE_TRACE_POINTS
> >> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
> >> >>> 	return 0;
> >> >>> }
> >> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
> >> >>> +			  bool line_status)
> >> >>> +{
> >> >>> +	if (!irqchip_in_kernel(kvm))
> >> >>> +		return -ENXIO;
> >> >>> +
> >> >>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
> >> >>> +					irq_event->irq, irq_event->level,
> >> >>> +					line_status);
> >> >>> +	return 0;
> >> >>> +}
> >> >>
> >> >> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
> >> >>
> >> >> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
> >> >>
> >> >> I think these three steps would fix it (the first two seem like things we should be doing anyway):
> >> >> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
> >> >> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
> >> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.
> >> >
> >> > Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.
> > 
> > Well, you can't *just* add the RCU lock -- you need to add data to be managed via RCU (e.g. a list of MSI callbacks, or at least a boolean indicating whether calling the MSI code is OK).
> 
> Well, we'd just access a random pin routing :).
> 
> > 
> >> In fact I'm having a hard time verifying that we're always accessing things with proper locks held. I'm pretty sure we're missing a few cases.
> > 
> > Any path in particular?
> 
> I'm already getting confused on whether normal MMIO accesses are always safe.

asserts via mutex_is_locked() and spinlock/rcu variants might be helpful.

> >> So how about we delay mpic destruction to vm destruction? We simply add one user too many when we spawn the mpic and put it on vm_destruct. That way users _can_ destroy mpics, but they will only be really free'd once the vm is also gone.
> > 
> > That's what we originally had before the fd conversion.  If we want it again, we'll need to go back to maintaining a list of devices in KVM (though it could be a linked list now that we don't need to use it for lookups), or have some hardcoded MPIC hack.
> 
> Well, we could have an anonymous linked list of device pointers with a simple registration function. That way it's generic enough for any device to be kept alive until vm destruction if it wants that.
> 
> > IIRC I said back then that converting to fd would make destruction ordering more of a pain...
> 
> I usually like to pick the raisins from everything I can. So while I like the fd approach for its universally understandable scheme, simplicity of use, extensibility of ioctls etc, I don't really like the headaches that come with destroying a device while a VM is running. So having a device keep itself alive until the VM is gone is the best of all worlds :).

The other problem which arises from the moment you allow "get/set device
attribute at any time during VM lifetime" (which this interface allows),
is that synchronization with vcpus must be performed (and you don't want
to take a lock on the vcpu path). So the programmer has to avoid doing
that now. But its no big deal.

^ permalink raw reply	[flat|nested] 128+ messages in thread

* Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
@ 2013-05-01 13:15               ` Marcelo Tosatti
  0 siblings, 0 replies; 128+ messages in thread
From: Marcelo Tosatti @ 2013-05-01 13:15 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, kvm-ppc, kvm@vger.kernel.org mailing list, Gleb Natapov

On Thu, Apr 25, 2013 at 11:13:40PM +0200, Alexander Graf wrote:
> 
> On 25.04.2013, at 21:03, Scott Wood wrote:
> 
> > On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
> >> On 25.04.2013, at 13:30, Alexander Graf wrote:
> >> >
> >> > On 19.04.2013, at 20:51, Scott Wood wrote:
> >> >
> >> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> >> >>> Now that all pieces are in place for reusing generic irq infrastructure,
> >> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
> >> >>> reuse it for PPC, as it will work there just as well.
> >> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
> >> >>> ---
> >> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
> >> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
> >> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
> >> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> >> >>> index 3537bf3..dbb2ac2 100644
> >> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
> >> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> >> >>> @@ -26,6 +26,7 @@
> >> >>> #define __KVM_HAVE_SPAPR_TCE
> >> >>> #define __KVM_HAVE_PPC_SMT
> >> >>> #define __KVM_HAVE_IRQCHIP
> >> >>> +#define __KVM_HAVE_IRQ_LINE
> >> >>> struct kvm_regs {
> >> >>> 	__u64 pc;
> >> >>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> >> >>> index c431fea..874c106 100644
> >> >>> --- a/arch/powerpc/kvm/powerpc.c
> >> >>> +++ b/arch/powerpc/kvm/powerpc.c
> >> >>> @@ -33,6 +33,7 @@
> >> >>> #include <asm/cputhreads.h>
> >> >>> #include <asm/irqflags.h>
> >> >>> #include "timing.h"
> >> >>> +#include "irq.h"
> >> >>> #include "../mm/mmu_decl.h"
> >> >>> #define CREATE_TRACE_POINTS
> >> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
> >> >>> 	return 0;
> >> >>> }
> >> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
> >> >>> +			  bool line_status)
> >> >>> +{
> >> >>> +	if (!irqchip_in_kernel(kvm))
> >> >>> +		return -ENXIO;
> >> >>> +
> >> >>> +	irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
> >> >>> +					irq_event->irq, irq_event->level,
> >> >>> +					line_status);
> >> >>> +	return 0;
> >> >>> +}
> >> >>
> >> >> As Paul noted in the XICS patchset, this could reference an MPIC that has gone away if the user never attached any vcpus and then closed the MPIC fd.  It's not a reasonable use case, but it could be used malicously to get the kernel to access a bad pointer.  The irqchip_in_kernel check helps somewhat, but it's meant for ensuring that the creation has happened -- it's racy if used for ensuring that destruction hasn't happened.
> >> >>
> >> >> The problem is rooted in the awkwardness of performing an operation that logically should be on the MPIC fd, but is instead being done on the vm fd.
> >> >>
> >> >> I think these three steps would fix it (the first two seem like things we should be doing anyway):
> >> >> - During MPIC destruction, make sure MPIC deregisters all routes that reference it.
> >> >> - In kvm_set_irq(), do not release the RCU read lock until after the set() function has been called.
> >> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new irqchips, as that bypasses the RCU lock.  It could be supported as a device fd ioctl if desired, or it could be reworked to operate on an RCU-managed list of MSI handlers, though MPIC really doesn't need this at all.
> >> >
> >> > Can't we just add an RCU lock in the send_userspace_msi case? I don't think we should handle MSIs any differently from normal IRQs.
> > 
> > Well, you can't *just* add the RCU lock -- you need to add data to be managed via RCU (e.g. a list of MSI callbacks, or at least a boolean indicating whether calling the MSI code is OK).
> 
> Well, we'd just access a random pin routing :).
> 
> > 
> >> In fact I'm having a hard time verifying that we're always accessing things with proper locks held. I'm pretty sure we're missing a few cases.
> > 
> > Any path in particular?
> 
> I'm already getting confused on whether normal MMIO accesses are always safe.

asserts via mutex_is_locked() and spinlock/rcu variants might be helpful.

> >> So how about we delay mpic destruction to vm destruction? We simply add one user too many when we spawn the mpic and put it on vm_destruct. That way users _can_ destroy mpics, but they will only be really free'd once the vm is also gone.
> > 
> > That's what we originally had before the fd conversion.  If we want it again, we'll need to go back to maintaining a list of devices in KVM (though it could be a linked list now that we don't need to use it for lookups), or have some hardcoded MPIC hack.
> 
> Well, we could have an anonymous linked list of device pointers with a simple registration function. That way it's generic enough for any device to be kept alive until vm destruction if it wants that.
> 
> > IIRC I said back then that converting to fd would make destruction ordering more of a pain...
> 
> I usually like to pick the raisins from everything I can. So while I like the fd approach for its universally understandable scheme, simplicity of use, extensibility of ioctls etc, I don't really like the headaches that come with destroying a device while a VM is running. So having a device keep itself alive until the VM is gone is the best of all worlds :).

The other problem which arises from the moment you allow "get/set device
attribute at any time during VM lifetime" (which this interface allows),
is that synchronization with vcpus must be performed (and you don't want
to take a lock on the vcpu path). So the programmer has to avoid doing
that now. But its no big deal.


^ permalink raw reply	[flat|nested] 128+ messages in thread

end of thread, other threads:[~2013-05-01 13:15 UTC | newest]

Thread overview: 128+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-18 14:11 [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd Alexander Graf
2013-04-18 14:11 ` Alexander Graf
2013-04-18 14:11 ` [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 05/17] KVM: Move irq routing to generic code Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 07/17] KVM: Move irq routing setup to irqchip.c Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 09/17] kvm: add device control API Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 10/17] kvm/ppc/mpic: import hw/openpic.c from QEMU Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 11/17] kvm/ppc/mpic: remove some obviously unneeded code Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 12/17] kvm/ppc/mpic: adapt to kernel style and environment Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 13/17] kvm/ppc/mpic: in-kernel MPIC emulation Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 14/17] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 21:39   ` Scott Wood
2013-04-18 21:39     ` Scott Wood
2013-04-19  0:15     ` Alexander Graf
2013-04-19  0:15       ` Alexander Graf
2013-04-19  0:50       ` Scott Wood
2013-04-19  0:50         ` Scott Wood
2013-04-19  1:09         ` Alexander Graf
2013-04-19  1:09           ` Alexander Graf
2013-04-19  1:37           ` Scott Wood
2013-04-19  1:37             ` Scott Wood
2013-04-22 23:31       ` Scott Wood
2013-04-22 23:31         ` Scott Wood
2013-04-18 14:11 ` [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:11 ` [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms Alexander Graf
2013-04-18 14:11   ` Alexander Graf
2013-04-18 14:29   ` Scott Wood
2013-04-18 14:29     ` Scott Wood
2013-04-18 14:52     ` Alexander Graf
2013-04-18 14:52       ` Alexander Graf
2013-04-19 14:06 [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd v3 Alexander Graf
2013-04-19 14:06 ` Alexander Graf
2013-04-19 14:06 ` [PATCH 01/17] KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:18   ` Michael S. Tsirkin
2013-04-25 10:18     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 02/17] KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:18   ` Michael S. Tsirkin
2013-04-25 10:18     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 03/17] KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:19   ` Michael S. Tsirkin
2013-04-25 10:19     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 04/17] KVM: Remove kvm_get_intr_delivery_bitmask Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:19   ` Michael S. Tsirkin
2013-04-25 10:19     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 05/17] KVM: Move irq routing to generic code Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:19   ` Michael S. Tsirkin
2013-04-25 10:19     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 06/17] KVM: Extract generic irqchip logic into irqchip.c Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:19   ` Michael S. Tsirkin
2013-04-25 10:19     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 07/17] KVM: Move irq routing setup to irqchip.c Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:20   ` Michael S. Tsirkin
2013-04-25 10:20     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 08/17] KVM: Move irqfd resample cap handling to generic code Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:21   ` Michael S. Tsirkin
2013-04-25 10:21     ` Michael S. Tsirkin
2013-04-19 14:06 ` [PATCH 09/17] kvm: add device control API Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 14:06 ` [PATCH 10/17] kvm/ppc/mpic: import hw/openpic.c from QEMU Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 14:06 ` [PATCH 11/17] kvm/ppc/mpic: remove some obviously unneeded code Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 14:06 ` [PATCH 12/17] kvm/ppc/mpic: adapt to kernel style and environment Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 14:06 ` [PATCH 13/17] kvm/ppc/mpic: in-kernel MPIC emulation Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 14:06 ` [PATCH 14/17] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 14:06 ` [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 18:02   ` Scott Wood
2013-04-19 18:02     ` Scott Wood
2013-04-25  9:58     ` Alexander Graf
2013-04-25  9:58       ` Alexander Graf
2013-04-25 16:53       ` Scott Wood
2013-04-25 16:53         ` Scott Wood
2013-04-23  6:38   ` Paul Mackerras
2013-04-23  6:38     ` Paul Mackerras
2013-04-25 10:02     ` Alexander Graf
2013-04-25 10:02       ` Alexander Graf
2013-04-19 14:06 ` [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-19 18:51   ` Scott Wood
2013-04-19 18:51     ` Scott Wood
2013-04-25 11:30     ` Alexander Graf
2013-04-25 11:30       ` Alexander Graf
2013-04-25 14:49       ` Alexander Graf
2013-04-25 14:49         ` Alexander Graf
2013-04-25 19:03         ` Scott Wood
2013-04-25 19:03           ` Scott Wood
2013-04-25 21:13           ` Alexander Graf
2013-04-25 21:13             ` Alexander Graf
2013-05-01 13:15             ` Marcelo Tosatti
2013-05-01 13:15               ` Marcelo Tosatti
2013-04-19 14:06 ` [PATCH 17/17] KVM: PPC: MPIC: Restrict to e500 platforms Alexander Graf
2013-04-19 14:06   ` Alexander Graf
2013-04-25 10:24 ` [PATCH 00/17] KVM: PPC: In-kernel MPIC support with irqfd v3 Michael S. Tsirkin
2013-04-25 10:24   ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.