All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 00/38] ppc patch queue 2012-08-15
@ 2012-08-14 23:04 ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Hi Avi,

This is my current patch queue for ppc. It contains the following improvements:

  * add support for idle hcall on booke
  * icache clear on map
  * mmu notifier support for e500 and book3s_pr
  * revive the 440 support slightly (still not 100% happy)
  * unify booke and book3s_pr entry/exit code a bit
  * add watchdog emulation for booke
  * small bug fixes

Please pull.

Alex


The following changes since commit dbcb4e798072d114fe68813f39a9efd239ab99c0:
  Avi Kivity (1):
        KVM: VMX: Advertize RDTSC exiting to nested guests

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git for-upstream

Alan Cox (1):
      ppc: e500_tlb memset clears nothing

Alexander Graf (24):
      KVM: PPC: PR: Use generic tracepoint for guest exit
      KVM: PPC: Expose SYNC cap based on mmu notifiers
      KVM: PPC: BookE: Expose remote TLB flushes in debugfs
      KVM: PPC: E500: Fix clear_tlb_refs
      KVM: PPC: BookE: Add check_requests helper function
      KVM: PPC: BookE: Add support for vcpu->mode
      KVM: PPC: E500: Implement MMU notifiers
      KVM: PPC: Add cache flush on page map
      KVM: PPC: BookE: Add some more trace points
      KVM: PPC: BookE: No duplicate request != 0 check
      KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr
      KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support
      KVM: PPC: BookE: Drop redundant vcpu->mode set
      KVM: PPC: Book3S: PR: Only do resched check once per exit
      KVM: PPC: Exit guest context while handling exit
      KVM: PPC: Book3S: PR: Indicate we're out of guest mode
      KVM: PPC: Consistentify vcpu exit path
      KVM: PPC: Book3S: PR: Rework irq disabling
      KVM: PPC: Move kvm_guest_enter call into generic code
      KVM: PPC: Ignore EXITING_GUEST_MODE mode
      KVM: PPC: Add return value in prepare_to_enter
      KVM: PPC: Add return value to core_check_requests
      KVM: PPC: 44x: Initialize PVR
      KVM: PPC: BookE: Add MCSR SPR support

Bharat Bhushan (2):
      KVM: PPC: booke: Add watchdog emulation
      booke: Added ONE_REG interface for IAC/DAC debug registers

Liu Yu-B13201 (3):
      KVM: PPC: Add support for ePAPR idle hcall in host kernel
      KVM: PPC: ev_idle hcall support for e500 guests
      PPC: Don't use hardcoded opcode for ePAPR hcall invocation

Paul Mackerras (2):
      KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
      KVM: PPC: Quieten message about allocating linear regions

Scott Wood (2):
      powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls
      powerpc/epapr: export epapr_hypercall_start

Stuart Yoder (4):
      PPC: epapr: create define for return code value of success
      KVM: PPC: use definitions in epapr header for hcalls
      KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500
      PPC: select EPAPR_PARAVIRT for all users of epapr hcalls

 Documentation/virtual/kvm/api.txt       |    7 +-
 arch/powerpc/include/asm/Kbuild         |    1 +
 arch/powerpc/include/asm/epapr_hcalls.h |   36 ++--
 arch/powerpc/include/asm/fsl_hcalls.h   |   36 ++--
 arch/powerpc/include/asm/kvm.h          |   12 ++
 arch/powerpc/include/asm/kvm_host.h     |   30 +++-
 arch/powerpc/include/asm/kvm_para.h     |   21 ++-
 arch/powerpc/include/asm/kvm_ppc.h      |   28 +++
 arch/powerpc/include/asm/reg_booke.h    |    7 +
 arch/powerpc/kernel/epapr_hcalls.S      |   28 +++
 arch/powerpc/kernel/epapr_paravirt.c    |   11 +-
 arch/powerpc/kernel/kvm.c               |    2 +-
 arch/powerpc/kernel/ppc_ksyms.c         |    5 +
 arch/powerpc/kvm/44x.c                  |    1 +
 arch/powerpc/kvm/Kconfig                |    3 +
 arch/powerpc/kvm/book3s.c               |    9 +
 arch/powerpc/kvm/book3s_32_mmu_host.c   |    4 +
 arch/powerpc/kvm/book3s_64_mmu_host.c   |    3 +
 arch/powerpc/kvm/book3s_hv_builtin.c    |    4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   12 +-
 arch/powerpc/kvm/book3s_mmu_hpte.c      |    5 -
 arch/powerpc/kvm/book3s_pr.c            |  109 ++++++++----
 arch/powerpc/kvm/book3s_rmhandlers.S    |   15 +-
 arch/powerpc/kvm/booke.c                |  279 +++++++++++++++++++++++++------
 arch/powerpc/kvm/booke_emulate.c        |   22 ++-
 arch/powerpc/kvm/e500_tlb.c             |   82 ++++++++--
 arch/powerpc/kvm/powerpc.c              |  128 +++++++++++++--
 arch/powerpc/kvm/trace.h                |  146 +++++++++++++---
 arch/powerpc/mm/mem.c                   |    1 +
 arch/powerpc/platforms/Kconfig          |    1 +
 arch/powerpc/sysdev/fsl_msi.c           |    9 +-
 arch/powerpc/sysdev/fsl_soc.c           |    2 +
 drivers/tty/Kconfig                     |    1 +
 drivers/virt/Kconfig                    |    1 +
 include/linux/kvm.h                     |    4 +
 include/linux/kvm_host.h                |    1 +
 36 files changed, 857 insertions(+), 209 deletions(-)

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PULL 00/38] ppc patch queue 2012-08-15
@ 2012-08-14 23:04 ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Hi Avi,

This is my current patch queue for ppc. It contains the following improvements:

  * add support for idle hcall on booke
  * icache clear on map
  * mmu notifier support for e500 and book3s_pr
  * revive the 440 support slightly (still not 100% happy)
  * unify booke and book3s_pr entry/exit code a bit
  * add watchdog emulation for booke
  * small bug fixes

Please pull.

Alex


The following changes since commit dbcb4e798072d114fe68813f39a9efd239ab99c0:
  Avi Kivity (1):
        KVM: VMX: Advertize RDTSC exiting to nested guests

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git for-upstream

Alan Cox (1):
      ppc: e500_tlb memset clears nothing

Alexander Graf (24):
      KVM: PPC: PR: Use generic tracepoint for guest exit
      KVM: PPC: Expose SYNC cap based on mmu notifiers
      KVM: PPC: BookE: Expose remote TLB flushes in debugfs
      KVM: PPC: E500: Fix clear_tlb_refs
      KVM: PPC: BookE: Add check_requests helper function
      KVM: PPC: BookE: Add support for vcpu->mode
      KVM: PPC: E500: Implement MMU notifiers
      KVM: PPC: Add cache flush on page map
      KVM: PPC: BookE: Add some more trace points
      KVM: PPC: BookE: No duplicate request != 0 check
      KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr
      KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support
      KVM: PPC: BookE: Drop redundant vcpu->mode set
      KVM: PPC: Book3S: PR: Only do resched check once per exit
      KVM: PPC: Exit guest context while handling exit
      KVM: PPC: Book3S: PR: Indicate we're out of guest mode
      KVM: PPC: Consistentify vcpu exit path
      KVM: PPC: Book3S: PR: Rework irq disabling
      KVM: PPC: Move kvm_guest_enter call into generic code
      KVM: PPC: Ignore EXITING_GUEST_MODE mode
      KVM: PPC: Add return value in prepare_to_enter
      KVM: PPC: Add return value to core_check_requests
      KVM: PPC: 44x: Initialize PVR
      KVM: PPC: BookE: Add MCSR SPR support

Bharat Bhushan (2):
      KVM: PPC: booke: Add watchdog emulation
      booke: Added ONE_REG interface for IAC/DAC debug registers

Liu Yu-B13201 (3):
      KVM: PPC: Add support for ePAPR idle hcall in host kernel
      KVM: PPC: ev_idle hcall support for e500 guests
      PPC: Don't use hardcoded opcode for ePAPR hcall invocation

Paul Mackerras (2):
      KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
      KVM: PPC: Quieten message about allocating linear regions

Scott Wood (2):
      powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls
      powerpc/epapr: export epapr_hypercall_start

Stuart Yoder (4):
      PPC: epapr: create define for return code value of success
      KVM: PPC: use definitions in epapr header for hcalls
      KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500
      PPC: select EPAPR_PARAVIRT for all users of epapr hcalls

 Documentation/virtual/kvm/api.txt       |    7 +-
 arch/powerpc/include/asm/Kbuild         |    1 +
 arch/powerpc/include/asm/epapr_hcalls.h |   36 ++--
 arch/powerpc/include/asm/fsl_hcalls.h   |   36 ++--
 arch/powerpc/include/asm/kvm.h          |   12 ++
 arch/powerpc/include/asm/kvm_host.h     |   30 +++-
 arch/powerpc/include/asm/kvm_para.h     |   21 ++-
 arch/powerpc/include/asm/kvm_ppc.h      |   28 +++
 arch/powerpc/include/asm/reg_booke.h    |    7 +
 arch/powerpc/kernel/epapr_hcalls.S      |   28 +++
 arch/powerpc/kernel/epapr_paravirt.c    |   11 +-
 arch/powerpc/kernel/kvm.c               |    2 +-
 arch/powerpc/kernel/ppc_ksyms.c         |    5 +
 arch/powerpc/kvm/44x.c                  |    1 +
 arch/powerpc/kvm/Kconfig                |    3 +
 arch/powerpc/kvm/book3s.c               |    9 +
 arch/powerpc/kvm/book3s_32_mmu_host.c   |    4 +
 arch/powerpc/kvm/book3s_64_mmu_host.c   |    3 +
 arch/powerpc/kvm/book3s_hv_builtin.c    |    4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   12 +-
 arch/powerpc/kvm/book3s_mmu_hpte.c      |    5 -
 arch/powerpc/kvm/book3s_pr.c            |  109 ++++++++----
 arch/powerpc/kvm/book3s_rmhandlers.S    |   15 +-
 arch/powerpc/kvm/booke.c                |  279 +++++++++++++++++++++++++------
 arch/powerpc/kvm/booke_emulate.c        |   22 ++-
 arch/powerpc/kvm/e500_tlb.c             |   82 ++++++++--
 arch/powerpc/kvm/powerpc.c              |  128 +++++++++++++--
 arch/powerpc/kvm/trace.h                |  146 +++++++++++++---
 arch/powerpc/mm/mem.c                   |    1 +
 arch/powerpc/platforms/Kconfig          |    1 +
 arch/powerpc/sysdev/fsl_msi.c           |    9 +-
 arch/powerpc/sysdev/fsl_soc.c           |    2 +
 drivers/tty/Kconfig                     |    1 +
 drivers/virt/Kconfig                    |    1 +
 include/linux/kvm.h                     |    4 +
 include/linux/kvm_host.h                |    1 +
 36 files changed, 857 insertions(+), 209 deletions(-)

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH 01/38] PPC: epapr: create define for return code value of success
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/epapr_hcalls.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/epapr_hcalls.h b/arch/powerpc/include/asm/epapr_hcalls.h
index bf2c06c..c0c7adc 100644
--- a/arch/powerpc/include/asm/epapr_hcalls.h
+++ b/arch/powerpc/include/asm/epapr_hcalls.h
@@ -88,7 +88,8 @@
 #define _EV_HCALL_TOKEN(id, num) (((id) << 16) | (num))
 #define EV_HCALL_TOKEN(hcall_num) _EV_HCALL_TOKEN(EV_EPAPR_VENDOR_ID, hcall_num)
 
-/* epapr error codes */
+/* epapr return codes */
+#define EV_SUCCESS		0
 #define EV_EPERM		1	/* Operation not permitted */
 #define EV_ENOENT		2	/*  Entry Not Found */
 #define EV_EIO			3	/* I/O error occured */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 01/38] PPC: epapr: create define for return code value of success
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/epapr_hcalls.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/epapr_hcalls.h b/arch/powerpc/include/asm/epapr_hcalls.h
index bf2c06c..c0c7adc 100644
--- a/arch/powerpc/include/asm/epapr_hcalls.h
+++ b/arch/powerpc/include/asm/epapr_hcalls.h
@@ -88,7 +88,8 @@
 #define _EV_HCALL_TOKEN(id, num) (((id) << 16) | (num))
 #define EV_HCALL_TOKEN(hcall_num) _EV_HCALL_TOKEN(EV_EPAPR_VENDOR_ID, hcall_num)
 
-/* epapr error codes */
+/* epapr return codes */
+#define EV_SUCCESS		0
 #define EV_EPERM		1	/* Operation not permitted */
 #define EV_ENOENT		2	/*  Entry Not Found */
 #define EV_EIO			3	/* I/O error occured */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 02/38] KVM: PPC: use definitions in epapr header for hcalls
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_para.h |   21 +++++++++++----------
 arch/powerpc/kernel/kvm.c           |    2 +-
 arch/powerpc/kvm/powerpc.c          |   10 +++++-----
 3 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index c18916b..a168ce3 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -75,9 +75,10 @@ struct kvm_vcpu_arch_shared {
 };
 
 #define KVM_SC_MAGIC_R0		0x4b564d21 /* "KVM!" */
-#define HC_VENDOR_KVM		(42 << 16)
-#define HC_EV_SUCCESS		0
-#define HC_EV_UNIMPLEMENTED	12
+
+#define KVM_HCALL_TOKEN(num)     _EV_HCALL_TOKEN(EV_KVM_VENDOR_ID, num)
+
+#include <asm/epapr_hcalls.h>
 
 #define KVM_FEATURE_MAGIC_PAGE	1
 
@@ -121,7 +122,7 @@ static unsigned long kvm_hypercall(unsigned long *in,
 				   unsigned long *out,
 				   unsigned long nr)
 {
-	return HC_EV_UNIMPLEMENTED;
+	return EV_UNIMPLEMENTED;
 }
 
 #endif
@@ -132,7 +133,7 @@ static inline long kvm_hypercall0_1(unsigned int nr, unsigned long *r2)
 	unsigned long out[8];
 	unsigned long r;
 
-	r = kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	r = kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 	*r2 = out[0];
 
 	return r;
@@ -143,7 +144,7 @@ static inline long kvm_hypercall0(unsigned int nr)
 	unsigned long in[8];
 	unsigned long out[8];
 
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
@@ -152,7 +153,7 @@ static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
 	unsigned long out[8];
 
 	in[0] = p1;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
@@ -163,7 +164,7 @@ static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
 
 	in[0] = p1;
 	in[1] = p2;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
@@ -175,7 +176,7 @@ static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
 	in[0] = p1;
 	in[1] = p2;
 	in[2] = p3;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
@@ -189,7 +190,7 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
 	in[1] = p2;
 	in[2] = p3;
 	in[3] = p4;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 867db1d..a61b133 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -419,7 +419,7 @@ static void kvm_map_magic_page(void *data)
 	in[0] = KVM_MAGIC_PAGE;
 	in[1] = KVM_MAGIC_PAGE;
 
-	kvm_hypercall(in, out, HC_VENDOR_KVM | KVM_HC_PPC_MAP_MAGIC_PAGE);
+	kvm_hypercall(in, out, KVM_HCALL_TOKEN(KVM_HC_PPC_MAP_MAGIC_PAGE));
 
 	*features = out[0];
 }
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 879b14a..62165cc 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -67,18 +67,18 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 	}
 
 	switch (nr) {
-	case HC_VENDOR_KVM | KVM_HC_PPC_MAP_MAGIC_PAGE:
+	case KVM_HCALL_TOKEN(KVM_HC_PPC_MAP_MAGIC_PAGE):
 	{
 		vcpu->arch.magic_page_pa = param1;
 		vcpu->arch.magic_page_ea = param2;
 
 		r2 = KVM_MAGIC_FEAT_SR | KVM_MAGIC_FEAT_MAS0_TO_SPRG7;
 
-		r = HC_EV_SUCCESS;
+		r = EV_SUCCESS;
 		break;
 	}
-	case HC_VENDOR_KVM | KVM_HC_FEATURES:
-		r = HC_EV_SUCCESS;
+	case KVM_HCALL_TOKEN(KVM_HC_FEATURES):
+		r = EV_SUCCESS;
 #if defined(CONFIG_PPC_BOOK3S) || defined(CONFIG_KVM_E500V2)
 		/* XXX Missing magic page on 44x */
 		r2 |= (1 << KVM_FEATURE_MAGIC_PAGE);
@@ -87,7 +87,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 		/* Second return value is in r4 */
 		break;
 	default:
-		r = HC_EV_UNIMPLEMENTED;
+		r = EV_UNIMPLEMENTED;
 		break;
 	}
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 02/38] KVM: PPC: use definitions in epapr header for hcalls
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_para.h |   21 +++++++++++----------
 arch/powerpc/kernel/kvm.c           |    2 +-
 arch/powerpc/kvm/powerpc.c          |   10 +++++-----
 3 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h
index c18916b..a168ce3 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -75,9 +75,10 @@ struct kvm_vcpu_arch_shared {
 };
 
 #define KVM_SC_MAGIC_R0		0x4b564d21 /* "KVM!" */
-#define HC_VENDOR_KVM		(42 << 16)
-#define HC_EV_SUCCESS		0
-#define HC_EV_UNIMPLEMENTED	12
+
+#define KVM_HCALL_TOKEN(num)     _EV_HCALL_TOKEN(EV_KVM_VENDOR_ID, num)
+
+#include <asm/epapr_hcalls.h>
 
 #define KVM_FEATURE_MAGIC_PAGE	1
 
@@ -121,7 +122,7 @@ static unsigned long kvm_hypercall(unsigned long *in,
 				   unsigned long *out,
 				   unsigned long nr)
 {
-	return HC_EV_UNIMPLEMENTED;
+	return EV_UNIMPLEMENTED;
 }
 
 #endif
@@ -132,7 +133,7 @@ static inline long kvm_hypercall0_1(unsigned int nr, unsigned long *r2)
 	unsigned long out[8];
 	unsigned long r;
 
-	r = kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	r = kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 	*r2 = out[0];
 
 	return r;
@@ -143,7 +144,7 @@ static inline long kvm_hypercall0(unsigned int nr)
 	unsigned long in[8];
 	unsigned long out[8];
 
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
@@ -152,7 +153,7 @@ static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
 	unsigned long out[8];
 
 	in[0] = p1;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
@@ -163,7 +164,7 @@ static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
 
 	in[0] = p1;
 	in[1] = p2;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
@@ -175,7 +176,7 @@ static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
 	in[0] = p1;
 	in[1] = p2;
 	in[2] = p3;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
@@ -189,7 +190,7 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
 	in[1] = p2;
 	in[2] = p3;
 	in[3] = p4;
-	return kvm_hypercall(in, out, nr | HC_VENDOR_KVM);
+	return kvm_hypercall(in, out, KVM_HCALL_TOKEN(nr));
 }
 
 
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index 867db1d..a61b133 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -419,7 +419,7 @@ static void kvm_map_magic_page(void *data)
 	in[0] = KVM_MAGIC_PAGE;
 	in[1] = KVM_MAGIC_PAGE;
 
-	kvm_hypercall(in, out, HC_VENDOR_KVM | KVM_HC_PPC_MAP_MAGIC_PAGE);
+	kvm_hypercall(in, out, KVM_HCALL_TOKEN(KVM_HC_PPC_MAP_MAGIC_PAGE));
 
 	*features = out[0];
 }
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 879b14a..62165cc 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -67,18 +67,18 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 	}
 
 	switch (nr) {
-	case HC_VENDOR_KVM | KVM_HC_PPC_MAP_MAGIC_PAGE:
+	case KVM_HCALL_TOKEN(KVM_HC_PPC_MAP_MAGIC_PAGE):
 	{
 		vcpu->arch.magic_page_pa = param1;
 		vcpu->arch.magic_page_ea = param2;
 
 		r2 = KVM_MAGIC_FEAT_SR | KVM_MAGIC_FEAT_MAS0_TO_SPRG7;
 
-		r = HC_EV_SUCCESS;
+		r = EV_SUCCESS;
 		break;
 	}
-	case HC_VENDOR_KVM | KVM_HC_FEATURES:
-		r = HC_EV_SUCCESS;
+	case KVM_HCALL_TOKEN(KVM_HC_FEATURES):
+		r = EV_SUCCESS;
 #if defined(CONFIG_PPC_BOOK3S) || defined(CONFIG_KVM_E500V2)
 		/* XXX Missing magic page on 44x */
 		r2 |= (1 << KVM_FEATURE_MAGIC_PAGE);
@@ -87,7 +87,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 		/* Second return value is in r4 */
 		break;
 	default:
-		r = HC_EV_UNIMPLEMENTED;
+		r = EV_UNIMPLEMENTED;
 		break;
 	}
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 03/38] KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder, Liu Yu

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[stuart: factored this out from idle hcall support in host patch]
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/powerpc.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 62165cc..4ff2f27 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -747,9 +747,16 @@ int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
 
 static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 {
+	u32 inst_nop = 0x60000000;
+#ifdef CONFIG_KVM_BOOKE_HV
+	u32 inst_sc1 = 0x44000022;
+	pvinfo->hcall[0] = inst_sc1;
+	pvinfo->hcall[1] = inst_nop;
+	pvinfo->hcall[2] = inst_nop;
+	pvinfo->hcall[3] = inst_nop;
+#else
 	u32 inst_lis = 0x3c000000;
 	u32 inst_ori = 0x60000000;
-	u32 inst_nop = 0x60000000;
 	u32 inst_sc = 0x44000002;
 	u32 inst_imm_mask = 0xffff;
 
@@ -766,6 +773,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	pvinfo->hcall[1] = inst_ori | (KVM_SC_MAGIC_R0 & inst_imm_mask);
 	pvinfo->hcall[2] = inst_sc;
 	pvinfo->hcall[3] = inst_nop;
+#endif
 
 	return 0;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 03/38] KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder, Liu Yu

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[stuart: factored this out from idle hcall support in host patch]
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/powerpc.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 62165cc..4ff2f27 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -747,9 +747,16 @@ int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
 
 static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 {
+	u32 inst_nop = 0x60000000;
+#ifdef CONFIG_KVM_BOOKE_HV
+	u32 inst_sc1 = 0x44000022;
+	pvinfo->hcall[0] = inst_sc1;
+	pvinfo->hcall[1] = inst_nop;
+	pvinfo->hcall[2] = inst_nop;
+	pvinfo->hcall[3] = inst_nop;
+#else
 	u32 inst_lis = 0x3c000000;
 	u32 inst_ori = 0x60000000;
-	u32 inst_nop = 0x60000000;
 	u32 inst_sc = 0x44000002;
 	u32 inst_imm_mask = 0xffff;
 
@@ -766,6 +773,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	pvinfo->hcall[1] = inst_ori | (KVM_SC_MAGIC_R0 & inst_imm_mask);
 	pvinfo->hcall[2] = inst_sc;
 	pvinfo->hcall[3] = inst_nop;
+#endif
 
 	return 0;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 04/38] KVM: PPC: Add support for ePAPR idle hcall in host kernel
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Liu Yu-B13201, Liu Yu, Stuart Yoder

From: Liu Yu-B13201 <Yu.Liu@freescale.com>

And add a new flag definition in kvm_ppc_pvinfo to indicate
whether the host supports the EV_IDLE hcall.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[stuart.yoder@freescale.com: cleanup,fixes for conditions allowing idle]
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
[agraf: fix typo]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt |    7 +++++--
 arch/powerpc/include/asm/Kbuild   |    1 +
 arch/powerpc/kvm/powerpc.c        |   10 ++++++++--
 include/linux/kvm.h               |    2 ++
 4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index bf33aaa..a37ec45 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1190,12 +1190,15 @@ struct kvm_ppc_pvinfo {
 This ioctl fetches PV specific information that need to be passed to the guest
 using the device tree or other means from vm context.
 
-For now the only implemented piece of information distributed here is an array
-of 4 instructions that make up a hypercall.
+The hcall array defines 4 instructions that make up a hypercall.
 
 If any additional field gets added to this structure later on, a bit for that
 additional piece of information will be set in the flags bitmap.
 
+The flags bitmap is defined as:
+
+   /* the host supports the ePAPR idle hcall
+   #define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
 
 4.48 KVM_ASSIGN_PCI_DEVICE
 
diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
index 7e313f1..13d6b7b 100644
--- a/arch/powerpc/include/asm/Kbuild
+++ b/arch/powerpc/include/asm/Kbuild
@@ -34,5 +34,6 @@ header-y += termios.h
 header-y += types.h
 header-y += ucontext.h
 header-y += unistd.h
+header-y += epapr_hcalls.h
 
 generic-y += rwsem.h
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4ff2f27..6645cc7 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -38,8 +38,7 @@
 
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 {
-	return !(v->arch.shared->msr & MSR_WE) ||
-	       !!(v->arch.pending_exceptions) ||
+	return !!(v->arch.pending_exceptions) ||
 	       v->requests;
 }
 
@@ -86,6 +85,11 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 
 		/* Second return value is in r4 */
 		break;
+	case EV_HCALL_TOKEN(EV_IDLE):
+		r = EV_SUCCESS;
+		kvm_vcpu_block(vcpu);
+		clear_bit(KVM_REQ_UNHALT, &vcpu->requests);
+		break;
 	default:
 		r = EV_UNIMPLEMENTED;
 		break;
@@ -775,6 +779,8 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	pvinfo->hcall[3] = inst_nop;
 #endif
 
+	pvinfo->flags = KVM_PPC_PVINFO_FLAGS_EV_IDLE;
+
 	return 0;
 }
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2ce09aa..c03e59e 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -473,6 +473,8 @@ struct kvm_ppc_smmu_info {
 	struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
 };
 
+#define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
+
 #define KVMIO 0xAE
 
 /* machine type bits, to be used as argument to KVM_CREATE_VM */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 04/38] KVM: PPC: Add support for ePAPR idle hcall in host kernel
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Liu Yu-B13201, Liu Yu, Stuart Yoder

From: Liu Yu-B13201 <Yu.Liu@freescale.com>

And add a new flag definition in kvm_ppc_pvinfo to indicate
whether the host supports the EV_IDLE hcall.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[stuart.yoder@freescale.com: cleanup,fixes for conditions allowing idle]
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
[agraf: fix typo]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Documentation/virtual/kvm/api.txt |    7 +++++--
 arch/powerpc/include/asm/Kbuild   |    1 +
 arch/powerpc/kvm/powerpc.c        |   10 ++++++++--
 include/linux/kvm.h               |    2 ++
 4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index bf33aaa..a37ec45 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1190,12 +1190,15 @@ struct kvm_ppc_pvinfo {
 This ioctl fetches PV specific information that need to be passed to the guest
 using the device tree or other means from vm context.
 
-For now the only implemented piece of information distributed here is an array
-of 4 instructions that make up a hypercall.
+The hcall array defines 4 instructions that make up a hypercall.
 
 If any additional field gets added to this structure later on, a bit for that
 additional piece of information will be set in the flags bitmap.
 
+The flags bitmap is defined as:
+
+   /* the host supports the ePAPR idle hcall
+   #define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
 
 4.48 KVM_ASSIGN_PCI_DEVICE
 
diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
index 7e313f1..13d6b7b 100644
--- a/arch/powerpc/include/asm/Kbuild
+++ b/arch/powerpc/include/asm/Kbuild
@@ -34,5 +34,6 @@ header-y += termios.h
 header-y += types.h
 header-y += ucontext.h
 header-y += unistd.h
+header-y += epapr_hcalls.h
 
 generic-y += rwsem.h
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4ff2f27..6645cc7 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -38,8 +38,7 @@
 
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 {
-	return !(v->arch.shared->msr & MSR_WE) ||
-	       !!(v->arch.pending_exceptions) ||
+	return !!(v->arch.pending_exceptions) ||
 	       v->requests;
 }
 
@@ -86,6 +85,11 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 
 		/* Second return value is in r4 */
 		break;
+	case EV_HCALL_TOKEN(EV_IDLE):
+		r = EV_SUCCESS;
+		kvm_vcpu_block(vcpu);
+		clear_bit(KVM_REQ_UNHALT, &vcpu->requests);
+		break;
 	default:
 		r = EV_UNIMPLEMENTED;
 		break;
@@ -775,6 +779,8 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo)
 	pvinfo->hcall[3] = inst_nop;
 #endif
 
+	pvinfo->flags = KVM_PPC_PVINFO_FLAGS_EV_IDLE;
+
 	return 0;
 }
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2ce09aa..c03e59e 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -473,6 +473,8 @@ struct kvm_ppc_smmu_info {
 	struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
 };
 
+#define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
+
 #define KVMIO 0xAE
 
 /* machine type bits, to be used as argument to KVM_CREATE_VM */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 05/38] KVM: PPC: ev_idle hcall support for e500 guests
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Liu Yu-B13201, Liu Yu, Varun Sethi, Stuart Yoder

From: Liu Yu-B13201 <Yu.Liu@freescale.com>

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[varun: 64-bit changes]
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/epapr_hcalls.h |   11 ++++++-----
 arch/powerpc/kernel/epapr_hcalls.S      |   28 ++++++++++++++++++++++++++++
 arch/powerpc/kernel/epapr_paravirt.c    |   11 ++++++++++-
 3 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/epapr_hcalls.h b/arch/powerpc/include/asm/epapr_hcalls.h
index c0c7adc..833ce2c 100644
--- a/arch/powerpc/include/asm/epapr_hcalls.h
+++ b/arch/powerpc/include/asm/epapr_hcalls.h
@@ -50,10 +50,6 @@
 #ifndef _EPAPR_HCALLS_H
 #define _EPAPR_HCALLS_H
 
-#include <linux/types.h>
-#include <linux/errno.h>
-#include <asm/byteorder.h>
-
 #define EV_BYTE_CHANNEL_SEND		1
 #define EV_BYTE_CHANNEL_RECEIVE		2
 #define EV_BYTE_CHANNEL_POLL		3
@@ -109,6 +105,11 @@
 #define EV_UNIMPLEMENTED	12	/* Unimplemented hypercall */
 #define EV_BUFFER_OVERFLOW	13	/* Caller-supplied buffer too small */
 
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <asm/byteorder.h>
+
 /*
  * Hypercall register clobber list
  *
@@ -506,5 +507,5 @@ static inline unsigned int ev_idle(void)
 
 	return r3;
 }
-
+#endif /* !__ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/kernel/epapr_hcalls.S b/arch/powerpc/kernel/epapr_hcalls.S
index 697b390..62c0dc2 100644
--- a/arch/powerpc/kernel/epapr_hcalls.S
+++ b/arch/powerpc/kernel/epapr_hcalls.S
@@ -8,13 +8,41 @@
  */
 
 #include <linux/threads.h>
+#include <asm/epapr_hcalls.h>
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/cputable.h>
 #include <asm/thread_info.h>
 #include <asm/ppc_asm.h>
+#include <asm/asm-compat.h>
 #include <asm/asm-offsets.h>
 
+/* epapr_ev_idle() was derived from e500_idle() */
+_GLOBAL(epapr_ev_idle)
+	CURRENT_THREAD_INFO(r3, r1)
+	PPC_LL	r4, TI_LOCAL_FLAGS(r3)	/* set napping bit */
+	ori	r4, r4,_TLF_NAPPING	/* so when we take an exception */
+	PPC_STL	r4, TI_LOCAL_FLAGS(r3)	/* it will return to our caller */
+
+	wrteei	1
+
+idle_loop:
+	LOAD_REG_IMMEDIATE(r11, EV_HCALL_TOKEN(EV_IDLE))
+
+.global epapr_ev_idle_start
+epapr_ev_idle_start:
+	li	r3, -1
+	nop
+	nop
+	nop
+
+	/*
+	 * Guard against spurious wakeups from a hypervisor --
+	 * only interrupt will cause us to return to LR due to
+	 * _TLF_NAPPING.
+	 */
+	b	idle_loop
+
 /* Hypercall entry point. Will be patched with device tree instructions. */
 .global epapr_hypercall_start
 epapr_hypercall_start:
diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c
index 028aeae..f3eab85 100644
--- a/arch/powerpc/kernel/epapr_paravirt.c
+++ b/arch/powerpc/kernel/epapr_paravirt.c
@@ -21,6 +21,10 @@
 #include <asm/epapr_hcalls.h>
 #include <asm/cacheflush.h>
 #include <asm/code-patching.h>
+#include <asm/machdep.h>
+
+extern void epapr_ev_idle(void);
+extern u32 epapr_ev_idle_start[];
 
 bool epapr_paravirt_enabled;
 
@@ -41,8 +45,13 @@ static int __init epapr_paravirt_init(void)
 	if (len % 4 || len > (4 * 4))
 		return -ENODEV;
 
-	for (i = 0; i < (len / 4); i++)
+	for (i = 0; i < (len / 4); i++) {
 		patch_instruction(epapr_hypercall_start + i, insts[i]);
+		patch_instruction(epapr_ev_idle_start + i, insts[i]);
+	}
+
+	if (of_get_property(hyper_node, "has-idle", NULL))
+		ppc_md.power_save = epapr_ev_idle;
 
 	epapr_paravirt_enabled = true;
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 05/38] KVM: PPC: ev_idle hcall support for e500 guests
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Liu Yu-B13201, Liu Yu, Varun Sethi, Stuart Yoder

From: Liu Yu-B13201 <Yu.Liu@freescale.com>

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[varun: 64-bit changes]
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/epapr_hcalls.h |   11 ++++++-----
 arch/powerpc/kernel/epapr_hcalls.S      |   28 ++++++++++++++++++++++++++++
 arch/powerpc/kernel/epapr_paravirt.c    |   11 ++++++++++-
 3 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/epapr_hcalls.h b/arch/powerpc/include/asm/epapr_hcalls.h
index c0c7adc..833ce2c 100644
--- a/arch/powerpc/include/asm/epapr_hcalls.h
+++ b/arch/powerpc/include/asm/epapr_hcalls.h
@@ -50,10 +50,6 @@
 #ifndef _EPAPR_HCALLS_H
 #define _EPAPR_HCALLS_H
 
-#include <linux/types.h>
-#include <linux/errno.h>
-#include <asm/byteorder.h>
-
 #define EV_BYTE_CHANNEL_SEND		1
 #define EV_BYTE_CHANNEL_RECEIVE		2
 #define EV_BYTE_CHANNEL_POLL		3
@@ -109,6 +105,11 @@
 #define EV_UNIMPLEMENTED	12	/* Unimplemented hypercall */
 #define EV_BUFFER_OVERFLOW	13	/* Caller-supplied buffer too small */
 
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <asm/byteorder.h>
+
 /*
  * Hypercall register clobber list
  *
@@ -506,5 +507,5 @@ static inline unsigned int ev_idle(void)
 
 	return r3;
 }
-
+#endif /* !__ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/kernel/epapr_hcalls.S b/arch/powerpc/kernel/epapr_hcalls.S
index 697b390..62c0dc2 100644
--- a/arch/powerpc/kernel/epapr_hcalls.S
+++ b/arch/powerpc/kernel/epapr_hcalls.S
@@ -8,13 +8,41 @@
  */
 
 #include <linux/threads.h>
+#include <asm/epapr_hcalls.h>
 #include <asm/reg.h>
 #include <asm/page.h>
 #include <asm/cputable.h>
 #include <asm/thread_info.h>
 #include <asm/ppc_asm.h>
+#include <asm/asm-compat.h>
 #include <asm/asm-offsets.h>
 
+/* epapr_ev_idle() was derived from e500_idle() */
+_GLOBAL(epapr_ev_idle)
+	CURRENT_THREAD_INFO(r3, r1)
+	PPC_LL	r4, TI_LOCAL_FLAGS(r3)	/* set napping bit */
+	ori	r4, r4,_TLF_NAPPING	/* so when we take an exception */
+	PPC_STL	r4, TI_LOCAL_FLAGS(r3)	/* it will return to our caller */
+
+	wrteei	1
+
+idle_loop:
+	LOAD_REG_IMMEDIATE(r11, EV_HCALL_TOKEN(EV_IDLE))
+
+.global epapr_ev_idle_start
+epapr_ev_idle_start:
+	li	r3, -1
+	nop
+	nop
+	nop
+
+	/*
+	 * Guard against spurious wakeups from a hypervisor --
+	 * only interrupt will cause us to return to LR due to
+	 * _TLF_NAPPING.
+	 */
+	b	idle_loop
+
 /* Hypercall entry point. Will be patched with device tree instructions. */
 .global epapr_hypercall_start
 epapr_hypercall_start:
diff --git a/arch/powerpc/kernel/epapr_paravirt.c b/arch/powerpc/kernel/epapr_paravirt.c
index 028aeae..f3eab85 100644
--- a/arch/powerpc/kernel/epapr_paravirt.c
+++ b/arch/powerpc/kernel/epapr_paravirt.c
@@ -21,6 +21,10 @@
 #include <asm/epapr_hcalls.h>
 #include <asm/cacheflush.h>
 #include <asm/code-patching.h>
+#include <asm/machdep.h>
+
+extern void epapr_ev_idle(void);
+extern u32 epapr_ev_idle_start[];
 
 bool epapr_paravirt_enabled;
 
@@ -41,8 +45,13 @@ static int __init epapr_paravirt_init(void)
 	if (len % 4 || len > (4 * 4))
 		return -ENODEV;
 
-	for (i = 0; i < (len / 4); i++)
+	for (i = 0; i < (len / 4); i++) {
 		patch_instruction(epapr_hypercall_start + i, insts[i]);
+		patch_instruction(epapr_ev_idle_start + i, insts[i]);
+	}
+
+	if (of_get_property(hyper_node, "has-idle", NULL))
+		ppc_md.power_save = epapr_ev_idle;
 
 	epapr_paravirt_enabled = true;
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 06/38] PPC: select EPAPR_PARAVIRT for all users of epapr hcalls
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/platforms/Kconfig |    1 +
 drivers/tty/Kconfig            |    1 +
 drivers/virt/Kconfig           |    1 +
 3 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index e7a896a..48a920d 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -90,6 +90,7 @@ config MPIC
 config PPC_EPAPR_HV_PIC
 	bool
 	default n
+	select EPAPR_PARAVIRT
 
 config MPIC_WEIRD
 	bool
diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
index 830cd62..aa99cd2 100644
--- a/drivers/tty/Kconfig
+++ b/drivers/tty/Kconfig
@@ -358,6 +358,7 @@ config TRACE_SINK
 config PPC_EPAPR_HV_BYTECHAN
 	tristate "ePAPR hypervisor byte channel driver"
 	depends on PPC
+	select EPAPR_PARAVIRT
 	help
 	  This driver creates /dev entries for each ePAPR hypervisor byte
 	  channel, thereby allowing applications to communicate with byte
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 2dcdbc9..99ebdde 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -15,6 +15,7 @@ if VIRT_DRIVERS
 config FSL_HV_MANAGER
 	tristate "Freescale hypervisor management driver"
 	depends on FSL_SOC
+	select EPAPR_PARAVIRT
 	help
           The Freescale hypervisor management driver provides several services
 	  to drivers and applications related to the Freescale hypervisor:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 06/38] PPC: select EPAPR_PARAVIRT for all users of epapr hcalls
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Stuart Yoder

From: Stuart Yoder <stuart.yoder@freescale.com>

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/platforms/Kconfig |    1 +
 drivers/tty/Kconfig            |    1 +
 drivers/virt/Kconfig           |    1 +
 3 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index e7a896a..48a920d 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -90,6 +90,7 @@ config MPIC
 config PPC_EPAPR_HV_PIC
 	bool
 	default n
+	select EPAPR_PARAVIRT
 
 config MPIC_WEIRD
 	bool
diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
index 830cd62..aa99cd2 100644
--- a/drivers/tty/Kconfig
+++ b/drivers/tty/Kconfig
@@ -358,6 +358,7 @@ config TRACE_SINK
 config PPC_EPAPR_HV_BYTECHAN
 	tristate "ePAPR hypervisor byte channel driver"
 	depends on PPC
+	select EPAPR_PARAVIRT
 	help
 	  This driver creates /dev entries for each ePAPR hypervisor byte
 	  channel, thereby allowing applications to communicate with byte
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 2dcdbc9..99ebdde 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -15,6 +15,7 @@ if VIRT_DRIVERS
 config FSL_HV_MANAGER
 	tristate "Freescale hypervisor management driver"
 	depends on FSL_SOC
+	select EPAPR_PARAVIRT
 	help
           The Freescale hypervisor management driver provides several services
 	  to drivers and applications related to the Freescale hypervisor:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 07/38] powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Scott Wood, Stuart Yoder

From: Scott Wood <scottwood@freescale.com>

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/sysdev/fsl_msi.c |    9 +++++++--
 arch/powerpc/sysdev/fsl_soc.c |    2 ++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 6e097de..7e2b2f2 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -236,7 +236,6 @@ static void fsl_msi_cascade(unsigned int irq, struct irq_desc *desc)
 	u32 intr_index;
 	u32 have_shift = 0;
 	struct fsl_msi_cascade_data *cascade_data;
-	unsigned int ret;
 
 	cascade_data = irq_get_handler_data(irq);
 	msi_data = cascade_data->msi_data;
@@ -268,7 +267,9 @@ static void fsl_msi_cascade(unsigned int irq, struct irq_desc *desc)
 	case FSL_PIC_IP_IPIC:
 		msir_value = fsl_msi_read(msi_data->msi_regs, msir_index * 0x4);
 		break;
-	case FSL_PIC_IP_VMPIC:
+#ifdef CONFIG_EPAPR_PARAVIRT
+	case FSL_PIC_IP_VMPIC: {
+		unsigned int ret;
 		ret = fh_vmpic_get_msir(virq_to_hw(irq), &msir_value);
 		if (ret) {
 			pr_err("fsl-msi: fh_vmpic_get_msir() failed for "
@@ -277,6 +278,8 @@ static void fsl_msi_cascade(unsigned int irq, struct irq_desc *desc)
 		}
 		break;
 	}
+#endif
+	}
 
 	while (msir_value) {
 		intr_index = ffs(msir_value) - 1;
@@ -508,10 +511,12 @@ static const struct of_device_id fsl_of_msi_ids[] = {
 		.compatible = "fsl,ipic-msi",
 		.data = (void *)&ipic_msi_feature,
 	},
+#ifdef CONFIG_EPAPR_PARAVIRT
 	{
 		.compatible = "fsl,vmpic-msi",
 		.data = (void *)&vmpic_msi_feature,
 	},
+#endif
 	{}
 };
 
diff --git a/arch/powerpc/sysdev/fsl_soc.c b/arch/powerpc/sysdev/fsl_soc.c
index c449dbd..97118dc 100644
--- a/arch/powerpc/sysdev/fsl_soc.c
+++ b/arch/powerpc/sysdev/fsl_soc.c
@@ -253,6 +253,7 @@ struct platform_diu_data_ops diu_ops;
 EXPORT_SYMBOL(diu_ops);
 #endif
 
+#ifdef CONFIG_EPAPR_PARAVIRT
 /*
  * Restart the current partition
  *
@@ -278,3 +279,4 @@ void fsl_hv_halt(void)
 	pr_info("hv exit\n");
 	fh_partition_stop(-1);
 }
+#endif
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 07/38] powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Scott Wood, Stuart Yoder

From: Scott Wood <scottwood@freescale.com>

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/sysdev/fsl_msi.c |    9 +++++++--
 arch/powerpc/sysdev/fsl_soc.c |    2 ++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 6e097de..7e2b2f2 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -236,7 +236,6 @@ static void fsl_msi_cascade(unsigned int irq, struct irq_desc *desc)
 	u32 intr_index;
 	u32 have_shift = 0;
 	struct fsl_msi_cascade_data *cascade_data;
-	unsigned int ret;
 
 	cascade_data = irq_get_handler_data(irq);
 	msi_data = cascade_data->msi_data;
@@ -268,7 +267,9 @@ static void fsl_msi_cascade(unsigned int irq, struct irq_desc *desc)
 	case FSL_PIC_IP_IPIC:
 		msir_value = fsl_msi_read(msi_data->msi_regs, msir_index * 0x4);
 		break;
-	case FSL_PIC_IP_VMPIC:
+#ifdef CONFIG_EPAPR_PARAVIRT
+	case FSL_PIC_IP_VMPIC: {
+		unsigned int ret;
 		ret = fh_vmpic_get_msir(virq_to_hw(irq), &msir_value);
 		if (ret) {
 			pr_err("fsl-msi: fh_vmpic_get_msir() failed for "
@@ -277,6 +278,8 @@ static void fsl_msi_cascade(unsigned int irq, struct irq_desc *desc)
 		}
 		break;
 	}
+#endif
+	}
 
 	while (msir_value) {
 		intr_index = ffs(msir_value) - 1;
@@ -508,10 +511,12 @@ static const struct of_device_id fsl_of_msi_ids[] = {
 		.compatible = "fsl,ipic-msi",
 		.data = (void *)&ipic_msi_feature,
 	},
+#ifdef CONFIG_EPAPR_PARAVIRT
 	{
 		.compatible = "fsl,vmpic-msi",
 		.data = (void *)&vmpic_msi_feature,
 	},
+#endif
 	{}
 };
 
diff --git a/arch/powerpc/sysdev/fsl_soc.c b/arch/powerpc/sysdev/fsl_soc.c
index c449dbd..97118dc 100644
--- a/arch/powerpc/sysdev/fsl_soc.c
+++ b/arch/powerpc/sysdev/fsl_soc.c
@@ -253,6 +253,7 @@ struct platform_diu_data_ops diu_ops;
 EXPORT_SYMBOL(diu_ops);
 #endif
 
+#ifdef CONFIG_EPAPR_PARAVIRT
 /*
  * Restart the current partition
  *
@@ -278,3 +279,4 @@ void fsl_hv_halt(void)
 	pr_info("hv exit\n");
 	fh_partition_stop(-1);
 }
+#endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 08/38] PPC: Don't use hardcoded opcode for ePAPR hcall invocation
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Liu Yu-B13201, Liu Yu, Stuart Yoder

From: Liu Yu-B13201 <Yu.Liu@freescale.com>

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/epapr_hcalls.h |   22 +++++++++---------
 arch/powerpc/include/asm/fsl_hcalls.h   |   36 +++++++++++++++---------------
 2 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/epapr_hcalls.h b/arch/powerpc/include/asm/epapr_hcalls.h
index 833ce2c..b8d9445 100644
--- a/arch/powerpc/include/asm/epapr_hcalls.h
+++ b/arch/powerpc/include/asm/epapr_hcalls.h
@@ -195,7 +195,7 @@ static inline unsigned int ev_int_set_config(unsigned int interrupt,
 	r5  = priority;
 	r6  = destination;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6)
 		: : EV_HCALL_CLOBBERS4
 	);
@@ -224,7 +224,7 @@ static inline unsigned int ev_int_get_config(unsigned int interrupt,
 	r11 = EV_HCALL_TOKEN(EV_INT_GET_CONFIG);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4), "=r" (r5), "=r" (r6)
 		: : EV_HCALL_CLOBBERS4
 	);
@@ -254,7 +254,7 @@ static inline unsigned int ev_int_set_mask(unsigned int interrupt,
 	r3 = interrupt;
 	r4 = mask;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -279,7 +279,7 @@ static inline unsigned int ev_int_get_mask(unsigned int interrupt,
 	r11 = EV_HCALL_TOKEN(EV_INT_GET_MASK);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -307,7 +307,7 @@ static inline unsigned int ev_int_eoi(unsigned int interrupt)
 	r11 = EV_HCALL_TOKEN(EV_INT_EOI);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -346,7 +346,7 @@ static inline unsigned int ev_byte_channel_send(unsigned int handle,
 	r7 = be32_to_cpu(p[2]);
 	r8 = be32_to_cpu(p[3]);
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3),
 		  "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7), "+r" (r8)
 		: : EV_HCALL_CLOBBERS6
@@ -385,7 +385,7 @@ static inline unsigned int ev_byte_channel_receive(unsigned int handle,
 	r3 = handle;
 	r4 = *count;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4),
 		  "=r" (r5), "=r" (r6), "=r" (r7), "=r" (r8)
 		: : EV_HCALL_CLOBBERS6
@@ -423,7 +423,7 @@ static inline unsigned int ev_byte_channel_poll(unsigned int handle,
 	r11 = EV_HCALL_TOKEN(EV_BYTE_CHANNEL_POLL);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4), "=r" (r5)
 		: : EV_HCALL_CLOBBERS3
 	);
@@ -456,7 +456,7 @@ static inline unsigned int ev_int_iack(unsigned int handle,
 	r11 = EV_HCALL_TOKEN(EV_INT_IACK);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -480,7 +480,7 @@ static inline unsigned int ev_doorbell_send(unsigned int handle)
 	r11 = EV_HCALL_TOKEN(EV_DOORBELL_SEND);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -500,7 +500,7 @@ static inline unsigned int ev_idle(void)
 
 	r11 = EV_HCALL_TOKEN(EV_IDLE);
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "=r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
diff --git a/arch/powerpc/include/asm/fsl_hcalls.h b/arch/powerpc/include/asm/fsl_hcalls.h
index 922d9b5..3abb583 100644
--- a/arch/powerpc/include/asm/fsl_hcalls.h
+++ b/arch/powerpc/include/asm/fsl_hcalls.h
@@ -96,7 +96,7 @@ static inline unsigned int fh_send_nmi(unsigned int vcpu_mask)
 	r11 = FH_HCALL_TOKEN(FH_SEND_NMI);
 	r3 = vcpu_mask;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -151,7 +151,7 @@ static inline unsigned int fh_partition_get_dtprop(int handle,
 	r9 = (uint32_t)propvalue_addr;
 	r10 = *propvalue_len;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11),
 		  "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7),
 		  "+r" (r8), "+r" (r9), "+r" (r10)
@@ -205,7 +205,7 @@ static inline unsigned int fh_partition_set_dtprop(int handle,
 	r9 = (uint32_t)propvalue_addr;
 	r10 = propvalue_len;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11),
 		  "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7),
 		  "+r" (r8), "+r" (r9), "+r" (r10)
@@ -229,7 +229,7 @@ static inline unsigned int fh_partition_restart(unsigned int partition)
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_RESTART);
 	r3 = partition;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -262,7 +262,7 @@ static inline unsigned int fh_partition_get_status(unsigned int partition,
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_GET_STATUS);
 	r3 = partition;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -295,7 +295,7 @@ static inline unsigned int fh_partition_start(unsigned int partition,
 	r4 = entry_point;
 	r5 = load;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5)
 		: : EV_HCALL_CLOBBERS3
 	);
@@ -317,7 +317,7 @@ static inline unsigned int fh_partition_stop(unsigned int partition)
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_STOP);
 	r3 = partition;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -376,7 +376,7 @@ static inline unsigned int fh_partition_memcpy(unsigned int source,
 #endif
 	r7 = count;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11),
 		  "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7)
 		: : EV_HCALL_CLOBBERS5
@@ -399,7 +399,7 @@ static inline unsigned int fh_dma_enable(unsigned int liodn)
 	r11 = FH_HCALL_TOKEN(FH_DMA_ENABLE);
 	r3 = liodn;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -421,7 +421,7 @@ static inline unsigned int fh_dma_disable(unsigned int liodn)
 	r11 = FH_HCALL_TOKEN(FH_DMA_DISABLE);
 	r3 = liodn;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -447,7 +447,7 @@ static inline unsigned int fh_vmpic_get_msir(unsigned int interrupt,
 	r11 = FH_HCALL_TOKEN(FH_VMPIC_GET_MSIR);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -469,7 +469,7 @@ static inline unsigned int fh_system_reset(void)
 
 	r11 = FH_HCALL_TOKEN(FH_SYSTEM_RESET);
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "=r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -506,7 +506,7 @@ static inline unsigned int fh_err_get_info(int queue, uint32_t *bufsize,
 	r6 = addr_lo;
 	r7 = peek;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6),
 		  "+r" (r7)
 		: : EV_HCALL_CLOBBERS5
@@ -542,7 +542,7 @@ static inline unsigned int fh_get_core_state(unsigned int handle,
 	r3 = handle;
 	r4 = vcpu;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -572,7 +572,7 @@ static inline unsigned int fh_enter_nap(unsigned int handle, unsigned int vcpu)
 	r3 = handle;
 	r4 = vcpu;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -597,7 +597,7 @@ static inline unsigned int fh_exit_nap(unsigned int handle, unsigned int vcpu)
 	r3 = handle;
 	r4 = vcpu;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -618,7 +618,7 @@ static inline unsigned int fh_claim_device(unsigned int handle)
 	r11 = FH_HCALL_TOKEN(FH_CLAIM_DEVICE);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -645,7 +645,7 @@ static inline unsigned int fh_partition_stop_dma(unsigned int handle)
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_STOP_DMA);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 08/38] PPC: Don't use hardcoded opcode for ePAPR hcall invocation
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Liu Yu-B13201, Liu Yu, Stuart Yoder

From: Liu Yu-B13201 <Yu.Liu@freescale.com>

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/epapr_hcalls.h |   22 +++++++++---------
 arch/powerpc/include/asm/fsl_hcalls.h   |   36 +++++++++++++++---------------
 2 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/epapr_hcalls.h b/arch/powerpc/include/asm/epapr_hcalls.h
index 833ce2c..b8d9445 100644
--- a/arch/powerpc/include/asm/epapr_hcalls.h
+++ b/arch/powerpc/include/asm/epapr_hcalls.h
@@ -195,7 +195,7 @@ static inline unsigned int ev_int_set_config(unsigned int interrupt,
 	r5  = priority;
 	r6  = destination;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6)
 		: : EV_HCALL_CLOBBERS4
 	);
@@ -224,7 +224,7 @@ static inline unsigned int ev_int_get_config(unsigned int interrupt,
 	r11 = EV_HCALL_TOKEN(EV_INT_GET_CONFIG);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4), "=r" (r5), "=r" (r6)
 		: : EV_HCALL_CLOBBERS4
 	);
@@ -254,7 +254,7 @@ static inline unsigned int ev_int_set_mask(unsigned int interrupt,
 	r3 = interrupt;
 	r4 = mask;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -279,7 +279,7 @@ static inline unsigned int ev_int_get_mask(unsigned int interrupt,
 	r11 = EV_HCALL_TOKEN(EV_INT_GET_MASK);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -307,7 +307,7 @@ static inline unsigned int ev_int_eoi(unsigned int interrupt)
 	r11 = EV_HCALL_TOKEN(EV_INT_EOI);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -346,7 +346,7 @@ static inline unsigned int ev_byte_channel_send(unsigned int handle,
 	r7 = be32_to_cpu(p[2]);
 	r8 = be32_to_cpu(p[3]);
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3),
 		  "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7), "+r" (r8)
 		: : EV_HCALL_CLOBBERS6
@@ -385,7 +385,7 @@ static inline unsigned int ev_byte_channel_receive(unsigned int handle,
 	r3 = handle;
 	r4 = *count;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4),
 		  "=r" (r5), "=r" (r6), "=r" (r7), "=r" (r8)
 		: : EV_HCALL_CLOBBERS6
@@ -423,7 +423,7 @@ static inline unsigned int ev_byte_channel_poll(unsigned int handle,
 	r11 = EV_HCALL_TOKEN(EV_BYTE_CHANNEL_POLL);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4), "=r" (r5)
 		: : EV_HCALL_CLOBBERS3
 	);
@@ -456,7 +456,7 @@ static inline unsigned int ev_int_iack(unsigned int handle,
 	r11 = EV_HCALL_TOKEN(EV_INT_IACK);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -480,7 +480,7 @@ static inline unsigned int ev_doorbell_send(unsigned int handle)
 	r11 = EV_HCALL_TOKEN(EV_DOORBELL_SEND);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -500,7 +500,7 @@ static inline unsigned int ev_idle(void)
 
 	r11 = EV_HCALL_TOKEN(EV_IDLE);
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "=r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
diff --git a/arch/powerpc/include/asm/fsl_hcalls.h b/arch/powerpc/include/asm/fsl_hcalls.h
index 922d9b5..3abb583 100644
--- a/arch/powerpc/include/asm/fsl_hcalls.h
+++ b/arch/powerpc/include/asm/fsl_hcalls.h
@@ -96,7 +96,7 @@ static inline unsigned int fh_send_nmi(unsigned int vcpu_mask)
 	r11 = FH_HCALL_TOKEN(FH_SEND_NMI);
 	r3 = vcpu_mask;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -151,7 +151,7 @@ static inline unsigned int fh_partition_get_dtprop(int handle,
 	r9 = (uint32_t)propvalue_addr;
 	r10 = *propvalue_len;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11),
 		  "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7),
 		  "+r" (r8), "+r" (r9), "+r" (r10)
@@ -205,7 +205,7 @@ static inline unsigned int fh_partition_set_dtprop(int handle,
 	r9 = (uint32_t)propvalue_addr;
 	r10 = propvalue_len;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11),
 		  "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7),
 		  "+r" (r8), "+r" (r9), "+r" (r10)
@@ -229,7 +229,7 @@ static inline unsigned int fh_partition_restart(unsigned int partition)
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_RESTART);
 	r3 = partition;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -262,7 +262,7 @@ static inline unsigned int fh_partition_get_status(unsigned int partition,
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_GET_STATUS);
 	r3 = partition;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -295,7 +295,7 @@ static inline unsigned int fh_partition_start(unsigned int partition,
 	r4 = entry_point;
 	r5 = load;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5)
 		: : EV_HCALL_CLOBBERS3
 	);
@@ -317,7 +317,7 @@ static inline unsigned int fh_partition_stop(unsigned int partition)
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_STOP);
 	r3 = partition;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -376,7 +376,7 @@ static inline unsigned int fh_partition_memcpy(unsigned int source,
 #endif
 	r7 = count;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11),
 		  "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6), "+r" (r7)
 		: : EV_HCALL_CLOBBERS5
@@ -399,7 +399,7 @@ static inline unsigned int fh_dma_enable(unsigned int liodn)
 	r11 = FH_HCALL_TOKEN(FH_DMA_ENABLE);
 	r3 = liodn;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -421,7 +421,7 @@ static inline unsigned int fh_dma_disable(unsigned int liodn)
 	r11 = FH_HCALL_TOKEN(FH_DMA_DISABLE);
 	r3 = liodn;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -447,7 +447,7 @@ static inline unsigned int fh_vmpic_get_msir(unsigned int interrupt,
 	r11 = FH_HCALL_TOKEN(FH_VMPIC_GET_MSIR);
 	r3 = interrupt;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "=r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -469,7 +469,7 @@ static inline unsigned int fh_system_reset(void)
 
 	r11 = FH_HCALL_TOKEN(FH_SYSTEM_RESET);
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "=r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -506,7 +506,7 @@ static inline unsigned int fh_err_get_info(int queue, uint32_t *bufsize,
 	r6 = addr_lo;
 	r7 = peek;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4), "+r" (r5), "+r" (r6),
 		  "+r" (r7)
 		: : EV_HCALL_CLOBBERS5
@@ -542,7 +542,7 @@ static inline unsigned int fh_get_core_state(unsigned int handle,
 	r3 = handle;
 	r4 = vcpu;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -572,7 +572,7 @@ static inline unsigned int fh_enter_nap(unsigned int handle, unsigned int vcpu)
 	r3 = handle;
 	r4 = vcpu;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -597,7 +597,7 @@ static inline unsigned int fh_exit_nap(unsigned int handle, unsigned int vcpu)
 	r3 = handle;
 	r4 = vcpu;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3), "+r" (r4)
 		: : EV_HCALL_CLOBBERS2
 	);
@@ -618,7 +618,7 @@ static inline unsigned int fh_claim_device(unsigned int handle)
 	r11 = FH_HCALL_TOKEN(FH_CLAIM_DEVICE);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
@@ -645,7 +645,7 @@ static inline unsigned int fh_partition_stop_dma(unsigned int handle)
 	r11 = FH_HCALL_TOKEN(FH_PARTITION_STOP_DMA);
 	r3 = handle;
 
-	__asm__ __volatile__ ("sc 1"
+	asm volatile("bl	epapr_hypercall_start"
 		: "+r" (r11), "+r" (r3)
 		: : EV_HCALL_CLOBBERS1
 	);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 09/38] KVM: PPC: PR: Use generic tracepoint for guest exit
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We want to have tracing information on guest exits for booke as well
as book3s. Since most information is identical, use a common trace point.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 +-
 arch/powerpc/kvm/booke.c     |    3 ++
 arch/powerpc/kvm/trace.h     |   79 +++++++++++++++++++++++++++---------------
 3 files changed, 55 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 05c28f5..7f0fe6f 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -549,7 +549,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	/* We get here with MSR.EE=0, so enable it to be a nice citizen */
 	__hard_irq_enable();
 
-	trace_kvm_book3s_exit(exit_nr, vcpu);
+	trace_kvm_exit(exit_nr, vcpu);
 	preempt_enable();
 	kvm_resched(vcpu);
 	switch (exit_nr) {
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index d25a097..7ce2ed0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -39,6 +39,7 @@
 
 #include "timing.h"
 #include "booke.h"
+#include "trace.h"
 
 unsigned long kvmppc_booke_handlers;
 
@@ -677,6 +678,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	local_irq_enable();
 
+	trace_kvm_exit(exit_nr, vcpu);
+
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
 
diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 877186b..9fab6ed 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -31,6 +31,57 @@ TRACE_EVENT(kvm_ppc_instr,
 		  __entry->inst, __entry->pc, __entry->emulate)
 );
 
+TRACE_EVENT(kvm_exit,
+	TP_PROTO(unsigned int exit_nr, struct kvm_vcpu *vcpu),
+	TP_ARGS(exit_nr, vcpu),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	exit_nr		)
+		__field(	unsigned long,	pc		)
+		__field(	unsigned long,	msr		)
+		__field(	unsigned long,	dar		)
+#ifdef CONFIG_KVM_BOOK3S_PR
+		__field(	unsigned long,	srr1		)
+#endif
+		__field(	unsigned long,	last_inst	)
+	),
+
+	TP_fast_assign(
+#ifdef CONFIG_KVM_BOOK3S_PR
+		struct kvmppc_book3s_shadow_vcpu *svcpu;
+#endif
+		__entry->exit_nr	= exit_nr;
+		__entry->pc		= kvmppc_get_pc(vcpu);
+		__entry->dar		= kvmppc_get_fault_dar(vcpu);
+		__entry->msr		= vcpu->arch.shared->msr;
+#ifdef CONFIG_KVM_BOOK3S_PR
+		svcpu = svcpu_get(vcpu);
+		__entry->srr1		= svcpu->shadow_srr1;
+		svcpu_put(svcpu);
+#endif
+		__entry->last_inst	= vcpu->arch.last_inst;
+	),
+
+	TP_printk("exit=0x%x"
+		" | pc=0x%lx"
+		" | msr=0x%lx"
+		" | dar=0x%lx"
+#ifdef CONFIG_KVM_BOOK3S_PR
+		" | srr1=0x%lx"
+#endif
+		" | last_inst=0x%lx"
+		,
+		__entry->exit_nr,
+		__entry->pc,
+		__entry->msr,
+		__entry->dar,
+#ifdef CONFIG_KVM_BOOK3S_PR
+		__entry->srr1,
+#endif
+		__entry->last_inst
+		)
+);
+
 TRACE_EVENT(kvm_stlb_inval,
 	TP_PROTO(unsigned int stlb_index),
 	TP_ARGS(stlb_index),
@@ -105,34 +156,6 @@ TRACE_EVENT(kvm_gtlb_write,
 
 #ifdef CONFIG_KVM_BOOK3S_PR
 
-TRACE_EVENT(kvm_book3s_exit,
-	TP_PROTO(unsigned int exit_nr, struct kvm_vcpu *vcpu),
-	TP_ARGS(exit_nr, vcpu),
-
-	TP_STRUCT__entry(
-		__field(	unsigned int,	exit_nr		)
-		__field(	unsigned long,	pc		)
-		__field(	unsigned long,	msr		)
-		__field(	unsigned long,	dar		)
-		__field(	unsigned long,	srr1		)
-	),
-
-	TP_fast_assign(
-		struct kvmppc_book3s_shadow_vcpu *svcpu;
-		__entry->exit_nr	= exit_nr;
-		__entry->pc		= kvmppc_get_pc(vcpu);
-		__entry->dar		= kvmppc_get_fault_dar(vcpu);
-		__entry->msr		= vcpu->arch.shared->msr;
-		svcpu = svcpu_get(vcpu);
-		__entry->srr1		= svcpu->shadow_srr1;
-		svcpu_put(svcpu);
-	),
-
-	TP_printk("exit=0x%x | pc=0x%lx | msr=0x%lx | dar=0x%lx | srr1=0x%lx",
-		  __entry->exit_nr, __entry->pc, __entry->msr, __entry->dar,
-		  __entry->srr1)
-);
-
 TRACE_EVENT(kvm_book3s_reenter,
 	TP_PROTO(int r, struct kvm_vcpu *vcpu),
 	TP_ARGS(r, vcpu),
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 09/38] KVM: PPC: PR: Use generic tracepoint for guest exit
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We want to have tracing information on guest exits for booke as well
as book3s. Since most information is identical, use a common trace point.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 +-
 arch/powerpc/kvm/booke.c     |    3 ++
 arch/powerpc/kvm/trace.h     |   79 +++++++++++++++++++++++++++---------------
 3 files changed, 55 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 05c28f5..7f0fe6f 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -549,7 +549,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	/* We get here with MSR.EE=0, so enable it to be a nice citizen */
 	__hard_irq_enable();
 
-	trace_kvm_book3s_exit(exit_nr, vcpu);
+	trace_kvm_exit(exit_nr, vcpu);
 	preempt_enable();
 	kvm_resched(vcpu);
 	switch (exit_nr) {
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index d25a097..7ce2ed0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -39,6 +39,7 @@
 
 #include "timing.h"
 #include "booke.h"
+#include "trace.h"
 
 unsigned long kvmppc_booke_handlers;
 
@@ -677,6 +678,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	local_irq_enable();
 
+	trace_kvm_exit(exit_nr, vcpu);
+
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
 
diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 877186b..9fab6ed 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -31,6 +31,57 @@ TRACE_EVENT(kvm_ppc_instr,
 		  __entry->inst, __entry->pc, __entry->emulate)
 );
 
+TRACE_EVENT(kvm_exit,
+	TP_PROTO(unsigned int exit_nr, struct kvm_vcpu *vcpu),
+	TP_ARGS(exit_nr, vcpu),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	exit_nr		)
+		__field(	unsigned long,	pc		)
+		__field(	unsigned long,	msr		)
+		__field(	unsigned long,	dar		)
+#ifdef CONFIG_KVM_BOOK3S_PR
+		__field(	unsigned long,	srr1		)
+#endif
+		__field(	unsigned long,	last_inst	)
+	),
+
+	TP_fast_assign(
+#ifdef CONFIG_KVM_BOOK3S_PR
+		struct kvmppc_book3s_shadow_vcpu *svcpu;
+#endif
+		__entry->exit_nr	= exit_nr;
+		__entry->pc		= kvmppc_get_pc(vcpu);
+		__entry->dar		= kvmppc_get_fault_dar(vcpu);
+		__entry->msr		= vcpu->arch.shared->msr;
+#ifdef CONFIG_KVM_BOOK3S_PR
+		svcpu = svcpu_get(vcpu);
+		__entry->srr1		= svcpu->shadow_srr1;
+		svcpu_put(svcpu);
+#endif
+		__entry->last_inst	= vcpu->arch.last_inst;
+	),
+
+	TP_printk("exit=0x%x"
+		" | pc=0x%lx"
+		" | msr=0x%lx"
+		" | dar=0x%lx"
+#ifdef CONFIG_KVM_BOOK3S_PR
+		" | srr1=0x%lx"
+#endif
+		" | last_inst=0x%lx"
+		,
+		__entry->exit_nr,
+		__entry->pc,
+		__entry->msr,
+		__entry->dar,
+#ifdef CONFIG_KVM_BOOK3S_PR
+		__entry->srr1,
+#endif
+		__entry->last_inst
+		)
+);
+
 TRACE_EVENT(kvm_stlb_inval,
 	TP_PROTO(unsigned int stlb_index),
 	TP_ARGS(stlb_index),
@@ -105,34 +156,6 @@ TRACE_EVENT(kvm_gtlb_write,
 
 #ifdef CONFIG_KVM_BOOK3S_PR
 
-TRACE_EVENT(kvm_book3s_exit,
-	TP_PROTO(unsigned int exit_nr, struct kvm_vcpu *vcpu),
-	TP_ARGS(exit_nr, vcpu),
-
-	TP_STRUCT__entry(
-		__field(	unsigned int,	exit_nr		)
-		__field(	unsigned long,	pc		)
-		__field(	unsigned long,	msr		)
-		__field(	unsigned long,	dar		)
-		__field(	unsigned long,	srr1		)
-	),
-
-	TP_fast_assign(
-		struct kvmppc_book3s_shadow_vcpu *svcpu;
-		__entry->exit_nr	= exit_nr;
-		__entry->pc		= kvmppc_get_pc(vcpu);
-		__entry->dar		= kvmppc_get_fault_dar(vcpu);
-		__entry->msr		= vcpu->arch.shared->msr;
-		svcpu = svcpu_get(vcpu);
-		__entry->srr1		= svcpu->shadow_srr1;
-		svcpu_put(svcpu);
-	),
-
-	TP_printk("exit=0x%x | pc=0x%lx | msr=0x%lx | dar=0x%lx | srr1=0x%lx",
-		  __entry->exit_nr, __entry->pc, __entry->msr, __entry->dar,
-		  __entry->srr1)
-);
-
 TRACE_EVENT(kvm_book3s_reenter,
 	TP_PROTO(int r, struct kvm_vcpu *vcpu),
 	TP_ARGS(r, vcpu),
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 10/38] KVM: PPC: Expose SYNC cap based on mmu notifiers
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Semantically, the "SYNC" cap means that we have mmu notifiers available.
Express this in our #ifdef'ery around the feature, so that we can be sure
we don't miss out on ppc targets when they get their implementation.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/powerpc.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6645cc7..98b3408 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -264,10 +264,16 @@ int kvm_dev_ioctl_check_extension(long ext)
 		if (cpu_has_feature(CPU_FTR_ARCH_201))
 			r = 2;
 		break;
+#endif
 	case KVM_CAP_SYNC_MMU:
+#ifdef CONFIG_KVM_BOOK3S_64_HV
 		r = cpu_has_feature(CPU_FTR_ARCH_206) ? 1 : 0;
-		break;
+#elif defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+		r = 1;
+#else
+		r = 0;
 #endif
+		break;
 	case KVM_CAP_NR_VCPUS:
 		/*
 		 * Recommending a number of CPUs is somewhat arbitrary; we
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 10/38] KVM: PPC: Expose SYNC cap based on mmu notifiers
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Semantically, the "SYNC" cap means that we have mmu notifiers available.
Express this in our #ifdef'ery around the feature, so that we can be sure
we don't miss out on ppc targets when they get their implementation.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/powerpc.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6645cc7..98b3408 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -264,10 +264,16 @@ int kvm_dev_ioctl_check_extension(long ext)
 		if (cpu_has_feature(CPU_FTR_ARCH_201))
 			r = 2;
 		break;
+#endif
 	case KVM_CAP_SYNC_MMU:
+#ifdef CONFIG_KVM_BOOK3S_64_HV
 		r = cpu_has_feature(CPU_FTR_ARCH_206) ? 1 : 0;
-		break;
+#elif defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+		r = 1;
+#else
+		r = 0;
 #endif
+		break;
 	case KVM_CAP_NR_VCPUS:
 		/*
 		 * Recommending a number of CPUs is somewhat arbitrary; we
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 11/38] KVM: PPC: BookE: Expose remote TLB flushes in debugfs
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We're already counting remote TLB flushes in a variable, but don't export
it to user space yet. Do so, so we know what's going on.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 7ce2ed0..1d4ce9a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -63,6 +63,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
 	{ "doorbell", VCPU_STAT(dbell_exits) },
 	{ "guest doorbell", VCPU_STAT(gdbell_exits) },
+	{ "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
 	{ NULL }
 };
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 11/38] KVM: PPC: BookE: Expose remote TLB flushes in debugfs
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We're already counting remote TLB flushes in a variable, but don't export
it to user space yet. Do so, so we know what's going on.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 7ce2ed0..1d4ce9a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -63,6 +63,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
 	{ "doorbell", VCPU_STAT(dbell_exits) },
 	{ "guest doorbell", VCPU_STAT(gdbell_exits) },
+	{ "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
 	{ NULL }
 };
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 12/38] KVM: PPC: E500: Fix clear_tlb_refs
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Our mapping code assumes that TLB0 entries are always mapped. However, after
calling clear_tlb_refs() this is no longer the case.

Map them dynamically if we find an entry unmapped in TLB0.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/e500_tlb.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 09ce5ac..93f3b92 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -1036,8 +1036,12 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 		sesel = 0; /* unused */
 		priv = &vcpu_e500->gtlb_priv[tlbsel][esel];
 
-		kvmppc_e500_setup_stlbe(vcpu, gtlbe, BOOK3E_PAGESZ_4K,
-					&priv->ref, eaddr, &stlbe);
+		/* Only triggers after clear_tlb_refs */
+		if (unlikely(!(priv->ref.flags & E500_TLB_VALID)))
+			kvmppc_e500_tlb0_map(vcpu_e500, esel, &stlbe);
+		else
+			kvmppc_e500_setup_stlbe(vcpu, gtlbe, BOOK3E_PAGESZ_4K,
+						&priv->ref, eaddr, &stlbe);
 		break;
 
 	case 1: {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 12/38] KVM: PPC: E500: Fix clear_tlb_refs
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Our mapping code assumes that TLB0 entries are always mapped. However, after
calling clear_tlb_refs() this is no longer the case.

Map them dynamically if we find an entry unmapped in TLB0.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/e500_tlb.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 09ce5ac..93f3b92 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -1036,8 +1036,12 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 		sesel = 0; /* unused */
 		priv = &vcpu_e500->gtlb_priv[tlbsel][esel];
 
-		kvmppc_e500_setup_stlbe(vcpu, gtlbe, BOOK3E_PAGESZ_4K,
-					&priv->ref, eaddr, &stlbe);
+		/* Only triggers after clear_tlb_refs */
+		if (unlikely(!(priv->ref.flags & E500_TLB_VALID)))
+			kvmppc_e500_tlb0_map(vcpu_e500, esel, &stlbe);
+		else
+			kvmppc_e500_setup_stlbe(vcpu, gtlbe, BOOK3E_PAGESZ_4K,
+						&priv->ref, eaddr, &stlbe);
 		break;
 
 	case 1: {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 13/38] KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Paul Mackerras

From: Paul Mackerras <paulus@samba.org>

In handling the H_CEDE hypercall, if this vcpu has already been
prodded (with the H_PROD hypercall, which Linux guests don't in fact
use), we branch to a numeric label '1f'.  Unfortunately there is
another '1:' label before the one that we want to jump to.  This fixes
the problem by using a textual label, 'kvm_cede_prodded'.  It also
changes the label for another longish branch from '2:' to
'kvm_cede_exit' to avoid a possible future problem if code modifications
add another numeric '2:' label in between.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 5a84c8d..44b72fe 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1421,13 +1421,13 @@ _GLOBAL(kvmppc_h_cede)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
-	bne	1f
+	bne	kvm_cede_prodded
 	li	r0,0		/* set trap to 0 to say hcall is handled */
 	stw	r0,VCPU_TRAP(r3)
 	li	r0,H_SUCCESS
 	std	r0,VCPU_GPR(R3)(r3)
 BEGIN_FTR_SECTION
-	b	2f		/* just send it up to host on 970 */
+	b	kvm_cede_exit	/* just send it up to host on 970 */
 END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
 
 	/*
@@ -1446,7 +1446,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
 	or	r4,r4,r0
 	PPC_POPCNTW(R7,R4)
 	cmpw	r7,r8
-	bge	2f
+	bge	kvm_cede_exit
 	stwcx.	r4,0,r6
 	bne	31b
 	li	r0,1
@@ -1555,7 +1555,8 @@ kvm_end_cede:
 	b	hcall_real_fallback
 
 	/* cede when already previously prodded case */
-1:	li	r0,0
+kvm_cede_prodded:
+	li	r0,0
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
@@ -1563,7 +1564,8 @@ kvm_end_cede:
 	blr
 
 	/* we've ceded but we want to give control to the host */
-2:	li	r3,H_TOO_HARD
+kvm_cede_exit:
+	li	r3,H_TOO_HARD
 	blr
 
 secondary_too_late:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 13/38] KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Paul Mackerras

From: Paul Mackerras <paulus@samba.org>

In handling the H_CEDE hypercall, if this vcpu has already been
prodded (with the H_PROD hypercall, which Linux guests don't in fact
use), we branch to a numeric label '1f'.  Unfortunately there is
another '1:' label before the one that we want to jump to.  This fixes
the problem by using a textual label, 'kvm_cede_prodded'.  It also
changes the label for another longish branch from '2:' to
'kvm_cede_exit' to avoid a possible future problem if code modifications
add another numeric '2:' label in between.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 5a84c8d..44b72fe 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1421,13 +1421,13 @@ _GLOBAL(kvmppc_h_cede)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
-	bne	1f
+	bne	kvm_cede_prodded
 	li	r0,0		/* set trap to 0 to say hcall is handled */
 	stw	r0,VCPU_TRAP(r3)
 	li	r0,H_SUCCESS
 	std	r0,VCPU_GPR(R3)(r3)
 BEGIN_FTR_SECTION
-	b	2f		/* just send it up to host on 970 */
+	b	kvm_cede_exit	/* just send it up to host on 970 */
 END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
 
 	/*
@@ -1446,7 +1446,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
 	or	r4,r4,r0
 	PPC_POPCNTW(R7,R4)
 	cmpw	r7,r8
-	bge	2f
+	bge	kvm_cede_exit
 	stwcx.	r4,0,r6
 	bne	31b
 	li	r0,1
@@ -1555,7 +1555,8 @@ kvm_end_cede:
 	b	hcall_real_fallback
 
 	/* cede when already previously prodded case */
-1:	li	r0,0
+kvm_cede_prodded:
+	li	r0,0
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
@@ -1563,7 +1564,8 @@ kvm_end_cede:
 	blr
 
 	/* we've ceded but we want to give control to the host */
-2:	li	r3,H_TOO_HARD
+kvm_cede_exit:
+	li	r3,H_TOO_HARD
 	blr
 
 secondary_too_late:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 14/38] KVM: PPC: Quieten message about allocating linear regions
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Paul Mackerras

From: Paul Mackerras <paulus@samba.org>

This is printed once for every RMA or HPT region that get
preallocated.  If one preallocates hundreds of such regions
(in order to run hundreds of KVM guests), that gets rather
painful, so make it a bit quieter.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_hv_builtin.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index fb4eac2..ec0a9e5 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -157,8 +157,8 @@ static void __init kvm_linear_init_one(ulong size, int count, int type)
 	linear_info = alloc_bootmem(count * sizeof(struct kvmppc_linear_info));
 	for (i = 0; i < count; ++i) {
 		linear = alloc_bootmem_align(size, size);
-		pr_info("Allocated KVM %s at %p (%ld MB)\n", typestr, linear,
-			size >> 20);
+		pr_debug("Allocated KVM %s at %p (%ld MB)\n", typestr, linear,
+			 size >> 20);
 		linear_info[i].base_virt = linear;
 		linear_info[i].base_pfn = __pa(linear) >> PAGE_SHIFT;
 		linear_info[i].npages = npages;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 14/38] KVM: PPC: Quieten message about allocating linear regions
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Paul Mackerras

From: Paul Mackerras <paulus@samba.org>

This is printed once for every RMA or HPT region that get
preallocated.  If one preallocates hundreds of such regions
(in order to run hundreds of KVM guests), that gets rather
painful, so make it a bit quieter.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_hv_builtin.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index fb4eac2..ec0a9e5 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -157,8 +157,8 @@ static void __init kvm_linear_init_one(ulong size, int count, int type)
 	linear_info = alloc_bootmem(count * sizeof(struct kvmppc_linear_info));
 	for (i = 0; i < count; ++i) {
 		linear = alloc_bootmem_align(size, size);
-		pr_info("Allocated KVM %s at %p (%ld MB)\n", typestr, linear,
-			size >> 20);
+		pr_debug("Allocated KVM %s at %p (%ld MB)\n", typestr, linear,
+			 size >> 20);
 		linear_info[i].base_virt = linear;
 		linear_info[i].base_pfn = __pa(linear) >> PAGE_SHIFT;
 		linear_info[i].npages = npages;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 15/38] powerpc/epapr: export epapr_hypercall_start
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Scott Wood

From: Scott Wood <scottwood@freescale.com>

This fixes breakage introduced by the following commit:

  commit 6d2d82627f4f1e96a33664ace494fa363e0495cb
  Author: Liu Yu-B13201 <Yu.Liu@freescale.com>
  Date:   Tue Jul 3 05:48:56 2012 +0000

    PPC: Don't use hardcoded opcode for ePAPR hcall invocation

when a driver that uses ePAPR hypercalls is built as a module.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/ppc_ksyms.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 3e40315..e597dde 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -43,6 +43,7 @@
 #include <asm/dcr.h>
 #include <asm/ftrace.h>
 #include <asm/switch_to.h>
+#include <asm/epapr_hcalls.h>
 
 #ifdef CONFIG_PPC32
 extern void transfer_to_handler(void);
@@ -192,3 +193,7 @@ EXPORT_SYMBOL(__arch_hweight64);
 #ifdef CONFIG_PPC_BOOK3S_64
 EXPORT_SYMBOL_GPL(mmu_psize_defs);
 #endif
+
+#ifdef CONFIG_EPAPR_PARAVIRT
+EXPORT_SYMBOL(epapr_hypercall_start);
+#endif
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 15/38] powerpc/epapr: export epapr_hypercall_start
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Scott Wood

From: Scott Wood <scottwood@freescale.com>

This fixes breakage introduced by the following commit:

  commit 6d2d82627f4f1e96a33664ace494fa363e0495cb
  Author: Liu Yu-B13201 <Yu.Liu@freescale.com>
  Date:   Tue Jul 3 05:48:56 2012 +0000

    PPC: Don't use hardcoded opcode for ePAPR hcall invocation

when a driver that uses ePAPR hypercalls is built as a module.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kernel/ppc_ksyms.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 3e40315..e597dde 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -43,6 +43,7 @@
 #include <asm/dcr.h>
 #include <asm/ftrace.h>
 #include <asm/switch_to.h>
+#include <asm/epapr_hcalls.h>
 
 #ifdef CONFIG_PPC32
 extern void transfer_to_handler(void);
@@ -192,3 +193,7 @@ EXPORT_SYMBOL(__arch_hweight64);
 #ifdef CONFIG_PPC_BOOK3S_64
 EXPORT_SYMBOL_GPL(mmu_psize_defs);
 #endif
+
+#ifdef CONFIG_EPAPR_PARAVIRT
+EXPORT_SYMBOL(epapr_hypercall_start);
+#endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need a central place to check for pending requests in. Add one that
only does the timer check we already do in a different place.

Later, this central function can be extended by more checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
 1 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1d4ce9a..bcf87fe 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
 	unsigned long *pending = &vcpu->arch.pending_exceptions;
 	unsigned int priority;
 
-	if (vcpu->requests) {
-		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
-			smp_mb();
-			update_timer_ints(vcpu);
-		}
-	}
-
 	priority = __ffs(*pending);
 	while (priority < BOOKE_IRQPRIO_MAX) {
 		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
@@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 	return r;
 }
 
+static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->requests) {
+		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
+			update_timer_ints(vcpu);
+	}
+}
+
 /*
  * Common checks before entering the guest world.  Call with interrupts
  * disabled.
@@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			break;
 		}
 
+		smp_mb();
+		if (vcpu->requests) {
+			/* Make sure we process requests preemptable */
+			local_irq_enable();
+			kvmppc_check_requests(vcpu);
+			local_irq_disable();
+			continue;
+		}
+
 		if (kvmppc_core_prepare_to_enter(vcpu)) {
 			/* interrupts got enabled in between, so we
 			   are back at square 1 */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need a central place to check for pending requests in. Add one that
only does the timer check we already do in a different place.

Later, this central function can be extended by more checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
 1 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1d4ce9a..bcf87fe 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
 	unsigned long *pending = &vcpu->arch.pending_exceptions;
 	unsigned int priority;
 
-	if (vcpu->requests) {
-		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
-			smp_mb();
-			update_timer_ints(vcpu);
-		}
-	}
-
 	priority = __ffs(*pending);
 	while (priority < BOOKE_IRQPRIO_MAX) {
 		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
@@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 	return r;
 }
 
+static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
+{
+	if (vcpu->requests) {
+		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
+			update_timer_ints(vcpu);
+	}
+}
+
 /*
  * Common checks before entering the guest world.  Call with interrupts
  * disabled.
@@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			break;
 		}
 
+		smp_mb();
+		if (vcpu->requests) {
+			/* Make sure we process requests preemptable */
+			local_irq_enable();
+			kvmppc_check_requests(vcpu);
+			local_irq_disable();
+			continue;
+		}
+
 		if (kvmppc_core_prepare_to_enter(vcpu)) {
 			/* interrupts got enabled in between, so we
 			   are back at square 1 */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Generic KVM code might want to know whether we are inside guest context
or outside. It also wants to be able to push us out of guest context.

Add support to the BookE code for the generic vcpu->mode field that describes
the above states.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index bcf87fe..70a86c0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			continue;
 		}
 
+		if (vcpu->mode == EXITING_GUEST_MODE) {
+			r = 1;
+			break;
+		}
+
+		/* Going into guest context! Yay! */
+		vcpu->mode = IN_GUEST_MODE;
+		smp_wmb();
+
 		break;
 	}
 
@@ -572,6 +581,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	kvm_guest_exit();
 
 out:
+	vcpu->mode = OUTSIDE_GUEST_MODE;
+	smp_wmb();
 	local_irq_enable();
 	return ret;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Generic KVM code might want to know whether we are inside guest context
or outside. It also wants to be able to push us out of guest context.

Add support to the BookE code for the generic vcpu->mode field that describes
the above states.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index bcf87fe..70a86c0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			continue;
 		}
 
+		if (vcpu->mode = EXITING_GUEST_MODE) {
+			r = 1;
+			break;
+		}
+
+		/* Going into guest context! Yay! */
+		vcpu->mode = IN_GUEST_MODE;
+		smp_wmb();
+
 		break;
 	}
 
@@ -572,6 +581,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	kvm_guest_exit();
 
 out:
+	vcpu->mode = OUTSIDE_GUEST_MODE;
+	smp_wmb();
 	local_irq_enable();
 	return ret;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

The e500 target has lived without mmu notifiers ever since it got
introduced, but fails for the user space check on them with hugetlbfs.

So in order to get that one working, implement mmu notifiers in a
reasonably dumb fashion and be happy. On embedded hardware, we almost
never end up with mmu notifier calls, since most people don't overcommit.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    3 +-
 arch/powerpc/include/asm/kvm_ppc.h  |    1 +
 arch/powerpc/kvm/Kconfig            |    2 +
 arch/powerpc/kvm/booke.c            |    6 +++
 arch/powerpc/kvm/e500_tlb.c         |   60 +++++++++++++++++++++++++++++++---
 5 files changed, 65 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index a29e091..ff8d51c 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -45,7 +45,8 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
-#ifdef CONFIG_KVM_BOOK3S_64_HV
+#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
+    defined(CONFIG_KVM_E500MC)
 #include <linux/mmu_notifier.h>
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 0124937..c38e824 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
                                        struct kvm_interrupt *irq);
 extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
                                          struct kvm_interrupt *irq);
+extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
                                   unsigned int op, int *advance);
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index f4dacb9..40cad8c 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -123,6 +123,7 @@ config KVM_E500V2
 	depends on EXPERIMENTAL && E500 && !PPC_E500MC
 	select KVM
 	select KVM_MMIO
+	select MMU_NOTIFIER
 	---help---
 	  Support running unmodified E500 guest kernels in virtual machines on
 	  E500v2 host processors.
@@ -138,6 +139,7 @@ config KVM_E500MC
 	select KVM
 	select KVM_MMIO
 	select KVM_BOOKE_HV
+	select MMU_NOTIFIER
 	---help---
 	  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
 	  virtual machines on E500MC/E5500 host processors.
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 70a86c0..52f6cbb 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -459,6 +459,10 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 	if (vcpu->requests) {
 		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 			update_timer_ints(vcpu);
+#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
+		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+			kvmppc_core_flush_tlb(vcpu);
+#endif
 	}
 }
 
@@ -579,6 +583,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 
 	kvm_guest_exit();
+	vcpu->mode = OUTSIDE_GUEST_MODE;
+	smp_wmb();
 
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 93f3b92..06273a7 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -303,18 +303,15 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 	ref->pfn = pfn;
 	ref->flags = E500_TLB_VALID;
 
-	if (tlbe_is_writable(gtlbe))
+	if (tlbe_is_writable(gtlbe)) {
 		ref->flags |= E500_TLB_DIRTY;
+		kvm_set_pfn_dirty(pfn);
+	}
 }
 
 static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 {
 	if (ref->flags & E500_TLB_VALID) {
-		if (ref->flags & E500_TLB_DIRTY)
-			kvm_release_pfn_dirty(ref->pfn);
-		else
-			kvm_release_pfn_clean(ref->pfn);
-
 		ref->flags = 0;
 	}
 }
@@ -357,6 +354,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500)
 	clear_tlb_privs(vcpu_e500);
 }
 
+void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
+{
+	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+	clear_tlb_refs(vcpu_e500);
+	clear_tlb1_bitmap(vcpu_e500);
+}
+
 static inline void kvmppc_e500_deliver_tlb_miss(struct kvm_vcpu *vcpu,
 		unsigned int eaddr, int as)
 {
@@ -538,6 +542,9 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 
 	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
 				ref, gvaddr, stlbe);
+
+	/* Drop refcount on page, so that mmu notifiers can clear it */
+	kvm_release_pfn_clean(pfn);
 }
 
 /* XXX only map the one-one case, for now use TLB0 */
@@ -1061,6 +1068,47 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	write_stlbe(vcpu_e500, gtlbe, &stlbe, stlbsel, sesel);
 }
 
+/************* MMU Notifiers *************/
+
+int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
+{
+	/*
+	 * Flush all shadow tlb entries everywhere. This is slow, but
+	 * we are 100% sure that we catch the to be unmapped page
+	 */
+	kvm_flush_remote_tlbs(kvm);
+
+	return 0;
+}
+
+int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
+{
+	/* kvm_unmap_hva flushes everything anyways */
+	kvm_unmap_hva(kvm, start);
+
+	return 0;
+}
+
+int kvm_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
+{
+	/* The page will get remapped properly on its next fault */
+	kvm_unmap_hva(kvm, hva);
+}
+
+/*****************************************/
+
 static void free_gtlb(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	int i;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

The e500 target has lived without mmu notifiers ever since it got
introduced, but fails for the user space check on them with hugetlbfs.

So in order to get that one working, implement mmu notifiers in a
reasonably dumb fashion and be happy. On embedded hardware, we almost
never end up with mmu notifier calls, since most people don't overcommit.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h |    3 +-
 arch/powerpc/include/asm/kvm_ppc.h  |    1 +
 arch/powerpc/kvm/Kconfig            |    2 +
 arch/powerpc/kvm/booke.c            |    6 +++
 arch/powerpc/kvm/e500_tlb.c         |   60 +++++++++++++++++++++++++++++++---
 5 files changed, 65 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index a29e091..ff8d51c 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -45,7 +45,8 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
-#ifdef CONFIG_KVM_BOOK3S_64_HV
+#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
+    defined(CONFIG_KVM_E500MC)
 #include <linux/mmu_notifier.h>
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 0124937..c38e824 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
                                        struct kvm_interrupt *irq);
 extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
                                          struct kvm_interrupt *irq);
+extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
                                   unsigned int op, int *advance);
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index f4dacb9..40cad8c 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -123,6 +123,7 @@ config KVM_E500V2
 	depends on EXPERIMENTAL && E500 && !PPC_E500MC
 	select KVM
 	select KVM_MMIO
+	select MMU_NOTIFIER
 	---help---
 	  Support running unmodified E500 guest kernels in virtual machines on
 	  E500v2 host processors.
@@ -138,6 +139,7 @@ config KVM_E500MC
 	select KVM
 	select KVM_MMIO
 	select KVM_BOOKE_HV
+	select MMU_NOTIFIER
 	---help---
 	  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
 	  virtual machines on E500MC/E5500 host processors.
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 70a86c0..52f6cbb 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -459,6 +459,10 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 	if (vcpu->requests) {
 		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 			update_timer_ints(vcpu);
+#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
+		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+			kvmppc_core_flush_tlb(vcpu);
+#endif
 	}
 }
 
@@ -579,6 +583,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 
 	kvm_guest_exit();
+	vcpu->mode = OUTSIDE_GUEST_MODE;
+	smp_wmb();
 
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 93f3b92..06273a7 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -303,18 +303,15 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 	ref->pfn = pfn;
 	ref->flags = E500_TLB_VALID;
 
-	if (tlbe_is_writable(gtlbe))
+	if (tlbe_is_writable(gtlbe)) {
 		ref->flags |= E500_TLB_DIRTY;
+		kvm_set_pfn_dirty(pfn);
+	}
 }
 
 static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 {
 	if (ref->flags & E500_TLB_VALID) {
-		if (ref->flags & E500_TLB_DIRTY)
-			kvm_release_pfn_dirty(ref->pfn);
-		else
-			kvm_release_pfn_clean(ref->pfn);
-
 		ref->flags = 0;
 	}
 }
@@ -357,6 +354,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500)
 	clear_tlb_privs(vcpu_e500);
 }
 
+void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
+{
+	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+	clear_tlb_refs(vcpu_e500);
+	clear_tlb1_bitmap(vcpu_e500);
+}
+
 static inline void kvmppc_e500_deliver_tlb_miss(struct kvm_vcpu *vcpu,
 		unsigned int eaddr, int as)
 {
@@ -538,6 +542,9 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 
 	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
 				ref, gvaddr, stlbe);
+
+	/* Drop refcount on page, so that mmu notifiers can clear it */
+	kvm_release_pfn_clean(pfn);
 }
 
 /* XXX only map the one-one case, for now use TLB0 */
@@ -1061,6 +1068,47 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	write_stlbe(vcpu_e500, gtlbe, &stlbe, stlbsel, sesel);
 }
 
+/************* MMU Notifiers *************/
+
+int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
+{
+	/*
+	 * Flush all shadow tlb entries everywhere. This is slow, but
+	 * we are 100% sure that we catch the to be unmapped page
+	 */
+	kvm_flush_remote_tlbs(kvm);
+
+	return 0;
+}
+
+int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
+{
+	/* kvm_unmap_hva flushes everything anyways */
+	kvm_unmap_hva(kvm, start);
+
+	return 0;
+}
+
+int kvm_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
+{
+	/* The page will get remapped properly on its next fault */
+	kvm_unmap_hva(kvm, hva);
+}
+
+/*****************************************/
+
 static void free_gtlb(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	int i;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

When we map a page that wasn't icache cleared before, do so when first
mapping it in KVM using the same information bits as the Linux mapping
logic. That way we are 100% sure that any page we map does not have stale
entries in the icache.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h   |    1 +
 arch/powerpc/include/asm/kvm_ppc.h    |   12 ++++++++++++
 arch/powerpc/kvm/book3s_32_mmu_host.c |    3 +++
 arch/powerpc/kvm/book3s_64_mmu_host.c |    2 ++
 arch/powerpc/kvm/e500_tlb.c           |    3 +++
 arch/powerpc/mm/mem.c                 |    1 +
 6 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index ff8d51c..cea9d3a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -33,6 +33,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/processor.h>
 #include <asm/page.h>
+#include <asm/cacheflush.h>
 
 #define KVM_MAX_VCPUS		NR_CPUS
 #define KVM_MAX_VCORES		NR_CPUS
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index c38e824..88de314 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -220,4 +220,16 @@ void kvmppc_claim_lpid(long lpid);
 void kvmppc_free_lpid(long lpid);
 void kvmppc_init_lpid(unsigned long nr_lpids);
 
+static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
+{
+	/* Clear i-cache for new pages */
+	struct page *page;
+	page = pfn_to_page(pfn);
+	if (!test_bit(PG_arch_1, &page->flags)) {
+		flush_dcache_icache_page(page);
+		set_bit(PG_arch_1, &page->flags);
+	}
+}
+
+
 #endif /* __POWERPC_KVM_PPC_H__ */
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index f922c29..837f13e 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -211,6 +211,9 @@ next_pteg:
 		pteg1 |= PP_RWRX;
 	}
 
+	if (orig_pte->may_execute)
+		kvmppc_mmu_flush_icache(hpaddr >> PAGE_SHIFT);
+
 	local_irq_disable();
 
 	if (pteg[rr]) {
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 10fc8ec..0688b6b 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -126,6 +126,8 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
 
 	if (!orig_pte->may_execute)
 		rflags |= HPTE_R_N;
+	else
+		kvmppc_mmu_flush_icache(hpaddr >> PAGE_SHIFT);
 
 	hash = hpt_hash(va, PTE_SIZE, MMU_SEGSIZE_256M);
 
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 06273a7..a6519ca 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -543,6 +543,9 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
 				ref, gvaddr, stlbe);
 
+	/* Clear i-cache for new pages */
+	kvmppc_mmu_flush_icache(pfn);
+
 	/* Drop refcount on page, so that mmu notifiers can clear it */
 	kvm_release_pfn_clean(pfn);
 }
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index baaafde..fbdad0e 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -469,6 +469,7 @@ void flush_dcache_icache_page(struct page *page)
 	__flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT);
 #endif
 }
+EXPORT_SYMBOL(flush_dcache_icache_page);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

When we map a page that wasn't icache cleared before, do so when first
mapping it in KVM using the same information bits as the Linux mapping
logic. That way we are 100% sure that any page we map does not have stale
entries in the icache.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h   |    1 +
 arch/powerpc/include/asm/kvm_ppc.h    |   12 ++++++++++++
 arch/powerpc/kvm/book3s_32_mmu_host.c |    3 +++
 arch/powerpc/kvm/book3s_64_mmu_host.c |    2 ++
 arch/powerpc/kvm/e500_tlb.c           |    3 +++
 arch/powerpc/mm/mem.c                 |    1 +
 6 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index ff8d51c..cea9d3a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -33,6 +33,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/processor.h>
 #include <asm/page.h>
+#include <asm/cacheflush.h>
 
 #define KVM_MAX_VCPUS		NR_CPUS
 #define KVM_MAX_VCORES		NR_CPUS
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index c38e824..88de314 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -220,4 +220,16 @@ void kvmppc_claim_lpid(long lpid);
 void kvmppc_free_lpid(long lpid);
 void kvmppc_init_lpid(unsigned long nr_lpids);
 
+static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
+{
+	/* Clear i-cache for new pages */
+	struct page *page;
+	page = pfn_to_page(pfn);
+	if (!test_bit(PG_arch_1, &page->flags)) {
+		flush_dcache_icache_page(page);
+		set_bit(PG_arch_1, &page->flags);
+	}
+}
+
+
 #endif /* __POWERPC_KVM_PPC_H__ */
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index f922c29..837f13e 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -211,6 +211,9 @@ next_pteg:
 		pteg1 |= PP_RWRX;
 	}
 
+	if (orig_pte->may_execute)
+		kvmppc_mmu_flush_icache(hpaddr >> PAGE_SHIFT);
+
 	local_irq_disable();
 
 	if (pteg[rr]) {
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 10fc8ec..0688b6b 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -126,6 +126,8 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte)
 
 	if (!orig_pte->may_execute)
 		rflags |= HPTE_R_N;
+	else
+		kvmppc_mmu_flush_icache(hpaddr >> PAGE_SHIFT);
 
 	hash = hpt_hash(va, PTE_SIZE, MMU_SEGSIZE_256M);
 
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 06273a7..a6519ca 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -543,6 +543,9 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 	kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
 				ref, gvaddr, stlbe);
 
+	/* Clear i-cache for new pages */
+	kvmppc_mmu_flush_icache(pfn);
+
 	/* Drop refcount on page, so that mmu notifiers can clear it */
 	kvm_release_pfn_clean(pfn);
 }
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index baaafde..fbdad0e 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -469,6 +469,7 @@ void flush_dcache_icache_page(struct page *page)
 	__flush_dcache_icache_phys(page_to_pfn(page) << PAGE_SHIFT);
 #endif
 }
+EXPORT_SYMBOL(flush_dcache_icache_page);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 20/38] KVM: PPC: BookE: Add some more trace points
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Without trace points, debugging what exactly is going on inside guest
code can be very tricky. Add a few more trace points at places that
hopefully tell us more when things go wrong.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c    |    3 ++
 arch/powerpc/kvm/e500_tlb.c |    3 ++
 arch/powerpc/kvm/trace.h    |   71 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 52f6cbb..00bcc57 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -143,6 +143,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
 static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu,
                                        unsigned int priority)
 {
+	trace_kvm_booke_queue_irqprio(vcpu, priority);
 	set_bit(priority, &vcpu->arch.pending_exceptions);
 }
 
@@ -457,6 +458,8 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 {
 	if (vcpu->requests) {
+		trace_kvm_check_requests(vcpu);
+
 		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 			update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index a6519ca..6340b3c 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -312,6 +312,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 {
 	if (ref->flags & E500_TLB_VALID) {
+		trace_kvm_booke206_ref_release(ref->pfn, ref->flags);
 		ref->flags = 0;
 	}
 }
@@ -1075,6 +1076,8 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
 {
+	trace_kvm_unmap_hva(hva);
+
 	/*
 	 * Flush all shadow tlb entries everywhere. This is slow, but
 	 * we are 100% sure that we catch the to be unmapped page
diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 9fab6ed..cb2780a 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -82,6 +82,21 @@ TRACE_EVENT(kvm_exit,
 		)
 );
 
+TRACE_EVENT(kvm_unmap_hva,
+	TP_PROTO(unsigned long hva),
+	TP_ARGS(hva),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	hva		)
+	),
+
+	TP_fast_assign(
+		__entry->hva		= hva;
+	),
+
+	TP_printk("unmap hva 0x%lx\n", __entry->hva)
+);
+
 TRACE_EVENT(kvm_stlb_inval,
 	TP_PROTO(unsigned int stlb_index),
 	TP_ARGS(stlb_index),
@@ -149,6 +164,24 @@ TRACE_EVENT(kvm_gtlb_write,
 		__entry->word1, __entry->word2)
 );
 
+TRACE_EVENT(kvm_check_requests,
+	TP_PROTO(struct kvm_vcpu *vcpu),
+	TP_ARGS(vcpu),
+
+	TP_STRUCT__entry(
+		__field(	__u32,	cpu_nr		)
+		__field(	__u32,	requests	)
+	),
+
+	TP_fast_assign(
+		__entry->cpu_nr		= vcpu->vcpu_id;
+		__entry->requests	= vcpu->requests;
+	),
+
+	TP_printk("vcpu=%x requests=%x",
+		__entry->cpu_nr, __entry->requests)
+);
+
 
 /*************************************************************************
  *                         Book3S trace points                           *
@@ -418,6 +451,44 @@ TRACE_EVENT(kvm_booke206_gtlb_write,
 		__entry->mas2, __entry->mas7_3)
 );
 
+TRACE_EVENT(kvm_booke206_ref_release,
+	TP_PROTO(__u64 pfn, __u32 flags),
+	TP_ARGS(pfn, flags),
+
+	TP_STRUCT__entry(
+		__field(	__u64,	pfn		)
+		__field(	__u32,	flags		)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->flags		= flags;
+	),
+
+	TP_printk("pfn=%llx flags=%x",
+		__entry->pfn, __entry->flags)
+);
+
+TRACE_EVENT(kvm_booke_queue_irqprio,
+	TP_PROTO(struct kvm_vcpu *vcpu, unsigned int priority),
+	TP_ARGS(vcpu, priority),
+
+	TP_STRUCT__entry(
+		__field(	__u32,	cpu_nr		)
+		__field(	__u32,	priority		)
+		__field(	unsigned long,	pending		)
+	),
+
+	TP_fast_assign(
+		__entry->cpu_nr		= vcpu->vcpu_id;
+		__entry->priority	= priority;
+		__entry->pending	= vcpu->arch.pending_exceptions;
+	),
+
+	TP_printk("vcpu=%x prio=%x pending=%lx",
+		__entry->cpu_nr, __entry->priority, __entry->pending)
+);
+
 #endif
 
 #endif /* _TRACE_KVM_H */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 20/38] KVM: PPC: BookE: Add some more trace points
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Without trace points, debugging what exactly is going on inside guest
code can be very tricky. Add a few more trace points at places that
hopefully tell us more when things go wrong.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c    |    3 ++
 arch/powerpc/kvm/e500_tlb.c |    3 ++
 arch/powerpc/kvm/trace.h    |   71 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 52f6cbb..00bcc57 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -143,6 +143,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
 static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu,
                                        unsigned int priority)
 {
+	trace_kvm_booke_queue_irqprio(vcpu, priority);
 	set_bit(priority, &vcpu->arch.pending_exceptions);
 }
 
@@ -457,6 +458,8 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 {
 	if (vcpu->requests) {
+		trace_kvm_check_requests(vcpu);
+
 		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 			update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index a6519ca..6340b3c 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -312,6 +312,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 {
 	if (ref->flags & E500_TLB_VALID) {
+		trace_kvm_booke206_ref_release(ref->pfn, ref->flags);
 		ref->flags = 0;
 	}
 }
@@ -1075,6 +1076,8 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
 {
+	trace_kvm_unmap_hva(hva);
+
 	/*
 	 * Flush all shadow tlb entries everywhere. This is slow, but
 	 * we are 100% sure that we catch the to be unmapped page
diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 9fab6ed..cb2780a 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -82,6 +82,21 @@ TRACE_EVENT(kvm_exit,
 		)
 );
 
+TRACE_EVENT(kvm_unmap_hva,
+	TP_PROTO(unsigned long hva),
+	TP_ARGS(hva),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	hva		)
+	),
+
+	TP_fast_assign(
+		__entry->hva		= hva;
+	),
+
+	TP_printk("unmap hva 0x%lx\n", __entry->hva)
+);
+
 TRACE_EVENT(kvm_stlb_inval,
 	TP_PROTO(unsigned int stlb_index),
 	TP_ARGS(stlb_index),
@@ -149,6 +164,24 @@ TRACE_EVENT(kvm_gtlb_write,
 		__entry->word1, __entry->word2)
 );
 
+TRACE_EVENT(kvm_check_requests,
+	TP_PROTO(struct kvm_vcpu *vcpu),
+	TP_ARGS(vcpu),
+
+	TP_STRUCT__entry(
+		__field(	__u32,	cpu_nr		)
+		__field(	__u32,	requests	)
+	),
+
+	TP_fast_assign(
+		__entry->cpu_nr		= vcpu->vcpu_id;
+		__entry->requests	= vcpu->requests;
+	),
+
+	TP_printk("vcpu=%x requests=%x",
+		__entry->cpu_nr, __entry->requests)
+);
+
 
 /*************************************************************************
  *                         Book3S trace points                           *
@@ -418,6 +451,44 @@ TRACE_EVENT(kvm_booke206_gtlb_write,
 		__entry->mas2, __entry->mas7_3)
 );
 
+TRACE_EVENT(kvm_booke206_ref_release,
+	TP_PROTO(__u64 pfn, __u32 flags),
+	TP_ARGS(pfn, flags),
+
+	TP_STRUCT__entry(
+		__field(	__u64,	pfn		)
+		__field(	__u32,	flags		)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->flags		= flags;
+	),
+
+	TP_printk("pfn=%llx flags=%x",
+		__entry->pfn, __entry->flags)
+);
+
+TRACE_EVENT(kvm_booke_queue_irqprio,
+	TP_PROTO(struct kvm_vcpu *vcpu, unsigned int priority),
+	TP_ARGS(vcpu, priority),
+
+	TP_STRUCT__entry(
+		__field(	__u32,	cpu_nr		)
+		__field(	__u32,	priority		)
+		__field(	unsigned long,	pending		)
+	),
+
+	TP_fast_assign(
+		__entry->cpu_nr		= vcpu->vcpu_id;
+		__entry->priority	= priority;
+		__entry->pending	= vcpu->arch.pending_exceptions;
+	),
+
+	TP_printk("vcpu=%x prio=%x pending=%lx",
+		__entry->cpu_nr, __entry->priority, __entry->pending)
+);
+
 #endif
 
 #endif /* _TRACE_KVM_H */
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 21/38] KVM: PPC: BookE: No duplicate request != 0 check
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We only call kvmppc_check_requests() when vcpu->requests != 0, so drop
the redundant check in the function itself

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   12 +++++-------
 1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 00bcc57..683cbd6 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -457,16 +457,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 
 static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 {
-	if (vcpu->requests) {
-		trace_kvm_check_requests(vcpu);
+	trace_kvm_check_requests(vcpu);
 
-		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
-			update_timer_ints(vcpu);
+	if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
+		update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
-		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
-			kvmppc_core_flush_tlb(vcpu);
+	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+		kvmppc_core_flush_tlb(vcpu);
 #endif
-	}
 }
 
 /*
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 21/38] KVM: PPC: BookE: No duplicate request != 0 check
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We only call kvmppc_check_requests() when vcpu->requests != 0, so drop
the redundant check in the function itself

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |   12 +++++-------
 1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 00bcc57..683cbd6 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -457,16 +457,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 
 static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 {
-	if (vcpu->requests) {
-		trace_kvm_check_requests(vcpu);
+	trace_kvm_check_requests(vcpu);
 
-		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
-			update_timer_ints(vcpu);
+	if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
+		update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
-		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
-			kvmppc_core_flush_tlb(vcpu);
+	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+		kvmppc_core_flush_tlb(vcpu);
 #endif
-	}
 }
 
 /*
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 22/38] KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need to do the same things when preparing to enter a guest for booke and
book3s_pr cores. Fold the generic code into a generic function that both call.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_ppc.h |    3 ++
 arch/powerpc/kvm/book3s_pr.c       |   22 ++++----------
 arch/powerpc/kvm/booke.c           |   58 +-----------------------------------
 arch/powerpc/kvm/powerpc.c         |   57 +++++++++++++++++++++++++++++++++++
 4 files changed, 67 insertions(+), 73 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 88de314..59b7c87 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -112,6 +112,7 @@ extern int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong val);
 extern int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong *val);
+extern void kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_booke_init(void);
 extern void kvmppc_booke_exit(void);
@@ -150,6 +151,8 @@ extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm,
 extern int kvmppc_bookehv_init(void);
 extern void kvmppc_bookehv_exit(void);
 
+extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 7f0fe6f..cae2def 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -88,6 +88,10 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	kvmppc_giveup_ext(vcpu, MSR_VSX);
 }
 
+void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
+{
+}
+
 static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
 {
 	ulong smsr = vcpu->arch.shared->msr;
@@ -815,19 +819,9 @@ program_interrupt:
 		 * again due to a host external interrupt.
 		 */
 		__hard_irq_disable();
-		if (signal_pending(current)) {
-			__hard_irq_enable();
-#ifdef EXIT_DEBUG
-			printk(KERN_EMERG "KVM: Going back to host\n");
-#endif
-			vcpu->stat.signal_exits++;
+		if (kvmppc_prepare_to_enter(vcpu)) {
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
-		} else {
-			/* In case an interrupt came in that was triggered
-			 * from userspace (like DEC), we need to check what
-			 * to inject now! */
-			kvmppc_core_prepare_to_enter(vcpu);
 		}
 	}
 
@@ -1029,8 +1023,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		goto out;
 	}
 
-	kvmppc_core_prepare_to_enter(vcpu);
-
 	/*
 	 * Interrupts could be timers for the guest which we have to inject
 	 * again, so let's postpone them until we're in the guest and if we
@@ -1038,9 +1030,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * a host external interrupt.
 	 */
 	__hard_irq_disable();
-
-	/* No need to go into the guest when all we do is going out */
-	if (signal_pending(current)) {
+	if (kvmppc_prepare_to_enter(vcpu)) {
 		__hard_irq_enable();
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 683cbd6..4652e0b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -455,10 +455,8 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 	return r;
 }
 
-static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
+void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
-	trace_kvm_check_requests(vcpu);
-
 	if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 		update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
@@ -467,60 +465,6 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 #endif
 }
 
-/*
- * Common checks before entering the guest world.  Call with interrupts
- * disabled.
- *
- * returns !0 if a signal is pending and check_signal is true
- */
-static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
-{
-	int r = 0;
-
-	WARN_ON_ONCE(!irqs_disabled());
-	while (true) {
-		if (need_resched()) {
-			local_irq_enable();
-			cond_resched();
-			local_irq_disable();
-			continue;
-		}
-
-		if (signal_pending(current)) {
-			r = 1;
-			break;
-		}
-
-		smp_mb();
-		if (vcpu->requests) {
-			/* Make sure we process requests preemptable */
-			local_irq_enable();
-			kvmppc_check_requests(vcpu);
-			local_irq_disable();
-			continue;
-		}
-
-		if (kvmppc_core_prepare_to_enter(vcpu)) {
-			/* interrupts got enabled in between, so we
-			   are back at square 1 */
-			continue;
-		}
-
-		if (vcpu->mode == EXITING_GUEST_MODE) {
-			r = 1;
-			break;
-		}
-
-		/* Going into guest context! Yay! */
-		vcpu->mode = IN_GUEST_MODE;
-		smp_wmb();
-
-		break;
-	}
-
-	return r;
-}
-
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
 	int ret;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 98b3408..053bfef 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -47,6 +47,63 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+#ifndef CONFIG_KVM_BOOK3S_64_HV
+/*
+ * Common checks before entering the guest world.  Call with interrupts
+ * disabled.
+ *
+ * returns !0 if a signal is pending and check_signal is true
+ */
+int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
+{
+	int r = 0;
+
+	WARN_ON_ONCE(!irqs_disabled());
+	while (true) {
+		if (need_resched()) {
+			local_irq_enable();
+			cond_resched();
+			local_irq_disable();
+			continue;
+		}
+
+		if (signal_pending(current)) {
+			r = 1;
+			break;
+		}
+
+		smp_mb();
+		if (vcpu->requests) {
+			/* Make sure we process requests preemptable */
+			local_irq_enable();
+			trace_kvm_check_requests(vcpu);
+			kvmppc_core_check_requests(vcpu);
+			local_irq_disable();
+			continue;
+		}
+
+		if (kvmppc_core_prepare_to_enter(vcpu)) {
+			/* interrupts got enabled in between, so we
+			   are back at square 1 */
+			continue;
+		}
+
+		if (vcpu->mode == EXITING_GUEST_MODE) {
+			r = 1;
+			break;
+		}
+
+		/* Going into guest context! Yay! */
+		vcpu->mode = IN_GUEST_MODE;
+		smp_wmb();
+
+		break;
+	}
+
+	return r;
+}
+#endif /* CONFIG_KVM_BOOK3S_64_HV */
+
 int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 {
 	int nr = kvmppc_get_gpr(vcpu, 11);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 22/38] KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need to do the same things when preparing to enter a guest for booke and
book3s_pr cores. Fold the generic code into a generic function that both call.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_ppc.h |    3 ++
 arch/powerpc/kvm/book3s_pr.c       |   22 ++++----------
 arch/powerpc/kvm/booke.c           |   58 +-----------------------------------
 arch/powerpc/kvm/powerpc.c         |   57 +++++++++++++++++++++++++++++++++++
 4 files changed, 67 insertions(+), 73 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 88de314..59b7c87 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -112,6 +112,7 @@ extern int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong val);
 extern int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong *val);
+extern void kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_booke_init(void);
 extern void kvmppc_booke_exit(void);
@@ -150,6 +151,8 @@ extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm,
 extern int kvmppc_bookehv_init(void);
 extern void kvmppc_bookehv_exit(void);
 
+extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 7f0fe6f..cae2def 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -88,6 +88,10 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	kvmppc_giveup_ext(vcpu, MSR_VSX);
 }
 
+void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
+{
+}
+
 static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
 {
 	ulong smsr = vcpu->arch.shared->msr;
@@ -815,19 +819,9 @@ program_interrupt:
 		 * again due to a host external interrupt.
 		 */
 		__hard_irq_disable();
-		if (signal_pending(current)) {
-			__hard_irq_enable();
-#ifdef EXIT_DEBUG
-			printk(KERN_EMERG "KVM: Going back to host\n");
-#endif
-			vcpu->stat.signal_exits++;
+		if (kvmppc_prepare_to_enter(vcpu)) {
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
-		} else {
-			/* In case an interrupt came in that was triggered
-			 * from userspace (like DEC), we need to check what
-			 * to inject now! */
-			kvmppc_core_prepare_to_enter(vcpu);
 		}
 	}
 
@@ -1029,8 +1023,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		goto out;
 	}
 
-	kvmppc_core_prepare_to_enter(vcpu);
-
 	/*
 	 * Interrupts could be timers for the guest which we have to inject
 	 * again, so let's postpone them until we're in the guest and if we
@@ -1038,9 +1030,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * a host external interrupt.
 	 */
 	__hard_irq_disable();
-
-	/* No need to go into the guest when all we do is going out */
-	if (signal_pending(current)) {
+	if (kvmppc_prepare_to_enter(vcpu)) {
 		__hard_irq_enable();
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 683cbd6..4652e0b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -455,10 +455,8 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 	return r;
 }
 
-static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
+void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
-	trace_kvm_check_requests(vcpu);
-
 	if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 		update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
@@ -467,60 +465,6 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
 #endif
 }
 
-/*
- * Common checks before entering the guest world.  Call with interrupts
- * disabled.
- *
- * returns !0 if a signal is pending and check_signal is true
- */
-static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
-{
-	int r = 0;
-
-	WARN_ON_ONCE(!irqs_disabled());
-	while (true) {
-		if (need_resched()) {
-			local_irq_enable();
-			cond_resched();
-			local_irq_disable();
-			continue;
-		}
-
-		if (signal_pending(current)) {
-			r = 1;
-			break;
-		}
-
-		smp_mb();
-		if (vcpu->requests) {
-			/* Make sure we process requests preemptable */
-			local_irq_enable();
-			kvmppc_check_requests(vcpu);
-			local_irq_disable();
-			continue;
-		}
-
-		if (kvmppc_core_prepare_to_enter(vcpu)) {
-			/* interrupts got enabled in between, so we
-			   are back at square 1 */
-			continue;
-		}
-
-		if (vcpu->mode = EXITING_GUEST_MODE) {
-			r = 1;
-			break;
-		}
-
-		/* Going into guest context! Yay! */
-		vcpu->mode = IN_GUEST_MODE;
-		smp_wmb();
-
-		break;
-	}
-
-	return r;
-}
-
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
 	int ret;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 98b3408..053bfef 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -47,6 +47,63 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+#ifndef CONFIG_KVM_BOOK3S_64_HV
+/*
+ * Common checks before entering the guest world.  Call with interrupts
+ * disabled.
+ *
+ * returns !0 if a signal is pending and check_signal is true
+ */
+int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
+{
+	int r = 0;
+
+	WARN_ON_ONCE(!irqs_disabled());
+	while (true) {
+		if (need_resched()) {
+			local_irq_enable();
+			cond_resched();
+			local_irq_disable();
+			continue;
+		}
+
+		if (signal_pending(current)) {
+			r = 1;
+			break;
+		}
+
+		smp_mb();
+		if (vcpu->requests) {
+			/* Make sure we process requests preemptable */
+			local_irq_enable();
+			trace_kvm_check_requests(vcpu);
+			kvmppc_core_check_requests(vcpu);
+			local_irq_disable();
+			continue;
+		}
+
+		if (kvmppc_core_prepare_to_enter(vcpu)) {
+			/* interrupts got enabled in between, so we
+			   are back at square 1 */
+			continue;
+		}
+
+		if (vcpu->mode = EXITING_GUEST_MODE) {
+			r = 1;
+			break;
+		}
+
+		/* Going into guest context! Yay! */
+		vcpu->mode = IN_GUEST_MODE;
+		smp_wmb();
+
+		break;
+	}
+
+	return r;
+}
+#endif /* CONFIG_KVM_BOOK3S_64_HV */
+
 int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 {
 	int nr = kvmppc_get_gpr(vcpu, 11);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 23/38] KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Now that we have very simple MMU Notifier support for e500 in place,
also add the same simple support to book3s. It gets us one step closer
to actual fast support.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h   |    3 +-
 arch/powerpc/kvm/Kconfig              |    1 +
 arch/powerpc/kvm/book3s_32_mmu_host.c |    1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c |    1 +
 arch/powerpc/kvm/book3s_mmu_hpte.c    |    5 ---
 arch/powerpc/kvm/book3s_pr.c          |   47 +++++++++++++++++++++++++++++++++
 6 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index cea9d3a..4a5ec8f 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -46,8 +46,7 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
-#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
-    defined(CONFIG_KVM_E500MC)
+#if !defined(CONFIG_KVM_440)
 #include <linux/mmu_notifier.h>
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 40cad8c..71f0cd9 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -36,6 +36,7 @@ config KVM_BOOK3S_64_HANDLER
 config KVM_BOOK3S_PR
 	bool
 	select KVM_MMIO
+	select MMU_NOTIFIER
 
 config KVM_BOOK3S_32
 	tristate "KVM support for PowerPC book3s_32 processors"
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 837f13e..9fac010 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -254,6 +254,7 @@ next_pteg:
 
 	kvmppc_mmu_hpte_cache_map(vcpu, pte);
 
+	kvm_release_pfn_clean(hpaddr >> PAGE_SHIFT);
 out:
 	return r;
 }
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 0688b6b..6b2c80e 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -168,6 +168,7 @@ map_again:
 
 		kvmppc_mmu_hpte_cache_map(vcpu, pte);
 	}
+	kvm_release_pfn_clean(hpaddr >> PAGE_SHIFT);
 
 out:
 	return r;
diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c b/arch/powerpc/kvm/book3s_mmu_hpte.c
index 41cb001..2c86b0d 100644
--- a/arch/powerpc/kvm/book3s_mmu_hpte.c
+++ b/arch/powerpc/kvm/book3s_mmu_hpte.c
@@ -114,11 +114,6 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
 	hlist_del_init_rcu(&pte->list_vpte);
 	hlist_del_init_rcu(&pte->list_vpte_long);
 
-	if (pte->pte.may_write)
-		kvm_release_pfn_dirty(pte->pfn);
-	else
-		kvm_release_pfn_clean(pte->pfn);
-
 	spin_unlock(&vcpu3s->mmu_lock);
 
 	vcpu3s->hpte_cache_count--;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index cae2def..10f8217 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -90,8 +90,55 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 
 void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
+	/* We misuse TLB_FLUSH to indicate that we want to clear
+	   all shadow cache entries */
+	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+		kvmppc_mmu_pte_flush(vcpu, 0, 0);
 }
 
+/************* MMU Notifiers *************/
+
+int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
+{
+	trace_kvm_unmap_hva(hva);
+
+	/*
+	 * Flush all shadow tlb entries everywhere. This is slow, but
+	 * we are 100% sure that we catch the to be unmapped page
+	 */
+	kvm_flush_remote_tlbs(kvm);
+
+	return 0;
+}
+
+int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
+{
+	/* kvm_unmap_hva flushes everything anyways */
+	kvm_unmap_hva(kvm, start);
+
+	return 0;
+}
+
+int kvm_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
+{
+	/* The page will get remapped properly on its next fault */
+	kvm_unmap_hva(kvm, hva);
+}
+
+/*****************************************/
+
 static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
 {
 	ulong smsr = vcpu->arch.shared->msr;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 23/38] KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Now that we have very simple MMU Notifier support for e500 in place,
also add the same simple support to book3s. It gets us one step closer
to actual fast support.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h   |    3 +-
 arch/powerpc/kvm/Kconfig              |    1 +
 arch/powerpc/kvm/book3s_32_mmu_host.c |    1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c |    1 +
 arch/powerpc/kvm/book3s_mmu_hpte.c    |    5 ---
 arch/powerpc/kvm/book3s_pr.c          |   47 +++++++++++++++++++++++++++++++++
 6 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index cea9d3a..4a5ec8f 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -46,8 +46,7 @@
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 #endif
 
-#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
-    defined(CONFIG_KVM_E500MC)
+#if !defined(CONFIG_KVM_440)
 #include <linux/mmu_notifier.h>
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 40cad8c..71f0cd9 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -36,6 +36,7 @@ config KVM_BOOK3S_64_HANDLER
 config KVM_BOOK3S_PR
 	bool
 	select KVM_MMIO
+	select MMU_NOTIFIER
 
 config KVM_BOOK3S_32
 	tristate "KVM support for PowerPC book3s_32 processors"
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 837f13e..9fac010 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -254,6 +254,7 @@ next_pteg:
 
 	kvmppc_mmu_hpte_cache_map(vcpu, pte);
 
+	kvm_release_pfn_clean(hpaddr >> PAGE_SHIFT);
 out:
 	return r;
 }
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c
index 0688b6b..6b2c80e 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -168,6 +168,7 @@ map_again:
 
 		kvmppc_mmu_hpte_cache_map(vcpu, pte);
 	}
+	kvm_release_pfn_clean(hpaddr >> PAGE_SHIFT);
 
 out:
 	return r;
diff --git a/arch/powerpc/kvm/book3s_mmu_hpte.c b/arch/powerpc/kvm/book3s_mmu_hpte.c
index 41cb001..2c86b0d 100644
--- a/arch/powerpc/kvm/book3s_mmu_hpte.c
+++ b/arch/powerpc/kvm/book3s_mmu_hpte.c
@@ -114,11 +114,6 @@ static void invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte)
 	hlist_del_init_rcu(&pte->list_vpte);
 	hlist_del_init_rcu(&pte->list_vpte_long);
 
-	if (pte->pte.may_write)
-		kvm_release_pfn_dirty(pte->pfn);
-	else
-		kvm_release_pfn_clean(pte->pfn);
-
 	spin_unlock(&vcpu3s->mmu_lock);
 
 	vcpu3s->hpte_cache_count--;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index cae2def..10f8217 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -90,8 +90,55 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 
 void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
+	/* We misuse TLB_FLUSH to indicate that we want to clear
+	   all shadow cache entries */
+	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
+		kvmppc_mmu_pte_flush(vcpu, 0, 0);
 }
 
+/************* MMU Notifiers *************/
+
+int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
+{
+	trace_kvm_unmap_hva(hva);
+
+	/*
+	 * Flush all shadow tlb entries everywhere. This is slow, but
+	 * we are 100% sure that we catch the to be unmapped page
+	 */
+	kvm_flush_remote_tlbs(kvm);
+
+	return 0;
+}
+
+int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
+{
+	/* kvm_unmap_hva flushes everything anyways */
+	kvm_unmap_hva(kvm, start);
+
+	return 0;
+}
+
+int kvm_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
+{
+	/* XXX could be more clever ;) */
+	return 0;
+}
+
+void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
+{
+	/* The page will get remapped properly on its next fault */
+	kvm_unmap_hva(kvm, hva);
+}
+
+/*****************************************/
+
 static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
 {
 	ulong smsr = vcpu->arch.shared->msr;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 24/38] KVM: PPC: BookE: Drop redundant vcpu->mode set
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We only need to set vcpu->mode to outside once.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 4652e0b..492c343 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -528,8 +528,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 
 	kvm_guest_exit();
-	vcpu->mode = OUTSIDE_GUEST_MODE;
-	smp_wmb();
 
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 24/38] KVM: PPC: BookE: Drop redundant vcpu->mode set
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We only need to set vcpu->mode to outside once.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 4652e0b..492c343 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -528,8 +528,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 
 	kvm_guest_exit();
-	vcpu->mode = OUTSIDE_GUEST_MODE;
-	smp_wmb();
 
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 25/38] KVM: PPC: Book3S: PR: Only do resched check once per exit
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Now that we use our generic exit helper, we can safely drop our previous
kvm_resched that we used to trigger at the beginning of the exit handler
function.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 10f8217..2c268a1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -602,7 +602,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	trace_kvm_exit(exit_nr, vcpu);
 	preempt_enable();
-	kvm_resched(vcpu);
+
 	switch (exit_nr) {
 	case BOOK3S_INTERRUPT_INST_STORAGE:
 	{
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 25/38] KVM: PPC: Book3S: PR: Only do resched check once per exit
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Now that we use our generic exit helper, we can safely drop our previous
kvm_resched that we used to trigger at the beginning of the exit handler
function.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 10f8217..2c268a1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -602,7 +602,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	trace_kvm_exit(exit_nr, vcpu);
 	preempt_enable();
-	kvm_resched(vcpu);
+
 	switch (exit_nr) {
 	case BOOK3S_INTERRUPT_INST_STORAGE:
 	{
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 26/38] KVM: PPC: Exit guest context while handling exit
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

The x86 implementation of KVM accounts for host time while processing
guest exits. Do the same for us.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 ++
 arch/powerpc/kvm/booke.c     |    3 +++
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 2c268a1..b4ae11e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -601,6 +601,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	__hard_irq_enable();
 
 	trace_kvm_exit(exit_nr, vcpu);
+	kvm_guest_exit();
 	preempt_enable();
 
 	switch (exit_nr) {
@@ -872,6 +873,7 @@ program_interrupt:
 		}
 	}
 
+	kvm_guest_enter();
 	trace_kvm_book3s_reenter(r, vcpu);
 
 	return r;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 492c343..887c7cc 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -650,6 +650,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	local_irq_enable();
 
 	trace_kvm_exit(exit_nr, vcpu);
+	kvm_guest_exit();
 
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
@@ -952,6 +953,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		}
 	}
 
+	kvm_guest_enter();
+
 	return r;
 }
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 26/38] KVM: PPC: Exit guest context while handling exit
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

The x86 implementation of KVM accounts for host time while processing
guest exits. Do the same for us.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 ++
 arch/powerpc/kvm/booke.c     |    3 +++
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 2c268a1..b4ae11e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -601,6 +601,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	__hard_irq_enable();
 
 	trace_kvm_exit(exit_nr, vcpu);
+	kvm_guest_exit();
 	preempt_enable();
 
 	switch (exit_nr) {
@@ -872,6 +873,7 @@ program_interrupt:
 		}
 	}
 
+	kvm_guest_enter();
 	trace_kvm_book3s_reenter(r, vcpu);
 
 	return r;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 492c343..887c7cc 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -650,6 +650,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	local_irq_enable();
 
 	trace_kvm_exit(exit_nr, vcpu);
+	kvm_guest_exit();
 
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
@@ -952,6 +953,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		}
 	}
 
+	kvm_guest_enter();
+
 	return r;
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 27/38] KVM: PPC: Book3S: PR: Indicate we're out of guest mode
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

When going out of guest mode, indicate that we are in vcpu->mode. That way
requests from other CPUs don't needlessly need to kick us to process them,
because it'll just happen next time we enter the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index b4ae11e..9430a36 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1152,6 +1152,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 
 out:
+	vcpu->mode = OUTSIDE_GUEST_MODE;
 	preempt_enable();
 	return ret;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 27/38] KVM: PPC: Book3S: PR: Indicate we're out of guest mode
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

When going out of guest mode, indicate that we are in vcpu->mode. That way
requests from other CPUs don't needlessly need to kick us to process them,
because it'll just happen next time we enter the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index b4ae11e..9430a36 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1152,6 +1152,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 
 out:
+	vcpu->mode = OUTSIDE_GUEST_MODE;
 	preempt_enable();
 	return ret;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 28/38] KVM: PPC: Consistentify vcpu exit path
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

When getting out of __vcpu_run, let's be consistent about the state we
return in. We want to always

  * have IRQs enabled
  * have called kvm_guest_exit before

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    8 ++++++--
 arch/powerpc/kvm/booke.c     |   13 ++++++++-----
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 9430a36..3dec346 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -868,12 +868,15 @@ program_interrupt:
 		 */
 		__hard_irq_disable();
 		if (kvmppc_prepare_to_enter(vcpu)) {
+			/* local_irq_enable(); */
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
+		} else {
+			/* Going back to guest */
+			kvm_guest_enter();
 		}
 	}
 
-	kvm_guest_enter();
 	trace_kvm_book3s_reenter(r, vcpu);
 
 	return r;
@@ -1123,7 +1126,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
-	kvm_guest_exit();
+	/* No need for kvm_guest_exit. It's done in handle_exit.
+	   We also get here with interrupts enabled. */
 
 	current->thread.regs->msr = ext_msr;
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 887c7cc..aae535f 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -481,6 +481,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	local_irq_disable();
 	if (kvmppc_prepare_to_enter(vcpu)) {
+		local_irq_enable();
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
 		goto out;
@@ -512,6 +513,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
+	/* No need for kvm_guest_exit. It's done in handle_exit.
+	   We also get here with interrupts enabled. */
+
 #ifdef CONFIG_PPC_FPU
 	kvmppc_save_guest_fp(vcpu);
 
@@ -527,12 +531,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	current->thread.fpexc_mode = fpexc_mode;
 #endif
 
-	kvm_guest_exit();
-
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
-	local_irq_enable();
 	return ret;
 }
 
@@ -947,14 +948,16 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	if (!(r & RESUME_HOST)) {
 		local_irq_disable();
 		if (kvmppc_prepare_to_enter(vcpu)) {
+			local_irq_enable();
 			run->exit_reason = KVM_EXIT_INTR;
 			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
 			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+		} else {
+			/* Going back to guest */
+			kvm_guest_enter();
 		}
 	}
 
-	kvm_guest_enter();
-
 	return r;
 }
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 28/38] KVM: PPC: Consistentify vcpu exit path
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

When getting out of __vcpu_run, let's be consistent about the state we
return in. We want to always

  * have IRQs enabled
  * have called kvm_guest_exit before

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    8 ++++++--
 arch/powerpc/kvm/booke.c     |   13 ++++++++-----
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 9430a36..3dec346 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -868,12 +868,15 @@ program_interrupt:
 		 */
 		__hard_irq_disable();
 		if (kvmppc_prepare_to_enter(vcpu)) {
+			/* local_irq_enable(); */
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
+		} else {
+			/* Going back to guest */
+			kvm_guest_enter();
 		}
 	}
 
-	kvm_guest_enter();
 	trace_kvm_book3s_reenter(r, vcpu);
 
 	return r;
@@ -1123,7 +1126,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
-	kvm_guest_exit();
+	/* No need for kvm_guest_exit. It's done in handle_exit.
+	   We also get here with interrupts enabled. */
 
 	current->thread.regs->msr = ext_msr;
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 887c7cc..aae535f 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -481,6 +481,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	local_irq_disable();
 	if (kvmppc_prepare_to_enter(vcpu)) {
+		local_irq_enable();
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
 		goto out;
@@ -512,6 +513,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
+	/* No need for kvm_guest_exit. It's done in handle_exit.
+	   We also get here with interrupts enabled. */
+
 #ifdef CONFIG_PPC_FPU
 	kvmppc_save_guest_fp(vcpu);
 
@@ -527,12 +531,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	current->thread.fpexc_mode = fpexc_mode;
 #endif
 
-	kvm_guest_exit();
-
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
-	local_irq_enable();
 	return ret;
 }
 
@@ -947,14 +948,16 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	if (!(r & RESUME_HOST)) {
 		local_irq_disable();
 		if (kvmppc_prepare_to_enter(vcpu)) {
+			local_irq_enable();
 			run->exit_reason = KVM_EXIT_INTR;
 			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
 			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+		} else {
+			/* Going back to guest */
+			kvm_guest_enter();
 		}
 	}
 
-	kvm_guest_enter();
-
 	return r;
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Today, we disable preemption while inside guest context, because we need
to expose to the world that we are not in a preemptible context. However,
during that time we already have interrupts disabled, which would indicate
that we are in a non-preemptible context.

The reason the checks for irqs_disabled() fail for us though is that we
manually control hard IRQs and ignore all the lazy EE framework. Let's
stop doing that. Instead, let's always use lazy EE to indicate when we
want to disable IRQs, but do a special final switch that gets us into
EE disabled, but soft enabled state. That way when we get back out of
guest state, we are immediately ready to process interrupts.

This simplifies the code drastically and reduces the time that we appear
as preempt disabled.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_ppc.h   |   10 ++++++++++
 arch/powerpc/kvm/book3s_pr.c         |   21 +++++++--------------
 arch/powerpc/kvm/book3s_rmhandlers.S |   15 ++++++++-------
 arch/powerpc/kvm/booke.c             |    2 ++
 arch/powerpc/kvm/powerpc.c           |   14 ++++++++++++++
 5 files changed, 41 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 59b7c87..5459364 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -234,5 +234,15 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
 	}
 }
 
+/* Please call after prepare_to_enter. This function puts the lazy ee state
+   back to normal mode, without actually enabling interrupts. */
+static inline void kvmppc_lazy_ee_enable(void)
+{
+#ifdef CONFIG_PPC64
+	/* Only need to enable IRQs by hard enabling them after this */
+	local_paca->irq_happened = 0;
+	local_paca->soft_enabled = 1;
+#endif
+}
 
 #endif /* __POWERPC_KVM_PPC_H__ */
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3dec346..e737db8 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -52,8 +52,6 @@ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 #define MSR_USER32 MSR_USER
 #define MSR_USER64 MSR_USER
 #define HW_PAGE_SIZE PAGE_SIZE
-#define __hard_irq_disable local_irq_disable
-#define __hard_irq_enable local_irq_enable
 #endif
 
 void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -597,12 +595,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
 
-	/* We get here with MSR.EE=0, so enable it to be a nice citizen */
-	__hard_irq_enable();
+	/* We get here with MSR.EE=1 */
 
 	trace_kvm_exit(exit_nr, vcpu);
 	kvm_guest_exit();
-	preempt_enable();
 
 	switch (exit_nr) {
 	case BOOK3S_INTERRUPT_INST_STORAGE:
@@ -854,7 +850,6 @@ program_interrupt:
 	}
 	}
 
-	preempt_disable();
 	if (!(r & RESUME_HOST)) {
 		/* To avoid clobbering exit_reason, only check for signals if
 		 * we aren't already exiting to userspace for some other
@@ -866,14 +861,15 @@ program_interrupt:
 		 * and if we really did time things so badly, then we just exit
 		 * again due to a host external interrupt.
 		 */
-		__hard_irq_disable();
+		local_irq_disable();
 		if (kvmppc_prepare_to_enter(vcpu)) {
-			/* local_irq_enable(); */
+			local_irq_enable();
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
 		} else {
 			/* Going back to guest */
 			kvm_guest_enter();
+			kvmppc_lazy_ee_enable();
 		}
 	}
 
@@ -1066,8 +1062,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 	ulong ext_msr;
 
-	preempt_disable();
-
 	/* Check if we can run the vcpu at all */
 	if (!vcpu->arch.sane) {
 		kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
@@ -1081,9 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * really did time things so badly, then we just exit again due to
 	 * a host external interrupt.
 	 */
-	__hard_irq_disable();
+	local_irq_disable();
 	if (kvmppc_prepare_to_enter(vcpu)) {
-		__hard_irq_enable();
+		local_irq_enable();
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
 		goto out;
@@ -1122,7 +1116,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	if (vcpu->arch.shared->msr & MSR_FP)
 		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 
-	kvm_guest_enter();
+	kvmppc_lazy_ee_enable();
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
@@ -1157,7 +1151,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
-	preempt_enable();
 	return ret;
 }
 
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index 9ecf6e3..b2f8258 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
  * Call kvmppc_handler_trampoline_enter in real mode
  *
  * On entry, r4 contains the guest shadow MSR
+ * MSR.EE has to be 0 when calling this function
  */
 _GLOBAL(kvmppc_entry_trampoline)
 	mfmsr	r5
 	LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
 	toreal(r7)
 
-	li	r9, MSR_RI
-	ori	r9, r9, MSR_EE
-	andc	r9, r5, r9	/* Clear EE and RI in MSR value */
 	li	r6, MSR_IR | MSR_DR
-	ori	r6, r6, MSR_EE
-	andc	r6, r5, r6	/* Clear EE, DR and IR in MSR value */
-	MTMSR_EERI(r9)		/* Clear EE and RI in MSR */
-	mtsrr0	r7		/* before we set srr0/1 */
+	andc	r6, r5, r6	/* Clear DR and IR in MSR value */
+	/*
+	 * Set EE in HOST_MSR so that it's enabled when we get into our
+	 * C exit handler function
+	 */
+	ori	r5, r5, MSR_EE
+	mtsrr0	r7
 	mtsrr1	r6
 	RFI
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index aae535f..2bd190c 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -486,6 +486,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		ret = -EINTR;
 		goto out;
 	}
+	kvmppc_lazy_ee_enable();
 
 	kvm_guest_enter();
 
@@ -955,6 +956,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		} else {
 			/* Going back to guest */
 			kvm_guest_enter();
+			kvmppc_lazy_ee_enable();
 		}
 	}
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 053bfef..545c183 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -30,6 +30,7 @@
 #include <asm/kvm_ppc.h>
 #include <asm/tlbflush.h>
 #include <asm/cputhreads.h>
+#include <asm/irqflags.h>
 #include "timing.h"
 #include "../mm/mmu_decl.h"
 
@@ -93,6 +94,19 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			break;
 		}
 
+#ifdef CONFIG_PPC64
+		/* lazy EE magic */
+		hard_irq_disable();
+		if (lazy_irq_pending()) {
+			/* Got an interrupt in between, try again */
+			local_irq_enable();
+			local_irq_disable();
+			continue;
+		}
+
+		trace_hardirqs_on();
+#endif
+
 		/* Going into guest context! Yay! */
 		vcpu->mode = IN_GUEST_MODE;
 		smp_wmb();
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Today, we disable preemption while inside guest context, because we need
to expose to the world that we are not in a preemptible context. However,
during that time we already have interrupts disabled, which would indicate
that we are in a non-preemptible context.

The reason the checks for irqs_disabled() fail for us though is that we
manually control hard IRQs and ignore all the lazy EE framework. Let's
stop doing that. Instead, let's always use lazy EE to indicate when we
want to disable IRQs, but do a special final switch that gets us into
EE disabled, but soft enabled state. That way when we get back out of
guest state, we are immediately ready to process interrupts.

This simplifies the code drastically and reduces the time that we appear
as preempt disabled.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_ppc.h   |   10 ++++++++++
 arch/powerpc/kvm/book3s_pr.c         |   21 +++++++--------------
 arch/powerpc/kvm/book3s_rmhandlers.S |   15 ++++++++-------
 arch/powerpc/kvm/booke.c             |    2 ++
 arch/powerpc/kvm/powerpc.c           |   14 ++++++++++++++
 5 files changed, 41 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 59b7c87..5459364 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -234,5 +234,15 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
 	}
 }
 
+/* Please call after prepare_to_enter. This function puts the lazy ee state
+   back to normal mode, without actually enabling interrupts. */
+static inline void kvmppc_lazy_ee_enable(void)
+{
+#ifdef CONFIG_PPC64
+	/* Only need to enable IRQs by hard enabling them after this */
+	local_paca->irq_happened = 0;
+	local_paca->soft_enabled = 1;
+#endif
+}
 
 #endif /* __POWERPC_KVM_PPC_H__ */
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3dec346..e737db8 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -52,8 +52,6 @@ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 #define MSR_USER32 MSR_USER
 #define MSR_USER64 MSR_USER
 #define HW_PAGE_SIZE PAGE_SIZE
-#define __hard_irq_disable local_irq_disable
-#define __hard_irq_enable local_irq_enable
 #endif
 
 void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -597,12 +595,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	run->exit_reason = KVM_EXIT_UNKNOWN;
 	run->ready_for_interrupt_injection = 1;
 
-	/* We get here with MSR.EE=0, so enable it to be a nice citizen */
-	__hard_irq_enable();
+	/* We get here with MSR.EE=1 */
 
 	trace_kvm_exit(exit_nr, vcpu);
 	kvm_guest_exit();
-	preempt_enable();
 
 	switch (exit_nr) {
 	case BOOK3S_INTERRUPT_INST_STORAGE:
@@ -854,7 +850,6 @@ program_interrupt:
 	}
 	}
 
-	preempt_disable();
 	if (!(r & RESUME_HOST)) {
 		/* To avoid clobbering exit_reason, only check for signals if
 		 * we aren't already exiting to userspace for some other
@@ -866,14 +861,15 @@ program_interrupt:
 		 * and if we really did time things so badly, then we just exit
 		 * again due to a host external interrupt.
 		 */
-		__hard_irq_disable();
+		local_irq_disable();
 		if (kvmppc_prepare_to_enter(vcpu)) {
-			/* local_irq_enable(); */
+			local_irq_enable();
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
 		} else {
 			/* Going back to guest */
 			kvm_guest_enter();
+			kvmppc_lazy_ee_enable();
 		}
 	}
 
@@ -1066,8 +1062,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 #endif
 	ulong ext_msr;
 
-	preempt_disable();
-
 	/* Check if we can run the vcpu at all */
 	if (!vcpu->arch.sane) {
 		kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
@@ -1081,9 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * really did time things so badly, then we just exit again due to
 	 * a host external interrupt.
 	 */
-	__hard_irq_disable();
+	local_irq_disable();
 	if (kvmppc_prepare_to_enter(vcpu)) {
-		__hard_irq_enable();
+		local_irq_enable();
 		kvm_run->exit_reason = KVM_EXIT_INTR;
 		ret = -EINTR;
 		goto out;
@@ -1122,7 +1116,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	if (vcpu->arch.shared->msr & MSR_FP)
 		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 
-	kvm_guest_enter();
+	kvmppc_lazy_ee_enable();
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
@@ -1157,7 +1151,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 
 out:
 	vcpu->mode = OUTSIDE_GUEST_MODE;
-	preempt_enable();
 	return ret;
 }
 
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index 9ecf6e3..b2f8258 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
  * Call kvmppc_handler_trampoline_enter in real mode
  *
  * On entry, r4 contains the guest shadow MSR
+ * MSR.EE has to be 0 when calling this function
  */
 _GLOBAL(kvmppc_entry_trampoline)
 	mfmsr	r5
 	LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
 	toreal(r7)
 
-	li	r9, MSR_RI
-	ori	r9, r9, MSR_EE
-	andc	r9, r5, r9	/* Clear EE and RI in MSR value */
 	li	r6, MSR_IR | MSR_DR
-	ori	r6, r6, MSR_EE
-	andc	r6, r5, r6	/* Clear EE, DR and IR in MSR value */
-	MTMSR_EERI(r9)		/* Clear EE and RI in MSR */
-	mtsrr0	r7		/* before we set srr0/1 */
+	andc	r6, r5, r6	/* Clear DR and IR in MSR value */
+	/*
+	 * Set EE in HOST_MSR so that it's enabled when we get into our
+	 * C exit handler function
+	 */
+	ori	r5, r5, MSR_EE
+	mtsrr0	r7
 	mtsrr1	r6
 	RFI
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index aae535f..2bd190c 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -486,6 +486,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		ret = -EINTR;
 		goto out;
 	}
+	kvmppc_lazy_ee_enable();
 
 	kvm_guest_enter();
 
@@ -955,6 +956,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		} else {
 			/* Going back to guest */
 			kvm_guest_enter();
+			kvmppc_lazy_ee_enable();
 		}
 	}
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 053bfef..545c183 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -30,6 +30,7 @@
 #include <asm/kvm_ppc.h>
 #include <asm/tlbflush.h>
 #include <asm/cputhreads.h>
+#include <asm/irqflags.h>
 #include "timing.h"
 #include "../mm/mmu_decl.h"
 
@@ -93,6 +94,19 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			break;
 		}
 
+#ifdef CONFIG_PPC64
+		/* lazy EE magic */
+		hard_irq_disable();
+		if (lazy_irq_pending()) {
+			/* Got an interrupt in between, try again */
+			local_irq_enable();
+			local_irq_disable();
+			continue;
+		}
+
+		trace_hardirqs_on();
+#endif
+
 		/* Going into guest context! Yay! */
 		vcpu->mode = IN_GUEST_MODE;
 		smp_wmb();
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 30/38] KVM: PPC: Move kvm_guest_enter call into generic code
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need to call kvm_guest_enter in booke and book3s, so move its
call to generic code.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 --
 arch/powerpc/kvm/booke.c     |    2 --
 arch/powerpc/kvm/powerpc.c   |    3 +++
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e737db8..1ff0d6c 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -867,8 +867,6 @@ program_interrupt:
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
 		} else {
-			/* Going back to guest */
-			kvm_guest_enter();
 			kvmppc_lazy_ee_enable();
 		}
 	}
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 2bd190c..5e8dc19 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -954,8 +954,6 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
 			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
 		} else {
-			/* Going back to guest */
-			kvm_guest_enter();
 			kvmppc_lazy_ee_enable();
 		}
 	}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 545c183..fbf18be 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -101,12 +101,15 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			/* Got an interrupt in between, try again */
 			local_irq_enable();
 			local_irq_disable();
+			kvm_guest_exit();
 			continue;
 		}
 
 		trace_hardirqs_on();
 #endif
 
+		kvm_guest_enter();
+
 		/* Going into guest context! Yay! */
 		vcpu->mode = IN_GUEST_MODE;
 		smp_wmb();
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 30/38] KVM: PPC: Move kvm_guest_enter call into generic code
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need to call kvm_guest_enter in booke and book3s, so move its
call to generic code.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |    2 --
 arch/powerpc/kvm/booke.c     |    2 --
 arch/powerpc/kvm/powerpc.c   |    3 +++
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e737db8..1ff0d6c 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -867,8 +867,6 @@ program_interrupt:
 			run->exit_reason = KVM_EXIT_INTR;
 			r = -EINTR;
 		} else {
-			/* Going back to guest */
-			kvm_guest_enter();
 			kvmppc_lazy_ee_enable();
 		}
 	}
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 2bd190c..5e8dc19 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -954,8 +954,6 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
 			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
 		} else {
-			/* Going back to guest */
-			kvm_guest_enter();
 			kvmppc_lazy_ee_enable();
 		}
 	}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 545c183..fbf18be 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -101,12 +101,15 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			/* Got an interrupt in between, try again */
 			local_irq_enable();
 			local_irq_disable();
+			kvm_guest_exit();
 			continue;
 		}
 
 		trace_hardirqs_on();
 #endif
 
+		kvm_guest_enter();
+
 		/* Going into guest context! Yay! */
 		vcpu->mode = IN_GUEST_MODE;
 		smp_wmb();
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 31/38] KVM: PPC: Ignore EXITING_GUEST_MODE mode
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We don't need to do anything when mode is EXITING_GUEST_MODE, because
we essentially are outside of guest mode and did everything it asked
us to do by the time we check it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/powerpc.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index fbf18be..3855c0f 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -89,11 +89,6 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			continue;
 		}
 
-		if (vcpu->mode == EXITING_GUEST_MODE) {
-			r = 1;
-			break;
-		}
-
 #ifdef CONFIG_PPC64
 		/* lazy EE magic */
 		hard_irq_disable();
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 31/38] KVM: PPC: Ignore EXITING_GUEST_MODE mode
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We don't need to do anything when mode is EXITING_GUEST_MODE, because
we essentially are outside of guest mode and did everything it asked
us to do by the time we check it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/powerpc.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index fbf18be..3855c0f 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -89,11 +89,6 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			continue;
 		}
 
-		if (vcpu->mode = EXITING_GUEST_MODE) {
-			r = 1;
-			break;
-		}
-
 #ifdef CONFIG_PPC64
 		/* lazy EE magic */
 		hard_irq_disable();
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 32/38] KVM: PPC: Add return value in prepare_to_enter
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Our prepare_to_enter helper wants to be able to return in more circumstances
to the host than only when an interrupt is pending. Broaden the interface a
bit and move even more generic code to the generic helper.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |   12 ++++++------
 arch/powerpc/kvm/booke.c     |   16 ++++++++--------
 arch/powerpc/kvm/powerpc.c   |   11 ++++++++---
 3 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 1ff0d6c..71fa0f1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -589,6 +589,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
                        unsigned int exit_nr)
 {
 	int r = RESUME_HOST;
+	int s;
 
 	vcpu->stat.sum_exits++;
 
@@ -862,10 +863,10 @@ program_interrupt:
 		 * again due to a host external interrupt.
 		 */
 		local_irq_disable();
-		if (kvmppc_prepare_to_enter(vcpu)) {
+		s = kvmppc_prepare_to_enter(vcpu);
+		if (s <= 0) {
 			local_irq_enable();
-			run->exit_reason = KVM_EXIT_INTR;
-			r = -EINTR;
+			r = s;
 		} else {
 			kvmppc_lazy_ee_enable();
 		}
@@ -1074,10 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * a host external interrupt.
 	 */
 	local_irq_disable();
-	if (kvmppc_prepare_to_enter(vcpu)) {
+	ret = kvmppc_prepare_to_enter(vcpu);
+	if (ret <= 0) {
 		local_irq_enable();
-		kvm_run->exit_reason = KVM_EXIT_INTR;
-		ret = -EINTR;
 		goto out;
 	}
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 5e8dc19..1917802 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -467,7 +467,7 @@ void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
-	int ret;
+	int ret, s;
 #ifdef CONFIG_PPC_FPU
 	unsigned int fpscr;
 	int fpexc_mode;
@@ -480,10 +480,10 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	}
 
 	local_irq_disable();
-	if (kvmppc_prepare_to_enter(vcpu)) {
+	s = kvmppc_prepare_to_enter(vcpu);
+	if (s <= 0) {
 		local_irq_enable();
-		kvm_run->exit_reason = KVM_EXIT_INTR;
-		ret = -EINTR;
+		ret = s;
 		goto out;
 	}
 	kvmppc_lazy_ee_enable();
@@ -642,6 +642,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
                        unsigned int exit_nr)
 {
 	int r = RESUME_HOST;
+	int s;
 
 	/* update before a new last_exit_type is rewritten */
 	kvmppc_update_timing_stats(vcpu);
@@ -948,11 +949,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 */
 	if (!(r & RESUME_HOST)) {
 		local_irq_disable();
-		if (kvmppc_prepare_to_enter(vcpu)) {
+		s = kvmppc_prepare_to_enter(vcpu);
+		if (s <= 0) {
 			local_irq_enable();
-			run->exit_reason = KVM_EXIT_INTR;
-			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
-			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+			r = (s << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
 		} else {
 			kvmppc_lazy_ee_enable();
 		}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3855c0f..07e86f8 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -53,11 +53,14 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
  * Common checks before entering the guest world.  Call with interrupts
  * disabled.
  *
- * returns !0 if a signal is pending and check_signal is true
+ * returns:
+ *
+ * == 1 if we're ready to go into guest state
+ * <= 0 if we need to go back to the host with return value
  */
 int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
-	int r = 0;
+	int r = 1;
 
 	WARN_ON_ONCE(!irqs_disabled());
 	while (true) {
@@ -69,7 +72,9 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 		}
 
 		if (signal_pending(current)) {
-			r = 1;
+			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+			vcpu->run->exit_reason = KVM_EXIT_INTR;
+			r = -EINTR;
 			break;
 		}
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 32/38] KVM: PPC: Add return value in prepare_to_enter
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Our prepare_to_enter helper wants to be able to return in more circumstances
to the host than only when an interrupt is pending. Broaden the interface a
bit and move even more generic code to the generic helper.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/book3s_pr.c |   12 ++++++------
 arch/powerpc/kvm/booke.c     |   16 ++++++++--------
 arch/powerpc/kvm/powerpc.c   |   11 ++++++++---
 3 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 1ff0d6c..71fa0f1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -589,6 +589,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
                        unsigned int exit_nr)
 {
 	int r = RESUME_HOST;
+	int s;
 
 	vcpu->stat.sum_exits++;
 
@@ -862,10 +863,10 @@ program_interrupt:
 		 * again due to a host external interrupt.
 		 */
 		local_irq_disable();
-		if (kvmppc_prepare_to_enter(vcpu)) {
+		s = kvmppc_prepare_to_enter(vcpu);
+		if (s <= 0) {
 			local_irq_enable();
-			run->exit_reason = KVM_EXIT_INTR;
-			r = -EINTR;
+			r = s;
 		} else {
 			kvmppc_lazy_ee_enable();
 		}
@@ -1074,10 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * a host external interrupt.
 	 */
 	local_irq_disable();
-	if (kvmppc_prepare_to_enter(vcpu)) {
+	ret = kvmppc_prepare_to_enter(vcpu);
+	if (ret <= 0) {
 		local_irq_enable();
-		kvm_run->exit_reason = KVM_EXIT_INTR;
-		ret = -EINTR;
 		goto out;
 	}
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 5e8dc19..1917802 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -467,7 +467,7 @@ void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
-	int ret;
+	int ret, s;
 #ifdef CONFIG_PPC_FPU
 	unsigned int fpscr;
 	int fpexc_mode;
@@ -480,10 +480,10 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	}
 
 	local_irq_disable();
-	if (kvmppc_prepare_to_enter(vcpu)) {
+	s = kvmppc_prepare_to_enter(vcpu);
+	if (s <= 0) {
 		local_irq_enable();
-		kvm_run->exit_reason = KVM_EXIT_INTR;
-		ret = -EINTR;
+		ret = s;
 		goto out;
 	}
 	kvmppc_lazy_ee_enable();
@@ -642,6 +642,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
                        unsigned int exit_nr)
 {
 	int r = RESUME_HOST;
+	int s;
 
 	/* update before a new last_exit_type is rewritten */
 	kvmppc_update_timing_stats(vcpu);
@@ -948,11 +949,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 */
 	if (!(r & RESUME_HOST)) {
 		local_irq_disable();
-		if (kvmppc_prepare_to_enter(vcpu)) {
+		s = kvmppc_prepare_to_enter(vcpu);
+		if (s <= 0) {
 			local_irq_enable();
-			run->exit_reason = KVM_EXIT_INTR;
-			r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
-			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+			r = (s << 2) | RESUME_HOST | (r & RESUME_FLAG_NV);
 		} else {
 			kvmppc_lazy_ee_enable();
 		}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3855c0f..07e86f8 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -53,11 +53,14 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
  * Common checks before entering the guest world.  Call with interrupts
  * disabled.
  *
- * returns !0 if a signal is pending and check_signal is true
+ * returns:
+ *
+ * = 1 if we're ready to go into guest state
+ * <= 0 if we need to go back to the host with return value
  */
 int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
-	int r = 0;
+	int r = 1;
 
 	WARN_ON_ONCE(!irqs_disabled());
 	while (true) {
@@ -69,7 +72,9 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 		}
 
 		if (signal_pending(current)) {
-			r = 1;
+			kvmppc_account_exit(vcpu, SIGNAL_EXITS);
+			vcpu->run->exit_reason = KVM_EXIT_INTR;
+			r = -EINTR;
 			break;
 		}
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 33/38] KVM: PPC: Add return value to core_check_requests
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Requests may want to tell us that we need to go back into host state,
so add a return value for the checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_ppc.h |    2 +-
 arch/powerpc/kvm/book3s_pr.c       |    6 +++++-
 arch/powerpc/kvm/booke.c           |    6 +++++-
 arch/powerpc/kvm/powerpc.c         |    6 ++++--
 4 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 5459364..3dfc437 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -112,7 +112,7 @@ extern int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong val);
 extern int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong *val);
-extern void kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
+extern int kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_booke_init(void);
 extern void kvmppc_booke_exit(void);
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 71fa0f1..b3c584f 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -86,12 +86,16 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	kvmppc_giveup_ext(vcpu, MSR_VSX);
 }
 
-void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
+int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
+	int r = 1; /* Indicate we want to get back into the guest */
+
 	/* We misuse TLB_FLUSH to indicate that we want to clear
 	   all shadow cache entries */
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
 		kvmppc_mmu_pte_flush(vcpu, 0, 0);
+
+	return r;
 }
 
 /************* MMU Notifiers *************/
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1917802..c364930 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -455,14 +455,18 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 	return r;
 }
 
-void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
+int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
+	int r = 1; /* Indicate we want to get back into the guest */
+
 	if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 		update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
 		kvmppc_core_flush_tlb(vcpu);
 #endif
+
+	return r;
 }
 
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 07e86f8..da98b41 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -83,9 +83,11 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			/* Make sure we process requests preemptable */
 			local_irq_enable();
 			trace_kvm_check_requests(vcpu);
-			kvmppc_core_check_requests(vcpu);
+			r = kvmppc_core_check_requests(vcpu);
 			local_irq_disable();
-			continue;
+			if (r > 0)
+				continue;
+			break;
 		}
 
 		if (kvmppc_core_prepare_to_enter(vcpu)) {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 33/38] KVM: PPC: Add return value to core_check_requests
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Requests may want to tell us that we need to go back into host state,
so add a return value for the checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_ppc.h |    2 +-
 arch/powerpc/kvm/book3s_pr.c       |    6 +++++-
 arch/powerpc/kvm/booke.c           |    6 +++++-
 arch/powerpc/kvm/powerpc.c         |    6 ++++--
 4 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 5459364..3dfc437 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -112,7 +112,7 @@ extern int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong val);
 extern int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn,
 				     ulong *val);
-extern void kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
+extern int kvmppc_core_check_requests(struct kvm_vcpu *vcpu);
 
 extern int kvmppc_booke_init(void);
 extern void kvmppc_booke_exit(void);
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 71fa0f1..b3c584f 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -86,12 +86,16 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	kvmppc_giveup_ext(vcpu, MSR_VSX);
 }
 
-void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
+int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
+	int r = 1; /* Indicate we want to get back into the guest */
+
 	/* We misuse TLB_FLUSH to indicate that we want to clear
 	   all shadow cache entries */
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
 		kvmppc_mmu_pte_flush(vcpu, 0, 0);
+
+	return r;
 }
 
 /************* MMU Notifiers *************/
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1917802..c364930 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -455,14 +455,18 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 	return r;
 }
 
-void kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
+int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 {
+	int r = 1; /* Indicate we want to get back into the guest */
+
 	if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
 		update_timer_ints(vcpu);
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
 		kvmppc_core_flush_tlb(vcpu);
 #endif
+
+	return r;
 }
 
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 07e86f8..da98b41 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -83,9 +83,11 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 			/* Make sure we process requests preemptable */
 			local_irq_enable();
 			trace_kvm_check_requests(vcpu);
-			kvmppc_core_check_requests(vcpu);
+			r = kvmppc_core_check_requests(vcpu);
 			local_irq_disable();
-			continue;
+			if (r > 0)
+				continue;
+			break;
 		}
 
 		if (kvmppc_core_prepare_to_enter(vcpu)) {
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 34/38] KVM: PPC: booke: Add watchdog emulation
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Bharat Bhushan, Liu Yu, Scott Wood, Bharat Bhushan

From: Bharat Bhushan <r65777@freescale.com>

This patch adds the watchdog emulation in KVM. The watchdog
emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
The kernel timer are used for watchdog emulation and emulates
h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
it is configured.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
[bharat.bhushan@freescale.com: reworked patch]
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: adjust to new request framework]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h  |    3 +
 arch/powerpc/include/asm/kvm_ppc.h   |    2 +
 arch/powerpc/include/asm/reg_booke.h |    7 ++
 arch/powerpc/kvm/book3s.c            |    9 ++
 arch/powerpc/kvm/booke.c             |  155 ++++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/booke_emulate.c     |    8 ++
 arch/powerpc/kvm/powerpc.c           |   14 +++-
 include/linux/kvm.h                  |    2 +
 include/linux/kvm_host.h             |    1 +
 9 files changed, 199 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 4a5ec8f..51b0ccd 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -471,6 +471,8 @@ struct kvm_vcpu_arch {
 	ulong fault_esr;
 	ulong queued_dear;
 	ulong queued_esr;
+	spinlock_t wdt_lock;
+	struct timer_list wdt_timer;
 	u32 tlbcfg[4];
 	u32 mmucfg;
 	u32 epr;
@@ -486,6 +488,7 @@ struct kvm_vcpu_arch {
 	u8 osi_needed;
 	u8 osi_enabled;
 	u8 papr_enabled;
+	u8 watchdog_enabled;
 	u8 sane;
 	u8 cpu_type;
 	u8 hcall_needed;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 3dfc437..c06a64b 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -68,6 +68,8 @@ extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
 extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
 extern void kvmppc_decrementer_func(unsigned long data);
 extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
+extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
+extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 /* Core-specific hooks */
 
diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h
index 2d916c4..e07e6af 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -539,6 +539,13 @@
 #define TCR_FIE		0x00800000	/* FIT Interrupt Enable */
 #define TCR_ARE		0x00400000	/* Auto Reload Enable */
 
+#ifdef CONFIG_E500
+#define TCR_GET_WP(tcr)  ((((tcr) & 0xC0000000) >> 30) | \
+			      (((tcr) & 0x1E0000) >> 15))
+#else
+#define TCR_GET_WP(tcr)  (((tcr) & 0xC0000000) >> 30)
+#endif
+
 /* Bit definitions for the TSR. */
 #define TSR_ENW		0x80000000	/* Enable Next Watchdog */
 #define TSR_WIS		0x40000000	/* WDT Interrupt Status */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 3f2a836..e946665 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -411,6 +411,15 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+
+void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	int i;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index c364930..09e8bf3 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -209,6 +209,16 @@ void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
 	clear_bit(BOOKE_IRQPRIO_EXTERNAL_LEVEL, &vcpu->arch.pending_exceptions);
 }
 
+static void kvmppc_core_queue_watchdog(struct kvm_vcpu *vcpu)
+{
+	kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_WATCHDOG);
+}
+
+static void kvmppc_core_dequeue_watchdog(struct kvm_vcpu *vcpu)
+{
+	clear_bit(BOOKE_IRQPRIO_WATCHDOG, &vcpu->arch.pending_exceptions);
+}
+
 static void set_guest_srr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 srr1)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
@@ -328,6 +338,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		msr_mask = MSR_CE | MSR_ME | MSR_DE;
 		int_class = INT_CLASS_NONCRIT;
 		break;
+	case BOOKE_IRQPRIO_WATCHDOG:
 	case BOOKE_IRQPRIO_CRITICAL:
 	case BOOKE_IRQPRIO_DBELL_CRIT:
 		allowed = vcpu->arch.shared->msr & MSR_CE;
@@ -407,12 +418,121 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 	return allowed;
 }
 
+/*
+ * Return the number of jiffies until the next timeout.  If the timeout is
+ * longer than the NEXT_TIMER_MAX_DELTA, then return NEXT_TIMER_MAX_DELTA
+ * because the larger value can break the timer APIs.
+ */
+static unsigned long watchdog_next_timeout(struct kvm_vcpu *vcpu)
+{
+	u64 tb, wdt_tb, wdt_ticks = 0;
+	u64 nr_jiffies = 0;
+	u32 period = TCR_GET_WP(vcpu->arch.tcr);
+
+	wdt_tb = 1ULL << (63 - period);
+	tb = get_tb();
+	/*
+	 * The watchdog timeout will hapeen when TB bit corresponding
+	 * to watchdog will toggle from 0 to 1.
+	 */
+	if (tb & wdt_tb)
+		wdt_ticks = wdt_tb;
+
+	wdt_ticks += wdt_tb - (tb & (wdt_tb - 1));
+
+	/* Convert timebase ticks to jiffies */
+	nr_jiffies = wdt_ticks;
+
+	if (do_div(nr_jiffies, tb_ticks_per_jiffy))
+		nr_jiffies++;
+
+	return min_t(unsigned long long, nr_jiffies, NEXT_TIMER_MAX_DELTA);
+}
+
+static void arm_next_watchdog(struct kvm_vcpu *vcpu)
+{
+	unsigned long nr_jiffies;
+	unsigned long flags;
+
+	/*
+	 * If TSR_ENW and TSR_WIS are not set then no need to exit to
+	 * userspace, so clear the KVM_REQ_WATCHDOG request.
+	 */
+	if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS))
+		clear_bit(KVM_REQ_WATCHDOG, &vcpu->requests);
+
+	spin_lock_irqsave(&vcpu->arch.wdt_lock, flags);
+	nr_jiffies = watchdog_next_timeout(vcpu);
+	/*
+	 * If the number of jiffies of watchdog timer >= NEXT_TIMER_MAX_DELTA
+	 * then do not run the watchdog timer as this can break timer APIs.
+	 */
+	if (nr_jiffies < NEXT_TIMER_MAX_DELTA)
+		mod_timer(&vcpu->arch.wdt_timer, jiffies + nr_jiffies);
+	else
+		del_timer(&vcpu->arch.wdt_timer);
+	spin_unlock_irqrestore(&vcpu->arch.wdt_lock, flags);
+}
+
+void kvmppc_watchdog_func(unsigned long data)
+{
+	struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
+	u32 tsr, new_tsr;
+	int final;
+
+	do {
+		new_tsr = tsr = vcpu->arch.tsr;
+		final = 0;
+
+		/* Time out event */
+		if (tsr & TSR_ENW) {
+			if (tsr & TSR_WIS)
+				final = 1;
+			else
+				new_tsr = tsr | TSR_WIS;
+		} else {
+			new_tsr = tsr | TSR_ENW;
+		}
+	} while (cmpxchg(&vcpu->arch.tsr, tsr, new_tsr) != tsr);
+
+	if (new_tsr & TSR_WIS) {
+		smp_wmb();
+		kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
+		kvm_vcpu_kick(vcpu);
+	}
+
+	/*
+	 * If this is final watchdog expiry and some action is required
+	 * then exit to userspace.
+	 */
+	if (final && (vcpu->arch.tcr & TCR_WRC_MASK) &&
+	    vcpu->arch.watchdog_enabled) {
+		smp_wmb();
+		kvm_make_request(KVM_REQ_WATCHDOG, vcpu);
+		kvm_vcpu_kick(vcpu);
+	}
+
+	/*
+	 * Stop running the watchdog timer after final expiration to
+	 * prevent the host from being flooded with timers if the
+	 * guest sets a short period.
+	 * Timers will resume when TSR/TCR is updated next time.
+	 */
+	if (!final)
+		arm_next_watchdog(vcpu);
+}
+
 static void update_timer_ints(struct kvm_vcpu *vcpu)
 {
 	if ((vcpu->arch.tcr & TCR_DIE) && (vcpu->arch.tsr & TSR_DIS))
 		kvmppc_core_queue_dec(vcpu);
 	else
 		kvmppc_core_dequeue_dec(vcpu);
+
+	if ((vcpu->arch.tcr & TCR_WIE) && (vcpu->arch.tsr & TSR_WIS))
+		kvmppc_core_queue_watchdog(vcpu);
+	else
+		kvmppc_core_dequeue_watchdog(vcpu);
 }
 
 static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
@@ -466,6 +586,11 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 		kvmppc_core_flush_tlb(vcpu);
 #endif
 
+	if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
+		vcpu->run->exit_reason = KVM_EXIT_WATCHDOG;
+		r = 0;
+	}
+
 	return r;
 }
 
@@ -995,6 +1120,21 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return r;
 }
 
+int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	/* setup watchdog timer once */
+	spin_lock_init(&vcpu->arch.wdt_lock);
+	setup_timer(&vcpu->arch.wdt_timer, kvmppc_watchdog_func,
+		    (unsigned long)vcpu);
+
+	return 0;
+}
+
+void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	del_timer_sync(&vcpu->arch.wdt_timer);
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	int i;
@@ -1090,7 +1230,13 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
 	}
 
 	if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR) {
+		u32 old_tsr = vcpu->arch.tsr;
+
 		vcpu->arch.tsr = sregs->u.e.tsr;
+
+		if ((old_tsr ^ vcpu->arch.tsr) & (TSR_ENW | TSR_WIS))
+			arm_next_watchdog(vcpu);
+
 		update_timer_ints(vcpu);
 	}
 
@@ -1251,6 +1397,7 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm,
 void kvmppc_set_tcr(struct kvm_vcpu *vcpu, u32 new_tcr)
 {
 	vcpu->arch.tcr = new_tcr;
+	arm_next_watchdog(vcpu);
 	update_timer_ints(vcpu);
 }
 
@@ -1265,6 +1412,14 @@ void kvmppc_set_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits)
 void kvmppc_clr_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits)
 {
 	clear_bits(tsr_bits, &vcpu->arch.tsr);
+
+	/*
+	 * We may have stopped the watchdog due to
+	 * being stuck on final expiration.
+	 */
+	if (tsr_bits & (TSR_ENW | TSR_WIS))
+		arm_next_watchdog(vcpu);
+
 	update_timer_ints(vcpu);
 }
 
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 12834bb..5a66ade 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -145,6 +145,14 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 		kvmppc_clr_tsr_bits(vcpu, spr_val);
 		break;
 	case SPRN_TCR:
+		/*
+		 * WRC is a 2-bit field that is supposed to preserve its
+		 * value once written to non-zero.
+		 */
+		if (vcpu->arch.tcr & TCR_WRC_MASK) {
+			spr_val &= ~TCR_WRC_MASK;
+			spr_val |= vcpu->arch.tcr & TCR_WRC_MASK;
+		}
 		kvmppc_set_tcr(vcpu, spr_val);
 		break;
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index da98b41..32d217c 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -300,6 +300,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	switch (ext) {
 #ifdef CONFIG_BOOKE
 	case KVM_CAP_PPC_BOOKE_SREGS:
+	case KVM_CAP_PPC_BOOKE_WATCHDOG:
 #else
 	case KVM_CAP_PPC_SEGSTATE:
 	case KVM_CAP_PPC_HIOR:
@@ -472,6 +473,8 @@ enum hrtimer_restart kvmppc_decrementer_wakeup(struct hrtimer *timer)
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+	int ret;
+
 	hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
 	tasklet_init(&vcpu->arch.tasklet, kvmppc_decrementer_func, (ulong)vcpu);
 	vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup;
@@ -480,13 +483,14 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_KVM_EXIT_TIMING
 	mutex_init(&vcpu->arch.exit_timing_lock);
 #endif
-
-	return 0;
+	ret = kvmppc_subarch_vcpu_init(vcpu);
+	return ret;
 }
 
 void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
 {
 	kvmppc_mmu_destroy(vcpu);
+	kvmppc_subarch_vcpu_uninit(vcpu);
 }
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -731,6 +735,12 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		r = 0;
 		vcpu->arch.papr_enabled = true;
 		break;
+#ifdef CONFIG_BOOKE
+	case KVM_CAP_PPC_BOOKE_WATCHDOG:
+		r = 0;
+		vcpu->arch.watchdog_enabled = true;
+		break;
+#endif
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	case KVM_CAP_SW_TLB: {
 		struct kvm_config_tlb cfg;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c03e59e..f4f5be8 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -163,6 +163,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_OSI              18
 #define KVM_EXIT_PAPR_HCALL	  19
 #define KVM_EXIT_S390_UCONTROL	  20
+#define KVM_EXIT_WATCHDOG         21
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
@@ -620,6 +621,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
 #define KVM_CAP_PPC_ALLOC_HTAB 80
+#define KVM_CAP_PPC_BOOKE_WATCHDOG 81
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d2b897e..7c0bbd1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -102,6 +102,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_IMMEDIATE_EXIT    15
 #define KVM_REQ_PMU               16
 #define KVM_REQ_PMI               17
+#define KVM_REQ_WATCHDOG          18
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID	0
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 34/38] KVM: PPC: booke: Add watchdog emulation
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Bharat Bhushan, Liu Yu, Scott Wood, Bharat Bhushan

From: Bharat Bhushan <r65777@freescale.com>

This patch adds the watchdog emulation in KVM. The watchdog
emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
The kernel timer are used for watchdog emulation and emulates
h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
it is configured.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
[bharat.bhushan@freescale.com: reworked patch]
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: adjust to new request framework]
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm_host.h  |    3 +
 arch/powerpc/include/asm/kvm_ppc.h   |    2 +
 arch/powerpc/include/asm/reg_booke.h |    7 ++
 arch/powerpc/kvm/book3s.c            |    9 ++
 arch/powerpc/kvm/booke.c             |  155 ++++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/booke_emulate.c     |    8 ++
 arch/powerpc/kvm/powerpc.c           |   14 +++-
 include/linux/kvm.h                  |    2 +
 include/linux/kvm_host.h             |    1 +
 9 files changed, 199 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 4a5ec8f..51b0ccd 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -471,6 +471,8 @@ struct kvm_vcpu_arch {
 	ulong fault_esr;
 	ulong queued_dear;
 	ulong queued_esr;
+	spinlock_t wdt_lock;
+	struct timer_list wdt_timer;
 	u32 tlbcfg[4];
 	u32 mmucfg;
 	u32 epr;
@@ -486,6 +488,7 @@ struct kvm_vcpu_arch {
 	u8 osi_needed;
 	u8 osi_enabled;
 	u8 papr_enabled;
+	u8 watchdog_enabled;
 	u8 sane;
 	u8 cpu_type;
 	u8 hcall_needed;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 3dfc437..c06a64b 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -68,6 +68,8 @@ extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
 extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
 extern void kvmppc_decrementer_func(unsigned long data);
 extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
+extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
+extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 /* Core-specific hooks */
 
diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h
index 2d916c4..e07e6af 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -539,6 +539,13 @@
 #define TCR_FIE		0x00800000	/* FIT Interrupt Enable */
 #define TCR_ARE		0x00400000	/* Auto Reload Enable */
 
+#ifdef CONFIG_E500
+#define TCR_GET_WP(tcr)  ((((tcr) & 0xC0000000) >> 30) | \
+			      (((tcr) & 0x1E0000) >> 15))
+#else
+#define TCR_GET_WP(tcr)  (((tcr) & 0xC0000000) >> 30)
+#endif
+
 /* Bit definitions for the TSR. */
 #define TSR_ENW		0x80000000	/* Enable Next Watchdog */
 #define TSR_WIS		0x40000000	/* WDT Interrupt Status */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 3f2a836..e946665 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -411,6 +411,15 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	return 0;
+}
+
+void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	int i;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index c364930..09e8bf3 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -209,6 +209,16 @@ void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
 	clear_bit(BOOKE_IRQPRIO_EXTERNAL_LEVEL, &vcpu->arch.pending_exceptions);
 }
 
+static void kvmppc_core_queue_watchdog(struct kvm_vcpu *vcpu)
+{
+	kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_WATCHDOG);
+}
+
+static void kvmppc_core_dequeue_watchdog(struct kvm_vcpu *vcpu)
+{
+	clear_bit(BOOKE_IRQPRIO_WATCHDOG, &vcpu->arch.pending_exceptions);
+}
+
 static void set_guest_srr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 srr1)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
@@ -328,6 +338,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 		msr_mask = MSR_CE | MSR_ME | MSR_DE;
 		int_class = INT_CLASS_NONCRIT;
 		break;
+	case BOOKE_IRQPRIO_WATCHDOG:
 	case BOOKE_IRQPRIO_CRITICAL:
 	case BOOKE_IRQPRIO_DBELL_CRIT:
 		allowed = vcpu->arch.shared->msr & MSR_CE;
@@ -407,12 +418,121 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 	return allowed;
 }
 
+/*
+ * Return the number of jiffies until the next timeout.  If the timeout is
+ * longer than the NEXT_TIMER_MAX_DELTA, then return NEXT_TIMER_MAX_DELTA
+ * because the larger value can break the timer APIs.
+ */
+static unsigned long watchdog_next_timeout(struct kvm_vcpu *vcpu)
+{
+	u64 tb, wdt_tb, wdt_ticks = 0;
+	u64 nr_jiffies = 0;
+	u32 period = TCR_GET_WP(vcpu->arch.tcr);
+
+	wdt_tb = 1ULL << (63 - period);
+	tb = get_tb();
+	/*
+	 * The watchdog timeout will hapeen when TB bit corresponding
+	 * to watchdog will toggle from 0 to 1.
+	 */
+	if (tb & wdt_tb)
+		wdt_ticks = wdt_tb;
+
+	wdt_ticks += wdt_tb - (tb & (wdt_tb - 1));
+
+	/* Convert timebase ticks to jiffies */
+	nr_jiffies = wdt_ticks;
+
+	if (do_div(nr_jiffies, tb_ticks_per_jiffy))
+		nr_jiffies++;
+
+	return min_t(unsigned long long, nr_jiffies, NEXT_TIMER_MAX_DELTA);
+}
+
+static void arm_next_watchdog(struct kvm_vcpu *vcpu)
+{
+	unsigned long nr_jiffies;
+	unsigned long flags;
+
+	/*
+	 * If TSR_ENW and TSR_WIS are not set then no need to exit to
+	 * userspace, so clear the KVM_REQ_WATCHDOG request.
+	 */
+	if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS))
+		clear_bit(KVM_REQ_WATCHDOG, &vcpu->requests);
+
+	spin_lock_irqsave(&vcpu->arch.wdt_lock, flags);
+	nr_jiffies = watchdog_next_timeout(vcpu);
+	/*
+	 * If the number of jiffies of watchdog timer >= NEXT_TIMER_MAX_DELTA
+	 * then do not run the watchdog timer as this can break timer APIs.
+	 */
+	if (nr_jiffies < NEXT_TIMER_MAX_DELTA)
+		mod_timer(&vcpu->arch.wdt_timer, jiffies + nr_jiffies);
+	else
+		del_timer(&vcpu->arch.wdt_timer);
+	spin_unlock_irqrestore(&vcpu->arch.wdt_lock, flags);
+}
+
+void kvmppc_watchdog_func(unsigned long data)
+{
+	struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
+	u32 tsr, new_tsr;
+	int final;
+
+	do {
+		new_tsr = tsr = vcpu->arch.tsr;
+		final = 0;
+
+		/* Time out event */
+		if (tsr & TSR_ENW) {
+			if (tsr & TSR_WIS)
+				final = 1;
+			else
+				new_tsr = tsr | TSR_WIS;
+		} else {
+			new_tsr = tsr | TSR_ENW;
+		}
+	} while (cmpxchg(&vcpu->arch.tsr, tsr, new_tsr) != tsr);
+
+	if (new_tsr & TSR_WIS) {
+		smp_wmb();
+		kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
+		kvm_vcpu_kick(vcpu);
+	}
+
+	/*
+	 * If this is final watchdog expiry and some action is required
+	 * then exit to userspace.
+	 */
+	if (final && (vcpu->arch.tcr & TCR_WRC_MASK) &&
+	    vcpu->arch.watchdog_enabled) {
+		smp_wmb();
+		kvm_make_request(KVM_REQ_WATCHDOG, vcpu);
+		kvm_vcpu_kick(vcpu);
+	}
+
+	/*
+	 * Stop running the watchdog timer after final expiration to
+	 * prevent the host from being flooded with timers if the
+	 * guest sets a short period.
+	 * Timers will resume when TSR/TCR is updated next time.
+	 */
+	if (!final)
+		arm_next_watchdog(vcpu);
+}
+
 static void update_timer_ints(struct kvm_vcpu *vcpu)
 {
 	if ((vcpu->arch.tcr & TCR_DIE) && (vcpu->arch.tsr & TSR_DIS))
 		kvmppc_core_queue_dec(vcpu);
 	else
 		kvmppc_core_dequeue_dec(vcpu);
+
+	if ((vcpu->arch.tcr & TCR_WIE) && (vcpu->arch.tsr & TSR_WIS))
+		kvmppc_core_queue_watchdog(vcpu);
+	else
+		kvmppc_core_dequeue_watchdog(vcpu);
 }
 
 static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
@@ -466,6 +586,11 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 		kvmppc_core_flush_tlb(vcpu);
 #endif
 
+	if (kvm_check_request(KVM_REQ_WATCHDOG, vcpu)) {
+		vcpu->run->exit_reason = KVM_EXIT_WATCHDOG;
+		r = 0;
+	}
+
 	return r;
 }
 
@@ -995,6 +1120,21 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 	return r;
 }
 
+int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	/* setup watchdog timer once */
+	spin_lock_init(&vcpu->arch.wdt_lock);
+	setup_timer(&vcpu->arch.wdt_timer, kvmppc_watchdog_func,
+		    (unsigned long)vcpu);
+
+	return 0;
+}
+
+void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu)
+{
+	del_timer_sync(&vcpu->arch.wdt_timer);
+}
+
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
 	int i;
@@ -1090,7 +1230,13 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
 	}
 
 	if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR) {
+		u32 old_tsr = vcpu->arch.tsr;
+
 		vcpu->arch.tsr = sregs->u.e.tsr;
+
+		if ((old_tsr ^ vcpu->arch.tsr) & (TSR_ENW | TSR_WIS))
+			arm_next_watchdog(vcpu);
+
 		update_timer_ints(vcpu);
 	}
 
@@ -1251,6 +1397,7 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm,
 void kvmppc_set_tcr(struct kvm_vcpu *vcpu, u32 new_tcr)
 {
 	vcpu->arch.tcr = new_tcr;
+	arm_next_watchdog(vcpu);
 	update_timer_ints(vcpu);
 }
 
@@ -1265,6 +1412,14 @@ void kvmppc_set_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits)
 void kvmppc_clr_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits)
 {
 	clear_bits(tsr_bits, &vcpu->arch.tsr);
+
+	/*
+	 * We may have stopped the watchdog due to
+	 * being stuck on final expiration.
+	 */
+	if (tsr_bits & (TSR_ENW | TSR_WIS))
+		arm_next_watchdog(vcpu);
+
 	update_timer_ints(vcpu);
 }
 
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 12834bb..5a66ade 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -145,6 +145,14 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 		kvmppc_clr_tsr_bits(vcpu, spr_val);
 		break;
 	case SPRN_TCR:
+		/*
+		 * WRC is a 2-bit field that is supposed to preserve its
+		 * value once written to non-zero.
+		 */
+		if (vcpu->arch.tcr & TCR_WRC_MASK) {
+			spr_val &= ~TCR_WRC_MASK;
+			spr_val |= vcpu->arch.tcr & TCR_WRC_MASK;
+		}
 		kvmppc_set_tcr(vcpu, spr_val);
 		break;
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index da98b41..32d217c 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -300,6 +300,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	switch (ext) {
 #ifdef CONFIG_BOOKE
 	case KVM_CAP_PPC_BOOKE_SREGS:
+	case KVM_CAP_PPC_BOOKE_WATCHDOG:
 #else
 	case KVM_CAP_PPC_SEGSTATE:
 	case KVM_CAP_PPC_HIOR:
@@ -472,6 +473,8 @@ enum hrtimer_restart kvmppc_decrementer_wakeup(struct hrtimer *timer)
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
+	int ret;
+
 	hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
 	tasklet_init(&vcpu->arch.tasklet, kvmppc_decrementer_func, (ulong)vcpu);
 	vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup;
@@ -480,13 +483,14 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_KVM_EXIT_TIMING
 	mutex_init(&vcpu->arch.exit_timing_lock);
 #endif
-
-	return 0;
+	ret = kvmppc_subarch_vcpu_init(vcpu);
+	return ret;
 }
 
 void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
 {
 	kvmppc_mmu_destroy(vcpu);
+	kvmppc_subarch_vcpu_uninit(vcpu);
 }
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -731,6 +735,12 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		r = 0;
 		vcpu->arch.papr_enabled = true;
 		break;
+#ifdef CONFIG_BOOKE
+	case KVM_CAP_PPC_BOOKE_WATCHDOG:
+		r = 0;
+		vcpu->arch.watchdog_enabled = true;
+		break;
+#endif
 #if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
 	case KVM_CAP_SW_TLB: {
 		struct kvm_config_tlb cfg;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index c03e59e..f4f5be8 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -163,6 +163,7 @@ struct kvm_pit_config {
 #define KVM_EXIT_OSI              18
 #define KVM_EXIT_PAPR_HCALL	  19
 #define KVM_EXIT_S390_UCONTROL	  20
+#define KVM_EXIT_WATCHDOG         21
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 #define KVM_INTERNAL_ERROR_EMULATION 1
@@ -620,6 +621,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
 #define KVM_CAP_PPC_ALLOC_HTAB 80
+#define KVM_CAP_PPC_BOOKE_WATCHDOG 81
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d2b897e..7c0bbd1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -102,6 +102,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_IMMEDIATE_EXIT    15
 #define KVM_REQ_PMU               16
 #define KVM_REQ_PMI               17
+#define KVM_REQ_WATCHDOG          18
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID	0
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Bharat Bhushan, Bharat Bhushan

From: Bharat Bhushan <r65777@freescale.com>

IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
interface is added to set/get them.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm.h      |   12 ++++++++
 arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
 arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
 arch/powerpc/kvm/booke_emulate.c    |    8 +++---
 4 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index 1bea4d8..3c14202 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -221,6 +221,12 @@ struct kvm_sregs {
 
 			__u32 dbsr;	/* KVM_SREGS_E_UPDATE_DBSR */
 			__u32 dbcr[3];
+			/*
+			 * iac/dac registers are 64bit wide, while this API
+			 * interface provides only lower 32 bits on 64 bit
+			 * processors. ONE_REG interface is added for 64bit
+			 * iac/dac registers.
+			 */
 			__u32 iac[4];
 			__u32 dac[2];
 			__u32 dvc[2];
@@ -326,5 +332,11 @@ struct kvm_book3e_206_tlb_params {
 };
 
 #define KVM_REG_PPC_HIOR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
+#define KVM_REG_PPC_IAC1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
+#define KVM_REG_PPC_IAC2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
+#define KVM_REG_PPC_IAC3	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x4)
+#define KVM_REG_PPC_IAC4	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x5)
+#define KVM_REG_PPC_DAC1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x6)
+#define KVM_REG_PPC_DAC2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x7)
 
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 51b0ccd..f20a5ef 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -346,6 +346,27 @@ struct kvmppc_slb {
 	bool class	: 1;
 };
 
+# ifdef CONFIG_PPC_FSL_BOOK3E
+#define KVMPPC_BOOKE_IAC_NUM	2
+#define KVMPPC_BOOKE_DAC_NUM	2
+# else
+#define KVMPPC_BOOKE_IAC_NUM	4
+#define KVMPPC_BOOKE_DAC_NUM	2
+# endif
+#define KVMPPC_BOOKE_MAX_IAC	4
+#define KVMPPC_BOOKE_MAX_DAC	2
+
+struct kvmppc_booke_debug_reg {
+	u32 dbcr0;
+	u32 dbcr1;
+	u32 dbcr2;
+#ifdef CONFIG_KVM_E500MC
+	u32 dbcr4;
+#endif
+	u64 iac[KVMPPC_BOOKE_MAX_IAC];
+	u64 dac[KVMPPC_BOOKE_MAX_DAC];
+};
+
 struct kvm_vcpu_arch {
 	ulong host_stack;
 	u32 host_pid;
@@ -440,8 +461,6 @@ struct kvm_vcpu_arch {
 
 	u32 ccr0;
 	u32 ccr1;
-	u32 dbcr0;
-	u32 dbcr1;
 	u32 dbsr;
 
 	u64 mmcr[3];
@@ -476,6 +495,7 @@ struct kvm_vcpu_arch {
 	u32 tlbcfg[4];
 	u32 mmucfg;
 	u32 epr;
+	struct kvmppc_booke_debug_reg dbg_reg;
 #endif
 	gpa_t paddr_accessed;
 	gva_t vaddr_accessed;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 09e8bf3..959aae9 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1351,12 +1351,56 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
-	return -EINVAL;
+	int r = -EINVAL;
+
+	switch (reg->id) {
+	case KVM_REG_PPC_IAC1:
+	case KVM_REG_PPC_IAC2:
+	case KVM_REG_PPC_IAC3:
+	case KVM_REG_PPC_IAC4: {
+		int iac = reg->id - KVM_REG_PPC_IAC1;
+		r = copy_to_user((u64 __user *)(long)reg->addr,
+				 &vcpu->arch.dbg_reg.iac[iac], sizeof(u64));
+		break;
+	}
+	case KVM_REG_PPC_DAC1:
+	case KVM_REG_PPC_DAC2: {
+		int dac = reg->id - KVM_REG_PPC_DAC1;
+		r = copy_to_user((u64 __user *)(long)reg->addr,
+				 &vcpu->arch.dbg_reg.dac[dac], sizeof(u64));
+		break;
+	}
+	default:
+		break;
+	}
+	return r;
 }
 
 int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
-	return -EINVAL;
+	int r = -EINVAL;
+
+	switch (reg->id) {
+	case KVM_REG_PPC_IAC1:
+	case KVM_REG_PPC_IAC2:
+	case KVM_REG_PPC_IAC3:
+	case KVM_REG_PPC_IAC4: {
+		int iac = reg->id - KVM_REG_PPC_IAC1;
+		r = copy_from_user(&vcpu->arch.dbg_reg.iac[iac],
+			     (u64 __user *)(long)reg->addr, sizeof(u64));
+		break;
+	}
+	case KVM_REG_PPC_DAC1:
+	case KVM_REG_PPC_DAC2: {
+		int dac = reg->id - KVM_REG_PPC_DAC1;
+		r = copy_from_user(&vcpu->arch.dbg_reg.dac[dac],
+			     (u64 __user *)(long)reg->addr, sizeof(u64));
+		break;
+	}
+	default:
+		break;
+	}
+	return r;
 }
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 5a66ade..cc99a0b 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -133,10 +133,10 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 		vcpu->arch.csrr1 = spr_val;
 		break;
 	case SPRN_DBCR0:
-		vcpu->arch.dbcr0 = spr_val;
+		vcpu->arch.dbg_reg.dbcr0 = spr_val;
 		break;
 	case SPRN_DBCR1:
-		vcpu->arch.dbcr1 = spr_val;
+		vcpu->arch.dbg_reg.dbcr1 = spr_val;
 		break;
 	case SPRN_DBSR:
 		vcpu->arch.dbsr &= ~spr_val;
@@ -266,10 +266,10 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
 		*spr_val = vcpu->arch.csrr1;
 		break;
 	case SPRN_DBCR0:
-		*spr_val = vcpu->arch.dbcr0;
+		*spr_val = vcpu->arch.dbg_reg.dbcr0;
 		break;
 	case SPRN_DBCR1:
-		*spr_val = vcpu->arch.dbcr1;
+		*spr_val = vcpu->arch.dbg_reg.dbcr1;
 		break;
 	case SPRN_DBSR:
 		*spr_val = vcpu->arch.dbsr;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Bharat Bhushan, Bharat Bhushan

From: Bharat Bhushan <r65777@freescale.com>

IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
interface is added to set/get them.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/include/asm/kvm.h      |   12 ++++++++
 arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
 arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
 arch/powerpc/kvm/booke_emulate.c    |    8 +++---
 4 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index 1bea4d8..3c14202 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -221,6 +221,12 @@ struct kvm_sregs {
 
 			__u32 dbsr;	/* KVM_SREGS_E_UPDATE_DBSR */
 			__u32 dbcr[3];
+			/*
+			 * iac/dac registers are 64bit wide, while this API
+			 * interface provides only lower 32 bits on 64 bit
+			 * processors. ONE_REG interface is added for 64bit
+			 * iac/dac registers.
+			 */
 			__u32 iac[4];
 			__u32 dac[2];
 			__u32 dvc[2];
@@ -326,5 +332,11 @@ struct kvm_book3e_206_tlb_params {
 };
 
 #define KVM_REG_PPC_HIOR	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
+#define KVM_REG_PPC_IAC1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
+#define KVM_REG_PPC_IAC2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
+#define KVM_REG_PPC_IAC3	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x4)
+#define KVM_REG_PPC_IAC4	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x5)
+#define KVM_REG_PPC_DAC1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x6)
+#define KVM_REG_PPC_DAC2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x7)
 
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 51b0ccd..f20a5ef 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -346,6 +346,27 @@ struct kvmppc_slb {
 	bool class	: 1;
 };
 
+# ifdef CONFIG_PPC_FSL_BOOK3E
+#define KVMPPC_BOOKE_IAC_NUM	2
+#define KVMPPC_BOOKE_DAC_NUM	2
+# else
+#define KVMPPC_BOOKE_IAC_NUM	4
+#define KVMPPC_BOOKE_DAC_NUM	2
+# endif
+#define KVMPPC_BOOKE_MAX_IAC	4
+#define KVMPPC_BOOKE_MAX_DAC	2
+
+struct kvmppc_booke_debug_reg {
+	u32 dbcr0;
+	u32 dbcr1;
+	u32 dbcr2;
+#ifdef CONFIG_KVM_E500MC
+	u32 dbcr4;
+#endif
+	u64 iac[KVMPPC_BOOKE_MAX_IAC];
+	u64 dac[KVMPPC_BOOKE_MAX_DAC];
+};
+
 struct kvm_vcpu_arch {
 	ulong host_stack;
 	u32 host_pid;
@@ -440,8 +461,6 @@ struct kvm_vcpu_arch {
 
 	u32 ccr0;
 	u32 ccr1;
-	u32 dbcr0;
-	u32 dbcr1;
 	u32 dbsr;
 
 	u64 mmcr[3];
@@ -476,6 +495,7 @@ struct kvm_vcpu_arch {
 	u32 tlbcfg[4];
 	u32 mmucfg;
 	u32 epr;
+	struct kvmppc_booke_debug_reg dbg_reg;
 #endif
 	gpa_t paddr_accessed;
 	gva_t vaddr_accessed;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 09e8bf3..959aae9 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1351,12 +1351,56 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
-	return -EINVAL;
+	int r = -EINVAL;
+
+	switch (reg->id) {
+	case KVM_REG_PPC_IAC1:
+	case KVM_REG_PPC_IAC2:
+	case KVM_REG_PPC_IAC3:
+	case KVM_REG_PPC_IAC4: {
+		int iac = reg->id - KVM_REG_PPC_IAC1;
+		r = copy_to_user((u64 __user *)(long)reg->addr,
+				 &vcpu->arch.dbg_reg.iac[iac], sizeof(u64));
+		break;
+	}
+	case KVM_REG_PPC_DAC1:
+	case KVM_REG_PPC_DAC2: {
+		int dac = reg->id - KVM_REG_PPC_DAC1;
+		r = copy_to_user((u64 __user *)(long)reg->addr,
+				 &vcpu->arch.dbg_reg.dac[dac], sizeof(u64));
+		break;
+	}
+	default:
+		break;
+	}
+	return r;
 }
 
 int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
 {
-	return -EINVAL;
+	int r = -EINVAL;
+
+	switch (reg->id) {
+	case KVM_REG_PPC_IAC1:
+	case KVM_REG_PPC_IAC2:
+	case KVM_REG_PPC_IAC3:
+	case KVM_REG_PPC_IAC4: {
+		int iac = reg->id - KVM_REG_PPC_IAC1;
+		r = copy_from_user(&vcpu->arch.dbg_reg.iac[iac],
+			     (u64 __user *)(long)reg->addr, sizeof(u64));
+		break;
+	}
+	case KVM_REG_PPC_DAC1:
+	case KVM_REG_PPC_DAC2: {
+		int dac = reg->id - KVM_REG_PPC_DAC1;
+		r = copy_from_user(&vcpu->arch.dbg_reg.dac[dac],
+			     (u64 __user *)(long)reg->addr, sizeof(u64));
+		break;
+	}
+	default:
+		break;
+	}
+	return r;
 }
 
 int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 5a66ade..cc99a0b 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -133,10 +133,10 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 		vcpu->arch.csrr1 = spr_val;
 		break;
 	case SPRN_DBCR0:
-		vcpu->arch.dbcr0 = spr_val;
+		vcpu->arch.dbg_reg.dbcr0 = spr_val;
 		break;
 	case SPRN_DBCR1:
-		vcpu->arch.dbcr1 = spr_val;
+		vcpu->arch.dbg_reg.dbcr1 = spr_val;
 		break;
 	case SPRN_DBSR:
 		vcpu->arch.dbsr &= ~spr_val;
@@ -266,10 +266,10 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
 		*spr_val = vcpu->arch.csrr1;
 		break;
 	case SPRN_DBCR0:
-		*spr_val = vcpu->arch.dbcr0;
+		*spr_val = vcpu->arch.dbg_reg.dbcr0;
 		break;
 	case SPRN_DBCR1:
-		*spr_val = vcpu->arch.dbcr1;
+		*spr_val = vcpu->arch.dbg_reg.dbcr1;
 		break;
 	case SPRN_DBSR:
 		*spr_val = vcpu->arch.dbsr;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 36/38] KVM: PPC: 44x: Initialize PVR
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need to make sure that vcpu->arch.pvr is initialized to a sane value,
so let's just take the host PVR.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/44x.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 50e7dbc..3d7fd21 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -83,6 +83,7 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu_44x->shadow_refs[i].gtlb_index = -1;
 
 	vcpu->arch.cpu_type = KVM_CPU_440;
+	vcpu->arch.pvr = mfspr(SPRN_PVR);
 
 	return 0;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 36/38] KVM: PPC: 44x: Initialize PVR
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

We need to make sure that vcpu->arch.pvr is initialized to a sane value,
so let's just take the host PVR.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/44x.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index 50e7dbc..3d7fd21 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -83,6 +83,7 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu)
 		vcpu_44x->shadow_refs[i].gtlb_index = -1;
 
 	vcpu->arch.cpu_type = KVM_CPU_440;
+	vcpu->arch.pvr = mfspr(SPRN_PVR);
 
 	return 0;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 37/38] KVM: PPC: BookE: Add MCSR SPR support
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Add support for the MCSR SPR. This only implements the SPR storage
bits, not actual machine checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke_emulate.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index cc99a0b..514790f 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -237,6 +237,9 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 	case SPRN_IVOR15:
 		vcpu->arch.ivor[BOOKE_IRQPRIO_DEBUG] = spr_val;
 		break;
+	case SPRN_MCSR:
+		vcpu->arch.mcsr &= ~spr_val;
+		break;
 
 	default:
 		emulated = EMULATE_FAIL;
@@ -329,6 +332,9 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
 	case SPRN_IVOR15:
 		*spr_val = vcpu->arch.ivor[BOOKE_IRQPRIO_DEBUG];
 		break;
+	case SPRN_MCSR:
+		*spr_val = vcpu->arch.mcsr;
+		break;
 
 	default:
 		emulated = EMULATE_FAIL;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 37/38] KVM: PPC: BookE: Add MCSR SPR support
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list

Add support for the MCSR SPR. This only implements the SPR storage
bits, not actual machine checks.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/booke_emulate.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index cc99a0b..514790f 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -237,6 +237,9 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 	case SPRN_IVOR15:
 		vcpu->arch.ivor[BOOKE_IRQPRIO_DEBUG] = spr_val;
 		break;
+	case SPRN_MCSR:
+		vcpu->arch.mcsr &= ~spr_val;
+		break;
 
 	default:
 		emulated = EMULATE_FAIL;
@@ -329,6 +332,9 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
 	case SPRN_IVOR15:
 		*spr_val = vcpu->arch.ivor[BOOKE_IRQPRIO_DEBUG];
 		break;
+	case SPRN_MCSR:
+		*spr_val = vcpu->arch.mcsr;
+		break;
 
 	default:
 		emulated = EMULATE_FAIL;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 38/38] ppc: e500_tlb memset clears nothing
  2012-08-14 23:04 ` Alexander Graf
@ 2012-08-14 23:04   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Alan Cox, Andrew Morton

From: Alan Cox <alan@linux.intel.com>

Put the parameters the right way around

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44031

Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/e500_tlb.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 6340b3c..1af6fab 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -320,11 +320,11 @@ static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 static void clear_tlb1_bitmap(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	if (vcpu_e500->g2h_tlb1_map)
-		memset(vcpu_e500->g2h_tlb1_map,
-		       sizeof(u64) * vcpu_e500->gtlb_params[1].entries, 0);
+		memset(vcpu_e500->g2h_tlb1_map, 0,
+		       sizeof(u64) * vcpu_e500->gtlb_params[1].entries);
 	if (vcpu_e500->h2g_tlb1_rmap)
-		memset(vcpu_e500->h2g_tlb1_rmap,
-		       sizeof(unsigned int) * host_tlb_params[1].entries, 0);
+		memset(vcpu_e500->h2g_tlb1_rmap, 0,
+		       sizeof(unsigned int) * host_tlb_params[1].entries);
 }
 
 static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 38/38] ppc: e500_tlb memset clears nothing
@ 2012-08-14 23:04   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:04 UTC (permalink / raw)
  To: kvm-ppc; +Cc: KVM list, Alan Cox, Andrew Morton

From: Alan Cox <alan@linux.intel.com>

Put the parameters the right way around

Addresses https://bugzilla.kernel.org/show_bug.cgi?idD031

Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/powerpc/kvm/e500_tlb.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
index 6340b3c..1af6fab 100644
--- a/arch/powerpc/kvm/e500_tlb.c
+++ b/arch/powerpc/kvm/e500_tlb.c
@@ -320,11 +320,11 @@ static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
 static void clear_tlb1_bitmap(struct kvmppc_vcpu_e500 *vcpu_e500)
 {
 	if (vcpu_e500->g2h_tlb1_map)
-		memset(vcpu_e500->g2h_tlb1_map,
-		       sizeof(u64) * vcpu_e500->gtlb_params[1].entries, 0);
+		memset(vcpu_e500->g2h_tlb1_map, 0,
+		       sizeof(u64) * vcpu_e500->gtlb_params[1].entries);
 	if (vcpu_e500->h2g_tlb1_rmap)
-		memset(vcpu_e500->h2g_tlb1_rmap,
-		       sizeof(unsigned int) * host_tlb_params[1].entries, 0);
+		memset(vcpu_e500->h2g_tlb1_rmap, 0,
+		       sizeof(unsigned int) * host_tlb_params[1].entries);
 }
 
 static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500)
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 150+ messages in thread

* Re: [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-14 23:44     ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-14 23:44 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Bharat Bhushan, Bharat Bhushan

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> From: Bharat Bhushan <r65777@freescale.com>
> 
> IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
> interface is added to set/get them.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/asm/kvm.h      |   12 ++++++++
>  arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
>  arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
>  arch/powerpc/kvm/booke_emulate.c    |    8 +++---
>  4 files changed, 84 insertions(+), 8 deletions(-)

Shouldn't this be added to the table in
Documentation/virtual/kvm/api.txt section 4.68 (KVM_SET_ONE_REG)?

Oh, and section 4.69 refers to 4.68 as 4.64. :-P  Maybe the section
numbering should be removed altogether.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
@ 2012-08-14 23:44     ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-14 23:44 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Bharat Bhushan, Bharat Bhushan

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> From: Bharat Bhushan <r65777@freescale.com>
> 
> IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
> interface is added to set/get them.
> 
> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/asm/kvm.h      |   12 ++++++++
>  arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
>  arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
>  arch/powerpc/kvm/booke_emulate.c    |    8 +++---
>  4 files changed, 84 insertions(+), 8 deletions(-)

Shouldn't this be added to the table in
Documentation/virtual/kvm/api.txt section 4.68 (KVM_SET_ONE_REG)?

Oh, and section 4.69 refers to 4.68 as 4.64. :-P  Maybe the section
numbering should be removed altogether.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
  2012-08-14 23:44     ` Scott Wood
@ 2012-08-14 23:47       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:47 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list, Bharat Bhushan, Bharat Bhushan


On 15.08.2012, at 01:44, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> From: Bharat Bhushan <r65777@freescale.com>
>> 
>> IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
>> interface is added to set/get them.
>> 
>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm.h      |   12 ++++++++
>> arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
>> arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
>> arch/powerpc/kvm/booke_emulate.c    |    8 +++---
>> 4 files changed, 84 insertions(+), 8 deletions(-)
> 
> Shouldn't this be added to the table in
> Documentation/virtual/kvm/api.txt section 4.68 (KVM_SET_ONE_REG)?

Very good point. Bharat, please send a patch to add the reg to the documentation.

> Oh, and section 4.69 refers to 4.68 as 4.64. :-P  Maybe the section
> numbering should be removed altogether.

... and while at it please also post a patch to fix the numbering mess :). I'd prefer if you just fix the numbers rather than remove them altogether though ;).


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
@ 2012-08-14 23:47       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-14 23:47 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list, Bharat Bhushan, Bharat Bhushan


On 15.08.2012, at 01:44, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> From: Bharat Bhushan <r65777@freescale.com>
>> 
>> IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
>> interface is added to set/get them.
>> 
>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm.h      |   12 ++++++++
>> arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
>> arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
>> arch/powerpc/kvm/booke_emulate.c    |    8 +++---
>> 4 files changed, 84 insertions(+), 8 deletions(-)
> 
> Shouldn't this be added to the table in
> Documentation/virtual/kvm/api.txt section 4.68 (KVM_SET_ONE_REG)?

Very good point. Bharat, please send a patch to add the reg to the documentation.

> Oh, and section 4.69 refers to 4.68 as 4.64. :-P  Maybe the section
> numbering should be removed altogether.

... and while at it please also post a patch to fix the numbering mess :). I'd prefer if you just fix the numbers rather than remove them altogether though ;).


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
  2012-08-14 23:47       ` Alexander Graf
@ 2012-08-15  0:06         ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:06 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Bharat Bhushan, Bharat Bhushan

On 08/14/2012 06:47 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 01:44, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> From: Bharat Bhushan <r65777@freescale.com>
>>>
>>> IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
>>> interface is added to set/get them.
>>>
>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/include/asm/kvm.h      |   12 ++++++++
>>> arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
>>> arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
>>> arch/powerpc/kvm/booke_emulate.c    |    8 +++---
>>> 4 files changed, 84 insertions(+), 8 deletions(-)
>>
>> Shouldn't this be added to the table in
>> Documentation/virtual/kvm/api.txt section 4.68 (KVM_SET_ONE_REG)?
> 
> Very good point. Bharat, please send a patch to add the reg to the documentation.
> 
>> Oh, and section 4.69 refers to 4.68 as 4.64. :-P  Maybe the section
>> numbering should be removed altogether.
> 
> ... and while at it please also post a patch to fix the numbering mess :). I'd prefer if you just fix the numbers rather than remove them altogether though ;).

The numbering seems to be a recurring problem, and just asking for
trouble in a distributed development environment.  I had to renumber the
MMU API documentation numerous times during its long development, which
was even more fun since the numbering in that area of the documentation
was already broken.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers
@ 2012-08-15  0:06         ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:06 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Bharat Bhushan, Bharat Bhushan

On 08/14/2012 06:47 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 01:44, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> From: Bharat Bhushan <r65777@freescale.com>
>>>
>>> IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG
>>> interface is added to set/get them.
>>>
>>> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/include/asm/kvm.h      |   12 ++++++++
>>> arch/powerpc/include/asm/kvm_host.h |   24 ++++++++++++++++-
>>> arch/powerpc/kvm/booke.c            |   48 +++++++++++++++++++++++++++++++++-
>>> arch/powerpc/kvm/booke_emulate.c    |    8 +++---
>>> 4 files changed, 84 insertions(+), 8 deletions(-)
>>
>> Shouldn't this be added to the table in
>> Documentation/virtual/kvm/api.txt section 4.68 (KVM_SET_ONE_REG)?
> 
> Very good point. Bharat, please send a patch to add the reg to the documentation.
> 
>> Oh, and section 4.69 refers to 4.68 as 4.64. :-P  Maybe the section
>> numbering should be removed altogether.
> 
> ... and while at it please also post a patch to fix the numbering mess :). I'd prefer if you just fix the numbers rather than remove them altogether though ;).

The numbering seems to be a recurring problem, and just asking for
trouble in a distributed development environment.  I had to renumber the
MMU API documentation numerous times during its long development, which
was even more fun since the numbering in that area of the documentation
was already broken.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-15  0:10     ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:10 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> We need a central place to check for pending requests in. Add one that
> only does the timer check we already do in a different place.
> 
> Later, this central function can be extended by more checks.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
>  1 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 1d4ce9a..bcf87fe 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>  	unsigned long *pending = &vcpu->arch.pending_exceptions;
>  	unsigned int priority;
>  
> -	if (vcpu->requests) {
> -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
> -			smp_mb();
> -			update_timer_ints(vcpu);
> -		}
> -	}
> -
>  	priority = __ffs(*pending);
>  	while (priority < BOOKE_IRQPRIO_MAX) {
>  		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
> @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  	return r;
>  }
>  
> +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
> +{
> +	if (vcpu->requests) {
> +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
> +			update_timer_ints(vcpu);
> +	}
> +}
> +
>  /*
>   * Common checks before entering the guest world.  Call with interrupts
>   * disabled.
> @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>  			break;
>  		}
>  
> +		smp_mb();
> +		if (vcpu->requests) {
> +			/* Make sure we process requests preemptable */
> +			local_irq_enable();
> +			kvmppc_check_requests(vcpu);
> +			local_irq_disable();
> +			continue;
> +		}

What previous memory access is the smp_mb() ordering against?

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
@ 2012-08-15  0:10     ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:10 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> We need a central place to check for pending requests in. Add one that
> only does the timer check we already do in a different place.
> 
> Later, this central function can be extended by more checks.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
>  1 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 1d4ce9a..bcf87fe 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>  	unsigned long *pending = &vcpu->arch.pending_exceptions;
>  	unsigned int priority;
>  
> -	if (vcpu->requests) {
> -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
> -			smp_mb();
> -			update_timer_ints(vcpu);
> -		}
> -	}
> -
>  	priority = __ffs(*pending);
>  	while (priority < BOOKE_IRQPRIO_MAX) {
>  		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
> @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  	return r;
>  }
>  
> +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
> +{
> +	if (vcpu->requests) {
> +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
> +			update_timer_ints(vcpu);
> +	}
> +}
> +
>  /*
>   * Common checks before entering the guest world.  Call with interrupts
>   * disabled.
> @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>  			break;
>  		}
>  
> +		smp_mb();
> +		if (vcpu->requests) {
> +			/* Make sure we process requests preemptable */
> +			local_irq_enable();
> +			kvmppc_check_requests(vcpu);
> +			local_irq_disable();
> +			continue;
> +		}

What previous memory access is the smp_mb() ordering against?

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
  2012-08-15  0:10     ` Scott Wood
@ 2012-08-15  0:13       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  0:13 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 02:10, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> We need a central place to check for pending requests in. Add one that
>> only does the timer check we already do in a different place.
>> 
>> Later, this central function can be extended by more checks.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
>> 1 files changed, 17 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 1d4ce9a..bcf87fe 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>> 	unsigned long *pending = &vcpu->arch.pending_exceptions;
>> 	unsigned int priority;
>> 
>> -	if (vcpu->requests) {
>> -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
>> -			smp_mb();
>> -			update_timer_ints(vcpu);
>> -		}
>> -	}
>> -
>> 	priority = __ffs(*pending);
>> 	while (priority < BOOKE_IRQPRIO_MAX) {
>> 		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
>> @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 	return r;
>> }
>> 
>> +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>> +{
>> +	if (vcpu->requests) {
>> +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>> +			update_timer_ints(vcpu);
>> +	}
>> +}
>> +
>> /*
>>  * Common checks before entering the guest world.  Call with interrupts
>>  * disabled.
>> @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 			break;
>> 		}
>> 
>> +		smp_mb();
>> +		if (vcpu->requests) {
>> +			/* Make sure we process requests preemptable */
>> +			local_irq_enable();
>> +			kvmppc_check_requests(vcpu);
>> +			local_irq_disable();
>> +			continue;
>> +		}
> 
> What previous memory access is the smp_mb() ordering against?

Against itself really (see the continue?). I might have just gotten this wrong though :).


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
@ 2012-08-15  0:13       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  0:13 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 02:10, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> We need a central place to check for pending requests in. Add one that
>> only does the timer check we already do in a different place.
>> 
>> Later, this central function can be extended by more checks.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
>> 1 files changed, 17 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 1d4ce9a..bcf87fe 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>> 	unsigned long *pending = &vcpu->arch.pending_exceptions;
>> 	unsigned int priority;
>> 
>> -	if (vcpu->requests) {
>> -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
>> -			smp_mb();
>> -			update_timer_ints(vcpu);
>> -		}
>> -	}
>> -
>> 	priority = __ffs(*pending);
>> 	while (priority < BOOKE_IRQPRIO_MAX) {
>> 		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
>> @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 	return r;
>> }
>> 
>> +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>> +{
>> +	if (vcpu->requests) {
>> +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>> +			update_timer_ints(vcpu);
>> +	}
>> +}
>> +
>> /*
>>  * Common checks before entering the guest world.  Call with interrupts
>>  * disabled.
>> @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 			break;
>> 		}
>> 
>> +		smp_mb();
>> +		if (vcpu->requests) {
>> +			/* Make sure we process requests preemptable */
>> +			local_irq_enable();
>> +			kvmppc_check_requests(vcpu);
>> +			local_irq_disable();
>> +			continue;
>> +		}
> 
> What previous memory access is the smp_mb() ordering against?

Against itself really (see the continue?). I might have just gotten this wrong though :).


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-15  0:17     ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:17 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> Generic KVM code might want to know whether we are inside guest context
> or outside. It also wants to be able to push us out of guest context.
> 
> Add support to the BookE code for the generic vcpu->mode field that describes
> the above states.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/kvm/booke.c |   11 +++++++++++
>  1 files changed, 11 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index bcf87fe..70a86c0 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>  			continue;
>  		}
>  
> +		if (vcpu->mode == EXITING_GUEST_MODE) {
> +			r = 1;
> +			break;
> +		}
> +
> +		/* Going into guest context! Yay! */
> +		vcpu->mode = IN_GUEST_MODE;
> +		smp_wmb();
> +
>  		break;
>  	}

Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
right?  How could it possibly be EXITING_GUEST_MODE then, since that
only replaces IN_GUEST_MODE?

This doesn't match what x86 does with mode on entry.  Mode is supposed
to be set to IN_GUEST_MODE before requests are checked.

I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
just waiting until after interrupts are disabled before setting
IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
like a trivial change), plus the existing ordering between mode and
requests.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-15  0:17     ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:17 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> Generic KVM code might want to know whether we are inside guest context
> or outside. It also wants to be able to push us out of guest context.
> 
> Add support to the BookE code for the generic vcpu->mode field that describes
> the above states.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/kvm/booke.c |   11 +++++++++++
>  1 files changed, 11 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index bcf87fe..70a86c0 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>  			continue;
>  		}
>  
> +		if (vcpu->mode = EXITING_GUEST_MODE) {
> +			r = 1;
> +			break;
> +		}
> +
> +		/* Going into guest context! Yay! */
> +		vcpu->mode = IN_GUEST_MODE;
> +		smp_wmb();
> +
>  		break;
>  	}

Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
right?  How could it possibly be EXITING_GUEST_MODE then, since that
only replaces IN_GUEST_MODE?

This doesn't match what x86 does with mode on entry.  Mode is supposed
to be set to IN_GUEST_MODE before requests are checked.

I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
just waiting until after interrupts are disabled before setting
IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
like a trivial change), plus the existing ordering between mode and
requests.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
  2012-08-15  0:13       ` Alexander Graf
@ 2012-08-15  0:20         ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 07:13 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 02:10, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> We need a central place to check for pending requests in. Add one that
>>> only does the timer check we already do in a different place.
>>>
>>> Later, this central function can be extended by more checks.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
>>> 1 files changed, 17 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>> index 1d4ce9a..bcf87fe 100644
>>> --- a/arch/powerpc/kvm/booke.c
>>> +++ b/arch/powerpc/kvm/booke.c
>>> @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>>> 	unsigned long *pending = &vcpu->arch.pending_exceptions;
>>> 	unsigned int priority;
>>>
>>> -	if (vcpu->requests) {
>>> -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
>>> -			smp_mb();
>>> -			update_timer_ints(vcpu);
>>> -		}
>>> -	}
>>> -
>>> 	priority = __ffs(*pending);
>>> 	while (priority < BOOKE_IRQPRIO_MAX) {
>>> 		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
>>> @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> 	return r;
>>> }
>>>
>>> +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>>> +{
>>> +	if (vcpu->requests) {
>>> +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>>> +			update_timer_ints(vcpu);
>>> +	}
>>> +}
>>> +
>>> /*
>>>  * Common checks before entering the guest world.  Call with interrupts
>>>  * disabled.
>>> @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> 			break;
>>> 		}
>>>
>>> +		smp_mb();
>>> +		if (vcpu->requests) {
>>> +			/* Make sure we process requests preemptable */
>>> +			local_irq_enable();
>>> +			kvmppc_check_requests(vcpu);
>>> +			local_irq_disable();
>>> +			continue;
>>> +		}
>>
>> What previous memory access is the smp_mb() ordering against?
> 
> Against itself really (see the continue?). I might have just gotten this wrong though :).

Ordering is already ensured by the hardware in that case since it's the
same address -- and the stuff inside the if statement is more than
enough to ensure the compiler doesn't cache an old value of vcpu->requests.

That said, I think the problem is not that the smp_mb() shouldn't be
there, but that there should be a vcpu->mode setting before it.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
@ 2012-08-15  0:20         ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  0:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 07:13 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 02:10, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> We need a central place to check for pending requests in. Add one that
>>> only does the timer check we already do in a different place.
>>>
>>> Later, this central function can be extended by more checks.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
>>> 1 files changed, 17 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>> index 1d4ce9a..bcf87fe 100644
>>> --- a/arch/powerpc/kvm/booke.c
>>> +++ b/arch/powerpc/kvm/booke.c
>>> @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
>>> 	unsigned long *pending = &vcpu->arch.pending_exceptions;
>>> 	unsigned int priority;
>>>
>>> -	if (vcpu->requests) {
>>> -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
>>> -			smp_mb();
>>> -			update_timer_ints(vcpu);
>>> -		}
>>> -	}
>>> -
>>> 	priority = __ffs(*pending);
>>> 	while (priority < BOOKE_IRQPRIO_MAX) {
>>> 		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
>>> @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> 	return r;
>>> }
>>>
>>> +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>>> +{
>>> +	if (vcpu->requests) {
>>> +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>>> +			update_timer_ints(vcpu);
>>> +	}
>>> +}
>>> +
>>> /*
>>>  * Common checks before entering the guest world.  Call with interrupts
>>>  * disabled.
>>> @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> 			break;
>>> 		}
>>>
>>> +		smp_mb();
>>> +		if (vcpu->requests) {
>>> +			/* Make sure we process requests preemptable */
>>> +			local_irq_enable();
>>> +			kvmppc_check_requests(vcpu);
>>> +			local_irq_disable();
>>> +			continue;
>>> +		}
>>
>> What previous memory access is the smp_mb() ordering against?
> 
> Against itself really (see the continue?). I might have just gotten this wrong though :).

Ordering is already ensured by the hardware in that case since it's the
same address -- and the stuff inside the if statement is more than
enough to ensure the compiler doesn't cache an old value of vcpu->requests.

That said, I think the problem is not that the smp_mb() shouldn't be
there, but that there should be a vcpu->mode setting before it.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-15  0:17     ` Scott Wood
@ 2012-08-15  0:26       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  0:26 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 02:17, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> Generic KVM code might want to know whether we are inside guest context
>> or outside. It also wants to be able to push us out of guest context.
>> 
>> Add support to the BookE code for the generic vcpu->mode field that describes
>> the above states.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>> 1 files changed, 11 insertions(+), 0 deletions(-)
>> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index bcf87fe..70a86c0 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 			continue;
>> 		}
>> 
>> +		if (vcpu->mode == EXITING_GUEST_MODE) {
>> +			r = 1;
>> +			break;
>> +		}
>> +
>> +		/* Going into guest context! Yay! */
>> +		vcpu->mode = IN_GUEST_MODE;
>> +		smp_wmb();
>> +
>> 		break;
>> 	}
> 
> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
> right?  How could it possibly be EXITING_GUEST_MODE then, since that
> only replaces IN_GUEST_MODE?
> 
> This doesn't match what x86 does with mode on entry.  Mode is supposed
> to be set to IN_GUEST_MODE before requests are checked.
> 
> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
> just waiting until after interrupts are disabled before setting
> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
> like a trivial change), plus the existing ordering between mode and
> requests.

Well, the only real use case I could find for the mode was the remote vcpu kick. If we're not outside of guest mode, we get an IPI to notify us that requests are outstanding.

So I only get us into OUTSIDE_GUEST_MODE when we really exit __vcpu_run, thus are in user space. That doesn't reflect what x86 does, right, but so doesn't our whole loop concept.

However, since we might do the vcpu_block in this loop, we will never really get into OUTSIDE_GUEST_MODE, right? Hrm. So what would you suggest? Do all the handle_exit in OUTSIDE_GUEST_MODE scope and then reenter IN_GUEST_MODE in prepare_to_enter? We still wouldn't need an EXITING_GUEST_MODE though.


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-15  0:26       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  0:26 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 02:17, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> Generic KVM code might want to know whether we are inside guest context
>> or outside. It also wants to be able to push us out of guest context.
>> 
>> Add support to the BookE code for the generic vcpu->mode field that describes
>> the above states.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>> 1 files changed, 11 insertions(+), 0 deletions(-)
>> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index bcf87fe..70a86c0 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 			continue;
>> 		}
>> 
>> +		if (vcpu->mode = EXITING_GUEST_MODE) {
>> +			r = 1;
>> +			break;
>> +		}
>> +
>> +		/* Going into guest context! Yay! */
>> +		vcpu->mode = IN_GUEST_MODE;
>> +		smp_wmb();
>> +
>> 		break;
>> 	}
> 
> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
> right?  How could it possibly be EXITING_GUEST_MODE then, since that
> only replaces IN_GUEST_MODE?
> 
> This doesn't match what x86 does with mode on entry.  Mode is supposed
> to be set to IN_GUEST_MODE before requests are checked.
> 
> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
> just waiting until after interrupts are disabled before setting
> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
> like a trivial change), plus the existing ordering between mode and
> requests.

Well, the only real use case I could find for the mode was the remote vcpu kick. If we're not outside of guest mode, we get an IPI to notify us that requests are outstanding.

So I only get us into OUTSIDE_GUEST_MODE when we really exit __vcpu_run, thus are in user space. That doesn't reflect what x86 does, right, but so doesn't our whole loop concept.

However, since we might do the vcpu_block in this loop, we will never really get into OUTSIDE_GUEST_MODE, right? Hrm. So what would you suggest? Do all the handle_exit in OUTSIDE_GUEST_MODE scope and then reenter IN_GUEST_MODE in prepare_to_enter? We still wouldn't need an EXITING_GUEST_MODE though.


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-15  0:26       ` Alexander Graf
@ 2012-08-15  1:17         ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:17 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 07:26 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 02:17, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> Generic KVM code might want to know whether we are inside guest context
>>> or outside. It also wants to be able to push us out of guest context.
>>>
>>> Add support to the BookE code for the generic vcpu->mode field that describes
>>> the above states.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>> index bcf87fe..70a86c0 100644
>>> --- a/arch/powerpc/kvm/booke.c
>>> +++ b/arch/powerpc/kvm/booke.c
>>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> 			continue;
>>> 		}
>>>
>>> +		if (vcpu->mode == EXITING_GUEST_MODE) {
>>> +			r = 1;
>>> +			break;
>>> +		}
>>> +
>>> +		/* Going into guest context! Yay! */
>>> +		vcpu->mode = IN_GUEST_MODE;
>>> +		smp_wmb();
>>> +
>>> 		break;
>>> 	}
>>
>> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
>> right?  How could it possibly be EXITING_GUEST_MODE then, since that
>> only replaces IN_GUEST_MODE?
>>
>> This doesn't match what x86 does with mode on entry.  Mode is supposed
>> to be set to IN_GUEST_MODE before requests are checked.
>>
>> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
>> just waiting until after interrupts are disabled before setting
>> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
>> like a trivial change), plus the existing ordering between mode and
>> requests.
> 
> Well, the only real use case I could find for the mode was the remote
> vcpu kick. If we're not outside of guest mode, we get an IPI to
> notify us that requests are outstanding.

I'm curious why this is done so differently for broadcast requests than
for single-cpu requests.

> So I only get us into OUTSIDE_GUEST_MODE when we really exit
> __vcpu_run, thus are in user space. That doesn't reflect what x86
> does, right, but so doesn't our whole loop concept.

OK.  We still need to do ordering like x86 does, because otherwise
there's a race where we could check requests before the request bit is
set, and still have make_all_cpus_request see OUTSIDE_GUEST_MODE and not
send an IPI.

> However, since we might do the vcpu_block in this loop, we will never
> really get into OUTSIDE_GUEST_MODE, right?

Not when the guest is just idling, only when it's exited to QEMU.

> Hrm. So what would you
> suggest? Do all the handle_exit in OUTSIDE_GUEST_MODE scope and then
> reenter IN_GUEST_MODE in prepare_to_enter? 

I suggest leaving this optimization until a need is felt.

> We still wouldn't need an EXITING_GUEST_MODE though.

Right.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-15  1:17         ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:17 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 07:26 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 02:17, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> Generic KVM code might want to know whether we are inside guest context
>>> or outside. It also wants to be able to push us out of guest context.
>>>
>>> Add support to the BookE code for the generic vcpu->mode field that describes
>>> the above states.
>>>
>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>> ---
>>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>> index bcf87fe..70a86c0 100644
>>> --- a/arch/powerpc/kvm/booke.c
>>> +++ b/arch/powerpc/kvm/booke.c
>>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>> 			continue;
>>> 		}
>>>
>>> +		if (vcpu->mode = EXITING_GUEST_MODE) {
>>> +			r = 1;
>>> +			break;
>>> +		}
>>> +
>>> +		/* Going into guest context! Yay! */
>>> +		vcpu->mode = IN_GUEST_MODE;
>>> +		smp_wmb();
>>> +
>>> 		break;
>>> 	}
>>
>> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
>> right?  How could it possibly be EXITING_GUEST_MODE then, since that
>> only replaces IN_GUEST_MODE?
>>
>> This doesn't match what x86 does with mode on entry.  Mode is supposed
>> to be set to IN_GUEST_MODE before requests are checked.
>>
>> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
>> just waiting until after interrupts are disabled before setting
>> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
>> like a trivial change), plus the existing ordering between mode and
>> requests.
> 
> Well, the only real use case I could find for the mode was the remote
> vcpu kick. If we're not outside of guest mode, we get an IPI to
> notify us that requests are outstanding.

I'm curious why this is done so differently for broadcast requests than
for single-cpu requests.

> So I only get us into OUTSIDE_GUEST_MODE when we really exit
> __vcpu_run, thus are in user space. That doesn't reflect what x86
> does, right, but so doesn't our whole loop concept.

OK.  We still need to do ordering like x86 does, because otherwise
there's a race where we could check requests before the request bit is
set, and still have make_all_cpus_request see OUTSIDE_GUEST_MODE and not
send an IPI.

> However, since we might do the vcpu_block in this loop, we will never
> really get into OUTSIDE_GUEST_MODE, right?

Not when the guest is just idling, only when it's exited to QEMU.

> Hrm. So what would you
> suggest? Do all the handle_exit in OUTSIDE_GUEST_MODE scope and then
> reenter IN_GUEST_MODE in prepare_to_enter? 

I suggest leaving this optimization until a need is felt.

> We still wouldn't need an EXITING_GUEST_MODE though.

Right.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-15  1:20     ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> The e500 target has lived without mmu notifiers ever since it got
> introduced, but fails for the user space check on them with hugetlbfs.
> 
> So in order to get that one working, implement mmu notifiers in a
> reasonably dumb fashion and be happy. On embedded hardware, we almost
> never end up with mmu notifier calls, since most people don't overcommit.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/asm/kvm_host.h |    3 +-
>  arch/powerpc/include/asm/kvm_ppc.h  |    1 +
>  arch/powerpc/kvm/Kconfig            |    2 +
>  arch/powerpc/kvm/booke.c            |    6 +++
>  arch/powerpc/kvm/e500_tlb.c         |   60 +++++++++++++++++++++++++++++++---
>  5 files changed, 65 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index a29e091..ff8d51c 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -45,7 +45,8 @@
>  #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>  #endif
>  
> -#ifdef CONFIG_KVM_BOOK3S_64_HV
> +#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
> +    defined(CONFIG_KVM_E500MC)
>  #include <linux/mmu_notifier.h>
>  
>  #define KVM_ARCH_WANT_MMU_NOTIFIER
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 0124937..c38e824 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
>                                         struct kvm_interrupt *irq);
>  extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
>                                           struct kvm_interrupt *irq);
> +extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
>  
>  extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
>                                    unsigned int op, int *advance);
> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
> index f4dacb9..40cad8c 100644
> --- a/arch/powerpc/kvm/Kconfig
> +++ b/arch/powerpc/kvm/Kconfig
> @@ -123,6 +123,7 @@ config KVM_E500V2
>  	depends on EXPERIMENTAL && E500 && !PPC_E500MC
>  	select KVM
>  	select KVM_MMIO
> +	select MMU_NOTIFIER
>  	---help---
>  	  Support running unmodified E500 guest kernels in virtual machines on
>  	  E500v2 host processors.
> @@ -138,6 +139,7 @@ config KVM_E500MC
>  	select KVM
>  	select KVM_MMIO
>  	select KVM_BOOKE_HV
> +	select MMU_NOTIFIER
>  	---help---
>  	  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
>  	  virtual machines on E500MC/E5500 host processors.
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 70a86c0..52f6cbb 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -459,6 +459,10 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>  	if (vcpu->requests) {
>  		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>  			update_timer_ints(vcpu);
> +#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
> +		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
> +			kvmppc_core_flush_tlb(vcpu);
> +#endif
>  	}
>  }

Can we define a new symbol that means "e500_tlb.c is used"?  Or just say
that all new TLB implementations shall support MMU notifiers, and change
this to ifndef 4xx.  Or make this a tlb flush callback with a no-op stub
in 4xx.

Of course this isn't critical and shouldn't hold up the pull request
since I got around to the review late -- just something to think about
for further refactoring.

> @@ -579,6 +583,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  #endif
>  
>  	kvm_guest_exit();
> +	vcpu->mode = OUTSIDE_GUEST_MODE;
> +	smp_wmb();
>  
>  out:
>  	vcpu->mode = OUTSIDE_GUEST_MODE;

This looks wrong.

> diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
> index 93f3b92..06273a7 100644
> --- a/arch/powerpc/kvm/e500_tlb.c
> +++ b/arch/powerpc/kvm/e500_tlb.c
> @@ -303,18 +303,15 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
>  	ref->pfn = pfn;
>  	ref->flags = E500_TLB_VALID;
>  
> -	if (tlbe_is_writable(gtlbe))
> +	if (tlbe_is_writable(gtlbe)) {
>  		ref->flags |= E500_TLB_DIRTY;
> +		kvm_set_pfn_dirty(pfn);
> +	}
>  }

Is there any reason to keep E500_TLB_DIRTY around?  You seem to be
removing the only code that checks it.

> @@ -357,6 +354,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500)
>  	clear_tlb_privs(vcpu_e500);
>  }
>  
> +void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
> +{
> +	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
> +	clear_tlb_refs(vcpu_e500);
> +	clear_tlb1_bitmap(vcpu_e500);
> +}

Should we move clear_tlb1_bitmap() into clear_tlb_refs()?  That is, is
it a bug that we don't do it in ioctl_dirty_tlb()?

> +/************* MMU Notifiers *************/
> +

Is this really necessary?

> +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
> +{
> +	/*
> +	 * Flush all shadow tlb entries everywhere. This is slow, but
> +	 * we are 100% sure that we catch the to be unmapped page
> +	 */
> +	kvm_flush_remote_tlbs(kvm);
> +
> +	return 0;
> +}
> +
> +int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
> +{
> +	/* kvm_unmap_hva flushes everything anyways */
> +	kvm_unmap_hva(kvm, start);
> +
> +	return 0;
> +}

I'd feel better about this calling kvm_flush_remote_tlbs() directly
rather than hoping that someone who enhances kvm_unmap_hva() updates
this function as well.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers
@ 2012-08-15  1:20     ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:20 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> The e500 target has lived without mmu notifiers ever since it got
> introduced, but fails for the user space check on them with hugetlbfs.
> 
> So in order to get that one working, implement mmu notifiers in a
> reasonably dumb fashion and be happy. On embedded hardware, we almost
> never end up with mmu notifier calls, since most people don't overcommit.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  arch/powerpc/include/asm/kvm_host.h |    3 +-
>  arch/powerpc/include/asm/kvm_ppc.h  |    1 +
>  arch/powerpc/kvm/Kconfig            |    2 +
>  arch/powerpc/kvm/booke.c            |    6 +++
>  arch/powerpc/kvm/e500_tlb.c         |   60 +++++++++++++++++++++++++++++++---
>  5 files changed, 65 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index a29e091..ff8d51c 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -45,7 +45,8 @@
>  #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>  #endif
>  
> -#ifdef CONFIG_KVM_BOOK3S_64_HV
> +#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
> +    defined(CONFIG_KVM_E500MC)
>  #include <linux/mmu_notifier.h>
>  
>  #define KVM_ARCH_WANT_MMU_NOTIFIER
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 0124937..c38e824 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
>                                         struct kvm_interrupt *irq);
>  extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
>                                           struct kvm_interrupt *irq);
> +extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
>  
>  extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
>                                    unsigned int op, int *advance);
> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
> index f4dacb9..40cad8c 100644
> --- a/arch/powerpc/kvm/Kconfig
> +++ b/arch/powerpc/kvm/Kconfig
> @@ -123,6 +123,7 @@ config KVM_E500V2
>  	depends on EXPERIMENTAL && E500 && !PPC_E500MC
>  	select KVM
>  	select KVM_MMIO
> +	select MMU_NOTIFIER
>  	---help---
>  	  Support running unmodified E500 guest kernels in virtual machines on
>  	  E500v2 host processors.
> @@ -138,6 +139,7 @@ config KVM_E500MC
>  	select KVM
>  	select KVM_MMIO
>  	select KVM_BOOKE_HV
> +	select MMU_NOTIFIER
>  	---help---
>  	  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
>  	  virtual machines on E500MC/E5500 host processors.
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 70a86c0..52f6cbb 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -459,6 +459,10 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>  	if (vcpu->requests) {
>  		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>  			update_timer_ints(vcpu);
> +#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
> +		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
> +			kvmppc_core_flush_tlb(vcpu);
> +#endif
>  	}
>  }

Can we define a new symbol that means "e500_tlb.c is used"?  Or just say
that all new TLB implementations shall support MMU notifiers, and change
this to ifndef 4xx.  Or make this a tlb flush callback with a no-op stub
in 4xx.

Of course this isn't critical and shouldn't hold up the pull request
since I got around to the review late -- just something to think about
for further refactoring.

> @@ -579,6 +583,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  #endif
>  
>  	kvm_guest_exit();
> +	vcpu->mode = OUTSIDE_GUEST_MODE;
> +	smp_wmb();
>  
>  out:
>  	vcpu->mode = OUTSIDE_GUEST_MODE;

This looks wrong.

> diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
> index 93f3b92..06273a7 100644
> --- a/arch/powerpc/kvm/e500_tlb.c
> +++ b/arch/powerpc/kvm/e500_tlb.c
> @@ -303,18 +303,15 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
>  	ref->pfn = pfn;
>  	ref->flags = E500_TLB_VALID;
>  
> -	if (tlbe_is_writable(gtlbe))
> +	if (tlbe_is_writable(gtlbe)) {
>  		ref->flags |= E500_TLB_DIRTY;
> +		kvm_set_pfn_dirty(pfn);
> +	}
>  }

Is there any reason to keep E500_TLB_DIRTY around?  You seem to be
removing the only code that checks it.

> @@ -357,6 +354,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500)
>  	clear_tlb_privs(vcpu_e500);
>  }
>  
> +void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
> +{
> +	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
> +	clear_tlb_refs(vcpu_e500);
> +	clear_tlb1_bitmap(vcpu_e500);
> +}

Should we move clear_tlb1_bitmap() into clear_tlb_refs()?  That is, is
it a bug that we don't do it in ioctl_dirty_tlb()?

> +/************* MMU Notifiers *************/
> +

Is this really necessary?

> +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
> +{
> +	/*
> +	 * Flush all shadow tlb entries everywhere. This is slow, but
> +	 * we are 100% sure that we catch the to be unmapped page
> +	 */
> +	kvm_flush_remote_tlbs(kvm);
> +
> +	return 0;
> +}
> +
> +int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
> +{
> +	/* kvm_unmap_hva flushes everything anyways */
> +	kvm_unmap_hva(kvm, start);
> +
> +	return 0;
> +}

I'd feel better about this calling kvm_flush_remote_tlbs() directly
rather than hoping that someone who enhances kvm_unmap_hva() updates
this function as well.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-15  1:23     ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> When we map a page that wasn't icache cleared before, do so when first
> mapping it in KVM using the same information bits as the Linux mapping
> logic. That way we are 100% sure that any page we map does not have stale
> entries in the icache.

We're not really 100% sure of that -- this only handles the case where
the kernel does the dirtying, not when it's done by QEMU or the guest.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15  1:23     ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> When we map a page that wasn't icache cleared before, do so when first
> mapping it in KVM using the same information bits as the Linux mapping
> logic. That way we are 100% sure that any page we map does not have stale
> entries in the icache.

We're not really 100% sure of that -- this only handles the case where
the kernel does the dirtying, not when it's done by QEMU or the guest.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-15  1:25     ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:25 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> +		/* Going into guest context! Yay! */
> +		vcpu->mode = IN_GUEST_MODE;
> +		smp_wmb();
> +
>  		break;
>  	}

What is the wmb protecting against here?

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-15  1:25     ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15  1:25 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/14/2012 06:04 PM, Alexander Graf wrote:
> +		/* Going into guest context! Yay! */
> +		vcpu->mode = IN_GUEST_MODE;
> +		smp_wmb();
> +
>  		break;
>  	}

What is the wmb protecting against here?

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-15  1:17         ` Scott Wood
@ 2012-08-15  9:29           ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  9:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 03:17, Scott Wood wrote:

> On 08/14/2012 07:26 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 02:17, Scott Wood wrote:
>> 
>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>> Generic KVM code might want to know whether we are inside guest context
>>>> or outside. It also wants to be able to push us out of guest context.
>>>> 
>>>> Add support to the BookE code for the generic vcpu->mode field that describes
>>>> the above states.
>>>> 
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>> ---
>>>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>>>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>>> 
>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>> index bcf87fe..70a86c0 100644
>>>> --- a/arch/powerpc/kvm/booke.c
>>>> +++ b/arch/powerpc/kvm/booke.c
>>>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>>> 			continue;
>>>> 		}
>>>> 
>>>> +		if (vcpu->mode == EXITING_GUEST_MODE) {
>>>> +			r = 1;
>>>> +			break;
>>>> +		}
>>>> +
>>>> +		/* Going into guest context! Yay! */
>>>> +		vcpu->mode = IN_GUEST_MODE;
>>>> +		smp_wmb();
>>>> +
>>>> 		break;
>>>> 	}
>>> 
>>> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
>>> right?  How could it possibly be EXITING_GUEST_MODE then, since that
>>> only replaces IN_GUEST_MODE?
>>> 
>>> This doesn't match what x86 does with mode on entry.  Mode is supposed
>>> to be set to IN_GUEST_MODE before requests are checked.
>>> 
>>> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
>>> just waiting until after interrupts are disabled before setting
>>> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
>>> like a trivial change), plus the existing ordering between mode and
>>> requests.
>> 
>> Well, the only real use case I could find for the mode was the remote
>> vcpu kick. If we're not outside of guest mode, we get an IPI to
>> notify us that requests are outstanding.
> 
> I'm curious why this is done so differently for broadcast requests than
> for single-cpu requests.
> 
>> So I only get us into OUTSIDE_GUEST_MODE when we really exit
>> __vcpu_run, thus are in user space. That doesn't reflect what x86
>> does, right, but so doesn't our whole loop concept.
> 
> OK.  We still need to do ordering like x86 does, because otherwise
> there's a race where we could check requests before the request bit is
> set, and still have make_all_cpus_request see OUTSIDE_GUEST_MODE and not
> send an IPI.

Could you please send a patch showing what workflow you envision? The code as is should work, just be inefficient at times, right?


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-15  9:29           ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  9:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 03:17, Scott Wood wrote:

> On 08/14/2012 07:26 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 02:17, Scott Wood wrote:
>> 
>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>> Generic KVM code might want to know whether we are inside guest context
>>>> or outside. It also wants to be able to push us out of guest context.
>>>> 
>>>> Add support to the BookE code for the generic vcpu->mode field that describes
>>>> the above states.
>>>> 
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>> ---
>>>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>>>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>>> 
>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>> index bcf87fe..70a86c0 100644
>>>> --- a/arch/powerpc/kvm/booke.c
>>>> +++ b/arch/powerpc/kvm/booke.c
>>>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>>> 			continue;
>>>> 		}
>>>> 
>>>> +		if (vcpu->mode = EXITING_GUEST_MODE) {
>>>> +			r = 1;
>>>> +			break;
>>>> +		}
>>>> +
>>>> +		/* Going into guest context! Yay! */
>>>> +		vcpu->mode = IN_GUEST_MODE;
>>>> +		smp_wmb();
>>>> +
>>>> 		break;
>>>> 	}
>>> 
>>> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
>>> right?  How could it possibly be EXITING_GUEST_MODE then, since that
>>> only replaces IN_GUEST_MODE?
>>> 
>>> This doesn't match what x86 does with mode on entry.  Mode is supposed
>>> to be set to IN_GUEST_MODE before requests are checked.
>>> 
>>> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
>>> just waiting until after interrupts are disabled before setting
>>> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
>>> like a trivial change), plus the existing ordering between mode and
>>> requests.
>> 
>> Well, the only real use case I could find for the mode was the remote
>> vcpu kick. If we're not outside of guest mode, we get an IPI to
>> notify us that requests are outstanding.
> 
> I'm curious why this is done so differently for broadcast requests than
> for single-cpu requests.
> 
>> So I only get us into OUTSIDE_GUEST_MODE when we really exit
>> __vcpu_run, thus are in user space. That doesn't reflect what x86
>> does, right, but so doesn't our whole loop concept.
> 
> OK.  We still need to do ordering like x86 does, because otherwise
> there's a race where we could check requests before the request bit is
> set, and still have make_all_cpus_request see OUTSIDE_GUEST_MODE and not
> send an IPI.

Could you please send a patch showing what workflow you envision? The code as is should work, just be inefficient at times, right?


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers
  2012-08-15  1:20     ` Scott Wood
@ 2012-08-15  9:38       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  9:38 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 03:20, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> The e500 target has lived without mmu notifiers ever since it got
>> introduced, but fails for the user space check on them with hugetlbfs.
>> 
>> So in order to get that one working, implement mmu notifiers in a
>> reasonably dumb fashion and be happy. On embedded hardware, we almost
>> never end up with mmu notifier calls, since most people don't overcommit.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm_host.h |    3 +-
>> arch/powerpc/include/asm/kvm_ppc.h  |    1 +
>> arch/powerpc/kvm/Kconfig            |    2 +
>> arch/powerpc/kvm/booke.c            |    6 +++
>> arch/powerpc/kvm/e500_tlb.c         |   60 +++++++++++++++++++++++++++++++---
>> 5 files changed, 65 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>> index a29e091..ff8d51c 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -45,7 +45,8 @@
>> #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>> #endif
>> 
>> -#ifdef CONFIG_KVM_BOOK3S_64_HV
>> +#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
>> +    defined(CONFIG_KVM_E500MC)
>> #include <linux/mmu_notifier.h>
>> 
>> #define KVM_ARCH_WANT_MMU_NOTIFIER
>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
>> index 0124937..c38e824 100644
>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>> @@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
>>                                        struct kvm_interrupt *irq);
>> extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
>>                                          struct kvm_interrupt *irq);
>> +extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
>> 
>> extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>                                   unsigned int op, int *advance);
>> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
>> index f4dacb9..40cad8c 100644
>> --- a/arch/powerpc/kvm/Kconfig
>> +++ b/arch/powerpc/kvm/Kconfig
>> @@ -123,6 +123,7 @@ config KVM_E500V2
>> 	depends on EXPERIMENTAL && E500 && !PPC_E500MC
>> 	select KVM
>> 	select KVM_MMIO
>> +	select MMU_NOTIFIER
>> 	---help---
>> 	  Support running unmodified E500 guest kernels in virtual machines on
>> 	  E500v2 host processors.
>> @@ -138,6 +139,7 @@ config KVM_E500MC
>> 	select KVM
>> 	select KVM_MMIO
>> 	select KVM_BOOKE_HV
>> +	select MMU_NOTIFIER
>> 	---help---
>> 	  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
>> 	  virtual machines on E500MC/E5500 host processors.
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 70a86c0..52f6cbb 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -459,6 +459,10 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>> 	if (vcpu->requests) {
>> 		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>> 			update_timer_ints(vcpu);
>> +#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
>> +		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
>> +			kvmppc_core_flush_tlb(vcpu);
>> +#endif
>> 	}
>> }
> 
> Can we define a new symbol that means "e500_tlb.c is used"?  Or just say
> that all new TLB implementations shall support MMU notifiers, and change
> this to ifndef 4xx.  Or make this a tlb flush callback with a no-op stub
> in 4xx.

I'd for the "all TLB implementations shall support MMU notifiers". Period. Including 440. The current state is only interim until we either

  * fix 440 or
  * remove 440

I do have a 440 machine here that I'm trying to get the current code to run on, but I don't have forever to spend on this and something seems odd. For some reason we seem to be jumping into the data section. Maybe you can spot something from the logs below?

If it wasn't for the overall brokenness of the target, I would've quickly implemented MMU Notifiers for 440 already.

root@ppc440:~/qemu# ./ppc-softmmu/qemu-system-ppc -M bamboo -kernel /boot/uImage.autotest -nographic -enable-kvm -s
Using PowerPC 44x Platform machine description
Linux version 3.6.0-rc1-00221-g7542507 (agraf@wolfberry-1) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) ) #20 Wed Aug 15 01:46:42 CEST 2012
bootconsole [udbg0] enabled
setup_arch: bootmem
arch: exit
Zone ranges:
  DMA      [mem 0x00000000-0x07ffffff]
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x00000000-0x07ffffff]
MMU: Allocated 1088 bytes of context maps for 255 contexts
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
Kernel command line: 
PID hash table entries: 512 (order: -1, 2048 bytes)
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 125352k/131072k available (4052k kernel code, 5720k reserved, 172k data, 244k bss, 220k init)
Kernel virtual memory layout:
  * 0xfffdf000..0xfffff000  : fixmap
  * 0xfde00000..0xfe000000  : consistent mem
  * 0xfddfe000..0xfde00000  : early ioremap
  * 0xd1000000..0xfddfe000  : vmalloc & ioremap
SLUB: Genslabs=13, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
NR_IRQS:512 nr_irqs:512 16
KVM: unknown exit, hardware reason ffffffff7c030306
NIP c03c57ac   LR c03c56ec CTR c03198b4 XER 00000000
MSR 00021002 HID0 00000000  HF 00000000 idx 1
TB 00000195 2403482311 DECR 00000000
GPR00 0000000000000000 00000000c0415f60 00000000c03f82e0 00000000000000c2
GPR04 0000000000000000 0000000000000000 00000000000000c0 00000000c054f0c0
GPR08 0000000000000000 0000000020000000 0000000000000001 00000000c03fe408
GPR12 0000000022000022 0000000000000000 0000000000000000 0000000000000000
GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20 0000000000000000 0000000000000000 00000000c0000010 0000000000000000
GPR24 0000000000000000 0000000000000000 00000000c03e3720 00000000c0420538
GPR28 00000000c0374280 00000000c0415f68 00000000c0420000 00000000c7800300
CR 42000028  [ G  E  -  -  -  -  E  L  ]             RES ffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 00000000
 SRR0 c000e754  SRR1 00021002    PVR 422218d3 VRSAVE 00000000
SPRG0 c0420000 SPRG1 00000000  SPRG2 00000000  SPRG3 c03f84c0
SPRG4 44000022 SPRG5 00000000  SPRG6 00000000  SPRG7 82000028
CSRR0 00000000 CSRR1 00000000 MCSRR0 00000000 MCSRR1 00000000
  TCR 00000000   TSR 00000000    ESR 00800000   DEAR fddfe303
  PIR 00000000 DECAR 00000000   IVPR c0000000   EPCR 00000000
 MCSR 00000000 SPRG8 00000000    EPR 00000000
 MCAR 00000000  PID1 00000000   PID2 00000000    SVR 00000000

[...]
 qemu-system-ppc-1895  [000] ....   423.529387: kvm_exit: exit=ITLB_MISS | pc=0xc00b1480 | msr=0x21002 | dar=0xc044ba70 | last_inst=0x7f400106
 qemu-system-ppc-1895  [000] ....   423.529394: kvm_stlb_inval: stlb_index 50
 qemu-system-ppc-1895  [000] ....   423.529395: kvm_stlb_write: victim 50 tid 1 w0 3221951248 w1 307912704 w2 575
 qemu-system-ppc-1895  [000] ....   423.529399: kvm_exit: exit=DTLB_MISS | pc=0xc00e47e0 | msr=0x21002 | dar=0xc7802284 | last_inst=0x801a0004
 qemu-system-ppc-1895  [000] ....   423.529405: kvm_stlb_inval: stlb_index 51
 qemu-system-ppc-1895  [000] ....   423.529406: kvm_stlb_write: victim 51 tid 1 w0 3347063568 w1 296411136 w2 575
 qemu-system-ppc-1895  [000] ....   423.529410: kvm_exit: exit=DTLB_MISS | pc=0xc00e48bc | msr=0x21002 | dar=0xc0420200 | last_inst=0x812b0200
 qemu-system-ppc-1895  [000] ....   423.529417: kvm_stlb_inval: stlb_index 52
 qemu-system-ppc-1895  [000] ....   423.529418: kvm_stlb_write: victim 52 tid 1 w0 3225551632 w1 301301760 w2 575
 qemu-system-ppc-1895  [000] ....   423.529421: kvm_exit: exit=DTLB_MISS | pc=0xc00e4910 | msr=0x21002 | dar=0xc7805000 | last_inst=0x7fbb012e
 qemu-system-ppc-1895  [000] ....   423.529450: kvm_stlb_inval: stlb_index 53
 qemu-system-ppc-1895  [000] ....   423.529451: kvm_stlb_write: victim 53 tid 1 w0 3347075856 w1 296632320 w2 575
 qemu-system-ppc-1895  [000] ....   423.529457: kvm_exit: exit=DTLB_MISS | pc=0xc00e5204 | msr=0x21002 | dar=0xc0561058 | last_inst=0x809d0008
 qemu-system-ppc-1895  [000] ....   423.529464: kvm_stlb_inval: stlb_index 54
 qemu-system-ppc-1895  [000] ....   423.529465: kvm_stlb_write: victim 54 tid 1 w0 3226866448 w1 295124992 w2 575
 qemu-system-ppc-1895  [000] ....   423.529468: kvm_exit: exit=PROGRAM | pc=0xc00e5138 | msr=0x21002 | dar=0xc0561058 | last_inst=0x7f400106
 qemu-system-ppc-1895  [000] ....   423.529471: kvm_ppc_instr: inst 2134900998 pc 0xc00e5138 emulate 0

 qemu-system-ppc-1895  [000] ....   423.529475: kvm_exit: exit=ITLB_MISS | pc=0xc03199d8 | msr=0x21002 | dar=0xc0561058 | last_inst=0x7f400106
 qemu-system-ppc-1895  [000] ....   423.529482: kvm_stlb_inval: stlb_index 55
 qemu-system-ppc-1895  [000] ....   423.529483: kvm_stlb_write: victim 55 tid 1 w0 3224474384 w1 301121536 w2 575
 qemu-system-ppc-1895  [000] ....   423.529488: kvm_exit: exit=PROGRAM | pc=0xc03c57ac | msr=0x21002 | dar=0xc0561058 | last_inst=0x7c030306
 qemu-system-ppc-1895  [000] .N..   423.534502: kvm_booke_queue_irqprio: vcpu=0 prio=3 pending=0
 qemu-system-ppc-1895  [000] .N..   423.534505: kvm_ppc_instr: inst 2080572166 pc 0xc03c57ac emulate 3

 qemu-system-ppc-1895  [000] .N..   423.539462: kvm_booke_queue_irqprio: vcpu=0 prio=3 pending=8
 qemu-system-ppc-1895  [000] .N..   423.539468: kvm_userspace_exit: reason KVM_EXIT_UNKNOWN (0)

(gdb) x /i 0xc03c57ac
   0xc03c57ac <__kvmppc_vcpu_run+3789496>:	mtdcrx  r3,r0

Unfortunately, TLB synchronization between KVM and QEMU isn't implemented for 440, so bt in gdb doesn't work. Sigh.

> Of course this isn't critical and shouldn't hold up the pull request
> since I got around to the review late -- just something to think about
> for further refactoring.
> 
>> @@ -579,6 +583,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> #endif
>> 
>> 	kvm_guest_exit();
>> +	vcpu->mode = OUTSIDE_GUEST_MODE;
>> +	smp_wmb();
>> 
>> out:
>> 	vcpu->mode = OUTSIDE_GUEST_MODE;
> 
> This looks wrong.

Yeah. A later patch fixes it up again :(. Sorry.

> 
>> diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
>> index 93f3b92..06273a7 100644
>> --- a/arch/powerpc/kvm/e500_tlb.c
>> +++ b/arch/powerpc/kvm/e500_tlb.c
>> @@ -303,18 +303,15 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
>> 	ref->pfn = pfn;
>> 	ref->flags = E500_TLB_VALID;
>> 
>> -	if (tlbe_is_writable(gtlbe))
>> +	if (tlbe_is_writable(gtlbe)) {
>> 		ref->flags |= E500_TLB_DIRTY;
>> +		kvm_set_pfn_dirty(pfn);
>> +	}
>> }
> 
> Is there any reason to keep E500_TLB_DIRTY around?  You seem to be
> removing the only code that checks it.

Nope, none. We can safely remove it.

> 
>> @@ -357,6 +354,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500)
>> 	clear_tlb_privs(vcpu_e500);
>> }
>> 
>> +void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
>> +	clear_tlb_refs(vcpu_e500);
>> +	clear_tlb1_bitmap(vcpu_e500);
>> +}
> 
> Should we move clear_tlb1_bitmap() into clear_tlb_refs()?  That is, is
> it a bug that we don't do it in ioctl_dirty_tlb()?

I think so, yes :).

> 
>> +/************* MMU Notifiers *************/
>> +
> 
> Is this really necessary?

The comment? No, comments are never necessary. But the code is slowly becoming bigger and bigger and having individual sections in it improves readability IMHO.

> 
>> +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
>> +{
>> +	/*
>> +	 * Flush all shadow tlb entries everywhere. This is slow, but
>> +	 * we are 100% sure that we catch the to be unmapped page
>> +	 */
>> +	kvm_flush_remote_tlbs(kvm);
>> +
>> +	return 0;
>> +}
>> +
>> +int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
>> +{
>> +	/* kvm_unmap_hva flushes everything anyways */
>> +	kvm_unmap_hva(kvm, start);
>> +
>> +	return 0;
>> +}
> 
> I'd feel better about this calling kvm_flush_remote_tlbs() directly
> rather than hoping that someone who enhances kvm_unmap_hva() updates
> this function as well.

The reasons I went for this approach were:

  * originally I called a generic helper that would get a mask and an HVA, but it turned out to be useless because it's easier to flush everything for now
  * kvm_unmap_hva gets a trace point. This way we get traces for all unmap operations.

I would say chances are very low that someone changes kvm_unmap_hva without also touching kvm_unmap_hva_range.


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers
@ 2012-08-15  9:38       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  9:38 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 03:20, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> The e500 target has lived without mmu notifiers ever since it got
>> introduced, but fails for the user space check on them with hugetlbfs.
>> 
>> So in order to get that one working, implement mmu notifiers in a
>> reasonably dumb fashion and be happy. On embedded hardware, we almost
>> never end up with mmu notifier calls, since most people don't overcommit.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm_host.h |    3 +-
>> arch/powerpc/include/asm/kvm_ppc.h  |    1 +
>> arch/powerpc/kvm/Kconfig            |    2 +
>> arch/powerpc/kvm/booke.c            |    6 +++
>> arch/powerpc/kvm/e500_tlb.c         |   60 +++++++++++++++++++++++++++++++---
>> 5 files changed, 65 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>> index a29e091..ff8d51c 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -45,7 +45,8 @@
>> #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>> #endif
>> 
>> -#ifdef CONFIG_KVM_BOOK3S_64_HV
>> +#if defined(CONFIG_KVM_BOOK3S_64_HV) || defined(CONFIG_KVM_E500V2) || \
>> +    defined(CONFIG_KVM_E500MC)
>> #include <linux/mmu_notifier.h>
>> 
>> #define KVM_ARCH_WANT_MMU_NOTIFIER
>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
>> index 0124937..c38e824 100644
>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>> @@ -104,6 +104,7 @@ extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
>>                                        struct kvm_interrupt *irq);
>> extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
>>                                          struct kvm_interrupt *irq);
>> +extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
>> 
>> extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>                                   unsigned int op, int *advance);
>> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
>> index f4dacb9..40cad8c 100644
>> --- a/arch/powerpc/kvm/Kconfig
>> +++ b/arch/powerpc/kvm/Kconfig
>> @@ -123,6 +123,7 @@ config KVM_E500V2
>> 	depends on EXPERIMENTAL && E500 && !PPC_E500MC
>> 	select KVM
>> 	select KVM_MMIO
>> +	select MMU_NOTIFIER
>> 	---help---
>> 	  Support running unmodified E500 guest kernels in virtual machines on
>> 	  E500v2 host processors.
>> @@ -138,6 +139,7 @@ config KVM_E500MC
>> 	select KVM
>> 	select KVM_MMIO
>> 	select KVM_BOOKE_HV
>> +	select MMU_NOTIFIER
>> 	---help---
>> 	  Support running unmodified E500MC/E5500 (32-bit) guest kernels in
>> 	  virtual machines on E500MC/E5500 host processors.
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 70a86c0..52f6cbb 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -459,6 +459,10 @@ static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
>> 	if (vcpu->requests) {
>> 		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
>> 			update_timer_ints(vcpu);
>> +#if defined(CONFIG_KVM_E500V2) || defined(CONFIG_KVM_E500MC)
>> +		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
>> +			kvmppc_core_flush_tlb(vcpu);
>> +#endif
>> 	}
>> }
> 
> Can we define a new symbol that means "e500_tlb.c is used"?  Or just say
> that all new TLB implementations shall support MMU notifiers, and change
> this to ifndef 4xx.  Or make this a tlb flush callback with a no-op stub
> in 4xx.

I'd for the "all TLB implementations shall support MMU notifiers". Period. Including 440. The current state is only interim until we either

  * fix 440 or
  * remove 440

I do have a 440 machine here that I'm trying to get the current code to run on, but I don't have forever to spend on this and something seems odd. For some reason we seem to be jumping into the data section. Maybe you can spot something from the logs below?

If it wasn't for the overall brokenness of the target, I would've quickly implemented MMU Notifiers for 440 already.

root@ppc440:~/qemu# ./ppc-softmmu/qemu-system-ppc -M bamboo -kernel /boot/uImage.autotest -nographic -enable-kvm -s
Using PowerPC 44x Platform machine description
Linux version 3.6.0-rc1-00221-g7542507 (agraf@wolfberry-1) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) ) #20 Wed Aug 15 01:46:42 CEST 2012
bootconsole [udbg0] enabled
setup_arch: bootmem
arch: exit
Zone ranges:
  DMA      [mem 0x00000000-0x07ffffff]
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x00000000-0x07ffffff]
MMU: Allocated 1088 bytes of context maps for 255 contexts
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 32512
Kernel command line: 
PID hash table entries: 512 (order: -1, 2048 bytes)
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 125352k/131072k available (4052k kernel code, 5720k reserved, 172k data, 244k bss, 220k init)
Kernel virtual memory layout:
  * 0xfffdf000..0xfffff000  : fixmap
  * 0xfde00000..0xfe000000  : consistent mem
  * 0xfddfe000..0xfde00000  : early ioremap
  * 0xd1000000..0xfddfe000  : vmalloc & ioremap
SLUB: Genslabs\x13, HWalign2, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
NR_IRQS:512 nr_irqs:512 16
KVM: unknown exit, hardware reason ffffffff7c030306
NIP c03c57ac   LR c03c56ec CTR c03198b4 XER 00000000
MSR 00021002 HID0 00000000  HF 00000000 idx 1
TB 00000195 2403482311 DECR 00000000
GPR00 0000000000000000 00000000c0415f60 00000000c03f82e0 00000000000000c2
GPR04 0000000000000000 0000000000000000 00000000000000c0 00000000c054f0c0
GPR08 0000000000000000 0000000020000000 0000000000000001 00000000c03fe408
GPR12 0000000022000022 0000000000000000 0000000000000000 0000000000000000
GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20 0000000000000000 0000000000000000 00000000c0000010 0000000000000000
GPR24 0000000000000000 0000000000000000 00000000c03e3720 00000000c0420538
GPR28 00000000c0374280 00000000c0415f68 00000000c0420000 00000000c7800300
CR 42000028  [ G  E  -  -  -  -  E  L  ]             RES ffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 00000000
 SRR0 c000e754  SRR1 00021002    PVR 422218d3 VRSAVE 00000000
SPRG0 c0420000 SPRG1 00000000  SPRG2 00000000  SPRG3 c03f84c0
SPRG4 44000022 SPRG5 00000000  SPRG6 00000000  SPRG7 82000028
CSRR0 00000000 CSRR1 00000000 MCSRR0 00000000 MCSRR1 00000000
  TCR 00000000   TSR 00000000    ESR 00800000   DEAR fddfe303
  PIR 00000000 DECAR 00000000   IVPR c0000000   EPCR 00000000
 MCSR 00000000 SPRG8 00000000    EPR 00000000
 MCAR 00000000  PID1 00000000   PID2 00000000    SVR 00000000

[...]
 qemu-system-ppc-1895  [000] ....   423.529387: kvm_exit: exit=ITLB_MISS | pc=0xc00b1480 | msr=0x21002 | dar=0xc044ba70 | last_inst=0x7f400106
 qemu-system-ppc-1895  [000] ....   423.529394: kvm_stlb_inval: stlb_index 50
 qemu-system-ppc-1895  [000] ....   423.529395: kvm_stlb_write: victim 50 tid 1 w0 3221951248 w1 307912704 w2 575
 qemu-system-ppc-1895  [000] ....   423.529399: kvm_exit: exit=DTLB_MISS | pc=0xc00e47e0 | msr=0x21002 | dar=0xc7802284 | last_inst=0x801a0004
 qemu-system-ppc-1895  [000] ....   423.529405: kvm_stlb_inval: stlb_index 51
 qemu-system-ppc-1895  [000] ....   423.529406: kvm_stlb_write: victim 51 tid 1 w0 3347063568 w1 296411136 w2 575
 qemu-system-ppc-1895  [000] ....   423.529410: kvm_exit: exit=DTLB_MISS | pc=0xc00e48bc | msr=0x21002 | dar=0xc0420200 | last_inst=0x812b0200
 qemu-system-ppc-1895  [000] ....   423.529417: kvm_stlb_inval: stlb_index 52
 qemu-system-ppc-1895  [000] ....   423.529418: kvm_stlb_write: victim 52 tid 1 w0 3225551632 w1 301301760 w2 575
 qemu-system-ppc-1895  [000] ....   423.529421: kvm_exit: exit=DTLB_MISS | pc=0xc00e4910 | msr=0x21002 | dar=0xc7805000 | last_inst=0x7fbb012e
 qemu-system-ppc-1895  [000] ....   423.529450: kvm_stlb_inval: stlb_index 53
 qemu-system-ppc-1895  [000] ....   423.529451: kvm_stlb_write: victim 53 tid 1 w0 3347075856 w1 296632320 w2 575
 qemu-system-ppc-1895  [000] ....   423.529457: kvm_exit: exit=DTLB_MISS | pc=0xc00e5204 | msr=0x21002 | dar=0xc0561058 | last_inst=0x809d0008
 qemu-system-ppc-1895  [000] ....   423.529464: kvm_stlb_inval: stlb_index 54
 qemu-system-ppc-1895  [000] ....   423.529465: kvm_stlb_write: victim 54 tid 1 w0 3226866448 w1 295124992 w2 575
 qemu-system-ppc-1895  [000] ....   423.529468: kvm_exit: exit=PROGRAM | pc=0xc00e5138 | msr=0x21002 | dar=0xc0561058 | last_inst=0x7f400106
 qemu-system-ppc-1895  [000] ....   423.529471: kvm_ppc_instr: inst 2134900998 pc 0xc00e5138 emulate 0

 qemu-system-ppc-1895  [000] ....   423.529475: kvm_exit: exit=ITLB_MISS | pc=0xc03199d8 | msr=0x21002 | dar=0xc0561058 | last_inst=0x7f400106
 qemu-system-ppc-1895  [000] ....   423.529482: kvm_stlb_inval: stlb_index 55
 qemu-system-ppc-1895  [000] ....   423.529483: kvm_stlb_write: victim 55 tid 1 w0 3224474384 w1 301121536 w2 575
 qemu-system-ppc-1895  [000] ....   423.529488: kvm_exit: exit=PROGRAM | pc=0xc03c57ac | msr=0x21002 | dar=0xc0561058 | last_inst=0x7c030306
 qemu-system-ppc-1895  [000] .N..   423.534502: kvm_booke_queue_irqprio: vcpu=0 prio=3 pending=0
 qemu-system-ppc-1895  [000] .N..   423.534505: kvm_ppc_instr: inst 2080572166 pc 0xc03c57ac emulate 3

 qemu-system-ppc-1895  [000] .N..   423.539462: kvm_booke_queue_irqprio: vcpu=0 prio=3 pending=8
 qemu-system-ppc-1895  [000] .N..   423.539468: kvm_userspace_exit: reason KVM_EXIT_UNKNOWN (0)

(gdb) x /i 0xc03c57ac
   0xc03c57ac <__kvmppc_vcpu_run+3789496>:	mtdcrx  r3,r0

Unfortunately, TLB synchronization between KVM and QEMU isn't implemented for 440, so bt in gdb doesn't work. Sigh.

> Of course this isn't critical and shouldn't hold up the pull request
> since I got around to the review late -- just something to think about
> for further refactoring.
> 
>> @@ -579,6 +583,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> #endif
>> 
>> 	kvm_guest_exit();
>> +	vcpu->mode = OUTSIDE_GUEST_MODE;
>> +	smp_wmb();
>> 
>> out:
>> 	vcpu->mode = OUTSIDE_GUEST_MODE;
> 
> This looks wrong.

Yeah. A later patch fixes it up again :(. Sorry.

> 
>> diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c
>> index 93f3b92..06273a7 100644
>> --- a/arch/powerpc/kvm/e500_tlb.c
>> +++ b/arch/powerpc/kvm/e500_tlb.c
>> @@ -303,18 +303,15 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
>> 	ref->pfn = pfn;
>> 	ref->flags = E500_TLB_VALID;
>> 
>> -	if (tlbe_is_writable(gtlbe))
>> +	if (tlbe_is_writable(gtlbe)) {
>> 		ref->flags |= E500_TLB_DIRTY;
>> +		kvm_set_pfn_dirty(pfn);
>> +	}
>> }
> 
> Is there any reason to keep E500_TLB_DIRTY around?  You seem to be
> removing the only code that checks it.

Nope, none. We can safely remove it.

> 
>> @@ -357,6 +354,13 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500)
>> 	clear_tlb_privs(vcpu_e500);
>> }
>> 
>> +void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
>> +	clear_tlb_refs(vcpu_e500);
>> +	clear_tlb1_bitmap(vcpu_e500);
>> +}
> 
> Should we move clear_tlb1_bitmap() into clear_tlb_refs()?  That is, is
> it a bug that we don't do it in ioctl_dirty_tlb()?

I think so, yes :).

> 
>> +/************* MMU Notifiers *************/
>> +
> 
> Is this really necessary?

The comment? No, comments are never necessary. But the code is slowly becoming bigger and bigger and having individual sections in it improves readability IMHO.

> 
>> +int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
>> +{
>> +	/*
>> +	 * Flush all shadow tlb entries everywhere. This is slow, but
>> +	 * we are 100% sure that we catch the to be unmapped page
>> +	 */
>> +	kvm_flush_remote_tlbs(kvm);
>> +
>> +	return 0;
>> +}
>> +
>> +int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
>> +{
>> +	/* kvm_unmap_hva flushes everything anyways */
>> +	kvm_unmap_hva(kvm, start);
>> +
>> +	return 0;
>> +}
> 
> I'd feel better about this calling kvm_flush_remote_tlbs() directly
> rather than hoping that someone who enhances kvm_unmap_hva() updates
> this function as well.

The reasons I went for this approach were:

  * originally I called a generic helper that would get a mask and an HVA, but it turned out to be useless because it's easier to flush everything for now
  * kvm_unmap_hva gets a trace point. This way we get traces for all unmap operations.

I would say chances are very low that someone changes kvm_unmap_hva without also touching kvm_unmap_hva_range.


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15  1:23     ` Scott Wood
@ 2012-08-15  9:52       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  9:52 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 03:23, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> When we map a page that wasn't icache cleared before, do so when first
>> mapping it in KVM using the same information bits as the Linux mapping
>> logic. That way we are 100% sure that any page we map does not have stale
>> entries in the icache.
> 
> We're not really 100% sure of that -- this only handles the case where
> the kernel does the dirtying, not when it's done by QEMU or the guest.

When the guest does it, the guest is responsible for clearing the icache. Same for QEMU. It needs to clear it when doing DMA.

However, what is still broken would be a direct /dev/mem map. There QEMU should probably clear the icache before starting the guest, in case another guest was running on that same memory before. Fortunately, we don't have that mode available in upstream QEMU :).


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15  9:52       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15  9:52 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 03:23, Scott Wood wrote:

> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>> When we map a page that wasn't icache cleared before, do so when first
>> mapping it in KVM using the same information bits as the Linux mapping
>> logic. That way we are 100% sure that any page we map does not have stale
>> entries in the icache.
> 
> We're not really 100% sure of that -- this only handles the case where
> the kernel does the dirtying, not when it's done by QEMU or the guest.

When the guest does it, the guest is responsible for clearing the icache. Same for QEMU. It needs to clear it when doing DMA.

However, what is still broken would be a direct /dev/mem map. There QEMU should probably clear the icache before starting the guest, in case another guest was running on that same memory before. Fortunately, we don't have that mode available in upstream QEMU :).


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 38/38] ppc: e500_tlb memset clears nothing
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-15 10:07     ` Avi Kivity
  -1 siblings, 0 replies; 150+ messages in thread
From: Avi Kivity @ 2012-08-15 10:07 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Alan Cox, Andrew Morton

On 08/15/2012 02:04 AM, Alexander Graf wrote:
> From: Alan Cox <alan@linux.intel.com>
> 
> Put the parameters the right way around
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44031
> 


Should this go to 3.6 (and backports etc.)?


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 38/38] ppc: e500_tlb memset clears nothing
@ 2012-08-15 10:07     ` Avi Kivity
  0 siblings, 0 replies; 150+ messages in thread
From: Avi Kivity @ 2012-08-15 10:07 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Alan Cox, Andrew Morton

On 08/15/2012 02:04 AM, Alexander Graf wrote:
> From: Alan Cox <alan@linux.intel.com>
> 
> Put the parameters the right way around
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?idD031
> 


Should this go to 3.6 (and backports etc.)?


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 38/38] ppc: e500_tlb memset clears nothing
  2012-08-15 10:07     ` Avi Kivity
@ 2012-08-15 10:09       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 10:09 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-ppc, KVM list, Alan Cox, Andrew Morton


On 15.08.2012, at 12:07, Avi Kivity wrote:

> On 08/15/2012 02:04 AM, Alexander Graf wrote:
>> From: Alan Cox <alan@linux.intel.com>
>> 
>> Put the parameters the right way around
>> 
>> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44031
>> 
> 
> 
> Should this go to 3.6 (and backports etc.)?

This one is even less crucial than the icache fix. Should I assemble a separate patch queue with patches that should go into 3.6?


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 38/38] ppc: e500_tlb memset clears nothing
@ 2012-08-15 10:09       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 10:09 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-ppc, KVM list, Alan Cox, Andrew Morton


On 15.08.2012, at 12:07, Avi Kivity wrote:

> On 08/15/2012 02:04 AM, Alexander Graf wrote:
>> From: Alan Cox <alan@linux.intel.com>
>> 
>> Put the parameters the right way around
>> 
>> Addresses https://bugzilla.kernel.org/show_bug.cgi?idD031
>> 
> 
> 
> Should this go to 3.6 (and backports etc.)?

This one is even less crucial than the icache fix. Should I assemble a separate patch queue with patches that should go into 3.6?


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 38/38] ppc: e500_tlb memset clears nothing
  2012-08-15 10:09       ` Alexander Graf
@ 2012-08-15 10:10         ` Avi Kivity
  -1 siblings, 0 replies; 150+ messages in thread
From: Avi Kivity @ 2012-08-15 10:10 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Alan Cox, Andrew Morton

On 08/15/2012 01:09 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 12:07, Avi Kivity wrote:
> 
>> On 08/15/2012 02:04 AM, Alexander Graf wrote:
>>> From: Alan Cox <alan@linux.intel.com>
>>> 
>>> Put the parameters the right way around
>>> 
>>> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44031
>>> 
>> 
>> 
>> Should this go to 3.6 (and backports etc.)?
> 
> This one is even less crucial than the icache fix. Should I assemble a separate patch queue with patches that should go into 3.6?

Yes please.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 38/38] ppc: e500_tlb memset clears nothing
@ 2012-08-15 10:10         ` Avi Kivity
  0 siblings, 0 replies; 150+ messages in thread
From: Avi Kivity @ 2012-08-15 10:10 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Alan Cox, Andrew Morton

On 08/15/2012 01:09 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 12:07, Avi Kivity wrote:
> 
>> On 08/15/2012 02:04 AM, Alexander Graf wrote:
>>> From: Alan Cox <alan@linux.intel.com>
>>> 
>>> Put the parameters the right way around
>>> 
>>> Addresses https://bugzilla.kernel.org/show_bug.cgi?idD031
>>> 
>> 
>> 
>> Should this go to 3.6 (and backports etc.)?
> 
> This one is even less crucial than the icache fix. Should I assemble a separate patch queue with patches that should go into 3.6?

Yes please.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15  9:52       ` Alexander Graf
@ 2012-08-15 17:26         ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 17:26 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 04:52 AM, Alexander Graf wrote:
> 
> On 15.08.2012, at 03:23, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> When we map a page that wasn't icache cleared before, do so when first
>>> mapping it in KVM using the same information bits as the Linux mapping
>>> logic. That way we are 100% sure that any page we map does not have stale
>>> entries in the icache.
>>
>> We're not really 100% sure of that -- this only handles the case where
>> the kernel does the dirtying, not when it's done by QEMU or the guest.
> 
> When the guest does it, the guest is responsible for clearing the
> icache. Same for QEMU. It needs to clear it when doing DMA.

Sure.  I was just worried that that commit message could be taken the
wrong way, as in "we no longer need the QEMU icache flushing patch".

> However, what is still broken would be a direct /dev/mem map. There
> QEMU should probably clear the icache before starting the guest, in
> case another guest was running on that same memory before.
> Fortunately, we don't have that mode available in upstream QEMU :).

How is QEMU loading images different if it's /dev/mem versus ordinary
anonymous memory?  You probably won't have stale icache data in the
latter case (which makes it less likely to be a problem in pratice), but
in theory you could have data that still hasn't left the dcache.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 17:26         ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 17:26 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 04:52 AM, Alexander Graf wrote:
> 
> On 15.08.2012, at 03:23, Scott Wood wrote:
> 
>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>> When we map a page that wasn't icache cleared before, do so when first
>>> mapping it in KVM using the same information bits as the Linux mapping
>>> logic. That way we are 100% sure that any page we map does not have stale
>>> entries in the icache.
>>
>> We're not really 100% sure of that -- this only handles the case where
>> the kernel does the dirtying, not when it's done by QEMU or the guest.
> 
> When the guest does it, the guest is responsible for clearing the
> icache. Same for QEMU. It needs to clear it when doing DMA.

Sure.  I was just worried that that commit message could be taken the
wrong way, as in "we no longer need the QEMU icache flushing patch".

> However, what is still broken would be a direct /dev/mem map. There
> QEMU should probably clear the icache before starting the guest, in
> case another guest was running on that same memory before.
> Fortunately, we don't have that mode available in upstream QEMU :).

How is QEMU loading images different if it's /dev/mem versus ordinary
anonymous memory?  You probably won't have stale icache data in the
latter case (which makes it less likely to be a problem in pratice), but
in theory you could have data that still hasn't left the dcache.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 17:26         ` Scott Wood
@ 2012-08-15 17:27           ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 17:27 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 19:26, Scott Wood wrote:

> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 03:23, Scott Wood wrote:
>> 
>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>> When we map a page that wasn't icache cleared before, do so when first
>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>> entries in the icache.
>>> 
>>> We're not really 100% sure of that -- this only handles the case where
>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>> 
>> When the guest does it, the guest is responsible for clearing the
>> icache. Same for QEMU. It needs to clear it when doing DMA.
> 
> Sure.  I was just worried that that commit message could be taken the
> wrong way, as in "we no longer need the QEMU icache flushing patch".
> 
>> However, what is still broken would be a direct /dev/mem map. There
>> QEMU should probably clear the icache before starting the guest, in
>> case another guest was running on that same memory before.
>> Fortunately, we don't have that mode available in upstream QEMU :).
> 
> How is QEMU loading images different if it's /dev/mem versus ordinary
> anonymous memory?  You probably won't have stale icache data in the
> latter case (which makes it less likely to be a problem in pratice), but
> in theory you could have data that still hasn't left the dcache.

It's the same. I just talked to Ben about this today in a different context and we should be safe :).


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 17:27           ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 17:27 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 19:26, Scott Wood wrote:

> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 03:23, Scott Wood wrote:
>> 
>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>> When we map a page that wasn't icache cleared before, do so when first
>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>> entries in the icache.
>>> 
>>> We're not really 100% sure of that -- this only handles the case where
>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>> 
>> When the guest does it, the guest is responsible for clearing the
>> icache. Same for QEMU. It needs to clear it when doing DMA.
> 
> Sure.  I was just worried that that commit message could be taken the
> wrong way, as in "we no longer need the QEMU icache flushing patch".
> 
>> However, what is still broken would be a direct /dev/mem map. There
>> QEMU should probably clear the icache before starting the guest, in
>> case another guest was running on that same memory before.
>> Fortunately, we don't have that mode available in upstream QEMU :).
> 
> How is QEMU loading images different if it's /dev/mem versus ordinary
> anonymous memory?  You probably won't have stale icache data in the
> latter case (which makes it less likely to be a problem in pratice), but
> in theory you could have data that still hasn't left the dcache.

It's the same. I just talked to Ben about this today in a different context and we should be safe :).


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 17:27           ` Alexander Graf
@ 2012-08-15 17:47             ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 17:47 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 12:27 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 19:26, Scott Wood wrote:
> 
>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>
>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>> entries in the icache.
>>>>
>>>> We're not really 100% sure of that -- this only handles the case where
>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>
>>> When the guest does it, the guest is responsible for clearing the
>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>
>> Sure.  I was just worried that that commit message could be taken the
>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>
>>> However, what is still broken would be a direct /dev/mem map. There
>>> QEMU should probably clear the icache before starting the guest, in
>>> case another guest was running on that same memory before.
>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>
>> How is QEMU loading images different if it's /dev/mem versus ordinary
>> anonymous memory?  You probably won't have stale icache data in the
>> latter case (which makes it less likely to be a problem in pratice), but
>> in theory you could have data that still hasn't left the dcache.
> 
> It's the same. I just talked to Ben about this today in a different context and we should be safe :).

Safe how?

If it's truly the same, we're definitely not safe, since I had problems
with this using /dev/mem (particularly when changing the kernel image
without a host reboot) before I put in the icache flush patch.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 17:47             ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 17:47 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 12:27 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 19:26, Scott Wood wrote:
> 
>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>
>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>> entries in the icache.
>>>>
>>>> We're not really 100% sure of that -- this only handles the case where
>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>
>>> When the guest does it, the guest is responsible for clearing the
>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>
>> Sure.  I was just worried that that commit message could be taken the
>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>
>>> However, what is still broken would be a direct /dev/mem map. There
>>> QEMU should probably clear the icache before starting the guest, in
>>> case another guest was running on that same memory before.
>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>
>> How is QEMU loading images different if it's /dev/mem versus ordinary
>> anonymous memory?  You probably won't have stale icache data in the
>> latter case (which makes it less likely to be a problem in pratice), but
>> in theory you could have data that still hasn't left the dcache.
> 
> It's the same. I just talked to Ben about this today in a different context and we should be safe :).

Safe how?

If it's truly the same, we're definitely not safe, since I had problems
with this using /dev/mem (particularly when changing the kernel image
without a host reboot) before I put in the icache flush patch.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 17:47             ` Scott Wood
@ 2012-08-15 18:01               ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:01 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 19:47, Scott Wood wrote:

> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 19:26, Scott Wood wrote:
>> 
>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>> 
>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>> entries in the icache.
>>>>> 
>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>> 
>>>> When the guest does it, the guest is responsible for clearing the
>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>> 
>>> Sure.  I was just worried that that commit message could be taken the
>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>> 
>>>> However, what is still broken would be a direct /dev/mem map. There
>>>> QEMU should probably clear the icache before starting the guest, in
>>>> case another guest was running on that same memory before.
>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>> 
>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>> anonymous memory?  You probably won't have stale icache data in the
>>> latter case (which makes it less likely to be a problem in pratice), but
>>> in theory you could have data that still hasn't left the dcache.
>> 
>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
> 
> Safe how?
> 
> If it's truly the same, we're definitely not safe, since I had problems
> with this using /dev/mem (particularly when changing the kernel image
> without a host reboot) before I put in the icache flush patch.

QEMU needs to icache flush everything it puts into guest memory. Whatever blob comes next (SLOF for -M pseries, kernel for -M e500) assumes dirty icache for every page it maps itself. So SLOF will icache flush the kernel text segment, Linux will icache flush user space pages.


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:01               ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:01 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 19:47, Scott Wood wrote:

> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 19:26, Scott Wood wrote:
>> 
>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>> 
>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>> entries in the icache.
>>>>> 
>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>> 
>>>> When the guest does it, the guest is responsible for clearing the
>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>> 
>>> Sure.  I was just worried that that commit message could be taken the
>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>> 
>>>> However, what is still broken would be a direct /dev/mem map. There
>>>> QEMU should probably clear the icache before starting the guest, in
>>>> case another guest was running on that same memory before.
>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>> 
>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>> anonymous memory?  You probably won't have stale icache data in the
>>> latter case (which makes it less likely to be a problem in pratice), but
>>> in theory you could have data that still hasn't left the dcache.
>> 
>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
> 
> Safe how?
> 
> If it's truly the same, we're definitely not safe, since I had problems
> with this using /dev/mem (particularly when changing the kernel image
> without a host reboot) before I put in the icache flush patch.

QEMU needs to icache flush everything it puts into guest memory. Whatever blob comes next (SLOF for -M pseries, kernel for -M e500) assumes dirty icache for every page it maps itself. So SLOF will icache flush the kernel text segment, Linux will icache flush user space pages.


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:01               ` Alexander Graf
@ 2012-08-15 18:16                 ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 18:16 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:01 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 19:47, Scott Wood wrote:
> 
>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>
>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>
>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>
>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>> entries in the icache.
>>>>>>
>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>
>>>>> When the guest does it, the guest is responsible for clearing the
>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>
>>>> Sure.  I was just worried that that commit message could be taken the
>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>
>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>> case another guest was running on that same memory before.
>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>
>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>> anonymous memory?  You probably won't have stale icache data in the
>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>> in theory you could have data that still hasn't left the dcache.
>>>
>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>
>> Safe how?
>>
>> If it's truly the same, we're definitely not safe, since I had problems
>> with this using /dev/mem (particularly when changing the kernel image
>> without a host reboot) before I put in the icache flush patch.
> 
> QEMU needs to icache flush everything it puts into guest memory.

Yes.  I thought you meant we should be safe as things are now.

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:16                 ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 18:16 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:01 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 19:47, Scott Wood wrote:
> 
>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>
>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>
>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>
>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>> entries in the icache.
>>>>>>
>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>
>>>>> When the guest does it, the guest is responsible for clearing the
>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>
>>>> Sure.  I was just worried that that commit message could be taken the
>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>
>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>> case another guest was running on that same memory before.
>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>
>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>> anonymous memory?  You probably won't have stale icache data in the
>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>> in theory you could have data that still hasn't left the dcache.
>>>
>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>
>> Safe how?
>>
>> If it's truly the same, we're definitely not safe, since I had problems
>> with this using /dev/mem (particularly when changing the kernel image
>> without a host reboot) before I put in the icache flush patch.
> 
> QEMU needs to icache flush everything it puts into guest memory.

Yes.  I thought you meant we should be safe as things are now.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:16                 ` Scott Wood
@ 2012-08-15 18:27                   ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:27 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:16, Scott Wood wrote:

> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 19:47, Scott Wood wrote:
>> 
>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>> 
>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>> 
>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>> 
>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>> entries in the icache.
>>>>>>> 
>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>> 
>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>> 
>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>> 
>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>> case another guest was running on that same memory before.
>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>> 
>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>> in theory you could have data that still hasn't left the dcache.
>>>> 
>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>> 
>>> Safe how?
>>> 
>>> If it's truly the same, we're definitely not safe, since I had problems
>>> with this using /dev/mem (particularly when changing the kernel image
>>> without a host reboot) before I put in the icache flush patch.
>> 
>> QEMU needs to icache flush everything it puts into guest memory.
> 
> Yes.  I thought you meant we should be safe as things are now.

Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:27                   ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:27 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:16, Scott Wood wrote:

> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 19:47, Scott Wood wrote:
>> 
>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>> 
>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>> 
>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>> 
>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>> entries in the icache.
>>>>>>> 
>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>> 
>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>> 
>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>> 
>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>> case another guest was running on that same memory before.
>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>> 
>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>> in theory you could have data that still hasn't left the dcache.
>>>> 
>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>> 
>>> Safe how?
>>> 
>>> If it's truly the same, we're definitely not safe, since I had problems
>>> with this using /dev/mem (particularly when changing the kernel image
>>> without a host reboot) before I put in the icache flush patch.
>> 
>> QEMU needs to icache flush everything it puts into guest memory.
> 
> Yes.  I thought you meant we should be safe as things are now.

Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
  2012-08-15  0:10     ` Scott Wood
@ 2012-08-15 18:28       ` Marcelo Tosatti
  -1 siblings, 0 replies; 150+ messages in thread
From: Marcelo Tosatti @ 2012-08-15 18:28 UTC (permalink / raw)
  To: Scott Wood; +Cc: Alexander Graf, kvm-ppc, KVM list

On Tue, Aug 14, 2012 at 07:10:43PM -0500, Scott Wood wrote:
> On 08/14/2012 06:04 PM, Alexander Graf wrote:
> > We need a central place to check for pending requests in. Add one that
> > only does the timer check we already do in a different place.
> > 
> > Later, this central function can be extended by more checks.
> > 
> > Signed-off-by: Alexander Graf <agraf@suse.de>
> > ---
> >  arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
> >  1 files changed, 17 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > index 1d4ce9a..bcf87fe 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
> >  	unsigned long *pending = &vcpu->arch.pending_exceptions;
> >  	unsigned int priority;
> >  
> > -	if (vcpu->requests) {
> > -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
> > -			smp_mb();
> > -			update_timer_ints(vcpu);
> > -		}
> > -	}
> > -
> >  	priority = __ffs(*pending);
> >  	while (priority < BOOKE_IRQPRIO_MAX) {
> >  		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
> > @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> >  	return r;
> >  }
> >  
> > +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
> > +{
> > +	if (vcpu->requests) {
> > +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
> > +			update_timer_ints(vcpu);
> > +	}
> > +}
> > +
> >  /*
> >   * Common checks before entering the guest world.  Call with interrupts
> >   * disabled.
> > @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
> >  			break;
> >  		}
> >  
> > +		smp_mb();
> > +		if (vcpu->requests) {
> > +			/* Make sure we process requests preemptable */
> > +			local_irq_enable();
> > +			kvmppc_check_requests(vcpu);
> > +			local_irq_disable();
> > +			continue;
> > +		}
> 
> What previous memory access is the smp_mb() ordering against?
> 
> -Scott

It is good practice to document each memory barrier.


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function
@ 2012-08-15 18:28       ` Marcelo Tosatti
  0 siblings, 0 replies; 150+ messages in thread
From: Marcelo Tosatti @ 2012-08-15 18:28 UTC (permalink / raw)
  To: Scott Wood; +Cc: Alexander Graf, kvm-ppc, KVM list

On Tue, Aug 14, 2012 at 07:10:43PM -0500, Scott Wood wrote:
> On 08/14/2012 06:04 PM, Alexander Graf wrote:
> > We need a central place to check for pending requests in. Add one that
> > only does the timer check we already do in a different place.
> > 
> > Later, this central function can be extended by more checks.
> > 
> > Signed-off-by: Alexander Graf <agraf@suse.de>
> > ---
> >  arch/powerpc/kvm/booke.c |   24 +++++++++++++++++-------
> >  1 files changed, 17 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > index 1d4ce9a..bcf87fe 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -419,13 +419,6 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu)
> >  	unsigned long *pending = &vcpu->arch.pending_exceptions;
> >  	unsigned int priority;
> >  
> > -	if (vcpu->requests) {
> > -		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu)) {
> > -			smp_mb();
> > -			update_timer_ints(vcpu);
> > -		}
> > -	}
> > -
> >  	priority = __ffs(*pending);
> >  	while (priority < BOOKE_IRQPRIO_MAX) {
> >  		if (kvmppc_booke_irqprio_deliver(vcpu, priority))
> > @@ -461,6 +454,14 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
> >  	return r;
> >  }
> >  
> > +static void kvmppc_check_requests(struct kvm_vcpu *vcpu)
> > +{
> > +	if (vcpu->requests) {
> > +		if (kvm_check_request(KVM_REQ_PENDING_TIMER, vcpu))
> > +			update_timer_ints(vcpu);
> > +	}
> > +}
> > +
> >  /*
> >   * Common checks before entering the guest world.  Call with interrupts
> >   * disabled.
> > @@ -485,6 +486,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
> >  			break;
> >  		}
> >  
> > +		smp_mb();
> > +		if (vcpu->requests) {
> > +			/* Make sure we process requests preemptable */
> > +			local_irq_enable();
> > +			kvmppc_check_requests(vcpu);
> > +			local_irq_disable();
> > +			continue;
> > +		}
> 
> What previous memory access is the smp_mb() ordering against?
> 
> -Scott

It is good practice to document each memory barrier.


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:27                   ` Alexander Graf
@ 2012-08-15 18:29                     ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:27, Alexander Graf wrote:

> 
> On 15.08.2012, at 20:16, Scott Wood wrote:
> 
>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>> 
>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>> 
>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>> 
>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>> 
>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>> 
>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>> 
>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>> entries in the icache.
>>>>>>>> 
>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>> 
>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>> 
>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>> 
>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>> case another guest was running on that same memory before.
>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>> 
>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>> 
>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>> 
>>>> Safe how?
>>>> 
>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>> with this using /dev/mem (particularly when changing the kernel image
>>>> without a host reboot) before I put in the icache flush patch.
>>> 
>>> QEMU needs to icache flush everything it puts into guest memory.
>> 
>> Yes.  I thought you meant we should be safe as things are now.
> 
> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?

Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:29                     ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:27, Alexander Graf wrote:

> 
> On 15.08.2012, at 20:16, Scott Wood wrote:
> 
>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>> 
>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>> 
>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>> 
>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>> 
>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>> 
>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>> 
>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>> entries in the icache.
>>>>>>>> 
>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>> 
>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>> 
>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>> 
>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>> case another guest was running on that same memory before.
>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>> 
>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>> 
>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>> 
>>>> Safe how?
>>>> 
>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>> with this using /dev/mem (particularly when changing the kernel image
>>>> without a host reboot) before I put in the icache flush patch.
>>> 
>>> QEMU needs to icache flush everything it puts into guest memory.
>> 
>> Yes.  I thought you meant we should be safe as things are now.
> 
> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?

Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:29                     ` Alexander Graf
@ 2012-08-15 18:33                       ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 18:33 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:29 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 20:27, Alexander Graf wrote:
> 
>>
>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>
>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>
>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>
>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>
>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>
>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>
>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>
>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>> entries in the icache.
>>>>>>>>>
>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>
>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>
>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>
>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>> case another guest was running on that same memory before.
>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>
>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>
>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>
>>>>> Safe how?
>>>>>
>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>> without a host reboot) before I put in the icache flush patch.
>>>>
>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>
>>> Yes.  I thought you meant we should be safe as things are now.
>>
>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?

IIRC Ben wanted it conditionalized to not slow things down on
icache-coherent systems, and I never got around to respinning it.

> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.

Why?

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:33                       ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 18:33 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:29 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 20:27, Alexander Graf wrote:
> 
>>
>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>
>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>
>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>
>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>
>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>
>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>
>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>
>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>> entries in the icache.
>>>>>>>>>
>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>
>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>
>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>
>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>> case another guest was running on that same memory before.
>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>
>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>
>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>
>>>>> Safe how?
>>>>>
>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>> without a host reboot) before I put in the icache flush patch.
>>>>
>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>
>>> Yes.  I thought you meant we should be safe as things are now.
>>
>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?

IIRC Ben wanted it conditionalized to not slow things down on
icache-coherent systems, and I never got around to respinning it.

> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.

Why?

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:33                       ` Scott Wood
@ 2012-08-15 18:51                         ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:51 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:33, Scott Wood wrote:

> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 20:27, Alexander Graf wrote:
>> 
>>> 
>>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>> 
>>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>> 
>>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>> 
>>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>> 
>>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>> 
>>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>> 
>>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>> 
>>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>>> entries in the icache.
>>>>>>>>>> 
>>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>> 
>>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>> 
>>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>> 
>>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>>> case another guest was running on that same memory before.
>>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>> 
>>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>> 
>>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>> 
>>>>>> Safe how?
>>>>>> 
>>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>>> without a host reboot) before I put in the icache flush patch.
>>>>> 
>>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>> 
>>>> Yes.  I thought you meant we should be safe as things are now.
>>> 
>>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?
> 
> IIRC Ben wanted it conditionalized to not slow things down on
> icache-coherent systems, and I never got around to respinning it.

No, he was saying that DMA doesn't flush the icache:

  http://thread.gmane.org/gmane.comp.emulators.qemu/119022/focus=119086

> 
>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
> 
> Why?

Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:51                         ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:51 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:33, Scott Wood wrote:

> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 20:27, Alexander Graf wrote:
>> 
>>> 
>>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>> 
>>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>> 
>>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>> 
>>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>> 
>>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>> 
>>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>> 
>>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>> 
>>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>>> entries in the icache.
>>>>>>>>>> 
>>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>> 
>>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>> 
>>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>> 
>>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>>> case another guest was running on that same memory before.
>>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>> 
>>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>> 
>>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>> 
>>>>>> Safe how?
>>>>>> 
>>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>>> without a host reboot) before I put in the icache flush patch.
>>>>> 
>>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>> 
>>>> Yes.  I thought you meant we should be safe as things are now.
>>> 
>>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?
> 
> IIRC Ben wanted it conditionalized to not slow things down on
> icache-coherent systems, and I never got around to respinning it.

No, he was saying that DMA doesn't flush the icache:

  http://thread.gmane.org/gmane.comp.emulators.qemu/119022/focus\x119086

> 
>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
> 
> Why?

Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:51                         ` Alexander Graf
@ 2012-08-15 18:56                           ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 18:56 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:51 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 20:33, Scott Wood wrote:
> 
>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 20:27, Alexander Graf wrote:
>>>
>>>>
>>>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>>>
>>>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>>>
>>>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>>>
>>>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>>>
>>>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>>>
>>>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>>>
>>>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>>>
>>>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>>>> entries in the icache.
>>>>>>>>>>>
>>>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>>>
>>>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>>>
>>>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>>>
>>>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>>>> case another guest was running on that same memory before.
>>>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>>>
>>>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>>>
>>>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>>>
>>>>>>> Safe how?
>>>>>>>
>>>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>>>> without a host reboot) before I put in the icache flush patch.
>>>>>>
>>>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>>>
>>>>> Yes.  I thought you meant we should be safe as things are now.
>>>>
>>>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?
>>
>> IIRC Ben wanted it conditionalized to not slow things down on
>> icache-coherent systems, and I never got around to respinning it.
> 
> No, he was saying that DMA doesn't flush the icache:
> 
>   http://thread.gmane.org/gmane.comp.emulators.qemu/119022/focus=119086

I recall someone asking for it to be made conditional, but I don't have
time to look it up right now -- I want to try to get some U-Boot stuff
done before the end of the merge window tomorrow.

>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>
>> Why?
> 
> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.

What about breakpoints and other debug modifications?

And it's possible (if not necessarily likely) that other guests are
different.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:56                           ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 18:56 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:51 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 20:33, Scott Wood wrote:
> 
>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 20:27, Alexander Graf wrote:
>>>
>>>>
>>>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>>>
>>>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>>>
>>>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>>>
>>>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>>>
>>>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>>>
>>>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>>>
>>>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>>>
>>>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>>>> entries in the icache.
>>>>>>>>>>>
>>>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>>>
>>>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>>>
>>>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>>>
>>>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>>>> case another guest was running on that same memory before.
>>>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>>>
>>>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>>>
>>>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>>>
>>>>>>> Safe how?
>>>>>>>
>>>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>>>> without a host reboot) before I put in the icache flush patch.
>>>>>>
>>>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>>>
>>>>> Yes.  I thought you meant we should be safe as things are now.
>>>>
>>>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?
>>
>> IIRC Ben wanted it conditionalized to not slow things down on
>> icache-coherent systems, and I never got around to respinning it.
> 
> No, he was saying that DMA doesn't flush the icache:
> 
>   http://thread.gmane.org/gmane.comp.emulators.qemu/119022/focus\x119086

I recall someone asking for it to be made conditional, but I don't have
time to look it up right now -- I want to try to get some U-Boot stuff
done before the end of the merge window tomorrow.

>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>
>> Why?
> 
> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.

What about breakpoints and other debug modifications?

And it's possible (if not necessarily likely) that other guests are
different.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:56                           ` Scott Wood
@ 2012-08-15 18:58                             ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:58 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:56, Scott Wood wrote:

> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 20:33, Scott Wood wrote:
>> 
>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 20:27, Alexander Graf wrote:
>>>> 
>>>>> 
>>>>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>>>> 
>>>>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>>>> 
>>>>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>>>> 
>>>>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>>>> 
>>>>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>>>> 
>>>>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>>>>> entries in the icache.
>>>>>>>>>>>> 
>>>>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>>>> 
>>>>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>>>> 
>>>>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>>>> 
>>>>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>>>>> case another guest was running on that same memory before.
>>>>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>>>> 
>>>>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>>>> 
>>>>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>>>> 
>>>>>>>> Safe how?
>>>>>>>> 
>>>>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>>>>> without a host reboot) before I put in the icache flush patch.
>>>>>>> 
>>>>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>>>> 
>>>>>> Yes.  I thought you meant we should be safe as things are now.
>>>>> 
>>>>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?
>>> 
>>> IIRC Ben wanted it conditionalized to not slow things down on
>>> icache-coherent systems, and I never got around to respinning it.
>> 
>> No, he was saying that DMA doesn't flush the icache:
>> 
>>  http://thread.gmane.org/gmane.comp.emulators.qemu/119022/focus=119086
> 
> I recall someone asking for it to be made conditional, but I don't have
> time to look it up right now -- I want to try to get some U-Boot stuff
> done before the end of the merge window tomorrow.

Sure :)

> 
>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>> 
>>> Why?
>> 
>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
> 
> What about breakpoints and other debug modifications?

The breakpoint code is arch specific. We can just put an icache flush in there.

> And it's possible (if not necessarily likely) that other guests are
> different.

Does fsl hardware guarantee icache coherency from device DMA?


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 18:58                             ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 18:58 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list


On 15.08.2012, at 20:56, Scott Wood wrote:

> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 20:33, Scott Wood wrote:
>> 
>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 20:27, Alexander Graf wrote:
>>>> 
>>>>> 
>>>>> On 15.08.2012, at 20:16, Scott Wood wrote:
>>>>> 
>>>>>> On 08/15/2012 01:01 PM, Alexander Graf wrote:
>>>>>>> 
>>>>>>> On 15.08.2012, at 19:47, Scott Wood wrote:
>>>>>>> 
>>>>>>>> On 08/15/2012 12:27 PM, Alexander Graf wrote:
>>>>>>>>> 
>>>>>>>>> On 15.08.2012, at 19:26, Scott Wood wrote:
>>>>>>>>> 
>>>>>>>>>> On 08/15/2012 04:52 AM, Alexander Graf wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On 15.08.2012, at 03:23, Scott Wood wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>>>>>>>>>> When we map a page that wasn't icache cleared before, do so when first
>>>>>>>>>>>>> mapping it in KVM using the same information bits as the Linux mapping
>>>>>>>>>>>>> logic. That way we are 100% sure that any page we map does not have stale
>>>>>>>>>>>>> entries in the icache.
>>>>>>>>>>>> 
>>>>>>>>>>>> We're not really 100% sure of that -- this only handles the case where
>>>>>>>>>>>> the kernel does the dirtying, not when it's done by QEMU or the guest.
>>>>>>>>>>> 
>>>>>>>>>>> When the guest does it, the guest is responsible for clearing the
>>>>>>>>>>> icache. Same for QEMU. It needs to clear it when doing DMA.
>>>>>>>>>> 
>>>>>>>>>> Sure.  I was just worried that that commit message could be taken the
>>>>>>>>>> wrong way, as in "we no longer need the QEMU icache flushing patch".
>>>>>>>>>> 
>>>>>>>>>>> However, what is still broken would be a direct /dev/mem map. There
>>>>>>>>>>> QEMU should probably clear the icache before starting the guest, in
>>>>>>>>>>> case another guest was running on that same memory before.
>>>>>>>>>>> Fortunately, we don't have that mode available in upstream QEMU :).
>>>>>>>>>> 
>>>>>>>>>> How is QEMU loading images different if it's /dev/mem versus ordinary
>>>>>>>>>> anonymous memory?  You probably won't have stale icache data in the
>>>>>>>>>> latter case (which makes it less likely to be a problem in pratice), but
>>>>>>>>>> in theory you could have data that still hasn't left the dcache.
>>>>>>>>> 
>>>>>>>>> It's the same. I just talked to Ben about this today in a different context and we should be safe :).
>>>>>>>> 
>>>>>>>> Safe how?
>>>>>>>> 
>>>>>>>> If it's truly the same, we're definitely not safe, since I had problems
>>>>>>>> with this using /dev/mem (particularly when changing the kernel image
>>>>>>>> without a host reboot) before I put in the icache flush patch.
>>>>>>> 
>>>>>>> QEMU needs to icache flush everything it puts into guest memory.
>>>>>> 
>>>>>> Yes.  I thought you meant we should be safe as things are now.
>>>>> 
>>>>> Hrm. What happened to your patch that flushes the icache on cpu_physical_memory_rw?
>>> 
>>> IIRC Ben wanted it conditionalized to not slow things down on
>>> icache-coherent systems, and I never got around to respinning it.
>> 
>> No, he was saying that DMA doesn't flush the icache:
>> 
>>  http://thread.gmane.org/gmane.comp.emulators.qemu/119022/focus\x119086
> 
> I recall someone asking for it to be made conditional, but I don't have
> time to look it up right now -- I want to try to get some U-Boot stuff
> done before the end of the merge window tomorrow.

Sure :)

> 
>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>> 
>>> Why?
>> 
>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
> 
> What about breakpoints and other debug modifications?

The breakpoint code is arch specific. We can just put an icache flush in there.

> And it's possible (if not necessarily likely) that other guests are
> different.

Does fsl hardware guarantee icache coherency from device DMA?


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 18:58                             ` Alexander Graf
@ 2012-08-15 19:05                               ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 19:05 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:58 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 20:56, Scott Wood wrote:
> 
>> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 20:33, Scott Wood wrote:
>>>
>>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>>>
>>>> Why?
>>>
>>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
>>
>> What about breakpoints and other debug modifications?
> 
> The breakpoint code is arch specific. We can just put an icache flush in there.

That doesn't cover other modifications that a debugger might do
(including manual poking at code done by a person at the command line).
 It's not really the breakpoint that's the special case, it's things
that the guest thinks of as DMA -- and differentiating that seems like a
questionable optimization.  If the guest is going to flush anyway, is
there any significant performance penalty to flushing twice?  The second
time would just be a no-op beyond doing the MMU/cache lookup.

>> And it's possible (if not necessarily likely) that other guests are
>> different.
> 
> Does fsl hardware guarantee icache coherency from device DMA?

I don't think so, but I don't know of any fsl hardware that leaves dirty
data in the dcache after DMA.  Even with stashing on our newer chips,
the data first goes to memory and then the core is told to prefetch it.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 19:05                               ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 19:05 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 01:58 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 20:56, Scott Wood wrote:
> 
>> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 20:33, Scott Wood wrote:
>>>
>>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>>>
>>>> Why?
>>>
>>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
>>
>> What about breakpoints and other debug modifications?
> 
> The breakpoint code is arch specific. We can just put an icache flush in there.

That doesn't cover other modifications that a debugger might do
(including manual poking at code done by a person at the command line).
 It's not really the breakpoint that's the special case, it's things
that the guest thinks of as DMA -- and differentiating that seems like a
questionable optimization.  If the guest is going to flush anyway, is
there any significant performance penalty to flushing twice?  The second
time would just be a no-op beyond doing the MMU/cache lookup.

>> And it's possible (if not necessarily likely) that other guests are
>> different.
> 
> Does fsl hardware guarantee icache coherency from device DMA?

I don't think so, but I don't know of any fsl hardware that leaves dirty
data in the dcache after DMA.  Even with stashing on our newer chips,
the data first goes to memory and then the core is told to prefetch it.

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 19:05                               ` Scott Wood
@ 2012-08-15 19:29                                 ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 19:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list, Benjamin Herrenschmidt


On 15.08.2012, at 21:05, Scott Wood wrote:

> On 08/15/2012 01:58 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 20:56, Scott Wood wrote:
>> 
>>> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 20:33, Scott Wood wrote:
>>>> 
>>>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>>>> 
>>>>> Why?
>>>> 
>>>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
>>> 
>>> What about breakpoints and other debug modifications?
>> 
>> The breakpoint code is arch specific. We can just put an icache flush in there.
> 
> That doesn't cover other modifications that a debugger might do
> (including manual poking at code done by a person at the command line).

Why not? This would go through gdbstub, where we can always put in an icache flush.

> It's not really the breakpoint that's the special case, it's things
> that the guest thinks of as DMA -- and differentiating that seems like a
> questionable optimization.  If the guest is going to flush anyway, is
> there any significant performance penalty to flushing twice?  The second
> time would just be a no-op beyond doing the MMU/cache lookup.

I would hope the guest is clever enough to only icache flush when we actually execute from these pages.

> 
>>> And it's possible (if not necessarily likely) that other guests are
>>> different.
>> 
>> Does fsl hardware guarantee icache coherency from device DMA?
> 
> I don't think so, but I don't know of any fsl hardware that leaves dirty
> data in the dcache after DMA.  Even with stashing on our newer chips,
> the data first goes to memory and then the core is told to prefetch it.

For Linux, I think we always flush the dcache when flushing the icache. However, that argument is reasonably valid. We probably want to flush the dache on DMA, so that a stale icache can fetch it from memory properly. But I don't see a reason why we would want to do so for the icache if hardware doesn't do it either.

But let's get Ben on board here :).


Alex

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 19:29                                 ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-08-15 19:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: kvm-ppc, KVM list, Benjamin Herrenschmidt


On 15.08.2012, at 21:05, Scott Wood wrote:

> On 08/15/2012 01:58 PM, Alexander Graf wrote:
>> 
>> On 15.08.2012, at 20:56, Scott Wood wrote:
>> 
>>> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>>>> 
>>>> On 15.08.2012, at 20:33, Scott Wood wrote:
>>>> 
>>>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>>>> 
>>>>> Why?
>>>> 
>>>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
>>> 
>>> What about breakpoints and other debug modifications?
>> 
>> The breakpoint code is arch specific. We can just put an icache flush in there.
> 
> That doesn't cover other modifications that a debugger might do
> (including manual poking at code done by a person at the command line).

Why not? This would go through gdbstub, where we can always put in an icache flush.

> It's not really the breakpoint that's the special case, it's things
> that the guest thinks of as DMA -- and differentiating that seems like a
> questionable optimization.  If the guest is going to flush anyway, is
> there any significant performance penalty to flushing twice?  The second
> time would just be a no-op beyond doing the MMU/cache lookup.

I would hope the guest is clever enough to only icache flush when we actually execute from these pages.

> 
>>> And it's possible (if not necessarily likely) that other guests are
>>> different.
>> 
>> Does fsl hardware guarantee icache coherency from device DMA?
> 
> I don't think so, but I don't know of any fsl hardware that leaves dirty
> data in the dcache after DMA.  Even with stashing on our newer chips,
> the data first goes to memory and then the core is told to prefetch it.

For Linux, I think we always flush the dcache when flushing the icache. However, that argument is reasonably valid. We probably want to flush the dache on DMA, so that a stale icache can fetch it from memory properly. But I don't see a reason why we would want to do so for the icache if hardware doesn't do it either.

But let's get Ben on board here :).


Alex


^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
  2012-08-15 19:29                                 ` Alexander Graf
@ 2012-08-15 19:53                                   ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 19:53 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Benjamin Herrenschmidt

On 08/15/2012 02:29 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 21:05, Scott Wood wrote:
> 
>> On 08/15/2012 01:58 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 20:56, Scott Wood wrote:
>>>
>>>> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>>>>>
>>>>> On 15.08.2012, at 20:33, Scott Wood wrote:
>>>>>
>>>>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>>>>>
>>>>>> Why?
>>>>>
>>>>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
>>>>
>>>> What about breakpoints and other debug modifications?
>>>
>>> The breakpoint code is arch specific. We can just put an icache flush in there.
>>
>> That doesn't cover other modifications that a debugger might do
>> (including manual poking at code done by a person at the command line).
> 
> Why not? This would go through gdbstub,

Not necessarily.  I could be poking at memory from the QEMU command
line.  If there isn't a command for that, there should be. :-)

> where we can always put in an icache flush.

If you want to cover every individual place where this function is
called for non-DMA, fine, though I'd feel more comfortable with
something that specifically identifies the access as for DMA.

>>>>>> And it's possible (if not necessarily likely) that other guests are
>>>> different.
>>>
>>> Does fsl hardware guarantee icache coherency from device DMA?
>>
>> I don't think so, but I don't know of any fsl hardware that leaves dirty
>> data in the dcache after DMA.  Even with stashing on our newer chips,
>> the data first goes to memory and then the core is told to prefetch it.
> 
> For Linux, I think we always flush the dcache when flushing the
> icache. However, that argument is reasonably valid. We probably want
> to flush the dache on DMA, so that a stale icache can fetch it from
> memory properly. But I don't see a reason why we would want to do so
> for the icache if hardware doesn't do it either.
> 
> But let's get Ben on board here :).

The only reason to invalidate the icache on DMA accesses would be to
avoid introducing a special case in the QEMU code, unless we find
hardware to emulate that does invalidate icache on DMA writes but isn't
icache-coherent in general (it's fairly plausable -- icache would act on
snoops it sees on the bus, but icache fills wouldn't issue snoops of
their own).

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 19/38] KVM: PPC: Add cache flush on page map
@ 2012-08-15 19:53                                   ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-15 19:53 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list, Benjamin Herrenschmidt

On 08/15/2012 02:29 PM, Alexander Graf wrote:
> 
> On 15.08.2012, at 21:05, Scott Wood wrote:
> 
>> On 08/15/2012 01:58 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 20:56, Scott Wood wrote:
>>>
>>>> On 08/15/2012 01:51 PM, Alexander Graf wrote:
>>>>>
>>>>> On 15.08.2012, at 20:33, Scott Wood wrote:
>>>>>
>>>>>> On 08/15/2012 01:29 PM, Alexander Graf wrote:
>>>>>>> Ah, if I read Ben's comment correctly we only need it for rom loads, not always for cpu_physical_memory_rw.
>>>>>>
>>>>>> Why?
>>>>>
>>>>> Because guest Linux apparently assumes that DMA'd memory needs to be icache flushed.
>>>>
>>>> What about breakpoints and other debug modifications?
>>>
>>> The breakpoint code is arch specific. We can just put an icache flush in there.
>>
>> That doesn't cover other modifications that a debugger might do
>> (including manual poking at code done by a person at the command line).
> 
> Why not? This would go through gdbstub,

Not necessarily.  I could be poking at memory from the QEMU command
line.  If there isn't a command for that, there should be. :-)

> where we can always put in an icache flush.

If you want to cover every individual place where this function is
called for non-DMA, fine, though I'd feel more comfortable with
something that specifically identifies the access as for DMA.

>>>>>> And it's possible (if not necessarily likely) that other guests are
>>>> different.
>>>
>>> Does fsl hardware guarantee icache coherency from device DMA?
>>
>> I don't think so, but I don't know of any fsl hardware that leaves dirty
>> data in the dcache after DMA.  Even with stashing on our newer chips,
>> the data first goes to memory and then the core is told to prefetch it.
> 
> For Linux, I think we always flush the dcache when flushing the
> icache. However, that argument is reasonably valid. We probably want
> to flush the dache on DMA, so that a stale icache can fetch it from
> memory properly. But I don't see a reason why we would want to do so
> for the icache if hardware doesn't do it either.
> 
> But let's get Ben on board here :).

The only reason to invalidate the icache on DMA accesses would be to
avoid introducing a special case in the QEMU code, unless we find
hardware to emulate that does invalidate icache on DMA writes but isn't
icache-coherent in general (it's fairly plausable -- icache would act on
snoops it sees on the bus, but icache fills wouldn't issue snoops of
their own).

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling
  2012-08-14 23:04   ` Alexander Graf
@ 2012-08-17 21:47     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 150+ messages in thread
From: Benjamin Herrenschmidt @ 2012-08-17 21:47 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

>  
> +/* Please call after prepare_to_enter. This function puts the lazy ee state
> +   back to normal mode, without actually enabling interrupts. */
> +static inline void kvmppc_lazy_ee_enable(void)
> +{
> +#ifdef CONFIG_PPC64
> +	/* Only need to enable IRQs by hard enabling them after this */
> +	local_paca->irq_happened = 0;
> +	local_paca->soft_enabled = 1;
> +#endif
> +}

Smells like the above is the right spot for trace_hardirqs_on() an:

> -		__hard_irq_disable();
> +		local_irq_disable();
>  		if (kvmppc_prepare_to_enter(vcpu)) {
> -			/* local_irq_enable(); */
> +			local_irq_enable();
>  			run->exit_reason = KVM_EXIT_INTR;
>  			r = -EINTR;
>  		} else {
>  			/* Going back to guest */
>  			kvm_guest_enter();
> +			kvmppc_lazy_ee_enable();
>  		}
>  	}

You should probably do kvmppc_lazy_ee_enable() before guest enter
so the CPU doesn't appear to the rest of the world that it has
interrupt disabled while it's in the guest.
  
> @@ -1066,8 +1062,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  #endif
>  	ulong ext_msr;
>  
> -	preempt_disable();
> -
>  	/* Check if we can run the vcpu at all */
>  	if (!vcpu->arch.sane) {
>  		kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> @@ -1081,9 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  	 * really did time things so badly, then we just exit again due to
>  	 * a host external interrupt.
>  	 */
> -	__hard_irq_disable();
> +	local_irq_disable();
>  	if (kvmppc_prepare_to_enter(vcpu)) {
> -		__hard_irq_enable();
> +		local_irq_enable();
>  		kvm_run->exit_reason = KVM_EXIT_INTR;
>  		ret = -EINTR;
>  		goto out;
> @@ -1122,7 +1116,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  	if (vcpu->arch.shared->msr & MSR_FP)
>  		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
>  
> -	kvm_guest_enter();
> +	kvmppc_lazy_ee_enable();

Same. BTW, why do you have two enter path ? Smells like a recipe for
disaster :-)
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
>  
> @@ -1157,7 +1151,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  
>  out:
>  	vcpu->mode = OUTSIDE_GUEST_MODE;
> -	preempt_enable();
>  	return ret;
>  }
>  
> diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
> index 9ecf6e3..b2f8258 100644
> --- a/arch/powerpc/kvm/book3s_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_rmhandlers.S
> @@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
>   * Call kvmppc_handler_trampoline_enter in real mode
>   *
>   * On entry, r4 contains the guest shadow MSR
> + * MSR.EE has to be 0 when calling this function
>   */
>  _GLOBAL(kvmppc_entry_trampoline)
>  	mfmsr	r5
>  	LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
>  	toreal(r7)
>  
> -	li	r9, MSR_RI
> -	ori	r9, r9, MSR_EE
> -	andc	r9, r5, r9	/* Clear EE and RI in MSR value */
>  	li	r6, MSR_IR | MSR_DR
> -	ori	r6, r6, MSR_EE
> -	andc	r6, r5, r6	/* Clear EE, DR and IR in MSR value */
> -	MTMSR_EERI(r9)		/* Clear EE and RI in MSR */
> -	mtsrr0	r7		/* before we set srr0/1 */
> +	andc	r6, r5, r6	/* Clear DR and IR in MSR value */
> +	/*
> +	 * Set EE in HOST_MSR so that it's enabled when we get into our
> +	 * C exit handler function
> +	 */
> +	ori	r5, r5, MSR_EE
> +	mtsrr0	r7
>  	mtsrr1	r6
>  	RFI
>  
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index aae535f..2bd190c 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -486,6 +486,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  		ret = -EINTR;
>  		goto out;
>  	}
> +	kvmppc_lazy_ee_enable();
>  
>  	kvm_guest_enter();

Same.

> @@ -955,6 +956,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  		} else {
>  			/* Going back to guest */
>  			kvm_guest_enter();
> +			kvmppc_lazy_ee_enable();
>  		}
>  	}

Same.

>  
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 053bfef..545c183 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -30,6 +30,7 @@
>  #include <asm/kvm_ppc.h>
>  #include <asm/tlbflush.h>
>  #include <asm/cputhreads.h>
> +#include <asm/irqflags.h>
>  #include "timing.h"
>  #include "../mm/mmu_decl.h"
>  
> @@ -93,6 +94,19 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>  			break;
>  		}
>  
> +#ifdef CONFIG_PPC64
> +		/* lazy EE magic */
> +		hard_irq_disable();
> +		if (lazy_irq_pending()) {
> +			/* Got an interrupt in between, try again */
> +			local_irq_enable();
> +			local_irq_disable();
> +			continue;
> +		}
> +
> +		trace_hardirqs_on();
> +#endif

And move the trace out as I mentioned.

Cheers,
Ben.

>  		/* Going into guest context! Yay! */
>  		vcpu->mode = IN_GUEST_MODE;
>  		smp_wmb();

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling
@ 2012-08-17 21:47     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 150+ messages in thread
From: Benjamin Herrenschmidt @ 2012-08-17 21:47 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

>  
> +/* Please call after prepare_to_enter. This function puts the lazy ee state
> +   back to normal mode, without actually enabling interrupts. */
> +static inline void kvmppc_lazy_ee_enable(void)
> +{
> +#ifdef CONFIG_PPC64
> +	/* Only need to enable IRQs by hard enabling them after this */
> +	local_paca->irq_happened = 0;
> +	local_paca->soft_enabled = 1;
> +#endif
> +}

Smells like the above is the right spot for trace_hardirqs_on() an:

> -		__hard_irq_disable();
> +		local_irq_disable();
>  		if (kvmppc_prepare_to_enter(vcpu)) {
> -			/* local_irq_enable(); */
> +			local_irq_enable();
>  			run->exit_reason = KVM_EXIT_INTR;
>  			r = -EINTR;
>  		} else {
>  			/* Going back to guest */
>  			kvm_guest_enter();
> +			kvmppc_lazy_ee_enable();
>  		}
>  	}

You should probably do kvmppc_lazy_ee_enable() before guest enter
so the CPU doesn't appear to the rest of the world that it has
interrupt disabled while it's in the guest.
  
> @@ -1066,8 +1062,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  #endif
>  	ulong ext_msr;
>  
> -	preempt_disable();
> -
>  	/* Check if we can run the vcpu at all */
>  	if (!vcpu->arch.sane) {
>  		kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> @@ -1081,9 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  	 * really did time things so badly, then we just exit again due to
>  	 * a host external interrupt.
>  	 */
> -	__hard_irq_disable();
> +	local_irq_disable();
>  	if (kvmppc_prepare_to_enter(vcpu)) {
> -		__hard_irq_enable();
> +		local_irq_enable();
>  		kvm_run->exit_reason = KVM_EXIT_INTR;
>  		ret = -EINTR;
>  		goto out;
> @@ -1122,7 +1116,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  	if (vcpu->arch.shared->msr & MSR_FP)
>  		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
>  
> -	kvm_guest_enter();
> +	kvmppc_lazy_ee_enable();

Same. BTW, why do you have two enter path ? Smells like a recipe for
disaster :-)
 
 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
>  
> @@ -1157,7 +1151,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  
>  out:
>  	vcpu->mode = OUTSIDE_GUEST_MODE;
> -	preempt_enable();
>  	return ret;
>  }
>  
> diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
> index 9ecf6e3..b2f8258 100644
> --- a/arch/powerpc/kvm/book3s_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_rmhandlers.S
> @@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
>   * Call kvmppc_handler_trampoline_enter in real mode
>   *
>   * On entry, r4 contains the guest shadow MSR
> + * MSR.EE has to be 0 when calling this function
>   */
>  _GLOBAL(kvmppc_entry_trampoline)
>  	mfmsr	r5
>  	LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
>  	toreal(r7)
>  
> -	li	r9, MSR_RI
> -	ori	r9, r9, MSR_EE
> -	andc	r9, r5, r9	/* Clear EE and RI in MSR value */
>  	li	r6, MSR_IR | MSR_DR
> -	ori	r6, r6, MSR_EE
> -	andc	r6, r5, r6	/* Clear EE, DR and IR in MSR value */
> -	MTMSR_EERI(r9)		/* Clear EE and RI in MSR */
> -	mtsrr0	r7		/* before we set srr0/1 */
> +	andc	r6, r5, r6	/* Clear DR and IR in MSR value */
> +	/*
> +	 * Set EE in HOST_MSR so that it's enabled when we get into our
> +	 * C exit handler function
> +	 */
> +	ori	r5, r5, MSR_EE
> +	mtsrr0	r7
>  	mtsrr1	r6
>  	RFI
>  
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index aae535f..2bd190c 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -486,6 +486,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>  		ret = -EINTR;
>  		goto out;
>  	}
> +	kvmppc_lazy_ee_enable();
>  
>  	kvm_guest_enter();

Same.

> @@ -955,6 +956,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  		} else {
>  			/* Going back to guest */
>  			kvm_guest_enter();
> +			kvmppc_lazy_ee_enable();
>  		}
>  	}

Same.

>  
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 053bfef..545c183 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -30,6 +30,7 @@
>  #include <asm/kvm_ppc.h>
>  #include <asm/tlbflush.h>
>  #include <asm/cputhreads.h>
> +#include <asm/irqflags.h>
>  #include "timing.h"
>  #include "../mm/mmu_decl.h"
>  
> @@ -93,6 +94,19 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>  			break;
>  		}
>  
> +#ifdef CONFIG_PPC64
> +		/* lazy EE magic */
> +		hard_irq_disable();
> +		if (lazy_irq_pending()) {
> +			/* Got an interrupt in between, try again */
> +			local_irq_enable();
> +			local_irq_disable();
> +			continue;
> +		}
> +
> +		trace_hardirqs_on();
> +#endif

And move the trace out as I mentioned.

Cheers,
Ben.

>  		/* Going into guest context! Yay! */
>  		vcpu->mode = IN_GUEST_MODE;
>  		smp_wmb();



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
  2012-08-15  9:29           ` Alexander Graf
@ 2012-08-21  1:41             ` Scott Wood
  -1 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-21  1:41 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 04:29 AM, Alexander Graf wrote:
> 
> On 15.08.2012, at 03:17, Scott Wood wrote:
> 
>> On 08/14/2012 07:26 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 02:17, Scott Wood wrote:
>>>
>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>> Generic KVM code might want to know whether we are inside guest context
>>>>> or outside. It also wants to be able to push us out of guest context.
>>>>>
>>>>> Add support to the BookE code for the generic vcpu->mode field that describes
>>>>> the above states.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>> ---
>>>>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>>>>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>> index bcf87fe..70a86c0 100644
>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>>>> 			continue;
>>>>> 		}
>>>>>
>>>>> +		if (vcpu->mode == EXITING_GUEST_MODE) {
>>>>> +			r = 1;
>>>>> +			break;
>>>>> +		}
>>>>> +
>>>>> +		/* Going into guest context! Yay! */
>>>>> +		vcpu->mode = IN_GUEST_MODE;
>>>>> +		smp_wmb();
>>>>> +
>>>>> 		break;
>>>>> 	}
>>>>
>>>> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
>>>> right?  How could it possibly be EXITING_GUEST_MODE then, since that
>>>> only replaces IN_GUEST_MODE?
>>>>
>>>> This doesn't match what x86 does with mode on entry.  Mode is supposed
>>>> to be set to IN_GUEST_MODE before requests are checked.
>>>>
>>>> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
>>>> just waiting until after interrupts are disabled before setting
>>>> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
>>>> like a trivial change), plus the existing ordering between mode and
>>>> requests.
>>>
>>> Well, the only real use case I could find for the mode was the remote
>>> vcpu kick. If we're not outside of guest mode, we get an IPI to
>>> notify us that requests are outstanding.
>>
>> I'm curious why this is done so differently for broadcast requests than
>> for single-cpu requests.
>>
>>> So I only get us into OUTSIDE_GUEST_MODE when we really exit
>>> __vcpu_run, thus are in user space. That doesn't reflect what x86
>>> does, right, but so doesn't our whole loop concept.
>>
>> OK.  We still need to do ordering like x86 does, because otherwise
>> there's a race where we could check requests before the request bit is
>> set, and still have make_all_cpus_request see OUTSIDE_GUEST_MODE and not
>> send an IPI.
> 
> Could you please send a patch showing what workflow you envision? 

I'll try to do this tomorrow.

> The code as is should work, just be inefficient at times, right?

No, we could fail to send the IPI -- couldn't that result in the guest
using a stale TLB entry?

-Scott

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode
@ 2012-08-21  1:41             ` Scott Wood
  0 siblings, 0 replies; 150+ messages in thread
From: Scott Wood @ 2012-08-21  1:41 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, KVM list

On 08/15/2012 04:29 AM, Alexander Graf wrote:
> 
> On 15.08.2012, at 03:17, Scott Wood wrote:
> 
>> On 08/14/2012 07:26 PM, Alexander Graf wrote:
>>>
>>> On 15.08.2012, at 02:17, Scott Wood wrote:
>>>
>>>> On 08/14/2012 06:04 PM, Alexander Graf wrote:
>>>>> Generic KVM code might want to know whether we are inside guest context
>>>>> or outside. It also wants to be able to push us out of guest context.
>>>>>
>>>>> Add support to the BookE code for the generic vcpu->mode field that describes
>>>>> the above states.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>> ---
>>>>> arch/powerpc/kvm/booke.c |   11 +++++++++++
>>>>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>>>>> index bcf87fe..70a86c0 100644
>>>>> --- a/arch/powerpc/kvm/booke.c
>>>>> +++ b/arch/powerpc/kvm/booke.c
>>>>> @@ -501,6 +501,15 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>>>>> 			continue;
>>>>> 		}
>>>>>
>>>>> +		if (vcpu->mode = EXITING_GUEST_MODE) {
>>>>> +			r = 1;
>>>>> +			break;
>>>>> +		}
>>>>> +
>>>>> +		/* Going into guest context! Yay! */
>>>>> +		vcpu->mode = IN_GUEST_MODE;
>>>>> +		smp_wmb();
>>>>> +
>>>>> 		break;
>>>>> 	}
>>>>
>>>> Normally on entry to this function mode should be OUTSIDE_GUEST_MODE,
>>>> right?  How could it possibly be EXITING_GUEST_MODE then, since that
>>>> only replaces IN_GUEST_MODE?
>>>>
>>>> This doesn't match what x86 does with mode on entry.  Mode is supposed
>>>> to be set to IN_GUEST_MODE before requests are checked.
>>>>
>>>> I'm not sure what the point of EXITING_GUEST_MODE is at all, compared to
>>>> just waiting until after interrupts are disabled before setting
>>>> IN_GUEST_MODE (which we do on ppc, but not on x86 even though it seems
>>>> like a trivial change), plus the existing ordering between mode and
>>>> requests.
>>>
>>> Well, the only real use case I could find for the mode was the remote
>>> vcpu kick. If we're not outside of guest mode, we get an IPI to
>>> notify us that requests are outstanding.
>>
>> I'm curious why this is done so differently for broadcast requests than
>> for single-cpu requests.
>>
>>> So I only get us into OUTSIDE_GUEST_MODE when we really exit
>>> __vcpu_run, thus are in user space. That doesn't reflect what x86
>>> does, right, but so doesn't our whole loop concept.
>>
>> OK.  We still need to do ordering like x86 does, because otherwise
>> there's a race where we could check requests before the request bit is
>> set, and still have make_all_cpus_request see OUTSIDE_GUEST_MODE and not
>> send an IPI.
> 
> Could you please send a patch showing what workflow you envision? 

I'll try to do this tomorrow.

> The code as is should work, just be inefficient at times, right?

No, we could fail to send the IPI -- couldn't that result in the guest
using a stale TLB entry?

-Scott



^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling
  2012-08-17 21:47     ` Benjamin Herrenschmidt
@ 2012-09-28  0:52       ` Alexander Graf
  -1 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-09-28  0:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: kvm-ppc, KVM list


On 17.08.2012, at 23:47, Benjamin Herrenschmidt wrote:

>> 
>> +/* Please call after prepare_to_enter. This function puts the lazy ee state
>> +   back to normal mode, without actually enabling interrupts. */
>> +static inline void kvmppc_lazy_ee_enable(void)
>> +{
>> +#ifdef CONFIG_PPC64
>> +	/* Only need to enable IRQs by hard enabling them after this */
>> +	local_paca->irq_happened = 0;
>> +	local_paca->soft_enabled = 1;
>> +#endif
>> +}
> 
> Smells like the above is the right spot for trace_hardirqs_on() an:

Hrm. Ok :).

> 
>> -		__hard_irq_disable();
>> +		local_irq_disable();
>> 		if (kvmppc_prepare_to_enter(vcpu)) {
>> -			/* local_irq_enable(); */
>> +			local_irq_enable();
>> 			run->exit_reason = KVM_EXIT_INTR;
>> 			r = -EINTR;
>> 		} else {
>> 			/* Going back to guest */
>> 			kvm_guest_enter();
>> +			kvmppc_lazy_ee_enable();
>> 		}
>> 	}
> 
> You should probably do kvmppc_lazy_ee_enable() before guest enter
> so the CPU doesn't appear to the rest of the world that it has
> interrupt disabled while it's in the guest.

I don't think I understand. The patch adds kvmppc_lazy_ee_enable() right before guest entry here, no?

> 
>> @@ -1066,8 +1062,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> #endif
>> 	ulong ext_msr;
>> 
>> -	preempt_disable();
>> -
>> 	/* Check if we can run the vcpu at all */
>> 	if (!vcpu->arch.sane) {
>> 		kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> @@ -1081,9 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 	 * really did time things so badly, then we just exit again due to
>> 	 * a host external interrupt.
>> 	 */
>> -	__hard_irq_disable();
>> +	local_irq_disable();
>> 	if (kvmppc_prepare_to_enter(vcpu)) {
>> -		__hard_irq_enable();
>> +		local_irq_enable();
>> 		kvm_run->exit_reason = KVM_EXIT_INTR;
>> 		ret = -EINTR;
>> 		goto out;
>> @@ -1122,7 +1116,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 	if (vcpu->arch.shared->msr & MSR_FP)
>> 		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
>> 
>> -	kvm_guest_enter();
>> +	kvmppc_lazy_ee_enable();
> 
> Same. BTW, why do you have two enter path ? Smells like a recipe for
> disaster :-)

Because this way we can keep r14-r31 in registers ;). But here too we call kvmppc_lazy_ee_enable() right before going into guest context, no?

I can't shake off the feeling I don't fully understand your comments :)


Alex

> 
> 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
>> 
>> @@ -1157,7 +1151,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 
>> out:
>> 	vcpu->mode = OUTSIDE_GUEST_MODE;
>> -	preempt_enable();
>> 	return ret;
>> }
>> 
>> diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
>> index 9ecf6e3..b2f8258 100644
>> --- a/arch/powerpc/kvm/book3s_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_rmhandlers.S
>> @@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
>>  * Call kvmppc_handler_trampoline_enter in real mode
>>  *
>>  * On entry, r4 contains the guest shadow MSR
>> + * MSR.EE has to be 0 when calling this function
>>  */
>> _GLOBAL(kvmppc_entry_trampoline)
>> 	mfmsr	r5
>> 	LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
>> 	toreal(r7)
>> 
>> -	li	r9, MSR_RI
>> -	ori	r9, r9, MSR_EE
>> -	andc	r9, r5, r9	/* Clear EE and RI in MSR value */
>> 	li	r6, MSR_IR | MSR_DR
>> -	ori	r6, r6, MSR_EE
>> -	andc	r6, r5, r6	/* Clear EE, DR and IR in MSR value */
>> -	MTMSR_EERI(r9)		/* Clear EE and RI in MSR */
>> -	mtsrr0	r7		/* before we set srr0/1 */
>> +	andc	r6, r5, r6	/* Clear DR and IR in MSR value */
>> +	/*
>> +	 * Set EE in HOST_MSR so that it's enabled when we get into our
>> +	 * C exit handler function
>> +	 */
>> +	ori	r5, r5, MSR_EE
>> +	mtsrr0	r7
>> 	mtsrr1	r6
>> 	RFI
>> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index aae535f..2bd190c 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -486,6 +486,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 		ret = -EINTR;
>> 		goto out;
>> 	}
>> +	kvmppc_lazy_ee_enable();
>> 
>> 	kvm_guest_enter();
> 
> Same.
> 
>> @@ -955,6 +956,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>> 		} else {
>> 			/* Going back to guest */
>> 			kvm_guest_enter();
>> +			kvmppc_lazy_ee_enable();
>> 		}
>> 	}
> 
> Same.
> 
>> 
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index 053bfef..545c183 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -30,6 +30,7 @@
>> #include <asm/kvm_ppc.h>
>> #include <asm/tlbflush.h>
>> #include <asm/cputhreads.h>
>> +#include <asm/irqflags.h>
>> #include "timing.h"
>> #include "../mm/mmu_decl.h"
>> 
>> @@ -93,6 +94,19 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 			break;
>> 		}
>> 
>> +#ifdef CONFIG_PPC64
>> +		/* lazy EE magic */
>> +		hard_irq_disable();
>> +		if (lazy_irq_pending()) {
>> +			/* Got an interrupt in between, try again */
>> +			local_irq_enable();
>> +			local_irq_disable();
>> +			continue;
>> +		}
>> +
>> +		trace_hardirqs_on();
>> +#endif
> 
> And move the trace out as I mentioned.
> 
> Cheers,
> Ben.
> 
>> 		/* Going into guest context! Yay! */
>> 		vcpu->mode = IN_GUEST_MODE;
>> 		smp_wmb();
> 
> 

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling
@ 2012-09-28  0:52       ` Alexander Graf
  0 siblings, 0 replies; 150+ messages in thread
From: Alexander Graf @ 2012-09-28  0:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: kvm-ppc, KVM list


On 17.08.2012, at 23:47, Benjamin Herrenschmidt wrote:

>> 
>> +/* Please call after prepare_to_enter. This function puts the lazy ee state
>> +   back to normal mode, without actually enabling interrupts. */
>> +static inline void kvmppc_lazy_ee_enable(void)
>> +{
>> +#ifdef CONFIG_PPC64
>> +	/* Only need to enable IRQs by hard enabling them after this */
>> +	local_paca->irq_happened = 0;
>> +	local_paca->soft_enabled = 1;
>> +#endif
>> +}
> 
> Smells like the above is the right spot for trace_hardirqs_on() an:

Hrm. Ok :).

> 
>> -		__hard_irq_disable();
>> +		local_irq_disable();
>> 		if (kvmppc_prepare_to_enter(vcpu)) {
>> -			/* local_irq_enable(); */
>> +			local_irq_enable();
>> 			run->exit_reason = KVM_EXIT_INTR;
>> 			r = -EINTR;
>> 		} else {
>> 			/* Going back to guest */
>> 			kvm_guest_enter();
>> +			kvmppc_lazy_ee_enable();
>> 		}
>> 	}
> 
> You should probably do kvmppc_lazy_ee_enable() before guest enter
> so the CPU doesn't appear to the rest of the world that it has
> interrupt disabled while it's in the guest.

I don't think I understand. The patch adds kvmppc_lazy_ee_enable() right before guest entry here, no?

> 
>> @@ -1066,8 +1062,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> #endif
>> 	ulong ext_msr;
>> 
>> -	preempt_disable();
>> -
>> 	/* Check if we can run the vcpu at all */
>> 	if (!vcpu->arch.sane) {
>> 		kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> @@ -1081,9 +1075,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 	 * really did time things so badly, then we just exit again due to
>> 	 * a host external interrupt.
>> 	 */
>> -	__hard_irq_disable();
>> +	local_irq_disable();
>> 	if (kvmppc_prepare_to_enter(vcpu)) {
>> -		__hard_irq_enable();
>> +		local_irq_enable();
>> 		kvm_run->exit_reason = KVM_EXIT_INTR;
>> 		ret = -EINTR;
>> 		goto out;
>> @@ -1122,7 +1116,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 	if (vcpu->arch.shared->msr & MSR_FP)
>> 		kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
>> 
>> -	kvm_guest_enter();
>> +	kvmppc_lazy_ee_enable();
> 
> Same. BTW, why do you have two enter path ? Smells like a recipe for
> disaster :-)

Because this way we can keep r14-r31 in registers ;). But here too we call kvmppc_lazy_ee_enable() right before going into guest context, no?

I can't shake off the feeling I don't fully understand your comments :)


Alex

> 
> 	ret = __kvmppc_vcpu_run(kvm_run, vcpu);
>> 
>> @@ -1157,7 +1151,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 
>> out:
>> 	vcpu->mode = OUTSIDE_GUEST_MODE;
>> -	preempt_enable();
>> 	return ret;
>> }
>> 
>> diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
>> index 9ecf6e3..b2f8258 100644
>> --- a/arch/powerpc/kvm/book3s_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_rmhandlers.S
>> @@ -170,20 +170,21 @@ kvmppc_handler_skip_ins:
>>  * Call kvmppc_handler_trampoline_enter in real mode
>>  *
>>  * On entry, r4 contains the guest shadow MSR
>> + * MSR.EE has to be 0 when calling this function
>>  */
>> _GLOBAL(kvmppc_entry_trampoline)
>> 	mfmsr	r5
>> 	LOAD_REG_ADDR(r7, kvmppc_handler_trampoline_enter)
>> 	toreal(r7)
>> 
>> -	li	r9, MSR_RI
>> -	ori	r9, r9, MSR_EE
>> -	andc	r9, r5, r9	/* Clear EE and RI in MSR value */
>> 	li	r6, MSR_IR | MSR_DR
>> -	ori	r6, r6, MSR_EE
>> -	andc	r6, r5, r6	/* Clear EE, DR and IR in MSR value */
>> -	MTMSR_EERI(r9)		/* Clear EE and RI in MSR */
>> -	mtsrr0	r7		/* before we set srr0/1 */
>> +	andc	r6, r5, r6	/* Clear DR and IR in MSR value */
>> +	/*
>> +	 * Set EE in HOST_MSR so that it's enabled when we get into our
>> +	 * C exit handler function
>> +	 */
>> +	ori	r5, r5, MSR_EE
>> +	mtsrr0	r7
>> 	mtsrr1	r6
>> 	RFI
>> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index aae535f..2bd190c 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -486,6 +486,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> 		ret = -EINTR;
>> 		goto out;
>> 	}
>> +	kvmppc_lazy_ee_enable();
>> 
>> 	kvm_guest_enter();
> 
> Same.
> 
>> @@ -955,6 +956,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>> 		} else {
>> 			/* Going back to guest */
>> 			kvm_guest_enter();
>> +			kvmppc_lazy_ee_enable();
>> 		}
>> 	}
> 
> Same.
> 
>> 
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index 053bfef..545c183 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -30,6 +30,7 @@
>> #include <asm/kvm_ppc.h>
>> #include <asm/tlbflush.h>
>> #include <asm/cputhreads.h>
>> +#include <asm/irqflags.h>
>> #include "timing.h"
>> #include "../mm/mmu_decl.h"
>> 
>> @@ -93,6 +94,19 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> 			break;
>> 		}
>> 
>> +#ifdef CONFIG_PPC64
>> +		/* lazy EE magic */
>> +		hard_irq_disable();
>> +		if (lazy_irq_pending()) {
>> +			/* Got an interrupt in between, try again */
>> +			local_irq_enable();
>> +			local_irq_disable();
>> +			continue;
>> +		}
>> +
>> +		trace_hardirqs_on();
>> +#endif
> 
> And move the trace out as I mentioned.
> 
> Cheers,
> Ben.
> 
>> 		/* Going into guest context! Yay! */
>> 		vcpu->mode = IN_GUEST_MODE;
>> 		smp_wmb();
> 
> 


^ permalink raw reply	[flat|nested] 150+ messages in thread

end of thread, other threads:[~2012-09-28  0:52 UTC | newest]

Thread overview: 150+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-14 23:04 [PULL 00/38] ppc patch queue 2012-08-15 Alexander Graf
2012-08-14 23:04 ` Alexander Graf
2012-08-14 23:04 ` [PATCH 01/38] PPC: epapr: create define for return code value of success Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 02/38] KVM: PPC: use definitions in epapr header for hcalls Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 03/38] KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500 Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 04/38] KVM: PPC: Add support for ePAPR idle hcall in host kernel Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 05/38] KVM: PPC: ev_idle hcall support for e500 guests Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 06/38] PPC: select EPAPR_PARAVIRT for all users of epapr hcalls Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 07/38] powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 08/38] PPC: Don't use hardcoded opcode for ePAPR hcall invocation Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 09/38] KVM: PPC: PR: Use generic tracepoint for guest exit Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 10/38] KVM: PPC: Expose SYNC cap based on mmu notifiers Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 11/38] KVM: PPC: BookE: Expose remote TLB flushes in debugfs Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 12/38] KVM: PPC: E500: Fix clear_tlb_refs Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 13/38] KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 14/38] KVM: PPC: Quieten message about allocating linear regions Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 15/38] powerpc/epapr: export epapr_hypercall_start Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 16/38] KVM: PPC: BookE: Add check_requests helper function Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-15  0:10   ` Scott Wood
2012-08-15  0:10     ` Scott Wood
2012-08-15  0:13     ` Alexander Graf
2012-08-15  0:13       ` Alexander Graf
2012-08-15  0:20       ` Scott Wood
2012-08-15  0:20         ` Scott Wood
2012-08-15 18:28     ` Marcelo Tosatti
2012-08-15 18:28       ` Marcelo Tosatti
2012-08-14 23:04 ` [PATCH 17/38] KVM: PPC: BookE: Add support for vcpu->mode Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-15  0:17   ` Scott Wood
2012-08-15  0:17     ` Scott Wood
2012-08-15  0:26     ` Alexander Graf
2012-08-15  0:26       ` Alexander Graf
2012-08-15  1:17       ` Scott Wood
2012-08-15  1:17         ` Scott Wood
2012-08-15  9:29         ` Alexander Graf
2012-08-15  9:29           ` Alexander Graf
2012-08-21  1:41           ` Scott Wood
2012-08-21  1:41             ` Scott Wood
2012-08-15  1:25   ` Scott Wood
2012-08-15  1:25     ` Scott Wood
2012-08-14 23:04 ` [PATCH 18/38] KVM: PPC: E500: Implement MMU notifiers Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-15  1:20   ` Scott Wood
2012-08-15  1:20     ` Scott Wood
2012-08-15  9:38     ` Alexander Graf
2012-08-15  9:38       ` Alexander Graf
2012-08-14 23:04 ` [PATCH 19/38] KVM: PPC: Add cache flush on page map Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-15  1:23   ` Scott Wood
2012-08-15  1:23     ` Scott Wood
2012-08-15  9:52     ` Alexander Graf
2012-08-15  9:52       ` Alexander Graf
2012-08-15 17:26       ` Scott Wood
2012-08-15 17:26         ` Scott Wood
2012-08-15 17:27         ` Alexander Graf
2012-08-15 17:27           ` Alexander Graf
2012-08-15 17:47           ` Scott Wood
2012-08-15 17:47             ` Scott Wood
2012-08-15 18:01             ` Alexander Graf
2012-08-15 18:01               ` Alexander Graf
2012-08-15 18:16               ` Scott Wood
2012-08-15 18:16                 ` Scott Wood
2012-08-15 18:27                 ` Alexander Graf
2012-08-15 18:27                   ` Alexander Graf
2012-08-15 18:29                   ` Alexander Graf
2012-08-15 18:29                     ` Alexander Graf
2012-08-15 18:33                     ` Scott Wood
2012-08-15 18:33                       ` Scott Wood
2012-08-15 18:51                       ` Alexander Graf
2012-08-15 18:51                         ` Alexander Graf
2012-08-15 18:56                         ` Scott Wood
2012-08-15 18:56                           ` Scott Wood
2012-08-15 18:58                           ` Alexander Graf
2012-08-15 18:58                             ` Alexander Graf
2012-08-15 19:05                             ` Scott Wood
2012-08-15 19:05                               ` Scott Wood
2012-08-15 19:29                               ` Alexander Graf
2012-08-15 19:29                                 ` Alexander Graf
2012-08-15 19:53                                 ` Scott Wood
2012-08-15 19:53                                   ` Scott Wood
2012-08-14 23:04 ` [PATCH 20/38] KVM: PPC: BookE: Add some more trace points Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 21/38] KVM: PPC: BookE: No duplicate request != 0 check Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 22/38] KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_pr Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 23/38] KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier support Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 24/38] KVM: PPC: BookE: Drop redundant vcpu->mode set Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 25/38] KVM: PPC: Book3S: PR: Only do resched check once per exit Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 26/38] KVM: PPC: Exit guest context while handling exit Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 27/38] KVM: PPC: Book3S: PR: Indicate we're out of guest mode Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 28/38] KVM: PPC: Consistentify vcpu exit path Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 29/38] KVM: PPC: Book3S: PR: Rework irq disabling Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-17 21:47   ` Benjamin Herrenschmidt
2012-08-17 21:47     ` Benjamin Herrenschmidt
2012-09-28  0:52     ` Alexander Graf
2012-09-28  0:52       ` Alexander Graf
2012-08-14 23:04 ` [PATCH 30/38] KVM: PPC: Move kvm_guest_enter call into generic code Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 31/38] KVM: PPC: Ignore EXITING_GUEST_MODE mode Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 32/38] KVM: PPC: Add return value in prepare_to_enter Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 33/38] KVM: PPC: Add return value to core_check_requests Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 34/38] KVM: PPC: booke: Add watchdog emulation Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 35/38] booke: Added ONE_REG interface for IAC/DAC debug registers Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:44   ` Scott Wood
2012-08-14 23:44     ` Scott Wood
2012-08-14 23:47     ` Alexander Graf
2012-08-14 23:47       ` Alexander Graf
2012-08-15  0:06       ` Scott Wood
2012-08-15  0:06         ` Scott Wood
2012-08-14 23:04 ` [PATCH 36/38] KVM: PPC: 44x: Initialize PVR Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 37/38] KVM: PPC: BookE: Add MCSR SPR support Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-14 23:04 ` [PATCH 38/38] ppc: e500_tlb memset clears nothing Alexander Graf
2012-08-14 23:04   ` Alexander Graf
2012-08-15 10:07   ` Avi Kivity
2012-08-15 10:07     ` Avi Kivity
2012-08-15 10:09     ` Alexander Graf
2012-08-15 10:09       ` Alexander Graf
2012-08-15 10:10       ` Avi Kivity
2012-08-15 10:10         ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.