All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst()
@ 2014-05-01  0:39 ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:39 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (4):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

 arch/powerpc/include/asm/kvm_book3s.h    | 26 ---------
 arch/powerpc/include/asm/kvm_booke.h     |  5 --
 arch/powerpc/include/asm/kvm_ppc.h       |  2 +
 arch/powerpc/include/asm/mmu-book3e.h    |  7 ++-
 arch/powerpc/kvm/book3s.c                | 32 +++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 ++----
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++-----
 arch/powerpc/kvm/booke.c                 | 14 +++++
 arch/powerpc/kvm/booke.h                 |  3 +
 arch/powerpc/kvm/bookehv_interrupts.S    | 54 ++----------------
 arch/powerpc/kvm/e500_mmu_host.c         | 98 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/emulate.c               | 18 ++++--
 arch/powerpc/kvm/powerpc.c               | 10 +++-
 15 files changed, 235 insertions(+), 129 deletions(-)

-- 
1.7.11.7

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst()
@ 2014-05-01  0:39 ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:39 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (4):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

 arch/powerpc/include/asm/kvm_book3s.h    | 26 ---------
 arch/powerpc/include/asm/kvm_booke.h     |  5 --
 arch/powerpc/include/asm/kvm_ppc.h       |  2 +
 arch/powerpc/include/asm/mmu-book3e.h    |  7 ++-
 arch/powerpc/kvm/book3s.c                | 32 +++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 ++----
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++-----
 arch/powerpc/kvm/booke.c                 | 14 +++++
 arch/powerpc/kvm/booke.h                 |  3 +
 arch/powerpc/kvm/bookehv_interrupts.S    | 54 ++----------------
 arch/powerpc/kvm/e500_mmu_host.c         | 98 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/emulate.c               | 18 ++++--
 arch/powerpc/kvm/powerpc.c               | 10 +++-
 15 files changed, 235 insertions(+), 129 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst()
@ 2014-05-01  0:39 ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:39 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (4):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

 arch/powerpc/include/asm/kvm_book3s.h    | 26 ---------
 arch/powerpc/include/asm/kvm_booke.h     |  5 --
 arch/powerpc/include/asm/kvm_ppc.h       |  2 +
 arch/powerpc/include/asm/mmu-book3e.h    |  7 ++-
 arch/powerpc/kvm/book3s.c                | 32 +++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 ++----
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++-----
 arch/powerpc/kvm/booke.c                 | 14 +++++
 arch/powerpc/kvm/booke.h                 |  3 +
 arch/powerpc/kvm/bookehv_interrupts.S    | 54 ++----------------
 arch/powerpc/kvm/e500_mmu_host.c         | 98 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/emulate.c               | 18 ++++--
 arch/powerpc/kvm/powerpc.c               | 10 +++-
 15 files changed, 235 insertions(+), 129 deletions(-)

-- 
1.7.11.7


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst()
@ 2014-05-01  0:45 ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (4):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

 arch/powerpc/include/asm/kvm_book3s.h    | 26 ---------
 arch/powerpc/include/asm/kvm_booke.h     |  5 --
 arch/powerpc/include/asm/kvm_ppc.h       |  2 +
 arch/powerpc/include/asm/mmu-book3e.h    |  7 ++-
 arch/powerpc/kvm/book3s.c                | 32 +++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 ++----
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++-----
 arch/powerpc/kvm/booke.c                 | 14 +++++
 arch/powerpc/kvm/booke.h                 |  3 +
 arch/powerpc/kvm/bookehv_interrupts.S    | 54 ++----------------
 arch/powerpc/kvm/e500_mmu_host.c         | 98 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/emulate.c               | 18 ++++--
 arch/powerpc/kvm/powerpc.c               | 10 +++-
 15 files changed, 235 insertions(+), 129 deletions(-)

-- 
1.7.11.7

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst()
@ 2014-05-01  0:45 ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (4):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

 arch/powerpc/include/asm/kvm_book3s.h    | 26 ---------
 arch/powerpc/include/asm/kvm_booke.h     |  5 --
 arch/powerpc/include/asm/kvm_ppc.h       |  2 +
 arch/powerpc/include/asm/mmu-book3e.h    |  7 ++-
 arch/powerpc/kvm/book3s.c                | 32 +++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 ++----
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++-----
 arch/powerpc/kvm/booke.c                 | 14 +++++
 arch/powerpc/kvm/booke.h                 |  3 +
 arch/powerpc/kvm/bookehv_interrupts.S    | 54 ++----------------
 arch/powerpc/kvm/e500_mmu_host.c         | 98 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/emulate.c               | 18 ++++--
 arch/powerpc/kvm/powerpc.c               | 10 +++-
 15 files changed, 235 insertions(+), 129 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst()
@ 2014-05-01  0:45 ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (4):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

 arch/powerpc/include/asm/kvm_book3s.h    | 26 ---------
 arch/powerpc/include/asm/kvm_booke.h     |  5 --
 arch/powerpc/include/asm/kvm_ppc.h       |  2 +
 arch/powerpc/include/asm/mmu-book3e.h    |  7 ++-
 arch/powerpc/kvm/book3s.c                | 32 +++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 ++----
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++-----
 arch/powerpc/kvm/booke.c                 | 14 +++++
 arch/powerpc/kvm/booke.h                 |  3 +
 arch/powerpc/kvm/bookehv_interrupts.S    | 54 ++----------------
 arch/powerpc/kvm/e500_mmu_host.c         | 98 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/emulate.c               | 18 ++++--
 arch/powerpc/kvm/powerpc.c               | 10 +++-
 15 files changed, 235 insertions(+), 129 deletions(-)

-- 
1.7.11.7


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
  2014-05-01  0:45 ` Mihai Caraman
  (?)
@ 2014-05-01  0:45   ` Mihai Caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

The commit 1d628af7 "add load inst fixup" made an attempt to handle
failures generated by reading the guest current instruction. The fixup
code that was added works by chance hiding the real issue.

Load external pid (lwepx) instruction, used by KVM to read guest
instructions, is executed in a subsituted guest translation context
(EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
interrupts need to be handled by KVM, even though these interrupts
are generated from host context (MSR[GS] = 0).

Currently, KVM hooks only interrupts generated from guest context
(MSR[GS] = 1), doing minimal checks on the fast path to avoid host
performance degradation. As a result, the host kernel handles lwepx
faults searching the faulting guest data address (loaded in DEAR) in
its own Logical Partition ID (LPID) 0 context. In case a host translation
is found the execution returns to the lwepx instruction instead of the
fixup, the host ending up in an infinite loop.

Revert the commit "add load inst fixup". lwepx issue will be addressed
in a subsequent patch without needing fixup code.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - reworked patch description

 arch/powerpc/kvm/bookehv_interrupts.S | 25 +------------------------
 1 file changed, 1 insertion(+), 24 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index a1712b8..925da71 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -164,32 +164,9 @@
 	PPC_STL	r30, VCPU_GPR(R30)(r4)
 	PPC_STL	r31, VCPU_GPR(R31)(r4)
 	mtspr	SPRN_EPLC, r8
-
-	/* disable preemption, so we are sure we hit the fixup handler */
-	CURRENT_THREAD_INFO(r8, r1)
-	li	r7, 1
-	stw	r7, TI_PREEMPT(r8)
-
 	isync
-
-	/*
-	 * In case the read goes wrong, we catch it and write an invalid value
-	 * in LAST_INST instead.
-	 */
-1:	lwepx	r9, 0, r5
-2:
-.section .fixup, "ax"
-3:	li	r9, KVM_INST_FETCH_FAILED
-	b	2b
-.previous
-.section __ex_table,"a"
-	PPC_LONG_ALIGN
-	PPC_LONG 1b,3b
-.previous
-
+	lwepx   r9, 0, r5
 	mtspr	SPRN_EPLC, r3
-	li	r7, 0
-	stw	r7, TI_PREEMPT(r8)
 	stw	r9, VCPU_LAST_INST(r4)
 	.endif
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

The commit 1d628af7 "add load inst fixup" made an attempt to handle
failures generated by reading the guest current instruction. The fixup
code that was added works by chance hiding the real issue.

Load external pid (lwepx) instruction, used by KVM to read guest
instructions, is executed in a subsituted guest translation context
(EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
interrupts need to be handled by KVM, even though these interrupts
are generated from host context (MSR[GS] = 0).

Currently, KVM hooks only interrupts generated from guest context
(MSR[GS] = 1), doing minimal checks on the fast path to avoid host
performance degradation. As a result, the host kernel handles lwepx
faults searching the faulting guest data address (loaded in DEAR) in
its own Logical Partition ID (LPID) 0 context. In case a host translation
is found the execution returns to the lwepx instruction instead of the
fixup, the host ending up in an infinite loop.

Revert the commit "add load inst fixup". lwepx issue will be addressed
in a subsequent patch without needing fixup code.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - reworked patch description

 arch/powerpc/kvm/bookehv_interrupts.S | 25 +------------------------
 1 file changed, 1 insertion(+), 24 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index a1712b8..925da71 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -164,32 +164,9 @@
 	PPC_STL	r30, VCPU_GPR(R30)(r4)
 	PPC_STL	r31, VCPU_GPR(R31)(r4)
 	mtspr	SPRN_EPLC, r8
-
-	/* disable preemption, so we are sure we hit the fixup handler */
-	CURRENT_THREAD_INFO(r8, r1)
-	li	r7, 1
-	stw	r7, TI_PREEMPT(r8)
-
 	isync
-
-	/*
-	 * In case the read goes wrong, we catch it and write an invalid value
-	 * in LAST_INST instead.
-	 */
-1:	lwepx	r9, 0, r5
-2:
-.section .fixup, "ax"
-3:	li	r9, KVM_INST_FETCH_FAILED
-	b	2b
-.previous
-.section __ex_table,"a"
-	PPC_LONG_ALIGN
-	PPC_LONG 1b,3b
-.previous
-
+	lwepx   r9, 0, r5
 	mtspr	SPRN_EPLC, r3
-	li	r7, 0
-	stw	r7, TI_PREEMPT(r8)
 	stw	r9, VCPU_LAST_INST(r4)
 	.endif
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

The commit 1d628af7 "add load inst fixup" made an attempt to handle
failures generated by reading the guest current instruction. The fixup
code that was added works by chance hiding the real issue.

Load external pid (lwepx) instruction, used by KVM to read guest
instructions, is executed in a subsituted guest translation context
(EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
interrupts need to be handled by KVM, even though these interrupts
are generated from host context (MSR[GS] = 0).

Currently, KVM hooks only interrupts generated from guest context
(MSR[GS] = 1), doing minimal checks on the fast path to avoid host
performance degradation. As a result, the host kernel handles lwepx
faults searching the faulting guest data address (loaded in DEAR) in
its own Logical Partition ID (LPID) 0 context. In case a host translation
is found the execution returns to the lwepx instruction instead of the
fixup, the host ending up in an infinite loop.

Revert the commit "add load inst fixup". lwepx issue will be addressed
in a subsequent patch without needing fixup code.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - reworked patch description

 arch/powerpc/kvm/bookehv_interrupts.S | 25 +------------------------
 1 file changed, 1 insertion(+), 24 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index a1712b8..925da71 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -164,32 +164,9 @@
 	PPC_STL	r30, VCPU_GPR(R30)(r4)
 	PPC_STL	r31, VCPU_GPR(R31)(r4)
 	mtspr	SPRN_EPLC, r8
-
-	/* disable preemption, so we are sure we hit the fixup handler */
-	CURRENT_THREAD_INFO(r8, r1)
-	li	r7, 1
-	stw	r7, TI_PREEMPT(r8)
-
 	isync
-
-	/*
-	 * In case the read goes wrong, we catch it and write an invalid value
-	 * in LAST_INST instead.
-	 */
-1:	lwepx	r9, 0, r5
-2:
-.section .fixup, "ax"
-3:	li	r9, KVM_INST_FETCH_FAILED
-	b	2b
-.previous
-.section __ex_table,"a"
-	PPC_LONG_ALIGN
-	PPC_LONG 1b,3b
-.previous
-
+	lwepx   r9, 0, r5
 	mtspr	SPRN_EPLC, r3
-	li	r7, 0
-	stw	r7, TI_PREEMPT(r8)
 	stw	r9, VCPU_LAST_INST(r4)
 	.endif
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  2014-05-01  0:45 ` Mihai Caraman
  (?)
@ 2014-05-01  0:45   ` Mihai Caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

Add defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() to Book3E.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - no change

 arch/powerpc/include/asm/mmu-book3e.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index 901dac6..60a949a 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,7 +40,11 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL(x)		(((x) << 28) & 0x30000000)
+#define MAS0_TLBSEL_MASK	0x30000000
+#define MAS0_TLBSEL_SHIFT	28
+#define MAS0_TLBSEL(x)		(((x) << MAS0_TLBSEL_SHIFT) & MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)	(((mas0) & MAS0_TLBSEL_MASK) >> \
+			MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK		0x0FFF0000
 #define MAS0_ESEL_SHIFT		16
 #define MAS0_ESEL(x)		(((x) << MAS0_ESEL_SHIFT) & MAS0_ESEL_MASK)
@@ -58,6 +62,7 @@
 #define MAS1_TSIZE_MASK		0x00000f80
 #define MAS1_TSIZE_SHIFT	7
 #define MAS1_TSIZE(x)		(((x) << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)	(((mas1) & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN		(~0xFFFUL)
 #define MAS2_X0			0x00000040
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Add defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() to Book3E.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - no change

 arch/powerpc/include/asm/mmu-book3e.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index 901dac6..60a949a 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,7 +40,11 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL(x)		(((x) << 28) & 0x30000000)
+#define MAS0_TLBSEL_MASK	0x30000000
+#define MAS0_TLBSEL_SHIFT	28
+#define MAS0_TLBSEL(x)		(((x) << MAS0_TLBSEL_SHIFT) & MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)	(((mas0) & MAS0_TLBSEL_MASK) >> \
+			MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK		0x0FFF0000
 #define MAS0_ESEL_SHIFT		16
 #define MAS0_ESEL(x)		(((x) << MAS0_ESEL_SHIFT) & MAS0_ESEL_MASK)
@@ -58,6 +62,7 @@
 #define MAS1_TSIZE_MASK		0x00000f80
 #define MAS1_TSIZE_SHIFT	7
 #define MAS1_TSIZE(x)		(((x) << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)	(((mas1) & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN		(~0xFFFUL)
 #define MAS2_X0			0x00000040
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

Add defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() to Book3E.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - no change

 arch/powerpc/include/asm/mmu-book3e.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index 901dac6..60a949a 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,7 +40,11 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL(x)		(((x) << 28) & 0x30000000)
+#define MAS0_TLBSEL_MASK	0x30000000
+#define MAS0_TLBSEL_SHIFT	28
+#define MAS0_TLBSEL(x)		(((x) << MAS0_TLBSEL_SHIFT) & MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)	(((mas0) & MAS0_TLBSEL_MASK) >> \
+			MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK		0x0FFF0000
 #define MAS0_ESEL_SHIFT		16
 #define MAS0_ESEL(x)		(((x) << MAS0_ESEL_SHIFT) & MAS0_ESEL_MASK)
@@ -58,6 +62,7 @@
 #define MAS1_TSIZE_MASK		0x00000f80
 #define MAS1_TSIZE_SHIFT	7
 #define MAS1_TSIZE(x)		(((x) << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)	(((mas1) & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN		(~0xFFFUL)
 #define MAS2_X0			0x00000040
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
  2014-05-01  0:45 ` Mihai Caraman
  (?)
@ 2014-05-01  0:45   ` Mihai Caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

On book3e, guest last instruction was read on the exist path using load
external pid (lwepx) dedicated instruction. lwepx failures have to be
handled by KVM and this would require additional checks in DO_KVM hooks
(beside MSR[GS] = 1). However extra checks on host fast path are commonly
considered too intrusive.

This patch lay down the path for an altenative solution, by allowing
kvmppc_get_lat_inst() function to fail. Booke architectures may choose
to read guest instruction from kvmppc_ld_inst() and to instruct the
emulation layer either to fail or to emulate again.

emulation_result enum defintion is not accesible from book3 asm headers.
Move kvmppc_get_last_inst() definitions that require emulation_result
to source files.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - integrated kvmppc_get_last_inst() in book3s code and checked build
 - addressed cosmetic feedback
 - please validate book3s functionality!

 arch/powerpc/include/asm/kvm_book3s.h    | 26 --------------------
 arch/powerpc/include/asm/kvm_booke.h     |  5 ----
 arch/powerpc/include/asm/kvm_ppc.h       |  2 ++
 arch/powerpc/kvm/book3s.c                | 32 ++++++++++++++++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 +++---------
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++++++++++++++------------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++++++++++++----------
 arch/powerpc/kvm/booke.c                 | 14 +++++++++++
 arch/powerpc/kvm/booke.h                 |  3 +++
 arch/powerpc/kvm/e500_mmu_host.c         |  7 ++++++
 arch/powerpc/kvm/emulate.c               | 18 +++++++++-----
 arch/powerpc/kvm/powerpc.c               | 10 ++++++--
 13 files changed, 132 insertions(+), 80 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index bb1e38a..e2a89f3 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -273,32 +273,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
 	return (vcpu->arch.shared->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
 }
 
-static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)
-{
-	/* Load the instruction manually if it failed to do so in the
-	 * exit path */
-	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
-		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
-
-	return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
-		vcpu->arch.last_inst;
-}
-
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
-}
-
-/*
- * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
- * Because the sc instruction sets SRR0 to point to the following
- * instruction, we have to fetch from pc - 4.
- */
-static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
-{
-	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
-}
-
 static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.fault_dar;
diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
index 80d46b5..6db1ca0 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
 	return false;
 }
 
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-	return vcpu->arch.last_inst;
-}
-
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
 	vcpu->arch.ctr = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..6e7c358 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
 extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
 extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
+extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
+
 /* Core-specific hooks */
 
 extern void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 94e597e..3553735 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -463,6 +463,38 @@ mmio:
 }
 EXPORT_SYMBOL_GPL(kvmppc_ld);
 
+int kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc, u32 *inst)
+{
+	int ret = EMULATE_DONE;
+
+	/* Load the instruction manually if it failed to do so in the
+	 * exit path */
+	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
+		ret = kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst,
+			false);
+
+	*inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
+		vcpu->arch.last_inst;
+
+	return ret;
+}
+
+int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu), inst);
+}
+
+/*
+ * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
+ * Because the sc instruction sets SRR0 to point to the following
+ * instruction, we have to fetch from pc - 4.
+ */
+int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4,
+		inst);
+}
+
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
 	return 0;
diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
index 4bf956c..d85839d 100644
--- a/arch/powerpc/kvm/book3s.h
+++ b/arch/powerpc/kvm/book3s.h
@@ -30,5 +30,6 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu,
 					int sprn, ulong *spr_val);
 extern int kvmppc_book3s_init_pr(void);
 extern void kvmppc_book3s_exit_pr(void);
+extern int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst);
 
 #endif
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index fb25ebc..7ffcc24 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -541,21 +541,13 @@ static int instruction_is_store(unsigned int instr)
 static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 				  unsigned long gpa, gva_t ea, int is_store)
 {
-	int ret;
 	u32 last_inst;
-	unsigned long srr0 = kvmppc_get_pc(vcpu);
 
-	/* We try to load the last instruction.  We don't let
-	 * emulate_instruction do it as it doesn't check what
-	 * kvmppc_ld returns.
+	/*
 	 * If we fail, we just return to the guest and try executing it again.
 	 */
-	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
-		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
-			return RESUME_GUEST;
-		vcpu->arch.last_inst = last_inst;
-	}
+	if (kvmppc_get_last_inst(vcpu, &last_inst) != EMULATE_DONE)
+		return RESUME_GUEST;
 
 	/*
 	 * WARNING: We do not know for sure whether the instruction we just
@@ -569,7 +561,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 * we just return and retry the instruction.
 	 */
 
-	if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store)
+	if (instruction_is_store(last_inst) != !!is_store)
 		return RESUME_GUEST;
 
 	/*
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c
index c1abd95..9a6e565 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -637,26 +637,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc,
 
 int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
-	u32 inst = kvmppc_get_last_inst(vcpu);
-	enum emulation_result emulated = EMULATE_DONE;
-
-	int ax_rd = inst_get_field(inst, 6, 10);
-	int ax_ra = inst_get_field(inst, 11, 15);
-	int ax_rb = inst_get_field(inst, 16, 20);
-	int ax_rc = inst_get_field(inst, 21, 25);
-	short full_d = inst_get_field(inst, 16, 31);
-
-	u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd);
-	u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra);
-	u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb);
-	u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc);
-
-	bool rcomp = (inst & 1) ? true : false;
-	u32 cr = kvmppc_get_cr(vcpu);
+	u32 inst;
+	enum emulation_result emulated;
+	int ax_rd, ax_ra, ax_rb, ax_rc;
+	short full_d;
+	u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c;
+
+	bool rcomp;
+	u32 cr;
 #ifdef DEBUG
 	int i;
 #endif
 
+	emulated = kvmppc_get_last_inst(vcpu, &inst);
+	if (emulated != EMULATE_DONE)
+		return emulated;
+
+	ax_rd = inst_get_field(inst, 6, 10);
+	ax_ra = inst_get_field(inst, 11, 15);
+	ax_rb = inst_get_field(inst, 16, 20);
+	ax_rc = inst_get_field(inst, 21, 25);
+	full_d = inst_get_field(inst, 16, 31);
+
+	fpr_d = &VCPU_FPR(vcpu, ax_rd);
+	fpr_a = &VCPU_FPR(vcpu, ax_ra);
+	fpr_b = &VCPU_FPR(vcpu, ax_rb);
+	fpr_c = &VCPU_FPR(vcpu, ax_rc);
+
+	rcomp = (inst & 1) ? true : false;
+	cr = kvmppc_get_cr(vcpu);
+
 	if (!kvmppc_inst_is_paired_single(vcpu, inst))
 		return EMULATE_FAIL;
 
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index c5c052a..b7fffd1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 
 static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
 {
-	ulong srr0 = kvmppc_get_pc(vcpu);
-	u32 last_inst = kvmppc_get_last_inst(vcpu);
-	int ret;
+	u32 last_inst;
 
-	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-	if (ret == -ENOENT) {
+	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {
 		ulong msr = vcpu->arch.shared->msr;
 
 		msr = kvmppc_set_field(msr, 33, 33, 1);
@@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	{
 		enum emulation_result er;
 		ulong flags;
+		u32 last_inst;
 
 program_interrupt:
 		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
+		kvmppc_get_last_inst(vcpu, &last_inst);
 
 		if (vcpu->arch.shared->msr & MSR_PR) {
 #ifdef EXIT_DEBUG
-			printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
+			pr_info("Userspace triggered 0x700 exception at\n"
+			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
 #endif
-			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
+			if ((last_inst & 0xff0007ff) !=
 			    (INS_DCBZ & 0xfffffff7)) {
 				kvmppc_core_queue_program(vcpu, flags);
 				r = RESUME_GUEST;
@@ -894,7 +894,7 @@ program_interrupt:
 			break;
 		case EMULATE_FAIL:
 			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
-			       __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
+			       __func__, kvmppc_get_pc(vcpu), last_inst);
 			kvmppc_core_queue_program(vcpu, flags);
 			r = RESUME_GUEST;
 			break;
@@ -911,8 +911,12 @@ program_interrupt:
 		break;
 	}
 	case BOOK3S_INTERRUPT_SYSCALL:
+	{
+		u32 last_sc;
+
+		kvmppc_get_last_sc(vcpu, &last_sc);
 		if (vcpu->arch.papr_enabled &&
-		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
+		    (last_sc == 0x44000022) &&
 		    !(vcpu->arch.shared->msr & MSR_PR)) {
 			/* SC 1 papr hypercalls */
 			ulong cmd = kvmppc_get_gpr(vcpu, 3);
@@ -957,6 +961,7 @@ program_interrupt:
 			r = RESUME_GUEST;
 		}
 		break;
+	}
 	case BOOK3S_INTERRUPT_FP_UNAVAIL:
 	case BOOK3S_INTERRUPT_ALTIVEC:
 	case BOOK3S_INTERRUPT_VSX:
@@ -985,15 +990,20 @@ program_interrupt:
 		break;
 	}
 	case BOOK3S_INTERRUPT_ALIGNMENT:
+	{
+		u32 last_inst;
+
 		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
-			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
-				kvmppc_get_last_inst(vcpu));
-			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
-				kvmppc_get_last_inst(vcpu));
+			kvmppc_get_last_inst(vcpu, &last_inst);
+			vcpu->arch.shared->dsisr =
+				kvmppc_alignment_dsisr(vcpu, last_inst);
+			vcpu->arch.shared->dar =
+				kvmppc_alignment_dar(vcpu, last_inst);
 			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
 		}
 		r = RESUME_GUEST;
 		break;
+	}
 	case BOOK3S_INTERRUPT_MACHINE_CHECK:
 	case BOOK3S_INTERRUPT_TRACE:
 		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ab62109..df25db0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		 * they were actually modified by emulation. */
 		return RESUME_GUEST_NV;
 
+	case EMULATE_AGAIN:
+		return RESUME_GUEST;
+
 	case EMULATE_DO_DCR:
 		run->exit_reason = KVM_EXIT_DCR;
 		return RESUME_HOST;
@@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
 }
 
+int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	int result = EMULATE_DONE;
+
+	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
+		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
+	*inst = vcpu->arch.last_inst;
+
+	return result;
+}
+
 int __init kvmppc_booke_init(void)
 {
 #ifndef CONFIG_KVM_BOOKE_HV
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index b632cd3..d07a154 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -80,6 +80,9 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
 int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val);
 int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val);
 
+
+extern int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr);
+
 /* low-level asm code to transfer guest state */
 void kvmppc_load_guest_spe(struct kvm_vcpu *vcpu);
 void kvmppc_save_guest_spe(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index dd2cc03..fcccbb3 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -34,6 +34,7 @@
 #include "e500.h"
 #include "timing.h"
 #include "e500_mmu_host.h"
+#include "booke.h"
 
 #include "trace_booke.h"
 
@@ -606,6 +607,12 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	}
 }
 
+
+int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
+{
+	return EMULATE_FAIL;
+}
+
 /************* MMU Notifiers *************/
 
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index c2b887b..e281d32 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -224,19 +224,25 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt)
  * from opcode tables in the future. */
 int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
-	u32 inst = kvmppc_get_last_inst(vcpu);
-	int ra = get_ra(inst);
-	int rs = get_rs(inst);
-	int rt = get_rt(inst);
-	int sprn = get_sprn(inst);
-	enum emulation_result emulated = EMULATE_DONE;
+	u32 inst;
+	int ra, rs, rt, sprn;
+	enum emulation_result emulated;
 	int advance = 1;
 
 	/* this default type might be overwritten by subcategories */
 	kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
 
+	emulated = kvmppc_get_last_inst(vcpu, &inst);
+	if (emulated != EMULATE_DONE)
+		return emulated;
+
 	pr_debug("Emulating opcode %d / %d\n", get_op(inst), get_xop(inst));
 
+	ra = get_ra(inst);
+	rs = get_rs(inst);
+	rt = get_rt(inst);
+	sprn = get_sprn(inst);
+
 	switch (get_op(inst)) {
 	case OP_TRAP:
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..fdc1ee3 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -220,6 +220,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		 * actually modified. */
 		r = RESUME_GUEST_NV;
 		break;
+	case EMULATE_AGAIN:
+		r = RESUME_GUEST;
+		break;
 	case EMULATE_DO_MMIO:
 		run->exit_reason = KVM_EXIT_MMIO;
 		/* We must reload nonvolatiles because "update" load/store
@@ -229,11 +232,14 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		r = RESUME_HOST_NV;
 		break;
 	case EMULATE_FAIL:
+	{
+		u32 last_inst;
+		kvmppc_get_last_inst(vcpu, &last_inst);
 		/* XXX Deliver Program interrupt to guest. */
-		printk(KERN_EMERG "%s: emulation failed (%08x)\n", __func__,
-		       kvmppc_get_last_inst(vcpu));
+		pr_emerg("%s: emulation failed (%08x)\n", __func__, last_inst);
 		r = RESUME_HOST;
 		break;
+	}
 	default:
 		WARN_ON(1);
 		r = RESUME_GUEST;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

On book3e, guest last instruction was read on the exist path using load
external pid (lwepx) dedicated instruction. lwepx failures have to be
handled by KVM and this would require additional checks in DO_KVM hooks
(beside MSR[GS] = 1). However extra checks on host fast path are commonly
considered too intrusive.

This patch lay down the path for an altenative solution, by allowing
kvmppc_get_lat_inst() function to fail. Booke architectures may choose
to read guest instruction from kvmppc_ld_inst() and to instruct the
emulation layer either to fail or to emulate again.

emulation_result enum defintion is not accesible from book3 asm headers.
Move kvmppc_get_last_inst() definitions that require emulation_result
to source files.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - integrated kvmppc_get_last_inst() in book3s code and checked build
 - addressed cosmetic feedback
 - please validate book3s functionality!

 arch/powerpc/include/asm/kvm_book3s.h    | 26 --------------------
 arch/powerpc/include/asm/kvm_booke.h     |  5 ----
 arch/powerpc/include/asm/kvm_ppc.h       |  2 ++
 arch/powerpc/kvm/book3s.c                | 32 ++++++++++++++++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 +++---------
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++++++++++++++------------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++++++++++++----------
 arch/powerpc/kvm/booke.c                 | 14 +++++++++++
 arch/powerpc/kvm/booke.h                 |  3 +++
 arch/powerpc/kvm/e500_mmu_host.c         |  7 ++++++
 arch/powerpc/kvm/emulate.c               | 18 +++++++++-----
 arch/powerpc/kvm/powerpc.c               | 10 ++++++--
 13 files changed, 132 insertions(+), 80 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index bb1e38a..e2a89f3 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -273,32 +273,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
 	return (vcpu->arch.shared->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
 }
 
-static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)
-{
-	/* Load the instruction manually if it failed to do so in the
-	 * exit path */
-	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
-		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
-
-	return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
-		vcpu->arch.last_inst;
-}
-
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
-}
-
-/*
- * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
- * Because the sc instruction sets SRR0 to point to the following
- * instruction, we have to fetch from pc - 4.
- */
-static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
-{
-	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
-}
-
 static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.fault_dar;
diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
index 80d46b5..6db1ca0 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
 	return false;
 }
 
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-	return vcpu->arch.last_inst;
-}
-
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
 	vcpu->arch.ctr = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..6e7c358 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
 extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
 extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
+extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
+
 /* Core-specific hooks */
 
 extern void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 94e597e..3553735 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -463,6 +463,38 @@ mmio:
 }
 EXPORT_SYMBOL_GPL(kvmppc_ld);
 
+int kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc, u32 *inst)
+{
+	int ret = EMULATE_DONE;
+
+	/* Load the instruction manually if it failed to do so in the
+	 * exit path */
+	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
+		ret = kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst,
+			false);
+
+	*inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
+		vcpu->arch.last_inst;
+
+	return ret;
+}
+
+int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu), inst);
+}
+
+/*
+ * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
+ * Because the sc instruction sets SRR0 to point to the following
+ * instruction, we have to fetch from pc - 4.
+ */
+int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4,
+		inst);
+}
+
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
 	return 0;
diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
index 4bf956c..d85839d 100644
--- a/arch/powerpc/kvm/book3s.h
+++ b/arch/powerpc/kvm/book3s.h
@@ -30,5 +30,6 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu,
 					int sprn, ulong *spr_val);
 extern int kvmppc_book3s_init_pr(void);
 extern void kvmppc_book3s_exit_pr(void);
+extern int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst);
 
 #endif
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index fb25ebc..7ffcc24 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -541,21 +541,13 @@ static int instruction_is_store(unsigned int instr)
 static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 				  unsigned long gpa, gva_t ea, int is_store)
 {
-	int ret;
 	u32 last_inst;
-	unsigned long srr0 = kvmppc_get_pc(vcpu);
 
-	/* We try to load the last instruction.  We don't let
-	 * emulate_instruction do it as it doesn't check what
-	 * kvmppc_ld returns.
+	/*
 	 * If we fail, we just return to the guest and try executing it again.
 	 */
-	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
-		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
-			return RESUME_GUEST;
-		vcpu->arch.last_inst = last_inst;
-	}
+	if (kvmppc_get_last_inst(vcpu, &last_inst) != EMULATE_DONE)
+		return RESUME_GUEST;
 
 	/*
 	 * WARNING: We do not know for sure whether the instruction we just
@@ -569,7 +561,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 * we just return and retry the instruction.
 	 */
 
-	if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store)
+	if (instruction_is_store(last_inst) != !!is_store)
 		return RESUME_GUEST;
 
 	/*
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c
index c1abd95..9a6e565 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -637,26 +637,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc,
 
 int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
-	u32 inst = kvmppc_get_last_inst(vcpu);
-	enum emulation_result emulated = EMULATE_DONE;
-
-	int ax_rd = inst_get_field(inst, 6, 10);
-	int ax_ra = inst_get_field(inst, 11, 15);
-	int ax_rb = inst_get_field(inst, 16, 20);
-	int ax_rc = inst_get_field(inst, 21, 25);
-	short full_d = inst_get_field(inst, 16, 31);
-
-	u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd);
-	u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra);
-	u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb);
-	u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc);
-
-	bool rcomp = (inst & 1) ? true : false;
-	u32 cr = kvmppc_get_cr(vcpu);
+	u32 inst;
+	enum emulation_result emulated;
+	int ax_rd, ax_ra, ax_rb, ax_rc;
+	short full_d;
+	u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c;
+
+	bool rcomp;
+	u32 cr;
 #ifdef DEBUG
 	int i;
 #endif
 
+	emulated = kvmppc_get_last_inst(vcpu, &inst);
+	if (emulated != EMULATE_DONE)
+		return emulated;
+
+	ax_rd = inst_get_field(inst, 6, 10);
+	ax_ra = inst_get_field(inst, 11, 15);
+	ax_rb = inst_get_field(inst, 16, 20);
+	ax_rc = inst_get_field(inst, 21, 25);
+	full_d = inst_get_field(inst, 16, 31);
+
+	fpr_d = &VCPU_FPR(vcpu, ax_rd);
+	fpr_a = &VCPU_FPR(vcpu, ax_ra);
+	fpr_b = &VCPU_FPR(vcpu, ax_rb);
+	fpr_c = &VCPU_FPR(vcpu, ax_rc);
+
+	rcomp = (inst & 1) ? true : false;
+	cr = kvmppc_get_cr(vcpu);
+
 	if (!kvmppc_inst_is_paired_single(vcpu, inst))
 		return EMULATE_FAIL;
 
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index c5c052a..b7fffd1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 
 static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
 {
-	ulong srr0 = kvmppc_get_pc(vcpu);
-	u32 last_inst = kvmppc_get_last_inst(vcpu);
-	int ret;
+	u32 last_inst;
 
-	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-	if (ret == -ENOENT) {
+	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {
 		ulong msr = vcpu->arch.shared->msr;
 
 		msr = kvmppc_set_field(msr, 33, 33, 1);
@@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	{
 		enum emulation_result er;
 		ulong flags;
+		u32 last_inst;
 
 program_interrupt:
 		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
+		kvmppc_get_last_inst(vcpu, &last_inst);
 
 		if (vcpu->arch.shared->msr & MSR_PR) {
 #ifdef EXIT_DEBUG
-			printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
+			pr_info("Userspace triggered 0x700 exception at\n"
+			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
 #endif
-			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
+			if ((last_inst & 0xff0007ff) !=
 			    (INS_DCBZ & 0xfffffff7)) {
 				kvmppc_core_queue_program(vcpu, flags);
 				r = RESUME_GUEST;
@@ -894,7 +894,7 @@ program_interrupt:
 			break;
 		case EMULATE_FAIL:
 			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
-			       __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
+			       __func__, kvmppc_get_pc(vcpu), last_inst);
 			kvmppc_core_queue_program(vcpu, flags);
 			r = RESUME_GUEST;
 			break;
@@ -911,8 +911,12 @@ program_interrupt:
 		break;
 	}
 	case BOOK3S_INTERRUPT_SYSCALL:
+	{
+		u32 last_sc;
+
+		kvmppc_get_last_sc(vcpu, &last_sc);
 		if (vcpu->arch.papr_enabled &&
-		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
+		    (last_sc == 0x44000022) &&
 		    !(vcpu->arch.shared->msr & MSR_PR)) {
 			/* SC 1 papr hypercalls */
 			ulong cmd = kvmppc_get_gpr(vcpu, 3);
@@ -957,6 +961,7 @@ program_interrupt:
 			r = RESUME_GUEST;
 		}
 		break;
+	}
 	case BOOK3S_INTERRUPT_FP_UNAVAIL:
 	case BOOK3S_INTERRUPT_ALTIVEC:
 	case BOOK3S_INTERRUPT_VSX:
@@ -985,15 +990,20 @@ program_interrupt:
 		break;
 	}
 	case BOOK3S_INTERRUPT_ALIGNMENT:
+	{
+		u32 last_inst;
+
 		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
-			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
-				kvmppc_get_last_inst(vcpu));
-			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
-				kvmppc_get_last_inst(vcpu));
+			kvmppc_get_last_inst(vcpu, &last_inst);
+			vcpu->arch.shared->dsisr =
+				kvmppc_alignment_dsisr(vcpu, last_inst);
+			vcpu->arch.shared->dar =
+				kvmppc_alignment_dar(vcpu, last_inst);
 			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
 		}
 		r = RESUME_GUEST;
 		break;
+	}
 	case BOOK3S_INTERRUPT_MACHINE_CHECK:
 	case BOOK3S_INTERRUPT_TRACE:
 		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ab62109..df25db0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		 * they were actually modified by emulation. */
 		return RESUME_GUEST_NV;
 
+	case EMULATE_AGAIN:
+		return RESUME_GUEST;
+
 	case EMULATE_DO_DCR:
 		run->exit_reason = KVM_EXIT_DCR;
 		return RESUME_HOST;
@@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
 }
 
+int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	int result = EMULATE_DONE;
+
+	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
+		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
+	*inst = vcpu->arch.last_inst;
+
+	return result;
+}
+
 int __init kvmppc_booke_init(void)
 {
 #ifndef CONFIG_KVM_BOOKE_HV
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index b632cd3..d07a154 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -80,6 +80,9 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
 int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val);
 int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val);
 
+
+extern int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr);
+
 /* low-level asm code to transfer guest state */
 void kvmppc_load_guest_spe(struct kvm_vcpu *vcpu);
 void kvmppc_save_guest_spe(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index dd2cc03..fcccbb3 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -34,6 +34,7 @@
 #include "e500.h"
 #include "timing.h"
 #include "e500_mmu_host.h"
+#include "booke.h"
 
 #include "trace_booke.h"
 
@@ -606,6 +607,12 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	}
 }
 
+
+int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
+{
+	return EMULATE_FAIL;
+}
+
 /************* MMU Notifiers *************/
 
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index c2b887b..e281d32 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -224,19 +224,25 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt)
  * from opcode tables in the future. */
 int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
-	u32 inst = kvmppc_get_last_inst(vcpu);
-	int ra = get_ra(inst);
-	int rs = get_rs(inst);
-	int rt = get_rt(inst);
-	int sprn = get_sprn(inst);
-	enum emulation_result emulated = EMULATE_DONE;
+	u32 inst;
+	int ra, rs, rt, sprn;
+	enum emulation_result emulated;
 	int advance = 1;
 
 	/* this default type might be overwritten by subcategories */
 	kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
 
+	emulated = kvmppc_get_last_inst(vcpu, &inst);
+	if (emulated != EMULATE_DONE)
+		return emulated;
+
 	pr_debug("Emulating opcode %d / %d\n", get_op(inst), get_xop(inst));
 
+	ra = get_ra(inst);
+	rs = get_rs(inst);
+	rt = get_rt(inst);
+	sprn = get_sprn(inst);
+
 	switch (get_op(inst)) {
 	case OP_TRAP:
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..fdc1ee3 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -220,6 +220,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		 * actually modified. */
 		r = RESUME_GUEST_NV;
 		break;
+	case EMULATE_AGAIN:
+		r = RESUME_GUEST;
+		break;
 	case EMULATE_DO_MMIO:
 		run->exit_reason = KVM_EXIT_MMIO;
 		/* We must reload nonvolatiles because "update" load/store
@@ -229,11 +232,14 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		r = RESUME_HOST_NV;
 		break;
 	case EMULATE_FAIL:
+	{
+		u32 last_inst;
+		kvmppc_get_last_inst(vcpu, &last_inst);
 		/* XXX Deliver Program interrupt to guest. */
-		printk(KERN_EMERG "%s: emulation failed (%08x)\n", __func__,
-		       kvmppc_get_last_inst(vcpu));
+		pr_emerg("%s: emulation failed (%08x)\n", __func__, last_inst);
 		r = RESUME_HOST;
 		break;
+	}
 	default:
 		WARN_ON(1);
 		r = RESUME_GUEST;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

On book3e, guest last instruction was read on the exist path using load
external pid (lwepx) dedicated instruction. lwepx failures have to be
handled by KVM and this would require additional checks in DO_KVM hooks
(beside MSR[GS] = 1). However extra checks on host fast path are commonly
considered too intrusive.

This patch lay down the path for an altenative solution, by allowing
kvmppc_get_lat_inst() function to fail. Booke architectures may choose
to read guest instruction from kvmppc_ld_inst() and to instruct the
emulation layer either to fail or to emulate again.

emulation_result enum defintion is not accesible from book3 asm headers.
Move kvmppc_get_last_inst() definitions that require emulation_result
to source files.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - integrated kvmppc_get_last_inst() in book3s code and checked build
 - addressed cosmetic feedback
 - please validate book3s functionality!

 arch/powerpc/include/asm/kvm_book3s.h    | 26 --------------------
 arch/powerpc/include/asm/kvm_booke.h     |  5 ----
 arch/powerpc/include/asm/kvm_ppc.h       |  2 ++
 arch/powerpc/kvm/book3s.c                | 32 ++++++++++++++++++++++++
 arch/powerpc/kvm/book3s.h                |  1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 +++---------
 arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++++++++++++++------------
 arch/powerpc/kvm/book3s_pr.c             | 36 +++++++++++++++++----------
 arch/powerpc/kvm/booke.c                 | 14 +++++++++++
 arch/powerpc/kvm/booke.h                 |  3 +++
 arch/powerpc/kvm/e500_mmu_host.c         |  7 ++++++
 arch/powerpc/kvm/emulate.c               | 18 +++++++++-----
 arch/powerpc/kvm/powerpc.c               | 10 ++++++--
 13 files changed, 132 insertions(+), 80 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index bb1e38a..e2a89f3 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -273,32 +273,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
 	return (vcpu->arch.shared->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
 }
 
-static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)
-{
-	/* Load the instruction manually if it failed to do so in the
-	 * exit path */
-	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
-		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
-
-	return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
-		vcpu->arch.last_inst;
-}
-
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
-}
-
-/*
- * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
- * Because the sc instruction sets SRR0 to point to the following
- * instruction, we have to fetch from pc - 4.
- */
-static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
-{
-	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
-}
-
 static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.fault_dar;
diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
index 80d46b5..6db1ca0 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
 	return false;
 }
 
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-	return vcpu->arch.last_inst;
-}
-
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
 	vcpu->arch.ctr = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..6e7c358 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
 extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
 extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
+extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
+
 /* Core-specific hooks */
 
 extern void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 94e597e..3553735 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -463,6 +463,38 @@ mmio:
 }
 EXPORT_SYMBOL_GPL(kvmppc_ld);
 
+int kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc, u32 *inst)
+{
+	int ret = EMULATE_DONE;
+
+	/* Load the instruction manually if it failed to do so in the
+	 * exit path */
+	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
+		ret = kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst,
+			false);
+
+	*inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
+		vcpu->arch.last_inst;
+
+	return ret;
+}
+
+int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu), inst);
+}
+
+/*
+ * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
+ * Because the sc instruction sets SRR0 to point to the following
+ * instruction, we have to fetch from pc - 4.
+ */
+int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4,
+		inst);
+}
+
 int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
 {
 	return 0;
diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
index 4bf956c..d85839d 100644
--- a/arch/powerpc/kvm/book3s.h
+++ b/arch/powerpc/kvm/book3s.h
@@ -30,5 +30,6 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu,
 					int sprn, ulong *spr_val);
 extern int kvmppc_book3s_init_pr(void);
 extern void kvmppc_book3s_exit_pr(void);
+extern int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst);
 
 #endif
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index fb25ebc..7ffcc24 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -541,21 +541,13 @@ static int instruction_is_store(unsigned int instr)
 static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 				  unsigned long gpa, gva_t ea, int is_store)
 {
-	int ret;
 	u32 last_inst;
-	unsigned long srr0 = kvmppc_get_pc(vcpu);
 
-	/* We try to load the last instruction.  We don't let
-	 * emulate_instruction do it as it doesn't check what
-	 * kvmppc_ld returns.
+	/*
 	 * If we fail, we just return to the guest and try executing it again.
 	 */
-	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED) {
-		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-		if (ret != EMULATE_DONE || last_inst = KVM_INST_FETCH_FAILED)
-			return RESUME_GUEST;
-		vcpu->arch.last_inst = last_inst;
-	}
+	if (kvmppc_get_last_inst(vcpu, &last_inst) != EMULATE_DONE)
+		return RESUME_GUEST;
 
 	/*
 	 * WARNING: We do not know for sure whether the instruction we just
@@ -569,7 +561,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 * we just return and retry the instruction.
 	 */
 
-	if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store)
+	if (instruction_is_store(last_inst) != !!is_store)
 		return RESUME_GUEST;
 
 	/*
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c
index c1abd95..9a6e565 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -637,26 +637,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc,
 
 int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
-	u32 inst = kvmppc_get_last_inst(vcpu);
-	enum emulation_result emulated = EMULATE_DONE;
-
-	int ax_rd = inst_get_field(inst, 6, 10);
-	int ax_ra = inst_get_field(inst, 11, 15);
-	int ax_rb = inst_get_field(inst, 16, 20);
-	int ax_rc = inst_get_field(inst, 21, 25);
-	short full_d = inst_get_field(inst, 16, 31);
-
-	u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd);
-	u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra);
-	u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb);
-	u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc);
-
-	bool rcomp = (inst & 1) ? true : false;
-	u32 cr = kvmppc_get_cr(vcpu);
+	u32 inst;
+	enum emulation_result emulated;
+	int ax_rd, ax_ra, ax_rb, ax_rc;
+	short full_d;
+	u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c;
+
+	bool rcomp;
+	u32 cr;
 #ifdef DEBUG
 	int i;
 #endif
 
+	emulated = kvmppc_get_last_inst(vcpu, &inst);
+	if (emulated != EMULATE_DONE)
+		return emulated;
+
+	ax_rd = inst_get_field(inst, 6, 10);
+	ax_ra = inst_get_field(inst, 11, 15);
+	ax_rb = inst_get_field(inst, 16, 20);
+	ax_rc = inst_get_field(inst, 21, 25);
+	full_d = inst_get_field(inst, 16, 31);
+
+	fpr_d = &VCPU_FPR(vcpu, ax_rd);
+	fpr_a = &VCPU_FPR(vcpu, ax_ra);
+	fpr_b = &VCPU_FPR(vcpu, ax_rb);
+	fpr_c = &VCPU_FPR(vcpu, ax_rc);
+
+	rcomp = (inst & 1) ? true : false;
+	cr = kvmppc_get_cr(vcpu);
+
 	if (!kvmppc_inst_is_paired_single(vcpu, inst))
 		return EMULATE_FAIL;
 
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index c5c052a..b7fffd1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 
 static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
 {
-	ulong srr0 = kvmppc_get_pc(vcpu);
-	u32 last_inst = kvmppc_get_last_inst(vcpu);
-	int ret;
+	u32 last_inst;
 
-	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-	if (ret = -ENOENT) {
+	if (kvmppc_get_last_inst(vcpu, &last_inst) = -ENOENT) {
 		ulong msr = vcpu->arch.shared->msr;
 
 		msr = kvmppc_set_field(msr, 33, 33, 1);
@@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	{
 		enum emulation_result er;
 		ulong flags;
+		u32 last_inst;
 
 program_interrupt:
 		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
+		kvmppc_get_last_inst(vcpu, &last_inst);
 
 		if (vcpu->arch.shared->msr & MSR_PR) {
 #ifdef EXIT_DEBUG
-			printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
+			pr_info("Userspace triggered 0x700 exception at\n"
+			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
 #endif
-			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !+			if ((last_inst & 0xff0007ff) ! 			    (INS_DCBZ & 0xfffffff7)) {
 				kvmppc_core_queue_program(vcpu, flags);
 				r = RESUME_GUEST;
@@ -894,7 +894,7 @@ program_interrupt:
 			break;
 		case EMULATE_FAIL:
 			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
-			       __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
+			       __func__, kvmppc_get_pc(vcpu), last_inst);
 			kvmppc_core_queue_program(vcpu, flags);
 			r = RESUME_GUEST;
 			break;
@@ -911,8 +911,12 @@ program_interrupt:
 		break;
 	}
 	case BOOK3S_INTERRUPT_SYSCALL:
+	{
+		u32 last_sc;
+
+		kvmppc_get_last_sc(vcpu, &last_sc);
 		if (vcpu->arch.papr_enabled &&
-		    (kvmppc_get_last_sc(vcpu) = 0x44000022) &&
+		    (last_sc = 0x44000022) &&
 		    !(vcpu->arch.shared->msr & MSR_PR)) {
 			/* SC 1 papr hypercalls */
 			ulong cmd = kvmppc_get_gpr(vcpu, 3);
@@ -957,6 +961,7 @@ program_interrupt:
 			r = RESUME_GUEST;
 		}
 		break;
+	}
 	case BOOK3S_INTERRUPT_FP_UNAVAIL:
 	case BOOK3S_INTERRUPT_ALTIVEC:
 	case BOOK3S_INTERRUPT_VSX:
@@ -985,15 +990,20 @@ program_interrupt:
 		break;
 	}
 	case BOOK3S_INTERRUPT_ALIGNMENT:
+	{
+		u32 last_inst;
+
 		if (kvmppc_read_inst(vcpu) = EMULATE_DONE) {
-			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
-				kvmppc_get_last_inst(vcpu));
-			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
-				kvmppc_get_last_inst(vcpu));
+			kvmppc_get_last_inst(vcpu, &last_inst);
+			vcpu->arch.shared->dsisr +				kvmppc_alignment_dsisr(vcpu, last_inst);
+			vcpu->arch.shared->dar +				kvmppc_alignment_dar(vcpu, last_inst);
 			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
 		}
 		r = RESUME_GUEST;
 		break;
+	}
 	case BOOK3S_INTERRUPT_MACHINE_CHECK:
 	case BOOK3S_INTERRUPT_TRACE:
 		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ab62109..df25db0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		 * they were actually modified by emulation. */
 		return RESUME_GUEST_NV;
 
+	case EMULATE_AGAIN:
+		return RESUME_GUEST;
+
 	case EMULATE_DO_DCR:
 		run->exit_reason = KVM_EXIT_DCR;
 		return RESUME_HOST;
@@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
 	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
 }
 
+int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
+{
+	int result = EMULATE_DONE;
+
+	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
+		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
+	*inst = vcpu->arch.last_inst;
+
+	return result;
+}
+
 int __init kvmppc_booke_init(void)
 {
 #ifndef CONFIG_KVM_BOOKE_HV
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index b632cd3..d07a154 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -80,6 +80,9 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
 int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val);
 int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val);
 
+
+extern int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr);
+
 /* low-level asm code to transfer guest state */
 void kvmppc_load_guest_spe(struct kvm_vcpu *vcpu);
 void kvmppc_save_guest_spe(struct kvm_vcpu *vcpu);
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index dd2cc03..fcccbb3 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -34,6 +34,7 @@
 #include "e500.h"
 #include "timing.h"
 #include "e500_mmu_host.h"
+#include "booke.h"
 
 #include "trace_booke.h"
 
@@ -606,6 +607,12 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	}
 }
 
+
+int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
+{
+	return EMULATE_FAIL;
+}
+
 /************* MMU Notifiers *************/
 
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index c2b887b..e281d32 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -224,19 +224,25 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt)
  * from opcode tables in the future. */
 int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
 {
-	u32 inst = kvmppc_get_last_inst(vcpu);
-	int ra = get_ra(inst);
-	int rs = get_rs(inst);
-	int rt = get_rt(inst);
-	int sprn = get_sprn(inst);
-	enum emulation_result emulated = EMULATE_DONE;
+	u32 inst;
+	int ra, rs, rt, sprn;
+	enum emulation_result emulated;
 	int advance = 1;
 
 	/* this default type might be overwritten by subcategories */
 	kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
 
+	emulated = kvmppc_get_last_inst(vcpu, &inst);
+	if (emulated != EMULATE_DONE)
+		return emulated;
+
 	pr_debug("Emulating opcode %d / %d\n", get_op(inst), get_xop(inst));
 
+	ra = get_ra(inst);
+	rs = get_rs(inst);
+	rt = get_rt(inst);
+	sprn = get_sprn(inst);
+
 	switch (get_op(inst)) {
 	case OP_TRAP:
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..fdc1ee3 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -220,6 +220,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		 * actually modified. */
 		r = RESUME_GUEST_NV;
 		break;
+	case EMULATE_AGAIN:
+		r = RESUME_GUEST;
+		break;
 	case EMULATE_DO_MMIO:
 		run->exit_reason = KVM_EXIT_MMIO;
 		/* We must reload nonvolatiles because "update" load/store
@@ -229,11 +232,14 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
 		r = RESUME_HOST_NV;
 		break;
 	case EMULATE_FAIL:
+	{
+		u32 last_inst;
+		kvmppc_get_last_inst(vcpu, &last_inst);
 		/* XXX Deliver Program interrupt to guest. */
-		printk(KERN_EMERG "%s: emulation failed (%08x)\n", __func__,
-		       kvmppc_get_last_inst(vcpu));
+		pr_emerg("%s: emulation failed (%08x)\n", __func__, last_inst);
 		r = RESUME_HOST;
 		break;
+	}
 	default:
 		WARN_ON(1);
 		r = RESUME_GUEST;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
  2014-05-01  0:45 ` Mihai Caraman
  (?)
@ 2014-05-01  0:45   ` Mihai Caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

On bookehv vcpu's last instruction is read using load external pid
(lwepx) instruction. lwepx exceptions (DTLB_MISS, DSI and LRAT) need
to be handled by KVM. These exceptions originate from host state
(MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside
the current MSR[GS] = 1) by looking at the Exception Syndrome Register
(ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]).
Doing this on each Data TLB miss exception is obvious too intrusive
for the host.

Get rid of lwepx and acquire last instuction in kvmppc_get_last_inst()
by searching for the physical address and kmap it. This fixes an
infinite loop caused by lwepx's data TLB miss handled in the host
and the TODO for TLB eviction and execute-but-not-read entries.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - reworked patch description
 - used pr_* functions
 - addressed cosmetic feedback

 arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
 arch/powerpc/kvm/e500_mmu_host.c      | 93 ++++++++++++++++++++++++++++++++++-
 2 files changed, 101 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index 925da71..65eff4c 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -122,38 +122,14 @@
 1:
 
 	.if	\flags & NEED_EMU
-	/*
-	 * This assumes you have external PID support.
-	 * To support a bookehv CPU without external PID, you'll
-	 * need to look up the TLB entry and create a temporary mapping.
-	 *
-	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
-	 * booke doesn't handle it either.  Since Linux doesn't use
-	 * broadcast tlbivax anymore, the only way this should happen is
-	 * if the guest maps its memory execute-but-not-read, or if we
-	 * somehow take a TLB miss in the middle of this entry code and
-	 * evict the relevant entry.  On e500mc, all kernel lowmem is
-	 * bolted into TLB1 large page mappings, and we don't use
-	 * broadcast invalidates, so we should not take a TLB miss here.
-	 *
-	 * Later we'll need to deal with faults here.  Disallowing guest
-	 * mappings that are execute-but-not-read could be an option on
-	 * e500mc, but not on chips with an LRAT if it is used.
-	 */
-
-	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
 	PPC_STL	r15, VCPU_GPR(R15)(r4)
 	PPC_STL	r16, VCPU_GPR(R16)(r4)
 	PPC_STL	r17, VCPU_GPR(R17)(r4)
 	PPC_STL	r18, VCPU_GPR(R18)(r4)
 	PPC_STL	r19, VCPU_GPR(R19)(r4)
-	mr	r8, r3
 	PPC_STL	r20, VCPU_GPR(R20)(r4)
-	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
 	PPC_STL	r21, VCPU_GPR(R21)(r4)
-	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
 	PPC_STL	r22, VCPU_GPR(R22)(r4)
-	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
 	PPC_STL	r23, VCPU_GPR(R23)(r4)
 	PPC_STL	r24, VCPU_GPR(R24)(r4)
 	PPC_STL	r25, VCPU_GPR(R25)(r4)
@@ -163,10 +139,15 @@
 	PPC_STL	r29, VCPU_GPR(R29)(r4)
 	PPC_STL	r30, VCPU_GPR(R30)(r4)
 	PPC_STL	r31, VCPU_GPR(R31)(r4)
-	mtspr	SPRN_EPLC, r8
-	isync
-	lwepx   r9, 0, r5
-	mtspr	SPRN_EPLC, r3
+
+	/*
+	 * We don't use external PID support. lwepx faults would need to be
+	 * handled by KVM and this implies aditional code in DO_KVM (for
+	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
+	 * is too intrusive for the host. Get last instuction in
+	 * kvmppc_get_last_inst().
+	 */
+	li	r9, KVM_INST_FETCH_FAILED
 	stw	r9, VCPU_LAST_INST(r4)
 	.endif
 
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index fcccbb3..94b8be0 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -607,11 +607,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	}
 }
 
+#ifdef CONFIG_KVM_BOOKE_HV
+int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
+{
+	gva_t geaddr;
+	hpa_t addr;
+	hfn_t pfn;
+	hva_t eaddr;
+	u32 mas0, mas1, mas2, mas3;
+	u64 mas7_mas3;
+	struct page *page;
+	unsigned int addr_space, psize_shift;
+	bool pr;
+	unsigned long flags;
+
+	/* Search TLB for guest pc to get the real address */
+	geaddr = kvmppc_get_pc(vcpu);
+	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
+
+	local_irq_save(flags);
+	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
+	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
+	asm volatile("tlbsx 0, %[geaddr]\n" : :
+		     [geaddr] "r" (geaddr));
+	mtspr(SPRN_MAS5, 0);
+	mtspr(SPRN_MAS8, 0);
+	mas0 = mfspr(SPRN_MAS0);
+	mas1 = mfspr(SPRN_MAS1);
+	mas2 = mfspr(SPRN_MAS2);
+	mas3 = mfspr(SPRN_MAS3);
+	mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3;
+	local_irq_restore(flags);
+
+	/*
+	 * If the TLB entry for guest pc was evicted, return to the guest.
+	 * There are high chances to find a valid TLB entry next time.
+	 */
+	if (!(mas1 & MAS1_VALID))
+		return EMULATE_AGAIN;
 
+	/*
+	 * Another thread may rewrite the TLB entry in parallel, don't
+	 * execute from the address if the execute permission is not set
+	 */
+	pr = vcpu->arch.shared->msr & MSR_PR;
+	if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) {
+		pr_debug("Instuction emulation from a guest page\n"
+				"withot execute permission\n");
+		kvmppc_core_queue_program(vcpu, 0);
+		return EMULATE_AGAIN;
+	}
+
+	/*
+	 * We will map the real address through a cacheable page, so we will
+	 * not support cache-inhibited guest pages. Fortunately emulated
+	 * instructions should not live there.
+	 */
+	if (mas2 & MAS2_I) {
+		pr_debug("Instuction emulation from cache-inhibited\n"
+				"guest pages is not supported\n");
+		return EMULATE_FAIL;
+	}
+
+	/* Get page size */
+	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
+
+	/* Map a page and get guest's instruction */
+	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
+	       (geaddr & ((1ULL << psize_shift) - 1ULL));
+	pfn = addr >> PAGE_SHIFT;
+
+	/* Guard us against emulation from devices area */
+	if (unlikely(!page_is_ram(pfn))) {
+		pr_debug("Instruction emulation from non-RAM host\n"
+				"pages is not supported\n");
+		return EMULATE_FAIL;
+	}
+
+	if (unlikely(!pfn_valid(pfn))) {
+		pr_debug("Invalid frame number\n");
+		return EMULATE_FAIL;
+	}
+
+	page = pfn_to_page(pfn);
+	eaddr = (unsigned long)kmap_atomic(page);
+	eaddr |= addr & ~PAGE_MASK;
+	*instr = *(u32 *)eaddr;
+	kunmap_atomic((u32 *)eaddr);
+
+	return EMULATE_DONE;
+}
+#else
 int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
 {
 	return EMULATE_FAIL;
-}
+};
+#endif
 
 /************* MMU Notifiers *************/
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

On bookehv vcpu's last instruction is read using load external pid
(lwepx) instruction. lwepx exceptions (DTLB_MISS, DSI and LRAT) need
to be handled by KVM. These exceptions originate from host state
(MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside
the current MSR[GS] = 1) by looking at the Exception Syndrome Register
(ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]).
Doing this on each Data TLB miss exception is obvious too intrusive
for the host.

Get rid of lwepx and acquire last instuction in kvmppc_get_last_inst()
by searching for the physical address and kmap it. This fixes an
infinite loop caused by lwepx's data TLB miss handled in the host
and the TODO for TLB eviction and execute-but-not-read entries.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - reworked patch description
 - used pr_* functions
 - addressed cosmetic feedback

 arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
 arch/powerpc/kvm/e500_mmu_host.c      | 93 ++++++++++++++++++++++++++++++++++-
 2 files changed, 101 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index 925da71..65eff4c 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -122,38 +122,14 @@
 1:
 
 	.if	\flags & NEED_EMU
-	/*
-	 * This assumes you have external PID support.
-	 * To support a bookehv CPU without external PID, you'll
-	 * need to look up the TLB entry and create a temporary mapping.
-	 *
-	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
-	 * booke doesn't handle it either.  Since Linux doesn't use
-	 * broadcast tlbivax anymore, the only way this should happen is
-	 * if the guest maps its memory execute-but-not-read, or if we
-	 * somehow take a TLB miss in the middle of this entry code and
-	 * evict the relevant entry.  On e500mc, all kernel lowmem is
-	 * bolted into TLB1 large page mappings, and we don't use
-	 * broadcast invalidates, so we should not take a TLB miss here.
-	 *
-	 * Later we'll need to deal with faults here.  Disallowing guest
-	 * mappings that are execute-but-not-read could be an option on
-	 * e500mc, but not on chips with an LRAT if it is used.
-	 */
-
-	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
 	PPC_STL	r15, VCPU_GPR(R15)(r4)
 	PPC_STL	r16, VCPU_GPR(R16)(r4)
 	PPC_STL	r17, VCPU_GPR(R17)(r4)
 	PPC_STL	r18, VCPU_GPR(R18)(r4)
 	PPC_STL	r19, VCPU_GPR(R19)(r4)
-	mr	r8, r3
 	PPC_STL	r20, VCPU_GPR(R20)(r4)
-	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
 	PPC_STL	r21, VCPU_GPR(R21)(r4)
-	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
 	PPC_STL	r22, VCPU_GPR(R22)(r4)
-	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
 	PPC_STL	r23, VCPU_GPR(R23)(r4)
 	PPC_STL	r24, VCPU_GPR(R24)(r4)
 	PPC_STL	r25, VCPU_GPR(R25)(r4)
@@ -163,10 +139,15 @@
 	PPC_STL	r29, VCPU_GPR(R29)(r4)
 	PPC_STL	r30, VCPU_GPR(R30)(r4)
 	PPC_STL	r31, VCPU_GPR(R31)(r4)
-	mtspr	SPRN_EPLC, r8
-	isync
-	lwepx   r9, 0, r5
-	mtspr	SPRN_EPLC, r3
+
+	/*
+	 * We don't use external PID support. lwepx faults would need to be
+	 * handled by KVM and this implies aditional code in DO_KVM (for
+	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
+	 * is too intrusive for the host. Get last instuction in
+	 * kvmppc_get_last_inst().
+	 */
+	li	r9, KVM_INST_FETCH_FAILED
 	stw	r9, VCPU_LAST_INST(r4)
 	.endif
 
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index fcccbb3..94b8be0 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -607,11 +607,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	}
 }
 
+#ifdef CONFIG_KVM_BOOKE_HV
+int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
+{
+	gva_t geaddr;
+	hpa_t addr;
+	hfn_t pfn;
+	hva_t eaddr;
+	u32 mas0, mas1, mas2, mas3;
+	u64 mas7_mas3;
+	struct page *page;
+	unsigned int addr_space, psize_shift;
+	bool pr;
+	unsigned long flags;
+
+	/* Search TLB for guest pc to get the real address */
+	geaddr = kvmppc_get_pc(vcpu);
+	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
+
+	local_irq_save(flags);
+	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
+	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
+	asm volatile("tlbsx 0, %[geaddr]\n" : :
+		     [geaddr] "r" (geaddr));
+	mtspr(SPRN_MAS5, 0);
+	mtspr(SPRN_MAS8, 0);
+	mas0 = mfspr(SPRN_MAS0);
+	mas1 = mfspr(SPRN_MAS1);
+	mas2 = mfspr(SPRN_MAS2);
+	mas3 = mfspr(SPRN_MAS3);
+	mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3;
+	local_irq_restore(flags);
+
+	/*
+	 * If the TLB entry for guest pc was evicted, return to the guest.
+	 * There are high chances to find a valid TLB entry next time.
+	 */
+	if (!(mas1 & MAS1_VALID))
+		return EMULATE_AGAIN;
 
+	/*
+	 * Another thread may rewrite the TLB entry in parallel, don't
+	 * execute from the address if the execute permission is not set
+	 */
+	pr = vcpu->arch.shared->msr & MSR_PR;
+	if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) {
+		pr_debug("Instuction emulation from a guest page\n"
+				"withot execute permission\n");
+		kvmppc_core_queue_program(vcpu, 0);
+		return EMULATE_AGAIN;
+	}
+
+	/*
+	 * We will map the real address through a cacheable page, so we will
+	 * not support cache-inhibited guest pages. Fortunately emulated
+	 * instructions should not live there.
+	 */
+	if (mas2 & MAS2_I) {
+		pr_debug("Instuction emulation from cache-inhibited\n"
+				"guest pages is not supported\n");
+		return EMULATE_FAIL;
+	}
+
+	/* Get page size */
+	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
+
+	/* Map a page and get guest's instruction */
+	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
+	       (geaddr & ((1ULL << psize_shift) - 1ULL));
+	pfn = addr >> PAGE_SHIFT;
+
+	/* Guard us against emulation from devices area */
+	if (unlikely(!page_is_ram(pfn))) {
+		pr_debug("Instruction emulation from non-RAM host\n"
+				"pages is not supported\n");
+		return EMULATE_FAIL;
+	}
+
+	if (unlikely(!pfn_valid(pfn))) {
+		pr_debug("Invalid frame number\n");
+		return EMULATE_FAIL;
+	}
+
+	page = pfn_to_page(pfn);
+	eaddr = (unsigned long)kmap_atomic(page);
+	eaddr |= addr & ~PAGE_MASK;
+	*instr = *(u32 *)eaddr;
+	kunmap_atomic((u32 *)eaddr);
+
+	return EMULATE_DONE;
+}
+#else
 int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
 {
 	return EMULATE_FAIL;
-}
+};
+#endif
 
 /************* MMU Notifiers *************/
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-01  0:45   ` Mihai Caraman
  0 siblings, 0 replies; 54+ messages in thread
From: Mihai Caraman @ 2014-05-01  0:45 UTC (permalink / raw)
  To: kvm-ppc; +Cc: kvm, linuxppc-dev, Mihai Caraman

On bookehv vcpu's last instruction is read using load external pid
(lwepx) instruction. lwepx exceptions (DTLB_MISS, DSI and LRAT) need
to be handled by KVM. These exceptions originate from host state
(MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside
the current MSR[GS] = 1) by looking at the Exception Syndrome Register
(ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]).
Doing this on each Data TLB miss exception is obvious too intrusive
for the host.

Get rid of lwepx and acquire last instuction in kvmppc_get_last_inst()
by searching for the physical address and kmap it. This fixes an
infinite loop caused by lwepx's data TLB miss handled in the host
and the TODO for TLB eviction and execute-but-not-read entries.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
v2:
 - reworked patch description
 - used pr_* functions
 - addressed cosmetic feedback

 arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
 arch/powerpc/kvm/e500_mmu_host.c      | 93 ++++++++++++++++++++++++++++++++++-
 2 files changed, 101 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index 925da71..65eff4c 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -122,38 +122,14 @@
 1:
 
 	.if	\flags & NEED_EMU
-	/*
-	 * This assumes you have external PID support.
-	 * To support a bookehv CPU without external PID, you'll
-	 * need to look up the TLB entry and create a temporary mapping.
-	 *
-	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
-	 * booke doesn't handle it either.  Since Linux doesn't use
-	 * broadcast tlbivax anymore, the only way this should happen is
-	 * if the guest maps its memory execute-but-not-read, or if we
-	 * somehow take a TLB miss in the middle of this entry code and
-	 * evict the relevant entry.  On e500mc, all kernel lowmem is
-	 * bolted into TLB1 large page mappings, and we don't use
-	 * broadcast invalidates, so we should not take a TLB miss here.
-	 *
-	 * Later we'll need to deal with faults here.  Disallowing guest
-	 * mappings that are execute-but-not-read could be an option on
-	 * e500mc, but not on chips with an LRAT if it is used.
-	 */
-
-	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
 	PPC_STL	r15, VCPU_GPR(R15)(r4)
 	PPC_STL	r16, VCPU_GPR(R16)(r4)
 	PPC_STL	r17, VCPU_GPR(R17)(r4)
 	PPC_STL	r18, VCPU_GPR(R18)(r4)
 	PPC_STL	r19, VCPU_GPR(R19)(r4)
-	mr	r8, r3
 	PPC_STL	r20, VCPU_GPR(R20)(r4)
-	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
 	PPC_STL	r21, VCPU_GPR(R21)(r4)
-	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
 	PPC_STL	r22, VCPU_GPR(R22)(r4)
-	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
 	PPC_STL	r23, VCPU_GPR(R23)(r4)
 	PPC_STL	r24, VCPU_GPR(R24)(r4)
 	PPC_STL	r25, VCPU_GPR(R25)(r4)
@@ -163,10 +139,15 @@
 	PPC_STL	r29, VCPU_GPR(R29)(r4)
 	PPC_STL	r30, VCPU_GPR(R30)(r4)
 	PPC_STL	r31, VCPU_GPR(R31)(r4)
-	mtspr	SPRN_EPLC, r8
-	isync
-	lwepx   r9, 0, r5
-	mtspr	SPRN_EPLC, r3
+
+	/*
+	 * We don't use external PID support. lwepx faults would need to be
+	 * handled by KVM and this implies aditional code in DO_KVM (for
+	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
+	 * is too intrusive for the host. Get last instuction in
+	 * kvmppc_get_last_inst().
+	 */
+	li	r9, KVM_INST_FETCH_FAILED
 	stw	r9, VCPU_LAST_INST(r4)
 	.endif
 
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index fcccbb3..94b8be0 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -607,11 +607,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
 	}
 }
 
+#ifdef CONFIG_KVM_BOOKE_HV
+int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
+{
+	gva_t geaddr;
+	hpa_t addr;
+	hfn_t pfn;
+	hva_t eaddr;
+	u32 mas0, mas1, mas2, mas3;
+	u64 mas7_mas3;
+	struct page *page;
+	unsigned int addr_space, psize_shift;
+	bool pr;
+	unsigned long flags;
+
+	/* Search TLB for guest pc to get the real address */
+	geaddr = kvmppc_get_pc(vcpu);
+	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
+
+	local_irq_save(flags);
+	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
+	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
+	asm volatile("tlbsx 0, %[geaddr]\n" : :
+		     [geaddr] "r" (geaddr));
+	mtspr(SPRN_MAS5, 0);
+	mtspr(SPRN_MAS8, 0);
+	mas0 = mfspr(SPRN_MAS0);
+	mas1 = mfspr(SPRN_MAS1);
+	mas2 = mfspr(SPRN_MAS2);
+	mas3 = mfspr(SPRN_MAS3);
+	mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3;
+	local_irq_restore(flags);
+
+	/*
+	 * If the TLB entry for guest pc was evicted, return to the guest.
+	 * There are high chances to find a valid TLB entry next time.
+	 */
+	if (!(mas1 & MAS1_VALID))
+		return EMULATE_AGAIN;
 
+	/*
+	 * Another thread may rewrite the TLB entry in parallel, don't
+	 * execute from the address if the execute permission is not set
+	 */
+	pr = vcpu->arch.shared->msr & MSR_PR;
+	if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) {
+		pr_debug("Instuction emulation from a guest page\n"
+				"withot execute permission\n");
+		kvmppc_core_queue_program(vcpu, 0);
+		return EMULATE_AGAIN;
+	}
+
+	/*
+	 * We will map the real address through a cacheable page, so we will
+	 * not support cache-inhibited guest pages. Fortunately emulated
+	 * instructions should not live there.
+	 */
+	if (mas2 & MAS2_I) {
+		pr_debug("Instuction emulation from cache-inhibited\n"
+				"guest pages is not supported\n");
+		return EMULATE_FAIL;
+	}
+
+	/* Get page size */
+	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
+
+	/* Map a page and get guest's instruction */
+	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
+	       (geaddr & ((1ULL << psize_shift) - 1ULL));
+	pfn = addr >> PAGE_SHIFT;
+
+	/* Guard us against emulation from devices area */
+	if (unlikely(!page_is_ram(pfn))) {
+		pr_debug("Instruction emulation from non-RAM host\n"
+				"pages is not supported\n");
+		return EMULATE_FAIL;
+	}
+
+	if (unlikely(!pfn_valid(pfn))) {
+		pr_debug("Invalid frame number\n");
+		return EMULATE_FAIL;
+	}
+
+	page = pfn_to_page(pfn);
+	eaddr = (unsigned long)kmap_atomic(page);
+	eaddr |= addr & ~PAGE_MASK;
+	*instr = *(u32 *)eaddr;
+	kunmap_atomic((u32 *)eaddr);
+
+	return EMULATE_DONE;
+}
+#else
 int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
 {
 	return EMULATE_FAIL;
-}
+};
+#endif
 
 /************* MMU Notifiers *************/
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
  2014-05-01  0:45   ` Mihai Caraman
  (?)
@ 2014-05-02  9:24     ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02  9:24 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> The commit 1d628af7 "add load inst fixup" made an attempt to handle
> failures generated by reading the guest current instruction. The fixup
> code that was added works by chance hiding the real issue.
>
> Load external pid (lwepx) instruction, used by KVM to read guest
> instructions, is executed in a subsituted guest translation context
> (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
> interrupts need to be handled by KVM, even though these interrupts
> are generated from host context (MSR[GS] = 0).
>
> Currently, KVM hooks only interrupts generated from guest context
> (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
> performance degradation. As a result, the host kernel handles lwepx
> faults searching the faulting guest data address (loaded in DEAR) in
> its own Logical Partition ID (LPID) 0 context. In case a host translation
> is found the execution returns to the lwepx instruction instead of the
> fixup, the host ending up in an infinite loop.
>
> Revert the commit "add load inst fixup". lwepx issue will be addressed
> in a subsequent patch without needing fixup code.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>

Just a random idea: Could we just switch IVOR2 during the critical lwepx 
phase? In fact, we could even do that later when we're already in C code 
and try to recover the last instruction. The code IVOR2 would point to 
would simply set the register we're trying to read to as LAST_INST_FAIL 
and rfi.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-02  9:24     ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02  9:24 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> The commit 1d628af7 "add load inst fixup" made an attempt to handle
> failures generated by reading the guest current instruction. The fixup
> code that was added works by chance hiding the real issue.
>
> Load external pid (lwepx) instruction, used by KVM to read guest
> instructions, is executed in a subsituted guest translation context
> (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
> interrupts need to be handled by KVM, even though these interrupts
> are generated from host context (MSR[GS] = 0).
>
> Currently, KVM hooks only interrupts generated from guest context
> (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
> performance degradation. As a result, the host kernel handles lwepx
> faults searching the faulting guest data address (loaded in DEAR) in
> its own Logical Partition ID (LPID) 0 context. In case a host translation
> is found the execution returns to the lwepx instruction instead of the
> fixup, the host ending up in an infinite loop.
>
> Revert the commit "add load inst fixup". lwepx issue will be addressed
> in a subsequent patch without needing fixup code.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>

Just a random idea: Could we just switch IVOR2 during the critical lwepx 
phase? In fact, we could even do that later when we're already in C code 
and try to recover the last instruction. The code IVOR2 would point to 
would simply set the register we're trying to read to as LAST_INST_FAIL 
and rfi.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-02  9:24     ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02  9:24 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> The commit 1d628af7 "add load inst fixup" made an attempt to handle
> failures generated by reading the guest current instruction. The fixup
> code that was added works by chance hiding the real issue.
>
> Load external pid (lwepx) instruction, used by KVM to read guest
> instructions, is executed in a subsituted guest translation context
> (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
> interrupts need to be handled by KVM, even though these interrupts
> are generated from host context (MSR[GS] = 0).
>
> Currently, KVM hooks only interrupts generated from guest context
> (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
> performance degradation. As a result, the host kernel handles lwepx
> faults searching the faulting guest data address (loaded in DEAR) in
> its own Logical Partition ID (LPID) 0 context. In case a host translation
> is found the execution returns to the lwepx instruction instead of the
> fixup, the host ending up in an infinite loop.
>
> Revert the commit "add load inst fixup". lwepx issue will be addressed
> in a subsequent patch without needing fixup code.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>

Just a random idea: Could we just switch IVOR2 during the critical lwepx 
phase? In fact, we could even do that later when we're already in C code 
and try to recover the last instruction. The code IVOR2 would point to 
would simply set the register we're trying to read to as LAST_INST_FAIL 
and rfi.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
  2014-05-01  0:45   ` Mihai Caraman
  (?)
@ 2014-05-02  9:54     ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02  9:54 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> On book3e, guest last instruction was read on the exist path using load
> external pid (lwepx) dedicated instruction. lwepx failures have to be
> handled by KVM and this would require additional checks in DO_KVM hooks
> (beside MSR[GS] = 1). However extra checks on host fast path are commonly
> considered too intrusive.
>
> This patch lay down the path for an altenative solution, by allowing
> kvmppc_get_lat_inst() function to fail. Booke architectures may choose
> to read guest instruction from kvmppc_ld_inst() and to instruct the
> emulation layer either to fail or to emulate again.
>
> emulation_result enum defintion is not accesible from book3 asm headers.
> Move kvmppc_get_last_inst() definitions that require emulation_result
> to source files.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v2:
>   - integrated kvmppc_get_last_inst() in book3s code and checked build
>   - addressed cosmetic feedback
>   - please validate book3s functionality!
>
>   arch/powerpc/include/asm/kvm_book3s.h    | 26 --------------------
>   arch/powerpc/include/asm/kvm_booke.h     |  5 ----
>   arch/powerpc/include/asm/kvm_ppc.h       |  2 ++
>   arch/powerpc/kvm/book3s.c                | 32 ++++++++++++++++++++++++
>   arch/powerpc/kvm/book3s.h                |  1 +
>   arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 +++---------
>   arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++++++++++++++------------
>   arch/powerpc/kvm/book3s_pr.c             | 36 +++++++++++++++++----------
>   arch/powerpc/kvm/booke.c                 | 14 +++++++++++
>   arch/powerpc/kvm/booke.h                 |  3 +++
>   arch/powerpc/kvm/e500_mmu_host.c         |  7 ++++++
>   arch/powerpc/kvm/emulate.c               | 18 +++++++++-----
>   arch/powerpc/kvm/powerpc.c               | 10 ++++++--
>   13 files changed, 132 insertions(+), 80 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index bb1e38a..e2a89f3 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -273,32 +273,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>   	return (vcpu->arch.shared->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
>   }
>   
> -static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)
> -{
> -	/* Load the instruction manually if it failed to do so in the
> -	 * exit path */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> -		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> -
> -	return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
> -		vcpu->arch.last_inst;
> -}
> -
> -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> -{
> -	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
> -}
> -
> -/*
> - * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
> - * Because the sc instruction sets SRR0 to point to the following
> - * instruction, we have to fetch from pc - 4.
> - */
> -static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
> -{
> -	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
> -}
> -
>   static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
>   {
>   	return vcpu->arch.fault_dar;
> diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
> index 80d46b5..6db1ca0 100644
> --- a/arch/powerpc/include/asm/kvm_booke.h
> +++ b/arch/powerpc/include/asm/kvm_booke.h
> @@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>   	return false;
>   }
>   
> -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> -{
> -	return vcpu->arch.last_inst;
> -}
> -
>   static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
>   {
>   	vcpu->arch.ctr = val;
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 4096f16..6e7c358 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
>   extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
>   extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
>   
> +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);

Phew. Moving this into a separate function sure has some performance 
implications. Was there no way to keep it in a header?

You could just move it into its own .h file which we include after 
kvm_ppc.h. That way everything's available. That would also help me a 
lot with the little endian port where I'm also struggling with header 
file inclusion order and kvmppc_need_byteswap().

> +
>   /* Core-specific hooks */
>   
>   extern void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 94e597e..3553735 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -463,6 +463,38 @@ mmio:
>   }
>   EXPORT_SYMBOL_GPL(kvmppc_ld);
>   
> +int kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc, u32 *inst)
> +{
> +	int ret = EMULATE_DONE;
> +
> +	/* Load the instruction manually if it failed to do so in the
> +	 * exit path */
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +		ret = kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst,
> +			false);
> +
> +	*inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
> +		vcpu->arch.last_inst;
> +
> +	return ret;
> +}
> +
> +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu), inst);
> +}
> +
> +/*
> + * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
> + * Because the sc instruction sets SRR0 to point to the following
> + * instruction, we have to fetch from pc - 4.
> + */
> +int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4,
> +		inst);
> +}
> +
>   int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>   {
>   	return 0;
> diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
> index 4bf956c..d85839d 100644
> --- a/arch/powerpc/kvm/book3s.h
> +++ b/arch/powerpc/kvm/book3s.h
> @@ -30,5 +30,6 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu,
>   					int sprn, ulong *spr_val);
>   extern int kvmppc_book3s_init_pr(void);
>   extern void kvmppc_book3s_exit_pr(void);
> +extern int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst);
>   
>   #endif
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index fb25ebc..7ffcc24 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -541,21 +541,13 @@ static int instruction_is_store(unsigned int instr)
>   static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   				  unsigned long gpa, gva_t ea, int is_store)
>   {
> -	int ret;
>   	u32 last_inst;
> -	unsigned long srr0 = kvmppc_get_pc(vcpu);
>   
> -	/* We try to load the last instruction.  We don't let
> -	 * emulate_instruction do it as it doesn't check what
> -	 * kvmppc_ld returns.
> +	/*
>   	 * If we fail, we just return to the guest and try executing it again.
>   	 */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
> -		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> -		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
> -			return RESUME_GUEST;
> -		vcpu->arch.last_inst = last_inst;
> -	}
> +	if (kvmppc_get_last_inst(vcpu, &last_inst) != EMULATE_DONE)
> +		return RESUME_GUEST;
>   
>   	/*
>   	 * WARNING: We do not know for sure whether the instruction we just
> @@ -569,7 +561,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	 * we just return and retry the instruction.
>   	 */
>   
> -	if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store)
> +	if (instruction_is_store(last_inst) != !!is_store)
>   		return RESUME_GUEST;
>   
>   	/*
> diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c
> index c1abd95..9a6e565 100644
> --- a/arch/powerpc/kvm/book3s_paired_singles.c
> +++ b/arch/powerpc/kvm/book3s_paired_singles.c
> @@ -637,26 +637,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc,
>   
>   int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu)
>   {
> -	u32 inst = kvmppc_get_last_inst(vcpu);
> -	enum emulation_result emulated = EMULATE_DONE;
> -
> -	int ax_rd = inst_get_field(inst, 6, 10);
> -	int ax_ra = inst_get_field(inst, 11, 15);
> -	int ax_rb = inst_get_field(inst, 16, 20);
> -	int ax_rc = inst_get_field(inst, 21, 25);
> -	short full_d = inst_get_field(inst, 16, 31);
> -
> -	u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd);
> -	u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra);
> -	u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb);
> -	u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc);
> -
> -	bool rcomp = (inst & 1) ? true : false;
> -	u32 cr = kvmppc_get_cr(vcpu);
> +	u32 inst;
> +	enum emulation_result emulated;
> +	int ax_rd, ax_ra, ax_rb, ax_rc;
> +	short full_d;
> +	u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c;
> +
> +	bool rcomp;
> +	u32 cr;
>   #ifdef DEBUG
>   	int i;
>   #endif
>   
> +	emulated = kvmppc_get_last_inst(vcpu, &inst);
> +	if (emulated != EMULATE_DONE)
> +		return emulated;
> +
> +	ax_rd = inst_get_field(inst, 6, 10);
> +	ax_ra = inst_get_field(inst, 11, 15);
> +	ax_rb = inst_get_field(inst, 16, 20);
> +	ax_rc = inst_get_field(inst, 21, 25);
> +	full_d = inst_get_field(inst, 16, 31);
> +
> +	fpr_d = &VCPU_FPR(vcpu, ax_rd);
> +	fpr_a = &VCPU_FPR(vcpu, ax_ra);
> +	fpr_b = &VCPU_FPR(vcpu, ax_rb);
> +	fpr_c = &VCPU_FPR(vcpu, ax_rc);
> +
> +	rcomp = (inst & 1) ? true : false;
> +	cr = kvmppc_get_cr(vcpu);
> +
>   	if (!kvmppc_inst_is_paired_single(vcpu, inst))
>   		return EMULATE_FAIL;
>   
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index c5c052a..b7fffd1 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
>   
>   static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
>   {
> -	ulong srr0 = kvmppc_get_pc(vcpu);
> -	u32 last_inst = kvmppc_get_last_inst(vcpu);
> -	int ret;
> +	u32 last_inst;
>   
> -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> -	if (ret == -ENOENT) {
> +	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {

ENOENT?

>   		ulong msr = vcpu->arch.shared->msr;
>   
>   		msr = kvmppc_set_field(msr, 33, 33, 1);
> @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	{
>   		enum emulation_result er;
>   		ulong flags;
> +		u32 last_inst;
>   
>   program_interrupt:
>   		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
> +		kvmppc_get_last_inst(vcpu, &last_inst);

No check for the return value?

>   
>   		if (vcpu->arch.shared->msr & MSR_PR) {
>   #ifdef EXIT_DEBUG
> -			printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> +			pr_info("Userspace triggered 0x700 exception at\n"
> +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
>   #endif
> -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
> +			if ((last_inst & 0xff0007ff) !=
>   			    (INS_DCBZ & 0xfffffff7)) {
>   				kvmppc_core_queue_program(vcpu, flags);
>   				r = RESUME_GUEST;
> @@ -894,7 +894,7 @@ program_interrupt:
>   			break;
>   		case EMULATE_FAIL:
>   			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
> -			       __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> +			       __func__, kvmppc_get_pc(vcpu), last_inst);
>   			kvmppc_core_queue_program(vcpu, flags);
>   			r = RESUME_GUEST;
>   			break;
> @@ -911,8 +911,12 @@ program_interrupt:
>   		break;
>   	}
>   	case BOOK3S_INTERRUPT_SYSCALL:
> +	{
> +		u32 last_sc;
> +
> +		kvmppc_get_last_sc(vcpu, &last_sc);

No check for the return value?

>   		if (vcpu->arch.papr_enabled &&
> -		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
> +		    (last_sc == 0x44000022) &&
>   		    !(vcpu->arch.shared->msr & MSR_PR)) {
>   			/* SC 1 papr hypercalls */
>   			ulong cmd = kvmppc_get_gpr(vcpu, 3);
> @@ -957,6 +961,7 @@ program_interrupt:
>   			r = RESUME_GUEST;
>   		}
>   		break;
> +	}
>   	case BOOK3S_INTERRUPT_FP_UNAVAIL:
>   	case BOOK3S_INTERRUPT_ALTIVEC:
>   	case BOOK3S_INTERRUPT_VSX:
> @@ -985,15 +990,20 @@ program_interrupt:
>   		break;
>   	}
>   	case BOOK3S_INTERRUPT_ALIGNMENT:
> +	{
> +		u32 last_inst;
> +
>   		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
> -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
> -				kvmppc_get_last_inst(vcpu));
> -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
> -				kvmppc_get_last_inst(vcpu));
> +			kvmppc_get_last_inst(vcpu, &last_inst);

I think with an error returning kvmppc_get_last_inst we can just use 
completely get rid of kvmppc_read_inst() and only use 
kvmppc_get_last_inst() instead.

> +			vcpu->arch.shared->dsisr =
> +				kvmppc_alignment_dsisr(vcpu, last_inst);
> +			vcpu->arch.shared->dar =
> +				kvmppc_alignment_dar(vcpu, last_inst);
>   			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
>   		}
>   		r = RESUME_GUEST;
>   		break;
> +	}
>   	case BOOK3S_INTERRUPT_MACHINE_CHECK:
>   	case BOOK3S_INTERRUPT_TRACE:
>   		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index ab62109..df25db0 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
>   		 * they were actually modified by emulation. */
>   		return RESUME_GUEST_NV;
>   
> +	case EMULATE_AGAIN:
> +		return RESUME_GUEST;
> +
>   	case EMULATE_DO_DCR:
>   		run->exit_reason = KVM_EXIT_DCR;
>   		return RESUME_HOST;
> @@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
>   	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
>   }
>   
> +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	int result = EMULATE_DONE;
> +
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
> +	*inst = vcpu->arch.last_inst;

This function looks almost identical to the book3s one. Can we share the 
same code path here? Just always return false for 
kvmppc_needs_byteswap() on booke.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-02  9:54     ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02  9:54 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> On book3e, guest last instruction was read on the exist path using load
> external pid (lwepx) dedicated instruction. lwepx failures have to be
> handled by KVM and this would require additional checks in DO_KVM hooks
> (beside MSR[GS] = 1). However extra checks on host fast path are commonly
> considered too intrusive.
>
> This patch lay down the path for an altenative solution, by allowing
> kvmppc_get_lat_inst() function to fail. Booke architectures may choose
> to read guest instruction from kvmppc_ld_inst() and to instruct the
> emulation layer either to fail or to emulate again.
>
> emulation_result enum defintion is not accesible from book3 asm headers.
> Move kvmppc_get_last_inst() definitions that require emulation_result
> to source files.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v2:
>   - integrated kvmppc_get_last_inst() in book3s code and checked build
>   - addressed cosmetic feedback
>   - please validate book3s functionality!
>
>   arch/powerpc/include/asm/kvm_book3s.h    | 26 --------------------
>   arch/powerpc/include/asm/kvm_booke.h     |  5 ----
>   arch/powerpc/include/asm/kvm_ppc.h       |  2 ++
>   arch/powerpc/kvm/book3s.c                | 32 ++++++++++++++++++++++++
>   arch/powerpc/kvm/book3s.h                |  1 +
>   arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 +++---------
>   arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++++++++++++++------------
>   arch/powerpc/kvm/book3s_pr.c             | 36 +++++++++++++++++----------
>   arch/powerpc/kvm/booke.c                 | 14 +++++++++++
>   arch/powerpc/kvm/booke.h                 |  3 +++
>   arch/powerpc/kvm/e500_mmu_host.c         |  7 ++++++
>   arch/powerpc/kvm/emulate.c               | 18 +++++++++-----
>   arch/powerpc/kvm/powerpc.c               | 10 ++++++--
>   13 files changed, 132 insertions(+), 80 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index bb1e38a..e2a89f3 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -273,32 +273,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>   	return (vcpu->arch.shared->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
>   }
>   
> -static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)
> -{
> -	/* Load the instruction manually if it failed to do so in the
> -	 * exit path */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> -		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> -
> -	return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
> -		vcpu->arch.last_inst;
> -}
> -
> -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> -{
> -	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
> -}
> -
> -/*
> - * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
> - * Because the sc instruction sets SRR0 to point to the following
> - * instruction, we have to fetch from pc - 4.
> - */
> -static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
> -{
> -	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
> -}
> -
>   static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
>   {
>   	return vcpu->arch.fault_dar;
> diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
> index 80d46b5..6db1ca0 100644
> --- a/arch/powerpc/include/asm/kvm_booke.h
> +++ b/arch/powerpc/include/asm/kvm_booke.h
> @@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>   	return false;
>   }
>   
> -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> -{
> -	return vcpu->arch.last_inst;
> -}
> -
>   static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
>   {
>   	vcpu->arch.ctr = val;
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 4096f16..6e7c358 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
>   extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
>   extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
>   
> +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);

Phew. Moving this into a separate function sure has some performance 
implications. Was there no way to keep it in a header?

You could just move it into its own .h file which we include after 
kvm_ppc.h. That way everything's available. That would also help me a 
lot with the little endian port where I'm also struggling with header 
file inclusion order and kvmppc_need_byteswap().

> +
>   /* Core-specific hooks */
>   
>   extern void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 94e597e..3553735 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -463,6 +463,38 @@ mmio:
>   }
>   EXPORT_SYMBOL_GPL(kvmppc_ld);
>   
> +int kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc, u32 *inst)
> +{
> +	int ret = EMULATE_DONE;
> +
> +	/* Load the instruction manually if it failed to do so in the
> +	 * exit path */
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +		ret = kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst,
> +			false);
> +
> +	*inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
> +		vcpu->arch.last_inst;
> +
> +	return ret;
> +}
> +
> +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu), inst);
> +}
> +
> +/*
> + * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
> + * Because the sc instruction sets SRR0 to point to the following
> + * instruction, we have to fetch from pc - 4.
> + */
> +int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4,
> +		inst);
> +}
> +
>   int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>   {
>   	return 0;
> diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
> index 4bf956c..d85839d 100644
> --- a/arch/powerpc/kvm/book3s.h
> +++ b/arch/powerpc/kvm/book3s.h
> @@ -30,5 +30,6 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu,
>   					int sprn, ulong *spr_val);
>   extern int kvmppc_book3s_init_pr(void);
>   extern void kvmppc_book3s_exit_pr(void);
> +extern int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst);
>   
>   #endif
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index fb25ebc..7ffcc24 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -541,21 +541,13 @@ static int instruction_is_store(unsigned int instr)
>   static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   				  unsigned long gpa, gva_t ea, int is_store)
>   {
> -	int ret;
>   	u32 last_inst;
> -	unsigned long srr0 = kvmppc_get_pc(vcpu);
>   
> -	/* We try to load the last instruction.  We don't let
> -	 * emulate_instruction do it as it doesn't check what
> -	 * kvmppc_ld returns.
> +	/*
>   	 * If we fail, we just return to the guest and try executing it again.
>   	 */
> -	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) {
> -		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> -		if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED)
> -			return RESUME_GUEST;
> -		vcpu->arch.last_inst = last_inst;
> -	}
> +	if (kvmppc_get_last_inst(vcpu, &last_inst) != EMULATE_DONE)
> +		return RESUME_GUEST;
>   
>   	/*
>   	 * WARNING: We do not know for sure whether the instruction we just
> @@ -569,7 +561,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	 * we just return and retry the instruction.
>   	 */
>   
> -	if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store)
> +	if (instruction_is_store(last_inst) != !!is_store)
>   		return RESUME_GUEST;
>   
>   	/*
> diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c
> index c1abd95..9a6e565 100644
> --- a/arch/powerpc/kvm/book3s_paired_singles.c
> +++ b/arch/powerpc/kvm/book3s_paired_singles.c
> @@ -637,26 +637,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc,
>   
>   int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu)
>   {
> -	u32 inst = kvmppc_get_last_inst(vcpu);
> -	enum emulation_result emulated = EMULATE_DONE;
> -
> -	int ax_rd = inst_get_field(inst, 6, 10);
> -	int ax_ra = inst_get_field(inst, 11, 15);
> -	int ax_rb = inst_get_field(inst, 16, 20);
> -	int ax_rc = inst_get_field(inst, 21, 25);
> -	short full_d = inst_get_field(inst, 16, 31);
> -
> -	u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd);
> -	u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra);
> -	u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb);
> -	u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc);
> -
> -	bool rcomp = (inst & 1) ? true : false;
> -	u32 cr = kvmppc_get_cr(vcpu);
> +	u32 inst;
> +	enum emulation_result emulated;
> +	int ax_rd, ax_ra, ax_rb, ax_rc;
> +	short full_d;
> +	u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c;
> +
> +	bool rcomp;
> +	u32 cr;
>   #ifdef DEBUG
>   	int i;
>   #endif
>   
> +	emulated = kvmppc_get_last_inst(vcpu, &inst);
> +	if (emulated != EMULATE_DONE)
> +		return emulated;
> +
> +	ax_rd = inst_get_field(inst, 6, 10);
> +	ax_ra = inst_get_field(inst, 11, 15);
> +	ax_rb = inst_get_field(inst, 16, 20);
> +	ax_rc = inst_get_field(inst, 21, 25);
> +	full_d = inst_get_field(inst, 16, 31);
> +
> +	fpr_d = &VCPU_FPR(vcpu, ax_rd);
> +	fpr_a = &VCPU_FPR(vcpu, ax_ra);
> +	fpr_b = &VCPU_FPR(vcpu, ax_rb);
> +	fpr_c = &VCPU_FPR(vcpu, ax_rc);
> +
> +	rcomp = (inst & 1) ? true : false;
> +	cr = kvmppc_get_cr(vcpu);
> +
>   	if (!kvmppc_inst_is_paired_single(vcpu, inst))
>   		return EMULATE_FAIL;
>   
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index c5c052a..b7fffd1 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
>   
>   static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
>   {
> -	ulong srr0 = kvmppc_get_pc(vcpu);
> -	u32 last_inst = kvmppc_get_last_inst(vcpu);
> -	int ret;
> +	u32 last_inst;
>   
> -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> -	if (ret == -ENOENT) {
> +	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {

ENOENT?

>   		ulong msr = vcpu->arch.shared->msr;
>   
>   		msr = kvmppc_set_field(msr, 33, 33, 1);
> @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	{
>   		enum emulation_result er;
>   		ulong flags;
> +		u32 last_inst;
>   
>   program_interrupt:
>   		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
> +		kvmppc_get_last_inst(vcpu, &last_inst);

No check for the return value?

>   
>   		if (vcpu->arch.shared->msr & MSR_PR) {
>   #ifdef EXIT_DEBUG
> -			printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> +			pr_info("Userspace triggered 0x700 exception at\n"
> +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
>   #endif
> -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
> +			if ((last_inst & 0xff0007ff) !=
>   			    (INS_DCBZ & 0xfffffff7)) {
>   				kvmppc_core_queue_program(vcpu, flags);
>   				r = RESUME_GUEST;
> @@ -894,7 +894,7 @@ program_interrupt:
>   			break;
>   		case EMULATE_FAIL:
>   			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
> -			       __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> +			       __func__, kvmppc_get_pc(vcpu), last_inst);
>   			kvmppc_core_queue_program(vcpu, flags);
>   			r = RESUME_GUEST;
>   			break;
> @@ -911,8 +911,12 @@ program_interrupt:
>   		break;
>   	}
>   	case BOOK3S_INTERRUPT_SYSCALL:
> +	{
> +		u32 last_sc;
> +
> +		kvmppc_get_last_sc(vcpu, &last_sc);

No check for the return value?

>   		if (vcpu->arch.papr_enabled &&
> -		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
> +		    (last_sc == 0x44000022) &&
>   		    !(vcpu->arch.shared->msr & MSR_PR)) {
>   			/* SC 1 papr hypercalls */
>   			ulong cmd = kvmppc_get_gpr(vcpu, 3);
> @@ -957,6 +961,7 @@ program_interrupt:
>   			r = RESUME_GUEST;
>   		}
>   		break;
> +	}
>   	case BOOK3S_INTERRUPT_FP_UNAVAIL:
>   	case BOOK3S_INTERRUPT_ALTIVEC:
>   	case BOOK3S_INTERRUPT_VSX:
> @@ -985,15 +990,20 @@ program_interrupt:
>   		break;
>   	}
>   	case BOOK3S_INTERRUPT_ALIGNMENT:
> +	{
> +		u32 last_inst;
> +
>   		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
> -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
> -				kvmppc_get_last_inst(vcpu));
> -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
> -				kvmppc_get_last_inst(vcpu));
> +			kvmppc_get_last_inst(vcpu, &last_inst);

I think with an error returning kvmppc_get_last_inst we can just use 
completely get rid of kvmppc_read_inst() and only use 
kvmppc_get_last_inst() instead.

> +			vcpu->arch.shared->dsisr =
> +				kvmppc_alignment_dsisr(vcpu, last_inst);
> +			vcpu->arch.shared->dar =
> +				kvmppc_alignment_dar(vcpu, last_inst);
>   			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
>   		}
>   		r = RESUME_GUEST;
>   		break;
> +	}
>   	case BOOK3S_INTERRUPT_MACHINE_CHECK:
>   	case BOOK3S_INTERRUPT_TRACE:
>   		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index ab62109..df25db0 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
>   		 * they were actually modified by emulation. */
>   		return RESUME_GUEST_NV;
>   
> +	case EMULATE_AGAIN:
> +		return RESUME_GUEST;
> +
>   	case EMULATE_DO_DCR:
>   		run->exit_reason = KVM_EXIT_DCR;
>   		return RESUME_HOST;
> @@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
>   	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
>   }
>   
> +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	int result = EMULATE_DONE;
> +
> +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> +		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
> +	*inst = vcpu->arch.last_inst;

This function looks almost identical to the book3s one. Can we share the 
same code path here? Just always return false for 
kvmppc_needs_byteswap() on booke.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-02  9:54     ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02  9:54 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> On book3e, guest last instruction was read on the exist path using load
> external pid (lwepx) dedicated instruction. lwepx failures have to be
> handled by KVM and this would require additional checks in DO_KVM hooks
> (beside MSR[GS] = 1). However extra checks on host fast path are commonly
> considered too intrusive.
>
> This patch lay down the path for an altenative solution, by allowing
> kvmppc_get_lat_inst() function to fail. Booke architectures may choose
> to read guest instruction from kvmppc_ld_inst() and to instruct the
> emulation layer either to fail or to emulate again.
>
> emulation_result enum defintion is not accesible from book3 asm headers.
> Move kvmppc_get_last_inst() definitions that require emulation_result
> to source files.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v2:
>   - integrated kvmppc_get_last_inst() in book3s code and checked build
>   - addressed cosmetic feedback
>   - please validate book3s functionality!
>
>   arch/powerpc/include/asm/kvm_book3s.h    | 26 --------------------
>   arch/powerpc/include/asm/kvm_booke.h     |  5 ----
>   arch/powerpc/include/asm/kvm_ppc.h       |  2 ++
>   arch/powerpc/kvm/book3s.c                | 32 ++++++++++++++++++++++++
>   arch/powerpc/kvm/book3s.h                |  1 +
>   arch/powerpc/kvm/book3s_64_mmu_hv.c      | 16 +++---------
>   arch/powerpc/kvm/book3s_paired_singles.c | 42 ++++++++++++++++++++------------
>   arch/powerpc/kvm/book3s_pr.c             | 36 +++++++++++++++++----------
>   arch/powerpc/kvm/booke.c                 | 14 +++++++++++
>   arch/powerpc/kvm/booke.h                 |  3 +++
>   arch/powerpc/kvm/e500_mmu_host.c         |  7 ++++++
>   arch/powerpc/kvm/emulate.c               | 18 +++++++++-----
>   arch/powerpc/kvm/powerpc.c               | 10 ++++++--
>   13 files changed, 132 insertions(+), 80 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index bb1e38a..e2a89f3 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -273,32 +273,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>   	return (vcpu->arch.shared->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
>   }
>   
> -static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)
> -{
> -	/* Load the instruction manually if it failed to do so in the
> -	 * exit path */
> -	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
> -		kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
> -
> -	return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
> -		vcpu->arch.last_inst;
> -}
> -
> -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> -{
> -	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
> -}
> -
> -/*
> - * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
> - * Because the sc instruction sets SRR0 to point to the following
> - * instruction, we have to fetch from pc - 4.
> - */
> -static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
> -{
> -	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
> -}
> -
>   static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
>   {
>   	return vcpu->arch.fault_dar;
> diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h
> index 80d46b5..6db1ca0 100644
> --- a/arch/powerpc/include/asm/kvm_booke.h
> +++ b/arch/powerpc/include/asm/kvm_booke.h
> @@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu)
>   	return false;
>   }
>   
> -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
> -{
> -	return vcpu->arch.last_inst;
> -}
> -
>   static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
>   {
>   	vcpu->arch.ctr = val;
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index 4096f16..6e7c358 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
>   extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
>   extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
>   
> +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);

Phew. Moving this into a separate function sure has some performance 
implications. Was there no way to keep it in a header?

You could just move it into its own .h file which we include after 
kvm_ppc.h. That way everything's available. That would also help me a 
lot with the little endian port where I'm also struggling with header 
file inclusion order and kvmppc_need_byteswap().

> +
>   /* Core-specific hooks */
>   
>   extern void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 94e597e..3553735 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -463,6 +463,38 @@ mmio:
>   }
>   EXPORT_SYMBOL_GPL(kvmppc_ld);
>   
> +int kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc, u32 *inst)
> +{
> +	int ret = EMULATE_DONE;
> +
> +	/* Load the instruction manually if it failed to do so in the
> +	 * exit path */
> +	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
> +		ret = kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst,
> +			false);
> +
> +	*inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
> +		vcpu->arch.last_inst;
> +
> +	return ret;
> +}
> +
> +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu), inst);
> +}
> +
> +/*
> + * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
> + * Because the sc instruction sets SRR0 to point to the following
> + * instruction, we have to fetch from pc - 4.
> + */
> +int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4,
> +		inst);
> +}
> +
>   int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>   {
>   	return 0;
> diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
> index 4bf956c..d85839d 100644
> --- a/arch/powerpc/kvm/book3s.h
> +++ b/arch/powerpc/kvm/book3s.h
> @@ -30,5 +30,6 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu,
>   					int sprn, ulong *spr_val);
>   extern int kvmppc_book3s_init_pr(void);
>   extern void kvmppc_book3s_exit_pr(void);
> +extern int kvmppc_get_last_sc(struct kvm_vcpu *vcpu, u32 *inst);
>   
>   #endif
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index fb25ebc..7ffcc24 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -541,21 +541,13 @@ static int instruction_is_store(unsigned int instr)
>   static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   				  unsigned long gpa, gva_t ea, int is_store)
>   {
> -	int ret;
>   	u32 last_inst;
> -	unsigned long srr0 = kvmppc_get_pc(vcpu);
>   
> -	/* We try to load the last instruction.  We don't let
> -	 * emulate_instruction do it as it doesn't check what
> -	 * kvmppc_ld returns.
> +	/*
>   	 * If we fail, we just return to the guest and try executing it again.
>   	 */
> -	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED) {
> -		ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> -		if (ret != EMULATE_DONE || last_inst = KVM_INST_FETCH_FAILED)
> -			return RESUME_GUEST;
> -		vcpu->arch.last_inst = last_inst;
> -	}
> +	if (kvmppc_get_last_inst(vcpu, &last_inst) != EMULATE_DONE)
> +		return RESUME_GUEST;
>   
>   	/*
>   	 * WARNING: We do not know for sure whether the instruction we just
> @@ -569,7 +561,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	 * we just return and retry the instruction.
>   	 */
>   
> -	if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store)
> +	if (instruction_is_store(last_inst) != !!is_store)
>   		return RESUME_GUEST;
>   
>   	/*
> diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c
> index c1abd95..9a6e565 100644
> --- a/arch/powerpc/kvm/book3s_paired_singles.c
> +++ b/arch/powerpc/kvm/book3s_paired_singles.c
> @@ -637,26 +637,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc,
>   
>   int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu)
>   {
> -	u32 inst = kvmppc_get_last_inst(vcpu);
> -	enum emulation_result emulated = EMULATE_DONE;
> -
> -	int ax_rd = inst_get_field(inst, 6, 10);
> -	int ax_ra = inst_get_field(inst, 11, 15);
> -	int ax_rb = inst_get_field(inst, 16, 20);
> -	int ax_rc = inst_get_field(inst, 21, 25);
> -	short full_d = inst_get_field(inst, 16, 31);
> -
> -	u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd);
> -	u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra);
> -	u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb);
> -	u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc);
> -
> -	bool rcomp = (inst & 1) ? true : false;
> -	u32 cr = kvmppc_get_cr(vcpu);
> +	u32 inst;
> +	enum emulation_result emulated;
> +	int ax_rd, ax_ra, ax_rb, ax_rc;
> +	short full_d;
> +	u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c;
> +
> +	bool rcomp;
> +	u32 cr;
>   #ifdef DEBUG
>   	int i;
>   #endif
>   
> +	emulated = kvmppc_get_last_inst(vcpu, &inst);
> +	if (emulated != EMULATE_DONE)
> +		return emulated;
> +
> +	ax_rd = inst_get_field(inst, 6, 10);
> +	ax_ra = inst_get_field(inst, 11, 15);
> +	ax_rb = inst_get_field(inst, 16, 20);
> +	ax_rc = inst_get_field(inst, 21, 25);
> +	full_d = inst_get_field(inst, 16, 31);
> +
> +	fpr_d = &VCPU_FPR(vcpu, ax_rd);
> +	fpr_a = &VCPU_FPR(vcpu, ax_ra);
> +	fpr_b = &VCPU_FPR(vcpu, ax_rb);
> +	fpr_c = &VCPU_FPR(vcpu, ax_rc);
> +
> +	rcomp = (inst & 1) ? true : false;
> +	cr = kvmppc_get_cr(vcpu);
> +
>   	if (!kvmppc_inst_is_paired_single(vcpu, inst))
>   		return EMULATE_FAIL;
>   
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index c5c052a..b7fffd1 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
>   
>   static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
>   {
> -	ulong srr0 = kvmppc_get_pc(vcpu);
> -	u32 last_inst = kvmppc_get_last_inst(vcpu);
> -	int ret;
> +	u32 last_inst;
>   
> -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> -	if (ret = -ENOENT) {
> +	if (kvmppc_get_last_inst(vcpu, &last_inst) = -ENOENT) {

ENOENT?

>   		ulong msr = vcpu->arch.shared->msr;
>   
>   		msr = kvmppc_set_field(msr, 33, 33, 1);
> @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
>   	{
>   		enum emulation_result er;
>   		ulong flags;
> +		u32 last_inst;
>   
>   program_interrupt:
>   		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
> +		kvmppc_get_last_inst(vcpu, &last_inst);

No check for the return value?

>   
>   		if (vcpu->arch.shared->msr & MSR_PR) {
>   #ifdef EXIT_DEBUG
> -			printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> +			pr_info("Userspace triggered 0x700 exception at\n"
> +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
>   #endif
> -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !> +			if ((last_inst & 0xff0007ff) !>   			    (INS_DCBZ & 0xfffffff7)) {
>   				kvmppc_core_queue_program(vcpu, flags);
>   				r = RESUME_GUEST;
> @@ -894,7 +894,7 @@ program_interrupt:
>   			break;
>   		case EMULATE_FAIL:
>   			printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
> -			       __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> +			       __func__, kvmppc_get_pc(vcpu), last_inst);
>   			kvmppc_core_queue_program(vcpu, flags);
>   			r = RESUME_GUEST;
>   			break;
> @@ -911,8 +911,12 @@ program_interrupt:
>   		break;
>   	}
>   	case BOOK3S_INTERRUPT_SYSCALL:
> +	{
> +		u32 last_sc;
> +
> +		kvmppc_get_last_sc(vcpu, &last_sc);

No check for the return value?

>   		if (vcpu->arch.papr_enabled &&
> -		    (kvmppc_get_last_sc(vcpu) = 0x44000022) &&
> +		    (last_sc = 0x44000022) &&
>   		    !(vcpu->arch.shared->msr & MSR_PR)) {
>   			/* SC 1 papr hypercalls */
>   			ulong cmd = kvmppc_get_gpr(vcpu, 3);
> @@ -957,6 +961,7 @@ program_interrupt:
>   			r = RESUME_GUEST;
>   		}
>   		break;
> +	}
>   	case BOOK3S_INTERRUPT_FP_UNAVAIL:
>   	case BOOK3S_INTERRUPT_ALTIVEC:
>   	case BOOK3S_INTERRUPT_VSX:
> @@ -985,15 +990,20 @@ program_interrupt:
>   		break;
>   	}
>   	case BOOK3S_INTERRUPT_ALIGNMENT:
> +	{
> +		u32 last_inst;
> +
>   		if (kvmppc_read_inst(vcpu) = EMULATE_DONE) {
> -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
> -				kvmppc_get_last_inst(vcpu));
> -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
> -				kvmppc_get_last_inst(vcpu));
> +			kvmppc_get_last_inst(vcpu, &last_inst);

I think with an error returning kvmppc_get_last_inst we can just use 
completely get rid of kvmppc_read_inst() and only use 
kvmppc_get_last_inst() instead.

> +			vcpu->arch.shared->dsisr > +				kvmppc_alignment_dsisr(vcpu, last_inst);
> +			vcpu->arch.shared->dar > +				kvmppc_alignment_dar(vcpu, last_inst);
>   			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
>   		}
>   		r = RESUME_GUEST;
>   		break;
> +	}
>   	case BOOK3S_INTERRUPT_MACHINE_CHECK:
>   	case BOOK3S_INTERRUPT_TRACE:
>   		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index ab62109..df25db0 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
>   		 * they were actually modified by emulation. */
>   		return RESUME_GUEST_NV;
>   
> +	case EMULATE_AGAIN:
> +		return RESUME_GUEST;
> +
>   	case EMULATE_DO_DCR:
>   		run->exit_reason = KVM_EXIT_DCR;
>   		return RESUME_HOST;
> @@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
>   	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
>   }
>   
> +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> +{
> +	int result = EMULATE_DONE;
> +
> +	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
> +		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
> +	*inst = vcpu->arch.last_inst;

This function looks almost identical to the book3s one. Can we share the 
same code path here? Just always return false for 
kvmppc_needs_byteswap() on booke.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
  2014-05-01  0:45   ` Mihai Caraman
  (?)
@ 2014-05-02 10:01     ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02 10:01 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> On bookehv vcpu's last instruction is read using load external pid
> (lwepx) instruction. lwepx exceptions (DTLB_MISS, DSI and LRAT) need
> to be handled by KVM. These exceptions originate from host state
> (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside
> the current MSR[GS] = 1) by looking at the Exception Syndrome Register
> (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]).
> Doing this on each Data TLB miss exception is obvious too intrusive
> for the host.
>
> Get rid of lwepx and acquire last instuction in kvmppc_get_last_inst()
> by searching for the physical address and kmap it. This fixes an
> infinite loop caused by lwepx's data TLB miss handled in the host
> and the TODO for TLB eviction and execute-but-not-read entries.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v2:
>   - reworked patch description
>   - used pr_* functions
>   - addressed cosmetic feedback
>
>   arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
>   arch/powerpc/kvm/e500_mmu_host.c      | 93 ++++++++++++++++++++++++++++++++++-
>   2 files changed, 101 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
> index 925da71..65eff4c 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -122,38 +122,14 @@
>   1:
>   
>   	.if	\flags & NEED_EMU
> -	/*
> -	 * This assumes you have external PID support.
> -	 * To support a bookehv CPU without external PID, you'll
> -	 * need to look up the TLB entry and create a temporary mapping.
> -	 *
> -	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
> -	 * booke doesn't handle it either.  Since Linux doesn't use
> -	 * broadcast tlbivax anymore, the only way this should happen is
> -	 * if the guest maps its memory execute-but-not-read, or if we
> -	 * somehow take a TLB miss in the middle of this entry code and
> -	 * evict the relevant entry.  On e500mc, all kernel lowmem is
> -	 * bolted into TLB1 large page mappings, and we don't use
> -	 * broadcast invalidates, so we should not take a TLB miss here.
> -	 *
> -	 * Later we'll need to deal with faults here.  Disallowing guest
> -	 * mappings that are execute-but-not-read could be an option on
> -	 * e500mc, but not on chips with an LRAT if it is used.
> -	 */
> -
> -	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
>   	PPC_STL	r15, VCPU_GPR(R15)(r4)
>   	PPC_STL	r16, VCPU_GPR(R16)(r4)
>   	PPC_STL	r17, VCPU_GPR(R17)(r4)
>   	PPC_STL	r18, VCPU_GPR(R18)(r4)
>   	PPC_STL	r19, VCPU_GPR(R19)(r4)
> -	mr	r8, r3
>   	PPC_STL	r20, VCPU_GPR(R20)(r4)
> -	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
>   	PPC_STL	r21, VCPU_GPR(R21)(r4)
> -	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
>   	PPC_STL	r22, VCPU_GPR(R22)(r4)
> -	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
>   	PPC_STL	r23, VCPU_GPR(R23)(r4)
>   	PPC_STL	r24, VCPU_GPR(R24)(r4)
>   	PPC_STL	r25, VCPU_GPR(R25)(r4)
> @@ -163,10 +139,15 @@
>   	PPC_STL	r29, VCPU_GPR(R29)(r4)
>   	PPC_STL	r30, VCPU_GPR(R30)(r4)
>   	PPC_STL	r31, VCPU_GPR(R31)(r4)
> -	mtspr	SPRN_EPLC, r8
> -	isync
> -	lwepx   r9, 0, r5
> -	mtspr	SPRN_EPLC, r3
> +
> +	/*
> +	 * We don't use external PID support. lwepx faults would need to be
> +	 * handled by KVM and this implies aditional code in DO_KVM (for
> +	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
> +	 * is too intrusive for the host. Get last instuction in
> +	 * kvmppc_get_last_inst().
> +	 */
> +	li	r9, KVM_INST_FETCH_FAILED
>   	stw	r9, VCPU_LAST_INST(r4)
>   	.endif
>   
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index fcccbb3..94b8be0 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -607,11 +607,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
>   	}
>   }
>   
> +#ifdef CONFIG_KVM_BOOKE_HV
> +int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
> +{
> +	gva_t geaddr;
> +	hpa_t addr;
> +	hfn_t pfn;
> +	hva_t eaddr;
> +	u32 mas0, mas1, mas2, mas3;
> +	u64 mas7_mas3;
> +	struct page *page;
> +	unsigned int addr_space, psize_shift;
> +	bool pr;
> +	unsigned long flags;
> +
> +	/* Search TLB for guest pc to get the real address */
> +	geaddr = kvmppc_get_pc(vcpu);
> +	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
> +
> +	local_irq_save(flags);
> +	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
> +	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
> +	asm volatile("tlbsx 0, %[geaddr]\n" : :
> +		     [geaddr] "r" (geaddr));
> +	mtspr(SPRN_MAS5, 0);
> +	mtspr(SPRN_MAS8, 0);
> +	mas0 = mfspr(SPRN_MAS0);
> +	mas1 = mfspr(SPRN_MAS1);
> +	mas2 = mfspr(SPRN_MAS2);
> +	mas3 = mfspr(SPRN_MAS3);
> +	mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3;
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If the TLB entry for guest pc was evicted, return to the guest.
> +	 * There are high chances to find a valid TLB entry next time.
> +	 */
> +	if (!(mas1 & MAS1_VALID))
> +		return EMULATE_AGAIN;
>   
> +	/*
> +	 * Another thread may rewrite the TLB entry in parallel, don't
> +	 * execute from the address if the execute permission is not set
> +	 */
> +	pr = vcpu->arch.shared->msr & MSR_PR;
> +	if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) {
> +		pr_debug("Instuction emulation from a guest page\n"
> +				"withot execute permission\n");
> +		kvmppc_core_queue_program(vcpu, 0);
> +		return EMULATE_AGAIN;
> +	}
> +
> +	/*
> +	 * We will map the real address through a cacheable page, so we will
> +	 * not support cache-inhibited guest pages. Fortunately emulated
> +	 * instructions should not live there.
> +	 */
> +	if (mas2 & MAS2_I) {
> +		pr_debug("Instuction emulation from cache-inhibited\n"
> +				"guest pages is not supported\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	/* Get page size */
> +	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
> +
> +	/* Map a page and get guest's instruction */
> +	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
> +	       (geaddr & ((1ULL << psize_shift) - 1ULL));
> +	pfn = addr >> PAGE_SHIFT;
> +
> +	/* Guard us against emulation from devices area */
> +	if (unlikely(!page_is_ram(pfn))) {
> +		pr_debug("Instruction emulation from non-RAM host\n"
> +				"pages is not supported\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	if (unlikely(!pfn_valid(pfn))) {
> +		pr_debug("Invalid frame number\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	page = pfn_to_page(pfn);
> +	eaddr = (unsigned long)kmap_atomic(page);
> +	eaddr |= addr & ~PAGE_MASK;
> +	*instr = *(u32 *)eaddr;
> +	kunmap_atomic((u32 *)eaddr);

I think I'd rather write this as

   *instr = *(u32 *)(eaddr | (addr & ~PAGE));
   kunmap_atomic((void*)eaddr);

to make sure we pass the unmap function the same value we got from the 
map function.

Otherwise looks good to me.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 10:01     ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02 10:01 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> On bookehv vcpu's last instruction is read using load external pid
> (lwepx) instruction. lwepx exceptions (DTLB_MISS, DSI and LRAT) need
> to be handled by KVM. These exceptions originate from host state
> (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside
> the current MSR[GS] = 1) by looking at the Exception Syndrome Register
> (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]).
> Doing this on each Data TLB miss exception is obvious too intrusive
> for the host.
>
> Get rid of lwepx and acquire last instuction in kvmppc_get_last_inst()
> by searching for the physical address and kmap it. This fixes an
> infinite loop caused by lwepx's data TLB miss handled in the host
> and the TODO for TLB eviction and execute-but-not-read entries.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v2:
>   - reworked patch description
>   - used pr_* functions
>   - addressed cosmetic feedback
>
>   arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
>   arch/powerpc/kvm/e500_mmu_host.c      | 93 ++++++++++++++++++++++++++++++++++-
>   2 files changed, 101 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
> index 925da71..65eff4c 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -122,38 +122,14 @@
>   1:
>   
>   	.if	\flags & NEED_EMU
> -	/*
> -	 * This assumes you have external PID support.
> -	 * To support a bookehv CPU without external PID, you'll
> -	 * need to look up the TLB entry and create a temporary mapping.
> -	 *
> -	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
> -	 * booke doesn't handle it either.  Since Linux doesn't use
> -	 * broadcast tlbivax anymore, the only way this should happen is
> -	 * if the guest maps its memory execute-but-not-read, or if we
> -	 * somehow take a TLB miss in the middle of this entry code and
> -	 * evict the relevant entry.  On e500mc, all kernel lowmem is
> -	 * bolted into TLB1 large page mappings, and we don't use
> -	 * broadcast invalidates, so we should not take a TLB miss here.
> -	 *
> -	 * Later we'll need to deal with faults here.  Disallowing guest
> -	 * mappings that are execute-but-not-read could be an option on
> -	 * e500mc, but not on chips with an LRAT if it is used.
> -	 */
> -
> -	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
>   	PPC_STL	r15, VCPU_GPR(R15)(r4)
>   	PPC_STL	r16, VCPU_GPR(R16)(r4)
>   	PPC_STL	r17, VCPU_GPR(R17)(r4)
>   	PPC_STL	r18, VCPU_GPR(R18)(r4)
>   	PPC_STL	r19, VCPU_GPR(R19)(r4)
> -	mr	r8, r3
>   	PPC_STL	r20, VCPU_GPR(R20)(r4)
> -	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
>   	PPC_STL	r21, VCPU_GPR(R21)(r4)
> -	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
>   	PPC_STL	r22, VCPU_GPR(R22)(r4)
> -	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
>   	PPC_STL	r23, VCPU_GPR(R23)(r4)
>   	PPC_STL	r24, VCPU_GPR(R24)(r4)
>   	PPC_STL	r25, VCPU_GPR(R25)(r4)
> @@ -163,10 +139,15 @@
>   	PPC_STL	r29, VCPU_GPR(R29)(r4)
>   	PPC_STL	r30, VCPU_GPR(R30)(r4)
>   	PPC_STL	r31, VCPU_GPR(R31)(r4)
> -	mtspr	SPRN_EPLC, r8
> -	isync
> -	lwepx   r9, 0, r5
> -	mtspr	SPRN_EPLC, r3
> +
> +	/*
> +	 * We don't use external PID support. lwepx faults would need to be
> +	 * handled by KVM and this implies aditional code in DO_KVM (for
> +	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
> +	 * is too intrusive for the host. Get last instuction in
> +	 * kvmppc_get_last_inst().
> +	 */
> +	li	r9, KVM_INST_FETCH_FAILED
>   	stw	r9, VCPU_LAST_INST(r4)
>   	.endif
>   
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index fcccbb3..94b8be0 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -607,11 +607,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
>   	}
>   }
>   
> +#ifdef CONFIG_KVM_BOOKE_HV
> +int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
> +{
> +	gva_t geaddr;
> +	hpa_t addr;
> +	hfn_t pfn;
> +	hva_t eaddr;
> +	u32 mas0, mas1, mas2, mas3;
> +	u64 mas7_mas3;
> +	struct page *page;
> +	unsigned int addr_space, psize_shift;
> +	bool pr;
> +	unsigned long flags;
> +
> +	/* Search TLB for guest pc to get the real address */
> +	geaddr = kvmppc_get_pc(vcpu);
> +	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
> +
> +	local_irq_save(flags);
> +	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
> +	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
> +	asm volatile("tlbsx 0, %[geaddr]\n" : :
> +		     [geaddr] "r" (geaddr));
> +	mtspr(SPRN_MAS5, 0);
> +	mtspr(SPRN_MAS8, 0);
> +	mas0 = mfspr(SPRN_MAS0);
> +	mas1 = mfspr(SPRN_MAS1);
> +	mas2 = mfspr(SPRN_MAS2);
> +	mas3 = mfspr(SPRN_MAS3);
> +	mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3;
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If the TLB entry for guest pc was evicted, return to the guest.
> +	 * There are high chances to find a valid TLB entry next time.
> +	 */
> +	if (!(mas1 & MAS1_VALID))
> +		return EMULATE_AGAIN;
>   
> +	/*
> +	 * Another thread may rewrite the TLB entry in parallel, don't
> +	 * execute from the address if the execute permission is not set
> +	 */
> +	pr = vcpu->arch.shared->msr & MSR_PR;
> +	if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) {
> +		pr_debug("Instuction emulation from a guest page\n"
> +				"withot execute permission\n");
> +		kvmppc_core_queue_program(vcpu, 0);
> +		return EMULATE_AGAIN;
> +	}
> +
> +	/*
> +	 * We will map the real address through a cacheable page, so we will
> +	 * not support cache-inhibited guest pages. Fortunately emulated
> +	 * instructions should not live there.
> +	 */
> +	if (mas2 & MAS2_I) {
> +		pr_debug("Instuction emulation from cache-inhibited\n"
> +				"guest pages is not supported\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	/* Get page size */
> +	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
> +
> +	/* Map a page and get guest's instruction */
> +	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
> +	       (geaddr & ((1ULL << psize_shift) - 1ULL));
> +	pfn = addr >> PAGE_SHIFT;
> +
> +	/* Guard us against emulation from devices area */
> +	if (unlikely(!page_is_ram(pfn))) {
> +		pr_debug("Instruction emulation from non-RAM host\n"
> +				"pages is not supported\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	if (unlikely(!pfn_valid(pfn))) {
> +		pr_debug("Invalid frame number\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	page = pfn_to_page(pfn);
> +	eaddr = (unsigned long)kmap_atomic(page);
> +	eaddr |= addr & ~PAGE_MASK;
> +	*instr = *(u32 *)eaddr;
> +	kunmap_atomic((u32 *)eaddr);

I think I'd rather write this as

   *instr = *(u32 *)(eaddr | (addr & ~PAGE));
   kunmap_atomic((void*)eaddr);

to make sure we pass the unmap function the same value we got from the 
map function.

Otherwise looks good to me.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 10:01     ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02 10:01 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> On bookehv vcpu's last instruction is read using load external pid
> (lwepx) instruction. lwepx exceptions (DTLB_MISS, DSI and LRAT) need
> to be handled by KVM. These exceptions originate from host state
> (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside
> the current MSR[GS] = 1) by looking at the Exception Syndrome Register
> (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]).
> Doing this on each Data TLB miss exception is obvious too intrusive
> for the host.
>
> Get rid of lwepx and acquire last instuction in kvmppc_get_last_inst()
> by searching for the physical address and kmap it. This fixes an
> infinite loop caused by lwepx's data TLB miss handled in the host
> and the TODO for TLB eviction and execute-but-not-read entries.
>
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> v2:
>   - reworked patch description
>   - used pr_* functions
>   - addressed cosmetic feedback
>
>   arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++----------
>   arch/powerpc/kvm/e500_mmu_host.c      | 93 ++++++++++++++++++++++++++++++++++-
>   2 files changed, 101 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
> index 925da71..65eff4c 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -122,38 +122,14 @@
>   1:
>   
>   	.if	\flags & NEED_EMU
> -	/*
> -	 * This assumes you have external PID support.
> -	 * To support a bookehv CPU without external PID, you'll
> -	 * need to look up the TLB entry and create a temporary mapping.
> -	 *
> -	 * FIXME: we don't currently handle if the lwepx faults.  PR-mode
> -	 * booke doesn't handle it either.  Since Linux doesn't use
> -	 * broadcast tlbivax anymore, the only way this should happen is
> -	 * if the guest maps its memory execute-but-not-read, or if we
> -	 * somehow take a TLB miss in the middle of this entry code and
> -	 * evict the relevant entry.  On e500mc, all kernel lowmem is
> -	 * bolted into TLB1 large page mappings, and we don't use
> -	 * broadcast invalidates, so we should not take a TLB miss here.
> -	 *
> -	 * Later we'll need to deal with faults here.  Disallowing guest
> -	 * mappings that are execute-but-not-read could be an option on
> -	 * e500mc, but not on chips with an LRAT if it is used.
> -	 */
> -
> -	mfspr	r3, SPRN_EPLC	/* will already have correct ELPID and EGS */
>   	PPC_STL	r15, VCPU_GPR(R15)(r4)
>   	PPC_STL	r16, VCPU_GPR(R16)(r4)
>   	PPC_STL	r17, VCPU_GPR(R17)(r4)
>   	PPC_STL	r18, VCPU_GPR(R18)(r4)
>   	PPC_STL	r19, VCPU_GPR(R19)(r4)
> -	mr	r8, r3
>   	PPC_STL	r20, VCPU_GPR(R20)(r4)
> -	rlwimi	r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS
>   	PPC_STL	r21, VCPU_GPR(R21)(r4)
> -	rlwimi	r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR
>   	PPC_STL	r22, VCPU_GPR(R22)(r4)
> -	rlwimi	r8, r10, EPC_EPID_SHIFT, EPC_EPID
>   	PPC_STL	r23, VCPU_GPR(R23)(r4)
>   	PPC_STL	r24, VCPU_GPR(R24)(r4)
>   	PPC_STL	r25, VCPU_GPR(R25)(r4)
> @@ -163,10 +139,15 @@
>   	PPC_STL	r29, VCPU_GPR(R29)(r4)
>   	PPC_STL	r30, VCPU_GPR(R30)(r4)
>   	PPC_STL	r31, VCPU_GPR(R31)(r4)
> -	mtspr	SPRN_EPLC, r8
> -	isync
> -	lwepx   r9, 0, r5
> -	mtspr	SPRN_EPLC, r3
> +
> +	/*
> +	 * We don't use external PID support. lwepx faults would need to be
> +	 * handled by KVM and this implies aditional code in DO_KVM (for
> +	 * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
> +	 * is too intrusive for the host. Get last instuction in
> +	 * kvmppc_get_last_inst().
> +	 */
> +	li	r9, KVM_INST_FETCH_FAILED
>   	stw	r9, VCPU_LAST_INST(r4)
>   	.endif
>   
> diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
> index fcccbb3..94b8be0 100644
> --- a/arch/powerpc/kvm/e500_mmu_host.c
> +++ b/arch/powerpc/kvm/e500_mmu_host.c
> @@ -607,11 +607,102 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
>   	}
>   }
>   
> +#ifdef CONFIG_KVM_BOOKE_HV
> +int kvmppc_ld_inst(struct kvm_vcpu *vcpu, u32 *instr)
> +{
> +	gva_t geaddr;
> +	hpa_t addr;
> +	hfn_t pfn;
> +	hva_t eaddr;
> +	u32 mas0, mas1, mas2, mas3;
> +	u64 mas7_mas3;
> +	struct page *page;
> +	unsigned int addr_space, psize_shift;
> +	bool pr;
> +	unsigned long flags;
> +
> +	/* Search TLB for guest pc to get the real address */
> +	geaddr = kvmppc_get_pc(vcpu);
> +	addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG;
> +
> +	local_irq_save(flags);
> +	mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space);
> +	mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid);
> +	asm volatile("tlbsx 0, %[geaddr]\n" : :
> +		     [geaddr] "r" (geaddr));
> +	mtspr(SPRN_MAS5, 0);
> +	mtspr(SPRN_MAS8, 0);
> +	mas0 = mfspr(SPRN_MAS0);
> +	mas1 = mfspr(SPRN_MAS1);
> +	mas2 = mfspr(SPRN_MAS2);
> +	mas3 = mfspr(SPRN_MAS3);
> +	mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3;
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * If the TLB entry for guest pc was evicted, return to the guest.
> +	 * There are high chances to find a valid TLB entry next time.
> +	 */
> +	if (!(mas1 & MAS1_VALID))
> +		return EMULATE_AGAIN;
>   
> +	/*
> +	 * Another thread may rewrite the TLB entry in parallel, don't
> +	 * execute from the address if the execute permission is not set
> +	 */
> +	pr = vcpu->arch.shared->msr & MSR_PR;
> +	if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) {
> +		pr_debug("Instuction emulation from a guest page\n"
> +				"withot execute permission\n");
> +		kvmppc_core_queue_program(vcpu, 0);
> +		return EMULATE_AGAIN;
> +	}
> +
> +	/*
> +	 * We will map the real address through a cacheable page, so we will
> +	 * not support cache-inhibited guest pages. Fortunately emulated
> +	 * instructions should not live there.
> +	 */
> +	if (mas2 & MAS2_I) {
> +		pr_debug("Instuction emulation from cache-inhibited\n"
> +				"guest pages is not supported\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	/* Get page size */
> +	psize_shift = MAS1_GET_TSIZE(mas1) + 10;
> +
> +	/* Map a page and get guest's instruction */
> +	addr = (mas7_mas3 & (~0ULL << psize_shift)) |
> +	       (geaddr & ((1ULL << psize_shift) - 1ULL));
> +	pfn = addr >> PAGE_SHIFT;
> +
> +	/* Guard us against emulation from devices area */
> +	if (unlikely(!page_is_ram(pfn))) {
> +		pr_debug("Instruction emulation from non-RAM host\n"
> +				"pages is not supported\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	if (unlikely(!pfn_valid(pfn))) {
> +		pr_debug("Invalid frame number\n");
> +		return EMULATE_FAIL;
> +	}
> +
> +	page = pfn_to_page(pfn);
> +	eaddr = (unsigned long)kmap_atomic(page);
> +	eaddr |= addr & ~PAGE_MASK;
> +	*instr = *(u32 *)eaddr;
> +	kunmap_atomic((u32 *)eaddr);

I think I'd rather write this as

   *instr = *(u32 *)(eaddr | (addr & ~PAGE));
   kunmap_atomic((void*)eaddr);

to make sure we pass the unmap function the same value we got from the 
map function.

Otherwise looks good to me.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
  2014-05-02 10:01     ` Alexander Graf
  (?)
@ 2014-05-02 10:12       ` David Laight
  -1 siblings, 0 replies; 54+ messages in thread
From: David Laight @ 2014-05-02 10:12 UTC (permalink / raw)
  To: 'Alexander Graf', Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

From: Alexander Graf
...
> > +	page = pfn_to_page(pfn);
> > +	eaddr = (unsigned long)kmap_atomic(page);
> > +	eaddr |= addr & ~PAGE_MASK;
> > +	*instr = *(u32 *)eaddr;
> > +	kunmap_atomic((u32 *)eaddr);
> 
> I think I'd rather write this as
> 
>    *instr = *(u32 *)(eaddr | (addr & ~PAGE));
>    kunmap_atomic((void*)eaddr);
> 
> to make sure we pass the unmap function the same value we got from the
> map function.
> 
> Otherwise looks good to me.

Is there any mileage in keeping a virtual address page allocated (per cpu)
for this (and similar) accesses to physical memory?
Not having to search for a free VA page might speed things up (if that matters).

You also probably want the page mapped uncached - no point polluting the data
cache.

	David


^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 10:12       ` David Laight
  0 siblings, 0 replies; 54+ messages in thread
From: David Laight @ 2014-05-02 10:12 UTC (permalink / raw)
  To: 'Alexander Graf', Mihai Caraman; +Cc: linuxppc-dev, kvm-ppc, kvm

RnJvbTogQWxleGFuZGVyIEdyYWYNCi4uLg0KPiA+ICsJcGFnZSA9IHBmbl90b19wYWdlKHBmbik7
DQo+ID4gKwllYWRkciA9ICh1bnNpZ25lZCBsb25nKWttYXBfYXRvbWljKHBhZ2UpOw0KPiA+ICsJ
ZWFkZHIgfD0gYWRkciAmIH5QQUdFX01BU0s7DQo+ID4gKwkqaW5zdHIgPSAqKHUzMiAqKWVhZGRy
Ow0KPiA+ICsJa3VubWFwX2F0b21pYygodTMyICopZWFkZHIpOw0KPiANCj4gSSB0aGluayBJJ2Qg
cmF0aGVyIHdyaXRlIHRoaXMgYXMNCj4gDQo+ICAgICppbnN0ciA9ICoodTMyICopKGVhZGRyIHwg
KGFkZHIgJiB+UEFHRSkpOw0KPiAgICBrdW5tYXBfYXRvbWljKCh2b2lkKillYWRkcik7DQo+IA0K
PiB0byBtYWtlIHN1cmUgd2UgcGFzcyB0aGUgdW5tYXAgZnVuY3Rpb24gdGhlIHNhbWUgdmFsdWUg
d2UgZ290IGZyb20gdGhlDQo+IG1hcCBmdW5jdGlvbi4NCj4gDQo+IE90aGVyd2lzZSBsb29rcyBn
b29kIHRvIG1lLg0KDQpJcyB0aGVyZSBhbnkgbWlsZWFnZSBpbiBrZWVwaW5nIGEgdmlydHVhbCBh
ZGRyZXNzIHBhZ2UgYWxsb2NhdGVkIChwZXIgY3B1KQ0KZm9yIHRoaXMgKGFuZCBzaW1pbGFyKSBh
Y2Nlc3NlcyB0byBwaHlzaWNhbCBtZW1vcnk/DQpOb3QgaGF2aW5nIHRvIHNlYXJjaCBmb3IgYSBm
cmVlIFZBIHBhZ2UgbWlnaHQgc3BlZWQgdGhpbmdzIHVwIChpZiB0aGF0IG1hdHRlcnMpLg0KDQpZ
b3UgYWxzbyBwcm9iYWJseSB3YW50IHRoZSBwYWdlIG1hcHBlZCB1bmNhY2hlZCAtIG5vIHBvaW50
IHBvbGx1dGluZyB0aGUgZGF0YQ0KY2FjaGUuDQoNCglEYXZpZA0KDQo=

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 10:12       ` David Laight
  0 siblings, 0 replies; 54+ messages in thread
From: David Laight @ 2014-05-02 10:12 UTC (permalink / raw)
  To: 'Alexander Graf', Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

RnJvbTogQWxleGFuZGVyIEdyYWYNCi4uLg0KPiA+ICsJcGFnZSA9IHBmbl90b19wYWdlKHBmbik7
DQo+ID4gKwllYWRkciA9ICh1bnNpZ25lZCBsb25nKWttYXBfYXRvbWljKHBhZ2UpOw0KPiA+ICsJ
ZWFkZHIgfD0gYWRkciAmIH5QQUdFX01BU0s7DQo+ID4gKwkqaW5zdHIgPSAqKHUzMiAqKWVhZGRy
Ow0KPiA+ICsJa3VubWFwX2F0b21pYygodTMyICopZWFkZHIpOw0KPiANCj4gSSB0aGluayBJJ2Qg
cmF0aGVyIHdyaXRlIHRoaXMgYXMNCj4gDQo+ICAgICppbnN0ciA9ICoodTMyICopKGVhZGRyIHwg
KGFkZHIgJiB+UEFHRSkpOw0KPiAgICBrdW5tYXBfYXRvbWljKCh2b2lkKillYWRkcik7DQo+IA0K
PiB0byBtYWtlIHN1cmUgd2UgcGFzcyB0aGUgdW5tYXAgZnVuY3Rpb24gdGhlIHNhbWUgdmFsdWUg
d2UgZ290IGZyb20gdGhlDQo+IG1hcCBmdW5jdGlvbi4NCj4gDQo+IE90aGVyd2lzZSBsb29rcyBn
b29kIHRvIG1lLg0KDQpJcyB0aGVyZSBhbnkgbWlsZWFnZSBpbiBrZWVwaW5nIGEgdmlydHVhbCBh
ZGRyZXNzIHBhZ2UgYWxsb2NhdGVkIChwZXIgY3B1KQ0KZm9yIHRoaXMgKGFuZCBzaW1pbGFyKSBh
Y2Nlc3NlcyB0byBwaHlzaWNhbCBtZW1vcnk/DQpOb3QgaGF2aW5nIHRvIHNlYXJjaCBmb3IgYSBm
cmVlIFZBIHBhZ2UgbWlnaHQgc3BlZWQgdGhpbmdzIHVwIChpZiB0aGF0IG1hdHRlcnMpLg0KDQpZ
b3UgYWxzbyBwcm9iYWJseSB3YW50IHRoZSBwYWdlIG1hcHBlZCB1bmNhY2hlZCAtIG5vIHBvaW50
IHBvbGx1dGluZyB0aGUgZGF0YQ0KY2FjaGUuDQoNCglEYXZpZA0KDQo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
  2014-05-02 10:12       ` David Laight
  (?)
@ 2014-05-02 11:10         ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02 11:10 UTC (permalink / raw)
  To: David Laight; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On 05/02/2014 12:12 PM, David Laight wrote:
> From: Alexander Graf
> ...
>>> +	page = pfn_to_page(pfn);
>>> +	eaddr = (unsigned long)kmap_atomic(page);
>>> +	eaddr |= addr & ~PAGE_MASK;
>>> +	*instr = *(u32 *)eaddr;
>>> +	kunmap_atomic((u32 *)eaddr);
>> I think I'd rather write this as
>>
>>     *instr = *(u32 *)(eaddr | (addr & ~PAGE));
>>     kunmap_atomic((void*)eaddr);
>>
>> to make sure we pass the unmap function the same value we got from the
>> map function.
>>
>> Otherwise looks good to me.
> Is there any mileage in keeping a virtual address page allocated (per cpu)
> for this (and similar) accesses to physical memory?
> Not having to search for a free VA page might speed things up (if that matters).

I like the idea, though I'm not sure how that would best fit into the 
current memory mapping ecosystem.

> You also probably want the page mapped uncached - no point polluting the data
> cache.

Do e500 chips have a shared I/D cache somewhere? If they do, that 
particular instruction would already be there, so no pollution but nice 
performance.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 11:10         ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02 11:10 UTC (permalink / raw)
  To: David Laight; +Cc: Mihai Caraman, linuxppc-dev, kvm-ppc, kvm

On 05/02/2014 12:12 PM, David Laight wrote:
> From: Alexander Graf
> ...
>>> +	page = pfn_to_page(pfn);
>>> +	eaddr = (unsigned long)kmap_atomic(page);
>>> +	eaddr |= addr & ~PAGE_MASK;
>>> +	*instr = *(u32 *)eaddr;
>>> +	kunmap_atomic((u32 *)eaddr);
>> I think I'd rather write this as
>>
>>     *instr = *(u32 *)(eaddr | (addr & ~PAGE));
>>     kunmap_atomic((void*)eaddr);
>>
>> to make sure we pass the unmap function the same value we got from the
>> map function.
>>
>> Otherwise looks good to me.
> Is there any mileage in keeping a virtual address page allocated (per cpu)
> for this (and similar) accesses to physical memory?
> Not having to search for a free VA page might speed things up (if that matters).

I like the idea, though I'm not sure how that would best fit into the 
current memory mapping ecosystem.

> You also probably want the page mapped uncached - no point polluting the data
> cache.

Do e500 chips have a shared I/D cache somewhere? If they do, that 
particular instruction would already be there, so no pollution but nice 
performance.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 11:10         ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-02 11:10 UTC (permalink / raw)
  To: David Laight; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On 05/02/2014 12:12 PM, David Laight wrote:
> From: Alexander Graf
> ...
>>> +	page = pfn_to_page(pfn);
>>> +	eaddr = (unsigned long)kmap_atomic(page);
>>> +	eaddr |= addr & ~PAGE_MASK;
>>> +	*instr = *(u32 *)eaddr;
>>> +	kunmap_atomic((u32 *)eaddr);
>> I think I'd rather write this as
>>
>>     *instr = *(u32 *)(eaddr | (addr & ~PAGE));
>>     kunmap_atomic((void*)eaddr);
>>
>> to make sure we pass the unmap function the same value we got from the
>> map function.
>>
>> Otherwise looks good to me.
> Is there any mileage in keeping a virtual address page allocated (per cpu)
> for this (and similar) accesses to physical memory?
> Not having to search for a free VA page might speed things up (if that matters).

I like the idea, though I'm not sure how that would best fit into the 
current memory mapping ecosystem.

> You also probably want the page mapped uncached - no point polluting the data
> cache.

Do e500 chips have a shared I/D cache somewhere? If they do, that 
particular instruction would already be there, so no pollution but nice 
performance.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
  2014-05-02 11:10         ` Alexander Graf
  (?)
@ 2014-05-02 15:32           ` Scott Wood
  -1 siblings, 0 replies; 54+ messages in thread
From: Scott Wood @ 2014-05-02 15:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: David Laight, Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On Fri, 2014-05-02 at 13:10 +0200, Alexander Graf wrote:
> On 05/02/2014 12:12 PM, David Laight wrote:
> > You also probably want the page mapped uncached - no point polluting the data
> > cache.

We can't do that without creating an architecturally illegal alias
between cacheable and non-cacheable mappings.

> Do e500 chips have a shared I/D cache somewhere?

Yes.  Only L1 is separate.

-Scott

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 15:32           ` Scott Wood
  0 siblings, 0 replies; 54+ messages in thread
From: Scott Wood @ 2014-05-02 15:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Mihai Caraman, David Laight, linuxppc-dev, kvm-ppc, kvm

On Fri, 2014-05-02 at 13:10 +0200, Alexander Graf wrote:
> On 05/02/2014 12:12 PM, David Laight wrote:
> > You also probably want the page mapped uncached - no point polluting the data
> > cache.

We can't do that without creating an architecturally illegal alias
between cacheable and non-cacheable mappings.

> Do e500 chips have a shared I/D cache somewhere?

Yes.  Only L1 is separate.

-Scott

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation
@ 2014-05-02 15:32           ` Scott Wood
  0 siblings, 0 replies; 54+ messages in thread
From: Scott Wood @ 2014-05-02 15:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: David Laight, Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On Fri, 2014-05-02 at 13:10 +0200, Alexander Graf wrote:
> On 05/02/2014 12:12 PM, David Laight wrote:
> > You also probably want the page mapped uncached - no point polluting the data
> > cache.

We can't do that without creating an architecturally illegal alias
between cacheable and non-cacheable mappings.

> Do e500 chips have a shared I/D cache somewhere?

Yes.  Only L1 is separate.

-Scott



^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
  2014-05-02  9:24     ` Alexander Graf
  (?)
@ 2014-05-02 23:14       ` mihai.caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-02 23:14 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, kvm, linuxppc-dev

> From: Alexander Graf <agraf@suse.de>
> Sent: Friday, May 2, 2014 12:24 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
> 
> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> > The commit 1d628af7 "add load inst fixup" made an attempt to handle
> > failures generated by reading the guest current instruction. The fixup
> > code that was added works by chance hiding the real issue.
> >
> > Load external pid (lwepx) instruction, used by KVM to read guest
> > instructions, is executed in a subsituted guest translation context
> > (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
> > interrupts need to be handled by KVM, even though these interrupts
> > are generated from host context (MSR[GS] = 0).
> >
> > Currently, KVM hooks only interrupts generated from guest context
> > (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
> > performance degradation. As a result, the host kernel handles lwepx
> > faults searching the faulting guest data address (loaded in DEAR) in
> > its own Logical Partition ID (LPID) 0 context. In case a host translation
> > is found the execution returns to the lwepx instruction instead of the
> > fixup, the host ending up in an infinite loop.
> >
> > Revert the commit "add load inst fixup". lwepx issue will be addressed
> > in a subsequent patch without needing fixup code.
> >
> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> 
> Just a random idea: Could we just switch IVOR2 during the critical lwepx
> phase? In fact, we could even do that later when we're already in C code
> and try to recover the last instruction. The code IVOR2 would point to
> would simply set the register we're trying to read to as LAST_INST_FAIL
> and rfi.

This was the first idea that sprang to my mind inspired from how DO_KVM
is hooked on PR. I actually did a simple POC for e500mc/e5500, but this will
not work on e6500 which has shared IVORs between HW threads.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-02 23:14       ` mihai.caraman
  0 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-02 23:14 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc

> From: Alexander Graf <agraf@suse.de>=0A=
> Sent: Friday, May 2, 2014 12:24 PM=0A=
> To: Caraman Mihai Claudiu-B02008=0A=
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozla=
bs.org=0A=
> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup=
"=0A=
> =0A=
> On 05/01/2014 02:45 AM, Mihai Caraman wrote:=0A=
> > The commit 1d628af7 "add load inst fixup" made an attempt to handle=0A=
> > failures generated by reading the guest current instruction. The fixup=
=0A=
> > code that was added works by chance hiding the real issue.=0A=
> >=0A=
> > Load external pid (lwepx) instruction, used by KVM to read guest=0A=
> > instructions, is executed in a subsituted guest translation context=0A=
> > (EPLC[EGS] =3D 1). In consequence lwepx's TLB error and data storage=0A=
> > interrupts need to be handled by KVM, even though these interrupts=0A=
> > are generated from host context (MSR[GS] =3D 0).=0A=
> >=0A=
> > Currently, KVM hooks only interrupts generated from guest context=0A=
> > (MSR[GS] =3D 1), doing minimal checks on the fast path to avoid host=0A=
> > performance degradation. As a result, the host kernel handles lwepx=0A=
> > faults searching the faulting guest data address (loaded in DEAR) in=0A=
> > its own Logical Partition ID (LPID) 0 context. In case a host translati=
on=0A=
> > is found the execution returns to the lwepx instruction instead of the=
=0A=
> > fixup, the host ending up in an infinite loop.=0A=
> >=0A=
> > Revert the commit "add load inst fixup". lwepx issue will be addressed=
=0A=
> > in a subsequent patch without needing fixup code.=0A=
> >=0A=
> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>=0A=
> =0A=
> Just a random idea: Could we just switch IVOR2 during the critical lwepx=
=0A=
> phase? In fact, we could even do that later when we're already in C code=
=0A=
> and try to recover the last instruction. The code IVOR2 would point to=0A=
> would simply set the register we're trying to read to as LAST_INST_FAIL=
=0A=
> and rfi.=0A=
=0A=
This was the first idea that sprang to my mind inspired from how DO_KVM=0A=
is hooked on PR. I actually did a simple POC for e500mc/e5500, but this wil=
l=0A=
not work on e6500 which has shared IVORs between HW threads.=0A=
=0A=
-Mike=

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-02 23:14       ` mihai.caraman
  0 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-02 23:14 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, kvm, linuxppc-dev

> From: Alexander Graf <agraf@suse.de>
> Sent: Friday, May 2, 2014 12:24 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
> 
> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> > The commit 1d628af7 "add load inst fixup" made an attempt to handle
> > failures generated by reading the guest current instruction. The fixup
> > code that was added works by chance hiding the real issue.
> >
> > Load external pid (lwepx) instruction, used by KVM to read guest
> > instructions, is executed in a subsituted guest translation context
> > (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
> > interrupts need to be handled by KVM, even though these interrupts
> > are generated from host context (MSR[GS] = 0).
> >
> > Currently, KVM hooks only interrupts generated from guest context
> > (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
> > performance degradation. As a result, the host kernel handles lwepx
> > faults searching the faulting guest data address (loaded in DEAR) in
> > its own Logical Partition ID (LPID) 0 context. In case a host translation
> > is found the execution returns to the lwepx instruction instead of the
> > fixup, the host ending up in an infinite loop.
> >
> > Revert the commit "add load inst fixup". lwepx issue will be addressed
> > in a subsequent patch without needing fixup code.
> >
> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> 
> Just a random idea: Could we just switch IVOR2 during the critical lwepx
> phase? In fact, we could even do that later when we're already in C code
> and try to recover the last instruction. The code IVOR2 would point to
> would simply set the register we're trying to read to as LAST_INST_FAIL
> and rfi.

This was the first idea that sprang to my mind inspired from how DO_KVM
is hooked on PR. I actually did a simple POC for e500mc/e5500, but this will
not work on e6500 which has shared IVORs between HW threads.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
  2014-05-02 23:14       ` mihai.caraman
  (?)
@ 2014-05-03 22:14         ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-03 22:14 UTC (permalink / raw)
  To: mihai.caraman; +Cc: kvm-ppc, kvm, linuxppc-dev



Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com" <mihai.caraman@freescale.com>:

>> From: Alexander Graf <agraf@suse.de>
>> Sent: Friday, May 2, 2014 12:24 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
>> 
>>> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
>>> The commit 1d628af7 "add load inst fixup" made an attempt to handle
>>> failures generated by reading the guest current instruction. The fixup
>>> code that was added works by chance hiding the real issue.
>>> 
>>> Load external pid (lwepx) instruction, used by KVM to read guest
>>> instructions, is executed in a subsituted guest translation context
>>> (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
>>> interrupts need to be handled by KVM, even though these interrupts
>>> are generated from host context (MSR[GS] = 0).
>>> 
>>> Currently, KVM hooks only interrupts generated from guest context
>>> (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
>>> performance degradation. As a result, the host kernel handles lwepx
>>> faults searching the faulting guest data address (loaded in DEAR) in
>>> its own Logical Partition ID (LPID) 0 context. In case a host translation
>>> is found the execution returns to the lwepx instruction instead of the
>>> fixup, the host ending up in an infinite loop.
>>> 
>>> Revert the commit "add load inst fixup". lwepx issue will be addressed
>>> in a subsequent patch without needing fixup code.
>>> 
>>> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>> 
>> Just a random idea: Could we just switch IVOR2 during the critical lwepx
>> phase? In fact, we could even do that later when we're already in C code
>> and try to recover the last instruction. The code IVOR2 would point to
>> would simply set the register we're trying to read to as LAST_INST_FAIL
>> and rfi.
> 
> This was the first idea that sprang to my mind inspired from how DO_KVM
> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this will
> not work on e6500 which has shared IVORs between HW threads.

What if we combine the ideas? On read we flip the IVOR to a separate handler that checks for a field in the PACA. Only if that field is set, we treat the fault as kvm fault, otherwise we jump into the normal handler.

I suppose we'd have to also take a lock to make sure we don't race with the other thread when it wants to also read a guest instruction, but you get the idea.

I have no idea whether this would be any faster, it's more of a brainstorming thing really. But regardless this patch set would be a move into the right direction.

Btw, do we have any guarantees that we don't get scheduled away before we run kvmppc_get_last_inst()? If we run on a different core we can't read the inst anymore. Hrm.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-03 22:14         ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-03 22:14 UTC (permalink / raw)
  To: mihai.caraman; +Cc: linuxppc-dev, kvm, kvm-ppc



Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com" <mihai.caraman@=
freescale.com>:

>> From: Alexander Graf <agraf@suse.de>
>> Sent: Friday, May 2, 2014 12:24 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozla=
bs.org
>> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup=
"
>>=20
>>> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
>>> The commit 1d628af7 "add load inst fixup" made an attempt to handle
>>> failures generated by reading the guest current instruction. The fixup
>>> code that was added works by chance hiding the real issue.
>>>=20
>>> Load external pid (lwepx) instruction, used by KVM to read guest
>>> instructions, is executed in a subsituted guest translation context
>>> (EPLC[EGS] =3D 1). In consequence lwepx's TLB error and data storage
>>> interrupts need to be handled by KVM, even though these interrupts
>>> are generated from host context (MSR[GS] =3D 0).
>>>=20
>>> Currently, KVM hooks only interrupts generated from guest context
>>> (MSR[GS] =3D 1), doing minimal checks on the fast path to avoid host
>>> performance degradation. As a result, the host kernel handles lwepx
>>> faults searching the faulting guest data address (loaded in DEAR) in
>>> its own Logical Partition ID (LPID) 0 context. In case a host translatio=
n
>>> is found the execution returns to the lwepx instruction instead of the
>>> fixup, the host ending up in an infinite loop.
>>>=20
>>> Revert the commit "add load inst fixup". lwepx issue will be addressed
>>> in a subsequent patch without needing fixup code.
>>>=20
>>> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>>=20
>> Just a random idea: Could we just switch IVOR2 during the critical lwepx
>> phase? In fact, we could even do that later when we're already in C code
>> and try to recover the last instruction. The code IVOR2 would point to
>> would simply set the register we're trying to read to as LAST_INST_FAIL
>> and rfi.
>=20
> This was the first idea that sprang to my mind inspired from how DO_KVM
> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this wi=
ll
> not work on e6500 which has shared IVORs between HW threads.

What if we combine the ideas? On read we flip the IVOR to a separate handler=
 that checks for a field in the PACA. Only if that field is set, we treat th=
e fault as kvm fault, otherwise we jump into the normal handler.

I suppose we'd have to also take a lock to make sure we don't race with the o=
ther thread when it wants to also read a guest instruction, but you get the i=
dea.

I have no idea whether this would be any faster, it's more of a brainstormin=
g thing really. But regardless this patch set would be a move into the right=
 direction.

Btw, do we have any guarantees that we don't get scheduled away before we ru=
n kvmppc_get_last_inst()? If we run on a different core we can't read the in=
st anymore. Hrm.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-03 22:14         ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-03 22:14 UTC (permalink / raw)
  To: mihai.caraman; +Cc: kvm-ppc, kvm, linuxppc-dev



Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com" <mihai.caraman@freescale.com>:

>> From: Alexander Graf <agraf@suse.de>
>> Sent: Friday, May 2, 2014 12:24 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
>> 
>>> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
>>> The commit 1d628af7 "add load inst fixup" made an attempt to handle
>>> failures generated by reading the guest current instruction. The fixup
>>> code that was added works by chance hiding the real issue.
>>> 
>>> Load external pid (lwepx) instruction, used by KVM to read guest
>>> instructions, is executed in a subsituted guest translation context
>>> (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
>>> interrupts need to be handled by KVM, even though these interrupts
>>> are generated from host context (MSR[GS] = 0).
>>> 
>>> Currently, KVM hooks only interrupts generated from guest context
>>> (MSR[GS] = 1), doing minimal checks on the fast path to avoid host
>>> performance degradation. As a result, the host kernel handles lwepx
>>> faults searching the faulting guest data address (loaded in DEAR) in
>>> its own Logical Partition ID (LPID) 0 context. In case a host translation
>>> is found the execution returns to the lwepx instruction instead of the
>>> fixup, the host ending up in an infinite loop.
>>> 
>>> Revert the commit "add load inst fixup". lwepx issue will be addressed
>>> in a subsequent patch without needing fixup code.
>>> 
>>> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>> 
>> Just a random idea: Could we just switch IVOR2 during the critical lwepx
>> phase? In fact, we could even do that later when we're already in C code
>> and try to recover the last instruction. The code IVOR2 would point to
>> would simply set the register we're trying to read to as LAST_INST_FAIL
>> and rfi.
> 
> This was the first idea that sprang to my mind inspired from how DO_KVM
> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this will
> not work on e6500 which has shared IVORs between HW threads.

What if we combine the ideas? On read we flip the IVOR to a separate handler that checks for a field in the PACA. Only if that field is set, we treat the fault as kvm fault, otherwise we jump into the normal handler.

I suppose we'd have to also take a lock to make sure we don't race with the other thread when it wants to also read a guest instruction, but you get the idea.

I have no idea whether this would be any faster, it's more of a brainstorming thing really. But regardless this patch set would be a move into the right direction.

Btw, do we have any guarantees that we don't get scheduled away before we run kvmppc_get_last_inst()? If we run on a different core we can't read the inst anymore. Hrm.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
  2014-05-03 22:14         ` Alexander Graf
  (?)
@ 2014-05-06 15:48           ` mihai.caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-06 15:48 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Sunday, May 04, 2014 1:14 AM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst
> fixup"
> 
> 
> 
> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com"
> <mihai.caraman@freescale.com>:
> 
> >> From: Alexander Graf <agraf@suse.de>
> >> Sent: Friday, May 2, 2014 12:24 PM

> > This was the first idea that sprang to my mind inspired from how DO_KVM
> > is hooked on PR. I actually did a simple POC for e500mc/e5500, but this
> will
> > not work on e6500 which has shared IVORs between HW threads.
> 
> What if we combine the ideas? On read we flip the IVOR to a separate
> handler that checks for a field in the PACA. Only if that field is set,
> we treat the fault as kvm fault, otherwise we jump into the normal
> handler.
> 
> I suppose we'd have to also take a lock to make sure we don't race with
> the other thread when it wants to also read a guest instruction, but you
> get the idea.

This might be a solution for TLB eviction but not for execute-but-not-read
entries which requires access from host context.

> 
> I have no idea whether this would be any faster, it's more of a
> brainstorming thing really. But regardless this patch set would be a move
> into the right direction.
> 
> Btw, do we have any guarantees that we don't get scheduled away before we
> run kvmppc_get_last_inst()? If we run on a different core we can't read
> the inst anymore. Hrm.

It was your suggestion to move the logic from kvmppc_handle_exit() irq
disabled area to kvmppc_get_last_inst():

http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c

Still, what is wrong if we get scheduled on another core? We will emulate
again and the guest will populate the TLB on the new core.

-Mike
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-06 15:48           ` mihai.caraman
  0 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-06 15:48 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Sunday, May 04, 2014 1:14 AM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst
> fixup"
>=20
>=20
>=20
> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com"
> <mihai.caraman@freescale.com>:
>=20
> >> From: Alexander Graf <agraf@suse.de>
> >> Sent: Friday, May 2, 2014 12:24 PM

> > This was the first idea that sprang to my mind inspired from how DO_KVM
> > is hooked on PR. I actually did a simple POC for e500mc/e5500, but this
> will
> > not work on e6500 which has shared IVORs between HW threads.
>=20
> What if we combine the ideas? On read we flip the IVOR to a separate
> handler that checks for a field in the PACA. Only if that field is set,
> we treat the fault as kvm fault, otherwise we jump into the normal
> handler.
>=20
> I suppose we'd have to also take a lock to make sure we don't race with
> the other thread when it wants to also read a guest instruction, but you
> get the idea.

This might be a solution for TLB eviction but not for execute-but-not-read
entries which requires access from host context.

>=20
> I have no idea whether this would be any faster, it's more of a
> brainstorming thing really. But regardless this patch set would be a move
> into the right direction.
>=20
> Btw, do we have any guarantees that we don't get scheduled away before we
> run kvmppc_get_last_inst()? If we run on a different core we can't read
> the inst anymore. Hrm.

It was your suggestion to move the logic from kvmppc_handle_exit() irq
disabled area to kvmppc_get_last_inst():

http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/k=
vm/booke.c

Still, what is wrong if we get scheduled on another core? We will emulate
again and the guest will populate the TLB on the new core.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-06 15:48           ` mihai.caraman
  0 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-06 15:48 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Sunday, May 04, 2014 1:14 AM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst
> fixup"
> 
> 
> 
> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com"
> <mihai.caraman@freescale.com>:
> 
> >> From: Alexander Graf <agraf@suse.de>
> >> Sent: Friday, May 2, 2014 12:24 PM

> > This was the first idea that sprang to my mind inspired from how DO_KVM
> > is hooked on PR. I actually did a simple POC for e500mc/e5500, but this
> will
> > not work on e6500 which has shared IVORs between HW threads.
> 
> What if we combine the ideas? On read we flip the IVOR to a separate
> handler that checks for a field in the PACA. Only if that field is set,
> we treat the fault as kvm fault, otherwise we jump into the normal
> handler.
> 
> I suppose we'd have to also take a lock to make sure we don't race with
> the other thread when it wants to also read a guest instruction, but you
> get the idea.

This might be a solution for TLB eviction but not for execute-but-not-read
entries which requires access from host context.

> 
> I have no idea whether this would be any faster, it's more of a
> brainstorming thing really. But regardless this patch set would be a move
> into the right direction.
> 
> Btw, do we have any guarantees that we don't get scheduled away before we
> run kvmppc_get_last_inst()? If we run on a different core we can't read
> the inst anymore. Hrm.

It was your suggestion to move the logic from kvmppc_handle_exit() irq
disabled area to kvmppc_get_last_inst():

http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c

Still, what is wrong if we get scheduled on another core? We will emulate
again and the guest will populate the TLB on the new core.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
  2014-05-06 15:48           ` mihai.caraman
  (?)
@ 2014-05-06 15:54             ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-06 15:54 UTC (permalink / raw)
  To: mihai.caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/06/2014 05:48 PM, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Sunday, May 04, 2014 1:14 AM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst
>> fixup"
>>
>>
>>
>> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com"
>> <mihai.caraman@freescale.com>:
>>
>>>> From: Alexander Graf <agraf@suse.de>
>>>> Sent: Friday, May 2, 2014 12:24 PM
>>> This was the first idea that sprang to my mind inspired from how DO_KVM
>>> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this
>> will
>>> not work on e6500 which has shared IVORs between HW threads.
>> What if we combine the ideas? On read we flip the IVOR to a separate
>> handler that checks for a field in the PACA. Only if that field is set,
>> we treat the fault as kvm fault, otherwise we jump into the normal
>> handler.
>>
>> I suppose we'd have to also take a lock to make sure we don't race with
>> the other thread when it wants to also read a guest instruction, but you
>> get the idea.
> This might be a solution for TLB eviction but not for execute-but-not-read
> entries which requires access from host context.

Good point :).

>
>> I have no idea whether this would be any faster, it's more of a
>> brainstorming thing really. But regardless this patch set would be a move
>> into the right direction.
>>
>> Btw, do we have any guarantees that we don't get scheduled away before we
>> run kvmppc_get_last_inst()? If we run on a different core we can't read
>> the inst anymore. Hrm.
> It was your suggestion to move the logic from kvmppc_handle_exit() irq
> disabled area to kvmppc_get_last_inst():
>
> http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c
>
> Still, what is wrong if we get scheduled on another core? We will emulate
> again and the guest will populate the TLB on the new core.

Yes, it means we have to get the EMULATE_AGAIN code paths correct :). It 
also means we lose some performance with preemptive kernel configurations.


Alex

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-06 15:54             ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-06 15:54 UTC (permalink / raw)
  To: mihai.caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/06/2014 05:48 PM, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Sunday, May 04, 2014 1:14 AM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst
>> fixup"
>>
>>
>>
>> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com"
>> <mihai.caraman@freescale.com>:
>>
>>>> From: Alexander Graf <agraf@suse.de>
>>>> Sent: Friday, May 2, 2014 12:24 PM
>>> This was the first idea that sprang to my mind inspired from how DO_KVM
>>> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this
>> will
>>> not work on e6500 which has shared IVORs between HW threads.
>> What if we combine the ideas? On read we flip the IVOR to a separate
>> handler that checks for a field in the PACA. Only if that field is set,
>> we treat the fault as kvm fault, otherwise we jump into the normal
>> handler.
>>
>> I suppose we'd have to also take a lock to make sure we don't race with
>> the other thread when it wants to also read a guest instruction, but you
>> get the idea.
> This might be a solution for TLB eviction but not for execute-but-not-read
> entries which requires access from host context.

Good point :).

>
>> I have no idea whether this would be any faster, it's more of a
>> brainstorming thing really. But regardless this patch set would be a move
>> into the right direction.
>>
>> Btw, do we have any guarantees that we don't get scheduled away before we
>> run kvmppc_get_last_inst()? If we run on a different core we can't read
>> the inst anymore. Hrm.
> It was your suggestion to move the logic from kvmppc_handle_exit() irq
> disabled area to kvmppc_get_last_inst():
>
> http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c
>
> Still, what is wrong if we get scheduled on another core? We will emulate
> again and the guest will populate the TLB on the new core.

Yes, it means we have to get the EMULATE_AGAIN code paths correct :). It 
also means we lose some performance with preemptive kernel configurations.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup"
@ 2014-05-06 15:54             ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-06 15:54 UTC (permalink / raw)
  To: mihai.caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/06/2014 05:48 PM, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Sunday, May 04, 2014 1:14 AM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst
>> fixup"
>>
>>
>>
>> Am 03.05.2014 um 01:14 schrieb "mihai.caraman@freescale.com"
>> <mihai.caraman@freescale.com>:
>>
>>>> From: Alexander Graf <agraf@suse.de>
>>>> Sent: Friday, May 2, 2014 12:24 PM
>>> This was the first idea that sprang to my mind inspired from how DO_KVM
>>> is hooked on PR. I actually did a simple POC for e500mc/e5500, but this
>> will
>>> not work on e6500 which has shared IVORs between HW threads.
>> What if we combine the ideas? On read we flip the IVOR to a separate
>> handler that checks for a field in the PACA. Only if that field is set,
>> we treat the fault as kvm fault, otherwise we jump into the normal
>> handler.
>>
>> I suppose we'd have to also take a lock to make sure we don't race with
>> the other thread when it wants to also read a guest instruction, but you
>> get the idea.
> This might be a solution for TLB eviction but not for execute-but-not-read
> entries which requires access from host context.

Good point :).

>
>> I have no idea whether this would be any faster, it's more of a
>> brainstorming thing really. But regardless this patch set would be a move
>> into the right direction.
>>
>> Btw, do we have any guarantees that we don't get scheduled away before we
>> run kvmppc_get_last_inst()? If we run on a different core we can't read
>> the inst anymore. Hrm.
> It was your suggestion to move the logic from kvmppc_handle_exit() irq
> disabled area to kvmppc_get_last_inst():
>
> http://git.freescale.com/git/cgit.cgi/ppc/sdk/linux.git/tree/arch/powerpc/kvm/booke.c
>
> Still, what is wrong if we get scheduled on another core? We will emulate
> again and the guest will populate the TLB on the new core.

Yes, it means we have to get the EMULATE_AGAIN code paths correct :). It 
also means we lose some performance with preemptive kernel configurations.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
  2014-05-02  9:54     ` Alexander Graf
  (?)
@ 2014-05-06 19:06       ` mihai.caraman
  -1 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-06 19:06 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, kvm, linuxppc-dev

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Friday, May 02, 2014 12:55 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
> 
> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
...
> > diff --git a/arch/powerpc/include/asm/kvm_ppc.h
> b/arch/powerpc/include/asm/kvm_ppc.h
> > index 4096f16..6e7c358 100644
> > --- a/arch/powerpc/include/asm/kvm_ppc.h
> > +++ b/arch/powerpc/include/asm/kvm_ppc.h
> > @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu
> *vcpu);
> >   extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
> >   extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >
> > +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
> 
> Phew. Moving this into a separate function sure has some performance
> implications. Was there no way to keep it in a header?
> 
> You could just move it into its own .h file which we include after
> kvm_ppc.h. That way everything's available. That would also help me a
> lot with the little endian port where I'm also struggling with header
> file inclusion order and kvmppc_need_byteswap().

Great, I will do this.

> > diff --git a/arch/powerpc/kvm/book3s_pr.c
> b/arch/powerpc/kvm/book3s_pr.c
> > index c5c052a..b7fffd1 100644
> > --- a/arch/powerpc/kvm/book3s_pr.c
> > +++ b/arch/powerpc/kvm/book3s_pr.c
> > @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu,
> ulong msr)
> >
> >   static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
> >   {
> > -	ulong srr0 = kvmppc_get_pc(vcpu);
> > -	u32 last_inst = kvmppc_get_last_inst(vcpu);
> > -	int ret;
> > +	u32 last_inst;
> >
> > -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> > -	if (ret == -ENOENT) {
> > +	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {
> 
> ENOENT?

You have to tell us :) Why does kvmppc_ld() mix emulation_result
enumeration with generic errors? Do you want to change that and
use EMULATE_FAIL instead?

> 
> >   		ulong msr = vcpu->arch.shared->msr;
> >
> >   		msr = kvmppc_set_field(msr, 33, 33, 1);
> > @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run,
> struct kvm_vcpu *vcpu,
> >   	{
> >   		enum emulation_result er;
> >   		ulong flags;
> > +		u32 last_inst;
> >
> >   program_interrupt:
> >   		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
> > +		kvmppc_get_last_inst(vcpu, &last_inst);
> 
> No check for the return value?

Should we queue a program exception and resume guest?

> 
> >
> >   		if (vcpu->arch.shared->msr & MSR_PR) {
> >   #ifdef EXIT_DEBUG
> > -			printk(KERN_INFO "Userspace triggered 0x700 exception
> at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> > +			pr_info("Userspace triggered 0x700 exception at\n"
> > +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
> >   #endif
> > -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
> > +			if ((last_inst & 0xff0007ff) !=
> >   			    (INS_DCBZ & 0xfffffff7)) {
> >   				kvmppc_core_queue_program(vcpu, flags);
> >   				r = RESUME_GUEST;
> > @@ -894,7 +894,7 @@ program_interrupt:
> >   			break;
> >   		case EMULATE_FAIL:
> >   			printk(KERN_CRIT "%s: emulation at %lx failed
> (%08x)\n",
> > -			       __func__, kvmppc_get_pc(vcpu),
> kvmppc_get_last_inst(vcpu));
> > +			       __func__, kvmppc_get_pc(vcpu), last_inst);
> >   			kvmppc_core_queue_program(vcpu, flags);
> >   			r = RESUME_GUEST;
> >   			break;
> > @@ -911,8 +911,12 @@ program_interrupt:
> >   		break;
> >   	}
> >   	case BOOK3S_INTERRUPT_SYSCALL:
> > +	{
> > +		u32 last_sc;
> > +
> > +		kvmppc_get_last_sc(vcpu, &last_sc);
> 
> No check for the return value?

The existing code does not handle KVM_INST_FETCH_FAILED. 
How should we continue if papr is enabled and last_sc fails?

> 
> >   		if (vcpu->arch.papr_enabled &&
> > -		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
> > +		    (last_sc == 0x44000022) &&
> >   		    !(vcpu->arch.shared->msr & MSR_PR)) {
> >   			/* SC 1 papr hypercalls */
> >   			ulong cmd = kvmppc_get_gpr(vcpu, 3);
> > @@ -957,6 +961,7 @@ program_interrupt:
> >   			r = RESUME_GUEST;
> >   		}
> >   		break;
> > +	}
> >   	case BOOK3S_INTERRUPT_FP_UNAVAIL:
> >   	case BOOK3S_INTERRUPT_ALTIVEC:
> >   	case BOOK3S_INTERRUPT_VSX:
> > @@ -985,15 +990,20 @@ program_interrupt:
> >   		break;
> >   	}
> >   	case BOOK3S_INTERRUPT_ALIGNMENT:
> > +	{
> > +		u32 last_inst;
> > +
> >   		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
> > -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
> > -				kvmppc_get_last_inst(vcpu));
> > -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
> > -				kvmppc_get_last_inst(vcpu));
> > +			kvmppc_get_last_inst(vcpu, &last_inst);
> 
> I think with an error returning kvmppc_get_last_inst we can just use
> completely get rid of kvmppc_read_inst() and only use
> kvmppc_get_last_inst() instead.

The only purpose of kvmppc_read_inst() function is to generate an
instruction storage interrupt in case of load failure. It is also called
from kvmppc_check_ext(). Do you suggest to get rid of this logic? Or to
duplicate it in order to avoid kvmppc_get_last_inst() redundant call?

> 
> > +			vcpu->arch.shared->dsisr =
> > +				kvmppc_alignment_dsisr(vcpu, last_inst);
> > +			vcpu->arch.shared->dar =
> > +				kvmppc_alignment_dar(vcpu, last_inst);
> >   			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> >   		}
> >   		r = RESUME_GUEST;
> >   		break;
> > +	}
> >   	case BOOK3S_INTERRUPT_MACHINE_CHECK:
> >   	case BOOK3S_INTERRUPT_TRACE:
> >   		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > index ab62109..df25db0 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run,
> struct kvm_vcpu *vcpu)
> >   		 * they were actually modified by emulation. */
> >   		return RESUME_GUEST_NV;
> >
> > +	case EMULATE_AGAIN:
> > +		return RESUME_GUEST;
> > +
> >   	case EMULATE_DO_DCR:
> >   		run->exit_reason = KVM_EXIT_DCR;
> >   		return RESUME_HOST;
> > @@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
> >   	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
> >   }
> >
> > +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> > +{
> > +	int result = EMULATE_DONE;
> > +
> > +	if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
> > +		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
> > +	*inst = vcpu->arch.last_inst;
> 
> This function looks almost identical to the book3s one. Can we share the
> same code path here? Just always return false for
> kvmppc_needs_byteswap() on booke.

Sounds good.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-06 19:06       ` mihai.caraman
  0 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-06 19:06 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Friday, May 02, 2014 12:55 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
>=20
> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
...
> > diff --git a/arch/powerpc/include/asm/kvm_ppc.h
> b/arch/powerpc/include/asm/kvm_ppc.h
> > index 4096f16..6e7c358 100644
> > --- a/arch/powerpc/include/asm/kvm_ppc.h
> > +++ b/arch/powerpc/include/asm/kvm_ppc.h
> > @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu
> *vcpu);
> >   extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
> >   extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >
> > +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
>=20
> Phew. Moving this into a separate function sure has some performance
> implications. Was there no way to keep it in a header?
>=20
> You could just move it into its own .h file which we include after
> kvm_ppc.h. That way everything's available. That would also help me a
> lot with the little endian port where I'm also struggling with header
> file inclusion order and kvmppc_need_byteswap().

Great, I will do this.

> > diff --git a/arch/powerpc/kvm/book3s_pr.c
> b/arch/powerpc/kvm/book3s_pr.c
> > index c5c052a..b7fffd1 100644
> > --- a/arch/powerpc/kvm/book3s_pr.c
> > +++ b/arch/powerpc/kvm/book3s_pr.c
> > @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu,
> ulong msr)
> >
> >   static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
> >   {
> > -	ulong srr0 =3D kvmppc_get_pc(vcpu);
> > -	u32 last_inst =3D kvmppc_get_last_inst(vcpu);
> > -	int ret;
> > +	u32 last_inst;
> >
> > -	ret =3D kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> > -	if (ret =3D=3D -ENOENT) {
> > +	if (kvmppc_get_last_inst(vcpu, &last_inst) =3D=3D -ENOENT) {
>=20
> ENOENT?

You have to tell us :) Why does kvmppc_ld() mix emulation_result
enumeration with generic errors? Do you want to change that and
use EMULATE_FAIL instead?

>=20
> >   		ulong msr =3D vcpu->arch.shared->msr;
> >
> >   		msr =3D kvmppc_set_field(msr, 33, 33, 1);
> > @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run,
> struct kvm_vcpu *vcpu,
> >   	{
> >   		enum emulation_result er;
> >   		ulong flags;
> > +		u32 last_inst;
> >
> >   program_interrupt:
> >   		flags =3D vcpu->arch.shadow_srr1 & 0x1f0000ull;
> > +		kvmppc_get_last_inst(vcpu, &last_inst);
>=20
> No check for the return value?

Should we queue a program exception and resume guest?

>=20
> >
> >   		if (vcpu->arch.shared->msr & MSR_PR) {
> >   #ifdef EXIT_DEBUG
> > -			printk(KERN_INFO "Userspace triggered 0x700 exception
> at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> > +			pr_info("Userspace triggered 0x700 exception at\n"
> > +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
> >   #endif
> > -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=3D
> > +			if ((last_inst & 0xff0007ff) !=3D
> >   			    (INS_DCBZ & 0xfffffff7)) {
> >   				kvmppc_core_queue_program(vcpu, flags);
> >   				r =3D RESUME_GUEST;
> > @@ -894,7 +894,7 @@ program_interrupt:
> >   			break;
> >   		case EMULATE_FAIL:
> >   			printk(KERN_CRIT "%s: emulation at %lx failed
> (%08x)\n",
> > -			       __func__, kvmppc_get_pc(vcpu),
> kvmppc_get_last_inst(vcpu));
> > +			       __func__, kvmppc_get_pc(vcpu), last_inst);
> >   			kvmppc_core_queue_program(vcpu, flags);
> >   			r =3D RESUME_GUEST;
> >   			break;
> > @@ -911,8 +911,12 @@ program_interrupt:
> >   		break;
> >   	}
> >   	case BOOK3S_INTERRUPT_SYSCALL:
> > +	{
> > +		u32 last_sc;
> > +
> > +		kvmppc_get_last_sc(vcpu, &last_sc);
>=20
> No check for the return value?

The existing code does not handle KVM_INST_FETCH_FAILED.=20
How should we continue if papr is enabled and last_sc fails?

>=20
> >   		if (vcpu->arch.papr_enabled &&
> > -		    (kvmppc_get_last_sc(vcpu) =3D=3D 0x44000022) &&
> > +		    (last_sc =3D=3D 0x44000022) &&
> >   		    !(vcpu->arch.shared->msr & MSR_PR)) {
> >   			/* SC 1 papr hypercalls */
> >   			ulong cmd =3D kvmppc_get_gpr(vcpu, 3);
> > @@ -957,6 +961,7 @@ program_interrupt:
> >   			r =3D RESUME_GUEST;
> >   		}
> >   		break;
> > +	}
> >   	case BOOK3S_INTERRUPT_FP_UNAVAIL:
> >   	case BOOK3S_INTERRUPT_ALTIVEC:
> >   	case BOOK3S_INTERRUPT_VSX:
> > @@ -985,15 +990,20 @@ program_interrupt:
> >   		break;
> >   	}
> >   	case BOOK3S_INTERRUPT_ALIGNMENT:
> > +	{
> > +		u32 last_inst;
> > +
> >   		if (kvmppc_read_inst(vcpu) =3D=3D EMULATE_DONE) {
> > -			vcpu->arch.shared->dsisr =3D kvmppc_alignment_dsisr(vcpu,
> > -				kvmppc_get_last_inst(vcpu));
> > -			vcpu->arch.shared->dar =3D kvmppc_alignment_dar(vcpu,
> > -				kvmppc_get_last_inst(vcpu));
> > +			kvmppc_get_last_inst(vcpu, &last_inst);
>=20
> I think with an error returning kvmppc_get_last_inst we can just use
> completely get rid of kvmppc_read_inst() and only use
> kvmppc_get_last_inst() instead.

The only purpose of kvmppc_read_inst() function is to generate an
instruction storage interrupt in case of load failure. It is also called
from kvmppc_check_ext(). Do you suggest to get rid of this logic? Or to
duplicate it in order to avoid kvmppc_get_last_inst() redundant call?

>=20
> > +			vcpu->arch.shared->dsisr =3D
> > +				kvmppc_alignment_dsisr(vcpu, last_inst);
> > +			vcpu->arch.shared->dar =3D
> > +				kvmppc_alignment_dar(vcpu, last_inst);
> >   			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> >   		}
> >   		r =3D RESUME_GUEST;
> >   		break;
> > +	}
> >   	case BOOK3S_INTERRUPT_MACHINE_CHECK:
> >   	case BOOK3S_INTERRUPT_TRACE:
> >   		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > index ab62109..df25db0 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run,
> struct kvm_vcpu *vcpu)
> >   		 * they were actually modified by emulation. */
> >   		return RESUME_GUEST_NV;
> >
> > +	case EMULATE_AGAIN:
> > +		return RESUME_GUEST;
> > +
> >   	case EMULATE_DO_DCR:
> >   		run->exit_reason =3D KVM_EXIT_DCR;
> >   		return RESUME_HOST;
> > @@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
> >   	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
> >   }
> >
> > +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> > +{
> > +	int result =3D EMULATE_DONE;
> > +
> > +	if (vcpu->arch.last_inst =3D=3D KVM_INST_FETCH_FAILED)
> > +		result =3D kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
> > +	*inst =3D vcpu->arch.last_inst;
>=20
> This function looks almost identical to the book3s one. Can we share the
> same code path here? Just always return false for
> kvmppc_needs_byteswap() on booke.

Sounds good.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-06 19:06       ` mihai.caraman
  0 siblings, 0 replies; 54+ messages in thread
From: mihai.caraman @ 2014-05-06 19:06 UTC (permalink / raw)
  To: Alexander Graf; +Cc: kvm-ppc, kvm, linuxppc-dev

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@suse.de]
> Sent: Friday, May 02, 2014 12:55 PM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org
> Subject: Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
> 
> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
...
> > diff --git a/arch/powerpc/include/asm/kvm_ppc.h
> b/arch/powerpc/include/asm/kvm_ppc.h
> > index 4096f16..6e7c358 100644
> > --- a/arch/powerpc/include/asm/kvm_ppc.h
> > +++ b/arch/powerpc/include/asm/kvm_ppc.h
> > @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu
> *vcpu);
> >   extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
> >   extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
> >
> > +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
> 
> Phew. Moving this into a separate function sure has some performance
> implications. Was there no way to keep it in a header?
> 
> You could just move it into its own .h file which we include after
> kvm_ppc.h. That way everything's available. That would also help me a
> lot with the little endian port where I'm also struggling with header
> file inclusion order and kvmppc_need_byteswap().

Great, I will do this.

> > diff --git a/arch/powerpc/kvm/book3s_pr.c
> b/arch/powerpc/kvm/book3s_pr.c
> > index c5c052a..b7fffd1 100644
> > --- a/arch/powerpc/kvm/book3s_pr.c
> > +++ b/arch/powerpc/kvm/book3s_pr.c
> > @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu,
> ulong msr)
> >
> >   static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
> >   {
> > -	ulong srr0 = kvmppc_get_pc(vcpu);
> > -	u32 last_inst = kvmppc_get_last_inst(vcpu);
> > -	int ret;
> > +	u32 last_inst;
> >
> > -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
> > -	if (ret = -ENOENT) {
> > +	if (kvmppc_get_last_inst(vcpu, &last_inst) = -ENOENT) {
> 
> ENOENT?

You have to tell us :) Why does kvmppc_ld() mix emulation_result
enumeration with generic errors? Do you want to change that and
use EMULATE_FAIL instead?

> 
> >   		ulong msr = vcpu->arch.shared->msr;
> >
> >   		msr = kvmppc_set_field(msr, 33, 33, 1);
> > @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run,
> struct kvm_vcpu *vcpu,
> >   	{
> >   		enum emulation_result er;
> >   		ulong flags;
> > +		u32 last_inst;
> >
> >   program_interrupt:
> >   		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
> > +		kvmppc_get_last_inst(vcpu, &last_inst);
> 
> No check for the return value?

Should we queue a program exception and resume guest?

> 
> >
> >   		if (vcpu->arch.shared->msr & MSR_PR) {
> >   #ifdef EXIT_DEBUG
> > -			printk(KERN_INFO "Userspace triggered 0x700 exception
> at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
> > +			pr_info("Userspace triggered 0x700 exception at\n"
> > +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
> >   #endif
> > -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !> > +			if ((last_inst & 0xff0007ff) !> >   			    (INS_DCBZ & 0xfffffff7)) {
> >   				kvmppc_core_queue_program(vcpu, flags);
> >   				r = RESUME_GUEST;
> > @@ -894,7 +894,7 @@ program_interrupt:
> >   			break;
> >   		case EMULATE_FAIL:
> >   			printk(KERN_CRIT "%s: emulation at %lx failed
> (%08x)\n",
> > -			       __func__, kvmppc_get_pc(vcpu),
> kvmppc_get_last_inst(vcpu));
> > +			       __func__, kvmppc_get_pc(vcpu), last_inst);
> >   			kvmppc_core_queue_program(vcpu, flags);
> >   			r = RESUME_GUEST;
> >   			break;
> > @@ -911,8 +911,12 @@ program_interrupt:
> >   		break;
> >   	}
> >   	case BOOK3S_INTERRUPT_SYSCALL:
> > +	{
> > +		u32 last_sc;
> > +
> > +		kvmppc_get_last_sc(vcpu, &last_sc);
> 
> No check for the return value?

The existing code does not handle KVM_INST_FETCH_FAILED. 
How should we continue if papr is enabled and last_sc fails?

> 
> >   		if (vcpu->arch.papr_enabled &&
> > -		    (kvmppc_get_last_sc(vcpu) = 0x44000022) &&
> > +		    (last_sc = 0x44000022) &&
> >   		    !(vcpu->arch.shared->msr & MSR_PR)) {
> >   			/* SC 1 papr hypercalls */
> >   			ulong cmd = kvmppc_get_gpr(vcpu, 3);
> > @@ -957,6 +961,7 @@ program_interrupt:
> >   			r = RESUME_GUEST;
> >   		}
> >   		break;
> > +	}
> >   	case BOOK3S_INTERRUPT_FP_UNAVAIL:
> >   	case BOOK3S_INTERRUPT_ALTIVEC:
> >   	case BOOK3S_INTERRUPT_VSX:
> > @@ -985,15 +990,20 @@ program_interrupt:
> >   		break;
> >   	}
> >   	case BOOK3S_INTERRUPT_ALIGNMENT:
> > +	{
> > +		u32 last_inst;
> > +
> >   		if (kvmppc_read_inst(vcpu) = EMULATE_DONE) {
> > -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
> > -				kvmppc_get_last_inst(vcpu));
> > -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
> > -				kvmppc_get_last_inst(vcpu));
> > +			kvmppc_get_last_inst(vcpu, &last_inst);
> 
> I think with an error returning kvmppc_get_last_inst we can just use
> completely get rid of kvmppc_read_inst() and only use
> kvmppc_get_last_inst() instead.

The only purpose of kvmppc_read_inst() function is to generate an
instruction storage interrupt in case of load failure. It is also called
from kvmppc_check_ext(). Do you suggest to get rid of this logic? Or to
duplicate it in order to avoid kvmppc_get_last_inst() redundant call?

> 
> > +			vcpu->arch.shared->dsisr > > +				kvmppc_alignment_dsisr(vcpu, last_inst);
> > +			vcpu->arch.shared->dar > > +				kvmppc_alignment_dar(vcpu, last_inst);
> >   			kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> >   		}
> >   		r = RESUME_GUEST;
> >   		break;
> > +	}
> >   	case BOOK3S_INTERRUPT_MACHINE_CHECK:
> >   	case BOOK3S_INTERRUPT_TRACE:
> >   		kvmppc_book3s_queue_irqprio(vcpu, exit_nr);
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > index ab62109..df25db0 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run,
> struct kvm_vcpu *vcpu)
> >   		 * they were actually modified by emulation. */
> >   		return RESUME_GUEST_NV;
> >
> > +	case EMULATE_AGAIN:
> > +		return RESUME_GUEST;
> > +
> >   	case EMULATE_DO_DCR:
> >   		run->exit_reason = KVM_EXIT_DCR;
> >   		return RESUME_HOST;
> > @@ -1911,6 +1914,17 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu)
> >   	vcpu->kvm->arch.kvm_ops->vcpu_put(vcpu);
> >   }
> >
> > +int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst)
> > +{
> > +	int result = EMULATE_DONE;
> > +
> > +	if (vcpu->arch.last_inst = KVM_INST_FETCH_FAILED)
> > +		result = kvmppc_ld_inst(vcpu, &vcpu->arch.last_inst);
> > +	*inst = vcpu->arch.last_inst;
> 
> This function looks almost identical to the book3s one. Can we share the
> same code path here? Just always return false for
> kvmppc_needs_byteswap() on booke.

Sounds good.

-Mike

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
  2014-05-06 19:06       ` mihai.caraman
  (?)
@ 2014-05-08 13:31         ` Alexander Graf
  -1 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-08 13:31 UTC (permalink / raw)
  To: mihai.caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/06/2014 09:06 PM, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Friday, May 02, 2014 12:55 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
>>
>> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> ...
>>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h
>> b/arch/powerpc/include/asm/kvm_ppc.h
>>> index 4096f16..6e7c358 100644
>>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>>> @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu
>> *vcpu);
>>>    extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
>>>    extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
>>>
>>> +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
>> Phew. Moving this into a separate function sure has some performance
>> implications. Was there no way to keep it in a header?
>>
>> You could just move it into its own .h file which we include after
>> kvm_ppc.h. That way everything's available. That would also help me a
>> lot with the little endian port where I'm also struggling with header
>> file inclusion order and kvmppc_need_byteswap().
> Great, I will do this.
>
>>> diff --git a/arch/powerpc/kvm/book3s_pr.c
>> b/arch/powerpc/kvm/book3s_pr.c
>>> index c5c052a..b7fffd1 100644
>>> --- a/arch/powerpc/kvm/book3s_pr.c
>>> +++ b/arch/powerpc/kvm/book3s_pr.c
>>> @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu,
>> ulong msr)
>>>    static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
>>>    {
>>> -	ulong srr0 = kvmppc_get_pc(vcpu);
>>> -	u32 last_inst = kvmppc_get_last_inst(vcpu);
>>> -	int ret;
>>> +	u32 last_inst;
>>>
>>> -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
>>> -	if (ret == -ENOENT) {
>>> +	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {
>> ENOENT?
> You have to tell us :) Why does kvmppc_ld() mix emulation_result
> enumeration with generic errors? Do you want to change that and
> use EMULATE_FAIL instead?

Haha you're right. Man, this code sucks :). No, leave it be for now.

>
>>>    		ulong msr = vcpu->arch.shared->msr;
>>>
>>>    		msr = kvmppc_set_field(msr, 33, 33, 1);
>>> @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run,
>> struct kvm_vcpu *vcpu,
>>>    	{
>>>    		enum emulation_result er;
>>>    		ulong flags;
>>> +		u32 last_inst;
>>>
>>>    program_interrupt:
>>>    		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
>>> +		kvmppc_get_last_inst(vcpu, &last_inst);
>> No check for the return value?
> Should we queue a program exception and resume guest?

Depends on the return code. We need to be able to handle EMULATE_AGAIN 
everywhere now.

>
>>>    		if (vcpu->arch.shared->msr & MSR_PR) {
>>>    #ifdef EXIT_DEBUG
>>> -			printk(KERN_INFO "Userspace triggered 0x700 exception
>> at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
>>> +			pr_info("Userspace triggered 0x700 exception at\n"
>>> +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
>>>    #endif
>>> -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
>>> +			if ((last_inst & 0xff0007ff) !=
>>>    			    (INS_DCBZ & 0xfffffff7)) {
>>>    				kvmppc_core_queue_program(vcpu, flags);
>>>    				r = RESUME_GUEST;
>>> @@ -894,7 +894,7 @@ program_interrupt:
>>>    			break;
>>>    		case EMULATE_FAIL:
>>>    			printk(KERN_CRIT "%s: emulation at %lx failed
>> (%08x)\n",
>>> -			       __func__, kvmppc_get_pc(vcpu),
>> kvmppc_get_last_inst(vcpu));
>>> +			       __func__, kvmppc_get_pc(vcpu), last_inst);
>>>    			kvmppc_core_queue_program(vcpu, flags);
>>>    			r = RESUME_GUEST;
>>>    			break;
>>> @@ -911,8 +911,12 @@ program_interrupt:
>>>    		break;
>>>    	}
>>>    	case BOOK3S_INTERRUPT_SYSCALL:
>>> +	{
>>> +		u32 last_sc;
>>> +
>>> +		kvmppc_get_last_sc(vcpu, &last_sc);
>> No check for the return value?
> The existing code does not handle KVM_INST_FETCH_FAILED.
> How should we continue if papr is enabled and last_sc fails?

I'd say go back into the guest and try again ;). Keep in mind that sc 
already sets srr0 to pc + 4, so we need to subtract 4 from pc and then 
go back into the guest.

>
>>>    		if (vcpu->arch.papr_enabled &&
>>> -		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
>>> +		    (last_sc == 0x44000022) &&
>>>    		    !(vcpu->arch.shared->msr & MSR_PR)) {
>>>    			/* SC 1 papr hypercalls */
>>>    			ulong cmd = kvmppc_get_gpr(vcpu, 3);
>>> @@ -957,6 +961,7 @@ program_interrupt:
>>>    			r = RESUME_GUEST;
>>>    		}
>>>    		break;
>>> +	}
>>>    	case BOOK3S_INTERRUPT_FP_UNAVAIL:
>>>    	case BOOK3S_INTERRUPT_ALTIVEC:
>>>    	case BOOK3S_INTERRUPT_VSX:
>>> @@ -985,15 +990,20 @@ program_interrupt:
>>>    		break;
>>>    	}
>>>    	case BOOK3S_INTERRUPT_ALIGNMENT:
>>> +	{
>>> +		u32 last_inst;
>>> +
>>>    		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
>>> -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
>>> -				kvmppc_get_last_inst(vcpu));
>>> -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
>>> -				kvmppc_get_last_inst(vcpu));
>>> +			kvmppc_get_last_inst(vcpu, &last_inst);
>> I think with an error returning kvmppc_get_last_inst we can just use
>> completely get rid of kvmppc_read_inst() and only use
>> kvmppc_get_last_inst() instead.
> The only purpose of kvmppc_read_inst() function is to generate an
> instruction storage interrupt in case of load failure. It is also called
> from kvmppc_check_ext(). Do you suggest to get rid of this logic? Or to
> duplicate it in order to avoid kvmppc_get_last_inst() redundant call?

I don't think we need that logic. If we get EMULATE_AGAIN, we just have 
to make sure we go back into the guest. No need to inject an ISI into 
the guest - it'll do that all by itself.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-08 13:31         ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-08 13:31 UTC (permalink / raw)
  To: mihai.caraman; +Cc: linuxppc-dev, kvm, kvm-ppc

On 05/06/2014 09:06 PM, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Friday, May 02, 2014 12:55 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
>>
>> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> ...
>>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h
>> b/arch/powerpc/include/asm/kvm_ppc.h
>>> index 4096f16..6e7c358 100644
>>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>>> @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu
>> *vcpu);
>>>    extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
>>>    extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
>>>
>>> +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
>> Phew. Moving this into a separate function sure has some performance
>> implications. Was there no way to keep it in a header?
>>
>> You could just move it into its own .h file which we include after
>> kvm_ppc.h. That way everything's available. That would also help me a
>> lot with the little endian port where I'm also struggling with header
>> file inclusion order and kvmppc_need_byteswap().
> Great, I will do this.
>
>>> diff --git a/arch/powerpc/kvm/book3s_pr.c
>> b/arch/powerpc/kvm/book3s_pr.c
>>> index c5c052a..b7fffd1 100644
>>> --- a/arch/powerpc/kvm/book3s_pr.c
>>> +++ b/arch/powerpc/kvm/book3s_pr.c
>>> @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu,
>> ulong msr)
>>>    static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
>>>    {
>>> -	ulong srr0 = kvmppc_get_pc(vcpu);
>>> -	u32 last_inst = kvmppc_get_last_inst(vcpu);
>>> -	int ret;
>>> +	u32 last_inst;
>>>
>>> -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
>>> -	if (ret == -ENOENT) {
>>> +	if (kvmppc_get_last_inst(vcpu, &last_inst) == -ENOENT) {
>> ENOENT?
> You have to tell us :) Why does kvmppc_ld() mix emulation_result
> enumeration with generic errors? Do you want to change that and
> use EMULATE_FAIL instead?

Haha you're right. Man, this code sucks :). No, leave it be for now.

>
>>>    		ulong msr = vcpu->arch.shared->msr;
>>>
>>>    		msr = kvmppc_set_field(msr, 33, 33, 1);
>>> @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run,
>> struct kvm_vcpu *vcpu,
>>>    	{
>>>    		enum emulation_result er;
>>>    		ulong flags;
>>> +		u32 last_inst;
>>>
>>>    program_interrupt:
>>>    		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
>>> +		kvmppc_get_last_inst(vcpu, &last_inst);
>> No check for the return value?
> Should we queue a program exception and resume guest?

Depends on the return code. We need to be able to handle EMULATE_AGAIN 
everywhere now.

>
>>>    		if (vcpu->arch.shared->msr & MSR_PR) {
>>>    #ifdef EXIT_DEBUG
>>> -			printk(KERN_INFO "Userspace triggered 0x700 exception
>> at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
>>> +			pr_info("Userspace triggered 0x700 exception at\n"
>>> +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
>>>    #endif
>>> -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !=
>>> +			if ((last_inst & 0xff0007ff) !=
>>>    			    (INS_DCBZ & 0xfffffff7)) {
>>>    				kvmppc_core_queue_program(vcpu, flags);
>>>    				r = RESUME_GUEST;
>>> @@ -894,7 +894,7 @@ program_interrupt:
>>>    			break;
>>>    		case EMULATE_FAIL:
>>>    			printk(KERN_CRIT "%s: emulation at %lx failed
>> (%08x)\n",
>>> -			       __func__, kvmppc_get_pc(vcpu),
>> kvmppc_get_last_inst(vcpu));
>>> +			       __func__, kvmppc_get_pc(vcpu), last_inst);
>>>    			kvmppc_core_queue_program(vcpu, flags);
>>>    			r = RESUME_GUEST;
>>>    			break;
>>> @@ -911,8 +911,12 @@ program_interrupt:
>>>    		break;
>>>    	}
>>>    	case BOOK3S_INTERRUPT_SYSCALL:
>>> +	{
>>> +		u32 last_sc;
>>> +
>>> +		kvmppc_get_last_sc(vcpu, &last_sc);
>> No check for the return value?
> The existing code does not handle KVM_INST_FETCH_FAILED.
> How should we continue if papr is enabled and last_sc fails?

I'd say go back into the guest and try again ;). Keep in mind that sc 
already sets srr0 to pc + 4, so we need to subtract 4 from pc and then 
go back into the guest.

>
>>>    		if (vcpu->arch.papr_enabled &&
>>> -		    (kvmppc_get_last_sc(vcpu) == 0x44000022) &&
>>> +		    (last_sc == 0x44000022) &&
>>>    		    !(vcpu->arch.shared->msr & MSR_PR)) {
>>>    			/* SC 1 papr hypercalls */
>>>    			ulong cmd = kvmppc_get_gpr(vcpu, 3);
>>> @@ -957,6 +961,7 @@ program_interrupt:
>>>    			r = RESUME_GUEST;
>>>    		}
>>>    		break;
>>> +	}
>>>    	case BOOK3S_INTERRUPT_FP_UNAVAIL:
>>>    	case BOOK3S_INTERRUPT_ALTIVEC:
>>>    	case BOOK3S_INTERRUPT_VSX:
>>> @@ -985,15 +990,20 @@ program_interrupt:
>>>    		break;
>>>    	}
>>>    	case BOOK3S_INTERRUPT_ALIGNMENT:
>>> +	{
>>> +		u32 last_inst;
>>> +
>>>    		if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
>>> -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
>>> -				kvmppc_get_last_inst(vcpu));
>>> -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
>>> -				kvmppc_get_last_inst(vcpu));
>>> +			kvmppc_get_last_inst(vcpu, &last_inst);
>> I think with an error returning kvmppc_get_last_inst we can just use
>> completely get rid of kvmppc_read_inst() and only use
>> kvmppc_get_last_inst() instead.
> The only purpose of kvmppc_read_inst() function is to generate an
> instruction storage interrupt in case of load failure. It is also called
> from kvmppc_check_ext(). Do you suggest to get rid of this logic? Or to
> duplicate it in order to avoid kvmppc_get_last_inst() redundant call?

I don't think we need that logic. If we get EMULATE_AGAIN, we just have 
to make sure we go back into the guest. No need to inject an ISI into 
the guest - it'll do that all by itself.


Alex

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
@ 2014-05-08 13:31         ` Alexander Graf
  0 siblings, 0 replies; 54+ messages in thread
From: Alexander Graf @ 2014-05-08 13:31 UTC (permalink / raw)
  To: mihai.caraman; +Cc: kvm-ppc, kvm, linuxppc-dev

On 05/06/2014 09:06 PM, mihai.caraman@freescale.com wrote:
>> -----Original Message-----
>> From: Alexander Graf [mailto:agraf@suse.de]
>> Sent: Friday, May 02, 2014 12:55 PM
>> To: Caraman Mihai Claudiu-B02008
>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; linuxppc-
>> dev@lists.ozlabs.org
>> Subject: Re: [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail
>>
>> On 05/01/2014 02:45 AM, Mihai Caraman wrote:
> ...
>>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h
>> b/arch/powerpc/include/asm/kvm_ppc.h
>>> index 4096f16..6e7c358 100644
>>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>>> @@ -72,6 +72,8 @@ extern int kvmppc_sanity_check(struct kvm_vcpu
>> *vcpu);
>>>    extern int kvmppc_subarch_vcpu_init(struct kvm_vcpu *vcpu);
>>>    extern void kvmppc_subarch_vcpu_uninit(struct kvm_vcpu *vcpu);
>>>
>>> +extern int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, u32 *inst);
>> Phew. Moving this into a separate function sure has some performance
>> implications. Was there no way to keep it in a header?
>>
>> You could just move it into its own .h file which we include after
>> kvm_ppc.h. That way everything's available. That would also help me a
>> lot with the little endian port where I'm also struggling with header
>> file inclusion order and kvmppc_need_byteswap().
> Great, I will do this.
>
>>> diff --git a/arch/powerpc/kvm/book3s_pr.c
>> b/arch/powerpc/kvm/book3s_pr.c
>>> index c5c052a..b7fffd1 100644
>>> --- a/arch/powerpc/kvm/book3s_pr.c
>>> +++ b/arch/powerpc/kvm/book3s_pr.c
>>> @@ -608,12 +608,9 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu,
>> ulong msr)
>>>    static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
>>>    {
>>> -	ulong srr0 = kvmppc_get_pc(vcpu);
>>> -	u32 last_inst = kvmppc_get_last_inst(vcpu);
>>> -	int ret;
>>> +	u32 last_inst;
>>>
>>> -	ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
>>> -	if (ret = -ENOENT) {
>>> +	if (kvmppc_get_last_inst(vcpu, &last_inst) = -ENOENT) {
>> ENOENT?
> You have to tell us :) Why does kvmppc_ld() mix emulation_result
> enumeration with generic errors? Do you want to change that and
> use EMULATE_FAIL instead?

Haha you're right. Man, this code sucks :). No, leave it be for now.

>
>>>    		ulong msr = vcpu->arch.shared->msr;
>>>
>>>    		msr = kvmppc_set_field(msr, 33, 33, 1);
>>> @@ -867,15 +864,18 @@ int kvmppc_handle_exit_pr(struct kvm_run *run,
>> struct kvm_vcpu *vcpu,
>>>    	{
>>>    		enum emulation_result er;
>>>    		ulong flags;
>>> +		u32 last_inst;
>>>
>>>    program_interrupt:
>>>    		flags = vcpu->arch.shadow_srr1 & 0x1f0000ull;
>>> +		kvmppc_get_last_inst(vcpu, &last_inst);
>> No check for the return value?
> Should we queue a program exception and resume guest?

Depends on the return code. We need to be able to handle EMULATE_AGAIN 
everywhere now.

>
>>>    		if (vcpu->arch.shared->msr & MSR_PR) {
>>>    #ifdef EXIT_DEBUG
>>> -			printk(KERN_INFO "Userspace triggered 0x700 exception
>> at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu));
>>> +			pr_info("Userspace triggered 0x700 exception at\n"
>>> +			    "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst);
>>>    #endif
>>> -			if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) !>>> +			if ((last_inst & 0xff0007ff) !>>>    			    (INS_DCBZ & 0xfffffff7)) {
>>>    				kvmppc_core_queue_program(vcpu, flags);
>>>    				r = RESUME_GUEST;
>>> @@ -894,7 +894,7 @@ program_interrupt:
>>>    			break;
>>>    		case EMULATE_FAIL:
>>>    			printk(KERN_CRIT "%s: emulation at %lx failed
>> (%08x)\n",
>>> -			       __func__, kvmppc_get_pc(vcpu),
>> kvmppc_get_last_inst(vcpu));
>>> +			       __func__, kvmppc_get_pc(vcpu), last_inst);
>>>    			kvmppc_core_queue_program(vcpu, flags);
>>>    			r = RESUME_GUEST;
>>>    			break;
>>> @@ -911,8 +911,12 @@ program_interrupt:
>>>    		break;
>>>    	}
>>>    	case BOOK3S_INTERRUPT_SYSCALL:
>>> +	{
>>> +		u32 last_sc;
>>> +
>>> +		kvmppc_get_last_sc(vcpu, &last_sc);
>> No check for the return value?
> The existing code does not handle KVM_INST_FETCH_FAILED.
> How should we continue if papr is enabled and last_sc fails?

I'd say go back into the guest and try again ;). Keep in mind that sc 
already sets srr0 to pc + 4, so we need to subtract 4 from pc and then 
go back into the guest.

>
>>>    		if (vcpu->arch.papr_enabled &&
>>> -		    (kvmppc_get_last_sc(vcpu) = 0x44000022) &&
>>> +		    (last_sc = 0x44000022) &&
>>>    		    !(vcpu->arch.shared->msr & MSR_PR)) {
>>>    			/* SC 1 papr hypercalls */
>>>    			ulong cmd = kvmppc_get_gpr(vcpu, 3);
>>> @@ -957,6 +961,7 @@ program_interrupt:
>>>    			r = RESUME_GUEST;
>>>    		}
>>>    		break;
>>> +	}
>>>    	case BOOK3S_INTERRUPT_FP_UNAVAIL:
>>>    	case BOOK3S_INTERRUPT_ALTIVEC:
>>>    	case BOOK3S_INTERRUPT_VSX:
>>> @@ -985,15 +990,20 @@ program_interrupt:
>>>    		break;
>>>    	}
>>>    	case BOOK3S_INTERRUPT_ALIGNMENT:
>>> +	{
>>> +		u32 last_inst;
>>> +
>>>    		if (kvmppc_read_inst(vcpu) = EMULATE_DONE) {
>>> -			vcpu->arch.shared->dsisr = kvmppc_alignment_dsisr(vcpu,
>>> -				kvmppc_get_last_inst(vcpu));
>>> -			vcpu->arch.shared->dar = kvmppc_alignment_dar(vcpu,
>>> -				kvmppc_get_last_inst(vcpu));
>>> +			kvmppc_get_last_inst(vcpu, &last_inst);
>> I think with an error returning kvmppc_get_last_inst we can just use
>> completely get rid of kvmppc_read_inst() and only use
>> kvmppc_get_last_inst() instead.
> The only purpose of kvmppc_read_inst() function is to generate an
> instruction storage interrupt in case of load failure. It is also called
> from kvmppc_check_ext(). Do you suggest to get rid of this logic? Or to
> duplicate it in order to avoid kvmppc_get_last_inst() redundant call?

I don't think we need that logic. If we get EMULATE_AGAIN, we just have 
to make sure we go back into the guest. No need to inject an ISI into 
the guest - it'll do that all by itself.


Alex


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2014-05-08 13:31 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-01  0:45 [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman
2014-05-01  0:45 ` Mihai Caraman
2014-05-01  0:45 ` Mihai Caraman
2014-05-01  0:45 ` [PATCH v2 1/4] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-02  9:24   ` Alexander Graf
2014-05-02  9:24     ` Alexander Graf
2014-05-02  9:24     ` Alexander Graf
2014-05-02 23:14     ` mihai.caraman
2014-05-02 23:14       ` mihai.caraman
2014-05-02 23:14       ` mihai.caraman
2014-05-03 22:14       ` Alexander Graf
2014-05-03 22:14         ` Alexander Graf
2014-05-03 22:14         ` Alexander Graf
2014-05-06 15:48         ` mihai.caraman
2014-05-06 15:48           ` mihai.caraman
2014-05-06 15:48           ` mihai.caraman
2014-05-06 15:54           ` Alexander Graf
2014-05-06 15:54             ` Alexander Graf
2014-05-06 15:54             ` Alexander Graf
2014-05-01  0:45 ` [PATCH v2 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-01  0:45 ` [PATCH v2 3/4] KVM: PPC: Alow kvmppc_get_last_inst() to fail Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-02  9:54   ` Alexander Graf
2014-05-02  9:54     ` Alexander Graf
2014-05-02  9:54     ` Alexander Graf
2014-05-06 19:06     ` mihai.caraman
2014-05-06 19:06       ` mihai.caraman
2014-05-06 19:06       ` mihai.caraman
2014-05-08 13:31       ` Alexander Graf
2014-05-08 13:31         ` Alexander Graf
2014-05-08 13:31         ` Alexander Graf
2014-05-01  0:45 ` [PATCH v2 4/4] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-01  0:45   ` Mihai Caraman
2014-05-02 10:01   ` Alexander Graf
2014-05-02 10:01     ` Alexander Graf
2014-05-02 10:01     ` Alexander Graf
2014-05-02 10:12     ` David Laight
2014-05-02 10:12       ` David Laight
2014-05-02 10:12       ` David Laight
2014-05-02 11:10       ` Alexander Graf
2014-05-02 11:10         ` Alexander Graf
2014-05-02 11:10         ` Alexander Graf
2014-05-02 15:32         ` Scott Wood
2014-05-02 15:32           ` Scott Wood
2014-05-02 15:32           ` Scott Wood
  -- strict thread matches above, loose matches on Subject: below --
2014-05-01  0:39 [PATCH v2 0/4] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman
2014-05-01  0:39 ` Mihai Caraman
2014-05-01  0:39 ` Mihai Caraman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.