All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/43] KVM: PPC: Book3S HV P9: entry/exit optimisations round 1
@ 2021-06-22 10:56 ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This series applies to powerpc topic/ppc-kvm branch (KVM Cify
series in particular), plus "KVM: PPC: Book3S HV Nested: Reflect L2 PMU
in-use to L0 when L2 SPRs are live" posted to kvm-ppc.

This reduces radix guest full entry/exit latency on POWER9 and POWER10
by almost 2x (hash is similar but it's still significantly slower than
the P7/8 real mode handler). Nested HV guests should see speedups with
some smaller improvements in the L1, plus the L0 switching sees many
of the same speedups as a direct guest.

It does this in several main ways:

- Rearrange code to optimise SPR accesses. Mainly, avoid scoreboard
  stalls.

- Test SPR values to avoid mtSPRs where possible. mtSPRs are expensive.

- Reduce mftb. mftb is expensive.

- Demand fault certain facilities to avoid saving and/or restoring them
  (at the cost of fault when they are used, but this is mitigated over
  a number of entries, like the facilities when context switching 
  processes). PM, TM, and EBB so far.

- Defer some sequences that are made just in case a guest is interrupted
  in the middle of a critical section to the case where the guest is
  scheduled on a different CPU, rather than every time (at the cost of
  an extra IPI in this case). Namely the tlbsync sequence for radix with
  GTSE, which is very expensive.

Some of the numbers quoted in changelogs may have changed a bit with
patches being updated, reordered, etc. They give a bit of a guide, but
I might remove them from the final submission because they're too much
to maintain.

Thanks,
Nick

Nicholas Piggin (43):
  powerpc/64s: Remove WORT SPR from POWER9/10
  KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host
  KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer
    read
  KVM: PPC: Book3S HV P9: Use large decrementer for HDEC
  KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit
  powerpc/time: add API for KVM to re-arm the host timer/decrementer
  KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
  powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
  KVM: PPC: Book3S HV: Don't always save PMU for guest capable of
    nesting
  powerpc/64s: Always set PMU control registers to frozen/disabled when
    not in use
  KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  KVM: PPC: Book3S HV P9: Factor out yield_count increment
  KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch
    functions
  KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse
  KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write
  KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs
  KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save
    host SPRs
  KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE]
    disable
  KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match
    kvmppc_start_thread
  KVM: PPC: Book3S HV: Change dec_expires to be relative to guest
    timebase
  KVM: PPC: Book3S HV P9: Move TB updates
  KVM: PPC: Book3S HV P9: Optimise timebase reads
  KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
  KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
  KVM: PPC: Book3S HV P9: Juggle SPR switching around
  KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
  KVM: PPC: Book3S HV P9: Move host OS save/restore functions to
    built-in
  KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
  KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low
    level entry
  KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
  KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
  KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors
    that require it
  KVM: PPC: Book3S HV P9: More SPR speed improvements
  KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
  KVM: PPC: Book3S HV P9: Demand fault TM facility registers
  KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host
    SPRs
  KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
  KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR
    SPRs
  KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
  KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
  KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
  KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
  KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving

 arch/powerpc/include/asm/asm-prototypes.h |   5 -
 arch/powerpc/include/asm/kvm_asm.h        |   1 +
 arch/powerpc/include/asm/kvm_book3s.h     |   6 +
 arch/powerpc/include/asm/kvm_book3s_64.h  |   4 +-
 arch/powerpc/include/asm/kvm_host.h       |   5 +-
 arch/powerpc/include/asm/switch_to.h      |   2 +
 arch/powerpc/include/asm/time.h           |  19 +-
 arch/powerpc/kernel/cpu_setup_power.c     |  12 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c         |   8 +-
 arch/powerpc/kernel/process.c             |  30 +
 arch/powerpc/kernel/time.c                |  54 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c    |   4 +
 arch/powerpc/kvm/book3s_hv.c              | 628 +++++++++----------
 arch/powerpc/kvm/book3s_hv.h              |  36 ++
 arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
 arch/powerpc/kvm/book3s_hv_nested.c       |  21 +-
 arch/powerpc/kvm/book3s_hv_p9_entry.c     | 725 +++++++++++++++++++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  74 +--
 arch/powerpc/mm/book3s64/radix_pgtable.c  |  15 -
 arch/powerpc/perf/core-book3s.c           |   7 +
 arch/powerpc/platforms/powernv/idle.c     |  10 +-
 21 files changed, 1143 insertions(+), 536 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv.h

-- 
2.23.0


^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC PATCH 00/43] KVM: PPC: Book3S HV P9: entry/exit optimisations round 1
@ 2021-06-22 10:56 ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This series applies to powerpc topic/ppc-kvm branch (KVM Cify
series in particular), plus "KVM: PPC: Book3S HV Nested: Reflect L2 PMU
in-use to L0 when L2 SPRs are live" posted to kvm-ppc.

This reduces radix guest full entry/exit latency on POWER9 and POWER10
by almost 2x (hash is similar but it's still significantly slower than
the P7/8 real mode handler). Nested HV guests should see speedups with
some smaller improvements in the L1, plus the L0 switching sees many
of the same speedups as a direct guest.

It does this in several main ways:

- Rearrange code to optimise SPR accesses. Mainly, avoid scoreboard
  stalls.

- Test SPR values to avoid mtSPRs where possible. mtSPRs are expensive.

- Reduce mftb. mftb is expensive.

- Demand fault certain facilities to avoid saving and/or restoring them
  (at the cost of fault when they are used, but this is mitigated over
  a number of entries, like the facilities when context switching 
  processes). PM, TM, and EBB so far.

- Defer some sequences that are made just in case a guest is interrupted
  in the middle of a critical section to the case where the guest is
  scheduled on a different CPU, rather than every time (at the cost of
  an extra IPI in this case). Namely the tlbsync sequence for radix with
  GTSE, which is very expensive.

Some of the numbers quoted in changelogs may have changed a bit with
patches being updated, reordered, etc. They give a bit of a guide, but
I might remove them from the final submission because they're too much
to maintain.

Thanks,
Nick

Nicholas Piggin (43):
  powerpc/64s: Remove WORT SPR from POWER9/10
  KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host
  KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer
    read
  KVM: PPC: Book3S HV P9: Use large decrementer for HDEC
  KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit
  powerpc/time: add API for KVM to re-arm the host timer/decrementer
  KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
  powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
  KVM: PPC: Book3S HV: Don't always save PMU for guest capable of
    nesting
  powerpc/64s: Always set PMU control registers to frozen/disabled when
    not in use
  KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  KVM: PPC: Book3S HV P9: Factor out yield_count increment
  KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch
    functions
  KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse
  KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write
  KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs
  KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save
    host SPRs
  KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE]
    disable
  KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match
    kvmppc_start_thread
  KVM: PPC: Book3S HV: Change dec_expires to be relative to guest
    timebase
  KVM: PPC: Book3S HV P9: Move TB updates
  KVM: PPC: Book3S HV P9: Optimise timebase reads
  KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
  KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
  KVM: PPC: Book3S HV P9: Juggle SPR switching around
  KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
  KVM: PPC: Book3S HV P9: Move host OS save/restore functions to
    built-in
  KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
  KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low
    level entry
  KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
  KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
  KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors
    that require it
  KVM: PPC: Book3S HV P9: More SPR speed improvements
  KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
  KVM: PPC: Book3S HV P9: Demand fault TM facility registers
  KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host
    SPRs
  KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
  KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR
    SPRs
  KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
  KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
  KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
  KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
  KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving

 arch/powerpc/include/asm/asm-prototypes.h |   5 -
 arch/powerpc/include/asm/kvm_asm.h        |   1 +
 arch/powerpc/include/asm/kvm_book3s.h     |   6 +
 arch/powerpc/include/asm/kvm_book3s_64.h  |   4 +-
 arch/powerpc/include/asm/kvm_host.h       |   5 +-
 arch/powerpc/include/asm/switch_to.h      |   2 +
 arch/powerpc/include/asm/time.h           |  19 +-
 arch/powerpc/kernel/cpu_setup_power.c     |  12 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c         |   8 +-
 arch/powerpc/kernel/process.c             |  30 +
 arch/powerpc/kernel/time.c                |  54 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c    |   4 +
 arch/powerpc/kvm/book3s_hv.c              | 628 +++++++++----------
 arch/powerpc/kvm/book3s_hv.h              |  36 ++
 arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
 arch/powerpc/kvm/book3s_hv_nested.c       |  21 +-
 arch/powerpc/kvm/book3s_hv_p9_entry.c     | 725 +++++++++++++++++++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  74 +--
 arch/powerpc/mm/book3s64/radix_pgtable.c  |  15 -
 arch/powerpc/perf/core-book3s.c           |   7 +
 arch/powerpc/platforms/powernv/idle.c     |  10 +-
 21 files changed, 1143 insertions(+), 536 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv.h

-- 
2.23.0

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [RFC PATCH 01/43] powerpc/64s: Remove WORT SPR from POWER9/10
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:56   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This register is not architected and not implemented in POWER9 or 10,
it just reads back zeroes for compatibility.

-78 cycles (9255) cycles POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 3 ---
 arch/powerpc/platforms/powernv/idle.c | 2 --
 2 files changed, 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 9228042bd54f..97f3d6d54b61 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3640,7 +3640,6 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
 	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
 	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
 	mtspr(SPRN_BESCR, vcpu->arch.bescr);
-	mtspr(SPRN_WORT, vcpu->arch.wort);
 	mtspr(SPRN_TIDR, vcpu->arch.tid);
 	mtspr(SPRN_AMR, vcpu->arch.amr);
 	mtspr(SPRN_UAMOR, vcpu->arch.uamor);
@@ -3667,7 +3666,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
 	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
 	vcpu->arch.bescr = mfspr(SPRN_BESCR);
-	vcpu->arch.wort = mfspr(SPRN_WORT);
 	vcpu->arch.tid = mfspr(SPRN_TIDR);
 	vcpu->arch.amr = mfspr(SPRN_AMR);
 	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
@@ -3699,7 +3697,6 @@ static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 				    struct p9_host_os_sprs *host_os_sprs)
 {
 	mtspr(SPRN_PSPB, 0);
-	mtspr(SPRN_WORT, 0);
 	mtspr(SPRN_UAMOR, 0);
 
 	mtspr(SPRN_DSCR, host_os_sprs->dscr);
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 528a7e0cf83a..180baecad914 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -667,7 +667,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 		sprs.purr	= mfspr(SPRN_PURR);
 		sprs.spurr	= mfspr(SPRN_SPURR);
 		sprs.dscr	= mfspr(SPRN_DSCR);
-		sprs.wort	= mfspr(SPRN_WORT);
 		sprs.ciabr	= mfspr(SPRN_CIABR);
 
 		sprs.mmcra	= mfspr(SPRN_MMCRA);
@@ -785,7 +784,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 	mtspr(SPRN_PURR,	sprs.purr);
 	mtspr(SPRN_SPURR,	sprs.spurr);
 	mtspr(SPRN_DSCR,	sprs.dscr);
-	mtspr(SPRN_WORT,	sprs.wort);
 	mtspr(SPRN_CIABR,	sprs.ciabr);
 
 	mtspr(SPRN_MMCRA,	sprs.mmcra);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 01/43] powerpc/64s: Remove WORT SPR from POWER9/10
@ 2021-06-22 10:56   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This register is not architected and not implemented in POWER9 or 10,
it just reads back zeroes for compatibility.

-78 cycles (9255) cycles POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 3 ---
 arch/powerpc/platforms/powernv/idle.c | 2 --
 2 files changed, 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 9228042bd54f..97f3d6d54b61 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3640,7 +3640,6 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
 	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
 	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
 	mtspr(SPRN_BESCR, vcpu->arch.bescr);
-	mtspr(SPRN_WORT, vcpu->arch.wort);
 	mtspr(SPRN_TIDR, vcpu->arch.tid);
 	mtspr(SPRN_AMR, vcpu->arch.amr);
 	mtspr(SPRN_UAMOR, vcpu->arch.uamor);
@@ -3667,7 +3666,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
 	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
 	vcpu->arch.bescr = mfspr(SPRN_BESCR);
-	vcpu->arch.wort = mfspr(SPRN_WORT);
 	vcpu->arch.tid = mfspr(SPRN_TIDR);
 	vcpu->arch.amr = mfspr(SPRN_AMR);
 	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
@@ -3699,7 +3697,6 @@ static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 				    struct p9_host_os_sprs *host_os_sprs)
 {
 	mtspr(SPRN_PSPB, 0);
-	mtspr(SPRN_WORT, 0);
 	mtspr(SPRN_UAMOR, 0);
 
 	mtspr(SPRN_DSCR, host_os_sprs->dscr);
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 528a7e0cf83a..180baecad914 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -667,7 +667,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 		sprs.purr	= mfspr(SPRN_PURR);
 		sprs.spurr	= mfspr(SPRN_SPURR);
 		sprs.dscr	= mfspr(SPRN_DSCR);
-		sprs.wort	= mfspr(SPRN_WORT);
 		sprs.ciabr	= mfspr(SPRN_CIABR);
 
 		sprs.mmcra	= mfspr(SPRN_MMCRA);
@@ -785,7 +784,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 	mtspr(SPRN_PURR,	sprs.purr);
 	mtspr(SPRN_SPURR,	sprs.spurr);
 	mtspr(SPRN_DSCR,	sprs.dscr);
-	mtspr(SPRN_WORT,	sprs.wort);
 	mtspr(SPRN_CIABR,	sprs.ciabr);
 
 	mtspr(SPRN_MMCRA,	sprs.mmcra);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 02/43] KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:56   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Alexey Kardashevskiy, linuxppc-dev, Nicholas Piggin

The host Linux timer code arms the decrementer with the value
'decrementers_next_tb - current_tb' using set_dec(), which stores
val - 1 on Book3S-64, which is not quite the same as what KVM does
to re-arm the host decrementer when exiting the guest.

This shouldn't be a significant change, but it makes the logic match
and avoids this small extra change being brought into the next patch.

Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 97f3d6d54b61..d19b4ae01642 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3914,7 +3914,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	mtspr(SPRN_DEC, local_paca->kvm_hstate.dec_expires - mftb());
+	set_dec(local_paca->kvm_hstate.dec_expires - mftb());
 	/* We may have raced with new irq work */
 	if (test_irq_work_pending())
 		set_dec(1);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 02/43] KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host
@ 2021-06-22 10:56   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Alexey Kardashevskiy, linuxppc-dev, Nicholas Piggin

The host Linux timer code arms the decrementer with the value
'decrementers_next_tb - current_tb' using set_dec(), which stores
val - 1 on Book3S-64, which is not quite the same as what KVM does
to re-arm the host decrementer when exiting the guest.

This shouldn't be a significant change, but it makes the logic match
and avoids this small extra change being brought into the next patch.

Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 97f3d6d54b61..d19b4ae01642 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3914,7 +3914,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	mtspr(SPRN_DEC, local_paca->kvm_hstate.dec_expires - mftb());
+	set_dec(local_paca->kvm_hstate.dec_expires - mftb());
 	/* We may have raced with new irq work */
 	if (test_irq_work_pending())
 		set_dec(1);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 03/43] KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer read
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:56   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

There is no need to save away the host DEC value, as it is derived
from the host timer subsystem which maintains the next timer time,
so it can be restored from there.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h |  5 +++++
 arch/powerpc/kernel/time.c      |  1 +
 arch/powerpc/kvm/book3s_hv.c    | 14 +++++++-------
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 8c2c3dd4ddba..fd09b4797fd7 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -111,6 +111,11 @@ static inline unsigned long test_irq_work_pending(void)
 
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
+static inline u64 timer_get_next_tb(void)
+{
+	return __this_cpu_read(decrementers_next_tb);
+}
+
 /* Convert timebase ticks to nanoseconds */
 unsigned long long tb_to_ns(unsigned long long tb_ticks);
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index da995c5fb97d..98bdd96141f2 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -108,6 +108,7 @@ struct clock_event_device decrementer_clockevent = {
 EXPORT_SYMBOL(decrementer_clockevent);
 
 DEFINE_PER_CPU(u64, decrementers_next_tb);
+EXPORT_SYMBOL_GPL(decrementers_next_tb);
 static DEFINE_PER_CPU(struct clock_event_device, decrementers);
 
 #define XSEC_PER_SEC (1024*1024)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d19b4ae01642..a413377aafb5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3729,18 +3729,17 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
-	u64 tb;
+	u64 tb, next_timer;
 	int trap, save_pmu;
 
 	WARN_ON_ONCE(vcpu->arch.ceded);
 
-	dec = mfspr(SPRN_DEC);
 	tb = mftb();
-	if (dec < 0)
+	next_timer = timer_get_next_tb();
+	if (tb >= next_timer)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
-	local_paca->kvm_hstate.dec_expires = dec + tb;
-	if (local_paca->kvm_hstate.dec_expires < time_limit)
-		time_limit = local_paca->kvm_hstate.dec_expires;
+	if (next_timer < time_limit)
+		time_limit = next_timer;
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
@@ -3914,7 +3913,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	set_dec(local_paca->kvm_hstate.dec_expires - mftb());
+	next_timer = timer_get_next_tb();
+	set_dec(next_timer - mftb());
 	/* We may have raced with new irq work */
 	if (test_irq_work_pending())
 		set_dec(1);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 03/43] KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer read
@ 2021-06-22 10:56   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

There is no need to save away the host DEC value, as it is derived
from the host timer subsystem which maintains the next timer time,
so it can be restored from there.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h |  5 +++++
 arch/powerpc/kernel/time.c      |  1 +
 arch/powerpc/kvm/book3s_hv.c    | 14 +++++++-------
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 8c2c3dd4ddba..fd09b4797fd7 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -111,6 +111,11 @@ static inline unsigned long test_irq_work_pending(void)
 
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
+static inline u64 timer_get_next_tb(void)
+{
+	return __this_cpu_read(decrementers_next_tb);
+}
+
 /* Convert timebase ticks to nanoseconds */
 unsigned long long tb_to_ns(unsigned long long tb_ticks);
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index da995c5fb97d..98bdd96141f2 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -108,6 +108,7 @@ struct clock_event_device decrementer_clockevent = {
 EXPORT_SYMBOL(decrementer_clockevent);
 
 DEFINE_PER_CPU(u64, decrementers_next_tb);
+EXPORT_SYMBOL_GPL(decrementers_next_tb);
 static DEFINE_PER_CPU(struct clock_event_device, decrementers);
 
 #define XSEC_PER_SEC (1024*1024)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d19b4ae01642..a413377aafb5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3729,18 +3729,17 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
-	u64 tb;
+	u64 tb, next_timer;
 	int trap, save_pmu;
 
 	WARN_ON_ONCE(vcpu->arch.ceded);
 
-	dec = mfspr(SPRN_DEC);
 	tb = mftb();
-	if (dec < 0)
+	next_timer = timer_get_next_tb();
+	if (tb >= next_timer)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
-	local_paca->kvm_hstate.dec_expires = dec + tb;
-	if (local_paca->kvm_hstate.dec_expires < time_limit)
-		time_limit = local_paca->kvm_hstate.dec_expires;
+	if (next_timer < time_limit)
+		time_limit = next_timer;
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
@@ -3914,7 +3913,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	set_dec(local_paca->kvm_hstate.dec_expires - mftb());
+	next_timer = timer_get_next_tb();
+	set_dec(next_timer - mftb());
 	/* We may have raced with new irq work */
 	if (test_irq_work_pending())
 		set_dec(1);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 04/43] KVM: PPC: Book3S HV P9: Use large decrementer for HDEC
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:56   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Alexey Kardashevskiy, linuxppc-dev, Nicholas Piggin

On processors that don't suppress the HDEC exceptions when LPCR[HDICE]=0,
this could help reduce needless guest exits due to leftover exceptions on
entering the guest.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h       | 2 ++
 arch/powerpc/kernel/time.c            | 1 +
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 3 ++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index fd09b4797fd7..69b6be617772 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -18,6 +18,8 @@
 #include <asm/vdso/timebase.h>
 
 /* time.c */
+extern u64 decrementer_max;
+
 extern unsigned long tb_ticks_per_jiffy;
 extern unsigned long tb_ticks_per_usec;
 extern unsigned long tb_ticks_per_sec;
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 98bdd96141f2..026b3c0b648c 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -89,6 +89,7 @@ static struct clocksource clocksource_timebase = {
 
 #define DECREMENTER_DEFAULT_MAX 0x7FFFFFFF
 u64 decrementer_max = DECREMENTER_DEFAULT_MAX;
+EXPORT_SYMBOL_GPL(decrementer_max); /* for KVM HDEC */
 
 static int decrementer_set_next_event(unsigned long evt,
 				      struct clock_event_device *dev);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 83f592eadcd2..63afd277c5f3 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -489,7 +489,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = 0;
 	}
 
-	mtspr(SPRN_HDEC, 0x7fffffff);
+	/* HDEC must be at least as large as DEC, so decrementer_max fits */
+	mtspr(SPRN_HDEC, decrementer_max);
 
 	save_clear_guest_mmu(kvm, vcpu);
 	switch_mmu_to_host(kvm, host_pidr);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 04/43] KVM: PPC: Book3S HV P9: Use large decrementer for HDEC
@ 2021-06-22 10:56   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Alexey Kardashevskiy, linuxppc-dev, Nicholas Piggin

On processors that don't suppress the HDEC exceptions when LPCR[HDICE]=0,
this could help reduce needless guest exits due to leftover exceptions on
entering the guest.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h       | 2 ++
 arch/powerpc/kernel/time.c            | 1 +
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 3 ++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index fd09b4797fd7..69b6be617772 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -18,6 +18,8 @@
 #include <asm/vdso/timebase.h>
 
 /* time.c */
+extern u64 decrementer_max;
+
 extern unsigned long tb_ticks_per_jiffy;
 extern unsigned long tb_ticks_per_usec;
 extern unsigned long tb_ticks_per_sec;
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 98bdd96141f2..026b3c0b648c 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -89,6 +89,7 @@ static struct clocksource clocksource_timebase = {
 
 #define DECREMENTER_DEFAULT_MAX 0x7FFFFFFF
 u64 decrementer_max = DECREMENTER_DEFAULT_MAX;
+EXPORT_SYMBOL_GPL(decrementer_max); /* for KVM HDEC */
 
 static int decrementer_set_next_event(unsigned long evt,
 				      struct clock_event_device *dev);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 83f592eadcd2..63afd277c5f3 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -489,7 +489,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = 0;
 	}
 
-	mtspr(SPRN_HDEC, 0x7fffffff);
+	/* HDEC must be at least as large as DEC, so decrementer_max fits */
+	mtspr(SPRN_HDEC, decrementer_max);
 
 	save_clear_guest_mmu(kvm, vcpu);
 	switch_mmu_to_host(kvm, host_pidr);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 05/43] KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:56   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin, Fabiano Rosas

mftb is serialising (dispatch next-to-complete) so it is heavy weight
for a mfspr. Avoid reading it multiple times in the entry or exit paths.
A small number of cycles delay to timers is tolerable.

-118 cycles (9137) POWER9 virt-mode NULL hcall

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 4 ++--
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a413377aafb5..5ec534620e07 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3794,7 +3794,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	 *
 	 * XXX: Another day's problem.
 	 */
-	mtspr(SPRN_DEC, vcpu->arch.dec_expires - mftb());
+	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
 
 	if (kvmhv_on_pseries()) {
 		/*
@@ -3914,7 +3914,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->in_guest = 0;
 
 	next_timer = timer_get_next_tb();
-	set_dec(next_timer - mftb());
+	set_dec(next_timer - tb);
 	/* We may have raced with new irq work */
 	if (test_irq_work_pending())
 		set_dec(1);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 63afd277c5f3..c4f3e066fcb4 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -203,7 +203,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	unsigned long host_dawr1;
 	unsigned long host_dawrx1;
 
-	hdec = time_limit - mftb();
+	tb = mftb();
+	hdec = time_limit - tb;
 	if (hdec < 0)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 
@@ -215,7 +216,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vcpu->arch.ceded = 0;
 
 	if (vc->tb_offset) {
-		u64 new_tb = mftb() + vc->tb_offset;
+		u64 new_tb = tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
 		tb = mftb();
 		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 05/43] KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit
@ 2021-06-22 10:56   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin, Fabiano Rosas

mftb is serialising (dispatch next-to-complete) so it is heavy weight
for a mfspr. Avoid reading it multiple times in the entry or exit paths.
A small number of cycles delay to timers is tolerable.

-118 cycles (9137) POWER9 virt-mode NULL hcall

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 4 ++--
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a413377aafb5..5ec534620e07 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3794,7 +3794,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	 *
 	 * XXX: Another day's problem.
 	 */
-	mtspr(SPRN_DEC, vcpu->arch.dec_expires - mftb());
+	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
 
 	if (kvmhv_on_pseries()) {
 		/*
@@ -3914,7 +3914,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->in_guest = 0;
 
 	next_timer = timer_get_next_tb();
-	set_dec(next_timer - mftb());
+	set_dec(next_timer - tb);
 	/* We may have raced with new irq work */
 	if (test_irq_work_pending())
 		set_dec(1);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 63afd277c5f3..c4f3e066fcb4 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -203,7 +203,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	unsigned long host_dawr1;
 	unsigned long host_dawrx1;
 
-	hdec = time_limit - mftb();
+	tb = mftb();
+	hdec = time_limit - tb;
 	if (hdec < 0)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 
@@ -215,7 +216,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vcpu->arch.ceded = 0;
 
 	if (vc->tb_offset) {
-		u64 new_tb = mftb() + vc->tb_offset;
+		u64 new_tb = tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
 		tb = mftb();
 		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 06/43] powerpc/time: add API for KVM to re-arm the host timer/decrementer
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:56   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Rather than have KVM look up the host timer and fiddle with the
irq-work internal details, have the powerpc/time.c code provide a
function for KVM to re-arm the Linux timer code when exiting a
guest.

This is implementation has an improvement over existing code of
marking a decrementer interrupt as soft-pending if a timer has
expired, rather than setting DEC to a -ve value, which tended to
cause host timers to take two interrupts (first hdec to exit the
guest, then the immediate dec).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h | 16 +++-------
 arch/powerpc/kernel/time.c      | 52 +++++++++++++++++++++++++++------
 arch/powerpc/kvm/book3s_hv.c    |  7 ++---
 3 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 69b6be617772..924b2157882f 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -99,18 +99,6 @@ extern void div128_by_32(u64 dividend_high, u64 dividend_low,
 extern void secondary_cpu_time_init(void);
 extern void __init time_init(void);
 
-#ifdef CONFIG_PPC64
-static inline unsigned long test_irq_work_pending(void)
-{
-	unsigned long x;
-
-	asm volatile("lbz %0,%1(13)"
-		: "=r" (x)
-		: "i" (offsetof(struct paca_struct, irq_work_pending)));
-	return x;
-}
-#endif
-
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
 static inline u64 timer_get_next_tb(void)
@@ -118,6 +106,10 @@ static inline u64 timer_get_next_tb(void)
 	return __this_cpu_read(decrementers_next_tb);
 }
 
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+void timer_rearm_host_dec(u64 now);
+#endif
+
 /* Convert timebase ticks to nanoseconds */
 unsigned long long tb_to_ns(unsigned long long tb_ticks);
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 026b3c0b648c..7c9de3498548 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -510,6 +510,16 @@ EXPORT_SYMBOL(profile_pc);
  * 64-bit uses a byte in the PACA, 32-bit uses a per-cpu variable...
  */
 #ifdef CONFIG_PPC64
+static inline unsigned long test_irq_work_pending(void)
+{
+	unsigned long x;
+
+	asm volatile("lbz %0,%1(13)"
+		: "=r" (x)
+		: "i" (offsetof(struct paca_struct, irq_work_pending)));
+	return x;
+}
+
 static inline void set_irq_work_pending_flag(void)
 {
 	asm volatile("stb %0,%1(13)" : :
@@ -553,13 +563,44 @@ void arch_irq_work_raise(void)
 	preempt_enable();
 }
 
+static void set_dec_or_work(u64 val)
+{
+	set_dec(val);
+	/* We may have raced with new irq work */
+	if (unlikely(test_irq_work_pending()))
+		set_dec(1);
+}
+
 #else  /* CONFIG_IRQ_WORK */
 
 #define test_irq_work_pending()	0
 #define clear_irq_work_pending()
 
+static void set_dec_or_work(u64 val)
+{
+	set_dec(val);
+}
 #endif /* CONFIG_IRQ_WORK */
 
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+void timer_rearm_host_dec(u64 now)
+{
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
+
+	WARN_ON_ONCE(!arch_irqs_disabled());
+	WARN_ON_ONCE(mfmsr() & MSR_EE);
+
+	if (now >= *next_tb) {
+		local_paca->irq_happened |= PACA_IRQ_DEC;
+	} else {
+		now = *next_tb - now;
+		if (now <= decrementer_max)
+			set_dec_or_work(now);
+	}
+}
+EXPORT_SYMBOL_GPL(timer_rearm_host_dec);
+#endif
+
 /*
  * timer_interrupt - gets called when the decrementer overflows,
  * with interrupts disabled.
@@ -620,10 +661,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 	} else {
 		now = *next_tb - now;
 		if (now <= decrementer_max)
-			set_dec(now);
-		/* We may have raced with new irq work */
-		if (test_irq_work_pending())
-			set_dec(1);
+			set_dec_or_work(now);
 		__this_cpu_inc(irq_stat.timer_irqs_others);
 	}
 
@@ -865,11 +903,7 @@ static int decrementer_set_next_event(unsigned long evt,
 				      struct clock_event_device *dev)
 {
 	__this_cpu_write(decrementers_next_tb, get_tb() + evt);
-	set_dec(evt);
-
-	/* We may have raced with new irq work */
-	if (test_irq_work_pending())
-		set_dec(1);
+	set_dec_or_work(evt);
 
 	return 0;
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 5ec534620e07..36e1db48fccf 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3913,11 +3913,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	next_timer = timer_get_next_tb();
-	set_dec(next_timer - tb);
-	/* We may have raced with new irq work */
-	if (test_irq_work_pending())
-		set_dec(1);
+	timer_rearm_host_dec(tb);
+
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
 	kvmhv_load_host_pmu();
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 06/43] powerpc/time: add API for KVM to re-arm the host timer/decrementer
@ 2021-06-22 10:56   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:56 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Rather than have KVM look up the host timer and fiddle with the
irq-work internal details, have the powerpc/time.c code provide a
function for KVM to re-arm the Linux timer code when exiting a
guest.

This is implementation has an improvement over existing code of
marking a decrementer interrupt as soft-pending if a timer has
expired, rather than setting DEC to a -ve value, which tended to
cause host timers to take two interrupts (first hdec to exit the
guest, then the immediate dec).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/time.h | 16 +++-------
 arch/powerpc/kernel/time.c      | 52 +++++++++++++++++++++++++++------
 arch/powerpc/kvm/book3s_hv.c    |  7 ++---
 3 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 69b6be617772..924b2157882f 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -99,18 +99,6 @@ extern void div128_by_32(u64 dividend_high, u64 dividend_low,
 extern void secondary_cpu_time_init(void);
 extern void __init time_init(void);
 
-#ifdef CONFIG_PPC64
-static inline unsigned long test_irq_work_pending(void)
-{
-	unsigned long x;
-
-	asm volatile("lbz %0,%1(13)"
-		: "=r" (x)
-		: "i" (offsetof(struct paca_struct, irq_work_pending)));
-	return x;
-}
-#endif
-
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
 static inline u64 timer_get_next_tb(void)
@@ -118,6 +106,10 @@ static inline u64 timer_get_next_tb(void)
 	return __this_cpu_read(decrementers_next_tb);
 }
 
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+void timer_rearm_host_dec(u64 now);
+#endif
+
 /* Convert timebase ticks to nanoseconds */
 unsigned long long tb_to_ns(unsigned long long tb_ticks);
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 026b3c0b648c..7c9de3498548 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -510,6 +510,16 @@ EXPORT_SYMBOL(profile_pc);
  * 64-bit uses a byte in the PACA, 32-bit uses a per-cpu variable...
  */
 #ifdef CONFIG_PPC64
+static inline unsigned long test_irq_work_pending(void)
+{
+	unsigned long x;
+
+	asm volatile("lbz %0,%1(13)"
+		: "=r" (x)
+		: "i" (offsetof(struct paca_struct, irq_work_pending)));
+	return x;
+}
+
 static inline void set_irq_work_pending_flag(void)
 {
 	asm volatile("stb %0,%1(13)" : :
@@ -553,13 +563,44 @@ void arch_irq_work_raise(void)
 	preempt_enable();
 }
 
+static void set_dec_or_work(u64 val)
+{
+	set_dec(val);
+	/* We may have raced with new irq work */
+	if (unlikely(test_irq_work_pending()))
+		set_dec(1);
+}
+
 #else  /* CONFIG_IRQ_WORK */
 
 #define test_irq_work_pending()	0
 #define clear_irq_work_pending()
 
+static void set_dec_or_work(u64 val)
+{
+	set_dec(val);
+}
 #endif /* CONFIG_IRQ_WORK */
 
+#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+void timer_rearm_host_dec(u64 now)
+{
+	u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
+
+	WARN_ON_ONCE(!arch_irqs_disabled());
+	WARN_ON_ONCE(mfmsr() & MSR_EE);
+
+	if (now >= *next_tb) {
+		local_paca->irq_happened |= PACA_IRQ_DEC;
+	} else {
+		now = *next_tb - now;
+		if (now <= decrementer_max)
+			set_dec_or_work(now);
+	}
+}
+EXPORT_SYMBOL_GPL(timer_rearm_host_dec);
+#endif
+
 /*
  * timer_interrupt - gets called when the decrementer overflows,
  * with interrupts disabled.
@@ -620,10 +661,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 	} else {
 		now = *next_tb - now;
 		if (now <= decrementer_max)
-			set_dec(now);
-		/* We may have raced with new irq work */
-		if (test_irq_work_pending())
-			set_dec(1);
+			set_dec_or_work(now);
 		__this_cpu_inc(irq_stat.timer_irqs_others);
 	}
 
@@ -865,11 +903,7 @@ static int decrementer_set_next_event(unsigned long evt,
 				      struct clock_event_device *dev)
 {
 	__this_cpu_write(decrementers_next_tb, get_tb() + evt);
-	set_dec(evt);
-
-	/* We may have raced with new irq work */
-	if (test_irq_work_pending())
-		set_dec(1);
+	set_dec_or_work(evt);
 
 	return 0;
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 5ec534620e07..36e1db48fccf 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3913,11 +3913,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	next_timer = timer_get_next_tb();
-	set_dec(next_timer - tb);
-	/* We may have raced with new irq work */
-	if (test_irq_work_pending())
-		set_dec(1);
+	timer_rearm_host_dec(tb);
+
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
 	kvmhv_load_host_pmu();
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 07/43] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

HV interrupts may be taken with the MMU enabled when radix guests are
running. Enable LPCR[HAIL] on ISA v3.1 processors for radix guests.
Make this depend on the host LPCR[HAIL] being enabled. Currently that is
always enabled, but having this test means any issue that might require
LPCR[HAIL] to be disabled in the host will not have to be duplicated in
KVM.

-1380 cycles on P10 NULL hcall entry+exit

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 36e1db48fccf..ed713f49fbd5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4896,6 +4896,8 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu)
  */
 int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
 {
+	unsigned long lpcr, lpcr_mask;
+
 	if (nesting_enabled(kvm))
 		kvmhv_release_all_nested(kvm);
 	kvmppc_rmap_reset(kvm);
@@ -4905,8 +4907,13 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
 	kvm->arch.radix = 0;
 	spin_unlock(&kvm->mmu_lock);
 	kvmppc_free_radix(kvm);
-	kvmppc_update_lpcr(kvm, LPCR_VPM1,
-			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
+
+	lpcr = LPCR_VPM1;
+	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		lpcr_mask |= LPCR_HAIL;
+	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
+
 	return 0;
 }
 
@@ -4916,6 +4923,7 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
  */
 int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
 {
+	unsigned long lpcr, lpcr_mask;
 	int err;
 
 	err = kvmppc_init_vm_radix(kvm);
@@ -4927,8 +4935,17 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
 	kvm->arch.radix = 1;
 	spin_unlock(&kvm->mmu_lock);
 	kvmppc_free_hpt(&kvm->arch.hpt);
-	kvmppc_update_lpcr(kvm, LPCR_UPRT | LPCR_GTSE | LPCR_HR,
-			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
+
+	lpcr = LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		lpcr_mask |= LPCR_HAIL;
+		if (cpu_has_feature(CPU_FTR_HVMODE) &&
+				(kvm->arch.host_lpcr & LPCR_HAIL))
+			lpcr |= LPCR_HAIL;
+	}
+	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
+
 	return 0;
 }
 
@@ -5092,6 +5109,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
 		kvm->arch.mmu_ready = 1;
 		lpcr &= ~LPCR_VPM1;
 		lpcr |= LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+		if (cpu_has_feature(CPU_FTR_HVMODE) &&
+		    cpu_has_feature(CPU_FTR_ARCH_31) &&
+		    (kvm->arch.host_lpcr & LPCR_HAIL))
+			lpcr |= LPCR_HAIL;
 		ret = kvmppc_init_vm_radix(kvm);
 		if (ret) {
 			kvmppc_free_lpid(kvm->arch.lpid);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 07/43] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

HV interrupts may be taken with the MMU enabled when radix guests are
running. Enable LPCR[HAIL] on ISA v3.1 processors for radix guests.
Make this depend on the host LPCR[HAIL] being enabled. Currently that is
always enabled, but having this test means any issue that might require
LPCR[HAIL] to be disabled in the host will not have to be duplicated in
KVM.

-1380 cycles on P10 NULL hcall entry+exit

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 36e1db48fccf..ed713f49fbd5 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4896,6 +4896,8 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu)
  */
 int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
 {
+	unsigned long lpcr, lpcr_mask;
+
 	if (nesting_enabled(kvm))
 		kvmhv_release_all_nested(kvm);
 	kvmppc_rmap_reset(kvm);
@@ -4905,8 +4907,13 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
 	kvm->arch.radix = 0;
 	spin_unlock(&kvm->mmu_lock);
 	kvmppc_free_radix(kvm);
-	kvmppc_update_lpcr(kvm, LPCR_VPM1,
-			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
+
+	lpcr = LPCR_VPM1;
+	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		lpcr_mask |= LPCR_HAIL;
+	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
+
 	return 0;
 }
 
@@ -4916,6 +4923,7 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
  */
 int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
 {
+	unsigned long lpcr, lpcr_mask;
 	int err;
 
 	err = kvmppc_init_vm_radix(kvm);
@@ -4927,8 +4935,17 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
 	kvm->arch.radix = 1;
 	spin_unlock(&kvm->mmu_lock);
 	kvmppc_free_hpt(&kvm->arch.hpt);
-	kvmppc_update_lpcr(kvm, LPCR_UPRT | LPCR_GTSE | LPCR_HR,
-			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
+
+	lpcr = LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		lpcr_mask |= LPCR_HAIL;
+		if (cpu_has_feature(CPU_FTR_HVMODE) &&
+				(kvm->arch.host_lpcr & LPCR_HAIL))
+			lpcr |= LPCR_HAIL;
+	}
+	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
+
 	return 0;
 }
 
@@ -5092,6 +5109,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
 		kvm->arch.mmu_ready = 1;
 		lpcr &= ~LPCR_VPM1;
 		lpcr |= LPCR_UPRT | LPCR_GTSE | LPCR_HR;
+		if (cpu_has_feature(CPU_FTR_HVMODE) &&
+		    cpu_has_feature(CPU_FTR_ARCH_31) &&
+		    (kvm->arch.host_lpcr & LPCR_HAIL))
+			lpcr |= LPCR_HAIL;
 		ret = kvmppc_init_vm_radix(kvm);
 		if (ret) {
 			kvmppc_free_lpid(kvm->arch.lpid);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 08/43] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This register controls supervisor SPR modifications, and as such is only
relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never
restores it coming back out to the host, so it can be kept constant and
avoid the mtSPR in KVM guest entry.

-21 cycles (9116) cycles POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/cpu_setup_power.c    |  8 ++++++++
 arch/powerpc/kernel/dt_cpu_ftrs.c        |  2 ++
 arch/powerpc/kvm/book3s_hv_p9_entry.c    |  2 --
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  2 --
 arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ---------------
 arch/powerpc/platforms/powernv/idle.c    |  8 +++-----
 6 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
index 3cca88ee96d7..a29dc8326622 100644
--- a/arch/powerpc/kernel/cpu_setup_power.c
+++ b/arch/powerpc/kernel/cpu_setup_power.c
@@ -137,6 +137,7 @@ void __setup_cpu_power7(unsigned long offset, struct cpu_spec *t)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
 }
@@ -150,6 +151,7 @@ void __restore_cpu_power7(void)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
 }
@@ -164,6 +166,7 @@ void __setup_cpu_power8(unsigned long offset, struct cpu_spec *t)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
 	init_HFSCR();
@@ -184,6 +187,7 @@ void __restore_cpu_power8(void)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
 	init_HFSCR();
@@ -202,6 +206,7 @@ void __setup_cpu_power9(unsigned long offset, struct cpu_spec *t)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
@@ -223,6 +228,7 @@ void __restore_cpu_power9(void)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
@@ -242,6 +248,7 @@ void __setup_cpu_power10(unsigned long offset, struct cpu_spec *t)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
@@ -264,6 +271,7 @@ void __restore_cpu_power10(void)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 358aee7c2d79..0a6b36b4bda8 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -80,6 +80,7 @@ static void __restore_cpu_cpufeatures(void)
 	mtspr(SPRN_LPCR, system_registers.lpcr);
 	if (hv_mode) {
 		mtspr(SPRN_LPID, 0);
+		mtspr(SPRN_AMOR, ~0);
 		mtspr(SPRN_HFSCR, system_registers.hfscr);
 		mtspr(SPRN_PCR, system_registers.pcr);
 	}
@@ -216,6 +217,7 @@ static int __init feat_enable_hv(struct dt_cpu_feature *f)
 	}
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 
 	lpcr = mfspr(SPRN_LPCR);
 	lpcr &=  ~LPCR_LPES0; /* HV external interrupts */
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index c4f3e066fcb4..a3281f0c9214 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -286,8 +286,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_SPRG2, vcpu->arch.shregs.sprg2);
 	mtspr(SPRN_SPRG3, vcpu->arch.shregs.sprg3);
 
-	mtspr(SPRN_AMOR, ~0UL);
-
 	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
 
 	/*
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 8dd437d7a2c6..007f87b97184 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -772,10 +772,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	/* Restore AMR and UAMOR, set AMOR to all 1s */
 	ld	r5,VCPU_AMR(r4)
 	ld	r6,VCPU_UAMOR(r4)
-	li	r7,-1
 	mtspr	SPRN_AMR,r5
 	mtspr	SPRN_UAMOR,r6
-	mtspr	SPRN_AMOR,r7
 
 	/* Restore state of CTRL run bit; assume 1 on entry */
 	lwz	r5,VCPU_CTRL(r4)
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index fe236c38ce00..b985cfead5d7 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -566,18 +566,6 @@ void __init radix__early_init_devtree(void)
 	return;
 }
 
-static void radix_init_amor(void)
-{
-	/*
-	* In HV mode, we init AMOR (Authority Mask Override Register) so that
-	* the hypervisor and guest can setup IAMR (Instruction Authority Mask
-	* Register), enable key 0 and set it to 1.
-	*
-	* AMOR = 0b1100 .... 0000 (Mask for key 0 is 11)
-	*/
-	mtspr(SPRN_AMOR, (3ul << 62));
-}
-
 void __init radix__early_init_mmu(void)
 {
 	unsigned long lpcr;
@@ -638,7 +626,6 @@ void __init radix__early_init_mmu(void)
 		lpcr = mfspr(SPRN_LPCR);
 		mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
 		radix_init_partition_table();
-		radix_init_amor();
 	} else {
 		radix_init_pseries();
 	}
@@ -662,8 +649,6 @@ void radix__early_init_mmu_secondary(void)
 
 		set_ptcr_when_no_uv(__pa(partition_tb) |
 				    (PATB_SIZE_SHIFT - 12));
-
-		radix_init_amor();
 	}
 
 	radix__switch_mmu_context(NULL, &init_mm);
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 180baecad914..f791ca041854 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -306,8 +306,8 @@ struct p7_sprs {
 	/* per thread SPRs that get lost in shallow states */
 	u64 amr;
 	u64 iamr;
-	u64 amor;
 	u64 uamor;
+	/* amor is restored to constant ~0 */
 };
 
 static unsigned long power7_idle_insn(unsigned long type)
@@ -378,7 +378,6 @@ static unsigned long power7_idle_insn(unsigned long type)
 	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
 		sprs.amr	= mfspr(SPRN_AMR);
 		sprs.iamr	= mfspr(SPRN_IAMR);
-		sprs.amor	= mfspr(SPRN_AMOR);
 		sprs.uamor	= mfspr(SPRN_UAMOR);
 	}
 
@@ -397,7 +396,7 @@ static unsigned long power7_idle_insn(unsigned long type)
 			 */
 			mtspr(SPRN_AMR,		sprs.amr);
 			mtspr(SPRN_IAMR,	sprs.iamr);
-			mtspr(SPRN_AMOR,	sprs.amor);
+			mtspr(SPRN_AMOR,	~0);
 			mtspr(SPRN_UAMOR,	sprs.uamor);
 		}
 	}
@@ -687,7 +686,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 
 	sprs.amr	= mfspr(SPRN_AMR);
 	sprs.iamr	= mfspr(SPRN_IAMR);
-	sprs.amor	= mfspr(SPRN_AMOR);
 	sprs.uamor	= mfspr(SPRN_UAMOR);
 
 	srr1 = isa300_idle_stop_mayloss(psscr);		/* go idle */
@@ -708,7 +706,7 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 		 */
 		mtspr(SPRN_AMR,		sprs.amr);
 		mtspr(SPRN_IAMR,	sprs.iamr);
-		mtspr(SPRN_AMOR,	sprs.amor);
+		mtspr(SPRN_AMOR,	~0);
 		mtspr(SPRN_UAMOR,	sprs.uamor);
 
 		/*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 08/43] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This register controls supervisor SPR modifications, and as such is only
relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never
restores it coming back out to the host, so it can be kept constant and
avoid the mtSPR in KVM guest entry.

-21 cycles (9116) cycles POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/cpu_setup_power.c    |  8 ++++++++
 arch/powerpc/kernel/dt_cpu_ftrs.c        |  2 ++
 arch/powerpc/kvm/book3s_hv_p9_entry.c    |  2 --
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  2 --
 arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ---------------
 arch/powerpc/platforms/powernv/idle.c    |  8 +++-----
 6 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
index 3cca88ee96d7..a29dc8326622 100644
--- a/arch/powerpc/kernel/cpu_setup_power.c
+++ b/arch/powerpc/kernel/cpu_setup_power.c
@@ -137,6 +137,7 @@ void __setup_cpu_power7(unsigned long offset, struct cpu_spec *t)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
 }
@@ -150,6 +151,7 @@ void __restore_cpu_power7(void)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
 }
@@ -164,6 +166,7 @@ void __setup_cpu_power8(unsigned long offset, struct cpu_spec *t)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
 	init_HFSCR();
@@ -184,6 +187,7 @@ void __restore_cpu_power8(void)
 		return;
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
 	init_HFSCR();
@@ -202,6 +206,7 @@ void __setup_cpu_power9(unsigned long offset, struct cpu_spec *t)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
@@ -223,6 +228,7 @@ void __restore_cpu_power9(void)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
@@ -242,6 +248,7 @@ void __setup_cpu_power10(unsigned long offset, struct cpu_spec *t)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
@@ -264,6 +271,7 @@ void __restore_cpu_power10(void)
 	mtspr(SPRN_PSSCR, 0);
 	mtspr(SPRN_LPID, 0);
 	mtspr(SPRN_PID, 0);
+	mtspr(SPRN_AMOR, ~0);
 	mtspr(SPRN_PCR, PCR_MASK);
 	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
 			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 358aee7c2d79..0a6b36b4bda8 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -80,6 +80,7 @@ static void __restore_cpu_cpufeatures(void)
 	mtspr(SPRN_LPCR, system_registers.lpcr);
 	if (hv_mode) {
 		mtspr(SPRN_LPID, 0);
+		mtspr(SPRN_AMOR, ~0);
 		mtspr(SPRN_HFSCR, system_registers.hfscr);
 		mtspr(SPRN_PCR, system_registers.pcr);
 	}
@@ -216,6 +217,7 @@ static int __init feat_enable_hv(struct dt_cpu_feature *f)
 	}
 
 	mtspr(SPRN_LPID, 0);
+	mtspr(SPRN_AMOR, ~0);
 
 	lpcr = mfspr(SPRN_LPCR);
 	lpcr &=  ~LPCR_LPES0; /* HV external interrupts */
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index c4f3e066fcb4..a3281f0c9214 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -286,8 +286,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_SPRG2, vcpu->arch.shregs.sprg2);
 	mtspr(SPRN_SPRG3, vcpu->arch.shregs.sprg3);
 
-	mtspr(SPRN_AMOR, ~0UL);
-
 	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
 
 	/*
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 8dd437d7a2c6..007f87b97184 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -772,10 +772,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	/* Restore AMR and UAMOR, set AMOR to all 1s */
 	ld	r5,VCPU_AMR(r4)
 	ld	r6,VCPU_UAMOR(r4)
-	li	r7,-1
 	mtspr	SPRN_AMR,r5
 	mtspr	SPRN_UAMOR,r6
-	mtspr	SPRN_AMOR,r7
 
 	/* Restore state of CTRL run bit; assume 1 on entry */
 	lwz	r5,VCPU_CTRL(r4)
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index fe236c38ce00..b985cfead5d7 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -566,18 +566,6 @@ void __init radix__early_init_devtree(void)
 	return;
 }
 
-static void radix_init_amor(void)
-{
-	/*
-	* In HV mode, we init AMOR (Authority Mask Override Register) so that
-	* the hypervisor and guest can setup IAMR (Instruction Authority Mask
-	* Register), enable key 0 and set it to 1.
-	*
-	* AMOR = 0b1100 .... 0000 (Mask for key 0 is 11)
-	*/
-	mtspr(SPRN_AMOR, (3ul << 62));
-}
-
 void __init radix__early_init_mmu(void)
 {
 	unsigned long lpcr;
@@ -638,7 +626,6 @@ void __init radix__early_init_mmu(void)
 		lpcr = mfspr(SPRN_LPCR);
 		mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
 		radix_init_partition_table();
-		radix_init_amor();
 	} else {
 		radix_init_pseries();
 	}
@@ -662,8 +649,6 @@ void radix__early_init_mmu_secondary(void)
 
 		set_ptcr_when_no_uv(__pa(partition_tb) |
 				    (PATB_SIZE_SHIFT - 12));
-
-		radix_init_amor();
 	}
 
 	radix__switch_mmu_context(NULL, &init_mm);
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 180baecad914..f791ca041854 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -306,8 +306,8 @@ struct p7_sprs {
 	/* per thread SPRs that get lost in shallow states */
 	u64 amr;
 	u64 iamr;
-	u64 amor;
 	u64 uamor;
+	/* amor is restored to constant ~0 */
 };
 
 static unsigned long power7_idle_insn(unsigned long type)
@@ -378,7 +378,6 @@ static unsigned long power7_idle_insn(unsigned long type)
 	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
 		sprs.amr	= mfspr(SPRN_AMR);
 		sprs.iamr	= mfspr(SPRN_IAMR);
-		sprs.amor	= mfspr(SPRN_AMOR);
 		sprs.uamor	= mfspr(SPRN_UAMOR);
 	}
 
@@ -397,7 +396,7 @@ static unsigned long power7_idle_insn(unsigned long type)
 			 */
 			mtspr(SPRN_AMR,		sprs.amr);
 			mtspr(SPRN_IAMR,	sprs.iamr);
-			mtspr(SPRN_AMOR,	sprs.amor);
+			mtspr(SPRN_AMOR,	~0);
 			mtspr(SPRN_UAMOR,	sprs.uamor);
 		}
 	}
@@ -687,7 +686,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 
 	sprs.amr	= mfspr(SPRN_AMR);
 	sprs.iamr	= mfspr(SPRN_IAMR);
-	sprs.amor	= mfspr(SPRN_AMOR);
 	sprs.uamor	= mfspr(SPRN_UAMOR);
 
 	srr1 = isa300_idle_stop_mayloss(psscr);		/* go idle */
@@ -708,7 +706,7 @@ static unsigned long power9_idle_stop(unsigned long psscr)
 		 */
 		mtspr(SPRN_AMR,		sprs.amr);
 		mtspr(SPRN_IAMR,	sprs.iamr);
-		mtspr(SPRN_AMOR,	sprs.amor);
+		mtspr(SPRN_AMOR,	~0);
 		mtspr(SPRN_UAMOR,	sprs.uamor);
 
 		/*
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 09/43] KVM: PPC: Book3S HV: Don't always save PMU for guest capable of nesting
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Revert the workaround added by commit 63279eeb7f93a ("KVM: PPC: Book3S
HV: Always save guest pmu for guest capable of nesting").

Nested capable guests running with the earlier commit ("KVM: PPC: Book3S
HV Nested: Indicate guest PMU in-use in VPA") will now indicate the PMU
in-use status of their guests, which means the parent does not need to
unconditionally save the PMU for nested capable guests.

This will cause the PMU to break for nested guests when running older
nested hypervisor guests under a kernel with this change. It's unclear
there's an easy way to avoid that, so this could wait for a release or
so for the fix to filter into stable kernels.

-134 cycles (8982) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ed713f49fbd5..1f30f98b09d1 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3901,8 +3901,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		vcpu->arch.vpa.dirty = 1;
 		save_pmu = lp->pmcregs_in_use;
 	}
-	/* Must save pmu if this guest is capable of running nested guests */
-	save_pmu |= nesting_enabled(vcpu->kvm);
 
 	kvmhv_save_guest_pmu(vcpu, save_pmu);
 #ifdef CONFIG_PPC_PSERIES
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 09/43] KVM: PPC: Book3S HV: Don't always save PMU for guest capable of nesting
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Revert the workaround added by commit 63279eeb7f93a ("KVM: PPC: Book3S
HV: Always save guest pmu for guest capable of nesting").

Nested capable guests running with the earlier commit ("KVM: PPC: Book3S
HV Nested: Indicate guest PMU in-use in VPA") will now indicate the PMU
in-use status of their guests, which means the parent does not need to
unconditionally save the PMU for nested capable guests.

This will cause the PMU to break for nested guests when running older
nested hypervisor guests under a kernel with this change. It's unclear
there's an easy way to avoid that, so this could wait for a release or
so for the fix to filter into stable kernels.

-134 cycles (8982) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ed713f49fbd5..1f30f98b09d1 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3901,8 +3901,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		vcpu->arch.vpa.dirty = 1;
 		save_pmu = lp->pmcregs_in_use;
 	}
-	/* Must save pmu if this guest is capable of running nested guests */
-	save_pmu |= nesting_enabled(vcpu->kvm);
 
 	kvmhv_save_guest_pmu(vcpu, save_pmu);
 #ifdef CONFIG_PPC_PSERIES
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

KVM PMU management code looks for particular frozen/disabled bits in
the PMU registers so it knows whether it must clear them when coming
out of a guest or not. Setting this up helps KVM make these optimisations
without getting confused. Longer term the better approach might be to
move guest/host PMU switching to the perf subsystem.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
 arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
 arch/powerpc/kvm/book3s_hv.c          | 5 +++++
 arch/powerpc/perf/core-book3s.c       | 7 +++++++
 4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
index a29dc8326622..3dc61e203f37 100644
--- a/arch/powerpc/kernel/cpu_setup_power.c
+++ b/arch/powerpc/kernel/cpu_setup_power.c
@@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
 static void init_PMU(void)
 {
 	mtspr(SPRN_MMCRA, 0);
-	mtspr(SPRN_MMCR0, 0);
+	mtspr(SPRN_MMCR0, MMCR0_FC);
 	mtspr(SPRN_MMCR1, 0);
 	mtspr(SPRN_MMCR2, 0);
 }
@@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
 {
 	mtspr(SPRN_MMCR3, 0);
 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
-	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
+	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
 }
 
 /*
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 0a6b36b4bda8..06a089fbeaa7 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -353,7 +353,7 @@ static void init_pmu_power8(void)
 	}
 
 	mtspr(SPRN_MMCRA, 0);
-	mtspr(SPRN_MMCR0, 0);
+	mtspr(SPRN_MMCR0, MMCR0_FC);
 	mtspr(SPRN_MMCR1, 0);
 	mtspr(SPRN_MMCR2, 0);
 	mtspr(SPRN_MMCRS, 0);
@@ -392,7 +392,7 @@ static void init_pmu_power9(void)
 		mtspr(SPRN_MMCRC, 0);
 
 	mtspr(SPRN_MMCRA, 0);
-	mtspr(SPRN_MMCR0, 0);
+	mtspr(SPRN_MMCR0, MMCR0_FC);
 	mtspr(SPRN_MMCR1, 0);
 	mtspr(SPRN_MMCR2, 0);
 }
@@ -428,7 +428,7 @@ static void init_pmu_power10(void)
 
 	mtspr(SPRN_MMCR3, 0);
 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
-	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
+	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
 }
 
 static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 1f30f98b09d1..f7349d150828 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
 #endif
 #endif
 	vcpu->arch.mmcr[0] = MMCR0_FC;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
+		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
+	}
+
 	vcpu->arch.ctrl = CTRL_RUNLATCH;
 	/* default to host PVR, since we can't spoof it */
 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 51622411a7cc..e33b29ec1a65 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
 		goto out;
 
 	if (cpuhw->n_events == 0) {
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
+			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
+		} else {
+			mtspr(SPRN_MMCRA, 0);
+			mtspr(SPRN_MMCR0, MMCR0_FC);
+		}
 		ppc_set_pmu_inuse(0);
 		goto out;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

KVM PMU management code looks for particular frozen/disabled bits in
the PMU registers so it knows whether it must clear them when coming
out of a guest or not. Setting this up helps KVM make these optimisations
without getting confused. Longer term the better approach might be to
move guest/host PMU switching to the perf subsystem.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
 arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
 arch/powerpc/kvm/book3s_hv.c          | 5 +++++
 arch/powerpc/perf/core-book3s.c       | 7 +++++++
 4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
index a29dc8326622..3dc61e203f37 100644
--- a/arch/powerpc/kernel/cpu_setup_power.c
+++ b/arch/powerpc/kernel/cpu_setup_power.c
@@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
 static void init_PMU(void)
 {
 	mtspr(SPRN_MMCRA, 0);
-	mtspr(SPRN_MMCR0, 0);
+	mtspr(SPRN_MMCR0, MMCR0_FC);
 	mtspr(SPRN_MMCR1, 0);
 	mtspr(SPRN_MMCR2, 0);
 }
@@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
 {
 	mtspr(SPRN_MMCR3, 0);
 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
-	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
+	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
 }
 
 /*
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 0a6b36b4bda8..06a089fbeaa7 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -353,7 +353,7 @@ static void init_pmu_power8(void)
 	}
 
 	mtspr(SPRN_MMCRA, 0);
-	mtspr(SPRN_MMCR0, 0);
+	mtspr(SPRN_MMCR0, MMCR0_FC);
 	mtspr(SPRN_MMCR1, 0);
 	mtspr(SPRN_MMCR2, 0);
 	mtspr(SPRN_MMCRS, 0);
@@ -392,7 +392,7 @@ static void init_pmu_power9(void)
 		mtspr(SPRN_MMCRC, 0);
 
 	mtspr(SPRN_MMCRA, 0);
-	mtspr(SPRN_MMCR0, 0);
+	mtspr(SPRN_MMCR0, MMCR0_FC);
 	mtspr(SPRN_MMCR1, 0);
 	mtspr(SPRN_MMCR2, 0);
 }
@@ -428,7 +428,7 @@ static void init_pmu_power10(void)
 
 	mtspr(SPRN_MMCR3, 0);
 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
-	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
+	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
 }
 
 static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 1f30f98b09d1..f7349d150828 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
 #endif
 #endif
 	vcpu->arch.mmcr[0] = MMCR0_FC;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
+		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
+	}
+
 	vcpu->arch.ctrl = CTRL_RUNLATCH;
 	/* default to host PVR, since we can't spoof it */
 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 51622411a7cc..e33b29ec1a65 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
 		goto out;
 
 	if (cpuhw->n_events = 0) {
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
+			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
+		} else {
+			mtspr(SPRN_MMCRA, 0);
+			mtspr(SPRN_MMCR0, MMCR0_FC);
+		}
 		ppc_set_pmu_inuse(0);
 		goto out;
 	}
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Implement the P9 path PMU save/restore code in C, and remove the
POWER9/10 code from the P7/8 path assembly.

-449 cycles (8533) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/asm-prototypes.h |   5 -
 arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
 arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
 4 files changed, 200 insertions(+), 66 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index 02ee6f5ac9fe..928db8ef9a5a 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
 					bool preserve_nv) { }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
-void kvmhv_save_host_pmu(void);
-void kvmhv_load_host_pmu(void);
-void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
-void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
-
 void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
 
 long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f7349d150828..b1b94b3563b7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	trace_kvmppc_run_core(vc, 1);
 }
 
+/*
+ * Privileged (non-hypervisor) host registers to save.
+ */
+struct p9_host_os_sprs {
+	unsigned long dscr;
+	unsigned long tidr;
+	unsigned long iamr;
+	unsigned long amr;
+	unsigned long fscr;
+
+	unsigned int pmc1;
+	unsigned int pmc2;
+	unsigned int pmc3;
+	unsigned int pmc4;
+	unsigned int pmc5;
+	unsigned int pmc6;
+	unsigned long mmcr0;
+	unsigned long mmcr1;
+	unsigned long mmcr2;
+	unsigned long mmcr3;
+	unsigned long mmcra;
+	unsigned long siar;
+	unsigned long sier1;
+	unsigned long sier2;
+	unsigned long sier3;
+	unsigned long sdar;
+};
+
+static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
+{
+	if (!(mmcr0 & MMCR0_FC))
+		goto do_freeze;
+	if (mmcra & MMCRA_SAMPLE_ENABLE)
+		goto do_freeze;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		if (!(mmcr0 & MMCR0_PMCCEXT))
+			goto do_freeze;
+		if (!(mmcra & MMCRA_BHRB_DISABLE))
+			goto do_freeze;
+	}
+	return;
+
+do_freeze:
+	mmcr0 = MMCR0_FC;
+	mmcra = 0;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		mmcr0 |= MMCR0_PMCCEXT;
+		mmcra = MMCRA_BHRB_DISABLE;
+	}
+
+	mtspr(SPRN_MMCR0, mmcr0);
+	mtspr(SPRN_MMCRA, mmcra);
+	isync();
+}
+
+static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
+{
+	if (ppc_get_pmu_inuse()) {
+		/*
+		 * It might be better to put PMU handling (at least for the
+		 * host) in the perf subsystem because it knows more about what
+		 * is being used.
+		 */
+
+		/* POWER9, POWER10 do not implement HPMC or SPMC */
+
+		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
+		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
+
+		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
+		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
+		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
+		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
+		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
+		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
+		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
+		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
+		host_os_sprs->sdar = mfspr(SPRN_SDAR);
+		host_os_sprs->siar = mfspr(SPRN_SIAR);
+		host_os_sprs->sier1 = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
+			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
+			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
+		}
+	}
+}
+
+static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
+{
+	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
+	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
+	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
+	mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
+	mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
+	mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
+	mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
+	mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
+	mtspr(SPRN_SDAR, vcpu->arch.sdar);
+	mtspr(SPRN_SIAR, vcpu->arch.siar);
+	mtspr(SPRN_SIER, vcpu->arch.sier[0]);
+
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
+		mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
+		mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
+	}
+
+	/* Set MMCRA then MMCR0 last */
+	mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
+	mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
+	/* No isync necessary because we're starting counters */
+}
+
+static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
+{
+	struct lppaca *lp;
+	int save_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		save_pmu = lp->pmcregs_in_use;
+
+	if (save_pmu) {
+		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
+		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
+
+		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
+		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
+		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
+		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
+		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
+		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
+		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
+		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
+		vcpu->arch.sdar = mfspr(SPRN_SDAR);
+		vcpu->arch.siar = mfspr(SPRN_SIAR);
+		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
+			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
+			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
+		}
+	} else {
+		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
+	}
+}
+
+static void load_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
+{
+	if (ppc_get_pmu_inuse()) {
+		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
+		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
+		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
+		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
+		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
+		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
+		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
+		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
+		mtspr(SPRN_SDAR, host_os_sprs->sdar);
+		mtspr(SPRN_SIAR, host_os_sprs->siar);
+		mtspr(SPRN_SIER, host_os_sprs->sier1);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
+			mtspr(SPRN_SIER2, host_os_sprs->sier2);
+			mtspr(SPRN_SIER3, host_os_sprs->sier3);
+		}
+
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
+		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
+		isync();
+	}
+}
+
 static void load_spr_state(struct kvm_vcpu *vcpu)
 {
 	mtspr(SPRN_DSCR, vcpu->arch.dscr);
@@ -3677,17 +3859,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.dscr = mfspr(SPRN_DSCR);
 }
 
-/*
- * Privileged (non-hypervisor) host registers to save.
- */
-struct p9_host_os_sprs {
-	unsigned long dscr;
-	unsigned long tidr;
-	unsigned long iamr;
-	unsigned long amr;
-	unsigned long fscr;
-};
-
 static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
 	host_os_sprs->dscr = mfspr(SPRN_DSCR);
@@ -3735,7 +3906,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
 	u64 tb, next_timer;
-	int trap, save_pmu;
+	int trap;
 
 	WARN_ON_ONCE(vcpu->arch.ceded);
 
@@ -3748,7 +3919,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
-	kvmhv_save_host_pmu();		/* saves it to PACA kvm_hstate */
+	save_p9_host_pmu(&host_os_sprs);
 
 	kvmppc_subcore_enter_guest();
 
@@ -3776,7 +3947,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		}
 	}
 #endif
-	kvmhv_load_guest_pmu(vcpu);
+	load_p9_guest_pmu(vcpu);
 
 	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	load_fp_state(&vcpu->arch.fp);
@@ -3898,16 +4069,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-	save_pmu = 1;
 	if (vcpu->arch.vpa.pinned_addr) {
 		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
 		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
 		lp->yield_count = cpu_to_be32(yield_count);
 		vcpu->arch.vpa.dirty = 1;
-		save_pmu = lp->pmcregs_in_use;
 	}
 
-	kvmhv_save_guest_pmu(vcpu, save_pmu);
+	save_p9_guest_pmu(vcpu);
 #ifdef CONFIG_PPC_PSERIES
 	if (kvmhv_on_pseries())
 		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
@@ -3920,7 +4089,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	kvmhv_load_host_pmu();
+	load_p9_host_pmu(&host_os_sprs);
 
 	kvmppc_subcore_exit_guest();
 
diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
index 4444f83cb133..59d89e4b154a 100644
--- a/arch/powerpc/kvm/book3s_hv_interrupts.S
+++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
@@ -104,7 +104,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	mtlr	r0
 	blr
 
-_GLOBAL(kvmhv_save_host_pmu)
+/*
+ * void kvmhv_save_host_pmu(void)
+ */
+kvmhv_save_host_pmu:
 BEGIN_FTR_SECTION
 	/* Work around P8 PMAE bug */
 	li	r3, -1
@@ -138,14 +141,6 @@ BEGIN_FTR_SECTION
 	std	r8, HSTATE_MMCR2(r13)
 	std	r9, HSTATE_SIER(r13)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
-BEGIN_FTR_SECTION
-	mfspr	r5, SPRN_MMCR3
-	mfspr	r6, SPRN_SIER2
-	mfspr	r7, SPRN_SIER3
-	std	r5, HSTATE_MMCR3(r13)
-	std	r6, HSTATE_SIER2(r13)
-	std	r7, HSTATE_SIER3(r13)
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	mfspr	r3, SPRN_PMC1
 	mfspr	r5, SPRN_PMC2
 	mfspr	r6, SPRN_PMC3
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 007f87b97184..0eb06734bc26 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2780,10 +2780,11 @@ kvmppc_msr_interrupt:
 	blr
 
 /*
+ * void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu)
+ *
  * Load up guest PMU state.  R3 points to the vcpu struct.
  */
-_GLOBAL(kvmhv_load_guest_pmu)
-EXPORT_SYMBOL_GPL(kvmhv_load_guest_pmu)
+kvmhv_load_guest_pmu:
 	mr	r4, r3
 	mflr	r0
 	li	r3, 1
@@ -2817,27 +2818,17 @@ END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
 	mtspr	SPRN_MMCRA, r6
 	mtspr	SPRN_SIAR, r7
 	mtspr	SPRN_SDAR, r8
-BEGIN_FTR_SECTION
-	ld      r5, VCPU_MMCR + 24(r4)
-	ld      r6, VCPU_SIER + 8(r4)
-	ld      r7, VCPU_SIER + 16(r4)
-	mtspr   SPRN_MMCR3, r5
-	mtspr   SPRN_SIER2, r6
-	mtspr   SPRN_SIER3, r7
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 BEGIN_FTR_SECTION
 	ld	r5, VCPU_MMCR + 16(r4)
 	ld	r6, VCPU_SIER(r4)
 	mtspr	SPRN_MMCR2, r5
 	mtspr	SPRN_SIER, r6
-BEGIN_FTR_SECTION_NESTED(96)
 	lwz	r7, VCPU_PMC + 24(r4)
 	lwz	r8, VCPU_PMC + 28(r4)
 	ld	r9, VCPU_MMCRS(r4)
 	mtspr	SPRN_SPMC1, r7
 	mtspr	SPRN_SPMC2, r8
 	mtspr	SPRN_MMCRS, r9
-END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 	mtspr	SPRN_MMCR0, r3
 	isync
@@ -2845,10 +2836,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 	blr
 
 /*
+ * void kvmhv_load_host_pmu(void)
+ *
  * Reload host PMU state saved in the PACA by kvmhv_save_host_pmu.
  */
-_GLOBAL(kvmhv_load_host_pmu)
-EXPORT_SYMBOL_GPL(kvmhv_load_host_pmu)
+kvmhv_load_host_pmu:
 	mflr	r0
 	lbz	r4, PACA_PMCINUSE(r13) /* is the host using the PMU? */
 	cmpwi	r4, 0
@@ -2886,25 +2878,18 @@ BEGIN_FTR_SECTION
 	mtspr	SPRN_MMCR2, r8
 	mtspr	SPRN_SIER, r9
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
-BEGIN_FTR_SECTION
-	ld      r5, HSTATE_MMCR3(r13)
-	ld      r6, HSTATE_SIER2(r13)
-	ld      r7, HSTATE_SIER3(r13)
-	mtspr   SPRN_MMCR3, r5
-	mtspr   SPRN_SIER2, r6
-	mtspr   SPRN_SIER3, r7
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	mtspr	SPRN_MMCR0, r3
 	isync
 	mtlr	r0
 23:	blr
 
 /*
+ * void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use)
+ *
  * Save guest PMU state into the vcpu struct.
  * r3 = vcpu, r4 = full save flag (PMU in use flag set in VPA)
  */
-_GLOBAL(kvmhv_save_guest_pmu)
-EXPORT_SYMBOL_GPL(kvmhv_save_guest_pmu)
+kvmhv_save_guest_pmu:
 	mr	r9, r3
 	mr	r8, r4
 BEGIN_FTR_SECTION
@@ -2953,14 +2938,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 BEGIN_FTR_SECTION
 	std	r10, VCPU_MMCR + 16(r9)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
-BEGIN_FTR_SECTION
-	mfspr   r5, SPRN_MMCR3
-	mfspr   r6, SPRN_SIER2
-	mfspr   r7, SPRN_SIER3
-	std     r5, VCPU_MMCR + 24(r9)
-	std     r6, VCPU_SIER + 8(r9)
-	std     r7, VCPU_SIER + 16(r9)
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	std	r7, VCPU_SIAR(r9)
 	std	r8, VCPU_SDAR(r9)
 	mfspr	r3, SPRN_PMC1
@@ -2978,7 +2955,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 BEGIN_FTR_SECTION
 	mfspr	r5, SPRN_SIER
 	std	r5, VCPU_SIER(r9)
-BEGIN_FTR_SECTION_NESTED(96)
 	mfspr	r6, SPRN_SPMC1
 	mfspr	r7, SPRN_SPMC2
 	mfspr	r8, SPRN_MMCRS
@@ -2987,7 +2963,6 @@ BEGIN_FTR_SECTION_NESTED(96)
 	std	r8, VCPU_MMCRS(r9)
 	lis	r4, 0x8000
 	mtspr	SPRN_MMCRS, r4
-END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 22:	blr
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Implement the P9 path PMU save/restore code in C, and remove the
POWER9/10 code from the P7/8 path assembly.

-449 cycles (8533) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/asm-prototypes.h |   5 -
 arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
 arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
 4 files changed, 200 insertions(+), 66 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index 02ee6f5ac9fe..928db8ef9a5a 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
 					bool preserve_nv) { }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
-void kvmhv_save_host_pmu(void);
-void kvmhv_load_host_pmu(void);
-void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
-void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
-
 void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
 
 long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f7349d150828..b1b94b3563b7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	trace_kvmppc_run_core(vc, 1);
 }
 
+/*
+ * Privileged (non-hypervisor) host registers to save.
+ */
+struct p9_host_os_sprs {
+	unsigned long dscr;
+	unsigned long tidr;
+	unsigned long iamr;
+	unsigned long amr;
+	unsigned long fscr;
+
+	unsigned int pmc1;
+	unsigned int pmc2;
+	unsigned int pmc3;
+	unsigned int pmc4;
+	unsigned int pmc5;
+	unsigned int pmc6;
+	unsigned long mmcr0;
+	unsigned long mmcr1;
+	unsigned long mmcr2;
+	unsigned long mmcr3;
+	unsigned long mmcra;
+	unsigned long siar;
+	unsigned long sier1;
+	unsigned long sier2;
+	unsigned long sier3;
+	unsigned long sdar;
+};
+
+static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
+{
+	if (!(mmcr0 & MMCR0_FC))
+		goto do_freeze;
+	if (mmcra & MMCRA_SAMPLE_ENABLE)
+		goto do_freeze;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		if (!(mmcr0 & MMCR0_PMCCEXT))
+			goto do_freeze;
+		if (!(mmcra & MMCRA_BHRB_DISABLE))
+			goto do_freeze;
+	}
+	return;
+
+do_freeze:
+	mmcr0 = MMCR0_FC;
+	mmcra = 0;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		mmcr0 |= MMCR0_PMCCEXT;
+		mmcra = MMCRA_BHRB_DISABLE;
+	}
+
+	mtspr(SPRN_MMCR0, mmcr0);
+	mtspr(SPRN_MMCRA, mmcra);
+	isync();
+}
+
+static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
+{
+	if (ppc_get_pmu_inuse()) {
+		/*
+		 * It might be better to put PMU handling (at least for the
+		 * host) in the perf subsystem because it knows more about what
+		 * is being used.
+		 */
+
+		/* POWER9, POWER10 do not implement HPMC or SPMC */
+
+		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
+		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
+
+		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
+		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
+		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
+		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
+		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
+		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
+		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
+		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
+		host_os_sprs->sdar = mfspr(SPRN_SDAR);
+		host_os_sprs->siar = mfspr(SPRN_SIAR);
+		host_os_sprs->sier1 = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
+			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
+			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
+		}
+	}
+}
+
+static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
+{
+	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
+	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
+	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
+	mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
+	mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
+	mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
+	mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
+	mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
+	mtspr(SPRN_SDAR, vcpu->arch.sdar);
+	mtspr(SPRN_SIAR, vcpu->arch.siar);
+	mtspr(SPRN_SIER, vcpu->arch.sier[0]);
+
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
+		mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
+		mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
+	}
+
+	/* Set MMCRA then MMCR0 last */
+	mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
+	mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
+	/* No isync necessary because we're starting counters */
+}
+
+static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
+{
+	struct lppaca *lp;
+	int save_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		save_pmu = lp->pmcregs_in_use;
+
+	if (save_pmu) {
+		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
+		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
+
+		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
+		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
+		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
+		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
+		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
+		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
+		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
+		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
+		vcpu->arch.sdar = mfspr(SPRN_SDAR);
+		vcpu->arch.siar = mfspr(SPRN_SIAR);
+		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
+			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
+			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
+		}
+	} else {
+		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
+	}
+}
+
+static void load_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
+{
+	if (ppc_get_pmu_inuse()) {
+		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
+		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
+		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
+		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
+		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
+		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
+		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
+		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
+		mtspr(SPRN_SDAR, host_os_sprs->sdar);
+		mtspr(SPRN_SIAR, host_os_sprs->siar);
+		mtspr(SPRN_SIER, host_os_sprs->sier1);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
+			mtspr(SPRN_SIER2, host_os_sprs->sier2);
+			mtspr(SPRN_SIER3, host_os_sprs->sier3);
+		}
+
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
+		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
+		isync();
+	}
+}
+
 static void load_spr_state(struct kvm_vcpu *vcpu)
 {
 	mtspr(SPRN_DSCR, vcpu->arch.dscr);
@@ -3677,17 +3859,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.dscr = mfspr(SPRN_DSCR);
 }
 
-/*
- * Privileged (non-hypervisor) host registers to save.
- */
-struct p9_host_os_sprs {
-	unsigned long dscr;
-	unsigned long tidr;
-	unsigned long iamr;
-	unsigned long amr;
-	unsigned long fscr;
-};
-
 static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
 	host_os_sprs->dscr = mfspr(SPRN_DSCR);
@@ -3735,7 +3906,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
 	u64 tb, next_timer;
-	int trap, save_pmu;
+	int trap;
 
 	WARN_ON_ONCE(vcpu->arch.ceded);
 
@@ -3748,7 +3919,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
-	kvmhv_save_host_pmu();		/* saves it to PACA kvm_hstate */
+	save_p9_host_pmu(&host_os_sprs);
 
 	kvmppc_subcore_enter_guest();
 
@@ -3776,7 +3947,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		}
 	}
 #endif
-	kvmhv_load_guest_pmu(vcpu);
+	load_p9_guest_pmu(vcpu);
 
 	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	load_fp_state(&vcpu->arch.fp);
@@ -3898,16 +4069,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-	save_pmu = 1;
 	if (vcpu->arch.vpa.pinned_addr) {
 		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
 		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
 		lp->yield_count = cpu_to_be32(yield_count);
 		vcpu->arch.vpa.dirty = 1;
-		save_pmu = lp->pmcregs_in_use;
 	}
 
-	kvmhv_save_guest_pmu(vcpu, save_pmu);
+	save_p9_guest_pmu(vcpu);
 #ifdef CONFIG_PPC_PSERIES
 	if (kvmhv_on_pseries())
 		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
@@ -3920,7 +4089,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	kvmhv_load_host_pmu();
+	load_p9_host_pmu(&host_os_sprs);
 
 	kvmppc_subcore_exit_guest();
 
diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
index 4444f83cb133..59d89e4b154a 100644
--- a/arch/powerpc/kvm/book3s_hv_interrupts.S
+++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
@@ -104,7 +104,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	mtlr	r0
 	blr
 
-_GLOBAL(kvmhv_save_host_pmu)
+/*
+ * void kvmhv_save_host_pmu(void)
+ */
+kvmhv_save_host_pmu:
 BEGIN_FTR_SECTION
 	/* Work around P8 PMAE bug */
 	li	r3, -1
@@ -138,14 +141,6 @@ BEGIN_FTR_SECTION
 	std	r8, HSTATE_MMCR2(r13)
 	std	r9, HSTATE_SIER(r13)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
-BEGIN_FTR_SECTION
-	mfspr	r5, SPRN_MMCR3
-	mfspr	r6, SPRN_SIER2
-	mfspr	r7, SPRN_SIER3
-	std	r5, HSTATE_MMCR3(r13)
-	std	r6, HSTATE_SIER2(r13)
-	std	r7, HSTATE_SIER3(r13)
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	mfspr	r3, SPRN_PMC1
 	mfspr	r5, SPRN_PMC2
 	mfspr	r6, SPRN_PMC3
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 007f87b97184..0eb06734bc26 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2780,10 +2780,11 @@ kvmppc_msr_interrupt:
 	blr
 
 /*
+ * void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu)
+ *
  * Load up guest PMU state.  R3 points to the vcpu struct.
  */
-_GLOBAL(kvmhv_load_guest_pmu)
-EXPORT_SYMBOL_GPL(kvmhv_load_guest_pmu)
+kvmhv_load_guest_pmu:
 	mr	r4, r3
 	mflr	r0
 	li	r3, 1
@@ -2817,27 +2818,17 @@ END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
 	mtspr	SPRN_MMCRA, r6
 	mtspr	SPRN_SIAR, r7
 	mtspr	SPRN_SDAR, r8
-BEGIN_FTR_SECTION
-	ld      r5, VCPU_MMCR + 24(r4)
-	ld      r6, VCPU_SIER + 8(r4)
-	ld      r7, VCPU_SIER + 16(r4)
-	mtspr   SPRN_MMCR3, r5
-	mtspr   SPRN_SIER2, r6
-	mtspr   SPRN_SIER3, r7
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 BEGIN_FTR_SECTION
 	ld	r5, VCPU_MMCR + 16(r4)
 	ld	r6, VCPU_SIER(r4)
 	mtspr	SPRN_MMCR2, r5
 	mtspr	SPRN_SIER, r6
-BEGIN_FTR_SECTION_NESTED(96)
 	lwz	r7, VCPU_PMC + 24(r4)
 	lwz	r8, VCPU_PMC + 28(r4)
 	ld	r9, VCPU_MMCRS(r4)
 	mtspr	SPRN_SPMC1, r7
 	mtspr	SPRN_SPMC2, r8
 	mtspr	SPRN_MMCRS, r9
-END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 	mtspr	SPRN_MMCR0, r3
 	isync
@@ -2845,10 +2836,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 	blr
 
 /*
+ * void kvmhv_load_host_pmu(void)
+ *
  * Reload host PMU state saved in the PACA by kvmhv_save_host_pmu.
  */
-_GLOBAL(kvmhv_load_host_pmu)
-EXPORT_SYMBOL_GPL(kvmhv_load_host_pmu)
+kvmhv_load_host_pmu:
 	mflr	r0
 	lbz	r4, PACA_PMCINUSE(r13) /* is the host using the PMU? */
 	cmpwi	r4, 0
@@ -2886,25 +2878,18 @@ BEGIN_FTR_SECTION
 	mtspr	SPRN_MMCR2, r8
 	mtspr	SPRN_SIER, r9
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
-BEGIN_FTR_SECTION
-	ld      r5, HSTATE_MMCR3(r13)
-	ld      r6, HSTATE_SIER2(r13)
-	ld      r7, HSTATE_SIER3(r13)
-	mtspr   SPRN_MMCR3, r5
-	mtspr   SPRN_SIER2, r6
-	mtspr   SPRN_SIER3, r7
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	mtspr	SPRN_MMCR0, r3
 	isync
 	mtlr	r0
 23:	blr
 
 /*
+ * void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use)
+ *
  * Save guest PMU state into the vcpu struct.
  * r3 = vcpu, r4 = full save flag (PMU in use flag set in VPA)
  */
-_GLOBAL(kvmhv_save_guest_pmu)
-EXPORT_SYMBOL_GPL(kvmhv_save_guest_pmu)
+kvmhv_save_guest_pmu:
 	mr	r9, r3
 	mr	r8, r4
 BEGIN_FTR_SECTION
@@ -2953,14 +2938,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 BEGIN_FTR_SECTION
 	std	r10, VCPU_MMCR + 16(r9)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
-BEGIN_FTR_SECTION
-	mfspr   r5, SPRN_MMCR3
-	mfspr   r6, SPRN_SIER2
-	mfspr   r7, SPRN_SIER3
-	std     r5, VCPU_MMCR + 24(r9)
-	std     r6, VCPU_SIER + 8(r9)
-	std     r7, VCPU_SIER + 16(r9)
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	std	r7, VCPU_SIAR(r9)
 	std	r8, VCPU_SDAR(r9)
 	mfspr	r3, SPRN_PMC1
@@ -2978,7 +2955,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 BEGIN_FTR_SECTION
 	mfspr	r5, SPRN_SIER
 	std	r5, VCPU_SIER(r9)
-BEGIN_FTR_SECTION_NESTED(96)
 	mfspr	r6, SPRN_SPMC1
 	mfspr	r7, SPRN_SPMC2
 	mfspr	r8, SPRN_MMCRS
@@ -2987,7 +2963,6 @@ BEGIN_FTR_SECTION_NESTED(96)
 	std	r8, VCPU_MMCRS(r9)
 	lis	r4, 0x8000
 	mtspr	SPRN_MMCRS, r4
-END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 22:	blr
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 12/43] KVM: PPC: Book3S HV P9: Factor out yield_count increment
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Factor duplicated code into a helper function.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b1b94b3563b7..38d8afa16839 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3896,6 +3896,16 @@ static inline bool hcall_is_xics(unsigned long req)
 		req == H_IPOLL || req == H_XIRR || req == H_XIRR_X;
 }
 
+static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
+{
+	struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
+	if (lp) {
+		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
+		lp->yield_count = cpu_to_be32(yield_count);
+		vcpu->arch.vpa.dirty = 1;
+	}
+}
+
 /*
  * Guest entry for POWER9 and later CPUs.
  */
@@ -3926,12 +3936,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 1;
 	vc->in_guest = 1;
 
-	if (vcpu->arch.vpa.pinned_addr) {
-		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
-		lp->yield_count = cpu_to_be32(yield_count);
-		vcpu->arch.vpa.dirty = 1;
-	}
+	vcpu_vpa_increment_dispatch(vcpu);
 
 	if (cpu_has_feature(CPU_FTR_TM) ||
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
@@ -4069,12 +4074,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-	if (vcpu->arch.vpa.pinned_addr) {
-		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
-		lp->yield_count = cpu_to_be32(yield_count);
-		vcpu->arch.vpa.dirty = 1;
-	}
+	vcpu_vpa_increment_dispatch(vcpu);
 
 	save_p9_guest_pmu(vcpu);
 #ifdef CONFIG_PPC_PSERIES
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 12/43] KVM: PPC: Book3S HV P9: Factor out yield_count increment
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Factor duplicated code into a helper function.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b1b94b3563b7..38d8afa16839 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3896,6 +3896,16 @@ static inline bool hcall_is_xics(unsigned long req)
 		req = H_IPOLL || req = H_XIRR || req = H_XIRR_X;
 }
 
+static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
+{
+	struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
+	if (lp) {
+		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
+		lp->yield_count = cpu_to_be32(yield_count);
+		vcpu->arch.vpa.dirty = 1;
+	}
+}
+
 /*
  * Guest entry for POWER9 and later CPUs.
  */
@@ -3926,12 +3936,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 1;
 	vc->in_guest = 1;
 
-	if (vcpu->arch.vpa.pinned_addr) {
-		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
-		lp->yield_count = cpu_to_be32(yield_count);
-		vcpu->arch.vpa.dirty = 1;
-	}
+	vcpu_vpa_increment_dispatch(vcpu);
 
 	if (cpu_has_feature(CPU_FTR_TM) ||
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
@@ -4069,12 +4074,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-	if (vcpu->arch.vpa.pinned_addr) {
-		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
-		lp->yield_count = cpu_to_be32(yield_count);
-		vcpu->arch.vpa.dirty = 1;
-	}
+	vcpu_vpa_increment_dispatch(vcpu);
 
 	save_p9_guest_pmu(vcpu);
 #ifdef CONFIG_PPC_PSERIES
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 13/43] KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch functions
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Rather than guest/host save/retsore functions, implement context switch
functions that take care of details like the VPA update for nested.

The reason to split these kind of helpers into explicit save/load
functions is mainly to schedule SPR access nicely, but PMU is a special
case where the load requires mtSPR (to stop counters) and other
difficulties, so there's less possibility to schedule those nicely. The
SPR accesses also have side-effects if the PMU is running, and in later
changes we keep the host PMU running as long as possible so this code
can be better profiled, which also complicates scheduling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 51 ++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 38d8afa16839..13b8389b0479 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3690,7 +3690,8 @@ static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
 	isync();
 }
 
-static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
+static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
 {
 	if (ppc_get_pmu_inuse()) {
 		/*
@@ -3724,10 +3725,19 @@ static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
 			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
 		}
 	}
-}
 
-static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
-{
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries()) {
+		if (vcpu->arch.vpa.pinned_addr) {
+			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
+			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
+		} else {
+			get_lppaca()->pmcregs_in_use = 1;
+		}
+	}
+#endif
+
+	/* load guest */
 	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
 	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
 	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
@@ -3752,7 +3762,8 @@ static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
 	/* No isync necessary because we're starting counters */
 }
 
-static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
+static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
 {
 	struct lppaca *lp;
 	int save_pmu = 1;
@@ -3787,10 +3798,12 @@ static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
 	} else {
 		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
 	}
-}
 
-static void load_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
-{
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries())
+		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
+#endif
+
 	if (ppc_get_pmu_inuse()) {
 		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
 		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
@@ -3929,8 +3942,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
-	save_p9_host_pmu(&host_os_sprs);
-
 	kvmppc_subcore_enter_guest();
 
 	vc->entry_exit_map = 1;
@@ -3942,17 +3953,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries()) {
-		if (vcpu->arch.vpa.pinned_addr) {
-			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
-		} else {
-			get_lppaca()->pmcregs_in_use = 1;
-		}
-	}
-#endif
-	load_p9_guest_pmu(vcpu);
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
 	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	load_fp_state(&vcpu->arch.fp);
@@ -4076,11 +4077,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	save_p9_guest_pmu(vcpu);
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries())
-		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
-#endif
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
@@ -4089,8 +4086,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	load_p9_host_pmu(&host_os_sprs);
-
 	kvmppc_subcore_exit_guest();
 
 	return trap;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 13/43] KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch functions
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Rather than guest/host save/retsore functions, implement context switch
functions that take care of details like the VPA update for nested.

The reason to split these kind of helpers into explicit save/load
functions is mainly to schedule SPR access nicely, but PMU is a special
case where the load requires mtSPR (to stop counters) and other
difficulties, so there's less possibility to schedule those nicely. The
SPR accesses also have side-effects if the PMU is running, and in later
changes we keep the host PMU running as long as possible so this code
can be better profiled, which also complicates scheduling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 51 ++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 38d8afa16839..13b8389b0479 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3690,7 +3690,8 @@ static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
 	isync();
 }
 
-static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
+static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
 {
 	if (ppc_get_pmu_inuse()) {
 		/*
@@ -3724,10 +3725,19 @@ static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
 			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
 		}
 	}
-}
 
-static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
-{
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries()) {
+		if (vcpu->arch.vpa.pinned_addr) {
+			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
+			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
+		} else {
+			get_lppaca()->pmcregs_in_use = 1;
+		}
+	}
+#endif
+
+	/* load guest */
 	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
 	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
 	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
@@ -3752,7 +3762,8 @@ static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
 	/* No isync necessary because we're starting counters */
 }
 
-static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
+static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
 {
 	struct lppaca *lp;
 	int save_pmu = 1;
@@ -3787,10 +3798,12 @@ static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
 	} else {
 		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
 	}
-}
 
-static void load_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
-{
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries())
+		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
+#endif
+
 	if (ppc_get_pmu_inuse()) {
 		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
 		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
@@ -3929,8 +3942,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
-	save_p9_host_pmu(&host_os_sprs);
-
 	kvmppc_subcore_enter_guest();
 
 	vc->entry_exit_map = 1;
@@ -3942,17 +3953,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries()) {
-		if (vcpu->arch.vpa.pinned_addr) {
-			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
-		} else {
-			get_lppaca()->pmcregs_in_use = 1;
-		}
-	}
-#endif
-	load_p9_guest_pmu(vcpu);
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
 	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	load_fp_state(&vcpu->arch.fp);
@@ -4076,11 +4077,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	save_p9_guest_pmu(vcpu);
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries())
-		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
-#endif
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
@@ -4089,8 +4086,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	load_p9_host_pmu(&host_os_sprs);
-
 	kvmppc_subcore_exit_guest();
 
 	return trap;
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 14/43] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

The pmcregs_in_use field in the guest VPA can not be trusted to reflect
what the guest is doing with PMU SPRs, so the PMU must always be managed
(stopped) when exiting the guest, and SPR values set when entering the
guest to ensure it can't cause a covert channel or otherwise cause other
guests or the host to misbehave.

So prevent guest access to the PMU with HFSCR[PM] if pmcregs_in_use is
clear, and avoid the PMU SPR access on every partition switch. Guests
that set pmcregs_in_use incorrectly or when first setting it and using
the PMU will take a hypervisor facility unavailable interrupt that will
bring in the PMU SPRs.

-774 cycles (7759) cycles POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h |   1 +
 arch/powerpc/kvm/book3s_hv.c        | 122 ++++++++++++++++++++++------
 arch/powerpc/kvm/book3s_hv_nested.c |  12 ++-
 3 files changed, 105 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e4c3a741951..5c003a5ff854 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -819,6 +819,7 @@ struct kvm_vcpu_arch {
 	/* For support of nested guests */
 	struct kvm_nested_guest *nested;
 	u32 nested_vcpu_id;
+	u64 nested_hfscr;
 	gpa_t nested_io_gpr;
 #endif
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 13b8389b0479..0733bb95f439 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1349,6 +1349,20 @@ static int kvmppc_emulate_doorbell_instr(struct kvm_vcpu *vcpu)
 	return RESUME_GUEST;
 }
 
+/*
+ * If the lppaca had pmcregs_in_use clear when we exited the guest, then
+ * HFSCR_PM is cleared for next entry. If the guest then tries to access
+ * the PMU SPRs, we get this facility unavailable interrupt. Putting HFSCR_PM
+ * back in the guest HFSCR will cause the next entry to load the PMU SPRs and
+ * allow the guest access to continue.
+ */
+static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.hfscr |= HFSCR_PM;
+
+	return RESUME_GUEST;
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 				 struct task_struct *tsk)
 {
@@ -1618,16 +1632,22 @@ XXX benchmark guest exits
 	 * to emulate.
 	 * Otherwise, we just generate a program interrupt to the guest.
 	 */
-	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL:
+	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: {
 		r = EMULATE_FAIL;
-		if (((vcpu->arch.hfscr >> 56) == FSCR_MSGP_LG) &&
-		    cpu_has_feature(CPU_FTR_ARCH_300))
-			r = kvmppc_emulate_doorbell_instr(vcpu);
+		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+			unsigned long cause = vcpu->arch.hfscr >> 56;
+
+			if (cause == FSCR_MSGP_LG)
+				r = kvmppc_emulate_doorbell_instr(vcpu);
+			if (cause == FSCR_PM_LG)
+				r = kvmppc_pmu_unavailable(vcpu);
+		}
 		if (r == EMULATE_FAIL) {
 			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
 			r = RESUME_GUEST;
 		}
 		break;
+	}
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case BOOK3S_INTERRUPT_HV_SOFTPATCH:
@@ -1734,6 +1754,19 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 		srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
 		break;
 
+	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: {
+		unsigned long cause = vcpu->arch.hfscr >> 56;
+
+		r = EMULATE_FAIL;
+		if (cause == FSCR_PM_LG && (vcpu->arch.nested_hfscr & HFSCR_PM))
+			r = kvmppc_pmu_unavailable(vcpu);
+
+		if (r == EMULATE_FAIL)
+			r = RESUME_HOST;
+
+		break;
+	}
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case BOOK3S_INTERRUPT_HV_SOFTPATCH:
 		/*
@@ -3693,6 +3726,17 @@ static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
 static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
 				struct p9_host_os_sprs *host_os_sprs)
 {
+	struct lppaca *lp;
+	int load_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		load_pmu = lp->pmcregs_in_use;
+
+	if (load_pmu)
+	      vcpu->arch.hfscr |= HFSCR_PM;
+
+	/* Save host */
 	if (ppc_get_pmu_inuse()) {
 		/*
 		 * It might be better to put PMU handling (at least for the
@@ -3737,29 +3781,31 @@ static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
 	}
 #endif
 
-	/* load guest */
-	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
-	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
-	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
-	mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
-	mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
-	mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
-	mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
-	mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
-	mtspr(SPRN_SDAR, vcpu->arch.sdar);
-	mtspr(SPRN_SIAR, vcpu->arch.siar);
-	mtspr(SPRN_SIER, vcpu->arch.sier[0]);
+	/* Load guest */
+	if (vcpu->arch.hfscr & HFSCR_PM) {
+		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
+		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
+		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
+		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
+		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
+		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
+		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
+		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
+		mtspr(SPRN_SDAR, vcpu->arch.sdar);
+		mtspr(SPRN_SIAR, vcpu->arch.siar);
+		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
 
-	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-		mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
-		mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
-		mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
-	}
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
+			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
+			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
+		}
 
-	/* Set MMCRA then MMCR0 last */
-	mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
-	mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
-	/* No isync necessary because we're starting counters */
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
+		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
+		/* No isync necessary because we're starting counters */
+	}
 }
 
 static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
@@ -3795,9 +3841,31 @@ static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
 			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
 			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
 		}
-	} else {
+
+	} else if (vcpu->arch.hfscr & HFSCR_PM) {
+		/*
+		 * The guest accessed PMC SPRs without specifying they should
+		 * be preserved. Stop them from counting if the guest had
+		 * started anything.
+		 */
 		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
-	}
+
+		/*
+		 * Demand-fault PMU register access in the guest.
+		 *
+		 * This is used to grab the guest's VPA pmcregs_in_use value
+		 * and reflect it into the host's VPA in the case of a nested
+		 * hypervisor.
+		 *
+		 * It also avoids having to zero-out SPRs after each guest
+		 * exit to avoid side-channels when.
+		 *
+		 * This is cleared here when we exit the guest, so later HFSCR
+		 * interrupt handling can add it back to run the guest with
+		 * PM enabled next time.
+		 */
+		vcpu->arch.hfscr &= ~HFSCR_PM;
+	} /* otherwise the PMU should still be frozen from guest entry */
 
 #ifdef CONFIG_PPC_PSERIES
 	if (kvmhv_on_pseries())
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 60724f674421..6add13a22f56 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -103,7 +103,7 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap,
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 
 	hr->dpdes = vc->dpdes;
-	hr->hfscr = vcpu->arch.hfscr;
+	hr->hfscr = vcpu->arch.nested_hfscr;
 	hr->purr = vcpu->arch.purr;
 	hr->spurr = vcpu->arch.spurr;
 	hr->ic = vcpu->arch.ic;
@@ -126,6 +126,10 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap,
 	case BOOK3S_INTERRUPT_H_INST_STORAGE:
 		hr->asdr = vcpu->arch.fault_gpa;
 		break;
+	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL:
+		hr->hfscr &= ~HFSCR_INTR_CAUSE;
+		hr->hfscr |= vcpu->arch.hfscr & HFSCR_INTR_CAUSE;
+		break;
 	case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
 		hr->heir = vcpu->arch.emul_inst;
 		break;
@@ -161,9 +165,10 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
 
 	/*
 	 * Don't let L1 enable features for L2 which we've disabled for L1,
-	 * but preserve the interrupt cause field.
+	 * but preserve the interrupt cause field and facilities that might
+	 * be disabled for demand faulting in the L1.
 	 */
-	hr->hfscr &= (HFSCR_INTR_CAUSE | vcpu->arch.hfscr);
+	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | vcpu->arch.hfscr);
 
 	/* Don't let data address watchpoint match in hypervisor state */
 	hr->dawrx0 &= ~DAWRX_HYP;
@@ -342,6 +347,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	/* set L1 state to L2 state */
 	vcpu->arch.nested = l2;
 	vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token;
+	vcpu->arch.nested_hfscr = l2_hv.hfscr;
 	vcpu->arch.regs = l2_regs;
 
 	/* Guest must always run with ME enabled, HV disabled. */
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 14/43] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

The pmcregs_in_use field in the guest VPA can not be trusted to reflect
what the guest is doing with PMU SPRs, so the PMU must always be managed
(stopped) when exiting the guest, and SPR values set when entering the
guest to ensure it can't cause a covert channel or otherwise cause other
guests or the host to misbehave.

So prevent guest access to the PMU with HFSCR[PM] if pmcregs_in_use is
clear, and avoid the PMU SPR access on every partition switch. Guests
that set pmcregs_in_use incorrectly or when first setting it and using
the PMU will take a hypervisor facility unavailable interrupt that will
bring in the PMU SPRs.

-774 cycles (7759) cycles POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h |   1 +
 arch/powerpc/kvm/book3s_hv.c        | 122 ++++++++++++++++++++++------
 arch/powerpc/kvm/book3s_hv_nested.c |  12 ++-
 3 files changed, 105 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e4c3a741951..5c003a5ff854 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -819,6 +819,7 @@ struct kvm_vcpu_arch {
 	/* For support of nested guests */
 	struct kvm_nested_guest *nested;
 	u32 nested_vcpu_id;
+	u64 nested_hfscr;
 	gpa_t nested_io_gpr;
 #endif
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 13b8389b0479..0733bb95f439 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1349,6 +1349,20 @@ static int kvmppc_emulate_doorbell_instr(struct kvm_vcpu *vcpu)
 	return RESUME_GUEST;
 }
 
+/*
+ * If the lppaca had pmcregs_in_use clear when we exited the guest, then
+ * HFSCR_PM is cleared for next entry. If the guest then tries to access
+ * the PMU SPRs, we get this facility unavailable interrupt. Putting HFSCR_PM
+ * back in the guest HFSCR will cause the next entry to load the PMU SPRs and
+ * allow the guest access to continue.
+ */
+static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.hfscr |= HFSCR_PM;
+
+	return RESUME_GUEST;
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 				 struct task_struct *tsk)
 {
@@ -1618,16 +1632,22 @@ XXX benchmark guest exits
 	 * to emulate.
 	 * Otherwise, we just generate a program interrupt to the guest.
 	 */
-	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL:
+	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: {
 		r = EMULATE_FAIL;
-		if (((vcpu->arch.hfscr >> 56) = FSCR_MSGP_LG) &&
-		    cpu_has_feature(CPU_FTR_ARCH_300))
-			r = kvmppc_emulate_doorbell_instr(vcpu);
+		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+			unsigned long cause = vcpu->arch.hfscr >> 56;
+
+			if (cause = FSCR_MSGP_LG)
+				r = kvmppc_emulate_doorbell_instr(vcpu);
+			if (cause = FSCR_PM_LG)
+				r = kvmppc_pmu_unavailable(vcpu);
+		}
 		if (r = EMULATE_FAIL) {
 			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
 			r = RESUME_GUEST;
 		}
 		break;
+	}
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case BOOK3S_INTERRUPT_HV_SOFTPATCH:
@@ -1734,6 +1754,19 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 		srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
 		break;
 
+	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: {
+		unsigned long cause = vcpu->arch.hfscr >> 56;
+
+		r = EMULATE_FAIL;
+		if (cause = FSCR_PM_LG && (vcpu->arch.nested_hfscr & HFSCR_PM))
+			r = kvmppc_pmu_unavailable(vcpu);
+
+		if (r = EMULATE_FAIL)
+			r = RESUME_HOST;
+
+		break;
+	}
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	case BOOK3S_INTERRUPT_HV_SOFTPATCH:
 		/*
@@ -3693,6 +3726,17 @@ static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
 static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
 				struct p9_host_os_sprs *host_os_sprs)
 {
+	struct lppaca *lp;
+	int load_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		load_pmu = lp->pmcregs_in_use;
+
+	if (load_pmu)
+	      vcpu->arch.hfscr |= HFSCR_PM;
+
+	/* Save host */
 	if (ppc_get_pmu_inuse()) {
 		/*
 		 * It might be better to put PMU handling (at least for the
@@ -3737,29 +3781,31 @@ static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
 	}
 #endif
 
-	/* load guest */
-	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
-	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
-	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
-	mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
-	mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
-	mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
-	mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
-	mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
-	mtspr(SPRN_SDAR, vcpu->arch.sdar);
-	mtspr(SPRN_SIAR, vcpu->arch.siar);
-	mtspr(SPRN_SIER, vcpu->arch.sier[0]);
+	/* Load guest */
+	if (vcpu->arch.hfscr & HFSCR_PM) {
+		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
+		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
+		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
+		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
+		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
+		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
+		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
+		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
+		mtspr(SPRN_SDAR, vcpu->arch.sdar);
+		mtspr(SPRN_SIAR, vcpu->arch.siar);
+		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
 
-	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-		mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
-		mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
-		mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
-	}
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
+			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
+			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
+		}
 
-	/* Set MMCRA then MMCR0 last */
-	mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
-	mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
-	/* No isync necessary because we're starting counters */
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
+		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
+		/* No isync necessary because we're starting counters */
+	}
 }
 
 static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
@@ -3795,9 +3841,31 @@ static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
 			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
 			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
 		}
-	} else {
+
+	} else if (vcpu->arch.hfscr & HFSCR_PM) {
+		/*
+		 * The guest accessed PMC SPRs without specifying they should
+		 * be preserved. Stop them from counting if the guest had
+		 * started anything.
+		 */
 		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
-	}
+
+		/*
+		 * Demand-fault PMU register access in the guest.
+		 *
+		 * This is used to grab the guest's VPA pmcregs_in_use value
+		 * and reflect it into the host's VPA in the case of a nested
+		 * hypervisor.
+		 *
+		 * It also avoids having to zero-out SPRs after each guest
+		 * exit to avoid side-channels when.
+		 *
+		 * This is cleared here when we exit the guest, so later HFSCR
+		 * interrupt handling can add it back to run the guest with
+		 * PM enabled next time.
+		 */
+		vcpu->arch.hfscr &= ~HFSCR_PM;
+	} /* otherwise the PMU should still be frozen from guest entry */
 
 #ifdef CONFIG_PPC_PSERIES
 	if (kvmhv_on_pseries())
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 60724f674421..6add13a22f56 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -103,7 +103,7 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap,
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 
 	hr->dpdes = vc->dpdes;
-	hr->hfscr = vcpu->arch.hfscr;
+	hr->hfscr = vcpu->arch.nested_hfscr;
 	hr->purr = vcpu->arch.purr;
 	hr->spurr = vcpu->arch.spurr;
 	hr->ic = vcpu->arch.ic;
@@ -126,6 +126,10 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, int trap,
 	case BOOK3S_INTERRUPT_H_INST_STORAGE:
 		hr->asdr = vcpu->arch.fault_gpa;
 		break;
+	case BOOK3S_INTERRUPT_H_FAC_UNAVAIL:
+		hr->hfscr &= ~HFSCR_INTR_CAUSE;
+		hr->hfscr |= vcpu->arch.hfscr & HFSCR_INTR_CAUSE;
+		break;
 	case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
 		hr->heir = vcpu->arch.emul_inst;
 		break;
@@ -161,9 +165,10 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
 
 	/*
 	 * Don't let L1 enable features for L2 which we've disabled for L1,
-	 * but preserve the interrupt cause field.
+	 * but preserve the interrupt cause field and facilities that might
+	 * be disabled for demand faulting in the L1.
 	 */
-	hr->hfscr &= (HFSCR_INTR_CAUSE | vcpu->arch.hfscr);
+	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | vcpu->arch.hfscr);
 
 	/* Don't let data address watchpoint match in hypervisor state */
 	hr->dawrx0 &= ~DAWRX_HYP;
@@ -342,6 +347,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	/* set L1 state to L2 state */
 	vcpu->arch.nested = l2;
 	vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token;
+	vcpu->arch.nested_hfscr = l2_hv.hfscr;
 	vcpu->arch.regs = l2_regs;
 
 	/* Guest must always run with ME enabled, HV disabled. */
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 15/43] KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Processors that support KVM HV do not require read-modify-write of
the CTRL SPR to set/clear their thread's runlatch. Just write 1 or 0
to it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c            |  2 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 15 ++++++---------
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0733bb95f439..f0298b286c42 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3920,7 +3920,7 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
 	 */
 
 	if (!(vcpu->arch.ctrl & 1))
-		mtspr(SPRN_CTRLT, mfspr(SPRN_CTRLF) & ~1);
+		mtspr(SPRN_CTRLT, 0);
 }
 
 static void store_spr_state(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 0eb06734bc26..488a1e07958c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -775,12 +775,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	mtspr	SPRN_AMR,r5
 	mtspr	SPRN_UAMOR,r6
 
-	/* Restore state of CTRL run bit; assume 1 on entry */
+	/* Restore state of CTRL run bit; the host currently has it set to 1 */
 	lwz	r5,VCPU_CTRL(r4)
 	andi.	r5,r5,1
 	bne	4f
-	mfspr	r6,SPRN_CTRLF
-	clrrdi	r6,r6,1
+	li	r6,0
 	mtspr	SPRN_CTRLT,r6
 4:
 	/* Secondary threads wait for primary to have done partition switch */
@@ -1209,12 +1208,12 @@ guest_bypass:
 	stw	r0, VCPU_CPU(r9)
 	stw	r0, VCPU_THREAD_CPU(r9)
 
-	/* Save guest CTRL register, set runlatch to 1 */
+	/* Save guest CTRL register, set runlatch to 1 if it was clear */
 	mfspr	r6,SPRN_CTRLF
 	stw	r6,VCPU_CTRL(r9)
 	andi.	r0,r6,1
 	bne	4f
-	ori	r6,r6,1
+	li	r6,1
 	mtspr	SPRN_CTRLT,r6
 4:
 	/*
@@ -2220,8 +2219,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 	 * Also clear the runlatch bit before napping.
 	 */
 kvm_do_nap:
-	mfspr	r0, SPRN_CTRLF
-	clrrdi	r0, r0, 1
+	li	r0,0
 	mtspr	SPRN_CTRLT, r0
 
 	li	r0,1
@@ -2240,8 +2238,7 @@ kvm_nap_sequence:		/* desired LPCR value in r5 */
 
 	bl	isa206_idle_insn_mayloss
 
-	mfspr	r0, SPRN_CTRLF
-	ori	r0, r0, 1
+	li	r0,1
 	mtspr	SPRN_CTRLT, r0
 
 	mtspr	SPRN_SRR1, r3
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 15/43] KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Processors that support KVM HV do not require read-modify-write of
the CTRL SPR to set/clear their thread's runlatch. Just write 1 or 0
to it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c            |  2 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 15 ++++++---------
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0733bb95f439..f0298b286c42 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3920,7 +3920,7 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
 	 */
 
 	if (!(vcpu->arch.ctrl & 1))
-		mtspr(SPRN_CTRLT, mfspr(SPRN_CTRLF) & ~1);
+		mtspr(SPRN_CTRLT, 0);
 }
 
 static void store_spr_state(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 0eb06734bc26..488a1e07958c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -775,12 +775,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	mtspr	SPRN_AMR,r5
 	mtspr	SPRN_UAMOR,r6
 
-	/* Restore state of CTRL run bit; assume 1 on entry */
+	/* Restore state of CTRL run bit; the host currently has it set to 1 */
 	lwz	r5,VCPU_CTRL(r4)
 	andi.	r5,r5,1
 	bne	4f
-	mfspr	r6,SPRN_CTRLF
-	clrrdi	r6,r6,1
+	li	r6,0
 	mtspr	SPRN_CTRLT,r6
 4:
 	/* Secondary threads wait for primary to have done partition switch */
@@ -1209,12 +1208,12 @@ guest_bypass:
 	stw	r0, VCPU_CPU(r9)
 	stw	r0, VCPU_THREAD_CPU(r9)
 
-	/* Save guest CTRL register, set runlatch to 1 */
+	/* Save guest CTRL register, set runlatch to 1 if it was clear */
 	mfspr	r6,SPRN_CTRLF
 	stw	r6,VCPU_CTRL(r9)
 	andi.	r0,r6,1
 	bne	4f
-	ori	r6,r6,1
+	li	r6,1
 	mtspr	SPRN_CTRLT,r6
 4:
 	/*
@@ -2220,8 +2219,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 	 * Also clear the runlatch bit before napping.
 	 */
 kvm_do_nap:
-	mfspr	r0, SPRN_CTRLF
-	clrrdi	r0, r0, 1
+	li	r0,0
 	mtspr	SPRN_CTRLT, r0
 
 	li	r0,1
@@ -2240,8 +2238,7 @@ kvm_nap_sequence:		/* desired LPCR value in r5 */
 
 	bl	isa206_idle_insn_mayloss
 
-	mfspr	r0, SPRN_CTRLF
-	ori	r0, r0, 1
+	li	r0,1
 	mtspr	SPRN_CTRLT, r0
 
 	mtspr	SPRN_SRR1, r3
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 16/43] KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move the SPR update into its relevant helper function. This will
help with SPR scheduling improvements in later changes.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f0298b286c42..73a8b45249e8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3953,6 +3953,8 @@ static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 				    struct p9_host_os_sprs *host_os_sprs)
 {
+	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
+
 	mtspr(SPRN_PSPB, 0);
 	mtspr(SPRN_UAMOR, 0);
 
@@ -4152,8 +4154,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	timer_rearm_host_dec(tb);
 
-	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
-
 	kvmppc_subcore_exit_guest();
 
 	return trap;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 16/43] KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move the SPR update into its relevant helper function. This will
help with SPR scheduling improvements in later changes.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f0298b286c42..73a8b45249e8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3953,6 +3953,8 @@ static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 				    struct p9_host_os_sprs *host_os_sprs)
 {
+	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
+
 	mtspr(SPRN_PSPB, 0);
 	mtspr(SPRN_UAMOR, 0);
 
@@ -4152,8 +4154,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	timer_rearm_host_dec(tb);
 
-	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
-
 	kvmppc_subcore_exit_guest();
 
 	return trap;
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 17/43] KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This reduces the number of mtmsrd required to enable facility bits when
saving/restoring registers, by having the KVM code set all bits up front
rather than using individual facility functions that set their particular
MSR bits.

-42 cycles (7803) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/process.c         | 24 +++++++++++
 arch/powerpc/kvm/book3s_hv.c          | 57 ++++++++++++++++++---------
 arch/powerpc/kvm/book3s_hv_p9_entry.c |  1 +
 3 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 89e34aa273e2..dfce089ac424 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -592,6 +592,30 @@ static void save_all(struct task_struct *tsk)
 	msr_check_and_clear(msr_all_available);
 }
 
+void save_user_regs_kvm(void)
+{
+	unsigned long usermsr;
+
+	if (!current->thread.regs)
+		return;
+
+	usermsr = current->thread.regs->msr;
+
+	if (usermsr & MSR_FP)
+		save_fpu(current);
+
+	if (usermsr & MSR_VEC)
+		save_altivec(current);
+
+	if (usermsr & MSR_TM) {
+                current->thread.tm_tfhar = mfspr(SPRN_TFHAR);
+                current->thread.tm_tfiar = mfspr(SPRN_TFIAR);
+                current->thread.tm_texasr = mfspr(SPRN_TEXASR);
+                current->thread.regs->msr &= ~MSR_TM;
+	}
+}
+EXPORT_SYMBOL_GPL(save_user_regs_kvm);
+
 void flush_all_to_thread(struct task_struct *tsk)
 {
 	if (tsk->thread.regs) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 73a8b45249e8..3ac5dbdb59f8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3999,6 +3999,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
 	u64 tb, next_timer;
+	unsigned long msr;
 	int trap;
 
 	WARN_ON_ONCE(vcpu->arch.ceded);
@@ -4010,8 +4011,23 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	if (next_timer < time_limit)
 		time_limit = next_timer;
 
+	vcpu->arch.ceded = 0;
+
 	save_p9_host_os_sprs(&host_os_sprs);
 
+	/* MSR bits may have been cleared by context switch */
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+
 	kvmppc_subcore_enter_guest();
 
 	vc->entry_exit_map = 1;
@@ -4025,7 +4041,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
-	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	load_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
 	load_vr_state(&vcpu->arch.vr);
@@ -4134,7 +4149,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
 
-	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	store_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
 	store_vr_state(&vcpu->arch.vr);
@@ -4663,6 +4677,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	goto done;
 }
 
+void save_user_regs_kvm(void);
+
 static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 {
 	struct kvm_run *run = vcpu->run;
@@ -4672,19 +4688,24 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 	unsigned long user_tar = 0;
 	unsigned int user_vrsave;
 	struct kvm *kvm;
+	unsigned long msr;
 
 	if (!vcpu->arch.sane) {
 		run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 		return -EINVAL;
 	}
 
+	/* No need to go into the guest when all we'll do is come back out */
+	if (signal_pending(current)) {
+		run->exit_reason = KVM_EXIT_INTR;
+		return -EINTR;
+	}
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
 	 * Don't allow entry with a suspended transaction, because
 	 * the guest entry/exit code will lose it.
-	 * If the guest has TM enabled, save away their TM-related SPRs
-	 * (they will get restored by the TM unavailable interrupt).
 	 */
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	if (cpu_has_feature(CPU_FTR_TM) && current->thread.regs &&
 	    (current->thread.regs->msr & MSR_TM)) {
 		if (MSR_TM_ACTIVE(current->thread.regs->msr)) {
@@ -4692,12 +4713,6 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 			run->fail_entry.hardware_entry_failure_reason = 0;
 			return -EINVAL;
 		}
-		/* Enable TM so we can read the TM SPRs */
-		mtmsr(mfmsr() | MSR_TM);
-		current->thread.tm_tfhar = mfspr(SPRN_TFHAR);
-		current->thread.tm_tfiar = mfspr(SPRN_TFIAR);
-		current->thread.tm_texasr = mfspr(SPRN_TEXASR);
-		current->thread.regs->msr &= ~MSR_TM;
 	}
 #endif
 
@@ -4712,18 +4727,24 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 
 	kvmppc_core_prepare_to_enter(vcpu);
 
-	/* No need to go into the guest when all we'll do is come back out */
-	if (signal_pending(current)) {
-		run->exit_reason = KVM_EXIT_INTR;
-		return -EINTR;
-	}
-
 	kvm = vcpu->kvm;
 	atomic_inc(&kvm->arch.vcpus_running);
 	/* Order vcpus_running vs. mmu_ready, see kvmppc_alloc_reset_hpt */
 	smp_mb();
 
-	flush_all_to_thread(current);
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+
+	save_user_regs_kvm();
 
 	/* Save userspace EBB and other register values */
 	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index a3281f0c9214..065bfd4d2c63 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -224,6 +224,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = vc->tb_offset;
 	}
 
+	/* Could avoid mfmsr by passing around, but probably no big deal */
 	msr = mfmsr();
 
 	host_hfscr = mfspr(SPRN_HFSCR);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 17/43] KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This reduces the number of mtmsrd required to enable facility bits when
saving/restoring registers, by having the KVM code set all bits up front
rather than using individual facility functions that set their particular
MSR bits.

-42 cycles (7803) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/process.c         | 24 +++++++++++
 arch/powerpc/kvm/book3s_hv.c          | 57 ++++++++++++++++++---------
 arch/powerpc/kvm/book3s_hv_p9_entry.c |  1 +
 3 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 89e34aa273e2..dfce089ac424 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -592,6 +592,30 @@ static void save_all(struct task_struct *tsk)
 	msr_check_and_clear(msr_all_available);
 }
 
+void save_user_regs_kvm(void)
+{
+	unsigned long usermsr;
+
+	if (!current->thread.regs)
+		return;
+
+	usermsr = current->thread.regs->msr;
+
+	if (usermsr & MSR_FP)
+		save_fpu(current);
+
+	if (usermsr & MSR_VEC)
+		save_altivec(current);
+
+	if (usermsr & MSR_TM) {
+                current->thread.tm_tfhar = mfspr(SPRN_TFHAR);
+                current->thread.tm_tfiar = mfspr(SPRN_TFIAR);
+                current->thread.tm_texasr = mfspr(SPRN_TEXASR);
+                current->thread.regs->msr &= ~MSR_TM;
+	}
+}
+EXPORT_SYMBOL_GPL(save_user_regs_kvm);
+
 void flush_all_to_thread(struct task_struct *tsk)
 {
 	if (tsk->thread.regs) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 73a8b45249e8..3ac5dbdb59f8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3999,6 +3999,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
 	u64 tb, next_timer;
+	unsigned long msr;
 	int trap;
 
 	WARN_ON_ONCE(vcpu->arch.ceded);
@@ -4010,8 +4011,23 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	if (next_timer < time_limit)
 		time_limit = next_timer;
 
+	vcpu->arch.ceded = 0;
+
 	save_p9_host_os_sprs(&host_os_sprs);
 
+	/* MSR bits may have been cleared by context switch */
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+
 	kvmppc_subcore_enter_guest();
 
 	vc->entry_exit_map = 1;
@@ -4025,7 +4041,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
-	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	load_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
 	load_vr_state(&vcpu->arch.vr);
@@ -4134,7 +4149,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
 
-	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
 	store_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
 	store_vr_state(&vcpu->arch.vr);
@@ -4663,6 +4677,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	goto done;
 }
 
+void save_user_regs_kvm(void);
+
 static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 {
 	struct kvm_run *run = vcpu->run;
@@ -4672,19 +4688,24 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 	unsigned long user_tar = 0;
 	unsigned int user_vrsave;
 	struct kvm *kvm;
+	unsigned long msr;
 
 	if (!vcpu->arch.sane) {
 		run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 		return -EINVAL;
 	}
 
+	/* No need to go into the guest when all we'll do is come back out */
+	if (signal_pending(current)) {
+		run->exit_reason = KVM_EXIT_INTR;
+		return -EINTR;
+	}
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
 	 * Don't allow entry with a suspended transaction, because
 	 * the guest entry/exit code will lose it.
-	 * If the guest has TM enabled, save away their TM-related SPRs
-	 * (they will get restored by the TM unavailable interrupt).
 	 */
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	if (cpu_has_feature(CPU_FTR_TM) && current->thread.regs &&
 	    (current->thread.regs->msr & MSR_TM)) {
 		if (MSR_TM_ACTIVE(current->thread.regs->msr)) {
@@ -4692,12 +4713,6 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 			run->fail_entry.hardware_entry_failure_reason = 0;
 			return -EINVAL;
 		}
-		/* Enable TM so we can read the TM SPRs */
-		mtmsr(mfmsr() | MSR_TM);
-		current->thread.tm_tfhar = mfspr(SPRN_TFHAR);
-		current->thread.tm_tfiar = mfspr(SPRN_TFIAR);
-		current->thread.tm_texasr = mfspr(SPRN_TEXASR);
-		current->thread.regs->msr &= ~MSR_TM;
 	}
 #endif
 
@@ -4712,18 +4727,24 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 
 	kvmppc_core_prepare_to_enter(vcpu);
 
-	/* No need to go into the guest when all we'll do is come back out */
-	if (signal_pending(current)) {
-		run->exit_reason = KVM_EXIT_INTR;
-		return -EINTR;
-	}
-
 	kvm = vcpu->kvm;
 	atomic_inc(&kvm->arch.vcpus_running);
 	/* Order vcpus_running vs. mmu_ready, see kvmppc_alloc_reset_hpt */
 	smp_mb();
 
-	flush_all_to_thread(current);
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+
+	save_user_regs_kvm();
 
 	/* Save userspace EBB and other register values */
 	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index a3281f0c9214..065bfd4d2c63 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -224,6 +224,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = vc->tb_offset;
 	}
 
+	/* Could avoid mfmsr by passing around, but probably no big deal */
 	msr = mfmsr();
 
 	host_hfscr = mfspr(SPRN_HFSCR);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 18/43] KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] disable
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Moving the mtmsrd after the host SPRs are saved and before the guest
SPRs start to be loaded can prevent an SPR scoreboard stall (because
the mtmsrd is L=1 type which does not cause context synchronisation.

This is also now more convenient to combined with the mtmsrd L=0
instruction to enable facilities just below, but that is not done yet.

-12 cycles (7791) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3ac5dbdb59f8..b8b0695a9312 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4015,6 +4015,18 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
+	/*
+	 * This could be combined with MSR[RI] clearing, but that expands
+	 * the unrecoverable window. It would be better to cover unrecoverable
+	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
+	 *
+	 * Much more difficult and less worthwhile to combine with IR/DR
+	 * disable.
+	 */
+	hard_irq_disable();
+	if (lazy_irq_pending())
+		return 0;
+
 	/* MSR bits may have been cleared by context switch */
 	msr = 0;
 	if (IS_ENABLED(CONFIG_PPC_FPU))
@@ -4512,6 +4524,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct kvmppc_vcore *vc;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
+	unsigned long flags;
 
 	trace_kvmppc_run_vcpu_enter(vcpu);
 
@@ -4555,11 +4568,11 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	if (kvm_is_radix(kvm))
 		kvmppc_prepare_radix_vcpu(vcpu, pcpu);
 
-	local_irq_disable();
-	hard_irq_disable();
+	/* flags save not required, but irq_pmu has no disable/enable API */
+	powerpc_local_irq_pmu_save(flags);
 	if (signal_pending(current))
 		goto sigpend;
-	if (lazy_irq_pending() || need_resched() || !kvm->arch.mmu_ready)
+	if (need_resched() || !kvm->arch.mmu_ready)
 		goto out;
 
 	if (!nested) {
@@ -4614,7 +4627,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	guest_exit_irqoff();
 
-	local_irq_enable();
+	powerpc_local_irq_pmu_restore(flags);
 
 	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);
 
@@ -4672,7 +4685,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	run->exit_reason = KVM_EXIT_INTR;
 	vcpu->arch.ret = -EINTR;
  out:
-	local_irq_enable();
+	powerpc_local_irq_pmu_restore(flags);
 	preempt_enable();
 	goto done;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 18/43] KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] disable
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Moving the mtmsrd after the host SPRs are saved and before the guest
SPRs start to be loaded can prevent an SPR scoreboard stall (because
the mtmsrd is L=1 type which does not cause context synchronisation.

This is also now more convenient to combined with the mtmsrd L=0
instruction to enable facilities just below, but that is not done yet.

-12 cycles (7791) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3ac5dbdb59f8..b8b0695a9312 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4015,6 +4015,18 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
+	/*
+	 * This could be combined with MSR[RI] clearing, but that expands
+	 * the unrecoverable window. It would be better to cover unrecoverable
+	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
+	 *
+	 * Much more difficult and less worthwhile to combine with IR/DR
+	 * disable.
+	 */
+	hard_irq_disable();
+	if (lazy_irq_pending())
+		return 0;
+
 	/* MSR bits may have been cleared by context switch */
 	msr = 0;
 	if (IS_ENABLED(CONFIG_PPC_FPU))
@@ -4512,6 +4524,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct kvmppc_vcore *vc;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
+	unsigned long flags;
 
 	trace_kvmppc_run_vcpu_enter(vcpu);
 
@@ -4555,11 +4568,11 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	if (kvm_is_radix(kvm))
 		kvmppc_prepare_radix_vcpu(vcpu, pcpu);
 
-	local_irq_disable();
-	hard_irq_disable();
+	/* flags save not required, but irq_pmu has no disable/enable API */
+	powerpc_local_irq_pmu_save(flags);
 	if (signal_pending(current))
 		goto sigpend;
-	if (lazy_irq_pending() || need_resched() || !kvm->arch.mmu_ready)
+	if (need_resched() || !kvm->arch.mmu_ready)
 		goto out;
 
 	if (!nested) {
@@ -4614,7 +4627,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	guest_exit_irqoff();
 
-	local_irq_enable();
+	powerpc_local_irq_pmu_restore(flags);
 
 	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);
 
@@ -4672,7 +4685,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	run->exit_reason = KVM_EXIT_INTR;
 	vcpu->arch.ret = -EINTR;
  out:
-	local_irq_enable();
+	powerpc_local_irq_pmu_restore(flags);
 	preempt_enable();
 	goto done;
 }
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 19/43] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Small cleanup makes it a bit easier to match up entry and exit
operations.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b8b0695a9312..86c85e303a6d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2948,6 +2948,13 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
 		kvmppc_ipi_thread(cpu);
 }
 
+/* Old path does this in asm */
+static void kvmppc_stop_thread(struct kvm_vcpu *vcpu)
+{
+	vcpu->cpu = -1;
+	vcpu->arch.thread_cpu = -1;
+}
+
 static void kvmppc_wait_for_nap(int n_threads)
 {
 	int cpu = smp_processor_id();
@@ -4154,8 +4161,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		dec = (s32) dec;
 	tb = mftb();
 	vcpu->arch.dec_expires = dec + tb;
-	vcpu->cpu = -1;
-	vcpu->arch.thread_cpu = -1;
 
 	store_spr_state(vcpu);
 
@@ -4627,6 +4632,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	guest_exit_irqoff();
 
+	kvmppc_stop_thread(vcpu);
+
 	powerpc_local_irq_pmu_restore(flags);
 
 	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 19/43] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Small cleanup makes it a bit easier to match up entry and exit
operations.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b8b0695a9312..86c85e303a6d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2948,6 +2948,13 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
 		kvmppc_ipi_thread(cpu);
 }
 
+/* Old path does this in asm */
+static void kvmppc_stop_thread(struct kvm_vcpu *vcpu)
+{
+	vcpu->cpu = -1;
+	vcpu->arch.thread_cpu = -1;
+}
+
 static void kvmppc_wait_for_nap(int n_threads)
 {
 	int cpu = smp_processor_id();
@@ -4154,8 +4161,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		dec = (s32) dec;
 	tb = mftb();
 	vcpu->arch.dec_expires = dec + tb;
-	vcpu->cpu = -1;
-	vcpu->arch.thread_cpu = -1;
 
 	store_spr_state(vcpu);
 
@@ -4627,6 +4632,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	guest_exit_irqoff();
 
+	kvmppc_stop_thread(vcpu);
+
 	powerpc_local_irq_pmu_restore(flags);
 
 	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 20/43] KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Change dec_expires to be relative to the guest timebase, and allow
it to be moved into low level P9 guest entry functions, to improve
SPR access scheduling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s.h   |  6 +++
 arch/powerpc/include/asm/kvm_host.h     |  2 +-
 arch/powerpc/kvm/book3s_hv.c            | 58 +++++++++++++------------
 arch/powerpc/kvm/book3s_hv_nested.c     |  3 ++
 arch/powerpc/kvm/book3s_hv_p9_entry.c   | 10 ++++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 14 ------
 6 files changed, 49 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index e6b53c6e21e3..032c597db0a9 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -403,6 +403,12 @@ static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 	return vcpu->arch.fault_dar;
 }
 
+/* Expiry time of vcpu DEC relative to host TB */
+static inline u64 kvmppc_dec_expires_host_tb(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.dec_expires - vcpu->arch.vcore->tb_offset;
+}
+
 static inline bool is_kvmppc_resume_guest(int r)
 {
 	return (r == RESUME_GUEST || r == RESUME_GUEST_NV);
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 5c003a5ff854..118b388ea887 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -747,7 +747,7 @@ struct kvm_vcpu_arch {
 
 	struct hrtimer dec_timer;
 	u64 dec_jiffies;
-	u64 dec_expires;
+	u64 dec_expires;	/* Relative to guest timebase. */
 	unsigned long pending_exceptions;
 	u8 ceded;
 	u8 prodded;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 86c85e303a6d..218dacd78e25 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2149,8 +2149,7 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
 		*val = get_reg_val(id, vcpu->arch.vcore->arch_compat);
 		break;
 	case KVM_REG_PPC_DEC_EXPIRY:
-		*val = get_reg_val(id, vcpu->arch.dec_expires +
-				   vcpu->arch.vcore->tb_offset);
+		*val = get_reg_val(id, vcpu->arch.dec_expires);
 		break;
 	case KVM_REG_PPC_ONLINE:
 		*val = get_reg_val(id, vcpu->arch.online);
@@ -2402,8 +2401,7 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
 		r = kvmppc_set_arch_compat(vcpu, set_reg_val(id, *val));
 		break;
 	case KVM_REG_PPC_DEC_EXPIRY:
-		vcpu->arch.dec_expires = set_reg_val(id, *val) -
-			vcpu->arch.vcore->tb_offset;
+		vcpu->arch.dec_expires = set_reg_val(id, *val);
 		break;
 	case KVM_REG_PPC_ONLINE:
 		i = set_reg_val(id, *val);
@@ -2780,13 +2778,13 @@ static void kvmppc_set_timer(struct kvm_vcpu *vcpu)
 	unsigned long dec_nsec, now;
 
 	now = get_tb();
-	if (now > vcpu->arch.dec_expires) {
+	if (now > kvmppc_dec_expires_host_tb(vcpu)) {
 		/* decrementer has already gone negative */
 		kvmppc_core_queue_dec(vcpu);
 		kvmppc_core_prepare_to_enter(vcpu);
 		return;
 	}
-	dec_nsec = tb_to_ns(vcpu->arch.dec_expires - now);
+	dec_nsec = tb_to_ns(kvmppc_dec_expires_host_tb(vcpu) - now);
 	hrtimer_start(&vcpu->arch.dec_timer, dec_nsec, HRTIMER_MODE_REL);
 	vcpu->arch.timer_running = 1;
 }
@@ -3258,7 +3256,7 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 		 */
 		spin_unlock(&vc->lock);
 		/* cancel pending dec exception if dec is positive */
-		if (now < vcpu->arch.dec_expires &&
+		if (now < kvmppc_dec_expires_host_tb(vcpu) &&
 		    kvmppc_core_pending_dec(vcpu))
 			kvmppc_core_dequeue_dec(vcpu);
 
@@ -4068,20 +4066,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	load_spr_state(vcpu);
 
-	/*
-	 * When setting DEC, we must always deal with irq_work_raise via NMI vs
-	 * setting DEC. The problem occurs right as we switch into guest mode
-	 * if a NMI hits and sets pending work and sets DEC, then that will
-	 * apply to the guest and not bring us back to the host.
-	 *
-	 * irq_work_raise could check a flag (or possibly LPCR[HDICE] for
-	 * example) and set HDEC to 1? That wouldn't solve the nested hv
-	 * case which needs to abort the hcall or zero the time limit.
-	 *
-	 * XXX: Another day's problem.
-	 */
-	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
-
 	if (kvmhv_on_pseries()) {
 		/*
 		 * We need to save and restore the guest visible part of the
@@ -4107,6 +4091,23 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			hvregs.vcpu_token = vcpu->vcpu_id;
 		}
 		hvregs.hdec_expiry = time_limit;
+
+		/*
+		 * When setting DEC, we must always deal with irq_work_raise
+		 * via NMI vs setting DEC. The problem occurs right as we
+		 * switch into guest mode if a NMI hits and sets pending work
+		 * and sets DEC, then that will apply to the guest and not
+		 * bring us back to the host.
+		 *
+		 * irq_work_raise could check a flag (or possibly LPCR[HDICE]
+		 * for example) and set HDEC to 1? That wouldn't solve the
+		 * nested hv case which needs to abort the hcall or zero the
+		 * time limit.
+		 *
+		 * XXX: Another day's problem.
+		 */
+		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - tb);
+
 		mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 		mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
 		trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
@@ -4118,6 +4119,12 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
 		mtspr(SPRN_PSSCR_PR, host_psscr);
 
+		dec = mfspr(SPRN_DEC);
+		if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+			dec = (s32) dec;
+		tb = mftb();
+		vcpu->arch.dec_expires = dec + (tb + vc->tb_offset);
+
 		/* H_CEDE has to be handled now, not later */
 		if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
 		    kvmppc_get_gpr(vcpu, 3) == H_CEDE) {
@@ -4125,6 +4132,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			kvmppc_set_gpr(vcpu, 3, 0);
 			trap = 0;
 		}
+
 	} else {
 		kvmppc_xive_push_vcpu(vcpu);
 		trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr);
@@ -4156,12 +4164,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			vcpu->arch.slb_max = 0;
 	}
 
-	dec = mfspr(SPRN_DEC);
-	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
-		dec = (s32) dec;
-	tb = mftb();
-	vcpu->arch.dec_expires = dec + tb;
-
 	store_spr_state(vcpu);
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
@@ -4646,7 +4648,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	 * by L2 and the L1 decrementer is provided in hdec_expires
 	 */
 	if (kvmppc_core_pending_dec(vcpu) &&
-			((get_tb() < vcpu->arch.dec_expires) ||
+			((get_tb() < kvmppc_dec_expires_host_tb(vcpu)) ||
 			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
 			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
 		kvmppc_core_dequeue_dec(vcpu);
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 6add13a22f56..024b0ce5b702 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -343,6 +343,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	/* convert TB values/offsets to host (L0) values */
 	hdec_exp = l2_hv.hdec_expiry - vc->tb_offset;
 	vc->tb_offset += l2_hv.tb_offset;
+	vcpu->arch.dec_expires += l2_hv.tb_offset;
 
 	/* set L1 state to L2 state */
 	vcpu->arch.nested = l2;
@@ -384,6 +385,8 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	if (l2_regs.msr & MSR_TS_MASK)
 		vcpu->arch.shregs.msr |= MSR_TS_S;
 	vc->tb_offset = saved_l1_hv.tb_offset;
+	/* XXX: is this always the same delta as saved_l1_hv.tb_offset? */
+	vcpu->arch.dec_expires -= l2_hv.tb_offset;
 	restore_hv_regs(vcpu, &saved_l1_hv);
 	vcpu->arch.purr += delta_purr;
 	vcpu->arch.spurr += delta_spurr;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 065bfd4d2c63..469dd5cbb52d 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -188,7 +188,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
-	s64 hdec;
+	s64 hdec, dec;
 	u64 tb, purr, spurr;
 	u64 *exsave;
 	bool ri_set;
@@ -317,6 +317,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	 */
 	mtspr(SPRN_HDEC, hdec);
 
+	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
+
 	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
 	mtspr(SPRN_SRR0, vcpu->arch.shregs.srr0);
@@ -446,6 +448,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
 	vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
 
+	dec = mfspr(SPRN_DEC);
+	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+		dec = (s32) dec;
+	tb = mftb();
+	vcpu->arch.dec_expires = dec + tb;
+
 	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 488a1e07958c..1c51dd704dd4 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -808,10 +808,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	 * Set the decrementer to the guest decrementer.
 	 */
 	ld	r8,VCPU_DEC_EXPIRES(r4)
-	/* r8 is a host timebase value here, convert to guest TB */
-	ld	r5,HSTATE_KVM_VCORE(r13)
-	ld	r6,VCORE_TB_OFFSET_APPL(r5)
-	add	r8,r8,r6
 	mftb	r7
 	subf	r3,r7,r8
 	mtspr	SPRN_DEC,r3
@@ -1192,9 +1188,6 @@ guest_bypass:
 	mftb	r6
 	extsw	r5,r5
 16:	add	r5,r5,r6
-	/* r5 is a guest timebase value here, convert to host TB */
-	ld	r4,VCORE_TB_OFFSET_APPL(r3)
-	subf	r5,r4,r5
 	std	r5,VCPU_DEC_EXPIRES(r9)
 
 	/* Increment exit count, poke other threads to exit */
@@ -2195,10 +2188,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 67:
 	/* save expiry time of guest decrementer */
 	add	r3, r3, r5
-	ld	r4, HSTATE_KVM_VCPU(r13)
-	ld	r5, HSTATE_KVM_VCORE(r13)
-	ld	r6, VCORE_TB_OFFSET_APPL(r5)
-	subf	r3, r6, r3	/* convert to host TB value */
 	std	r3, VCPU_DEC_EXPIRES(r4)
 
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
@@ -2295,9 +2284,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
 	/* Restore guest decrementer */
 	ld	r3, VCPU_DEC_EXPIRES(r4)
-	ld	r5, HSTATE_KVM_VCORE(r13)
-	ld	r6, VCORE_TB_OFFSET_APPL(r5)
-	add	r3, r3, r6	/* convert host TB to guest TB value */
 	mftb	r7
 	subf	r3, r7, r3
 	mtspr	SPRN_DEC, r3
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 20/43] KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Change dec_expires to be relative to the guest timebase, and allow
it to be moved into low level P9 guest entry functions, to improve
SPR access scheduling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s.h   |  6 +++
 arch/powerpc/include/asm/kvm_host.h     |  2 +-
 arch/powerpc/kvm/book3s_hv.c            | 58 +++++++++++++------------
 arch/powerpc/kvm/book3s_hv_nested.c     |  3 ++
 arch/powerpc/kvm/book3s_hv_p9_entry.c   | 10 ++++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 14 ------
 6 files changed, 49 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index e6b53c6e21e3..032c597db0a9 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -403,6 +403,12 @@ static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 	return vcpu->arch.fault_dar;
 }
 
+/* Expiry time of vcpu DEC relative to host TB */
+static inline u64 kvmppc_dec_expires_host_tb(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.dec_expires - vcpu->arch.vcore->tb_offset;
+}
+
 static inline bool is_kvmppc_resume_guest(int r)
 {
 	return (r = RESUME_GUEST || r = RESUME_GUEST_NV);
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 5c003a5ff854..118b388ea887 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -747,7 +747,7 @@ struct kvm_vcpu_arch {
 
 	struct hrtimer dec_timer;
 	u64 dec_jiffies;
-	u64 dec_expires;
+	u64 dec_expires;	/* Relative to guest timebase. */
 	unsigned long pending_exceptions;
 	u8 ceded;
 	u8 prodded;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 86c85e303a6d..218dacd78e25 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2149,8 +2149,7 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
 		*val = get_reg_val(id, vcpu->arch.vcore->arch_compat);
 		break;
 	case KVM_REG_PPC_DEC_EXPIRY:
-		*val = get_reg_val(id, vcpu->arch.dec_expires +
-				   vcpu->arch.vcore->tb_offset);
+		*val = get_reg_val(id, vcpu->arch.dec_expires);
 		break;
 	case KVM_REG_PPC_ONLINE:
 		*val = get_reg_val(id, vcpu->arch.online);
@@ -2402,8 +2401,7 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
 		r = kvmppc_set_arch_compat(vcpu, set_reg_val(id, *val));
 		break;
 	case KVM_REG_PPC_DEC_EXPIRY:
-		vcpu->arch.dec_expires = set_reg_val(id, *val) -
-			vcpu->arch.vcore->tb_offset;
+		vcpu->arch.dec_expires = set_reg_val(id, *val);
 		break;
 	case KVM_REG_PPC_ONLINE:
 		i = set_reg_val(id, *val);
@@ -2780,13 +2778,13 @@ static void kvmppc_set_timer(struct kvm_vcpu *vcpu)
 	unsigned long dec_nsec, now;
 
 	now = get_tb();
-	if (now > vcpu->arch.dec_expires) {
+	if (now > kvmppc_dec_expires_host_tb(vcpu)) {
 		/* decrementer has already gone negative */
 		kvmppc_core_queue_dec(vcpu);
 		kvmppc_core_prepare_to_enter(vcpu);
 		return;
 	}
-	dec_nsec = tb_to_ns(vcpu->arch.dec_expires - now);
+	dec_nsec = tb_to_ns(kvmppc_dec_expires_host_tb(vcpu) - now);
 	hrtimer_start(&vcpu->arch.dec_timer, dec_nsec, HRTIMER_MODE_REL);
 	vcpu->arch.timer_running = 1;
 }
@@ -3258,7 +3256,7 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 		 */
 		spin_unlock(&vc->lock);
 		/* cancel pending dec exception if dec is positive */
-		if (now < vcpu->arch.dec_expires &&
+		if (now < kvmppc_dec_expires_host_tb(vcpu) &&
 		    kvmppc_core_pending_dec(vcpu))
 			kvmppc_core_dequeue_dec(vcpu);
 
@@ -4068,20 +4066,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	load_spr_state(vcpu);
 
-	/*
-	 * When setting DEC, we must always deal with irq_work_raise via NMI vs
-	 * setting DEC. The problem occurs right as we switch into guest mode
-	 * if a NMI hits and sets pending work and sets DEC, then that will
-	 * apply to the guest and not bring us back to the host.
-	 *
-	 * irq_work_raise could check a flag (or possibly LPCR[HDICE] for
-	 * example) and set HDEC to 1? That wouldn't solve the nested hv
-	 * case which needs to abort the hcall or zero the time limit.
-	 *
-	 * XXX: Another day's problem.
-	 */
-	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
-
 	if (kvmhv_on_pseries()) {
 		/*
 		 * We need to save and restore the guest visible part of the
@@ -4107,6 +4091,23 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			hvregs.vcpu_token = vcpu->vcpu_id;
 		}
 		hvregs.hdec_expiry = time_limit;
+
+		/*
+		 * When setting DEC, we must always deal with irq_work_raise
+		 * via NMI vs setting DEC. The problem occurs right as we
+		 * switch into guest mode if a NMI hits and sets pending work
+		 * and sets DEC, then that will apply to the guest and not
+		 * bring us back to the host.
+		 *
+		 * irq_work_raise could check a flag (or possibly LPCR[HDICE]
+		 * for example) and set HDEC to 1? That wouldn't solve the
+		 * nested hv case which needs to abort the hcall or zero the
+		 * time limit.
+		 *
+		 * XXX: Another day's problem.
+		 */
+		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - tb);
+
 		mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 		mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
 		trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
@@ -4118,6 +4119,12 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
 		mtspr(SPRN_PSSCR_PR, host_psscr);
 
+		dec = mfspr(SPRN_DEC);
+		if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+			dec = (s32) dec;
+		tb = mftb();
+		vcpu->arch.dec_expires = dec + (tb + vc->tb_offset);
+
 		/* H_CEDE has to be handled now, not later */
 		if (trap = BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
 		    kvmppc_get_gpr(vcpu, 3) = H_CEDE) {
@@ -4125,6 +4132,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			kvmppc_set_gpr(vcpu, 3, 0);
 			trap = 0;
 		}
+
 	} else {
 		kvmppc_xive_push_vcpu(vcpu);
 		trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr);
@@ -4156,12 +4164,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			vcpu->arch.slb_max = 0;
 	}
 
-	dec = mfspr(SPRN_DEC);
-	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
-		dec = (s32) dec;
-	tb = mftb();
-	vcpu->arch.dec_expires = dec + tb;
-
 	store_spr_state(vcpu);
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
@@ -4646,7 +4648,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	 * by L2 and the L1 decrementer is provided in hdec_expires
 	 */
 	if (kvmppc_core_pending_dec(vcpu) &&
-			((get_tb() < vcpu->arch.dec_expires) ||
+			((get_tb() < kvmppc_dec_expires_host_tb(vcpu)) ||
 			 (trap = BOOK3S_INTERRUPT_SYSCALL &&
 			  kvmppc_get_gpr(vcpu, 3) = H_ENTER_NESTED)))
 		kvmppc_core_dequeue_dec(vcpu);
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 6add13a22f56..024b0ce5b702 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -343,6 +343,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	/* convert TB values/offsets to host (L0) values */
 	hdec_exp = l2_hv.hdec_expiry - vc->tb_offset;
 	vc->tb_offset += l2_hv.tb_offset;
+	vcpu->arch.dec_expires += l2_hv.tb_offset;
 
 	/* set L1 state to L2 state */
 	vcpu->arch.nested = l2;
@@ -384,6 +385,8 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	if (l2_regs.msr & MSR_TS_MASK)
 		vcpu->arch.shregs.msr |= MSR_TS_S;
 	vc->tb_offset = saved_l1_hv.tb_offset;
+	/* XXX: is this always the same delta as saved_l1_hv.tb_offset? */
+	vcpu->arch.dec_expires -= l2_hv.tb_offset;
 	restore_hv_regs(vcpu, &saved_l1_hv);
 	vcpu->arch.purr += delta_purr;
 	vcpu->arch.spurr += delta_spurr;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 065bfd4d2c63..469dd5cbb52d 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -188,7 +188,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
-	s64 hdec;
+	s64 hdec, dec;
 	u64 tb, purr, spurr;
 	u64 *exsave;
 	bool ri_set;
@@ -317,6 +317,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	 */
 	mtspr(SPRN_HDEC, hdec);
 
+	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
+
 	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
 	mtspr(SPRN_SRR0, vcpu->arch.shregs.srr0);
@@ -446,6 +448,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
 	vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
 
+	dec = mfspr(SPRN_DEC);
+	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+		dec = (s32) dec;
+	tb = mftb();
+	vcpu->arch.dec_expires = dec + tb;
+
 	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 488a1e07958c..1c51dd704dd4 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -808,10 +808,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 	 * Set the decrementer to the guest decrementer.
 	 */
 	ld	r8,VCPU_DEC_EXPIRES(r4)
-	/* r8 is a host timebase value here, convert to guest TB */
-	ld	r5,HSTATE_KVM_VCORE(r13)
-	ld	r6,VCORE_TB_OFFSET_APPL(r5)
-	add	r8,r8,r6
 	mftb	r7
 	subf	r3,r7,r8
 	mtspr	SPRN_DEC,r3
@@ -1192,9 +1188,6 @@ guest_bypass:
 	mftb	r6
 	extsw	r5,r5
 16:	add	r5,r5,r6
-	/* r5 is a guest timebase value here, convert to host TB */
-	ld	r4,VCORE_TB_OFFSET_APPL(r3)
-	subf	r5,r4,r5
 	std	r5,VCPU_DEC_EXPIRES(r9)
 
 	/* Increment exit count, poke other threads to exit */
@@ -2195,10 +2188,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 67:
 	/* save expiry time of guest decrementer */
 	add	r3, r3, r5
-	ld	r4, HSTATE_KVM_VCPU(r13)
-	ld	r5, HSTATE_KVM_VCORE(r13)
-	ld	r6, VCORE_TB_OFFSET_APPL(r5)
-	subf	r3, r6, r3	/* convert to host TB value */
 	std	r3, VCPU_DEC_EXPIRES(r4)
 
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
@@ -2295,9 +2284,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
 	/* Restore guest decrementer */
 	ld	r3, VCPU_DEC_EXPIRES(r4)
-	ld	r5, HSTATE_KVM_VCORE(r13)
-	ld	r6, VCORE_TB_OFFSET_APPL(r5)
-	add	r3, r3, r6	/* convert host TB to guest TB value */
 	mftb	r7
 	subf	r3, r7, r3
 	mtspr	SPRN_DEC, r3
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 21/43] KVM: PPC: Book3S HV P9: Move TB updates
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move the TB updates between saving and loading guest and host SPRs,
to improve scheduling by keeping issue-NTC operations together as
much as possible.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 36 +++++++++++++--------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 469dd5cbb52d..44ee805875ba 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -215,15 +215,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ceded = 0;
 
-	if (vc->tb_offset) {
-		u64 new_tb = tb + vc->tb_offset;
-		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
-		vc->tb_offset_applied = vc->tb_offset;
-	}
-
 	/* Could avoid mfmsr by passing around, but probably no big deal */
 	msr = mfmsr();
 
@@ -238,6 +229,15 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		host_dawrx1 = mfspr(SPRN_DAWRX1);
 	}
 
+	if (vc->tb_offset) {
+		u64 new_tb = tb + vc->tb_offset;
+		mtspr(SPRN_TBU40, new_tb);
+		tb = mftb();
+		if ((tb & 0xffffff) < (new_tb & 0xffffff))
+			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		vc->tb_offset_applied = vc->tb_offset;
+	}
+
 	if (vc->pcr)
 		mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
 	mtspr(SPRN_DPDES, vc->dpdes);
@@ -454,6 +454,15 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	tb = mftb();
 	vcpu->arch.dec_expires = dec + tb;
 
+	if (vc->tb_offset_applied) {
+		u64 new_tb = tb - vc->tb_offset_applied;
+		mtspr(SPRN_TBU40, new_tb);
+		tb = mftb();
+		if ((tb & 0xffffff) < (new_tb & 0xffffff))
+			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		vc->tb_offset_applied = 0;
+	}
+
 	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -488,15 +497,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (vc->pcr)
 		mtspr(SPRN_PCR, PCR_MASK);
 
-	if (vc->tb_offset_applied) {
-		u64 new_tb = mftb() - vc->tb_offset_applied;
-		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
-		vc->tb_offset_applied = 0;
-	}
-
 	/* HDEC must be at least as large as DEC, so decrementer_max fits */
 	mtspr(SPRN_HDEC, decrementer_max);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 21/43] KVM: PPC: Book3S HV P9: Move TB updates
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move the TB updates between saving and loading guest and host SPRs,
to improve scheduling by keeping issue-NTC operations together as
much as possible.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 36 +++++++++++++--------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 469dd5cbb52d..44ee805875ba 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -215,15 +215,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ceded = 0;
 
-	if (vc->tb_offset) {
-		u64 new_tb = tb + vc->tb_offset;
-		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
-		vc->tb_offset_applied = vc->tb_offset;
-	}
-
 	/* Could avoid mfmsr by passing around, but probably no big deal */
 	msr = mfmsr();
 
@@ -238,6 +229,15 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		host_dawrx1 = mfspr(SPRN_DAWRX1);
 	}
 
+	if (vc->tb_offset) {
+		u64 new_tb = tb + vc->tb_offset;
+		mtspr(SPRN_TBU40, new_tb);
+		tb = mftb();
+		if ((tb & 0xffffff) < (new_tb & 0xffffff))
+			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		vc->tb_offset_applied = vc->tb_offset;
+	}
+
 	if (vc->pcr)
 		mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
 	mtspr(SPRN_DPDES, vc->dpdes);
@@ -454,6 +454,15 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	tb = mftb();
 	vcpu->arch.dec_expires = dec + tb;
 
+	if (vc->tb_offset_applied) {
+		u64 new_tb = tb - vc->tb_offset_applied;
+		mtspr(SPRN_TBU40, new_tb);
+		tb = mftb();
+		if ((tb & 0xffffff) < (new_tb & 0xffffff))
+			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		vc->tb_offset_applied = 0;
+	}
+
 	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -488,15 +497,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (vc->pcr)
 		mtspr(SPRN_PCR, PCR_MASK);
 
-	if (vc->tb_offset_applied) {
-		u64 new_tb = mftb() - vc->tb_offset_applied;
-		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
-		vc->tb_offset_applied = 0;
-	}
-
 	/* HDEC must be at least as large as DEC, so decrementer_max fits */
 	mtspr(SPRN_HDEC, decrementer_max);
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 22/43] KVM: PPC: Book3S HV P9: Optimise timebase reads
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Reduce the number of mfTB executed by passing the current timebase
around entry and exit code rather than read it multiple times.

-213 cycles (7578) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +-
 arch/powerpc/kvm/book3s_hv.c             | 88 +++++++++++++-----------
 arch/powerpc/kvm/book3s_hv_p9_entry.c    | 33 +++++----
 3 files changed, 65 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index eaf3a562bf1e..f8a0ed90b853 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -153,7 +153,7 @@ static inline bool kvmhv_vcpu_is_radix(struct kvm_vcpu *vcpu)
 	return radix;
 }
 
-int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr);
+int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb);
 
 #define KVM_DEFAULT_HPT_ORDER	24	/* 16MB HPT by default */
 #endif
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 218dacd78e25..99b19f4e7ed7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -275,22 +275,22 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
  * they should never fail.)
  */
 
-static void kvmppc_core_start_stolen(struct kvmppc_vcore *vc)
+static void kvmppc_core_start_stolen(struct kvmppc_vcore *vc, u64 tb)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&vc->stoltb_lock, flags);
-	vc->preempt_tb = mftb();
+	vc->preempt_tb = tb;
 	spin_unlock_irqrestore(&vc->stoltb_lock, flags);
 }
 
-static void kvmppc_core_end_stolen(struct kvmppc_vcore *vc)
+static void kvmppc_core_end_stolen(struct kvmppc_vcore *vc, u64 tb)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&vc->stoltb_lock, flags);
 	if (vc->preempt_tb != TB_NIL) {
-		vc->stolen_tb += mftb() - vc->preempt_tb;
+		vc->stolen_tb += tb - vc->preempt_tb;
 		vc->preempt_tb = TB_NIL;
 	}
 	spin_unlock_irqrestore(&vc->stoltb_lock, flags);
@@ -300,6 +300,7 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	unsigned long flags;
+	u64 now = mftb();
 
 	/*
 	 * We can test vc->runner without taking the vcore lock,
@@ -308,12 +309,12 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
 	 * ever sets it to NULL.
 	 */
 	if (vc->runner == vcpu && vc->vcore_state >= VCORE_SLEEPING)
-		kvmppc_core_end_stolen(vc);
+		kvmppc_core_end_stolen(vc, now);
 
 	spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
 	if (vcpu->arch.state == KVMPPC_VCPU_BUSY_IN_HOST &&
 	    vcpu->arch.busy_preempt != TB_NIL) {
-		vcpu->arch.busy_stolen += mftb() - vcpu->arch.busy_preempt;
+		vcpu->arch.busy_stolen += now - vcpu->arch.busy_preempt;
 		vcpu->arch.busy_preempt = TB_NIL;
 	}
 	spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
@@ -323,13 +324,14 @@ static void kvmppc_core_vcpu_put_hv(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	unsigned long flags;
+	u64 now = mftb();
 
 	if (vc->runner == vcpu && vc->vcore_state >= VCORE_SLEEPING)
-		kvmppc_core_start_stolen(vc);
+		kvmppc_core_start_stolen(vc, now);
 
 	spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
 	if (vcpu->arch.state == KVMPPC_VCPU_BUSY_IN_HOST)
-		vcpu->arch.busy_preempt = mftb();
+		vcpu->arch.busy_preempt = now;
 	spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
 }
 
@@ -684,7 +686,7 @@ static u64 vcore_stolen_time(struct kvmppc_vcore *vc, u64 now)
 }
 
 static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
-				    struct kvmppc_vcore *vc)
+				    struct kvmppc_vcore *vc, u64 tb)
 {
 	struct dtl_entry *dt;
 	struct lppaca *vpa;
@@ -695,7 +697,7 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 
 	dt = vcpu->arch.dtl_ptr;
 	vpa = vcpu->arch.vpa.pinned_addr;
-	now = mftb();
+	now = tb;
 	core_stolen = vcore_stolen_time(vc, now);
 	stolen = core_stolen - vcpu->arch.stolen_logged;
 	vcpu->arch.stolen_logged = core_stolen;
@@ -2792,14 +2794,14 @@ static void kvmppc_set_timer(struct kvm_vcpu *vcpu)
 extern int __kvmppc_vcore_entry(void);
 
 static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
-				   struct kvm_vcpu *vcpu)
+				   struct kvm_vcpu *vcpu, u64 tb)
 {
 	u64 now;
 
 	if (vcpu->arch.state != KVMPPC_VCPU_RUNNABLE)
 		return;
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
-	now = mftb();
+	now = tb;
 	vcpu->arch.busy_stolen += vcore_stolen_time(vc, now) -
 		vcpu->arch.stolen_logged;
 	vcpu->arch.busy_preempt = now;
@@ -3050,14 +3052,14 @@ static void kvmppc_vcore_preempt(struct kvmppc_vcore *vc)
 	}
 
 	/* Start accumulating stolen time */
-	kvmppc_core_start_stolen(vc);
+	kvmppc_core_start_stolen(vc, mftb());
 }
 
 static void kvmppc_vcore_end_preempt(struct kvmppc_vcore *vc)
 {
 	struct preempted_vcore_list *lp;
 
-	kvmppc_core_end_stolen(vc);
+	kvmppc_core_end_stolen(vc, mftb());
 	if (!list_empty(&vc->preempt_list)) {
 		lp = &per_cpu(preempted_vcores, vc->pcpu);
 		spin_lock(&lp->lock);
@@ -3184,7 +3186,7 @@ static void prepare_threads(struct kvmppc_vcore *vc)
 			vcpu->arch.ret = RESUME_GUEST;
 		else
 			continue;
-		kvmppc_remove_runnable(vc, vcpu);
+		kvmppc_remove_runnable(vc, vcpu, mftb());
 		wake_up(&vcpu->arch.cpu_run);
 	}
 }
@@ -3203,7 +3205,7 @@ static void collect_piggybacks(struct core_info *cip, int target_threads)
 			list_del_init(&pvc->preempt_list);
 			if (pvc->runner == NULL) {
 				pvc->vcore_state = VCORE_INACTIVE;
-				kvmppc_core_end_stolen(pvc);
+				kvmppc_core_end_stolen(pvc, mftb());
 			}
 			spin_unlock(&pvc->lock);
 			continue;
@@ -3212,7 +3214,7 @@ static void collect_piggybacks(struct core_info *cip, int target_threads)
 			spin_unlock(&pvc->lock);
 			continue;
 		}
-		kvmppc_core_end_stolen(pvc);
+		kvmppc_core_end_stolen(pvc, mftb());
 		pvc->vcore_state = VCORE_PIGGYBACK;
 		if (cip->total_threads >= target_threads)
 			break;
@@ -3279,7 +3281,7 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 			else
 				++still_running;
 		} else {
-			kvmppc_remove_runnable(vc, vcpu);
+			kvmppc_remove_runnable(vc, vcpu, mftb());
 			wake_up(&vcpu->arch.cpu_run);
 		}
 	}
@@ -3288,7 +3290,7 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 			kvmppc_vcore_preempt(vc);
 		} else if (vc->runner) {
 			vc->vcore_state = VCORE_PREEMPT;
-			kvmppc_core_start_stolen(vc);
+			kvmppc_core_start_stolen(vc, mftb());
 		} else {
 			vc->vcore_state = VCORE_INACTIVE;
 		}
@@ -3419,7 +3421,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	    ((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
 		for_each_runnable_thread(i, vcpu, vc) {
 			vcpu->arch.ret = -EBUSY;
-			kvmppc_remove_runnable(vc, vcpu);
+			kvmppc_remove_runnable(vc, vcpu, mftb());
 			wake_up(&vcpu->arch.cpu_run);
 		}
 		goto out;
@@ -3551,7 +3553,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 		pvc->pcpu = pcpu + thr;
 		for_each_runnable_thread(i, vcpu, pvc) {
 			kvmppc_start_thread(vcpu, pvc);
-			kvmppc_create_dtl_entry(vcpu, pvc);
+			kvmppc_create_dtl_entry(vcpu, pvc, mftb());
 			trace_kvm_guest_enter(vcpu);
 			if (!vcpu->arch.ptid)
 				thr0_done = true;
@@ -3998,20 +4000,17 @@ static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
  * Guest entry for POWER9 and later CPUs.
  */
 static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
-			 unsigned long lpcr)
+			 unsigned long lpcr, u64 *tb)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
-	u64 tb, next_timer;
+	u64 next_timer;
 	unsigned long msr;
 	int trap;
 
-	WARN_ON_ONCE(vcpu->arch.ceded);
-
-	tb = mftb();
 	next_timer = timer_get_next_tb();
-	if (tb >= next_timer)
+	if (*tb >= next_timer)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 	if (next_timer < time_limit)
 		time_limit = next_timer;
@@ -4106,7 +4105,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		 *
 		 * XXX: Another day's problem.
 		 */
-		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - tb);
+		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - *tb);
 
 		mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 		mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
@@ -4122,8 +4121,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		dec = mfspr(SPRN_DEC);
 		if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 			dec = (s32) dec;
-		tb = mftb();
-		vcpu->arch.dec_expires = dec + (tb + vc->tb_offset);
+		*tb = mftb();
+		vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
 
 		/* H_CEDE has to be handled now, not later */
 		if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
@@ -4135,7 +4134,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	} else {
 		kvmppc_xive_push_vcpu(vcpu);
-		trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr);
+		trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr, tb);
 		if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
 		    !(vcpu->arch.shregs.msr & MSR_PR)) {
 			unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -4166,6 +4165,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	store_spr_state(vcpu);
 
+	timer_rearm_host_dec(*tb);
+
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
 
 	store_fp_state(&vcpu->arch.fp);
@@ -4185,8 +4186,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	timer_rearm_host_dec(tb);
-
 	kvmppc_subcore_exit_guest();
 
 	return trap;
@@ -4428,7 +4427,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
 		if ((vc->vcore_state == VCORE_PIGGYBACK ||
 		     vc->vcore_state == VCORE_RUNNING) &&
 			   !VCORE_IS_EXITING(vc)) {
-			kvmppc_create_dtl_entry(vcpu, vc);
+			kvmppc_create_dtl_entry(vcpu, vc, mftb());
 			kvmppc_start_thread(vcpu, vc);
 			trace_kvm_guest_enter(vcpu);
 		} else if (vc->vcore_state == VCORE_SLEEPING) {
@@ -4463,7 +4462,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
 		for_each_runnable_thread(i, v, vc) {
 			kvmppc_core_prepare_to_enter(v);
 			if (signal_pending(v->arch.run_task)) {
-				kvmppc_remove_runnable(vc, v);
+				kvmppc_remove_runnable(vc, v, mftb());
 				v->stat.signal_exits++;
 				v->run->exit_reason = KVM_EXIT_INTR;
 				v->arch.ret = -EINTR;
@@ -4504,7 +4503,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
 		kvmppc_vcore_end_preempt(vc);
 
 	if (vcpu->arch.state == KVMPPC_VCPU_RUNNABLE) {
-		kvmppc_remove_runnable(vc, vcpu);
+		kvmppc_remove_runnable(vc, vcpu, mftb());
 		vcpu->stat.signal_exits++;
 		run->exit_reason = KVM_EXIT_INTR;
 		vcpu->arch.ret = -EINTR;
@@ -4532,6 +4531,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	unsigned long flags;
+	u64 tb;
 
 	trace_kvmppc_run_vcpu_enter(vcpu);
 
@@ -4542,7 +4542,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc = vcpu->arch.vcore;
 	vcpu->arch.ceded = 0;
 	vcpu->arch.run_task = current;
-	vcpu->arch.stolen_logged = vcore_stolen_time(vc, mftb());
 	vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
 	vcpu->arch.busy_preempt = TB_NIL;
 	vcpu->arch.last_inst = KVM_INST_FETCH_FAILED;
@@ -4567,7 +4566,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	kvmppc_update_vpas(vcpu);
 
 	init_vcore_to_run(vc);
-	vc->preempt_tb = TB_NIL;
 
 	preempt_disable();
 	pcpu = smp_processor_id();
@@ -4577,6 +4575,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	/* flags save not required, but irq_pmu has no disable/enable API */
 	powerpc_local_irq_pmu_save(flags);
+
 	if (signal_pending(current))
 		goto sigpend;
 	if (need_resched() || !kvm->arch.mmu_ready)
@@ -4599,12 +4598,17 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 		goto out;
 	}
 
+	tb = mftb();
+
+	vcpu->arch.stolen_logged = vcore_stolen_time(vc, tb);
+	vc->preempt_tb = TB_NIL;
+
 	kvmppc_clear_host_core(pcpu);
 
 	local_paca->kvm_hstate.napping = 0;
 	local_paca->kvm_hstate.kvm_split_mode = NULL;
 	kvmppc_start_thread(vcpu, vc);
-	kvmppc_create_dtl_entry(vcpu, vc);
+	kvmppc_create_dtl_entry(vcpu, vc, tb);
 	trace_kvm_guest_enter(vcpu);
 
 	vc->vcore_state = VCORE_RUNNING;
@@ -4619,7 +4623,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	/* Tell lockdep that we're about to enable interrupts */
 	trace_hardirqs_on();
 
-	trap = kvmhv_p9_guest_entry(vcpu, time_limit, lpcr);
+	trap = kvmhv_p9_guest_entry(vcpu, time_limit, lpcr, &tb);
 	vcpu->arch.trap = trap;
 
 	trace_hardirqs_off();
@@ -4648,7 +4652,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	 * by L2 and the L1 decrementer is provided in hdec_expires
 	 */
 	if (kvmppc_core_pending_dec(vcpu) &&
-			((get_tb() < kvmppc_dec_expires_host_tb(vcpu)) ||
+			((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
 			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
 			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
 		kvmppc_core_dequeue_dec(vcpu);
@@ -4684,7 +4688,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	trace_kvmppc_run_core(vc, 1);
 
  done:
-	kvmppc_remove_runnable(vc, vcpu);
+	kvmppc_remove_runnable(vc, vcpu, tb);
 	trace_kvmppc_run_vcpu_exit(vcpu);
 
 	return vcpu->arch.ret;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 44ee805875ba..237ea1ef1eab 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -183,13 +183,13 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	}
 }
 
-int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr)
+int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	s64 hdec, dec;
-	u64 tb, purr, spurr;
+	u64 purr, spurr;
 	u64 *exsave;
 	bool ri_set;
 	int trap;
@@ -203,8 +203,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	unsigned long host_dawr1;
 	unsigned long host_dawrx1;
 
-	tb = mftb();
-	hdec = time_limit - tb;
+	hdec = time_limit - *tb;
 	if (hdec < 0)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 
@@ -230,11 +229,13 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	}
 
 	if (vc->tb_offset) {
-		u64 new_tb = tb + vc->tb_offset;
+		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		if ((mftb() & 0xffffff) < (new_tb & 0xffffff)) {
+			new_tb += 0x1000000;
+			mtspr(SPRN_TBU40, new_tb);
+		}
+		*tb = new_tb;
 		vc->tb_offset_applied = vc->tb_offset;
 	}
 
@@ -317,7 +318,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	 */
 	mtspr(SPRN_HDEC, hdec);
 
-	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
+	mtspr(SPRN_DEC, vcpu->arch.dec_expires - *tb);
 
 	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
@@ -451,15 +452,17 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
-	tb = mftb();
-	vcpu->arch.dec_expires = dec + tb;
+	*tb = mftb();
+	vcpu->arch.dec_expires = dec + *tb;
 
 	if (vc->tb_offset_applied) {
-		u64 new_tb = tb - vc->tb_offset_applied;
+		u64 new_tb = *tb - vc->tb_offset_applied;
 		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		if ((mftb() & 0xffffff) < (new_tb & 0xffffff)) {
+			new_tb += 0x1000000;
+			mtspr(SPRN_TBU40, new_tb);
+		}
+		*tb = new_tb;
 		vc->tb_offset_applied = 0;
 	}
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 22/43] KVM: PPC: Book3S HV P9: Optimise timebase reads
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Reduce the number of mfTB executed by passing the current timebase
around entry and exit code rather than read it multiple times.

-213 cycles (7578) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +-
 arch/powerpc/kvm/book3s_hv.c             | 88 +++++++++++++-----------
 arch/powerpc/kvm/book3s_hv_p9_entry.c    | 33 +++++----
 3 files changed, 65 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index eaf3a562bf1e..f8a0ed90b853 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -153,7 +153,7 @@ static inline bool kvmhv_vcpu_is_radix(struct kvm_vcpu *vcpu)
 	return radix;
 }
 
-int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr);
+int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb);
 
 #define KVM_DEFAULT_HPT_ORDER	24	/* 16MB HPT by default */
 #endif
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 218dacd78e25..99b19f4e7ed7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -275,22 +275,22 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
  * they should never fail.)
  */
 
-static void kvmppc_core_start_stolen(struct kvmppc_vcore *vc)
+static void kvmppc_core_start_stolen(struct kvmppc_vcore *vc, u64 tb)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&vc->stoltb_lock, flags);
-	vc->preempt_tb = mftb();
+	vc->preempt_tb = tb;
 	spin_unlock_irqrestore(&vc->stoltb_lock, flags);
 }
 
-static void kvmppc_core_end_stolen(struct kvmppc_vcore *vc)
+static void kvmppc_core_end_stolen(struct kvmppc_vcore *vc, u64 tb)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&vc->stoltb_lock, flags);
 	if (vc->preempt_tb != TB_NIL) {
-		vc->stolen_tb += mftb() - vc->preempt_tb;
+		vc->stolen_tb += tb - vc->preempt_tb;
 		vc->preempt_tb = TB_NIL;
 	}
 	spin_unlock_irqrestore(&vc->stoltb_lock, flags);
@@ -300,6 +300,7 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	unsigned long flags;
+	u64 now = mftb();
 
 	/*
 	 * We can test vc->runner without taking the vcore lock,
@@ -308,12 +309,12 @@ static void kvmppc_core_vcpu_load_hv(struct kvm_vcpu *vcpu, int cpu)
 	 * ever sets it to NULL.
 	 */
 	if (vc->runner = vcpu && vc->vcore_state >= VCORE_SLEEPING)
-		kvmppc_core_end_stolen(vc);
+		kvmppc_core_end_stolen(vc, now);
 
 	spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
 	if (vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST &&
 	    vcpu->arch.busy_preempt != TB_NIL) {
-		vcpu->arch.busy_stolen += mftb() - vcpu->arch.busy_preempt;
+		vcpu->arch.busy_stolen += now - vcpu->arch.busy_preempt;
 		vcpu->arch.busy_preempt = TB_NIL;
 	}
 	spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
@@ -323,13 +324,14 @@ static void kvmppc_core_vcpu_put_hv(struct kvm_vcpu *vcpu)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	unsigned long flags;
+	u64 now = mftb();
 
 	if (vc->runner = vcpu && vc->vcore_state >= VCORE_SLEEPING)
-		kvmppc_core_start_stolen(vc);
+		kvmppc_core_start_stolen(vc, now);
 
 	spin_lock_irqsave(&vcpu->arch.tbacct_lock, flags);
 	if (vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST)
-		vcpu->arch.busy_preempt = mftb();
+		vcpu->arch.busy_preempt = now;
 	spin_unlock_irqrestore(&vcpu->arch.tbacct_lock, flags);
 }
 
@@ -684,7 +686,7 @@ static u64 vcore_stolen_time(struct kvmppc_vcore *vc, u64 now)
 }
 
 static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
-				    struct kvmppc_vcore *vc)
+				    struct kvmppc_vcore *vc, u64 tb)
 {
 	struct dtl_entry *dt;
 	struct lppaca *vpa;
@@ -695,7 +697,7 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 
 	dt = vcpu->arch.dtl_ptr;
 	vpa = vcpu->arch.vpa.pinned_addr;
-	now = mftb();
+	now = tb;
 	core_stolen = vcore_stolen_time(vc, now);
 	stolen = core_stolen - vcpu->arch.stolen_logged;
 	vcpu->arch.stolen_logged = core_stolen;
@@ -2792,14 +2794,14 @@ static void kvmppc_set_timer(struct kvm_vcpu *vcpu)
 extern int __kvmppc_vcore_entry(void);
 
 static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
-				   struct kvm_vcpu *vcpu)
+				   struct kvm_vcpu *vcpu, u64 tb)
 {
 	u64 now;
 
 	if (vcpu->arch.state != KVMPPC_VCPU_RUNNABLE)
 		return;
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
-	now = mftb();
+	now = tb;
 	vcpu->arch.busy_stolen += vcore_stolen_time(vc, now) -
 		vcpu->arch.stolen_logged;
 	vcpu->arch.busy_preempt = now;
@@ -3050,14 +3052,14 @@ static void kvmppc_vcore_preempt(struct kvmppc_vcore *vc)
 	}
 
 	/* Start accumulating stolen time */
-	kvmppc_core_start_stolen(vc);
+	kvmppc_core_start_stolen(vc, mftb());
 }
 
 static void kvmppc_vcore_end_preempt(struct kvmppc_vcore *vc)
 {
 	struct preempted_vcore_list *lp;
 
-	kvmppc_core_end_stolen(vc);
+	kvmppc_core_end_stolen(vc, mftb());
 	if (!list_empty(&vc->preempt_list)) {
 		lp = &per_cpu(preempted_vcores, vc->pcpu);
 		spin_lock(&lp->lock);
@@ -3184,7 +3186,7 @@ static void prepare_threads(struct kvmppc_vcore *vc)
 			vcpu->arch.ret = RESUME_GUEST;
 		else
 			continue;
-		kvmppc_remove_runnable(vc, vcpu);
+		kvmppc_remove_runnable(vc, vcpu, mftb());
 		wake_up(&vcpu->arch.cpu_run);
 	}
 }
@@ -3203,7 +3205,7 @@ static void collect_piggybacks(struct core_info *cip, int target_threads)
 			list_del_init(&pvc->preempt_list);
 			if (pvc->runner = NULL) {
 				pvc->vcore_state = VCORE_INACTIVE;
-				kvmppc_core_end_stolen(pvc);
+				kvmppc_core_end_stolen(pvc, mftb());
 			}
 			spin_unlock(&pvc->lock);
 			continue;
@@ -3212,7 +3214,7 @@ static void collect_piggybacks(struct core_info *cip, int target_threads)
 			spin_unlock(&pvc->lock);
 			continue;
 		}
-		kvmppc_core_end_stolen(pvc);
+		kvmppc_core_end_stolen(pvc, mftb());
 		pvc->vcore_state = VCORE_PIGGYBACK;
 		if (cip->total_threads >= target_threads)
 			break;
@@ -3279,7 +3281,7 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 			else
 				++still_running;
 		} else {
-			kvmppc_remove_runnable(vc, vcpu);
+			kvmppc_remove_runnable(vc, vcpu, mftb());
 			wake_up(&vcpu->arch.cpu_run);
 		}
 	}
@@ -3288,7 +3290,7 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
 			kvmppc_vcore_preempt(vc);
 		} else if (vc->runner) {
 			vc->vcore_state = VCORE_PREEMPT;
-			kvmppc_core_start_stolen(vc);
+			kvmppc_core_start_stolen(vc, mftb());
 		} else {
 			vc->vcore_state = VCORE_INACTIVE;
 		}
@@ -3419,7 +3421,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	    ((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
 		for_each_runnable_thread(i, vcpu, vc) {
 			vcpu->arch.ret = -EBUSY;
-			kvmppc_remove_runnable(vc, vcpu);
+			kvmppc_remove_runnable(vc, vcpu, mftb());
 			wake_up(&vcpu->arch.cpu_run);
 		}
 		goto out;
@@ -3551,7 +3553,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 		pvc->pcpu = pcpu + thr;
 		for_each_runnable_thread(i, vcpu, pvc) {
 			kvmppc_start_thread(vcpu, pvc);
-			kvmppc_create_dtl_entry(vcpu, pvc);
+			kvmppc_create_dtl_entry(vcpu, pvc, mftb());
 			trace_kvm_guest_enter(vcpu);
 			if (!vcpu->arch.ptid)
 				thr0_done = true;
@@ -3998,20 +4000,17 @@ static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
  * Guest entry for POWER9 and later CPUs.
  */
 static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
-			 unsigned long lpcr)
+			 unsigned long lpcr, u64 *tb)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
-	u64 tb, next_timer;
+	u64 next_timer;
 	unsigned long msr;
 	int trap;
 
-	WARN_ON_ONCE(vcpu->arch.ceded);
-
-	tb = mftb();
 	next_timer = timer_get_next_tb();
-	if (tb >= next_timer)
+	if (*tb >= next_timer)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 	if (next_timer < time_limit)
 		time_limit = next_timer;
@@ -4106,7 +4105,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		 *
 		 * XXX: Another day's problem.
 		 */
-		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - tb);
+		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - *tb);
 
 		mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 		mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
@@ -4122,8 +4121,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		dec = mfspr(SPRN_DEC);
 		if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 			dec = (s32) dec;
-		tb = mftb();
-		vcpu->arch.dec_expires = dec + (tb + vc->tb_offset);
+		*tb = mftb();
+		vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
 
 		/* H_CEDE has to be handled now, not later */
 		if (trap = BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
@@ -4135,7 +4134,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	} else {
 		kvmppc_xive_push_vcpu(vcpu);
-		trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr);
+		trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr, tb);
 		if (trap = BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
 		    !(vcpu->arch.shregs.msr & MSR_PR)) {
 			unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -4166,6 +4165,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	store_spr_state(vcpu);
 
+	timer_rearm_host_dec(*tb);
+
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
 
 	store_fp_state(&vcpu->arch.fp);
@@ -4185,8 +4186,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
-	timer_rearm_host_dec(tb);
-
 	kvmppc_subcore_exit_guest();
 
 	return trap;
@@ -4428,7 +4427,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
 		if ((vc->vcore_state = VCORE_PIGGYBACK ||
 		     vc->vcore_state = VCORE_RUNNING) &&
 			   !VCORE_IS_EXITING(vc)) {
-			kvmppc_create_dtl_entry(vcpu, vc);
+			kvmppc_create_dtl_entry(vcpu, vc, mftb());
 			kvmppc_start_thread(vcpu, vc);
 			trace_kvm_guest_enter(vcpu);
 		} else if (vc->vcore_state = VCORE_SLEEPING) {
@@ -4463,7 +4462,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
 		for_each_runnable_thread(i, v, vc) {
 			kvmppc_core_prepare_to_enter(v);
 			if (signal_pending(v->arch.run_task)) {
-				kvmppc_remove_runnable(vc, v);
+				kvmppc_remove_runnable(vc, v, mftb());
 				v->stat.signal_exits++;
 				v->run->exit_reason = KVM_EXIT_INTR;
 				v->arch.ret = -EINTR;
@@ -4504,7 +4503,7 @@ static int kvmppc_run_vcpu(struct kvm_vcpu *vcpu)
 		kvmppc_vcore_end_preempt(vc);
 
 	if (vcpu->arch.state = KVMPPC_VCPU_RUNNABLE) {
-		kvmppc_remove_runnable(vc, vcpu);
+		kvmppc_remove_runnable(vc, vcpu, mftb());
 		vcpu->stat.signal_exits++;
 		run->exit_reason = KVM_EXIT_INTR;
 		vcpu->arch.ret = -EINTR;
@@ -4532,6 +4531,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	unsigned long flags;
+	u64 tb;
 
 	trace_kvmppc_run_vcpu_enter(vcpu);
 
@@ -4542,7 +4542,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	vc = vcpu->arch.vcore;
 	vcpu->arch.ceded = 0;
 	vcpu->arch.run_task = current;
-	vcpu->arch.stolen_logged = vcore_stolen_time(vc, mftb());
 	vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
 	vcpu->arch.busy_preempt = TB_NIL;
 	vcpu->arch.last_inst = KVM_INST_FETCH_FAILED;
@@ -4567,7 +4566,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	kvmppc_update_vpas(vcpu);
 
 	init_vcore_to_run(vc);
-	vc->preempt_tb = TB_NIL;
 
 	preempt_disable();
 	pcpu = smp_processor_id();
@@ -4577,6 +4575,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	/* flags save not required, but irq_pmu has no disable/enable API */
 	powerpc_local_irq_pmu_save(flags);
+
 	if (signal_pending(current))
 		goto sigpend;
 	if (need_resched() || !kvm->arch.mmu_ready)
@@ -4599,12 +4598,17 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 		goto out;
 	}
 
+	tb = mftb();
+
+	vcpu->arch.stolen_logged = vcore_stolen_time(vc, tb);
+	vc->preempt_tb = TB_NIL;
+
 	kvmppc_clear_host_core(pcpu);
 
 	local_paca->kvm_hstate.napping = 0;
 	local_paca->kvm_hstate.kvm_split_mode = NULL;
 	kvmppc_start_thread(vcpu, vc);
-	kvmppc_create_dtl_entry(vcpu, vc);
+	kvmppc_create_dtl_entry(vcpu, vc, tb);
 	trace_kvm_guest_enter(vcpu);
 
 	vc->vcore_state = VCORE_RUNNING;
@@ -4619,7 +4623,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	/* Tell lockdep that we're about to enable interrupts */
 	trace_hardirqs_on();
 
-	trap = kvmhv_p9_guest_entry(vcpu, time_limit, lpcr);
+	trap = kvmhv_p9_guest_entry(vcpu, time_limit, lpcr, &tb);
 	vcpu->arch.trap = trap;
 
 	trace_hardirqs_off();
@@ -4648,7 +4652,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	 * by L2 and the L1 decrementer is provided in hdec_expires
 	 */
 	if (kvmppc_core_pending_dec(vcpu) &&
-			((get_tb() < kvmppc_dec_expires_host_tb(vcpu)) ||
+			((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
 			 (trap = BOOK3S_INTERRUPT_SYSCALL &&
 			  kvmppc_get_gpr(vcpu, 3) = H_ENTER_NESTED)))
 		kvmppc_core_dequeue_dec(vcpu);
@@ -4684,7 +4688,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	trace_kvmppc_run_core(vc, 1);
 
  done:
-	kvmppc_remove_runnable(vc, vcpu);
+	kvmppc_remove_runnable(vc, vcpu, tb);
 	trace_kvmppc_run_vcpu_exit(vcpu);
 
 	return vcpu->arch.ret;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 44ee805875ba..237ea1ef1eab 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -183,13 +183,13 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	}
 }
 
-int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr)
+int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	s64 hdec, dec;
-	u64 tb, purr, spurr;
+	u64 purr, spurr;
 	u64 *exsave;
 	bool ri_set;
 	int trap;
@@ -203,8 +203,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	unsigned long host_dawr1;
 	unsigned long host_dawrx1;
 
-	tb = mftb();
-	hdec = time_limit - tb;
+	hdec = time_limit - *tb;
 	if (hdec < 0)
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 
@@ -230,11 +229,13 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	}
 
 	if (vc->tb_offset) {
-		u64 new_tb = tb + vc->tb_offset;
+		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		if ((mftb() & 0xffffff) < (new_tb & 0xffffff)) {
+			new_tb += 0x1000000;
+			mtspr(SPRN_TBU40, new_tb);
+		}
+		*tb = new_tb;
 		vc->tb_offset_applied = vc->tb_offset;
 	}
 
@@ -317,7 +318,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	 */
 	mtspr(SPRN_HDEC, hdec);
 
-	mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb);
+	mtspr(SPRN_DEC, vcpu->arch.dec_expires - *tb);
 
 	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
@@ -451,15 +452,17 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
-	tb = mftb();
-	vcpu->arch.dec_expires = dec + tb;
+	*tb = mftb();
+	vcpu->arch.dec_expires = dec + *tb;
 
 	if (vc->tb_offset_applied) {
-		u64 new_tb = tb - vc->tb_offset_applied;
+		u64 new_tb = *tb - vc->tb_offset_applied;
 		mtspr(SPRN_TBU40, new_tb);
-		tb = mftb();
-		if ((tb & 0xffffff) < (new_tb & 0xffffff))
-			mtspr(SPRN_TBU40, new_tb + 0x1000000);
+		if ((mftb() & 0xffffff) < (new_tb & 0xffffff)) {
+			new_tb += 0x1000000;
+			mtspr(SPRN_TBU40, new_tb);
+		}
+		*tb = new_tb;
 		vc->tb_offset_applied = 0;
 	}
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 23/43] KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Avoid interleaving mfSPR and mtSPR.

-151 cycles (7427) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          |  8 ++++----
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 19 +++++++++++--------
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 99b19f4e7ed7..8c6ba04e1fdf 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4165,10 +4165,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	store_spr_state(vcpu);
 
-	timer_rearm_host_dec(*tb);
-
-	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-
 	store_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
 	store_vr_state(&vcpu->arch.vr);
@@ -4183,6 +4179,10 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	switch_pmu_to_host(vcpu, &host_os_sprs);
 
+	timer_rearm_host_dec(*tb);
+
+	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 237ea1ef1eab..afdd7dfa1c08 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -228,6 +228,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		host_dawrx1 = mfspr(SPRN_DAWRX1);
 	}
 
+	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
+	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
+
 	if (vc->tb_offset) {
 		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
@@ -244,8 +247,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_DPDES, vc->dpdes);
 	mtspr(SPRN_VTB, vc->vtb);
 
-	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
-	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
 	mtspr(SPRN_PURR, vcpu->arch.purr);
 	mtspr(SPRN_SPURR, vcpu->arch.spurr);
 
@@ -433,10 +434,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	/* Advance host PURR/SPURR by the amount used by guest */
 	purr = mfspr(SPRN_PURR);
 	spurr = mfspr(SPRN_SPURR);
-	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr +
-	      purr - vcpu->arch.purr);
-	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr +
-	      spurr - vcpu->arch.spurr);
+	local_paca->kvm_hstate.host_purr += purr - vcpu->arch.purr;
+	local_paca->kvm_hstate.host_spurr += spurr - vcpu->arch.spurr;
 	vcpu->arch.purr = purr;
 	vcpu->arch.spurr = spurr;
 
@@ -449,6 +448,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
 	vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
 
+	vc->dpdes = mfspr(SPRN_DPDES);
+	vc->vtb = mfspr(SPRN_VTB);
+
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
@@ -466,6 +468,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = 0;
 	}
 
+	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
+	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
+
 	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -494,8 +499,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (cpu_has_feature(CPU_FTR_ARCH_31))
 		asm volatile(PPC_CP_ABORT);
 
-	vc->dpdes = mfspr(SPRN_DPDES);
-	vc->vtb = mfspr(SPRN_VTB);
 	mtspr(SPRN_DPDES, 0);
 	if (vc->pcr)
 		mtspr(SPRN_PCR, PCR_MASK);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 23/43] KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Avoid interleaving mfSPR and mtSPR.

-151 cycles (7427) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          |  8 ++++----
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 19 +++++++++++--------
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 99b19f4e7ed7..8c6ba04e1fdf 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4165,10 +4165,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	store_spr_state(vcpu);
 
-	timer_rearm_host_dec(*tb);
-
-	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-
 	store_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
 	store_vr_state(&vcpu->arch.vr);
@@ -4183,6 +4179,10 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	switch_pmu_to_host(vcpu, &host_os_sprs);
 
+	timer_rearm_host_dec(*tb);
+
+	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 237ea1ef1eab..afdd7dfa1c08 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -228,6 +228,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		host_dawrx1 = mfspr(SPRN_DAWRX1);
 	}
 
+	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
+	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
+
 	if (vc->tb_offset) {
 		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
@@ -244,8 +247,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_DPDES, vc->dpdes);
 	mtspr(SPRN_VTB, vc->vtb);
 
-	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
-	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
 	mtspr(SPRN_PURR, vcpu->arch.purr);
 	mtspr(SPRN_SPURR, vcpu->arch.spurr);
 
@@ -433,10 +434,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	/* Advance host PURR/SPURR by the amount used by guest */
 	purr = mfspr(SPRN_PURR);
 	spurr = mfspr(SPRN_SPURR);
-	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr +
-	      purr - vcpu->arch.purr);
-	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr +
-	      spurr - vcpu->arch.spurr);
+	local_paca->kvm_hstate.host_purr += purr - vcpu->arch.purr;
+	local_paca->kvm_hstate.host_spurr += spurr - vcpu->arch.spurr;
 	vcpu->arch.purr = purr;
 	vcpu->arch.spurr = spurr;
 
@@ -449,6 +448,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
 	vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
 
+	vc->dpdes = mfspr(SPRN_DPDES);
+	vc->vtb = mfspr(SPRN_VTB);
+
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
@@ -466,6 +468,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = 0;
 	}
 
+	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
+	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
+
 	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -494,8 +499,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (cpu_has_feature(CPU_FTR_ARCH_31))
 		asm volatile(PPC_CP_ABORT);
 
-	vc->dpdes = mfspr(SPRN_DPDES);
-	vc->vtb = mfspr(SPRN_VTB);
 	mtspr(SPRN_DPDES, 0);
 	if (vc->pcr)
 		mtspr(SPRN_PCR, PCR_MASK);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 24/43] KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Keep better track of the current SPR value in places where
they are to be loaded with a new context, to reduce expensive
mtSPR operations.

-73 cycles (7354) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 64 ++++++++++++++++++++++--------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8c6ba04e1fdf..612b70216e75 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3905,19 +3905,28 @@ static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
 	}
 }
 
-static void load_spr_state(struct kvm_vcpu *vcpu)
+static void load_spr_state(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
 {
-	mtspr(SPRN_DSCR, vcpu->arch.dscr);
-	mtspr(SPRN_IAMR, vcpu->arch.iamr);
-	mtspr(SPRN_PSPB, vcpu->arch.pspb);
-	mtspr(SPRN_FSCR, vcpu->arch.fscr);
 	mtspr(SPRN_TAR, vcpu->arch.tar);
 	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
 	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
 	mtspr(SPRN_BESCR, vcpu->arch.bescr);
-	mtspr(SPRN_TIDR, vcpu->arch.tid);
-	mtspr(SPRN_AMR, vcpu->arch.amr);
-	mtspr(SPRN_UAMOR, vcpu->arch.uamor);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, vcpu->arch.tid);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, vcpu->arch.iamr);
+	if (host_os_sprs->amr != vcpu->arch.amr)
+		mtspr(SPRN_AMR, vcpu->arch.amr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
+	if (host_os_sprs->fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, vcpu->arch.fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, vcpu->arch.dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, vcpu->arch.pspb);
 
 	/*
 	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
@@ -3932,28 +3941,31 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
 
 static void store_spr_state(struct kvm_vcpu *vcpu)
 {
-	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
-
-	vcpu->arch.iamr = mfspr(SPRN_IAMR);
-	vcpu->arch.pspb = mfspr(SPRN_PSPB);
-	vcpu->arch.fscr = mfspr(SPRN_FSCR);
 	vcpu->arch.tar = mfspr(SPRN_TAR);
 	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
 	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
 	vcpu->arch.bescr = mfspr(SPRN_BESCR);
-	vcpu->arch.tid = mfspr(SPRN_TIDR);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		vcpu->arch.tid = mfspr(SPRN_TIDR);
+	vcpu->arch.iamr = mfspr(SPRN_IAMR);
 	vcpu->arch.amr = mfspr(SPRN_AMR);
 	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
+	vcpu->arch.fscr = mfspr(SPRN_FSCR);
 	vcpu->arch.dscr = mfspr(SPRN_DSCR);
+	vcpu->arch.pspb = mfspr(SPRN_PSPB);
+
+	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
 }
 
 static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
-	host_os_sprs->dscr = mfspr(SPRN_DSCR);
-	host_os_sprs->tidr = mfspr(SPRN_TIDR);
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		host_os_sprs->tidr = mfspr(SPRN_TIDR);
 	host_os_sprs->iamr = mfspr(SPRN_IAMR);
 	host_os_sprs->amr = mfspr(SPRN_AMR);
 	host_os_sprs->fscr = mfspr(SPRN_FSCR);
+	host_os_sprs->dscr = mfspr(SPRN_DSCR);
 }
 
 /* vcpu guest regs must already be saved */
@@ -3962,18 +3974,20 @@ static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 {
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	mtspr(SPRN_PSPB, 0);
-	mtspr(SPRN_UAMOR, 0);
-
-	mtspr(SPRN_DSCR, host_os_sprs->dscr);
-	mtspr(SPRN_TIDR, host_os_sprs->tidr);
-	mtspr(SPRN_IAMR, host_os_sprs->iamr);
-
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, host_os_sprs->tidr);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, host_os_sprs->iamr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, 0);
 	if (host_os_sprs->amr != vcpu->arch.amr)
 		mtspr(SPRN_AMR, host_os_sprs->amr);
-
 	if (host_os_sprs->fscr != vcpu->arch.fscr)
 		mtspr(SPRN_FSCR, host_os_sprs->fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, host_os_sprs->dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, 0);
 
 	/* Save guest CTRL register, set runlatch to 1 */
 	if (!(vcpu->arch.ctrl & 1))
@@ -4063,7 +4077,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 #endif
 	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
 
-	load_spr_state(vcpu);
+	load_spr_state(vcpu, &host_os_sprs);
 
 	if (kvmhv_on_pseries()) {
 		/*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 24/43] KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Keep better track of the current SPR value in places where
they are to be loaded with a new context, to reduce expensive
mtSPR operations.

-73 cycles (7354) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 64 ++++++++++++++++++++++--------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8c6ba04e1fdf..612b70216e75 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3905,19 +3905,28 @@ static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
 	}
 }
 
-static void load_spr_state(struct kvm_vcpu *vcpu)
+static void load_spr_state(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
 {
-	mtspr(SPRN_DSCR, vcpu->arch.dscr);
-	mtspr(SPRN_IAMR, vcpu->arch.iamr);
-	mtspr(SPRN_PSPB, vcpu->arch.pspb);
-	mtspr(SPRN_FSCR, vcpu->arch.fscr);
 	mtspr(SPRN_TAR, vcpu->arch.tar);
 	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
 	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
 	mtspr(SPRN_BESCR, vcpu->arch.bescr);
-	mtspr(SPRN_TIDR, vcpu->arch.tid);
-	mtspr(SPRN_AMR, vcpu->arch.amr);
-	mtspr(SPRN_UAMOR, vcpu->arch.uamor);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, vcpu->arch.tid);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, vcpu->arch.iamr);
+	if (host_os_sprs->amr != vcpu->arch.amr)
+		mtspr(SPRN_AMR, vcpu->arch.amr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
+	if (host_os_sprs->fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, vcpu->arch.fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, vcpu->arch.dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, vcpu->arch.pspb);
 
 	/*
 	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
@@ -3932,28 +3941,31 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
 
 static void store_spr_state(struct kvm_vcpu *vcpu)
 {
-	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
-
-	vcpu->arch.iamr = mfspr(SPRN_IAMR);
-	vcpu->arch.pspb = mfspr(SPRN_PSPB);
-	vcpu->arch.fscr = mfspr(SPRN_FSCR);
 	vcpu->arch.tar = mfspr(SPRN_TAR);
 	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
 	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
 	vcpu->arch.bescr = mfspr(SPRN_BESCR);
-	vcpu->arch.tid = mfspr(SPRN_TIDR);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		vcpu->arch.tid = mfspr(SPRN_TIDR);
+	vcpu->arch.iamr = mfspr(SPRN_IAMR);
 	vcpu->arch.amr = mfspr(SPRN_AMR);
 	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
+	vcpu->arch.fscr = mfspr(SPRN_FSCR);
 	vcpu->arch.dscr = mfspr(SPRN_DSCR);
+	vcpu->arch.pspb = mfspr(SPRN_PSPB);
+
+	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
 }
 
 static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
-	host_os_sprs->dscr = mfspr(SPRN_DSCR);
-	host_os_sprs->tidr = mfspr(SPRN_TIDR);
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		host_os_sprs->tidr = mfspr(SPRN_TIDR);
 	host_os_sprs->iamr = mfspr(SPRN_IAMR);
 	host_os_sprs->amr = mfspr(SPRN_AMR);
 	host_os_sprs->fscr = mfspr(SPRN_FSCR);
+	host_os_sprs->dscr = mfspr(SPRN_DSCR);
 }
 
 /* vcpu guest regs must already be saved */
@@ -3962,18 +3974,20 @@ static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 {
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	mtspr(SPRN_PSPB, 0);
-	mtspr(SPRN_UAMOR, 0);
-
-	mtspr(SPRN_DSCR, host_os_sprs->dscr);
-	mtspr(SPRN_TIDR, host_os_sprs->tidr);
-	mtspr(SPRN_IAMR, host_os_sprs->iamr);
-
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, host_os_sprs->tidr);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, host_os_sprs->iamr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, 0);
 	if (host_os_sprs->amr != vcpu->arch.amr)
 		mtspr(SPRN_AMR, host_os_sprs->amr);
-
 	if (host_os_sprs->fscr != vcpu->arch.fscr)
 		mtspr(SPRN_FSCR, host_os_sprs->fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, host_os_sprs->dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, 0);
 
 	/* Save guest CTRL register, set runlatch to 1 */
 	if (!(vcpu->arch.ctrl & 1))
@@ -4063,7 +4077,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 #endif
 	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
 
-	load_spr_state(vcpu);
+	load_spr_state(vcpu, &host_os_sprs);
 
 	if (kvmhv_on_pseries()) {
 		/*
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 25/43] KVM: PPC: Book3S HV P9: Juggle SPR switching around
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This juggles SPR switching on the entry and exit sides to be more
symmetric, which makes the next refactoring patch possible with no
functional change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 612b70216e75..a780a9b9effd 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4069,7 +4069,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
+	load_spr_state(vcpu, &host_os_sprs);
 
 	load_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
@@ -4077,7 +4077,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 #endif
 	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
 
-	load_spr_state(vcpu, &host_os_sprs);
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
 	if (kvmhv_on_pseries()) {
 		/*
@@ -4177,6 +4177,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			vcpu->arch.slb_max = 0;
 	}
 
+	switch_pmu_to_host(vcpu, &host_os_sprs);
+
 	store_spr_state(vcpu);
 
 	store_fp_state(&vcpu->arch.fp);
@@ -4191,8 +4193,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
 	timer_rearm_host_dec(*tb);
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 25/43] KVM: PPC: Book3S HV P9: Juggle SPR switching around
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This juggles SPR switching on the entry and exit sides to be more
symmetric, which makes the next refactoring patch possible with no
functional change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 612b70216e75..a780a9b9effd 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4069,7 +4069,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
 		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
 
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
+	load_spr_state(vcpu, &host_os_sprs);
 
 	load_fp_state(&vcpu->arch.fp);
 #ifdef CONFIG_ALTIVEC
@@ -4077,7 +4077,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 #endif
 	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
 
-	load_spr_state(vcpu, &host_os_sprs);
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
 	if (kvmhv_on_pseries()) {
 		/*
@@ -4177,6 +4177,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			vcpu->arch.slb_max = 0;
 	}
 
+	switch_pmu_to_host(vcpu, &host_os_sprs);
+
 	store_spr_state(vcpu);
 
 	store_fp_state(&vcpu->arch.fp);
@@ -4191,8 +4193,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
 	timer_rearm_host_dec(*tb);
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 26/43] KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This should be no functional difference but makes the caller easier
to read.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 55 +++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a780a9b9effd..35749b0b663f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3958,6 +3958,37 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
 }
 
+static void load_vcpu_state(struct kvm_vcpu *vcpu,
+			   struct p9_host_os_sprs *host_os_sprs)
+{
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+
+	load_spr_state(vcpu, host_os_sprs);
+
+	load_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	load_vr_state(&vcpu->arch.vr);
+#endif
+	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+}
+
+static void store_vcpu_state(struct kvm_vcpu *vcpu)
+{
+	store_spr_state(vcpu);
+
+	store_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	store_vr_state(&vcpu->arch.vr);
+#endif
+	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+}
+
 static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
@@ -4065,17 +4096,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
-
-	load_spr_state(vcpu, &host_os_sprs);
-
-	load_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	load_vr_state(&vcpu->arch.vr);
-#endif
-	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+	load_vcpu_state(vcpu, &host_os_sprs);
 
 	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
@@ -4179,17 +4200,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	switch_pmu_to_host(vcpu, &host_os_sprs);
 
-	store_spr_state(vcpu);
-
-	store_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	store_vr_state(&vcpu->arch.vr);
-#endif
-	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
-
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+	store_vcpu_state(vcpu);
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 26/43] KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This should be no functional difference but makes the caller easier
to read.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 55 +++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a780a9b9effd..35749b0b663f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3958,6 +3958,37 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
 }
 
+static void load_vcpu_state(struct kvm_vcpu *vcpu,
+			   struct p9_host_os_sprs *host_os_sprs)
+{
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+
+	load_spr_state(vcpu, host_os_sprs);
+
+	load_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	load_vr_state(&vcpu->arch.vr);
+#endif
+	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+}
+
+static void store_vcpu_state(struct kvm_vcpu *vcpu)
+{
+	store_spr_state(vcpu);
+
+	store_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	store_vr_state(&vcpu->arch.vr);
+#endif
+	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+}
+
 static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
@@ -4065,17 +4096,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
-
-	load_spr_state(vcpu, &host_os_sprs);
-
-	load_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	load_vr_state(&vcpu->arch.vr);
-#endif
-	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+	load_vcpu_state(vcpu, &host_os_sprs);
 
 	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
@@ -4179,17 +4200,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	switch_pmu_to_host(vcpu, &host_os_sprs);
 
-	store_spr_state(vcpu);
-
-	store_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	store_vr_state(&vcpu->arch.vr);
-#endif
-	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
-
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+	store_vcpu_state(vcpu);
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move the P9 guest/host register switching functions to the built-in
P9 entry code, and export it for nested to use as well.

This allows more flexibility in scheduling these supervisor privileged
SPR accesses with the HV privileged and PR SPR accesses in the low level
entry code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 351 +-------------------------
 arch/powerpc/kvm/book3s_hv.h          |  39 +++
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 332 ++++++++++++++++++++++++
 3 files changed, 372 insertions(+), 350 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv.h

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 35749b0b663f..a7660af22161 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -79,6 +79,7 @@
 #include <asm/dtl.h>
 
 #include "book3s.h"
+#include "book3s_hv.h"
 
 #define CREATE_TRACE_POINTS
 #include "trace_hv.h"
@@ -3675,356 +3676,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	trace_kvmppc_run_core(vc, 1);
 }
 
-/*
- * Privileged (non-hypervisor) host registers to save.
- */
-struct p9_host_os_sprs {
-	unsigned long dscr;
-	unsigned long tidr;
-	unsigned long iamr;
-	unsigned long amr;
-	unsigned long fscr;
-
-	unsigned int pmc1;
-	unsigned int pmc2;
-	unsigned int pmc3;
-	unsigned int pmc4;
-	unsigned int pmc5;
-	unsigned int pmc6;
-	unsigned long mmcr0;
-	unsigned long mmcr1;
-	unsigned long mmcr2;
-	unsigned long mmcr3;
-	unsigned long mmcra;
-	unsigned long siar;
-	unsigned long sier1;
-	unsigned long sier2;
-	unsigned long sier3;
-	unsigned long sdar;
-};
-
-static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
-{
-	if (!(mmcr0 & MMCR0_FC))
-		goto do_freeze;
-	if (mmcra & MMCRA_SAMPLE_ENABLE)
-		goto do_freeze;
-	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-		if (!(mmcr0 & MMCR0_PMCCEXT))
-			goto do_freeze;
-		if (!(mmcra & MMCRA_BHRB_DISABLE))
-			goto do_freeze;
-	}
-	return;
-
-do_freeze:
-	mmcr0 = MMCR0_FC;
-	mmcra = 0;
-	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-		mmcr0 |= MMCR0_PMCCEXT;
-		mmcra = MMCRA_BHRB_DISABLE;
-	}
-
-	mtspr(SPRN_MMCR0, mmcr0);
-	mtspr(SPRN_MMCRA, mmcra);
-	isync();
-}
-
-static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
-				struct p9_host_os_sprs *host_os_sprs)
-{
-	struct lppaca *lp;
-	int load_pmu = 1;
-
-	lp = vcpu->arch.vpa.pinned_addr;
-	if (lp)
-		load_pmu = lp->pmcregs_in_use;
-
-	if (load_pmu)
-	      vcpu->arch.hfscr |= HFSCR_PM;
-
-	/* Save host */
-	if (ppc_get_pmu_inuse()) {
-		/*
-		 * It might be better to put PMU handling (at least for the
-		 * host) in the perf subsystem because it knows more about what
-		 * is being used.
-		 */
-
-		/* POWER9, POWER10 do not implement HPMC or SPMC */
-
-		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
-		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
-
-		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
-
-		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
-		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
-		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
-		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
-		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
-		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
-		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
-		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
-		host_os_sprs->sdar = mfspr(SPRN_SDAR);
-		host_os_sprs->siar = mfspr(SPRN_SIAR);
-		host_os_sprs->sier1 = mfspr(SPRN_SIER);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
-			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
-			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
-		}
-	}
-
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries()) {
-		if (vcpu->arch.vpa.pinned_addr) {
-			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
-		} else {
-			get_lppaca()->pmcregs_in_use = 1;
-		}
-	}
-#endif
-
-	/* Load guest */
-	if (vcpu->arch.hfscr & HFSCR_PM) {
-		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
-		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
-		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
-		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
-		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
-		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
-		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
-		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
-		mtspr(SPRN_SDAR, vcpu->arch.sdar);
-		mtspr(SPRN_SIAR, vcpu->arch.siar);
-		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
-			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
-			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
-		}
-
-		/* Set MMCRA then MMCR0 last */
-		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
-		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
-		/* No isync necessary because we're starting counters */
-	}
-}
-
-static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
-				struct p9_host_os_sprs *host_os_sprs)
-{
-	struct lppaca *lp;
-	int save_pmu = 1;
-
-	lp = vcpu->arch.vpa.pinned_addr;
-	if (lp)
-		save_pmu = lp->pmcregs_in_use;
-
-	if (save_pmu) {
-		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
-		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
-
-		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
-
-		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
-		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
-		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
-		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
-		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
-		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
-		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
-		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
-		vcpu->arch.sdar = mfspr(SPRN_SDAR);
-		vcpu->arch.siar = mfspr(SPRN_SIAR);
-		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
-			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
-			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
-		}
-
-	} else if (vcpu->arch.hfscr & HFSCR_PM) {
-		/*
-		 * The guest accessed PMC SPRs without specifying they should
-		 * be preserved. Stop them from counting if the guest had
-		 * started anything.
-		 */
-		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
-
-		/*
-		 * Demand-fault PMU register access in the guest.
-		 *
-		 * This is used to grab the guest's VPA pmcregs_in_use value
-		 * and reflect it into the host's VPA in the case of a nested
-		 * hypervisor.
-		 *
-		 * It also avoids having to zero-out SPRs after each guest
-		 * exit to avoid side-channels when.
-		 *
-		 * This is cleared here when we exit the guest, so later HFSCR
-		 * interrupt handling can add it back to run the guest with
-		 * PM enabled next time.
-		 */
-		vcpu->arch.hfscr &= ~HFSCR_PM;
-	} /* otherwise the PMU should still be frozen from guest entry */
-
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries())
-		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
-#endif
-
-	if (ppc_get_pmu_inuse()) {
-		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
-		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
-		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
-		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
-		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
-		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
-		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
-		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
-		mtspr(SPRN_SDAR, host_os_sprs->sdar);
-		mtspr(SPRN_SIAR, host_os_sprs->siar);
-		mtspr(SPRN_SIER, host_os_sprs->sier1);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
-			mtspr(SPRN_SIER2, host_os_sprs->sier2);
-			mtspr(SPRN_SIER3, host_os_sprs->sier3);
-		}
-
-		/* Set MMCRA then MMCR0 last */
-		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
-		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
-		isync();
-	}
-}
-
-static void load_spr_state(struct kvm_vcpu *vcpu,
-				struct p9_host_os_sprs *host_os_sprs)
-{
-	mtspr(SPRN_TAR, vcpu->arch.tar);
-	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
-	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
-	mtspr(SPRN_BESCR, vcpu->arch.bescr);
-
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		mtspr(SPRN_TIDR, vcpu->arch.tid);
-	if (host_os_sprs->iamr != vcpu->arch.iamr)
-		mtspr(SPRN_IAMR, vcpu->arch.iamr);
-	if (host_os_sprs->amr != vcpu->arch.amr)
-		mtspr(SPRN_AMR, vcpu->arch.amr);
-	if (vcpu->arch.uamor != 0)
-		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
-		mtspr(SPRN_FSCR, vcpu->arch.fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
-		mtspr(SPRN_DSCR, vcpu->arch.dscr);
-	if (vcpu->arch.pspb != 0)
-		mtspr(SPRN_PSPB, vcpu->arch.pspb);
-
-	/*
-	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
-	 * clear (or hstate set appropriately to catch those registers
-	 * being clobbered if we take a MCE or SRESET), so those are done
-	 * later.
-	 */
-
-	if (!(vcpu->arch.ctrl & 1))
-		mtspr(SPRN_CTRLT, 0);
-}
-
-static void store_spr_state(struct kvm_vcpu *vcpu)
-{
-	vcpu->arch.tar = mfspr(SPRN_TAR);
-	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
-	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
-	vcpu->arch.bescr = mfspr(SPRN_BESCR);
-
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		vcpu->arch.tid = mfspr(SPRN_TIDR);
-	vcpu->arch.iamr = mfspr(SPRN_IAMR);
-	vcpu->arch.amr = mfspr(SPRN_AMR);
-	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
-	vcpu->arch.fscr = mfspr(SPRN_FSCR);
-	vcpu->arch.dscr = mfspr(SPRN_DSCR);
-	vcpu->arch.pspb = mfspr(SPRN_PSPB);
-
-	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
-}
-
-static void load_vcpu_state(struct kvm_vcpu *vcpu,
-			   struct p9_host_os_sprs *host_os_sprs)
-{
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
-
-	load_spr_state(vcpu, host_os_sprs);
-
-	load_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	load_vr_state(&vcpu->arch.vr);
-#endif
-	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
-}
-
-static void store_vcpu_state(struct kvm_vcpu *vcpu)
-{
-	store_spr_state(vcpu);
-
-	store_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	store_vr_state(&vcpu->arch.vr);
-#endif
-	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
-
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
-}
-
-static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
-{
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		host_os_sprs->tidr = mfspr(SPRN_TIDR);
-	host_os_sprs->iamr = mfspr(SPRN_IAMR);
-	host_os_sprs->amr = mfspr(SPRN_AMR);
-	host_os_sprs->fscr = mfspr(SPRN_FSCR);
-	host_os_sprs->dscr = mfspr(SPRN_DSCR);
-}
-
-/* vcpu guest regs must already be saved */
-static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
-				    struct p9_host_os_sprs *host_os_sprs)
-{
-	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
-
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		mtspr(SPRN_TIDR, host_os_sprs->tidr);
-	if (host_os_sprs->iamr != vcpu->arch.iamr)
-		mtspr(SPRN_IAMR, host_os_sprs->iamr);
-	if (vcpu->arch.uamor != 0)
-		mtspr(SPRN_UAMOR, 0);
-	if (host_os_sprs->amr != vcpu->arch.amr)
-		mtspr(SPRN_AMR, host_os_sprs->amr);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
-		mtspr(SPRN_FSCR, host_os_sprs->fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
-		mtspr(SPRN_DSCR, host_os_sprs->dscr);
-	if (vcpu->arch.pspb != 0)
-		mtspr(SPRN_PSPB, 0);
-
-	/* Save guest CTRL register, set runlatch to 1 */
-	if (!(vcpu->arch.ctrl & 1))
-		mtspr(SPRN_CTRLT, 1);
-}
-
 static inline bool hcall_is_xics(unsigned long req)
 {
 	return req == H_EOI || req == H_CPPR || req == H_IPI ||
diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
new file mode 100644
index 000000000000..72e3a8f4c2cf
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_hv.h
@@ -0,0 +1,39 @@
+
+/*
+ * Privileged (non-hypervisor) host registers to save.
+ */
+struct p9_host_os_sprs {
+	unsigned long dscr;
+	unsigned long tidr;
+	unsigned long iamr;
+	unsigned long amr;
+	unsigned long fscr;
+
+	unsigned int pmc1;
+	unsigned int pmc2;
+	unsigned int pmc3;
+	unsigned int pmc4;
+	unsigned int pmc5;
+	unsigned int pmc6;
+	unsigned long mmcr0;
+	unsigned long mmcr1;
+	unsigned long mmcr2;
+	unsigned long mmcr3;
+	unsigned long mmcra;
+	unsigned long siar;
+	unsigned long sier1;
+	unsigned long sier2;
+	unsigned long sier3;
+	unsigned long sdar;
+};
+
+void load_vcpu_state(struct kvm_vcpu *vcpu,
+			   struct p9_host_os_sprs *host_os_sprs);
+void store_vcpu_state(struct kvm_vcpu *vcpu);
+void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs);
+void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
+				    struct p9_host_os_sprs *host_os_sprs);
+void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
+			    struct p9_host_os_sprs *host_os_sprs);
+void switch_pmu_to_host(struct kvm_vcpu *vcpu,
+			    struct p9_host_os_sprs *host_os_sprs);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index afdd7dfa1c08..cc74cd314fcf 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -4,8 +4,340 @@
 #include <asm/asm-prototypes.h>
 #include <asm/dbell.h>
 #include <asm/kvm_ppc.h>
+#include <asm/pmc.h>
 #include <asm/ppc-opcode.h>
 
+#include "book3s_hv.h"
+
+static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
+{
+	if (!(mmcr0 & MMCR0_FC))
+		goto do_freeze;
+	if (mmcra & MMCRA_SAMPLE_ENABLE)
+		goto do_freeze;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		if (!(mmcr0 & MMCR0_PMCCEXT))
+			goto do_freeze;
+		if (!(mmcra & MMCRA_BHRB_DISABLE))
+			goto do_freeze;
+	}
+	return;
+
+do_freeze:
+	mmcr0 = MMCR0_FC;
+	mmcra = 0;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		mmcr0 |= MMCR0_PMCCEXT;
+		mmcra = MMCRA_BHRB_DISABLE;
+	}
+
+	mtspr(SPRN_MMCR0, mmcr0);
+	mtspr(SPRN_MMCRA, mmcra);
+	isync();
+}
+
+void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
+{
+	struct lppaca *lp;
+	int load_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		load_pmu = lp->pmcregs_in_use;
+
+	if (load_pmu)
+	      vcpu->arch.hfscr |= HFSCR_PM;
+
+	/* Save host */
+	if (ppc_get_pmu_inuse()) {
+		/*
+		 * It might be better to put PMU handling (at least for the
+		 * host) in the perf subsystem because it knows more about what
+		 * is being used.
+		 */
+
+		/* POWER9, POWER10 do not implement HPMC or SPMC */
+
+		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
+		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
+
+		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
+		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
+		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
+		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
+		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
+		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
+		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
+		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
+		host_os_sprs->sdar = mfspr(SPRN_SDAR);
+		host_os_sprs->siar = mfspr(SPRN_SIAR);
+		host_os_sprs->sier1 = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
+			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
+			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
+		}
+	}
+
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries()) {
+		if (vcpu->arch.vpa.pinned_addr) {
+			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
+			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
+		} else {
+			get_lppaca()->pmcregs_in_use = 1;
+		}
+	}
+#endif
+
+	/* Load guest */
+	if (vcpu->arch.hfscr & HFSCR_PM) {
+		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
+		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
+		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
+		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
+		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
+		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
+		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
+		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
+		mtspr(SPRN_SDAR, vcpu->arch.sdar);
+		mtspr(SPRN_SIAR, vcpu->arch.siar);
+		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
+			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
+			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
+		}
+
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
+		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
+		/* No isync necessary because we're starting counters */
+	}
+}
+EXPORT_SYMBOL_GPL(switch_pmu_to_guest);
+
+void switch_pmu_to_host(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
+{
+	struct lppaca *lp;
+	int save_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		save_pmu = lp->pmcregs_in_use;
+
+	if (save_pmu) {
+		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
+		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
+
+		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
+		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
+		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
+		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
+		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
+		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
+		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
+		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
+		vcpu->arch.sdar = mfspr(SPRN_SDAR);
+		vcpu->arch.siar = mfspr(SPRN_SIAR);
+		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
+			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
+			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
+		}
+
+	} else if (vcpu->arch.hfscr & HFSCR_PM) {
+		/*
+		 * The guest accessed PMC SPRs without specifying they should
+		 * be preserved. Stop them from counting if the guest had
+		 * started anything.
+		 */
+		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
+
+		/*
+		 * Demand-fault PMU register access in the guest.
+		 *
+		 * This is used to grab the guest's VPA pmcregs_in_use value
+		 * and reflect it into the host's VPA in the case of a nested
+		 * hypervisor.
+		 *
+		 * It also avoids having to zero-out SPRs after each guest
+		 * exit to avoid side-channels when.
+		 *
+		 * This is cleared here when we exit the guest, so later HFSCR
+		 * interrupt handling can add it back to run the guest with
+		 * PM enabled next time.
+		 */
+		vcpu->arch.hfscr &= ~HFSCR_PM;
+	} /* otherwise the PMU should still be frozen from guest entry */
+
+
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries())
+		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
+#endif
+
+	if (ppc_get_pmu_inuse()) {
+		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
+		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
+		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
+		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
+		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
+		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
+		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
+		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
+		mtspr(SPRN_SDAR, host_os_sprs->sdar);
+		mtspr(SPRN_SIAR, host_os_sprs->siar);
+		mtspr(SPRN_SIER, host_os_sprs->sier1);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
+			mtspr(SPRN_SIER2, host_os_sprs->sier2);
+			mtspr(SPRN_SIER3, host_os_sprs->sier3);
+		}
+
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
+		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
+		isync();
+	}
+}
+EXPORT_SYMBOL_GPL(switch_pmu_to_host);
+
+static void load_spr_state(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
+{
+	mtspr(SPRN_TAR, vcpu->arch.tar);
+	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+	mtspr(SPRN_BESCR, vcpu->arch.bescr);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, vcpu->arch.tid);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, vcpu->arch.iamr);
+	if (host_os_sprs->amr != vcpu->arch.amr)
+		mtspr(SPRN_AMR, vcpu->arch.amr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
+	if (host_os_sprs->fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, vcpu->arch.fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, vcpu->arch.dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, vcpu->arch.pspb);
+
+	/*
+	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
+	 * clear (or hstate set appropriately to catch those registers
+	 * being clobbered if we take a MCE or SRESET), so those are done
+	 * later.
+	 */
+
+	if (!(vcpu->arch.ctrl & 1))
+		mtspr(SPRN_CTRLT, 0);
+}
+
+static void store_spr_state(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.tar = mfspr(SPRN_TAR);
+	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
+	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
+	vcpu->arch.bescr = mfspr(SPRN_BESCR);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		vcpu->arch.tid = mfspr(SPRN_TIDR);
+	vcpu->arch.iamr = mfspr(SPRN_IAMR);
+	vcpu->arch.amr = mfspr(SPRN_AMR);
+	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
+	vcpu->arch.fscr = mfspr(SPRN_FSCR);
+	vcpu->arch.dscr = mfspr(SPRN_DSCR);
+	vcpu->arch.pspb = mfspr(SPRN_PSPB);
+
+	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
+}
+
+void load_vcpu_state(struct kvm_vcpu *vcpu,
+			   struct p9_host_os_sprs *host_os_sprs)
+{
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+
+	load_spr_state(vcpu, host_os_sprs);
+
+	load_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	load_vr_state(&vcpu->arch.vr);
+#endif
+	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+}
+EXPORT_SYMBOL_GPL(load_vcpu_state);
+
+void store_vcpu_state(struct kvm_vcpu *vcpu)
+{
+	store_spr_state(vcpu);
+
+	store_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	store_vr_state(&vcpu->arch.vr);
+#endif
+	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+}
+EXPORT_SYMBOL_GPL(store_vcpu_state);
+
+void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
+{
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		host_os_sprs->tidr = mfspr(SPRN_TIDR);
+	host_os_sprs->iamr = mfspr(SPRN_IAMR);
+	host_os_sprs->amr = mfspr(SPRN_AMR);
+	host_os_sprs->fscr = mfspr(SPRN_FSCR);
+	host_os_sprs->dscr = mfspr(SPRN_DSCR);
+}
+EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
+
+/* vcpu guest regs must already be saved */
+void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
+				    struct p9_host_os_sprs *host_os_sprs)
+{
+	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, host_os_sprs->tidr);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, host_os_sprs->iamr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, 0);
+	if (host_os_sprs->amr != vcpu->arch.amr)
+		mtspr(SPRN_AMR, host_os_sprs->amr);
+	if (host_os_sprs->fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, host_os_sprs->fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, host_os_sprs->dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, 0);
+
+	/* Save guest CTRL register, set runlatch to 1 */
+	if (!(vcpu->arch.ctrl & 1))
+		mtspr(SPRN_CTRLT, 1);
+}
+EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
+
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
 static void __start_timing(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
 {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move the P9 guest/host register switching functions to the built-in
P9 entry code, and export it for nested to use as well.

This allows more flexibility in scheduling these supervisor privileged
SPR accesses with the HV privileged and PR SPR accesses in the low level
entry code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 351 +-------------------------
 arch/powerpc/kvm/book3s_hv.h          |  39 +++
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 332 ++++++++++++++++++++++++
 3 files changed, 372 insertions(+), 350 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv.h

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 35749b0b663f..a7660af22161 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -79,6 +79,7 @@
 #include <asm/dtl.h>
 
 #include "book3s.h"
+#include "book3s_hv.h"
 
 #define CREATE_TRACE_POINTS
 #include "trace_hv.h"
@@ -3675,356 +3676,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 	trace_kvmppc_run_core(vc, 1);
 }
 
-/*
- * Privileged (non-hypervisor) host registers to save.
- */
-struct p9_host_os_sprs {
-	unsigned long dscr;
-	unsigned long tidr;
-	unsigned long iamr;
-	unsigned long amr;
-	unsigned long fscr;
-
-	unsigned int pmc1;
-	unsigned int pmc2;
-	unsigned int pmc3;
-	unsigned int pmc4;
-	unsigned int pmc5;
-	unsigned int pmc6;
-	unsigned long mmcr0;
-	unsigned long mmcr1;
-	unsigned long mmcr2;
-	unsigned long mmcr3;
-	unsigned long mmcra;
-	unsigned long siar;
-	unsigned long sier1;
-	unsigned long sier2;
-	unsigned long sier3;
-	unsigned long sdar;
-};
-
-static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
-{
-	if (!(mmcr0 & MMCR0_FC))
-		goto do_freeze;
-	if (mmcra & MMCRA_SAMPLE_ENABLE)
-		goto do_freeze;
-	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-		if (!(mmcr0 & MMCR0_PMCCEXT))
-			goto do_freeze;
-		if (!(mmcra & MMCRA_BHRB_DISABLE))
-			goto do_freeze;
-	}
-	return;
-
-do_freeze:
-	mmcr0 = MMCR0_FC;
-	mmcra = 0;
-	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-		mmcr0 |= MMCR0_PMCCEXT;
-		mmcra = MMCRA_BHRB_DISABLE;
-	}
-
-	mtspr(SPRN_MMCR0, mmcr0);
-	mtspr(SPRN_MMCRA, mmcra);
-	isync();
-}
-
-static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
-				struct p9_host_os_sprs *host_os_sprs)
-{
-	struct lppaca *lp;
-	int load_pmu = 1;
-
-	lp = vcpu->arch.vpa.pinned_addr;
-	if (lp)
-		load_pmu = lp->pmcregs_in_use;
-
-	if (load_pmu)
-	      vcpu->arch.hfscr |= HFSCR_PM;
-
-	/* Save host */
-	if (ppc_get_pmu_inuse()) {
-		/*
-		 * It might be better to put PMU handling (at least for the
-		 * host) in the perf subsystem because it knows more about what
-		 * is being used.
-		 */
-
-		/* POWER9, POWER10 do not implement HPMC or SPMC */
-
-		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
-		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
-
-		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
-
-		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
-		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
-		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
-		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
-		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
-		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
-		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
-		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
-		host_os_sprs->sdar = mfspr(SPRN_SDAR);
-		host_os_sprs->siar = mfspr(SPRN_SIAR);
-		host_os_sprs->sier1 = mfspr(SPRN_SIER);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
-			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
-			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
-		}
-	}
-
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries()) {
-		if (vcpu->arch.vpa.pinned_addr) {
-			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
-			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
-		} else {
-			get_lppaca()->pmcregs_in_use = 1;
-		}
-	}
-#endif
-
-	/* Load guest */
-	if (vcpu->arch.hfscr & HFSCR_PM) {
-		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
-		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
-		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
-		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
-		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
-		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
-		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
-		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
-		mtspr(SPRN_SDAR, vcpu->arch.sdar);
-		mtspr(SPRN_SIAR, vcpu->arch.siar);
-		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
-			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
-			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
-		}
-
-		/* Set MMCRA then MMCR0 last */
-		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
-		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
-		/* No isync necessary because we're starting counters */
-	}
-}
-
-static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
-				struct p9_host_os_sprs *host_os_sprs)
-{
-	struct lppaca *lp;
-	int save_pmu = 1;
-
-	lp = vcpu->arch.vpa.pinned_addr;
-	if (lp)
-		save_pmu = lp->pmcregs_in_use;
-
-	if (save_pmu) {
-		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
-		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
-
-		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
-
-		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
-		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
-		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
-		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
-		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
-		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
-		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
-		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
-		vcpu->arch.sdar = mfspr(SPRN_SDAR);
-		vcpu->arch.siar = mfspr(SPRN_SIAR);
-		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
-			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
-			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
-		}
-
-	} else if (vcpu->arch.hfscr & HFSCR_PM) {
-		/*
-		 * The guest accessed PMC SPRs without specifying they should
-		 * be preserved. Stop them from counting if the guest had
-		 * started anything.
-		 */
-		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
-
-		/*
-		 * Demand-fault PMU register access in the guest.
-		 *
-		 * This is used to grab the guest's VPA pmcregs_in_use value
-		 * and reflect it into the host's VPA in the case of a nested
-		 * hypervisor.
-		 *
-		 * It also avoids having to zero-out SPRs after each guest
-		 * exit to avoid side-channels when.
-		 *
-		 * This is cleared here when we exit the guest, so later HFSCR
-		 * interrupt handling can add it back to run the guest with
-		 * PM enabled next time.
-		 */
-		vcpu->arch.hfscr &= ~HFSCR_PM;
-	} /* otherwise the PMU should still be frozen from guest entry */
-
-#ifdef CONFIG_PPC_PSERIES
-	if (kvmhv_on_pseries())
-		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
-#endif
-
-	if (ppc_get_pmu_inuse()) {
-		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
-		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
-		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
-		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
-		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
-		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
-		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
-		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
-		mtspr(SPRN_SDAR, host_os_sprs->sdar);
-		mtspr(SPRN_SIAR, host_os_sprs->siar);
-		mtspr(SPRN_SIER, host_os_sprs->sier1);
-
-		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
-			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
-			mtspr(SPRN_SIER2, host_os_sprs->sier2);
-			mtspr(SPRN_SIER3, host_os_sprs->sier3);
-		}
-
-		/* Set MMCRA then MMCR0 last */
-		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
-		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
-		isync();
-	}
-}
-
-static void load_spr_state(struct kvm_vcpu *vcpu,
-				struct p9_host_os_sprs *host_os_sprs)
-{
-	mtspr(SPRN_TAR, vcpu->arch.tar);
-	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
-	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
-	mtspr(SPRN_BESCR, vcpu->arch.bescr);
-
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		mtspr(SPRN_TIDR, vcpu->arch.tid);
-	if (host_os_sprs->iamr != vcpu->arch.iamr)
-		mtspr(SPRN_IAMR, vcpu->arch.iamr);
-	if (host_os_sprs->amr != vcpu->arch.amr)
-		mtspr(SPRN_AMR, vcpu->arch.amr);
-	if (vcpu->arch.uamor != 0)
-		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
-		mtspr(SPRN_FSCR, vcpu->arch.fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
-		mtspr(SPRN_DSCR, vcpu->arch.dscr);
-	if (vcpu->arch.pspb != 0)
-		mtspr(SPRN_PSPB, vcpu->arch.pspb);
-
-	/*
-	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
-	 * clear (or hstate set appropriately to catch those registers
-	 * being clobbered if we take a MCE or SRESET), so those are done
-	 * later.
-	 */
-
-	if (!(vcpu->arch.ctrl & 1))
-		mtspr(SPRN_CTRLT, 0);
-}
-
-static void store_spr_state(struct kvm_vcpu *vcpu)
-{
-	vcpu->arch.tar = mfspr(SPRN_TAR);
-	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
-	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
-	vcpu->arch.bescr = mfspr(SPRN_BESCR);
-
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		vcpu->arch.tid = mfspr(SPRN_TIDR);
-	vcpu->arch.iamr = mfspr(SPRN_IAMR);
-	vcpu->arch.amr = mfspr(SPRN_AMR);
-	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
-	vcpu->arch.fscr = mfspr(SPRN_FSCR);
-	vcpu->arch.dscr = mfspr(SPRN_DSCR);
-	vcpu->arch.pspb = mfspr(SPRN_PSPB);
-
-	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
-}
-
-static void load_vcpu_state(struct kvm_vcpu *vcpu,
-			   struct p9_host_os_sprs *host_os_sprs)
-{
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
-
-	load_spr_state(vcpu, host_os_sprs);
-
-	load_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	load_vr_state(&vcpu->arch.vr);
-#endif
-	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
-}
-
-static void store_vcpu_state(struct kvm_vcpu *vcpu)
-{
-	store_spr_state(vcpu);
-
-	store_fp_state(&vcpu->arch.fp);
-#ifdef CONFIG_ALTIVEC
-	store_vr_state(&vcpu->arch.vr);
-#endif
-	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
-
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
-}
-
-static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
-{
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		host_os_sprs->tidr = mfspr(SPRN_TIDR);
-	host_os_sprs->iamr = mfspr(SPRN_IAMR);
-	host_os_sprs->amr = mfspr(SPRN_AMR);
-	host_os_sprs->fscr = mfspr(SPRN_FSCR);
-	host_os_sprs->dscr = mfspr(SPRN_DSCR);
-}
-
-/* vcpu guest regs must already be saved */
-static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
-				    struct p9_host_os_sprs *host_os_sprs)
-{
-	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
-
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		mtspr(SPRN_TIDR, host_os_sprs->tidr);
-	if (host_os_sprs->iamr != vcpu->arch.iamr)
-		mtspr(SPRN_IAMR, host_os_sprs->iamr);
-	if (vcpu->arch.uamor != 0)
-		mtspr(SPRN_UAMOR, 0);
-	if (host_os_sprs->amr != vcpu->arch.amr)
-		mtspr(SPRN_AMR, host_os_sprs->amr);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
-		mtspr(SPRN_FSCR, host_os_sprs->fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
-		mtspr(SPRN_DSCR, host_os_sprs->dscr);
-	if (vcpu->arch.pspb != 0)
-		mtspr(SPRN_PSPB, 0);
-
-	/* Save guest CTRL register, set runlatch to 1 */
-	if (!(vcpu->arch.ctrl & 1))
-		mtspr(SPRN_CTRLT, 1);
-}
-
 static inline bool hcall_is_xics(unsigned long req)
 {
 	return req = H_EOI || req = H_CPPR || req = H_IPI ||
diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
new file mode 100644
index 000000000000..72e3a8f4c2cf
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_hv.h
@@ -0,0 +1,39 @@
+
+/*
+ * Privileged (non-hypervisor) host registers to save.
+ */
+struct p9_host_os_sprs {
+	unsigned long dscr;
+	unsigned long tidr;
+	unsigned long iamr;
+	unsigned long amr;
+	unsigned long fscr;
+
+	unsigned int pmc1;
+	unsigned int pmc2;
+	unsigned int pmc3;
+	unsigned int pmc4;
+	unsigned int pmc5;
+	unsigned int pmc6;
+	unsigned long mmcr0;
+	unsigned long mmcr1;
+	unsigned long mmcr2;
+	unsigned long mmcr3;
+	unsigned long mmcra;
+	unsigned long siar;
+	unsigned long sier1;
+	unsigned long sier2;
+	unsigned long sier3;
+	unsigned long sdar;
+};
+
+void load_vcpu_state(struct kvm_vcpu *vcpu,
+			   struct p9_host_os_sprs *host_os_sprs);
+void store_vcpu_state(struct kvm_vcpu *vcpu);
+void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs);
+void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
+				    struct p9_host_os_sprs *host_os_sprs);
+void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
+			    struct p9_host_os_sprs *host_os_sprs);
+void switch_pmu_to_host(struct kvm_vcpu *vcpu,
+			    struct p9_host_os_sprs *host_os_sprs);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index afdd7dfa1c08..cc74cd314fcf 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -4,8 +4,340 @@
 #include <asm/asm-prototypes.h>
 #include <asm/dbell.h>
 #include <asm/kvm_ppc.h>
+#include <asm/pmc.h>
 #include <asm/ppc-opcode.h>
 
+#include "book3s_hv.h"
+
+static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
+{
+	if (!(mmcr0 & MMCR0_FC))
+		goto do_freeze;
+	if (mmcra & MMCRA_SAMPLE_ENABLE)
+		goto do_freeze;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		if (!(mmcr0 & MMCR0_PMCCEXT))
+			goto do_freeze;
+		if (!(mmcra & MMCRA_BHRB_DISABLE))
+			goto do_freeze;
+	}
+	return;
+
+do_freeze:
+	mmcr0 = MMCR0_FC;
+	mmcra = 0;
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		mmcr0 |= MMCR0_PMCCEXT;
+		mmcra = MMCRA_BHRB_DISABLE;
+	}
+
+	mtspr(SPRN_MMCR0, mmcr0);
+	mtspr(SPRN_MMCRA, mmcra);
+	isync();
+}
+
+void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
+{
+	struct lppaca *lp;
+	int load_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		load_pmu = lp->pmcregs_in_use;
+
+	if (load_pmu)
+	      vcpu->arch.hfscr |= HFSCR_PM;
+
+	/* Save host */
+	if (ppc_get_pmu_inuse()) {
+		/*
+		 * It might be better to put PMU handling (at least for the
+		 * host) in the perf subsystem because it knows more about what
+		 * is being used.
+		 */
+
+		/* POWER9, POWER10 do not implement HPMC or SPMC */
+
+		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
+		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
+
+		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
+		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
+		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
+		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
+		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
+		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
+		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
+		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
+		host_os_sprs->sdar = mfspr(SPRN_SDAR);
+		host_os_sprs->siar = mfspr(SPRN_SIAR);
+		host_os_sprs->sier1 = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
+			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
+			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
+		}
+	}
+
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries()) {
+		if (vcpu->arch.vpa.pinned_addr) {
+			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
+			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
+		} else {
+			get_lppaca()->pmcregs_in_use = 1;
+		}
+	}
+#endif
+
+	/* Load guest */
+	if (vcpu->arch.hfscr & HFSCR_PM) {
+		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
+		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
+		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
+		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
+		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
+		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
+		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
+		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
+		mtspr(SPRN_SDAR, vcpu->arch.sdar);
+		mtspr(SPRN_SIAR, vcpu->arch.siar);
+		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
+			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
+			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
+		}
+
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
+		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
+		/* No isync necessary because we're starting counters */
+	}
+}
+EXPORT_SYMBOL_GPL(switch_pmu_to_guest);
+
+void switch_pmu_to_host(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
+{
+	struct lppaca *lp;
+	int save_pmu = 1;
+
+	lp = vcpu->arch.vpa.pinned_addr;
+	if (lp)
+		save_pmu = lp->pmcregs_in_use;
+
+	if (save_pmu) {
+		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
+		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
+
+		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
+
+		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
+		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
+		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
+		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
+		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
+		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
+		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
+		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
+		vcpu->arch.sdar = mfspr(SPRN_SDAR);
+		vcpu->arch.siar = mfspr(SPRN_SIAR);
+		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
+			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
+			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
+		}
+
+	} else if (vcpu->arch.hfscr & HFSCR_PM) {
+		/*
+		 * The guest accessed PMC SPRs without specifying they should
+		 * be preserved. Stop them from counting if the guest had
+		 * started anything.
+		 */
+		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
+
+		/*
+		 * Demand-fault PMU register access in the guest.
+		 *
+		 * This is used to grab the guest's VPA pmcregs_in_use value
+		 * and reflect it into the host's VPA in the case of a nested
+		 * hypervisor.
+		 *
+		 * It also avoids having to zero-out SPRs after each guest
+		 * exit to avoid side-channels when.
+		 *
+		 * This is cleared here when we exit the guest, so later HFSCR
+		 * interrupt handling can add it back to run the guest with
+		 * PM enabled next time.
+		 */
+		vcpu->arch.hfscr &= ~HFSCR_PM;
+	} /* otherwise the PMU should still be frozen from guest entry */
+
+
+#ifdef CONFIG_PPC_PSERIES
+	if (kvmhv_on_pseries())
+		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
+#endif
+
+	if (ppc_get_pmu_inuse()) {
+		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
+		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
+		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
+		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
+		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
+		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
+		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
+		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
+		mtspr(SPRN_SDAR, host_os_sprs->sdar);
+		mtspr(SPRN_SIAR, host_os_sprs->siar);
+		mtspr(SPRN_SIER, host_os_sprs->sier1);
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
+			mtspr(SPRN_SIER2, host_os_sprs->sier2);
+			mtspr(SPRN_SIER3, host_os_sprs->sier3);
+		}
+
+		/* Set MMCRA then MMCR0 last */
+		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
+		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
+		isync();
+	}
+}
+EXPORT_SYMBOL_GPL(switch_pmu_to_host);
+
+static void load_spr_state(struct kvm_vcpu *vcpu,
+				struct p9_host_os_sprs *host_os_sprs)
+{
+	mtspr(SPRN_TAR, vcpu->arch.tar);
+	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+	mtspr(SPRN_BESCR, vcpu->arch.bescr);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, vcpu->arch.tid);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, vcpu->arch.iamr);
+	if (host_os_sprs->amr != vcpu->arch.amr)
+		mtspr(SPRN_AMR, vcpu->arch.amr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
+	if (host_os_sprs->fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, vcpu->arch.fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, vcpu->arch.dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, vcpu->arch.pspb);
+
+	/*
+	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
+	 * clear (or hstate set appropriately to catch those registers
+	 * being clobbered if we take a MCE or SRESET), so those are done
+	 * later.
+	 */
+
+	if (!(vcpu->arch.ctrl & 1))
+		mtspr(SPRN_CTRLT, 0);
+}
+
+static void store_spr_state(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.tar = mfspr(SPRN_TAR);
+	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
+	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
+	vcpu->arch.bescr = mfspr(SPRN_BESCR);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		vcpu->arch.tid = mfspr(SPRN_TIDR);
+	vcpu->arch.iamr = mfspr(SPRN_IAMR);
+	vcpu->arch.amr = mfspr(SPRN_AMR);
+	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
+	vcpu->arch.fscr = mfspr(SPRN_FSCR);
+	vcpu->arch.dscr = mfspr(SPRN_DSCR);
+	vcpu->arch.pspb = mfspr(SPRN_PSPB);
+
+	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
+}
+
+void load_vcpu_state(struct kvm_vcpu *vcpu,
+			   struct p9_host_os_sprs *host_os_sprs)
+{
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+
+	load_spr_state(vcpu, host_os_sprs);
+
+	load_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	load_vr_state(&vcpu->arch.vr);
+#endif
+	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+}
+EXPORT_SYMBOL_GPL(load_vcpu_state);
+
+void store_vcpu_state(struct kvm_vcpu *vcpu)
+{
+	store_spr_state(vcpu);
+
+	store_fp_state(&vcpu->arch.fp);
+#ifdef CONFIG_ALTIVEC
+	store_vr_state(&vcpu->arch.vr);
+#endif
+	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+}
+EXPORT_SYMBOL_GPL(store_vcpu_state);
+
+void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
+{
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		host_os_sprs->tidr = mfspr(SPRN_TIDR);
+	host_os_sprs->iamr = mfspr(SPRN_IAMR);
+	host_os_sprs->amr = mfspr(SPRN_AMR);
+	host_os_sprs->fscr = mfspr(SPRN_FSCR);
+	host_os_sprs->dscr = mfspr(SPRN_DSCR);
+}
+EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
+
+/* vcpu guest regs must already be saved */
+void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
+				    struct p9_host_os_sprs *host_os_sprs)
+{
+	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		mtspr(SPRN_TIDR, host_os_sprs->tidr);
+	if (host_os_sprs->iamr != vcpu->arch.iamr)
+		mtspr(SPRN_IAMR, host_os_sprs->iamr);
+	if (vcpu->arch.uamor != 0)
+		mtspr(SPRN_UAMOR, 0);
+	if (host_os_sprs->amr != vcpu->arch.amr)
+		mtspr(SPRN_AMR, host_os_sprs->amr);
+	if (host_os_sprs->fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, host_os_sprs->fscr);
+	if (host_os_sprs->dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, host_os_sprs->dscr);
+	if (vcpu->arch.pspb != 0)
+		mtspr(SPRN_PSPB, 0);
+
+	/* Save guest CTRL register, set runlatch to 1 */
+	if (!(vcpu->arch.ctrl & 1))
+		mtspr(SPRN_CTRLT, 1);
+}
+EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
+
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
 static void __start_timing(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
 {
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 28/43] KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This is just refactoring.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 125 +++++++++++++++++++----------------
 1 file changed, 67 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a7660af22161..64386fc0cd00 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3692,6 +3692,72 @@ static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
 	}
 }
 
+/* call our hypervisor to load up HV regs and go */
+static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
+{
+	struct kvmppc_vcore *vc = vcpu->arch.vcore;
+	unsigned long host_psscr;
+	struct hv_guest_state hvregs;
+	int trap;
+	s64 dec;
+
+	/*
+	 * We need to save and restore the guest visible part of the
+	 * psscr (i.e. using SPRN_PSSCR_PR) since the hypervisor
+	 * doesn't do this for us. Note only required if pseries since
+	 * this is done in kvmhv_vcpu_entry_p9() below otherwise.
+	 */
+	host_psscr = mfspr(SPRN_PSSCR_PR);
+	mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+	kvmhv_save_hv_regs(vcpu, &hvregs);
+	hvregs.lpcr = lpcr;
+	vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
+	hvregs.version = HV_GUEST_STATE_VERSION;
+	if (vcpu->arch.nested) {
+		hvregs.lpid = vcpu->arch.nested->shadow_lpid;
+		hvregs.vcpu_token = vcpu->arch.nested_vcpu_id;
+	} else {
+		hvregs.lpid = vcpu->kvm->arch.lpid;
+		hvregs.vcpu_token = vcpu->vcpu_id;
+	}
+	hvregs.hdec_expiry = time_limit;
+
+	/*
+	 * When setting DEC, we must always deal with irq_work_raise
+	 * via NMI vs setting DEC. The problem occurs right as we
+	 * switch into guest mode if a NMI hits and sets pending work
+	 * and sets DEC, then that will apply to the guest and not
+	 * bring us back to the host.
+	 *
+	 * irq_work_raise could check a flag (or possibly LPCR[HDICE]
+	 * for example) and set HDEC to 1? That wouldn't solve the
+	 * nested hv case which needs to abort the hcall or zero the
+	 * time limit.
+	 *
+	 * XXX: Another day's problem.
+	 */
+	mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - *tb);
+
+	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
+	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
+	trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
+				  __pa(&vcpu->arch.regs));
+	kvmhv_restore_hv_return_state(vcpu, &hvregs);
+	vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
+	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
+	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
+	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
+	mtspr(SPRN_PSSCR_PR, host_psscr);
+
+	dec = mfspr(SPRN_DEC);
+	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+		dec = (s32) dec;
+	*tb = mftb();
+	vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
+
+	return trap;
+}
+
 /*
  * Guest entry for POWER9 and later CPUs.
  */
@@ -3700,7 +3766,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	struct p9_host_os_sprs host_os_sprs;
-	s64 dec;
 	u64 next_timer;
 	unsigned long msr;
 	int trap;
@@ -3752,63 +3817,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
 	if (kvmhv_on_pseries()) {
-		/*
-		 * We need to save and restore the guest visible part of the
-		 * psscr (i.e. using SPRN_PSSCR_PR) since the hypervisor
-		 * doesn't do this for us. Note only required if pseries since
-		 * this is done in kvmhv_vcpu_entry_p9() below otherwise.
-		 */
-		unsigned long host_psscr;
-		/* call our hypervisor to load up HV regs and go */
-		struct hv_guest_state hvregs;
-
-		host_psscr = mfspr(SPRN_PSSCR_PR);
-		mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
-		kvmhv_save_hv_regs(vcpu, &hvregs);
-		hvregs.lpcr = lpcr;
-		vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
-		hvregs.version = HV_GUEST_STATE_VERSION;
-		if (vcpu->arch.nested) {
-			hvregs.lpid = vcpu->arch.nested->shadow_lpid;
-			hvregs.vcpu_token = vcpu->arch.nested_vcpu_id;
-		} else {
-			hvregs.lpid = vcpu->kvm->arch.lpid;
-			hvregs.vcpu_token = vcpu->vcpu_id;
-		}
-		hvregs.hdec_expiry = time_limit;
-
-		/*
-		 * When setting DEC, we must always deal with irq_work_raise
-		 * via NMI vs setting DEC. The problem occurs right as we
-		 * switch into guest mode if a NMI hits and sets pending work
-		 * and sets DEC, then that will apply to the guest and not
-		 * bring us back to the host.
-		 *
-		 * irq_work_raise could check a flag (or possibly LPCR[HDICE]
-		 * for example) and set HDEC to 1? That wouldn't solve the
-		 * nested hv case which needs to abort the hcall or zero the
-		 * time limit.
-		 *
-		 * XXX: Another day's problem.
-		 */
-		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - *tb);
-
-		mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
-		mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
-		trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
-					  __pa(&vcpu->arch.regs));
-		kvmhv_restore_hv_return_state(vcpu, &hvregs);
-		vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
-		vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
-		vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
-		vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
-		mtspr(SPRN_PSSCR_PR, host_psscr);
-
-		dec = mfspr(SPRN_DEC);
-		if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
-			dec = (s32) dec;
-		*tb = mftb();
-		vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
+		trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
 
 		/* H_CEDE has to be handled now, not later */
 		if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 28/43] KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This is just refactoring.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 125 +++++++++++++++++++----------------
 1 file changed, 67 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a7660af22161..64386fc0cd00 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3692,6 +3692,72 @@ static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
 	}
 }
 
+/* call our hypervisor to load up HV regs and go */
+static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
+{
+	struct kvmppc_vcore *vc = vcpu->arch.vcore;
+	unsigned long host_psscr;
+	struct hv_guest_state hvregs;
+	int trap;
+	s64 dec;
+
+	/*
+	 * We need to save and restore the guest visible part of the
+	 * psscr (i.e. using SPRN_PSSCR_PR) since the hypervisor
+	 * doesn't do this for us. Note only required if pseries since
+	 * this is done in kvmhv_vcpu_entry_p9() below otherwise.
+	 */
+	host_psscr = mfspr(SPRN_PSSCR_PR);
+	mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+	kvmhv_save_hv_regs(vcpu, &hvregs);
+	hvregs.lpcr = lpcr;
+	vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
+	hvregs.version = HV_GUEST_STATE_VERSION;
+	if (vcpu->arch.nested) {
+		hvregs.lpid = vcpu->arch.nested->shadow_lpid;
+		hvregs.vcpu_token = vcpu->arch.nested_vcpu_id;
+	} else {
+		hvregs.lpid = vcpu->kvm->arch.lpid;
+		hvregs.vcpu_token = vcpu->vcpu_id;
+	}
+	hvregs.hdec_expiry = time_limit;
+
+	/*
+	 * When setting DEC, we must always deal with irq_work_raise
+	 * via NMI vs setting DEC. The problem occurs right as we
+	 * switch into guest mode if a NMI hits and sets pending work
+	 * and sets DEC, then that will apply to the guest and not
+	 * bring us back to the host.
+	 *
+	 * irq_work_raise could check a flag (or possibly LPCR[HDICE]
+	 * for example) and set HDEC to 1? That wouldn't solve the
+	 * nested hv case which needs to abort the hcall or zero the
+	 * time limit.
+	 *
+	 * XXX: Another day's problem.
+	 */
+	mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - *tb);
+
+	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
+	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
+	trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
+				  __pa(&vcpu->arch.regs));
+	kvmhv_restore_hv_return_state(vcpu, &hvregs);
+	vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
+	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
+	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
+	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
+	mtspr(SPRN_PSSCR_PR, host_psscr);
+
+	dec = mfspr(SPRN_DEC);
+	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+		dec = (s32) dec;
+	*tb = mftb();
+	vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
+
+	return trap;
+}
+
 /*
  * Guest entry for POWER9 and later CPUs.
  */
@@ -3700,7 +3766,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	struct p9_host_os_sprs host_os_sprs;
-	s64 dec;
 	u64 next_timer;
 	unsigned long msr;
 	int trap;
@@ -3752,63 +3817,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 	switch_pmu_to_guest(vcpu, &host_os_sprs);
 
 	if (kvmhv_on_pseries()) {
-		/*
-		 * We need to save and restore the guest visible part of the
-		 * psscr (i.e. using SPRN_PSSCR_PR) since the hypervisor
-		 * doesn't do this for us. Note only required if pseries since
-		 * this is done in kvmhv_vcpu_entry_p9() below otherwise.
-		 */
-		unsigned long host_psscr;
-		/* call our hypervisor to load up HV regs and go */
-		struct hv_guest_state hvregs;
-
-		host_psscr = mfspr(SPRN_PSSCR_PR);
-		mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
-		kvmhv_save_hv_regs(vcpu, &hvregs);
-		hvregs.lpcr = lpcr;
-		vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
-		hvregs.version = HV_GUEST_STATE_VERSION;
-		if (vcpu->arch.nested) {
-			hvregs.lpid = vcpu->arch.nested->shadow_lpid;
-			hvregs.vcpu_token = vcpu->arch.nested_vcpu_id;
-		} else {
-			hvregs.lpid = vcpu->kvm->arch.lpid;
-			hvregs.vcpu_token = vcpu->vcpu_id;
-		}
-		hvregs.hdec_expiry = time_limit;
-
-		/*
-		 * When setting DEC, we must always deal with irq_work_raise
-		 * via NMI vs setting DEC. The problem occurs right as we
-		 * switch into guest mode if a NMI hits and sets pending work
-		 * and sets DEC, then that will apply to the guest and not
-		 * bring us back to the host.
-		 *
-		 * irq_work_raise could check a flag (or possibly LPCR[HDICE]
-		 * for example) and set HDEC to 1? That wouldn't solve the
-		 * nested hv case which needs to abort the hcall or zero the
-		 * time limit.
-		 *
-		 * XXX: Another day's problem.
-		 */
-		mtspr(SPRN_DEC, kvmppc_dec_expires_host_tb(vcpu) - *tb);
-
-		mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
-		mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
-		trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
-					  __pa(&vcpu->arch.regs));
-		kvmhv_restore_hv_return_state(vcpu, &hvregs);
-		vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
-		vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
-		vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
-		vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
-		mtspr(SPRN_PSSCR_PR, host_psscr);
-
-		dec = mfspr(SPRN_DEC);
-		if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
-			dec = (s32) dec;
-		*tb = mftb();
-		vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
+		trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
 
 		/* H_CEDE has to be handled now, not later */
 		if (trap = BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 29/43] KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move register saving and loading from kvmhv_p9_guest_entry() into the HV
and nested entry handlers.

Accesses are scheduled to reduce mtSPR / mfSPR interleaving which
reduces SPR scoreboard stalls.

XXX +212 cycles here somewhere (7566), investigate  POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 77 ++++++++++++--------------
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 80 ++++++++++++++++++++-------
 2 files changed, 96 insertions(+), 61 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 64386fc0cd00..ee4002c33f89 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3697,9 +3697,15 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	unsigned long host_psscr;
+	unsigned long msr;
 	struct hv_guest_state hvregs;
-	int trap;
+	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
+	int trap;
+
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
+
+	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
 	 * We need to save and restore the guest visible part of the
@@ -3708,6 +3714,26 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	 * this is done in kvmhv_vcpu_entry_p9() below otherwise.
 	 */
 	host_psscr = mfspr(SPRN_PSSCR_PR);
+
+	hard_irq_disable();
+	if (lazy_irq_pending())
+		return 0;
+
+	/* MSR bits may have been cleared by context switch */
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+
+	load_vcpu_state(vcpu, &host_os_sprs);
+
 	mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
 	kvmhv_save_hv_regs(vcpu, &hvregs);
 	hvregs.lpcr = lpcr;
@@ -3749,12 +3775,20 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
 	mtspr(SPRN_PSSCR_PR, host_psscr);
 
+	store_vcpu_state(vcpu);
+
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
 	*tb = mftb();
 	vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
 
+	timer_rearm_host_dec(*tb);
+
+	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
+	switch_pmu_to_host(vcpu, &host_os_sprs);
+
 	return trap;
 }
 
@@ -3765,9 +3799,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			 unsigned long lpcr, u64 *tb)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
-	struct p9_host_os_sprs host_os_sprs;
 	u64 next_timer;
-	unsigned long msr;
 	int trap;
 
 	next_timer = timer_get_next_tb();
@@ -3778,33 +3810,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu->arch.ceded = 0;
 
-	save_p9_host_os_sprs(&host_os_sprs);
-
-	/*
-	 * This could be combined with MSR[RI] clearing, but that expands
-	 * the unrecoverable window. It would be better to cover unrecoverable
-	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
-	 *
-	 * Much more difficult and less worthwhile to combine with IR/DR
-	 * disable.
-	 */
-	hard_irq_disable();
-	if (lazy_irq_pending())
-		return 0;
-
-	/* MSR bits may have been cleared by context switch */
-	msr = 0;
-	if (IS_ENABLED(CONFIG_PPC_FPU))
-		msr |= MSR_FP;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		msr |= MSR_VEC;
-	if (cpu_has_feature(CPU_FTR_VSX))
-		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		msr |= MSR_TM;
-	msr = msr_check_and_set(msr);
-
 	kvmppc_subcore_enter_guest();
 
 	vc->entry_exit_map = 1;
@@ -3812,10 +3817,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	load_vcpu_state(vcpu, &host_os_sprs);
-
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
-
 	if (kvmhv_on_pseries()) {
 		trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
 
@@ -3858,16 +3859,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			vcpu->arch.slb_max = 0;
 	}
 
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
-	store_vcpu_state(vcpu);
-
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	timer_rearm_host_dec(*tb);
-
-	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index cc74cd314fcf..f5098995f5cb 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -517,6 +517,7 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 
 int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
 {
+	struct p9_host_os_sprs host_os_sprs;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
@@ -546,9 +547,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ceded = 0;
 
-	/* Could avoid mfmsr by passing around, but probably no big deal */
-	msr = mfmsr();
-
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
 	host_dawr0 = mfspr(SPRN_DAWR0);
@@ -563,6 +561,38 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
 	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
 
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
+
+	save_p9_host_os_sprs(&host_os_sprs);
+
+	/*
+	 * This could be combined with MSR[RI] clearing, but that expands
+	 * the unrecoverable window. It would be better to cover unrecoverable
+	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
+	 *
+	 * Much more difficult and less worthwhile to combine with IR/DR
+	 * disable.
+	 */
+	hard_irq_disable();
+	if (lazy_irq_pending()) {
+		trap = 0;
+		goto out;
+	}
+
+	/* MSR bits may have been cleared by context switch */
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+	/* Save MSR for restore. This is after hard disable, so EE is clear. */
+
 	if (vc->tb_offset) {
 		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
@@ -623,6 +653,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
 
+	load_vcpu_state(vcpu, &host_os_sprs);
+
 	/*
 	 * Hash host, hash guest, or radix guest with prefetch bug, all have
 	 * to disable the MMU before switching to guest MMU state.
@@ -783,6 +815,17 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vc->dpdes = mfspr(SPRN_DPDES);
 	vc->vtb = mfspr(SPRN_VTB);
 
+	save_clear_guest_mmu(kvm, vcpu);
+	switch_mmu_to_host(kvm, host_pidr);
+
+	/*
+	 * If we are in real mode, only switch MMU on after the MMU is
+	 * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
+	 */
+	__mtmsrd(msr, 0);
+
+	store_vcpu_state(vcpu);
+
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
@@ -815,6 +858,19 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		mtspr(SPRN_DAWRX1, host_dawrx1);
 	}
 
+	mtspr(SPRN_DPDES, 0);
+	if (vc->pcr)
+		mtspr(SPRN_PCR, PCR_MASK);
+
+	/* HDEC must be at least as large as DEC, so decrementer_max fits */
+	mtspr(SPRN_HDEC, decrementer_max);
+
+	timer_rearm_host_dec(*tb);
+
+	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
+	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
+
 	if (kvm_is_radix(kvm)) {
 		/*
 		 * Since this is radix, do a eieio; tlbsync; ptesync sequence
@@ -831,22 +887,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (cpu_has_feature(CPU_FTR_ARCH_31))
 		asm volatile(PPC_CP_ABORT);
 
-	mtspr(SPRN_DPDES, 0);
-	if (vc->pcr)
-		mtspr(SPRN_PCR, PCR_MASK);
-
-	/* HDEC must be at least as large as DEC, so decrementer_max fits */
-	mtspr(SPRN_HDEC, decrementer_max);
-
-	save_clear_guest_mmu(kvm, vcpu);
-	switch_mmu_to_host(kvm, host_pidr);
-	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
-
-	/*
-	 * If we are in real mode, only switch MMU on after the MMU is
-	 * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
-	 */
-	__mtmsrd(msr, 0);
+out:
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 
 	end_timing(vcpu);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 29/43] KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Move register saving and loading from kvmhv_p9_guest_entry() into the HV
and nested entry handlers.

Accesses are scheduled to reduce mtSPR / mfSPR interleaving which
reduces SPR scoreboard stalls.

XXX +212 cycles here somewhere (7566), investigate  POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 77 ++++++++++++--------------
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 80 ++++++++++++++++++++-------
 2 files changed, 96 insertions(+), 61 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 64386fc0cd00..ee4002c33f89 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3697,9 +3697,15 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
 	unsigned long host_psscr;
+	unsigned long msr;
 	struct hv_guest_state hvregs;
-	int trap;
+	struct p9_host_os_sprs host_os_sprs;
 	s64 dec;
+	int trap;
+
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
+
+	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
 	 * We need to save and restore the guest visible part of the
@@ -3708,6 +3714,26 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	 * this is done in kvmhv_vcpu_entry_p9() below otherwise.
 	 */
 	host_psscr = mfspr(SPRN_PSSCR_PR);
+
+	hard_irq_disable();
+	if (lazy_irq_pending())
+		return 0;
+
+	/* MSR bits may have been cleared by context switch */
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+
+	load_vcpu_state(vcpu, &host_os_sprs);
+
 	mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
 	kvmhv_save_hv_regs(vcpu, &hvregs);
 	hvregs.lpcr = lpcr;
@@ -3749,12 +3775,20 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
 	mtspr(SPRN_PSSCR_PR, host_psscr);
 
+	store_vcpu_state(vcpu);
+
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
 	*tb = mftb();
 	vcpu->arch.dec_expires = dec + (*tb + vc->tb_offset);
 
+	timer_rearm_host_dec(*tb);
+
+	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
+	switch_pmu_to_host(vcpu, &host_os_sprs);
+
 	return trap;
 }
 
@@ -3765,9 +3799,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			 unsigned long lpcr, u64 *tb)
 {
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
-	struct p9_host_os_sprs host_os_sprs;
 	u64 next_timer;
-	unsigned long msr;
 	int trap;
 
 	next_timer = timer_get_next_tb();
@@ -3778,33 +3810,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu->arch.ceded = 0;
 
-	save_p9_host_os_sprs(&host_os_sprs);
-
-	/*
-	 * This could be combined with MSR[RI] clearing, but that expands
-	 * the unrecoverable window. It would be better to cover unrecoverable
-	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
-	 *
-	 * Much more difficult and less worthwhile to combine with IR/DR
-	 * disable.
-	 */
-	hard_irq_disable();
-	if (lazy_irq_pending())
-		return 0;
-
-	/* MSR bits may have been cleared by context switch */
-	msr = 0;
-	if (IS_ENABLED(CONFIG_PPC_FPU))
-		msr |= MSR_FP;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		msr |= MSR_VEC;
-	if (cpu_has_feature(CPU_FTR_VSX))
-		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		msr |= MSR_TM;
-	msr = msr_check_and_set(msr);
-
 	kvmppc_subcore_enter_guest();
 
 	vc->entry_exit_map = 1;
@@ -3812,10 +3817,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	load_vcpu_state(vcpu, &host_os_sprs);
-
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
-
 	if (kvmhv_on_pseries()) {
 		trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
 
@@ -3858,16 +3859,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 			vcpu->arch.slb_max = 0;
 	}
 
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
-	store_vcpu_state(vcpu);
-
 	vcpu_vpa_increment_dispatch(vcpu);
 
-	timer_rearm_host_dec(*tb);
-
-	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
-
 	vc->entry_exit_map = 0x101;
 	vc->in_guest = 0;
 
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index cc74cd314fcf..f5098995f5cb 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -517,6 +517,7 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 
 int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
 {
+	struct p9_host_os_sprs host_os_sprs;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
 	struct kvmppc_vcore *vc = vcpu->arch.vcore;
@@ -546,9 +547,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ceded = 0;
 
-	/* Could avoid mfmsr by passing around, but probably no big deal */
-	msr = mfmsr();
-
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
 	host_dawr0 = mfspr(SPRN_DAWR0);
@@ -563,6 +561,38 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
 	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
 
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
+
+	save_p9_host_os_sprs(&host_os_sprs);
+
+	/*
+	 * This could be combined with MSR[RI] clearing, but that expands
+	 * the unrecoverable window. It would be better to cover unrecoverable
+	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
+	 *
+	 * Much more difficult and less worthwhile to combine with IR/DR
+	 * disable.
+	 */
+	hard_irq_disable();
+	if (lazy_irq_pending()) {
+		trap = 0;
+		goto out;
+	}
+
+	/* MSR bits may have been cleared by context switch */
+	msr = 0;
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr |= MSR_VSX;
+	if (cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+		msr |= MSR_TM;
+	msr = msr_check_and_set(msr);
+	/* Save MSR for restore. This is after hard disable, so EE is clear. */
+
 	if (vc->tb_offset) {
 		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
@@ -623,6 +653,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
 
+	load_vcpu_state(vcpu, &host_os_sprs);
+
 	/*
 	 * Hash host, hash guest, or radix guest with prefetch bug, all have
 	 * to disable the MMU before switching to guest MMU state.
@@ -783,6 +815,17 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vc->dpdes = mfspr(SPRN_DPDES);
 	vc->vtb = mfspr(SPRN_VTB);
 
+	save_clear_guest_mmu(kvm, vcpu);
+	switch_mmu_to_host(kvm, host_pidr);
+
+	/*
+	 * If we are in real mode, only switch MMU on after the MMU is
+	 * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
+	 */
+	__mtmsrd(msr, 0);
+
+	store_vcpu_state(vcpu);
+
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
@@ -815,6 +858,19 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		mtspr(SPRN_DAWRX1, host_dawrx1);
 	}
 
+	mtspr(SPRN_DPDES, 0);
+	if (vc->pcr)
+		mtspr(SPRN_PCR, PCR_MASK);
+
+	/* HDEC must be at least as large as DEC, so decrementer_max fits */
+	mtspr(SPRN_HDEC, decrementer_max);
+
+	timer_rearm_host_dec(*tb);
+
+	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+
+	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
+
 	if (kvm_is_radix(kvm)) {
 		/*
 		 * Since this is radix, do a eieio; tlbsync; ptesync sequence
@@ -831,22 +887,8 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (cpu_has_feature(CPU_FTR_ARCH_31))
 		asm volatile(PPC_CP_ABORT);
 
-	mtspr(SPRN_DPDES, 0);
-	if (vc->pcr)
-		mtspr(SPRN_PCR, PCR_MASK);
-
-	/* HDEC must be at least as large as DEC, so decrementer_max fits */
-	mtspr(SPRN_HDEC, decrementer_max);
-
-	save_clear_guest_mmu(kvm, vcpu);
-	switch_mmu_to_host(kvm, host_pidr);
-	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
-
-	/*
-	 * If we are in real mode, only switch MMU on after the MMU is
-	 * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
-	 */
-	__mtmsrd(msr, 0);
+out:
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 
 	end_timing(vcpu);
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 30/43] KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

If TM is not active, only TM register state needs to be saved.

-348 cycles (7218) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index f5098995f5cb..81ff8479ac32 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -271,8 +271,16 @@ void load_vcpu_state(struct kvm_vcpu *vcpu,
 			   struct p9_host_os_sprs *host_os_sprs)
 {
 	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+		unsigned long msr = vcpu->arch.shregs.msr;
+		if (MSR_TM_ACTIVE(msr)) {
+			kvmppc_restore_tm_hv(vcpu, msr, true);
+		} else {
+			mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+			mtspr(SPRN_TFHAR, vcpu->arch.tfhar);
+			mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+		}
+	}
 
 	load_spr_state(vcpu, host_os_sprs);
 
@@ -295,8 +303,16 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
 
 	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+		unsigned long msr = vcpu->arch.shregs.msr;
+		if (MSR_TM_ACTIVE(msr)) {
+			kvmppc_save_tm_hv(vcpu, msr, true);
+		} else {
+			vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+			vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
+			vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
+		}
+	}
 }
 EXPORT_SYMBOL_GPL(store_vcpu_state);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 30/43] KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

If TM is not active, only TM register state needs to be saved.

-348 cycles (7218) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index f5098995f5cb..81ff8479ac32 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -271,8 +271,16 @@ void load_vcpu_state(struct kvm_vcpu *vcpu,
 			   struct p9_host_os_sprs *host_os_sprs)
 {
 	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+		unsigned long msr = vcpu->arch.shregs.msr;
+		if (MSR_TM_ACTIVE(msr)) {
+			kvmppc_restore_tm_hv(vcpu, msr, true);
+		} else {
+			mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+			mtspr(SPRN_TFHAR, vcpu->arch.tfhar);
+			mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+		}
+	}
 
 	load_spr_state(vcpu, host_os_sprs);
 
@@ -295,8 +303,16 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
 
 	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
-		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+		unsigned long msr = vcpu->arch.shregs.msr;
+		if (MSR_TM_ACTIVE(msr)) {
+			kvmppc_save_tm_hv(vcpu, msr, true);
+		} else {
+			vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+			vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
+			vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
+		}
+	}
 }
 EXPORT_SYMBOL_GPL(store_vcpu_state);
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 31/43] KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This moves PMU switch to guest as late as possible in entry, and switch
back to host as early as possible at exit. This helps the host get the
most perf coverage of KVM entry/exit code as possible.

This is slightly suboptimal for SPR scheduling point of view when the
PMU is enabled, but when perf is disabled there is no real difference.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 6 ++----
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 6 ++----
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ee4002c33f89..a31397fde98e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3703,8 +3703,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	s64 dec;
 	int trap;
 
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
-
 	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
@@ -3766,9 +3764,11 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 
 	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 	trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
 				  __pa(&vcpu->arch.regs));
 	kvmhv_restore_hv_return_state(vcpu, &hvregs);
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 	vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
 	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
 	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
@@ -3787,8 +3787,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
 
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
 	return trap;
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 81ff8479ac32..9e58624566a4 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -577,8 +577,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
 	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
 
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
-
 	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
@@ -708,7 +706,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	accumulate_time(vcpu, &vcpu->arch.guest_time);
 
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 	kvmppc_p9_enter_guest(vcpu);
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 
 	accumulate_time(vcpu, &vcpu->arch.rm_intr);
 
@@ -904,8 +904,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		asm volatile(PPC_CP_ABORT);
 
 out:
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
 	end_timing(vcpu);
 
 	return trap;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 31/43] KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This moves PMU switch to guest as late as possible in entry, and switch
back to host as early as possible at exit. This helps the host get the
most perf coverage of KVM entry/exit code as possible.

This is slightly suboptimal for SPR scheduling point of view when the
PMU is enabled, but when perf is disabled there is no real difference.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 6 ++----
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 6 ++----
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ee4002c33f89..a31397fde98e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3703,8 +3703,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	s64 dec;
 	int trap;
 
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
-
 	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
@@ -3766,9 +3764,11 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 
 	mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
 	mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 	trap = plpar_hcall_norets(H_ENTER_NESTED, __pa(&hvregs),
 				  __pa(&vcpu->arch.regs));
 	kvmhv_restore_hv_return_state(vcpu, &hvregs);
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 	vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
 	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
 	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
@@ -3787,8 +3787,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
 
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
 	return trap;
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 81ff8479ac32..9e58624566a4 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -577,8 +577,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
 	local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
 
-	switch_pmu_to_guest(vcpu, &host_os_sprs);
-
 	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
@@ -708,7 +706,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	accumulate_time(vcpu, &vcpu->arch.guest_time);
 
+	switch_pmu_to_guest(vcpu, &host_os_sprs);
 	kvmppc_p9_enter_guest(vcpu);
+	switch_pmu_to_host(vcpu, &host_os_sprs);
 
 	accumulate_time(vcpu, &vcpu->arch.rm_intr);
 
@@ -904,8 +904,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		asm volatile(PPC_CP_ABORT);
 
 out:
-	switch_pmu_to_host(vcpu, &host_os_sprs);
-
 	end_timing(vcpu);
 
 	return trap;
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 32/43] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use CPU_FTR_P9_RADIX_PREFETCH_BUG for this, to test for DD2.1 and below
processors.

-43 cycles (7178) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 3 ++-
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 6 ++++--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a31397fde98e..ae528eb37792 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1523,7 +1523,8 @@ XXX benchmark guest exits
 		unsigned long vsid;
 		long err;
 
-		if (vcpu->arch.fault_dsisr == HDSISR_CANARY) {
+		if (cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG) &&
+		    unlikely(vcpu->arch.fault_dsisr == HDSISR_CANARY)) {
 			r = RESUME_GUEST; /* Just retry if it's the canary */
 			break;
 		}
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 9e58624566a4..b41be3d8f101 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -656,9 +656,11 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	 * HDSI which should correctly update the HDSISR the second time HDSI
 	 * entry.
 	 *
-	 * Just do this on all p9 processors for now.
+	 * The "radix prefetch bug" test can be used to test for this bug, as
+	 * it also exists fo DD2.1 and below.
 	 */
-	mtspr(SPRN_HDSISR, HDSISR_CANARY);
+	if (cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
+		mtspr(SPRN_HDSISR, HDSISR_CANARY);
 
 	mtspr(SPRN_SPRG0, vcpu->arch.shregs.sprg0);
 	mtspr(SPRN_SPRG1, vcpu->arch.shregs.sprg1);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 32/43] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that requir
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use CPU_FTR_P9_RADIX_PREFETCH_BUG for this, to test for DD2.1 and below
processors.

-43 cycles (7178) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 3 ++-
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 6 ++++--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a31397fde98e..ae528eb37792 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1523,7 +1523,8 @@ XXX benchmark guest exits
 		unsigned long vsid;
 		long err;
 
-		if (vcpu->arch.fault_dsisr = HDSISR_CANARY) {
+		if (cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG) &&
+		    unlikely(vcpu->arch.fault_dsisr = HDSISR_CANARY)) {
 			r = RESUME_GUEST; /* Just retry if it's the canary */
 			break;
 		}
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 9e58624566a4..b41be3d8f101 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -656,9 +656,11 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	 * HDSI which should correctly update the HDSISR the second time HDSI
 	 * entry.
 	 *
-	 * Just do this on all p9 processors for now.
+	 * The "radix prefetch bug" test can be used to test for this bug, as
+	 * it also exists fo DD2.1 and below.
 	 */
-	mtspr(SPRN_HDSISR, HDSISR_CANARY);
+	if (cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
+		mtspr(SPRN_HDSISR, HDSISR_CANARY);
 
 	mtspr(SPRN_SPRG0, vcpu->arch.shregs.sprg0);
 	mtspr(SPRN_SPRG1, vcpu->arch.shregs.sprg1);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 33/43] KVM: PPC: Book3S HV P9: More SPR speed improvements
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This avoids more scoreboard stalls and reduces mtSPRs.

-193 cycles (6985) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 67 ++++++++++++++++-----------
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index b41be3d8f101..4d1a2d1ff4c1 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -618,24 +618,29 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = vc->tb_offset;
 	}
 
-	if (vc->pcr)
-		mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
-	mtspr(SPRN_DPDES, vc->dpdes);
 	mtspr(SPRN_VTB, vc->vtb);
-
 	mtspr(SPRN_PURR, vcpu->arch.purr);
 	mtspr(SPRN_SPURR, vcpu->arch.spurr);
 
+	if (vc->pcr)
+		mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
+	if (vc->dpdes)
+		mtspr(SPRN_DPDES, vc->dpdes);
+
 	if (dawr_enabled()) {
-		mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
-		mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
+		if (vcpu->arch.dawr0 != host_dawr0)
+			mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
+		if (vcpu->arch.dawrx0 != host_dawrx0)
+			mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
 		if (cpu_has_feature(CPU_FTR_DAWR1)) {
-			mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
-			mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
+			if (vcpu->arch.dawr1 != host_dawr1)
+				mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
+			if (vcpu->arch.dawrx1 != host_dawrx1)
+				mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
 		}
 	}
-	mtspr(SPRN_CIABR, vcpu->arch.ciabr);
-	mtspr(SPRN_IC, vcpu->arch.ic);
+	if (vcpu->arch.ciabr != host_ciabr)
+		mtspr(SPRN_CIABR, vcpu->arch.ciabr);
 
 	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -833,17 +838,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vc->dpdes = mfspr(SPRN_DPDES);
 	vc->vtb = mfspr(SPRN_VTB);
 
-	save_clear_guest_mmu(kvm, vcpu);
-	switch_mmu_to_host(kvm, host_pidr);
-
-	/*
-	 * If we are in real mode, only switch MMU on after the MMU is
-	 * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
-	 */
-	__mtmsrd(msr, 0);
-
-	store_vcpu_state(vcpu);
-
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
@@ -861,6 +855,19 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = 0;
 	}
 
+	save_clear_guest_mmu(kvm, vcpu);
+	switch_mmu_to_host(kvm, host_pidr);
+
+	/*
+	 * Enable MSR here in order to have facilities enabled to save
+	 * guest registers. This enables MMU (if we were in realmode), so
+	 * only switch MMU on after the MMU is switched to host, to avoid
+	 * the P9_RADIX_PREFETCH_BUG or hash guest context.
+	 */
+	__mtmsrd(msr, 0);
+
+	store_vcpu_state(vcpu);
+
 	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
 	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
 
@@ -868,15 +875,21 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
 	mtspr(SPRN_HFSCR, host_hfscr);
-	mtspr(SPRN_CIABR, host_ciabr);
-	mtspr(SPRN_DAWR0, host_dawr0);
-	mtspr(SPRN_DAWRX0, host_dawrx0);
+	if (vcpu->arch.ciabr != host_ciabr)
+		mtspr(SPRN_CIABR, host_ciabr);
+	if (vcpu->arch.dawr0 != host_dawr0)
+		mtspr(SPRN_DAWR0, host_dawr0);
+	if (vcpu->arch.dawrx0 != host_dawrx0)
+		mtspr(SPRN_DAWRX0, host_dawrx0);
 	if (cpu_has_feature(CPU_FTR_DAWR1)) {
-		mtspr(SPRN_DAWR1, host_dawr1);
-		mtspr(SPRN_DAWRX1, host_dawrx1);
+		if (vcpu->arch.dawr1 != host_dawr1)
+			mtspr(SPRN_DAWR1, host_dawr1);
+		if (vcpu->arch.dawrx1 != host_dawrx1)
+			mtspr(SPRN_DAWRX1, host_dawrx1);
 	}
 
-	mtspr(SPRN_DPDES, 0);
+	if (vc->dpdes)
+		mtspr(SPRN_DPDES, 0);
 	if (vc->pcr)
 		mtspr(SPRN_PCR, PCR_MASK);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 33/43] KVM: PPC: Book3S HV P9: More SPR speed improvements
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This avoids more scoreboard stalls and reduces mtSPRs.

-193 cycles (6985) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 67 ++++++++++++++++-----------
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index b41be3d8f101..4d1a2d1ff4c1 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -618,24 +618,29 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = vc->tb_offset;
 	}
 
-	if (vc->pcr)
-		mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
-	mtspr(SPRN_DPDES, vc->dpdes);
 	mtspr(SPRN_VTB, vc->vtb);
-
 	mtspr(SPRN_PURR, vcpu->arch.purr);
 	mtspr(SPRN_SPURR, vcpu->arch.spurr);
 
+	if (vc->pcr)
+		mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
+	if (vc->dpdes)
+		mtspr(SPRN_DPDES, vc->dpdes);
+
 	if (dawr_enabled()) {
-		mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
-		mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
+		if (vcpu->arch.dawr0 != host_dawr0)
+			mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
+		if (vcpu->arch.dawrx0 != host_dawrx0)
+			mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
 		if (cpu_has_feature(CPU_FTR_DAWR1)) {
-			mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
-			mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
+			if (vcpu->arch.dawr1 != host_dawr1)
+				mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
+			if (vcpu->arch.dawrx1 != host_dawrx1)
+				mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
 		}
 	}
-	mtspr(SPRN_CIABR, vcpu->arch.ciabr);
-	mtspr(SPRN_IC, vcpu->arch.ic);
+	if (vcpu->arch.ciabr != host_ciabr)
+		mtspr(SPRN_CIABR, vcpu->arch.ciabr);
 
 	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
@@ -833,17 +838,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	vc->dpdes = mfspr(SPRN_DPDES);
 	vc->vtb = mfspr(SPRN_VTB);
 
-	save_clear_guest_mmu(kvm, vcpu);
-	switch_mmu_to_host(kvm, host_pidr);
-
-	/*
-	 * If we are in real mode, only switch MMU on after the MMU is
-	 * switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
-	 */
-	__mtmsrd(msr, 0);
-
-	store_vcpu_state(vcpu);
-
 	dec = mfspr(SPRN_DEC);
 	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
 		dec = (s32) dec;
@@ -861,6 +855,19 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		vc->tb_offset_applied = 0;
 	}
 
+	save_clear_guest_mmu(kvm, vcpu);
+	switch_mmu_to_host(kvm, host_pidr);
+
+	/*
+	 * Enable MSR here in order to have facilities enabled to save
+	 * guest registers. This enables MMU (if we were in realmode), so
+	 * only switch MMU on after the MMU is switched to host, to avoid
+	 * the P9_RADIX_PREFETCH_BUG or hash guest context.
+	 */
+	__mtmsrd(msr, 0);
+
+	store_vcpu_state(vcpu);
+
 	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
 	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
 
@@ -868,15 +875,21 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_PSSCR, host_psscr |
 	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
 	mtspr(SPRN_HFSCR, host_hfscr);
-	mtspr(SPRN_CIABR, host_ciabr);
-	mtspr(SPRN_DAWR0, host_dawr0);
-	mtspr(SPRN_DAWRX0, host_dawrx0);
+	if (vcpu->arch.ciabr != host_ciabr)
+		mtspr(SPRN_CIABR, host_ciabr);
+	if (vcpu->arch.dawr0 != host_dawr0)
+		mtspr(SPRN_DAWR0, host_dawr0);
+	if (vcpu->arch.dawrx0 != host_dawrx0)
+		mtspr(SPRN_DAWRX0, host_dawrx0);
 	if (cpu_has_feature(CPU_FTR_DAWR1)) {
-		mtspr(SPRN_DAWR1, host_dawr1);
-		mtspr(SPRN_DAWRX1, host_dawrx1);
+		if (vcpu->arch.dawr1 != host_dawr1)
+			mtspr(SPRN_DAWR1, host_dawr1);
+		if (vcpu->arch.dawrx1 != host_dawrx1)
+			mtspr(SPRN_DAWRX1, host_dawrx1);
 	}
 
-	mtspr(SPRN_DPDES, 0);
+	if (vc->dpdes)
+		mtspr(SPRN_DPDES, 0);
 	if (vc->pcr)
 		mtspr(SPRN_PCR, PCR_MASK);
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 34/43] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use HFSCR facility disabling to implement demand faulting for EBB, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.

This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h   |  1 +
 arch/powerpc/kvm/book3s_hv.c          | 11 +++++++++++
 arch/powerpc/kvm/book3s_hv_nested.c   |  3 ++-
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 ++++++++++++++++++++------
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 118b388ea887..bee95106c1f2 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -585,6 +585,7 @@ struct kvm_vcpu_arch {
 	ulong cfar;
 	ulong ppr;
 	u32 pspb;
+	u8 load_ebb;
 	ulong fscr;
 	ulong shadow_fscr;
 	ulong ebbhr;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ae528eb37792..99e9da078e7d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1366,6 +1366,13 @@ static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
 	return RESUME_GUEST;
 }
 
+static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.hfscr |= HFSCR_EBB;
+
+	return RESUME_GUEST;
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 				 struct task_struct *tsk)
 {
@@ -1645,6 +1652,8 @@ XXX benchmark guest exits
 				r = kvmppc_emulate_doorbell_instr(vcpu);
 			if (cause == FSCR_PM_LG)
 				r = kvmppc_pmu_unavailable(vcpu);
+			if (cause == FSCR_EBB_LG)
+				r = kvmppc_ebb_unavailable(vcpu);
 		}
 		if (r == EMULATE_FAIL) {
 			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
@@ -1764,6 +1773,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 		r = EMULATE_FAIL;
 		if (cause == FSCR_PM_LG && (vcpu->arch.nested_hfscr & HFSCR_PM))
 			r = kvmppc_pmu_unavailable(vcpu);
+		if (cause == FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
+			r = kvmppc_ebb_unavailable(vcpu);
 
 		if (r == EMULATE_FAIL)
 			r = RESUME_HOST;
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 024b0ce5b702..ee8668f056f9 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -168,7 +168,8 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
 	 * but preserve the interrupt cause field and facilities that might
 	 * be disabled for demand faulting in the L1.
 	 */
-	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | vcpu->arch.hfscr);
+	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
+			vcpu->arch.hfscr);
 
 	/* Don't let data address watchpoint match in hypervisor state */
 	hr->dawrx0 &= ~DAWRX_HYP;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 4d1a2d1ff4c1..cf41261daa97 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -218,9 +218,12 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
 				struct p9_host_os_sprs *host_os_sprs)
 {
 	mtspr(SPRN_TAR, vcpu->arch.tar);
-	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
-	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
-	mtspr(SPRN_BESCR, vcpu->arch.bescr);
+
+	if (vcpu->arch.hfscr & HFSCR_EBB) {
+		mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+		mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+		mtspr(SPRN_BESCR, vcpu->arch.bescr);
+	}
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
 		mtspr(SPRN_TIDR, vcpu->arch.tid);
@@ -251,9 +254,20 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
 static void store_spr_state(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.tar = mfspr(SPRN_TAR);
-	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
-	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
-	vcpu->arch.bescr = mfspr(SPRN_BESCR);
+
+	if (vcpu->arch.hfscr & HFSCR_EBB) {
+		vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
+		vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
+		vcpu->arch.bescr = mfspr(SPRN_BESCR);
+		/*
+		 * This is like load_fp in context switching, turn off the
+		 * facility after it wraps the u8 to try avoiding saving
+		 * and restoring the registers each partition switch.
+		 */
+		vcpu->arch.load_ebb++;
+		if (!vcpu->arch.load_ebb)
+			vcpu->arch.hfscr &= ~HFSCR_EBB;
+	}
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
 		vcpu->arch.tid = mfspr(SPRN_TIDR);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 34/43] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use HFSCR facility disabling to implement demand faulting for EBB, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.

This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h   |  1 +
 arch/powerpc/kvm/book3s_hv.c          | 11 +++++++++++
 arch/powerpc/kvm/book3s_hv_nested.c   |  3 ++-
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 ++++++++++++++++++++------
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 118b388ea887..bee95106c1f2 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -585,6 +585,7 @@ struct kvm_vcpu_arch {
 	ulong cfar;
 	ulong ppr;
 	u32 pspb;
+	u8 load_ebb;
 	ulong fscr;
 	ulong shadow_fscr;
 	ulong ebbhr;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ae528eb37792..99e9da078e7d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1366,6 +1366,13 @@ static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
 	return RESUME_GUEST;
 }
 
+static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.hfscr |= HFSCR_EBB;
+
+	return RESUME_GUEST;
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 				 struct task_struct *tsk)
 {
@@ -1645,6 +1652,8 @@ XXX benchmark guest exits
 				r = kvmppc_emulate_doorbell_instr(vcpu);
 			if (cause = FSCR_PM_LG)
 				r = kvmppc_pmu_unavailable(vcpu);
+			if (cause = FSCR_EBB_LG)
+				r = kvmppc_ebb_unavailable(vcpu);
 		}
 		if (r = EMULATE_FAIL) {
 			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
@@ -1764,6 +1773,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 		r = EMULATE_FAIL;
 		if (cause = FSCR_PM_LG && (vcpu->arch.nested_hfscr & HFSCR_PM))
 			r = kvmppc_pmu_unavailable(vcpu);
+		if (cause = FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
+			r = kvmppc_ebb_unavailable(vcpu);
 
 		if (r = EMULATE_FAIL)
 			r = RESUME_HOST;
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 024b0ce5b702..ee8668f056f9 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -168,7 +168,8 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
 	 * but preserve the interrupt cause field and facilities that might
 	 * be disabled for demand faulting in the L1.
 	 */
-	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | vcpu->arch.hfscr);
+	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
+			vcpu->arch.hfscr);
 
 	/* Don't let data address watchpoint match in hypervisor state */
 	hr->dawrx0 &= ~DAWRX_HYP;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 4d1a2d1ff4c1..cf41261daa97 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -218,9 +218,12 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
 				struct p9_host_os_sprs *host_os_sprs)
 {
 	mtspr(SPRN_TAR, vcpu->arch.tar);
-	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
-	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
-	mtspr(SPRN_BESCR, vcpu->arch.bescr);
+
+	if (vcpu->arch.hfscr & HFSCR_EBB) {
+		mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+		mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+		mtspr(SPRN_BESCR, vcpu->arch.bescr);
+	}
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
 		mtspr(SPRN_TIDR, vcpu->arch.tid);
@@ -251,9 +254,20 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
 static void store_spr_state(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.tar = mfspr(SPRN_TAR);
-	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
-	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
-	vcpu->arch.bescr = mfspr(SPRN_BESCR);
+
+	if (vcpu->arch.hfscr & HFSCR_EBB) {
+		vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
+		vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
+		vcpu->arch.bescr = mfspr(SPRN_BESCR);
+		/*
+		 * This is like load_fp in context switching, turn off the
+		 * facility after it wraps the u8 to try avoiding saving
+		 * and restoring the registers each partition switch.
+		 */
+		vcpu->arch.load_ebb++;
+		if (!vcpu->arch.load_ebb)
+			vcpu->arch.hfscr &= ~HFSCR_EBB;
+	}
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
 		vcpu->arch.tid = mfspr(SPRN_TIDR);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 35/43] KVM: PPC: Book3S HV P9: Demand fault TM facility registers
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use HFSCR facility disabling to implement demand faulting for TM, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.

This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.

-304 cycles (6681) POWER9 virt-mode NULL hcall with the previous patch

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h   |  1 +
 arch/powerpc/kvm/book3s_hv.c          | 21 +++++++++++++++++----
 arch/powerpc/kvm/book3s_hv_nested.c   |  2 +-
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 18 ++++++++++++------
 4 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index bee95106c1f2..d79f0b1b1578 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -586,6 +586,7 @@ struct kvm_vcpu_arch {
 	ulong ppr;
 	u32 pspb;
 	u8 load_ebb;
+	u8 load_tm;
 	ulong fscr;
 	ulong shadow_fscr;
 	ulong ebbhr;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 99e9da078e7d..2430725f29f7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1373,6 +1373,13 @@ static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
 	return RESUME_GUEST;
 }
 
+static int kvmppc_tm_unavailable(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.hfscr |= HFSCR_TM;
+
+	return RESUME_GUEST;
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 				 struct task_struct *tsk)
 {
@@ -1654,6 +1661,8 @@ XXX benchmark guest exits
 				r = kvmppc_pmu_unavailable(vcpu);
 			if (cause == FSCR_EBB_LG)
 				r = kvmppc_ebb_unavailable(vcpu);
+			if (cause == FSCR_TM_LG)
+				r = kvmppc_tm_unavailable(vcpu);
 		}
 		if (r == EMULATE_FAIL) {
 			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
@@ -1775,6 +1784,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 			r = kvmppc_pmu_unavailable(vcpu);
 		if (cause == FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
 			r = kvmppc_ebb_unavailable(vcpu);
+		if (cause == FSCR_TM_LG && (vcpu->arch.nested_hfscr & HFSCR_TM))
+			r = kvmppc_tm_unavailable(vcpu);
 
 		if (r == EMULATE_FAIL)
 			r = RESUME_HOST;
@@ -3737,8 +3748,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 		msr |= MSR_VEC;
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
 		msr |= MSR_TM;
 	msr = msr_check_and_set(msr);
 
@@ -4453,8 +4465,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 		msr |= MSR_VEC;
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
 		msr |= MSR_TM;
 	msr = msr_check_and_set(msr);
 
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index ee8668f056f9..5a534f7924f2 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -168,7 +168,7 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
 	 * but preserve the interrupt cause field and facilities that might
 	 * be disabled for demand faulting in the L1.
 	 */
-	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
+	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_TM | HFSCR_EBB |
 			vcpu->arch.hfscr);
 
 	/* Don't let data address watchpoint match in hypervisor state */
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index cf41261daa97..653f2765a399 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -284,8 +284,9 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 void load_vcpu_state(struct kvm_vcpu *vcpu,
 			   struct p9_host_os_sprs *host_os_sprs)
 {
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+		       (vcpu->arch.hfscr & HFSCR_TM)) {
 		unsigned long msr = vcpu->arch.shregs.msr;
 		if (MSR_TM_ACTIVE(msr)) {
 			kvmppc_restore_tm_hv(vcpu, msr, true);
@@ -316,8 +317,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 #endif
 	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
 
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+		       (vcpu->arch.hfscr & HFSCR_TM)) {
 		unsigned long msr = vcpu->arch.shregs.msr;
 		if (MSR_TM_ACTIVE(msr)) {
 			kvmppc_save_tm_hv(vcpu, msr, true);
@@ -326,6 +328,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 			vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
 			vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
 		}
+		vcpu->arch.load_tm++; /* see load_ebb comment for details */
+		if (!vcpu->arch.load_tm)
+			vcpu->arch.hfscr &= ~HFSCR_TM;
 	}
 }
 EXPORT_SYMBOL_GPL(store_vcpu_state);
@@ -615,8 +620,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		msr |= MSR_VEC;
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
 		msr |= MSR_TM;
 	msr = msr_check_and_set(msr);
 	/* Save MSR for restore. This is after hard disable, so EE is clear. */
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 35/43] KVM: PPC: Book3S HV P9: Demand fault TM facility registers
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use HFSCR facility disabling to implement demand faulting for TM, with
a hysteresis counter similar to the load_fp etc counters in context
switching that implement the equivalent demand faulting for userspace
facilities.

This speeds up guest entry/exit by avoiding the register save/restore
when a guest is not frequently using them. When a guest does use them
often, there will be some additional demand fault overhead, but these
are not commonly used facilities.

-304 cycles (6681) POWER9 virt-mode NULL hcall with the previous patch

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_host.h   |  1 +
 arch/powerpc/kvm/book3s_hv.c          | 21 +++++++++++++++++----
 arch/powerpc/kvm/book3s_hv_nested.c   |  2 +-
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 18 ++++++++++++------
 4 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index bee95106c1f2..d79f0b1b1578 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -586,6 +586,7 @@ struct kvm_vcpu_arch {
 	ulong ppr;
 	u32 pspb;
 	u8 load_ebb;
+	u8 load_tm;
 	ulong fscr;
 	ulong shadow_fscr;
 	ulong ebbhr;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 99e9da078e7d..2430725f29f7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1373,6 +1373,13 @@ static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
 	return RESUME_GUEST;
 }
 
+static int kvmppc_tm_unavailable(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.hfscr |= HFSCR_TM;
+
+	return RESUME_GUEST;
+}
+
 static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 				 struct task_struct *tsk)
 {
@@ -1654,6 +1661,8 @@ XXX benchmark guest exits
 				r = kvmppc_pmu_unavailable(vcpu);
 			if (cause = FSCR_EBB_LG)
 				r = kvmppc_ebb_unavailable(vcpu);
+			if (cause = FSCR_TM_LG)
+				r = kvmppc_tm_unavailable(vcpu);
 		}
 		if (r = EMULATE_FAIL) {
 			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
@@ -1775,6 +1784,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 			r = kvmppc_pmu_unavailable(vcpu);
 		if (cause = FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
 			r = kvmppc_ebb_unavailable(vcpu);
+		if (cause = FSCR_TM_LG && (vcpu->arch.nested_hfscr & HFSCR_TM))
+			r = kvmppc_tm_unavailable(vcpu);
 
 		if (r = EMULATE_FAIL)
 			r = RESUME_HOST;
@@ -3737,8 +3748,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 		msr |= MSR_VEC;
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
 		msr |= MSR_TM;
 	msr = msr_check_and_set(msr);
 
@@ -4453,8 +4465,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 		msr |= MSR_VEC;
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
 		msr |= MSR_TM;
 	msr = msr_check_and_set(msr);
 
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index ee8668f056f9..5a534f7924f2 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -168,7 +168,7 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
 	 * but preserve the interrupt cause field and facilities that might
 	 * be disabled for demand faulting in the L1.
 	 */
-	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
+	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_TM | HFSCR_EBB |
 			vcpu->arch.hfscr);
 
 	/* Don't let data address watchpoint match in hypervisor state */
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index cf41261daa97..653f2765a399 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -284,8 +284,9 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 void load_vcpu_state(struct kvm_vcpu *vcpu,
 			   struct p9_host_os_sprs *host_os_sprs)
 {
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+		       (vcpu->arch.hfscr & HFSCR_TM)) {
 		unsigned long msr = vcpu->arch.shregs.msr;
 		if (MSR_TM_ACTIVE(msr)) {
 			kvmppc_restore_tm_hv(vcpu, msr, true);
@@ -316,8 +317,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 #endif
 	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
 
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+		       (vcpu->arch.hfscr & HFSCR_TM)) {
 		unsigned long msr = vcpu->arch.shregs.msr;
 		if (MSR_TM_ACTIVE(msr)) {
 			kvmppc_save_tm_hv(vcpu, msr, true);
@@ -326,6 +328,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 			vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
 			vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
 		}
+		vcpu->arch.load_tm++; /* see load_ebb comment for details */
+		if (!vcpu->arch.load_tm)
+			vcpu->arch.hfscr &= ~HFSCR_TM;
 	}
 }
 EXPORT_SYMBOL_GPL(store_vcpu_state);
@@ -615,8 +620,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 		msr |= MSR_VEC;
 	if (cpu_has_feature(CPU_FTR_VSX))
 		msr |= MSR_VSX;
-	if (cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
 		msr |= MSR_TM;
 	msr = msr_check_and_set(msr);
 	/* Save MSR for restore. This is after hard disable, so EE is clear. */
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 36/43] KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Linux implements SPR save/restore including storage space for registers
in the task struct for process context switching. Make use of this
similarly to the way we make use of the context switching fp/vec save
restore.

This improves code reuse, allows some stack space to be saved, and helps
with avoiding VRSAVE updates if they are not required.

-61 cycles (6620) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/switch_to.h  |  2 +
 arch/powerpc/kernel/process.c         |  6 ++
 arch/powerpc/kvm/book3s_hv.c          | 22 +------
 arch/powerpc/kvm/book3s_hv.h          |  3 -
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 88 +++++++++++++++++++--------
 5 files changed, 72 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h
index 9d1fbd8be1c7..de17c45314bc 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -112,6 +112,8 @@ static inline void clear_task_ebb(struct task_struct *t)
 #endif
 }
 
+void kvmppc_save_current_sprs(void);
+
 extern int set_thread_tidr(struct task_struct *t);
 
 #endif /* _ASM_POWERPC_SWITCH_TO_H */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index dfce089ac424..29b8fd9704be 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1175,6 +1175,12 @@ static inline void save_sprs(struct thread_struct *t)
 #endif
 }
 
+void kvmppc_save_current_sprs(void)
+{
+	save_sprs(&current->thread);
+}
+EXPORT_SYMBOL_GPL(kvmppc_save_current_sprs);
+
 static inline void restore_sprs(struct thread_struct *old_thread,
 				struct thread_struct *new_thread)
 {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2430725f29f7..c7cf771d3351 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4410,9 +4410,6 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 	struct kvm_run *run = vcpu->run;
 	int r;
 	int srcu_idx;
-	unsigned long ebb_regs[3] = {};	/* shut up GCC */
-	unsigned long user_tar = 0;
-	unsigned int user_vrsave;
 	struct kvm *kvm;
 	unsigned long msr;
 
@@ -4473,14 +4470,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 
 	save_user_regs_kvm();
 
-	/* Save userspace EBB and other register values */
-	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
-		ebb_regs[0] = mfspr(SPRN_EBBHR);
-		ebb_regs[1] = mfspr(SPRN_EBBRR);
-		ebb_regs[2] = mfspr(SPRN_BESCR);
-		user_tar = mfspr(SPRN_TAR);
-	}
-	user_vrsave = mfspr(SPRN_VRSAVE);
+	kvmppc_save_current_sprs();
 
 	vcpu->arch.waitp = &vcpu->arch.vcore->wait;
 	vcpu->arch.pgdir = kvm->mm->pgd;
@@ -4521,17 +4511,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 		}
 	} while (is_kvmppc_resume_guest(r));
 
-	/* Restore userspace EBB and other register values */
-	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
-		mtspr(SPRN_EBBHR, ebb_regs[0]);
-		mtspr(SPRN_EBBRR, ebb_regs[1]);
-		mtspr(SPRN_BESCR, ebb_regs[2]);
-		mtspr(SPRN_TAR, user_tar);
-	}
-	mtspr(SPRN_VRSAVE, user_vrsave);
-
 	vcpu->arch.state = KVMPPC_VCPU_NOTREADY;
 	atomic_dec(&kvm->arch.vcpus_running);
+
 	return r;
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
index 72e3a8f4c2cf..c7ad1127462d 100644
--- a/arch/powerpc/kvm/book3s_hv.h
+++ b/arch/powerpc/kvm/book3s_hv.h
@@ -3,11 +3,8 @@
  * Privileged (non-hypervisor) host registers to save.
  */
 struct p9_host_os_sprs {
-	unsigned long dscr;
-	unsigned long tidr;
 	unsigned long iamr;
 	unsigned long amr;
-	unsigned long fscr;
 
 	unsigned int pmc1;
 	unsigned int pmc2;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 653f2765a399..55286a8357f7 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -217,15 +217,26 @@ EXPORT_SYMBOL_GPL(switch_pmu_to_host);
 static void load_spr_state(struct kvm_vcpu *vcpu,
 				struct p9_host_os_sprs *host_os_sprs)
 {
+	/* TAR is very fast */
 	mtspr(SPRN_TAR, vcpu->arch.tar);
 
+#ifdef CONFIG_ALTIVEC
+	if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
+	    current->thread.vrsave != vcpu->arch.vrsave)
+		mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+#endif
+
 	if (vcpu->arch.hfscr & HFSCR_EBB) {
-		mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
-		mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
-		mtspr(SPRN_BESCR, vcpu->arch.bescr);
+		if (current->thread.ebbhr != vcpu->arch.ebbhr)
+			mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+		if (current->thread.ebbrr != vcpu->arch.ebbrr)
+			mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+		if (current->thread.bescr != vcpu->arch.bescr)
+			mtspr(SPRN_BESCR, vcpu->arch.bescr);
 	}
 
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+	if (!cpu_has_feature(CPU_FTR_ARCH_31) &&
+			current->thread.tidr != vcpu->arch.tid)
 		mtspr(SPRN_TIDR, vcpu->arch.tid);
 	if (host_os_sprs->iamr != vcpu->arch.iamr)
 		mtspr(SPRN_IAMR, vcpu->arch.iamr);
@@ -233,9 +244,9 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
 		mtspr(SPRN_AMR, vcpu->arch.amr);
 	if (vcpu->arch.uamor != 0)
 		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
+	if (current->thread.fscr != vcpu->arch.fscr)
 		mtspr(SPRN_FSCR, vcpu->arch.fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
+	if (current->thread.dscr != vcpu->arch.dscr)
 		mtspr(SPRN_DSCR, vcpu->arch.dscr);
 	if (vcpu->arch.pspb != 0)
 		mtspr(SPRN_PSPB, vcpu->arch.pspb);
@@ -255,18 +266,15 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.tar = mfspr(SPRN_TAR);
 
+#ifdef CONFIG_ALTIVEC
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+#endif
+
 	if (vcpu->arch.hfscr & HFSCR_EBB) {
 		vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
 		vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
 		vcpu->arch.bescr = mfspr(SPRN_BESCR);
-		/*
-		 * This is like load_fp in context switching, turn off the
-		 * facility after it wraps the u8 to try avoiding saving
-		 * and restoring the registers each partition switch.
-		 */
-		vcpu->arch.load_ebb++;
-		if (!vcpu->arch.load_ebb)
-			vcpu->arch.hfscr &= ~HFSCR_EBB;
 	}
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
@@ -303,7 +311,6 @@ void load_vcpu_state(struct kvm_vcpu *vcpu,
 #ifdef CONFIG_ALTIVEC
 	load_vr_state(&vcpu->arch.vr);
 #endif
-	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
 }
 EXPORT_SYMBOL_GPL(load_vcpu_state);
 
@@ -315,7 +322,6 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_ALTIVEC
 	store_vr_state(&vcpu->arch.vr);
 #endif
-	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
 
 	if ((cpu_has_feature(CPU_FTR_TM) ||
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
@@ -337,12 +343,8 @@ EXPORT_SYMBOL_GPL(store_vcpu_state);
 
 void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		host_os_sprs->tidr = mfspr(SPRN_TIDR);
 	host_os_sprs->iamr = mfspr(SPRN_IAMR);
 	host_os_sprs->amr = mfspr(SPRN_AMR);
-	host_os_sprs->fscr = mfspr(SPRN_FSCR);
-	host_os_sprs->dscr = mfspr(SPRN_DSCR);
 }
 EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
 
@@ -350,26 +352,60 @@ EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
 void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 				    struct p9_host_os_sprs *host_os_sprs)
 {
+	/*
+	 * current->thread.xxx registers must all be restored to host
+	 * values before a potential context switch, othrewise the context
+	 * switch itself will overwrite current->thread.xxx with the values
+	 * from the guest SPRs.
+	 */
+
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		mtspr(SPRN_TIDR, host_os_sprs->tidr);
+	if (!cpu_has_feature(CPU_FTR_ARCH_31) &&
+			current->thread.tidr != vcpu->arch.tid)
+		mtspr(SPRN_TIDR, current->thread.tidr);
 	if (host_os_sprs->iamr != vcpu->arch.iamr)
 		mtspr(SPRN_IAMR, host_os_sprs->iamr);
 	if (vcpu->arch.uamor != 0)
 		mtspr(SPRN_UAMOR, 0);
 	if (host_os_sprs->amr != vcpu->arch.amr)
 		mtspr(SPRN_AMR, host_os_sprs->amr);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
-		mtspr(SPRN_FSCR, host_os_sprs->fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
-		mtspr(SPRN_DSCR, host_os_sprs->dscr);
+	if (current->thread.fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, current->thread.fscr);
+	if (current->thread.dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, current->thread.dscr);
 	if (vcpu->arch.pspb != 0)
 		mtspr(SPRN_PSPB, 0);
 
 	/* Save guest CTRL register, set runlatch to 1 */
 	if (!(vcpu->arch.ctrl & 1))
 		mtspr(SPRN_CTRLT, 1);
+
+#ifdef CONFIG_ALTIVEC
+	if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
+	    vcpu->arch.vrsave != current->thread.vrsave)
+		mtspr(SPRN_VRSAVE, current->thread.vrsave);
+#endif
+	if (vcpu->arch.hfscr & HFSCR_EBB) {
+		if (vcpu->arch.bescr != current->thread.bescr)
+			mtspr(SPRN_BESCR, current->thread.bescr);
+		if (vcpu->arch.ebbhr != current->thread.ebbhr)
+			mtspr(SPRN_EBBHR, current->thread.ebbhr);
+		if (vcpu->arch.ebbrr != current->thread.ebbrr)
+			mtspr(SPRN_EBBRR, current->thread.ebbrr);
+
+		/*
+		 * This is like load_fp in context switching, turn off the
+		 * facility after it wraps the u8 to try avoiding saving
+		 * and restoring the registers each partition switch.
+		 */
+		vcpu->arch.load_ebb++;
+		if (!vcpu->arch.load_ebb)
+			vcpu->arch.hfscr &= ~HFSCR_EBB;
+	}
+
+	if (vcpu->arch.tar != current->thread.tar)
+		mtspr(SPRN_TAR, current->thread.tar);
 }
 EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 36/43] KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Linux implements SPR save/restore including storage space for registers
in the task struct for process context switching. Make use of this
similarly to the way we make use of the context switching fp/vec save
restore.

This improves code reuse, allows some stack space to be saved, and helps
with avoiding VRSAVE updates if they are not required.

-61 cycles (6620) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/switch_to.h  |  2 +
 arch/powerpc/kernel/process.c         |  6 ++
 arch/powerpc/kvm/book3s_hv.c          | 22 +------
 arch/powerpc/kvm/book3s_hv.h          |  3 -
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 88 +++++++++++++++++++--------
 5 files changed, 72 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h
index 9d1fbd8be1c7..de17c45314bc 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -112,6 +112,8 @@ static inline void clear_task_ebb(struct task_struct *t)
 #endif
 }
 
+void kvmppc_save_current_sprs(void);
+
 extern int set_thread_tidr(struct task_struct *t);
 
 #endif /* _ASM_POWERPC_SWITCH_TO_H */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index dfce089ac424..29b8fd9704be 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1175,6 +1175,12 @@ static inline void save_sprs(struct thread_struct *t)
 #endif
 }
 
+void kvmppc_save_current_sprs(void)
+{
+	save_sprs(&current->thread);
+}
+EXPORT_SYMBOL_GPL(kvmppc_save_current_sprs);
+
 static inline void restore_sprs(struct thread_struct *old_thread,
 				struct thread_struct *new_thread)
 {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2430725f29f7..c7cf771d3351 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4410,9 +4410,6 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 	struct kvm_run *run = vcpu->run;
 	int r;
 	int srcu_idx;
-	unsigned long ebb_regs[3] = {};	/* shut up GCC */
-	unsigned long user_tar = 0;
-	unsigned int user_vrsave;
 	struct kvm *kvm;
 	unsigned long msr;
 
@@ -4473,14 +4470,7 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 
 	save_user_regs_kvm();
 
-	/* Save userspace EBB and other register values */
-	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
-		ebb_regs[0] = mfspr(SPRN_EBBHR);
-		ebb_regs[1] = mfspr(SPRN_EBBRR);
-		ebb_regs[2] = mfspr(SPRN_BESCR);
-		user_tar = mfspr(SPRN_TAR);
-	}
-	user_vrsave = mfspr(SPRN_VRSAVE);
+	kvmppc_save_current_sprs();
 
 	vcpu->arch.waitp = &vcpu->arch.vcore->wait;
 	vcpu->arch.pgdir = kvm->mm->pgd;
@@ -4521,17 +4511,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
 		}
 	} while (is_kvmppc_resume_guest(r));
 
-	/* Restore userspace EBB and other register values */
-	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
-		mtspr(SPRN_EBBHR, ebb_regs[0]);
-		mtspr(SPRN_EBBRR, ebb_regs[1]);
-		mtspr(SPRN_BESCR, ebb_regs[2]);
-		mtspr(SPRN_TAR, user_tar);
-	}
-	mtspr(SPRN_VRSAVE, user_vrsave);
-
 	vcpu->arch.state = KVMPPC_VCPU_NOTREADY;
 	atomic_dec(&kvm->arch.vcpus_running);
+
 	return r;
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
index 72e3a8f4c2cf..c7ad1127462d 100644
--- a/arch/powerpc/kvm/book3s_hv.h
+++ b/arch/powerpc/kvm/book3s_hv.h
@@ -3,11 +3,8 @@
  * Privileged (non-hypervisor) host registers to save.
  */
 struct p9_host_os_sprs {
-	unsigned long dscr;
-	unsigned long tidr;
 	unsigned long iamr;
 	unsigned long amr;
-	unsigned long fscr;
 
 	unsigned int pmc1;
 	unsigned int pmc2;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 653f2765a399..55286a8357f7 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -217,15 +217,26 @@ EXPORT_SYMBOL_GPL(switch_pmu_to_host);
 static void load_spr_state(struct kvm_vcpu *vcpu,
 				struct p9_host_os_sprs *host_os_sprs)
 {
+	/* TAR is very fast */
 	mtspr(SPRN_TAR, vcpu->arch.tar);
 
+#ifdef CONFIG_ALTIVEC
+	if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
+	    current->thread.vrsave != vcpu->arch.vrsave)
+		mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
+#endif
+
 	if (vcpu->arch.hfscr & HFSCR_EBB) {
-		mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
-		mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
-		mtspr(SPRN_BESCR, vcpu->arch.bescr);
+		if (current->thread.ebbhr != vcpu->arch.ebbhr)
+			mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
+		if (current->thread.ebbrr != vcpu->arch.ebbrr)
+			mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
+		if (current->thread.bescr != vcpu->arch.bescr)
+			mtspr(SPRN_BESCR, vcpu->arch.bescr);
 	}
 
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+	if (!cpu_has_feature(CPU_FTR_ARCH_31) &&
+			current->thread.tidr != vcpu->arch.tid)
 		mtspr(SPRN_TIDR, vcpu->arch.tid);
 	if (host_os_sprs->iamr != vcpu->arch.iamr)
 		mtspr(SPRN_IAMR, vcpu->arch.iamr);
@@ -233,9 +244,9 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
 		mtspr(SPRN_AMR, vcpu->arch.amr);
 	if (vcpu->arch.uamor != 0)
 		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
+	if (current->thread.fscr != vcpu->arch.fscr)
 		mtspr(SPRN_FSCR, vcpu->arch.fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
+	if (current->thread.dscr != vcpu->arch.dscr)
 		mtspr(SPRN_DSCR, vcpu->arch.dscr);
 	if (vcpu->arch.pspb != 0)
 		mtspr(SPRN_PSPB, vcpu->arch.pspb);
@@ -255,18 +266,15 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.tar = mfspr(SPRN_TAR);
 
+#ifdef CONFIG_ALTIVEC
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
+#endif
+
 	if (vcpu->arch.hfscr & HFSCR_EBB) {
 		vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
 		vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
 		vcpu->arch.bescr = mfspr(SPRN_BESCR);
-		/*
-		 * This is like load_fp in context switching, turn off the
-		 * facility after it wraps the u8 to try avoiding saving
-		 * and restoring the registers each partition switch.
-		 */
-		vcpu->arch.load_ebb++;
-		if (!vcpu->arch.load_ebb)
-			vcpu->arch.hfscr &= ~HFSCR_EBB;
 	}
 
 	if (!cpu_has_feature(CPU_FTR_ARCH_31))
@@ -303,7 +311,6 @@ void load_vcpu_state(struct kvm_vcpu *vcpu,
 #ifdef CONFIG_ALTIVEC
 	load_vr_state(&vcpu->arch.vr);
 #endif
-	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
 }
 EXPORT_SYMBOL_GPL(load_vcpu_state);
 
@@ -315,7 +322,6 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_ALTIVEC
 	store_vr_state(&vcpu->arch.vr);
 #endif
-	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
 
 	if ((cpu_has_feature(CPU_FTR_TM) ||
 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
@@ -337,12 +343,8 @@ EXPORT_SYMBOL_GPL(store_vcpu_state);
 
 void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
 {
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		host_os_sprs->tidr = mfspr(SPRN_TIDR);
 	host_os_sprs->iamr = mfspr(SPRN_IAMR);
 	host_os_sprs->amr = mfspr(SPRN_AMR);
-	host_os_sprs->fscr = mfspr(SPRN_FSCR);
-	host_os_sprs->dscr = mfspr(SPRN_DSCR);
 }
 EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
 
@@ -350,26 +352,60 @@ EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
 void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
 				    struct p9_host_os_sprs *host_os_sprs)
 {
+	/*
+	 * current->thread.xxx registers must all be restored to host
+	 * values before a potential context switch, othrewise the context
+	 * switch itself will overwrite current->thread.xxx with the values
+	 * from the guest SPRs.
+	 */
+
 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
 
-	if (!cpu_has_feature(CPU_FTR_ARCH_31))
-		mtspr(SPRN_TIDR, host_os_sprs->tidr);
+	if (!cpu_has_feature(CPU_FTR_ARCH_31) &&
+			current->thread.tidr != vcpu->arch.tid)
+		mtspr(SPRN_TIDR, current->thread.tidr);
 	if (host_os_sprs->iamr != vcpu->arch.iamr)
 		mtspr(SPRN_IAMR, host_os_sprs->iamr);
 	if (vcpu->arch.uamor != 0)
 		mtspr(SPRN_UAMOR, 0);
 	if (host_os_sprs->amr != vcpu->arch.amr)
 		mtspr(SPRN_AMR, host_os_sprs->amr);
-	if (host_os_sprs->fscr != vcpu->arch.fscr)
-		mtspr(SPRN_FSCR, host_os_sprs->fscr);
-	if (host_os_sprs->dscr != vcpu->arch.dscr)
-		mtspr(SPRN_DSCR, host_os_sprs->dscr);
+	if (current->thread.fscr != vcpu->arch.fscr)
+		mtspr(SPRN_FSCR, current->thread.fscr);
+	if (current->thread.dscr != vcpu->arch.dscr)
+		mtspr(SPRN_DSCR, current->thread.dscr);
 	if (vcpu->arch.pspb != 0)
 		mtspr(SPRN_PSPB, 0);
 
 	/* Save guest CTRL register, set runlatch to 1 */
 	if (!(vcpu->arch.ctrl & 1))
 		mtspr(SPRN_CTRLT, 1);
+
+#ifdef CONFIG_ALTIVEC
+	if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
+	    vcpu->arch.vrsave != current->thread.vrsave)
+		mtspr(SPRN_VRSAVE, current->thread.vrsave);
+#endif
+	if (vcpu->arch.hfscr & HFSCR_EBB) {
+		if (vcpu->arch.bescr != current->thread.bescr)
+			mtspr(SPRN_BESCR, current->thread.bescr);
+		if (vcpu->arch.ebbhr != current->thread.ebbhr)
+			mtspr(SPRN_EBBHR, current->thread.ebbhr);
+		if (vcpu->arch.ebbrr != current->thread.ebbrr)
+			mtspr(SPRN_EBBRR, current->thread.ebbrr);
+
+		/*
+		 * This is like load_fp in context switching, turn off the
+		 * facility after it wraps the u8 to try avoiding saving
+		 * and restoring the registers each partition switch.
+		 */
+		vcpu->arch.load_ebb++;
+		if (!vcpu->arch.load_ebb)
+			vcpu->arch.hfscr &= ~HFSCR_EBB;
+	}
+
+	if (vcpu->arch.tar != current->thread.tar)
+		mtspr(SPRN_TAR, current->thread.tar);
 }
 EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 37/43] KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Tighten up partition switching code synchronisation and comments.

In particular, hwsync ; isync is required after the last access that is
performed in the context of a partition, before the partition is
switched away from.

-301 cycles (6319) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  4 +++
 arch/powerpc/kvm/book3s_hv_p9_entry.c  | 40 +++++++++++++++++++-------
 2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index d909c069363e..5a6ab0a61b68 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -53,6 +53,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int pid,
 
 	preempt_disable();
 
+	asm volatile("hwsync" ::: "memory");
+	isync();
 	/* switch the lpid first to avoid running host with unallocated pid */
 	old_lpid = mfspr(SPRN_LPID);
 	if (old_lpid != lpid)
@@ -69,6 +71,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int pid,
 	else
 		ret = copy_to_user_nofault((void __user *)to, from, n);
 
+	asm volatile("hwsync" ::: "memory");
+	isync();
 	/* switch the pid first to avoid running host with unallocated pid */
 	if (quadrant == 1 && pid != old_pid)
 		mtspr(SPRN_PID, old_pid);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 55286a8357f7..7aa72efcac6c 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -503,17 +503,19 @@ static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u6
 	lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
 
 	/*
-	 * All the isync()s are overkill but trivially follow the ISA
-	 * requirements. Some can likely be replaced with justification
-	 * comment for why they are not needed.
+	 * Prior memory accesses to host PID Q3 must be completed before we
+	 * start switching, and stores must be drained to avoid not-my-LPAR
+	 * logic (see switch_mmu_to_host).
 	 */
+	asm volatile("hwsync" ::: "memory");
 	isync();
 	mtspr(SPRN_LPID, lpid);
-	isync();
 	mtspr(SPRN_LPCR, lpcr);
-	isync();
 	mtspr(SPRN_PID, vcpu->arch.pid);
-	isync();
+	/*
+	 * isync not required here because we are HRFID'ing to guest before
+	 * any guest context access, which is context synchronising.
+	 */
 }
 
 static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
@@ -523,25 +525,41 @@ static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64
 
 	lpid = kvm->arch.lpid;
 
+	/*
+	 * See switch_mmu_to_guest_radix. ptesync should not be required here
+	 * even if the host is in HPT mode because speculative accesses would
+	 * not cause RC updates (we are in real mode).
+	 */
+	asm volatile("hwsync" ::: "memory");
+	isync();
 	mtspr(SPRN_LPID, lpid);
 	mtspr(SPRN_LPCR, lpcr);
 	mtspr(SPRN_PID, vcpu->arch.pid);
 
 	for (i = 0; i < vcpu->arch.slb_max; i++)
 		mtslb(vcpu->arch.slb[i].orige, vcpu->arch.slb[i].origv);
-
-	isync();
+	/*
+	 * isync not required here, see switch_mmu_to_guest_radix.
+	 */
 }
 
 static void switch_mmu_to_host(struct kvm *kvm, u32 pid)
 {
+	/*
+	 * The guest has exited, so guest MMU context is no longer being
+	 * non-speculatively accessed, but a hwsync is needed before the
+	 * mtLPIDR / mtPIDR switch, in order to ensure all stores are drained,
+	 * so the not-my-LPAR tlbie logic does not overlook them.
+	 */
+	asm volatile("hwsync" ::: "memory");
 	isync();
 	mtspr(SPRN_PID, pid);
-	isync();
 	mtspr(SPRN_LPID, kvm->arch.host_lpid);
-	isync();
 	mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
-	isync();
+	/*
+	 * isync is not required after the switch, because mtmsrd with L=0
+	 * is performed after this switch, which is context synchronising.
+	 */
 
 	if (!radix_enabled())
 		slb_restore_bolted_realmode();
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 37/43] KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Tighten up partition switching code synchronisation and comments.

In particular, hwsync ; isync is required after the last access that is
performed in the context of a partition, before the partition is
switched away from.

-301 cycles (6319) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  4 +++
 arch/powerpc/kvm/book3s_hv_p9_entry.c  | 40 +++++++++++++++++++-------
 2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index d909c069363e..5a6ab0a61b68 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -53,6 +53,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int pid,
 
 	preempt_disable();
 
+	asm volatile("hwsync" ::: "memory");
+	isync();
 	/* switch the lpid first to avoid running host with unallocated pid */
 	old_lpid = mfspr(SPRN_LPID);
 	if (old_lpid != lpid)
@@ -69,6 +71,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int pid,
 	else
 		ret = copy_to_user_nofault((void __user *)to, from, n);
 
+	asm volatile("hwsync" ::: "memory");
+	isync();
 	/* switch the pid first to avoid running host with unallocated pid */
 	if (quadrant = 1 && pid != old_pid)
 		mtspr(SPRN_PID, old_pid);
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 55286a8357f7..7aa72efcac6c 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -503,17 +503,19 @@ static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u6
 	lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
 
 	/*
-	 * All the isync()s are overkill but trivially follow the ISA
-	 * requirements. Some can likely be replaced with justification
-	 * comment for why they are not needed.
+	 * Prior memory accesses to host PID Q3 must be completed before we
+	 * start switching, and stores must be drained to avoid not-my-LPAR
+	 * logic (see switch_mmu_to_host).
 	 */
+	asm volatile("hwsync" ::: "memory");
 	isync();
 	mtspr(SPRN_LPID, lpid);
-	isync();
 	mtspr(SPRN_LPCR, lpcr);
-	isync();
 	mtspr(SPRN_PID, vcpu->arch.pid);
-	isync();
+	/*
+	 * isync not required here because we are HRFID'ing to guest before
+	 * any guest context access, which is context synchronising.
+	 */
 }
 
 static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
@@ -523,25 +525,41 @@ static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64
 
 	lpid = kvm->arch.lpid;
 
+	/*
+	 * See switch_mmu_to_guest_radix. ptesync should not be required here
+	 * even if the host is in HPT mode because speculative accesses would
+	 * not cause RC updates (we are in real mode).
+	 */
+	asm volatile("hwsync" ::: "memory");
+	isync();
 	mtspr(SPRN_LPID, lpid);
 	mtspr(SPRN_LPCR, lpcr);
 	mtspr(SPRN_PID, vcpu->arch.pid);
 
 	for (i = 0; i < vcpu->arch.slb_max; i++)
 		mtslb(vcpu->arch.slb[i].orige, vcpu->arch.slb[i].origv);
-
-	isync();
+	/*
+	 * isync not required here, see switch_mmu_to_guest_radix.
+	 */
 }
 
 static void switch_mmu_to_host(struct kvm *kvm, u32 pid)
 {
+	/*
+	 * The guest has exited, so guest MMU context is no longer being
+	 * non-speculatively accessed, but a hwsync is needed before the
+	 * mtLPIDR / mtPIDR switch, in order to ensure all stores are drained,
+	 * so the not-my-LPAR tlbie logic does not overlook them.
+	 */
+	asm volatile("hwsync" ::: "memory");
 	isync();
 	mtspr(SPRN_PID, pid);
-	isync();
 	mtspr(SPRN_LPID, kvm->arch.host_lpid);
-	isync();
 	mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
-	isync();
+	/*
+	 * isync is not required after the switch, because mtmsrd with L=0
+	 * is performed after this switch, which is context synchronising.
+	 */
 
 	if (!radix_enabled())
 		slb_restore_bolted_realmode();
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Some of the DAWR SPR access is already predicated on dawr_enabled(),
apply this to the remainder of the accesses.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 7aa72efcac6c..f305d1d6445c 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -638,13 +638,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
-	host_dawr0 = mfspr(SPRN_DAWR0);
-	host_dawrx0 = mfspr(SPRN_DAWRX0);
 	host_psscr = mfspr(SPRN_PSSCR);
 	host_pidr = mfspr(SPRN_PID);
-	if (cpu_has_feature(CPU_FTR_DAWR1)) {
-		host_dawr1 = mfspr(SPRN_DAWR1);
-		host_dawrx1 = mfspr(SPRN_DAWRX1);
+
+	if (dawr_enabled()) {
+		host_dawr0 = mfspr(SPRN_DAWR0);
+		host_dawrx0 = mfspr(SPRN_DAWRX0);
+		if (cpu_has_feature(CPU_FTR_DAWR1)) {
+			host_dawr1 = mfspr(SPRN_DAWR1);
+			host_dawrx1 = mfspr(SPRN_DAWRX1);
+		}
 	}
 
 	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
@@ -951,15 +954,18 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_HFSCR, host_hfscr);
 	if (vcpu->arch.ciabr != host_ciabr)
 		mtspr(SPRN_CIABR, host_ciabr);
-	if (vcpu->arch.dawr0 != host_dawr0)
-		mtspr(SPRN_DAWR0, host_dawr0);
-	if (vcpu->arch.dawrx0 != host_dawrx0)
-		mtspr(SPRN_DAWRX0, host_dawrx0);
-	if (cpu_has_feature(CPU_FTR_DAWR1)) {
-		if (vcpu->arch.dawr1 != host_dawr1)
-			mtspr(SPRN_DAWR1, host_dawr1);
-		if (vcpu->arch.dawrx1 != host_dawrx1)
-			mtspr(SPRN_DAWRX1, host_dawrx1);
+
+	if (dawr_enabled()) {
+		if (vcpu->arch.dawr0 != host_dawr0)
+			mtspr(SPRN_DAWR0, host_dawr0);
+		if (vcpu->arch.dawrx0 != host_dawrx0)
+			mtspr(SPRN_DAWRX0, host_dawrx0);
+		if (cpu_has_feature(CPU_FTR_DAWR1)) {
+			if (vcpu->arch.dawr1 != host_dawr1)
+				mtspr(SPRN_DAWR1, host_dawr1);
+			if (vcpu->arch.dawrx1 != host_dawrx1)
+				mtspr(SPRN_DAWRX1, host_dawrx1);
+		}
 	}
 
 	if (vc->dpdes)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Some of the DAWR SPR access is already predicated on dawr_enabled(),
apply this to the remainder of the accesses.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 7aa72efcac6c..f305d1d6445c 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -638,13 +638,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
-	host_dawr0 = mfspr(SPRN_DAWR0);
-	host_dawrx0 = mfspr(SPRN_DAWRX0);
 	host_psscr = mfspr(SPRN_PSSCR);
 	host_pidr = mfspr(SPRN_PID);
-	if (cpu_has_feature(CPU_FTR_DAWR1)) {
-		host_dawr1 = mfspr(SPRN_DAWR1);
-		host_dawrx1 = mfspr(SPRN_DAWRX1);
+
+	if (dawr_enabled()) {
+		host_dawr0 = mfspr(SPRN_DAWR0);
+		host_dawrx0 = mfspr(SPRN_DAWRX0);
+		if (cpu_has_feature(CPU_FTR_DAWR1)) {
+			host_dawr1 = mfspr(SPRN_DAWR1);
+			host_dawrx1 = mfspr(SPRN_DAWRX1);
+		}
 	}
 
 	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
@@ -951,15 +954,18 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_HFSCR, host_hfscr);
 	if (vcpu->arch.ciabr != host_ciabr)
 		mtspr(SPRN_CIABR, host_ciabr);
-	if (vcpu->arch.dawr0 != host_dawr0)
-		mtspr(SPRN_DAWR0, host_dawr0);
-	if (vcpu->arch.dawrx0 != host_dawrx0)
-		mtspr(SPRN_DAWRX0, host_dawrx0);
-	if (cpu_has_feature(CPU_FTR_DAWR1)) {
-		if (vcpu->arch.dawr1 != host_dawr1)
-			mtspr(SPRN_DAWR1, host_dawr1);
-		if (vcpu->arch.dawrx1 != host_dawrx1)
-			mtspr(SPRN_DAWRX1, host_dawrx1);
+
+	if (dawr_enabled()) {
+		if (vcpu->arch.dawr0 != host_dawr0)
+			mtspr(SPRN_DAWR0, host_dawr0);
+		if (vcpu->arch.dawrx0 != host_dawrx0)
+			mtspr(SPRN_DAWRX0, host_dawrx0);
+		if (cpu_has_feature(CPU_FTR_DAWR1)) {
+			if (vcpu->arch.dawr1 != host_dawr1)
+				mtspr(SPRN_DAWR1, host_dawr1);
+			if (vcpu->arch.dawrx1 != host_dawrx1)
+				mtspr(SPRN_DAWRX1, host_dawrx1);
+		}
 	}
 
 	if (vc->dpdes)
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 39/43] KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This also moves the PSSCR update in nested entry to avoid a SPR
scoreboard stall.

-45 cycles (6276) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          |  7 +++++--
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 +++++++++++++++++++-------
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index c7cf771d3351..91bbd0a8f6b6 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3756,7 +3756,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 
 	load_vcpu_state(vcpu, &host_os_sprs);
 
-	mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+	if (vcpu->arch.psscr != host_psscr)
+		mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+
 	kvmhv_save_hv_regs(vcpu, &hvregs);
 	hvregs.lpcr = lpcr;
 	vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
@@ -3797,7 +3799,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
 	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
 	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
-	mtspr(SPRN_PSSCR_PR, host_psscr);
 
 	store_vcpu_state(vcpu);
 
@@ -3810,6 +3811,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	timer_rearm_host_dec(*tb);
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+	if (vcpu->arch.psscr != host_psscr)
+		mtspr(SPRN_PSSCR_PR, host_psscr);
 
 	return trap;
 }
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index f305d1d6445c..4bab56c10254 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -621,6 +621,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	unsigned long host_dawr0;
 	unsigned long host_dawrx0;
 	unsigned long host_psscr;
+	unsigned long host_hpsscr;
 	unsigned long host_pidr;
 	unsigned long host_dawr1;
 	unsigned long host_dawrx1;
@@ -638,7 +639,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
-	host_psscr = mfspr(SPRN_PSSCR);
+	host_psscr = mfspr(SPRN_PSSCR_PR);
+	if (cpu_has_feature(CPU_FTRS_POWER9_DD2_2))
+		host_hpsscr = mfspr(SPRN_PSSCR);
 	host_pidr = mfspr(SPRN_PID);
 
 	if (dawr_enabled()) {
@@ -719,8 +722,14 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (vcpu->arch.ciabr != host_ciabr)
 		mtspr(SPRN_CIABR, vcpu->arch.ciabr);
 
-	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
-	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+
+	if (cpu_has_feature(CPU_FTRS_POWER9_DD2_2)) {
+		mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
+		      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+	} else {
+		if (vcpu->arch.psscr != host_psscr)
+			mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+	}
 
 	mtspr(SPRN_HFSCR, vcpu->arch.hfscr);
 
@@ -905,7 +914,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ic = mfspr(SPRN_IC);
 	vcpu->arch.pid = mfspr(SPRN_PID);
-	vcpu->arch.psscr = mfspr(SPRN_PSSCR) & PSSCR_GUEST_VIS;
+	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
 
 	vcpu->arch.shregs.sprg0 = mfspr(SPRN_SPRG0);
 	vcpu->arch.shregs.sprg1 = mfspr(SPRN_SPRG1);
@@ -948,9 +957,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
 	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
 
-	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
-	mtspr(SPRN_PSSCR, host_psscr |
-	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+	if (cpu_has_feature(CPU_FTRS_POWER9_DD2_2)) {
+		/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
+		mtspr(SPRN_PSSCR, host_hpsscr |
+		      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+	}
+
 	mtspr(SPRN_HFSCR, host_hfscr);
 	if (vcpu->arch.ciabr != host_ciabr)
 		mtspr(SPRN_CIABR, host_ciabr);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 39/43] KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

This also moves the PSSCR update in nested entry to avoid a SPR
scoreboard stall.

-45 cycles (6276) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          |  7 +++++--
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 +++++++++++++++++++-------
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index c7cf771d3351..91bbd0a8f6b6 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3756,7 +3756,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 
 	load_vcpu_state(vcpu, &host_os_sprs);
 
-	mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+	if (vcpu->arch.psscr != host_psscr)
+		mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+
 	kvmhv_save_hv_regs(vcpu, &hvregs);
 	hvregs.lpcr = lpcr;
 	vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
@@ -3797,7 +3799,6 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
 	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
 	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
-	mtspr(SPRN_PSSCR_PR, host_psscr);
 
 	store_vcpu_state(vcpu);
 
@@ -3810,6 +3811,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	timer_rearm_host_dec(*tb);
 
 	restore_p9_host_os_sprs(vcpu, &host_os_sprs);
+	if (vcpu->arch.psscr != host_psscr)
+		mtspr(SPRN_PSSCR_PR, host_psscr);
 
 	return trap;
 }
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index f305d1d6445c..4bab56c10254 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -621,6 +621,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	unsigned long host_dawr0;
 	unsigned long host_dawrx0;
 	unsigned long host_psscr;
+	unsigned long host_hpsscr;
 	unsigned long host_pidr;
 	unsigned long host_dawr1;
 	unsigned long host_dawrx1;
@@ -638,7 +639,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
-	host_psscr = mfspr(SPRN_PSSCR);
+	host_psscr = mfspr(SPRN_PSSCR_PR);
+	if (cpu_has_feature(CPU_FTRS_POWER9_DD2_2))
+		host_hpsscr = mfspr(SPRN_PSSCR);
 	host_pidr = mfspr(SPRN_PID);
 
 	if (dawr_enabled()) {
@@ -719,8 +722,14 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	if (vcpu->arch.ciabr != host_ciabr)
 		mtspr(SPRN_CIABR, vcpu->arch.ciabr);
 
-	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
-	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+
+	if (cpu_has_feature(CPU_FTRS_POWER9_DD2_2)) {
+		mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
+		      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+	} else {
+		if (vcpu->arch.psscr != host_psscr)
+			mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
+	}
 
 	mtspr(SPRN_HFSCR, vcpu->arch.hfscr);
 
@@ -905,7 +914,7 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ic = mfspr(SPRN_IC);
 	vcpu->arch.pid = mfspr(SPRN_PID);
-	vcpu->arch.psscr = mfspr(SPRN_PSSCR) & PSSCR_GUEST_VIS;
+	vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
 
 	vcpu->arch.shregs.sprg0 = mfspr(SPRN_SPRG0);
 	vcpu->arch.shregs.sprg1 = mfspr(SPRN_SPRG1);
@@ -948,9 +957,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 	mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr);
 	mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr);
 
-	/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
-	mtspr(SPRN_PSSCR, host_psscr |
-	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+	if (cpu_has_feature(CPU_FTRS_POWER9_DD2_2)) {
+		/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
+		mtspr(SPRN_PSSCR, host_hpsscr |
+		      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
+	}
+
 	mtspr(SPRN_HFSCR, host_hfscr);
 	if (vcpu->arch.ciabr != host_ciabr)
 		mtspr(SPRN_CIABR, host_ciabr);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 40/43] KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use the existing TLB flushing logic to IPI the previous CPU and run the
necessary barriers before running a guest vCPU on a new physical CPU,
to do the necessary radix GTSE barriers for handling the case of an
interrupted guest tlbie sequence.

This results in more IPIs than the TLB flush logic requires, but it's
a significant win for common case scheduling when the vCPU remains on
the same physical CPU.

-522 cycles (5754) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 31 +++++++++++++++++++++++----
 arch/powerpc/kvm/book3s_hv_p9_entry.c |  9 --------
 2 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 91bbd0a8f6b6..9d8277a4c829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2906,6 +2906,25 @@ static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu)
 			smp_call_function_single(i, do_nothing, NULL, 1);
 }
 
+static void do_migrate_away_vcpu(void *arg)
+{
+	struct kvm_vcpu *vcpu = arg;
+	struct kvm *kvm = vcpu->kvm;
+
+	/*
+	 * If the guest has GTSE, it may execute tlbie, so do a eieio; tlbsync;
+	 * ptesync sequence on the old CPU before migrating to a new one, in
+	 * case we interrupted the guest between a tlbie ; eieio ;
+	 * tlbsync; ptesync sequence.
+	 *
+	 * Otherwise, ptesync is sufficient.
+	 */
+	if (kvm->arch.lpcr & LPCR_GTSE)
+		asm volatile("eieio; tlbsync; ptesync");
+	else
+		asm volatile("ptesync");
+}
+
 static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
 {
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
@@ -2933,10 +2952,14 @@ static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
 	 * so we use a single bit in .need_tlb_flush for all 4 threads.
 	 */
 	if (prev_cpu != pcpu) {
-		if (prev_cpu >= 0 &&
-		    cpu_first_tlb_thread_sibling(prev_cpu) !=
-		    cpu_first_tlb_thread_sibling(pcpu))
-			radix_flush_cpu(kvm, prev_cpu, vcpu);
+		if (prev_cpu >= 0) {
+			if (cpu_first_tlb_thread_sibling(prev_cpu) !=
+			    cpu_first_tlb_thread_sibling(pcpu))
+				radix_flush_cpu(kvm, prev_cpu, vcpu);
+
+			smp_call_function_single(prev_cpu,
+					do_migrate_away_vcpu, vcpu, 1);
+		}
 		if (nested)
 			nested->prev_cpu[vcpu->arch.nested_vcpu_id] = pcpu;
 		else
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 4bab56c10254..48b0ce9e0c39 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -994,15 +994,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
 
-	if (kvm_is_radix(kvm)) {
-		/*
-		 * Since this is radix, do a eieio; tlbsync; ptesync sequence
-		 * in case we interrupted the guest between a tlbie and a
-		 * ptesync.
-		 */
-		asm volatile("eieio; tlbsync; ptesync");
-	}
-
 	/*
 	 * cp_abort is required if the processor supports local copy-paste
 	 * to clear the copy buffer that was under control of the guest.
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 40/43] KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Use the existing TLB flushing logic to IPI the previous CPU and run the
necessary barriers before running a guest vCPU on a new physical CPU,
to do the necessary radix GTSE barriers for handling the case of an
interrupted guest tlbie sequence.

This results in more IPIs than the TLB flush logic requires, but it's
a significant win for common case scheduling when the vCPU remains on
the same physical CPU.

-522 cycles (5754) POWER9 virt-mode NULL hcall

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c          | 31 +++++++++++++++++++++++----
 arch/powerpc/kvm/book3s_hv_p9_entry.c |  9 --------
 2 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 91bbd0a8f6b6..9d8277a4c829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2906,6 +2906,25 @@ static void radix_flush_cpu(struct kvm *kvm, int cpu, struct kvm_vcpu *vcpu)
 			smp_call_function_single(i, do_nothing, NULL, 1);
 }
 
+static void do_migrate_away_vcpu(void *arg)
+{
+	struct kvm_vcpu *vcpu = arg;
+	struct kvm *kvm = vcpu->kvm;
+
+	/*
+	 * If the guest has GTSE, it may execute tlbie, so do a eieio; tlbsync;
+	 * ptesync sequence on the old CPU before migrating to a new one, in
+	 * case we interrupted the guest between a tlbie ; eieio ;
+	 * tlbsync; ptesync sequence.
+	 *
+	 * Otherwise, ptesync is sufficient.
+	 */
+	if (kvm->arch.lpcr & LPCR_GTSE)
+		asm volatile("eieio; tlbsync; ptesync");
+	else
+		asm volatile("ptesync");
+}
+
 static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
 {
 	struct kvm_nested_guest *nested = vcpu->arch.nested;
@@ -2933,10 +2952,14 @@ static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
 	 * so we use a single bit in .need_tlb_flush for all 4 threads.
 	 */
 	if (prev_cpu != pcpu) {
-		if (prev_cpu >= 0 &&
-		    cpu_first_tlb_thread_sibling(prev_cpu) !-		    cpu_first_tlb_thread_sibling(pcpu))
-			radix_flush_cpu(kvm, prev_cpu, vcpu);
+		if (prev_cpu >= 0) {
+			if (cpu_first_tlb_thread_sibling(prev_cpu) !+			    cpu_first_tlb_thread_sibling(pcpu))
+				radix_flush_cpu(kvm, prev_cpu, vcpu);
+
+			smp_call_function_single(prev_cpu,
+					do_migrate_away_vcpu, vcpu, 1);
+		}
 		if (nested)
 			nested->prev_cpu[vcpu->arch.nested_vcpu_id] = pcpu;
 		else
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 4bab56c10254..48b0ce9e0c39 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -994,15 +994,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
 
-	if (kvm_is_radix(kvm)) {
-		/*
-		 * Since this is radix, do a eieio; tlbsync; ptesync sequence
-		 * in case we interrupted the guest between a tlbie and a
-		 * ptesync.
-		 */
-		asm volatile("eieio; tlbsync; ptesync");
-	}
-
 	/*
 	 * cp_abort is required if the processor supports local copy-paste
 	 * to clear the copy buffer that was under control of the guest.
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 41/43] KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

mftb() is expensive and one can be avoided on nested guest dispatch.

If the time checking code distinguishes between the L0 timer and the
nested HV timer, then both can be tested in the same place with the
same mftb() value.

This also nicely illustrates the relationship between the L0 and nested
HV timers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_asm.h  |  1 +
 arch/powerpc/kvm/book3s_hv.c        | 12 ++++++++++++
 arch/powerpc/kvm/book3s_hv_nested.c |  5 -----
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h
index fbbf3cec92e9..d68d71987d5c 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -79,6 +79,7 @@
 #define BOOK3S_INTERRUPT_FP_UNAVAIL	0x800
 #define BOOK3S_INTERRUPT_DECREMENTER	0x900
 #define BOOK3S_INTERRUPT_HV_DECREMENTER	0x980
+#define BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER	0x1980
 #define BOOK3S_INTERRUPT_DOORBELL	0xa00
 #define BOOK3S_INTERRUPT_SYSCALL	0xc00
 #define BOOK3S_INTERRUPT_TRACE		0xd00
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 9d8277a4c829..7cb9e87b50b7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1410,6 +1410,10 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 	run->ready_for_interrupt_injection = 1;
 	switch (vcpu->arch.trap) {
 	/* We're good on these - the host merely wanted to get our attention */
+	case BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER:
+		WARN_ON_ONCE(1); /* Should never happen */
+		vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
+		fallthrough;
 	case BOOK3S_INTERRUPT_HV_DECREMENTER:
 		vcpu->stat.dec_exits++;
 		r = RESUME_GUEST;
@@ -1737,6 +1741,12 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 		vcpu->stat.ext_intr_exits++;
 		r = RESUME_GUEST;
 		break;
+	/* These need to go to the nested HV */
+	case BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER:
+		vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
+		vcpu->stat.dec_exits++;
+		r = RESUME_HOST;
+		break;
 	/* SR/HMI/PMI are HV interrupts that host has handled. Resume guest.*/
 	case BOOK3S_INTERRUPT_HMI:
 	case BOOK3S_INTERRUPT_PERFMON:
@@ -3855,6 +3865,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 	if (next_timer < time_limit)
 		time_limit = next_timer;
+	else if (*tb >= time_limit) /* nested time limit */
+		return BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER;
 
 	vcpu->arch.ceded = 0;
 
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 5a534f7924f2..a92808a927ff 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -361,11 +361,6 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	vcpu->arch.ret = RESUME_GUEST;
 	vcpu->arch.trap = 0;
 	do {
-		if (mftb() >= hdec_exp) {
-			vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
-			r = RESUME_HOST;
-			break;
-		}
 		r = kvmhv_run_single_vcpu(vcpu, hdec_exp, l2_hv.lpcr);
 	} while (is_kvmppc_resume_guest(r));
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 41/43] KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

mftb() is expensive and one can be avoided on nested guest dispatch.

If the time checking code distinguishes between the L0 timer and the
nested HV timer, then both can be tested in the same place with the
same mftb() value.

This also nicely illustrates the relationship between the L0 and nested
HV timers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_asm.h  |  1 +
 arch/powerpc/kvm/book3s_hv.c        | 12 ++++++++++++
 arch/powerpc/kvm/book3s_hv_nested.c |  5 -----
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h
index fbbf3cec92e9..d68d71987d5c 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -79,6 +79,7 @@
 #define BOOK3S_INTERRUPT_FP_UNAVAIL	0x800
 #define BOOK3S_INTERRUPT_DECREMENTER	0x900
 #define BOOK3S_INTERRUPT_HV_DECREMENTER	0x980
+#define BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER	0x1980
 #define BOOK3S_INTERRUPT_DOORBELL	0xa00
 #define BOOK3S_INTERRUPT_SYSCALL	0xc00
 #define BOOK3S_INTERRUPT_TRACE		0xd00
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 9d8277a4c829..7cb9e87b50b7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1410,6 +1410,10 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
 	run->ready_for_interrupt_injection = 1;
 	switch (vcpu->arch.trap) {
 	/* We're good on these - the host merely wanted to get our attention */
+	case BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER:
+		WARN_ON_ONCE(1); /* Should never happen */
+		vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
+		fallthrough;
 	case BOOK3S_INTERRUPT_HV_DECREMENTER:
 		vcpu->stat.dec_exits++;
 		r = RESUME_GUEST;
@@ -1737,6 +1741,12 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
 		vcpu->stat.ext_intr_exits++;
 		r = RESUME_GUEST;
 		break;
+	/* These need to go to the nested HV */
+	case BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER:
+		vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
+		vcpu->stat.dec_exits++;
+		r = RESUME_HOST;
+		break;
 	/* SR/HMI/PMI are HV interrupts that host has handled. Resume guest.*/
 	case BOOK3S_INTERRUPT_HMI:
 	case BOOK3S_INTERRUPT_PERFMON:
@@ -3855,6 +3865,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 		return BOOK3S_INTERRUPT_HV_DECREMENTER;
 	if (next_timer < time_limit)
 		time_limit = next_timer;
+	else if (*tb >= time_limit) /* nested time limit */
+		return BOOK3S_INTERRUPT_NESTED_HV_DECREMENTER;
 
 	vcpu->arch.ceded = 0;
 
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 5a534f7924f2..a92808a927ff 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -361,11 +361,6 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 	vcpu->arch.ret = RESUME_GUEST;
 	vcpu->arch.trap = 0;
 	do {
-		if (mftb() >= hdec_exp) {
-			vcpu->arch.trap = BOOK3S_INTERRUPT_HV_DECREMENTER;
-			r = RESUME_HOST;
-			break;
-		}
 		r = kvmhv_run_single_vcpu(vcpu, hdec_exp, l2_hv.lpcr);
 	} while (is_kvmppc_resume_guest(r));
 
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 42/43] KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Rearrange the MSR saving on entry so it does not follow the mtmsrd to
disable interrupts, avoiding a possible RAW scoreboard stall.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +
 arch/powerpc/kvm/book3s_hv.c             | 18 ++-----
 arch/powerpc/kvm/book3s_hv_p9_entry.c    | 66 +++++++++++++++---------
 3 files changed, 47 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index f8a0ed90b853..20ca9b1a2d41 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -153,6 +153,8 @@ static inline bool kvmhv_vcpu_is_radix(struct kvm_vcpu *vcpu)
 	return radix;
 }
 
+unsigned long kvmppc_msr_hard_disable_set_facilities(struct kvm_vcpu *vcpu, unsigned long msr);
+
 int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb);
 
 #define KVM_DEFAULT_HPT_ORDER	24	/* 16MB HPT by default */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7cb9e87b50b7..c8edab9a90cb 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3759,6 +3759,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	s64 dec;
 	int trap;
 
+	msr = mfmsr();
+
 	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
@@ -3769,24 +3771,10 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	 */
 	host_psscr = mfspr(SPRN_PSSCR_PR);
 
-	hard_irq_disable();
+	kvmppc_msr_hard_disable_set_facilities(vcpu, msr);
 	if (lazy_irq_pending())
 		return 0;
 
-	/* MSR bits may have been cleared by context switch */
-	msr = 0;
-	if (IS_ENABLED(CONFIG_PPC_FPU))
-		msr |= MSR_FP;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		msr |= MSR_VEC;
-	if (cpu_has_feature(CPU_FTR_VSX))
-		msr |= MSR_VSX;
-	if ((cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
-			(vcpu->arch.hfscr & HFSCR_TM))
-		msr |= MSR_TM;
-	msr = msr_check_and_set(msr);
-
 	load_vcpu_state(vcpu, &host_os_sprs);
 
 	if (vcpu->arch.psscr != host_psscr)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 48b0ce9e0c39..3fffcec67ff8 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -604,6 +604,44 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	}
 }
 
+unsigned long kvmppc_msr_hard_disable_set_facilities(struct kvm_vcpu *vcpu, unsigned long msr)
+{
+	unsigned long msr_needed = 0;
+
+	msr &= ~MSR_EE;
+
+	/* MSR bits may have been cleared by context switch so must recheck */
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr_needed |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr_needed |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr_needed |= MSR_VSX;
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
+		msr_needed |= MSR_TM;
+
+	/*
+	 * This could be combined with MSR[RI] clearing, but that expands
+	 * the unrecoverable window. It would be better to cover unrecoverable
+	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
+	 *
+	 * Much more difficult and less worthwhile to combine with IR/DR
+	 * disable.
+	 */
+	if ((msr & msr_needed) != msr_needed) {
+		msr |= msr_needed;
+		__mtmsrd(msr, 0);
+	} else {
+		__hard_irq_disable();
+	}
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
+	return msr;
+}
+EXPORT_SYMBOL_GPL(kvmppc_msr_hard_disable_set_facilities);
+
 int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
 {
 	struct p9_host_os_sprs host_os_sprs;
@@ -637,6 +675,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ceded = 0;
 
+	/* Save MSR for restore, with EE clear. */
+	msr = mfmsr() & ~MSR_EE;
+
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
 	host_psscr = mfspr(SPRN_PSSCR_PR);
@@ -658,35 +699,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
-	/*
-	 * This could be combined with MSR[RI] clearing, but that expands
-	 * the unrecoverable window. It would be better to cover unrecoverable
-	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
-	 *
-	 * Much more difficult and less worthwhile to combine with IR/DR
-	 * disable.
-	 */
-	hard_irq_disable();
+	msr = kvmppc_msr_hard_disable_set_facilities(vcpu, msr);
 	if (lazy_irq_pending()) {
 		trap = 0;
 		goto out;
 	}
 
-	/* MSR bits may have been cleared by context switch */
-	msr = 0;
-	if (IS_ENABLED(CONFIG_PPC_FPU))
-		msr |= MSR_FP;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		msr |= MSR_VEC;
-	if (cpu_has_feature(CPU_FTR_VSX))
-		msr |= MSR_VSX;
-	if ((cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
-			(vcpu->arch.hfscr & HFSCR_TM))
-		msr |= MSR_TM;
-	msr = msr_check_and_set(msr);
-	/* Save MSR for restore. This is after hard disable, so EE is clear. */
-
 	if (vc->tb_offset) {
 		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 42/43] KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Rearrange the MSR saving on entry so it does not follow the mtmsrd to
disable interrupts, avoiding a possible RAW scoreboard stall.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +
 arch/powerpc/kvm/book3s_hv.c             | 18 ++-----
 arch/powerpc/kvm/book3s_hv_p9_entry.c    | 66 +++++++++++++++---------
 3 files changed, 47 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
index f8a0ed90b853..20ca9b1a2d41 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -153,6 +153,8 @@ static inline bool kvmhv_vcpu_is_radix(struct kvm_vcpu *vcpu)
 	return radix;
 }
 
+unsigned long kvmppc_msr_hard_disable_set_facilities(struct kvm_vcpu *vcpu, unsigned long msr);
+
 int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb);
 
 #define KVM_DEFAULT_HPT_ORDER	24	/* 16MB HPT by default */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7cb9e87b50b7..c8edab9a90cb 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3759,6 +3759,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	s64 dec;
 	int trap;
 
+	msr = mfmsr();
+
 	save_p9_host_os_sprs(&host_os_sprs);
 
 	/*
@@ -3769,24 +3771,10 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
 	 */
 	host_psscr = mfspr(SPRN_PSSCR_PR);
 
-	hard_irq_disable();
+	kvmppc_msr_hard_disable_set_facilities(vcpu, msr);
 	if (lazy_irq_pending())
 		return 0;
 
-	/* MSR bits may have been cleared by context switch */
-	msr = 0;
-	if (IS_ENABLED(CONFIG_PPC_FPU))
-		msr |= MSR_FP;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		msr |= MSR_VEC;
-	if (cpu_has_feature(CPU_FTR_VSX))
-		msr |= MSR_VSX;
-	if ((cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
-			(vcpu->arch.hfscr & HFSCR_TM))
-		msr |= MSR_TM;
-	msr = msr_check_and_set(msr);
-
 	load_vcpu_state(vcpu, &host_os_sprs);
 
 	if (vcpu->arch.psscr != host_psscr)
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 48b0ce9e0c39..3fffcec67ff8 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -604,6 +604,44 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	}
 }
 
+unsigned long kvmppc_msr_hard_disable_set_facilities(struct kvm_vcpu *vcpu, unsigned long msr)
+{
+	unsigned long msr_needed = 0;
+
+	msr &= ~MSR_EE;
+
+	/* MSR bits may have been cleared by context switch so must recheck */
+	if (IS_ENABLED(CONFIG_PPC_FPU))
+		msr_needed |= MSR_FP;
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		msr_needed |= MSR_VEC;
+	if (cpu_has_feature(CPU_FTR_VSX))
+		msr_needed |= MSR_VSX;
+	if ((cpu_has_feature(CPU_FTR_TM) ||
+	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
+			(vcpu->arch.hfscr & HFSCR_TM))
+		msr_needed |= MSR_TM;
+
+	/*
+	 * This could be combined with MSR[RI] clearing, but that expands
+	 * the unrecoverable window. It would be better to cover unrecoverable
+	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
+	 *
+	 * Much more difficult and less worthwhile to combine with IR/DR
+	 * disable.
+	 */
+	if ((msr & msr_needed) != msr_needed) {
+		msr |= msr_needed;
+		__mtmsrd(msr, 0);
+	} else {
+		__hard_irq_disable();
+	}
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
+	return msr;
+}
+EXPORT_SYMBOL_GPL(kvmppc_msr_hard_disable_set_facilities);
+
 int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr, u64 *tb)
 {
 	struct p9_host_os_sprs host_os_sprs;
@@ -637,6 +675,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	vcpu->arch.ceded = 0;
 
+	/* Save MSR for restore, with EE clear. */
+	msr = mfmsr() & ~MSR_EE;
+
 	host_hfscr = mfspr(SPRN_HFSCR);
 	host_ciabr = mfspr(SPRN_CIABR);
 	host_psscr = mfspr(SPRN_PSSCR_PR);
@@ -658,35 +699,12 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
 
 	save_p9_host_os_sprs(&host_os_sprs);
 
-	/*
-	 * This could be combined with MSR[RI] clearing, but that expands
-	 * the unrecoverable window. It would be better to cover unrecoverable
-	 * with KVM bad interrupt handling rather than use MSR[RI] at all.
-	 *
-	 * Much more difficult and less worthwhile to combine with IR/DR
-	 * disable.
-	 */
-	hard_irq_disable();
+	msr = kvmppc_msr_hard_disable_set_facilities(vcpu, msr);
 	if (lazy_irq_pending()) {
 		trap = 0;
 		goto out;
 	}
 
-	/* MSR bits may have been cleared by context switch */
-	msr = 0;
-	if (IS_ENABLED(CONFIG_PPC_FPU))
-		msr |= MSR_FP;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		msr |= MSR_VEC;
-	if (cpu_has_feature(CPU_FTR_VSX))
-		msr |= MSR_VSX;
-	if ((cpu_has_feature(CPU_FTR_TM) ||
-	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
-			(vcpu->arch.hfscr & HFSCR_TM))
-		msr |= MSR_TM;
-	msr = msr_check_and_set(msr);
-	/* Save MSR for restore. This is after hard disable, so EE is clear. */
-
 	if (vc->tb_offset) {
 		u64 new_tb = *tb + vc->tb_offset;
 		mtspr(SPRN_TBU40, new_tb);
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 43/43] KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving
  2021-06-22 10:56 ` Nicholas Piggin
@ 2021-06-22 10:57   ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

slbmfee/slbmfev instructions are very expensive, moreso than a regular
mfspr instruction, so minimising them significantly improves hash guest
exit performance. The slbmfev is only required if slbmfee found a valid
SLB entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 3fffcec67ff8..5e9e9f809297 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -459,10 +459,22 @@ static void __accumulate_time(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator
 #define accumulate_time(vcpu, next) do {} while (0)
 #endif
 
-static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev)
+static inline u64 mfslbv(unsigned int idx)
 {
-	asm volatile("slbmfev  %0,%1" : "=r" (*slbev) : "r" (idx));
-	asm volatile("slbmfee  %0,%1" : "=r" (*slbee) : "r" (idx));
+	u64 slbev;
+
+	asm volatile("slbmfev  %0,%1" : "=r" (slbev) : "r" (idx));
+
+	return slbev;
+}
+
+static inline u64 mfslbe(unsigned int idx)
+{
+	u64 slbee;
+
+	asm volatile("slbmfee  %0,%1" : "=r" (slbee) : "r" (idx));
+
+	return slbee;
 }
 
 static inline void mtslb(u64 slbee, u64 slbev)
@@ -592,8 +604,10 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 		 */
 		for (i = 0; i < vcpu->arch.slb_nr; i++) {
 			u64 slbee, slbev;
-			mfslb(i, &slbee, &slbev);
+
+			slbee = mfslbe(i);
 			if (slbee & SLB_ESID_V) {
+				slbev = mfslbv(i);
 				vcpu->arch.slb[nr].orige = slbee | i;
 				vcpu->arch.slb[nr].origv = slbev;
 				nr++;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [RFC PATCH 43/43] KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving
@ 2021-06-22 10:57   ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-06-22 10:57 UTC (permalink / raw)
  To: kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

slbmfee/slbmfev instructions are very expensive, moreso than a regular
mfspr instruction, so minimising them significantly improves hash guest
exit performance. The slbmfev is only required if slbmfee found a valid
SLB entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_p9_entry.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index 3fffcec67ff8..5e9e9f809297 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -459,10 +459,22 @@ static void __accumulate_time(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator
 #define accumulate_time(vcpu, next) do {} while (0)
 #endif
 
-static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev)
+static inline u64 mfslbv(unsigned int idx)
 {
-	asm volatile("slbmfev  %0,%1" : "=r" (*slbev) : "r" (idx));
-	asm volatile("slbmfee  %0,%1" : "=r" (*slbee) : "r" (idx));
+	u64 slbev;
+
+	asm volatile("slbmfev  %0,%1" : "=r" (slbev) : "r" (idx));
+
+	return slbev;
+}
+
+static inline u64 mfslbe(unsigned int idx)
+{
+	u64 slbee;
+
+	asm volatile("slbmfee  %0,%1" : "=r" (slbee) : "r" (idx));
+
+	return slbee;
 }
 
 static inline void mtslb(u64 slbee, u64 slbev)
@@ -592,8 +604,10 @@ static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
 		 */
 		for (i = 0; i < vcpu->arch.slb_nr; i++) {
 			u64 slbee, slbev;
-			mfslb(i, &slbee, &slbev);
+
+			slbee = mfslbe(i);
 			if (slbee & SLB_ESID_V) {
+				slbev = mfslbv(i);
 				vcpu->arch.slb[nr].orige = slbee | i;
 				vcpu->arch.slb[nr].origv = slbev;
 				nr++;
-- 
2.23.0

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 01/43] powerpc/64s: Remove WORT SPR from POWER9/10
  2021-06-22 10:56   ` Nicholas Piggin
@ 2021-06-30 17:29     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 17:29 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> This register is not architected and not implemented in POWER9 or 10,
> it just reads back zeroes for compatibility.
>
> -78 cycles (9255) cycles POWER9 virt-mode NULL hcall
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c          | 3 ---
>  arch/powerpc/platforms/powernv/idle.c | 2 --
>  2 files changed, 5 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 9228042bd54f..97f3d6d54b61 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3640,7 +3640,6 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
>  	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
>  	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
>  	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> -	mtspr(SPRN_WORT, vcpu->arch.wort);
>  	mtspr(SPRN_TIDR, vcpu->arch.tid);
>  	mtspr(SPRN_AMR, vcpu->arch.amr);
>  	mtspr(SPRN_UAMOR, vcpu->arch.uamor);
> @@ -3667,7 +3666,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
>  	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
>  	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
>  	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> -	vcpu->arch.wort = mfspr(SPRN_WORT);
>  	vcpu->arch.tid = mfspr(SPRN_TIDR);
>  	vcpu->arch.amr = mfspr(SPRN_AMR);
>  	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
> @@ -3699,7 +3697,6 @@ static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
>  				    struct p9_host_os_sprs *host_os_sprs)
>  {
>  	mtspr(SPRN_PSPB, 0);
> -	mtspr(SPRN_WORT, 0);
>  	mtspr(SPRN_UAMOR, 0);
>
>  	mtspr(SPRN_DSCR, host_os_sprs->dscr);
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 528a7e0cf83a..180baecad914 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -667,7 +667,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>  		sprs.purr	= mfspr(SPRN_PURR);
>  		sprs.spurr	= mfspr(SPRN_SPURR);
>  		sprs.dscr	= mfspr(SPRN_DSCR);
> -		sprs.wort	= mfspr(SPRN_WORT);
>  		sprs.ciabr	= mfspr(SPRN_CIABR);
>
>  		sprs.mmcra	= mfspr(SPRN_MMCRA);
> @@ -785,7 +784,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>  	mtspr(SPRN_PURR,	sprs.purr);
>  	mtspr(SPRN_SPURR,	sprs.spurr);
>  	mtspr(SPRN_DSCR,	sprs.dscr);
> -	mtspr(SPRN_WORT,	sprs.wort);
>  	mtspr(SPRN_CIABR,	sprs.ciabr);
>
>  	mtspr(SPRN_MMCRA,	sprs.mmcra);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 01/43] powerpc/64s: Remove WORT SPR from POWER9/10
@ 2021-06-30 17:29     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 17:29 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> This register is not architected and not implemented in POWER9 or 10,
> it just reads back zeroes for compatibility.
>
> -78 cycles (9255) cycles POWER9 virt-mode NULL hcall
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c          | 3 ---
>  arch/powerpc/platforms/powernv/idle.c | 2 --
>  2 files changed, 5 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 9228042bd54f..97f3d6d54b61 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3640,7 +3640,6 @@ static void load_spr_state(struct kvm_vcpu *vcpu)
>  	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
>  	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
>  	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> -	mtspr(SPRN_WORT, vcpu->arch.wort);
>  	mtspr(SPRN_TIDR, vcpu->arch.tid);
>  	mtspr(SPRN_AMR, vcpu->arch.amr);
>  	mtspr(SPRN_UAMOR, vcpu->arch.uamor);
> @@ -3667,7 +3666,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
>  	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
>  	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
>  	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> -	vcpu->arch.wort = mfspr(SPRN_WORT);
>  	vcpu->arch.tid = mfspr(SPRN_TIDR);
>  	vcpu->arch.amr = mfspr(SPRN_AMR);
>  	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
> @@ -3699,7 +3697,6 @@ static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
>  				    struct p9_host_os_sprs *host_os_sprs)
>  {
>  	mtspr(SPRN_PSPB, 0);
> -	mtspr(SPRN_WORT, 0);
>  	mtspr(SPRN_UAMOR, 0);
>
>  	mtspr(SPRN_DSCR, host_os_sprs->dscr);
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 528a7e0cf83a..180baecad914 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -667,7 +667,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>  		sprs.purr	= mfspr(SPRN_PURR);
>  		sprs.spurr	= mfspr(SPRN_SPURR);
>  		sprs.dscr	= mfspr(SPRN_DSCR);
> -		sprs.wort	= mfspr(SPRN_WORT);
>  		sprs.ciabr	= mfspr(SPRN_CIABR);
>
>  		sprs.mmcra	= mfspr(SPRN_MMCRA);
> @@ -785,7 +784,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>  	mtspr(SPRN_PURR,	sprs.purr);
>  	mtspr(SPRN_SPURR,	sprs.spurr);
>  	mtspr(SPRN_DSCR,	sprs.dscr);
> -	mtspr(SPRN_WORT,	sprs.wort);
>  	mtspr(SPRN_CIABR,	sprs.ciabr);
>
>  	mtspr(SPRN_MMCRA,	sprs.mmcra);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-06-30 17:51     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 17:51 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Some of the DAWR SPR access is already predicated on dawr_enabled(),
> apply this to the remainder of the accesses.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index 7aa72efcac6c..f305d1d6445c 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -638,13 +638,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>
>  	host_hfscr = mfspr(SPRN_HFSCR);
>  	host_ciabr = mfspr(SPRN_CIABR);
> -	host_dawr0 = mfspr(SPRN_DAWR0);
> -	host_dawrx0 = mfspr(SPRN_DAWRX0);
>  	host_psscr = mfspr(SPRN_PSSCR);
>  	host_pidr = mfspr(SPRN_PID);
> -	if (cpu_has_feature(CPU_FTR_DAWR1)) {
> -		host_dawr1 = mfspr(SPRN_DAWR1);
> -		host_dawrx1 = mfspr(SPRN_DAWRX1);
> +
> +	if (dawr_enabled()) {
> +		host_dawr0 = mfspr(SPRN_DAWR0);
> +		host_dawrx0 = mfspr(SPRN_DAWRX0);
> +		if (cpu_has_feature(CPU_FTR_DAWR1)) {
> +			host_dawr1 = mfspr(SPRN_DAWR1);
> +			host_dawrx1 = mfspr(SPRN_DAWRX1);

The userspace needs to enable DAWR1 via KVM_CAP_PPC_DAWR1. That cap is
not even implemented in QEMU currently, so we never allow the guest to
set vcpu->arch.dawr1. If we check for kvm->arch.dawr1_enabled instead of
the CPU feature, we could shave some more time here.

> +		}
>  	}
>
>  	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
> @@ -951,15 +954,18 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>  	mtspr(SPRN_HFSCR, host_hfscr);
>  	if (vcpu->arch.ciabr != host_ciabr)
>  		mtspr(SPRN_CIABR, host_ciabr);
> -	if (vcpu->arch.dawr0 != host_dawr0)
> -		mtspr(SPRN_DAWR0, host_dawr0);
> -	if (vcpu->arch.dawrx0 != host_dawrx0)
> -		mtspr(SPRN_DAWRX0, host_dawrx0);
> -	if (cpu_has_feature(CPU_FTR_DAWR1)) {
> -		if (vcpu->arch.dawr1 != host_dawr1)
> -			mtspr(SPRN_DAWR1, host_dawr1);
> -		if (vcpu->arch.dawrx1 != host_dawrx1)
> -			mtspr(SPRN_DAWRX1, host_dawrx1);
> +
> +	if (dawr_enabled()) {
> +		if (vcpu->arch.dawr0 != host_dawr0)
> +			mtspr(SPRN_DAWR0, host_dawr0);
> +		if (vcpu->arch.dawrx0 != host_dawrx0)
> +			mtspr(SPRN_DAWRX0, host_dawrx0);
> +		if (cpu_has_feature(CPU_FTR_DAWR1)) {
> +			if (vcpu->arch.dawr1 != host_dawr1)
> +				mtspr(SPRN_DAWR1, host_dawr1);
> +			if (vcpu->arch.dawrx1 != host_dawrx1)
> +				mtspr(SPRN_DAWRX1, host_dawrx1);
> +		}
>  	}
>
>  	if (vc->dpdes)

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
@ 2021-06-30 17:51     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 17:51 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Some of the DAWR SPR access is already predicated on dawr_enabled(),
> apply this to the remainder of the accesses.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index 7aa72efcac6c..f305d1d6445c 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -638,13 +638,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>
>  	host_hfscr = mfspr(SPRN_HFSCR);
>  	host_ciabr = mfspr(SPRN_CIABR);
> -	host_dawr0 = mfspr(SPRN_DAWR0);
> -	host_dawrx0 = mfspr(SPRN_DAWRX0);
>  	host_psscr = mfspr(SPRN_PSSCR);
>  	host_pidr = mfspr(SPRN_PID);
> -	if (cpu_has_feature(CPU_FTR_DAWR1)) {
> -		host_dawr1 = mfspr(SPRN_DAWR1);
> -		host_dawrx1 = mfspr(SPRN_DAWRX1);
> +
> +	if (dawr_enabled()) {
> +		host_dawr0 = mfspr(SPRN_DAWR0);
> +		host_dawrx0 = mfspr(SPRN_DAWRX0);
> +		if (cpu_has_feature(CPU_FTR_DAWR1)) {
> +			host_dawr1 = mfspr(SPRN_DAWR1);
> +			host_dawrx1 = mfspr(SPRN_DAWRX1);

The userspace needs to enable DAWR1 via KVM_CAP_PPC_DAWR1. That cap is
not even implemented in QEMU currently, so we never allow the guest to
set vcpu->arch.dawr1. If we check for kvm->arch.dawr1_enabled instead of
the CPU feature, we could shave some more time here.

> +		}
>  	}
>
>  	local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
> @@ -951,15 +954,18 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>  	mtspr(SPRN_HFSCR, host_hfscr);
>  	if (vcpu->arch.ciabr != host_ciabr)
>  		mtspr(SPRN_CIABR, host_ciabr);
> -	if (vcpu->arch.dawr0 != host_dawr0)
> -		mtspr(SPRN_DAWR0, host_dawr0);
> -	if (vcpu->arch.dawrx0 != host_dawrx0)
> -		mtspr(SPRN_DAWRX0, host_dawrx0);
> -	if (cpu_has_feature(CPU_FTR_DAWR1)) {
> -		if (vcpu->arch.dawr1 != host_dawr1)
> -			mtspr(SPRN_DAWR1, host_dawr1);
> -		if (vcpu->arch.dawrx1 != host_dawrx1)
> -			mtspr(SPRN_DAWRX1, host_dawrx1);
> +
> +	if (dawr_enabled()) {
> +		if (vcpu->arch.dawr0 != host_dawr0)
> +			mtspr(SPRN_DAWR0, host_dawr0);
> +		if (vcpu->arch.dawrx0 != host_dawrx0)
> +			mtspr(SPRN_DAWRX0, host_dawrx0);
> +		if (cpu_has_feature(CPU_FTR_DAWR1)) {
> +			if (vcpu->arch.dawr1 != host_dawr1)
> +				mtspr(SPRN_DAWR1, host_dawr1);
> +			if (vcpu->arch.dawrx1 != host_dawrx1)
> +				mtspr(SPRN_DAWRX1, host_dawrx1);
> +		}
>  	}
>
>  	if (vc->dpdes)

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 08/43] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-06-30 19:17     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 19:17 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> This register controls supervisor SPR modifications, and as such is only
> relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never
> restores it coming back out to the host, so it can be kept constant and
> avoid the mtSPR in KVM guest entry.
>
> -21 cycles (9116) cycles POWER9 virt-mode NULL hcall
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kernel/cpu_setup_power.c    |  8 ++++++++
>  arch/powerpc/kernel/dt_cpu_ftrs.c        |  2 ++
>  arch/powerpc/kvm/book3s_hv_p9_entry.c    |  2 --
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  2 --
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ---------------
>  arch/powerpc/platforms/powernv/idle.c    |  8 +++-----
>  6 files changed, 13 insertions(+), 24 deletions(-)
>
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
> index 3cca88ee96d7..a29dc8326622 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -137,6 +137,7 @@ void __setup_cpu_power7(unsigned long offset, struct cpu_spec *t)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
>  }
> @@ -150,6 +151,7 @@ void __restore_cpu_power7(void)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
>  }
> @@ -164,6 +166,7 @@ void __setup_cpu_power8(unsigned long offset, struct cpu_spec *t)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
>  	init_HFSCR();
> @@ -184,6 +187,7 @@ void __restore_cpu_power8(void)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
>  	init_HFSCR();
> @@ -202,6 +206,7 @@ void __setup_cpu_power9(unsigned long offset, struct cpu_spec *t)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> @@ -223,6 +228,7 @@ void __restore_cpu_power9(void)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> @@ -242,6 +248,7 @@ void __setup_cpu_power10(unsigned long offset, struct cpu_spec *t)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> @@ -264,6 +271,7 @@ void __restore_cpu_power10(void)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 358aee7c2d79..0a6b36b4bda8 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -80,6 +80,7 @@ static void __restore_cpu_cpufeatures(void)
>  	mtspr(SPRN_LPCR, system_registers.lpcr);
>  	if (hv_mode) {
>  		mtspr(SPRN_LPID, 0);
> +		mtspr(SPRN_AMOR, ~0);
>  		mtspr(SPRN_HFSCR, system_registers.hfscr);
>  		mtspr(SPRN_PCR, system_registers.pcr);
>  	}
> @@ -216,6 +217,7 @@ static int __init feat_enable_hv(struct dt_cpu_feature *f)
>  	}
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>
>  	lpcr = mfspr(SPRN_LPCR);
>  	lpcr &=  ~LPCR_LPES0; /* HV external interrupts */
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index c4f3e066fcb4..a3281f0c9214 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -286,8 +286,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>  	mtspr(SPRN_SPRG2, vcpu->arch.shregs.sprg2);
>  	mtspr(SPRN_SPRG3, vcpu->arch.shregs.sprg3);
>
> -	mtspr(SPRN_AMOR, ~0UL);
> -
>  	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
>
>  	/*
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 8dd437d7a2c6..007f87b97184 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -772,10 +772,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
>  	/* Restore AMR and UAMOR, set AMOR to all 1s */
>  	ld	r5,VCPU_AMR(r4)
>  	ld	r6,VCPU_UAMOR(r4)
> -	li	r7,-1
>  	mtspr	SPRN_AMR,r5
>  	mtspr	SPRN_UAMOR,r6
> -	mtspr	SPRN_AMOR,r7
>
>  	/* Restore state of CTRL run bit; assume 1 on entry */
>  	lwz	r5,VCPU_CTRL(r4)
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index fe236c38ce00..b985cfead5d7 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -566,18 +566,6 @@ void __init radix__early_init_devtree(void)
>  	return;
>  }
>
> -static void radix_init_amor(void)
> -{
> -	/*
> -	* In HV mode, we init AMOR (Authority Mask Override Register) so that
> -	* the hypervisor and guest can setup IAMR (Instruction Authority Mask
> -	* Register), enable key 0 and set it to 1.
> -	*
> -	* AMOR = 0b1100 .... 0000 (Mask for key 0 is 11)
> -	*/
> -	mtspr(SPRN_AMOR, (3ul << 62));
> -}
> -
>  void __init radix__early_init_mmu(void)
>  {
>  	unsigned long lpcr;
> @@ -638,7 +626,6 @@ void __init radix__early_init_mmu(void)
>  		lpcr = mfspr(SPRN_LPCR);
>  		mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
>  		radix_init_partition_table();
> -		radix_init_amor();
>  	} else {
>  		radix_init_pseries();
>  	}
> @@ -662,8 +649,6 @@ void radix__early_init_mmu_secondary(void)
>
>  		set_ptcr_when_no_uv(__pa(partition_tb) |
>  				    (PATB_SIZE_SHIFT - 12));
> -
> -		radix_init_amor();
>  	}
>
>  	radix__switch_mmu_context(NULL, &init_mm);
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 180baecad914..f791ca041854 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -306,8 +306,8 @@ struct p7_sprs {
>  	/* per thread SPRs that get lost in shallow states */
>  	u64 amr;
>  	u64 iamr;
> -	u64 amor;
>  	u64 uamor;
> +	/* amor is restored to constant ~0 */
>  };
>
>  static unsigned long power7_idle_insn(unsigned long type)
> @@ -378,7 +378,6 @@ static unsigned long power7_idle_insn(unsigned long type)
>  	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
>  		sprs.amr	= mfspr(SPRN_AMR);
>  		sprs.iamr	= mfspr(SPRN_IAMR);
> -		sprs.amor	= mfspr(SPRN_AMOR);
>  		sprs.uamor	= mfspr(SPRN_UAMOR);
>  	}
>
> @@ -397,7 +396,7 @@ static unsigned long power7_idle_insn(unsigned long type)
>  			 */
>  			mtspr(SPRN_AMR,		sprs.amr);
>  			mtspr(SPRN_IAMR,	sprs.iamr);
> -			mtspr(SPRN_AMOR,	sprs.amor);
> +			mtspr(SPRN_AMOR,	~0);
>  			mtspr(SPRN_UAMOR,	sprs.uamor);
>  		}
>  	}
> @@ -687,7 +686,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>
>  	sprs.amr	= mfspr(SPRN_AMR);
>  	sprs.iamr	= mfspr(SPRN_IAMR);
> -	sprs.amor	= mfspr(SPRN_AMOR);
>  	sprs.uamor	= mfspr(SPRN_UAMOR);
>
>  	srr1 = isa300_idle_stop_mayloss(psscr);		/* go idle */
> @@ -708,7 +706,7 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>  		 */
>  		mtspr(SPRN_AMR,		sprs.amr);
>  		mtspr(SPRN_IAMR,	sprs.iamr);
> -		mtspr(SPRN_AMOR,	sprs.amor);
> +		mtspr(SPRN_AMOR,	~0);
>  		mtspr(SPRN_UAMOR,	sprs.uamor);
>
>  		/*

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 08/43] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
@ 2021-06-30 19:17     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 19:17 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> This register controls supervisor SPR modifications, and as such is only
> relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never
> restores it coming back out to the host, so it can be kept constant and
> avoid the mtSPR in KVM guest entry.
>
> -21 cycles (9116) cycles POWER9 virt-mode NULL hcall
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kernel/cpu_setup_power.c    |  8 ++++++++
>  arch/powerpc/kernel/dt_cpu_ftrs.c        |  2 ++
>  arch/powerpc/kvm/book3s_hv_p9_entry.c    |  2 --
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  2 --
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ---------------
>  arch/powerpc/platforms/powernv/idle.c    |  8 +++-----
>  6 files changed, 13 insertions(+), 24 deletions(-)
>
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
> index 3cca88ee96d7..a29dc8326622 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -137,6 +137,7 @@ void __setup_cpu_power7(unsigned long offset, struct cpu_spec *t)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
>  }
> @@ -150,6 +151,7 @@ void __restore_cpu_power7(void)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR), LPCR_LPES1 >> LPCR_LPES_SH);
>  }
> @@ -164,6 +166,7 @@ void __setup_cpu_power8(unsigned long offset, struct cpu_spec *t)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
>  	init_HFSCR();
> @@ -184,6 +187,7 @@ void __restore_cpu_power8(void)
>  		return;
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA206(mfspr(SPRN_LPCR) | LPCR_PECEDH, 0); /* LPES = 0 */
>  	init_HFSCR();
> @@ -202,6 +206,7 @@ void __setup_cpu_power9(unsigned long offset, struct cpu_spec *t)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> @@ -223,6 +228,7 @@ void __restore_cpu_power9(void)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> @@ -242,6 +248,7 @@ void __setup_cpu_power10(unsigned long offset, struct cpu_spec *t)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> @@ -264,6 +271,7 @@ void __restore_cpu_power10(void)
>  	mtspr(SPRN_PSSCR, 0);
>  	mtspr(SPRN_LPID, 0);
>  	mtspr(SPRN_PID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>  	mtspr(SPRN_PCR, PCR_MASK);
>  	init_LPCR_ISA300((mfspr(SPRN_LPCR) | LPCR_PECEDH | LPCR_PECE_HVEE |\
>  			 LPCR_HVICE | LPCR_HEIC) & ~(LPCR_UPRT | LPCR_HR), 0);
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 358aee7c2d79..0a6b36b4bda8 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -80,6 +80,7 @@ static void __restore_cpu_cpufeatures(void)
>  	mtspr(SPRN_LPCR, system_registers.lpcr);
>  	if (hv_mode) {
>  		mtspr(SPRN_LPID, 0);
> +		mtspr(SPRN_AMOR, ~0);
>  		mtspr(SPRN_HFSCR, system_registers.hfscr);
>  		mtspr(SPRN_PCR, system_registers.pcr);
>  	}
> @@ -216,6 +217,7 @@ static int __init feat_enable_hv(struct dt_cpu_feature *f)
>  	}
>
>  	mtspr(SPRN_LPID, 0);
> +	mtspr(SPRN_AMOR, ~0);
>
>  	lpcr = mfspr(SPRN_LPCR);
>  	lpcr &=  ~LPCR_LPES0; /* HV external interrupts */
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index c4f3e066fcb4..a3281f0c9214 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -286,8 +286,6 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>  	mtspr(SPRN_SPRG2, vcpu->arch.shregs.sprg2);
>  	mtspr(SPRN_SPRG3, vcpu->arch.shregs.sprg3);
>
> -	mtspr(SPRN_AMOR, ~0UL);
> -
>  	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
>
>  	/*
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 8dd437d7a2c6..007f87b97184 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -772,10 +772,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
>  	/* Restore AMR and UAMOR, set AMOR to all 1s */
>  	ld	r5,VCPU_AMR(r4)
>  	ld	r6,VCPU_UAMOR(r4)
> -	li	r7,-1
>  	mtspr	SPRN_AMR,r5
>  	mtspr	SPRN_UAMOR,r6
> -	mtspr	SPRN_AMOR,r7
>
>  	/* Restore state of CTRL run bit; assume 1 on entry */
>  	lwz	r5,VCPU_CTRL(r4)
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index fe236c38ce00..b985cfead5d7 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -566,18 +566,6 @@ void __init radix__early_init_devtree(void)
>  	return;
>  }
>
> -static void radix_init_amor(void)
> -{
> -	/*
> -	* In HV mode, we init AMOR (Authority Mask Override Register) so that
> -	* the hypervisor and guest can setup IAMR (Instruction Authority Mask
> -	* Register), enable key 0 and set it to 1.
> -	*
> -	* AMOR = 0b1100 .... 0000 (Mask for key 0 is 11)
> -	*/
> -	mtspr(SPRN_AMOR, (3ul << 62));
> -}
> -
>  void __init radix__early_init_mmu(void)
>  {
>  	unsigned long lpcr;
> @@ -638,7 +626,6 @@ void __init radix__early_init_mmu(void)
>  		lpcr = mfspr(SPRN_LPCR);
>  		mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
>  		radix_init_partition_table();
> -		radix_init_amor();
>  	} else {
>  		radix_init_pseries();
>  	}
> @@ -662,8 +649,6 @@ void radix__early_init_mmu_secondary(void)
>
>  		set_ptcr_when_no_uv(__pa(partition_tb) |
>  				    (PATB_SIZE_SHIFT - 12));
> -
> -		radix_init_amor();
>  	}
>
>  	radix__switch_mmu_context(NULL, &init_mm);
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 180baecad914..f791ca041854 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -306,8 +306,8 @@ struct p7_sprs {
>  	/* per thread SPRs that get lost in shallow states */
>  	u64 amr;
>  	u64 iamr;
> -	u64 amor;
>  	u64 uamor;
> +	/* amor is restored to constant ~0 */
>  };
>
>  static unsigned long power7_idle_insn(unsigned long type)
> @@ -378,7 +378,6 @@ static unsigned long power7_idle_insn(unsigned long type)
>  	if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
>  		sprs.amr	= mfspr(SPRN_AMR);
>  		sprs.iamr	= mfspr(SPRN_IAMR);
> -		sprs.amor	= mfspr(SPRN_AMOR);
>  		sprs.uamor	= mfspr(SPRN_UAMOR);
>  	}
>
> @@ -397,7 +396,7 @@ static unsigned long power7_idle_insn(unsigned long type)
>  			 */
>  			mtspr(SPRN_AMR,		sprs.amr);
>  			mtspr(SPRN_IAMR,	sprs.iamr);
> -			mtspr(SPRN_AMOR,	sprs.amor);
> +			mtspr(SPRN_AMOR,	~0);
>  			mtspr(SPRN_UAMOR,	sprs.uamor);
>  		}
>  	}
> @@ -687,7 +686,6 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>
>  	sprs.amr	= mfspr(SPRN_AMR);
>  	sprs.iamr	= mfspr(SPRN_IAMR);
> -	sprs.amor	= mfspr(SPRN_AMOR);
>  	sprs.uamor	= mfspr(SPRN_UAMOR);
>
>  	srr1 = isa300_idle_stop_mayloss(psscr);		/* go idle */
> @@ -708,7 +706,7 @@ static unsigned long power9_idle_stop(unsigned long psscr)
>  		 */
>  		mtspr(SPRN_AMR,		sprs.amr);
>  		mtspr(SPRN_IAMR,	sprs.iamr);
> -		mtspr(SPRN_AMOR,	sprs.amor);
> +		mtspr(SPRN_AMOR,	~0);
>  		mtspr(SPRN_UAMOR,	sprs.uamor);
>
>  		/*

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 07/43] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-06-30 19:41     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 19:41 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> HV interrupts may be taken with the MMU enabled when radix guests are
> running. Enable LPCR[HAIL] on ISA v3.1 processors for radix guests.
> Make this depend on the host LPCR[HAIL] being enabled. Currently that is
> always enabled, but having this test means any issue that might require
> LPCR[HAIL] to be disabled in the host will not have to be duplicated in
> KVM.
>
> -1380 cycles on P10 NULL hcall entry+exit
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 36e1db48fccf..ed713f49fbd5 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4896,6 +4896,8 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu)
>   */
>  int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
>  {
> +	unsigned long lpcr, lpcr_mask;
> +
>  	if (nesting_enabled(kvm))
>  		kvmhv_release_all_nested(kvm);
>  	kvmppc_rmap_reset(kvm);
> @@ -4905,8 +4907,13 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
>  	kvm->arch.radix = 0;
>  	spin_unlock(&kvm->mmu_lock);
>  	kvmppc_free_radix(kvm);
> -	kvmppc_update_lpcr(kvm, LPCR_VPM1,
> -			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
> +
> +	lpcr = LPCR_VPM1;
> +	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
> +		lpcr_mask |= LPCR_HAIL;
> +	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
> +
>  	return 0;
>  }
>
> @@ -4916,6 +4923,7 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
>   */
>  int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
>  {
> +	unsigned long lpcr, lpcr_mask;
>  	int err;
>
>  	err = kvmppc_init_vm_radix(kvm);
> @@ -4927,8 +4935,17 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
>  	kvm->arch.radix = 1;
>  	spin_unlock(&kvm->mmu_lock);
>  	kvmppc_free_hpt(&kvm->arch.hpt);
> -	kvmppc_update_lpcr(kvm, LPCR_UPRT | LPCR_GTSE | LPCR_HR,
> -			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
> +
> +	lpcr = LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		lpcr_mask |= LPCR_HAIL;
> +		if (cpu_has_feature(CPU_FTR_HVMODE) &&
> +				(kvm->arch.host_lpcr & LPCR_HAIL))
> +			lpcr |= LPCR_HAIL;
> +	}
> +	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
> +
>  	return 0;
>  }
>
> @@ -5092,6 +5109,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
>  		kvm->arch.mmu_ready = 1;
>  		lpcr &= ~LPCR_VPM1;
>  		lpcr |= LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +		if (cpu_has_feature(CPU_FTR_HVMODE) &&
> +		    cpu_has_feature(CPU_FTR_ARCH_31) &&
> +		    (kvm->arch.host_lpcr & LPCR_HAIL))
> +			lpcr |= LPCR_HAIL;
>  		ret = kvmppc_init_vm_radix(kvm);
>  		if (ret) {
>  			kvmppc_free_lpid(kvm->arch.lpid);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 07/43] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
@ 2021-06-30 19:41     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 19:41 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> HV interrupts may be taken with the MMU enabled when radix guests are
> running. Enable LPCR[HAIL] on ISA v3.1 processors for radix guests.
> Make this depend on the host LPCR[HAIL] being enabled. Currently that is
> always enabled, but having this test means any issue that might require
> LPCR[HAIL] to be disabled in the host will not have to be duplicated in
> KVM.
>
> -1380 cycles on P10 NULL hcall entry+exit
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 36e1db48fccf..ed713f49fbd5 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4896,6 +4896,8 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu)
>   */
>  int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
>  {
> +	unsigned long lpcr, lpcr_mask;
> +
>  	if (nesting_enabled(kvm))
>  		kvmhv_release_all_nested(kvm);
>  	kvmppc_rmap_reset(kvm);
> @@ -4905,8 +4907,13 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
>  	kvm->arch.radix = 0;
>  	spin_unlock(&kvm->mmu_lock);
>  	kvmppc_free_radix(kvm);
> -	kvmppc_update_lpcr(kvm, LPCR_VPM1,
> -			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
> +
> +	lpcr = LPCR_VPM1;
> +	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
> +		lpcr_mask |= LPCR_HAIL;
> +	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
> +
>  	return 0;
>  }
>
> @@ -4916,6 +4923,7 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
>   */
>  int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
>  {
> +	unsigned long lpcr, lpcr_mask;
>  	int err;
>
>  	err = kvmppc_init_vm_radix(kvm);
> @@ -4927,8 +4935,17 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
>  	kvm->arch.radix = 1;
>  	spin_unlock(&kvm->mmu_lock);
>  	kvmppc_free_hpt(&kvm->arch.hpt);
> -	kvmppc_update_lpcr(kvm, LPCR_UPRT | LPCR_GTSE | LPCR_HR,
> -			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
> +
> +	lpcr = LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +	lpcr_mask = LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		lpcr_mask |= LPCR_HAIL;
> +		if (cpu_has_feature(CPU_FTR_HVMODE) &&
> +				(kvm->arch.host_lpcr & LPCR_HAIL))
> +			lpcr |= LPCR_HAIL;
> +	}
> +	kvmppc_update_lpcr(kvm, lpcr, lpcr_mask);
> +
>  	return 0;
>  }
>
> @@ -5092,6 +5109,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
>  		kvm->arch.mmu_ready = 1;
>  		lpcr &= ~LPCR_VPM1;
>  		lpcr |= LPCR_UPRT | LPCR_GTSE | LPCR_HR;
> +		if (cpu_has_feature(CPU_FTR_HVMODE) &&
> +		    cpu_has_feature(CPU_FTR_ARCH_31) &&
> +		    (kvm->arch.host_lpcr & LPCR_HAIL))
> +			lpcr |= LPCR_HAIL;
>  		ret = kvmppc_init_vm_radix(kvm);
>  		if (ret) {
>  			kvmppc_free_lpid(kvm->arch.lpid);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 19/43] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-06-30 20:18     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 20:18 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Small cleanup makes it a bit easier to match up entry and exit
> operations.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index b8b0695a9312..86c85e303a6d 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2948,6 +2948,13 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
>  		kvmppc_ipi_thread(cpu);
>  }
>
> +/* Old path does this in asm */
> +static void kvmppc_stop_thread(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->cpu = -1;
> +	vcpu->arch.thread_cpu = -1;
> +}
> +
>  static void kvmppc_wait_for_nap(int n_threads)
>  {
>  	int cpu = smp_processor_id();
> @@ -4154,8 +4161,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  		dec = (s32) dec;
>  	tb = mftb();
>  	vcpu->arch.dec_expires = dec + tb;
> -	vcpu->cpu = -1;
> -	vcpu->arch.thread_cpu = -1;
>
>  	store_spr_state(vcpu);
>
> @@ -4627,6 +4632,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	guest_exit_irqoff();
>
> +	kvmppc_stop_thread(vcpu);
> +
>  	powerpc_local_irq_pmu_restore(flags);
>
>  	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 19/43] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread
@ 2021-06-30 20:18     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-06-30 20:18 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Small cleanup makes it a bit easier to match up entry and exit
> operations.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index b8b0695a9312..86c85e303a6d 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2948,6 +2948,13 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
>  		kvmppc_ipi_thread(cpu);
>  }
>
> +/* Old path does this in asm */
> +static void kvmppc_stop_thread(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->cpu = -1;
> +	vcpu->arch.thread_cpu = -1;
> +}
> +
>  static void kvmppc_wait_for_nap(int n_threads)
>  {
>  	int cpu = smp_processor_id();
> @@ -4154,8 +4161,6 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  		dec = (s32) dec;
>  	tb = mftb();
>  	vcpu->arch.dec_expires = dec + tb;
> -	vcpu->cpu = -1;
> -	vcpu->arch.thread_cpu = -1;
>
>  	store_spr_state(vcpu);
>
> @@ -4627,6 +4632,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	guest_exit_irqoff();
>
> +	kvmppc_stop_thread(vcpu);
> +
>  	powerpc_local_irq_pmu_restore(flags);
>
>  	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
  2021-06-30 17:51     ` Fabiano Rosas
@ 2021-07-01  8:04       ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-01  8:04 UTC (permalink / raw)
  To: Fabiano Rosas, kvm-ppc; +Cc: linuxppc-dev

Excerpts from Fabiano Rosas's message of July 1, 2021 3:51 am:
> Nicholas Piggin <npiggin@gmail.com> writes:
> 
>> Some of the DAWR SPR access is already predicated on dawr_enabled(),
>> apply this to the remainder of the accesses.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
>>  1 file changed, 20 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
>> index 7aa72efcac6c..f305d1d6445c 100644
>> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
>> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
>> @@ -638,13 +638,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>>
>>  	host_hfscr = mfspr(SPRN_HFSCR);
>>  	host_ciabr = mfspr(SPRN_CIABR);
>> -	host_dawr0 = mfspr(SPRN_DAWR0);
>> -	host_dawrx0 = mfspr(SPRN_DAWRX0);
>>  	host_psscr = mfspr(SPRN_PSSCR);
>>  	host_pidr = mfspr(SPRN_PID);
>> -	if (cpu_has_feature(CPU_FTR_DAWR1)) {
>> -		host_dawr1 = mfspr(SPRN_DAWR1);
>> -		host_dawrx1 = mfspr(SPRN_DAWRX1);
>> +
>> +	if (dawr_enabled()) {
>> +		host_dawr0 = mfspr(SPRN_DAWR0);
>> +		host_dawrx0 = mfspr(SPRN_DAWRX0);
>> +		if (cpu_has_feature(CPU_FTR_DAWR1)) {
>> +			host_dawr1 = mfspr(SPRN_DAWR1);
>> +			host_dawrx1 = mfspr(SPRN_DAWRX1);
> 
> The userspace needs to enable DAWR1 via KVM_CAP_PPC_DAWR1. That cap is
> not even implemented in QEMU currently, so we never allow the guest to
> set vcpu->arch.dawr1. If we check for kvm->arch.dawr1_enabled instead of
> the CPU feature, we could shave some more time here.

Ah good point, yes let's do that.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
@ 2021-07-01  8:04       ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-01  8:04 UTC (permalink / raw)
  To: Fabiano Rosas, kvm-ppc; +Cc: linuxppc-dev

Excerpts from Fabiano Rosas's message of July 1, 2021 3:51 am:
> Nicholas Piggin <npiggin@gmail.com> writes:
> 
>> Some of the DAWR SPR access is already predicated on dawr_enabled(),
>> apply this to the remainder of the accesses.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 34 ++++++++++++++++-----------
>>  1 file changed, 20 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
>> index 7aa72efcac6c..f305d1d6445c 100644
>> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
>> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
>> @@ -638,13 +638,16 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>>
>>  	host_hfscr = mfspr(SPRN_HFSCR);
>>  	host_ciabr = mfspr(SPRN_CIABR);
>> -	host_dawr0 = mfspr(SPRN_DAWR0);
>> -	host_dawrx0 = mfspr(SPRN_DAWRX0);
>>  	host_psscr = mfspr(SPRN_PSSCR);
>>  	host_pidr = mfspr(SPRN_PID);
>> -	if (cpu_has_feature(CPU_FTR_DAWR1)) {
>> -		host_dawr1 = mfspr(SPRN_DAWR1);
>> -		host_dawrx1 = mfspr(SPRN_DAWRX1);
>> +
>> +	if (dawr_enabled()) {
>> +		host_dawr0 = mfspr(SPRN_DAWR0);
>> +		host_dawrx0 = mfspr(SPRN_DAWRX0);
>> +		if (cpu_has_feature(CPU_FTR_DAWR1)) {
>> +			host_dawr1 = mfspr(SPRN_DAWR1);
>> +			host_dawrx1 = mfspr(SPRN_DAWRX1);
> 
> The userspace needs to enable DAWR1 via KVM_CAP_PPC_DAWR1. That cap is
> not even implemented in QEMU currently, so we never allow the guest to
> set vcpu->arch.dawr1. If we check for kvm->arch.dawr1_enabled instead of
> the CPU feature, we could shave some more time here.

Ah good point, yes let's do that.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-01 13:29     ` Madhavan Srinivasan
  -1 siblings, 0 replies; 133+ messages in thread
From: Madhavan Srinivasan @ 2021-07-01 13:17 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev


On 6/22/21 4:27 PM, Nicholas Piggin wrote:
> KVM PMU management code looks for particular frozen/disabled bits in
> the PMU registers so it knows whether it must clear them when coming
> out of a guest or not. Setting this up helps KVM make these optimisations
> without getting confused. Longer term the better approach might be to
> move guest/host PMU switching to the perf subsystem.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>   arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>   arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>   arch/powerpc/perf/core-book3s.c       | 7 +++++++
>   4 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
> index a29dc8326622..3dc61e203f37 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>   static void init_PMU(void)
>   {
>   	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);

Sticky point here is, currently if not frozen, pmc5/6 will
keep countering. And not freezing them at boot is quiet useful
sometime, like say when running in a simulation where we could calculate
approx CPIs for micro benchmarks without perf subsystem.

>   	mtspr(SPRN_MMCR1, 0);
>   	mtspr(SPRN_MMCR2, 0);
>   }
> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>   {
>   	mtspr(SPRN_MMCR3, 0);
>   	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>   }
>
>   /*
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 0a6b36b4bda8..06a089fbeaa7 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>   	}
>
>   	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>   	mtspr(SPRN_MMCR1, 0);
>   	mtspr(SPRN_MMCR2, 0);
>   	mtspr(SPRN_MMCRS, 0);
> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>   		mtspr(SPRN_MMCRC, 0);
>
>   	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>   	mtspr(SPRN_MMCR1, 0);
>   	mtspr(SPRN_MMCR2, 0);
>   }
> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>
>   	mtspr(SPRN_MMCR3, 0);
>   	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>   }
>
>   static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 1f30f98b09d1..f7349d150828 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>   #endif
>   #endif
>   	vcpu->arch.mmcr[0] = MMCR0_FC;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
>   	vcpu->arch.ctrl = CTRL_RUNLATCH;
>   	/* default to host PVR, since we can't spoof it */
>   	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 51622411a7cc..e33b29ec1a65 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>   		goto out;
>
>   	if (cpuhw->n_events == 0) {
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> +		} else {
> +			mtspr(SPRN_MMCRA, 0);
> +			mtspr(SPRN_MMCR0, MMCR0_FC);
> +		}
>   		ppc_set_pmu_inuse(0);
>   		goto out;
>   	}

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-01 13:29     ` Madhavan Srinivasan
  0 siblings, 0 replies; 133+ messages in thread
From: Madhavan Srinivasan @ 2021-07-01 13:29 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev


On 6/22/21 4:27 PM, Nicholas Piggin wrote:
> KVM PMU management code looks for particular frozen/disabled bits in
> the PMU registers so it knows whether it must clear them when coming
> out of a guest or not. Setting this up helps KVM make these optimisations
> without getting confused. Longer term the better approach might be to
> move guest/host PMU switching to the perf subsystem.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>   arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>   arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>   arch/powerpc/perf/core-book3s.c       | 7 +++++++
>   4 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
> index a29dc8326622..3dc61e203f37 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>   static void init_PMU(void)
>   {
>   	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);

Sticky point here is, currently if not frozen, pmc5/6 will
keep countering. And not freezing them at boot is quiet useful
sometime, like say when running in a simulation where we could calculate
approx CPIs for micro benchmarks without perf subsystem.

>   	mtspr(SPRN_MMCR1, 0);
>   	mtspr(SPRN_MMCR2, 0);
>   }
> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>   {
>   	mtspr(SPRN_MMCR3, 0);
>   	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>   }
>
>   /*
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 0a6b36b4bda8..06a089fbeaa7 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>   	}
>
>   	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>   	mtspr(SPRN_MMCR1, 0);
>   	mtspr(SPRN_MMCR2, 0);
>   	mtspr(SPRN_MMCRS, 0);
> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>   		mtspr(SPRN_MMCRC, 0);
>
>   	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>   	mtspr(SPRN_MMCR1, 0);
>   	mtspr(SPRN_MMCR2, 0);
>   }
> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>
>   	mtspr(SPRN_MMCR3, 0);
>   	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>   }
>
>   static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 1f30f98b09d1..f7349d150828 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>   #endif
>   #endif
>   	vcpu->arch.mmcr[0] = MMCR0_FC;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
>   	vcpu->arch.ctrl = CTRL_RUNLATCH;
>   	/* default to host PVR, since we can't spoof it */
>   	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 51622411a7cc..e33b29ec1a65 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>   		goto out;
>
>   	if (cpuhw->n_events = 0) {
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> +		} else {
> +			mtspr(SPRN_MMCRA, 0);
> +			mtspr(SPRN_MMCR0, MMCR0_FC);
> +		}
>   		ppc_set_pmu_inuse(0);
>   		goto out;
>   	}

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-01 13:29     ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Madhavan Srinivasan
@ 2021-07-02  0:27       ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-02  0:27 UTC (permalink / raw)
  To: kvm-ppc, Madhavan Srinivasan; +Cc: linuxppc-dev

Excerpts from Madhavan Srinivasan's message of July 1, 2021 11:17 pm:
> 
> On 6/22/21 4:27 PM, Nicholas Piggin wrote:
>> KVM PMU management code looks for particular frozen/disabled bits in
>> the PMU registers so it knows whether it must clear them when coming
>> out of a guest or not. Setting this up helps KVM make these optimisations
>> without getting confused. Longer term the better approach might be to
>> move guest/host PMU switching to the perf subsystem.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>   arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>   arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>   arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>   4 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>> index a29dc8326622..3dc61e203f37 100644
>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>   static void init_PMU(void)
>>   {
>>   	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 
> Sticky point here is, currently if not frozen, pmc5/6 will
> keep countering. And not freezing them at boot is quiet useful
> sometime, like say when running in a simulation where we could calculate
> approx CPIs for micro benchmarks without perf subsystem.

You even can't use the sysfs files in this sim environment? In that case
what if we added a boot option that could set some things up? In that 
case possibly you could even gather some more types of events too.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-02  0:27       ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-02  0:27 UTC (permalink / raw)
  To: kvm-ppc, Madhavan Srinivasan; +Cc: linuxppc-dev

Excerpts from Madhavan Srinivasan's message of July 1, 2021 11:17 pm:
> 
> On 6/22/21 4:27 PM, Nicholas Piggin wrote:
>> KVM PMU management code looks for particular frozen/disabled bits in
>> the PMU registers so it knows whether it must clear them when coming
>> out of a guest or not. Setting this up helps KVM make these optimisations
>> without getting confused. Longer term the better approach might be to
>> move guest/host PMU switching to the perf subsystem.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>   arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>   arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>   arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>   4 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>> index a29dc8326622..3dc61e203f37 100644
>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>   static void init_PMU(void)
>>   {
>>   	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 
> Sticky point here is, currently if not frozen, pmc5/6 will
> keep countering. And not freezing them at boot is quiet useful
> sometime, like say when running in a simulation where we could calculate
> approx CPIs for micro benchmarks without perf subsystem.

You even can't use the sysfs files in this sim environment? In that case
what if we added a boot option that could set some things up? In that 
case possibly you could even gather some more types of events too.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-08  5:44     ` Athira Rajeev
  -1 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-08  5:32 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Move the P9 guest/host register switching functions to the built-in
> P9 entry code, and export it for nested to use as well.
> 
> This allows more flexibility in scheduling these supervisor privileged
> SPR accesses with the HV privileged and PR SPR accesses in the low level
> entry code.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kvm/book3s_hv.c          | 351 +-------------------------
> arch/powerpc/kvm/book3s_hv.h          |  39 +++
> arch/powerpc/kvm/book3s_hv_p9_entry.c | 332 ++++++++++++++++++++++++
> 3 files changed, 372 insertions(+), 350 deletions(-)
> create mode 100644 arch/powerpc/kvm/book3s_hv.h
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 35749b0b663f..a7660af22161 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -79,6 +79,7 @@
> #include <asm/dtl.h>
> 
> #include "book3s.h"
> +#include "book3s_hv.h"
> 
> #define CREATE_TRACE_POINTS
> #include "trace_hv.h"
> @@ -3675,356 +3676,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> 	trace_kvmppc_run_core(vc, 1);
> }
> 
> -/*
> - * Privileged (non-hypervisor) host registers to save.
> - */
> -struct p9_host_os_sprs {
> -	unsigned long dscr;
> -	unsigned long tidr;
> -	unsigned long iamr;
> -	unsigned long amr;
> -	unsigned long fscr;
> -
> -	unsigned int pmc1;
> -	unsigned int pmc2;
> -	unsigned int pmc3;
> -	unsigned int pmc4;
> -	unsigned int pmc5;
> -	unsigned int pmc6;
> -	unsigned long mmcr0;
> -	unsigned long mmcr1;
> -	unsigned long mmcr2;
> -	unsigned long mmcr3;
> -	unsigned long mmcra;
> -	unsigned long siar;
> -	unsigned long sier1;
> -	unsigned long sier2;
> -	unsigned long sier3;
> -	unsigned long sdar;
> -};
> -
> -static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
> -{
> -	if (!(mmcr0 & MMCR0_FC))
> -		goto do_freeze;
> -	if (mmcra & MMCRA_SAMPLE_ENABLE)
> -		goto do_freeze;
> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -		if (!(mmcr0 & MMCR0_PMCCEXT))
> -			goto do_freeze;
> -		if (!(mmcra & MMCRA_BHRB_DISABLE))
> -			goto do_freeze;
> -	}
> -	return;
> -
> -do_freeze:
> -	mmcr0 = MMCR0_FC;
> -	mmcra = 0;
> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -		mmcr0 |= MMCR0_PMCCEXT;
> -		mmcra = MMCRA_BHRB_DISABLE;
> -	}
> -
> -	mtspr(SPRN_MMCR0, mmcr0);
> -	mtspr(SPRN_MMCRA, mmcra);
> -	isync();
> -}
> -
> -static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
> -				struct p9_host_os_sprs *host_os_sprs)
> -{
> -	struct lppaca *lp;
> -	int load_pmu = 1;
> -
> -	lp = vcpu->arch.vpa.pinned_addr;
> -	if (lp)
> -		load_pmu = lp->pmcregs_in_use;
> -
> -	if (load_pmu)
> -	      vcpu->arch.hfscr |= HFSCR_PM;
> -
> -	/* Save host */
> -	if (ppc_get_pmu_inuse()) {
> -		/*
> -		 * It might be better to put PMU handling (at least for the
> -		 * host) in the perf subsystem because it knows more about what
> -		 * is being used.
> -		 */
> -
> -		/* POWER9, POWER10 do not implement HPMC or SPMC */
> -
> -		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
> -		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
> -
> -		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
> -
> -		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
> -		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
> -		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
> -		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
> -		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
> -		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
> -		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
> -		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
> -		host_os_sprs->sdar = mfspr(SPRN_SDAR);
> -		host_os_sprs->siar = mfspr(SPRN_SIAR);
> -		host_os_sprs->sier1 = mfspr(SPRN_SIER);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
> -			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
> -			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
> -		}
> -	}
> -
> -#ifdef CONFIG_PPC_PSERIES
> -	if (kvmhv_on_pseries()) {
> -		if (vcpu->arch.vpa.pinned_addr) {
> -			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> -			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
> -		} else {
> -			get_lppaca()->pmcregs_in_use = 1;
> -		}
> -	}
> -#endif
> -
> -	/* Load guest */
> -	if (vcpu->arch.hfscr & HFSCR_PM) {
> -		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
> -		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
> -		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
> -		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
> -		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
> -		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
> -		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
> -		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
> -		mtspr(SPRN_SDAR, vcpu->arch.sdar);
> -		mtspr(SPRN_SIAR, vcpu->arch.siar);
> -		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);


Hi Nick,

Have a doubt here..
For MMCR3, it is  vcpu->arch.mmcr[3) ?

Thanks
Athira
> -			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
> -			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
> -		}
> -
> -		/* Set MMCRA then MMCR0 last */
> -		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
> -		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
> -		/* No isync necessary because we're starting counters */
> -	}
> -}
> -
> -static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
> -				struct p9_host_os_sprs *host_os_sprs)
> -{
> -	struct lppaca *lp;
> -	int save_pmu = 1;
> -
> -	lp = vcpu->arch.vpa.pinned_addr;
> -	if (lp)
> -		save_pmu = lp->pmcregs_in_use;
> -
> -	if (save_pmu) {
> -		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
> -		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
> -
> -		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
> -
> -		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
> -		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
> -		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
> -		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
> -		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
> -		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
> -		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
> -		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
> -		vcpu->arch.sdar = mfspr(SPRN_SDAR);
> -		vcpu->arch.siar = mfspr(SPRN_SIAR);
> -		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
> -			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
> -			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
> -		}
> -
> -	} else if (vcpu->arch.hfscr & HFSCR_PM) {
> -		/*
> -		 * The guest accessed PMC SPRs without specifying they should
> -		 * be preserved. Stop them from counting if the guest had
> -		 * started anything.
> -		 */
> -		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
> -
> -		/*
> -		 * Demand-fault PMU register access in the guest.
> -		 *
> -		 * This is used to grab the guest's VPA pmcregs_in_use value
> -		 * and reflect it into the host's VPA in the case of a nested
> -		 * hypervisor.
> -		 *
> -		 * It also avoids having to zero-out SPRs after each guest
> -		 * exit to avoid side-channels when.
> -		 *
> -		 * This is cleared here when we exit the guest, so later HFSCR
> -		 * interrupt handling can add it back to run the guest with
> -		 * PM enabled next time.
> -		 */
> -		vcpu->arch.hfscr &= ~HFSCR_PM;
> -	} /* otherwise the PMU should still be frozen from guest entry */
> -
> -#ifdef CONFIG_PPC_PSERIES
> -	if (kvmhv_on_pseries())
> -		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
> -#endif
> -
> -	if (ppc_get_pmu_inuse()) {
> -		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
> -		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
> -		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
> -		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
> -		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
> -		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
> -		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
> -		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
> -		mtspr(SPRN_SDAR, host_os_sprs->sdar);
> -		mtspr(SPRN_SIAR, host_os_sprs->siar);
> -		mtspr(SPRN_SIER, host_os_sprs->sier1);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
> -			mtspr(SPRN_SIER2, host_os_sprs->sier2);
> -			mtspr(SPRN_SIER3, host_os_sprs->sier3);
> -		}
> -
> -		/* Set MMCRA then MMCR0 last */
> -		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
> -		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
> -		isync();
> -	}
> -}
> -
> -static void load_spr_state(struct kvm_vcpu *vcpu,
> -				struct p9_host_os_sprs *host_os_sprs)
> -{
> -	mtspr(SPRN_TAR, vcpu->arch.tar);
> -	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> -	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> -	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> -
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		mtspr(SPRN_TIDR, vcpu->arch.tid);
> -	if (host_os_sprs->iamr != vcpu->arch.iamr)
> -		mtspr(SPRN_IAMR, vcpu->arch.iamr);
> -	if (host_os_sprs->amr != vcpu->arch.amr)
> -		mtspr(SPRN_AMR, vcpu->arch.amr);
> -	if (vcpu->arch.uamor != 0)
> -		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
> -	if (host_os_sprs->fscr != vcpu->arch.fscr)
> -		mtspr(SPRN_FSCR, vcpu->arch.fscr);
> -	if (host_os_sprs->dscr != vcpu->arch.dscr)
> -		mtspr(SPRN_DSCR, vcpu->arch.dscr);
> -	if (vcpu->arch.pspb != 0)
> -		mtspr(SPRN_PSPB, vcpu->arch.pspb);
> -
> -	/*
> -	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
> -	 * clear (or hstate set appropriately to catch those registers
> -	 * being clobbered if we take a MCE or SRESET), so those are done
> -	 * later.
> -	 */
> -
> -	if (!(vcpu->arch.ctrl & 1))
> -		mtspr(SPRN_CTRLT, 0);
> -}
> -
> -static void store_spr_state(struct kvm_vcpu *vcpu)
> -{
> -	vcpu->arch.tar = mfspr(SPRN_TAR);
> -	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> -	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> -	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> -
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		vcpu->arch.tid = mfspr(SPRN_TIDR);
> -	vcpu->arch.iamr = mfspr(SPRN_IAMR);
> -	vcpu->arch.amr = mfspr(SPRN_AMR);
> -	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
> -	vcpu->arch.fscr = mfspr(SPRN_FSCR);
> -	vcpu->arch.dscr = mfspr(SPRN_DSCR);
> -	vcpu->arch.pspb = mfspr(SPRN_PSPB);
> -
> -	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
> -}
> -
> -static void load_vcpu_state(struct kvm_vcpu *vcpu,
> -			   struct p9_host_os_sprs *host_os_sprs)
> -{
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> -		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> -
> -	load_spr_state(vcpu, host_os_sprs);
> -
> -	load_fp_state(&vcpu->arch.fp);
> -#ifdef CONFIG_ALTIVEC
> -	load_vr_state(&vcpu->arch.vr);
> -#endif
> -	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
> -}
> -
> -static void store_vcpu_state(struct kvm_vcpu *vcpu)
> -{
> -	store_spr_state(vcpu);
> -
> -	store_fp_state(&vcpu->arch.fp);
> -#ifdef CONFIG_ALTIVEC
> -	store_vr_state(&vcpu->arch.vr);
> -#endif
> -	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
> -
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> -		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> -}
> -
> -static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
> -{
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		host_os_sprs->tidr = mfspr(SPRN_TIDR);
> -	host_os_sprs->iamr = mfspr(SPRN_IAMR);
> -	host_os_sprs->amr = mfspr(SPRN_AMR);
> -	host_os_sprs->fscr = mfspr(SPRN_FSCR);
> -	host_os_sprs->dscr = mfspr(SPRN_DSCR);
> -}
> -
> -/* vcpu guest regs must already be saved */
> -static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
> -				    struct p9_host_os_sprs *host_os_sprs)
> -{
> -	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
> -
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		mtspr(SPRN_TIDR, host_os_sprs->tidr);
> -	if (host_os_sprs->iamr != vcpu->arch.iamr)
> -		mtspr(SPRN_IAMR, host_os_sprs->iamr);
> -	if (vcpu->arch.uamor != 0)
> -		mtspr(SPRN_UAMOR, 0);
> -	if (host_os_sprs->amr != vcpu->arch.amr)
> -		mtspr(SPRN_AMR, host_os_sprs->amr);
> -	if (host_os_sprs->fscr != vcpu->arch.fscr)
> -		mtspr(SPRN_FSCR, host_os_sprs->fscr);
> -	if (host_os_sprs->dscr != vcpu->arch.dscr)
> -		mtspr(SPRN_DSCR, host_os_sprs->dscr);
> -	if (vcpu->arch.pspb != 0)
> -		mtspr(SPRN_PSPB, 0);
> -
> -	/* Save guest CTRL register, set runlatch to 1 */
> -	if (!(vcpu->arch.ctrl & 1))
> -		mtspr(SPRN_CTRLT, 1);
> -}
> -
> static inline bool hcall_is_xics(unsigned long req)
> {
> 	return req == H_EOI || req == H_CPPR || req == H_IPI ||
> diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
> new file mode 100644
> index 000000000000..72e3a8f4c2cf
> --- /dev/null
> +++ b/arch/powerpc/kvm/book3s_hv.h
> @@ -0,0 +1,39 @@
> +
> +/*
> + * Privileged (non-hypervisor) host registers to save.
> + */
> +struct p9_host_os_sprs {
> +	unsigned long dscr;
> +	unsigned long tidr;
> +	unsigned long iamr;
> +	unsigned long amr;
> +	unsigned long fscr;
> +
> +	unsigned int pmc1;
> +	unsigned int pmc2;
> +	unsigned int pmc3;
> +	unsigned int pmc4;
> +	unsigned int pmc5;
> +	unsigned int pmc6;
> +	unsigned long mmcr0;
> +	unsigned long mmcr1;
> +	unsigned long mmcr2;
> +	unsigned long mmcr3;
> +	unsigned long mmcra;
> +	unsigned long siar;
> +	unsigned long sier1;
> +	unsigned long sier2;
> +	unsigned long sier3;
> +	unsigned long sdar;
> +};
> +
> +void load_vcpu_state(struct kvm_vcpu *vcpu,
> +			   struct p9_host_os_sprs *host_os_sprs);
> +void store_vcpu_state(struct kvm_vcpu *vcpu);
> +void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs);
> +void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
> +				    struct p9_host_os_sprs *host_os_sprs);
> +void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
> +			    struct p9_host_os_sprs *host_os_sprs);
> +void switch_pmu_to_host(struct kvm_vcpu *vcpu,
> +			    struct p9_host_os_sprs *host_os_sprs);
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index afdd7dfa1c08..cc74cd314fcf 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -4,8 +4,340 @@
> #include <asm/asm-prototypes.h>
> #include <asm/dbell.h>
> #include <asm/kvm_ppc.h>
> +#include <asm/pmc.h>
> #include <asm/ppc-opcode.h>
> 
> +#include "book3s_hv.h"
> +
> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
> +{
> +	if (!(mmcr0 & MMCR0_FC))
> +		goto do_freeze;
> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
> +		goto do_freeze;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		if (!(mmcr0 & MMCR0_PMCCEXT))
> +			goto do_freeze;
> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
> +			goto do_freeze;
> +	}
> +	return;
> +
> +do_freeze:
> +	mmcr0 = MMCR0_FC;
> +	mmcra = 0;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		mmcr0 |= MMCR0_PMCCEXT;
> +		mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
> +	mtspr(SPRN_MMCR0, mmcr0);
> +	mtspr(SPRN_MMCRA, mmcra);
> +	isync();
> +}
> +
> +void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
> +				struct p9_host_os_sprs *host_os_sprs)
> +{
> +	struct lppaca *lp;
> +	int load_pmu = 1;
> +
> +	lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp)
> +		load_pmu = lp->pmcregs_in_use;
> +
> +	if (load_pmu)
> +	      vcpu->arch.hfscr |= HFSCR_PM;
> +
> +	/* Save host */
> +	if (ppc_get_pmu_inuse()) {
> +		/*
> +		 * It might be better to put PMU handling (at least for the
> +		 * host) in the perf subsystem because it knows more about what
> +		 * is being used.
> +		 */
> +
> +		/* POWER9, POWER10 do not implement HPMC or SPMC */
> +
> +		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
> +		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
> +
> +		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
> +		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
> +		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
> +		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
> +		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
> +		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
> +		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
> +		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
> +		host_os_sprs->sdar = mfspr(SPRN_SDAR);
> +		host_os_sprs->siar = mfspr(SPRN_SIAR);
> +		host_os_sprs->sier1 = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
> +			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
> +			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
> +		}
> +	}
> +
> +#ifdef CONFIG_PPC_PSERIES
> +	if (kvmhv_on_pseries()) {
> +		if (vcpu->arch.vpa.pinned_addr) {
> +			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> +			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
> +		} else {
> +			get_lppaca()->pmcregs_in_use = 1;
> +		}
> +	}
> +#endif
> +
> +	/* Load guest */
> +	if (vcpu->arch.hfscr & HFSCR_PM) {
> +		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
> +		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
> +		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
> +		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
> +		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
> +		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
> +		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
> +		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
> +		mtspr(SPRN_SDAR, vcpu->arch.sdar);
> +		mtspr(SPRN_SIAR, vcpu->arch.siar);
> +		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
> +			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
> +			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
> +		}
> +
> +		/* Set MMCRA then MMCR0 last */
> +		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
> +		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
> +		/* No isync necessary because we're starting counters */
> +	}
> +}
> +EXPORT_SYMBOL_GPL(switch_pmu_to_guest);
> +
> +void switch_pmu_to_host(struct kvm_vcpu *vcpu,
> +				struct p9_host_os_sprs *host_os_sprs)
> +{
> +	struct lppaca *lp;
> +	int save_pmu = 1;
> +
> +	lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp)
> +		save_pmu = lp->pmcregs_in_use;
> +
> +	if (save_pmu) {
> +		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
> +		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
> +
> +		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
> +		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
> +		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
> +		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
> +		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
> +		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
> +		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
> +		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
> +		vcpu->arch.sdar = mfspr(SPRN_SDAR);
> +		vcpu->arch.siar = mfspr(SPRN_SIAR);
> +		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
> +			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
> +			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
> +		}
> +
> +	} else if (vcpu->arch.hfscr & HFSCR_PM) {
> +		/*
> +		 * The guest accessed PMC SPRs without specifying they should
> +		 * be preserved. Stop them from counting if the guest had
> +		 * started anything.
> +		 */
> +		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
> +
> +		/*
> +		 * Demand-fault PMU register access in the guest.
> +		 *
> +		 * This is used to grab the guest's VPA pmcregs_in_use value
> +		 * and reflect it into the host's VPA in the case of a nested
> +		 * hypervisor.
> +		 *
> +		 * It also avoids having to zero-out SPRs after each guest
> +		 * exit to avoid side-channels when.
> +		 *
> +		 * This is cleared here when we exit the guest, so later HFSCR
> +		 * interrupt handling can add it back to run the guest with
> +		 * PM enabled next time.
> +		 */
> +		vcpu->arch.hfscr &= ~HFSCR_PM;
> +	} /* otherwise the PMU should still be frozen from guest entry */
> +
> +
> +#ifdef CONFIG_PPC_PSERIES
> +	if (kvmhv_on_pseries())
> +		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
> +#endif
> +
> +	if (ppc_get_pmu_inuse()) {
> +		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
> +		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
> +		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
> +		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
> +		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
> +		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
> +		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
> +		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
> +		mtspr(SPRN_SDAR, host_os_sprs->sdar);
> +		mtspr(SPRN_SIAR, host_os_sprs->siar);
> +		mtspr(SPRN_SIER, host_os_sprs->sier1);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
> +			mtspr(SPRN_SIER2, host_os_sprs->sier2);
> +			mtspr(SPRN_SIER3, host_os_sprs->sier3);
> +		}
> +
> +		/* Set MMCRA then MMCR0 last */
> +		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
> +		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
> +		isync();
> +	}
> +}
> +EXPORT_SYMBOL_GPL(switch_pmu_to_host);
> +
> +static void load_spr_state(struct kvm_vcpu *vcpu,
> +				struct p9_host_os_sprs *host_os_sprs)
> +{
> +	mtspr(SPRN_TAR, vcpu->arch.tar);
> +	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> +	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> +	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> +
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		mtspr(SPRN_TIDR, vcpu->arch.tid);
> +	if (host_os_sprs->iamr != vcpu->arch.iamr)
> +		mtspr(SPRN_IAMR, vcpu->arch.iamr);
> +	if (host_os_sprs->amr != vcpu->arch.amr)
> +		mtspr(SPRN_AMR, vcpu->arch.amr);
> +	if (vcpu->arch.uamor != 0)
> +		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
> +	if (host_os_sprs->fscr != vcpu->arch.fscr)
> +		mtspr(SPRN_FSCR, vcpu->arch.fscr);
> +	if (host_os_sprs->dscr != vcpu->arch.dscr)
> +		mtspr(SPRN_DSCR, vcpu->arch.dscr);
> +	if (vcpu->arch.pspb != 0)
> +		mtspr(SPRN_PSPB, vcpu->arch.pspb);
> +
> +	/*
> +	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
> +	 * clear (or hstate set appropriately to catch those registers
> +	 * being clobbered if we take a MCE or SRESET), so those are done
> +	 * later.
> +	 */
> +
> +	if (!(vcpu->arch.ctrl & 1))
> +		mtspr(SPRN_CTRLT, 0);
> +}
> +
> +static void store_spr_state(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.tar = mfspr(SPRN_TAR);
> +	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> +	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> +	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> +
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		vcpu->arch.tid = mfspr(SPRN_TIDR);
> +	vcpu->arch.iamr = mfspr(SPRN_IAMR);
> +	vcpu->arch.amr = mfspr(SPRN_AMR);
> +	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
> +	vcpu->arch.fscr = mfspr(SPRN_FSCR);
> +	vcpu->arch.dscr = mfspr(SPRN_DSCR);
> +	vcpu->arch.pspb = mfspr(SPRN_PSPB);
> +
> +	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
> +}
> +
> +void load_vcpu_state(struct kvm_vcpu *vcpu,
> +			   struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> +
> +	load_spr_state(vcpu, host_os_sprs);
> +
> +	load_fp_state(&vcpu->arch.fp);
> +#ifdef CONFIG_ALTIVEC
> +	load_vr_state(&vcpu->arch.vr);
> +#endif
> +	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
> +}
> +EXPORT_SYMBOL_GPL(load_vcpu_state);
> +
> +void store_vcpu_state(struct kvm_vcpu *vcpu)
> +{
> +	store_spr_state(vcpu);
> +
> +	store_fp_state(&vcpu->arch.fp);
> +#ifdef CONFIG_ALTIVEC
> +	store_vr_state(&vcpu->arch.vr);
> +#endif
> +	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
> +
> +	if (cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> +}
> +EXPORT_SYMBOL_GPL(store_vcpu_state);
> +
> +void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		host_os_sprs->tidr = mfspr(SPRN_TIDR);
> +	host_os_sprs->iamr = mfspr(SPRN_IAMR);
> +	host_os_sprs->amr = mfspr(SPRN_AMR);
> +	host_os_sprs->fscr = mfspr(SPRN_FSCR);
> +	host_os_sprs->dscr = mfspr(SPRN_DSCR);
> +}
> +EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
> +
> +/* vcpu guest regs must already be saved */
> +void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
> +				    struct p9_host_os_sprs *host_os_sprs)
> +{
> +	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
> +
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		mtspr(SPRN_TIDR, host_os_sprs->tidr);
> +	if (host_os_sprs->iamr != vcpu->arch.iamr)
> +		mtspr(SPRN_IAMR, host_os_sprs->iamr);
> +	if (vcpu->arch.uamor != 0)
> +		mtspr(SPRN_UAMOR, 0);
> +	if (host_os_sprs->amr != vcpu->arch.amr)
> +		mtspr(SPRN_AMR, host_os_sprs->amr);
> +	if (host_os_sprs->fscr != vcpu->arch.fscr)
> +		mtspr(SPRN_FSCR, host_os_sprs->fscr);
> +	if (host_os_sprs->dscr != vcpu->arch.dscr)
> +		mtspr(SPRN_DSCR, host_os_sprs->dscr);
> +	if (vcpu->arch.pspb != 0)
> +		mtspr(SPRN_PSPB, 0);
> +
> +	/* Save guest CTRL register, set runlatch to 1 */
> +	if (!(vcpu->arch.ctrl & 1))
> +		mtspr(SPRN_CTRLT, 1);
> +}
> +EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
> +
> #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
> static void __start_timing(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
> {
> -- 
> 2.23.0
> 


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
@ 2021-07-08  5:44     ` Athira Rajeev
  0 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-08  5:44 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Move the P9 guest/host register switching functions to the built-in
> P9 entry code, and export it for nested to use as well.
> 
> This allows more flexibility in scheduling these supervisor privileged
> SPR accesses with the HV privileged and PR SPR accesses in the low level
> entry code.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kvm/book3s_hv.c          | 351 +-------------------------
> arch/powerpc/kvm/book3s_hv.h          |  39 +++
> arch/powerpc/kvm/book3s_hv_p9_entry.c | 332 ++++++++++++++++++++++++
> 3 files changed, 372 insertions(+), 350 deletions(-)
> create mode 100644 arch/powerpc/kvm/book3s_hv.h
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 35749b0b663f..a7660af22161 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -79,6 +79,7 @@
> #include <asm/dtl.h>
> 
> #include "book3s.h"
> +#include "book3s_hv.h"
> 
> #define CREATE_TRACE_POINTS
> #include "trace_hv.h"
> @@ -3675,356 +3676,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> 	trace_kvmppc_run_core(vc, 1);
> }
> 
> -/*
> - * Privileged (non-hypervisor) host registers to save.
> - */
> -struct p9_host_os_sprs {
> -	unsigned long dscr;
> -	unsigned long tidr;
> -	unsigned long iamr;
> -	unsigned long amr;
> -	unsigned long fscr;
> -
> -	unsigned int pmc1;
> -	unsigned int pmc2;
> -	unsigned int pmc3;
> -	unsigned int pmc4;
> -	unsigned int pmc5;
> -	unsigned int pmc6;
> -	unsigned long mmcr0;
> -	unsigned long mmcr1;
> -	unsigned long mmcr2;
> -	unsigned long mmcr3;
> -	unsigned long mmcra;
> -	unsigned long siar;
> -	unsigned long sier1;
> -	unsigned long sier2;
> -	unsigned long sier3;
> -	unsigned long sdar;
> -};
> -
> -static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
> -{
> -	if (!(mmcr0 & MMCR0_FC))
> -		goto do_freeze;
> -	if (mmcra & MMCRA_SAMPLE_ENABLE)
> -		goto do_freeze;
> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -		if (!(mmcr0 & MMCR0_PMCCEXT))
> -			goto do_freeze;
> -		if (!(mmcra & MMCRA_BHRB_DISABLE))
> -			goto do_freeze;
> -	}
> -	return;
> -
> -do_freeze:
> -	mmcr0 = MMCR0_FC;
> -	mmcra = 0;
> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -		mmcr0 |= MMCR0_PMCCEXT;
> -		mmcra = MMCRA_BHRB_DISABLE;
> -	}
> -
> -	mtspr(SPRN_MMCR0, mmcr0);
> -	mtspr(SPRN_MMCRA, mmcra);
> -	isync();
> -}
> -
> -static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
> -				struct p9_host_os_sprs *host_os_sprs)
> -{
> -	struct lppaca *lp;
> -	int load_pmu = 1;
> -
> -	lp = vcpu->arch.vpa.pinned_addr;
> -	if (lp)
> -		load_pmu = lp->pmcregs_in_use;
> -
> -	if (load_pmu)
> -	      vcpu->arch.hfscr |= HFSCR_PM;
> -
> -	/* Save host */
> -	if (ppc_get_pmu_inuse()) {
> -		/*
> -		 * It might be better to put PMU handling (at least for the
> -		 * host) in the perf subsystem because it knows more about what
> -		 * is being used.
> -		 */
> -
> -		/* POWER9, POWER10 do not implement HPMC or SPMC */
> -
> -		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
> -		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
> -
> -		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
> -
> -		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
> -		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
> -		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
> -		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
> -		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
> -		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
> -		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
> -		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
> -		host_os_sprs->sdar = mfspr(SPRN_SDAR);
> -		host_os_sprs->siar = mfspr(SPRN_SIAR);
> -		host_os_sprs->sier1 = mfspr(SPRN_SIER);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
> -			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
> -			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
> -		}
> -	}
> -
> -#ifdef CONFIG_PPC_PSERIES
> -	if (kvmhv_on_pseries()) {
> -		if (vcpu->arch.vpa.pinned_addr) {
> -			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> -			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
> -		} else {
> -			get_lppaca()->pmcregs_in_use = 1;
> -		}
> -	}
> -#endif
> -
> -	/* Load guest */
> -	if (vcpu->arch.hfscr & HFSCR_PM) {
> -		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
> -		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
> -		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
> -		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
> -		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
> -		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
> -		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
> -		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
> -		mtspr(SPRN_SDAR, vcpu->arch.sdar);
> -		mtspr(SPRN_SIAR, vcpu->arch.siar);
> -		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);


Hi Nick,

Have a doubt here..
For MMCR3, it is  vcpu->arch.mmcr[3) ?

Thanks
Athira
> -			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
> -			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
> -		}
> -
> -		/* Set MMCRA then MMCR0 last */
> -		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
> -		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
> -		/* No isync necessary because we're starting counters */
> -	}
> -}
> -
> -static void switch_pmu_to_host(struct kvm_vcpu *vcpu,
> -				struct p9_host_os_sprs *host_os_sprs)
> -{
> -	struct lppaca *lp;
> -	int save_pmu = 1;
> -
> -	lp = vcpu->arch.vpa.pinned_addr;
> -	if (lp)
> -		save_pmu = lp->pmcregs_in_use;
> -
> -	if (save_pmu) {
> -		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
> -		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
> -
> -		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
> -
> -		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
> -		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
> -		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
> -		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
> -		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
> -		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
> -		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
> -		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
> -		vcpu->arch.sdar = mfspr(SPRN_SDAR);
> -		vcpu->arch.siar = mfspr(SPRN_SIAR);
> -		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
> -			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
> -			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
> -		}
> -
> -	} else if (vcpu->arch.hfscr & HFSCR_PM) {
> -		/*
> -		 * The guest accessed PMC SPRs without specifying they should
> -		 * be preserved. Stop them from counting if the guest had
> -		 * started anything.
> -		 */
> -		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
> -
> -		/*
> -		 * Demand-fault PMU register access in the guest.
> -		 *
> -		 * This is used to grab the guest's VPA pmcregs_in_use value
> -		 * and reflect it into the host's VPA in the case of a nested
> -		 * hypervisor.
> -		 *
> -		 * It also avoids having to zero-out SPRs after each guest
> -		 * exit to avoid side-channels when.
> -		 *
> -		 * This is cleared here when we exit the guest, so later HFSCR
> -		 * interrupt handling can add it back to run the guest with
> -		 * PM enabled next time.
> -		 */
> -		vcpu->arch.hfscr &= ~HFSCR_PM;
> -	} /* otherwise the PMU should still be frozen from guest entry */
> -
> -#ifdef CONFIG_PPC_PSERIES
> -	if (kvmhv_on_pseries())
> -		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
> -#endif
> -
> -	if (ppc_get_pmu_inuse()) {
> -		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
> -		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
> -		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
> -		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
> -		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
> -		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
> -		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
> -		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
> -		mtspr(SPRN_SDAR, host_os_sprs->sdar);
> -		mtspr(SPRN_SIAR, host_os_sprs->siar);
> -		mtspr(SPRN_SIER, host_os_sprs->sier1);
> -
> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> -			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
> -			mtspr(SPRN_SIER2, host_os_sprs->sier2);
> -			mtspr(SPRN_SIER3, host_os_sprs->sier3);
> -		}
> -
> -		/* Set MMCRA then MMCR0 last */
> -		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
> -		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
> -		isync();
> -	}
> -}
> -
> -static void load_spr_state(struct kvm_vcpu *vcpu,
> -				struct p9_host_os_sprs *host_os_sprs)
> -{
> -	mtspr(SPRN_TAR, vcpu->arch.tar);
> -	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> -	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> -	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> -
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		mtspr(SPRN_TIDR, vcpu->arch.tid);
> -	if (host_os_sprs->iamr != vcpu->arch.iamr)
> -		mtspr(SPRN_IAMR, vcpu->arch.iamr);
> -	if (host_os_sprs->amr != vcpu->arch.amr)
> -		mtspr(SPRN_AMR, vcpu->arch.amr);
> -	if (vcpu->arch.uamor != 0)
> -		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
> -	if (host_os_sprs->fscr != vcpu->arch.fscr)
> -		mtspr(SPRN_FSCR, vcpu->arch.fscr);
> -	if (host_os_sprs->dscr != vcpu->arch.dscr)
> -		mtspr(SPRN_DSCR, vcpu->arch.dscr);
> -	if (vcpu->arch.pspb != 0)
> -		mtspr(SPRN_PSPB, vcpu->arch.pspb);
> -
> -	/*
> -	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
> -	 * clear (or hstate set appropriately to catch those registers
> -	 * being clobbered if we take a MCE or SRESET), so those are done
> -	 * later.
> -	 */
> -
> -	if (!(vcpu->arch.ctrl & 1))
> -		mtspr(SPRN_CTRLT, 0);
> -}
> -
> -static void store_spr_state(struct kvm_vcpu *vcpu)
> -{
> -	vcpu->arch.tar = mfspr(SPRN_TAR);
> -	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> -	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> -	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> -
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		vcpu->arch.tid = mfspr(SPRN_TIDR);
> -	vcpu->arch.iamr = mfspr(SPRN_IAMR);
> -	vcpu->arch.amr = mfspr(SPRN_AMR);
> -	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
> -	vcpu->arch.fscr = mfspr(SPRN_FSCR);
> -	vcpu->arch.dscr = mfspr(SPRN_DSCR);
> -	vcpu->arch.pspb = mfspr(SPRN_PSPB);
> -
> -	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
> -}
> -
> -static void load_vcpu_state(struct kvm_vcpu *vcpu,
> -			   struct p9_host_os_sprs *host_os_sprs)
> -{
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> -		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> -
> -	load_spr_state(vcpu, host_os_sprs);
> -
> -	load_fp_state(&vcpu->arch.fp);
> -#ifdef CONFIG_ALTIVEC
> -	load_vr_state(&vcpu->arch.vr);
> -#endif
> -	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
> -}
> -
> -static void store_vcpu_state(struct kvm_vcpu *vcpu)
> -{
> -	store_spr_state(vcpu);
> -
> -	store_fp_state(&vcpu->arch.fp);
> -#ifdef CONFIG_ALTIVEC
> -	store_vr_state(&vcpu->arch.vr);
> -#endif
> -	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
> -
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> -		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> -}
> -
> -static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
> -{
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		host_os_sprs->tidr = mfspr(SPRN_TIDR);
> -	host_os_sprs->iamr = mfspr(SPRN_IAMR);
> -	host_os_sprs->amr = mfspr(SPRN_AMR);
> -	host_os_sprs->fscr = mfspr(SPRN_FSCR);
> -	host_os_sprs->dscr = mfspr(SPRN_DSCR);
> -}
> -
> -/* vcpu guest regs must already be saved */
> -static void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
> -				    struct p9_host_os_sprs *host_os_sprs)
> -{
> -	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
> -
> -	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> -		mtspr(SPRN_TIDR, host_os_sprs->tidr);
> -	if (host_os_sprs->iamr != vcpu->arch.iamr)
> -		mtspr(SPRN_IAMR, host_os_sprs->iamr);
> -	if (vcpu->arch.uamor != 0)
> -		mtspr(SPRN_UAMOR, 0);
> -	if (host_os_sprs->amr != vcpu->arch.amr)
> -		mtspr(SPRN_AMR, host_os_sprs->amr);
> -	if (host_os_sprs->fscr != vcpu->arch.fscr)
> -		mtspr(SPRN_FSCR, host_os_sprs->fscr);
> -	if (host_os_sprs->dscr != vcpu->arch.dscr)
> -		mtspr(SPRN_DSCR, host_os_sprs->dscr);
> -	if (vcpu->arch.pspb != 0)
> -		mtspr(SPRN_PSPB, 0);
> -
> -	/* Save guest CTRL register, set runlatch to 1 */
> -	if (!(vcpu->arch.ctrl & 1))
> -		mtspr(SPRN_CTRLT, 1);
> -}
> -
> static inline bool hcall_is_xics(unsigned long req)
> {
> 	return req == H_EOI || req == H_CPPR || req == H_IPI ||
> diff --git a/arch/powerpc/kvm/book3s_hv.h b/arch/powerpc/kvm/book3s_hv.h
> new file mode 100644
> index 000000000000..72e3a8f4c2cf
> --- /dev/null
> +++ b/arch/powerpc/kvm/book3s_hv.h
> @@ -0,0 +1,39 @@
> +
> +/*
> + * Privileged (non-hypervisor) host registers to save.
> + */
> +struct p9_host_os_sprs {
> +	unsigned long dscr;
> +	unsigned long tidr;
> +	unsigned long iamr;
> +	unsigned long amr;
> +	unsigned long fscr;
> +
> +	unsigned int pmc1;
> +	unsigned int pmc2;
> +	unsigned int pmc3;
> +	unsigned int pmc4;
> +	unsigned int pmc5;
> +	unsigned int pmc6;
> +	unsigned long mmcr0;
> +	unsigned long mmcr1;
> +	unsigned long mmcr2;
> +	unsigned long mmcr3;
> +	unsigned long mmcra;
> +	unsigned long siar;
> +	unsigned long sier1;
> +	unsigned long sier2;
> +	unsigned long sier3;
> +	unsigned long sdar;
> +};
> +
> +void load_vcpu_state(struct kvm_vcpu *vcpu,
> +			   struct p9_host_os_sprs *host_os_sprs);
> +void store_vcpu_state(struct kvm_vcpu *vcpu);
> +void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs);
> +void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
> +				    struct p9_host_os_sprs *host_os_sprs);
> +void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
> +			    struct p9_host_os_sprs *host_os_sprs);
> +void switch_pmu_to_host(struct kvm_vcpu *vcpu,
> +			    struct p9_host_os_sprs *host_os_sprs);
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index afdd7dfa1c08..cc74cd314fcf 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -4,8 +4,340 @@
> #include <asm/asm-prototypes.h>
> #include <asm/dbell.h>
> #include <asm/kvm_ppc.h>
> +#include <asm/pmc.h>
> #include <asm/ppc-opcode.h>
> 
> +#include "book3s_hv.h"
> +
> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
> +{
> +	if (!(mmcr0 & MMCR0_FC))
> +		goto do_freeze;
> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
> +		goto do_freeze;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		if (!(mmcr0 & MMCR0_PMCCEXT))
> +			goto do_freeze;
> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
> +			goto do_freeze;
> +	}
> +	return;
> +
> +do_freeze:
> +	mmcr0 = MMCR0_FC;
> +	mmcra = 0;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		mmcr0 |= MMCR0_PMCCEXT;
> +		mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
> +	mtspr(SPRN_MMCR0, mmcr0);
> +	mtspr(SPRN_MMCRA, mmcra);
> +	isync();
> +}
> +
> +void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
> +				struct p9_host_os_sprs *host_os_sprs)
> +{
> +	struct lppaca *lp;
> +	int load_pmu = 1;
> +
> +	lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp)
> +		load_pmu = lp->pmcregs_in_use;
> +
> +	if (load_pmu)
> +	      vcpu->arch.hfscr |= HFSCR_PM;
> +
> +	/* Save host */
> +	if (ppc_get_pmu_inuse()) {
> +		/*
> +		 * It might be better to put PMU handling (at least for the
> +		 * host) in the perf subsystem because it knows more about what
> +		 * is being used.
> +		 */
> +
> +		/* POWER9, POWER10 do not implement HPMC or SPMC */
> +
> +		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
> +		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
> +
> +		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
> +		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
> +		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
> +		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
> +		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
> +		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
> +		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
> +		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
> +		host_os_sprs->sdar = mfspr(SPRN_SDAR);
> +		host_os_sprs->siar = mfspr(SPRN_SIAR);
> +		host_os_sprs->sier1 = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
> +			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
> +			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
> +		}
> +	}
> +
> +#ifdef CONFIG_PPC_PSERIES
> +	if (kvmhv_on_pseries()) {
> +		if (vcpu->arch.vpa.pinned_addr) {
> +			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> +			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
> +		} else {
> +			get_lppaca()->pmcregs_in_use = 1;
> +		}
> +	}
> +#endif
> +
> +	/* Load guest */
> +	if (vcpu->arch.hfscr & HFSCR_PM) {
> +		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
> +		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
> +		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
> +		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
> +		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
> +		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
> +		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
> +		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
> +		mtspr(SPRN_SDAR, vcpu->arch.sdar);
> +		mtspr(SPRN_SIAR, vcpu->arch.siar);
> +		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
> +			mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
> +			mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
> +		}
> +
> +		/* Set MMCRA then MMCR0 last */
> +		mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
> +		mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
> +		/* No isync necessary because we're starting counters */
> +	}
> +}
> +EXPORT_SYMBOL_GPL(switch_pmu_to_guest);
> +
> +void switch_pmu_to_host(struct kvm_vcpu *vcpu,
> +				struct p9_host_os_sprs *host_os_sprs)
> +{
> +	struct lppaca *lp;
> +	int save_pmu = 1;
> +
> +	lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp)
> +		save_pmu = lp->pmcregs_in_use;
> +
> +	if (save_pmu) {
> +		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
> +		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
> +
> +		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
> +		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
> +		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
> +		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
> +		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
> +		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
> +		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
> +		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
> +		vcpu->arch.sdar = mfspr(SPRN_SDAR);
> +		vcpu->arch.siar = mfspr(SPRN_SIAR);
> +		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
> +			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
> +			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
> +		}
> +
> +	} else if (vcpu->arch.hfscr & HFSCR_PM) {
> +		/*
> +		 * The guest accessed PMC SPRs without specifying they should
> +		 * be preserved. Stop them from counting if the guest had
> +		 * started anything.
> +		 */
> +		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
> +
> +		/*
> +		 * Demand-fault PMU register access in the guest.
> +		 *
> +		 * This is used to grab the guest's VPA pmcregs_in_use value
> +		 * and reflect it into the host's VPA in the case of a nested
> +		 * hypervisor.
> +		 *
> +		 * It also avoids having to zero-out SPRs after each guest
> +		 * exit to avoid side-channels when.
> +		 *
> +		 * This is cleared here when we exit the guest, so later HFSCR
> +		 * interrupt handling can add it back to run the guest with
> +		 * PM enabled next time.
> +		 */
> +		vcpu->arch.hfscr &= ~HFSCR_PM;
> +	} /* otherwise the PMU should still be frozen from guest entry */
> +
> +
> +#ifdef CONFIG_PPC_PSERIES
> +	if (kvmhv_on_pseries())
> +		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
> +#endif
> +
> +	if (ppc_get_pmu_inuse()) {
> +		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
> +		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
> +		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
> +		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
> +		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
> +		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
> +		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
> +		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
> +		mtspr(SPRN_SDAR, host_os_sprs->sdar);
> +		mtspr(SPRN_SIAR, host_os_sprs->siar);
> +		mtspr(SPRN_SIER, host_os_sprs->sier1);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
> +			mtspr(SPRN_SIER2, host_os_sprs->sier2);
> +			mtspr(SPRN_SIER3, host_os_sprs->sier3);
> +		}
> +
> +		/* Set MMCRA then MMCR0 last */
> +		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
> +		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
> +		isync();
> +	}
> +}
> +EXPORT_SYMBOL_GPL(switch_pmu_to_host);
> +
> +static void load_spr_state(struct kvm_vcpu *vcpu,
> +				struct p9_host_os_sprs *host_os_sprs)
> +{
> +	mtspr(SPRN_TAR, vcpu->arch.tar);
> +	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> +	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> +	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> +
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		mtspr(SPRN_TIDR, vcpu->arch.tid);
> +	if (host_os_sprs->iamr != vcpu->arch.iamr)
> +		mtspr(SPRN_IAMR, vcpu->arch.iamr);
> +	if (host_os_sprs->amr != vcpu->arch.amr)
> +		mtspr(SPRN_AMR, vcpu->arch.amr);
> +	if (vcpu->arch.uamor != 0)
> +		mtspr(SPRN_UAMOR, vcpu->arch.uamor);
> +	if (host_os_sprs->fscr != vcpu->arch.fscr)
> +		mtspr(SPRN_FSCR, vcpu->arch.fscr);
> +	if (host_os_sprs->dscr != vcpu->arch.dscr)
> +		mtspr(SPRN_DSCR, vcpu->arch.dscr);
> +	if (vcpu->arch.pspb != 0)
> +		mtspr(SPRN_PSPB, vcpu->arch.pspb);
> +
> +	/*
> +	 * DAR, DSISR, and for nested HV, SPRGs must be set with MSR[RI]
> +	 * clear (or hstate set appropriately to catch those registers
> +	 * being clobbered if we take a MCE or SRESET), so those are done
> +	 * later.
> +	 */
> +
> +	if (!(vcpu->arch.ctrl & 1))
> +		mtspr(SPRN_CTRLT, 0);
> +}
> +
> +static void store_spr_state(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.tar = mfspr(SPRN_TAR);
> +	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> +	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> +	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> +
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		vcpu->arch.tid = mfspr(SPRN_TIDR);
> +	vcpu->arch.iamr = mfspr(SPRN_IAMR);
> +	vcpu->arch.amr = mfspr(SPRN_AMR);
> +	vcpu->arch.uamor = mfspr(SPRN_UAMOR);
> +	vcpu->arch.fscr = mfspr(SPRN_FSCR);
> +	vcpu->arch.dscr = mfspr(SPRN_DSCR);
> +	vcpu->arch.pspb = mfspr(SPRN_PSPB);
> +
> +	vcpu->arch.ctrl = mfspr(SPRN_CTRLF);
> +}
> +
> +void load_vcpu_state(struct kvm_vcpu *vcpu,
> +			   struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +		kvmppc_restore_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> +
> +	load_spr_state(vcpu, host_os_sprs);
> +
> +	load_fp_state(&vcpu->arch.fp);
> +#ifdef CONFIG_ALTIVEC
> +	load_vr_state(&vcpu->arch.vr);
> +#endif
> +	mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
> +}
> +EXPORT_SYMBOL_GPL(load_vcpu_state);
> +
> +void store_vcpu_state(struct kvm_vcpu *vcpu)
> +{
> +	store_spr_state(vcpu);
> +
> +	store_fp_state(&vcpu->arch.fp);
> +#ifdef CONFIG_ALTIVEC
> +	store_vr_state(&vcpu->arch.vr);
> +#endif
> +	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
> +
> +	if (cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> +}
> +EXPORT_SYMBOL_GPL(store_vcpu_state);
> +
> +void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		host_os_sprs->tidr = mfspr(SPRN_TIDR);
> +	host_os_sprs->iamr = mfspr(SPRN_IAMR);
> +	host_os_sprs->amr = mfspr(SPRN_AMR);
> +	host_os_sprs->fscr = mfspr(SPRN_FSCR);
> +	host_os_sprs->dscr = mfspr(SPRN_DSCR);
> +}
> +EXPORT_SYMBOL_GPL(save_p9_host_os_sprs);
> +
> +/* vcpu guest regs must already be saved */
> +void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
> +				    struct p9_host_os_sprs *host_os_sprs)
> +{
> +	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
> +
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		mtspr(SPRN_TIDR, host_os_sprs->tidr);
> +	if (host_os_sprs->iamr != vcpu->arch.iamr)
> +		mtspr(SPRN_IAMR, host_os_sprs->iamr);
> +	if (vcpu->arch.uamor != 0)
> +		mtspr(SPRN_UAMOR, 0);
> +	if (host_os_sprs->amr != vcpu->arch.amr)
> +		mtspr(SPRN_AMR, host_os_sprs->amr);
> +	if (host_os_sprs->fscr != vcpu->arch.fscr)
> +		mtspr(SPRN_FSCR, host_os_sprs->fscr);
> +	if (host_os_sprs->dscr != vcpu->arch.dscr)
> +		mtspr(SPRN_DSCR, host_os_sprs->dscr);
> +	if (vcpu->arch.pspb != 0)
> +		mtspr(SPRN_PSPB, 0);
> +
> +	/* Save guest CTRL register, set runlatch to 1 */
> +	if (!(vcpu->arch.ctrl & 1))
> +		mtspr(SPRN_CTRLT, 1);
> +}
> +EXPORT_SYMBOL_GPL(restore_p9_host_os_sprs);
> +
> #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
> static void __start_timing(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
> {
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-02  0:27       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
@ 2021-07-08 12:45         ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-08 12:45 UTC (permalink / raw)
  To: kvm-ppc, Madhavan Srinivasan; +Cc: linuxppc-dev

Excerpts from Nicholas Piggin's message of July 2, 2021 10:27 am:
> Excerpts from Madhavan Srinivasan's message of July 1, 2021 11:17 pm:
>> 
>> On 6/22/21 4:27 PM, Nicholas Piggin wrote:
>>> KVM PMU management code looks for particular frozen/disabled bits in
>>> the PMU registers so it knows whether it must clear them when coming
>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>> without getting confused. Longer term the better approach might be to
>>> move guest/host PMU switching to the perf subsystem.
>>>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>   arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>>   arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>>   arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>>   arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>>   4 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>> index a29dc8326622..3dc61e203f37 100644
>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>>   static void init_PMU(void)
>>>   {
>>>   	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 
>> Sticky point here is, currently if not frozen, pmc5/6 will
>> keep countering. And not freezing them at boot is quiet useful
>> sometime, like say when running in a simulation where we could calculate
>> approx CPIs for micro benchmarks without perf subsystem.
> 
> You even can't use the sysfs files in this sim environment? In that case
> what if we added a boot option that could set some things up? In that 
> case possibly you could even gather some more types of events too.

What if we added this to allow sim environments to run PMC5/6 and 
additionally specify MMCR1 without userspace involvement?

Thanks,
Nick

---
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index af8a4981c6f6..454771243529 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2425,8 +2425,24 @@ int register_power_pmu(struct power_pmu *pmu)
 }
 
 #ifdef CONFIG_PPC64
+static bool pmu_override = false;
+static unsigned long pmu_override_val;
+static void do_pmu_override(void *data)
+{
+	ppc_set_pmu_inuse(1);
+	if (pmu_override_val)
+		mtspr(SPRN_MMCR1, pmu_override_val);
+	mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~MMCR0_FC);
+}
+
 static int __init init_ppc64_pmu(void)
 {
+	if (cpu_has_feature(CPU_FTR_HVMODE) && pmu_override) {
+		printk(KERN_WARNING "perf: disabling perf due to pmu= command line option.\n");
+		on_each_cpu(do_pmu_override, NULL, 1);
+		return 0;
+	}
+
 	/* run through all the pmu drivers one at a time */
 	if (!init_power5_pmu())
 		return 0;
@@ -2448,4 +2464,23 @@ static int __init init_ppc64_pmu(void)
 		return init_generic_compat_pmu();
 }
 early_initcall(init_ppc64_pmu);
+
+static int __init pmu_setup(char *str)
+{
+	unsigned long val;
+
+	if (!early_cpu_has_feature(CPU_FTR_HVMODE))
+		return 0;
+
+	pmu_override = true;
+
+	if (kstrtoul(str, 0, &val))
+		val = 0;
+
+	pmu_override_val = val;
+
+	return 1;
+}
+__setup("pmu=", pmu_setup);
+
 #endif

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-08 12:45         ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-08 12:45 UTC (permalink / raw)
  To: kvm-ppc, Madhavan Srinivasan; +Cc: linuxppc-dev

Excerpts from Nicholas Piggin's message of July 2, 2021 10:27 am:
> Excerpts from Madhavan Srinivasan's message of July 1, 2021 11:17 pm:
>> 
>> On 6/22/21 4:27 PM, Nicholas Piggin wrote:
>>> KVM PMU management code looks for particular frozen/disabled bits in
>>> the PMU registers so it knows whether it must clear them when coming
>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>> without getting confused. Longer term the better approach might be to
>>> move guest/host PMU switching to the perf subsystem.
>>>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>   arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>>   arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>>   arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>>   arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>>   4 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>> index a29dc8326622..3dc61e203f37 100644
>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>>   static void init_PMU(void)
>>>   {
>>>   	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 
>> Sticky point here is, currently if not frozen, pmc5/6 will
>> keep countering. And not freezing them at boot is quiet useful
>> sometime, like say when running in a simulation where we could calculate
>> approx CPIs for micro benchmarks without perf subsystem.
> 
> You even can't use the sysfs files in this sim environment? In that case
> what if we added a boot option that could set some things up? In that 
> case possibly you could even gather some more types of events too.

What if we added this to allow sim environments to run PMC5/6 and 
additionally specify MMCR1 without userspace involvement?

Thanks,
Nick

---
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index af8a4981c6f6..454771243529 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2425,8 +2425,24 @@ int register_power_pmu(struct power_pmu *pmu)
 }
 
 #ifdef CONFIG_PPC64
+static bool pmu_override = false;
+static unsigned long pmu_override_val;
+static void do_pmu_override(void *data)
+{
+	ppc_set_pmu_inuse(1);
+	if (pmu_override_val)
+		mtspr(SPRN_MMCR1, pmu_override_val);
+	mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~MMCR0_FC);
+}
+
 static int __init init_ppc64_pmu(void)
 {
+	if (cpu_has_feature(CPU_FTR_HVMODE) && pmu_override) {
+		printk(KERN_WARNING "perf: disabling perf due to pmu= command line option.\n");
+		on_each_cpu(do_pmu_override, NULL, 1);
+		return 0;
+	}
+
 	/* run through all the pmu drivers one at a time */
 	if (!init_power5_pmu())
 		return 0;
@@ -2448,4 +2464,23 @@ static int __init init_ppc64_pmu(void)
 		return init_generic_compat_pmu();
 }
 early_initcall(init_ppc64_pmu);
+
+static int __init pmu_setup(char *str)
+{
+	unsigned long val;
+
+	if (!early_cpu_has_feature(CPU_FTR_HVMODE))
+		return 0;
+
+	pmu_override = true;
+
+	if (kstrtoul(str, 0, &val))
+		val = 0;
+
+	pmu_override_val = val;
+
+	return 1;
+}
+__setup("pmu=", pmu_setup);
+
 #endif

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 34/43] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-08 17:46     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-07-08 17:46 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Use HFSCR facility disabling to implement demand faulting for EBB, with
> a hysteresis counter similar to the load_fp etc counters in context
> switching that implement the equivalent demand faulting for userspace
> facilities.
>
> This speeds up guest entry/exit by avoiding the register save/restore
> when a guest is not frequently using them. When a guest does use them
> often, there will be some additional demand fault overhead, but these
> are not commonly used facilities.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/include/asm/kvm_host.h   |  1 +
>  arch/powerpc/kvm/book3s_hv.c          | 11 +++++++++++
>  arch/powerpc/kvm/book3s_hv_nested.c   |  3 ++-
>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 ++++++++++++++++++++------
>  4 files changed, 34 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 118b388ea887..bee95106c1f2 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -585,6 +585,7 @@ struct kvm_vcpu_arch {
>  	ulong cfar;
>  	ulong ppr;
>  	u32 pspb;
> +	u8 load_ebb;
>  	ulong fscr;
>  	ulong shadow_fscr;
>  	ulong ebbhr;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index ae528eb37792..99e9da078e7d 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1366,6 +1366,13 @@ static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
>  	return RESUME_GUEST;
>  }
>
> +static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.hfscr |= HFSCR_EBB;
> +
> +	return RESUME_GUEST;
> +}
> +
>  static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
>  				 struct task_struct *tsk)
>  {
> @@ -1645,6 +1652,8 @@ XXX benchmark guest exits
>  				r = kvmppc_emulate_doorbell_instr(vcpu);
>  			if (cause == FSCR_PM_LG)
>  				r = kvmppc_pmu_unavailable(vcpu);
> +			if (cause == FSCR_EBB_LG)
> +				r = kvmppc_ebb_unavailable(vcpu);
>  		}
>  		if (r == EMULATE_FAIL) {
>  			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
> @@ -1764,6 +1773,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
>  		r = EMULATE_FAIL;
>  		if (cause == FSCR_PM_LG && (vcpu->arch.nested_hfscr & HFSCR_PM))
>  			r = kvmppc_pmu_unavailable(vcpu);
> +		if (cause == FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
> +			r = kvmppc_ebb_unavailable(vcpu);
>
>  		if (r == EMULATE_FAIL)
>  			r = RESUME_HOST;
> diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
> index 024b0ce5b702..ee8668f056f9 100644
> --- a/arch/powerpc/kvm/book3s_hv_nested.c
> +++ b/arch/powerpc/kvm/book3s_hv_nested.c
> @@ -168,7 +168,8 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
>  	 * but preserve the interrupt cause field and facilities that might
>  	 * be disabled for demand faulting in the L1.
>  	 */
> -	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | vcpu->arch.hfscr);
> +	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
> +			vcpu->arch.hfscr);
>
>  	/* Don't let data address watchpoint match in hypervisor state */
>  	hr->dawrx0 &= ~DAWRX_HYP;
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index 4d1a2d1ff4c1..cf41261daa97 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -218,9 +218,12 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
>  				struct p9_host_os_sprs *host_os_sprs)
>  {
>  	mtspr(SPRN_TAR, vcpu->arch.tar);
> -	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> -	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> -	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> +
> +	if (vcpu->arch.hfscr & HFSCR_EBB) {
> +		mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> +		mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> +		mtspr(SPRN_BESCR, vcpu->arch.bescr);
> +	}
>
>  	if (!cpu_has_feature(CPU_FTR_ARCH_31))
>  		mtspr(SPRN_TIDR, vcpu->arch.tid);
> @@ -251,9 +254,20 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
>  static void store_spr_state(struct kvm_vcpu *vcpu)
>  {
>  	vcpu->arch.tar = mfspr(SPRN_TAR);
> -	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> -	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> -	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> +
> +	if (vcpu->arch.hfscr & HFSCR_EBB) {
> +		vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> +		vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> +		vcpu->arch.bescr = mfspr(SPRN_BESCR);
> +		/*
> +		 * This is like load_fp in context switching, turn off the
> +		 * facility after it wraps the u8 to try avoiding saving
> +		 * and restoring the registers each partition switch.
> +		 */
> +		vcpu->arch.load_ebb++;
> +		if (!vcpu->arch.load_ebb)
> +			vcpu->arch.hfscr &= ~HFSCR_EBB;
> +	}
>
>  	if (!cpu_has_feature(CPU_FTR_ARCH_31))
>  		vcpu->arch.tid = mfspr(SPRN_TIDR);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 34/43] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
@ 2021-07-08 17:46     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-07-08 17:46 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Use HFSCR facility disabling to implement demand faulting for EBB, with
> a hysteresis counter similar to the load_fp etc counters in context
> switching that implement the equivalent demand faulting for userspace
> facilities.
>
> This speeds up guest entry/exit by avoiding the register save/restore
> when a guest is not frequently using them. When a guest does use them
> often, there will be some additional demand fault overhead, but these
> are not commonly used facilities.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/include/asm/kvm_host.h   |  1 +
>  arch/powerpc/kvm/book3s_hv.c          | 11 +++++++++++
>  arch/powerpc/kvm/book3s_hv_nested.c   |  3 ++-
>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 26 ++++++++++++++++++++------
>  4 files changed, 34 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 118b388ea887..bee95106c1f2 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -585,6 +585,7 @@ struct kvm_vcpu_arch {
>  	ulong cfar;
>  	ulong ppr;
>  	u32 pspb;
> +	u8 load_ebb;
>  	ulong fscr;
>  	ulong shadow_fscr;
>  	ulong ebbhr;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index ae528eb37792..99e9da078e7d 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1366,6 +1366,13 @@ static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu)
>  	return RESUME_GUEST;
>  }
>
> +static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.hfscr |= HFSCR_EBB;
> +
> +	return RESUME_GUEST;
> +}
> +
>  static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
>  				 struct task_struct *tsk)
>  {
> @@ -1645,6 +1652,8 @@ XXX benchmark guest exits
>  				r = kvmppc_emulate_doorbell_instr(vcpu);
>  			if (cause = FSCR_PM_LG)
>  				r = kvmppc_pmu_unavailable(vcpu);
> +			if (cause = FSCR_EBB_LG)
> +				r = kvmppc_ebb_unavailable(vcpu);
>  		}
>  		if (r = EMULATE_FAIL) {
>  			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
> @@ -1764,6 +1773,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
>  		r = EMULATE_FAIL;
>  		if (cause = FSCR_PM_LG && (vcpu->arch.nested_hfscr & HFSCR_PM))
>  			r = kvmppc_pmu_unavailable(vcpu);
> +		if (cause = FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
> +			r = kvmppc_ebb_unavailable(vcpu);
>
>  		if (r = EMULATE_FAIL)
>  			r = RESUME_HOST;
> diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
> index 024b0ce5b702..ee8668f056f9 100644
> --- a/arch/powerpc/kvm/book3s_hv_nested.c
> +++ b/arch/powerpc/kvm/book3s_hv_nested.c
> @@ -168,7 +168,8 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
>  	 * but preserve the interrupt cause field and facilities that might
>  	 * be disabled for demand faulting in the L1.
>  	 */
> -	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | vcpu->arch.hfscr);
> +	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
> +			vcpu->arch.hfscr);
>
>  	/* Don't let data address watchpoint match in hypervisor state */
>  	hr->dawrx0 &= ~DAWRX_HYP;
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index 4d1a2d1ff4c1..cf41261daa97 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -218,9 +218,12 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
>  				struct p9_host_os_sprs *host_os_sprs)
>  {
>  	mtspr(SPRN_TAR, vcpu->arch.tar);
> -	mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> -	mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> -	mtspr(SPRN_BESCR, vcpu->arch.bescr);
> +
> +	if (vcpu->arch.hfscr & HFSCR_EBB) {
> +		mtspr(SPRN_EBBHR, vcpu->arch.ebbhr);
> +		mtspr(SPRN_EBBRR, vcpu->arch.ebbrr);
> +		mtspr(SPRN_BESCR, vcpu->arch.bescr);
> +	}
>
>  	if (!cpu_has_feature(CPU_FTR_ARCH_31))
>  		mtspr(SPRN_TIDR, vcpu->arch.tid);
> @@ -251,9 +254,20 @@ static void load_spr_state(struct kvm_vcpu *vcpu,
>  static void store_spr_state(struct kvm_vcpu *vcpu)
>  {
>  	vcpu->arch.tar = mfspr(SPRN_TAR);
> -	vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> -	vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> -	vcpu->arch.bescr = mfspr(SPRN_BESCR);
> +
> +	if (vcpu->arch.hfscr & HFSCR_EBB) {
> +		vcpu->arch.ebbhr = mfspr(SPRN_EBBHR);
> +		vcpu->arch.ebbrr = mfspr(SPRN_EBBRR);
> +		vcpu->arch.bescr = mfspr(SPRN_BESCR);
> +		/*
> +		 * This is like load_fp in context switching, turn off the
> +		 * facility after it wraps the u8 to try avoiding saving
> +		 * and restoring the registers each partition switch.
> +		 */
> +		vcpu->arch.load_ebb++;
> +		if (!vcpu->arch.load_ebb)
> +			vcpu->arch.hfscr &= ~HFSCR_EBB;
> +	}
>
>  	if (!cpu_has_feature(CPU_FTR_ARCH_31))
>  		vcpu->arch.tid = mfspr(SPRN_TIDR);

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 35/43] KVM: PPC: Book3S HV P9: Demand fault TM facility registers
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-08 17:46     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-07-08 17:46 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Use HFSCR facility disabling to implement demand faulting for TM, with
> a hysteresis counter similar to the load_fp etc counters in context
> switching that implement the equivalent demand faulting for userspace
> facilities.
>
> This speeds up guest entry/exit by avoiding the register save/restore
> when a guest is not frequently using them. When a guest does use them
> often, there will be some additional demand fault overhead, but these
> are not commonly used facilities.
>
> -304 cycles (6681) POWER9 virt-mode NULL hcall with the previous patch
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/include/asm/kvm_host.h   |  1 +
>  arch/powerpc/kvm/book3s_hv.c          | 21 +++++++++++++++++----
>  arch/powerpc/kvm/book3s_hv_nested.c   |  2 +-
>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 18 ++++++++++++------
>  4 files changed, 31 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index bee95106c1f2..d79f0b1b1578 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -586,6 +586,7 @@ struct kvm_vcpu_arch {
>  	ulong ppr;
>  	u32 pspb;
>  	u8 load_ebb;
> +	u8 load_tm;
>  	ulong fscr;
>  	ulong shadow_fscr;
>  	ulong ebbhr;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 99e9da078e7d..2430725f29f7 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1373,6 +1373,13 @@ static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
>  	return RESUME_GUEST;
>  }
>
> +static int kvmppc_tm_unavailable(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.hfscr |= HFSCR_TM;
> +
> +	return RESUME_GUEST;
> +}
> +
>  static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
>  				 struct task_struct *tsk)
>  {
> @@ -1654,6 +1661,8 @@ XXX benchmark guest exits
>  				r = kvmppc_pmu_unavailable(vcpu);
>  			if (cause == FSCR_EBB_LG)
>  				r = kvmppc_ebb_unavailable(vcpu);
> +			if (cause == FSCR_TM_LG)
> +				r = kvmppc_tm_unavailable(vcpu);
>  		}
>  		if (r == EMULATE_FAIL) {
>  			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
> @@ -1775,6 +1784,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
>  			r = kvmppc_pmu_unavailable(vcpu);
>  		if (cause == FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
>  			r = kvmppc_ebb_unavailable(vcpu);
> +		if (cause == FSCR_TM_LG && (vcpu->arch.nested_hfscr & HFSCR_TM))
> +			r = kvmppc_tm_unavailable(vcpu);
>
>  		if (r == EMULATE_FAIL)
>  			r = RESUME_HOST;
> @@ -3737,8 +3748,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
>  		msr |= MSR_VEC;
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr |= MSR_VSX;
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +			(vcpu->arch.hfscr & HFSCR_TM))
>  		msr |= MSR_TM;
>  	msr = msr_check_and_set(msr);
>
> @@ -4453,8 +4465,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
>  		msr |= MSR_VEC;
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr |= MSR_VSX;
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +			(vcpu->arch.hfscr & HFSCR_TM))
>  		msr |= MSR_TM;
>  	msr = msr_check_and_set(msr);
>
> diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
> index ee8668f056f9..5a534f7924f2 100644
> --- a/arch/powerpc/kvm/book3s_hv_nested.c
> +++ b/arch/powerpc/kvm/book3s_hv_nested.c
> @@ -168,7 +168,7 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
>  	 * but preserve the interrupt cause field and facilities that might
>  	 * be disabled for demand faulting in the L1.
>  	 */
> -	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
> +	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_TM | HFSCR_EBB |
>  			vcpu->arch.hfscr);
>
>  	/* Don't let data address watchpoint match in hypervisor state */
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index cf41261daa97..653f2765a399 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -284,8 +284,9 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
>  void load_vcpu_state(struct kvm_vcpu *vcpu,
>  			   struct p9_host_os_sprs *host_os_sprs)
>  {
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +		       (vcpu->arch.hfscr & HFSCR_TM)) {
>  		unsigned long msr = vcpu->arch.shregs.msr;
>  		if (MSR_TM_ACTIVE(msr)) {
>  			kvmppc_restore_tm_hv(vcpu, msr, true);
> @@ -316,8 +317,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
>  #endif
>  	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
>
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +		       (vcpu->arch.hfscr & HFSCR_TM)) {
>  		unsigned long msr = vcpu->arch.shregs.msr;
>  		if (MSR_TM_ACTIVE(msr)) {
>  			kvmppc_save_tm_hv(vcpu, msr, true);
> @@ -326,6 +328,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
>  			vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
>  			vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
>  		}
> +		vcpu->arch.load_tm++; /* see load_ebb comment for details */
> +		if (!vcpu->arch.load_tm)
> +			vcpu->arch.hfscr &= ~HFSCR_TM;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(store_vcpu_state);
> @@ -615,8 +620,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>  		msr |= MSR_VEC;
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr |= MSR_VSX;
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +			(vcpu->arch.hfscr & HFSCR_TM))
>  		msr |= MSR_TM;
>  	msr = msr_check_and_set(msr);
>  	/* Save MSR for restore. This is after hard disable, so EE is clear. */

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 35/43] KVM: PPC: Book3S HV P9: Demand fault TM facility registers
@ 2021-07-08 17:46     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-07-08 17:46 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Use HFSCR facility disabling to implement demand faulting for TM, with
> a hysteresis counter similar to the load_fp etc counters in context
> switching that implement the equivalent demand faulting for userspace
> facilities.
>
> This speeds up guest entry/exit by avoiding the register save/restore
> when a guest is not frequently using them. When a guest does use them
> often, there will be some additional demand fault overhead, but these
> are not commonly used facilities.
>
> -304 cycles (6681) POWER9 virt-mode NULL hcall with the previous patch
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/include/asm/kvm_host.h   |  1 +
>  arch/powerpc/kvm/book3s_hv.c          | 21 +++++++++++++++++----
>  arch/powerpc/kvm/book3s_hv_nested.c   |  2 +-
>  arch/powerpc/kvm/book3s_hv_p9_entry.c | 18 ++++++++++++------
>  4 files changed, 31 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index bee95106c1f2..d79f0b1b1578 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -586,6 +586,7 @@ struct kvm_vcpu_arch {
>  	ulong ppr;
>  	u32 pspb;
>  	u8 load_ebb;
> +	u8 load_tm;
>  	ulong fscr;
>  	ulong shadow_fscr;
>  	ulong ebbhr;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 99e9da078e7d..2430725f29f7 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1373,6 +1373,13 @@ static int kvmppc_ebb_unavailable(struct kvm_vcpu *vcpu)
>  	return RESUME_GUEST;
>  }
>
> +static int kvmppc_tm_unavailable(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.hfscr |= HFSCR_TM;
> +
> +	return RESUME_GUEST;
> +}
> +
>  static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
>  				 struct task_struct *tsk)
>  {
> @@ -1654,6 +1661,8 @@ XXX benchmark guest exits
>  				r = kvmppc_pmu_unavailable(vcpu);
>  			if (cause = FSCR_EBB_LG)
>  				r = kvmppc_ebb_unavailable(vcpu);
> +			if (cause = FSCR_TM_LG)
> +				r = kvmppc_tm_unavailable(vcpu);
>  		}
>  		if (r = EMULATE_FAIL) {
>  			kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
> @@ -1775,6 +1784,8 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
>  			r = kvmppc_pmu_unavailable(vcpu);
>  		if (cause = FSCR_EBB_LG && (vcpu->arch.nested_hfscr & HFSCR_EBB))
>  			r = kvmppc_ebb_unavailable(vcpu);
> +		if (cause = FSCR_TM_LG && (vcpu->arch.nested_hfscr & HFSCR_TM))
> +			r = kvmppc_tm_unavailable(vcpu);
>
>  		if (r = EMULATE_FAIL)
>  			r = RESUME_HOST;
> @@ -3737,8 +3748,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
>  		msr |= MSR_VEC;
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr |= MSR_VSX;
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +			(vcpu->arch.hfscr & HFSCR_TM))
>  		msr |= MSR_TM;
>  	msr = msr_check_and_set(msr);
>
> @@ -4453,8 +4465,9 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
>  		msr |= MSR_VEC;
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr |= MSR_VSX;
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +			(vcpu->arch.hfscr & HFSCR_TM))
>  		msr |= MSR_TM;
>  	msr = msr_check_and_set(msr);
>
> diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
> index ee8668f056f9..5a534f7924f2 100644
> --- a/arch/powerpc/kvm/book3s_hv_nested.c
> +++ b/arch/powerpc/kvm/book3s_hv_nested.c
> @@ -168,7 +168,7 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
>  	 * but preserve the interrupt cause field and facilities that might
>  	 * be disabled for demand faulting in the L1.
>  	 */
> -	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_EBB |
> +	hr->hfscr &= (HFSCR_INTR_CAUSE | HFSCR_PM | HFSCR_TM | HFSCR_EBB |
>  			vcpu->arch.hfscr);
>
>  	/* Don't let data address watchpoint match in hypervisor state */
> diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> index cf41261daa97..653f2765a399 100644
> --- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
> +++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
> @@ -284,8 +284,9 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
>  void load_vcpu_state(struct kvm_vcpu *vcpu,
>  			   struct p9_host_os_sprs *host_os_sprs)
>  {
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +		       (vcpu->arch.hfscr & HFSCR_TM)) {
>  		unsigned long msr = vcpu->arch.shregs.msr;
>  		if (MSR_TM_ACTIVE(msr)) {
>  			kvmppc_restore_tm_hv(vcpu, msr, true);
> @@ -316,8 +317,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
>  #endif
>  	vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
>
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) {
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +		       (vcpu->arch.hfscr & HFSCR_TM)) {
>  		unsigned long msr = vcpu->arch.shregs.msr;
>  		if (MSR_TM_ACTIVE(msr)) {
>  			kvmppc_save_tm_hv(vcpu, msr, true);
> @@ -326,6 +328,9 @@ void store_vcpu_state(struct kvm_vcpu *vcpu)
>  			vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
>  			vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
>  		}
> +		vcpu->arch.load_tm++; /* see load_ebb comment for details */
> +		if (!vcpu->arch.load_tm)
> +			vcpu->arch.hfscr &= ~HFSCR_TM;
>  	}
>  }
>  EXPORT_SYMBOL_GPL(store_vcpu_state);
> @@ -615,8 +620,9 @@ int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpc
>  		msr |= MSR_VEC;
>  	if (cpu_has_feature(CPU_FTR_VSX))
>  		msr |= MSR_VSX;
> -	if (cpu_has_feature(CPU_FTR_TM) ||
> -	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> +	if ((cpu_has_feature(CPU_FTR_TM) ||
> +	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) &&
> +			(vcpu->arch.hfscr & HFSCR_TM))
>  		msr |= MSR_TM;
>  	msr = msr_check_and_set(msr);
>  	/* Save MSR for restore. This is after hard disable, so EE is clear. */

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 12/43] KVM: PPC: Book3S HV P9: Factor out yield_count increment
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-08 17:56     ` Fabiano Rosas
  -1 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-07-08 17:56 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Factor duplicated code into a helper function.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index b1b94b3563b7..38d8afa16839 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3896,6 +3896,16 @@ static inline bool hcall_is_xics(unsigned long req)
>  		req == H_IPOLL || req == H_XIRR || req == H_XIRR_X;
>  }
>
> +static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
> +{
> +	struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp) {
> +		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> +		lp->yield_count = cpu_to_be32(yield_count);
> +		vcpu->arch.vpa.dirty = 1;
> +	}
> +}
> +
>  /*
>   * Guest entry for POWER9 and later CPUs.
>   */
> @@ -3926,12 +3936,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  	vc->entry_exit_map = 1;
>  	vc->in_guest = 1;
>
> -	if (vcpu->arch.vpa.pinned_addr) {
> -		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> -		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> -		lp->yield_count = cpu_to_be32(yield_count);
> -		vcpu->arch.vpa.dirty = 1;
> -	}
> +	vcpu_vpa_increment_dispatch(vcpu);
>
>  	if (cpu_has_feature(CPU_FTR_TM) ||
>  	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> @@ -4069,12 +4074,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
>  		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
>
> -	if (vcpu->arch.vpa.pinned_addr) {
> -		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> -		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> -		lp->yield_count = cpu_to_be32(yield_count);
> -		vcpu->arch.vpa.dirty = 1;
> -	}
> +	vcpu_vpa_increment_dispatch(vcpu);
>
>  	save_p9_guest_pmu(vcpu);
>  #ifdef CONFIG_PPC_PSERIES

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 12/43] KVM: PPC: Book3S HV P9: Factor out yield_count increment
@ 2021-07-08 17:56     ` Fabiano Rosas
  0 siblings, 0 replies; 133+ messages in thread
From: Fabiano Rosas @ 2021-07-08 17:56 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> Factor duplicated code into a helper function.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index b1b94b3563b7..38d8afa16839 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3896,6 +3896,16 @@ static inline bool hcall_is_xics(unsigned long req)
>  		req = H_IPOLL || req = H_XIRR || req = H_XIRR_X;
>  }
>
> +static void vcpu_vpa_increment_dispatch(struct kvm_vcpu *vcpu)
> +{
> +	struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp) {
> +		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> +		lp->yield_count = cpu_to_be32(yield_count);
> +		vcpu->arch.vpa.dirty = 1;
> +	}
> +}
> +
>  /*
>   * Guest entry for POWER9 and later CPUs.
>   */
> @@ -3926,12 +3936,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  	vc->entry_exit_map = 1;
>  	vc->in_guest = 1;
>
> -	if (vcpu->arch.vpa.pinned_addr) {
> -		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> -		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> -		lp->yield_count = cpu_to_be32(yield_count);
> -		vcpu->arch.vpa.dirty = 1;
> -	}
> +	vcpu_vpa_increment_dispatch(vcpu);
>
>  	if (cpu_has_feature(CPU_FTR_TM) ||
>  	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> @@ -4069,12 +4074,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
>  		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
>
> -	if (vcpu->arch.vpa.pinned_addr) {
> -		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> -		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> -		lp->yield_count = cpu_to_be32(yield_count);
> -		vcpu->arch.vpa.dirty = 1;
> -	}
> +	vcpu_vpa_increment_dispatch(vcpu);
>
>  	save_p9_guest_pmu(vcpu);
>  #ifdef CONFIG_PPC_PSERIES

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-10  2:59     ` Athira Rajeev
  -1 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-10  2:47 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Implement the P9 path PMU save/restore code in C, and remove the
> POWER9/10 code from the P7/8 path assembly.
> 
> -449 cycles (8533) POWER9 virt-mode NULL hcall
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/asm-prototypes.h |   5 -
> arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
> arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
> arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
> 4 files changed, 200 insertions(+), 66 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
> index 02ee6f5ac9fe..928db8ef9a5a 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
> 					bool preserve_nv) { }
> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> 
> -void kvmhv_save_host_pmu(void);
> -void kvmhv_load_host_pmu(void);
> -void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
> -void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
> -
> void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
> 
> long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index f7349d150828..b1b94b3563b7 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> 	trace_kvmppc_run_core(vc, 1);
> }
> 
> +/*
> + * Privileged (non-hypervisor) host registers to save.
> + */
> +struct p9_host_os_sprs {
> +	unsigned long dscr;
> +	unsigned long tidr;
> +	unsigned long iamr;
> +	unsigned long amr;
> +	unsigned long fscr;
> +
> +	unsigned int pmc1;
> +	unsigned int pmc2;
> +	unsigned int pmc3;
> +	unsigned int pmc4;
> +	unsigned int pmc5;
> +	unsigned int pmc6;
> +	unsigned long mmcr0;
> +	unsigned long mmcr1;
> +	unsigned long mmcr2;
> +	unsigned long mmcr3;
> +	unsigned long mmcra;
> +	unsigned long siar;
> +	unsigned long sier1;
> +	unsigned long sier2;
> +	unsigned long sier3;
> +	unsigned long sdar;
> +};
> +
> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
> +{
> +	if (!(mmcr0 & MMCR0_FC))
> +		goto do_freeze;
> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
> +		goto do_freeze;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		if (!(mmcr0 & MMCR0_PMCCEXT))
> +			goto do_freeze;
> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
> +			goto do_freeze;
> +	}
> +	return;


Hi Nick

When freezing the PMU, do we need to also set pmcregs_in_use to zero ?

Also, why we need these above conditions like MMCRA_SAMPLE_ENABLE,  MMCR0_PMCCEXT checks also before freezing ?

> +
> +do_freeze:
> +	mmcr0 = MMCR0_FC;
> +	mmcra = 0;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		mmcr0 |= MMCR0_PMCCEXT;
> +		mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
> +	mtspr(SPRN_MMCR0, mmcr0);
> +	mtspr(SPRN_MMCRA, mmcra);
> +	isync();
> +}
> +
> +static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (ppc_get_pmu_inuse()) {
> +		/*
> +		 * It might be better to put PMU handling (at least for the
> +		 * host) in the perf subsystem because it knows more about what
> +		 * is being used.
> +		 */
> +
> +		/* POWER9, POWER10 do not implement HPMC or SPMC */
> +
> +		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
> +		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
> +
> +		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
> +		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
> +		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
> +		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
> +		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
> +		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
> +		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
> +		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
> +		host_os_sprs->sdar = mfspr(SPRN_SDAR);
> +		host_os_sprs->siar = mfspr(SPRN_SIAR);
> +		host_os_sprs->sier1 = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
> +			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
> +			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
> +		}
> +	}
> +}
> +
> +static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
> +{
> +	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
> +	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
> +	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
> +	mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
> +	mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
> +	mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
> +	mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
> +	mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
> +	mtspr(SPRN_SDAR, vcpu->arch.sdar);
> +	mtspr(SPRN_SIAR, vcpu->arch.siar);
> +	mtspr(SPRN_SIER, vcpu->arch.sier[0]);
> +
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
> +		mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
> +		mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
> +	}
> +
> +	/* Set MMCRA then MMCR0 last */
> +	mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
> +	mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
> +	/* No isync necessary because we're starting counters */
> +}
> +
> +static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
> +{
> +	struct lppaca *lp;
> +	int save_pmu = 1;
> +
> +	lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp)
> +		save_pmu = lp->pmcregs_in_use;
> +
> +	if (save_pmu) {
> +		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
> +		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
> +
> +		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
> +		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
> +		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
> +		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
> +		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
> +		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
> +		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
> +		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
> +		vcpu->arch.sdar = mfspr(SPRN_SDAR);
> +		vcpu->arch.siar = mfspr(SPRN_SIAR);
> +		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
> +			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
> +			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
> +		}
> +	} else {
> +		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
> +	}
> +}
> +
> +static void load_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (ppc_get_pmu_inuse()) {
> +		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
> +		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
> +		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
> +		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
> +		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
> +		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
> +		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
> +		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
> +		mtspr(SPRN_SDAR, host_os_sprs->sdar);
> +		mtspr(SPRN_SIAR, host_os_sprs->siar);
> +		mtspr(SPRN_SIER, host_os_sprs->sier1);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
> +			mtspr(SPRN_SIER2, host_os_sprs->sier2);
> +			mtspr(SPRN_SIER3, host_os_sprs->sier3);
> +		}
> +
> +		/* Set MMCRA then MMCR0 last */
> +		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
> +		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
> +		isync();
> +	}
> +}
> +
> static void load_spr_state(struct kvm_vcpu *vcpu)
> {
> 	mtspr(SPRN_DSCR, vcpu->arch.dscr);
> @@ -3677,17 +3859,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
> 	vcpu->arch.dscr = mfspr(SPRN_DSCR);
> }
> 
> -/*
> - * Privileged (non-hypervisor) host registers to save.
> - */
> -struct p9_host_os_sprs {
> -	unsigned long dscr;
> -	unsigned long tidr;
> -	unsigned long iamr;
> -	unsigned long amr;
> -	unsigned long fscr;
> -};
> -
> static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
> {
> 	host_os_sprs->dscr = mfspr(SPRN_DSCR);
> @@ -3735,7 +3906,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 	struct p9_host_os_sprs host_os_sprs;
> 	s64 dec;
> 	u64 tb, next_timer;
> -	int trap, save_pmu;
> +	int trap;
> 
> 	WARN_ON_ONCE(vcpu->arch.ceded);
> 
> @@ -3748,7 +3919,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 
> 	save_p9_host_os_sprs(&host_os_sprs);
> 
> -	kvmhv_save_host_pmu();		/* saves it to PACA kvm_hstate */
> +	save_p9_host_pmu(&host_os_sprs);
> 
> 	kvmppc_subcore_enter_guest();
> 
> @@ -3776,7 +3947,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 		}
> 	}
> #endif
> -	kvmhv_load_guest_pmu(vcpu);
> +	load_p9_guest_pmu(vcpu);
> 
> 	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
> 	load_fp_state(&vcpu->arch.fp);
> @@ -3898,16 +4069,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> 		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> 
> -	save_pmu = 1;
> 	if (vcpu->arch.vpa.pinned_addr) {
> 		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> 		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> 		lp->yield_count = cpu_to_be32(yield_count);
> 		vcpu->arch.vpa.dirty = 1;
> -		save_pmu = lp->pmcregs_in_use;
> 	}
> 
> -	kvmhv_save_guest_pmu(vcpu, save_pmu);
> +	save_p9_guest_pmu(vcpu);
> #ifdef CONFIG_PPC_PSERIES
> 	if (kvmhv_on_pseries())
> 		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
> @@ -3920,7 +4089,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 
> 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
> 
> -	kvmhv_load_host_pmu();
> +	load_p9_host_pmu(&host_os_sprs);
> 
> 	kvmppc_subcore_exit_guest();
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
> index 4444f83cb133..59d89e4b154a 100644
> --- a/arch/powerpc/kvm/book3s_hv_interrupts.S
> +++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
> @@ -104,7 +104,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
> 	mtlr	r0
> 	blr
> 
> -_GLOBAL(kvmhv_save_host_pmu)
> +/*
> + * void kvmhv_save_host_pmu(void)
> + */
> +kvmhv_save_host_pmu:
> BEGIN_FTR_SECTION
> 	/* Work around P8 PMAE bug */
> 	li	r3, -1
> @@ -138,14 +141,6 @@ BEGIN_FTR_SECTION
> 	std	r8, HSTATE_MMCR2(r13)
> 	std	r9, HSTATE_SIER(r13)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> -BEGIN_FTR_SECTION
> -	mfspr	r5, SPRN_MMCR3
> -	mfspr	r6, SPRN_SIER2
> -	mfspr	r7, SPRN_SIER3
> -	std	r5, HSTATE_MMCR3(r13)
> -	std	r6, HSTATE_SIER2(r13)
> -	std	r7, HSTATE_SIER3(r13)
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> 	mfspr	r3, SPRN_PMC1
> 	mfspr	r5, SPRN_PMC2
> 	mfspr	r6, SPRN_PMC3
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 007f87b97184..0eb06734bc26 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2780,10 +2780,11 @@ kvmppc_msr_interrupt:
> 	blr
> 
> /*
> + * void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu)
> + *
>  * Load up guest PMU state.  R3 points to the vcpu struct.
>  */
> -_GLOBAL(kvmhv_load_guest_pmu)
> -EXPORT_SYMBOL_GPL(kvmhv_load_guest_pmu)
> +kvmhv_load_guest_pmu:
> 	mr	r4, r3
> 	mflr	r0
> 	li	r3, 1
> @@ -2817,27 +2818,17 @@ END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
> 	mtspr	SPRN_MMCRA, r6
> 	mtspr	SPRN_SIAR, r7
> 	mtspr	SPRN_SDAR, r8
> -BEGIN_FTR_SECTION
> -	ld      r5, VCPU_MMCR + 24(r4)
> -	ld      r6, VCPU_SIER + 8(r4)
> -	ld      r7, VCPU_SIER + 16(r4)
> -	mtspr   SPRN_MMCR3, r5
> -	mtspr   SPRN_SIER2, r6
> -	mtspr   SPRN_SIER3, r7
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> BEGIN_FTR_SECTION
> 	ld	r5, VCPU_MMCR + 16(r4)
> 	ld	r6, VCPU_SIER(r4)
> 	mtspr	SPRN_MMCR2, r5
> 	mtspr	SPRN_SIER, r6
> -BEGIN_FTR_SECTION_NESTED(96)
> 	lwz	r7, VCPU_PMC + 24(r4)
> 	lwz	r8, VCPU_PMC + 28(r4)
> 	ld	r9, VCPU_MMCRS(r4)
> 	mtspr	SPRN_SPMC1, r7
> 	mtspr	SPRN_SPMC2, r8
> 	mtspr	SPRN_MMCRS, r9
> -END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> 	mtspr	SPRN_MMCR0, r3
> 	isync
> @@ -2845,10 +2836,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> 	blr
> 
> /*
> + * void kvmhv_load_host_pmu(void)
> + *
>  * Reload host PMU state saved in the PACA by kvmhv_save_host_pmu.
>  */
> -_GLOBAL(kvmhv_load_host_pmu)
> -EXPORT_SYMBOL_GPL(kvmhv_load_host_pmu)
> +kvmhv_load_host_pmu:
> 	mflr	r0
> 	lbz	r4, PACA_PMCINUSE(r13) /* is the host using the PMU? */
> 	cmpwi	r4, 0
> @@ -2886,25 +2878,18 @@ BEGIN_FTR_SECTION
> 	mtspr	SPRN_MMCR2, r8
> 	mtspr	SPRN_SIER, r9
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> -BEGIN_FTR_SECTION
> -	ld      r5, HSTATE_MMCR3(r13)
> -	ld      r6, HSTATE_SIER2(r13)
> -	ld      r7, HSTATE_SIER3(r13)
> -	mtspr   SPRN_MMCR3, r5
> -	mtspr   SPRN_SIER2, r6
> -	mtspr   SPRN_SIER3, r7
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> 	mtspr	SPRN_MMCR0, r3
> 	isync
> 	mtlr	r0
> 23:	blr
> 
> /*
> + * void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use)
> + *
>  * Save guest PMU state into the vcpu struct.
>  * r3 = vcpu, r4 = full save flag (PMU in use flag set in VPA)
>  */
> -_GLOBAL(kvmhv_save_guest_pmu)
> -EXPORT_SYMBOL_GPL(kvmhv_save_guest_pmu)
> +kvmhv_save_guest_pmu:
> 	mr	r9, r3
> 	mr	r8, r4
> BEGIN_FTR_SECTION
> @@ -2953,14 +2938,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> BEGIN_FTR_SECTION
> 	std	r10, VCPU_MMCR + 16(r9)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> -BEGIN_FTR_SECTION
> -	mfspr   r5, SPRN_MMCR3
> -	mfspr   r6, SPRN_SIER2
> -	mfspr   r7, SPRN_SIER3
> -	std     r5, VCPU_MMCR + 24(r9)
> -	std     r6, VCPU_SIER + 8(r9)
> -	std     r7, VCPU_SIER + 16(r9)
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> 	std	r7, VCPU_SIAR(r9)
> 	std	r8, VCPU_SDAR(r9)
> 	mfspr	r3, SPRN_PMC1
> @@ -2978,7 +2955,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> BEGIN_FTR_SECTION
> 	mfspr	r5, SPRN_SIER
> 	std	r5, VCPU_SIER(r9)
> -BEGIN_FTR_SECTION_NESTED(96)
> 	mfspr	r6, SPRN_SPMC1
> 	mfspr	r7, SPRN_SPMC2
> 	mfspr	r8, SPRN_MMCRS
> @@ -2987,7 +2963,6 @@ BEGIN_FTR_SECTION_NESTED(96)
> 	std	r8, VCPU_MMCRS(r9)
> 	lis	r4, 0x8000
> 	mtspr	SPRN_MMCRS, r4
> -END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> 22:	blr
> 
> -- 
> 2.23.0
> 


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-06-22 10:57   ` Nicholas Piggin
@ 2021-07-10  2:50     ` Athira Rajeev
  -1 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-10  2:50 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> KVM PMU management code looks for particular frozen/disabled bits in
> the PMU registers so it knows whether it must clear them when coming
> out of a guest or not. Setting this up helps KVM make these optimisations
> without getting confused. Longer term the better approach might be to
> move guest/host PMU switching to the perf subsystem.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
> arch/powerpc/perf/core-book3s.c       | 7 +++++++
> 4 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
> index a29dc8326622..3dc61e203f37 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
> static void init_PMU(void)
> {
> 	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 	mtspr(SPRN_MMCR1, 0);
> 	mtspr(SPRN_MMCR2, 0);
> }
> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
> {
> 	mtspr(SPRN_MMCR3, 0);
> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> }
> 
> /*
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 0a6b36b4bda8..06a089fbeaa7 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
> 	}
> 
> 	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 	mtspr(SPRN_MMCR1, 0);
> 	mtspr(SPRN_MMCR2, 0);
> 	mtspr(SPRN_MMCRS, 0);
> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
> 		mtspr(SPRN_MMCRC, 0);
> 
> 	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 	mtspr(SPRN_MMCR1, 0);
> 	mtspr(SPRN_MMCR2, 0);
> }
> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
> 
> 	mtspr(SPRN_MMCR3, 0);
> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> }
> 
> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 1f30f98b09d1..f7349d150828 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
> #endif
> #endif
> 	vcpu->arch.mmcr[0] = MMCR0_FC;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
> 	/* default to host PVR, since we can't spoof it */
> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 51622411a7cc..e33b29ec1a65 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
> 		goto out;
> 
> 	if (cpuhw->n_events == 0) {
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> +		} else {
> +			mtspr(SPRN_MMCRA, 0);
> +			mtspr(SPRN_MMCR0, MMCR0_FC);
> +		}


Hi Nick,

We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?

Thanks
Athira

> 		ppc_set_pmu_inuse(0);
> 		goto out;
> 	}
> -- 
> 2.23.0
> 


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-10  2:50     ` Athira Rajeev
  0 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-10  2:50 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> KVM PMU management code looks for particular frozen/disabled bits in
> the PMU registers so it knows whether it must clear them when coming
> out of a guest or not. Setting this up helps KVM make these optimisations
> without getting confused. Longer term the better approach might be to
> move guest/host PMU switching to the perf subsystem.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
> arch/powerpc/perf/core-book3s.c       | 7 +++++++
> 4 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
> index a29dc8326622..3dc61e203f37 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
> static void init_PMU(void)
> {
> 	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 	mtspr(SPRN_MMCR1, 0);
> 	mtspr(SPRN_MMCR2, 0);
> }
> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
> {
> 	mtspr(SPRN_MMCR3, 0);
> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> }
> 
> /*
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 0a6b36b4bda8..06a089fbeaa7 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
> 	}
> 
> 	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 	mtspr(SPRN_MMCR1, 0);
> 	mtspr(SPRN_MMCR2, 0);
> 	mtspr(SPRN_MMCRS, 0);
> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
> 		mtspr(SPRN_MMCRC, 0);
> 
> 	mtspr(SPRN_MMCRA, 0);
> -	mtspr(SPRN_MMCR0, 0);
> +	mtspr(SPRN_MMCR0, MMCR0_FC);
> 	mtspr(SPRN_MMCR1, 0);
> 	mtspr(SPRN_MMCR2, 0);
> }
> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
> 
> 	mtspr(SPRN_MMCR3, 0);
> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> }
> 
> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 1f30f98b09d1..f7349d150828 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
> #endif
> #endif
> 	vcpu->arch.mmcr[0] = MMCR0_FC;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
> 	/* default to host PVR, since we can't spoof it */
> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 51622411a7cc..e33b29ec1a65 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
> 		goto out;
> 
> 	if (cpuhw->n_events == 0) {
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
> +		} else {
> +			mtspr(SPRN_MMCRA, 0);
> +			mtspr(SPRN_MMCR0, MMCR0_FC);
> +		}


Hi Nick,

We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?

Thanks
Athira

> 		ppc_set_pmu_inuse(0);
> 		goto out;
> 	}
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
@ 2021-07-10  2:59     ` Athira Rajeev
  0 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-10  2:59 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Implement the P9 path PMU save/restore code in C, and remove the
> POWER9/10 code from the P7/8 path assembly.
> 
> -449 cycles (8533) POWER9 virt-mode NULL hcall
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/asm-prototypes.h |   5 -
> arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
> arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
> arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
> 4 files changed, 200 insertions(+), 66 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
> index 02ee6f5ac9fe..928db8ef9a5a 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
> 					bool preserve_nv) { }
> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> 
> -void kvmhv_save_host_pmu(void);
> -void kvmhv_load_host_pmu(void);
> -void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
> -void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
> -
> void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
> 
> long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index f7349d150828..b1b94b3563b7 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
> 	trace_kvmppc_run_core(vc, 1);
> }
> 
> +/*
> + * Privileged (non-hypervisor) host registers to save.
> + */
> +struct p9_host_os_sprs {
> +	unsigned long dscr;
> +	unsigned long tidr;
> +	unsigned long iamr;
> +	unsigned long amr;
> +	unsigned long fscr;
> +
> +	unsigned int pmc1;
> +	unsigned int pmc2;
> +	unsigned int pmc3;
> +	unsigned int pmc4;
> +	unsigned int pmc5;
> +	unsigned int pmc6;
> +	unsigned long mmcr0;
> +	unsigned long mmcr1;
> +	unsigned long mmcr2;
> +	unsigned long mmcr3;
> +	unsigned long mmcra;
> +	unsigned long siar;
> +	unsigned long sier1;
> +	unsigned long sier2;
> +	unsigned long sier3;
> +	unsigned long sdar;
> +};
> +
> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
> +{
> +	if (!(mmcr0 & MMCR0_FC))
> +		goto do_freeze;
> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
> +		goto do_freeze;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		if (!(mmcr0 & MMCR0_PMCCEXT))
> +			goto do_freeze;
> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
> +			goto do_freeze;
> +	}
> +	return;


Hi Nick

When freezing the PMU, do we need to also set pmcregs_in_use to zero ?

Also, why we need these above conditions like MMCRA_SAMPLE_ENABLE,  MMCR0_PMCCEXT checks also before freezing ?

> +
> +do_freeze:
> +	mmcr0 = MMCR0_FC;
> +	mmcra = 0;
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		mmcr0 |= MMCR0_PMCCEXT;
> +		mmcra = MMCRA_BHRB_DISABLE;
> +	}
> +
> +	mtspr(SPRN_MMCR0, mmcr0);
> +	mtspr(SPRN_MMCRA, mmcra);
> +	isync();
> +}
> +
> +static void save_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (ppc_get_pmu_inuse()) {
> +		/*
> +		 * It might be better to put PMU handling (at least for the
> +		 * host) in the perf subsystem because it knows more about what
> +		 * is being used.
> +		 */
> +
> +		/* POWER9, POWER10 do not implement HPMC or SPMC */
> +
> +		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
> +		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
> +
> +		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
> +		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
> +		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
> +		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
> +		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
> +		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
> +		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
> +		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
> +		host_os_sprs->sdar = mfspr(SPRN_SDAR);
> +		host_os_sprs->siar = mfspr(SPRN_SIAR);
> +		host_os_sprs->sier1 = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
> +			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
> +			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
> +		}
> +	}
> +}
> +
> +static void load_p9_guest_pmu(struct kvm_vcpu *vcpu)
> +{
> +	mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
> +	mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
> +	mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
> +	mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
> +	mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
> +	mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
> +	mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
> +	mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
> +	mtspr(SPRN_SDAR, vcpu->arch.sdar);
> +	mtspr(SPRN_SIAR, vcpu->arch.siar);
> +	mtspr(SPRN_SIER, vcpu->arch.sier[0]);
> +
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
> +		mtspr(SPRN_SIER2, vcpu->arch.sier[1]);
> +		mtspr(SPRN_SIER3, vcpu->arch.sier[2]);
> +	}
> +
> +	/* Set MMCRA then MMCR0 last */
> +	mtspr(SPRN_MMCRA, vcpu->arch.mmcra);
> +	mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]);
> +	/* No isync necessary because we're starting counters */
> +}
> +
> +static void save_p9_guest_pmu(struct kvm_vcpu *vcpu)
> +{
> +	struct lppaca *lp;
> +	int save_pmu = 1;
> +
> +	lp = vcpu->arch.vpa.pinned_addr;
> +	if (lp)
> +		save_pmu = lp->pmcregs_in_use;
> +
> +	if (save_pmu) {
> +		vcpu->arch.mmcr[0] = mfspr(SPRN_MMCR0);
> +		vcpu->arch.mmcra = mfspr(SPRN_MMCRA);
> +
> +		freeze_pmu(vcpu->arch.mmcr[0], vcpu->arch.mmcra);
> +
> +		vcpu->arch.pmc[0] = mfspr(SPRN_PMC1);
> +		vcpu->arch.pmc[1] = mfspr(SPRN_PMC2);
> +		vcpu->arch.pmc[2] = mfspr(SPRN_PMC3);
> +		vcpu->arch.pmc[3] = mfspr(SPRN_PMC4);
> +		vcpu->arch.pmc[4] = mfspr(SPRN_PMC5);
> +		vcpu->arch.pmc[5] = mfspr(SPRN_PMC6);
> +		vcpu->arch.mmcr[1] = mfspr(SPRN_MMCR1);
> +		vcpu->arch.mmcr[2] = mfspr(SPRN_MMCR2);
> +		vcpu->arch.sdar = mfspr(SPRN_SDAR);
> +		vcpu->arch.siar = mfspr(SPRN_SIAR);
> +		vcpu->arch.sier[0] = mfspr(SPRN_SIER);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			vcpu->arch.mmcr[3] = mfspr(SPRN_MMCR3);
> +			vcpu->arch.sier[1] = mfspr(SPRN_SIER2);
> +			vcpu->arch.sier[2] = mfspr(SPRN_SIER3);
> +		}
> +	} else {
> +		freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA));
> +	}
> +}
> +
> +static void load_p9_host_pmu(struct p9_host_os_sprs *host_os_sprs)
> +{
> +	if (ppc_get_pmu_inuse()) {
> +		mtspr(SPRN_PMC1, host_os_sprs->pmc1);
> +		mtspr(SPRN_PMC2, host_os_sprs->pmc2);
> +		mtspr(SPRN_PMC3, host_os_sprs->pmc3);
> +		mtspr(SPRN_PMC4, host_os_sprs->pmc4);
> +		mtspr(SPRN_PMC5, host_os_sprs->pmc5);
> +		mtspr(SPRN_PMC6, host_os_sprs->pmc6);
> +		mtspr(SPRN_MMCR1, host_os_sprs->mmcr1);
> +		mtspr(SPRN_MMCR2, host_os_sprs->mmcr2);
> +		mtspr(SPRN_SDAR, host_os_sprs->sdar);
> +		mtspr(SPRN_SIAR, host_os_sprs->siar);
> +		mtspr(SPRN_SIER, host_os_sprs->sier1);
> +
> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +			mtspr(SPRN_MMCR3, host_os_sprs->mmcr3);
> +			mtspr(SPRN_SIER2, host_os_sprs->sier2);
> +			mtspr(SPRN_SIER3, host_os_sprs->sier3);
> +		}
> +
> +		/* Set MMCRA then MMCR0 last */
> +		mtspr(SPRN_MMCRA, host_os_sprs->mmcra);
> +		mtspr(SPRN_MMCR0, host_os_sprs->mmcr0);
> +		isync();
> +	}
> +}
> +
> static void load_spr_state(struct kvm_vcpu *vcpu)
> {
> 	mtspr(SPRN_DSCR, vcpu->arch.dscr);
> @@ -3677,17 +3859,6 @@ static void store_spr_state(struct kvm_vcpu *vcpu)
> 	vcpu->arch.dscr = mfspr(SPRN_DSCR);
> }
> 
> -/*
> - * Privileged (non-hypervisor) host registers to save.
> - */
> -struct p9_host_os_sprs {
> -	unsigned long dscr;
> -	unsigned long tidr;
> -	unsigned long iamr;
> -	unsigned long amr;
> -	unsigned long fscr;
> -};
> -
> static void save_p9_host_os_sprs(struct p9_host_os_sprs *host_os_sprs)
> {
> 	host_os_sprs->dscr = mfspr(SPRN_DSCR);
> @@ -3735,7 +3906,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 	struct p9_host_os_sprs host_os_sprs;
> 	s64 dec;
> 	u64 tb, next_timer;
> -	int trap, save_pmu;
> +	int trap;
> 
> 	WARN_ON_ONCE(vcpu->arch.ceded);
> 
> @@ -3748,7 +3919,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 
> 	save_p9_host_os_sprs(&host_os_sprs);
> 
> -	kvmhv_save_host_pmu();		/* saves it to PACA kvm_hstate */
> +	save_p9_host_pmu(&host_os_sprs);
> 
> 	kvmppc_subcore_enter_guest();
> 
> @@ -3776,7 +3947,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 		}
> 	}
> #endif
> -	kvmhv_load_guest_pmu(vcpu);
> +	load_p9_guest_pmu(vcpu);
> 
> 	msr_check_and_set(MSR_FP | MSR_VEC | MSR_VSX);
> 	load_fp_state(&vcpu->arch.fp);
> @@ -3898,16 +4069,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 	    cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
> 		kvmppc_save_tm_hv(vcpu, vcpu->arch.shregs.msr, true);
> 
> -	save_pmu = 1;
> 	if (vcpu->arch.vpa.pinned_addr) {
> 		struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
> 		u32 yield_count = be32_to_cpu(lp->yield_count) + 1;
> 		lp->yield_count = cpu_to_be32(yield_count);
> 		vcpu->arch.vpa.dirty = 1;
> -		save_pmu = lp->pmcregs_in_use;
> 	}
> 
> -	kvmhv_save_guest_pmu(vcpu, save_pmu);
> +	save_p9_guest_pmu(vcpu);
> #ifdef CONFIG_PPC_PSERIES
> 	if (kvmhv_on_pseries())
> 		get_lppaca()->pmcregs_in_use = ppc_get_pmu_inuse();
> @@ -3920,7 +4089,7 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
> 
> 	mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso);
> 
> -	kvmhv_load_host_pmu();
> +	load_p9_host_pmu(&host_os_sprs);
> 
> 	kvmppc_subcore_exit_guest();
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
> index 4444f83cb133..59d89e4b154a 100644
> --- a/arch/powerpc/kvm/book3s_hv_interrupts.S
> +++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
> @@ -104,7 +104,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
> 	mtlr	r0
> 	blr
> 
> -_GLOBAL(kvmhv_save_host_pmu)
> +/*
> + * void kvmhv_save_host_pmu(void)
> + */
> +kvmhv_save_host_pmu:
> BEGIN_FTR_SECTION
> 	/* Work around P8 PMAE bug */
> 	li	r3, -1
> @@ -138,14 +141,6 @@ BEGIN_FTR_SECTION
> 	std	r8, HSTATE_MMCR2(r13)
> 	std	r9, HSTATE_SIER(r13)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> -BEGIN_FTR_SECTION
> -	mfspr	r5, SPRN_MMCR3
> -	mfspr	r6, SPRN_SIER2
> -	mfspr	r7, SPRN_SIER3
> -	std	r5, HSTATE_MMCR3(r13)
> -	std	r6, HSTATE_SIER2(r13)
> -	std	r7, HSTATE_SIER3(r13)
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> 	mfspr	r3, SPRN_PMC1
> 	mfspr	r5, SPRN_PMC2
> 	mfspr	r6, SPRN_PMC3
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 007f87b97184..0eb06734bc26 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2780,10 +2780,11 @@ kvmppc_msr_interrupt:
> 	blr
> 
> /*
> + * void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu)
> + *
>  * Load up guest PMU state.  R3 points to the vcpu struct.
>  */
> -_GLOBAL(kvmhv_load_guest_pmu)
> -EXPORT_SYMBOL_GPL(kvmhv_load_guest_pmu)
> +kvmhv_load_guest_pmu:
> 	mr	r4, r3
> 	mflr	r0
> 	li	r3, 1
> @@ -2817,27 +2818,17 @@ END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
> 	mtspr	SPRN_MMCRA, r6
> 	mtspr	SPRN_SIAR, r7
> 	mtspr	SPRN_SDAR, r8
> -BEGIN_FTR_SECTION
> -	ld      r5, VCPU_MMCR + 24(r4)
> -	ld      r6, VCPU_SIER + 8(r4)
> -	ld      r7, VCPU_SIER + 16(r4)
> -	mtspr   SPRN_MMCR3, r5
> -	mtspr   SPRN_SIER2, r6
> -	mtspr   SPRN_SIER3, r7
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> BEGIN_FTR_SECTION
> 	ld	r5, VCPU_MMCR + 16(r4)
> 	ld	r6, VCPU_SIER(r4)
> 	mtspr	SPRN_MMCR2, r5
> 	mtspr	SPRN_SIER, r6
> -BEGIN_FTR_SECTION_NESTED(96)
> 	lwz	r7, VCPU_PMC + 24(r4)
> 	lwz	r8, VCPU_PMC + 28(r4)
> 	ld	r9, VCPU_MMCRS(r4)
> 	mtspr	SPRN_SPMC1, r7
> 	mtspr	SPRN_SPMC2, r8
> 	mtspr	SPRN_MMCRS, r9
> -END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> 	mtspr	SPRN_MMCR0, r3
> 	isync
> @@ -2845,10 +2836,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> 	blr
> 
> /*
> + * void kvmhv_load_host_pmu(void)
> + *
>  * Reload host PMU state saved in the PACA by kvmhv_save_host_pmu.
>  */
> -_GLOBAL(kvmhv_load_host_pmu)
> -EXPORT_SYMBOL_GPL(kvmhv_load_host_pmu)
> +kvmhv_load_host_pmu:
> 	mflr	r0
> 	lbz	r4, PACA_PMCINUSE(r13) /* is the host using the PMU? */
> 	cmpwi	r4, 0
> @@ -2886,25 +2878,18 @@ BEGIN_FTR_SECTION
> 	mtspr	SPRN_MMCR2, r8
> 	mtspr	SPRN_SIER, r9
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> -BEGIN_FTR_SECTION
> -	ld      r5, HSTATE_MMCR3(r13)
> -	ld      r6, HSTATE_SIER2(r13)
> -	ld      r7, HSTATE_SIER3(r13)
> -	mtspr   SPRN_MMCR3, r5
> -	mtspr   SPRN_SIER2, r6
> -	mtspr   SPRN_SIER3, r7
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> 	mtspr	SPRN_MMCR0, r3
> 	isync
> 	mtlr	r0
> 23:	blr
> 
> /*
> + * void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use)
> + *
>  * Save guest PMU state into the vcpu struct.
>  * r3 = vcpu, r4 = full save flag (PMU in use flag set in VPA)
>  */
> -_GLOBAL(kvmhv_save_guest_pmu)
> -EXPORT_SYMBOL_GPL(kvmhv_save_guest_pmu)
> +kvmhv_save_guest_pmu:
> 	mr	r9, r3
> 	mr	r8, r4
> BEGIN_FTR_SECTION
> @@ -2953,14 +2938,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> BEGIN_FTR_SECTION
> 	std	r10, VCPU_MMCR + 16(r9)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> -BEGIN_FTR_SECTION
> -	mfspr   r5, SPRN_MMCR3
> -	mfspr   r6, SPRN_SIER2
> -	mfspr   r7, SPRN_SIER3
> -	std     r5, VCPU_MMCR + 24(r9)
> -	std     r6, VCPU_SIER + 8(r9)
> -	std     r7, VCPU_SIER + 16(r9)
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> 	std	r7, VCPU_SIAR(r9)
> 	std	r8, VCPU_SDAR(r9)
> 	mfspr	r3, SPRN_PMC1
> @@ -2978,7 +2955,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
> BEGIN_FTR_SECTION
> 	mfspr	r5, SPRN_SIER
> 	std	r5, VCPU_SIER(r9)
> -BEGIN_FTR_SECTION_NESTED(96)
> 	mfspr	r6, SPRN_SPMC1
> 	mfspr	r7, SPRN_SPMC2
> 	mfspr	r8, SPRN_MMCRS
> @@ -2987,7 +2963,6 @@ BEGIN_FTR_SECTION_NESTED(96)
> 	std	r8, VCPU_MMCRS(r9)
> 	lis	r4, 0x8000
> 	mtspr	SPRN_MMCRS, r4
> -END_FTR_SECTION_NESTED(CPU_FTR_ARCH_300, 0, 96)
> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
> 22:	blr
> 
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-10  2:50     ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Athira Rajeev
@ 2021-07-12  2:41       ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-12  2:41 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:
> 
> 
>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> KVM PMU management code looks for particular frozen/disabled bits in
>> the PMU registers so it knows whether it must clear them when coming
>> out of a guest or not. Setting this up helps KVM make these optimisations
>> without getting confused. Longer term the better approach might be to
>> move guest/host PMU switching to the perf subsystem.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>> arch/powerpc/perf/core-book3s.c       | 7 +++++++
>> 4 files changed, 17 insertions(+), 5 deletions(-)
>> 
>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>> index a29dc8326622..3dc61e203f37 100644
>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>> static void init_PMU(void)
>> {
>> 	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 	mtspr(SPRN_MMCR1, 0);
>> 	mtspr(SPRN_MMCR2, 0);
>> }
>> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>> {
>> 	mtspr(SPRN_MMCR3, 0);
>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>> }
>> 
>> /*
>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> index 0a6b36b4bda8..06a089fbeaa7 100644
>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>> 	}
>> 
>> 	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 	mtspr(SPRN_MMCR1, 0);
>> 	mtspr(SPRN_MMCR2, 0);
>> 	mtspr(SPRN_MMCRS, 0);
>> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>> 		mtspr(SPRN_MMCRC, 0);
>> 
>> 	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 	mtspr(SPRN_MMCR1, 0);
>> 	mtspr(SPRN_MMCR2, 0);
>> }
>> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>> 
>> 	mtspr(SPRN_MMCR3, 0);
>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>> }
>> 
>> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 1f30f98b09d1..f7349d150828 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>> #endif
>> #endif
>> 	vcpu->arch.mmcr[0] = MMCR0_FC;
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
>> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
>> +	}
>> +
>> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
>> 	/* default to host PVR, since we can't spoof it */
>> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index 51622411a7cc..e33b29ec1a65 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>> 		goto out;
>> 
>> 	if (cpuhw->n_events == 0) {
>> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>> +		} else {
>> +			mtspr(SPRN_MMCRA, 0);
>> +			mtspr(SPRN_MMCR0, MMCR0_FC);
>> +		}
> 
> 
> Hi Nick,
> 
> We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?

I'll have to go back and check what I needed it for.

Basically what I'm trying to achieve is that when the guest and host 
are not running perf, then the registers should match. This way the host 
can test them with mfspr but does not need to fix them with mtspr.

It's not too important for performance after TM facility demand faulting
and the nested guest PMU fix means you can usually just skip that 
entirely, but I think it's cleaner. I'll double check my perf/ changes
it's possible that part should be split out or is unnecessary.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-12  2:41       ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-12  2:41 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:
> 
> 
>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> KVM PMU management code looks for particular frozen/disabled bits in
>> the PMU registers so it knows whether it must clear them when coming
>> out of a guest or not. Setting this up helps KVM make these optimisations
>> without getting confused. Longer term the better approach might be to
>> move guest/host PMU switching to the perf subsystem.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>> arch/powerpc/perf/core-book3s.c       | 7 +++++++
>> 4 files changed, 17 insertions(+), 5 deletions(-)
>> 
>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>> index a29dc8326622..3dc61e203f37 100644
>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>> static void init_PMU(void)
>> {
>> 	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 	mtspr(SPRN_MMCR1, 0);
>> 	mtspr(SPRN_MMCR2, 0);
>> }
>> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>> {
>> 	mtspr(SPRN_MMCR3, 0);
>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>> }
>> 
>> /*
>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> index 0a6b36b4bda8..06a089fbeaa7 100644
>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>> 	}
>> 
>> 	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 	mtspr(SPRN_MMCR1, 0);
>> 	mtspr(SPRN_MMCR2, 0);
>> 	mtspr(SPRN_MMCRS, 0);
>> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>> 		mtspr(SPRN_MMCRC, 0);
>> 
>> 	mtspr(SPRN_MMCRA, 0);
>> -	mtspr(SPRN_MMCR0, 0);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> 	mtspr(SPRN_MMCR1, 0);
>> 	mtspr(SPRN_MMCR2, 0);
>> }
>> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>> 
>> 	mtspr(SPRN_MMCR3, 0);
>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>> }
>> 
>> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 1f30f98b09d1..f7349d150828 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>> #endif
>> #endif
>> 	vcpu->arch.mmcr[0] = MMCR0_FC;
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
>> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
>> +	}
>> +
>> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
>> 	/* default to host PVR, since we can't spoof it */
>> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index 51622411a7cc..e33b29ec1a65 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>> 		goto out;
>> 
>> 	if (cpuhw->n_events == 0) {
>> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>> +		} else {
>> +			mtspr(SPRN_MMCRA, 0);
>> +			mtspr(SPRN_MMCR0, MMCR0_FC);
>> +		}
> 
> 
> Hi Nick,
> 
> We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?

I'll have to go back and check what I needed it for.

Basically what I'm trying to achieve is that when the guest and host 
are not running perf, then the registers should match. This way the host 
can test them with mfspr but does not need to fix them with mtspr.

It's not too important for performance after TM facility demand faulting
and the nested guest PMU fix means you can usually just skip that 
entirely, but I think it's cleaner. I'll double check my perf/ changes
it's possible that part should be split out or is unnecessary.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  2021-07-10  2:59     ` Athira Rajeev
@ 2021-07-12  2:49       ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-12  2:49 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Athira Rajeev's message of July 10, 2021 12:47 pm:
> 
> 
>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> Implement the P9 path PMU save/restore code in C, and remove the
>> POWER9/10 code from the P7/8 path assembly.
>> 
>> -449 cycles (8533) POWER9 virt-mode NULL hcall
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/include/asm/asm-prototypes.h |   5 -
>> arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
>> arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
>> arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
>> 4 files changed, 200 insertions(+), 66 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
>> index 02ee6f5ac9fe..928db8ef9a5a 100644
>> --- a/arch/powerpc/include/asm/asm-prototypes.h
>> +++ b/arch/powerpc/include/asm/asm-prototypes.h
>> @@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
>> 					bool preserve_nv) { }
>> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>> 
>> -void kvmhv_save_host_pmu(void);
>> -void kvmhv_load_host_pmu(void);
>> -void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
>> -void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
>> -
>> void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
>> 
>> long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index f7349d150828..b1b94b3563b7 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>> 	trace_kvmppc_run_core(vc, 1);
>> }
>> 
>> +/*
>> + * Privileged (non-hypervisor) host registers to save.
>> + */
>> +struct p9_host_os_sprs {
>> +	unsigned long dscr;
>> +	unsigned long tidr;
>> +	unsigned long iamr;
>> +	unsigned long amr;
>> +	unsigned long fscr;
>> +
>> +	unsigned int pmc1;
>> +	unsigned int pmc2;
>> +	unsigned int pmc3;
>> +	unsigned int pmc4;
>> +	unsigned int pmc5;
>> +	unsigned int pmc6;
>> +	unsigned long mmcr0;
>> +	unsigned long mmcr1;
>> +	unsigned long mmcr2;
>> +	unsigned long mmcr3;
>> +	unsigned long mmcra;
>> +	unsigned long siar;
>> +	unsigned long sier1;
>> +	unsigned long sier2;
>> +	unsigned long sier3;
>> +	unsigned long sdar;
>> +};
>> +
>> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
>> +{
>> +	if (!(mmcr0 & MMCR0_FC))
>> +		goto do_freeze;
>> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
>> +		goto do_freeze;
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +		if (!(mmcr0 & MMCR0_PMCCEXT))
>> +			goto do_freeze;
>> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
>> +			goto do_freeze;
>> +	}
>> +	return;
> 
> 
> Hi Nick
> 
> When freezing the PMU, do we need to also set pmcregs_in_use to zero ?

Not immediately, we still need to save out the values of the PMU 
registers. If we clear pmcregs_in_use, then our hypervisor can discard 
the contents of those registers at any time.

> Also, why we need these above conditions like MMCRA_SAMPLE_ENABLE,  MMCR0_PMCCEXT checks also before freezing ?

Basically just because that's the condition we wnat to set the PMU to 
before entering the guest if the guest does not have its own PMU 
registers in use.

I'm not entirely sure this is correct / optimal for perf though, so we
should double check that. I think some of this stuff should be wrapped 
up and put into perf/ subdirectory as I've said before. KVM shouldn't 
need to know about exactly how PMU is to be set up and managed by
perf, we should just be able to ask perf to save/restore/switch state.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
@ 2021-07-12  2:49       ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-12  2:49 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Athira Rajeev's message of July 10, 2021 12:47 pm:
> 
> 
>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> Implement the P9 path PMU save/restore code in C, and remove the
>> POWER9/10 code from the P7/8 path assembly.
>> 
>> -449 cycles (8533) POWER9 virt-mode NULL hcall
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/include/asm/asm-prototypes.h |   5 -
>> arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
>> arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
>> arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
>> 4 files changed, 200 insertions(+), 66 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
>> index 02ee6f5ac9fe..928db8ef9a5a 100644
>> --- a/arch/powerpc/include/asm/asm-prototypes.h
>> +++ b/arch/powerpc/include/asm/asm-prototypes.h
>> @@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
>> 					bool preserve_nv) { }
>> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>> 
>> -void kvmhv_save_host_pmu(void);
>> -void kvmhv_load_host_pmu(void);
>> -void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
>> -void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
>> -
>> void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
>> 
>> long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index f7349d150828..b1b94b3563b7 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>> 	trace_kvmppc_run_core(vc, 1);
>> }
>> 
>> +/*
>> + * Privileged (non-hypervisor) host registers to save.
>> + */
>> +struct p9_host_os_sprs {
>> +	unsigned long dscr;
>> +	unsigned long tidr;
>> +	unsigned long iamr;
>> +	unsigned long amr;
>> +	unsigned long fscr;
>> +
>> +	unsigned int pmc1;
>> +	unsigned int pmc2;
>> +	unsigned int pmc3;
>> +	unsigned int pmc4;
>> +	unsigned int pmc5;
>> +	unsigned int pmc6;
>> +	unsigned long mmcr0;
>> +	unsigned long mmcr1;
>> +	unsigned long mmcr2;
>> +	unsigned long mmcr3;
>> +	unsigned long mmcra;
>> +	unsigned long siar;
>> +	unsigned long sier1;
>> +	unsigned long sier2;
>> +	unsigned long sier3;
>> +	unsigned long sdar;
>> +};
>> +
>> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
>> +{
>> +	if (!(mmcr0 & MMCR0_FC))
>> +		goto do_freeze;
>> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
>> +		goto do_freeze;
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +		if (!(mmcr0 & MMCR0_PMCCEXT))
>> +			goto do_freeze;
>> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
>> +			goto do_freeze;
>> +	}
>> +	return;
> 
> 
> Hi Nick
> 
> When freezing the PMU, do we need to also set pmcregs_in_use to zero ?

Not immediately, we still need to save out the values of the PMU 
registers. If we clear pmcregs_in_use, then our hypervisor can discard 
the contents of those registers at any time.

> Also, why we need these above conditions like MMCRA_SAMPLE_ENABLE,  MMCR0_PMCCEXT checks also before freezing ?

Basically just because that's the condition we wnat to set the PMU to 
before entering the guest if the guest does not have its own PMU 
registers in use.

I'm not entirely sure this is correct / optimal for perf though, so we
should double check that. I think some of this stuff should be wrapped 
up and put into perf/ subdirectory as I've said before. KVM shouldn't 
need to know about exactly how PMU is to be set up and managed by
perf, we should just be able to ask perf to save/restore/switch state.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
  2021-07-08  5:44     ` Athira Rajeev
@ 2021-07-12  2:50       ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-12  2:50 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Athira Rajeev's message of July 8, 2021 3:32 pm:
> 
> 
>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> Move the P9 guest/host register switching functions to the built-in
>> P9 entry code, and export it for nested to use as well.
>> 
>> This allows more flexibility in scheduling these supervisor privileged
>> SPR accesses with the HV privileged and PR SPR accesses in the low level
>> entry code.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/kvm/book3s_hv.c          | 351 +-------------------------
>> arch/powerpc/kvm/book3s_hv.h          |  39 +++
>> arch/powerpc/kvm/book3s_hv_p9_entry.c | 332 ++++++++++++++++++++++++
>> 3 files changed, 372 insertions(+), 350 deletions(-)
>> create mode 100644 arch/powerpc/kvm/book3s_hv.h
>> 
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 35749b0b663f..a7660af22161 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -79,6 +79,7 @@
>> #include <asm/dtl.h>
>> 
>> #include "book3s.h"
>> +#include "book3s_hv.h"
>> 
>> #define CREATE_TRACE_POINTS
>> #include "trace_hv.h"
>> @@ -3675,356 +3676,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>> 	trace_kvmppc_run_core(vc, 1);
>> }
>> 
>> -/*
>> - * Privileged (non-hypervisor) host registers to save.
>> - */
>> -struct p9_host_os_sprs {
>> -	unsigned long dscr;
>> -	unsigned long tidr;
>> -	unsigned long iamr;
>> -	unsigned long amr;
>> -	unsigned long fscr;
>> -
>> -	unsigned int pmc1;
>> -	unsigned int pmc2;
>> -	unsigned int pmc3;
>> -	unsigned int pmc4;
>> -	unsigned int pmc5;
>> -	unsigned int pmc6;
>> -	unsigned long mmcr0;
>> -	unsigned long mmcr1;
>> -	unsigned long mmcr2;
>> -	unsigned long mmcr3;
>> -	unsigned long mmcra;
>> -	unsigned long siar;
>> -	unsigned long sier1;
>> -	unsigned long sier2;
>> -	unsigned long sier3;
>> -	unsigned long sdar;
>> -};
>> -
>> -static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
>> -{
>> -	if (!(mmcr0 & MMCR0_FC))
>> -		goto do_freeze;
>> -	if (mmcra & MMCRA_SAMPLE_ENABLE)
>> -		goto do_freeze;
>> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -		if (!(mmcr0 & MMCR0_PMCCEXT))
>> -			goto do_freeze;
>> -		if (!(mmcra & MMCRA_BHRB_DISABLE))
>> -			goto do_freeze;
>> -	}
>> -	return;
>> -
>> -do_freeze:
>> -	mmcr0 = MMCR0_FC;
>> -	mmcra = 0;
>> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -		mmcr0 |= MMCR0_PMCCEXT;
>> -		mmcra = MMCRA_BHRB_DISABLE;
>> -	}
>> -
>> -	mtspr(SPRN_MMCR0, mmcr0);
>> -	mtspr(SPRN_MMCRA, mmcra);
>> -	isync();
>> -}
>> -
>> -static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
>> -				struct p9_host_os_sprs *host_os_sprs)
>> -{
>> -	struct lppaca *lp;
>> -	int load_pmu = 1;
>> -
>> -	lp = vcpu->arch.vpa.pinned_addr;
>> -	if (lp)
>> -		load_pmu = lp->pmcregs_in_use;
>> -
>> -	if (load_pmu)
>> -	      vcpu->arch.hfscr |= HFSCR_PM;
>> -
>> -	/* Save host */
>> -	if (ppc_get_pmu_inuse()) {
>> -		/*
>> -		 * It might be better to put PMU handling (at least for the
>> -		 * host) in the perf subsystem because it knows more about what
>> -		 * is being used.
>> -		 */
>> -
>> -		/* POWER9, POWER10 do not implement HPMC or SPMC */
>> -
>> -		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
>> -		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
>> -
>> -		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
>> -
>> -		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
>> -		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
>> -		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
>> -		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
>> -		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
>> -		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
>> -		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
>> -		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
>> -		host_os_sprs->sdar = mfspr(SPRN_SDAR);
>> -		host_os_sprs->siar = mfspr(SPRN_SIAR);
>> -		host_os_sprs->sier1 = mfspr(SPRN_SIER);
>> -
>> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
>> -			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
>> -			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
>> -		}
>> -	}
>> -
>> -#ifdef CONFIG_PPC_PSERIES
>> -	if (kvmhv_on_pseries()) {
>> -		if (vcpu->arch.vpa.pinned_addr) {
>> -			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
>> -			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
>> -		} else {
>> -			get_lppaca()->pmcregs_in_use = 1;
>> -		}
>> -	}
>> -#endif
>> -
>> -	/* Load guest */
>> -	if (vcpu->arch.hfscr & HFSCR_PM) {
>> -		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
>> -		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
>> -		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
>> -		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
>> -		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
>> -		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
>> -		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
>> -		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
>> -		mtspr(SPRN_SDAR, vcpu->arch.sdar);
>> -		mtspr(SPRN_SIAR, vcpu->arch.siar);
>> -		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
>> -
>> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
> 
> 
> Hi Nick,
> 
> Have a doubt here..
> For MMCR3, it is  vcpu->arch.mmcr[3) ?

Hey, yea it is you're right. Good catch.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
@ 2021-07-12  2:50       ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-12  2:50 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Athira Rajeev's message of July 8, 2021 3:32 pm:
> 
> 
>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> Move the P9 guest/host register switching functions to the built-in
>> P9 entry code, and export it for nested to use as well.
>> 
>> This allows more flexibility in scheduling these supervisor privileged
>> SPR accesses with the HV privileged and PR SPR accesses in the low level
>> entry code.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> arch/powerpc/kvm/book3s_hv.c          | 351 +-------------------------
>> arch/powerpc/kvm/book3s_hv.h          |  39 +++
>> arch/powerpc/kvm/book3s_hv_p9_entry.c | 332 ++++++++++++++++++++++++
>> 3 files changed, 372 insertions(+), 350 deletions(-)
>> create mode 100644 arch/powerpc/kvm/book3s_hv.h
>> 
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 35749b0b663f..a7660af22161 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -79,6 +79,7 @@
>> #include <asm/dtl.h>
>> 
>> #include "book3s.h"
>> +#include "book3s_hv.h"
>> 
>> #define CREATE_TRACE_POINTS
>> #include "trace_hv.h"
>> @@ -3675,356 +3676,6 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>> 	trace_kvmppc_run_core(vc, 1);
>> }
>> 
>> -/*
>> - * Privileged (non-hypervisor) host registers to save.
>> - */
>> -struct p9_host_os_sprs {
>> -	unsigned long dscr;
>> -	unsigned long tidr;
>> -	unsigned long iamr;
>> -	unsigned long amr;
>> -	unsigned long fscr;
>> -
>> -	unsigned int pmc1;
>> -	unsigned int pmc2;
>> -	unsigned int pmc3;
>> -	unsigned int pmc4;
>> -	unsigned int pmc5;
>> -	unsigned int pmc6;
>> -	unsigned long mmcr0;
>> -	unsigned long mmcr1;
>> -	unsigned long mmcr2;
>> -	unsigned long mmcr3;
>> -	unsigned long mmcra;
>> -	unsigned long siar;
>> -	unsigned long sier1;
>> -	unsigned long sier2;
>> -	unsigned long sier3;
>> -	unsigned long sdar;
>> -};
>> -
>> -static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
>> -{
>> -	if (!(mmcr0 & MMCR0_FC))
>> -		goto do_freeze;
>> -	if (mmcra & MMCRA_SAMPLE_ENABLE)
>> -		goto do_freeze;
>> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -		if (!(mmcr0 & MMCR0_PMCCEXT))
>> -			goto do_freeze;
>> -		if (!(mmcra & MMCRA_BHRB_DISABLE))
>> -			goto do_freeze;
>> -	}
>> -	return;
>> -
>> -do_freeze:
>> -	mmcr0 = MMCR0_FC;
>> -	mmcra = 0;
>> -	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -		mmcr0 |= MMCR0_PMCCEXT;
>> -		mmcra = MMCRA_BHRB_DISABLE;
>> -	}
>> -
>> -	mtspr(SPRN_MMCR0, mmcr0);
>> -	mtspr(SPRN_MMCRA, mmcra);
>> -	isync();
>> -}
>> -
>> -static void switch_pmu_to_guest(struct kvm_vcpu *vcpu,
>> -				struct p9_host_os_sprs *host_os_sprs)
>> -{
>> -	struct lppaca *lp;
>> -	int load_pmu = 1;
>> -
>> -	lp = vcpu->arch.vpa.pinned_addr;
>> -	if (lp)
>> -		load_pmu = lp->pmcregs_in_use;
>> -
>> -	if (load_pmu)
>> -	      vcpu->arch.hfscr |= HFSCR_PM;
>> -
>> -	/* Save host */
>> -	if (ppc_get_pmu_inuse()) {
>> -		/*
>> -		 * It might be better to put PMU handling (at least for the
>> -		 * host) in the perf subsystem because it knows more about what
>> -		 * is being used.
>> -		 */
>> -
>> -		/* POWER9, POWER10 do not implement HPMC or SPMC */
>> -
>> -		host_os_sprs->mmcr0 = mfspr(SPRN_MMCR0);
>> -		host_os_sprs->mmcra = mfspr(SPRN_MMCRA);
>> -
>> -		freeze_pmu(host_os_sprs->mmcr0, host_os_sprs->mmcra);
>> -
>> -		host_os_sprs->pmc1 = mfspr(SPRN_PMC1);
>> -		host_os_sprs->pmc2 = mfspr(SPRN_PMC2);
>> -		host_os_sprs->pmc3 = mfspr(SPRN_PMC3);
>> -		host_os_sprs->pmc4 = mfspr(SPRN_PMC4);
>> -		host_os_sprs->pmc5 = mfspr(SPRN_PMC5);
>> -		host_os_sprs->pmc6 = mfspr(SPRN_PMC6);
>> -		host_os_sprs->mmcr1 = mfspr(SPRN_MMCR1);
>> -		host_os_sprs->mmcr2 = mfspr(SPRN_MMCR2);
>> -		host_os_sprs->sdar = mfspr(SPRN_SDAR);
>> -		host_os_sprs->siar = mfspr(SPRN_SIAR);
>> -		host_os_sprs->sier1 = mfspr(SPRN_SIER);
>> -
>> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -			host_os_sprs->mmcr3 = mfspr(SPRN_MMCR3);
>> -			host_os_sprs->sier2 = mfspr(SPRN_SIER2);
>> -			host_os_sprs->sier3 = mfspr(SPRN_SIER3);
>> -		}
>> -	}
>> -
>> -#ifdef CONFIG_PPC_PSERIES
>> -	if (kvmhv_on_pseries()) {
>> -		if (vcpu->arch.vpa.pinned_addr) {
>> -			struct lppaca *lp = vcpu->arch.vpa.pinned_addr;
>> -			get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use;
>> -		} else {
>> -			get_lppaca()->pmcregs_in_use = 1;
>> -		}
>> -	}
>> -#endif
>> -
>> -	/* Load guest */
>> -	if (vcpu->arch.hfscr & HFSCR_PM) {
>> -		mtspr(SPRN_PMC1, vcpu->arch.pmc[0]);
>> -		mtspr(SPRN_PMC2, vcpu->arch.pmc[1]);
>> -		mtspr(SPRN_PMC3, vcpu->arch.pmc[2]);
>> -		mtspr(SPRN_PMC4, vcpu->arch.pmc[3]);
>> -		mtspr(SPRN_PMC5, vcpu->arch.pmc[4]);
>> -		mtspr(SPRN_PMC6, vcpu->arch.pmc[5]);
>> -		mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]);
>> -		mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]);
>> -		mtspr(SPRN_SDAR, vcpu->arch.sdar);
>> -		mtspr(SPRN_SIAR, vcpu->arch.siar);
>> -		mtspr(SPRN_SIER, vcpu->arch.sier[0]);
>> -
>> -		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> -			mtspr(SPRN_MMCR3, vcpu->arch.mmcr[4]);
> 
> 
> Hi Nick,
> 
> Have a doubt here..
> For MMCR3, it is  vcpu->arch.mmcr[3) ?

Hey, yea it is you're right. Good catch.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-12  2:41       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
  (?)
@ 2021-07-12  3:17       ` Athira Rajeev
  -1 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-12  3:17 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc

[-- Attachment #1: Type: text/html, Size: 8947 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-02  0:27       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
@ 2021-07-12  3:54         ` Madhavan Srinivasan
  -1 siblings, 0 replies; 133+ messages in thread
From: Madhavan Srinivasan @ 2021-07-12  3:42 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev


On 7/2/21 5:57 AM, Nicholas Piggin wrote:
> Excerpts from Madhavan Srinivasan's message of July 1, 2021 11:17 pm:
>> On 6/22/21 4:27 PM, Nicholas Piggin wrote:
>>> KVM PMU management code looks for particular frozen/disabled bits in
>>> the PMU registers so it knows whether it must clear them when coming
>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>> without getting confused. Longer term the better approach might be to
>>> move guest/host PMU switching to the perf subsystem.
>>>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>    arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>>    arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>>    arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>>    arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>>    4 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>> index a29dc8326622..3dc61e203f37 100644
>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>>    static void init_PMU(void)
>>>    {
>>>    	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> Sticky point here is, currently if not frozen, pmc5/6 will
>> keep countering. And not freezing them at boot is quiet useful
>> sometime, like say when running in a simulation where we could calculate
>> approx CPIs for micro benchmarks without perf subsystem.

Sorry for a delayed response

> You even can't use the sysfs files in this sim environment? In that case

Yes it is possible. Just that we need to have additional patch to maintain
for the sim.
> what if we added a boot option that could set some things up? In that
> case possibly you could even gather some more types of events too.

Having a boot option will be a over-kill :). I guess we could have
this under config option?

>
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-12  3:54         ` Madhavan Srinivasan
  0 siblings, 0 replies; 133+ messages in thread
From: Madhavan Srinivasan @ 2021-07-12  3:54 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev


On 7/2/21 5:57 AM, Nicholas Piggin wrote:
> Excerpts from Madhavan Srinivasan's message of July 1, 2021 11:17 pm:
>> On 6/22/21 4:27 PM, Nicholas Piggin wrote:
>>> KVM PMU management code looks for particular frozen/disabled bits in
>>> the PMU registers so it knows whether it must clear them when coming
>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>> without getting confused. Longer term the better approach might be to
>>> move guest/host PMU switching to the perf subsystem.
>>>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>>    arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>>    arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>>    arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>>    arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>>    4 files changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>> index a29dc8326622..3dc61e203f37 100644
>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>>    static void init_PMU(void)
>>>    {
>>>    	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>> Sticky point here is, currently if not frozen, pmc5/6 will
>> keep countering. And not freezing them at boot is quiet useful
>> sometime, like say when running in a simulation where we could calculate
>> approx CPIs for micro benchmarks without perf subsystem.

Sorry for a delayed response

> You even can't use the sysfs files in this sim environment? In that case

Yes it is possible. Just that we need to have additional patch to maintain
for the sim.
> what if we added a boot option that could set some things up? In that
> case possibly you could even gather some more types of events too.

Having a boot option will be a over-kill :). I guess we could have
this under config option?

>
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  2021-07-12  2:49       ` Nicholas Piggin
@ 2021-07-12 14:19         ` Athira Rajeev
  -1 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-12 14:07 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 12-Jul-2021, at 8:19 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Excerpts from Athira Rajeev's message of July 10, 2021 12:47 pm:
>> 
>> 
>>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>> 
>>> Implement the P9 path PMU save/restore code in C, and remove the
>>> POWER9/10 code from the P7/8 path assembly.
>>> 
>>> -449 cycles (8533) POWER9 virt-mode NULL hcall
>>> 
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>> arch/powerpc/include/asm/asm-prototypes.h |   5 -
>>> arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
>>> arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
>>> arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
>>> 4 files changed, 200 insertions(+), 66 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
>>> index 02ee6f5ac9fe..928db8ef9a5a 100644
>>> --- a/arch/powerpc/include/asm/asm-prototypes.h
>>> +++ b/arch/powerpc/include/asm/asm-prototypes.h
>>> @@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
>>> 					bool preserve_nv) { }
>>> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>>> 
>>> -void kvmhv_save_host_pmu(void);
>>> -void kvmhv_load_host_pmu(void);
>>> -void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
>>> -void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
>>> -
>>> void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
>>> 
>>> long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index f7349d150828..b1b94b3563b7 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>>> 	trace_kvmppc_run_core(vc, 1);
>>> }
>>> 
>>> +/*
>>> + * Privileged (non-hypervisor) host registers to save.
>>> + */
>>> +struct p9_host_os_sprs {
>>> +	unsigned long dscr;
>>> +	unsigned long tidr;
>>> +	unsigned long iamr;
>>> +	unsigned long amr;
>>> +	unsigned long fscr;
>>> +
>>> +	unsigned int pmc1;
>>> +	unsigned int pmc2;
>>> +	unsigned int pmc3;
>>> +	unsigned int pmc4;
>>> +	unsigned int pmc5;
>>> +	unsigned int pmc6;
>>> +	unsigned long mmcr0;
>>> +	unsigned long mmcr1;
>>> +	unsigned long mmcr2;
>>> +	unsigned long mmcr3;
>>> +	unsigned long mmcra;
>>> +	unsigned long siar;
>>> +	unsigned long sier1;
>>> +	unsigned long sier2;
>>> +	unsigned long sier3;
>>> +	unsigned long sdar;
>>> +};
>>> +
>>> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
>>> +{
>>> +	if (!(mmcr0 & MMCR0_FC))
>>> +		goto do_freeze;
>>> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
>>> +		goto do_freeze;
>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +		if (!(mmcr0 & MMCR0_PMCCEXT))
>>> +			goto do_freeze;
>>> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
>>> +			goto do_freeze;
>>> +	}
>>> +	return;
>> 
>> 
>> Hi Nick
>> 
>> When freezing the PMU, do we need to also set pmcregs_in_use to zero ?
> 
> Not immediately, we still need to save out the values of the PMU 
> registers. If we clear pmcregs_in_use, then our hypervisor can discard 
> the contents of those registers at any time.
> 
>> Also, why we need these above conditions like MMCRA_SAMPLE_ENABLE,  MMCR0_PMCCEXT checks also before freezing ?
> 
> Basically just because that's the condition we wnat to set the PMU to 
> before entering the guest if the guest does not have its own PMU 
> registers in use.
> 
> I'm not entirely sure this is correct / optimal for perf though, so we
> should double check that. I think some of this stuff should be wrapped 
> up and put into perf/ subdirectory as I've said before. KVM shouldn't 
> need to know about exactly how PMU is to be set up and managed by
> perf, we should just be able to ask perf to save/restore/switch state.

Hi Nick,

Agree to your point. It makes sense that some of these perf code should be moved under perf/ directory.

Athira
> 
> Thanks,
> Nick


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
@ 2021-07-12 14:19         ` Athira Rajeev
  0 siblings, 0 replies; 133+ messages in thread
From: Athira Rajeev @ 2021-07-12 14:19 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kvm-ppc



> On 12-Jul-2021, at 8:19 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Excerpts from Athira Rajeev's message of July 10, 2021 12:47 pm:
>> 
>> 
>>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>> 
>>> Implement the P9 path PMU save/restore code in C, and remove the
>>> POWER9/10 code from the P7/8 path assembly.
>>> 
>>> -449 cycles (8533) POWER9 virt-mode NULL hcall
>>> 
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>> arch/powerpc/include/asm/asm-prototypes.h |   5 -
>>> arch/powerpc/kvm/book3s_hv.c              | 205 ++++++++++++++++++++--
>>> arch/powerpc/kvm/book3s_hv_interrupts.S   |  13 +-
>>> arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +----
>>> 4 files changed, 200 insertions(+), 66 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
>>> index 02ee6f5ac9fe..928db8ef9a5a 100644
>>> --- a/arch/powerpc/include/asm/asm-prototypes.h
>>> +++ b/arch/powerpc/include/asm/asm-prototypes.h
>>> @@ -136,11 +136,6 @@ static inline void kvmppc_restore_tm_hv(struct kvm_vcpu *vcpu, u64 msr,
>>> 					bool preserve_nv) { }
>>> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>>> 
>>> -void kvmhv_save_host_pmu(void);
>>> -void kvmhv_load_host_pmu(void);
>>> -void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
>>> -void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
>>> -
>>> void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
>>> 
>>> long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index f7349d150828..b1b94b3563b7 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -3635,6 +3635,188 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>>> 	trace_kvmppc_run_core(vc, 1);
>>> }
>>> 
>>> +/*
>>> + * Privileged (non-hypervisor) host registers to save.
>>> + */
>>> +struct p9_host_os_sprs {
>>> +	unsigned long dscr;
>>> +	unsigned long tidr;
>>> +	unsigned long iamr;
>>> +	unsigned long amr;
>>> +	unsigned long fscr;
>>> +
>>> +	unsigned int pmc1;
>>> +	unsigned int pmc2;
>>> +	unsigned int pmc3;
>>> +	unsigned int pmc4;
>>> +	unsigned int pmc5;
>>> +	unsigned int pmc6;
>>> +	unsigned long mmcr0;
>>> +	unsigned long mmcr1;
>>> +	unsigned long mmcr2;
>>> +	unsigned long mmcr3;
>>> +	unsigned long mmcra;
>>> +	unsigned long siar;
>>> +	unsigned long sier1;
>>> +	unsigned long sier2;
>>> +	unsigned long sier3;
>>> +	unsigned long sdar;
>>> +};
>>> +
>>> +static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra)
>>> +{
>>> +	if (!(mmcr0 & MMCR0_FC))
>>> +		goto do_freeze;
>>> +	if (mmcra & MMCRA_SAMPLE_ENABLE)
>>> +		goto do_freeze;
>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +		if (!(mmcr0 & MMCR0_PMCCEXT))
>>> +			goto do_freeze;
>>> +		if (!(mmcra & MMCRA_BHRB_DISABLE))
>>> +			goto do_freeze;
>>> +	}
>>> +	return;
>> 
>> 
>> Hi Nick
>> 
>> When freezing the PMU, do we need to also set pmcregs_in_use to zero ?
> 
> Not immediately, we still need to save out the values of the PMU 
> registers. If we clear pmcregs_in_use, then our hypervisor can discard 
> the contents of those registers at any time.
> 
>> Also, why we need these above conditions like MMCRA_SAMPLE_ENABLE,  MMCR0_PMCCEXT checks also before freezing ?
> 
> Basically just because that's the condition we wnat to set the PMU to 
> before entering the guest if the guest does not have its own PMU 
> registers in use.
> 
> I'm not entirely sure this is correct / optimal for perf though, so we
> should double check that. I think some of this stuff should be wrapped 
> up and put into perf/ subdirectory as I've said before. KVM shouldn't 
> need to know about exactly how PMU is to be set up and managed by
> perf, we should just be able to ask perf to save/restore/switch state.

Hi Nick,

Agree to your point. It makes sense that some of these perf code should be moved under perf/ directory.

Athira
> 
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-12  2:41       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
@ 2021-07-14 12:39         ` Nicholas Piggin
  -1 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-14 12:39 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Nicholas Piggin's message of July 12, 2021 12:41 pm:
> Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:
>> 
>> 
>>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>> 
>>> KVM PMU management code looks for particular frozen/disabled bits in
>>> the PMU registers so it knows whether it must clear them when coming
>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>> without getting confused. Longer term the better approach might be to
>>> move guest/host PMU switching to the perf subsystem.
>>> 
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>> arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>> 4 files changed, 17 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>> index a29dc8326622..3dc61e203f37 100644
>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>> static void init_PMU(void)
>>> {
>>> 	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>> 	mtspr(SPRN_MMCR1, 0);
>>> 	mtspr(SPRN_MMCR2, 0);
>>> }
>>> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>>> {
>>> 	mtspr(SPRN_MMCR3, 0);
>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>> }
>>> 
>>> /*
>>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>> index 0a6b36b4bda8..06a089fbeaa7 100644
>>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>>> 	}
>>> 
>>> 	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>> 	mtspr(SPRN_MMCR1, 0);
>>> 	mtspr(SPRN_MMCR2, 0);
>>> 	mtspr(SPRN_MMCRS, 0);
>>> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>>> 		mtspr(SPRN_MMCRC, 0);
>>> 
>>> 	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>> 	mtspr(SPRN_MMCR1, 0);
>>> 	mtspr(SPRN_MMCR2, 0);
>>> }
>>> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>>> 
>>> 	mtspr(SPRN_MMCR3, 0);
>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>> }
>>> 
>>> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index 1f30f98b09d1..f7349d150828 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>>> #endif
>>> #endif
>>> 	vcpu->arch.mmcr[0] = MMCR0_FC;
>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
>>> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
>>> +	}
>>> +
>>> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
>>> 	/* default to host PVR, since we can't spoof it */
>>> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>> index 51622411a7cc..e33b29ec1a65 100644
>>> --- a/arch/powerpc/perf/core-book3s.c
>>> +++ b/arch/powerpc/perf/core-book3s.c
>>> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>>> 		goto out;
>>> 
>>> 	if (cpuhw->n_events == 0) {
>>> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>> +		} else {
>>> +			mtspr(SPRN_MMCRA, 0);
>>> +			mtspr(SPRN_MMCR0, MMCR0_FC);
>>> +		}
>> 
>> 
>> Hi Nick,
>> 
>> We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?
> 
> I'll have to go back and check what I needed it for.

Okay, MMCRA is getting MMCRA_SDAR_MODE_DCACHE set on POWER9, by the looks.

That's not necessarily a problem, but KVM sets MMCRA to 0 to disable
SDAR updates. So KVM and perf don't agree on what the "correct" value
for disabled is. Which could be a problem with POWER10 not setting BHRB
disable before my series.

I'll get rid of this hunk for now, I expect things won't be exactly clean
or consistent until the KVM host PMU code is moved into perf/ though.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-14 12:39         ` Nicholas Piggin
  0 siblings, 0 replies; 133+ messages in thread
From: Nicholas Piggin @ 2021-07-14 12:39 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc

Excerpts from Nicholas Piggin's message of July 12, 2021 12:41 pm:
> Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:
>> 
>> 
>>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>> 
>>> KVM PMU management code looks for particular frozen/disabled bits in
>>> the PMU registers so it knows whether it must clear them when coming
>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>> without getting confused. Longer term the better approach might be to
>>> move guest/host PMU switching to the perf subsystem.
>>> 
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>> arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>> 4 files changed, 17 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>> index a29dc8326622..3dc61e203f37 100644
>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>> static void init_PMU(void)
>>> {
>>> 	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>> 	mtspr(SPRN_MMCR1, 0);
>>> 	mtspr(SPRN_MMCR2, 0);
>>> }
>>> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>>> {
>>> 	mtspr(SPRN_MMCR3, 0);
>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>> }
>>> 
>>> /*
>>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>> index 0a6b36b4bda8..06a089fbeaa7 100644
>>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>>> 	}
>>> 
>>> 	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>> 	mtspr(SPRN_MMCR1, 0);
>>> 	mtspr(SPRN_MMCR2, 0);
>>> 	mtspr(SPRN_MMCRS, 0);
>>> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>>> 		mtspr(SPRN_MMCRC, 0);
>>> 
>>> 	mtspr(SPRN_MMCRA, 0);
>>> -	mtspr(SPRN_MMCR0, 0);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>> 	mtspr(SPRN_MMCR1, 0);
>>> 	mtspr(SPRN_MMCR2, 0);
>>> }
>>> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>>> 
>>> 	mtspr(SPRN_MMCR3, 0);
>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>> }
>>> 
>>> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index 1f30f98b09d1..f7349d150828 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>>> #endif
>>> #endif
>>> 	vcpu->arch.mmcr[0] = MMCR0_FC;
>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
>>> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
>>> +	}
>>> +
>>> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
>>> 	/* default to host PVR, since we can't spoof it */
>>> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>> index 51622411a7cc..e33b29ec1a65 100644
>>> --- a/arch/powerpc/perf/core-book3s.c
>>> +++ b/arch/powerpc/perf/core-book3s.c
>>> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>>> 		goto out;
>>> 
>>> 	if (cpuhw->n_events == 0) {
>>> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>> +		} else {
>>> +			mtspr(SPRN_MMCRA, 0);
>>> +			mtspr(SPRN_MMCR0, MMCR0_FC);
>>> +		}
>> 
>> 
>> Hi Nick,
>> 
>> We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?
> 
> I'll have to go back and check what I needed it for.

Okay, MMCRA is getting MMCRA_SDAR_MODE_DCACHE set on POWER9, by the looks.

That's not necessarily a problem, but KVM sets MMCRA to 0 to disable
SDAR updates. So KVM and perf don't agree on what the "correct" value
for disabled is. Which could be a problem with POWER10 not setting BHRB
disable before my series.

I'll get rid of this hunk for now, I expect things won't be exactly clean
or consistent until the KVM host PMU code is moved into perf/ though.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use
  2021-07-14 12:39         ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
@ 2021-07-16  3:55           ` Madhavan Srinivasan
  -1 siblings, 0 replies; 133+ messages in thread
From: Madhavan Srinivasan @ 2021-07-16  3:43 UTC (permalink / raw)
  To: Nicholas Piggin, Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc


On 7/14/21 6:09 PM, Nicholas Piggin wrote:
> Excerpts from Nicholas Piggin's message of July 12, 2021 12:41 pm:
>> Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:
>>>
>>>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>>
>>>> KVM PMU management code looks for particular frozen/disabled bits in
>>>> the PMU registers so it knows whether it must clear them when coming
>>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>>> without getting confused. Longer term the better approach might be to
>>>> move guest/host PMU switching to the perf subsystem.
>>>>
>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>> ---
>>>> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>>> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>>> arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>>> 4 files changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>>> index a29dc8326622..3dc61e203f37 100644
>>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>>> static void init_PMU(void)
>>>> {
>>>> 	mtspr(SPRN_MMCRA, 0);
>>>> -	mtspr(SPRN_MMCR0, 0);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> 	mtspr(SPRN_MMCR1, 0);
>>>> 	mtspr(SPRN_MMCR2, 0);
>>>> }
>>>> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>>>> {
>>>> 	mtspr(SPRN_MMCR3, 0);
>>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>>> }
>>>>
>>>> /*
>>>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>>> index 0a6b36b4bda8..06a089fbeaa7 100644
>>>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>>>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>>> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>>>> 	}
>>>>
>>>> 	mtspr(SPRN_MMCRA, 0);
>>>> -	mtspr(SPRN_MMCR0, 0);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> 	mtspr(SPRN_MMCR1, 0);
>>>> 	mtspr(SPRN_MMCR2, 0);
>>>> 	mtspr(SPRN_MMCRS, 0);
>>>> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>>>> 		mtspr(SPRN_MMCRC, 0);
>>>>
>>>> 	mtspr(SPRN_MMCRA, 0);
>>>> -	mtspr(SPRN_MMCR0, 0);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> 	mtspr(SPRN_MMCR1, 0);
>>>> 	mtspr(SPRN_MMCR2, 0);
>>>> }
>>>> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>>>>
>>>> 	mtspr(SPRN_MMCR3, 0);
>>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>>> }
>>>>
>>>> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>>> index 1f30f98b09d1..f7349d150828 100644
>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>>>> #endif
>>>> #endif
>>>> 	vcpu->arch.mmcr[0] = MMCR0_FC;
>>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>>> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
>>>> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
>>>> +	}
>>>> +
>>>> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
>>>> 	/* default to host PVR, since we can't spoof it */
>>>> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
>>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>>> index 51622411a7cc..e33b29ec1a65 100644
>>>> --- a/arch/powerpc/perf/core-book3s.c
>>>> +++ b/arch/powerpc/perf/core-book3s.c
>>>> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>>>> 		goto out;
>>>>
>>>> 	if (cpuhw->n_events == 0) {
>>>> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>>> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>>> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>>> +		} else {
>>>> +			mtspr(SPRN_MMCRA, 0);
>>>> +			mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> +		}
>>>
>>> Hi Nick,
>>>
>>> We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?
>> I'll have to go back and check what I needed it for.
> Okay, MMCRA is getting MMCRA_SDAR_MODE_DCACHE set on POWER9, by the looks.
>
> That's not necessarily a problem, but KVM sets MMCRA to 0 to disable
> SDAR updates. So KVM and perf don't agree on what the "correct" value
> for disabled is. Which could be a problem with POWER10 not setting BHRB
> disable before my series.

Nick,

BHRB disable will be only known to P10 guest kernel and in case of
P9 guest kernel it is not known. And for the P10 host and guest,
I guess we set BHRB_DISABLE in cpu_setup code and also in 
power_pmu_disable().
So you areseeing P10 guest having MMCRA as zero during context switch?

Maddy
>
> I'll get rid of this hunk for now, I expect things won't be exactly clean
> or consistent until the KVM host PMU code is moved into perf/ though.
>
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u
@ 2021-07-16  3:55           ` Madhavan Srinivasan
  0 siblings, 0 replies; 133+ messages in thread
From: Madhavan Srinivasan @ 2021-07-16  3:55 UTC (permalink / raw)
  To: Nicholas Piggin, Athira Rajeev; +Cc: linuxppc-dev, kvm-ppc


On 7/14/21 6:09 PM, Nicholas Piggin wrote:
> Excerpts from Nicholas Piggin's message of July 12, 2021 12:41 pm:
>> Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:
>>>
>>>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>>
>>>> KVM PMU management code looks for particular frozen/disabled bits in
>>>> the PMU registers so it knows whether it must clear them when coming
>>>> out of a guest or not. Setting this up helps KVM make these optimisations
>>>> without getting confused. Longer term the better approach might be to
>>>> move guest/host PMU switching to the perf subsystem.
>>>>
>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>> ---
>>>> arch/powerpc/kernel/cpu_setup_power.c | 4 ++--
>>>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 6 +++---
>>>> arch/powerpc/kvm/book3s_hv.c          | 5 +++++
>>>> arch/powerpc/perf/core-book3s.c       | 7 +++++++
>>>> 4 files changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/kernel/cpu_setup_power.c b/arch/powerpc/kernel/cpu_setup_power.c
>>>> index a29dc8326622..3dc61e203f37 100644
>>>> --- a/arch/powerpc/kernel/cpu_setup_power.c
>>>> +++ b/arch/powerpc/kernel/cpu_setup_power.c
>>>> @@ -109,7 +109,7 @@ static void init_PMU_HV_ISA207(void)
>>>> static void init_PMU(void)
>>>> {
>>>> 	mtspr(SPRN_MMCRA, 0);
>>>> -	mtspr(SPRN_MMCR0, 0);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> 	mtspr(SPRN_MMCR1, 0);
>>>> 	mtspr(SPRN_MMCR2, 0);
>>>> }
>>>> @@ -123,7 +123,7 @@ static void init_PMU_ISA31(void)
>>>> {
>>>> 	mtspr(SPRN_MMCR3, 0);
>>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>>> }
>>>>
>>>> /*
>>>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>>> index 0a6b36b4bda8..06a089fbeaa7 100644
>>>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>>>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>>>> @@ -353,7 +353,7 @@ static void init_pmu_power8(void)
>>>> 	}
>>>>
>>>> 	mtspr(SPRN_MMCRA, 0);
>>>> -	mtspr(SPRN_MMCR0, 0);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> 	mtspr(SPRN_MMCR1, 0);
>>>> 	mtspr(SPRN_MMCR2, 0);
>>>> 	mtspr(SPRN_MMCRS, 0);
>>>> @@ -392,7 +392,7 @@ static void init_pmu_power9(void)
>>>> 		mtspr(SPRN_MMCRC, 0);
>>>>
>>>> 	mtspr(SPRN_MMCRA, 0);
>>>> -	mtspr(SPRN_MMCR0, 0);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> 	mtspr(SPRN_MMCR1, 0);
>>>> 	mtspr(SPRN_MMCR2, 0);
>>>> }
>>>> @@ -428,7 +428,7 @@ static void init_pmu_power10(void)
>>>>
>>>> 	mtspr(SPRN_MMCR3, 0);
>>>> 	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>>> -	mtspr(SPRN_MMCR0, MMCR0_PMCCEXT);
>>>> +	mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>>> }
>>>>
>>>> static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>>> index 1f30f98b09d1..f7349d150828 100644
>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>> @@ -2593,6 +2593,11 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu)
>>>> #endif
>>>> #endif
>>>> 	vcpu->arch.mmcr[0] = MMCR0_FC;
>>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>>> +		vcpu->arch.mmcr[0] |= MMCR0_PMCCEXT;
>>>> +		vcpu->arch.mmcra = MMCRA_BHRB_DISABLE;
>>>> +	}
>>>> +
>>>> 	vcpu->arch.ctrl = CTRL_RUNLATCH;
>>>> 	/* default to host PVR, since we can't spoof it */
>>>> 	kvmppc_set_pvr_hv(vcpu, mfspr(SPRN_PVR));
>>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>>> index 51622411a7cc..e33b29ec1a65 100644
>>>> --- a/arch/powerpc/perf/core-book3s.c
>>>> +++ b/arch/powerpc/perf/core-book3s.c
>>>> @@ -1361,6 +1361,13 @@ static void power_pmu_enable(struct pmu *pmu)
>>>> 		goto out;
>>>>
>>>> 	if (cpuhw->n_events = 0) {
>>>> +		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>>> +			mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>>>> +			mtspr(SPRN_MMCR0, MMCR0_FC | MMCR0_PMCCEXT);
>>>> +		} else {
>>>> +			mtspr(SPRN_MMCRA, 0);
>>>> +			mtspr(SPRN_MMCR0, MMCR0_FC);
>>>> +		}
>>>
>>> Hi Nick,
>>>
>>> We are setting these bits in “power_pmu_disable” function. And disable will be called before any event gets deleted/stopped. Can you please help to understand why this is needed in power_pmu_enable path also ?
>> I'll have to go back and check what I needed it for.
> Okay, MMCRA is getting MMCRA_SDAR_MODE_DCACHE set on POWER9, by the looks.
>
> That's not necessarily a problem, but KVM sets MMCRA to 0 to disable
> SDAR updates. So KVM and perf don't agree on what the "correct" value
> for disabled is. Which could be a problem with POWER10 not setting BHRB
> disable before my series.

Nick,

BHRB disable will be only known to P10 guest kernel and in case of
P9 guest kernel it is not known. And for the P10 host and guest,
I guess we set BHRB_DISABLE in cpu_setup code and also in 
power_pmu_disable().
So you areseeing P10 guest having MMCRA as zero during context switch?

Maddy
>
> I'll get rid of this hunk for now, I expect things won't be exactly clean
> or consistent until the KVM host PMU code is moved into perf/ though.
>
> Thanks,
> Nick

^ permalink raw reply	[flat|nested] 133+ messages in thread

end of thread, other threads:[~2021-07-16  3:55 UTC | newest]

Thread overview: 133+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 10:56 [RFC PATCH 00/43] KVM: PPC: Book3S HV P9: entry/exit optimisations round 1 Nicholas Piggin
2021-06-22 10:56 ` Nicholas Piggin
2021-06-22 10:56 ` [RFC PATCH 01/43] powerpc/64s: Remove WORT SPR from POWER9/10 Nicholas Piggin
2021-06-22 10:56   ` Nicholas Piggin
2021-06-30 17:29   ` Fabiano Rosas
2021-06-30 17:29     ` Fabiano Rosas
2021-06-22 10:56 ` [RFC PATCH 02/43] KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host Nicholas Piggin
2021-06-22 10:56   ` Nicholas Piggin
2021-06-22 10:56 ` [RFC PATCH 03/43] KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer read Nicholas Piggin
2021-06-22 10:56   ` Nicholas Piggin
2021-06-22 10:56 ` [RFC PATCH 04/43] KVM: PPC: Book3S HV P9: Use large decrementer for HDEC Nicholas Piggin
2021-06-22 10:56   ` Nicholas Piggin
2021-06-22 10:56 ` [RFC PATCH 05/43] KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit Nicholas Piggin
2021-06-22 10:56   ` Nicholas Piggin
2021-06-22 10:56 ` [RFC PATCH 06/43] powerpc/time: add API for KVM to re-arm the host timer/decrementer Nicholas Piggin
2021-06-22 10:56   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 07/43] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-30 19:41   ` Fabiano Rosas
2021-06-30 19:41     ` Fabiano Rosas
2021-06-22 10:57 ` [RFC PATCH 08/43] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-30 19:17   ` Fabiano Rosas
2021-06-30 19:17     ` Fabiano Rosas
2021-06-22 10:57 ` [RFC PATCH 09/43] KVM: PPC: Book3S HV: Don't always save PMU for guest capable of nesting Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-07-01 13:17   ` Madhavan Srinivasan
2021-07-01 13:29     ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Madhavan Srinivasan
2021-07-02  0:27     ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Nicholas Piggin
2021-07-02  0:27       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
2021-07-08 12:45       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Nicholas Piggin
2021-07-08 12:45         ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
2021-07-12  3:42       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Madhavan Srinivasan
2021-07-12  3:54         ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Madhavan Srinivasan
2021-07-10  2:50   ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Athira Rajeev
2021-07-10  2:50     ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Athira Rajeev
2021-07-12  2:41     ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Nicholas Piggin
2021-07-12  2:41       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
2021-07-12  3:17       ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Athira Rajeev
2021-07-14 12:39       ` Nicholas Piggin
2021-07-14 12:39         ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Nicholas Piggin
2021-07-16  3:43         ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Madhavan Srinivasan
2021-07-16  3:55           ` [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in u Madhavan Srinivasan
2021-06-22 10:57 ` [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-07-10  2:47   ` Athira Rajeev
2021-07-10  2:59     ` Athira Rajeev
2021-07-12  2:49     ` Nicholas Piggin
2021-07-12  2:49       ` Nicholas Piggin
2021-07-12 14:07       ` Athira Rajeev
2021-07-12 14:19         ` Athira Rajeev
2021-06-22 10:57 ` [RFC PATCH 12/43] KVM: PPC: Book3S HV P9: Factor out yield_count increment Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-07-08 17:56   ` Fabiano Rosas
2021-07-08 17:56     ` Fabiano Rosas
2021-06-22 10:57 ` [RFC PATCH 13/43] KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch functions Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 14/43] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 15/43] KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 16/43] KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 17/43] KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 18/43] KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] disable Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 19/43] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-30 20:18   ` Fabiano Rosas
2021-06-30 20:18     ` Fabiano Rosas
2021-06-22 10:57 ` [RFC PATCH 20/43] KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 21/43] KVM: PPC: Book3S HV P9: Move TB updates Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 22/43] KVM: PPC: Book3S HV P9: Optimise timebase reads Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 23/43] KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 24/43] KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 25/43] KVM: PPC: Book3S HV P9: Juggle SPR switching around Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 26/43] KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-07-08  5:32   ` Athira Rajeev
2021-07-08  5:44     ` Athira Rajeev
2021-07-12  2:50     ` Nicholas Piggin
2021-07-12  2:50       ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 28/43] KVM: PPC: Book3S HV P9: Move nested guest entry into its own function Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 29/43] KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 30/43] KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 31/43] KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 32/43] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it Nicholas Piggin
2021-06-22 10:57   ` [RFC PATCH 32/43] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that requir Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 33/43] KVM: PPC: Book3S HV P9: More SPR speed improvements Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 34/43] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-07-08 17:46   ` Fabiano Rosas
2021-07-08 17:46     ` Fabiano Rosas
2021-06-22 10:57 ` [RFC PATCH 35/43] KVM: PPC: Book3S HV P9: Demand fault TM " Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-07-08 17:46   ` Fabiano Rosas
2021-07-08 17:46     ` Fabiano Rosas
2021-06-22 10:57 ` [RFC PATCH 36/43] KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 37/43] KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 38/43] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-30 17:51   ` Fabiano Rosas
2021-06-30 17:51     ` Fabiano Rosas
2021-07-01  8:04     ` Nicholas Piggin
2021-07-01  8:04       ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 39/43] KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 40/43] KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 41/43] KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 42/43] KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin
2021-06-22 10:57 ` [RFC PATCH 43/43] KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving Nicholas Piggin
2021-06-22 10:57   ` Nicholas Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.