All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/53] KVM: PPC: Book3S HV P9: entry/exit optimisations
@ 2021-11-23  9:51 Nicholas Piggin
  2021-11-23  9:51 ` [PATCH v4 01/53] powerpc/64s: Remove WORT SPR from POWER9/10 (take 2) Nicholas Piggin
                   ` (53 more replies)
  0 siblings, 54 replies; 55+ messages in thread
From: Nicholas Piggin @ 2021-11-23  9:51 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This reduces radix guest full entry/exit latency on POWER9 and POWER10
by 2x.

Nested HV guests should see smaller improvements in their L1 entry/exit,
but this is also combined with most L0 speedups also applying to nested
entry. nginx localhost throughput test in a SMP nested guest is improved
about 10% (in a direct guest it doesn't change much because it uses XIVE
for IPIs) when L0 and L1 are patched.

It does this in several main ways:

- Rearrange code to optimise SPR accesses. Mainly, avoid scoreboard
  stalls.

- Test SPR values to avoid mtSPRs where possible. mtSPRs are expensive.

- Reduce mftb. mftb is expensive.

- Demand fault certain facilities to avoid saving and/or restoring them
  (at the cost of fault when they are used, but this is mitigated over
  a number of entries, like the facilities when context switching 
  processes). PM, TM, and EBB so far.

- Defer some sequences that are made just in case a guest is interrupted
  in the middle of a critical section to the case where the guest is
  scheduled on a different CPU, rather than every time (at the cost of
  an extra IPI in this case). Namely the tlbsync sequence for radix with
  GTSE, which is very expensive.

- Reduce locking, barriers, atomics related to the vcpus-per-vcore > 1
  handling that the P9 path does not require.

Changes since v3:
- Fix a possible bug in "Avoid tlbsync sequence on radix guest exit"
  where the TLB flushing optimisation (1 thread TLBIEL flushes TLB for
  entire core) might break because 'ptesync' was no longer guaranteed
  to be executed on all threads (via regular exit path). Now the TLB
  flush keeps track of all threads and whether they need to do a TLBIEL
  or a PTESYNC. Fixing this requires a new patch "Split P8 from P9 path
  guest vCPU TLB flushing".

Changes since v2:
- Rebased, several patches from the series were merged in the previous
  merge window.
- Fixed some compile errors noticed by kernel test robot.
- Added RB from Athira for the PMU stuff (thanks!)
- Split TIDR ftr check (patch 2) out into its own patch.
- Added a missed license tag on new file.

Changes since v1:
- Verified DPDES changes still work with msgsndp SMT emulation.
- Fixed HMI handling bug.
- Split softpatch handling fixes into smaller pieces.
- Rebased with Fabiano's latest HV sanitising patches.
- Fix TM demand faulting bug causing nested guest TM tests to TM Bad
  Thing the host in rare cases.
- Re-name new "pmu=" command line option to "pmu_override=" and update
  documentation wording.
- Add default=y config option rather than unconditionally removing the
  L0 nested PMU workaround.
- Remove unnecessary MSR[RI] updates in entry/exit. Down to about 4700
  cycles now.
- Another bugfix from Alexey's testing.

Changes since RFC:
- Rebased with Fabiano's HV sanitising patches at the front.
- Several demand faulting bug fixes mostly relating to nested guests.
- Removed facility demand-faulting from L0 nested entry/exit handler.
  Demand faulting is still done in the L1, but not the L0. The reason
  is to reduce complexity (although it's only a small amount of
  complexity), reduce demand faulting overhead that may require several

Thanks,
Nick

Nicholas Piggin (53):
  powerpc/64s: Remove WORT SPR from POWER9/10 (take 2)
  powerpc/64s: guard optional TIDR SPR with CPU ftr test
  KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host
  KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer
    read
  KVM: PPC: Book3S HV P9: Use large decrementer for HDEC
  KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit
  powerpc/time: add API for KVM to re-arm the host timer/decrementer
  KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests
  powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
  KVM: PPC: Book3S HV: Don't always save PMU for guest capable of
    nesting
  powerpc/64s: Always set PMU control registers to frozen/disabled when
    not in use
  powerpc/64s: Implement PMU override command line option
  KVM: PPC: Book3S HV P9: Implement PMU save/restore in C
  KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch
    functions
  KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse
  KVM: PPC: Book3S HV P9: Factor out yield_count increment
  KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write
  KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs
  KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save
    host SPRs
  KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE]
    disable
  KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match
    kvmppc_start_thread
  KVM: PPC: Book3S HV: Change dec_expires to be relative to guest
    timebase
  KVM: PPC: Book3S HV P9: Move TB updates
  KVM: PPC: Book3S HV P9: Optimise timebase reads
  KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
  KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
  KVM: PPC: Book3S HV P9: Juggle SPR switching around
  KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
  KVM: PPC: Book3S HV P9: Move host OS save/restore functions to
    built-in
  KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
  KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low
    level entry
  KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
  KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
  KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors
    that require it
  KVM: PPC: Book3S HV P9: More SPR speed improvements
  KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
  KVM: PPC: Book3S HV P9: Demand fault TM facility registers
  KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host
    SPRs
  KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
  KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR
    SPRs
  KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
  KVM: PPC: Book3S HV: Split P8 from P9 path guest vCPU TLB flushing
  KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
  KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
  KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
  KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving
  KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit
  KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready
  KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit
  KVM: PPC: Book3S HV P9: Remove most of the vcore logic
  KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry
  KVM: PPC: Book3S HV P9: Stop using vc->dpdes
  KVM: PPC: Book3S HV P9: Remove subcore HMI handling

 .../admin-guide/kernel-parameters.txt         |   8 +
 arch/powerpc/include/asm/asm-prototypes.h     |   5 -
 arch/powerpc/include/asm/kvm_asm.h            |   1 +
 arch/powerpc/include/asm/kvm_book3s.h         |   6 +
 arch/powerpc/include/asm/kvm_book3s_64.h      |   5 +-
 arch/powerpc/include/asm/kvm_host.h           |   7 +-
 arch/powerpc/include/asm/kvm_ppc.h            |   4 +-
 arch/powerpc/include/asm/switch_to.h          |   3 +
 arch/powerpc/include/asm/time.h               |  19 +-
 arch/powerpc/kernel/cpu_setup_power.c         |  12 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c             |   8 +-
 arch/powerpc/kernel/process.c                 |  34 +
 arch/powerpc/kernel/time.c                    |  54 +-
 arch/powerpc/kvm/Kconfig                      |  15 +
 arch/powerpc/kvm/book3s_64_entry.S            |  11 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c        |   4 +
 arch/powerpc/kvm/book3s_hv.c                  | 851 +++++++++--------
 arch/powerpc/kvm/book3s_hv.h                  |  42 +
 arch/powerpc/kvm/book3s_hv_builtin.c          |  55 +-
 arch/powerpc/kvm/book3s_hv_hmi.c              |   7 +-
 arch/powerpc/kvm/book3s_hv_interrupts.S       |  13 +-
 arch/powerpc/kvm/book3s_hv_nested.c           |   8 +-
 arch/powerpc/kvm/book3s_hv_p9_entry.c         | 898 +++++++++++++++---
 arch/powerpc/kvm/book3s_hv_ras.c              |  54 ++
 arch/powerpc/kvm/book3s_hv_rm_mmu.c           |   6 -
 arch/powerpc/kvm/book3s_hv_rmhandlers.S       |  73 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c      |  15 -
 arch/powerpc/perf/core-book3s.c               |  35 +
 arch/powerpc/platforms/powernv/idle.c         |   9 +-
 arch/powerpc/xmon/xmon.c                      |  10 +-
 30 files changed, 1555 insertions(+), 717 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv.h

-- 
2.23.0


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2021-11-25  9:52 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23  9:51 [PATCH v4 00/53] KVM: PPC: Book3S HV P9: entry/exit optimisations Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 01/53] powerpc/64s: Remove WORT SPR from POWER9/10 (take 2) Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 02/53] powerpc/64s: guard optional TIDR SPR with CPU ftr test Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 03/53] KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 04/53] KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer read Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 05/53] KVM: PPC: Book3S HV P9: Use large decrementer for HDEC Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 06/53] KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 07/53] powerpc/time: add API for KVM to re-arm the host timer/decrementer Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 08/53] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 09/53] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 10/53] KVM: PPC: Book3S HV: Don't always save PMU for guest capable of nesting Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 11/53] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 12/53] powerpc/64s: Implement PMU override command line option Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 13/53] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 14/53] KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch functions Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 15/53] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 16/53] KVM: PPC: Book3S HV P9: Factor out yield_count increment Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 17/53] KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 18/53] KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 19/53] KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 20/53] KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] disable Nicholas Piggin
2021-11-23  9:51 ` [PATCH v4 21/53] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 22/53] KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 23/53] KVM: PPC: Book3S HV P9: Move TB updates Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 24/53] KVM: PPC: Book3S HV P9: Optimise timebase reads Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 25/53] KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 26/53] KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 27/53] KVM: PPC: Book3S HV P9: Juggle SPR switching around Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 28/53] KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 29/53] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 30/53] KVM: PPC: Book3S HV P9: Move nested guest entry into its own function Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 31/53] KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 32/53] KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 33/53] KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 34/53] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 35/53] KVM: PPC: Book3S HV P9: More SPR speed improvements Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 36/53] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 37/53] KVM: PPC: Book3S HV P9: Demand fault TM " Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 38/53] KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 39/53] KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 40/53] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 41/53] KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 42/53] KVM: PPC: Book3S HV: Split P8 from P9 path guest vCPU TLB flushing Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 43/53] KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 44/53] KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 45/53] KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 46/53] KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 47/53] KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 48/53] KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 49/53] KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 50/53] KVM: PPC: Book3S HV P9: Remove most of the vcore logic Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 51/53] KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 52/53] KVM: PPC: Book3S HV P9: Stop using vc->dpdes Nicholas Piggin
2021-11-23  9:52 ` [PATCH v4 53/53] KVM: PPC: Book3S HV P9: Remove subcore HMI handling Nicholas Piggin
2021-11-25  9:38 ` [PATCH v4 00/53] KVM: PPC: Book3S HV P9: entry/exit optimisations Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.