All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.17 01/49] KVM: PPC: Book3S HV P9: Fix "lost kick" race
@ 2022-04-12  0:43 ` Sasha Levin
  0 siblings, 0 replies; 89+ messages in thread
From: Sasha Levin @ 2022-04-12  0:43 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Nicholas Piggin, Michael Ellerman, Sasha Levin, farosas, aik,
	pbonzini, linuxppc-dev

From: Nicholas Piggin <npiggin@gmail.com>

[ Upstream commit c7fa848ff01dad9ed3146a6b1a7d3622131bcedd ]

When new work is created that requires attention from the hypervisor
(e.g., to inject an interrupt into the guest), fast_vcpu_kick is used to
pull the target vcpu out of the guest if it may have been running.

Therefore the work creation side looks like this:

  vcpu->arch.doorbell_request = 1;
  kvmppc_fast_vcpu_kick_hv(vcpu) {
    smp_mb();
    cpu = vcpu->cpu;
    if (cpu != -1)
        send_ipi(cpu);
  }

And the guest entry side *should* look like this:

  vcpu->cpu = smp_processor_id();
  smp_mb();
  if (vcpu->arch.doorbell_request) {
    // do something (abort entry or inject doorbell etc)
  }

But currently the store and load are flipped, so it is possible for the
entry to see no doorbell pending, and the doorbell creation misses the
store to set cpu, resulting lost work (or at least delayed until the
next guest exit).

Fix this by reordering the entry operations and adding a smp_mb
between them. The P8 path appears to have a similar race which is
commented but not addressed yet.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220303053315.1056880-2-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/powerpc/kvm/book3s_hv.c | 41 +++++++++++++++++++++++++++++-------
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 791db769080d..316f61a4cb59 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -225,6 +225,13 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 	int cpu;
 	struct rcuwait *waitp;
 
+	/*
+	 * rcuwait_wake_up contains smp_mb() which orders prior stores that
+	 * create pending work vs below loads of cpu fields. The other side
+	 * is the barrier in vcpu run that orders setting the cpu fields vs
+	 * testing for pending work.
+	 */
+
 	waitp = kvm_arch_vcpu_get_wait(vcpu);
 	if (rcuwait_wake_up(waitp))
 		++vcpu->stat.generic.halt_wakeup;
@@ -1089,7 +1096,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 			break;
 		}
 		tvcpu->arch.prodded = 1;
-		smp_mb();
+		smp_mb(); /* This orders prodded store vs ceded load */
 		if (tvcpu->arch.ceded)
 			kvmppc_fast_vcpu_kick_hv(tvcpu);
 		break;
@@ -3771,6 +3778,14 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 		pvc = core_info.vc[sub];
 		pvc->pcpu = pcpu + thr;
 		for_each_runnable_thread(i, vcpu, pvc) {
+			/*
+			 * XXX: is kvmppc_start_thread called too late here?
+			 * It updates vcpu->cpu and vcpu->arch.thread_cpu
+			 * which are used by kvmppc_fast_vcpu_kick_hv(), but
+			 * kick is called after new exceptions become available
+			 * and exceptions are checked earlier than here, by
+			 * kvmppc_core_prepare_to_enter.
+			 */
 			kvmppc_start_thread(vcpu, pvc);
 			kvmppc_create_dtl_entry(vcpu, pvc);
 			trace_kvm_guest_enter(vcpu);
@@ -4492,6 +4507,21 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	if (need_resched() || !kvm->arch.mmu_ready)
 		goto out;
 
+	vcpu->cpu = pcpu;
+	vcpu->arch.thread_cpu = pcpu;
+	vc->pcpu = pcpu;
+	local_paca->kvm_hstate.kvm_vcpu = vcpu;
+	local_paca->kvm_hstate.ptid = 0;
+	local_paca->kvm_hstate.fake_suspend = 0;
+
+	/*
+	 * Orders set cpu/thread_cpu vs testing for pending interrupts and
+	 * doorbells below. The other side is when these fields are set vs
+	 * kvmppc_fast_vcpu_kick_hv reading the cpu/thread_cpu fields to
+	 * kick a vCPU to notice the pending interrupt.
+	 */
+	smp_mb();
+
 	if (!nested) {
 		kvmppc_core_prepare_to_enter(vcpu);
 		if (test_bit(BOOK3S_IRQPRIO_EXTERNAL,
@@ -4511,13 +4541,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	tb = mftb();
 
-	vcpu->cpu = pcpu;
-	vcpu->arch.thread_cpu = pcpu;
-	vc->pcpu = pcpu;
-	local_paca->kvm_hstate.kvm_vcpu = vcpu;
-	local_paca->kvm_hstate.ptid = 0;
-	local_paca->kvm_hstate.fake_suspend = 0;
-
 	__kvmppc_create_dtl_entry(vcpu, pcpu, tb + vc->tb_offset, 0);
 
 	trace_kvm_guest_enter(vcpu);
@@ -4619,6 +4642,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	run->exit_reason = KVM_EXIT_INTR;
 	vcpu->arch.ret = -EINTR;
  out:
+	vcpu->cpu = -1;
+	vcpu->arch.thread_cpu = -1;
 	powerpc_local_irq_pmu_restore(flags);
 	preempt_enable();
 	goto done;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

end of thread, other threads:[~2022-04-19 13:15 UTC | newest]

Thread overview: 89+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-12  0:43 [PATCH AUTOSEL 5.17 01/49] KVM: PPC: Book3S HV P9: Fix "lost kick" race Sasha Levin
2022-04-12  0:43 ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 02/49] drm/amd: Add USBC connector ID Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 03/49] btrfs: fix fallocate to use file_modified to update permissions consistently Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 04/49] btrfs: do not warn for free space inode in cow_file_range Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 05/49] drm/amdgpu: conduct a proper cleanup of PDB bo Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 06/49] drm/amdgpu/gmc: use PCI BARs for APUs in passthrough Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 07/49] drm/amd/display: fix audio format not updated after edid updated Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 08/49] drm/amd/display: FEC check in timing validation Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 09/49] drm/amd/display: Update VTEM Infopacket definition Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 10/49] drm/amdkfd: Fix Incorrect VMIDs passed to HWS Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 11/49] drm/amdgpu/vcn: improve vcn dpg stop procedure Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 12/49] drm/amdkfd: Check for potential null return of kmalloc_array() Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 13/49] Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 14/49] Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 15/49] PCI: hv: Propagate coherence from VMbus device to PCI device Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 16/49] Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 17/49] scsi: target: tcmu: Fix possible page UAF Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 18/49] scsi: lpfc: Improve PCI EEH Error and Recovery Handling Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 19/49] scsi: lpfc: Fix unload hang after back to back PCI EEH faults Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 20/49] scsi: lpfc: Fix queue failures when recovering from PCI parity error Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 21/49] scsi: ibmvscsis: Increase INITIAL_SRP_LIMIT to 1024 Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 22/49] net: micrel: fix KS8851_MLL Kconfig Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 23/49] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 24/49] gpu: ipu-v3: Fix dev_dbg frequency output Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 25/49] regulator: wm8994: Add an off-on delay for WM8994 variant Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 26/49] Revert "ACPI: processor: idle: Only flush cache on entering C3" Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 27/49] static_call: Properly initialise DEFINE_STATIC_CALL_RET0() Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 28/49] arm64: alternatives: mark patch_alternative() as `noinstr` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 29/49] tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 30/49] net: axienet: setup mdio unconditionally Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 31/49] Drivers: hv: balloon: Disable balloon and hot-add accordingly Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 32/49] net: usb: aqc111: Fix out-of-bounds accesses in RX fixup Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 33/49] myri10ge: fix an incorrect free for skb in myri10ge_sw_tso Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 34/49] spi: cadence-quadspi: fix protocol setup for non-1-1-X operations Sasha Levin
2022-04-12 11:49   ` Matthias Schiffer
2022-04-12 12:07     ` Mark Brown
2022-04-17 21:33       ` Sasha Levin
2022-04-19 13:15         ` Mark Brown
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 35/49] drm/amd/display: Correct Slice reset calculation Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 36/49] drm/amd/display: Enable power gating before init_pipes Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 37/49] drm/amd/display: Revert FEC check in validation Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 38/49] drm/amd/display: Fix allocate_mst_payload assert on resume Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 39/49] drbd: set QUEUE_FLAG_STABLE_WRITES Sasha Levin
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 40/49] powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit Sasha Levin
2022-04-12  0:43   ` Sasha Levin
2022-04-12  6:35   ` Michael Ellerman
2022-04-12  6:35     ` Michael Ellerman
2022-04-12  0:43 ` [PATCH AUTOSEL 5.17 41/49] scsi: mpt3sas: Fail reset operation if config request timed out Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 42/49] scsi: mvsas: Add PCI ID of RocketRaid 2640 Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 43/49] scsi: megaraid_sas: Target with invalid LUN ID is deleted during scan Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 44/49] drivers: net: slip: fix NPD bug in sl_tx_timeout() Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 45/49] x86,bpf: Avoid IBT objtool warning Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 46/49] io_uring: zero tag on rsrc removal Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 47/49] io_uring: use nospec annotation for more indexes Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 48/49] arm64: Add part number for Arm Cortex-A78AE Sasha Levin
2022-04-12  0:44   ` Sasha Levin
2022-04-12  0:44 ` [PATCH AUTOSEL 5.17 49/49] perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant Sasha Levin
2022-04-12  0:44   ` Sasha Levin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.