All of lore.kernel.org
 help / color / mirror / Atom feed
* Trouble with perf_guest_get_msrs() and pebs_no_isolation
@ 2021-02-09 22:56 Jim Mattson
  0 siblings, 0 replies; only message in thread
From: Jim Mattson @ 2021-02-09 22:56 UTC (permalink / raw)
  To: Peter Shier, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Borislav Petkov, H. Peter Anvin, Ingo Molnar, Jiri Olsa,
	Joerg Roedel, kvm, linux-kernel, Mark Rutland, Namhyung Kim,
	Paolo Bonzini, Peter Zijlstra, Sean Christopherson,
	Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, x86
  Cc: Jim Mattson

On a host that suffers from pebs_no_isolation, perf_guest_get_msrs()
adds an entry to cpuc->guest_switch_msrs for
MSR_IA32_PEBS_ENABLE. Kvm's atomic_switch_perf_msrs() is the only
caller of perf_guest_get_msrs(). If atomic_switch_perf_msrs() finds an
entry for MSR_IA32_PEBS_ENABLE in cpuc->guest_switch_msrs, it puts the
"host" value of this entry on the VM-exit MSR-load list for the
current VMCS. At the next VM-exit, that "host" value will be written
to MSR_IA32_PEBS_ENABLE.

The problem is that by the next VM-exit, that "host" value may be
stale. Though maskable interrupts are blocked from well before
atomic_switch_perf_msrs() to the next VM-entry, PMIs are delivered as
NMIs. Moreover, due to PMI throttling, the PMI handler may clear bits
in MSR_IA32_PEBS_ENABLE. See the comment to that effect in
handle_pmi_common(). In fact, by the time that perf_guest_get_msrs()
returns to its caller, the "host" value that it has recorded for
MSR_IA32_PEBS_ENABLE could already be stale.

What happens if a VM-exit sets a bit in MSR_IA32_PEBS_ENABLE that the
perf subsystem thinks should be clear? In the short term, nothing
happens at all. But note that this situation may not get better at the
next VM-exit, because kvm won't add MSR_IA32_PEBS_ENABLE to the
VM-exit MSR-load list if perf_guest_get_mrs() reports that the "host"
value of the MSR is 0. So, if the new MSR_IA32_PEBS_ENABLE "host"
value is 0, the stale bits can actually persist for a long time.

If, at some point in the future, the perf subsystem programs a counter
overflow interrupt on the same PMC for a PEBS-capable event, we are in
for some nasty surprises. (Note that the perf subsystem never
*intentionally* programs a PMC for both PEBS and counter overflow
interrupts at the same time.)

If a PEBS assist is triggered while in VMX non-root operation, the CPU
will try to access the host's DS_AREA using the guest's page tables,
and a page fault is likely (specifically on a read of the PEBS
interrupt threshold at offset 0x38 in the DS_AREA).

If the PEBS interrupt threshold is met while in VMX root operation,
two separate PMIs are generated: one for the PEBS interrupt threshold
and one for the counter overflow. This results in a message from
unknown_nmi_error(): "Uhhuh. NMI received for unknown reason <xx> on
CPU <n>."

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-02-10  1:26 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-09 22:56 Trouble with perf_guest_get_msrs() and pebs_no_isolation Jim Mattson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.