linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Like Xu <like.xu.linux@gmail.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Mingwei Zhang <mizhang@google.com>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhang Xiong <xiong.y.zhang@intel.com>,
	Lv Zhiyuan <zhiyuan.lv@intel.com>,
	Dapeng Mi <dapeng1.mi@intel.com>,
	Jim Mattson <jmattson@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] KVM: x86/pmu: Disable support for adaptive PEBS
Date: Thu, 7 Mar 2024 19:07:46 +0800	[thread overview]
Message-ID: <e34d49b8-4aa2-455a-a623-53a630d484ef@gmail.com> (raw)
In-Reply-To: <20240307005833.827147-1-seanjc@google.com>

On 7/3/2024 8:58 am, Sean Christopherson wrote:
> Drop support for virtualizing adaptive PEBS, as KVM's implementation is
> architecturally broken without an obvious/easy path forward, and because
> exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak
> host kernel addresses to the guest.
> 
> Bug #1 is that KVM doesn't doesn't account for the upper 32 bits of
> IA32_FIXED_CTR_CTRL when (re)programming fixed counters, e.g
> fixed_ctrl_field() drops the upper bits, reprogram_fixed_counters()
> stores local variables as u8s and truncates the upper bits too, etc.
> 
> Bug #2 is that, because KVM _always_ sets precise_ip to a non-zero value
> for PEBS events, perf will _always_ generate an adaptive record, even if
> the guest requested a basic record.  Note, KVM will also enable adaptive
> PEBS in individual *counter*, even if adaptive PEBS isn't exposed to the
> guest, but this is benign as MSR_PEBS_DATA_CFG is guaranteed to be zero,
> i.e. the guest will only ever see Basic records.
> 
> Bug #3 is in perf.  intel_pmu_disable_fixed() doesn't clear the upper
> bits either, i.e. leaves ICL_FIXED_0_ADAPTIVE set, and
> intel_pmu_enable_fixed() effectively doesn't clear ICL_FIXED_0_ADAPTIVE
> either.  I.e. perf _always_ enables ADAPTIVE counters, regardless of what
> KVM requests.

The three issues above all point to a fix in one direction: to pass the value
of vcpu's EVT_SELx.Adaptive_Record[34] or FCx_Adaptive_Record to the
perf/core in a way and let the PEBS assist take effect as expected by vPEBS.

One place to address this is in the intel_guest_get_msrs() again:
- update vPMC[x].pPMC.fcctl_or_evtsel.ADAPTIVE = vPMC[x].use_adaptive
, since guest PEBS is disabled if host PEBS is enabled.

> 
> Bug #4 is that adaptive PEBS *might* effectively bypass event filters set
> by the host, as "Updated Memory Access Info Group" records information
> that might be disallowed by userspace via KVM_SET_PMU_EVENT_FILTER.

This could be seen as a missing feature, that is, whether PMU_EVENT_FILTER
can control PEBS events even if they share the same event encoding.

Furthermore, if LBR_FMT is cleared only by VMM, could the guest use
adaptive pebs to obtain valid guest_lbr records. It's open in the virt context
since real hardware doesn't have this issue.

> 
> Bug #5 is that KVM doesn't ensure LBR MSRs hold guest values (or at least
> zeros) when entering a vCPU with adaptive PEBS, which allows the guest
> to read host LBRs, i.e. host RIPs/addresses, by enabling "LBR Entries"
> records.
> 
> Disable adaptive PEBS support as an immediate fix due to the severity of
> the LBR leak in particular, and because fixing all of the bugs will be
> non-trivial, e.g. not suitable for backporting to stable kernels.
> 
> Note!  This will break live migration, but trying to make KVM play nice
> with live migration would be quite complicated, wouldn't be guaranteed to
> work (i.e. KVM might still kill/confuse the guest), and it's not clear
> that there are any publicly available VMMs that support adaptive PEBS,
> let alone live migrate VMs that support adaptive PEBS, e.g. QEMU doesn't
> support PEBS in any capacity.
> 
> Link: https://lore.kernel.org/all/20240306230153.786365-1-seanjc@google.com
> Link: https://lore.kernel.org/all/ZeepGjHCeSfadANM@google.com
> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
> Cc: stable@vger.kernel.org
> Cc: Like Xu <like.xu.linux@gmail.com>
> Cc: Mingwei Zhang <mizhang@google.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: Zhang Xiong <xiong.y.zhang@intel.com>
> Cc: Lv Zhiyuan <zhiyuan.lv@intel.com>
> Cc: Dapeng Mi <dapeng1.mi@intel.com>
> Cc: Jim Mattson <jmattson@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Acked-by: Like Xu <likexu@tencent.com>

> ---
>   arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++--
>   1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7a74388f9ecf..641a7d5bf584 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7864,8 +7864,28 @@ static u64 vmx_get_perf_capabilities(void)
>   
>   	if (vmx_pebs_supported()) {
>   		perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
> -		if ((perf_cap & PERF_CAP_PEBS_FORMAT) < 4)
> -			perf_cap &= ~PERF_CAP_PEBS_BASELINE;
> +
> +		/*
> +		 * Disallow adaptive PEBS as it is functionally broken, can be
> +		 * used by the guest to read *host* LBRs, and can be used to
> +		 * bypass userspace event filters.  To correctly and safely
> +		 * support adaptive PEBS, KVM needs to:
> +		 *
> +		 * 1. Account for the ADAPTIVE flag when (re)programming fixed
> +		 *    counters.
> +		 *
> +		 * 2. Gain support from perf (or take direct control of counter
> +		 *    programming) to support events without adaptive PEBS
> +		 *    enabled for the hardware counter.
> +		 *
> +		 * 3. Ensure LBR MSRs cannot hold host data on VM-Entry with
> +		 *    adaptive PEBS enabled and MSR_PEBS_DATA_CFG.LBRS=1.
> +		 *
> +		 * 4. Document which PMU events are effectively exposed to the
> +		 *    guest via adaptive PEBS, and make adaptive PEBS mutually
> +		 *    exclusive with KVM_SET_PMU_EVENT_FILTER if necessary.
> +		 */
> +		perf_cap &= ~PERF_CAP_PEBS_BASELINE;
>   	}
>   
>   	return perf_cap;
> 
> base-commit: 0c64952fec3ea01cb5b09f00134200f3e7ab40d5

  parent reply	other threads:[~2024-03-07 11:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-07  0:58 [PATCH] KVM: x86/pmu: Disable support for adaptive PEBS Sean Christopherson
2024-03-07  1:32 ` Mi, Dapeng
2024-03-07 11:07 ` Like Xu [this message]
2024-04-09  2:01 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e34d49b8-4aa2-455a-a623-53a630d484ef@gmail.com \
    --to=like.xu.linux@gmail.com \
    --cc=dapeng1.mi@intel.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=xiong.y.zhang@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).