All of lore.kernel.org
 help / color / mirror / Atom feed
From: Like Xu <like.xu.linux@gmail.com>
To: Jim Mattson <jmattson@google.com>,
	Sean Christopherson <seanjc@google.com>
Cc: Sandipan Das <sandipan.das@amd.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Santosh Shukla <santosh.shukla@amd.com>,
	"Tom Lendacky (AMD)" <thomas.lendacky@amd.com>,
	Ananth Narayan <ananth.narayan@amd.com>
Subject: Re: [PATCH 5/5] KVM: x86/pmu: Hide guest counter updates from the VMRUN instruction
Date: Mon, 29 May 2023 22:51:51 +0800	[thread overview]
Message-ID: <ec42501c-2e66-5248-5b97-4827344418f3@gmail.com> (raw)
In-Reply-To: <CALMp9eQrDX6=gJzybegjzDJ665NCuWmESt-sZrKHcncnuENdpA@mail.gmail.com>

On 25/5/2023 5:32 am, Jim Mattson wrote:
> On Wed, May 24, 2023 at 2:29 PM Sean Christopherson <seanjc@google.com> wrote:
>>
>> On Wed, May 24, 2023, Jim Mattson wrote:
>>> On Wed, May 24, 2023 at 1:41 PM Sean Christopherson <seanjc@google.com> wrote:
>>>>
>>>> On Wed, Apr 26, 2023, Sandipan Das wrote:
>>>>> Hi Sean, Like,
>>>>>
>>>>> On 4/19/2023 7:11 PM, Like Xu wrote:
>>>>>>> Heh, it's very much explicable, it's just not desirable, and you and I would argue
>>>>>>> that it's also incorrect.
>>>>>>
>>>>>> This is completely inaccurate from the end guest pmu user's perspective.
>>>>>>
>>>>>> I have a toy that looks like virtio-pmu, through which guest users can get hypervisor performance data.
>>>>>> But the side effect of letting the guest see the VMRUN instruction by default is unacceptable, isn't it ?
>>>>>>
>>>>>>>
>>>>>>> AMD folks, are there plans to document this as an erratum?� I agree with Like that
>>>>>>> counting VMRUN as a taken branch in guest context is a CPU bug, even if the behavior
>>>>>>> is known/expected.
>>>>>>
>>>>>
>>>>> This behaviour is architectural and an erratum will not be issued. However, for clarity, a future
>>>>> release of the APM will include additional details like the following:
>>>>>
>>>>>    1) From the perspective of performance monitoring counters, VMRUNs are considered as far control
>>>>>       transfers and VMEXITs as exceptions.
>>>>>
>>>>>    2) When the performance monitoring counters are set up to count events only in certain modes
>>>>>       through the "OsUserMode" and "HostGuestOnly" bits, instructions and events that change the
>>>>>       mode are counted in the target mode. For example, a SYSCALL from CPL 3 to CPL 0 with a
>>>>>       counter set to count retired instructions with USR=1 and OS=0 will not cause an increment of
>>>>>       the counter. However, the SYSRET back from CPL 0 to CPL 3 will cause an increment of the
>>>>>       counter and the total count will end up correct. Similarly, when counting PMCx0C6 (retired
>>>>>       far control transfers, including exceptions and interrupts) with Guest=1 and Host=0, a VMRUN
>>>>>       instruction will cause an increment of the counter. However, the subsequent VMEXIT that occurs,
>>>>>       since the target is in the host, will not cause an increment of the counter and so the total
>>>>>       count will end up correct.
>>>>
>>>> The count from the guest's perspective does not "end up correct".  Unlike SYSCALL,
>>>> where _userspace_ deliberately and synchronously executes a branch instruction,
>>>> VMEXIT and VMRUN are supposed to be transparent to the guest and can be completely
>>>> asynchronous with respect to guest code execution, e.g. if the host is spamming
>>>> IRQs, the guest will see a potentially large number of bogus (from it's perspective)
>>>> branches retired.
>>>
>>> The reverse problem occurs when a PMC is configured to count "CPUID
>>> instructions retired." Since KVM intercepts CPUID and emulates it, the
>>> PMC will always read 0, even if the guest executes a tight loop of
>>> CPUID instructions.

Unlikely. KVM will count any emulated instructions based on kvm_pmu_incr_counter().
Did I miss some conditions ?

>>>
>>> The PMU is not virtualizable on AMD CPUs without significant
>>> hypervisor corrections. I have to wonder if it's really worth the
>>> effort.

I used to think so, until I saw the AMD64_EVENTSEL_GUESTONLY bit.
Hardware architects are expected to put more effort into this area.

>>
>> Per our offlist chat, my understanding is that there are caveats with vPMUs that
>> it's simply not feasible for a hypervisor to handle.  I.e. virtualizing any x86
>> PMU with 100% accuracy isn't happening anytime soon.

Indeed, and any more detailed complaints ?

>>
>> The way forward is likely to evaluate each caveat on a case-by-case basis to
>> determine whether or not the cost of the fixup in KVM is worth the benefit to
>> the guest.  E.g. emulating "CPUID instructions retired" seems like it would be
>> fairly straightforward.  AFAICT, fixing up the VMRUN stuff is quite difficult though.
> 
> Yeah. The problem with fixing up "CPUID instructions retired" is
> tracking what the event encoding is for every F/M/S out there. It's
> not worth it.

I don't think it's feasible to emulate 100% accuracy on Intel. For guest pmu
users, it is motivated by wanting to know how effective they are running on
the current pCPU, and any vPMU eimulation behavior that helps this
understanding would be valuable.

  reply	other threads:[~2023-05-29 14:52 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-10 10:53 [PATCH 0/5] KVM: x86/pmu: Hide guest counter updates from the VMRUN instruction Like Xu
2023-03-10 10:53 ` [PATCH 1/5] KVM: x86/pmu: Emulate CTR overflow directly in kvm_pmu_handle_event() Like Xu
2023-03-10 10:53 ` [PATCH 2/5] KVM: x86/pmu: Add a helper to check if pmc has PEBS mode enabled Like Xu
2023-05-24 20:54   ` Sean Christopherson
2023-03-10 10:53 ` [PATCH 3/5] KVM: x86/pmu: Move the overflow of a normal counter out of PMI context Like Xu
2023-05-24 21:03   ` Sean Christopherson
2023-03-10 10:53 ` [PATCH 4/5] KVM: x86/pmu: Reorder functions to reduce unnecessary declarations Like Xu
2023-05-24 21:14   ` Sean Christopherson
2023-03-10 10:53 ` [PATCH 5/5] KVM: x86/pmu: Hide guest counter updates from the VMRUN instruction Like Xu
2023-04-07  2:18   ` Sean Christopherson
2023-04-07  8:15     ` Like Xu
2023-04-07 14:56       ` Sean Christopherson
2023-04-19 13:41         ` Like Xu
2023-04-26  5:25           ` Sandipan Das
2023-04-26  6:25             ` Like Xu
2023-05-24 20:41             ` Sean Christopherson
2023-05-24 20:47               ` Jim Mattson
2023-05-24 21:29                 ` Sean Christopherson
2023-05-24 21:32                   ` Jim Mattson
2023-05-29 14:51                     ` Like Xu [this message]
2023-05-30 20:00                       ` Jim Mattson
2023-05-24 21:23   ` Sean Christopherson
2023-05-24 21:30     ` Jim Mattson
2023-05-29 14:36       ` Like Xu
2023-03-13 10:57 ` [PATCH 0/5] " Sandipan Das
2023-03-23  8:16   ` Like Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec42501c-2e66-5248-5b97-4827344418f3@gmail.com \
    --to=like.xu.linux@gmail.com \
    --cc=ananth.narayan@amd.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    --cc=santosh.shukla@amd.com \
    --cc=seanjc@google.com \
    --cc=thomas.lendacky@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.