kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: David Dunn <daviddunn@google.com>, Dave Hansen <dave.hansen@intel.com>
Cc: Jim Mattson <jmattson@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Like Xu <like.xu.linux@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Like Xu <likexu@tencent.com>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH kvm/queue v2 2/3] perf: x86/core: Add interface to query perfmon_event_map[] directly
Date: Thu, 10 Feb 2022 10:34:42 -0500	[thread overview]
Message-ID: <7b5012d8-6ae1-7cde-a381-e82685dfed4f@linux.intel.com> (raw)
In-Reply-To: <CABOYuvbPL0DeEgV4gsC+v786xfBAo3T6+7XQr7cVVzbaoFoEAg@mail.gmail.com>



On 2/9/2022 2:24 PM, David Dunn wrote:
> Dave,
> 
> In my opinion, the right policy depends on what the host owner and
> guest owner are trying to achieve.
> 
> If the PMU is being used to locate places where performance could be
> improved in the system, there are two sub scenarios:
>     - The host and guest are owned by same entity that is optimizing
> overall system.  In this case, the guest doesn't need PMU access and
> better information is provided by profiling the entire system from the
> host.
>     - The host and guest are owned by different entities.  In this
> case, profiling from the host can identify perf issues in the guest.
> But what action can be taken?  The host entity must communicate issues
> back to the guest owner through some sort of out-of-band information
> channel.  On the other hand, preempting the host PMU to give the guest
> a fully functional PMU serves this use case well.
> 
> TDX and SGX (outside of debug mode) strongly assume different
> entities.  And Intel is doing this to reduce insight of the host into
> guest operations.  So in my opinion, preemption makes sense.
> 
> There are also scenarios where the host owner is trying to identify
> systemwide impacts of guest actions.  For example, detecting memory
> bandwidth consumption or split locks.  In this case, host control
> without preemption is necessary.
> 
> To address these various scenarios, it seems like the host needs to be
> able to have policy control on whether it is willing to have the PMU
> preempted by the guest.
> 
> But I don't see what scenario is well served by the current situation
> in KVM.  Currently the guest will either be told it has no PMU (which
> is fine) or that it has full control of a PMU.  If the guest is told
> it has full control of the PMU, it actually doesn't.  But instead of
> losing counters on well defined events (from the guest perspective),
> they simply stop counting depending on what the host is doing with the
> PMU.

For the current perf subsystem, a PMU should be shared among different 
users via the multiplexing mechanism if the resource is limited. No one 
has full control of a PMU for lifetime. A user can only have the PMU in 
its given period. I think the user can understand how long it runs via 
total_time_enabled and total_time_running.

For a guest, it should rely on the host to tell whether the PMU resource 
is available. But unfortunately, I don't think we have such a 
notification mechanism in KVM. The guest has the wrong impression that 
the guest can have full control of the PMU.

In my opinion, we should add the notification mechanism in KVM. When the 
PMU resource is limited, the guest can know whether it's multiplexing or 
can choose to reschedule the event.

But seems the notification mechanism may not work for TDX case?
> 
> On the other hand, if we flip it around the semantics are more clear.
> A guest will be told it has no PMU (which is fine) or that it has full
> control of the PMU.  If the guest is told that it has full control of
> the PMU, it does.  And the host (which is the thing that granted the
> full PMU to the guest) knows that events inside the guest are not
> being measured.  This results in all entities seeing something that
> can be reasoned about from their perspective.
>

I assume that this is for the TDX case (where the notification mechanism 
  doesn't work). The host still control all the PMU resources. The TDX 
guest is treated as a super-user who can 'own' a PMU. The admin in the 
host can configure/change the owned PMUs of the TDX. Personally, I think 
it makes sense. But please keep in mind that the counters are not 
identical. There are some special events that can only run on a specific 
counter. If the special counter is assigned to TDX, other entities can 
never run some events. We should let other entities know if it happens. 
Or we should never let non-host entities own the special counter.


Thanks,
Kan

> Thanks,
> 
> Dave Dunn
> 
> On Wed, Feb 9, 2022 at 10:57 AM Dave Hansen <dave.hansen@intel.com> wrote:
> 
>>> I was referring to gaps in the collection of data that the host perf
>>> subsystem doesn't know about if ATTRIBUTES.PERFMON is set for a TDX
>>> guest. This can potentially be a problem if someone is trying to
>>> measure events per unit of time.
>>
>> Ahh, that makes sense.
>>
>> Does SGX cause problem for these people?  It can create some of the same
>> collection gaps:
>>
>>          performance monitoring activities are suppressed when entering
>>          an opt-out (of performance monitoring) enclave.

  parent reply	other threads:[~2022-02-10 15:34 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-17  8:53 [PATCH kvm/queue v2 0/3] KVM: x86/pmu: Fix out-of-date AMD amd_event_mapping[] Like Xu
2022-01-17  8:53 ` [PATCH kvm/queue v2 1/3] KVM: x86/pmu: Replace pmu->available_event_types with a new BITMAP Like Xu
2022-02-01 12:26   ` Paolo Bonzini
2022-01-17  8:53 ` [PATCH kvm/queue v2 2/3] perf: x86/core: Add interface to query perfmon_event_map[] directly Like Xu
2022-02-01 12:27   ` Paolo Bonzini
2022-02-02 14:43   ` Peter Zijlstra
2022-02-02 22:35     ` Jim Mattson
2022-02-03 17:33       ` David Dunn
2022-02-09  8:10       ` KVM: x86: Reconsider the current approach of vPMU Like Xu
2022-02-09 13:33         ` Peter Zijlstra
2022-02-09 21:00           ` Sean Christopherson
2022-02-10 12:08             ` Like Xu
2022-02-10 17:12               ` Sean Christopherson
2022-02-16  3:33                 ` Like Xu
2022-02-16 17:53                   ` Jim Mattson
2022-02-09 13:21       ` [PATCH kvm/queue v2 2/3] perf: x86/core: Add interface to query perfmon_event_map[] directly Peter Zijlstra
2022-02-09 15:40         ` Dave Hansen
2022-02-09 18:47           ` Jim Mattson
2022-02-09 18:57             ` Dave Hansen
2022-02-09 19:24               ` David Dunn
2022-02-10 13:29                 ` Like Xu
2022-02-10 15:34                 ` Liang, Kan [this message]
2022-02-10 16:34                   ` Jim Mattson
2022-02-10 18:30                     ` Liang, Kan
2022-02-10 19:16                       ` Jim Mattson
2022-02-10 19:46                         ` Liang, Kan
2022-02-10 19:55                           ` David Dunn
2022-02-11 14:11                             ` Liang, Kan
2022-02-11 18:08                               ` Jim Mattson
2022-02-11 21:47                                 ` Liang, Kan
2022-02-12 23:31                                   ` Jim Mattson
2022-02-14 21:55                                     ` Liang, Kan
2022-02-14 22:55                                       ` Jim Mattson
2022-02-16  7:36                                         ` Like Xu
2022-02-16 18:10                                           ` Jim Mattson
2022-02-16  7:30                           ` Like Xu
2022-02-16  5:08                   ` Like Xu
2022-02-10 12:55               ` Like Xu
2022-02-12 23:32               ` Jim Mattson
2022-02-08 11:52     ` Like Xu
2022-01-17  8:53 ` [PATCH kvm/queue v2 3/3] KVM: x86/pmu: Setup the {inte|amd}_event_mapping[] when hardware_setup Like Xu
2022-02-01 12:28   ` Paolo Bonzini
2022-02-08 10:10     ` Like Xu
2022-01-26 11:22 ` [PATCH kvm/queue v2 0/3] KVM: x86/pmu: Fix out-of-date AMD amd_event_mapping[] Like Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b5012d8-6ae1-7cde-a381-e82685dfed4f@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=dave.hansen@intel.com \
    --cc=daviddunn@google.com \
    --cc=eranian@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu.linux@gmail.com \
    --cc=likexu@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).