kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Xu, Like" <like.xu@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	eranian@google.com, andi@firstfloor.org,
	kan.liang@linux.intel.com, wei.w.wang@intel.com,
	Wanpeng Li <wanpengli@tencent.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Like Xu <like.xu@linux.intel.com>
Subject: Re: [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer
Date: Fri, 9 Apr 2021 16:30:14 +0800	[thread overview]
Message-ID: <9ec0e0ba-bef6-710e-1e9c-36beaedae16e@intel.com> (raw)
In-Reply-To: <YHAJXh2AtSMcC5xf@hirez.programming.kicks-ass.net>

On 2021/4/9 15:59, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 03:07:38PM +0800, Xu, Like wrote:
>> Hi Peter,
>>
>> On 2021/4/8 15:52, Peter Zijlstra wrote:
>>>> This is because in the early part of this function, we have operations:
>>>>
>>>>       if (x86_pmu.flags & PMU_FL_PEBS_ALL)
>>>>           arr[0].guest &= ~cpuc->pebs_enabled;
>>>>       else
>>>>           arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
>>>>
>>>> and if guest has PEBS_ENABLED, we need these bits back for PEBS counters:
>>>>
>>>>       arr[0].guest |= arr[1].guest;
>>> I don't think that's right, who's to say they were set in the first
>>> place? The guest's GLOBAL_CTRL could have had the bits cleared at VMEXIT
>>> time. You can't unconditionally add PEBS_ENABLED into GLOBAL_CTRL,
>>> that's wrong.
>> I can't keep up with you on this comment and would you explain more ?
> Well, it could be I'm terminally confused on how virt works (I usually
> am, it just doesn't make any sense ever).

I may help you a little on this.

>
> On top of that this code doesn't have any comments to help.

More comments will be added.

>
> So perf_guest_switch_msr has two msr values: guest and host.
>
> In my naive understanding guest is the msr value the guest sees and host
> is the value the host has. If it is not that, then the naming is just
> misleading at best.
>
> But thinking more about it, if these are fully emulated MSRs (which I
> think they are), then there might actually be 3 different values, not 2.

You are right about 3 different values.

>
> We have the value the guest sees when it uses {RD,WR}MSR.
> We have the value the hardware has when it runs a guest.
> We have the value the hardware has when it doesn't run a guest.
>
> And somehow this code does something, but I can't for the life of me
> figure out what and how.

Just focus on the last two values and the enabling bits (on the GLOBAL_CTRL
and PEBS_ENABLE) of "the value the hardware has when it runs a guest"
are exclusive with "the value the hardware has when it doesn't run a guest."

>> To address your previous comments, does the code below look good to you?
>>
>> static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>> {
>>      struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>      struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
>>      struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
>>      struct kvm_pmu *pmu = (struct kvm_pmu *)data;
>>      u64 pebs_mask = (x86_pmu.flags & PMU_FL_PEBS_ALL) ?
>>              cpuc->pebs_enabled : (cpuc->pebs_enabled & PEBS_COUNTER_MASK);
>>      int i = 0;
>>
>>      arr[i].msr = MSR_CORE_PERF_GLOBAL_CTRL;
>>      arr[i].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
>>      arr[i].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
>>      arr[i].guest &= ~pebs_mask;
>>
>>      if (!x86_pmu.pebs)
>>          goto out;
>>
>>      /*
>>       * If PMU counter has PEBS enabled it is not enough to
>>       * disable counter on a guest entry since PEBS memory
>>       * write can overshoot guest entry and corrupt guest
>>       * memory. Disabling PEBS solves the problem.
>>       *
>>       * Don't do this if the CPU already enforces it.
>>       */
>>      if (x86_pmu.pebs_no_isolation) {
>>          i++;
>>          arr[i].msr = MSR_IA32_PEBS_ENABLE;
>>          arr[i].host = cpuc->pebs_enabled;
>>          arr[i].guest = 0;
>>          goto out;
>>      }
>>
>>      if (!pmu || !x86_pmu.pebs_vmx)
>>          goto out;
>>
>>      i++;
>>      arr[i].msr = MSR_IA32_DS_AREA;
>>      arr[i].host = (unsigned long)ds;
>>      arr[i].guest = pmu->ds_area;
>>
>>      if (x86_pmu.intel_cap.pebs_baseline) {
>>          i++;
>>          arr[i].msr = MSR_PEBS_DATA_CFG;
>>          arr[i].host = cpuc->pebs_data_cfg;
>>          arr[i].guest = pmu->pebs_data_cfg;
>>      }
>>
>>      i++;
>>      arr[i].msr = MSR_IA32_PEBS_ENABLE;
>>      arr[i].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;
>>      arr[i].guest = pebs_mask & ~cpuc->intel_ctrl_host_mask;
>>
>>      if (arr[i].host) {
>>          /* Disable guest PEBS if host PEBS is enabled. */
>>          arr[i].guest = 0;
>>      } else {
>>          /* Disable guest PEBS for cross-mapped PEBS counters. */
>>          arr[i].guest &= ~pmu->host_cross_mapped_mask;
>>          arr[0].guest |= arr[i].guest;
>>      }
>>
>> out:
>>      *nr = ++i;
>>      return arr;
>> }
> The ++ is in a weird location, if you place it after filling out an
> entry it makes more sense I think. Something like:
>
> 	arr[i].msr = MSR_CORE_PERF_GLOBAL_CTRL;
> 	arr[i].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
> 	arr[i].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
> 	arr[i].guest &= ~pebs_mask;
> 	i++;
>
> or, perhaps even like:
>
> 	arr[i++] = (struct perf_guest_switch_msr){
> 		.msr = MSR_CORE_PERF_GLOBAL_CTRL,
> 		.host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
> 		.guest = x86_pmu.intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> 	};

The later one looks good to me and I'll apply it.

> But it doesn't address the fundamental confusion I seem to be having,
> what actual msr value is what.

VMX hardware has the capability to switch MSR values atomically:
- for vm-entry instruction, it loads the value of arr[i].guest to arr[i].msr;
- for vm-exit instruction, it loads the value of arr[i].host to arr[i].msr;

The intel_guest_get_msrs() will populate arr[i].guest and arr[i].host values
before each vm-entry and its caller does the optimization to skip the switch
if arr[i].guest == arr[i].host.

Just let me know if you have more questions,
otherwise I assume we have reached an agreement on this part of code.

  reply	other threads:[~2021-04-09  8:30 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-29  5:41 [PATCH v4 00/16] KVM: x86/pmu: Add basic support to enable Guest PEBS via DS Like Xu
2021-03-29  5:41 ` [PATCH v4 01/16] perf/x86/intel: Add x86_pmu.pebs_vmx for Ice Lake Servers Like Xu
2021-04-06  3:24   ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.)
2021-04-06  5:14     ` Xu, Like
2021-04-08  1:40       ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.)
2021-04-09  8:33       ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.)
2021-04-09  8:46         ` Like Xu
2021-04-12 11:26           ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.)
2021-04-12 15:25             ` Andi Kleen
2021-04-14 14:10               ` Liuxiangdong
2021-04-14 14:49           ` Liuxiangdong
2021-04-15  1:38             ` Xu, Like
2021-04-15  2:49               ` Liuxiangdong
2021-04-15  3:23                 ` Like Xu
2021-04-06 12:47     ` Andi Kleen
2021-04-07  3:05       ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.)
2021-04-07 14:32         ` Andi Kleen
2021-03-29  5:41 ` [PATCH v4 02/16] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Like Xu
2021-04-06 16:22   ` Peter Zijlstra
2021-04-07  0:47     ` Xu, Like
2021-03-29  5:41 ` [PATCH v4 03/16] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values Like Xu
2021-03-29  5:41 ` [PATCH v4 04/16] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled Like Xu
2021-03-29  5:41 ` [PATCH v4 05/16] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter Like Xu
2021-03-29  5:41 ` [PATCH v4 06/16] KVM: x86/pmu: Reprogram guest PEBS event to emulate guest PEBS counter Like Xu
2021-04-07  8:40   ` Peter Zijlstra
2021-03-29  5:41 ` [PATCH v4 07/16] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS Like Xu
2021-04-07  8:56   ` Peter Zijlstra
2021-04-07 15:25   ` Peter Zijlstra
2021-03-29  5:41 ` [PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer Like Xu
2021-04-07 15:39   ` Peter Zijlstra
2021-04-08  5:39     ` Xu, Like
2021-04-08  7:52       ` Peter Zijlstra
2021-04-08  8:44         ` Xu, Like
2021-04-09  7:07         ` Xu, Like
2021-04-09  7:59           ` Peter Zijlstra
2021-04-09  8:30             ` Xu, Like [this message]
2021-03-29  5:41 ` [PATCH v4 09/16] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS Like Xu
2021-04-07 15:40   ` Peter Zijlstra
2021-03-29  5:41 ` [PATCH v4 10/16] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled Like Xu
2021-03-29  5:41 ` [PATCH v4 11/16] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Like Xu
2021-03-29  5:41 ` [PATCH v4 12/16] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h Like Xu
2021-03-29  5:41 ` [PATCH v4 13/16] KVM: x86/pmu: Disable guest PEBS before vm-entry in two cases Like Xu
2021-03-29  5:41 ` [PATCH v4 14/16] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability Like Xu
2021-03-29  5:41 ` [PATCH v4 15/16] KVM: x86/cpuid: Refactor host/guest CPU model consistency check Like Xu
2021-03-29  5:41 ` [PATCH v4 16/16] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Like Xu
2021-04-06  3:19 ` [PATCH v4 00/16] KVM: x86/pmu: Add basic support to enable Guest PEBS via DS Xu, Like

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ec0e0ba-bef6-710e-1e9c-36beaedae16e@intel.com \
    --to=like.xu@intel.com \
    --cc=ak@linux.intel.com \
    --cc=andi@firstfloor.org \
    --cc=eranian@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).