linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Kang, Luwei" <luwei.kang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	"Liang, Kan" <kan.liang@linux.intel.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"acme@kernel.org" <acme@kernel.org>,
	"mark.rutland@arm.com" <mark.rutland@arm.com>,
	"alexander.shishkin@linux.intel.com" 
	<alexander.shishkin@linux.intel.com>,
	"jolsa@redhat.com" <jolsa@redhat.com>,
	"namhyung@kernel.org" <namhyung@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"bp@alien8.de" <bp@alien8.de>, "hpa@zytor.com" <hpa@zytor.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"Christopherson, Sean J" <sean.j.christopherson@intel.com>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"wanpengli@tencent.com" <wanpengli@tencent.com>,
	"jmattson@google.com" <jmattson@google.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"pawan.kumar.gupta@linux.intel.com" 
	<pawan.kumar.gupta@linux.intel.com>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"Yu, Fenghua" <fenghua.yu@intel.com>,
	"like.xu@linux.intel.com" <like.xu@linux.intel.com>,
	"Wang, Wei W" <wei.w.wang@intel.com>
Subject: RE: [PATCH v1 01/11] perf/x86/core: Support KVM to assign a dedicated counter for guest PEBS
Date: Fri, 12 Jun 2020 05:28:49 +0000	[thread overview]
Message-ID: <DM5PR1101MB22667E832B3E9C1EF5389F2280810@DM5PR1101MB2266.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200309150526.GI12561@hirez.programming.kicks-ass.net>

> > > Suppose your KVM thing claims counter 0/2 (ICL/SKL) for some random
> > > PEBS event, and then the host wants to use PREC_DIST.. Then one of
> > > them will be screwed for no reason what so ever.
> > >
> >
> > The multiplexing should be triggered.
> >
> > For host, if both user A and user B requires PREC_DIST, the
> > multiplexing should be triggered for them.
> > Now, the user B is KVM. I don't think there is difference. The
> > multiplexing should still be triggered. Why it is screwed?
> 
> Becuase if KVM isn't PREC_DIST we should be able to reschedule it to a
> different counter.
> 
> > > How is that not destroying scheduling freedom? Any other situation
> > > we'd have moved the !PREC_DIST PEBS event to another counter.
> > >
> >
> > All counters are equivalent for them. It doesn't matter if we move it
> > to another counter. There is no impact for the user.
> 
> But we cannot move it to another counter, because you're pinning it.

Hi Peter,

To avoid the pinning counters, I have tried to do some evaluation about
patching the PEBS record for guest in KVM. In this approach, about ~30% 
time increased on guest PEBS PMI handler latency (
e.g.perf record -e branch-loads:p -c 1000 ~/Tools/br_instr a).

Some implementation details as below:
1. Patching the guest PEBS records "Applicable Counters" filed when the guest
     required counter is not the same with the host. Because the guest PEBS
     driver will drop these PEBS records if the "Applicable Counters" not the
     same with the required counter index.
2. Traping the guest driver's behavior(VM-exit) of disabling PEBS. 
     It happens before reading PEBS records (e.g. PEBS PMI handler, before
     application exit and so on)
3. To patch the Guest PEBS records in KVM, we need to get the HPA of the
     guest PEBS buffer.
     <1> Trapping the guest write of IA32_DS_AREA register and get the GVA
             of guest DS_AREA.
     <2> Translate the DS AREA GVA to GPA(kvm_mmu_gva_to_gpa_read)
             and get the GVA of guest PEBS buffer from DS AREA
             (kvm_vcpu_read_guest_atomic).
     <3> Although we have got the GVA of PEBS buffer, we need to do the
             address translation(GVA->GPA->HPA) for each page. Because we can't
             assume the GPAs of Guest PEBS buffer are always continuous.
	
But we met another issue about the PEBS counter reset field in DS AREA.
pebs_event_reset in DS area has to be set for auto reload, which is per
counter. Guest and Host may use different counters. Let's say guest wants to
use counter 0, but host assign counter 1 to guest. Guest sets the reset value to
pebs_event_reset[0]. However, since counter 1 is the one which is eventually
scheduled, HW will use  pebs_event_reset[1] as reset value.

We can't copy the value of the guest pebs_event_reset[0] to
pebs_event_reset[1] directly(Patching DS AREA) because the guest driver may
confused, and we can't assume the guest counter 0 and 1 are not used for this
PEBS task at the same time. And what's more, KVM can't aware the guest
read/write to the DS AREA because it just a general memory for guest.

What is your opinion or do you have a better proposal?

Thanks,
Luwei Kang

> 
> > In the new proposal, KVM user is treated the same as other host events
> > with event constraint. The scheduler is free to choose whether or not
> > to assign a counter for it.
> 
> That's what it does, I understand that. I'm saying that that is creating artificial
> contention.
> 
> 
> Why is this needed anyway? Can't we force the guest to flush and then move it
> over to a new counter?

  parent reply	other threads:[~2020-06-12  5:29 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-05 17:56 [PATCH v1 00/11] PEBS virtualization enabling via DS Luwei Kang
2020-03-05 16:51 ` Paolo Bonzini
2020-03-05 17:56 ` [PATCH v1 01/11] perf/x86/core: Support KVM to assign a dedicated counter for guest PEBS Luwei Kang
2020-03-06 13:53   ` Peter Zijlstra
2020-03-06 14:42     ` Liang, Kan
2020-03-09 10:04       ` Peter Zijlstra
2020-03-09 13:12         ` Liang, Kan
2020-03-09 15:05           ` Peter Zijlstra
2020-03-09 19:28             ` Liang, Kan
2020-03-12 10:28               ` Kang, Luwei
2020-03-26 14:03               ` Liang, Kan
2020-04-07 12:34                 ` Kang, Luwei
2020-06-12  5:28             ` Kang, Luwei [this message]
2020-06-19  9:30               ` Kang, Luwei
2020-08-20  3:32               ` Like Xu
2020-03-09 15:44         ` Andi Kleen
2020-03-05 17:56 ` [PATCH v1 02/11] perf/x86/ds: Handle guest PEBS events overflow and inject fake PMI Luwei Kang
2020-03-05 17:56 ` [PATCH v1 03/11] perf/x86: Expose a function to disable auto-reload Luwei Kang
2020-03-05 17:56 ` [PATCH v1 04/11] KVM: x86/pmu: Decouple event enablement from event creation Luwei Kang
2020-03-05 17:56 ` [PATCH v1 05/11] KVM: x86/pmu: Add support to reprogram PEBS event for guest counters Luwei Kang
2020-03-06 16:28   ` kbuild test robot
2020-03-09  0:58     ` Xu, Like
2020-03-05 17:57 ` [PATCH v1 06/11] KVM: x86/pmu: Implement is_pebs_via_ds_supported pmu ops Luwei Kang
2020-03-05 17:57 ` [PATCH v1 07/11] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Luwei Kang
2020-03-05 17:57 ` [PATCH v1 08/11] KVM: x86/pmu: PEBS MSRs emulation Luwei Kang
2020-03-05 17:57 ` [PATCH v1 09/11] KVM: x86/pmu: Expose PEBS feature to guest Luwei Kang
2020-03-05 17:57 ` [PATCH v1 10/11] KVM: x86/pmu: Introduce the mask value for fixed counter Luwei Kang
2020-03-05 17:57 ` [PATCH v1 11/11] KVM: x86/pmu: Adaptive PEBS virtualization enabling Luwei Kang
2020-03-05 22:48 ` [PATCH v1 00/11] PEBS virtualization enabling via DS Andi Kleen
2020-03-06  5:37   ` Kang, Luwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR1101MB22667E832B3E9C1EF5389F2280810@DM5PR1101MB2266.namprd11.prod.outlook.com \
    --to=luwei.kang@intel.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=jolsa@redhat.com \
    --cc=joro@8bytes.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).