From: "Jain, Jinank" <jinankj@amazon.de>
To: "maz@kernel.org" <maz@kernel.org>
Cc: "james.morse@arm.com" <james.morse@arm.com>,
"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"alexandru.elisei@arm.com" <alexandru.elisei@arm.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"will@kernel.org" <will@kernel.org>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>, "Graf (AWS),
Alexander" <graf@amazon.de>
Subject: Re: [PATCH] KVM: arm64: Properly restore PMU state during live-migration
Date: Tue, 8 Jun 2021 08:24:29 +0000 [thread overview]
Message-ID: <b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de> (raw)
In-Reply-To: <87eedczs49.wl-maz@kernel.org>
On Tue, 2021-06-08 at 09:18 +0100, Marc Zyngier wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
>
>
>
> On Mon, 07 Jun 2021 19:34:08 +0100,
> "Jain, Jinank" <jinankj@amazon.de> wrote:
> > Hi Marc.
> >
> > On Mon, 2021-06-07 at 17:35 +0100, Marc Zyngier wrote:
> > > CAUTION: This email originated from outside of the organization.
> > > Do
> > > not click links or open attachments unless you can confirm the
> > > sender
> > > and know the content is safe.
> > >
> > >
> > >
> > > On Mon, 07 Jun 2021 17:05:01 +0100,
> > > "Jain, Jinank" <jinankj@amazon.de> wrote:
> > > > On Thu, 2021-06-03 at 17:03 +0100, Marc Zyngier wrote:
> > > > > Hi Jinank,
> > > > >
> > > > > On Thu, 03 Jun 2021 12:05:54 +0100,
> > > > > Jinank Jain <jinankj@amazon.de> wrote:
> > > > > > Currently if a guest is live-migrated while it is actively
> > > > > > using
> > > > > > perf
> > > > > > counters, then after live-migrate it will notice that all
> > > > > > counters
> > > > > > would
> > > > > > suddenly start reporting 0s. This is due to the fact we are
> > > > > > not
> > > > > > re-creating the relevant perf events inside the kernel.
> > > > > >
> > > > > > Usually on live-migration guest state is restored using
> > > > > > KVM_SET_ONE_REG
> > > > > > ioctl interface, which simply restores the value of PMU
> > > > > > registers
> > > > > > values but does not re-program the perf events so that the
> > > > > > guest
> > > > > > can seamlessly
> > > > > > use these counters even after live-migration like it was
> > > > > > doing
> > > > > > before
> > > > > > live-migration.
> > > > > >
> > > > > > Instead there are two completely different code path
> > > > > > between
> > > > > > guest
> > > > > > accessing PMU registers and VMM restoring counters on
> > > > > > live-migration.
> > > > > >
> > > > > > In case of KVM_SET_ONE_REG:
> > > > > >
> > > > > > kvm_arm_set_reg()
> > > > > > ...... kvm_arm_sys_reg_set_reg()
> > > > > > ........... reg_from_user()
> > > > > >
> > > > > > but in case when guest tries to access these counters:
> > > > > >
> > > > > > handle_exit()
> > > > > > ..... kvm_handle_sys_reg()
> > > > > > ..........perform_access()
> > > > > > ...............access_pmu_evcntr()
> > > > > > ...................kvm_pmu_set_counter_value()
> > > > > > .......................kvm_pmu_create_perf_event()
> > > > > >
> > > > > > The drawback of using the KVM_SET_ONE_REG interface is that
> > > > > > the
> > > > > > host pmu
> > > > > > events which were registered for the source instance and
> > > > > > not
> > > > > > present for
> > > > > > the destination instance.
> > > > >
> > > > > I can't parse this sentence. Do you mean "are not present"?
> > > > >
> > > > > > Thus passively restoring PMCR_EL0 using
> > > > > > KVM_SET_ONE_REG interface would not create the necessary
> > > > > > host
> > > > > > pmu
> > > > > > events
> > > > > > which are crucial for seamless guest experience across live
> > > > > > migration.
> > > > > >
> > > > > > In ordet to fix the situation, on first vcpu load we should
> > > > > > restore
> > > > > > PMCR_EL0 in the same exact way like the guest was trying to
> > > > > > access
> > > > > > these counters. And then we will also recreate the relevant
> > > > > > host
> > > > > > pmu
> > > > > > events.
> > > > > >
> > > > > > Signed-off-by: Jinank Jain <jinankj@amazon.de>
> > > > > > Cc: Alexander Graf (AWS) <graf@amazon.de>
> > > > > > Cc: Marc Zyngier <maz@kernel.org>
> > > > > > Cc: James Morse <james.morse@arm.com>
> > > > > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > ---
> > > > > > arch/arm64/include/asm/kvm_host.h | 1 +
> > > > > > arch/arm64/kvm/arm.c | 1 +
> > > > > > arch/arm64/kvm/pmu-emul.c | 10 ++++++++--
> > > > > > arch/arm64/kvm/pmu.c | 15 +++++++++++++++
> > > > > > include/kvm/arm_pmu.h | 3 +++
> > > > > > 5 files changed, 28 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h
> > > > > > b/arch/arm64/include/asm/kvm_host.h
> > > > > > index 7cd7d5c8c4bc..2376ad3c2fc2 100644
> > > > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > > > @@ -745,6 +745,7 @@ static inline int
> > > > > > kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > > > > > void kvm_set_pmu_events(u32 set, struct perf_event_attr
> > > > > > *attr);
> > > > > > void kvm_clr_pmu_events(u32 clr);
> > > > > >
> > > > > > +void kvm_vcpu_pmu_restore(struct kvm_vcpu *vcpu);
> > > > > > void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > > > > > void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> > > > > > #else
> > > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > > > index e720148232a0..c66f6d16ec06 100644
> > > > > > --- a/arch/arm64/kvm/arm.c
> > > > > > +++ b/arch/arm64/kvm/arm.c
> > > > > > @@ -408,6 +408,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
> > > > > > *vcpu,
> > > > > > int cpu)
> > > > > > if (has_vhe())
> > > > > > kvm_vcpu_load_sysregs_vhe(vcpu);
> > > > > > kvm_arch_vcpu_load_fp(vcpu);
> > > > > > + kvm_vcpu_pmu_restore(vcpu);
> > > > >
> > > > > If this only needs to be run once per vcpu, why not trigger
> > > > > it
> > > > > from
> > > > > kvm_arm_pmu_v3_enable(), which is also called once per vcpu?
> > > > >
> > > > > This can done on the back of a request, saving most of the
> > > > > overhead
> > > > > and not requiring any extra field. Essentially, something
> > > > > like
> > > > > the
> > > > > (untested) patch below.
> > > > >
> > > > > > kvm_vcpu_pmu_restore_guest(vcpu);
> > > > > > if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
> > > > > > kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
> > > > > > diff --git a/arch/arm64/kvm/pmu-emul.c
> > > > > > b/arch/arm64/kvm/pmu-
> > > > > > emul.c
> > > > > > index fd167d4f4215..12a40f4b5f0d 100644
> > > > > > --- a/arch/arm64/kvm/pmu-emul.c
> > > > > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > > > > @@ -574,10 +574,16 @@ void kvm_pmu_handle_pmcr(struct
> > > > > > kvm_vcpu
> > > > > > *vcpu, u64 val)
> > > > > > kvm_pmu_disable_counter_mask(vcpu, mask);
> > > > > > }
> > > > > >
> > > > > > - if (val & ARMV8_PMU_PMCR_C)
> > > > > > + /*
> > > > > > + * Cycle counter needs to reset in case of first vcpu
> > > > > > load.
> > > > > > + */
> > > > > > + if (val & ARMV8_PMU_PMCR_C ||
> > > > > > !kvm_arm_pmu_v3_restored(vcpu))
> > > > >
> > > > > Why? There is no architectural guarantee that a counter
> > > > > resets to
> > > > > 0
> > > > > without writing PMCR_EL0.C. And if you want the guest to
> > > > > continue
> > > > > counting where it left off, resetting the counter is at best
> > > > > counter-productive.
> > > >
> > > > Without this we would not be resetting PMU which is required
> > > > for
> > > > creating host perf events. With the patch that you suggested we
> > > > are
> > > > restoring PMCR_EL0 properly but still missing recreation of
> > > > host
> > > > perf
> > > > events.
> > >
> > > How? The request that gets set on the first vcpu run will call
> > > kvm_pmu_handle_pmcr() -> kvm_pmu_enable_counter_mask() ->
> > > kvm_pmu_create_perf_event(). What are we missing?
> > >
> >
> > I found out what I was missing. I was working with an older kernel
> > which was missing this upstream patch:
> >
> > https://lore.kernel.org/lkml/20200124142535.29386-3-eric.auger@redhat.com/
>
> :-(
>
> Please test whatever you send with an upstream kernel. Actually,
> please *develop* on an upstream kernel. This will avoid this kind of
> discussion where we talk past each other, and make it plain that your
> production kernel is lacking all sorts of fixes.
>
> Now, can you please state whether or not this patch fixes it for you
> *on an upstream kernel*? I have no interest in results from a
> production kernel.
>
> M.
>
Really sorry for the noise and I can confirm that your suggested patch
fixes the problem for the upstream kernel i.e., if I live migrate a
guest which is actively using perf events then the guest can continue
using them even after live migration without interruption.
> --
> Without deviation from the norm, progress is not possible.
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-06-08 8:28 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-03 11:05 [PATCH] KVM: arm64: Properly restore PMU state during live-migration Jinank Jain
2021-06-03 16:03 ` Marc Zyngier
2021-06-07 16:05 ` Jain, Jinank
2021-06-07 16:35 ` Marc Zyngier
2021-06-07 18:34 ` Jain, Jinank
2021-06-08 8:18 ` Marc Zyngier
2021-06-08 8:24 ` Jain, Jinank [this message]
2021-06-07 18:58 ` [PATCH v2] " Jinank Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de \
--to=jinankj@amazon.de \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=graf@amazon.de \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).