All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jain, Jinank" <jinankj@amazon.de>
To: "maz@kernel.org" <maz@kernel.org>
Cc: "james.morse@arm.com" <james.morse@arm.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"alexandru.elisei@arm.com" <alexandru.elisei@arm.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"will@kernel.org" <will@kernel.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>, "Graf (AWS),
	Alexander" <graf@amazon.de>
Subject: Re: [PATCH] KVM: arm64: Properly restore PMU state during live-migration
Date: Tue, 8 Jun 2021 08:24:29 +0000	[thread overview]
Message-ID: <b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de> (raw)
In-Reply-To: <87eedczs49.wl-maz@kernel.org>

On Tue, 2021-06-08 at 09:18 +0100, Marc Zyngier wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On Mon, 07 Jun 2021 19:34:08 +0100,
> "Jain, Jinank" <jinankj@amazon.de> wrote:
> > Hi Marc.
> > 
> > On Mon, 2021-06-07 at 17:35 +0100, Marc Zyngier wrote:
> > > CAUTION: This email originated from outside of the organization.
> > > Do
> > > not click links or open attachments unless you can confirm the
> > > sender
> > > and know the content is safe.
> > > 
> > > 
> > > 
> > > On Mon, 07 Jun 2021 17:05:01 +0100,
> > > "Jain, Jinank" <jinankj@amazon.de> wrote:
> > > > On Thu, 2021-06-03 at 17:03 +0100, Marc Zyngier wrote:
> > > > > Hi Jinank,
> > > > > 
> > > > > On Thu, 03 Jun 2021 12:05:54 +0100,
> > > > > Jinank Jain <jinankj@amazon.de> wrote:
> > > > > > Currently if a guest is live-migrated while it is actively
> > > > > > using
> > > > > > perf
> > > > > > counters, then after live-migrate it will notice that all
> > > > > > counters
> > > > > > would
> > > > > > suddenly start reporting 0s. This is due to the fact we are
> > > > > > not
> > > > > > re-creating the relevant perf events inside the kernel.
> > > > > > 
> > > > > > Usually on live-migration guest state is restored using
> > > > > > KVM_SET_ONE_REG
> > > > > > ioctl interface, which simply restores the value of PMU
> > > > > > registers
> > > > > > values but does not re-program the perf events so that the
> > > > > > guest
> > > > > > can seamlessly
> > > > > > use these counters even after live-migration like it was
> > > > > > doing
> > > > > > before
> > > > > > live-migration.
> > > > > > 
> > > > > > Instead there are two completely different code path
> > > > > > between
> > > > > > guest
> > > > > > accessing PMU registers and VMM restoring counters on
> > > > > > live-migration.
> > > > > > 
> > > > > > In case of KVM_SET_ONE_REG:
> > > > > > 
> > > > > > kvm_arm_set_reg()
> > > > > > ...... kvm_arm_sys_reg_set_reg()
> > > > > > ........... reg_from_user()
> > > > > > 
> > > > > > but in case when guest tries to access these counters:
> > > > > > 
> > > > > > handle_exit()
> > > > > > ..... kvm_handle_sys_reg()
> > > > > > ..........perform_access()
> > > > > > ...............access_pmu_evcntr()
> > > > > > ...................kvm_pmu_set_counter_value()
> > > > > > .......................kvm_pmu_create_perf_event()
> > > > > > 
> > > > > > The drawback of using the KVM_SET_ONE_REG interface is that
> > > > > > the
> > > > > > host pmu
> > > > > > events which were registered for the source instance and
> > > > > > not
> > > > > > present for
> > > > > > the destination instance.
> > > > > 
> > > > > I can't parse this sentence. Do you mean "are not present"?
> > > > > 
> > > > > > Thus passively restoring PMCR_EL0 using
> > > > > > KVM_SET_ONE_REG interface would not create the necessary
> > > > > > host
> > > > > > pmu
> > > > > > events
> > > > > > which are crucial for seamless guest experience across live
> > > > > > migration.
> > > > > > 
> > > > > > In ordet to fix the situation, on first vcpu load we should
> > > > > > restore
> > > > > > PMCR_EL0 in the same exact way like the guest was trying to
> > > > > > access
> > > > > > these counters. And then we will also recreate the relevant
> > > > > > host
> > > > > > pmu
> > > > > > events.
> > > > > > 
> > > > > > Signed-off-by: Jinank Jain <jinankj@amazon.de>
> > > > > > Cc: Alexander Graf (AWS) <graf@amazon.de>
> > > > > > Cc: Marc Zyngier <maz@kernel.org>
> > > > > > Cc: James Morse <james.morse@arm.com>
> > > > > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > ---
> > > > > >  arch/arm64/include/asm/kvm_host.h |  1 +
> > > > > >  arch/arm64/kvm/arm.c              |  1 +
> > > > > >  arch/arm64/kvm/pmu-emul.c         | 10 ++++++++--
> > > > > >  arch/arm64/kvm/pmu.c              | 15 +++++++++++++++
> > > > > >  include/kvm/arm_pmu.h             |  3 +++
> > > > > >  5 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h
> > > > > > b/arch/arm64/include/asm/kvm_host.h
> > > > > > index 7cd7d5c8c4bc..2376ad3c2fc2 100644
> > > > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > > > @@ -745,6 +745,7 @@ static inline int
> > > > > > kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > > > > >  void kvm_set_pmu_events(u32 set, struct perf_event_attr
> > > > > > *attr);
> > > > > >  void kvm_clr_pmu_events(u32 clr);
> > > > > > 
> > > > > > +void kvm_vcpu_pmu_restore(struct kvm_vcpu *vcpu);
> > > > > >  void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > > > > >  void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> > > > > >  #else
> > > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > > > index e720148232a0..c66f6d16ec06 100644
> > > > > > --- a/arch/arm64/kvm/arm.c
> > > > > > +++ b/arch/arm64/kvm/arm.c
> > > > > > @@ -408,6 +408,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
> > > > > > *vcpu,
> > > > > > int cpu)
> > > > > >       if (has_vhe())
> > > > > >               kvm_vcpu_load_sysregs_vhe(vcpu);
> > > > > >       kvm_arch_vcpu_load_fp(vcpu);
> > > > > > +     kvm_vcpu_pmu_restore(vcpu);
> > > > > 
> > > > > If this only needs to be run once per vcpu, why not trigger
> > > > > it
> > > > > from
> > > > > kvm_arm_pmu_v3_enable(), which is also called once per vcpu?
> > > > > 
> > > > > This can done on the back of a request, saving most of the
> > > > > overhead
> > > > > and not requiring any extra field. Essentially, something
> > > > > like
> > > > > the
> > > > > (untested) patch below.
> > > > > 
> > > > > >       kvm_vcpu_pmu_restore_guest(vcpu);
> > > > > >       if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
> > > > > >               kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
> > > > > > diff --git a/arch/arm64/kvm/pmu-emul.c
> > > > > > b/arch/arm64/kvm/pmu-
> > > > > > emul.c
> > > > > > index fd167d4f4215..12a40f4b5f0d 100644
> > > > > > --- a/arch/arm64/kvm/pmu-emul.c
> > > > > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > > > > @@ -574,10 +574,16 @@ void kvm_pmu_handle_pmcr(struct
> > > > > > kvm_vcpu
> > > > > > *vcpu, u64 val)
> > > > > >               kvm_pmu_disable_counter_mask(vcpu, mask);
> > > > > >       }
> > > > > > 
> > > > > > -     if (val & ARMV8_PMU_PMCR_C)
> > > > > > +     /*
> > > > > > +      * Cycle counter needs to reset in case of first vcpu
> > > > > > load.
> > > > > > +      */
> > > > > > +     if (val & ARMV8_PMU_PMCR_C ||
> > > > > > !kvm_arm_pmu_v3_restored(vcpu))
> > > > > 
> > > > > Why? There is no architectural guarantee that a counter
> > > > > resets to
> > > > > 0
> > > > > without writing PMCR_EL0.C. And if you want the guest to
> > > > > continue
> > > > > counting where it left off, resetting the counter is at best
> > > > > counter-productive.
> > > > 
> > > > Without this we would not be resetting PMU which is required
> > > > for
> > > > creating host perf events. With the patch that you suggested we
> > > > are
> > > > restoring PMCR_EL0 properly but still missing recreation of
> > > > host
> > > > perf
> > > > events.
> > > 
> > > How? The request that gets set on the first vcpu run will call
> > > kvm_pmu_handle_pmcr() -> kvm_pmu_enable_counter_mask() ->
> > > kvm_pmu_create_perf_event(). What are we missing?
> > > 
> > 
> > I found out what I was missing. I was working with an older kernel
> > which was missing this upstream patch:
> > 
> > https://lore.kernel.org/lkml/20200124142535.29386-3-eric.auger@redhat.com/
> 
> :-(
> 
> Please test whatever you send with an upstream kernel. Actually,
> please *develop* on an upstream kernel. This will avoid this kind of
> discussion where we talk past each other, and make it plain that your
> production kernel is lacking all sorts of fixes.
> 
> Now, can you please state whether or not this patch fixes it for you
> *on an upstream kernel*? I have no interest in results from a
> production kernel.
> 
>         M.
> 

Really sorry for the noise and I can confirm that your suggested patch
fixes the problem for the upstream kernel i.e., if I live migrate a
guest which is actively using perf events then the guest can continue
using them even after live migration without interruption.

> --
> Without deviation from the norm, progress is not possible.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



WARNING: multiple messages have this Message-ID (diff)
From: "Jain, Jinank" <jinankj@amazon.de>
To: "maz@kernel.org" <maz@kernel.org>
Cc: "Graf \(AWS\), Alexander" <graf@amazon.de>,
	"will@kernel.org" <will@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] KVM: arm64: Properly restore PMU state during live-migration
Date: Tue, 8 Jun 2021 08:24:29 +0000	[thread overview]
Message-ID: <b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de> (raw)
In-Reply-To: <87eedczs49.wl-maz@kernel.org>

On Tue, 2021-06-08 at 09:18 +0100, Marc Zyngier wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On Mon, 07 Jun 2021 19:34:08 +0100,
> "Jain, Jinank" <jinankj@amazon.de> wrote:
> > Hi Marc.
> > 
> > On Mon, 2021-06-07 at 17:35 +0100, Marc Zyngier wrote:
> > > CAUTION: This email originated from outside of the organization.
> > > Do
> > > not click links or open attachments unless you can confirm the
> > > sender
> > > and know the content is safe.
> > > 
> > > 
> > > 
> > > On Mon, 07 Jun 2021 17:05:01 +0100,
> > > "Jain, Jinank" <jinankj@amazon.de> wrote:
> > > > On Thu, 2021-06-03 at 17:03 +0100, Marc Zyngier wrote:
> > > > > Hi Jinank,
> > > > > 
> > > > > On Thu, 03 Jun 2021 12:05:54 +0100,
> > > > > Jinank Jain <jinankj@amazon.de> wrote:
> > > > > > Currently if a guest is live-migrated while it is actively
> > > > > > using
> > > > > > perf
> > > > > > counters, then after live-migrate it will notice that all
> > > > > > counters
> > > > > > would
> > > > > > suddenly start reporting 0s. This is due to the fact we are
> > > > > > not
> > > > > > re-creating the relevant perf events inside the kernel.
> > > > > > 
> > > > > > Usually on live-migration guest state is restored using
> > > > > > KVM_SET_ONE_REG
> > > > > > ioctl interface, which simply restores the value of PMU
> > > > > > registers
> > > > > > values but does not re-program the perf events so that the
> > > > > > guest
> > > > > > can seamlessly
> > > > > > use these counters even after live-migration like it was
> > > > > > doing
> > > > > > before
> > > > > > live-migration.
> > > > > > 
> > > > > > Instead there are two completely different code path
> > > > > > between
> > > > > > guest
> > > > > > accessing PMU registers and VMM restoring counters on
> > > > > > live-migration.
> > > > > > 
> > > > > > In case of KVM_SET_ONE_REG:
> > > > > > 
> > > > > > kvm_arm_set_reg()
> > > > > > ...... kvm_arm_sys_reg_set_reg()
> > > > > > ........... reg_from_user()
> > > > > > 
> > > > > > but in case when guest tries to access these counters:
> > > > > > 
> > > > > > handle_exit()
> > > > > > ..... kvm_handle_sys_reg()
> > > > > > ..........perform_access()
> > > > > > ...............access_pmu_evcntr()
> > > > > > ...................kvm_pmu_set_counter_value()
> > > > > > .......................kvm_pmu_create_perf_event()
> > > > > > 
> > > > > > The drawback of using the KVM_SET_ONE_REG interface is that
> > > > > > the
> > > > > > host pmu
> > > > > > events which were registered for the source instance and
> > > > > > not
> > > > > > present for
> > > > > > the destination instance.
> > > > > 
> > > > > I can't parse this sentence. Do you mean "are not present"?
> > > > > 
> > > > > > Thus passively restoring PMCR_EL0 using
> > > > > > KVM_SET_ONE_REG interface would not create the necessary
> > > > > > host
> > > > > > pmu
> > > > > > events
> > > > > > which are crucial for seamless guest experience across live
> > > > > > migration.
> > > > > > 
> > > > > > In ordet to fix the situation, on first vcpu load we should
> > > > > > restore
> > > > > > PMCR_EL0 in the same exact way like the guest was trying to
> > > > > > access
> > > > > > these counters. And then we will also recreate the relevant
> > > > > > host
> > > > > > pmu
> > > > > > events.
> > > > > > 
> > > > > > Signed-off-by: Jinank Jain <jinankj@amazon.de>
> > > > > > Cc: Alexander Graf (AWS) <graf@amazon.de>
> > > > > > Cc: Marc Zyngier <maz@kernel.org>
> > > > > > Cc: James Morse <james.morse@arm.com>
> > > > > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > ---
> > > > > >  arch/arm64/include/asm/kvm_host.h |  1 +
> > > > > >  arch/arm64/kvm/arm.c              |  1 +
> > > > > >  arch/arm64/kvm/pmu-emul.c         | 10 ++++++++--
> > > > > >  arch/arm64/kvm/pmu.c              | 15 +++++++++++++++
> > > > > >  include/kvm/arm_pmu.h             |  3 +++
> > > > > >  5 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h
> > > > > > b/arch/arm64/include/asm/kvm_host.h
> > > > > > index 7cd7d5c8c4bc..2376ad3c2fc2 100644
> > > > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > > > @@ -745,6 +745,7 @@ static inline int
> > > > > > kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > > > > >  void kvm_set_pmu_events(u32 set, struct perf_event_attr
> > > > > > *attr);
> > > > > >  void kvm_clr_pmu_events(u32 clr);
> > > > > > 
> > > > > > +void kvm_vcpu_pmu_restore(struct kvm_vcpu *vcpu);
> > > > > >  void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > > > > >  void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> > > > > >  #else
> > > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > > > index e720148232a0..c66f6d16ec06 100644
> > > > > > --- a/arch/arm64/kvm/arm.c
> > > > > > +++ b/arch/arm64/kvm/arm.c
> > > > > > @@ -408,6 +408,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
> > > > > > *vcpu,
> > > > > > int cpu)
> > > > > >       if (has_vhe())
> > > > > >               kvm_vcpu_load_sysregs_vhe(vcpu);
> > > > > >       kvm_arch_vcpu_load_fp(vcpu);
> > > > > > +     kvm_vcpu_pmu_restore(vcpu);
> > > > > 
> > > > > If this only needs to be run once per vcpu, why not trigger
> > > > > it
> > > > > from
> > > > > kvm_arm_pmu_v3_enable(), which is also called once per vcpu?
> > > > > 
> > > > > This can done on the back of a request, saving most of the
> > > > > overhead
> > > > > and not requiring any extra field. Essentially, something
> > > > > like
> > > > > the
> > > > > (untested) patch below.
> > > > > 
> > > > > >       kvm_vcpu_pmu_restore_guest(vcpu);
> > > > > >       if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
> > > > > >               kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
> > > > > > diff --git a/arch/arm64/kvm/pmu-emul.c
> > > > > > b/arch/arm64/kvm/pmu-
> > > > > > emul.c
> > > > > > index fd167d4f4215..12a40f4b5f0d 100644
> > > > > > --- a/arch/arm64/kvm/pmu-emul.c
> > > > > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > > > > @@ -574,10 +574,16 @@ void kvm_pmu_handle_pmcr(struct
> > > > > > kvm_vcpu
> > > > > > *vcpu, u64 val)
> > > > > >               kvm_pmu_disable_counter_mask(vcpu, mask);
> > > > > >       }
> > > > > > 
> > > > > > -     if (val & ARMV8_PMU_PMCR_C)
> > > > > > +     /*
> > > > > > +      * Cycle counter needs to reset in case of first vcpu
> > > > > > load.
> > > > > > +      */
> > > > > > +     if (val & ARMV8_PMU_PMCR_C ||
> > > > > > !kvm_arm_pmu_v3_restored(vcpu))
> > > > > 
> > > > > Why? There is no architectural guarantee that a counter
> > > > > resets to
> > > > > 0
> > > > > without writing PMCR_EL0.C. And if you want the guest to
> > > > > continue
> > > > > counting where it left off, resetting the counter is at best
> > > > > counter-productive.
> > > > 
> > > > Without this we would not be resetting PMU which is required
> > > > for
> > > > creating host perf events. With the patch that you suggested we
> > > > are
> > > > restoring PMCR_EL0 properly but still missing recreation of
> > > > host
> > > > perf
> > > > events.
> > > 
> > > How? The request that gets set on the first vcpu run will call
> > > kvm_pmu_handle_pmcr() -> kvm_pmu_enable_counter_mask() ->
> > > kvm_pmu_create_perf_event(). What are we missing?
> > > 
> > 
> > I found out what I was missing. I was working with an older kernel
> > which was missing this upstream patch:
> > 
> > https://lore.kernel.org/lkml/20200124142535.29386-3-eric.auger@redhat.com/
> 
> :-(
> 
> Please test whatever you send with an upstream kernel. Actually,
> please *develop* on an upstream kernel. This will avoid this kind of
> discussion where we talk past each other, and make it plain that your
> production kernel is lacking all sorts of fixes.
> 
> Now, can you please state whether or not this patch fixes it for you
> *on an upstream kernel*? I have no interest in results from a
> production kernel.
> 
>         M.
> 

Really sorry for the noise and I can confirm that your suggested patch
fixes the problem for the upstream kernel i.e., if I live migrate a
guest which is actively using perf events then the guest can continue
using them even after live migration without interruption.

> --
> Without deviation from the norm, progress is not possible.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: "Jain, Jinank" <jinankj@amazon.de>
To: "maz@kernel.org" <maz@kernel.org>
Cc: "james.morse@arm.com" <james.morse@arm.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"alexandru.elisei@arm.com" <alexandru.elisei@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"will@kernel.org" <will@kernel.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>, "Graf (AWS),
	Alexander" <graf@amazon.de>
Subject: Re: [PATCH] KVM: arm64: Properly restore PMU state during live-migration
Date: Tue, 8 Jun 2021 08:24:29 +0000	[thread overview]
Message-ID: <b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de> (raw)
In-Reply-To: <87eedczs49.wl-maz@kernel.org>

On Tue, 2021-06-08 at 09:18 +0100, Marc Zyngier wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On Mon, 07 Jun 2021 19:34:08 +0100,
> "Jain, Jinank" <jinankj@amazon.de> wrote:
> > Hi Marc.
> > 
> > On Mon, 2021-06-07 at 17:35 +0100, Marc Zyngier wrote:
> > > CAUTION: This email originated from outside of the organization.
> > > Do
> > > not click links or open attachments unless you can confirm the
> > > sender
> > > and know the content is safe.
> > > 
> > > 
> > > 
> > > On Mon, 07 Jun 2021 17:05:01 +0100,
> > > "Jain, Jinank" <jinankj@amazon.de> wrote:
> > > > On Thu, 2021-06-03 at 17:03 +0100, Marc Zyngier wrote:
> > > > > Hi Jinank,
> > > > > 
> > > > > On Thu, 03 Jun 2021 12:05:54 +0100,
> > > > > Jinank Jain <jinankj@amazon.de> wrote:
> > > > > > Currently if a guest is live-migrated while it is actively
> > > > > > using
> > > > > > perf
> > > > > > counters, then after live-migrate it will notice that all
> > > > > > counters
> > > > > > would
> > > > > > suddenly start reporting 0s. This is due to the fact we are
> > > > > > not
> > > > > > re-creating the relevant perf events inside the kernel.
> > > > > > 
> > > > > > Usually on live-migration guest state is restored using
> > > > > > KVM_SET_ONE_REG
> > > > > > ioctl interface, which simply restores the value of PMU
> > > > > > registers
> > > > > > values but does not re-program the perf events so that the
> > > > > > guest
> > > > > > can seamlessly
> > > > > > use these counters even after live-migration like it was
> > > > > > doing
> > > > > > before
> > > > > > live-migration.
> > > > > > 
> > > > > > Instead there are two completely different code path
> > > > > > between
> > > > > > guest
> > > > > > accessing PMU registers and VMM restoring counters on
> > > > > > live-migration.
> > > > > > 
> > > > > > In case of KVM_SET_ONE_REG:
> > > > > > 
> > > > > > kvm_arm_set_reg()
> > > > > > ...... kvm_arm_sys_reg_set_reg()
> > > > > > ........... reg_from_user()
> > > > > > 
> > > > > > but in case when guest tries to access these counters:
> > > > > > 
> > > > > > handle_exit()
> > > > > > ..... kvm_handle_sys_reg()
> > > > > > ..........perform_access()
> > > > > > ...............access_pmu_evcntr()
> > > > > > ...................kvm_pmu_set_counter_value()
> > > > > > .......................kvm_pmu_create_perf_event()
> > > > > > 
> > > > > > The drawback of using the KVM_SET_ONE_REG interface is that
> > > > > > the
> > > > > > host pmu
> > > > > > events which were registered for the source instance and
> > > > > > not
> > > > > > present for
> > > > > > the destination instance.
> > > > > 
> > > > > I can't parse this sentence. Do you mean "are not present"?
> > > > > 
> > > > > > Thus passively restoring PMCR_EL0 using
> > > > > > KVM_SET_ONE_REG interface would not create the necessary
> > > > > > host
> > > > > > pmu
> > > > > > events
> > > > > > which are crucial for seamless guest experience across live
> > > > > > migration.
> > > > > > 
> > > > > > In ordet to fix the situation, on first vcpu load we should
> > > > > > restore
> > > > > > PMCR_EL0 in the same exact way like the guest was trying to
> > > > > > access
> > > > > > these counters. And then we will also recreate the relevant
> > > > > > host
> > > > > > pmu
> > > > > > events.
> > > > > > 
> > > > > > Signed-off-by: Jinank Jain <jinankj@amazon.de>
> > > > > > Cc: Alexander Graf (AWS) <graf@amazon.de>
> > > > > > Cc: Marc Zyngier <maz@kernel.org>
> > > > > > Cc: James Morse <james.morse@arm.com>
> > > > > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > ---
> > > > > >  arch/arm64/include/asm/kvm_host.h |  1 +
> > > > > >  arch/arm64/kvm/arm.c              |  1 +
> > > > > >  arch/arm64/kvm/pmu-emul.c         | 10 ++++++++--
> > > > > >  arch/arm64/kvm/pmu.c              | 15 +++++++++++++++
> > > > > >  include/kvm/arm_pmu.h             |  3 +++
> > > > > >  5 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h
> > > > > > b/arch/arm64/include/asm/kvm_host.h
> > > > > > index 7cd7d5c8c4bc..2376ad3c2fc2 100644
> > > > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > > > @@ -745,6 +745,7 @@ static inline int
> > > > > > kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > > > > >  void kvm_set_pmu_events(u32 set, struct perf_event_attr
> > > > > > *attr);
> > > > > >  void kvm_clr_pmu_events(u32 clr);
> > > > > > 
> > > > > > +void kvm_vcpu_pmu_restore(struct kvm_vcpu *vcpu);
> > > > > >  void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > > > > >  void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> > > > > >  #else
> > > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > > > index e720148232a0..c66f6d16ec06 100644
> > > > > > --- a/arch/arm64/kvm/arm.c
> > > > > > +++ b/arch/arm64/kvm/arm.c
> > > > > > @@ -408,6 +408,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
> > > > > > *vcpu,
> > > > > > int cpu)
> > > > > >       if (has_vhe())
> > > > > >               kvm_vcpu_load_sysregs_vhe(vcpu);
> > > > > >       kvm_arch_vcpu_load_fp(vcpu);
> > > > > > +     kvm_vcpu_pmu_restore(vcpu);
> > > > > 
> > > > > If this only needs to be run once per vcpu, why not trigger
> > > > > it
> > > > > from
> > > > > kvm_arm_pmu_v3_enable(), which is also called once per vcpu?
> > > > > 
> > > > > This can done on the back of a request, saving most of the
> > > > > overhead
> > > > > and not requiring any extra field. Essentially, something
> > > > > like
> > > > > the
> > > > > (untested) patch below.
> > > > > 
> > > > > >       kvm_vcpu_pmu_restore_guest(vcpu);
> > > > > >       if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
> > > > > >               kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
> > > > > > diff --git a/arch/arm64/kvm/pmu-emul.c
> > > > > > b/arch/arm64/kvm/pmu-
> > > > > > emul.c
> > > > > > index fd167d4f4215..12a40f4b5f0d 100644
> > > > > > --- a/arch/arm64/kvm/pmu-emul.c
> > > > > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > > > > @@ -574,10 +574,16 @@ void kvm_pmu_handle_pmcr(struct
> > > > > > kvm_vcpu
> > > > > > *vcpu, u64 val)
> > > > > >               kvm_pmu_disable_counter_mask(vcpu, mask);
> > > > > >       }
> > > > > > 
> > > > > > -     if (val & ARMV8_PMU_PMCR_C)
> > > > > > +     /*
> > > > > > +      * Cycle counter needs to reset in case of first vcpu
> > > > > > load.
> > > > > > +      */
> > > > > > +     if (val & ARMV8_PMU_PMCR_C ||
> > > > > > !kvm_arm_pmu_v3_restored(vcpu))
> > > > > 
> > > > > Why? There is no architectural guarantee that a counter
> > > > > resets to
> > > > > 0
> > > > > without writing PMCR_EL0.C. And if you want the guest to
> > > > > continue
> > > > > counting where it left off, resetting the counter is at best
> > > > > counter-productive.
> > > > 
> > > > Without this we would not be resetting PMU which is required
> > > > for
> > > > creating host perf events. With the patch that you suggested we
> > > > are
> > > > restoring PMCR_EL0 properly but still missing recreation of
> > > > host
> > > > perf
> > > > events.
> > > 
> > > How? The request that gets set on the first vcpu run will call
> > > kvm_pmu_handle_pmcr() -> kvm_pmu_enable_counter_mask() ->
> > > kvm_pmu_create_perf_event(). What are we missing?
> > > 
> > 
> > I found out what I was missing. I was working with an older kernel
> > which was missing this upstream patch:
> > 
> > https://lore.kernel.org/lkml/20200124142535.29386-3-eric.auger@redhat.com/
> 
> :-(
> 
> Please test whatever you send with an upstream kernel. Actually,
> please *develop* on an upstream kernel. This will avoid this kind of
> discussion where we talk past each other, and make it plain that your
> production kernel is lacking all sorts of fixes.
> 
> Now, can you please state whether or not this patch fixes it for you
> *on an upstream kernel*? I have no interest in results from a
> production kernel.
> 
>         M.
> 

Really sorry for the noise and I can confirm that your suggested patch
fixes the problem for the upstream kernel i.e., if I live migrate a
guest which is actively using perf events then the guest can continue
using them even after live migration without interruption.

> --
> Without deviation from the norm, progress is not possible.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-08  8:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03 11:05 [PATCH] KVM: arm64: Properly restore PMU state during live-migration Jinank Jain
2021-06-03 11:05 ` Jinank Jain
2021-06-03 11:05 ` Jinank Jain
2021-06-03 15:20 ` kernel test robot
2021-06-03 15:20   ` kernel test robot
2021-06-03 15:20   ` kernel test robot
2021-06-03 16:03 ` Marc Zyngier
2021-06-03 16:03   ` Marc Zyngier
2021-06-03 16:03   ` Marc Zyngier
2021-06-07 16:05   ` Jain, Jinank
2021-06-07 16:05     ` Jain, Jinank
2021-06-07 16:05     ` Jain, Jinank
2021-06-07 16:35     ` Marc Zyngier
2021-06-07 16:35       ` Marc Zyngier
2021-06-07 18:34       ` Jain, Jinank
2021-06-07 18:34         ` Jain, Jinank
2021-06-07 18:34         ` Jain, Jinank
2021-06-08  8:18         ` Marc Zyngier
2021-06-08  8:18           ` Marc Zyngier
2021-06-08  8:18           ` Marc Zyngier
2021-06-08  8:24           ` Jain, Jinank [this message]
2021-06-08  8:24             ` Jain, Jinank
2021-06-08  8:24             ` Jain, Jinank
2021-06-07 18:58 ` [PATCH v2] " Jinank Jain
2021-06-07 18:58   ` Jinank Jain
2021-06-07 18:58   ` Jinank Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de \
    --to=jinankj@amazon.de \
    --cc=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=graf@amazon.de \
    --cc=james.morse@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.