Re: [PATCH] KVM: arm64: Properly restore PMU state during live-migration

From: Marc Zyngier <maz@kernel.org>
To: "Jain, Jinank" <jinankj@amazon.de>
Cc: "james.morse@arm.com" <james.morse@arm.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"will@kernel.org" <will@kernel.org>,
	"alexandru.elisei@arm.com" <alexandru.elisei@arm.com>,
	"Graf (AWS),\ Alexander" <graf@amazon.de>
Subject: Re: [PATCH] KVM: arm64: Properly restore PMU state during live-migration
Date: Mon, 07 Jun 2021 17:35:15 +0100	[thread overview]
Message-ID: <87lf7lzl8c.wl-maz@kernel.org> (raw)
In-Reply-To: <0a694ea93303bfa04530cd940f692244e1ccd1e7.camel@amazon.de>

On Mon, 07 Jun 2021 17:05:01 +0100,
"Jain, Jinank" <jinankj@amazon.de> wrote:
> 
> On Thu, 2021-06-03 at 17:03 +0100, Marc Zyngier wrote:
> >
> > Hi Jinank,
> > 
> > On Thu, 03 Jun 2021 12:05:54 +0100,
> > Jinank Jain <jinankj@amazon.de> wrote:
> > > Currently if a guest is live-migrated while it is actively using
> > > perf
> > > counters, then after live-migrate it will notice that all counters
> > > would
> > > suddenly start reporting 0s. This is due to the fact we are not
> > > re-creating the relevant perf events inside the kernel.
> > > 
> > > Usually on live-migration guest state is restored using
> > > KVM_SET_ONE_REG
> > > ioctl interface, which simply restores the value of PMU registers
> > > values but does not re-program the perf events so that the guest
> > > can seamlessly
> > > use these counters even after live-migration like it was doing
> > > before
> > > live-migration.
> > > 
> > > Instead there are two completely different code path between guest
> > > accessing PMU registers and VMM restoring counters on
> > > live-migration.
> > > 
> > > In case of KVM_SET_ONE_REG:
> > > 
> > > kvm_arm_set_reg()
> > > ...... kvm_arm_sys_reg_set_reg()
> > > ........... reg_from_user()
> > > 
> > > but in case when guest tries to access these counters:
> > > 
> > > handle_exit()
> > > ..... kvm_handle_sys_reg()
> > > ..........perform_access()
> > > ...............access_pmu_evcntr()
> > > ...................kvm_pmu_set_counter_value()
> > > .......................kvm_pmu_create_perf_event()
> > > 
> > > The drawback of using the KVM_SET_ONE_REG interface is that the
> > > host pmu
> > > events which were registered for the source instance and not
> > > present for
> > > the destination instance.
> > 
> > I can't parse this sentence. Do you mean "are not present"?
> > 
> > > Thus passively restoring PMCR_EL0 using
> > > KVM_SET_ONE_REG interface would not create the necessary host pmu
> > > events
> > > which are crucial for seamless guest experience across live
> > > migration.
> > > 
> > > In ordet to fix the situation, on first vcpu load we should restore
> > > PMCR_EL0 in the same exact way like the guest was trying to access
> > > these counters. And then we will also recreate the relevant host
> > > pmu
> > > events.
> > > 
> > > Signed-off-by: Jinank Jain <jinankj@amazon.de>
> > > Cc: Alexander Graf (AWS) <graf@amazon.de>
> > > Cc: Marc Zyngier <maz@kernel.org>
> > > Cc: James Morse <james.morse@arm.com>
> > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h |  1 +
> > >  arch/arm64/kvm/arm.c              |  1 +
> > >  arch/arm64/kvm/pmu-emul.c         | 10 ++++++++--
> > >  arch/arm64/kvm/pmu.c              | 15 +++++++++++++++
> > >  include/kvm/arm_pmu.h             |  3 +++
> > >  5 files changed, 28 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_host.h
> > > b/arch/arm64/include/asm/kvm_host.h
> > > index 7cd7d5c8c4bc..2376ad3c2fc2 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -745,6 +745,7 @@ static inline int
> > > kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> > >  void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
> > >  void kvm_clr_pmu_events(u32 clr);
> > > 
> > > +void kvm_vcpu_pmu_restore(struct kvm_vcpu *vcpu);
> > >  void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > >  void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> > >  #else
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index e720148232a0..c66f6d16ec06 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -408,6 +408,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu,
> > > int cpu)
> > >       if (has_vhe())
> > >               kvm_vcpu_load_sysregs_vhe(vcpu);
> > >       kvm_arch_vcpu_load_fp(vcpu);
> > > +     kvm_vcpu_pmu_restore(vcpu);
> > 
> > If this only needs to be run once per vcpu, why not trigger it from
> > kvm_arm_pmu_v3_enable(), which is also called once per vcpu?
> > 
> > This can done on the back of a request, saving most of the overhead
> > and not requiring any extra field. Essentially, something like the
> > (untested) patch below.
> > 
> > >       kvm_vcpu_pmu_restore_guest(vcpu);
> > >       if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
> > >               kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
> > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > > index fd167d4f4215..12a40f4b5f0d 100644
> > > --- a/arch/arm64/kvm/pmu-emul.c
> > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > @@ -574,10 +574,16 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu
> > > *vcpu, u64 val)
> > >               kvm_pmu_disable_counter_mask(vcpu, mask);
> > >       }
> > > 
> > > -     if (val & ARMV8_PMU_PMCR_C)
> > > +     /*
> > > +      * Cycle counter needs to reset in case of first vcpu load.
> > > +      */
> > > +     if (val & ARMV8_PMU_PMCR_C || !kvm_arm_pmu_v3_restored(vcpu))
> > 
> > Why? There is no architectural guarantee that a counter resets to 0
> > without writing PMCR_EL0.C. And if you want the guest to continue
> > counting where it left off, resetting the counter is at best
> > counter-productive.
> 
> Without this we would not be resetting PMU which is required for
> creating host perf events. With the patch that you suggested we are
> restoring PMCR_EL0 properly but still missing recreation of host perf
> events.

How? The request that gets set on the first vcpu run will call
kvm_pmu_handle_pmcr() -> kvm_pmu_enable_counter_mask() ->
kvm_pmu_create_perf_event(). What are we missing?

> And without host perf events, guest would still zeros after live
> migration. In my opinion we have two ways to fix it. We can fix it
> inside the kernel or let userspace/VMM set those bits before
> restarting the guest on the destination machine. What do you think?

I think either you're missing my point above, or I'm completely
missing yours. And I still don't understand why you want to zero the
counters that you have just restored. How does that help?

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel