kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Stamatis, Ilias" <ilstam@amazon.com>
To: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"mlevitsk@redhat.com" <mlevitsk@redhat.com>,
	"ilstam@mailbox.org" <ilstam@mailbox.org>
Cc: "jmattson@google.com" <jmattson@google.com>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"mtosatti@redhat.com" <mtosatti@redhat.com>,
	"zamsden@gmail.com" <zamsden@gmail.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"wanpengli@tencent.com" <wanpengli@tencent.com>
Subject: Re: [PATCH 3/8] KVM: X86: Pass an additional 'L1' argument to kvm_scale_tsc()
Date: Mon, 10 May 2021 15:44:31 +0000	[thread overview]
Message-ID: <041e087ab930f33cff5563204c79438368c9d694.camel@amazon.com> (raw)
In-Reply-To: <b87ca34b3251f06c807e5d46bbf821756e57ff5b.camel@redhat.com>

On Mon, 2021-05-10 at 16:52 +0300, Maxim Levitsky wrote:
> On Thu, 2021-05-06 at 10:32 +0000, ilstam@mailbox.org wrote:
> > From: Ilias Stamatis <ilstam@amazon.com>
> > 
> > Sometimes kvm_scale_tsc() needs to use the current scaling ratio and
> > other times (like when reading the TSC from user space) it needs to use
> > L1's scaling ratio. Have the caller specify this by passing an
> > additional boolean argument.
> > 
> > Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  2 +-
> >  arch/x86/kvm/x86.c              | 21 +++++++++++++--------
> >  2 files changed, 14 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 132e820525fb..cdddbf0b1177 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1779,7 +1779,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
> >  void kvm_define_user_return_msr(unsigned index, u32 msr);
> >  int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
> > 
> > -u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
> > +u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc, bool l1);
> >  u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
> > 
> >  unsigned long kvm_get_linear_rip(struct kvm_vcpu *vcpu);
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 7bc5155ac6fd..26a4c0f46f15 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -2241,10 +2241,14 @@ static inline u64 __scale_tsc(u64 ratio, u64 tsc)
> >       return mul_u64_u64_shr(tsc, ratio, kvm_tsc_scaling_ratio_frac_bits);
> >  }
> > 
> > -u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
> > +/*
> > + * If l1 is true the TSC is scaled using L1's scaling ratio, otherwise
> > + * the current scaling ratio is used.
> > + */
> > +u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc, bool l1)
> >  {
> >       u64 _tsc = tsc;
> > -     u64 ratio = vcpu->arch.tsc_scaling_ratio;
> > +     u64 ratio = l1 ? vcpu->arch.l1_tsc_scaling_ratio : vcpu->arch.tsc_scaling_ratio;
> > 
> >       if (ratio != kvm_default_tsc_scaling_ratio)
> >               _tsc = __scale_tsc(ratio, tsc);
> 
> I do wonder if it is better to have kvm_scale_tsc_l1 and kvm_scale_tsc instead for better
> readablility?
> 

That makes sense. Will do.

> 
> > @@ -2257,14 +2261,14 @@ static u64 kvm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
> >  {
> >       u64 tsc;
> > 
> > -     tsc = kvm_scale_tsc(vcpu, rdtsc());
> > +     tsc = kvm_scale_tsc(vcpu, rdtsc(), true);
> 
> Here we have a somewhat dangerous assumption that this function
> will always be used with L1 tsc values.
> 
> The kvm_compute_tsc_offset should at least be renamed to
> kvm_compute_tsc_offset_l1 or something like that.
> 
> Currently the assumption holds though:
> 
> We call the kvm_compute_tsc_offset in:
> 
> -> kvm_synchronize_tsc which is nowadays thankfully only called
> on TSC writes from qemu, which are assumed to be L1 values.
> 
> (this is pending a rework of the whole thing which I started
> some time ago but haven't had a chance to finish it yet)
> 
> -> Guest write of MSR_IA32_TSC. The value written is in L1 units,
> since TSC offset/scaling only covers RDTSC and RDMSR of the IA32_TSC,
> while WRMSR should be intercepted by L1 and emulated.
> If it is not emulated, then L2 would just write L1 value.
> 
> -> in kvm_arch_vcpu_load, when TSC is unstable, we always try to resume
> the guest from the same TSC value as it had seen last time,
> and then catchup.

Yes. I wasn't sure about kvm_compute_tsc_offset but my understanding was
that all of its callers wanted the L1 value scaled.

Renaming it to kvm_scale_tsc_l1 sounds like a great idea.

> Also host TSC values are used, and after reading this function,
> I recommend to rename the vcpu->arch.last_guest_tsc
> to vcpu->arch.last_guest_tsc_l1 to document this.

OK

> > 
> >       return target_tsc - tsc;
> >  }
> > 
> >  u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc)
> >  {
> > -     return vcpu->arch.l1_tsc_offset + kvm_scale_tsc(vcpu, host_tsc);
> > +     return vcpu->arch.l1_tsc_offset + kvm_scale_tsc(vcpu, host_tsc, true);
> 
> OK
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_read_l1_tsc);
> > 
> > @@ -2395,9 +2399,9 @@ static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
> > 
> >  static inline void adjust_tsc_offset_host(struct kvm_vcpu *vcpu, s64 adjustment)
> >  {
> > -     if (vcpu->arch.tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
> > +     if (vcpu->arch.l1_tsc_scaling_ratio != kvm_default_tsc_scaling_ratio)
> >               WARN_ON(adjustment < 0);
> 
> This should belong to patch 2 IMHO.
> 

Right, I will move it.

> > -     adjustment = kvm_scale_tsc(vcpu, (u64) adjustment);
> > +     adjustment = kvm_scale_tsc(vcpu, (u64) adjustment, true);
> 
> OK
> >       adjust_tsc_offset_guest(vcpu, adjustment);
> >  }
> > 
> > @@ -2780,7 +2784,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
> >       /* With all the info we got, fill in the values */
> > 
> >       if (kvm_has_tsc_control)
> > -             tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz);
> > +             tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz, true);
> 
> OK (kvmclock is for L1 only, L1 hypervisor is free to expose its own kvmclock to L2)
> > 
> >       if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
> >               kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_khz * 1000LL,
> > @@ -3474,7 +3478,8 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >               u64 tsc_offset = msr_info->host_initiated ? vcpu->arch.l1_tsc_offset :
> >                                                           vcpu->arch.tsc_offset;
> > 
> > -             msr_info->data = kvm_scale_tsc(vcpu, rdtsc()) + tsc_offset;
> > +             msr_info->data = kvm_scale_tsc(vcpu, rdtsc(),
> > +                                            msr_info->host_initiated) + tsc_offset;
> 
> Since we now do two things that depend on msr_info->host_initiated, I
> think I would prefer to convert this back to regular 'if'.
> I don't have a strong opinion on this though.
> 

Agreed.

Thanks!

Ilias

> 
> >               break;
> >       }
> >       case MSR_MTRRcap:
> 
> 
> Best regards,
>         Maxim Levitsky
> 
> 

  reply	other threads:[~2021-05-10 15:44 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 10:32 [PATCH 0/8] KVM: VMX: Implement nested TSC scaling ilstam
2021-05-06 10:32 ` [PATCH 1/8] KVM: VMX: Add a TSC multiplier field in VMCS12 ilstam
2021-05-06 14:50   ` kernel test robot
2021-05-06 17:36   ` Jim Mattson
2021-05-10 13:42   ` Maxim Levitsky
2021-05-06 10:32 ` [PATCH 2/8] KVM: X86: Store L1's TSC scaling ratio in 'struct kvm_vcpu_arch' ilstam
2021-05-10 13:43   ` Maxim Levitsky
2021-05-06 10:32 ` [PATCH 3/8] KVM: X86: Pass an additional 'L1' argument to kvm_scale_tsc() ilstam
2021-05-10 13:52   ` Maxim Levitsky
2021-05-10 15:44     ` Stamatis, Ilias [this message]
2021-05-06 10:32 ` [PATCH 4/8] KVM: VMX: Adjust the TSC-related VMCS fields on L2 entry and exit ilstam
2021-05-06 11:32   ` Paolo Bonzini
2021-05-06 17:35     ` Stamatis, Ilias
2021-05-10 14:11       ` Paolo Bonzini
2021-05-10 13:53   ` Maxim Levitsky
2021-05-10 14:44     ` Stamatis, Ilias
2021-05-11 12:38       ` Maxim Levitsky
2021-05-11 15:11         ` Stamatis, Ilias
2021-05-06 10:32 ` [PATCH 5/8] KVM: X86: Move tracing outside write_l1_tsc_offset() ilstam
2021-05-10 13:54   ` Maxim Levitsky
2021-05-06 10:32 ` [PATCH 6/8] KVM: VMX: Make vmx_write_l1_tsc_offset() work with nested TSC scaling ilstam
2021-05-10 13:54   ` Maxim Levitsky
2021-05-10 16:08     ` Stamatis, Ilias
2021-05-11 12:44       ` Maxim Levitsky
2021-05-11 17:44         ` Stamatis, Ilias
2021-05-06 10:32 ` [PATCH 7/8] KVM: VMX: Expose TSC scaling to L2 ilstam
2021-05-10 13:56   ` Maxim Levitsky
2021-05-06 10:32 ` [PATCH 8/8] KVM: selftests: x86: Add vmx_nested_tsc_scaling_test ilstam
2021-05-10 13:59   ` Maxim Levitsky
2021-05-11 11:16     ` Stamatis, Ilias
2021-05-11 12:47       ` Maxim Levitsky
2021-05-11 14:02         ` Stamatis, Ilias
2021-05-06 17:16 ` [PATCH 0/8] KVM: VMX: Implement nested TSC scaling Jim Mattson
2021-05-06 17:48   ` Stamatis, Ilias
2021-05-10 13:43     ` Maxim Levitsky
2021-05-10 14:29   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=041e087ab930f33cff5563204c79438368c9d694.camel@amazon.com \
    --to=ilstam@amazon.com \
    --cc=dwmw@amazon.co.uk \
    --cc=ilstam@mailbox.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=zamsden@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).