All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: "Auld, Will" <will.auld@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Avi Kivity <avi@redhat.com>,
	"Zhang, Xiantao" <xiantao.zhang@intel.com>
Subject: Re: [PATCH] Enabling IA32_TSC_ADJUST for guest VM
Date: Mon, 8 Oct 2012 14:30:55 -0300	[thread overview]
Message-ID: <20121008173055.GC12735@amt.cnet> (raw)
In-Reply-To: <96EC5A4F3149B74492D2D9B9B1602C2728B61A0D@ORSMSX108.amr.corp.intel.com>


On Wed, Sep 19, 2012 at 05:44:46PM +0000, Auld, Will wrote:
> >From 9982bb73460b05c1328068aae047b14b2294e2da Mon Sep 17 00:00:00 2001
> From: Will Auld <will.auld@intel.com>
> Date: Wed, 12 Sep 2012 18:10:56 -0700
> Subject: [PATCH] Enabling IA32_TSC_ADJUST for guest VM
> 
> CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported
> 
> Basic design is to emulate the MSR by allowing reads and writes to a guest vcpu specific location to store the value of the emulated MSR while adding the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This is of course as long as the "use TSC counter offsetting" VM-execution control is enabled as well as the IA32_TSC_ADJUST control.
> 
> However, because hardware will only return the TSC + IA32_TSC_ADJUST + vmsc tsc_offset for a guest process when it does and rdtsc (with the correct settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one of these three locations. The argument against storing it in the actual MSR is performance. This is likely to be seldom used while the save/restore is required on every transition. IA32_TSC_ADJUST was created as a way to solve some issues with writing TSC itself so that is not an option either. The remaining option, defined above as our solution has the problem of returning incorrect vmcs tsc_offset values (unless we intercept and fix, not done here) as mentioned above. However, more problematic is that storing the data in vmcs tsc_offset will have a different semantic effect on the system than does using the actual MSR. This is illustrated in the following example: The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest process perfor!
>  ms a rdtsc. In this case the guest process will get TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics as seen by the guest do not and hence this will not cause a problem.
> ---
>  arch/x86/include/asm/cpufeature.h |    1 +
>  arch/x86/include/asm/kvm_host.h   |    2 ++
>  arch/x86/include/asm/msr-index.h  |    1 +
>  arch/x86/kvm/cpuid.c              |    4 ++--
>  arch/x86/kvm/vmx.c                |   12 ++++++++++++
>  arch/x86/kvm/x86.c                |    1 +
>  6 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index 6b7ee5f..e574d81 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -199,6 +199,7 @@
>  
>  /* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
>  #define X86_FEATURE_FSGSBASE	(9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
> +#define X86_FEATURE_TSC_ADJUST  (9*32+ 1) /* TSC adjustment MSR 0x3b */
>  #define X86_FEATURE_BMI1	(9*32+ 3) /* 1st group bit manipulation extensions */
>  #define X86_FEATURE_HLE		(9*32+ 4) /* Hardware Lock Elision */
>  #define X86_FEATURE_AVX2	(9*32+ 5) /* AVX2 instructions */
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09155d6..8a001a4 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -442,6 +442,8 @@ struct kvm_vcpu_arch {
>  	u32 virtual_tsc_mult;
>  	u32 virtual_tsc_khz;
>  
> +	s64 tsc_adjust;
> +
>  	atomic_t nmi_queued;  /* unprocessed asynchronous NMIs */
>  	unsigned nmi_pending; /* NMI queued after currently running handler */
>  	bool nmi_injected;    /* Trying to inject an NMI this entry */
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 957ec87..8e82e29 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -231,6 +231,7 @@
>  #define MSR_IA32_EBL_CR_POWERON		0x0000002a
>  #define MSR_EBC_FREQUENCY_ID		0x0000002c
>  #define MSR_IA32_FEATURE_CONTROL        0x0000003a
> +#define MSR_TSC_ADJUST				0x0000003b
>  
>  #define FEATURE_CONTROL_LOCKED				(1<<0)
>  #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX	(1<<1)
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0595f13..8f5943e 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -248,8 +248,8 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  
>  	/* cpuid 7.0.ebx */
>  	const u32 kvm_supported_word9_x86_features =
> -		F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
> -		F(BMI2) | F(ERMS) | f_invpcid | F(RTM);
> +		F(FSGSBASE) | F(TSC_ADJUST) | F(BMI1) | F(HLE) |
> +		F(AVX2) | F(SMEP) | F(BMI2) | F(ERMS) | f_invpcid | F(RTM);
>  
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c00f03d..35d11b3 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2173,6 +2173,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
>  	case MSR_IA32_SYSENTER_ESP:
>  		data = vmcs_readl(GUEST_SYSENTER_ESP);
>  		break;
> +	case MSR_TSC_ADJUST:
> +		data = (u64)vcpu->arch.tsc_adjust;
> +		break;
>  	case MSR_TSC_AUX:
>  		if (!to_vmx(vcpu)->rdtscp_enabled)
>  			return 1;

From Intel's manual:

• If an execution of WRMSR to the IA32_TIME_STAMP_COUNTER MSR adds (or
subtracts) value X from the TSC,
the logical processor also adds (or subtracts) value X from the
IA32_TSC_ADJUST MSR.

This is not handled in the patch. 

To support migration, it will be necessary to differentiate between
guest initiated and userspace-model initiated msr write. That is, 
only guest initiated TSC writes should affect the value of 
IA32_TSC_ADJUST MSR.

Avi, any better idea?

Will, please write a test case, see

http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=summary

> @@ -2241,6 +2244,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
>  		}
>  		ret = kvm_set_msr_common(vcpu, msr_index, data);
>  		break;
> +	case MSR_TSC_ADJUST:
> +#define DUMMY 1
> +		vmx_adjust_tsc_offset(vcpu,
> +				(s64)(data-vcpu->arch.tsc_adjust),
> +				(bool)DUMMY);
> +		vcpu->arch.tsc_adjust = (s64)data;
> +		break;
>  	case MSR_TSC_AUX:
>  		if (!vmx->rdtscp_enabled)
>  			return 1;
> @@ -3931,6 +3941,8 @@ static int vmx_vcpu_reset(struct kvm_vcpu *vcpu)
>  
>  	vcpu->arch.regs_avail = ~((1 << VCPU_REGS_RIP) | (1 << VCPU_REGS_RSP));
>  
> +	vcpu->arch.tsc_adjust = 0x0;
> +
>  	vmx->rmode.vm86_active = 0;
>  
>  	vmx->soft_vnmi_blocked = 0;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 42bce48..6c50f6c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -824,6 +824,7 @@ static u32 msrs_to_save[] = {
>  static unsigned num_msrs_to_save;
>  
>  static u32 emulated_msrs[] = {
> +	MSR_TSC_ADJUST,
>  	MSR_IA32_TSCDEADLINE,
>  	MSR_IA32_MISC_ENABLE,
>  	MSR_IA32_MCG_STATUS,
> -- 
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-10-08 17:33 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-19 17:44 [PATCH] Enabling IA32_TSC_ADJUST for guest VM Auld, Will
2012-09-20 11:13 ` Avi Kivity
2012-09-20 11:15 ` Avi Kivity
2012-09-26 21:34 ` Marcelo Tosatti
2012-09-26 22:58   ` Auld, Will
2012-09-27  0:29     ` Marcelo Tosatti
2012-09-27  0:30       ` Marcelo Tosatti
2012-09-27  0:50       ` Auld, Will
2012-09-27 11:31         ` Marcelo Tosatti
2012-09-27 11:48           ` Marcelo Tosatti
2012-09-28  2:07             ` Auld, Will
2012-09-28 13:24               ` Marcelo Tosatti
2012-10-08 17:30 ` Marcelo Tosatti [this message]
2012-10-09 12:12   ` Avi Kivity
2012-10-09 14:24     ` Marcelo Tosatti
2012-10-09 14:26       ` Avi Kivity
2012-10-09 14:27         ` Marcelo Tosatti
2012-10-09 14:30           ` Avi Kivity
2012-10-09 15:52             ` Marcelo Tosatti
2012-10-09 16:10     ` Auld, Will
2012-10-10 12:52       ` Marcelo Tosatti
2012-10-11  0:47         ` Auld, Will
2012-10-11  8:56           ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121008173055.GC12735@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=will.auld@intel.com \
    --cc=xiantao.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.