All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] x86/kvm: avoid expensive rdmsrs for FS/GS base MSRs
@ 2018-03-12 14:02 Vitaly Kuznetsov
  2018-03-12 14:02 ` [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread Vitaly Kuznetsov
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-12 14:02 UTC (permalink / raw)
  To: kvm, x86
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

Changes since RFC [Andy Lutomirski]:
- Export new save_current_fsgs() API and call it before reading
  current->thread.fs/gsbase
- New cpu_kernelmode_gs_base() API.
- Some comments added.

Some time ago Paolo suggested to take a look at probably unneeded expensive
rdmsrs for FS/GS base MSR in vmx_save_host_state(). This is called on every
vcpu run when we need to handle vmexit in userspace.

vmx_save_host_state() is always called in a very well defined context
(ioctl from userspace) so we may try to get the required values for FS/GS
bases from in-kernel variables and avoid expensive rdmsrs.

My debug shows we're shaving off 240 cpu cycles (E5-2603 v3).

Vitaly Kuznetsov (3):
  x86/kvm/vmx: read MSR_FS_BASE from current->thread
  x86/kvm/vmx: read MSR_KERNEL_GS_BASE from current->thread
  x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE

 arch/x86/include/asm/processor.h |  8 ++++++++
 arch/x86/kernel/cpu/common.c     |  3 ++-
 arch/x86/kernel/process_64.c     | 20 ++++++++++++++++++++
 arch/x86/kvm/vmx.c               |  9 ++++++---
 4 files changed, 36 insertions(+), 4 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 14:02 [PATCH 0/3] x86/kvm: avoid expensive rdmsrs for FS/GS base MSRs Vitaly Kuznetsov
@ 2018-03-12 14:02 ` Vitaly Kuznetsov
  2018-03-12 15:34   ` Andy Lutomirski
  2018-03-12 16:03   ` Paolo Bonzini
  2018-03-12 14:02 ` [PATCH 2/3] x86/kvm/vmx: read MSR_KERNEL_GS_BASE " Vitaly Kuznetsov
  2018-03-12 14:03 ` [PATCH 3/3] x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE Vitaly Kuznetsov
  2 siblings, 2 replies; 14+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-12 14:02 UTC (permalink / raw)
  To: kvm, x86
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
the context is pretty well defined. Read MSR_FS_BASE from
current->thread.fsbase after calling save_fsgs() which takes care of
X86_BUG_NULL_SEG case now and will do RD[FG,GS]BASE when FSGSBASE
extensions are exposed to userspace (currently they are not).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/processor.h |  3 +++
 arch/x86/kernel/process_64.c     | 20 ++++++++++++++++++++
 arch/x86/kvm/vmx.c               |  4 +++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b0ccd4847a58..006352b85ba3 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -410,6 +410,9 @@ DECLARE_INIT_PER_CPU(irq_stack_union);
 DECLARE_PER_CPU(char *, irq_stack_ptr);
 DECLARE_PER_CPU(unsigned int, irq_count);
 extern asmlinkage void ignore_sysret(void);
+
+/* Save actual FS/GS selectors and bases to current->thread */
+void save_current_fsgs(void);
 #else	/* X86_64 */
 #ifdef CONFIG_CC_STACKPROTECTOR
 /*
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9eb448c7859d..eb907fefe02e 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -205,6 +205,26 @@ static __always_inline void save_fsgs(struct task_struct *task)
 	save_base_legacy(task, task->thread.gsindex, GS);
 }
 
+/*
+ * Currently, the only way for processes to change their FS/GS base is to call
+ * ARCH_SET_FS/GS prctls and these reflect changes they make in task->thread.
+ * There are, however, additional considerations:
+ *
+ * There is X86_BUG_NULL_SEG: on some CPUs writing '0' to FS/GS selectors zeroes
+ * the base and on some it doesn't, we need to check for that
+ * (see save_base_legacy()).
+ *
+ * When FSGSBASE extensions are enabled userspace processes will be able to
+ * change their FS/GS bases without kernel intervention. save_fsgs() will
+ * have to be updated to actually read FS and GS bases with RD[FG,GS]BASE
+ * instructions.
+ */
+void save_current_fsgs(void)
+{
+	save_fsgs(current);
+}
+EXPORT_SYMBOL_GPL(save_current_fsgs);
+
 static __always_inline void loadseg(enum which_selector which,
 				    unsigned short sel)
 {
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 051dab74e4e9..e46b7b24ebae 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2157,7 +2157,9 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 #endif
 
 #ifdef CONFIG_X86_64
-	vmcs_writel(HOST_FS_BASE, read_msr(MSR_FS_BASE));
+	/* Synchronize FS and GS bases to current->thread first */
+	save_current_fsgs();
+	vmcs_writel(HOST_FS_BASE, current->thread.fsbase);
 	vmcs_writel(HOST_GS_BASE, read_msr(MSR_GS_BASE));
 #else
 	vmcs_writel(HOST_FS_BASE, segment_base(vmx->host_state.fs_sel));
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] x86/kvm/vmx: read MSR_KERNEL_GS_BASE from current->thread
  2018-03-12 14:02 [PATCH 0/3] x86/kvm: avoid expensive rdmsrs for FS/GS base MSRs Vitaly Kuznetsov
  2018-03-12 14:02 ` [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread Vitaly Kuznetsov
@ 2018-03-12 14:02 ` Vitaly Kuznetsov
  2018-03-12 14:03 ` [PATCH 3/3] x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE Vitaly Kuznetsov
  2 siblings, 0 replies; 14+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-12 14:02 UTC (permalink / raw)
  To: kvm, x86
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
the context is pretty well defined and as we're past 'swapgs'
MSR_KERNEL_GS_BASE is equal to current->thread.gsbase (after previous
save_current_fsgs() call).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e46b7b24ebae..d4d9bb2fd91e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2167,7 +2167,7 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 #endif
 
 #ifdef CONFIG_X86_64
-	rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base);
+	vmx->msr_host_kernel_gs_base = current->thread.gsbase;
 	if (is_long_mode(&vmx->vcpu))
 		wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
 #endif
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE
  2018-03-12 14:02 [PATCH 0/3] x86/kvm: avoid expensive rdmsrs for FS/GS base MSRs Vitaly Kuznetsov
  2018-03-12 14:02 ` [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread Vitaly Kuznetsov
  2018-03-12 14:02 ` [PATCH 2/3] x86/kvm/vmx: read MSR_KERNEL_GS_BASE " Vitaly Kuznetsov
@ 2018-03-12 14:03 ` Vitaly Kuznetsov
  2018-03-12 15:42   ` Andy Lutomirski
  2 siblings, 1 reply; 14+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-12 14:03 UTC (permalink / raw)
  To: kvm, x86
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
the context is pretty well defined and as we're past 'swapgs' MSR_GS_BASE
should contain kernel's GS base which we point to irq_stack_union.

Add new kernelmode_gs_base() API, irq_stack_union needs to be exported
as KVM can be build as module.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/processor.h | 5 +++++
 arch/x86/kernel/cpu/common.c     | 3 ++-
 arch/x86/kvm/vmx.c               | 3 ++-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 006352b85ba3..ab3d3e426a41 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -407,6 +407,11 @@ union irq_stack_union {
 DECLARE_PER_CPU_FIRST(union irq_stack_union, irq_stack_union) __visible;
 DECLARE_INIT_PER_CPU(irq_stack_union);
 
+static inline unsigned long cpu_kernelmode_gs_base(int cpu)
+{
+	return (unsigned long)per_cpu(irq_stack_union.gs_base, cpu);
+}
+
 DECLARE_PER_CPU(char *, irq_stack_ptr);
 DECLARE_PER_CPU(unsigned int, irq_count);
 extern asmlinkage void ignore_sysret(void);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 348cf4821240..4702fbd98f92 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -487,7 +487,7 @@ void load_percpu_segment(int cpu)
 	loadsegment(fs, __KERNEL_PERCPU);
 #else
 	__loadsegment_simple(gs, 0);
-	wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu));
+	wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu));
 #endif
 	load_stack_canary_segment();
 }
@@ -1398,6 +1398,7 @@ __setup("clearcpuid=", setup_clearcpuid);
 #ifdef CONFIG_X86_64
 DEFINE_PER_CPU_FIRST(union irq_stack_union,
 		     irq_stack_union) __aligned(PAGE_SIZE) __visible;
+EXPORT_PER_CPU_SYMBOL_GPL(irq_stack_union);
 
 /*
  * The following percpu variables are hot.  Align current_task to
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d4d9bb2fd91e..36cb14a121ca 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2123,6 +2123,7 @@ static unsigned long segment_base(u16 selector)
 static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	int cpu = raw_smp_processor_id();
 	int i;
 
 	if (vmx->host_state.loaded)
@@ -2160,7 +2161,7 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 	/* Synchronize FS and GS bases to current->thread first */
 	save_current_fsgs();
 	vmcs_writel(HOST_FS_BASE, current->thread.fsbase);
-	vmcs_writel(HOST_GS_BASE, read_msr(MSR_GS_BASE));
+	vmcs_writel(HOST_GS_BASE, cpu_kernelmode_gs_base(cpu));
 #else
 	vmcs_writel(HOST_FS_BASE, segment_base(vmx->host_state.fs_sel));
 	vmcs_writel(HOST_GS_BASE, segment_base(vmx->host_state.gs_sel));
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 14:02 ` [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread Vitaly Kuznetsov
@ 2018-03-12 15:34   ` Andy Lutomirski
  2018-03-12 15:55     ` Vitaly Kuznetsov
  2018-03-12 16:03   ` Paolo Bonzini
  1 sibling, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2018-03-12 15:34 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm list, X86 ML, LKML, Paolo Bonzini,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

On Mon, Mar 12, 2018 at 2:02 PM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
> the context is pretty well defined. Read MSR_FS_BASE from
> current->thread.fsbase after calling save_fsgs() which takes care of
> X86_BUG_NULL_SEG case now and will do RD[FG,GS]BASE when FSGSBASE
> extensions are exposed to userspace (currently they are not).
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/processor.h |  3 +++
>  arch/x86/kernel/process_64.c     | 20 ++++++++++++++++++++
>  arch/x86/kvm/vmx.c               |  4 +++-
>  3 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index b0ccd4847a58..006352b85ba3 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -410,6 +410,9 @@ DECLARE_INIT_PER_CPU(irq_stack_union);
>  DECLARE_PER_CPU(char *, irq_stack_ptr);
>  DECLARE_PER_CPU(unsigned int, irq_count);
>  extern asmlinkage void ignore_sysret(void);
> +
> +/* Save actual FS/GS selectors and bases to current->thread */
> +void save_current_fsgs(void);
>  #else  /* X86_64 */
>  #ifdef CONFIG_CC_STACKPROTECTOR
>  /*
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index 9eb448c7859d..eb907fefe02e 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -205,6 +205,26 @@ static __always_inline void save_fsgs(struct task_struct *task)
>         save_base_legacy(task, task->thread.gsindex, GS);
>  }
>
> +/*
> + * Currently, the only way for processes to change their FS/GS base is to call
> + * ARCH_SET_FS/GS prctls and these reflect changes they make in task->thread.
> + * There are, however, additional considerations:
> + *
> + * There is X86_BUG_NULL_SEG: on some CPUs writing '0' to FS/GS selectors zeroes
> + * the base and on some it doesn't, we need to check for that
> + * (see save_base_legacy()).
> + *
> + * When FSGSBASE extensions are enabled userspace processes will be able to
> + * change their FS/GS bases without kernel intervention. save_fsgs() will
> + * have to be updated to actually read FS and GS bases with RD[FG,GS]BASE
> + * instructions.
> + */

This is all a very complicated way to say "while a process is running,
current->thread.fsbase and current->thread.gsbase may not match the
corresponding CPU registers.  KVM wants an efficient way to save and
restore FSBASE and GSBASE."

And how about changing this to:

#if IS_ENABLED(CONFIG_KVM)
void save_fsgs_for_kvm(void)
{
    save_fsgs(current);
}
EXPORT_SYMBOL_GPL(save_fsgs_for_kvm);

--Andy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE
  2018-03-12 14:03 ` [PATCH 3/3] x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE Vitaly Kuznetsov
@ 2018-03-12 15:42   ` Andy Lutomirski
  0 siblings, 0 replies; 14+ messages in thread
From: Andy Lutomirski @ 2018-03-12 15:42 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm list, X86 ML, LKML, Paolo Bonzini,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

On Mon, Mar 12, 2018 at 2:03 PM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
> the context is pretty well defined and as we're past 'swapgs' MSR_GS_BASE
> should contain kernel's GS base which we point to irq_stack_union.
>
> Add new kernelmode_gs_base() API, irq_stack_union needs to be exported
> as KVM can be build as module.
>

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 15:34   ` Andy Lutomirski
@ 2018-03-12 15:55     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 14+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-12 15:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: kvm list, X86 ML, LKML, Paolo Bonzini,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin

Andy Lutomirski <luto@kernel.org> writes:

> On Mon, Mar 12, 2018 at 2:02 PM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so
>> the context is pretty well defined. Read MSR_FS_BASE from
>> current->thread.fsbase after calling save_fsgs() which takes care of
>> X86_BUG_NULL_SEG case now and will do RD[FG,GS]BASE when FSGSBASE
>> extensions are exposed to userspace (currently they are not).
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/asm/processor.h |  3 +++
>>  arch/x86/kernel/process_64.c     | 20 ++++++++++++++++++++
>>  arch/x86/kvm/vmx.c               |  4 +++-
>>  3 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
>> index b0ccd4847a58..006352b85ba3 100644
>> --- a/arch/x86/include/asm/processor.h
>> +++ b/arch/x86/include/asm/processor.h
>> @@ -410,6 +410,9 @@ DECLARE_INIT_PER_CPU(irq_stack_union);
>>  DECLARE_PER_CPU(char *, irq_stack_ptr);
>>  DECLARE_PER_CPU(unsigned int, irq_count);
>>  extern asmlinkage void ignore_sysret(void);
>> +
>> +/* Save actual FS/GS selectors and bases to current->thread */
>> +void save_current_fsgs(void);
>>  #else  /* X86_64 */
>>  #ifdef CONFIG_CC_STACKPROTECTOR
>>  /*
>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>> index 9eb448c7859d..eb907fefe02e 100644
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -205,6 +205,26 @@ static __always_inline void save_fsgs(struct task_struct *task)
>>         save_base_legacy(task, task->thread.gsindex, GS);
>>  }
>>
>> +/*
>> + * Currently, the only way for processes to change their FS/GS base is to call
>> + * ARCH_SET_FS/GS prctls and these reflect changes they make in task->thread.
>> + * There are, however, additional considerations:
>> + *
>> + * There is X86_BUG_NULL_SEG: on some CPUs writing '0' to FS/GS selectors zeroes
>> + * the base and on some it doesn't, we need to check for that
>> + * (see save_base_legacy()).
>> + *
>> + * When FSGSBASE extensions are enabled userspace processes will be able to
>> + * change their FS/GS bases without kernel intervention. save_fsgs() will
>> + * have to be updated to actually read FS and GS bases with RD[FG,GS]BASE
>> + * instructions.
>> + */
>
> This is all a very complicated way to say "while a process is running,
> current->thread.fsbase and current->thread.gsbase may not match the
> corresponding CPU registers.  KVM wants an efficient way to save and
> restore FSBASE and GSBASE."
>

Well, I though it is not really obvious why "current->thread.fsbase and
current->thread.gsbase may not match the corresponding CPU registers"
:-) but save_base_legacy() is really nearby so I'll trim my comment in
v2.

> And how about changing this to:
>
> #if IS_ENABLED(CONFIG_KVM)
> void save_fsgs_for_kvm(void)
> {
>     save_fsgs(current);
> }
> EXPORT_SYMBOL_GPL(save_fsgs_for_kvm);
>

Sure. Actually, there is nothing KVM-specific in this function but KVM
will probably be the only user for the time being.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 14:02 ` [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread Vitaly Kuznetsov
  2018-03-12 15:34   ` Andy Lutomirski
@ 2018-03-12 16:03   ` Paolo Bonzini
  2018-03-12 16:13     ` Andy Lutomirski
  2018-03-13 13:20     ` Vitaly Kuznetsov
  1 sibling, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2018-03-12 16:03 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, x86
  Cc: linux-kernel, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

On 12/03/2018 15:02, Vitaly Kuznetsov wrote:
>  
> +/*
> + * Currently, the only way for processes to change their FS/GS base is to call
> + * ARCH_SET_FS/GS prctls and these reflect changes they make in task->thread.
> + * There are, however, additional considerations:
> + *
> + * There is X86_BUG_NULL_SEG: on some CPUs writing '0' to FS/GS selectors zeroes
> + * the base and on some it doesn't, we need to check for that
> + * (see save_base_legacy()).
> + *
> + * When FSGSBASE extensions are enabled userspace processes will be able to
> + * change their FS/GS bases without kernel intervention. save_fsgs() will
> + * have to be updated to actually read FS and GS bases with RD[FG,GS]BASE
> + * instructions.
> + */
> +void save_current_fsgs(void)
> +{
> +	save_fsgs(current);
> +}
> +EXPORT_SYMBOL_GPL(save_current_fsgs);

We don't really need save_fsgs in KVM though.  We already do the
savesegment ourselves, and we know Intel CPUs don't have
X86_BUG_NULL_SEG.  So this:

        savesegment(fs, vmx->host_state.fs_sel);
        if (!(vmx->host_state.fs_sel & 7)) {
                vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
                vmx->host_state.fs_reload_needed = 0;
        } else {
                vmcs_write16(HOST_FS_SELECTOR, 0);
                vmx->host_state.fs_reload_needed = 1;
        }
        savesegment(gs, vmx->host_state.gs_sel);
	...

could probably become simply:

        savesegment(fs, vmx->host_state.fs_sel);
	/*
	 * When FSGSBASE extensions are enabled, this will have to use
	 * RD{FS,GS}BASE instead of accessing current, and the
	 * corresponding WR{FS,GS}BASE should be done unconditionally,
	 * even if fs_reload_needed (resp. gs_ldt_reload_needed) is 1.
	 */
        if (vmx->host_state.fs_sel <= 3) {
                vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
                vmcs_write16(HOST_FS_BASE, current->thread.fsbase);
                vmx->host_state.fs_reload_needed = 0;
        } else {
                vmcs_write16(HOST_FS_SELECTOR, 0);
                vmcs_write16(HOST_FS_BASE, 0);
                vmx->host_state.fs_reload_needed = 1;
        }
        savesegment(gs, vmx->host_state.gs_sel);
	...

?

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 16:03   ` Paolo Bonzini
@ 2018-03-12 16:13     ` Andy Lutomirski
  2018-03-12 16:18       ` Paolo Bonzini
  2018-03-13 13:20     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2018-03-12 16:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm list, X86 ML, LKML,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

On Mon, Mar 12, 2018 at 4:03 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 12/03/2018 15:02, Vitaly Kuznetsov wrote:
>>
>> +/*
>> + * Currently, the only way for processes to change their FS/GS base is to call
>> + * ARCH_SET_FS/GS prctls and these reflect changes they make in task->thread.
>> + * There are, however, additional considerations:
>> + *
>> + * There is X86_BUG_NULL_SEG: on some CPUs writing '0' to FS/GS selectors zeroes
>> + * the base and on some it doesn't, we need to check for that
>> + * (see save_base_legacy()).
>> + *
>> + * When FSGSBASE extensions are enabled userspace processes will be able to
>> + * change their FS/GS bases without kernel intervention. save_fsgs() will
>> + * have to be updated to actually read FS and GS bases with RD[FG,GS]BASE
>> + * instructions.
>> + */
>> +void save_current_fsgs(void)
>> +{
>> +     save_fsgs(current);
>> +}
>> +EXPORT_SYMBOL_GPL(save_current_fsgs);
>
> We don't really need save_fsgs in KVM though.  We already do the
> savesegment ourselves, and we know Intel CPUs don't have
> X86_BUG_NULL_SEG.  So this:
>
>         savesegment(fs, vmx->host_state.fs_sel);
>         if (!(vmx->host_state.fs_sel & 7)) {
>                 vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
>                 vmx->host_state.fs_reload_needed = 0;
>         } else {
>                 vmcs_write16(HOST_FS_SELECTOR, 0);
>                 vmx->host_state.fs_reload_needed = 1;
>         }
>         savesegment(gs, vmx->host_state.gs_sel);
>         ...
>
> could probably become simply:
>
>         savesegment(fs, vmx->host_state.fs_sel);
>         /*
>          * When FSGSBASE extensions are enabled, this will have to use
>          * RD{FS,GS}BASE instead of accessing current, and the
>          * corresponding WR{FS,GS}BASE should be done unconditionally,
>          * even if fs_reload_needed (resp. gs_ldt_reload_needed) is 1.
>          */
>         if (vmx->host_state.fs_sel <= 3) {
>                 vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
>                 vmcs_write16(HOST_FS_BASE, current->thread.fsbase);
>                 vmx->host_state.fs_reload_needed = 0;
>         } else {
>                 vmcs_write16(HOST_FS_SELECTOR, 0);
>                 vmcs_write16(HOST_FS_BASE, 0);
>                 vmx->host_state.fs_reload_needed = 1;
>         }
>         savesegment(gs, vmx->host_state.gs_sel);
>         ...
>
> ?
>

Hmm, probably, although this still gets the case where the user writes
0 to %fs wrong.  Of course, save_fsgs() also gets that wrong.

I'm okay with this variant as long as you add a comment to
save_..._legacy pointing at the KVM code.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 16:13     ` Andy Lutomirski
@ 2018-03-12 16:18       ` Paolo Bonzini
  2018-03-12 17:00         ` Andy Lutomirski
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2018-03-12 16:18 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Vitaly Kuznetsov, kvm list, X86 ML, LKML,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin

On 12/03/2018 17:13, Andy Lutomirski wrote:
>>
>>         savesegment(fs, vmx->host_state.fs_sel);
>>         /*
>>          * When FSGSBASE extensions are enabled, this will have to use
>>          * RD{FS,GS}BASE instead of accessing current, and the
>>          * corresponding WR{FS,GS}BASE should be done unconditionally,
>>          * even if fs_reload_needed (resp. gs_ldt_reload_needed) is 1.
>>          */
>>         if (vmx->host_state.fs_sel <= 3) {
>>                 vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
>>                 vmcs_write16(HOST_FS_BASE, current->thread.fsbase);
>>                 vmx->host_state.fs_reload_needed = 0;
>>         } else {
>>                 vmcs_write16(HOST_FS_SELECTOR, 0);
>>                 vmcs_write16(HOST_FS_BASE, 0);
>>                 vmx->host_state.fs_reload_needed = 1;
>>         }
>>         savesegment(gs, vmx->host_state.gs_sel);
>>         ...
>>
>> ?
>>
> Hmm, probably, although this still gets the case where the user writes
> 0 to %fs wrong.  Of course, save_fsgs() also gets that wrong.
> 
> I'm okay with this variant as long as you add a comment to
> save_..._legacy pointing at the KVM code.

Why in save_..._legacy?  If it is about FSGSBASE, shouldn't it be in
save_fsgs?  (Or if not I'm missing what the comment should be about).

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 16:18       ` Paolo Bonzini
@ 2018-03-12 17:00         ` Andy Lutomirski
  2018-03-12 17:11           ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2018-03-12 17:00 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Andy Lutomirski, Vitaly Kuznetsov, kvm list, X86 ML, LKML,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin

On Mon, Mar 12, 2018 at 4:18 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 12/03/2018 17:13, Andy Lutomirski wrote:
>>>
>>>         savesegment(fs, vmx->host_state.fs_sel);
>>>         /*
>>>          * When FSGSBASE extensions are enabled, this will have to use
>>>          * RD{FS,GS}BASE instead of accessing current, and the
>>>          * corresponding WR{FS,GS}BASE should be done unconditionally,
>>>          * even if fs_reload_needed (resp. gs_ldt_reload_needed) is 1.
>>>          */
>>>         if (vmx->host_state.fs_sel <= 3) {
>>>                 vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
>>>                 vmcs_write16(HOST_FS_BASE, current->thread.fsbase);
>>>                 vmx->host_state.fs_reload_needed = 0;
>>>         } else {
>>>                 vmcs_write16(HOST_FS_SELECTOR, 0);
>>>                 vmcs_write16(HOST_FS_BASE, 0);
>>>                 vmx->host_state.fs_reload_needed = 1;
>>>         }
>>>         savesegment(gs, vmx->host_state.gs_sel);
>>>         ...
>>>
>>> ?
>>>
>> Hmm, probably, although this still gets the case where the user writes
>> 0 to %fs wrong.  Of course, save_fsgs() also gets that wrong.
>>
>> I'm okay with this variant as long as you add a comment to
>> save_..._legacy pointing at the KVM code.
>
> Why in save_..._legacy?  If it is about FSGSBASE, shouldn't it be in
> save_fsgs?  (Or if not I'm missing what the comment should be about).

It could be in save_fsgs(), I guess.  The main point is to make it
clear to readers of the code in save_fsgs(), the legacy helpers, etc
that there's another piece of code in KVM that makes the same set of
somewhat problematic assumptions and that will need updating for
FSGSBASE.

I'm moderately confident that someone from Intel is currently working
on FSGSBASE, but all I've seen lately is a bunch of kbuild bot errors
:(

--Amdy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 17:00         ` Andy Lutomirski
@ 2018-03-12 17:11           ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2018-03-12 17:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Vitaly Kuznetsov, kvm list, X86 ML, LKML,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin

On 12/03/2018 18:00, Andy Lutomirski wrote:
>> Why in save_..._legacy?  If it is about FSGSBASE, shouldn't it be in
>> save_fsgs?  (Or if not I'm missing what the comment should be about).
> It could be in save_fsgs(), I guess.  The main point is to make it
> clear to readers of the code in save_fsgs(), the legacy helpers, etc
> that there's another piece of code in KVM that makes the same set of
> somewhat problematic assumptions and that will need updating for
> FSGSBASE.

Okay, that's a good idea.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-12 16:03   ` Paolo Bonzini
  2018-03-12 16:13     ` Andy Lutomirski
@ 2018-03-13 13:20     ` Vitaly Kuznetsov
  2018-03-13 13:24       ` Paolo Bonzini
  1 sibling, 1 reply; 14+ messages in thread
From: Vitaly Kuznetsov @ 2018-03-13 13:20 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, x86, linux-kernel, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 12/03/2018 15:02, Vitaly Kuznetsov wrote:
>>  
>> +/*
>> + * Currently, the only way for processes to change their FS/GS base is to call
>> + * ARCH_SET_FS/GS prctls and these reflect changes they make in task->thread.
>> + * There are, however, additional considerations:
>> + *
>> + * There is X86_BUG_NULL_SEG: on some CPUs writing '0' to FS/GS selectors zeroes
>> + * the base and on some it doesn't, we need to check for that
>> + * (see save_base_legacy()).
>> + *
>> + * When FSGSBASE extensions are enabled userspace processes will be able to
>> + * change their FS/GS bases without kernel intervention. save_fsgs() will
>> + * have to be updated to actually read FS and GS bases with RD[FG,GS]BASE
>> + * instructions.
>> + */
>> +void save_current_fsgs(void)
>> +{
>> +	save_fsgs(current);
>> +}
>> +EXPORT_SYMBOL_GPL(save_current_fsgs);
>
> We don't really need save_fsgs in KVM though.  We already do the
> savesegment ourselves, and we know Intel CPUs don't have
> X86_BUG_NULL_SEG.  So this:
>
>         savesegment(fs, vmx->host_state.fs_sel);
>         if (!(vmx->host_state.fs_sel & 7)) {
>                 vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
>                 vmx->host_state.fs_reload_needed = 0;
>         } else {
>                 vmcs_write16(HOST_FS_SELECTOR, 0);
>                 vmx->host_state.fs_reload_needed = 1;
>         }
>         savesegment(gs, vmx->host_state.gs_sel);
> 	...
>
> could probably become simply:
>
>         savesegment(fs, vmx->host_state.fs_sel);
> 	/*
> 	 * When FSGSBASE extensions are enabled, this will have to use
> 	 * RD{FS,GS}BASE instead of accessing current, and the
> 	 * corresponding WR{FS,GS}BASE should be done unconditionally,
> 	 * even if fs_reload_needed (resp. gs_ldt_reload_needed) is 1.
> 	 */
>         if (vmx->host_state.fs_sel <= 3) {
>                 vmcs_write16(HOST_FS_SELECTOR, vmx->host_state.fs_sel);
>                 vmcs_write16(HOST_FS_BASE, current->thread.fsbase);

vmcs_writel() I guess ... and, just to make sure I follow your
suggestion, this is for x86_64 only, right? x86_32 does

vmcs_writel(HOST_FS_BASE, segment_base(vmx->host_state.fs_sel));

and I think it needs to stay.

(personally, I'm rather for exporting save_fsgs(), dropping
savesegment() and getting all we need from current to avoid propagating
assumptions but I'm flexible)

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread
  2018-03-13 13:20     ` Vitaly Kuznetsov
@ 2018-03-13 13:24       ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2018-03-13 13:24 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, x86, linux-kernel, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andy Lutomirski

On 13/03/2018 14:20, Vitaly Kuznetsov wrote:
> vmcs_writel() I guess ... and, just to make sure I follow your
> suggestion, this is for x86_64 only, right? x86_32 does
> 
> vmcs_writel(HOST_FS_BASE, segment_base(vmx->host_state.fs_sel));
> 
> and I think it needs to stay.

Yes.

> (personally, I'm rather for exporting save_fsgs(), dropping
> savesegment() and getting all we need from current to avoid propagating
> assumptions but I'm flexible)

Yes, that's fine too, as long as it's not mix and match. :)

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-03-13 13:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-12 14:02 [PATCH 0/3] x86/kvm: avoid expensive rdmsrs for FS/GS base MSRs Vitaly Kuznetsov
2018-03-12 14:02 ` [PATCH 1/3] x86/kvm/vmx: read MSR_FS_BASE from current->thread Vitaly Kuznetsov
2018-03-12 15:34   ` Andy Lutomirski
2018-03-12 15:55     ` Vitaly Kuznetsov
2018-03-12 16:03   ` Paolo Bonzini
2018-03-12 16:13     ` Andy Lutomirski
2018-03-12 16:18       ` Paolo Bonzini
2018-03-12 17:00         ` Andy Lutomirski
2018-03-12 17:11           ` Paolo Bonzini
2018-03-13 13:20     ` Vitaly Kuznetsov
2018-03-13 13:24       ` Paolo Bonzini
2018-03-12 14:02 ` [PATCH 2/3] x86/kvm/vmx: read MSR_KERNEL_GS_BASE " Vitaly Kuznetsov
2018-03-12 14:03 ` [PATCH 3/3] x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE Vitaly Kuznetsov
2018-03-12 15:42   ` Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.