linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor
@ 2017-05-24  6:24 Nick Desaulniers
  2017-05-24 14:19 ` Radim Krčmář
  0 siblings, 1 reply; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-24  6:24 UTC (permalink / raw)
  Cc: Nick Desaulniers, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

Fixes the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes in
function
      'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
           ^

Found with CONFIG_FRAME_WARN set to 1024.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
---
 arch/x86/kvm/emulate.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..1d7c9ceeff56 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4017,30 +4017,38 @@ static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
 
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
-	struct fxregs_state fx_state;
+	struct fxregs_state *fx_state;
 	int rc;
 
 	rc = check_fxsr(ctxt);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
+	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
+	if (!fx_state)
+		return -ENOMEM;
+
+	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, fx_state, 512);
 	if (rc != X86EMUL_CONTINUE)
-		return rc;
+		goto out;
 
-	if (fx_state.mxcsr >> 16)
-		return emulate_gp(ctxt, 0);
+	if (fx_state->mxcsr >> 16) {
+		rc = emulate_gp(ctxt, 0);
+		goto out;
+	}
 
 	ctxt->ops->get_fpu(ctxt);
 
 	if (ctxt->mode < X86EMUL_MODE_PROT64)
-		rc = fxrstor_fixup(ctxt, &fx_state);
+		rc = fxrstor_fixup(ctxt, fx_state);
 
 	if (rc == X86EMUL_CONTINUE)
-		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
+		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(*fx_state));
 
 	ctxt->ops->put_fpu(ctxt);
 
+out:
+	kfree(fx_state);
 	return rc;
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor
  2017-05-24  6:24 [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor Nick Desaulniers
@ 2017-05-24 14:19 ` Radim Krčmář
  2017-05-25  1:36   ` Nick Desaulniers
  0 siblings, 1 reply; 19+ messages in thread
From: Radim Krčmář @ 2017-05-24 14:19 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	kvm, linux-kernel

2017-05-23 23:24-0700, Nick Desaulniers:
> Fixes the warning:
> 
> arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes in
> function
>       'em_fxrstor' [-Wframe-larger-than=]
> static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>            ^
> 
> Found with CONFIG_FRAME_WARN set to 1024.
> 
> Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
> ---
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> @@ -4017,30 +4017,38 @@ static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
>  
>  static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>  {
> -	struct fxregs_state fx_state;
> +	struct fxregs_state *fx_state;
>  	int rc;
>  
>  	rc = check_fxsr(ctxt);
>  	if (rc != X86EMUL_CONTINUE)
>  		return rc;
>  
> -	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
> +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
> +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);

fx_state must be 16 byte aligned and x86 ARCH_KMALLOC_MINALIGN is 8, so
this needs manual correction.

Also, please kmalloc also fxregs_state in fxrstor_fixup and em_fxsave so
we again have only one storage type.

> +	if (!fx_state)
> +		return -ENOMEM;

The caller does not understand -ENOMEM.  The appropriate return value is
X86EMUL_UNHANDLEABLE.

>  	if (ctxt->mode < X86EMUL_MODE_PROT64)
> -		rc = fxrstor_fixup(ctxt, &fx_state);
> +		rc = fxrstor_fixup(ctxt, fx_state);

Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte
fxregs_state on the stack ... noinline attribute should solve the
warning too.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor
  2017-05-24 14:19 ` Radim Krčmář
@ 2017-05-25  1:36   ` Nick Desaulniers
  2017-05-25 14:07     ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-25  1:36 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	kvm, linux-kernel

On Wed, May 24, 2017 at 04:19:57PM +0200, Radim Krčmář wrote:
> 2017-05-23 23:24-0700, Nick Desaulniers:
> > +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
> > +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
> 
> fx_state must be 16 byte aligned and x86 ARCH_KMALLOC_MINALIGN is 8, so
> this needs manual correction.

Radim, thanks for the code review. I appreciate it! I'm fairly new to
kernel patching, but I'd like to resolve this with your guidance.

Looking through the available functions in slab.h, it seems the only
function that allows specifying alignment is kmem_cache_create.

Is that the simplest solution?

If so, then it seems like a `struct kmem_cache *` should be added as a
member to `struct x86_emulate_ctxt`? I'm not sure yet where the context
gets initialized and destroyed.  Do you have any insight?

Also, is there a #define'd value to use for the 16B alignment that I
should be using?

> Also, please kmalloc also fxregs_state in fxrstor_fixup and em_fxsave so
> we again have only one storage type.

Then it should be easy to call kmem_cache_alloc() at these sites.

> > +	if (!fx_state)
> > +		return -ENOMEM;
> 
> The caller does not understand -ENOMEM.  The appropriate return value is
> X86EMUL_UNHANDLEABLE.

Ack.

> >  	if (ctxt->mode < X86EMUL_MODE_PROT64)
> > -		rc = fxrstor_fixup(ctxt, &fx_state);
> > +		rc = fxrstor_fixup(ctxt, fx_state);
> 
> Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte
> fxregs_state on the stack ... noinline attribute should solve the
> warning too.

While that would change fewer lines, doesn't the problem still exist in
the case of `ctxt->mode < X86EMUL_MODE_PROT64`, just split across two
stack frames?  As in shouldn't we still dynamically allocate fx_state?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor
  2017-05-25  1:36   ` Nick Desaulniers
@ 2017-05-25 14:07     ` Paolo Bonzini
  2017-05-26  4:13       ` Nick Desaulniers
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2017-05-25 14:07 UTC (permalink / raw)
  To: Nick Desaulniers, Radim Krčmář
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm, linux-kernel



On 25/05/2017 03:36, Nick Desaulniers wrote:
>>>  	if (ctxt->mode < X86EMUL_MODE_PROT64)
>>> -		rc = fxrstor_fixup(ctxt, &fx_state);
>>> +		rc = fxrstor_fixup(ctxt, fx_state);
>> Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte
>> fxregs_state on the stack ... noinline attribute should solve the
>> warning too.
> While that would change fewer lines, doesn't the problem still exist in
> the case of `ctxt->mode < X86EMUL_MODE_PROT64`, just split across two
> stack frames?  As in shouldn't we still dynamically allocate fx_state?

I think we should do the fixup backwards.

That is:

- first do get_fpu

- if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
fxsave into &fxstate.

- then do segmented_read_std with the correct size, which is
  - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
    if ctxt->mode == X86EMUL_MODE_PROT64
  - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
    if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
  - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
    if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0

- then check fx_state.mxcsr

- then do fxrstor

- finally do put_fpu

This will remove one of the two fxregs_state structs.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor
  2017-05-25 14:07     ` Paolo Bonzini
@ 2017-05-26  4:13       ` Nick Desaulniers
  2017-05-26  7:18         ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-26  4:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote:
> I think we should do the fixup backwards.
> 
> That is:
> 
> - first do get_fpu
> 
> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
> fxsave into &fxstate.
> 
> - then do segmented_read_std with the correct size, which is
>   - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
>     if ctxt->mode == X86EMUL_MODE_PROT64
>   - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
>   - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0

but we still want to do a segmented_read_std with size 512 otherwise,
correct?

> - then check fx_state.mxcsr
> 
> - then do fxrstor

This sounds like we conditionally do the fxsave, but then always do the
fxrstor.  Is that ok? I guess the original code kind of does that as
well.

> - finally do put_fpu

Sounds straight forward.  I can see how fxsave and CR4.OSFXSR are
accessed in fxstor_fixup.  Is it ok to skip those memcpy's that would
otherwise occur when calling fxrstor_fixup() (which after these changes,
we would not be)?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor
  2017-05-26  4:13       ` Nick Desaulniers
@ 2017-05-26  7:18         ` Paolo Bonzini
  2017-05-29 19:55           ` [PATCH v2] KVM: x86: avoid large stack allocations " Nick Desaulniers
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2017-05-26  7:18 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel



On 26/05/2017 06:13, Nick Desaulniers wrote:
> On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote:
>> I think we should do the fixup backwards.
>>
>> That is:
>>
>> - first do get_fpu
>>
>> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
>> fxsave into &fxstate.
>>
>> - then do segmented_read_std with the correct size, which is
>>   - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
>>     if ctxt->mode == X86EMUL_MODE_PROT64
>>   - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
>>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
>>   - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
>>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0
> 
> but we still want to do a segmented_read_std with size 512 otherwise,
> correct?

No, 512 is never necessary.  ctxt->mode is never > X86EMUL_MODE_PROT64
(see the definition of enum x86emul_mode in
arch/x86/include/asm/kvm_emulate.h.

>> - then check fx_state.mxcsr
>>
>> - then do fxrstor
> 
> This sounds like we conditionally do the fxsave, but then always do the
> fxrstor.  Is that ok? I guess the original code kind of does that as
> well.

Correct.  They idea is that fxrstor on Linux always accesses 416 bytes,
but we may have to restore only the first part for accurate emulation.
The fxsave retrieves the current state so that we can leave it
unmodified when we do fxrstor.

>> - finally do put_fpu
> 
> Sounds straight forward.  I can see how fxsave and CR4.OSFXSR are
> accessed in fxstor_fixup.  Is it ok to skip those memcpy's that would
> otherwise occur when calling fxrstor_fixup() (which after these changes,
> we would not be)?

Yes, the memcpys are replaced with a shorter segmented_read_std.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-26  7:18         ` Paolo Bonzini
@ 2017-05-29 19:55           ` Nick Desaulniers
  2017-05-29 20:14             ` kbuild test robot
  2017-05-29 20:39             ` [PATCH v3] " Nick Desaulniers
  0 siblings, 2 replies; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-29 19:55 UTC (permalink / raw)
  Cc: Nick Desaulniers, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
      'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
           ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
---
New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 55 +++++++++++++++++---------------------------------
 1 file changed, 19 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..0087d3458604 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3985,57 +3985,40 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
 	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 }
 
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-		struct fxregs_state *new)
-{
-	int rc = X86EMUL_CONTINUE;
-	struct fxregs_state old;
-
-	rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-	if (rc != X86EMUL_CONTINUE)
-		return rc;
-
-	/*
-	 * 64 bit host will restore XMM 8-15, which is not correct on non-64
-	 * bit guests.  Load the current values in order to preserve 64 bit
-	 * XMMs after fxrstor.
-	 */
-#ifdef CONFIG_X86_64
-	/* XXX: accessing XMM 8-15 very awkwardly */
-	memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-	/*
-	 * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-	 * does save and restore MXCSR.
-	 */
-	if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-		memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-	return rc;
-}
-
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
 	struct fxregs_state fx_state;
 	int rc;
+	unsigned int size;
 
 	rc = check_fxsr(ctxt);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
+	ctxt->ops->get_fpu(ctxt);
+
+	if (ctxt->mode < X86EMUL_MODE_PROT64) {
+		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+		if (rc != X86EMUL_CONTINUE)
+			return rc;
+		/*
+		 * Hardware doesn't save and restore XMM 0-7 without
+		 * CR4.OSFXSR, but does save and restore MXCSR.
+		 */
+		if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
+			size = offsetof(struct fxregs_state, xmm_space[8]);
+		else
+			size = offsetof(struct fxregs_state, xmm_space[0]);
+	} else if (ctxt->mode == X86EMUL_MODE_PROT64)
+		size = offsetof(struct fxregs_state, xmm_space[16]);
+
+	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
 	if (fx_state.mxcsr >> 16)
 		return emulate_gp(ctxt, 0);
 
-	ctxt->ops->get_fpu(ctxt);
-
-	if (ctxt->mode < X86EMUL_MODE_PROT64)
-		rc = fxrstor_fixup(ctxt, &fx_state);
-
 	if (rc == X86EMUL_CONTINUE)
 		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 19:55           ` [PATCH v2] KVM: x86: avoid large stack allocations " Nick Desaulniers
@ 2017-05-29 20:14             ` kbuild test robot
  2017-05-29 20:29               ` Nick Desaulniers
  2017-05-29 20:39             ` [PATCH v3] " Nick Desaulniers
  1 sibling, 1 reply; 19+ messages in thread
From: kbuild test robot @ 2017-05-29 20:14 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: kbuild-all, Nick Desaulniers, Paolo Bonzini,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2256 bytes --]

Hi Nick,

[auto build test WARNING on kvm/linux-next]
[also build test WARNING on v4.12-rc3 next-20170529]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nick-Desaulniers/KVM-x86-avoid-large-stack-allocations-in-em_fxrstor/20170530-040058
base:   https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   arch/x86/kvm/emulate.c: In function 'em_fxrstor':
>> arch/x86/kvm/emulate.c:4015:5: warning: 'size' may be used uninitialized in this function [-Wmaybe-uninitialized]
     rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
     ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

vim +/size +4015 arch/x86/kvm/emulate.c

  3999	
  4000		if (ctxt->mode < X86EMUL_MODE_PROT64) {
  4001			rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
  4002			if (rc != X86EMUL_CONTINUE)
  4003				return rc;
  4004			/*
  4005			 * Hardware doesn't save and restore XMM 0-7 without
  4006			 * CR4.OSFXSR, but does save and restore MXCSR.
  4007			 */
  4008			if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
  4009				size = offsetof(struct fxregs_state, xmm_space[8]);
  4010			else
  4011				size = offsetof(struct fxregs_state, xmm_space[0]);
  4012		} else if (ctxt->mode == X86EMUL_MODE_PROT64)
  4013			size = offsetof(struct fxregs_state, xmm_space[16]);
  4014	
> 4015		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
  4016		if (rc != X86EMUL_CONTINUE)
  4017			return rc;
  4018	
  4019		if (fx_state.mxcsr >> 16)
  4020			return emulate_gp(ctxt, 0);
  4021	
  4022		if (rc == X86EMUL_CONTINUE)
  4023			rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 60031 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 20:14             ` kbuild test robot
@ 2017-05-29 20:29               ` Nick Desaulniers
  0 siblings, 0 replies; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-29 20:29 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

On Tue, May 30, 2017 at 04:14:50AM +0800, kbuild test robot wrote:
>    arch/x86/kvm/emulate.c: In function 'em_fxrstor':
> >> arch/x86/kvm/emulate.c:4015:5: warning: 'size' may be used uninitialized in this function [-Wmaybe-uninitialized]
>      rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
>      ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I definitely thought about this before submitting.  size is initialized
in the case ctxt->mode < X86EMUL_MODE_PROT64, and
ctxt->mode == X86EMUL_MODE_PROT64, but uninitialized in the case
ctxt->mode > X86EMUL_MODE_PROT64.

Paulo mentions this is impossible in:

https://lkml.org/lkml/2017/5/26/62

>>> ctxt->mode is never > X86EMUL_MODE_PROT64
>>> (see the definition of enum x86emul_mode in
>>> arch/x86/include/asm/kvm_emulate.h.)

>From arch/x86/include/asm/kvm_emulate.h:

272 enum x86emul_mode {
273   X86EMUL_MODE_REAL,  /* Real mode.             */
274   X86EMUL_MODE_VM86,  /* Virtual 8086 mode.     */
275   X86EMUL_MODE_PROT16,  /* 16-bit protected mode. */
276   X86EMUL_MODE_PROT32,  /* 32-bit protected mode. */
277   X86EMUL_MODE_PROT64,  /* 64-bit (long) mode.    */
278 };

I would still rather err on the side of safety, and initialize the
variable. So I will submit a v3 initializing size to 0, and check that
size was set to a reasonable value at some point before invoke
segmented_read_std.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 19:55           ` [PATCH v2] KVM: x86: avoid large stack allocations " Nick Desaulniers
  2017-05-29 20:14             ` kbuild test robot
@ 2017-05-29 20:39             ` Nick Desaulniers
  2017-05-29 22:40               ` Nick Desaulniers
  2017-05-29 22:48               ` [PATCH v4] " Nick Desaulniers
  1 sibling, 2 replies; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-29 20:39 UTC (permalink / raw)
  Cc: Nick Desaulniers, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
      'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
           ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
---
New in V3:
* initialized size to 0 to avoid maybe-uninitialized warning.  Check
  that it gets set to something other than 0 before passing size to
  segmented_read_std().

New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 58 +++++++++++++++++++-------------------------------
 1 file changed, 22 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..8c74a3764405 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3985,57 +3985,43 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
 	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 }
 
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-		struct fxregs_state *new)
-{
-	int rc = X86EMUL_CONTINUE;
-	struct fxregs_state old;
-
-	rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-	if (rc != X86EMUL_CONTINUE)
-		return rc;
-
-	/*
-	 * 64 bit host will restore XMM 8-15, which is not correct on non-64
-	 * bit guests.  Load the current values in order to preserve 64 bit
-	 * XMMs after fxrstor.
-	 */
-#ifdef CONFIG_X86_64
-	/* XXX: accessing XMM 8-15 very awkwardly */
-	memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-	/*
-	 * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-	 * does save and restore MXCSR.
-	 */
-	if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-		memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-	return rc;
-}
-
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
 	struct fxregs_state fx_state;
 	int rc;
+	unsigned int size = 0;
 
 	rc = check_fxsr(ctxt);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
+	ctxt->ops->get_fpu(ctxt);
+
+	if (ctxt->mode < X86EMUL_MODE_PROT64) {
+		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+		if (rc != X86EMUL_CONTINUE)
+			return rc;
+		/*
+		 * Hardware doesn't save and restore XMM 0-7 without
+		 * CR4.OSFXSR, but does save and restore MXCSR.
+		 */
+		if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
+			size = offsetof(struct fxregs_state, xmm_space[8]);
+		else
+			size = offsetof(struct fxregs_state, xmm_space[0]);
+	} else if (ctxt->mode == X86EMUL_MODE_PROT64)
+		size = offsetof(struct fxregs_state, xmm_space[16]);
+
+	if (size == 0)
+		return X86EMUL_UNHANDLEABLE;
+
+	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
 	if (fx_state.mxcsr >> 16)
 		return emulate_gp(ctxt, 0);
 
-	ctxt->ops->get_fpu(ctxt);
-
-	if (ctxt->mode < X86EMUL_MODE_PROT64)
-		rc = fxrstor_fixup(ctxt, &fx_state);
-
 	if (rc == X86EMUL_CONTINUE)
 		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 20:39             ` [PATCH v3] " Nick Desaulniers
@ 2017-05-29 22:40               ` Nick Desaulniers
  2017-05-29 22:48               ` [PATCH v4] " Nick Desaulniers
  1 sibling, 0 replies; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-29 22:40 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

On Mon, May 29, 2017 at 01:39:08PM -0700, Nick Desaulniers wrote:
> +	if (ctxt->mode < X86EMUL_MODE_PROT64) {
> +		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
> +		if (rc != X86EMUL_CONTINUE)
> +			return rc;
> +		/*
> +		 * Hardware doesn't save and restore XMM 0-7 without
> +		 * CR4.OSFXSR, but does save and restore MXCSR.
> +		 */
> +		if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
> +			size = offsetof(struct fxregs_state, xmm_space[8]);
> +		else
> +			size = offsetof(struct fxregs_state, xmm_space[0]);
> +	} else if (ctxt->mode == X86EMUL_MODE_PROT64)
> +		size = offsetof(struct fxregs_state, xmm_space[16]);
> +
> +	if (size == 0)
> +		return X86EMUL_UNHANDLEABLE;
> +
> +	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
>  	if (rc != X86EMUL_CONTINUE)
>  		return rc;

Thinking more about this, I think it may be more elegant to move the
segmented_read_std into the then/else branches above, remove
initialization of size, and remove the size == 0 check.  Thoughts?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 20:39             ` [PATCH v3] " Nick Desaulniers
  2017-05-29 22:40               ` Nick Desaulniers
@ 2017-05-29 22:48               ` Nick Desaulniers
  2017-05-30 10:15                 ` Paolo Bonzini
  2017-05-31  3:08                 ` [PATCH v5] " Nick Desaulniers
  1 sibling, 2 replies; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-29 22:48 UTC (permalink / raw)
  Cc: Nick Desaulniers, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
      'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
           ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
---
New in V4:
* undo changes for V3, prefer to only call segmented_read_std() when
  size has been initialized.

New in V3:
* initialized size to 0 to avoid maybe-uninitialized warning.  Check
  that it gets set to something other than 0 before passing size to
  segmented_read_std().

New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 64 ++++++++++++++++++++------------------------------
 1 file changed, 26 insertions(+), 38 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..0367f7f81792 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3985,57 +3985,45 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
 	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 }
 
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-		struct fxregs_state *new)
-{
-	int rc = X86EMUL_CONTINUE;
-	struct fxregs_state old;
-
-	rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-	if (rc != X86EMUL_CONTINUE)
-		return rc;
-
-	/*
-	 * 64 bit host will restore XMM 8-15, which is not correct on non-64
-	 * bit guests.  Load the current values in order to preserve 64 bit
-	 * XMMs after fxrstor.
-	 */
-#ifdef CONFIG_X86_64
-	/* XXX: accessing XMM 8-15 very awkwardly */
-	memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-	/*
-	 * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-	 * does save and restore MXCSR.
-	 */
-	if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-		memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-	return rc;
-}
-
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
 	struct fxregs_state fx_state;
 	int rc;
+	unsigned int size;
 
 	rc = check_fxsr(ctxt);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
-	if (rc != X86EMUL_CONTINUE)
-		return rc;
+	ctxt->ops->get_fpu(ctxt);
+
+	if (ctxt->mode < X86EMUL_MODE_PROT64) {
+		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+		if (rc != X86EMUL_CONTINUE)
+			return rc;
+		/*
+		 * Hardware doesn't save and restore XMM 0-7 without
+		 * CR4.OSFXSR, but does save and restore MXCSR.
+		 */
+		if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
+			size = offsetof(struct fxregs_state, xmm_space[8]);
+		else
+			size = offsetof(struct fxregs_state, xmm_space[0]);
+		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
+			size);
+		if (rc != X86EMUL_CONTINUE)
+			return rc;
+	} else if (ctxt->mode == X86EMUL_MODE_PROT64) {
+		size = offsetof(struct fxregs_state, xmm_space[16]);
+		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
+			size);
+		if (rc != X86EMUL_CONTINUE)
+			return rc;
+	}
 
 	if (fx_state.mxcsr >> 16)
 		return emulate_gp(ctxt, 0);
 
-	ctxt->ops->get_fpu(ctxt);
-
-	if (ctxt->mode < X86EMUL_MODE_PROT64)
-		rc = fxrstor_fixup(ctxt, &fx_state);
-
 	if (rc == X86EMUL_CONTINUE)
 		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 22:48               ` [PATCH v4] " Nick Desaulniers
@ 2017-05-30 10:15                 ` Paolo Bonzini
  2017-05-30 14:05                   ` Radim Krčmář
  2017-05-31  3:08                 ` [PATCH v5] " Nick Desaulniers
  1 sibling, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2017-05-30 10:15 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel



On 30/05/2017 00:48, Nick Desaulniers wrote:
> em_fxstor previously called fxstor_fixup.  Both created instances of
> struct fxregs_state on the stack, which triggered the warning:
> 
> arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
> in function
>       'em_fxrstor' [-Wframe-larger-than=]
> static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>            ^
> with CONFIG_FRAME_WARN set to 1024.
> 
> This patch does the fixup in em_fxstor now, avoiding one additional
> struct fxregs_state, and now fxstor_fixup can be removed as it has no
> other call sites.
> 
> Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
> ---
> New in V4:
> * undo changes for V3, prefer to only call segmented_read_std() when
>   size has been initialized.
> 
> New in V3:
> * initialized size to 0 to avoid maybe-uninitialized warning.  Check
>   that it gets set to something other than 0 before passing size to
>   segmented_read_std().
> 
> New in V2:
> * reworked patch to do what was recommended by maintainers in:
>   https://lkml.org/lkml/2017/5/25/391
> 
>  arch/x86/kvm/emulate.c | 64 ++++++++++++++++++++------------------------------
>  1 file changed, 26 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 0816ab2e8adc..0367f7f81792 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -3985,57 +3985,45 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
>  	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
>  }
>  
> -static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
> -		struct fxregs_state *new)
> -{
> -	int rc = X86EMUL_CONTINUE;
> -	struct fxregs_state old;
> -
> -	rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
> -	if (rc != X86EMUL_CONTINUE)
> -		return rc;
> -
> -	/*
> -	 * 64 bit host will restore XMM 8-15, which is not correct on non-64
> -	 * bit guests.  Load the current values in order to preserve 64 bit
> -	 * XMMs after fxrstor.
> -	 */
> -#ifdef CONFIG_X86_64
> -	/* XXX: accessing XMM 8-15 very awkwardly */
> -	memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
> -#endif
> -
> -	/*
> -	 * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
> -	 * does save and restore MXCSR.
> -	 */
> -	if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
> -		memcpy(new->xmm_space, old.xmm_space, 8 * 16);
> -
> -	return rc;
> -}
> -
>  static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>  {
>  	struct fxregs_state fx_state;
>  	int rc;
> +	unsigned int size;
>  
>  	rc = check_fxsr(ctxt);
>  	if (rc != X86EMUL_CONTINUE)
>  		return rc;
>  
> -	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
> -	if (rc != X86EMUL_CONTINUE)
> -		return rc;
> +	ctxt->ops->get_fpu(ctxt);
> +
> +	if (ctxt->mode < X86EMUL_MODE_PROT64) {
> +		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
> +		if (rc != X86EMUL_CONTINUE)
> +			return rc;
> +		/*
> +		 * Hardware doesn't save and restore XMM 0-7 without
> +		 * CR4.OSFXSR, but does save and restore MXCSR.
> +		 */
> +		if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
> +			size = offsetof(struct fxregs_state, xmm_space[8]);
> +		else
> +			size = offsetof(struct fxregs_state, xmm_space[0]);
> +		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
> +			size);
> +		if (rc != X86EMUL_CONTINUE)
> +			return rc;
> +	} else if (ctxt->mode == X86EMUL_MODE_PROT64) {
> +		size = offsetof(struct fxregs_state, xmm_space[16]);
> +		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
> +			size);
> +		if (rc != X86EMUL_CONTINUE)
> +			return rc;
> +	}

You can just remove the "if (ctxt->mode == X86EMUL_MODE_PROT64)".  This
way, the call of segmented_read_std and the "if (rc !=
X86EMUL_CONTINUE)" can move outside the conditional.

Thanks,

Paolo

>  	if (fx_state.mxcsr >> 16)
>  		return emulate_gp(ctxt, 0);
>  
> -	ctxt->ops->get_fpu(ctxt);
> -
> -	if (ctxt->mode < X86EMUL_MODE_PROT64)
> -		rc = fxrstor_fixup(ctxt, &fx_state);
> -
>  	if (rc == X86EMUL_CONTINUE)
>  		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
>  
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-30 10:15                 ` Paolo Bonzini
@ 2017-05-30 14:05                   ` Radim Krčmář
  0 siblings, 0 replies; 19+ messages in thread
From: Radim Krčmář @ 2017-05-30 14:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Nick Desaulniers, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, kvm, linux-kernel

2017-05-30 12:15+0200, Paolo Bonzini:
> On 30/05/2017 00:48, Nick Desaulniers wrote:
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> @@ -3985,57 +3985,45 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
>>  static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>>  {
>>  	struct fxregs_state fx_state;
>>  	int rc;
>> +	unsigned int size;
>>  
>>  	rc = check_fxsr(ctxt);
>>  	if (rc != X86EMUL_CONTINUE)
>>  		return rc;
>>  
>> -	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
>> -	if (rc != X86EMUL_CONTINUE)
>> -		return rc;
>> +	ctxt->ops->get_fpu(ctxt);
>> +
>> +	if (ctxt->mode < X86EMUL_MODE_PROT64) {
>> +		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
>> +		if (rc != X86EMUL_CONTINUE)
>> +			return rc;

Please call ctxt->ops->put_fpu() before returning.
(Don't be afraid to use 'goto', the will likely be more of them.)

>> +		/*
>> +		 * Hardware doesn't save and restore XMM 0-7 without
>> +		 * CR4.OSFXSR, but does save and restore MXCSR.
>> +		 */
>> +		if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
>> +			size = offsetof(struct fxregs_state, xmm_space[8]);

This should be the size of first 8 XMM registers, but xmm_space is of
type u32, so the correct size is
  xmm_space[8 * 16/sizeof(*fx_state.xmm_space)].

Adding a separate function to compute the size and call it from
em_fxrstor and em_fxsave would make sense at this point.

>> +		else
>> +			size = offsetof(struct fxregs_state, xmm_space[0]);
>> +		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
>> +			size);
>> +		if (rc != X86EMUL_CONTINUE)
>> +			return rc;
>> +	} else if (ctxt->mode == X86EMUL_MODE_PROT64) {
>> +		size = offsetof(struct fxregs_state, xmm_space[16]);
>> +		rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
>> +			size);
>> +		if (rc != X86EMUL_CONTINUE)
>> +			return rc;
>> +	}
> 
> You can just remove the "if (ctxt->mode == X86EMUL_MODE_PROT64)".  This
> way, the call of segmented_read_std and the "if (rc !=
> X86EMUL_CONTINUE)" can move outside the conditional.

Good idea.  check_fxsr() doesn't allow X86EMUL_MODE_PROT64 and the
condition was an artefact from older iterations.

I'll refresh the kvm-unit-test ([kvm-unit-tests PATCH] x86: realmode:
add FXSR tests).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-29 22:48               ` [PATCH v4] " Nick Desaulniers
  2017-05-30 10:15                 ` Paolo Bonzini
@ 2017-05-31  3:08                 ` Nick Desaulniers
  2017-05-31 11:01                   ` Paolo Bonzini
  1 sibling, 1 reply; 19+ messages in thread
From: Nick Desaulniers @ 2017-05-31  3:08 UTC (permalink / raw)
  Cc: Nick Desaulniers, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
      'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
           ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Further, the calculation for offsets into xmm_space can be shared
between em_fxstor and em_fxsave.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
---
New in V5:
* Updated commit message.
* moved offset calculation from em_fxstor and em_fxsave into new helper,
  xmm_offset.
* Always call put_fpu after a get_fpu.
* remove temporary size from em_fxsave.
* move segmented_read_std to after conditional in em_fxstor as per feedback.

New in V4:
* undo changes for V3, prefer to only call segmented_read_std() when
  size has been initialized.

New in V3:
* initialized size to 0 to avoid maybe-uninitialized warning.  Check
  that it gets set to something other than 0 before passing size to
  segmented_read_std().

New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 75 ++++++++++++++++++++------------------------------
 1 file changed, 30 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..55470ad07c2a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3941,6 +3941,16 @@ static int check_fxsr(struct x86_emulate_ctxt *ctxt)
 }
 
 /*
+ * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but does save
+ * and restore MXCSR.
+ */
+static size_t xmm_offset(struct x86_emulate_ctxt *ctxt)
+{
+	bool b = ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR;
+	return offsetof(struct fxregs_state, xmm_space[b ? 8 * 16 / 4 : 0]);
+}
+
+/*
  * FXSAVE and FXRSTOR have 4 different formats depending on execution mode,
  *  1) 16 bit mode
  *  2) 32 bit mode
@@ -3961,7 +3971,6 @@ static int check_fxsr(struct x86_emulate_ctxt *ctxt)
 static int em_fxsave(struct x86_emulate_ctxt *ctxt)
 {
 	struct fxregs_state fx_state;
-	size_t size;
 	int rc;
 
 	rc = check_fxsr(ctxt);
@@ -3977,68 +3986,44 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
-		size = offsetof(struct fxregs_state, xmm_space[8 * 16/4]);
-	else
-		size = offsetof(struct fxregs_state, xmm_space[0]);
-
-	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
-}
-
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-		struct fxregs_state *new)
-{
-	int rc = X86EMUL_CONTINUE;
-	struct fxregs_state old;
-
-	rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-	if (rc != X86EMUL_CONTINUE)
-		return rc;
-
-	/*
-	 * 64 bit host will restore XMM 8-15, which is not correct on non-64
-	 * bit guests.  Load the current values in order to preserve 64 bit
-	 * XMMs after fxrstor.
-	 */
-#ifdef CONFIG_X86_64
-	/* XXX: accessing XMM 8-15 very awkwardly */
-	memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-	/*
-	 * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-	 * does save and restore MXCSR.
-	 */
-	if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-		memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-	return rc;
+	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state,
+		xmm_offset(ctxt));
 }
 
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
 	struct fxregs_state fx_state;
 	int rc;
+	size_t size;
 
 	rc = check_fxsr(ctxt);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
-	if (rc != X86EMUL_CONTINUE)
-		return rc;
+	ctxt->ops->get_fpu(ctxt);
 
-	if (fx_state.mxcsr >> 16)
-		return emulate_gp(ctxt, 0);
+	if (ctxt->mode < X86EMUL_MODE_PROT64) {
+		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+		if (rc != X86EMUL_CONTINUE)
+			goto out;
+		size = xmm_offset(ctxt);
+	} else {
+		size = offsetof(struct fxregs_state, xmm_space[16]);
+	}
 
-	ctxt->ops->get_fpu(ctxt);
+	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
+	if (rc != X86EMUL_CONTINUE)
+		goto out;
 
-	if (ctxt->mode < X86EMUL_MODE_PROT64)
-		rc = fxrstor_fixup(ctxt, &fx_state);
+	if (fx_state.mxcsr >> 16) {
+		rc = emulate_gp(ctxt, 0);
+		goto out;
+	}
 
 	if (rc == X86EMUL_CONTINUE)
 		rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
+out:
 	ctxt->ops->put_fpu(ctxt);
 
 	return rc;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-31  3:08                 ` [PATCH v5] " Nick Desaulniers
@ 2017-05-31 11:01                   ` Paolo Bonzini
  2017-06-01  1:05                     ` Nick Desaulniers
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2017-05-31 11:01 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel


> +		size = offsetof(struct fxregs_state, xmm_space[16]);

This still has the same issue (it should be multiplied by 4).  Here's my
take on it; I checked the compiled code and it's pretty good too (the
compiler knows to do the fxsave if and only if ctxt->mode <
X86EMUL_MODE_PROT64, because that's when the size is smaller):

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 55470ad07c2a..b76f19d2684d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3944,10 +3944,19 @@ static int check_fxsr(struct x86_emulate_ctxt *ctxt)
  * Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but does save
  * and restore MXCSR.
  */
-static size_t xmm_offset(struct x86_emulate_ctxt *ctxt)
+static size_t __fxstate_size(int nregs)
 {
-	bool b = ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR;
-	return offsetof(struct fxregs_state, xmm_space[b ? 8 * 16 / 4 : 0]);
+	return offsetof(struct fxregs_state, xmm_space[0]) + nregs * 16;
+}
+
+static inline size_t fxstate_size(struct x86_emulate_ctxt *ctxt)
+{
+	bool cr4_osfxsr;
+	if (ctxt->mode == X86EMUL_MODE_PROT64)
+		return __fxstate_size(16);
+
+	cr4_osfxsr = ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR;
+	return __fxstate_size(cr4_osfxsr ? 8 : 0);
 }
 
 /*
@@ -3987,7 +3996,7 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
 		return rc;
 
 	return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state,
-		xmm_offset(ctxt));
+		                   fxstate_size(ctxt));
 }
 
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
@@ -4002,13 +4011,11 @@ static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 
 	ctxt->ops->get_fpu(ctxt);
 
-	if (ctxt->mode < X86EMUL_MODE_PROT64) {
+	size = fxstate_size(ctxt);
+	if (size < __fxstate_size(16)) {
 		rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
 		if (rc != X86EMUL_CONTINUE)
 			goto out;
-		size = xmm_offset(ctxt);
-	} else {
-		size = offsetof(struct fxregs_state, xmm_space[16]);
 	}
 
 	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);

Thanks Nick for the patches and Radim for the reviews!

Paolo

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-05-31 11:01                   ` Paolo Bonzini
@ 2017-06-01  1:05                     ` Nick Desaulniers
  2017-06-01  7:36                       ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Nick Desaulniers @ 2017-06-01  1:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

On Wed, May 31, 2017 at 07:01:29AM -0400, Paolo Bonzini wrote:
> > +		size = offsetof(struct fxregs_state, xmm_space[16]);
> This still has the same issue (it should be multiplied by 4).

I'm still misunderstanding the math here.

Why multiplied by four, in this case? 8 * 16 / 4 is used in other cases.

Also, previously Radim wrote:

>> +      size = offsetof(struct fxregs_state, xmm_space[8]);
> This should be the size of first 8 XMM registers, but xmm_space is of
> type u32, so the correct size is
>   xmm_space[8 * 16/sizeof(*fx_state.xmm_space)].

So I think my calculation is off in xmm_offset still?  Can we make use
of well-named variables, in place of these constants? Otherwise the math
is hard to follow.

> Thanks Nick for the patches and Radim for the reviews!
> Paolo

Thanks for the code review!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-06-01  1:05                     ` Nick Desaulniers
@ 2017-06-01  7:36                       ` Paolo Bonzini
  2017-06-02  2:10                         ` Nick Desaulniers
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2017-06-01  7:36 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel



On 01/06/2017 03:05, Nick Desaulniers wrote:
> On Wed, May 31, 2017 at 07:01:29AM -0400, Paolo Bonzini wrote:
>>> +		size = offsetof(struct fxregs_state, xmm_space[16]);
>> This still has the same issue (it should be multiplied by 4).
> 
> I'm still misunderstanding the math here.
> 
> Why multiplied by four, in this case? 8 * 16 / 4 is used in other cases.

*16/4 is the same as *4. :)

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] KVM: x86: avoid large stack allocations in em_fxrstor
  2017-06-01  7:36                       ` Paolo Bonzini
@ 2017-06-02  2:10                         ` Nick Desaulniers
  0 siblings, 0 replies; 19+ messages in thread
From: Nick Desaulniers @ 2017-06-02  2:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, kvm,
	linux-kernel

On Thu, Jun 01, 2017 at 09:36:18AM +0200, Paolo Bonzini wrote:
> On 01/06/2017 03:05, Nick Desaulniers wrote:
> > On Wed, May 31, 2017 at 07:01:29AM -0400, Paolo Bonzini wrote:
> >>> +		size = offsetof(struct fxregs_state, xmm_space[16]);
> >> This still has the same issue (it should be multiplied by 4).
> > 
> > I'm still misunderstanding the math here.
> > 
> > Why multiplied by four, in this case? 8 * 16 / 4 is used in other cases.
> 
> *16/4 is the same as *4. :)

I meant the use of an expression full of literals rather than either a
single literal or an expression formed from well named variables seemed
kind of like a code smell, but w/e.  Patch inbound.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-06-02  2:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-24  6:24 [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor Nick Desaulniers
2017-05-24 14:19 ` Radim Krčmář
2017-05-25  1:36   ` Nick Desaulniers
2017-05-25 14:07     ` Paolo Bonzini
2017-05-26  4:13       ` Nick Desaulniers
2017-05-26  7:18         ` Paolo Bonzini
2017-05-29 19:55           ` [PATCH v2] KVM: x86: avoid large stack allocations " Nick Desaulniers
2017-05-29 20:14             ` kbuild test robot
2017-05-29 20:29               ` Nick Desaulniers
2017-05-29 20:39             ` [PATCH v3] " Nick Desaulniers
2017-05-29 22:40               ` Nick Desaulniers
2017-05-29 22:48               ` [PATCH v4] " Nick Desaulniers
2017-05-30 10:15                 ` Paolo Bonzini
2017-05-30 14:05                   ` Radim Krčmář
2017-05-31  3:08                 ` [PATCH v5] " Nick Desaulniers
2017-05-31 11:01                   ` Paolo Bonzini
2017-06-01  1:05                     ` Nick Desaulniers
2017-06-01  7:36                       ` Paolo Bonzini
2017-06-02  2:10                         ` Nick Desaulniers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).