All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
@ 2020-07-13  8:28 Vitaly Kuznetsov
  2020-07-13 15:17 ` Sean Christopherson
  0 siblings, 1 reply; 7+ messages in thread
From: Vitaly Kuznetsov @ 2020-07-13  8:28 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, linux-kernel

Holes in structs which are userspace ABI are undesireable.

Fixes: 83d31e5271ac ("KVM: nVMX: fixes for preemption timer migration")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 Documentation/virt/kvm/api.rst  | 2 +-
 arch/x86/include/uapi/asm/kvm.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 320788f81a05..7beccda11978 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4345,7 +4345,7 @@ Errors:
 	struct {
 		__u16 flags;
 	} smm;
-
+	__u16 pad;
 	__u32 flags;
 	__u64 preemption_timer_deadline;
   };
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 0780f97c1850..aae3df1cbd01 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -414,7 +414,7 @@ struct kvm_vmx_nested_state_hdr {
 	struct {
 		__u16 flags;
 	} smm;
-
+	__u16 pad;
 	__u32 flags;
 	__u64 preemption_timer_deadline;
 };
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
  2020-07-13  8:28 [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr Vitaly Kuznetsov
@ 2020-07-13 15:17 ` Sean Christopherson
  2020-07-13 15:54   ` Vitaly Kuznetsov
  0 siblings, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2020-07-13 15:17 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, linux-kernel

On Mon, Jul 13, 2020 at 10:28:24AM +0200, Vitaly Kuznetsov wrote:
> Holes in structs which are userspace ABI are undesireable.
> 
> Fixes: 83d31e5271ac ("KVM: nVMX: fixes for preemption timer migration")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  Documentation/virt/kvm/api.rst  | 2 +-
>  arch/x86/include/uapi/asm/kvm.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 320788f81a05..7beccda11978 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -4345,7 +4345,7 @@ Errors:
>  	struct {
>  		__u16 flags;
>  	} smm;
> -
> +	__u16 pad;

I don't think this is sufficient.  Before 83d31e5271ac, the struct was:

	struct kvm_vmx_nested_state_hdr {
		__u64 vmxon_pa;
		__u64 vmcs12_pa;

		struct {
			__u16 flags;
		} smm;
	};

which most/all compilers will pad out to 24 bytes on a 64-bit system.  And
although smm.flags is padded to 8 bytes, it's initialized as a 2 byte value.

714             struct kvm_vmx_nested_state_hdr boo;
715             u64 val;
716
717             BUILD_BUG_ON(sizeof(boo) != 3*8);
718             boo.smm.flags = 0;
   0xffffffff810148a9 <+41>:    xor    %eax,%eax
   0xffffffff810148ab <+43>:    mov    %ax,0x18(%rsp)

719
720             val = *((volatile u64 *)(&boo.smm.flags));
   0xffffffff810148b0 <+48>:    mov    0x18(%rsp),%rax


Which means that userspace built for the old kernel will potentially send in
garbage for the new 'flags' field due to it being uninitialized stack data,
even with the layout after this patch.

	struct kvm_vmx_nested_state_hdr {
		__u64 vmxon_pa;
		__u64 vmcs12_pa;

		struct {
			__u16 flags;
		} smm;
		__u16 pad;
		__u32 flags;
		__u64 preemption_timer_deadline;
	};

So to be backwards compatible I believe we need to add a __u32 pad as well,
and to not cause internal padding issues, either make the new 'flags' a
__u64 or pad that as well (or add and check a reserved __32).  Making flags
a __64 seems like the least wasteful approach, e.g.

	struct kvm_vmx_nested_state_hdr {
		__u64 vmxon_pa;
		__u64 vmcs12_pa;

		struct {
			__u16 flags;
		} smm;
		__u16 pad16;
		__u32 pad32;
		__u64 flags;
		__u64 preemption_timer_deadline;
	};


>  	__u32 flags;
>  	__u64 preemption_timer_deadline;
>    };
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 0780f97c1850..aae3df1cbd01 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -414,7 +414,7 @@ struct kvm_vmx_nested_state_hdr {
>  	struct {
>  		__u16 flags;
>  	} smm;
> -
> +	__u16 pad;
>  	__u32 flags;
>  	__u64 preemption_timer_deadline;
>  };
> -- 
> 2.25.4
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
  2020-07-13 15:17 ` Sean Christopherson
@ 2020-07-13 15:54   ` Vitaly Kuznetsov
  2020-07-27 11:43     ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Vitaly Kuznetsov @ 2020-07-13 15:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, linux-kernel

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Mon, Jul 13, 2020 at 10:28:24AM +0200, Vitaly Kuznetsov wrote:
>> Holes in structs which are userspace ABI are undesireable.
>> 
>> Fixes: 83d31e5271ac ("KVM: nVMX: fixes for preemption timer migration")
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  Documentation/virt/kvm/api.rst  | 2 +-
>>  arch/x86/include/uapi/asm/kvm.h | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 320788f81a05..7beccda11978 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -4345,7 +4345,7 @@ Errors:
>>  	struct {
>>  		__u16 flags;
>>  	} smm;
>> -
>> +	__u16 pad;
>
> I don't think this is sufficient.  Before 83d31e5271ac, the struct was:
>

Before 850448f35aaf. Thanks, I was too lazy to check that.

> 	struct kvm_vmx_nested_state_hdr {
> 		__u64 vmxon_pa;
> 		__u64 vmcs12_pa;
>
> 		struct {
> 			__u16 flags;
> 		} smm;
> 	};
>
> which most/all compilers will pad out to 24 bytes on a 64-bit system.  And
> although smm.flags is padded to 8 bytes, it's initialized as a 2 byte value.
>
> 714             struct kvm_vmx_nested_state_hdr boo;
> 715             u64 val;
> 716
> 717             BUILD_BUG_ON(sizeof(boo) != 3*8);
> 718             boo.smm.flags = 0;
>    0xffffffff810148a9 <+41>:    xor    %eax,%eax
>    0xffffffff810148ab <+43>:    mov    %ax,0x18(%rsp)
>
> 719
> 720             val = *((volatile u64 *)(&boo.smm.flags));
>    0xffffffff810148b0 <+48>:    mov    0x18(%rsp),%rax
>
>
> Which means that userspace built for the old kernel will potentially send in
> garbage for the new 'flags' field due to it being uninitialized stack data,
> even with the layout after this patch.
>
> 	struct kvm_vmx_nested_state_hdr {
> 		__u64 vmxon_pa;
> 		__u64 vmcs12_pa;
>
> 		struct {
> 			__u16 flags;
> 		} smm;
> 		__u16 pad;
> 		__u32 flags;
> 		__u64 preemption_timer_deadline;
> 	};
>
> So to be backwards compatible I believe we need to add a __u32 pad as well,
> and to not cause internal padding issues, either make the new 'flags' a
> __u64 or pad that as well (or add and check a reserved __32).  Making flags
> a __64 seems like the least wasteful approach, e.g.
>
> 	struct kvm_vmx_nested_state_hdr {
> 		__u64 vmxon_pa;
> 		__u64 vmcs12_pa;
>
> 		struct {
> 			__u16 flags;
> 		} smm;
> 		__u16 pad16;
> 		__u32 pad32;
> 		__u64 flags;
> 		__u64 preemption_timer_deadline;
> 	};

I see and I agree but the fix like that needs to get into 5.8 or an ABI
breakage is guaranteed. I'll send v2 immediately, hope Paolo will take a
look.

>
>
>>  	__u32 flags;
>>  	__u64 preemption_timer_deadline;
>>    };
>> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
>> index 0780f97c1850..aae3df1cbd01 100644
>> --- a/arch/x86/include/uapi/asm/kvm.h
>> +++ b/arch/x86/include/uapi/asm/kvm.h
>> @@ -414,7 +414,7 @@ struct kvm_vmx_nested_state_hdr {
>>  	struct {
>>  		__u16 flags;
>>  	} smm;
>> -
>> +	__u16 pad;
>>  	__u32 flags;
>>  	__u64 preemption_timer_deadline;
>>  };
>> -- 
>> 2.25.4
>> 
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
  2020-07-13 15:54   ` Vitaly Kuznetsov
@ 2020-07-27 11:43     ` Paolo Bonzini
  2020-07-27 15:46       ` Sean Christopherson
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2020-07-27 11:43 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Sean Christopherson
  Cc: kvm, Wanpeng Li, Jim Mattson, linux-kernel

On 13/07/20 17:54, Vitaly Kuznetsov wrote:
> Which means that userspace built for the old kernel will potentially send in
> garbage for the new 'flags' field due to it being uninitialized stack data,
> even with the layout after this patch.

It might as well send it now if the code didn't attempt to zero the
struct before filling it in (this is another good reason to use a
"flags" field to say what's been filled in).  I don't think special
casing padding is particularly useful; C11 for example requires
designated initializers to fill padding with zero bits[1] and even
before it's always been considered good behavior to use memset.

Paolo

[1]  It says: "If an object that has static or thread storage duration
is not initialized explicitly, then [...] any padding is initialized to
zero bits" and even for non-static objects, "If there are fewer
initializers in a brace-enclosed list than there are elements or members
of an aggregate [...] the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage
duration".


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
  2020-07-27 11:43     ` Paolo Bonzini
@ 2020-07-27 15:46       ` Sean Christopherson
  2020-07-27 16:16         ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2020-07-27 15:46 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, Wanpeng Li, Jim Mattson, linux-kernel

On Mon, Jul 27, 2020 at 01:43:34PM +0200, Paolo Bonzini wrote:
> On 13/07/20 17:54, Vitaly Kuznetsov wrote:
> > Which means that userspace built for the old kernel will potentially send in
> > garbage for the new 'flags' field due to it being uninitialized stack data,
> > even with the layout after this patch.
> 
> It might as well send it now if the code didn't attempt to zero the
> struct before filling it in (this is another good reason to use a
> "flags" field to say what's been filled in).

The issue is that flags itself could hold garbage.

https://lkml.kernel.org/r/20200713151750.GA29901@linux.intel.com

>  I don't think special
> casing padding is particularly useful; C11 for example requires
> designated initializers to fill padding with zero bits[1] and even
> before it's always been considered good behavior to use memset.
> 
> Paolo
> 
> [1]  It says: "If an object that has static or thread storage duration
> is not initialized explicitly, then [...] any padding is initialized to
> zero bits" 

static and per-thread storage is unlikely to be relevant, 

> and even for non-static objects, "If there are fewer
> initializers in a brace-enclosed list than there are elements or members
> of an aggregate [...] the remainder of the aggregate shall be
> initialized implicitly the same as objects that have static storage
> duration".

That's specifically talking about members, not usused/padded space, e.g.
smm.flags (in the hold struct) must be zeroed with this, but it doesn't
say anything about initializing padding.

  struct kvm_vmx_nested_state_hdr hdr = {
      .vmxon_pa = root,
      .vmcs12_pa = vmcs12,
  };

QEMU won't see issues because it zero allocates the entire nested state.

All the above being said, after looking at the whole picture I think padding
the header is a moot point.  The header is padded out to 120 bytes[*] when
including in the full nested state, and KVM only ever consumes the header in
the context of the full nested state.  I.e. if there's garbage at offset 6,
odds are there's going to be garbage at offset 18, so internally padding the
header does nothing.

KVM should be checking that the unused bytes of (sizeof(pad) - sizeof(vmx/svm))
is zero if we want to expand into the padding in the future.  Right now we're
relying on userspace to zero allocate the struct without enforcing it.

[*] Amusing side note, the comment in the header is wrong.  It states "pad
    the header to 128 bytes", but only pads it to 120 bytes, because union.

/* for KVM_CAP_NESTED_STATE */
struct kvm_nested_state {
	__u16 flags;
	__u16 format;
	__u32 size;

	union {
		struct kvm_vmx_nested_state_hdr vmx;
		struct kvm_svm_nested_state_hdr svm;

		/* Pad the header to 128 bytes.  */
		__u8 pad[120];
	} hdr;

	/*
	 * Define data region as 0 bytes to preserve backwards-compatability
	 * to old definition of kvm_nested_state in order to avoid changing
	 * KVM_{GET,PUT}_NESTED_STATE ioctl values.
	 */
	union {
		struct kvm_vmx_nested_state_data vmx[0];
		struct kvm_svm_nested_state_data svm[0];
	} data;
};


Odds are no real VMM will have issue given the dynamic size of struct
kvm_nested_state, but 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
  2020-07-27 15:46       ` Sean Christopherson
@ 2020-07-27 16:16         ` Paolo Bonzini
  2020-07-28 15:59           ` Sean Christopherson
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2020-07-27 16:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, kvm, Wanpeng Li, Jim Mattson, linux-kernel

On 27/07/20 17:46, Sean Christopherson wrote:
>> It might as well send it now if the code didn't attempt to zero the
>> struct before filling it in (this is another good reason to use a
>> "flags" field to say what's been filled in).
> The issue is that flags itself could hold garbage.
> 
> https://lkml.kernel.org/r/20200713151750.GA29901@linux.intel.com

Quoting from there:

> Which means that userspace built for the old kernel will potentially send in
> garbage for the new 'flags' field due to it being uninitialized stack data,
> even with the layout after this patch.

Userspace should always zero everything.  I don't think that the padding
between fields is any different from the other bytes padding the header
to 128 bytes.

>   struct kvm_vmx_nested_state_hdr hdr = {
>       .vmxon_pa = root,
>       .vmcs12_pa = vmcs12,
>   };
> 
> QEMU won't see issues because it zero allocates the entire nested state.
> 
> All the above being said, after looking at the whole picture I think padding
> the header is a moot point.  The header is padded out to 120 bytes[*] when
> including in the full nested state, and KVM only ever consumes the header in
> the context of the full nested state.  I.e. if there's garbage at offset 6,
> odds are there's going to be garbage at offset 18, so internally padding the
> header does nothing.

Yes, that was what I was hinting at with "it might as well send it now"
(i.e., after the patch).

(All of this is moot for userspace that just uses KVM_GET_NESTED_STATE
and passes it back to KVM_SET_NESTED_STATE).

> KVM should be checking that the unused bytes of (sizeof(pad) - sizeof(vmx/svm))
> is zero if we want to expand into the padding in the future.  Right now we're
> relying on userspace to zero allocate the struct without enforcing it.

The alternative, which is almost as good, is to only use these extra
fields which could be garbage if the flags are not set, and check the
flags (see the patches I have sent earlier today).

The chance of the flags passing the check will decrease over time as
more flags are added; but the chance of having buggy userspace that
sends down garbage also will.

> [*] Amusing side note, the comment in the header is wrong.  It states "pad
>     the header to 128 bytes", but only pads it to 120 bytes, because union.
> 
> /* for KVM_CAP_NESTED_STATE */
> struct kvm_nested_state {
> 	__u16 flags;
> 	__u16 format;
> 	__u32 size;
> 
> 	union {
> 		struct kvm_vmx_nested_state_hdr vmx;
> 		struct kvm_svm_nested_state_hdr svm;
> 
> 		/* Pad the header to 128 bytes.  */
> 		__u8 pad[120];
> 	} hdr;

There are 8 bytes before the union, and it's not a coincidence. :)
"Header" refers to the stuff before the data region.

> Odds are no real VMM will have issue given the dynamic size of struct
> kvm_nested_state, but 

... *suspence* ...

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr
  2020-07-27 16:16         ` Paolo Bonzini
@ 2020-07-28 15:59           ` Sean Christopherson
  0 siblings, 0 replies; 7+ messages in thread
From: Sean Christopherson @ 2020-07-28 15:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, Wanpeng Li, Jim Mattson, linux-kernel

On Mon, Jul 27, 2020 at 06:16:56PM +0200, Paolo Bonzini wrote:
> On 27/07/20 17:46, Sean Christopherson wrote:
> > All the above being said, after looking at the whole picture I think padding
> > the header is a moot point.  The header is padded out to 120 bytes[*] when
> > including in the full nested state, and KVM only ever consumes the header in
> > the context of the full nested state.  I.e. if there's garbage at offset 6,
> > odds are there's going to be garbage at offset 18, so internally padding the
> > header does nothing.
> 
> Yes, that was what I was hinting at with "it might as well send it now"
> (i.e., after the patch).
> 
> (All of this is moot for userspace that just uses KVM_GET_NESTED_STATE
> and passes it back to KVM_SET_NESTED_STATE).
> 
> > KVM should be checking that the unused bytes of (sizeof(pad) - sizeof(vmx/svm))
> > is zero if we want to expand into the padding in the future.  Right now we're
> > relying on userspace to zero allocate the struct without enforcing it.
> 
> The alternative, which is almost as good, is to only use these extra
> fields which could be garbage if the flags are not set, and check the
> flags (see the patches I have sent earlier today).
> 
> The chance of the flags passing the check will decrease over time as
> more flags are added; but the chance of having buggy userspace that
> sends down garbage also will.

Ah, I see what you're saying.  Ya, that makes sense.

> > [*] Amusing side note, the comment in the header is wrong.  It states "pad
> >     the header to 128 bytes", but only pads it to 120 bytes, because union.
> > 
> > /* for KVM_CAP_NESTED_STATE */
> > struct kvm_nested_state {
> > 	__u16 flags;
> > 	__u16 format;
> > 	__u32 size;
> > 
> > 	union {
> > 		struct kvm_vmx_nested_state_hdr vmx;
> > 		struct kvm_svm_nested_state_hdr svm;
> > 
> > 		/* Pad the header to 128 bytes.  */
> > 		__u8 pad[120];
> > 	} hdr;
> 
> There are 8 bytes before the union, and it's not a coincidence. :)
> "Header" refers to the stuff before the data region.

Ugh, then 'hdr' probably should be named vendor_header or something. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-28 15:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-13  8:28 [PATCH] KVM: nVMX: properly pad struct kvm_vmx_nested_state_hdr Vitaly Kuznetsov
2020-07-13 15:17 ` Sean Christopherson
2020-07-13 15:54   ` Vitaly Kuznetsov
2020-07-27 11:43     ` Paolo Bonzini
2020-07-27 15:46       ` Sean Christopherson
2020-07-27 16:16         ` Paolo Bonzini
2020-07-28 15:59           ` Sean Christopherson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.