All of lore.kernel.org
 help / color / mirror / Atom feed
* Nested VMX security review
@ 2016-08-16  0:23 Lars Bull
  2016-08-16  2:51 ` Wanpeng Li
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Lars Bull @ 2016-08-16  0:23 UTC (permalink / raw)
  To: kvm

Hi, all. I work on the virtualization security team at Google. We'd
like to start discussion on having nested VMX be considered ready for
production, in order to have it be enabled by default, its CVEs
embargoed, and its patches sent to stable. It's my understanding that
the lack of a security audit is one of the biggest reasons why nested
VMX is still experimental. We've done the following work on this
front:

- Andy Honig conducted a security review of the nested VMX code base.
He found time-of-check time-of-use issues, which have been addressed
with a VMCS caching change
(https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=4f2777bc97974b0df9276ee9a85155a9e27a5282).
He also found an issue that could allow the guest to access the host
x2APIC, for which a fix is pending
(https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1204751.html).

- I worked on fuzz testing the code. We have a suite of KVM fuzz tests
that we normally test with, covering surfaces such as IO, MMIO, MSRs,
and the instruction emulator. I ran this in a nested environment using
KVM on KVM and didn't encounter any serious problems on the host. I
also modified the L1 kernel to tweak bits in the VMCS control fields
for the L2 guest when handling exits, while the L2 guest was running
our standard fuzzing suite. This was able to find host memory
corruption with shadow VMCS enabled, which has now been fixed
(https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=2f1fe81123f59271bddda673b60116bde9660385).

Our testing was focused primarily on security of the host from both
guest levels rather than the security of L1 and did not check for
correctness. We are fairly confident after this work that nested VMX
doesn't present a significant increase in risk for the host. We're
curious what the next steps should be in getting this considered
production-ready.

-Lars Bull

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Nested VMX security review
  2016-08-16  0:23 Nested VMX security review Lars Bull
@ 2016-08-16  2:51 ` Wanpeng Li
  2016-08-16  6:20 ` Jan Kiszka
  2016-08-16  7:03 ` Christian Borntraeger
  2 siblings, 0 replies; 6+ messages in thread
From: Wanpeng Li @ 2016-08-16  2:51 UTC (permalink / raw)
  To: Lars Bull; +Cc: kvm

2016-08-16 8:23 GMT+08:00 Lars Bull <larsbull@google.com>:
> Hi, all. I work on the virtualization security team at Google. We'd
> like to start discussion on having nested VMX be considered ready for
> production, in order to have it be enabled by default, its CVEs

Interesting, what's the real scenarios can get benefit from this
feature from Google?

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Nested VMX security review
  2016-08-16  0:23 Nested VMX security review Lars Bull
  2016-08-16  2:51 ` Wanpeng Li
@ 2016-08-16  6:20 ` Jan Kiszka
  2016-08-16 22:23   ` Jim Mattson
  2016-08-16  7:03 ` Christian Borntraeger
  2 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2016-08-16  6:20 UTC (permalink / raw)
  To: Lars Bull, kvm, Paolo Bonzini, Radim Krčmář

On 2016-08-16 02:23, Lars Bull wrote:
> Hi, all. I work on the virtualization security team at Google. We'd
> like to start discussion on having nested VMX be considered ready for
> production, in order to have it be enabled by default, its CVEs
> embargoed, and its patches sent to stable. It's my understanding that
> the lack of a security audit is one of the biggest reasons why nested
> VMX is still experimental. We've done the following work on this
> front:
> 
> - Andy Honig conducted a security review of the nested VMX code base.
> He found time-of-check time-of-use issues, which have been addressed
> with a VMCS caching change
> (https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=4f2777bc97974b0df9276ee9a85155a9e27a5282).
> He also found an issue that could allow the guest to access the host
> x2APIC, for which a fix is pending
> (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1204751.html).
> 
> - I worked on fuzz testing the code. We have a suite of KVM fuzz tests
> that we normally test with, covering surfaces such as IO, MMIO, MSRs,
> and the instruction emulator. I ran this in a nested environment using
> KVM on KVM and didn't encounter any serious problems on the host. I
> also modified the L1 kernel to tweak bits in the VMCS control fields
> for the L2 guest when handling exits, while the L2 guest was running
> our standard fuzzing suite. This was able to find host memory
> corruption with shadow VMCS enabled, which has now been fixed
> (https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=2f1fe81123f59271bddda673b60116bde9660385).
> 
> Our testing was focused primarily on security of the host from both
> guest levels rather than the security of L1 and did not check for
> correctness. We are fairly confident after this work that nested VMX
> doesn't present a significant increase in risk for the host. We're
> curious what the next steps should be in getting this considered
> production-ready.

Great job! Thanks a lot for moving this topic to the next level.

I suppose the other remaining topic is saving/restoring nVMX related
states so that migration and a cleaner reset become possible, but also
other inspection from userspace (gdb & Co.). I guess this will also ease
further fuzz testing. But I don't know if that has to delay flipping the
default of kvm_intel.nested.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Nested VMX security review
  2016-08-16  0:23 Nested VMX security review Lars Bull
  2016-08-16  2:51 ` Wanpeng Li
  2016-08-16  6:20 ` Jan Kiszka
@ 2016-08-16  7:03 ` Christian Borntraeger
  2016-08-16 23:28   ` Lars Bull
  2 siblings, 1 reply; 6+ messages in thread
From: Christian Borntraeger @ 2016-08-16  7:03 UTC (permalink / raw)
  To: Lars Bull, kvm

On 08/16/2016 02:23 AM, Lars Bull wrote:
[...]

> We've done the following work on this front:
[...]

Thanks for doing this. Are the tests closed or public? Would it make
sense to also use these for others archslike s390/ppc (lets say with less
effort than a rewrite) or is this really  x86-specific code?

> Our testing was focused primarily on security of the host from both
> guest levels rather than the security of L1 and did not check for
> correctness. We are fairly confident after this work that nested VMX
> doesn't present a significant increase in risk for the host. We're
> curious what the next steps should be in getting this considered
> production-ready.

Are there any plans to do L2->L1 testing? If we cannot be sure that L2
does not violate the integrity of L1, we certainly cannot enable
that by default. (to make it more obvious, would you buy a hypervisor
that provides a "give-me-root" interface for its guests.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Nested VMX security review
  2016-08-16  6:20 ` Jan Kiszka
@ 2016-08-16 22:23   ` Jim Mattson
  0 siblings, 0 replies; 6+ messages in thread
From: Jim Mattson @ 2016-08-16 22:23 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Lars Bull, kvm, Paolo Bonzini, Radim Krčmář

I am working on nVMX state save/restore, but I don't have the
resources to do the same for nSVM state.

As mentioned in an earlier thread, migration would be a lot easier if
we eliminated the vmcs01/vmcs02 distinction and L0 just used a single
VMCS for both. I do have a workaround, but it's a little ugly. I'd
love to discuss this with interested parties at kvm forum next week.



On Mon, Aug 15, 2016 at 11:20 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2016-08-16 02:23, Lars Bull wrote:
>> Hi, all. I work on the virtualization security team at Google. We'd
>> like to start discussion on having nested VMX be considered ready for
>> production, in order to have it be enabled by default, its CVEs
>> embargoed, and its patches sent to stable. It's my understanding that
>> the lack of a security audit is one of the biggest reasons why nested
>> VMX is still experimental. We've done the following work on this
>> front:
>>
>> - Andy Honig conducted a security review of the nested VMX code base.
>> He found time-of-check time-of-use issues, which have been addressed
>> with a VMCS caching change
>> (https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=4f2777bc97974b0df9276ee9a85155a9e27a5282).
>> He also found an issue that could allow the guest to access the host
>> x2APIC, for which a fix is pending
>> (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1204751.html).
>>
>> - I worked on fuzz testing the code. We have a suite of KVM fuzz tests
>> that we normally test with, covering surfaces such as IO, MMIO, MSRs,
>> and the instruction emulator. I ran this in a nested environment using
>> KVM on KVM and didn't encounter any serious problems on the host. I
>> also modified the L1 kernel to tweak bits in the VMCS control fields
>> for the L2 guest when handling exits, while the L2 guest was running
>> our standard fuzzing suite. This was able to find host memory
>> corruption with shadow VMCS enabled, which has now been fixed
>> (https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=2f1fe81123f59271bddda673b60116bde9660385).
>>
>> Our testing was focused primarily on security of the host from both
>> guest levels rather than the security of L1 and did not check for
>> correctness. We are fairly confident after this work that nested VMX
>> doesn't present a significant increase in risk for the host. We're
>> curious what the next steps should be in getting this considered
>> production-ready.
>
> Great job! Thanks a lot for moving this topic to the next level.
>
> I suppose the other remaining topic is saving/restoring nVMX related
> states so that migration and a cleaner reset become possible, but also
> other inspection from userspace (gdb & Co.). I guess this will also ease
> further fuzz testing. But I don't know if that has to delay flipping the
> default of kvm_intel.nested.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> Corporate Competence Center Embedded Linux
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Nested VMX security review
  2016-08-16  7:03 ` Christian Borntraeger
@ 2016-08-16 23:28   ` Lars Bull
  0 siblings, 0 replies; 6+ messages in thread
From: Lars Bull @ 2016-08-16 23:28 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: kvm

On Tue, Aug 16, 2016 at 12:03 AM, Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
> On 08/16/2016 02:23 AM, Lars Bull wrote:
> [...]
>
>> We've done the following work on this front:
> [...]
>
> Thanks for doing this. Are the tests closed or public? Would it make
> sense to also use these for others archslike s390/ppc (lets say with less
> effort than a rewrite) or is this really  x86-specific code?

Unfortunately, both levels' fuzzers are pretty x86-specific. The L1
fuzzer specifically focuses on the VMCS and lives in vmx.c. It's a
fairly simple fuzzer that has a list of VMCS control field addresses
and sizes and just flips bits with a low probability on L2 exits. It
could be made public. The L2 fuzzer is written in x86 assembly and
unfortunately can't be made public at this time due to its
dependencies, though we've discussed the possibility of pursuing this
in the future.

>> Our testing was focused primarily on security of the host from both
>> guest levels rather than the security of L1 and did not check for
>> correctness. We are fairly confident after this work that nested VMX
>> doesn't present a significant increase in risk for the host. We're
>> curious what the next steps should be in getting this considered
>> production-ready.
>
> Are there any plans to do L2->L1 testing? If we cannot be sure that L2
> does not violate the integrity of L1, we certainly cannot enable
> that by default. (to make it more obvious, would you buy a hypervisor
> that provides a "give-me-root" interface for its guests.

While we didn't focus on the security of L1 in this review, it was
covered by Andy's code review. Our fuzzing coverage of this was
minimal because our VMCS fuzzing causes the L1 guest to be unstable,
though we may work on L1-targeted nested fuzzing and crash detection
in the future. Having nested VMX enabled by default is useful, but not
as important to us as the feature being considered production-ready.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-08-16 23:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-16  0:23 Nested VMX security review Lars Bull
2016-08-16  2:51 ` Wanpeng Li
2016-08-16  6:20 ` Jan Kiszka
2016-08-16 22:23   ` Jim Mattson
2016-08-16  7:03 ` Christian Borntraeger
2016-08-16 23:28   ` Lars Bull

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.