All of lore.kernel.org
 help / color / mirror / Atom feed
* [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
@ 2018-07-25 14:30 Paolo Bonzini
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1 Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-25 14:30 UTC (permalink / raw)
  To: speck

Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
tree.

Paolo Bonzini (4):
  KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
  x86: SMT doesn't matter for VMX L1TF if EPT disabled or mitigation
    disabled
  x86: use ARCH_CAPABILITIES to skip L1D flush on vmentry
  KVM: VMX: tell nested hypervisor to skip L1D flush on vmentry

 Documentation/admin-guide/l1tf.rst | 21 +++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/include/asm/msr-index.h   |  1 +
 arch/x86/include/asm/vmx.h         |  1 +
 arch/x86/kernel/cpu/bugs.c         |  8 +++++++-
 arch/x86/kvm/vmx.c                 | 13 +++++++++++--
 arch/x86/kvm/x86.c                 | 18 +++++++++++++++++-
 7 files changed, 59 insertions(+), 4 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1
  2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
@ 2018-07-25 14:30 ` Paolo Bonzini
  2018-07-25 19:43   ` [MODERATED] " Andrew Cooper
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2 Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-25 14:30 UTC (permalink / raw)
  To: speck

This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all
requested features are available on the host.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1529928277-22739-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 902d535dff8f..79c8ca2c2ad9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1098,6 +1098,7 @@ static u32 msr_based_features[] = {
 
 	MSR_F10H_DECFG,
 	MSR_IA32_UCODE_REV,
+	MSR_IA32_ARCH_CAPABILITIES,
 };
 
 static unsigned int num_msr_based_features;
@@ -1106,7 +1107,8 @@ static int kvm_get_msr_feature(struct kvm_msr_entry *msr)
 {
 	switch (msr->index) {
 	case MSR_IA32_UCODE_REV:
-		rdmsrl(msr->index, msr->data);
+	case MSR_IA32_ARCH_CAPABILITIES:
+		rdmsrl_safe(msr->index, &msr->data);
 		break;
 	default:
 		if (kvm_x86_ops->get_msr_feature(msr))
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [MODERATED] [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2
  2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1 Paolo Bonzini
@ 2018-07-25 14:30 ` Paolo Bonzini
  2018-07-30 21:27   ` Thomas Gleixner
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 3/4] L1TF KVM ARCH_CAPABILITIES #3 Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-25 14:30 UTC (permalink / raw)
  To: speck

If EPT is disabled, L1TF cannot be exploited even across threads on the
same core, and SMT is irrelevant.

If mitigation is completely disabled, L1TF can be exploited even within a
single thread, so SMT is again irrelevant.

Reflect this in the sysfs file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/cpu/bugs.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index d63cb1501784..912eb015949f 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -753,6 +753,11 @@ static ssize_t l1tf_show_state(char *buf)
 	if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_AUTO)
 		return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG);
 
+	if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_EPT_DISABLED ||
+	    l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER)
+		return sprintf(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG,
+			       l1tf_vmx_states[l1tf_vmx_mitigation]);
+
 	return sprintf(buf, "%s; VMX: SMT %s, L1D %s\n", L1TF_DEFAULT_MSG,
 		       cpu_smt_control == CPU_SMT_ENABLED ? "vulnerable" : "disabled",
 		       l1tf_vmx_states[l1tf_vmx_mitigation]);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [MODERATED] [PATCH v2 3/4] L1TF KVM ARCH_CAPABILITIES #3
  2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1 Paolo Bonzini
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2 Paolo Bonzini
@ 2018-07-25 14:30 ` Paolo Bonzini
  2018-07-25 14:31 ` [MODERATED] [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4 Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-25 14:30 UTC (permalink / raw)
  To: speck

Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is
not needed.  Add a new value to enum vmx_l1d_flush_state, which is used
either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/include/asm/vmx.h       |  1 +
 arch/x86/kernel/cpu/bugs.c       |  3 ++-
 arch/x86/kvm/vmx.c               | 10 ++++++++++
 4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 0e7517089b80..4731f0cf97c5 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -70,6 +70,7 @@
 #define MSR_IA32_ARCH_CAPABILITIES	0x0000010a
 #define ARCH_CAP_RDCL_NO		(1 << 0)   /* Not susceptible to Meltdown */
 #define ARCH_CAP_IBRS_ALL		(1 << 1)   /* Enhanced IBRS support */
+#define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH	(1 << 3)   /* Skip L1D flush on vmentry */
 #define ARCH_CAP_SSB_NO			(1 << 4)   /*
 						    * Not susceptible to Speculative Store Bypass
 						    * attack, so no Speculative Store Bypass
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 94a8547d915b..ae262f55309f 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -579,6 +579,7 @@ enum vmx_l1d_flush_state {
 	VMENTER_L1D_FLUSH_COND,
 	VMENTER_L1D_FLUSH_ALWAYS,
 	VMENTER_L1D_FLUSH_EPT_DISABLED,
+	VMENTER_L1D_FLUSH_NOT_REQUIRED,
 };
 
 extern enum vmx_l1d_flush_state l1tf_vmx_mitigation;
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 912eb015949f..cbc2ca4d5800 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -745,7 +745,8 @@ static const char *l1tf_vmx_states[] = {
 	[VMENTER_L1D_FLUSH_NEVER]	= "vulnerable",
 	[VMENTER_L1D_FLUSH_COND]	= "conditional cache flushes",
 	[VMENTER_L1D_FLUSH_ALWAYS]	= "cache flushes",
-	[VMENTER_L1D_FLUSH_EPT_DISABLED]= "EPT disabled"
+	[VMENTER_L1D_FLUSH_EPT_DISABLED]= "EPT disabled",
+	[VMENTER_L1D_FLUSH_NOT_REQUIRED]= "flush not necessary"
 };
 
 static ssize_t l1tf_show_state(char *buf)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c5c0118b126d..d70c7de84a9a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -217,6 +217,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
 		return 0;
 	}
 
+       if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) {
+	       u64 msr;
+
+	       rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
+	       if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) {
+		       l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED;
+		       return 0;
+	       }
+       }
+
 	/* If set to auto use the default l1tf mitigation method */
 	if (l1tf == VMENTER_L1D_FLUSH_AUTO) {
 		switch (l1tf_mitigation) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [MODERATED] [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4
  2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
                   ` (2 preceding siblings ...)
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 3/4] L1TF KVM ARCH_CAPABILITIES #3 Paolo Bonzini
@ 2018-07-25 14:31 ` Paolo Bonzini
  2018-07-30 21:36   ` Thomas Gleixner
  2018-07-25 15:52 ` [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Greg KH
  2018-08-02  2:51 ` [MODERATED] " Konrad Rzeszutek Wilk
  5 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-25 14:31 UTC (permalink / raw)
  To: speck

When nested virtualization is in use, VMENTER operations from the
nested hypervisor into the nested guest will always be processed by
the bare metal hypervisor, and KVM's "conditional cache flushes"
mode in particular does a flush on nested vmentry.  Therefore,
include the "skip L1D flush on vmentry" bit in KVM's suggested
ARCH_CAPABILITIES setting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/admin-guide/l1tf.rst | 21 +++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/vmx.c                 |  3 +--
 arch/x86/kvm/x86.c                 | 16 +++++++++++++++-
 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst
index 5adf7d7c2b4e..46765abba70c 100644
--- a/Documentation/admin-guide/l1tf.rst
+++ b/Documentation/admin-guide/l1tf.rst
@@ -528,6 +528,27 @@ available:
     EPT can be disabled in the hypervisor via the 'kvm-intel.ept'
     parameter.
 
+3.4. Nested virtual machines
+""""""""""""""""""""""""""""
+
+When nested virtualization is in use, three operating systems are involved:
+the bare metal hypervisor, the nested hypervisor, and the nested virtual
+machine.  VMENTER operations from the nested hypervisor into the nested
+guest will always be processed by the bare metal hypervisor.  Therefore:
+
+When running as a bare metal hypervisor, instead, KVM will:
+
+ - flush the L1D cache on every switch from nested hypervisor to
+   nested virtual machine, so that the nested hypervisor's secrets
+   are not exposed to the nested virtual machine;
+
+ - flush the L1D cache on every switch from nested virtual machine to
+   nested hypervisor; this is a complex operation, and flushing the L1D
+   cache avoids that the bare metal hypervisor's secrets be exposed
+   to the nested virtual machine;
+
+ - instruct the nested hypervisor to not perform any L1D cache flush.
+
 
 .. _default_mitigations:
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 57d418061c55..e375011108d1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1417,6 +1417,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
 void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
 void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu);
 
+u64 kvm_get_arch_capabilities(void);
 void kvm_define_shared_msr(unsigned index, u32 msr);
 int kvm_set_shared_msr(unsigned index, u64 val, u64 mask);
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d70c7de84a9a..87394fc21263 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6419,8 +6419,7 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
 		++vmx->nmsrs;
 	}
 
-	if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
-		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, vmx->arch_capabilities);
+	vmx->arch_capabilities = kvm_get_arch_capabilities();
 
 	vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 79c8ca2c2ad9..1a1d4cfc6322 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1103,11 +1103,25 @@ static u32 msr_based_features[] = {
 
 static unsigned int num_msr_based_features;
 
+u64 kvm_get_arch_capabilities(void)
+{
+	u64 data;
+
+	rdmsrl_safe(MSR_IA32_ARCH_CAPABILITIES, &data);
+	if (l1tf_vmx_mitigation != VMENTER_L1D_FLUSH_NEVER)
+		data |= ARCH_CAP_SKIP_VMENTRY_L1DFLUSH;
+
+	return data;
+}
+EXPORT_SYMBOL_GPL(kvm_get_arch_capabilities);
+
 static int kvm_get_msr_feature(struct kvm_msr_entry *msr)
 {
 	switch (msr->index) {
-	case MSR_IA32_UCODE_REV:
 	case MSR_IA32_ARCH_CAPABILITIES:
+		msr->data = kvm_get_arch_capabilities();
+		break;
+	case MSR_IA32_UCODE_REV:
 		rdmsrl_safe(msr->index, &msr->data);
 		break;
 	default:
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
                   ` (3 preceding siblings ...)
  2018-07-25 14:31 ` [MODERATED] [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4 Paolo Bonzini
@ 2018-07-25 15:52 ` Greg KH
  2018-07-26  8:12   ` Paolo Bonzini
  2018-08-02  2:51 ` [MODERATED] " Konrad Rzeszutek Wilk
  5 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2018-07-25 15:52 UTC (permalink / raw)
  To: speck

On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
> tree.

Should I be taking patch 1 into a stable tree now, or wait until the
rest of these land?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1 Paolo Bonzini
@ 2018-07-25 19:43   ` Andrew Cooper
  2018-07-26  8:15     ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2018-07-25 19:43 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 444 bytes --]

On 25/07/2018 15:30, speck for Paolo Bonzini wrote:
> @@ -1106,7 +1107,8 @@ static int kvm_get_msr_feature(struct kvm_msr_entry *msr)
>  {
>  	switch (msr->index) {
>  	case MSR_IA32_UCODE_REV:
> -		rdmsrl(msr->index, msr->data);

You'll be wanting a break in here.

~Andrew

> +	case MSR_IA32_ARCH_CAPABILITIES:
> +		rdmsrl_safe(msr->index, &msr->data);
>  		break;
>  	default:
>  		if (kvm_x86_ops->get_msr_feature(msr))



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-07-25 15:52 ` [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Greg KH
@ 2018-07-26  8:12   ` Paolo Bonzini
  2018-07-26 10:04     ` Greg KH
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-26  8:12 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 487 bytes --]

On 25/07/2018 17:52, speck for Greg KH wrote:
> On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
>> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
>> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
>> tree.
> Should I be taking patch 1 into a stable tree now, or wait until the
> rest of these land?

Yes, you can do that if you prefer.  It's commit
cd28325249a1ca0d771557ce823e0308ad629f98.

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1
  2018-07-25 19:43   ` [MODERATED] " Andrew Cooper
@ 2018-07-26  8:15     ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-26  8:15 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 751 bytes --]

On 25/07/2018 21:43, speck for Andrew Cooper wrote:
>>  	case MSR_IA32_UCODE_REV:
>> -		rdmsrl(msr->index, msr->data);
>
> You'll be wanting a break in here.
> 
>> +	case MSR_IA32_ARCH_CAPABILITIES:
>> +		rdmsrl_safe(msr->index, &msr->data);
>>  		break;

No, this is a fallthrough.  This patch is unrelated to L1TF, and I'm
including it just because the branch is based on 4.18-rc1.  See the
patch #4 for the actual patch that introduces L1TF handling, and indeed
it doesn't rely on fallthrough anymore.  The code there becomes:

 	case MSR_IA32_ARCH_CAPABILITIES:
		msr->data = kvm_get_arch_capabilities();
		break;
	case MSR_IA32_UCODE_REV:
 		rdmsrl_safe(msr->index, &msr->data);
 		break;

but not yet.  Thanks,

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-07-26  8:12   ` Paolo Bonzini
@ 2018-07-26 10:04     ` Greg KH
  2018-07-26 10:41       ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2018-07-26 10:04 UTC (permalink / raw)
  To: speck

On Thu, Jul 26, 2018 at 10:12:58AM +0200, speck for Paolo Bonzini wrote:
> On 25/07/2018 17:52, speck for Greg KH wrote:
> > On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
> >> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
> >> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
> >> tree.
> > Should I be taking patch 1 into a stable tree now, or wait until the
> > rest of these land?
> 
> Yes, you can do that if you prefer.  It's commit
> cd28325249a1ca0d771557ce823e0308ad629f98.

It seems only applicable to 4.17.y, older kernels (like 4.14 and 4.9)
don't seem to need it, right?  Or will they be needed for this "mess"?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-07-26 10:04     ` Greg KH
@ 2018-07-26 10:41       ` Paolo Bonzini
  2018-07-30 21:40         ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-26 10:41 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]

On 26/07/2018 12:04, speck for Greg KH wrote:
> On Thu, Jul 26, 2018 at 10:12:58AM +0200, speck for Paolo Bonzini wrote:
>> On 25/07/2018 17:52, speck for Greg KH wrote:
>>> On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
>>>> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
>>>> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
>>>> tree.
>>> Should I be taking patch 1 into a stable tree now, or wait until the
>>> rest of these land?
>> Yes, you can do that if you prefer.  It's commit
>> cd28325249a1ca0d771557ce823e0308ad629f98.
> It seems only applicable to 4.17.y, older kernels (like 4.14 and 4.9)
> don't seem to need it, right?  Or will they be needed for this "mess"?

It will be useful for this mess.  Though not strictly necessary, let's
not slow down things more than they would ayway.

4.14 and 4.9 need those missing patches to correctly report
vulnerability to Spectrev1 on some AMD families, but 1) AMD did not
contribute the userspace bits 2) anyway it's just sysfs, not actual
change in vulnerable/not vulnerable status 3) anyway^2 this only applies
to mitigation with lfence, not the masking trick, so it's more or less moot.

I'll hunt for the whole dependency chain for 4.14 and 4.9 and send
everything to stable.  It's not a big one and should backport just fine.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2
  2018-07-25 14:30 ` [MODERATED] [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2 Paolo Bonzini
@ 2018-07-30 21:27   ` Thomas Gleixner
  2018-07-31  8:22     ` [MODERATED] " Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2018-07-30 21:27 UTC (permalink / raw)
  To: speck

On Wed, 25 Jul 2018, speck for Paolo Bonzini wrote:

> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH v2 2/4] x86: SMT doesn't matter for VMX L1TF if EPT disabled or
>  mitigation disabled
> 
> If EPT is disabled, L1TF cannot be exploited even across threads on the
> same core, and SMT is irrelevant.

Ack.

> If mitigation is completely disabled, L1TF can be exploited even within a
> single thread, so SMT is again irrelevant.

I'm not sure about this one. You might decide that the risk of not flushing
is acceptable if SMT is off, but then if SMT is enabled the whole thing
might be less acceptable. So keeping that information in the sysfs file
makes some sense. No strong opionion though.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4
  2018-07-25 14:31 ` [MODERATED] [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4 Paolo Bonzini
@ 2018-07-30 21:36   ` Thomas Gleixner
  2018-07-31  7:39     ` [MODERATED] " Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2018-07-30 21:36 UTC (permalink / raw)
  To: speck

On Wed, 25 Jul 2018, speck for Paolo Bonzini wrote:
>  
> +3.4. Nested virtual machines
> +""""""""""""""""""""""""""""
> +
> +When nested virtualization is in use, three operating systems are involved:
> +the bare metal hypervisor, the nested hypervisor, and the nested virtual
> +machine.  VMENTER operations from the nested hypervisor into the nested
> +guest will always be processed by the bare metal hypervisor.  Therefore:
> +
> +When running as a bare metal hypervisor, instead, KVM will:
> +
> + - flush the L1D cache on every switch from nested hypervisor to
> +   nested virtual machine, so that the nested hypervisor's secrets
> +   are not exposed to the nested virtual machine;
> +
> + - flush the L1D cache on every switch from nested virtual machine to
> +   nested hypervisor; this is a complex operation, and flushing the L1D
> +   cache avoids that the bare metal hypervisor's secrets be exposed
> +   to the nested virtual machine;
> +
> + - instruct the nested hypervisor to not perform any L1D cache flush.

I still think that we need some explanation about SMT in guests, i.e. that
the SMT information in guests is inaccurate and does not tell anything
about the host side SMT control state. But that's independent of this
nested optimization as it applies to all guest levels.

> +u64 kvm_get_arch_capabilities(void)
> +{
> +	u64 data;
> +
> +	rdmsrl_safe(MSR_IA32_ARCH_CAPABILITIES, &data);
> +	if (l1tf_vmx_mitigation != VMENTER_L1D_FLUSH_NEVER)
> +		data |= ARCH_CAP_SKIP_VMENTRY_L1DFLUSH;

That really wants a comment explaining the magic here.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-07-26 10:41       ` Paolo Bonzini
@ 2018-07-30 21:40         ` Thomas Gleixner
  0 siblings, 0 replies; 21+ messages in thread
From: Thomas Gleixner @ 2018-07-30 21:40 UTC (permalink / raw)
  To: speck

On Thu, 26 Jul 2018, speck for Paolo Bonzini wrote:
> On 26/07/2018 12:04, speck for Greg KH wrote:
> > On Thu, Jul 26, 2018 at 10:12:58AM +0200, speck for Paolo Bonzini wrote:
> >> On 25/07/2018 17:52, speck for Greg KH wrote:
> >>> On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
> >>>> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
> >>>> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
> >>>> tree.
> >>> Should I be taking patch 1 into a stable tree now, or wait until the
> >>> rest of these land?
> >> Yes, you can do that if you prefer.  It's commit
> >> cd28325249a1ca0d771557ce823e0308ad629f98.
> > It seems only applicable to 4.17.y, older kernels (like 4.14 and 4.9)
> > don't seem to need it, right?  Or will they be needed for this "mess"?
> 
> It will be useful for this mess.  Though not strictly necessary, let's
> not slow down things more than they would ayway.
> 
> 4.14 and 4.9 need those missing patches to correctly report
> vulnerability to Spectrev1 on some AMD families, but 1) AMD did not
> contribute the userspace bits 2) anyway it's just sysfs, not actual
> change in vulnerable/not vulnerable status 3) anyway^2 this only applies
> to mitigation with lfence, not the masking trick, so it's more or less moot.
> 
> I'll hunt for the whole dependency chain for 4.14 and 4.9 and send
> everything to stable.  It's not a big one and should backport just fine.

That'd be nice to have so I can backport the whole pile once it has
settled to 4.14 without twisting my brain too much :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4
  2018-07-30 21:36   ` Thomas Gleixner
@ 2018-07-31  7:39     ` Paolo Bonzini
  2018-07-31  7:59       ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-31  7:39 UTC (permalink / raw)
  To: speck

On 30/07/2018 23:36, speck for Thomas Gleixner wrote:
> On Wed, 25 Jul 2018, speck for Paolo Bonzini wrote:
>>  
>> +3.4. Nested virtual machines
>> +""""""""""""""""""""""""""""
>> +
>> +When nested virtualization is in use, three operating systems are involved:
>> +the bare metal hypervisor, the nested hypervisor, and the nested virtual
>> +machine.  VMENTER operations from the nested hypervisor into the nested
>> +guest will always be processed by the bare metal hypervisor.  Therefore:
>> +
>> +When running as a bare metal hypervisor, instead, KVM will:
>> +
>> + - flush the L1D cache on every switch from nested hypervisor to
>> +   nested virtual machine, so that the nested hypervisor's secrets
>> +   are not exposed to the nested virtual machine;
>> +
>> + - flush the L1D cache on every switch from nested virtual machine to
>> +   nested hypervisor; this is a complex operation, and flushing the L1D
>> +   cache avoids that the bare metal hypervisor's secrets be exposed
>> +   to the nested virtual machine;
>> +
>> + - instruct the nested hypervisor to not perform any L1D cache flush.
> 
> I still think that we need some explanation about SMT in guests, i.e. that
> the SMT information in guests is inaccurate and does not tell anything
> about the host side SMT control state. But that's independent of this
> nested optimization as it applies to all guest levels.
> 
>> +u64 kvm_get_arch_capabilities(void)
>> +{
>> +	u64 data;
>> +
>> +	rdmsrl_safe(MSR_IA32_ARCH_CAPABILITIES, &data);
>> +	if (l1tf_vmx_mitigation != VMENTER_L1D_FLUSH_NEVER)
>> +		data |= ARCH_CAP_SKIP_VMENTRY_L1DFLUSH;
> 
> That really wants a comment explaining the magic here.

Something like:

	/*
	 * If we're doing cache flushes (either "always" or "cond")
	 * we will do one whenever the guest does a vmlaunch/vmresume.
	 * If an outer hypervisor is doing the cache flush for us
	 * (VMENTER_L1D_FLUSH_NESTED_VM), we can safely pass that
	 * capability to the guest too, and if EPT is disabled we're not
	 * vulnerable.  Overall, only VMENTER_L1D_FLUSH_NEVER will
	 * require a nested hypervisor to do a flush of its own.
	 */

?

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4
  2018-07-31  7:39     ` [MODERATED] " Paolo Bonzini
@ 2018-07-31  7:59       ` Thomas Gleixner
  0 siblings, 0 replies; 21+ messages in thread
From: Thomas Gleixner @ 2018-07-31  7:59 UTC (permalink / raw)
  To: speck

y1;5202;0cOn Tue, 31 Jul 2018, speck for Paolo Bonzini wrote:
> On 30/07/2018 23:36, speck for Thomas Gleixner wrote:
> >> +u64 kvm_get_arch_capabilities(void)
> >> +{
> >> +	u64 data;
> >> +
> >> +	rdmsrl_safe(MSR_IA32_ARCH_CAPABILITIES, &data);
> >> +	if (l1tf_vmx_mitigation != VMENTER_L1D_FLUSH_NEVER)
> >> +		data |= ARCH_CAP_SKIP_VMENTRY_L1DFLUSH;
> > 
> > That really wants a comment explaining the magic here.
> 
> Something like:
> 
> 	/*
> 	 * If we're doing cache flushes (either "always" or "cond")
> 	 * we will do one whenever the guest does a vmlaunch/vmresume.
> 	 * If an outer hypervisor is doing the cache flush for us
> 	 * (VMENTER_L1D_FLUSH_NESTED_VM), we can safely pass that
> 	 * capability to the guest too, and if EPT is disabled we're not
> 	 * vulnerable.  Overall, only VMENTER_L1D_FLUSH_NEVER will
> 	 * require a nested hypervisor to do a flush of its own.
> 	 */
> 
> ?

Works for me.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2
  2018-07-30 21:27   ` Thomas Gleixner
@ 2018-07-31  8:22     ` Paolo Bonzini
  2018-07-31  9:15       ` Thomas Gleixner
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-31  8:22 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

On 30/07/2018 23:27, speck for Thomas Gleixner wrote:
> On Wed, 25 Jul 2018, speck for Paolo Bonzini wrote:
> 
>> From: Paolo Bonzini <pbonzini@redhat.com>
>> Subject: [PATCH v2 2/4] x86: SMT doesn't matter for VMX L1TF if EPT disabled or
>>  mitigation disabled
>>
>> If EPT is disabled, L1TF cannot be exploited even across threads on the
>> same core, and SMT is irrelevant.
> 
> Ack.
> 
>> If mitigation is completely disabled, L1TF can be exploited even within a
>> single thread, so SMT is again irrelevant.
> 
> I'm not sure about this one. You might decide that the risk of not flushing
> is acceptable if SMT is off, but then if SMT is enabled the whole thing
> might be less acceptable. So keeping that information in the sysfs file
> makes some sense. No strong opionion though.

Fair enough, I'll change it to still do "VMX: SMT disabled, L1D
vulnerable".  What do you think actually about switching the order to
"VMX: xxx, SMT yyy"?

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2
  2018-07-31  8:22     ` [MODERATED] " Paolo Bonzini
@ 2018-07-31  9:15       ` Thomas Gleixner
  2018-07-31  9:35         ` [MODERATED] " Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Gleixner @ 2018-07-31  9:15 UTC (permalink / raw)
  To: speck

On Tue, 31 Jul 2018, speck for Paolo Bonzini wrote:

> On 30/07/2018 23:27, speck for Thomas Gleixner wrote:
> > On Wed, 25 Jul 2018, speck for Paolo Bonzini wrote:
> > 
> >> From: Paolo Bonzini <pbonzini@redhat.com>
> >> Subject: [PATCH v2 2/4] x86: SMT doesn't matter for VMX L1TF if EPT disabled or
> >>  mitigation disabled
> >>
> >> If EPT is disabled, L1TF cannot be exploited even across threads on the
> >> same core, and SMT is irrelevant.
> > 
> > Ack.
> > 
> >> If mitigation is completely disabled, L1TF can be exploited even within a
> >> single thread, so SMT is again irrelevant.
> > 
> > I'm not sure about this one. You might decide that the risk of not flushing
> > is acceptable if SMT is off, but then if SMT is enabled the whole thing
> > might be less acceptable. So keeping that information in the sysfs file
> > makes some sense. No strong opionion though.
> 
> Fair enough, I'll change it to still do "VMX: SMT disabled, L1D
> vulnerable".  What do you think actually about switching the order to
> "VMX: xxx, SMT yyy"?

No objections.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2
  2018-07-31  9:15       ` Thomas Gleixner
@ 2018-07-31  9:35         ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2018-07-31  9:35 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1229 bytes --]

On 31/07/2018 11:15, speck for Thomas Gleixner wrote:
> On Tue, 31 Jul 2018, speck for Paolo Bonzini wrote:
> 
>> On 30/07/2018 23:27, speck for Thomas Gleixner wrote:
>>> On Wed, 25 Jul 2018, speck for Paolo Bonzini wrote:
>>>
>>>> From: Paolo Bonzini <pbonzini@redhat.com>
>>>> Subject: [PATCH v2 2/4] x86: SMT doesn't matter for VMX L1TF if EPT disabled or
>>>>  mitigation disabled
>>>>
>>>> If EPT is disabled, L1TF cannot be exploited even across threads on the
>>>> same core, and SMT is irrelevant.
>>>
>>> Ack.
>>>
>>>> If mitigation is completely disabled, L1TF can be exploited even within a
>>>> single thread, so SMT is again irrelevant.
>>>
>>> I'm not sure about this one. You might decide that the risk of not flushing
>>> is acceptable if SMT is off, but then if SMT is enabled the whole thing
>>> might be less acceptable. So keeping that information in the sysfs file
>>> makes some sense. No strong opionion though.
>>
>> Fair enough, I'll change it to still do "VMX: SMT disabled, L1D
>> vulnerable".  What do you think actually about switching the order to
>> "VMX: xxx, SMT yyy"?
> 
> No objections.

Good, v3 should come with my usual Mediterranean punctuality.

Paolo



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
                   ` (4 preceding siblings ...)
  2018-07-25 15:52 ` [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Greg KH
@ 2018-08-02  2:51 ` Konrad Rzeszutek Wilk
  2018-08-02 12:07   ` Paolo Bonzini
  5 siblings, 1 reply; 21+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-08-02  2:51 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1247 bytes --]

On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
> tree.

This is still not working for me. Attached is the 'dmesg' output
using the debug patch (also included).

I am pretty sure that the fault is with QEMU not doing it correctly
or doing something extra. Attaching the qemu.patch as well.

Did I miss a v3 from Robert Hoo?


> 
> Paolo Bonzini (4):
>   KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
>   x86: SMT doesn't matter for VMX L1TF if EPT disabled or mitigation
>     disabled
>   x86: use ARCH_CAPABILITIES to skip L1D flush on vmentry
>   KVM: VMX: tell nested hypervisor to skip L1D flush on vmentry
> 
>  Documentation/admin-guide/l1tf.rst | 21 +++++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h    |  1 +
>  arch/x86/include/asm/msr-index.h   |  1 +
>  arch/x86/include/asm/vmx.h         |  1 +
>  arch/x86/kernel/cpu/bugs.c         |  8 +++++++-
>  arch/x86/kvm/vmx.c                 | 13 +++++++++++--
>  arch/x86/kvm/x86.c                 | 18 +++++++++++++++++-
>  7 files changed, 59 insertions(+), 4 deletions(-)
> 
> -- 
> 2.17.1

[-- Attachment #2: debug.patch --]
[-- Type: text/plain, Size: 2639 bytes --]

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0768492f4687..3edc743c5156 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -987,7 +987,7 @@ static const __initconst struct x86_cpu_id cpu_no_l1tf[] = {
 	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_MOOREFIELD	},
 	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GOLDMONT	},
 	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_DENVERTON	},
-	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GEMINI_LAKE	},
+	/*{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GEMINI_LAKE	},*/
 	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_XEON_PHI_KNL		},
 	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_XEON_PHI_KNM		},
 	{}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 172504d76b6c..50e3da7831d4 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -222,6 +222,7 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
 	       u64 msr;
 
 	       rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
+printk("%s:%x\n",__func__,msr);
 	       if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) {
 		       l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED;
 		       return 0;
@@ -3916,6 +3917,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		msr_info->data = to_vmx(vcpu)->spec_ctrl;
 		break;
 	case MSR_IA32_ARCH_CAPABILITIES:
+printk("%s:%d %d %x\n",__func__,
+		msr_info->host_initiated, guest_cpuid_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES), 
+		to_vmx(vcpu)->arch_capabilities);
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_ARCH_CAPABILITIES))
 			return 1;
@@ -4084,6 +4088,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 					      MSR_TYPE_W);
 		break;
 	case MSR_IA32_ARCH_CAPABILITIES:
+printk("%s:%d %x\n", __func__,msr_info->host_initiated, data);
+if (data==0) dump_stack();
 		if (!msr_info->host_initiated)
 			return 1;
 		vmx->arch_capabilities = data;
@@ -6431,7 +6437,7 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	}
 
 	vmx->arch_capabilities = kvm_get_arch_capabilities();
-
+printk("%s:%x\n",__func__,kvm_get_arch_capabilities());
 	vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
 
 	/* 22.2.1, 20.8.1 */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1a1d4cfc6322..efb7290b412c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1120,6 +1120,8 @@ static int kvm_get_msr_feature(struct kvm_msr_entry *msr)
 	switch (msr->index) {
 	case MSR_IA32_ARCH_CAPABILITIES:
 		msr->data = kvm_get_arch_capabilities();
+printk("%s:%x\n",__func__, msr->data);
+dump_stack();
 		break;
 	case MSR_IA32_UCODE_REV:
 		rdmsrl_safe(msr->index, &msr->data);

[-- Attachment #3: log --]
[-- Type: text/plain, Size: 12390 bytes --]

[ 1209.245984] kvm_get_msr_feature:a
[ 1209.245989] CPU: 0 PID: 25315 Comm: insmod Tainted: G            E     4.18.0-rc1speck+ #4
[ 1209.245989] Hardware name: Intel Corporation NUC7PJYH/NUC7JYB, BIOS JYGLKCPX.86A.0041.2018.0626.1500 06/26/2018
[ 1209.245990] Call Trace:
[ 1209.246001]  dump_stack+0x5c/0x80
[ 1209.246032]  kvm_get_msr_feature+0x9b/0xb0 [kvm]
[ 1209.246052]  kvm_arch_hardware_setup+0x153/0x200 [kvm]
[ 1209.246072]  kvm_init+0x75/0x260 [kvm]
[ 1209.246082]  ? hardware_setup+0x5d1/0x5d1 [kvm_intel]
[ 1209.246086]  vmx_init+0x24/0x4c3 [kvm_intel]
[ 1209.246091]  ? hardware_setup+0x5d1/0x5d1 [kvm_intel]
[ 1209.246095]  do_one_initcall+0x46/0x1c3
[ 1209.246099]  ? _cond_resched+0x15/0x30
[ 1209.246102]  ? kmem_cache_alloc_trace+0x166/0x1d0
[ 1209.246104]  ? do_init_module+0x22/0x210
[ 1209.246106]  do_init_module+0x5a/0x210
[ 1209.246108]  load_module+0x2070/0x2400
[ 1209.246112]  ? vfs_read+0x110/0x140
[ 1209.246114]  ? __do_sys_finit_module+0xa8/0x110
[ 1209.246115]  __do_sys_finit_module+0xa8/0x110
[ 1209.246118]  do_syscall_64+0x5b/0x160
[ 1209.246120]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.246123] RIP: 0033:0x7f016419aa39
[ 1209.246124] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 57 44 2c 00 f7 d8 64 89 01 48 
[ 1209.246154] RSP: 002b:00007fffc2b8ce18 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 1209.246156] RAX: ffffffffffffffda RBX: 000055bee7c4d850 RCX: 00007f016419aa39
[ 1209.246156] RDX: 0000000000000000 RSI: 000055bee6a763e6 RDI: 0000000000000003
[ 1209.246157] RBP: 000055bee6a763e6 R08: 0000000000000000 R09: 00007f0164462000
[ 1209.246158] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
[ 1209.246159] R13: 000055bee7c4d820 R14: 0000000000000000 R15: 0000000000000000
[ 1209.247163] vmx_setup_l1d_flush:2
[ 1209.405938] vmx_vcpu_setup:a
[ 1209.407062] vmx_vcpu_setup:a
[ 1209.427438] vmx_set_msr:1 0
[ 1209.427443] CPU: 1 PID: 25327 Comm: qemu-system-x86 Tainted: G            E     4.18.0-rc1speck+ #4
[ 1209.427444] Hardware name: Intel Corporation NUC7PJYH/NUC7JYB, BIOS JYGLKCPX.86A.0041.2018.0626.1500 06/26/2018
[ 1209.427445] Call Trace:
[ 1209.427458]  dump_stack+0x5c/0x80
[ 1209.427468]  vmx_set_msr.cold.133+0x23/0xa8 [kvm_intel]
[ 1209.427502]  ? do_get_msr+0x70/0x70 [kvm]
[ 1209.427519]  do_set_msr+0x31/0x50 [kvm]
[ 1209.427538]  msr_io+0xab/0x140 [kvm]
[ 1209.427558]  kvm_arch_vcpu_ioctl+0x206/0xed0 [kvm]
[ 1209.427564]  ? vmx_set_cr0+0x2bc/0x7b0 [kvm_intel]
[ 1209.427567]  ? setup_msrs+0x4e9/0x5e0 [kvm_intel]
[ 1209.427570]  ? vmx_set_segment+0x155/0x1e0 [kvm_intel]
[ 1209.427588]  ? __set_sregs+0x3d7/0x590 [kvm]
[ 1209.427605]  ? kvm_arch_vcpu_load+0x51/0x250 [kvm]
[ 1209.427609]  ? vmx_vcpu_put+0x11/0x70 [kvm_intel]
[ 1209.427613]  ? native_set_debugreg+0x28/0x30
[ 1209.427630]  ? kvm_arch_vcpu_put+0xd4/0xe0 [kvm]
[ 1209.427647]  kvm_vcpu_ioctl+0xd6/0x5e0 [kvm]
[ 1209.427651]  do_vfs_ioctl+0xa4/0x620
[ 1209.427654]  ksys_ioctl+0x60/0x90
[ 1209.427656]  __x64_sys_ioctl+0x16/0x20
[ 1209.427659]  do_syscall_64+0x5b/0x160
[ 1209.427663]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.427665] RIP: 0033:0x7f3f80a0be17
[ 1209.427666] Code: 00 00 90 48 8b 05 a9 80 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 80 2c 00 f7 d8 64 89 01 48 
[ 1209.427695] RSP: 002b:00007f3f693f6128 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1209.427697] RAX: ffffffffffffffda RBX: 000000004008ae89 RCX: 00007f3f80a0be17
[ 1209.427698] RDX: 00007f3f54002010 RSI: 000000004008ae89 RDI: 000000000000000f
[ 1209.427699] RBP: 00007f3f54002010 R08: 00007f3f54003010 R09: 0000000000000000
[ 1209.427700] R10: 00007f3f54002548 R11: 0000000000000246 R12: 000055b1f3f815b0
[ 1209.427701] R13: 00007f3f54002538 R14: 00007f3f54003010 R15: 00007ffed8f88380
[ 1209.427860] vmx_set_msr:1 0
[ 1209.427863] CPU: 1 PID: 25328 Comm: qemu-system-x86 Tainted: G            E     4.18.0-rc1speck+ #4
[ 1209.427864] Hardware name: Intel Corporation NUC7PJYH/NUC7JYB, BIOS JYGLKCPX.86A.0041.2018.0626.1500 06/26/2018
[ 1209.427864] Call Trace:
[ 1209.427871]  dump_stack+0x5c/0x80
[ 1209.427877]  vmx_set_msr.cold.133+0x23/0xa8 [kvm_intel]
[ 1209.427898]  ? do_get_msr+0x70/0x70 [kvm]
[ 1209.427914]  do_set_msr+0x31/0x50 [kvm]
[ 1209.427933]  msr_io+0xab/0x140 [kvm]
[ 1209.427953]  kvm_arch_vcpu_ioctl+0x206/0xed0 [kvm]
[ 1209.427957]  ? vmx_set_cr0+0x2bc/0x7b0 [kvm_intel]
[ 1209.427961]  ? setup_msrs+0x4e9/0x5e0 [kvm_intel]
[ 1209.427964]  ? vmx_set_segment+0x155/0x1e0 [kvm_intel]
[ 1209.427981]  ? __set_sregs+0x3d7/0x590 [kvm]
[ 1209.427999]  ? kvm_arch_vcpu_load+0x51/0x250 [kvm]
[ 1209.428004]  ? vmx_vcpu_put+0x11/0x70 [kvm_intel]
[ 1209.428006]  ? native_set_debugreg+0x28/0x30
[ 1209.428023]  ? kvm_arch_vcpu_put+0xd4/0xe0 [kvm]
[ 1209.428039]  kvm_vcpu_ioctl+0xd6/0x5e0 [kvm]
[ 1209.428043]  do_vfs_ioctl+0xa4/0x620
[ 1209.428046]  ksys_ioctl+0x60/0x90
[ 1209.428049]  __x64_sys_ioctl+0x16/0x20
[ 1209.428051]  do_syscall_64+0x5b/0x160
[ 1209.428054]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.428056] RIP: 0033:0x7f3f80a0be17
[ 1209.428056] Code: 00 00 90 48 8b 05 a9 80 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 80 2c 00 f7 d8 64 89 01 48 
[ 1209.428086] RSP: 002b:00007f3f68bf5128 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1209.428088] RAX: ffffffffffffffda RBX: 000000004008ae89 RCX: 00007f3f80a0be17
[ 1209.428089] RDX: 00007f3f58002010 RSI: 000000004008ae89 RDI: 0000000000000010
[ 1209.428090] RBP: 00007f3f58002010 R08: 00007f3f58003010 R09: 0000000000000000
[ 1209.428091] R10: 00007f3f58002548 R11: 0000000000000246 R12: 000055b1f3fceb10
[ 1209.428091] R13: 00007f3f58002538 R14: 00007f3f58003010 R15: 00007ffed8f88380
[ 1209.435819] vmx_get_msr:1 1 0
[ 1209.435951] vmx_get_msr:1 1 0
[ 1209.437095] vmx_set_msr:1 0
[ 1209.437100] CPU: 2 PID: 25327 Comm: qemu-system-x86 Tainted: G            E     4.18.0-rc1speck+ #4
[ 1209.437101] Hardware name: Intel Corporation NUC7PJYH/NUC7JYB, BIOS JYGLKCPX.86A.0041.2018.0626.1500 06/26/2018
[ 1209.437102] Call Trace:
[ 1209.437114]  dump_stack+0x5c/0x80
[ 1209.437124]  vmx_set_msr.cold.133+0x23/0xa8 [kvm_intel]
[ 1209.437157]  ? do_get_msr+0x70/0x70 [kvm]
[ 1209.437174]  do_set_msr+0x31/0x50 [kvm]
[ 1209.437194]  msr_io+0xab/0x140 [kvm]
[ 1209.437215]  kvm_arch_vcpu_ioctl+0x206/0xed0 [kvm]
[ 1209.437221]  ? vmx_set_cr0+0x2bc/0x7b0 [kvm_intel]
[ 1209.437224]  ? setup_msrs+0x4e9/0x5e0 [kvm_intel]
[ 1209.437228]  ? vmx_set_segment+0x155/0x1e0 [kvm_intel]
[ 1209.437247]  ? __set_sregs+0x3d7/0x590 [kvm]
[ 1209.437266]  ? kvm_arch_vcpu_load+0x51/0x250 [kvm]
[ 1209.437270]  ? vmx_vcpu_put+0x11/0x70 [kvm_intel]
[ 1209.437274]  ? native_set_debugreg+0x28/0x30
[ 1209.437293]  ? kvm_arch_vcpu_put+0xd4/0xe0 [kvm]
[ 1209.437312]  kvm_vcpu_ioctl+0xd6/0x5e0 [kvm]
[ 1209.437317]  ? do_signal+0x1a3/0x610
[ 1209.437320]  ? __fpu__restore_sig+0x90/0x440
[ 1209.437323]  do_vfs_ioctl+0xa4/0x620
[ 1209.437326]  ksys_ioctl+0x60/0x90
[ 1209.437327]  __x64_sys_ioctl+0x16/0x20
[ 1209.437331]  do_syscall_64+0x5b/0x160
[ 1209.437334]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.437338] RIP: 0033:0x7f3f80a0be17
[ 1209.437338] Code: 00 00 90 48 8b 05 a9 80 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 80 2c 00 f7 d8 64 89 01 48 
[ 1209.437368] RSP: 002b:00007f3f693f6128 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1209.437370] RAX: ffffffffffffffda RBX: 000000004008ae89 RCX: 00007f3f80a0be17
[ 1209.437371] RDX: 00007f3f54002010 RSI: 000000004008ae89 RDI: 000000000000000f
[ 1209.437372] RBP: 00007f3f54002010 R08: 00007f3f54003010 R09: 0000000000000000
[ 1209.437373] R10: 00007f3f54002548 R11: 0000000000000246 R12: 000055b1f3f815b0
[ 1209.437373] R13: 00007f3f54002538 R14: 00007f3f54003010 R15: 00007ffed8f88380
[ 1209.437507] vmx_set_msr:1 0
[ 1209.437510] CPU: 2 PID: 25328 Comm: qemu-system-x86 Tainted: G            E     4.18.0-rc1speck+ #4
[ 1209.437510] Hardware name: Intel Corporation NUC7PJYH/NUC7JYB, BIOS JYGLKCPX.86A.0041.2018.0626.1500 06/26/2018
[ 1209.437511] Call Trace:
[ 1209.437516]  dump_stack+0x5c/0x80
[ 1209.437522]  vmx_set_msr.cold.133+0x23/0xa8 [kvm_intel]
[ 1209.437542]  ? do_get_msr+0x70/0x70 [kvm]
[ 1209.437559]  do_set_msr+0x31/0x50 [kvm]
[ 1209.437579]  msr_io+0xab/0x140 [kvm]
[ 1209.437599]  kvm_arch_vcpu_ioctl+0x206/0xed0 [kvm]
[ 1209.437604]  ? vmx_set_cr0+0x2bc/0x7b0 [kvm_intel]
[ 1209.437607]  ? setup_msrs+0x4e9/0x5e0 [kvm_intel]
[ 1209.437611]  ? vmx_set_segment+0x155/0x1e0 [kvm_intel]
[ 1209.437629]  ? __set_sregs+0x3d7/0x590 [kvm]
[ 1209.437648]  ? kvm_arch_vcpu_load+0x51/0x250 [kvm]
[ 1209.437652]  ? vmx_vcpu_put+0x11/0x70 [kvm_intel]
[ 1209.437654]  ? native_set_debugreg+0x28/0x30
[ 1209.437673]  ? kvm_arch_vcpu_put+0xd4/0xe0 [kvm]
[ 1209.437691]  kvm_vcpu_ioctl+0xd6/0x5e0 [kvm]
[ 1209.437695]  ? do_signal+0x1a3/0x610
[ 1209.437697]  ? __fpu__restore_sig+0x90/0x440
[ 1209.437698]  do_vfs_ioctl+0xa4/0x620
[ 1209.437700]  ksys_ioctl+0x60/0x90
[ 1209.437702]  __x64_sys_ioctl+0x16/0x20
[ 1209.437704]  do_syscall_64+0x5b/0x160
[ 1209.437706]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.437708] RIP: 0033:0x7f3f80a0be17
[ 1209.437708] Code: 00 00 90 48 8b 05 a9 80 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 80 2c 00 f7 d8 64 89 01 48 
[ 1209.437737] RSP: 002b:00007f3f68bf5128 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1209.437738] RAX: ffffffffffffffda RBX: 000000004008ae89 RCX: 00007f3f80a0be17
[ 1209.437739] RDX: 00007f3f58002010 RSI: 000000004008ae89 RDI: 0000000000000010
[ 1209.437740] RBP: 00007f3f58002010 R08: 00007f3f58003010 R09: 0000000000000000
[ 1209.437741] R10: 00007f3f58002548 R11: 0000000000000246 R12: 000055b1f3fceb10
[ 1209.437742] R13: 00007f3f58002538 R14: 00007f3f58003010 R15: 00007ffed8f88380
[ 1209.532472] vmx_get_msr:1 1 0
[ 1209.533832] vmx_set_msr:1 0
[ 1209.533836] CPU: 2 PID: 25327 Comm: qemu-system-x86 Tainted: G            E     4.18.0-rc1speck+ #4
[ 1209.533837] Hardware name: Intel Corporation NUC7PJYH/NUC7JYB, BIOS JYGLKCPX.86A.0041.2018.0626.1500 06/26/2018
[ 1209.533838] Call Trace:
[ 1209.533850]  dump_stack+0x5c/0x80
[ 1209.533859]  vmx_set_msr.cold.133+0x23/0xa8 [kvm_intel]
[ 1209.533888]  ? do_get_msr+0x70/0x70 [kvm]
[ 1209.533905]  do_set_msr+0x31/0x50 [kvm]
[ 1209.533924]  msr_io+0xab/0x140 [kvm]
[ 1209.533943]  kvm_arch_vcpu_ioctl+0x206/0xed0 [kvm]
[ 1209.533948]  ? vmx_set_cr0+0x2bc/0x7b0 [kvm_intel]
[ 1209.533952]  ? setup_msrs+0x4e9/0x5e0 [kvm_intel]
[ 1209.533955]  ? vmx_set_segment+0x155/0x1e0 [kvm_intel]
[ 1209.533972]  ? __set_sregs+0x3d7/0x590 [kvm]
[ 1209.533989]  ? kvm_arch_vcpu_load+0x51/0x250 [kvm]
[ 1209.533993]  ? vmx_vcpu_put+0x11/0x70 [kvm_intel]
[ 1209.533999]  ? native_set_debugreg+0x28/0x30
[ 1209.534016]  ? kvm_arch_vcpu_put+0xd4/0xe0 [kvm]
[ 1209.534032]  kvm_vcpu_ioctl+0xd6/0x5e0 [kvm]
[ 1209.534036]  ? set_next_entity+0x96/0x1b0
[ 1209.534040]  do_vfs_ioctl+0xa4/0x620
[ 1209.534042]  ksys_ioctl+0x60/0x90
[ 1209.534044]  __x64_sys_ioctl+0x16/0x20
[ 1209.534048]  do_syscall_64+0x5b/0x160
[ 1209.534051]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.534054] RIP: 0033:0x7f3f80a0be17
[ 1209.534055] Code: 00 00 90 48 8b 05 a9 80 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 80 2c 00 f7 d8 64 89 01 48 
[ 1209.534084] RSP: 002b:00007f3f693f60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1209.534086] RAX: ffffffffffffffda RBX: 000000004008ae89 RCX: 00007f3f80a0be17
[ 1209.534087] RDX: 00007f3f54002010 RSI: 000000004008ae89 RDI: 000000000000000f
[ 1209.534088] RBP: 00007f3f54002010 R08: 00007f3f54003010 R09: 0000000000000000
[ 1209.534089] R10: 0000000000000001 R11: 0000000000000246 R12: 000055b1f3f815b0
[ 1209.534090] R13: 0000000000000000 R14: 0000000000000000 R15: 000055b1f3f815b0
[ 1226.636670] vmx_get_msr:0 1 0
[ 1226.636682] vmx_get_msr:0 1 0
[ 1226.884057] vmx_get_msr:0 1 0
[ 1226.937410] vmx_get_msr:0 1 0
[ 1232.986946] vmx_get_msr:0 1 0

[-- Attachment #4: qemu.patch --]
[-- Type: text/plain, Size: 4058 bytes --]

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e0e2f2eea1..070cb7cd9e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1000,7 +1000,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, "spec-ctrl", NULL,
-            NULL, NULL, NULL, "ssbd",
+            NULL, "arch-cap", NULL, "ssbd",
         },
         .cpuid_eax = 7,
         .cpuid_needs_ecx = true, .cpuid_ecx = 0,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 2c5a0d90a6..2211b01797 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -355,7 +355,7 @@ typedef enum X86Seg {
 #define MSR_IA32_SPEC_CTRL              0x48
 #define MSR_VIRT_SSBD                   0xc001011f
 #define MSR_IA32_TSCDEADLINE            0x6e0
-
+#define MSR_IA32_ARCH_CAPABILITIES	0x0000010a
 #define FEATURE_CONTROL_LOCKED                    (1<<0)
 #define FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX (1<<2)
 #define FEATURE_CONTROL_LMCE                      (1<<20)
@@ -1212,6 +1212,7 @@ typedef struct CPUX86State {
 
     uint64_t spec_ctrl;
     uint64_t virt_ssbd;
+    uint64_t arch_cap;
 
     /* End of state preserved by INIT (dummy marker).  */
     struct {} end_init_save;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index ebb2d23aa4..b96d9f6b3e 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -93,6 +93,7 @@ static bool has_msr_hv_reenlightenment;
 static bool has_msr_xss;
 static bool has_msr_spec_ctrl;
 static bool has_msr_virt_ssbd;
+static bool has_msr_arch_cap;
 static bool has_msr_smi_count;
 
 static uint32_t has_architectural_pmu_version;
@@ -1288,6 +1289,9 @@ static int kvm_get_supported_msrs(KVMState *s)
                 case MSR_VIRT_SSBD:
                     has_msr_virt_ssbd = true;
                     break;
+                case MSR_IA32_ARCH_CAPABILITIES:
+                    has_msr_arch_cap = true;
+                    break;
                 }
             }
         }
@@ -1802,6 +1806,9 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
     if (has_msr_virt_ssbd) {
         kvm_msr_entry_add(cpu, MSR_VIRT_SSBD, env->virt_ssbd);
     }
+    if (has_msr_arch_cap) {
+        kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES, env->arch_cap);
+    }
 
 #ifdef TARGET_X86_64
     if (lm_capable_kernel) {
@@ -2185,6 +2192,9 @@ static int kvm_get_msrs(X86CPU *cpu)
     if (has_msr_virt_ssbd) {
         kvm_msr_entry_add(cpu, MSR_VIRT_SSBD, 0);
     }
+    if (has_msr_arch_cap) {
+        kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES, 0);
+    }
     if (!env->tsc_valid) {
         kvm_msr_entry_add(cpu, MSR_IA32_TSC, 0);
         env->tsc_valid = !runstate_is_running();
@@ -2567,6 +2577,9 @@ static int kvm_get_msrs(X86CPU *cpu)
         case MSR_VIRT_SSBD:
             env->virt_ssbd = msrs[i].data;
             break;
+        case MSR_IA32_ARCH_CAPABILITIES:
+            env->arch_cap = msrs[i].data;
+            break;
         case MSR_IA32_RTIT_CTL:
             env->msr_rtit_ctrl = msrs[i].data;
             break;
diff --git a/target/i386/machine.c b/target/i386/machine.c
index 8b64dff487..963cfc051c 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -955,6 +955,25 @@ static const VMStateDescription vmstate_svm_npt = {
     }
 };
 
+static bool arch_cap_needed(void *opaque)
+{
+    X86CPU *cpu = opaque;
+    CPUX86State *env = &cpu->env;
+
+    return env->arch_cap != 0;
+}
+
+static const VMStateDescription vmstate_msr_arch_cap = {
+    .name = "cpu/arch_cap",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = arch_cap_needed,
+    .fields = (VMStateField[]){
+        VMSTATE_UINT64(env.arch_cap, X86CPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 VMStateDescription vmstate_x86_cpu = {
     .name = "cpu",
     .version_id = 12,
@@ -1080,6 +1099,7 @@ VMStateDescription vmstate_x86_cpu = {
         &vmstate_msr_intel_pt,
         &vmstate_msr_virt_ssbd,
         &vmstate_svm_npt,
+        &vmstate_msr_arch_cap,
         NULL
     }
 };

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0
  2018-08-02  2:51 ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-08-02 12:07   ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2018-08-02 12:07 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

On 02/08/2018 04:51, speck for Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 25, 2018 at 04:30:56PM +0200, speck for Paolo Bonzini wrote:
>> Support for ARCH_CAPABILITIES bit 3, which can already be used to disable
>> the mitigations on a nested hypervisor.  Patch 1 is already in Linus's
>> tree.
> 
> This is still not working for me. Attached is the 'dmesg' output
> using the debug patch (also included).

Here is my own test program...  It prints 0 without the patches and
8 with.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#include <stdint.h>
#include <linux/kvm.h>
#include <assert.h>
#include <sys/mman.h>


struct vm {
	int sys_fd;
	int fd;
};

struct vcpu {
	int fd;
};


void vm_init(struct vm *vm)
{
	vm->sys_fd = open("/dev/kvm", O_RDWR);
	if (vm->sys_fd < 0) {
		perror("open /dev/kvm");
		exit(1);
	}

	vm->fd = ioctl(vm->sys_fd, KVM_CREATE_VM, 0);
	if (vm->fd < 0) {
		perror("KVM_CREATE_VM");
		exit(1);
	}
}


void vcpu_init(struct vm *vm, struct vcpu *vcpu)
{
	vcpu->fd = ioctl(vm->fd, KVM_CREATE_VCPU, 0);
	if (vcpu->fd < 0) {
		perror("KVM_CREATE_VCPU");
		exit(1);
	}
}


int main(void)
{
	struct vm vm;
	struct vcpu vcpu;

	vm_init(&vm);
	vcpu_init(&vm, &vcpu);

	struct {
		struct  kvm_msrs m;
		struct kvm_msr_entry e[1];
	} s = {
		.m.nmsrs = 1,
		.e[0].index = 0x10a
	};
	ioctl(vm.sys_fd, KVM_GET_MSRS, &s);
	printf("%x\n", s.e[0].data);
	ioctl(vcpu.fd, KVM_GET_MSRS, &s);
	printf("%x\n", s.e[0].data);
}

// Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-08-02 12:07 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-25 14:30 [MODERATED] [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Paolo Bonzini
2018-07-25 14:30 ` [MODERATED] [PATCH v2 1/4] L1TF KVM ARCH_CAPABILITIES #1 Paolo Bonzini
2018-07-25 19:43   ` [MODERATED] " Andrew Cooper
2018-07-26  8:15     ` Paolo Bonzini
2018-07-25 14:30 ` [MODERATED] [PATCH v2 2/4] L1TF KVM ARCH_CAPABILITIES #2 Paolo Bonzini
2018-07-30 21:27   ` Thomas Gleixner
2018-07-31  8:22     ` [MODERATED] " Paolo Bonzini
2018-07-31  9:15       ` Thomas Gleixner
2018-07-31  9:35         ` [MODERATED] " Paolo Bonzini
2018-07-25 14:30 ` [MODERATED] [PATCH v2 3/4] L1TF KVM ARCH_CAPABILITIES #3 Paolo Bonzini
2018-07-25 14:31 ` [MODERATED] [PATCH v2 4/4] L1TF KVM ARCH_CAPABILITIES #4 Paolo Bonzini
2018-07-30 21:36   ` Thomas Gleixner
2018-07-31  7:39     ` [MODERATED] " Paolo Bonzini
2018-07-31  7:59       ` Thomas Gleixner
2018-07-25 15:52 ` [MODERATED] Re: [PATCH v2 0/4] L1TF KVM ARCH_CAPABILITIES #0 Greg KH
2018-07-26  8:12   ` Paolo Bonzini
2018-07-26 10:04     ` Greg KH
2018-07-26 10:41       ` Paolo Bonzini
2018-07-30 21:40         ` Thomas Gleixner
2018-08-02  2:51 ` [MODERATED] " Konrad Rzeszutek Wilk
2018-08-02 12:07   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.