linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [V1 PATCH 0/2] Linux PVH: set EFER bits..
@ 2014-08-27 22:33 Mukesh Rathor
  2014-08-27 22:33 ` [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu Mukesh Rathor
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Mukesh Rathor @ 2014-08-27 22:33 UTC (permalink / raw)
  To: boris.ostrovsky, david.vrabel; +Cc: xen-devel, linux-kernel

Resending with comments fixed up. Please note, these are no longer
AMD only, but address existing broken boot and broken NX on intel.

thanks
mukesh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu
  2014-08-27 22:33 [V1 PATCH 0/2] Linux PVH: set EFER bits Mukesh Rathor
@ 2014-08-27 22:33 ` Mukesh Rathor
  2014-08-28 14:18   ` [Xen-devel] " David Vrabel
  2014-08-27 22:33 ` [V1 PATCH 2/2] PVH: set EFER.NX and EFER.SCE for secondary vcpus Mukesh Rathor
  2014-08-28 14:08 ` [Xen-devel] [V1 PATCH 0/2] Linux PVH: set EFER bits David Vrabel
  2 siblings, 1 reply; 8+ messages in thread
From: Mukesh Rathor @ 2014-08-27 22:33 UTC (permalink / raw)
  To: boris.ostrovsky, david.vrabel; +Cc: xen-devel, linux-kernel

This patch addresses three things for a pvh boot vcpu:

  - NX bug on intel: It was recenlty discovered that NX is not being
    honored in PVH on intel since EFER.NX is not being set. The pte.NX
    bits are ignored if EFER.NX is not set on intel.

  - PVH boot hang on newer xen:  Following c/s on xen

        c/s 7645640:  x86/PVH: don't set EFER_SCE for pvh guest

    removes setting of EFER.SCE for PVH guests. As such, existing intel pvh
    guest will no longer boot on xen after that c/s.

  - Both above changes will be applicable to AMD also when xen support of
    AMD pvh is added.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 arch/x86/xen/enlighten.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index c0cb11f..4af512d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1499,6 +1499,17 @@ void __ref xen_pvh_secondary_vcpu_init(int cpu)
 	xen_pvh_set_cr_flags(cpu);
 }
 
+/* This is done in secondary_startup_64 for hvm guests. */
+static void __init xen_configure_efer(void)
+{
+	u64 efer;
+
+	rdmsrl(MSR_EFER, efer);
+	efer |= EFER_SCE;
+	efer |= (cpuid_edx(0x80000001) & (1 << 20)) ? EFER_NX : 0;
+	wrmsrl(MSR_EFER, efer);
+}
+
 static void __init xen_pvh_early_guest_init(void)
 {
 	if (!xen_feature(XENFEAT_auto_translated_physmap))
@@ -1508,6 +1519,7 @@ static void __init xen_pvh_early_guest_init(void)
 		return;
 
 	xen_have_vector_callback = 1;
+	xen_configure_efer();
 	xen_pvh_set_cr_flags(0);
 
 #ifdef CONFIG_X86_32
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [V1 PATCH 2/2] PVH: set EFER.NX and EFER.SCE for secondary vcpus
  2014-08-27 22:33 [V1 PATCH 0/2] Linux PVH: set EFER bits Mukesh Rathor
  2014-08-27 22:33 ` [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu Mukesh Rathor
@ 2014-08-27 22:33 ` Mukesh Rathor
  2014-08-28 14:24   ` [Xen-devel] " David Vrabel
  2014-08-28 14:08 ` [Xen-devel] [V1 PATCH 0/2] Linux PVH: set EFER bits David Vrabel
  2 siblings, 1 reply; 8+ messages in thread
From: Mukesh Rathor @ 2014-08-27 22:33 UTC (permalink / raw)
  To: boris.ostrovsky, david.vrabel; +Cc: xen-devel, linux-kernel

This patch addresses three things for a pvh secondary vcpu:

  - NX bug on intel: It was recenlty discovered that NX is not being
    honored in PVH on intel since EFER.NX is not being set. The pte.NX
    bits are ignored if EFER.NX is not set on intel.

  - PVH boot hang on newer xen:  Following c/s on xen

            c/s 7645640:  x86/PVH: don't set EFER_SCE for pvh guest

    removes setting of EFER.SCE for PVH guests. As such, existing intel pvh
    guest will no longer boot on xen after that c/s.

  - Both above changes will be applicable to AMD also when xen support of
    AMD pvh is added.

Please note: We create a new glue assembly entry point because the
secondary vcpus come up on kernel page tables that have pte.NX
bits set. While on Intel these are ignored if EFER.NX is not set, on
AMD a RSVD bit fault is generated.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 arch/x86/xen/smp.c      | 28 ++++++++++++++++++++--------
 arch/x86/xen/smp.h      |  1 +
 arch/x86/xen/xen-head.S | 21 +++++++++++++++++++++
 3 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 7005974..66058b9 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -37,6 +37,7 @@
 #include <xen/hvc-console.h>
 #include "xen-ops.h"
 #include "mmu.h"
+#include "smp.h"
 
 cpumask_var_t xen_cpu_initialized_map;
 
@@ -99,8 +100,12 @@ static void cpu_bringup(void)
 	wmb();			/* make sure everything is out */
 }
 
-/* Note: cpu parameter is only relevant for PVH */
-static void cpu_bringup_and_idle(int cpu)
+/*
+ * Note: cpu parameter is only relevant for PVH. The reason for passing it
+ * is we can't do smp_processor_id until the percpu segments are loaded, for
+ * which we need the cpu number! So we pass it in rdi as first parameter.
+ */
+asmlinkage __visible void cpu_bringup_and_idle(int cpu)
 {
 #ifdef CONFIG_X86_64
 	if (xen_feature(XENFEAT_auto_translated_physmap) &&
@@ -374,11 +379,10 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
 	ctxt->user_regs.fs = __KERNEL_PERCPU;
 	ctxt->user_regs.gs = __KERNEL_STACK_CANARY;
 #endif
-	ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
-
 	memset(&ctxt->fpu_ctxt, 0, sizeof(ctxt->fpu_ctxt));
 
 	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+		ctxt->user_regs.eip = (unsigned long)cpu_bringup_and_idle;
 		ctxt->flags = VGCF_IN_KERNEL;
 		ctxt->user_regs.eflags = 0x1000; /* IOPL_RING1 */
 		ctxt->user_regs.ds = __USER_DS;
@@ -416,12 +420,20 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
 #ifdef CONFIG_X86_32
 	}
 #else
-	} else
-		/* N.B. The user_regs.eip (cpu_bringup_and_idle) is called with
-		 * %rdi having the cpu number - which means are passing in
-		 * as the first parameter the cpu. Subtle!
+	} else {
+		/*
+		 * The vcpu comes on kernel page tables which have the NX pte
+		 * bit set on AMD. This means before DS/SS is touched, NX in
+		 * EFER must be set. Hence the following assembly glue code.
+		 */
+		ctxt->user_regs.eip = (unsigned long)pvh_cpu_bringup;
+
+		/* N.B. The bringup function cpu_bringup_and_idle is called with
+		 * %rdi having the cpu number - which means we are passing it in
+		 * as the first parameter. Subtle!
 		 */
 		ctxt->user_regs.rdi = cpu;
+	}
 #endif
 	ctxt->user_regs.esp = idle->thread.sp0 - sizeof(struct pt_regs);
 	ctxt->ctrlreg[3] = xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index c7c2d89..b20ba68 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -7,5 +7,6 @@ extern void xen_send_IPI_mask_allbutself(const struct cpumask *mask,
 extern void xen_send_IPI_allbutself(int vector);
 extern void xen_send_IPI_all(int vector);
 extern void xen_send_IPI_self(int vector);
+extern void pvh_cpu_bringup(int cpu);
 
 #endif
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 485b695..db8dca5 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -47,6 +47,27 @@ ENTRY(startup_xen)
 
 	__FINIT
 
+#ifdef CONFIG_XEN_PVH
+#ifdef CONFIG_X86_64
+/* Note that rdi contains the cpu number and must be preserved */
+ENTRY(pvh_cpu_bringup)
+	/* Gather features to see if NX implemented. (no EFER.NX on intel) */
+	movl    $0x80000001, %eax
+	cpuid
+	movl    %edx,%esi
+
+	movl    $MSR_EFER, %ecx
+	rdmsr
+	btsl    $_EFER_SCE, %eax
+
+	btl     $20,%esi
+	jnc     1f	/* No NX, skip it */
+	btsl    $_EFER_NX, %eax
+1:	wrmsr
+	jmp cpu_bringup_and_idle
+#endif /* CONFIG_X86_64 */
+#endif /* CONFIG_XEN_PVH */
+
 .pushsection .text
 	.balign PAGE_SIZE
 ENTRY(hypercall_page)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [V1 PATCH 0/2] Linux PVH: set EFER bits..
  2014-08-27 22:33 [V1 PATCH 0/2] Linux PVH: set EFER bits Mukesh Rathor
  2014-08-27 22:33 ` [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu Mukesh Rathor
  2014-08-27 22:33 ` [V1 PATCH 2/2] PVH: set EFER.NX and EFER.SCE for secondary vcpus Mukesh Rathor
@ 2014-08-28 14:08 ` David Vrabel
  2014-08-28 14:15   ` Jan Beulich
  2 siblings, 1 reply; 8+ messages in thread
From: David Vrabel @ 2014-08-28 14:08 UTC (permalink / raw)
  To: Mukesh Rathor, boris.ostrovsky, david.vrabel; +Cc: xen-devel, linux-kernel

On 27/08/14 23:33, Mukesh Rathor wrote:
> Resending with comments fixed up. Please note, these are no longer
> AMD only, but address existing broken boot and broken NX on intel.

It's no secret that I'm fed up with the cavalier attitude to the PVH ABI.

The Xen change that broke Linux PVH guests must be reverted.  This
change may also have broken other guests (e.g., Mirage or one of the
other unikernel OSes).

The documented posted by Roger is a good start, but I'm still not going
to apply any PVH patches to Linux until the ABI is agreed and stable and
the document is committed to the Xen repo.

David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [V1 PATCH 0/2] Linux PVH: set EFER bits..
  2014-08-28 14:08 ` [Xen-devel] [V1 PATCH 0/2] Linux PVH: set EFER bits David Vrabel
@ 2014-08-28 14:15   ` Jan Beulich
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2014-08-28 14:15 UTC (permalink / raw)
  To: david.vrabel; +Cc: xen-devel, boris.ostrovsky, Mukesh Rathor, linux-kernel

>>> On 28.08.14 at 16:08, <david.vrabel@citrix.com> wrote:
> The Xen change that broke Linux PVH guests must be reverted.  This
> change may also have broken other guests (e.g., Mirage or one of the
> other unikernel OSes).

If PVH wasn't tech preview, I'd agree. But since it is, I don't see a
reason to revert that change.

Jan


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu
  2014-08-27 22:33 ` [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu Mukesh Rathor
@ 2014-08-28 14:18   ` David Vrabel
  2014-08-28 23:59     ` Mukesh Rathor
  0 siblings, 1 reply; 8+ messages in thread
From: David Vrabel @ 2014-08-28 14:18 UTC (permalink / raw)
  To: Mukesh Rathor, boris.ostrovsky, david.vrabel; +Cc: xen-devel, linux-kernel

On 27/08/14 23:33, Mukesh Rathor wrote:
> This patch addresses three things for a pvh boot vcpu:
> 
>   - NX bug on intel: It was recenlty discovered that NX is not being
>     honored in PVH on intel since EFER.NX is not being set. The pte.NX
>     bits are ignored if EFER.NX is not set on intel.

I am unconvinced by this explanation.  The Intel SDM clearly states that
the XD bit in the page table entries is reserved if EFER.NXE is clear,
and thus using a entry with XD set and EFER.NXE clear should generate a
page fault (same as AMD).

You either need to find out why Intel really worked (perhaps Xen is
setting EFER.NXE on Intel?) or you need to included an errata (or
similar) reference.

>   - PVH boot hang on newer xen:  Following c/s on xen
> 
>         c/s 7645640:  x86/PVH: don't set EFER_SCE for pvh guest

As I already explained in another email, the Xen change caused a
regression and must be reverted.

David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [V1 PATCH 2/2] PVH: set EFER.NX and EFER.SCE for secondary vcpus
  2014-08-27 22:33 ` [V1 PATCH 2/2] PVH: set EFER.NX and EFER.SCE for secondary vcpus Mukesh Rathor
@ 2014-08-28 14:24   ` David Vrabel
  0 siblings, 0 replies; 8+ messages in thread
From: David Vrabel @ 2014-08-28 14:24 UTC (permalink / raw)
  To: Mukesh Rathor, boris.ostrovsky, david.vrabel; +Cc: xen-devel, linux-kernel

On 27/08/14 23:33, Mukesh Rathor wrote:
> This patch addresses three things for a pvh secondary vcpu:

I don't understand why you have separated this into two patches.  Please
fold into one.

> Please note: We create a new glue assembly entry point because the
> secondary vcpus come up on kernel page tables that have pte.NX
> bits set. While on Intel these are ignored if EFER.NX is not set, on
> AMD a RSVD bit fault is generated.

Please try and unify the early CPU init code for boot and secondary CPUs.

Native manages to do this (secondary_startup_64 is called for boot and
secondary CPUs).

> +	/* Gather features to see if NX implemented. (no EFER.NX on intel) */

EFER.NXE does exist on Intel.

David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu
  2014-08-28 14:18   ` [Xen-devel] " David Vrabel
@ 2014-08-28 23:59     ` Mukesh Rathor
  0 siblings, 0 replies; 8+ messages in thread
From: Mukesh Rathor @ 2014-08-28 23:59 UTC (permalink / raw)
  To: David Vrabel; +Cc: boris.ostrovsky, xen-devel, linux-kernel

On Thu, 28 Aug 2014 15:18:26 +0100
David Vrabel <david.vrabel@citrix.com> wrote:

> On 27/08/14 23:33, Mukesh Rathor wrote:
> > This patch addresses three things for a pvh boot vcpu:
> > 
> >   - NX bug on intel: It was recenlty discovered that NX is not being
> >     honored in PVH on intel since EFER.NX is not being set. The
> > pte.NX bits are ignored if EFER.NX is not set on intel.
> 
> I am unconvinced by this explanation.  The Intel SDM clearly states
> that the XD bit in the page table entries is reserved if EFER.NXE is
> clear, and thus using a entry with XD set and EFER.NXE clear should
> generate a page fault (same as AMD).
> 
> You either need to find out why Intel really worked (perhaps Xen is
> setting EFER.NXE on Intel?) or you need to included an errata (or
> similar) reference.

Nop, verified that again. The vcpu is coming up on efer 0x501,  ie,
LME/LMA/SCE (older xen prior to SCE removal change). The pte entry for
rsp is: 800000003e32b063 that has NX set. No exception is generated upon
push rbp instruction (like on amd).

Could be that Intel docs are incomplete on vmx, I didn't hear back from them 
on the last one I had found. Anyways, we are not addressing an intel errata
here, but fixing our issue of setting the EFER.NX bit.

-Mukesh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-08-28 23:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-27 22:33 [V1 PATCH 0/2] Linux PVH: set EFER bits Mukesh Rathor
2014-08-27 22:33 ` [V1 PATCH 1/2] PVH: set EFER.NX and EFER.SCE for boot vcpu Mukesh Rathor
2014-08-28 14:18   ` [Xen-devel] " David Vrabel
2014-08-28 23:59     ` Mukesh Rathor
2014-08-27 22:33 ` [V1 PATCH 2/2] PVH: set EFER.NX and EFER.SCE for secondary vcpus Mukesh Rathor
2014-08-28 14:24   ` [Xen-devel] " David Vrabel
2014-08-28 14:08 ` [Xen-devel] [V1 PATCH 0/2] Linux PVH: set EFER bits David Vrabel
2014-08-28 14:15   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).