All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
@ 2015-11-13 23:18 Boris Ostrovsky
  2015-11-13 23:26 ` Andy Lutomirski
  2015-11-13 23:26 ` Andy Lutomirski
  0 siblings, 2 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-13 23:18 UTC (permalink / raw)
  To: konrad.wilk, david.vrabel; +Cc: xen-devel, linux-kernel, luto, boris.ostrovsky

After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
it's not pt_regs).

We need to adjust it so that subsequent xen_iret can use it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Alternatively, we could return 0 from do_fast_syscall_32() if paravirt_enabled()
is true since Xen PV guests will end up using xen_iret one way or the other. And
then we won't need xen_sysexit at all.

 arch/x86/xen/xen-asm_32.S |   23 ++++++++++++++++-------
 1 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index fd92a64..c70ec37 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -36,15 +36,24 @@ check_events:
 
 /*
  * We can't use sysexit directly, because we're not running in ring0.
- * But we can easily fake it up using iret.  Assuming xen_sysexit is
- * jumped to with a standard stack frame, we can just strip it back to
- * a standard iret frame and use iret.
+ * But we can easily fake it up using iret.
+ * We came here from the opportunistic SYSEXIT path in entry_SYSENTER_32
+ * which left the stack looking like this:
+ * 	$__USER_DS
+ * 	%ecx
+ * 	eflags
+ *	$__USER_CS
+ *	%eip
+ *	%eax
+ *	%gs
+ *	%fs
+ *	%es
+ *	%ds	<-- %esp
+ *
+ * so we need to adjust it to look like a standard iret frame
  */
 ENTRY(xen_sysexit)
-	movl PT_EAX(%esp), %eax			/* Shouldn't be necessary? */
-	orl $X86_EFLAGS_IF, PT_EFLAGS(%esp)
-	lea PT_EIP(%esp), %esp
-
+	add $5*4, %esp
 	jmp xen_iret
 ENDPROC(xen_sysexit)
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-13 23:18 [PATCH] xen/x86: Adjust stack pointer in xen_sysexit Boris Ostrovsky
@ 2015-11-13 23:26 ` Andy Lutomirski
  2015-11-14  1:23   ` Boris Ostrovsky
  2015-11-14  1:23   ` Boris Ostrovsky
  2015-11-13 23:26 ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-13 23:26 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, linux-kernel,
	Andrew Lutomirski

On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
> it's not pt_regs).
>
> We need to adjust it so that subsequent xen_iret can use it.

I'm wondering if this should be more straightforward:

        movq    %rsp, %rdi
        call    do_fast_syscall_32
        testl   %eax, %eax
        jz      .Lsyscall_32_done

        /* Opportunistic SYSRET */
sysret32_from_system_call:
        XEN_DO_SYSRET32

where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
variant of Xen's iret path that knows that the fast path is okay.

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-13 23:18 [PATCH] xen/x86: Adjust stack pointer in xen_sysexit Boris Ostrovsky
  2015-11-13 23:26 ` Andy Lutomirski
@ 2015-11-13 23:26 ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-13 23:26 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: linux-kernel, xen-devel, David Vrabel, Andrew Lutomirski

On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
> it's not pt_regs).
>
> We need to adjust it so that subsequent xen_iret can use it.

I'm wondering if this should be more straightforward:

        movq    %rsp, %rdi
        call    do_fast_syscall_32
        testl   %eax, %eax
        jz      .Lsyscall_32_done

        /* Opportunistic SYSRET */
sysret32_from_system_call:
        XEN_DO_SYSRET32

where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
variant of Xen's iret path that knows that the fast path is okay.

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-13 23:26 ` Andy Lutomirski
  2015-11-14  1:23   ` Boris Ostrovsky
@ 2015-11-14  1:23   ` Boris Ostrovsky
  2015-11-15 18:02     ` Andy Lutomirski
  2015-11-15 18:02     ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-14  1:23 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, linux-kernel,
	Andrew Lutomirski



On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
>> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>> it's not pt_regs).
>>
>> We need to adjust it so that subsequent xen_iret can use it.
> I'm wondering if this should be more straightforward:
>
>          movq    %rsp, %rdi
>          call    do_fast_syscall_32
>          testl   %eax, %eax
>          jz      .Lsyscall_32_done
>
>          /* Opportunistic SYSRET */
> sysret32_from_system_call:
>          XEN_DO_SYSRET32
>
> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
> variant of Xen's iret path that knows that the fast path is okay.


This patch is for 32-bit kernel. I actually haven't looked at compat 
code (probably because our tests don't try that), I need to do that too.

As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for 
baremetal otherwise current paravirt op will use native_usergs_sysret32 
(for compat code). Which means a new pv_op, I think.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-13 23:26 ` Andy Lutomirski
@ 2015-11-14  1:23   ` Boris Ostrovsky
  2015-11-14  1:23   ` Boris Ostrovsky
  1 sibling, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-14  1:23 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: linux-kernel, xen-devel, David Vrabel, Andrew Lutomirski



On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
>> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>> it's not pt_regs).
>>
>> We need to adjust it so that subsequent xen_iret can use it.
> I'm wondering if this should be more straightforward:
>
>          movq    %rsp, %rdi
>          call    do_fast_syscall_32
>          testl   %eax, %eax
>          jz      .Lsyscall_32_done
>
>          /* Opportunistic SYSRET */
> sysret32_from_system_call:
>          XEN_DO_SYSRET32
>
> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
> variant of Xen's iret path that knows that the fast path is okay.


This patch is for 32-bit kernel. I actually haven't looked at compat 
code (probably because our tests don't try that), I need to do that too.

As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for 
baremetal otherwise current paravirt op will use native_usergs_sysret32 
(for compat code). Which means a new pv_op, I think.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-14  1:23   ` Boris Ostrovsky
  2015-11-15 18:02     ` Andy Lutomirski
@ 2015-11-15 18:02     ` Andy Lutomirski
  2015-11-16 16:25       ` Boris Ostrovsky
  2015-11-16 16:25       ` Boris Ostrovsky
  1 sibling, 2 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-15 18:02 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: linux-kernel, xen-devel, David Vrabel, Konrad Rzeszutek Wilk

On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>
>
>
> On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
>>
>> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>>>
>>> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
>>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>>> it's not pt_regs).
>>>
>>> We need to adjust it so that subsequent xen_iret can use it.
>>
>> I'm wondering if this should be more straightforward:
>>
>>          movq    %rsp, %rdi
>>          call    do_fast_syscall_32
>>          testl   %eax, %eax
>>          jz      .Lsyscall_32_done
>>
>>          /* Opportunistic SYSRET */
>> sysret32_from_system_call:
>>          XEN_DO_SYSRET32
>>
>> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
>> variant of Xen's iret path that knows that the fast path is okay.
>
>
>
> This patch is for 32-bit kernel. I actually haven't looked at compat code (probably because our tests don't try that), I need to do that too.

In 4.4, it's almost identical (which was part of the point of this
whole series).  We use sysret32 instead of sysexit, but the underlying
structure is the same: munge the stack frame and register state
appropriately to use the fast return instruction in question and then
execute it.  In both cases, the only real difference from the IRET
path is that we're willing to lose the values of some subset of cx,
dx, and (on 64-bit kernels) r11.

>
> As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for baremetal otherwise current paravirt op will use native_usergs_sysret32 (for compat code). Which means a new pv_op, I think.

Agreed, unless...

Does Xen have a cpufeature?  Using ALTERNATIVE instead of a pvop could
be easier to follow and be less code at the same time.  Frankly,
following the control flow from asm through the pre-paravirt-patching
and post-paravirt-patching variants and into the final targets is
getting a little bit old, and ALTERNATIVE is crystal clear in
comparison (and has all the interesting info inline with the rest of
the asm).  Of course, it doesn't work early in boot, but that's fine
for anything involving user/kernel switches.


--Andy

>
> -boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-14  1:23   ` Boris Ostrovsky
@ 2015-11-15 18:02     ` Andy Lutomirski
  2015-11-15 18:02     ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-15 18:02 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: David Vrabel, linux-kernel, xen-devel

On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>
>
>
> On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
>>
>> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>>>
>>> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
>>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>>> it's not pt_regs).
>>>
>>> We need to adjust it so that subsequent xen_iret can use it.
>>
>> I'm wondering if this should be more straightforward:
>>
>>          movq    %rsp, %rdi
>>          call    do_fast_syscall_32
>>          testl   %eax, %eax
>>          jz      .Lsyscall_32_done
>>
>>          /* Opportunistic SYSRET */
>> sysret32_from_system_call:
>>          XEN_DO_SYSRET32
>>
>> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
>> variant of Xen's iret path that knows that the fast path is okay.
>
>
>
> This patch is for 32-bit kernel. I actually haven't looked at compat code (probably because our tests don't try that), I need to do that too.

In 4.4, it's almost identical (which was part of the point of this
whole series).  We use sysret32 instead of sysexit, but the underlying
structure is the same: munge the stack frame and register state
appropriately to use the fast return instruction in question and then
execute it.  In both cases, the only real difference from the IRET
path is that we're willing to lose the values of some subset of cx,
dx, and (on 64-bit kernels) r11.

>
> As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for baremetal otherwise current paravirt op will use native_usergs_sysret32 (for compat code). Which means a new pv_op, I think.

Agreed, unless...

Does Xen have a cpufeature?  Using ALTERNATIVE instead of a pvop could
be easier to follow and be less code at the same time.  Frankly,
following the control flow from asm through the pre-paravirt-patching
and post-paravirt-patching variants and into the final targets is
getting a little bit old, and ALTERNATIVE is crystal clear in
comparison (and has all the interesting info inline with the rest of
the asm).  Of course, it doesn't work early in boot, but that's fine
for anything involving user/kernel switches.


--Andy

>
> -boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-15 18:02     ` Andy Lutomirski
  2015-11-16 16:25       ` Boris Ostrovsky
@ 2015-11-16 16:25       ` Boris Ostrovsky
  2015-11-16 19:03         ` Andy Lutomirski
  2015-11-16 19:03         ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-16 16:25 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, xen-devel, David Vrabel, Konrad Rzeszutek Wilk

On 11/15/2015 01:02 PM, Andy Lutomirski wrote:
> On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>
>>
>> On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
>>> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
>>> <boris.ostrovsky@oracle.com> wrote:
>>>> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
>>>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>>>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>>>> it's not pt_regs).
>>>>
>>>> We need to adjust it so that subsequent xen_iret can use it.
>>> I'm wondering if this should be more straightforward:
>>>
>>>           movq    %rsp, %rdi
>>>           call    do_fast_syscall_32
>>>           testl   %eax, %eax
>>>           jz      .Lsyscall_32_done
>>>
>>>           /* Opportunistic SYSRET */
>>> sysret32_from_system_call:
>>>           XEN_DO_SYSRET32
>>>
>>> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
>>> variant of Xen's iret path that knows that the fast path is okay.
>>
>>
>> This patch is for 32-bit kernel. I actually haven't looked at compat code (probably because our tests don't try that), I need to do that too.
> In 4.4, it's almost identical (which was part of the point of this
> whole series).  We use sysret32 instead of sysexit, but the underlying
> structure is the same: munge the stack frame and register state
> appropriately to use the fast return instruction in question and then
> execute it.  In both cases, the only real difference from the IRET
> path is that we're willing to lose the values of some subset of cx,
> dx, and (on 64-bit kernels) r11.


So it turned out that for compat mode we don't need to do anything since 
xen_sysret32 doesn't assume any stack format (or, rather, it assumes 
that it can't be used) and builds the IRET frame itself.


>
>> As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for baremetal otherwise current paravirt op will use native_usergs_sysret32 (for compat code). Which means a new pv_op, I think.
> Agreed, unless...
>
> Does Xen have a cpufeature?  Using ALTERNATIVE instead of a pvop could
> be easier to follow and be less code at the same time.  Frankly,
> following the control flow from asm through the pre-paravirt-patching
> and post-paravirt-patching variants and into the final targets is
> getting a little bit old, and ALTERNATIVE is crystal clear in
> comparison (and has all the interesting info inline with the rest of
> the asm).  Of course, it doesn't work early in boot, but that's fine
> for anything involving user/kernel switches.


We don't currently have a Xen-specific CPU feature. We could, in 
principle, add it but we can't replace all of current paravirt patching 
with a single feature since PVH guests use a subset of existing pv ops 
(and in the future it may become even more fine-grained).

And I don't think we should go ALTERNATIVE route for one set of features 
and keep pv ops for the rest --- it should be either one or the other.


-boris


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-15 18:02     ` Andy Lutomirski
@ 2015-11-16 16:25       ` Boris Ostrovsky
  2015-11-16 16:25       ` Boris Ostrovsky
  1 sibling, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-16 16:25 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: David Vrabel, linux-kernel, xen-devel

On 11/15/2015 01:02 PM, Andy Lutomirski wrote:
> On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>
>>
>> On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
>>> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
>>> <boris.ostrovsky@oracle.com> wrote:
>>>> After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
>>>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>>>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>>>> it's not pt_regs).
>>>>
>>>> We need to adjust it so that subsequent xen_iret can use it.
>>> I'm wondering if this should be more straightforward:
>>>
>>>           movq    %rsp, %rdi
>>>           call    do_fast_syscall_32
>>>           testl   %eax, %eax
>>>           jz      .Lsyscall_32_done
>>>
>>>           /* Opportunistic SYSRET */
>>> sysret32_from_system_call:
>>>           XEN_DO_SYSRET32
>>>
>>> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
>>> variant of Xen's iret path that knows that the fast path is okay.
>>
>>
>> This patch is for 32-bit kernel. I actually haven't looked at compat code (probably because our tests don't try that), I need to do that too.
> In 4.4, it's almost identical (which was part of the point of this
> whole series).  We use sysret32 instead of sysexit, but the underlying
> structure is the same: munge the stack frame and register state
> appropriately to use the fast return instruction in question and then
> execute it.  In both cases, the only real difference from the IRET
> path is that we're willing to lose the values of some subset of cx,
> dx, and (on 64-bit kernels) r11.


So it turned out that for compat mode we don't need to do anything since 
xen_sysret32 doesn't assume any stack format (or, rather, it assumes 
that it can't be used) and builds the IRET frame itself.


>
>> As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for baremetal otherwise current paravirt op will use native_usergs_sysret32 (for compat code). Which means a new pv_op, I think.
> Agreed, unless...
>
> Does Xen have a cpufeature?  Using ALTERNATIVE instead of a pvop could
> be easier to follow and be less code at the same time.  Frankly,
> following the control flow from asm through the pre-paravirt-patching
> and post-paravirt-patching variants and into the final targets is
> getting a little bit old, and ALTERNATIVE is crystal clear in
> comparison (and has all the interesting info inline with the rest of
> the asm).  Of course, it doesn't work early in boot, but that's fine
> for anything involving user/kernel switches.


We don't currently have a Xen-specific CPU feature. We could, in 
principle, add it but we can't replace all of current paravirt patching 
with a single feature since PVH guests use a subset of existing pv ops 
(and in the future it may become even more fine-grained).

And I don't think we should go ALTERNATIVE route for one set of features 
and keep pv ops for the rest --- it should be either one or the other.


-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 16:25       ` Boris Ostrovsky
@ 2015-11-16 19:03         ` Andy Lutomirski
  2015-11-16 19:59           ` Borislav Petkov
                             ` (3 more replies)
  2015-11-16 19:03         ` Andy Lutomirski
  1 sibling, 4 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 19:03 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: linux-kernel, xen-devel, David Vrabel, Konrad Rzeszutek Wilk,
	Borislav Petkov

On Mon, Nov 16, 2015 at 8:25 AM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 11/15/2015 01:02 PM, Andy Lutomirski wrote:
>>
>> On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com>
>> wrote:
>>>
>>>
>>>
>>> On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
>>>>
>>>> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
>>>> <boris.ostrovsky@oracle.com> wrote:
>>>>>
>>>>> After 32-bit syscall rewrite, and specifically after commit
>>>>> 5f310f739b4c
>>>>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>>>>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>>>>> it's not pt_regs).
>>>>>
>>>>> We need to adjust it so that subsequent xen_iret can use it.
>>>>
>>>> I'm wondering if this should be more straightforward:
>>>>
>>>>           movq    %rsp, %rdi
>>>>           call    do_fast_syscall_32
>>>>           testl   %eax, %eax
>>>>           jz      .Lsyscall_32_done
>>>>
>>>>           /* Opportunistic SYSRET */
>>>> sysret32_from_system_call:
>>>>           XEN_DO_SYSRET32
>>>>
>>>> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
>>>> variant of Xen's iret path that knows that the fast path is okay.
>>>
>>>
>>>
>>> This patch is for 32-bit kernel. I actually haven't looked at compat code
>>> (probably because our tests don't try that), I need to do that too.
>>
>> In 4.4, it's almost identical (which was part of the point of this
>> whole series).  We use sysret32 instead of sysexit, but the underlying
>> structure is the same: munge the stack frame and register state
>> appropriately to use the fast return instruction in question and then
>> execute it.  In both cases, the only real difference from the IRET
>> path is that we're willing to lose the values of some subset of cx,
>> dx, and (on 64-bit kernels) r11.
>
>
>
> So it turned out that for compat mode we don't need to do anything since
> xen_sysret32 doesn't assume any stack format (or, rather, it assumes that it
> can't be used) and builds the IRET frame itself.
>

It's still a waste of effort, though.  Also, I'd eventually like the
number of places in Xen code in which rsp/esp is invalid to be exactly
zero, and this approach makes this harder or even impossible.

>
>>
>>> As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for
>>> baremetal otherwise current paravirt op will use native_usergs_sysret32 (for
>>> compat code). Which means a new pv_op, I think.
>>
>> Agreed, unless...
>>
>> Does Xen have a cpufeature?  Using ALTERNATIVE instead of a pvop could
>> be easier to follow and be less code at the same time.  Frankly,
>> following the control flow from asm through the pre-paravirt-patching
>> and post-paravirt-patching variants and into the final targets is
>> getting a little bit old, and ALTERNATIVE is crystal clear in
>> comparison (and has all the interesting info inline with the rest of
>> the asm).  Of course, it doesn't work early in boot, but that's fine
>> for anything involving user/kernel switches.
>
>
>
> We don't currently have a Xen-specific CPU feature. We could, in principle,
> add it but we can't replace all of current paravirt patching with a single
> feature since PVH guests use a subset of existing pv ops (and in the future
> it may become even more fine-grained).
>
> And I don't think we should go ALTERNATIVE route for one set of features and
> keep pv ops for the rest --- it should be either one or the other.

Does PVH hook into the entry asm code at all?  I thought it was just
boot code and drivers.

In any case, someone needs to do some serious review and cleanup on
the whole paravirt op mess.  We have a bunch of paravirt ops that
serve little purpose.

The paravirt infrastructure is a bit weird, too: it seems to
effectively have four states for each patch site.  There's:

1. The initial state, which is unoptimized and works on native.
Presumably any of these that happen early also need to work, if
slowly, on Xen.

2. The Xen state without text patching.  I'm not actually sure why
this exists at all.  Are there pvops that need to switch too early for
us to patch the text?

3. The native patched state.  This is supposedly optimal, but it
results in a few more NOPs than are really needed.

4. The Xen patched state.

Alternatives have only two states, and the code is much easier to
understand.  Also, alternatives avoid things like:

...
SWAPGS
...

The reader surely doesn't remember that this isn't guaranteed to be a
swapgs instruction on native.  Using:

ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV

would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
much clearer.  We could hide *that* behind a macro and no one would be
confused.  (Well, they'd be confused by the fact that Xen PV handles
gsbase very differently from native, but that has nothing to do with
the macro.)

I think we could convert piecemeal, and I wonder if this new patch for
32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
starting point.  Borislav, what do you think?  Would you be okay with
adding a Xen PV pseudofeature?

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 16:25       ` Boris Ostrovsky
  2015-11-16 19:03         ` Andy Lutomirski
@ 2015-11-16 19:03         ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 19:03 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: David Vrabel, Borislav Petkov, linux-kernel, xen-devel

On Mon, Nov 16, 2015 at 8:25 AM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 11/15/2015 01:02 PM, Andy Lutomirski wrote:
>>
>> On Nov 13, 2015 5:23 PM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com>
>> wrote:
>>>
>>>
>>>
>>> On 11/13/2015 06:26 PM, Andy Lutomirski wrote:
>>>>
>>>> On Fri, Nov 13, 2015 at 3:18 PM, Boris Ostrovsky
>>>> <boris.ostrovsky@oracle.com> wrote:
>>>>>
>>>>> After 32-bit syscall rewrite, and specifically after commit
>>>>> 5f310f739b4c
>>>>> ("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
>>>>> frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
>>>>> it's not pt_regs).
>>>>>
>>>>> We need to adjust it so that subsequent xen_iret can use it.
>>>>
>>>> I'm wondering if this should be more straightforward:
>>>>
>>>>           movq    %rsp, %rdi
>>>>           call    do_fast_syscall_32
>>>>           testl   %eax, %eax
>>>>           jz      .Lsyscall_32_done
>>>>
>>>>           /* Opportunistic SYSRET */
>>>> sysret32_from_system_call:
>>>>           XEN_DO_SYSRET32
>>>>
>>>> where XEN_DO_SYSRET32 is a simple pv op that, on Xen, jumps to a
>>>> variant of Xen's iret path that knows that the fast path is okay.
>>>
>>>
>>>
>>> This patch is for 32-bit kernel. I actually haven't looked at compat code
>>> (probably because our tests don't try that), I need to do that too.
>>
>> In 4.4, it's almost identical (which was part of the point of this
>> whole series).  We use sysret32 instead of sysexit, but the underlying
>> structure is the same: munge the stack frame and register state
>> appropriately to use the fast return instruction in question and then
>> execute it.  In both cases, the only real difference from the IRET
>> path is that we're willing to lose the values of some subset of cx,
>> dx, and (on 64-bit kernels) r11.
>
>
>
> So it turned out that for compat mode we don't need to do anything since
> xen_sysret32 doesn't assume any stack format (or, rather, it assumes that it
> can't be used) and builds the IRET frame itself.
>

It's still a waste of effort, though.  Also, I'd eventually like the
number of places in Xen code in which rsp/esp is invalid to be exactly
zero, and this approach makes this harder or even impossible.

>
>>
>>> As for XEN_DO_SYSRET32 --- we'd presumably need to have a nop for
>>> baremetal otherwise current paravirt op will use native_usergs_sysret32 (for
>>> compat code). Which means a new pv_op, I think.
>>
>> Agreed, unless...
>>
>> Does Xen have a cpufeature?  Using ALTERNATIVE instead of a pvop could
>> be easier to follow and be less code at the same time.  Frankly,
>> following the control flow from asm through the pre-paravirt-patching
>> and post-paravirt-patching variants and into the final targets is
>> getting a little bit old, and ALTERNATIVE is crystal clear in
>> comparison (and has all the interesting info inline with the rest of
>> the asm).  Of course, it doesn't work early in boot, but that's fine
>> for anything involving user/kernel switches.
>
>
>
> We don't currently have a Xen-specific CPU feature. We could, in principle,
> add it but we can't replace all of current paravirt patching with a single
> feature since PVH guests use a subset of existing pv ops (and in the future
> it may become even more fine-grained).
>
> And I don't think we should go ALTERNATIVE route for one set of features and
> keep pv ops for the rest --- it should be either one or the other.

Does PVH hook into the entry asm code at all?  I thought it was just
boot code and drivers.

In any case, someone needs to do some serious review and cleanup on
the whole paravirt op mess.  We have a bunch of paravirt ops that
serve little purpose.

The paravirt infrastructure is a bit weird, too: it seems to
effectively have four states for each patch site.  There's:

1. The initial state, which is unoptimized and works on native.
Presumably any of these that happen early also need to work, if
slowly, on Xen.

2. The Xen state without text patching.  I'm not actually sure why
this exists at all.  Are there pvops that need to switch too early for
us to patch the text?

3. The native patched state.  This is supposedly optimal, but it
results in a few more NOPs than are really needed.

4. The Xen patched state.

Alternatives have only two states, and the code is much easier to
understand.  Also, alternatives avoid things like:

...
SWAPGS
...

The reader surely doesn't remember that this isn't guaranteed to be a
swapgs instruction on native.  Using:

ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV

would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
much clearer.  We could hide *that* behind a macro and no one would be
confused.  (Well, they'd be confused by the fact that Xen PV handles
gsbase very differently from native, but that has nothing to do with
the macro.)

I think we could convert piecemeal, and I wonder if this new patch for
32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
starting point.  Borislav, what do you think?  Would you be okay with
adding a Xen PV pseudofeature?

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 19:03         ` Andy Lutomirski
@ 2015-11-16 19:59           ` Borislav Petkov
  2015-11-16 20:11             ` Andy Lutomirski
  2015-11-16 20:11             ` Andy Lutomirski
  2015-11-16 19:59           ` Borislav Petkov
                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-16 19:59 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, David Vrabel,
	Konrad Rzeszutek Wilk

On Mon, Nov 16, 2015 at 11:03:22AM -0800, Andy Lutomirski wrote:

> ...
> The reader surely doesn't remember that this isn't guaranteed to be a
> swapgs instruction on native.  Using:
> 
> ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV
> 
> would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
> much clearer.  We could hide *that* behind a macro and no one would be
> confused.  (Well, they'd be confused by the fact that Xen PV handles
> gsbase very differently from native, but that has nothing to do with
> the macro.)
> 
> I think we could convert piecemeal, and I wonder if this new patch for
> 32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
> starting point.  Borislav, what do you think?  Would you be okay with
> adding a Xen PV pseudofeature?

AFAICT, I'd prefer this becomes rather a jump label which gets enabled
on xen. Especially if a single pseudofeature might not be enough,
apprently...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 19:03         ` Andy Lutomirski
  2015-11-16 19:59           ` Borislav Petkov
@ 2015-11-16 19:59           ` Borislav Petkov
  2015-11-16 20:31           ` Boris Ostrovsky
  2015-11-16 20:31           ` Boris Ostrovsky
  3 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-16 19:59 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Boris Ostrovsky, David Vrabel, linux-kernel, xen-devel

On Mon, Nov 16, 2015 at 11:03:22AM -0800, Andy Lutomirski wrote:

> ...
> The reader surely doesn't remember that this isn't guaranteed to be a
> swapgs instruction on native.  Using:
> 
> ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV
> 
> would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
> much clearer.  We could hide *that* behind a macro and no one would be
> confused.  (Well, they'd be confused by the fact that Xen PV handles
> gsbase very differently from native, but that has nothing to do with
> the macro.)
> 
> I think we could convert piecemeal, and I wonder if this new patch for
> 32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
> starting point.  Borislav, what do you think?  Would you be okay with
> adding a Xen PV pseudofeature?

AFAICT, I'd prefer this becomes rather a jump label which gets enabled
on xen. Especially if a single pseudofeature might not be enough,
apprently...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 19:59           ` Borislav Petkov
  2015-11-16 20:11             ` Andy Lutomirski
@ 2015-11-16 20:11             ` Andy Lutomirski
  2015-11-16 20:22               ` Borislav Petkov
  2015-11-16 20:22               ` Borislav Petkov
  1 sibling, 2 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 20:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, David Vrabel,
	Konrad Rzeszutek Wilk

On Mon, Nov 16, 2015 at 11:59 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Mon, Nov 16, 2015 at 11:03:22AM -0800, Andy Lutomirski wrote:
>
>> ...
>> The reader surely doesn't remember that this isn't guaranteed to be a
>> swapgs instruction on native.  Using:
>>
>> ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV
>>
>> would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
>> much clearer.  We could hide *that* behind a macro and no one would be
>> confused.  (Well, they'd be confused by the fact that Xen PV handles
>> gsbase very differently from native, but that has nothing to do with
>> the macro.)
>>
>> I think we could convert piecemeal, and I wonder if this new patch for
>> 32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
>> starting point.  Borislav, what do you think?  Would you be okay with
>> adding a Xen PV pseudofeature?
>
> AFAICT, I'd prefer this becomes rather a jump label which gets enabled
> on xen. Especially if a single pseudofeature might not be enough,
> apprently...

Except it's not a jump.  (Also, the alternatives infrastructure is IMO
much nicer than the jump label infrastructure.)

Taking SWAPGS as an example, the semantics we need are:

 - On native, do swapgs.  This *can't* be a call due to RSP issues.
 - On Xen PV, swapgs will work, but it's emulated.  We'd rather just nop it out.

In principle, we could static jump over it on Xen, but that also
involves forcing the jump label to be built on old GCC versions, which
PeterZ objected to the last time I asked.

If it would make you feel better, it could be X86_BUG_XENPV :-p

Are there really multiple feature bits for this stuff?  I'd like to
imagine that the entry code is all either Xen PV or native/PVH/PVHVM
-- i.e. I assumed that PVH works like native for all entries.

--Andy tip #101: Trim your mails when you reply.



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 19:59           ` Borislav Petkov
@ 2015-11-16 20:11             ` Andy Lutomirski
  2015-11-16 20:11             ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 20:11 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Boris Ostrovsky, David Vrabel, linux-kernel, xen-devel

On Mon, Nov 16, 2015 at 11:59 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Mon, Nov 16, 2015 at 11:03:22AM -0800, Andy Lutomirski wrote:
>
>> ...
>> The reader surely doesn't remember that this isn't guaranteed to be a
>> swapgs instruction on native.  Using:
>>
>> ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV
>>
>> would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
>> much clearer.  We could hide *that* behind a macro and no one would be
>> confused.  (Well, they'd be confused by the fact that Xen PV handles
>> gsbase very differently from native, but that has nothing to do with
>> the macro.)
>>
>> I think we could convert piecemeal, and I wonder if this new patch for
>> 32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
>> starting point.  Borislav, what do you think?  Would you be okay with
>> adding a Xen PV pseudofeature?
>
> AFAICT, I'd prefer this becomes rather a jump label which gets enabled
> on xen. Especially if a single pseudofeature might not be enough,
> apprently...

Except it's not a jump.  (Also, the alternatives infrastructure is IMO
much nicer than the jump label infrastructure.)

Taking SWAPGS as an example, the semantics we need are:

 - On native, do swapgs.  This *can't* be a call due to RSP issues.
 - On Xen PV, swapgs will work, but it's emulated.  We'd rather just nop it out.

In principle, we could static jump over it on Xen, but that also
involves forcing the jump label to be built on old GCC versions, which
PeterZ objected to the last time I asked.

If it would make you feel better, it could be X86_BUG_XENPV :-p

Are there really multiple feature bits for this stuff?  I'd like to
imagine that the entry code is all either Xen PV or native/PVH/PVHVM
-- i.e. I assumed that PVH works like native for all entries.

--Andy tip #101: Trim your mails when you reply.



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:11             ` Andy Lutomirski
  2015-11-16 20:22               ` Borislav Petkov
@ 2015-11-16 20:22               ` Borislav Petkov
  2015-11-16 20:48                 ` Boris Ostrovsky
                                   ` (3 more replies)
  1 sibling, 4 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-16 20:22 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, David Vrabel,
	Konrad Rzeszutek Wilk

On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
> On Mon, Nov 16, 2015 at 11:59 AM, Borislav Petkov <bp@alien8.de> wrote:
> > On Mon, Nov 16, 2015 at 11:03:22AM -0800, Andy Lutomirski wrote:
> >
> >> ...
> >> The reader surely doesn't remember that this isn't guaranteed to be a
> >> swapgs instruction on native.  Using:
> >>
> >> ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV
> >>
> >> would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
> >> much clearer.  We could hide *that* behind a macro and no one would be
> >> confused.  (Well, they'd be confused by the fact that Xen PV handles
> >> gsbase very differently from native, but that has nothing to do with
> >> the macro.)
> >>
> >> I think we could convert piecemeal, and I wonder if this new patch for
> >> 32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
> >> starting point.  Borislav, what do you think?  Would you be okay with
> >> adding a Xen PV pseudofeature?
> >
> > AFAICT, I'd prefer this becomes rather a jump label which gets enabled
> > on xen. Especially if a single pseudofeature might not be enough,
> > apprently...
> 
> Except it's not a jump.  (Also, the alternatives infrastructure is IMO
> much nicer than the jump label infrastructure.)
> 
> Taking SWAPGS as an example, the semantics we need are:
> 
>  - On native, do swapgs.  This *can't* be a call due to RSP issues.
>  - On Xen PV, swapgs will work, but it's emulated.  We'd rather just nop it out.

Huh, so what's wrong with a jump:

	jmp 1f
	swapgs
	1:

> In principle, we could static jump over it on Xen, but that also
> involves forcing the jump label to be built on old GCC versions, which
> PeterZ objected to the last time I asked.

:-\

> If it would make you feel better, it could be X86_BUG_XENPV :-p

That doesn't matter - I just don't want to open the flood gates on
pseudo feature bits.

hpa, what do you think?

> Are there really multiple feature bits for this stuff?  I'd like to
> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
> -- i.e. I assumed that PVH works like native for all entries.

I just reacted to Boris' statement:

"We don't currently have a Xen-specific CPU feature. We could, in
principle, add it but we can't replace all of current paravirt patching
with a single feature since PVH guests use a subset of existing pv ops
(and in the future it may become even more fine-grained)."

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:11             ` Andy Lutomirski
@ 2015-11-16 20:22               ` Borislav Petkov
  2015-11-16 20:22               ` Borislav Petkov
  1 sibling, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-16 20:22 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin
  Cc: Boris Ostrovsky, David Vrabel, linux-kernel, xen-devel

On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
> On Mon, Nov 16, 2015 at 11:59 AM, Borislav Petkov <bp@alien8.de> wrote:
> > On Mon, Nov 16, 2015 at 11:03:22AM -0800, Andy Lutomirski wrote:
> >
> >> ...
> >> The reader surely doesn't remember that this isn't guaranteed to be a
> >> swapgs instruction on native.  Using:
> >>
> >> ALTERNATIVE "swapgs" "" X86_FEATURE_XENPV
> >>
> >> would be safer (it would get rid of the SWAPGS_UNSAFE_STACK mess) and
> >> much clearer.  We could hide *that* behind a macro and no one would be
> >> confused.  (Well, they'd be confused by the fact that Xen PV handles
> >> gsbase very differently from native, but that has nothing to do with
> >> the macro.)
> >>
> >> I think we could convert piecemeal, and I wonder if this new patch for
> >> 32-bit native on 4.4 (this is needed for 4.4, right?) would be a good
> >> starting point.  Borislav, what do you think?  Would you be okay with
> >> adding a Xen PV pseudofeature?
> >
> > AFAICT, I'd prefer this becomes rather a jump label which gets enabled
> > on xen. Especially if a single pseudofeature might not be enough,
> > apprently...
> 
> Except it's not a jump.  (Also, the alternatives infrastructure is IMO
> much nicer than the jump label infrastructure.)
> 
> Taking SWAPGS as an example, the semantics we need are:
> 
>  - On native, do swapgs.  This *can't* be a call due to RSP issues.
>  - On Xen PV, swapgs will work, but it's emulated.  We'd rather just nop it out.

Huh, so what's wrong with a jump:

	jmp 1f
	swapgs
	1:

> In principle, we could static jump over it on Xen, but that also
> involves forcing the jump label to be built on old GCC versions, which
> PeterZ objected to the last time I asked.

:-\

> If it would make you feel better, it could be X86_BUG_XENPV :-p

That doesn't matter - I just don't want to open the flood gates on
pseudo feature bits.

hpa, what do you think?

> Are there really multiple feature bits for this stuff?  I'd like to
> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
> -- i.e. I assumed that PVH works like native for all entries.

I just reacted to Boris' statement:

"We don't currently have a Xen-specific CPU feature. We could, in
principle, add it but we can't replace all of current paravirt patching
with a single feature since PVH guests use a subset of existing pv ops
(and in the future it may become even more fine-grained)."

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 19:03         ` Andy Lutomirski
                             ` (2 preceding siblings ...)
  2015-11-16 20:31           ` Boris Ostrovsky
@ 2015-11-16 20:31           ` Boris Ostrovsky
  3 siblings, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-16 20:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, xen-devel, David Vrabel, Konrad Rzeszutek Wilk,
	Borislav Petkov

On 11/16/2015 02:03 PM, Andy Lutomirski wrote:
> It's still a waste of effort, though. Also, I'd eventually like the 
> number of places in Xen code in which rsp/esp is invalid to be exactly 
> zero, and this approach makes this harder or even impossible.

That's what PVH is going to do.



> Does PVH hook into the entry asm code at all?  I thought it was just
> boot code and drivers.

Not the current version --- it starts with xen_start_kernel(). But we 
are currently changing it and my plan is to have a small stub executed 
initially (to set bootparams and such) and then jump to startup_{32|64}().

>
> In any case, someone needs to do some serious review and cleanup on
> the whole paravirt op mess.  We have a bunch of paravirt ops that
> serve little purpose.
>
> The paravirt infrastructure is a bit weird, too: it seems to
> effectively have four states for each patch site.  There's:
>
> 1. The initial state, which is unoptimized and works on native.
> Presumably any of these that happen early also need to work, if
> slowly, on Xen.

Not on PV (and as of today, on PVH) --- we start directly from 
xen_start_kernel(). I.e.  from step 2.

>
> 2. The Xen state without text patching.  I'm not actually sure why
> this exists at all.  Are there pvops that need to switch too early for
> us to patch the text?

I don't think so.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 19:03         ` Andy Lutomirski
  2015-11-16 19:59           ` Borislav Petkov
  2015-11-16 19:59           ` Borislav Petkov
@ 2015-11-16 20:31           ` Boris Ostrovsky
  2015-11-16 20:31           ` Boris Ostrovsky
  3 siblings, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-16 20:31 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: David Vrabel, Borislav Petkov, linux-kernel, xen-devel

On 11/16/2015 02:03 PM, Andy Lutomirski wrote:
> It's still a waste of effort, though. Also, I'd eventually like the 
> number of places in Xen code in which rsp/esp is invalid to be exactly 
> zero, and this approach makes this harder or even impossible.

That's what PVH is going to do.



> Does PVH hook into the entry asm code at all?  I thought it was just
> boot code and drivers.

Not the current version --- it starts with xen_start_kernel(). But we 
are currently changing it and my plan is to have a small stub executed 
initially (to set bootparams and such) and then jump to startup_{32|64}().

>
> In any case, someone needs to do some serious review and cleanup on
> the whole paravirt op mess.  We have a bunch of paravirt ops that
> serve little purpose.
>
> The paravirt infrastructure is a bit weird, too: it seems to
> effectively have four states for each patch site.  There's:
>
> 1. The initial state, which is unoptimized and works on native.
> Presumably any of these that happen early also need to work, if
> slowly, on Xen.

Not on PV (and as of today, on PVH) --- we start directly from 
xen_start_kernel(). I.e.  from step 2.

>
> 2. The Xen state without text patching.  I'm not actually sure why
> this exists at all.  Are there pvops that need to switch too early for
> us to patch the text?

I don't think so.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:22               ` Borislav Petkov
@ 2015-11-16 20:48                 ` Boris Ostrovsky
  2015-11-16 20:50                   ` Andy Lutomirski
  2015-11-16 20:50                   ` Andy Lutomirski
  2015-11-16 20:48                 ` Boris Ostrovsky
                                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-16 20:48 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski, H. Peter Anvin
  Cc: linux-kernel, xen-devel, David Vrabel, Konrad Rzeszutek Wilk

On 11/16/2015 03:22 PM, Borislav Petkov wrote:
> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>
>> Are there really multiple feature bits for this stuff?  I'd like to
>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>> -- i.e. I assumed that PVH works like native for all entries.


Almost. For PVH we will have a small stub to set up bootparams and such 
but then we jump to startup_{32|64} code.


> I just reacted to Boris' statement:
>
> "We don't currently have a Xen-specific CPU feature. We could, in
> principle, add it but we can't replace all of current paravirt patching
> with a single feature since PVH guests use a subset of existing pv ops
> (and in the future it may become even more fine-grained)."

Actually, nevermind this --- I was thinking about APIC ops and they are 
not pv ops.

Note though that there are other users of pv ops --- lguest and looks 
like KVM (for one op) use them too.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:22               ` Borislav Petkov
  2015-11-16 20:48                 ` Boris Ostrovsky
@ 2015-11-16 20:48                 ` Boris Ostrovsky
  2015-11-16 21:55                 ` H. Peter Anvin
  2015-11-16 21:55                 ` H. Peter Anvin
  3 siblings, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-16 20:48 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski, H. Peter Anvin
  Cc: David Vrabel, linux-kernel, xen-devel

On 11/16/2015 03:22 PM, Borislav Petkov wrote:
> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>
>> Are there really multiple feature bits for this stuff?  I'd like to
>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>> -- i.e. I assumed that PVH works like native for all entries.


Almost. For PVH we will have a small stub to set up bootparams and such 
but then we jump to startup_{32|64} code.


> I just reacted to Boris' statement:
>
> "We don't currently have a Xen-specific CPU feature. We could, in
> principle, add it but we can't replace all of current paravirt patching
> with a single feature since PVH guests use a subset of existing pv ops
> (and in the future it may become even more fine-grained)."

Actually, nevermind this --- I was thinking about APIC ops and they are 
not pv ops.

Note though that there are other users of pv ops --- lguest and looks 
like KVM (for one op) use them too.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:48                 ` Boris Ostrovsky
@ 2015-11-16 20:50                   ` Andy Lutomirski
  2015-11-16 21:00                     ` Borislav Petkov
                                       ` (2 more replies)
  2015-11-16 20:50                   ` Andy Lutomirski
  1 sibling, 3 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 20:50 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Borislav Petkov, H. Peter Anvin, linux-kernel, xen-devel,
	David Vrabel, Konrad Rzeszutek Wilk

On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 11/16/2015 03:22 PM, Borislav Petkov wrote:
>>
>> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>>
>>> Are there really multiple feature bits for this stuff?  I'd like to
>>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>>> -- i.e. I assumed that PVH works like native for all entries.
>
>
>
> Almost. For PVH we will have a small stub to set up bootparams and such but
> then we jump to startup_{32|64} code.
>
>
>> I just reacted to Boris' statement:
>>
>> "We don't currently have a Xen-specific CPU feature. We could, in
>> principle, add it but we can't replace all of current paravirt patching
>> with a single feature since PVH guests use a subset of existing pv ops
>> (and in the future it may become even more fine-grained)."
>
>
> Actually, nevermind this --- I was thinking about APIC ops and they are not
> pv ops.
>
> Note though that there are other users of pv ops --- lguest and looks like
> KVM (for one op) use them too.
>

Honestly, I think we should just delete lguest some time soon.  And
KVM uses this stuff so lightly that we shouldn't need all of the pvop
stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
dependency on PARAVIRT.)

--Andy


-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:48                 ` Boris Ostrovsky
  2015-11-16 20:50                   ` Andy Lutomirski
@ 2015-11-16 20:50                   ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 20:50 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel, H. Peter Anvin

On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 11/16/2015 03:22 PM, Borislav Petkov wrote:
>>
>> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>>
>>> Are there really multiple feature bits for this stuff?  I'd like to
>>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>>> -- i.e. I assumed that PVH works like native for all entries.
>
>
>
> Almost. For PVH we will have a small stub to set up bootparams and such but
> then we jump to startup_{32|64} code.
>
>
>> I just reacted to Boris' statement:
>>
>> "We don't currently have a Xen-specific CPU feature. We could, in
>> principle, add it but we can't replace all of current paravirt patching
>> with a single feature since PVH guests use a subset of existing pv ops
>> (and in the future it may become even more fine-grained)."
>
>
> Actually, nevermind this --- I was thinking about APIC ops and they are not
> pv ops.
>
> Note though that there are other users of pv ops --- lguest and looks like
> KVM (for one op) use them too.
>

Honestly, I think we should just delete lguest some time soon.  And
KVM uses this stuff so lightly that we shouldn't need all of the pvop
stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
dependency on PARAVIRT.)

--Andy


-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:50                   ` Andy Lutomirski
  2015-11-16 21:00                     ` Borislav Petkov
@ 2015-11-16 21:00                     ` Borislav Petkov
  2015-11-16 21:03                     ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-16 21:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Boris Ostrovsky, H. Peter Anvin, linux-kernel, xen-devel,
	David Vrabel, Konrad Rzeszutek Wilk

On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
> (In fact, I'm slowly working on removing KVM_GUEST's dependency on
> PARAVIRT.)

Good! The hope that we'll be able to remove paravirt one day is != 0
again.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:50                   ` Andy Lutomirski
@ 2015-11-16 21:00                     ` Borislav Petkov
  2015-11-16 21:00                     ` Borislav Petkov
  2015-11-16 21:03                     ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-16 21:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, xen-devel, David Vrabel, H. Peter Anvin, Boris Ostrovsky

On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
> (In fact, I'm slowly working on removing KVM_GUEST's dependency on
> PARAVIRT.)

Good! The hope that we'll be able to remove paravirt one day is != 0
again.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:50                   ` Andy Lutomirski
  2015-11-16 21:00                     ` Borislav Petkov
  2015-11-16 21:00                     ` Borislav Petkov
@ 2015-11-16 21:03                     ` Konrad Rzeszutek Wilk
  2015-11-16 21:04                       ` Andy Lutomirski
  2015-11-16 21:04                       ` Andy Lutomirski
  2 siblings, 2 replies; 51+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-11-16 21:03 UTC (permalink / raw)
  To: Andy Lutomirski, Joao Martins
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel,
	H. Peter Anvin, Boris Ostrovsky

On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
> On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
> > On 11/16/2015 03:22 PM, Borislav Petkov wrote:
> >>
> >> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
> >>
> >>> Are there really multiple feature bits for this stuff?  I'd like to
> >>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
> >>> -- i.e. I assumed that PVH works like native for all entries.
> >
> >
> >
> > Almost. For PVH we will have a small stub to set up bootparams and such but
> > then we jump to startup_{32|64} code.
> >
> >
> >> I just reacted to Boris' statement:
> >>
> >> "We don't currently have a Xen-specific CPU feature. We could, in
> >> principle, add it but we can't replace all of current paravirt patching
> >> with a single feature since PVH guests use a subset of existing pv ops
> >> (and in the future it may become even more fine-grained)."
> >
> >
> > Actually, nevermind this --- I was thinking about APIC ops and they are not
> > pv ops.
> >
> > Note though that there are other users of pv ops --- lguest and looks like
> > KVM (for one op) use them too.
> >
> 
> Honestly, I think we should just delete lguest some time soon.  And
> KVM uses this stuff so lightly that we shouldn't need all of the pvop
> stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
> dependency on PARAVIRT.)

Even for the pvclock?

(sorry for stealing this thread on this subject).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 21:03                     ` Konrad Rzeszutek Wilk
@ 2015-11-16 21:04                       ` Andy Lutomirski
  2015-11-17 10:53                         ` Joao Martins
  2015-11-17 10:53                         ` Joao Martins
  2015-11-16 21:04                       ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 21:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Joao Martins, Boris Ostrovsky, Borislav Petkov, H. Peter Anvin,
	linux-kernel, xen-devel, David Vrabel

On Mon, Nov 16, 2015 at 1:03 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
>> On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>> > On 11/16/2015 03:22 PM, Borislav Petkov wrote:
>> >>
>> >> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>> >>
>> >>> Are there really multiple feature bits for this stuff?  I'd like to
>> >>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>> >>> -- i.e. I assumed that PVH works like native for all entries.
>> >
>> >
>> >
>> > Almost. For PVH we will have a small stub to set up bootparams and such but
>> > then we jump to startup_{32|64} code.
>> >
>> >
>> >> I just reacted to Boris' statement:
>> >>
>> >> "We don't currently have a Xen-specific CPU feature. We could, in
>> >> principle, add it but we can't replace all of current paravirt patching
>> >> with a single feature since PVH guests use a subset of existing pv ops
>> >> (and in the future it may become even more fine-grained)."
>> >
>> >
>> > Actually, nevermind this --- I was thinking about APIC ops and they are not
>> > pv ops.
>> >
>> > Note though that there are other users of pv ops --- lguest and looks like
>> > KVM (for one op) use them too.
>> >
>>
>> Honestly, I think we should just delete lguest some time soon.  And
>> KVM uses this stuff so lightly that we shouldn't need all of the pvop
>> stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
>> dependency on PARAVIRT.)
>
> Even for the pvclock?
>
> (sorry for stealing this thread on this subject).

I don't think that pvclock uses any of the paravirt infrastructure.
It's just another clock source AFAIK.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 21:03                     ` Konrad Rzeszutek Wilk
  2015-11-16 21:04                       ` Andy Lutomirski
@ 2015-11-16 21:04                       ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-16 21:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel,
	H. Peter Anvin, Joao Martins, Boris Ostrovsky

On Mon, Nov 16, 2015 at 1:03 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
>> On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>> > On 11/16/2015 03:22 PM, Borislav Petkov wrote:
>> >>
>> >> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>> >>
>> >>> Are there really multiple feature bits for this stuff?  I'd like to
>> >>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>> >>> -- i.e. I assumed that PVH works like native for all entries.
>> >
>> >
>> >
>> > Almost. For PVH we will have a small stub to set up bootparams and such but
>> > then we jump to startup_{32|64} code.
>> >
>> >
>> >> I just reacted to Boris' statement:
>> >>
>> >> "We don't currently have a Xen-specific CPU feature. We could, in
>> >> principle, add it but we can't replace all of current paravirt patching
>> >> with a single feature since PVH guests use a subset of existing pv ops
>> >> (and in the future it may become even more fine-grained)."
>> >
>> >
>> > Actually, nevermind this --- I was thinking about APIC ops and they are not
>> > pv ops.
>> >
>> > Note though that there are other users of pv ops --- lguest and looks like
>> > KVM (for one op) use them too.
>> >
>>
>> Honestly, I think we should just delete lguest some time soon.  And
>> KVM uses this stuff so lightly that we shouldn't need all of the pvop
>> stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
>> dependency on PARAVIRT.)
>
> Even for the pvclock?
>
> (sorry for stealing this thread on this subject).

I don't think that pvclock uses any of the paravirt infrastructure.
It's just another clock source AFAIK.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:22               ` Borislav Petkov
                                   ` (2 preceding siblings ...)
  2015-11-16 21:55                 ` H. Peter Anvin
@ 2015-11-16 21:55                 ` H. Peter Anvin
  2015-11-17 14:40                   ` Boris Ostrovsky
  2015-11-17 14:40                   ` Boris Ostrovsky
  3 siblings, 2 replies; 51+ messages in thread
From: H. Peter Anvin @ 2015-11-16 21:55 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, David Vrabel,
	Konrad Rzeszutek Wilk

On 11/16/15 12:22, Borislav Petkov wrote:
> 
> Huh, so what's wrong with a jump:
> 
> 	jmp 1f
> 	swapgs
> 	1:
> 

What is the point of that jump?

>> If it would make you feel better, it could be X86_BUG_XENPV :-p
> 
> That doesn't matter - I just don't want to open the flood gates on
> pseudo feature bits.
> 
> hpa, what do you think?

Pseudo feature bits are fine, we already have plenty of them.  They make
sense as they let us reuse a lot of infrastructure.

	-hpa



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 20:22               ` Borislav Petkov
  2015-11-16 20:48                 ` Boris Ostrovsky
  2015-11-16 20:48                 ` Boris Ostrovsky
@ 2015-11-16 21:55                 ` H. Peter Anvin
  2015-11-16 21:55                 ` H. Peter Anvin
  3 siblings, 0 replies; 51+ messages in thread
From: H. Peter Anvin @ 2015-11-16 21:55 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski
  Cc: Boris Ostrovsky, David Vrabel, linux-kernel, xen-devel

On 11/16/15 12:22, Borislav Petkov wrote:
> 
> Huh, so what's wrong with a jump:
> 
> 	jmp 1f
> 	swapgs
> 	1:
> 

What is the point of that jump?

>> If it would make you feel better, it could be X86_BUG_XENPV :-p
> 
> That doesn't matter - I just don't want to open the flood gates on
> pseudo feature bits.
> 
> hpa, what do you think?

Pseudo feature bits are fine, we already have plenty of them.  They make
sense as they let us reuse a lot of infrastructure.

	-hpa

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 21:04                       ` Andy Lutomirski
  2015-11-17 10:53                         ` Joao Martins
@ 2015-11-17 10:53                         ` Joao Martins
  1 sibling, 0 replies; 51+ messages in thread
From: Joao Martins @ 2015-11-17 10:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Konrad Rzeszutek Wilk, Boris Ostrovsky, Borislav Petkov,
	H. Peter Anvin, linux-kernel, xen-devel, David Vrabel



On 11/16/2015 09:04 PM, Andy Lutomirski wrote:
> On Mon, Nov 16, 2015 at 1:03 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
>> On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
>>> On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
>>> <boris.ostrovsky@oracle.com> wrote:
>>>> On 11/16/2015 03:22 PM, Borislav Petkov wrote:
>>>>>
>>>>> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>>>>>
>>>>>> Are there really multiple feature bits for this stuff?  I'd like to
>>>>>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>>>>>> -- i.e. I assumed that PVH works like native for all entries.
>>>>
>>>>
>>>>
>>>> Almost. For PVH we will have a small stub to set up bootparams and such but
>>>> then we jump to startup_{32|64} code.
>>>>
>>>>
>>>>> I just reacted to Boris' statement:
>>>>>
>>>>> "We don't currently have a Xen-specific CPU feature. We could, in
>>>>> principle, add it but we can't replace all of current paravirt patching
>>>>> with a single feature since PVH guests use a subset of existing pv ops
>>>>> (and in the future it may become even more fine-grained)."
>>>>
>>>>
>>>> Actually, nevermind this --- I was thinking about APIC ops and they are not
>>>> pv ops.
>>>>
>>>> Note though that there are other users of pv ops --- lguest and looks like
>>>> KVM (for one op) use them too.
>>>>
>>>
>>> Honestly, I think we should just delete lguest some time soon.  And
>>> KVM uses this stuff so lightly that we shouldn't need all of the pvop
>>> stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
>>> dependency on PARAVIRT.)
>>
>> Even for the pvclock?
>>
>> (sorry for stealing this thread on this subject).
> 
> I don't think that pvclock uses any of the paravirt infrastructure.
> It's just another clock source AFAIK.
> 
Yeah, but pv_time_ops is still used on both Xen and KVM. Even though it looks
that on KVM some of it's used only when pvclock isn't marked as stable (i.e. no
PVCLOCK_TSC_STABLE_BIT).

Joao

> --Andy
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 21:04                       ` Andy Lutomirski
@ 2015-11-17 10:53                         ` Joao Martins
  2015-11-17 10:53                         ` Joao Martins
  1 sibling, 0 replies; 51+ messages in thread
From: Joao Martins @ 2015-11-17 10:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel,
	H. Peter Anvin, Boris Ostrovsky



On 11/16/2015 09:04 PM, Andy Lutomirski wrote:
> On Mon, Nov 16, 2015 at 1:03 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
>> On Mon, Nov 16, 2015 at 12:50:19PM -0800, Andy Lutomirski wrote:
>>> On Mon, Nov 16, 2015 at 12:48 PM, Boris Ostrovsky
>>> <boris.ostrovsky@oracle.com> wrote:
>>>> On 11/16/2015 03:22 PM, Borislav Petkov wrote:
>>>>>
>>>>> On Mon, Nov 16, 2015 at 12:11:11PM -0800, Andy Lutomirski wrote:
>>>>>
>>>>>> Are there really multiple feature bits for this stuff?  I'd like to
>>>>>> imagine that the entry code is all either Xen PV or native/PVH/PVHVM
>>>>>> -- i.e. I assumed that PVH works like native for all entries.
>>>>
>>>>
>>>>
>>>> Almost. For PVH we will have a small stub to set up bootparams and such but
>>>> then we jump to startup_{32|64} code.
>>>>
>>>>
>>>>> I just reacted to Boris' statement:
>>>>>
>>>>> "We don't currently have a Xen-specific CPU feature. We could, in
>>>>> principle, add it but we can't replace all of current paravirt patching
>>>>> with a single feature since PVH guests use a subset of existing pv ops
>>>>> (and in the future it may become even more fine-grained)."
>>>>
>>>>
>>>> Actually, nevermind this --- I was thinking about APIC ops and they are not
>>>> pv ops.
>>>>
>>>> Note though that there are other users of pv ops --- lguest and looks like
>>>> KVM (for one op) use them too.
>>>>
>>>
>>> Honestly, I think we should just delete lguest some time soon.  And
>>> KVM uses this stuff so lightly that we shouldn't need all of the pvop
>>> stuff.  (In fact, I'm slowly working on removing KVM_GUEST's
>>> dependency on PARAVIRT.)
>>
>> Even for the pvclock?
>>
>> (sorry for stealing this thread on this subject).
> 
> I don't think that pvclock uses any of the paravirt infrastructure.
> It's just another clock source AFAIK.
> 
Yeah, but pv_time_ops is still used on both Xen and KVM. Even though it looks
that on KVM some of it's used only when pvclock isn't marked as stable (i.e. no
PVCLOCK_TSC_STABLE_BIT).

Joao

> --Andy
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 21:55                 ` H. Peter Anvin
  2015-11-17 14:40                   ` Boris Ostrovsky
@ 2015-11-17 14:40                   ` Boris Ostrovsky
  2015-11-17 18:49                     ` Andy Lutomirski
  2015-11-17 18:49                     ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-17 14:40 UTC (permalink / raw)
  To: H. Peter Anvin, Borislav Petkov, Andy Lutomirski
  Cc: linux-kernel, xen-devel, David Vrabel, Konrad Rzeszutek Wilk

On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
> On 11/16/15 12:22, Borislav Petkov wrote:
>> Huh, so what's wrong with a jump:
>>
>> 	jmp 1f
>> 	swapgs
>> 	1:
>>
> What is the point of that jump?
>
>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>> That doesn't matter - I just don't want to open the flood gates on
>> pseudo feature bits.
>>
>> hpa, what do you think?
> Pseudo feature bits are fine, we already have plenty of them.  They make
> sense as they let us reuse a lot of infrastructure.


So how about something like this? And then I think we can remove 
usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will 
use them (lguest doesn't set them)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 3eb572e..c43df7b 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -308,7 +308,8 @@ sysenter_past_esp:

         movl    %esp, %eax
         call    do_fast_syscall_32
-       testl   %eax, %eax
+       /* PV guests always use IRET path */
+       ALTERNATIVE "testl %eax, %eax", "jmp .Lsyscall_32_done", 
X86_FEATURE_PV
         jz      .Lsyscall_32_done

  /* Opportunistic SYSEXIT */
diff --git a/arch/x86/entry/entry_64_compat.S 
b/arch/x86/entry/entry_64_compat.S
index c320183..2d1bc82 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -121,7 +121,7 @@ sysenter_flags_fixed:

         movq    %rsp, %rdi
         call    do_fast_syscall_32
-       testl   %eax, %eax
+       ALTERNATIVE "testl %eax, %eax", "jmp .Lsyscall_32_done", 
X86_FEATURE_PV
         jz      .Lsyscall_32_done
         jmp     sysret32_from_system_call

@@ -200,7 +200,8 @@ ENTRY(entry_SYSCALL_compat)

         movq    %rsp, %rdi
         call    do_fast_syscall_32
-       testl   %eax, %eax
+       /* PV guests always use IRET path */
+       ALTERNATIVE "testl %eax, %eax", "jmp .Lsyscall_32_done", 
X86_FEATURE_PV
         jz      .Lsyscall_32_done

         /* Opportunistic SYSRET */
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e4f8010..723327b 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -216,6 +216,7 @@
  #define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause 
intercept */
  #define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter 
threshold */
  #define X86_FEATURE_VMMCALL     ( 8*32+15) /* Prefer vmmcall to vmcall */
+#define X86_FEATURE_PV          ( 8*32+16) /* Paravirtual guest */


  /* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-16 21:55                 ` H. Peter Anvin
@ 2015-11-17 14:40                   ` Boris Ostrovsky
  2015-11-17 14:40                   ` Boris Ostrovsky
  1 sibling, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-17 14:40 UTC (permalink / raw)
  To: H. Peter Anvin, Borislav Petkov, Andy Lutomirski
  Cc: David Vrabel, linux-kernel, xen-devel

On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
> On 11/16/15 12:22, Borislav Petkov wrote:
>> Huh, so what's wrong with a jump:
>>
>> 	jmp 1f
>> 	swapgs
>> 	1:
>>
> What is the point of that jump?
>
>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>> That doesn't matter - I just don't want to open the flood gates on
>> pseudo feature bits.
>>
>> hpa, what do you think?
> Pseudo feature bits are fine, we already have plenty of them.  They make
> sense as they let us reuse a lot of infrastructure.


So how about something like this? And then I think we can remove 
usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will 
use them (lguest doesn't set them)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 3eb572e..c43df7b 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -308,7 +308,8 @@ sysenter_past_esp:

         movl    %esp, %eax
         call    do_fast_syscall_32
-       testl   %eax, %eax
+       /* PV guests always use IRET path */
+       ALTERNATIVE "testl %eax, %eax", "jmp .Lsyscall_32_done", 
X86_FEATURE_PV
         jz      .Lsyscall_32_done

  /* Opportunistic SYSEXIT */
diff --git a/arch/x86/entry/entry_64_compat.S 
b/arch/x86/entry/entry_64_compat.S
index c320183..2d1bc82 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -121,7 +121,7 @@ sysenter_flags_fixed:

         movq    %rsp, %rdi
         call    do_fast_syscall_32
-       testl   %eax, %eax
+       ALTERNATIVE "testl %eax, %eax", "jmp .Lsyscall_32_done", 
X86_FEATURE_PV
         jz      .Lsyscall_32_done
         jmp     sysret32_from_system_call

@@ -200,7 +200,8 @@ ENTRY(entry_SYSCALL_compat)

         movq    %rsp, %rdi
         call    do_fast_syscall_32
-       testl   %eax, %eax
+       /* PV guests always use IRET path */
+       ALTERNATIVE "testl %eax, %eax", "jmp .Lsyscall_32_done", 
X86_FEATURE_PV
         jz      .Lsyscall_32_done

         /* Opportunistic SYSRET */
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e4f8010..723327b 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -216,6 +216,7 @@
  #define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause 
intercept */
  #define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter 
threshold */
  #define X86_FEATURE_VMMCALL     ( 8*32+15) /* Prefer vmmcall to vmcall */
+#define X86_FEATURE_PV          ( 8*32+16) /* Paravirtual guest */


  /* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 14:40                   ` Boris Ostrovsky
@ 2015-11-17 18:49                     ` Andy Lutomirski
  2015-11-17 19:12                       ` Andrew Cooper
  2015-11-17 19:12                       ` [Xen-devel] " Andrew Cooper
  2015-11-17 18:49                     ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-17 18:49 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, Konrad Rzeszutek Wilk, David Vrabel, Borislav Petkov,
	linux-kernel, H. Peter Anvin

On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>
> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>
>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>
>>> Huh, so what's wrong with a jump:
>>>
>>>         jmp 1f
>>>         swapgs
>>>         1:
>>>
>> What is the point of that jump?
>>
>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>
>>> That doesn't matter - I just don't want to open the flood gates on
>>> pseudo feature bits.
>>>
>>> hpa, what do you think?
>>
>> Pseudo feature bits are fine, we already have plenty of them.  They make
>> sense as they let us reuse a lot of infrastructure.
>
>
>
> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>

Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
return to 32-bit user mode?  If so, it could be worth trying to wire
it up by patching the jz instead of the test instruction.

Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
too many people.

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 14:40                   ` Boris Ostrovsky
  2015-11-17 18:49                     ` Andy Lutomirski
@ 2015-11-17 18:49                     ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-17 18:49 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel, H. Peter Anvin

On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>
> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>
>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>
>>> Huh, so what's wrong with a jump:
>>>
>>>         jmp 1f
>>>         swapgs
>>>         1:
>>>
>> What is the point of that jump?
>>
>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>
>>> That doesn't matter - I just don't want to open the flood gates on
>>> pseudo feature bits.
>>>
>>> hpa, what do you think?
>>
>> Pseudo feature bits are fine, we already have plenty of them.  They make
>> sense as they let us reuse a lot of infrastructure.
>
>
>
> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>

Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
return to 32-bit user mode?  If so, it could be worth trying to wire
it up by patching the jz instead of the test instruction.

Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
too many people.

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 18:49                     ` Andy Lutomirski
  2015-11-17 19:12                       ` Andrew Cooper
@ 2015-11-17 19:12                       ` Andrew Cooper
  2015-11-17 19:16                         ` Andy Lutomirski
  2015-11-17 19:16                         ` Andy Lutomirski
  1 sibling, 2 replies; 51+ messages in thread
From: Andrew Cooper @ 2015-11-17 19:12 UTC (permalink / raw)
  To: Andy Lutomirski, Boris Ostrovsky
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel, H. Peter Anvin

On 17/11/15 18:49, Andy Lutomirski wrote:
> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>> Huh, so what's wrong with a jump:
>>>>
>>>>         jmp 1f
>>>>         swapgs
>>>>         1:
>>>>
>>> What is the point of that jump?
>>>
>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>> That doesn't matter - I just don't want to open the flood gates on
>>>> pseudo feature bits.
>>>>
>>>> hpa, what do you think?
>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>> sense as they let us reuse a lot of infrastructure.
>>
>>
>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>
> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
> return to 32-bit user mode?  If so, it could be worth trying to wire
> it up by patching the jz instead of the test instruction.

>From the guests point of view, there is only hypercall_iret.

>
> Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
> too many people.

I agree - PV on its own is too generic.

An alternative might be X86_FEATURE_XEN_PV_GUEST which is very clear an
unambiguous, although rather longer.

~Andrew

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 18:49                     ` Andy Lutomirski
@ 2015-11-17 19:12                       ` Andrew Cooper
  2015-11-17 19:12                       ` [Xen-devel] " Andrew Cooper
  1 sibling, 0 replies; 51+ messages in thread
From: Andrew Cooper @ 2015-11-17 19:12 UTC (permalink / raw)
  To: Andy Lutomirski, Boris Ostrovsky
  Cc: David Vrabel, Borislav Petkov, linux-kernel, H. Peter Anvin, xen-devel

On 17/11/15 18:49, Andy Lutomirski wrote:
> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>> Huh, so what's wrong with a jump:
>>>>
>>>>         jmp 1f
>>>>         swapgs
>>>>         1:
>>>>
>>> What is the point of that jump?
>>>
>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>> That doesn't matter - I just don't want to open the flood gates on
>>>> pseudo feature bits.
>>>>
>>>> hpa, what do you think?
>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>> sense as they let us reuse a lot of infrastructure.
>>
>>
>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>
> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
> return to 32-bit user mode?  If so, it could be worth trying to wire
> it up by patching the jz instead of the test instruction.

>From the guests point of view, there is only hypercall_iret.

>
> Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
> too many people.

I agree - PV on its own is too generic.

An alternative might be X86_FEATURE_XEN_PV_GUEST which is very clear an
unambiguous, although rather longer.

~Andrew

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:12                       ` [Xen-devel] " Andrew Cooper
@ 2015-11-17 19:16                         ` Andy Lutomirski
  2015-11-17 19:21                           ` Borislav Petkov
                                             ` (5 more replies)
  2015-11-17 19:16                         ` Andy Lutomirski
  1 sibling, 6 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-17 19:16 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, Borislav Petkov,
	David Vrabel, H. Peter Anvin

On Tue, Nov 17, 2015 at 11:12 AM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 17/11/15 18:49, Andy Lutomirski wrote:
>> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>>> Huh, so what's wrong with a jump:
>>>>>
>>>>>         jmp 1f
>>>>>         swapgs
>>>>>         1:
>>>>>
>>>> What is the point of that jump?
>>>>
>>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>>> That doesn't matter - I just don't want to open the flood gates on
>>>>> pseudo feature bits.
>>>>>
>>>>> hpa, what do you think?
>>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>>> sense as they let us reuse a lot of infrastructure.
>>>
>>>
>>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>>
>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>> return to 32-bit user mode?  If so, it could be worth trying to wire
>> it up by patching the jz instead of the test instruction.
>
> From the guests point of view, there is only hypercall_iret.

Doesn't hypercall_iret have flags that ask for different behavior,
though?  (VG_syscall or whatever for the 64-bit case?)

>
>>
>> Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
>> too many people.
>
> I agree - PV on its own is too generic.
>
> An alternative might be X86_FEATURE_XEN_PV_GUEST which is very clear an
> unambiguous, although rather longer.

Works for me, too, although seeing "xen_pv_host" in the Linux cpu
features would be very strange indeed :)

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:12                       ` [Xen-devel] " Andrew Cooper
  2015-11-17 19:16                         ` Andy Lutomirski
@ 2015-11-17 19:16                         ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-17 19:16 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel,
	H. Peter Anvin, Boris Ostrovsky

On Tue, Nov 17, 2015 at 11:12 AM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 17/11/15 18:49, Andy Lutomirski wrote:
>> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>>> Huh, so what's wrong with a jump:
>>>>>
>>>>>         jmp 1f
>>>>>         swapgs
>>>>>         1:
>>>>>
>>>> What is the point of that jump?
>>>>
>>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>>> That doesn't matter - I just don't want to open the flood gates on
>>>>> pseudo feature bits.
>>>>>
>>>>> hpa, what do you think?
>>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>>> sense as they let us reuse a lot of infrastructure.
>>>
>>>
>>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>>
>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>> return to 32-bit user mode?  If so, it could be worth trying to wire
>> it up by patching the jz instead of the test instruction.
>
> From the guests point of view, there is only hypercall_iret.

Doesn't hypercall_iret have flags that ask for different behavior,
though?  (VG_syscall or whatever for the 64-bit case?)

>
>>
>> Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
>> too many people.
>
> I agree - PV on its own is too generic.
>
> An alternative might be X86_FEATURE_XEN_PV_GUEST which is very clear an
> unambiguous, although rather longer.

Works for me, too, although seeing "xen_pv_host" in the Linux cpu
features would be very strange indeed :)

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:16                         ` Andy Lutomirski
  2015-11-17 19:21                           ` Borislav Petkov
@ 2015-11-17 19:21                           ` Borislav Petkov
  2015-11-17 19:29                           ` Andrew Cooper
                                             ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-17 19:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Cooper, Boris Ostrovsky, linux-kernel, xen-devel,
	David Vrabel, H. Peter Anvin

On Tue, Nov 17, 2015 at 11:16:11AM -0800, Andy Lutomirski wrote:
> Works for me, too, although seeing "xen_pv_host" in the Linux cpu
> features would be very strange indeed :)

That feature would be, of course, *not* in /proc/cpuinfo.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:16                         ` Andy Lutomirski
@ 2015-11-17 19:21                           ` Borislav Petkov
  2015-11-17 19:21                           ` [Xen-devel] " Borislav Petkov
                                             ` (4 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Borislav Petkov @ 2015-11-17 19:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Cooper, linux-kernel, xen-devel, David Vrabel,
	H. Peter Anvin, Boris Ostrovsky

On Tue, Nov 17, 2015 at 11:16:11AM -0800, Andy Lutomirski wrote:
> Works for me, too, although seeing "xen_pv_host" in the Linux cpu
> features would be very strange indeed :)

That feature would be, of course, *not* in /proc/cpuinfo.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:16                         ` Andy Lutomirski
  2015-11-17 19:21                           ` Borislav Petkov
  2015-11-17 19:21                           ` [Xen-devel] " Borislav Petkov
@ 2015-11-17 19:29                           ` Andrew Cooper
  2015-11-17 19:36                             ` Andy Lutomirski
  2015-11-17 19:36                             ` Andy Lutomirski
  2015-11-17 19:29                           ` Andrew Cooper
                                             ` (2 subsequent siblings)
  5 siblings, 2 replies; 51+ messages in thread
From: Andrew Cooper @ 2015-11-17 19:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, Borislav Petkov,
	David Vrabel, H. Peter Anvin

On 17/11/15 19:16, Andy Lutomirski wrote:
> On Tue, Nov 17, 2015 at 11:12 AM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 17/11/15 18:49, Andy Lutomirski wrote:
>>> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>>>> Huh, so what's wrong with a jump:
>>>>>>
>>>>>>         jmp 1f
>>>>>>         swapgs
>>>>>>         1:
>>>>>>
>>>>> What is the point of that jump?
>>>>>
>>>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>>>> That doesn't matter - I just don't want to open the flood gates on
>>>>>> pseudo feature bits.
>>>>>>
>>>>>> hpa, what do you think?
>>>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>>>> sense as they let us reuse a lot of infrastructure.
>>>>
>>>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>>>
>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>> it up by patching the jz instead of the test instruction.
>> From the guests point of view, there is only hypercall_iret.
> Doesn't hypercall_iret have flags that ask for different behavior,
> though?  (VG_syscall or whatever for the 64-bit case?)

The one and only flag is VGCF_in_syscall

Xen has its own logic for choosing between sysretq/sysretl if
VGCF_in_syscall, but will end up on an iret path in all other
circumstances. 

There is definitely some room for optimisation here, but in in some
copious free time before that, I want to see about brining most of our
asm code up into C.  The vast majority of it doesn't need to be written
in asm.

>
>>> Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
>>> too many people.
>> I agree - PV on its own is too generic.
>>
>> An alternative might be X86_FEATURE_XEN_PV_GUEST which is very clear an
>> unambiguous, although rather longer.
> Works for me, too, although seeing "xen_pv_host" in the Linux cpu
> features would be very strange indeed :)

This makes me wonders whether the `insmod xen` project has managed to
gain any traction ;)

~Andrew

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:16                         ` Andy Lutomirski
                                             ` (2 preceding siblings ...)
  2015-11-17 19:29                           ` Andrew Cooper
@ 2015-11-17 19:29                           ` Andrew Cooper
  2015-11-17 19:37                           ` [Xen-devel] " Boris Ostrovsky
  2015-11-17 19:37                           ` Boris Ostrovsky
  5 siblings, 0 replies; 51+ messages in thread
From: Andrew Cooper @ 2015-11-17 19:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel,
	H. Peter Anvin, Boris Ostrovsky

On 17/11/15 19:16, Andy Lutomirski wrote:
> On Tue, Nov 17, 2015 at 11:12 AM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 17/11/15 18:49, Andy Lutomirski wrote:
>>> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>>>> Huh, so what's wrong with a jump:
>>>>>>
>>>>>>         jmp 1f
>>>>>>         swapgs
>>>>>>         1:
>>>>>>
>>>>> What is the point of that jump?
>>>>>
>>>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>>>> That doesn't matter - I just don't want to open the flood gates on
>>>>>> pseudo feature bits.
>>>>>>
>>>>>> hpa, what do you think?
>>>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>>>> sense as they let us reuse a lot of infrastructure.
>>>>
>>>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>>>
>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>> it up by patching the jz instead of the test instruction.
>> From the guests point of view, there is only hypercall_iret.
> Doesn't hypercall_iret have flags that ask for different behavior,
> though?  (VG_syscall or whatever for the 64-bit case?)

The one and only flag is VGCF_in_syscall

Xen has its own logic for choosing between sysretq/sysretl if
VGCF_in_syscall, but will end up on an iret path in all other
circumstances. 

There is definitely some room for optimisation here, but in in some
copious free time before that, I want to see about brining most of our
asm code up into C.  The vast majority of it doesn't need to be written
in asm.

>
>>> Also, I'd prefer X86_FEATURE_XENPV.  IMO "PV" means too many things to
>>> too many people.
>> I agree - PV on its own is too generic.
>>
>> An alternative might be X86_FEATURE_XEN_PV_GUEST which is very clear an
>> unambiguous, although rather longer.
> Works for me, too, although seeing "xen_pv_host" in the Linux cpu
> features would be very strange indeed :)

This makes me wonders whether the `insmod xen` project has managed to
gain any traction ;)

~Andrew

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:29                           ` Andrew Cooper
@ 2015-11-17 19:36                             ` Andy Lutomirski
  2015-11-17 19:36                             ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-17 19:36 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Boris Ostrovsky, linux-kernel, xen-devel, Borislav Petkov,
	David Vrabel, H. Peter Anvin

On Tue, Nov 17, 2015 at 11:29 AM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 17/11/15 19:16, Andy Lutomirski wrote:
>> On Tue, Nov 17, 2015 at 11:12 AM, Andrew Cooper
>> <andrew.cooper3@citrix.com> wrote:
>>> On 17/11/15 18:49, Andy Lutomirski wrote:
>>>> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>>>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>>>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>>>>> Huh, so what's wrong with a jump:
>>>>>>>
>>>>>>>         jmp 1f
>>>>>>>         swapgs
>>>>>>>         1:
>>>>>>>
>>>>>> What is the point of that jump?
>>>>>>
>>>>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>>>>> That doesn't matter - I just don't want to open the flood gates on
>>>>>>> pseudo feature bits.
>>>>>>>
>>>>>>> hpa, what do you think?
>>>>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>>>>> sense as they let us reuse a lot of infrastructure.
>>>>>
>>>>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>>>>
>>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>>> it up by patching the jz instead of the test instruction.
>>> From the guests point of view, there is only hypercall_iret.
>> Doesn't hypercall_iret have flags that ask for different behavior,
>> though?  (VG_syscall or whatever for the 64-bit case?)
>
> The one and only flag is VGCF_in_syscall
>
> Xen has its own logic for choosing between sysretq/sysretl if
> VGCF_in_syscall, but will end up on an iret path in all other
> circumstances.

In that case, a nicer version of this patch could preserve the new
sysret-or-iret decision (on 64-bit kernels in the compat path) and use
it to set VGCF_in_syscall.  This might work considerably better now
than it ever would have, since Linux now tries to use sysret32 on
*all* 64-bit CPUs, and the way it's structured for compat is just a
flag (the testl %eax,%eax thing) that indicates that sysret32 is okay.

Anyway, that can be a followup patch.

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:29                           ` Andrew Cooper
  2015-11-17 19:36                             ` Andy Lutomirski
@ 2015-11-17 19:36                             ` Andy Lutomirski
  1 sibling, 0 replies; 51+ messages in thread
From: Andy Lutomirski @ 2015-11-17 19:36 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel,
	H. Peter Anvin, Boris Ostrovsky

On Tue, Nov 17, 2015 at 11:29 AM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 17/11/15 19:16, Andy Lutomirski wrote:
>> On Tue, Nov 17, 2015 at 11:12 AM, Andrew Cooper
>> <andrew.cooper3@citrix.com> wrote:
>>> On 17/11/15 18:49, Andy Lutomirski wrote:
>>>> On Nov 17, 2015 6:40 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>>>>> On 11/16/2015 04:55 PM, H. Peter Anvin wrote:
>>>>>> On 11/16/15 12:22, Borislav Petkov wrote:
>>>>>>> Huh, so what's wrong with a jump:
>>>>>>>
>>>>>>>         jmp 1f
>>>>>>>         swapgs
>>>>>>>         1:
>>>>>>>
>>>>>> What is the point of that jump?
>>>>>>
>>>>>>>> If it would make you feel better, it could be X86_BUG_XENPV :-p
>>>>>>> That doesn't matter - I just don't want to open the flood gates on
>>>>>>> pseudo feature bits.
>>>>>>>
>>>>>>> hpa, what do you think?
>>>>>> Pseudo feature bits are fine, we already have plenty of them.  They make
>>>>>> sense as they let us reuse a lot of infrastructure.
>>>>>
>>>>> So how about something like this? And then I think we can remove usergs_sysret32 and irq_enable_sysexit pv ops completely as noone will use them (lguest doesn't set them)
>>>>>
>>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>>> it up by patching the jz instead of the test instruction.
>>> From the guests point of view, there is only hypercall_iret.
>> Doesn't hypercall_iret have flags that ask for different behavior,
>> though?  (VG_syscall or whatever for the 64-bit case?)
>
> The one and only flag is VGCF_in_syscall
>
> Xen has its own logic for choosing between sysretq/sysretl if
> VGCF_in_syscall, but will end up on an iret path in all other
> circumstances.

In that case, a nicer version of this patch could preserve the new
sysret-or-iret decision (on 64-bit kernels in the compat path) and use
it to set VGCF_in_syscall.  This might work considerably better now
than it ever would have, since Linux now tries to use sysret32 on
*all* 64-bit CPUs, and the way it's structured for compat is just a
flag (the testl %eax,%eax thing) that indicates that sysret32 is okay.

Anyway, that can be a followup patch.

--Andy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:16                         ` Andy Lutomirski
                                             ` (3 preceding siblings ...)
  2015-11-17 19:29                           ` Andrew Cooper
@ 2015-11-17 19:37                           ` Boris Ostrovsky
  2015-11-17 19:38                             ` Boris Ostrovsky
  2015-11-17 19:38                             ` [Xen-devel] " Boris Ostrovsky
  2015-11-17 19:37                           ` Boris Ostrovsky
  5 siblings, 2 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-17 19:37 UTC (permalink / raw)
  To: Andy Lutomirski, Andrew Cooper
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel, H. Peter Anvin

On 11/17/2015 02:16 PM, Andy Lutomirski wrote:
>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>> it up by patching the jz instead of the test instruction.

We can actually make patching a little bit more efficient by replacing 
the test instruction with 'xor %eax,%eax'. That way we won't need any 
'nop's.


-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:16                         ` Andy Lutomirski
                                             ` (4 preceding siblings ...)
  2015-11-17 19:37                           ` [Xen-devel] " Boris Ostrovsky
@ 2015-11-17 19:37                           ` Boris Ostrovsky
  5 siblings, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-17 19:37 UTC (permalink / raw)
  To: Andy Lutomirski, Andrew Cooper
  Cc: David Vrabel, Borislav Petkov, linux-kernel, H. Peter Anvin, xen-devel

On 11/17/2015 02:16 PM, Andy Lutomirski wrote:
>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>> it up by patching the jz instead of the test instruction.

We can actually make patching a little bit more efficient by replacing 
the test instruction with 'xor %eax,%eax'. That way we won't need any 
'nop's.


-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Xen-devel] [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:37                           ` [Xen-devel] " Boris Ostrovsky
  2015-11-17 19:38                             ` Boris Ostrovsky
@ 2015-11-17 19:38                             ` Boris Ostrovsky
  1 sibling, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-17 19:38 UTC (permalink / raw)
  To: Andy Lutomirski, Andrew Cooper
  Cc: linux-kernel, xen-devel, Borislav Petkov, David Vrabel, H. Peter Anvin

On 11/17/2015 02:37 PM, Boris Ostrovsky wrote:
> On 11/17/2015 02:16 PM, Andy Lutomirski wrote:
>>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>>> it up by patching the jz instead of the test instruction.
>
> We can actually make patching a little bit more efficient by replacing 
> the test instruction with 'xor %eax,%eax'. That way we won't need any 
> 'nop's.

Nevermind that, we are looking at flags, not register.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
  2015-11-17 19:37                           ` [Xen-devel] " Boris Ostrovsky
@ 2015-11-17 19:38                             ` Boris Ostrovsky
  2015-11-17 19:38                             ` [Xen-devel] " Boris Ostrovsky
  1 sibling, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-17 19:38 UTC (permalink / raw)
  To: Andy Lutomirski, Andrew Cooper
  Cc: David Vrabel, Borislav Petkov, linux-kernel, H. Peter Anvin, xen-devel

On 11/17/2015 02:37 PM, Boris Ostrovsky wrote:
> On 11/17/2015 02:16 PM, Andy Lutomirski wrote:
>>>> Looks good to me.  Does Xen have any sysexit/sysret32 equivalent to
>>>> return to 32-bit user mode?  If so, it could be worth trying to wire
>>>> it up by patching the jz instead of the test instruction.
>
> We can actually make patching a little bit more efficient by replacing 
> the test instruction with 'xor %eax,%eax'. That way we won't need any 
> 'nop's.

Nevermind that, we are looking at flags, not register.

-boris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] xen/x86: Adjust stack pointer in xen_sysexit
@ 2015-11-13 23:18 Boris Ostrovsky
  0 siblings, 0 replies; 51+ messages in thread
From: Boris Ostrovsky @ 2015-11-13 23:18 UTC (permalink / raw)
  To: konrad.wilk, david.vrabel; +Cc: boris.ostrovsky, linux-kernel, luto, xen-devel

After 32-bit syscall rewrite, and specifically after commit 5f310f739b4c
("x86/entry/32: Re-implement SYSENTER using the new C path"), the stack
frame that is passed to xen_sysexit is no longer a "standard" one (i.e.
it's not pt_regs).

We need to adjust it so that subsequent xen_iret can use it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---

Alternatively, we could return 0 from do_fast_syscall_32() if paravirt_enabled()
is true since Xen PV guests will end up using xen_iret one way or the other. And
then we won't need xen_sysexit at all.

 arch/x86/xen/xen-asm_32.S |   23 ++++++++++++++++-------
 1 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index fd92a64..c70ec37 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -36,15 +36,24 @@ check_events:
 
 /*
  * We can't use sysexit directly, because we're not running in ring0.
- * But we can easily fake it up using iret.  Assuming xen_sysexit is
- * jumped to with a standard stack frame, we can just strip it back to
- * a standard iret frame and use iret.
+ * But we can easily fake it up using iret.
+ * We came here from the opportunistic SYSEXIT path in entry_SYSENTER_32
+ * which left the stack looking like this:
+ * 	$__USER_DS
+ * 	%ecx
+ * 	eflags
+ *	$__USER_CS
+ *	%eip
+ *	%eax
+ *	%gs
+ *	%fs
+ *	%es
+ *	%ds	<-- %esp
+ *
+ * so we need to adjust it to look like a standard iret frame
  */
 ENTRY(xen_sysexit)
-	movl PT_EAX(%esp), %eax			/* Shouldn't be necessary? */
-	orl $X86_EFLAGS_IF, PT_EFLAGS(%esp)
-	lea PT_EIP(%esp), %esp
-
+	add $5*4, %esp
 	jmp xen_iret
 ENDPROC(xen_sysexit)
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2015-11-17 19:38 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-13 23:18 [PATCH] xen/x86: Adjust stack pointer in xen_sysexit Boris Ostrovsky
2015-11-13 23:26 ` Andy Lutomirski
2015-11-14  1:23   ` Boris Ostrovsky
2015-11-14  1:23   ` Boris Ostrovsky
2015-11-15 18:02     ` Andy Lutomirski
2015-11-15 18:02     ` Andy Lutomirski
2015-11-16 16:25       ` Boris Ostrovsky
2015-11-16 16:25       ` Boris Ostrovsky
2015-11-16 19:03         ` Andy Lutomirski
2015-11-16 19:59           ` Borislav Petkov
2015-11-16 20:11             ` Andy Lutomirski
2015-11-16 20:11             ` Andy Lutomirski
2015-11-16 20:22               ` Borislav Petkov
2015-11-16 20:22               ` Borislav Petkov
2015-11-16 20:48                 ` Boris Ostrovsky
2015-11-16 20:50                   ` Andy Lutomirski
2015-11-16 21:00                     ` Borislav Petkov
2015-11-16 21:00                     ` Borislav Petkov
2015-11-16 21:03                     ` Konrad Rzeszutek Wilk
2015-11-16 21:04                       ` Andy Lutomirski
2015-11-17 10:53                         ` Joao Martins
2015-11-17 10:53                         ` Joao Martins
2015-11-16 21:04                       ` Andy Lutomirski
2015-11-16 20:50                   ` Andy Lutomirski
2015-11-16 20:48                 ` Boris Ostrovsky
2015-11-16 21:55                 ` H. Peter Anvin
2015-11-16 21:55                 ` H. Peter Anvin
2015-11-17 14:40                   ` Boris Ostrovsky
2015-11-17 14:40                   ` Boris Ostrovsky
2015-11-17 18:49                     ` Andy Lutomirski
2015-11-17 19:12                       ` Andrew Cooper
2015-11-17 19:12                       ` [Xen-devel] " Andrew Cooper
2015-11-17 19:16                         ` Andy Lutomirski
2015-11-17 19:21                           ` Borislav Petkov
2015-11-17 19:21                           ` [Xen-devel] " Borislav Petkov
2015-11-17 19:29                           ` Andrew Cooper
2015-11-17 19:36                             ` Andy Lutomirski
2015-11-17 19:36                             ` Andy Lutomirski
2015-11-17 19:29                           ` Andrew Cooper
2015-11-17 19:37                           ` [Xen-devel] " Boris Ostrovsky
2015-11-17 19:38                             ` Boris Ostrovsky
2015-11-17 19:38                             ` [Xen-devel] " Boris Ostrovsky
2015-11-17 19:37                           ` Boris Ostrovsky
2015-11-17 19:16                         ` Andy Lutomirski
2015-11-17 18:49                     ` Andy Lutomirski
2015-11-16 19:59           ` Borislav Petkov
2015-11-16 20:31           ` Boris Ostrovsky
2015-11-16 20:31           ` Boris Ostrovsky
2015-11-16 19:03         ` Andy Lutomirski
2015-11-13 23:26 ` Andy Lutomirski
2015-11-13 23:18 Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.