linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
@ 2019-11-11 14:30 Jan Beulich
  2019-11-11 14:32 ` [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jan Beulich @ 2019-11-11 14:30 UTC (permalink / raw)
  To: Juergen Gross, Boris Ostrovsky
  Cc: lkml, the arch/x86 maintainers, Andy Lutomirski, xen-devel

The first patch here fixes another regression from 3c88c692c287
("x86/stackframe/32: Provide consistent pt_regs"), besides the
one already addressed by
https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
The second patch is a minimal bit of cleanup on top.

1: make xen_iret_crit_fixup independent of frame layout
2: simplify xen_iret_crit_fixup's ring check

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout
  2019-11-11 14:30 [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
@ 2019-11-11 14:32 ` Jan Beulich
  2019-11-19 13:17   ` Jürgen Groß
  2019-11-19 21:01   ` [tip: x86/urgent] x86/xen/32: Make xen_iret_crit_fixup() " tip-bot2 for Jan Beulich
  2019-11-11 14:32 ` [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check Jan Beulich
  2019-11-19 12:58 ` Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
  2 siblings, 2 replies; 13+ messages in thread
From: Jan Beulich @ 2019-11-11 14:32 UTC (permalink / raw)
  To: Juergen Gross, Boris Ostrovsky, Andy Lutomirski
  Cc: lkml, the arch/x86 maintainers, xen-devel

Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
accounted for in xen_iret_crit_fixup. Otherwise the old_ax value gets
interpreted as EFLAGS, and hence VM86 mode appears to be active all
the time, leading to random "vm86_32: no user_vm86: BAD" log messages
alongside processes randomly crashing.

Since following the previous model (sitting after SAVE_ALL) would
further complicate the code _and_ retain the dependency of
xen_iret_crit_fixup on frame manipulations done by entry_32.S, switch
things around and do the adjustment ahead of SAVE_ALL.

Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1341,11 +1341,6 @@ END(spurious_interrupt_bug)
 
 #ifdef CONFIG_XEN_PV
 ENTRY(xen_hypervisor_callback)
-	pushl	$-1				/* orig_ax = -1 => not a system call */
-	SAVE_ALL
-	ENCODE_FRAME_POINTER
-	TRACE_IRQS_OFF
-
 	/*
 	 * Check to see if we got the event in the critical
 	 * region in xen_iret_direct, after we've reenabled
@@ -1353,16 +1348,17 @@ ENTRY(xen_hypervisor_callback)
 	 * iret instruction's behaviour where it delivers a
 	 * pending interrupt when enabling interrupts:
 	 */
-	movl	PT_EIP(%esp), %eax
-	cmpl	$xen_iret_start_crit, %eax
+	cmpl	$xen_iret_start_crit, (%esp)
 	jb	1f
-	cmpl	$xen_iret_end_crit, %eax
+	cmpl	$xen_iret_end_crit, (%esp)
 	jae	1f
-
-	jmp	xen_iret_crit_fixup
-
-ENTRY(xen_do_upcall)
-1:	mov	%esp, %eax
+	call	xen_iret_crit_fixup
+1:
+	pushl	$-1				/* orig_ax = -1 => not a system call */
+	SAVE_ALL
+	ENCODE_FRAME_POINTER
+	TRACE_IRQS_OFF
+	mov	%esp, %eax
 	call	xen_evtchn_do_upcall
 #ifndef CONFIG_PREEMPTION
 	call	xen_maybe_preempt_hcall
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -126,10 +126,9 @@ hyper_iret:
 	.globl xen_iret_start_crit, xen_iret_end_crit
 
 /*
- * This is called by xen_hypervisor_callback in entry.S when it sees
+ * This is called by xen_hypervisor_callback in entry_32.S when it sees
  * that the EIP at the time of interrupt was between
- * xen_iret_start_crit and xen_iret_end_crit.  We're passed the EIP in
- * %eax so we can do a more refined determination of what to do.
+ * xen_iret_start_crit and xen_iret_end_crit.
  *
  * The stack format at this point is:
  *	----------------
@@ -138,34 +137,23 @@ hyper_iret:
  *	 eflags		}  outer exception info
  *	 cs		}
  *	 eip		}
- *	---------------- <- edi (copy dest)
- *	 eax		:  outer eax if it hasn't been restored
  *	----------------
- *	 eflags		}  nested exception info
- *	 cs		}   (no ss/esp because we're nested
- *	 eip		}    from the same ring)
- *	 orig_eax	}<- esi (copy src)
- *	 - - - - - - - -
- *	 fs		}
- *	 es		}
- *	 ds		}  SAVE_ALL state
- *	 eax		}
- *	  :		:
- *	 ebx		}<- esp
+ *	 eax		:  outer eax if it hasn't been restored
  *	----------------
+ *	 eflags		}
+ *	 cs		}  nested exception info
+ *	 eip		}
+ *	 return address	: (into xen_hypervisor_callback)
  *
- * In order to deliver the nested exception properly, we need to shift
- * everything from the return addr up to the error code so it sits
- * just under the outer exception info.  This means that when we
- * handle the exception, we do it in the context of the outer
- * exception rather than starting a new one.
+ * In order to deliver the nested exception properly, we need to discard the
+ * nested exception frame such that when we handle the exception, we do it
+ * in the context of the outer exception rather than starting a new one.
  *
- * The only caveat is that if the outer eax hasn't been restored yet
- * (ie, it's still on stack), we need to insert its value into the
- * SAVE_ALL state before going on, since it's usermode state which we
- * eventually need to restore.
+ * The only caveat is that if the outer eax hasn't been restored yet (i.e.
+ * it's still on stack), we need to restore its value here.
  */
 ENTRY(xen_iret_crit_fixup)
+	pushl %ecx
 	/*
 	 * Paranoia: Make sure we're really coming from kernel space.
 	 * One could imagine a case where userspace jumps into the
@@ -176,32 +164,26 @@ ENTRY(xen_iret_crit_fixup)
 	 * jump instruction itself, not the destination, but some
 	 * virtual environments get this wrong.
 	 */
-	movl PT_CS(%esp), %ecx
+	movl 3*4(%esp), %ecx		/* nested CS */
 	andl $SEGMENT_RPL_MASK, %ecx
 	cmpl $USER_RPL, %ecx
+	popl %ecx
 	je 2f
 
-	lea PT_ORIG_EAX(%esp), %esi
-	lea PT_EFLAGS(%esp), %edi
-
 	/*
 	 * If eip is before iret_restore_end then stack
 	 * hasn't been restored yet.
 	 */
-	cmp $iret_restore_end, %eax
+	cmpl $iret_restore_end, 1*4(%esp)
 	jae 1f
 
-	movl 0+4(%edi), %eax		/* copy EAX (just above top of frame) */
-	movl %eax, PT_EAX(%esp)
-
-	lea ESP_OFFSET(%edi), %edi	/* move dest up over saved regs */
-
-	/* set up the copy */
-1:	std
-	mov $PT_EIP / 4, %ecx		/* saved regs up to orig_eax */
-	rep movsl
-	cld
-
-	lea 4(%edi), %esp		/* point esp to new frame */
-2:	jmp xen_do_upcall
-
+	movl 4*4(%esp), %eax		/* load outer EAX */
+	ret $4*4			/* discard nested EIP, CS, and EFLAGS as
+					 * well as the just restored EAX */
+
+1:
+	ret $3*4			/* discard nested EIP, CS, and EFLAGS */
+
+2:
+	ret
+END(xen_iret_crit_fixup)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check
  2019-11-11 14:30 [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
  2019-11-11 14:32 ` [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout Jan Beulich
@ 2019-11-11 14:32 ` Jan Beulich
  2019-11-19 13:18   ` Jürgen Groß
  2019-11-19 21:01   ` [tip: x86/urgent] x86/xen/32: Simplify ring check in xen_iret_crit_fixup() tip-bot2 for Jan Beulich
  2019-11-19 12:58 ` Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
  2 siblings, 2 replies; 13+ messages in thread
From: Jan Beulich @ 2019-11-11 14:32 UTC (permalink / raw)
  To: Juergen Gross, Boris Ostrovsky; +Cc: lkml, the arch/x86 maintainers, xen-devel

This can be had with two instead of six insns, by just checking the high
CS.RPL bit.

Also adjust the comment - there would be no #GP in the mentioned cases,
as there's no segment limit violation or alike. Instead there'd be #PF,
but that one reports the target EIP of said branch, not the address of
the branch insn itself.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
An alternative would be to keep using SEGMENT_RPL_MASK, but follow it
with "jpe".

--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -153,22 +153,15 @@ hyper_iret:
  * it's still on stack), we need to restore its value here.
  */
 ENTRY(xen_iret_crit_fixup)
-	pushl %ecx
 	/*
 	 * Paranoia: Make sure we're really coming from kernel space.
 	 * One could imagine a case where userspace jumps into the
 	 * critical range address, but just before the CPU delivers a
-	 * GP, it decides to deliver an interrupt instead.  Unlikely?
-	 * Definitely.  Easy to avoid?  Yes.  The Intel documents
-	 * explicitly say that the reported EIP for a bad jump is the
-	 * jump instruction itself, not the destination, but some
-	 * virtual environments get this wrong.
+	 * PF, it decides to deliver an interrupt instead.  Unlikely?
+	 * Definitely.  Easy to avoid?  Yes.
 	 */
-	movl 3*4(%esp), %ecx		/* nested CS */
-	andl $SEGMENT_RPL_MASK, %ecx
-	cmpl $USER_RPL, %ecx
-	popl %ecx
-	je 2f
+	testb $2, 2*4(%esp)		/* nested CS */
+	jnz 2f
 
 	/*
 	 * If eip is before iret_restore_end then stack


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
  2019-11-11 14:30 [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
  2019-11-11 14:32 ` [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout Jan Beulich
  2019-11-11 14:32 ` [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check Jan Beulich
@ 2019-11-19 12:58 ` Jan Beulich
  2019-11-19 17:50   ` Boris Ostrovsky
  2 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2019-11-19 12:58 UTC (permalink / raw)
  To: Juergen Gross, Boris Ostrovsky
  Cc: lkml, the arch/x86 maintainers, Andy Lutomirski, xen-devel

On 11.11.2019 15:30, Jan Beulich wrote:
> The first patch here fixes another regression from 3c88c692c287
> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
> one already addressed by
> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
> The second patch is a minimal bit of cleanup on top.
> 
> 1: make xen_iret_crit_fixup independent of frame layout
> 2: simplify xen_iret_crit_fixup's ring check

Seeing that the other regression fix has been taken into -tip,
what is the situation here? Should 5.4 really ship with this
still unfixed?

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout
  2019-11-11 14:32 ` [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout Jan Beulich
@ 2019-11-19 13:17   ` Jürgen Groß
  2019-11-19 21:01   ` [tip: x86/urgent] x86/xen/32: Make xen_iret_crit_fixup() " tip-bot2 for Jan Beulich
  1 sibling, 0 replies; 13+ messages in thread
From: Jürgen Groß @ 2019-11-19 13:17 UTC (permalink / raw)
  To: Jan Beulich, Boris Ostrovsky, Andy Lutomirski
  Cc: lkml, the arch/x86 maintainers, xen-devel

On 11.11.19 15:32, Jan Beulich wrote:
> Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
> accounted for in xen_iret_crit_fixup. Otherwise the old_ax value gets
> interpreted as EFLAGS, and hence VM86 mode appears to be active all
> the time, leading to random "vm86_32: no user_vm86: BAD" log messages
> alongside processes randomly crashing.
> 
> Since following the previous model (sitting after SAVE_ALL) would
> further complicate the code _and_ retain the dependency of
> xen_iret_crit_fixup on frame manipulations done by entry_32.S, switch
> things around and do the adjustment ahead of SAVE_ALL.
> 
> Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check
  2019-11-11 14:32 ` [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check Jan Beulich
@ 2019-11-19 13:18   ` Jürgen Groß
  2019-11-19 21:01   ` [tip: x86/urgent] x86/xen/32: Simplify ring check in xen_iret_crit_fixup() tip-bot2 for Jan Beulich
  1 sibling, 0 replies; 13+ messages in thread
From: Jürgen Groß @ 2019-11-19 13:18 UTC (permalink / raw)
  To: Jan Beulich, Boris Ostrovsky; +Cc: lkml, the arch/x86 maintainers, xen-devel

On 11.11.19 15:32, Jan Beulich wrote:
> This can be had with two instead of six insns, by just checking the high
> CS.RPL bit.
> 
> Also adjust the comment - there would be no #GP in the mentioned cases,
> as there's no segment limit violation or alike. Instead there'd be #PF,
> but that one reports the target EIP of said branch, not the address of
> the branch insn itself.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
  2019-11-19 12:58 ` Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
@ 2019-11-19 17:50   ` Boris Ostrovsky
  2019-11-20  2:17     ` Boris Ostrovsky
  2019-11-20  7:13     ` Jan Beulich
  0 siblings, 2 replies; 13+ messages in thread
From: Boris Ostrovsky @ 2019-11-19 17:50 UTC (permalink / raw)
  To: Jan Beulich, Juergen Gross
  Cc: lkml, the arch/x86 maintainers, Andy Lutomirski, xen-devel

On 11/19/19 7:58 AM, Jan Beulich wrote:
> On 11.11.2019 15:30, Jan Beulich wrote:
>> The first patch here fixes another regression from 3c88c692c287
>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>> one already addressed by
>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>> The second patch is a minimal bit of cleanup on top.
>>
>> 1: make xen_iret_crit_fixup independent of frame layout
>> 2: simplify xen_iret_crit_fixup's ring check
> Seeing that the other regression fix has been taken into -tip,
> what is the situation here? Should 5.4 really ship with this
> still unfixed?


I am still unable to boot a 32-bit guest with those patches, crashing in
int3_exception_notify with regs->sp zero.

When I revert to 3c88c692c287 the guest actually boots so my (?) problem
was introduced somewhere in-between.

-boris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip: x86/urgent] x86/xen/32: Simplify ring check in xen_iret_crit_fixup()
  2019-11-11 14:32 ` [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check Jan Beulich
  2019-11-19 13:18   ` Jürgen Groß
@ 2019-11-19 21:01   ` tip-bot2 for Jan Beulich
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot2 for Jan Beulich @ 2019-11-19 21:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jan Beulich, Thomas Gleixner, Juergen Gross, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     922eea2ce5c799228d9ff1be9890e6873ce8fff6
Gitweb:        https://git.kernel.org/tip/922eea2ce5c799228d9ff1be9890e6873ce8fff6
Author:        Jan Beulich <jbeulich@suse.com>
AuthorDate:    Mon, 11 Nov 2019 15:32:59 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 Nov 2019 21:58:28 +01:00

x86/xen/32: Simplify ring check in xen_iret_crit_fixup()

This can be had with two instead of six insns, by just checking the high
CS.RPL bit.

Also adjust the comment - there would be no #GP in the mentioned cases, as
there's no segment limit violation or alike. Instead there'd be #PF, but
that one reports the target EIP of said branch, not the address of the
branch insn itself.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lkml.kernel.org/r/a5986837-01eb-7bf8-bf42-4d3084d6a1f5@suse.com

---
 arch/x86/xen/xen-asm_32.S | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index 392e033..cd17777 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -153,22 +153,15 @@ hyper_iret:
  * it's still on stack), we need to restore its value here.
  */
 ENTRY(xen_iret_crit_fixup)
-	pushl %ecx
 	/*
 	 * Paranoia: Make sure we're really coming from kernel space.
 	 * One could imagine a case where userspace jumps into the
 	 * critical range address, but just before the CPU delivers a
-	 * GP, it decides to deliver an interrupt instead.  Unlikely?
-	 * Definitely.  Easy to avoid?  Yes.  The Intel documents
-	 * explicitly say that the reported EIP for a bad jump is the
-	 * jump instruction itself, not the destination, but some
-	 * virtual environments get this wrong.
+	 * PF, it decides to deliver an interrupt instead.  Unlikely?
+	 * Definitely.  Easy to avoid?  Yes.
 	 */
-	movl 3*4(%esp), %ecx		/* nested CS */
-	andl $SEGMENT_RPL_MASK, %ecx
-	cmpl $USER_RPL, %ecx
-	popl %ecx
-	je 2f
+	testb $2, 2*4(%esp)		/* nested CS */
+	jnz 2f
 
 	/*
 	 * If eip is before iret_restore_end then stack

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip: x86/urgent] x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout
  2019-11-11 14:32 ` [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout Jan Beulich
  2019-11-19 13:17   ` Jürgen Groß
@ 2019-11-19 21:01   ` tip-bot2 for Jan Beulich
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot2 for Jan Beulich @ 2019-11-19 21:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jan Beulich, Thomas Gleixner, Juergen Gross, Stable Team,
	Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     29b810f5a5ec127d3143770098e05981baa3eb77
Gitweb:        https://git.kernel.org/tip/29b810f5a5ec127d3143770098e05981baa3eb77
Author:        Jan Beulich <jbeulich@suse.com>
AuthorDate:    Mon, 11 Nov 2019 15:32:12 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 Nov 2019 21:58:28 +01:00

x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout

Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
accounted for in xen_iret_crit_fixup(). Otherwise the old_ax value gets
interpreted as EFLAGS, and hence VM86 mode appears to be active all the
time, leading to random "vm86_32: no user_vm86: BAD" log messages alongside
processes randomly crashing.

Since following the previous model (sitting after SAVE_ALL) would further
complicate the code _and_ retain the dependency of xen_iret_crit_fixup() on
frame manipulations done by entry_32.S, switch things around and do the
adjustment ahead of SAVE_ALL.

Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Stable Team <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/32d8713d-25a7-84ab-b74b-aa3e88abce6b@suse.com


---
 arch/x86/entry/entry_32.S | 22 +++++--------
 arch/x86/xen/xen-asm_32.S | 66 +++++++++++++-------------------------
 2 files changed, 33 insertions(+), 55 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 3f847d8..019dbac 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1341,11 +1341,6 @@ END(spurious_interrupt_bug)
 
 #ifdef CONFIG_XEN_PV
 ENTRY(xen_hypervisor_callback)
-	pushl	$-1				/* orig_ax = -1 => not a system call */
-	SAVE_ALL
-	ENCODE_FRAME_POINTER
-	TRACE_IRQS_OFF
-
 	/*
 	 * Check to see if we got the event in the critical
 	 * region in xen_iret_direct, after we've reenabled
@@ -1353,16 +1348,17 @@ ENTRY(xen_hypervisor_callback)
 	 * iret instruction's behaviour where it delivers a
 	 * pending interrupt when enabling interrupts:
 	 */
-	movl	PT_EIP(%esp), %eax
-	cmpl	$xen_iret_start_crit, %eax
+	cmpl	$xen_iret_start_crit, (%esp)
 	jb	1f
-	cmpl	$xen_iret_end_crit, %eax
+	cmpl	$xen_iret_end_crit, (%esp)
 	jae	1f
-
-	jmp	xen_iret_crit_fixup
-
-ENTRY(xen_do_upcall)
-1:	mov	%esp, %eax
+	call	xen_iret_crit_fixup
+1:
+	pushl	$-1				/* orig_ax = -1 => not a system call */
+	SAVE_ALL
+	ENCODE_FRAME_POINTER
+	TRACE_IRQS_OFF
+	mov	%esp, %eax
 	call	xen_evtchn_do_upcall
 #ifndef CONFIG_PREEMPTION
 	call	xen_maybe_preempt_hcall
diff --git a/arch/x86/xen/xen-asm_32.S b/arch/x86/xen/xen-asm_32.S
index c15db06..392e033 100644
--- a/arch/x86/xen/xen-asm_32.S
+++ b/arch/x86/xen/xen-asm_32.S
@@ -126,10 +126,9 @@ hyper_iret:
 	.globl xen_iret_start_crit, xen_iret_end_crit
 
 /*
- * This is called by xen_hypervisor_callback in entry.S when it sees
+ * This is called by xen_hypervisor_callback in entry_32.S when it sees
  * that the EIP at the time of interrupt was between
- * xen_iret_start_crit and xen_iret_end_crit.  We're passed the EIP in
- * %eax so we can do a more refined determination of what to do.
+ * xen_iret_start_crit and xen_iret_end_crit.
  *
  * The stack format at this point is:
  *	----------------
@@ -138,34 +137,23 @@ hyper_iret:
  *	 eflags		}  outer exception info
  *	 cs		}
  *	 eip		}
- *	---------------- <- edi (copy dest)
- *	 eax		:  outer eax if it hasn't been restored
  *	----------------
- *	 eflags		}  nested exception info
- *	 cs		}   (no ss/esp because we're nested
- *	 eip		}    from the same ring)
- *	 orig_eax	}<- esi (copy src)
- *	 - - - - - - - -
- *	 fs		}
- *	 es		}
- *	 ds		}  SAVE_ALL state
- *	 eax		}
- *	  :		:
- *	 ebx		}<- esp
+ *	 eax		:  outer eax if it hasn't been restored
  *	----------------
+ *	 eflags		}
+ *	 cs		}  nested exception info
+ *	 eip		}
+ *	 return address	: (into xen_hypervisor_callback)
  *
- * In order to deliver the nested exception properly, we need to shift
- * everything from the return addr up to the error code so it sits
- * just under the outer exception info.  This means that when we
- * handle the exception, we do it in the context of the outer
- * exception rather than starting a new one.
+ * In order to deliver the nested exception properly, we need to discard the
+ * nested exception frame such that when we handle the exception, we do it
+ * in the context of the outer exception rather than starting a new one.
  *
- * The only caveat is that if the outer eax hasn't been restored yet
- * (ie, it's still on stack), we need to insert its value into the
- * SAVE_ALL state before going on, since it's usermode state which we
- * eventually need to restore.
+ * The only caveat is that if the outer eax hasn't been restored yet (i.e.
+ * it's still on stack), we need to restore its value here.
  */
 ENTRY(xen_iret_crit_fixup)
+	pushl %ecx
 	/*
 	 * Paranoia: Make sure we're really coming from kernel space.
 	 * One could imagine a case where userspace jumps into the
@@ -176,32 +164,26 @@ ENTRY(xen_iret_crit_fixup)
 	 * jump instruction itself, not the destination, but some
 	 * virtual environments get this wrong.
 	 */
-	movl PT_CS(%esp), %ecx
+	movl 3*4(%esp), %ecx		/* nested CS */
 	andl $SEGMENT_RPL_MASK, %ecx
 	cmpl $USER_RPL, %ecx
+	popl %ecx
 	je 2f
 
-	lea PT_ORIG_EAX(%esp), %esi
-	lea PT_EFLAGS(%esp), %edi
-
 	/*
 	 * If eip is before iret_restore_end then stack
 	 * hasn't been restored yet.
 	 */
-	cmp $iret_restore_end, %eax
+	cmpl $iret_restore_end, 1*4(%esp)
 	jae 1f
 
-	movl 0+4(%edi), %eax		/* copy EAX (just above top of frame) */
-	movl %eax, PT_EAX(%esp)
+	movl 4*4(%esp), %eax		/* load outer EAX */
+	ret $4*4			/* discard nested EIP, CS, and EFLAGS as
+					 * well as the just restored EAX */
 
-	lea ESP_OFFSET(%edi), %edi	/* move dest up over saved regs */
-
-	/* set up the copy */
-1:	std
-	mov $PT_EIP / 4, %ecx		/* saved regs up to orig_eax */
-	rep movsl
-	cld
-
-	lea 4(%edi), %esp		/* point esp to new frame */
-2:	jmp xen_do_upcall
+1:
+	ret $3*4			/* discard nested EIP, CS, and EFLAGS */
 
+2:
+	ret
+END(xen_iret_crit_fixup)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
  2019-11-19 17:50   ` Boris Ostrovsky
@ 2019-11-20  2:17     ` Boris Ostrovsky
  2019-11-20  2:39       ` Boris Ostrovsky
  2019-11-20  7:13     ` Jan Beulich
  1 sibling, 1 reply; 13+ messages in thread
From: Boris Ostrovsky @ 2019-11-20  2:17 UTC (permalink / raw)
  To: Jan Beulich, Juergen Gross
  Cc: lkml, the arch/x86 maintainers, Andy Lutomirski, xen-devel

On 11/19/19 12:50 PM, Boris Ostrovsky wrote:
> On 11/19/19 7:58 AM, Jan Beulich wrote:
>> On 11.11.2019 15:30, Jan Beulich wrote:
>>> The first patch here fixes another regression from 3c88c692c287
>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>> one already addressed by
>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>> The second patch is a minimal bit of cleanup on top.
>>>
>>> 1: make xen_iret_crit_fixup independent of frame layout
>>> 2: simplify xen_iret_crit_fixup's ring check
>> Seeing that the other regression fix has been taken into -tip,
>> what is the situation here? Should 5.4 really ship with this
>> still unfixed?
>
> I am still unable to boot a 32-bit guest with those patches, crashing in
> int3_exception_notify with regs->sp zero.
>
> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
> was introduced somewhere in-between.

Nevermind this. I didn't read your patches correctly.

-boris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
  2019-11-20  2:17     ` Boris Ostrovsky
@ 2019-11-20  2:39       ` Boris Ostrovsky
  2019-11-20  7:18         ` [Xen-devel] " Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Boris Ostrovsky @ 2019-11-20  2:39 UTC (permalink / raw)
  To: Jan Beulich, Juergen Gross
  Cc: lkml, the arch/x86 maintainers, Andy Lutomirski, xen-devel

On 11/19/19 9:17 PM, Boris Ostrovsky wrote:
> On 11/19/19 12:50 PM, Boris Ostrovsky wrote:
>> On 11/19/19 7:58 AM, Jan Beulich wrote:
>>> On 11.11.2019 15:30, Jan Beulich wrote:
>>>> The first patch here fixes another regression from 3c88c692c287
>>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>>> one already addressed by
>>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>>> The second patch is a minimal bit of cleanup on top.
>>>>
>>>> 1: make xen_iret_crit_fixup independent of frame layout
>>>> 2: simplify xen_iret_crit_fixup's ring check
>>> Seeing that the other regression fix has been taken into -tip,
>>> what is the situation here? Should 5.4 really ship with this
>>> still unfixed?
>> I am still unable to boot a 32-bit guest with those patches, crashing in
>> int3_exception_notify with regs->sp zero.
>>
>> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
>> was introduced somewhere in-between.
> Nevermind this. I didn't read your patches correctly.

BTW, I'd rather this not go into 5.4 this late. 3c88c692c287 has been
there since 5.2 and noone complained.

-boris



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xen-devel] Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
  2019-11-19 17:50   ` Boris Ostrovsky
  2019-11-20  2:17     ` Boris Ostrovsky
@ 2019-11-20  7:13     ` Jan Beulich
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2019-11-20  7:13 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, Andy Lutomirski, the arch/x86 maintainers,
	xen-devel, lkml

On 19.11.2019 18:50, Boris Ostrovsky wrote:
> On 11/19/19 7:58 AM, Jan Beulich wrote:
>> On 11.11.2019 15:30, Jan Beulich wrote:
>>> The first patch here fixes another regression from 3c88c692c287
>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>> one already addressed by
>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>> The second patch is a minimal bit of cleanup on top.
>>>
>>> 1: make xen_iret_crit_fixup independent of frame layout
>>> 2: simplify xen_iret_crit_fixup's ring check
>> Seeing that the other regression fix has been taken into -tip,
>> what is the situation here? Should 5.4 really ship with this
>> still unfixed?
> 
> 
> I am still unable to boot a 32-bit guest with those patches, crashing in
> int3_exception_notify with regs->sp zero.
> 
> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
> was introduced somewhere in-between.

In order to even get as far as the patches here taking effect
you first need "x86/stackframe/32: repair 32-bit Xen PV" (which
is what "the other regression fix" in my ping refers to).

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xen-devel] Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments
  2019-11-20  2:39       ` Boris Ostrovsky
@ 2019-11-20  7:18         ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2019-11-20  7:18 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, Andy Lutomirski, the arch/x86 maintainers,
	xen-devel, lkml

On 20.11.2019 03:39, Boris Ostrovsky wrote:
> On 11/19/19 9:17 PM, Boris Ostrovsky wrote:
>> On 11/19/19 12:50 PM, Boris Ostrovsky wrote:
>>> On 11/19/19 7:58 AM, Jan Beulich wrote:
>>>> On 11.11.2019 15:30, Jan Beulich wrote:
>>>>> The first patch here fixes another regression from 3c88c692c287
>>>>> ("x86/stackframe/32: Provide consistent pt_regs"), besides the
>>>>> one already addressed by
>>>>> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01988.html.
>>>>> The second patch is a minimal bit of cleanup on top.
>>>>>
>>>>> 1: make xen_iret_crit_fixup independent of frame layout
>>>>> 2: simplify xen_iret_crit_fixup's ring check
>>>> Seeing that the other regression fix has been taken into -tip,
>>>> what is the situation here? Should 5.4 really ship with this
>>>> still unfixed?
>>> I am still unable to boot a 32-bit guest with those patches, crashing in
>>> int3_exception_notify with regs->sp zero.
>>>
>>> When I revert to 3c88c692c287 the guest actually boots so my (?) problem
>>> was introduced somewhere in-between.
>> Nevermind this. I didn't read your patches correctly.
> 
> BTW, I'd rather this not go into 5.4 this late. 3c88c692c287 has been
> there since 5.2 and noone complained.

Afaict the issues were introduced in 5.3, and my first patch (including
a note [complaint if you will] of the second issue) was sent around
5.4-rc2. This has been blocking osstest's linux-linus forever since, so
even without my mail everyone could have been aware by paying attention
to the flight reports (the bisection ones, unfortunately, are pretty
useless here, as in cases like this one they seem to tend to point at
huge merge commits).

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-11-20  7:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11 14:30 [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
2019-11-11 14:32 ` [PATCH 1/2] x86/Xen/32: make xen_iret_crit_fixup independent of frame layout Jan Beulich
2019-11-19 13:17   ` Jürgen Groß
2019-11-19 21:01   ` [tip: x86/urgent] x86/xen/32: Make xen_iret_crit_fixup() " tip-bot2 for Jan Beulich
2019-11-11 14:32 ` [PATCH 2/2] x86/Xen/32: simplify xen_iret_crit_fixup's ring check Jan Beulich
2019-11-19 13:18   ` Jürgen Groß
2019-11-19 21:01   ` [tip: x86/urgent] x86/xen/32: Simplify ring check in xen_iret_crit_fixup() tip-bot2 for Jan Beulich
2019-11-19 12:58 ` Ping: [PATCH 0/2] x86/Xen/32: xen_iret_crit_fixup adjustments Jan Beulich
2019-11-19 17:50   ` Boris Ostrovsky
2019-11-20  2:17     ` Boris Ostrovsky
2019-11-20  2:39       ` Boris Ostrovsky
2019-11-20  7:18         ` [Xen-devel] " Jan Beulich
2019-11-20  7:13     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).