All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] x86_64: Sigcontext improvements
@ 2015-03-12 20:57 Andy Lutomirski
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-12 20:57 UTC (permalink / raw)
  To: Ingo Molnar, Andi Kleen, H. Peter Anvin
  Cc: Al Viro, x86, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

Patch 1 is IMO a bug fix.

Patch 2 is a cleanup and avoids some confusion.  It's also sort of an
API break (no ABI change) because it removes a struct field.  But that
struct field has never done anything at all as far as I can tell.

AFAICT this series got bogged down in discussion about patch 2 last
time.  I'm only including patch 2 for completeness here -- patch 1
is IMO far more useful, and I'm fine with patch 1 being merged and
tabling patch 2 again.

Changes from v1:
 - Add the __USER_DS fix, caught by my sigreturn test.
 - Re-tested locally.

Changes from 2:
 - Fixed comment style in patch 1.
 - Added a better comment in patch 2.

Andy Lutomirski (2):
  x86_64,signal: Fix SS handling for signals delivered to 64-bit
    programs
  x86_64,signal: Remove 'fs' and 'gs' from sigcontext

 arch/x86/include/asm/sigcontext.h      |  6 +++---
 arch/x86/include/uapi/asm/sigcontext.h | 21 ++++++++++++++++++---
 arch/x86/kernel/signal.c               | 26 +++++++++++++++-----------
 3 files changed, 36 insertions(+), 17 deletions(-)

-- 
2.3.0


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-12 20:57 [PATCH v3 0/2] x86_64: Sigcontext improvements Andy Lutomirski
@ 2015-03-12 20:57 ` Andy Lutomirski
  2015-03-13 16:13   ` Borislav Petkov
                     ` (4 more replies)
  2015-03-12 20:57 ` [PATCH v3 2/2] x86_64,signal: Remove 'fs' and 'gs' from sigcontext Andy Lutomirski
  2015-03-13 15:31 ` [PATCH v3 0/2] x86_64: Sigcontext improvements Oleg Nesterov
  2 siblings, 5 replies; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-12 20:57 UTC (permalink / raw)
  To: Ingo Molnar, Andi Kleen, H. Peter Anvin
  Cc: Al Viro, x86, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

From: Andy Lutomirski <luto@amacapital.net>

The comment in the signal code says that apps can save/restore other
segments on their own.  It's true that apps can *save* SS on their
own, but there's no way for apps to restore it: SYSCALL effectively
resets SS to __USER_DS, so any value that user code tries to load
into SS gets lost on entry to sigreturn.

This recycles two padding bytes in the segment selector area for SS.

While we're at it, we need a second change to make this useful.  If
the signal we're delivering is caused by a bad SS value, saving that
value isn't enough.  We need to remove that bad value from the regs
before we try to deliver the signal.  Oddly, x32 already got this
right.

I suspect that 64-bit programs that try to run 16-bit code and use
signals will have a lot of trouble without this.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
 arch/x86/include/asm/sigcontext.h      |  2 +-
 arch/x86/include/uapi/asm/sigcontext.h |  2 +-
 arch/x86/kernel/signal.c               | 22 +++++++++++++---------
 3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index 9dfce4e0417d..f910cdcb71fd 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -59,7 +59,7 @@ struct sigcontext {
 	unsigned short cs;
 	unsigned short gs;
 	unsigned short fs;
-	unsigned short __pad0;
+	unsigned short ss;
 	unsigned long err;
 	unsigned long trapno;
 	unsigned long oldmask;
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index d8b9f9081e86..076b11fd6fa1 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -179,7 +179,7 @@ struct sigcontext {
 	__u16 cs;
 	__u16 gs;
 	__u16 fs;
-	__u16 __pad0;
+	__u16 ss;
 	__u64 err;
 	__u64 trapno;
 	__u64 oldmask;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ed37a768d0fc..9511eb7f17b0 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -94,15 +94,8 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 		COPY(r15);
 #endif /* CONFIG_X86_64 */
 
-#ifdef CONFIG_X86_32
 		COPY_SEG_CPL3(cs);
 		COPY_SEG_CPL3(ss);
-#else /* !CONFIG_X86_32 */
-		/* Kernel saves and restores only the CS segment register on signals,
-		 * which is the bare minimum needed to allow mixed 32/64-bit code.
-		 * App's signal handler can save/restore other segments if needed. */
-		COPY_SEG_CPL3(cs);
-#endif /* CONFIG_X86_32 */
 
 		get_user_ex(tmpflags, &sc->flags);
 		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
@@ -164,6 +157,7 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		put_user_ex(regs->cs, &sc->cs);
 		put_user_ex(0, &sc->gs);
 		put_user_ex(0, &sc->fs);
+		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 
 		put_user_ex(fpstate, &sc->fpstate);
@@ -457,9 +451,19 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	regs->sp = (unsigned long)frame;
 
-	/* Set up the CS register to run signal handlers in 64-bit mode,
-	   even if the handler happens to be interrupting 32-bit code. */
+	/*
+	 * Set up the CS and SS registers to run signal handlers in
+	 * 64-bit mode, even if the handler happens to be interrupting
+	 * 32-bit or 16-bit code.
+	 *
+	 * SS is subtle.  In 64-bit mode, we don't need any particular
+	 * SS descriptor, but we do need SS to be valid.  It's possible
+	 * that the old SS is entirely bogus -- this can happen if the
+	 * signal we're trying to deliver is #GP or #SS caused by a bad
+	 * SS value.
+	 */
 	regs->cs = __USER_CS;
+	regs->ss = __USER_DS;
 
 	return 0;
 }
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 2/2] x86_64,signal: Remove 'fs' and 'gs' from sigcontext
  2015-03-12 20:57 [PATCH v3 0/2] x86_64: Sigcontext improvements Andy Lutomirski
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
@ 2015-03-12 20:57 ` Andy Lutomirski
  2015-03-16 12:08   ` [tip:x86/asm] x86/signal/64: " tip-bot for Andy Lutomirski
  2015-03-17  8:44   ` tip-bot for Andy Lutomirski
  2015-03-13 15:31 ` [PATCH v3 0/2] x86_64: Sigcontext improvements Oleg Nesterov
  2 siblings, 2 replies; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-12 20:57 UTC (permalink / raw)
  To: Ingo Molnar, Andi Kleen, H. Peter Anvin
  Cc: Al Viro, x86, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

From: Andy Lutomirski <luto@amacapital.net>

As far as I can tell, these fields have been set to zero on save and
ignored on restore since Linux was imported into git.  Rename them
'__pad1' and '__pad2' to avoid confusion.  This may also allow
us to recycle them some day.

This also adds a comment clarifying the history of those fields.

I'm intentionally avoiding calling either of them __pad0: the field
formerly known as __pad0 is now ss.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
 arch/x86/include/asm/sigcontext.h      |  4 ++--
 arch/x86/include/uapi/asm/sigcontext.h | 19 +++++++++++++++++--
 arch/x86/kernel/signal.c               |  4 ++--
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index f910cdcb71fd..6fe6b182c998 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -57,8 +57,8 @@ struct sigcontext {
 	unsigned long ip;
 	unsigned long flags;
 	unsigned short cs;
-	unsigned short gs;
-	unsigned short fs;
+	unsigned short __pad2;	/* Was called gs, but was always zero. */
+	unsigned short __pad1;	/* Was called fs, but was always zero. */
 	unsigned short ss;
 	unsigned long err;
 	unsigned long trapno;
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index 076b11fd6fa1..16dc4e8a2cd3 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -177,8 +177,23 @@ struct sigcontext {
 	__u64 rip;
 	__u64 eflags;		/* RFLAGS */
 	__u16 cs;
-	__u16 gs;
-	__u16 fs;
+
+	/*
+	 * Prior to 2.5.64 ("[PATCH] x86-64 updates for 2.5.64-bk3"),
+	 * Linux saved and restored fs and gs in these slots.  This
+	 * was counterproductive, as fsbase and gsbase were never
+	 * saved, so arch_prctl was presumably unreliable.
+	 *
+	 * If these slots are ever needed for any other purpose, there
+	 * is some risk that very old 64-bit binaries could get
+	 * confused.  I doubt that many such binaries still work,
+	 * though, since the same patch in 2.5.64 also removed the
+	 * 64-bit set_thread_area syscall, so it appears that there is
+	 * no TLS API that works in both pre- and post-2.5.64 kernels.
+	 */
+	__u16 __pad2;		/* Was gs. */
+	__u16 __pad1;		/* Was fs. */
+
 	__u16 ss;
 	__u64 err;
 	__u64 trapno;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 9511eb7f17b0..691ab4ba5f12 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -155,8 +155,8 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 #else /* !CONFIG_X86_32 */
 		put_user_ex(regs->flags, &sc->flags);
 		put_user_ex(regs->cs, &sc->cs);
-		put_user_ex(0, &sc->gs);
-		put_user_ex(0, &sc->fs);
+		put_user_ex(0, &sc->__pad2);
+		put_user_ex(0, &sc->__pad1);
 		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 0/2] x86_64: Sigcontext improvements
  2015-03-12 20:57 [PATCH v3 0/2] x86_64: Sigcontext improvements Andy Lutomirski
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
  2015-03-12 20:57 ` [PATCH v3 2/2] x86_64,signal: Remove 'fs' and 'gs' from sigcontext Andy Lutomirski
@ 2015-03-13 15:31 ` Oleg Nesterov
  2 siblings, 0 replies; 39+ messages in thread
From: Oleg Nesterov @ 2015-03-13 15:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, x86,
	linux-kernel, Linus Torvalds, Borislav Petkov

On 03/12, Andy Lutomirski wrote:
>
> Patch 1 is IMO a bug fix.

And personally I agree.

I am still not sure about renames in include/uapi, perhaps someone can
ack comment this change...

Perhaps we could could even do

	-	__u16 __pad0;
	+	// comment to explain that ->ss is used starting from 4.x,
	+	// __pad0 is still here to not break userspace.
	+	union { __u16 __pad0; __u16 ss; }

But this is purely cosmetic, and I simply do not know if we should worry
about the potential compilation failure.

I believe both patches are good.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
@ 2015-03-13 16:13   ` Borislav Petkov
  2015-03-16  8:11   ` Ingo Molnar
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 39+ messages in thread
From: Borislav Petkov @ 2015-03-13 16:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, x86,
	linux-kernel, Linus Torvalds, Oleg Nesterov, Andy Lutomirski

On Thu, Mar 12, 2015 at 01:57:51PM -0700, Andy Lutomirski wrote:
> From: Andy Lutomirski <luto@amacapital.net>
> 
> The comment in the signal code says that apps can save/restore other
> segments on their own.  It's true that apps can *save* SS on their
> own, but there's no way for apps to restore it: SYSCALL effectively
> resets SS to __USER_DS, so any value that user code tries to load
> into SS gets lost on entry to sigreturn.
> 
> This recycles two padding bytes in the segment selector area for SS.
> 
> While we're at it, we need a second change to make this useful.  If
> the signal we're delivering is caused by a bad SS value, saving that
> value isn't enough.  We need to remove that bad value from the regs
> before we try to deliver the signal.  Oddly, x32 already got this
> right.
> 
> I suspect that 64-bit programs that try to run 16-bit code and use
> signals will have a lot of trouble without this.
> 
> Signed-off-by: Andy Lutomirski <luto@amacapital.net>

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
  2015-03-13 16:13   ` Borislav Petkov
@ 2015-03-16  8:11   ` Ingo Molnar
  2015-03-16 12:08   ` [tip:x86/asm] x86/signal/64: " tip-bot for Andy Lutomirski
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 39+ messages in thread
From: Ingo Molnar @ 2015-03-16  8:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andi Kleen, H. Peter Anvin, Al Viro, x86, linux-kernel,
	Linus Torvalds, Oleg Nesterov, Borislav Petkov, Andy Lutomirski


* Andy Lutomirski <luto@kernel.org> wrote:

> From: Andy Lutomirski <luto@amacapital.net>
> 
> The comment in the signal code says that apps can save/restore other 
> segments on their own.  It's true that apps can *save* SS on their 
> own, but there's no way for apps to restore it: SYSCALL effectively 
> resets SS to __USER_DS, so any value that user code tries to load 
> into SS gets lost on entry to sigreturn.
> 
> This recycles two padding bytes in the segment selector area for SS.
> 
> While we're at it, we need a second change to make this useful.  If 
> the signal we're delivering is caused by a bad SS value, saving that 
> value isn't enough.  We need to remove that bad value from the regs 
> before we try to deliver the signal.  Oddly, x32 already got this 
> right.

Note, it's the original i386 code that got this right, not the 
(relatively new) x32 code. I fixed this in the commit that I applied.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [tip:x86/asm] x86/signal/64: Fix SS handling for signals delivered to 64-bit programs
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
  2015-03-13 16:13   ` Borislav Petkov
  2015-03-16  8:11   ` Ingo Molnar
@ 2015-03-16 12:08   ` tip-bot for Andy Lutomirski
  2015-03-17  8:44   ` tip-bot for Andy Lutomirski
  2015-03-18 17:19   ` [PATCH v3 1/2] x86_64,signal: " Andrey Wagin
  4 siblings, 0 replies; 39+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-16 12:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: oleg, hpa, viro, luto, torvalds, bp, linux-kernel, tglx, mingo

Commit-ID:  7a3d41d14d9bf9e1ceff241f22b2e03ab6336377
Gitweb:     http://git.kernel.org/tip/7a3d41d14d9bf9e1ceff241f22b2e03ab6336377
Author:     Andy Lutomirski <luto@amacapital.net>
AuthorDate: Thu, 12 Mar 2015 13:57:51 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Mar 2015 09:14:51 +0100

x86/signal/64: Fix SS handling for signals delivered to 64-bit programs

The comment in the signal code says that apps can save/restore
other segments on their own.  It's true that apps can *save* SS
on their own, but there's no way for apps to restore it: SYSCALL
effectively resets SS to __USER_DS, so any value that user code
tries to load into SS gets lost on entry to sigreturn.

This recycles two padding bytes in the segment selector area for SS.

While we're at it, we need a second change to make this useful.

If the signal we're delivering is caused by a bad SS value,
saving that value isn't enough.  We need to remove that bad
value from the regs before we try to deliver the signal.  Oddly,
the i386 code already got this right.

I suspect that 64-bit programs that try to run 16-bit code and
use signals will have a lot of trouble without this.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/405594361340a2ec32f8e2b115c142df0e180d8e.1426193719.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/sigcontext.h      |  2 +-
 arch/x86/include/uapi/asm/sigcontext.h |  2 +-
 arch/x86/kernel/signal.c               | 22 +++++++++++++---------
 3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index 9dfce4e..f910cdc 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -59,7 +59,7 @@ struct sigcontext {
 	unsigned short cs;
 	unsigned short gs;
 	unsigned short fs;
-	unsigned short __pad0;
+	unsigned short ss;
 	unsigned long err;
 	unsigned long trapno;
 	unsigned long oldmask;
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index d8b9f90..076b11f 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -179,7 +179,7 @@ struct sigcontext {
 	__u16 cs;
 	__u16 gs;
 	__u16 fs;
-	__u16 __pad0;
+	__u16 ss;
 	__u64 err;
 	__u64 trapno;
 	__u64 oldmask;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index e504246..e2f6061 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -94,15 +94,8 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 		COPY(r15);
 #endif /* CONFIG_X86_64 */
 
-#ifdef CONFIG_X86_32
 		COPY_SEG_CPL3(cs);
 		COPY_SEG_CPL3(ss);
-#else /* !CONFIG_X86_32 */
-		/* Kernel saves and restores only the CS segment register on signals,
-		 * which is the bare minimum needed to allow mixed 32/64-bit code.
-		 * App's signal handler can save/restore other segments if needed. */
-		COPY_SEG_CPL3(cs);
-#endif /* CONFIG_X86_32 */
 
 		get_user_ex(tmpflags, &sc->flags);
 		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
@@ -164,6 +157,7 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		put_user_ex(regs->cs, &sc->cs);
 		put_user_ex(0, &sc->gs);
 		put_user_ex(0, &sc->fs);
+		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 
 		put_user_ex(fpstate, &sc->fpstate);
@@ -457,9 +451,19 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	regs->sp = (unsigned long)frame;
 
-	/* Set up the CS register to run signal handlers in 64-bit mode,
-	   even if the handler happens to be interrupting 32-bit code. */
+	/*
+	 * Set up the CS and SS registers to run signal handlers in
+	 * 64-bit mode, even if the handler happens to be interrupting
+	 * 32-bit or 16-bit code.
+	 *
+	 * SS is subtle.  In 64-bit mode, we don't need any particular
+	 * SS descriptor, but we do need SS to be valid.  It's possible
+	 * that the old SS is entirely bogus -- this can happen if the
+	 * signal we're trying to deliver is #GP or #SS caused by a bad
+	 * SS value.
+	 */
 	regs->cs = __USER_CS;
+	regs->ss = __USER_DS;
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [tip:x86/asm] x86/signal/64: Remove 'fs' and 'gs' from sigcontext
  2015-03-12 20:57 ` [PATCH v3 2/2] x86_64,signal: Remove 'fs' and 'gs' from sigcontext Andy Lutomirski
@ 2015-03-16 12:08   ` tip-bot for Andy Lutomirski
  2015-03-17  8:44   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 39+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-16 12:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, torvalds, linux-kernel, hpa, bp, oleg, tglx, viro, luto

Commit-ID:  e82f114d61effc06385f61bce9047a8f13d28337
Gitweb:     http://git.kernel.org/tip/e82f114d61effc06385f61bce9047a8f13d28337
Author:     Andy Lutomirski <luto@amacapital.net>
AuthorDate: Thu, 12 Mar 2015 13:57:52 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Mar 2015 09:14:51 +0100

x86/signal/64: Remove 'fs' and 'gs' from sigcontext

As far as I can tell, these fields have been set to zero on save
and ignored on restore since Linux was imported into git.
Rename them '__pad1' and '__pad2' to avoid confusion.  This may
also allow us to recycle them some day.

This also adds a comment clarifying the history of those fields.

I'm intentionally avoiding calling either of them '__pad0': the
field formerly known as '__pad0' is now 'ss'.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/844f8490e938780c03355be4c9b69eb4c494bf4e.1426193719.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/sigcontext.h      |  4 ++--
 arch/x86/include/uapi/asm/sigcontext.h | 19 +++++++++++++++++--
 arch/x86/kernel/signal.c               |  4 ++--
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index f910cdc..6fe6b18 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -57,8 +57,8 @@ struct sigcontext {
 	unsigned long ip;
 	unsigned long flags;
 	unsigned short cs;
-	unsigned short gs;
-	unsigned short fs;
+	unsigned short __pad2;	/* Was called gs, but was always zero. */
+	unsigned short __pad1;	/* Was called fs, but was always zero. */
 	unsigned short ss;
 	unsigned long err;
 	unsigned long trapno;
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index 076b11f..16dc4e8 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -177,8 +177,23 @@ struct sigcontext {
 	__u64 rip;
 	__u64 eflags;		/* RFLAGS */
 	__u16 cs;
-	__u16 gs;
-	__u16 fs;
+
+	/*
+	 * Prior to 2.5.64 ("[PATCH] x86-64 updates for 2.5.64-bk3"),
+	 * Linux saved and restored fs and gs in these slots.  This
+	 * was counterproductive, as fsbase and gsbase were never
+	 * saved, so arch_prctl was presumably unreliable.
+	 *
+	 * If these slots are ever needed for any other purpose, there
+	 * is some risk that very old 64-bit binaries could get
+	 * confused.  I doubt that many such binaries still work,
+	 * though, since the same patch in 2.5.64 also removed the
+	 * 64-bit set_thread_area syscall, so it appears that there is
+	 * no TLS API that works in both pre- and post-2.5.64 kernels.
+	 */
+	__u16 __pad2;		/* Was gs. */
+	__u16 __pad1;		/* Was fs. */
+
 	__u16 ss;
 	__u64 err;
 	__u64 trapno;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index e2f6061..edcb862 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -155,8 +155,8 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 #else /* !CONFIG_X86_32 */
 		put_user_ex(regs->flags, &sc->flags);
 		put_user_ex(regs->cs, &sc->cs);
-		put_user_ex(0, &sc->gs);
-		put_user_ex(0, &sc->fs);
+		put_user_ex(0, &sc->__pad2);
+		put_user_ex(0, &sc->__pad1);
 		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [tip:x86/asm] x86/signal/64: Fix SS handling for signals delivered to 64-bit programs
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
                     ` (2 preceding siblings ...)
  2015-03-16 12:08   ` [tip:x86/asm] x86/signal/64: " tip-bot for Andy Lutomirski
@ 2015-03-17  8:44   ` tip-bot for Andy Lutomirski
  2015-03-18 17:19   ` [PATCH v3 1/2] x86_64,signal: " Andrey Wagin
  4 siblings, 0 replies; 39+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-17  8:44 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, hpa, tglx, torvalds, luto, linux-kernel, oleg, mingo, viro

Commit-ID:  c6f2062935c8fcb31235799eaee8bcd5b649936b
Gitweb:     http://git.kernel.org/tip/c6f2062935c8fcb31235799eaee8bcd5b649936b
Author:     Andy Lutomirski <luto@amacapital.net>
AuthorDate: Thu, 12 Mar 2015 13:57:51 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 Mar 2015 09:25:25 +0100

x86/signal/64: Fix SS handling for signals delivered to 64-bit programs

The comment in the signal code says that apps can save/restore
other segments on their own.  It's true that apps can *save* SS
on their own, but there's no way for apps to restore it: SYSCALL
effectively resets SS to __USER_DS, so any value that user code
tries to load into SS gets lost on entry to sigreturn.

This recycles two padding bytes in the segment selector area for SS.

While we're at it, we need a second change to make this useful.

If the signal we're delivering is caused by a bad SS value,
saving that value isn't enough.  We need to remove that bad
value from the regs before we try to deliver the signal.  Oddly,
the i386 code already got this right.

I suspect that 64-bit programs that try to run 16-bit code and
use signals will have a lot of trouble without this.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/405594361340a2ec32f8e2b115c142df0e180d8e.1426193719.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/sigcontext.h      |  2 +-
 arch/x86/include/uapi/asm/sigcontext.h |  2 +-
 arch/x86/kernel/signal.c               | 22 +++++++++++++---------
 3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index 9dfce4e..f910cdc 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -59,7 +59,7 @@ struct sigcontext {
 	unsigned short cs;
 	unsigned short gs;
 	unsigned short fs;
-	unsigned short __pad0;
+	unsigned short ss;
 	unsigned long err;
 	unsigned long trapno;
 	unsigned long oldmask;
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index d8b9f90..076b11f 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -179,7 +179,7 @@ struct sigcontext {
 	__u16 cs;
 	__u16 gs;
 	__u16 fs;
-	__u16 __pad0;
+	__u16 ss;
 	__u64 err;
 	__u64 trapno;
 	__u64 oldmask;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index e504246..e2f6061 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -94,15 +94,8 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 		COPY(r15);
 #endif /* CONFIG_X86_64 */
 
-#ifdef CONFIG_X86_32
 		COPY_SEG_CPL3(cs);
 		COPY_SEG_CPL3(ss);
-#else /* !CONFIG_X86_32 */
-		/* Kernel saves and restores only the CS segment register on signals,
-		 * which is the bare minimum needed to allow mixed 32/64-bit code.
-		 * App's signal handler can save/restore other segments if needed. */
-		COPY_SEG_CPL3(cs);
-#endif /* CONFIG_X86_32 */
 
 		get_user_ex(tmpflags, &sc->flags);
 		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
@@ -164,6 +157,7 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		put_user_ex(regs->cs, &sc->cs);
 		put_user_ex(0, &sc->gs);
 		put_user_ex(0, &sc->fs);
+		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 
 		put_user_ex(fpstate, &sc->fpstate);
@@ -457,9 +451,19 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	regs->sp = (unsigned long)frame;
 
-	/* Set up the CS register to run signal handlers in 64-bit mode,
-	   even if the handler happens to be interrupting 32-bit code. */
+	/*
+	 * Set up the CS and SS registers to run signal handlers in
+	 * 64-bit mode, even if the handler happens to be interrupting
+	 * 32-bit or 16-bit code.
+	 *
+	 * SS is subtle.  In 64-bit mode, we don't need any particular
+	 * SS descriptor, but we do need SS to be valid.  It's possible
+	 * that the old SS is entirely bogus -- this can happen if the
+	 * signal we're trying to deliver is #GP or #SS caused by a bad
+	 * SS value.
+	 */
 	regs->cs = __USER_CS;
+	regs->ss = __USER_DS;
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [tip:x86/asm] x86/signal/64: Remove 'fs' and 'gs' from sigcontext
  2015-03-12 20:57 ` [PATCH v3 2/2] x86_64,signal: Remove 'fs' and 'gs' from sigcontext Andy Lutomirski
  2015-03-16 12:08   ` [tip:x86/asm] x86/signal/64: " tip-bot for Andy Lutomirski
@ 2015-03-17  8:44   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 39+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-17  8:44 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, luto, hpa, oleg, linux-kernel, torvalds, mingo, viro, tglx

Commit-ID:  9a036b93a344235b7899401d04e97c34f3a2554c
Gitweb:     http://git.kernel.org/tip/9a036b93a344235b7899401d04e97c34f3a2554c
Author:     Andy Lutomirski <luto@amacapital.net>
AuthorDate: Thu, 12 Mar 2015 13:57:52 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 Mar 2015 09:25:26 +0100

x86/signal/64: Remove 'fs' and 'gs' from sigcontext

As far as I can tell, these fields have been set to zero on save
and ignored on restore since Linux was imported into git.
Rename them '__pad1' and '__pad2' to avoid confusion.  This may
also allow us to recycle them some day.

This also adds a comment clarifying the history of those fields.

I'm intentionally avoiding calling either of them '__pad0': the
field formerly known as '__pad0' is now 'ss'.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/844f8490e938780c03355be4c9b69eb4c494bf4e.1426193719.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/sigcontext.h      |  4 ++--
 arch/x86/include/uapi/asm/sigcontext.h | 19 +++++++++++++++++--
 arch/x86/kernel/signal.c               |  4 ++--
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
index f910cdc..6fe6b18 100644
--- a/arch/x86/include/asm/sigcontext.h
+++ b/arch/x86/include/asm/sigcontext.h
@@ -57,8 +57,8 @@ struct sigcontext {
 	unsigned long ip;
 	unsigned long flags;
 	unsigned short cs;
-	unsigned short gs;
-	unsigned short fs;
+	unsigned short __pad2;	/* Was called gs, but was always zero. */
+	unsigned short __pad1;	/* Was called fs, but was always zero. */
 	unsigned short ss;
 	unsigned long err;
 	unsigned long trapno;
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index 076b11f..16dc4e8 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -177,8 +177,23 @@ struct sigcontext {
 	__u64 rip;
 	__u64 eflags;		/* RFLAGS */
 	__u16 cs;
-	__u16 gs;
-	__u16 fs;
+
+	/*
+	 * Prior to 2.5.64 ("[PATCH] x86-64 updates for 2.5.64-bk3"),
+	 * Linux saved and restored fs and gs in these slots.  This
+	 * was counterproductive, as fsbase and gsbase were never
+	 * saved, so arch_prctl was presumably unreliable.
+	 *
+	 * If these slots are ever needed for any other purpose, there
+	 * is some risk that very old 64-bit binaries could get
+	 * confused.  I doubt that many such binaries still work,
+	 * though, since the same patch in 2.5.64 also removed the
+	 * 64-bit set_thread_area syscall, so it appears that there is
+	 * no TLS API that works in both pre- and post-2.5.64 kernels.
+	 */
+	__u16 __pad2;		/* Was gs. */
+	__u16 __pad1;		/* Was fs. */
+
 	__u16 ss;
 	__u64 err;
 	__u64 trapno;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index e2f6061..edcb862 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -155,8 +155,8 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 #else /* !CONFIG_X86_32 */
 		put_user_ex(regs->flags, &sc->flags);
 		put_user_ex(regs->cs, &sc->cs);
-		put_user_ex(0, &sc->gs);
-		put_user_ex(0, &sc->fs);
+		put_user_ex(0, &sc->__pad2);
+		put_user_ex(0, &sc->__pad1);
 		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
                     ` (3 preceding siblings ...)
  2015-03-17  8:44   ` tip-bot for Andy Lutomirski
@ 2015-03-18 17:19   ` Andrey Wagin
  2015-03-18 17:48     ` Oleg Nesterov
  4 siblings, 1 reply; 39+ messages in thread
From: Andrey Wagin @ 2015-03-18 17:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Oleg Nesterov, Borislav Petkov, Andy Lutomirski,
	Cyrill Gorcunov, Pavel Emelyanov

2015-03-12 23:57 GMT+03:00 Andy Lutomirski <luto@kernel.org>:
> From: Andy Lutomirski <luto@amacapital.net>
>
> The comment in the signal code says that apps can save/restore other
> segments on their own.  It's true that apps can *save* SS on their
> own, but there's no way for apps to restore it: SYSCALL effectively
> resets SS to __USER_DS, so any value that user code tries to load
> into SS gets lost on entry to sigreturn.
>
> This recycles two padding bytes in the segment selector area for SS.
>
> While we're at it, we need a second change to make this useful.  If
> the signal we're delivering is caused by a bad SS value, saving that
> value isn't enough.  We need to remove that bad value from the regs
> before we try to deliver the signal.  Oddly, x32 already got this
> right.
>
> I suspect that 64-bit programs that try to run 16-bit code and use
> signals will have a lot of trouble without this.

This commit breaks CRIU. I don't have any details yet. I'm going to
investigate this issue and provide more details tomorrow.

[root@avagin-fc19-cr criu]# setsid sleep 1000 &
[1] 1225
[root@avagin-fc19-cr criu]# ps -C sleep
  PID TTY          TIME CMD
 1226 ?        00:00:00 sleep
[root@avagin-fc19-cr criu]# ./criu dump -t 1226 -D dump --shell-job
[root@avagin-fc19-cr criu]# ./criu restore -D dump --shell-job
Error (parasite-syscall.c:923): Task is in unexpected state: b7f (SIGSEGV)


>
> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> ---
>  arch/x86/include/asm/sigcontext.h      |  2 +-
>  arch/x86/include/uapi/asm/sigcontext.h |  2 +-
>  arch/x86/kernel/signal.c               | 22 +++++++++++++---------
>  3 files changed, 15 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/sigcontext.h b/arch/x86/include/asm/sigcontext.h
> index 9dfce4e0417d..f910cdcb71fd 100644
> --- a/arch/x86/include/asm/sigcontext.h
> +++ b/arch/x86/include/asm/sigcontext.h
> @@ -59,7 +59,7 @@ struct sigcontext {
>         unsigned short cs;
>         unsigned short gs;
>         unsigned short fs;
> -       unsigned short __pad0;
> +       unsigned short ss;
>         unsigned long err;
>         unsigned long trapno;
>         unsigned long oldmask;
> diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
> index d8b9f9081e86..076b11fd6fa1 100644
> --- a/arch/x86/include/uapi/asm/sigcontext.h
> +++ b/arch/x86/include/uapi/asm/sigcontext.h
> @@ -179,7 +179,7 @@ struct sigcontext {
>         __u16 cs;
>         __u16 gs;
>         __u16 fs;
> -       __u16 __pad0;
> +       __u16 ss;
>         __u64 err;
>         __u64 trapno;
>         __u64 oldmask;
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index ed37a768d0fc..9511eb7f17b0 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -94,15 +94,8 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
>                 COPY(r15);
>  #endif /* CONFIG_X86_64 */
>
> -#ifdef CONFIG_X86_32
>                 COPY_SEG_CPL3(cs);
>                 COPY_SEG_CPL3(ss);
> -#else /* !CONFIG_X86_32 */
> -               /* Kernel saves and restores only the CS segment register on signals,
> -                * which is the bare minimum needed to allow mixed 32/64-bit code.
> -                * App's signal handler can save/restore other segments if needed. */
> -               COPY_SEG_CPL3(cs);
> -#endif /* CONFIG_X86_32 */
>
>                 get_user_ex(tmpflags, &sc->flags);
>                 regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
> @@ -164,6 +157,7 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
>                 put_user_ex(regs->cs, &sc->cs);
>                 put_user_ex(0, &sc->gs);
>                 put_user_ex(0, &sc->fs);
> +               put_user_ex(regs->ss, &sc->ss);
>  #endif /* CONFIG_X86_32 */
>
>                 put_user_ex(fpstate, &sc->fpstate);
> @@ -457,9 +451,19 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
>
>         regs->sp = (unsigned long)frame;
>
> -       /* Set up the CS register to run signal handlers in 64-bit mode,
> -          even if the handler happens to be interrupting 32-bit code. */
> +       /*
> +        * Set up the CS and SS registers to run signal handlers in
> +        * 64-bit mode, even if the handler happens to be interrupting
> +        * 32-bit or 16-bit code.
> +        *
> +        * SS is subtle.  In 64-bit mode, we don't need any particular
> +        * SS descriptor, but we do need SS to be valid.  It's possible
> +        * that the old SS is entirely bogus -- this can happen if the
> +        * signal we're trying to deliver is #GP or #SS caused by a bad
> +        * SS value.
> +        */
>         regs->cs = __USER_CS;
> +       regs->ss = __USER_DS;
>
>         return 0;
>  }
> --
> 2.3.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 17:19   ` [PATCH v3 1/2] x86_64,signal: " Andrey Wagin
@ 2015-03-18 17:48     ` Oleg Nesterov
  2015-03-18 18:06       ` Andy Lutomirski
  2015-03-18 18:13       ` Cyrill Gorcunov
  0 siblings, 2 replies; 39+ messages in thread
From: Oleg Nesterov @ 2015-03-18 17:48 UTC (permalink / raw)
  To: Andrey Wagin
  Cc: Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Andy Lutomirski, Cyrill Gorcunov, Pavel Emelyanov

On 03/18, Andrey Wagin wrote:
>
> This commit breaks CRIU. I don't have any details yet. I'm going to
> investigate this issue and provide more details tomorrow.
>
> [root@avagin-fc19-cr criu]# setsid sleep 1000 &
> [1] 1225
> [root@avagin-fc19-cr criu]# ps -C sleep
>   PID TTY          TIME CMD
>  1226 ?        00:00:00 sleep
> [root@avagin-fc19-cr criu]# ./criu dump -t 1226 -D dump --shell-job
> [root@avagin-fc19-cr criu]# ./criu restore -D dump --shell-job
> Error (parasite-syscall.c:923): Task is in unexpected state: b7f (SIGSEGV)

This is funny. Because currenty I am looking into criu sources for quite
different reason (and I HATE this reason ;)

Shot in a dark afer a quick grep: restore_gpregs() should initialize ->ss?

perhaps something like below... obviously uncompiled/untested.

And my grep can't find the definition of UserX86RegsEntry in crtools...
Perhaps the change below needs CPREG1(ss, anothername).

Seriously, where is UserX86RegsEntry?

Oleg.


--- a/arch/x86/crtools.c
+++ b/arch/x86/crtools.c
@@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
 	CPREG2(rip, ip);
 	CPREG2(eflags, flags);
 	CPREG1(cs);
+	CPREG1(ss);
 	CPREG1(gs);
 	CPREG1(fs);
 
diff --git a/arch/x86/include/asm/restorer.h b/arch/x86/include/asm/restorer.h
index 70199fb..c04fb94 100644
--- a/arch/x86/include/asm/restorer.h
+++ b/arch/x86/include/asm/restorer.h
@@ -53,7 +53,7 @@ struct rt_sigcontext {
 	unsigned short			cs;
 	unsigned short			gs;
 	unsigned short			fs;
-	unsigned short			__pad0;
+	unsigned short			ss;
 	unsigned long			err;
 	unsigned long			trapno;
 	unsigned long			oldmask;


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 17:48     ` Oleg Nesterov
@ 2015-03-18 18:06       ` Andy Lutomirski
  2015-03-18 18:17         ` Cyrill Gorcunov
  2015-03-18 18:13       ` Cyrill Gorcunov
  1 sibling, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-18 18:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrey Wagin, Andy Lutomirski, Ingo Molnar, Andi Kleen,
	H. Peter Anvin, Al Viro, X86 ML, LKML, Linus Torvalds,
	Borislav Petkov, Cyrill Gorcunov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 10:48 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/18, Andrey Wagin wrote:
>>
>> This commit breaks CRIU. I don't have any details yet. I'm going to
>> investigate this issue and provide more details tomorrow.
>>
>> [root@avagin-fc19-cr criu]# setsid sleep 1000 &
>> [1] 1225
>> [root@avagin-fc19-cr criu]# ps -C sleep
>>   PID TTY          TIME CMD
>>  1226 ?        00:00:00 sleep
>> [root@avagin-fc19-cr criu]# ./criu dump -t 1226 -D dump --shell-job
>> [root@avagin-fc19-cr criu]# ./criu restore -D dump --shell-job
>> Error (parasite-syscall.c:923): Task is in unexpected state: b7f (SIGSEGV)
>
> This is funny. Because currenty I am looking into criu sources for quite
> different reason (and I HATE this reason ;)
>
> Shot in a dark afer a quick grep: restore_gpregs() should initialize ->ss?
>
> perhaps something like below... obviously uncompiled/untested.
>
> And my grep can't find the definition of UserX86RegsEntry in crtools...
> Perhaps the change below needs CPREG1(ss, anothername).
>
> Seriously, where is UserX86RegsEntry?
>
> Oleg.
>
>
> --- a/arch/x86/crtools.c
> +++ b/arch/x86/crtools.c
> @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
>         CPREG2(rip, ip);
>         CPREG2(eflags, flags);
>         CPREG1(cs);
> +       CPREG1(ss);
>         CPREG1(gs);
>         CPREG1(fs);

Huh?  Is CRIU actually trying to build an entire sigcontext from
scratch here?  I don't see how this can reliably work across kernel
versions or CPU versions.

Also, what's up with CPREG1(gs) and CPREG1(fs)?  I assume that's
redundant, because that hasn't worked for many years, but CRIU works,
so there must be correct code somewhere to restore those regs.

--Andy

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 17:48     ` Oleg Nesterov
  2015-03-18 18:06       ` Andy Lutomirski
@ 2015-03-18 18:13       ` Cyrill Gorcunov
  2015-03-18 18:31         ` Oleg Nesterov
  1 sibling, 1 reply; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-18 18:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrey Wagin, Andy Lutomirski, Ingo Molnar, Andi Kleen,
	H. Peter Anvin, Al Viro, X86 ML, LKML, Linus Torvalds,
	Borislav Petkov, Andy Lutomirski, Pavel Emelyanov

On Wed, Mar 18, 2015 at 06:48:43PM +0100, Oleg Nesterov wrote:
> On 03/18, Andrey Wagin wrote:
> >
> > This commit breaks CRIU. I don't have any details yet. I'm going to
> > investigate this issue and provide more details tomorrow.
> >
> > [root@avagin-fc19-cr criu]# setsid sleep 1000 &
> > [1] 1225
> > [root@avagin-fc19-cr criu]# ps -C sleep
> >   PID TTY          TIME CMD
> >  1226 ?        00:00:00 sleep
> > [root@avagin-fc19-cr criu]# ./criu dump -t 1226 -D dump --shell-job
> > [root@avagin-fc19-cr criu]# ./criu restore -D dump --shell-job
> > Error (parasite-syscall.c:923): Task is in unexpected state: b7f (SIGSEGV)
> 
> This is funny. Because currenty I am looking into criu sources for quite
> different reason (and I HATE this reason ;)
> 
> Shot in a dark afer a quick grep: restore_gpregs() should initialize ->ss?

It hasn't been needed earlier, if this would help it means abi is broken, no? :)
Otherwise I don't understand what's happening.

> perhaps something like below... obviously uncompiled/untested.
> 
> And my grep can't find the definition of UserX86RegsEntry in crtools...
> Perhaps the change below needs CPREG1(ss, anothername).
> 
> Seriously, where is UserX86RegsEntry?

It's in protobif/core-x86.proto, welcome to protobuf hell.

> 
> Oleg.
> 
> 
> --- a/arch/x86/crtools.c
> +++ b/arch/x86/crtools.c
> @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
>  	CPREG2(rip, ip);
>  	CPREG2(eflags, flags);
>  	CPREG1(cs);
> +	CPREG1(ss);
>  	CPREG1(gs);
>  	CPREG1(fs);
>  
> diff --git a/arch/x86/include/asm/restorer.h b/arch/x86/include/asm/restorer.h
> index 70199fb..c04fb94 100644
> --- a/arch/x86/include/asm/restorer.h
> +++ b/arch/x86/include/asm/restorer.h
> @@ -53,7 +53,7 @@ struct rt_sigcontext {
>  	unsigned short			cs;
>  	unsigned short			gs;
>  	unsigned short			fs;
> -	unsigned short			__pad0;
> +	unsigned short			ss;
>  	unsigned long			err;
>  	unsigned long			trapno;
>  	unsigned long			oldmask;
> 

	Cyrill

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:06       ` Andy Lutomirski
@ 2015-03-18 18:17         ` Cyrill Gorcunov
  2015-03-18 18:20           ` Andy Lutomirski
  2015-03-18 18:25           ` Oleg Nesterov
  0 siblings, 2 replies; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-18 18:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Andrey Wagin, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 11:06:00AM -0700, Andy Lutomirski wrote:
> > --- a/arch/x86/crtools.c
> > +++ b/arch/x86/crtools.c
> > @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
> >         CPREG2(rip, ip);
> >         CPREG2(eflags, flags);
> >         CPREG1(cs);
> > +       CPREG1(ss);
> >         CPREG1(gs);
> >         CPREG1(fs);
> 
> Huh?  Is CRIU actually trying to build an entire sigcontext from
> scratch here?  I don't see how this can reliably work across kernel
> versions or CPU versions.

Yes we are. And why it can't work reliably? As to CPU -- we're
testing that cpu features saved in image should match ones
provided by the kernel runtime, ie on the machine where we're
restoring.

> Also, what's up with CPREG1(gs) and CPREG1(fs)?  I assume that's
> redundant, because that hasn't worked for many years, but CRIU works,
> so there must be correct code somewhere to restore those regs.

Basically the initial registers values are fetched with ptrace
when program is been dumped, then on restore we copy them back
into sigcontext. That said we don't mangle fs/gs anyhow simply
restore the values back from dump.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:17         ` Cyrill Gorcunov
@ 2015-03-18 18:20           ` Andy Lutomirski
  2015-03-18 18:45             ` Cyrill Gorcunov
  2015-03-18 18:25           ` Oleg Nesterov
  1 sibling, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-18 18:20 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, Andrey Wagin, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 11:17 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Wed, Mar 18, 2015 at 11:06:00AM -0700, Andy Lutomirski wrote:
>> > --- a/arch/x86/crtools.c
>> > +++ b/arch/x86/crtools.c
>> > @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
>> >         CPREG2(rip, ip);
>> >         CPREG2(eflags, flags);
>> >         CPREG1(cs);
>> > +       CPREG1(ss);
>> >         CPREG1(gs);
>> >         CPREG1(fs);
>>
>> Huh?  Is CRIU actually trying to build an entire sigcontext from
>> scratch here?  I don't see how this can reliably work across kernel
>> versions or CPU versions.
>
> Yes we are. And why it can't work reliably? As to CPU -- we're
> testing that cpu features saved in image should match ones
> provided by the kernel runtime, ie on the machine where we're
> restoring.
>
>> Also, what's up with CPREG1(gs) and CPREG1(fs)?  I assume that's
>> redundant, because that hasn't worked for many years, but CRIU works,
>> so there must be correct code somewhere to restore those regs.
>
> Basically the initial registers values are fetched with ptrace
> when program is been dumped, then on restore we copy them back
> into sigcontext. That said we don't mangle fs/gs anyhow simply
> restore the values back from dump.

Wouldn't it be a little safer to have the kernel make you a sane
sigcontext by raising a signal and catching it rather than writing a
new one from scratch?  Grr, maybe the kernel should version its
sigcontext structures.

Getting fs from ptrace and restoring it to sigcontext is useless --
ptrace handles fs correctly, but that sigcontext field is just
padding.

Is CRIU at least reliably putting zero in the ss field?  If so, we
could add a special case to translate zero to __USER_DS on restore.
(And then I'll update my test case, and we'll have to document it,
etc.)

--Andy

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:17         ` Cyrill Gorcunov
  2015-03-18 18:20           ` Andy Lutomirski
@ 2015-03-18 18:25           ` Oleg Nesterov
  2015-03-18 18:32             ` Andy Lutomirski
  2015-03-18 19:13             ` Cyrill Gorcunov
  1 sibling, 2 replies; 39+ messages in thread
From: Oleg Nesterov @ 2015-03-18 18:25 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andy Lutomirski, Andrey Wagin, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On 03/18, Cyrill Gorcunov wrote:
>
> On Wed, Mar 18, 2015 at 11:06:00AM -0700, Andy Lutomirski wrote:
> > > --- a/arch/x86/crtools.c
> > > +++ b/arch/x86/crtools.c
> > > @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
> > >         CPREG2(rip, ip);
> > >         CPREG2(eflags, flags);
> > >         CPREG1(cs);
> > > +       CPREG1(ss);
> > >         CPREG1(gs);
> > >         CPREG1(fs);
> >
> > Huh?  Is CRIU actually trying to build an entire sigcontext from
> > scratch here?  I don't see how this can reliably work across kernel
> > versions or CPU versions.
>
> Yes we are. And why it can't work reliably? As to CPU -- we're
> testing that cpu features saved in image should match ones
> provided by the kernel runtime, ie on the machine where we're
> restoring.

But, say, __USER_CS can be changed in kernel, and nobody should notice this.

But in this case "restore on another machine" or "restore after kernel
upgrade" can fail.

So probably restore_gpregs() should only change the general-purpose regs,
as its name suggests.

Oleg.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:13       ` Cyrill Gorcunov
@ 2015-03-18 18:31         ` Oleg Nesterov
  2015-03-18 18:50           ` Cyrill Gorcunov
  0 siblings, 1 reply; 39+ messages in thread
From: Oleg Nesterov @ 2015-03-18 18:31 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andrey Wagin, Andy Lutomirski, Ingo Molnar, Andi Kleen,
	H. Peter Anvin, Al Viro, X86 ML, LKML, Linus Torvalds,
	Borislav Petkov, Andy Lutomirski, Pavel Emelyanov

On 03/18, Cyrill Gorcunov wrote:
>
> On Wed, Mar 18, 2015 at 06:48:43PM +0100, Oleg Nesterov wrote:
> >
> > Shot in a dark afer a quick grep: restore_gpregs() should initialize ->ss?
>
> It hasn't been needed earlier, if this would help it means abi is broken, no? :)
> Otherwise I don't understand what's happening.

until this commit the kernel simply forgot to restore ->ss in sigreturn().

after this commit, if your rt_sigframe has garbage in ->ss then SIGSEGV
is clear.

> > perhaps something like below... obviously uncompiled/untested.
> >
> > And my grep can't find the definition of UserX86RegsEntry in crtools...
> > Perhaps the change below needs CPREG1(ss, anothername).
> >
> > Seriously, where is UserX86RegsEntry?
>
> It's in protobif/core-x86.proto, welcome to protobuf hell.

Still can't find UserX86RegsEntry... OK, perhaps it comes from
user_x86_regs_entry in protobuf/core-x86.proto. Then I guess the patch I sent
can be compiled at least ;)

>
> >
> > Oleg.
> >
> >
> > --- a/arch/x86/crtools.c
> > +++ b/arch/x86/crtools.c
> > @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
> >  	CPREG2(rip, ip);
> >  	CPREG2(eflags, flags);
> >  	CPREG1(cs);
> > +	CPREG1(ss);
> >  	CPREG1(gs);
> >  	CPREG1(fs);
> >
> > diff --git a/arch/x86/include/asm/restorer.h b/arch/x86/include/asm/restorer.h
> > index 70199fb..c04fb94 100644
> > --- a/arch/x86/include/asm/restorer.h
> > +++ b/arch/x86/include/asm/restorer.h
> > @@ -53,7 +53,7 @@ struct rt_sigcontext {
> >  	unsigned short			cs;
> >  	unsigned short			gs;
> >  	unsigned short			fs;
> > -	unsigned short			__pad0;
> > +	unsigned short			ss;
> >  	unsigned long			err;
> >  	unsigned long			trapno;
> >  	unsigned long			oldmask;
> >
>
> 	Cyrill


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:25           ` Oleg Nesterov
@ 2015-03-18 18:32             ` Andy Lutomirski
  2015-03-18 19:13             ` Cyrill Gorcunov
  1 sibling, 0 replies; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-18 18:32 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Cyrill Gorcunov, Andrey Wagin, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 11:25 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/18, Cyrill Gorcunov wrote:
>>
>> On Wed, Mar 18, 2015 at 11:06:00AM -0700, Andy Lutomirski wrote:
>> > > --- a/arch/x86/crtools.c
>> > > +++ b/arch/x86/crtools.c
>> > > @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
>> > >         CPREG2(rip, ip);
>> > >         CPREG2(eflags, flags);
>> > >         CPREG1(cs);
>> > > +       CPREG1(ss);
>> > >         CPREG1(gs);
>> > >         CPREG1(fs);
>> >
>> > Huh?  Is CRIU actually trying to build an entire sigcontext from
>> > scratch here?  I don't see how this can reliably work across kernel
>> > versions or CPU versions.
>>
>> Yes we are. And why it can't work reliably? As to CPU -- we're
>> testing that cpu features saved in image should match ones
>> provided by the kernel runtime, ie on the machine where we're
>> restoring.
>
> But, say, __USER_CS can be changed in kernel, and nobody should notice this.

This actually happens on a regular basis.  Has anyone tried
checkpointing on Xen and restoring on native?  Xen does interesting
things to cs and possibly to ss as well on x86.

You get somewhat of a free pass on ds and es because the kernel's
context switch code is very tolerant of failures there.

--Andy

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:20           ` Andy Lutomirski
@ 2015-03-18 18:45             ` Cyrill Gorcunov
  0 siblings, 0 replies; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-18 18:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Andrey Wagin, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 11:20:08AM -0700, Andy Lutomirski wrote:
> >
> > Basically the initial registers values are fetched with ptrace
> > when program is been dumped, then on restore we copy them back
> > into sigcontext. That said we don't mangle fs/gs anyhow simply
> > restore the values back from dump.
> 
> Wouldn't it be a little safer to have the kernel make you a sane
> sigcontext by raising a signal and catching it rather than writing a
> new one from scratch?  Grr, maybe the kernel should version its
> sigcontext structures.

I think it would, thanks for the point! I'll tune up our code.

> Getting fs from ptrace and restoring it to sigcontext is useless --
> ptrace handles fs correctly, but that sigcontext field is just
> padding.
> 
> Is CRIU at least reliably putting zero in the ss field?  If so, we
> could add a special case to translate zero to __USER_DS on restore.
> (And then I'll update my test case, and we'll have to document it,
> etc.)

Well, it should but need to check (we are tryin to not zeroify
things until really needed in a sake of speed).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:31         ` Oleg Nesterov
@ 2015-03-18 18:50           ` Cyrill Gorcunov
  2015-03-18 19:52             ` Andrey Wagin
  0 siblings, 1 reply; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-18 18:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrey Wagin, Andy Lutomirski, Ingo Molnar, Andi Kleen,
	H. Peter Anvin, Al Viro, X86 ML, LKML, Linus Torvalds,
	Borislav Petkov, Andy Lutomirski, Pavel Emelyanov

On Wed, Mar 18, 2015 at 07:31:33PM +0100, Oleg Nesterov wrote:
> On 03/18, Cyrill Gorcunov wrote:
> >
> > On Wed, Mar 18, 2015 at 06:48:43PM +0100, Oleg Nesterov wrote:
> > >
> > > Shot in a dark afer a quick grep: restore_gpregs() should initialize ->ss?
> >
> > It hasn't been needed earlier, if this would help it means abi is broken, no? :)
> > Otherwise I don't understand what's happening.
> 
> until this commit the kernel simply forgot to restore ->ss in sigreturn().
> 
> after this commit, if your rt_sigframe has garbage in ->ss then SIGSEGV
> is clear.

OK, so setting up a proper ss here should fix problem.

> > >
> > > Seriously, where is UserX86RegsEntry?
> >
> > It's in protobif/core-x86.proto, welcome to protobuf hell.
> 
> Still can't find UserX86RegsEntry... OK, perhaps it comes from
> user_x86_regs_entry in protobuf/core-x86.proto. Then I guess the patch I sent
> can be compiled at least ;)

Yeah, protobuf-c mangles "_" in message name so user_x86_regs_entry -> UserX86RegsEntry.
Andrew should be able to test it tomorrow (hopefully it's no urgent right? ;)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:25           ` Oleg Nesterov
  2015-03-18 18:32             ` Andy Lutomirski
@ 2015-03-18 19:13             ` Cyrill Gorcunov
  1 sibling, 0 replies; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-18 19:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Andrey Wagin, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 07:25:22PM +0100, Oleg Nesterov wrote:
> On 03/18, Cyrill Gorcunov wrote:
> >
> > On Wed, Mar 18, 2015 at 11:06:00AM -0700, Andy Lutomirski wrote:
> > > > --- a/arch/x86/crtools.c
> > > > +++ b/arch/x86/crtools.c
> > > > @@ -475,6 +475,7 @@ int restore_gpregs(struct rt_sigframe *f, UserX86RegsEntry *r)
> > > >         CPREG2(rip, ip);
> > > >         CPREG2(eflags, flags);
> > > >         CPREG1(cs);
> > > > +       CPREG1(ss);
> > > >         CPREG1(gs);
> > > >         CPREG1(fs);
> > >
> > > Huh?  Is CRIU actually trying to build an entire sigcontext from
> > > scratch here?  I don't see how this can reliably work across kernel
> > > versions or CPU versions.
> >
> > Yes we are. And why it can't work reliably? As to CPU -- we're
> > testing that cpu features saved in image should match ones
> > provided by the kernel runtime, ie on the machine where we're
> > restoring.
> 
> But, say, __USER_CS can be changed in kernel, and nobody should notice this.

True (and this applies to quotes below). Hopefully it won't be changed frequently
though ;) As to seg registers sure the safe way as Andy pointed is to fetch them
runtime on the machine we're restoring. Thanks, I will update our code!

> 
> But in this case "restore on another machine" or "restore after kernel
> upgrade" can fail.
> 
> So probably restore_gpregs() should only change the general-purpose regs,
> as its name suggests.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 18:50           ` Cyrill Gorcunov
@ 2015-03-18 19:52             ` Andrey Wagin
  2015-03-18 20:02               ` Oleg Nesterov
  0 siblings, 1 reply; 39+ messages in thread
From: Andrey Wagin @ 2015-03-18 19:52 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, Andy Lutomirski, Ingo Molnar, Andi Kleen,
	H. Peter Anvin, Al Viro, X86 ML, LKML, Linus Torvalds,
	Borislav Petkov, Andy Lutomirski, Pavel Emelyanov

2015-03-18 21:50 GMT+03:00 Cyrill Gorcunov <gorcunov@gmail.com>:
> On Wed, Mar 18, 2015 at 07:31:33PM +0100, Oleg Nesterov wrote:
>> On 03/18, Cyrill Gorcunov wrote:
>> >
>> > On Wed, Mar 18, 2015 at 06:48:43PM +0100, Oleg Nesterov wrote:
>> > >
>> > > Shot in a dark afer a quick grep: restore_gpregs() should initialize ->ss?
>> >
>> > It hasn't been needed earlier, if this would help it means abi is broken, no? :)
>> > Otherwise I don't understand what's happening.
>>
>> until this commit the kernel simply forgot to restore ->ss in sigreturn().
>>
>> after this commit, if your rt_sigframe has garbage in ->ss then SIGSEGV
>> is clear.
>
> OK, so setting up a proper ss here should fix problem.
>
>> > >
>> > > Seriously, where is UserX86RegsEntry?
>> >
>> > It's in protobif/core-x86.proto, welcome to protobuf hell.
>>
>> Still can't find UserX86RegsEntry... OK, perhaps it comes from
>> user_x86_regs_entry in protobuf/core-x86.proto. Then I guess the patch I sent
>> can be compiled at least ;)
>
> Yeah, protobuf-c mangles "_" in message name so user_x86_regs_entry -> UserX86RegsEntry.
> Andrew should be able to test it tomorrow (hopefully it's no urgent right? ;)

This patch fixes the problem. Oleg, could you send this path in the
criu maillist?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 19:52             ` Andrey Wagin
@ 2015-03-18 20:02               ` Oleg Nesterov
  2015-03-18 21:26                 ` Andy Lutomirski
  0 siblings, 1 reply; 39+ messages in thread
From: Oleg Nesterov @ 2015-03-18 20:02 UTC (permalink / raw)
  To: Andrey Wagin
  Cc: Cyrill Gorcunov, Andy Lutomirski, Ingo Molnar, Andi Kleen,
	H. Peter Anvin, Al Viro, X86 ML, LKML, Linus Torvalds,
	Borislav Petkov, Andy Lutomirski, Pavel Emelyanov

On 03/18, Andrey Wagin wrote:
>
> This patch fixes the problem. Oleg, could you send this path in the
> criu maillist?

Sure, will do.

Oleg.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 20:02               ` Oleg Nesterov
@ 2015-03-18 21:26                 ` Andy Lutomirski
  2015-03-18 21:34                   ` Pavel Emelyanov
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-18 21:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrey Wagin, Cyrill Gorcunov, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 1:02 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/18, Andrey Wagin wrote:
>>
>> This patch fixes the problem. Oleg, could you send this path in the
>> criu maillist?
>
> Sure, will do.

We still haven't answered one question: what's the kernel's position
on ABI stability wrt CRIU?  We clearly shouldn't make changes that
break the principle of CRIU, but CRIU encodes so many tricky
assumptions about the inner workings of the kernel that it's really
tough to avoid breaking old CRIU versions.

So... do we introduce somewhat nasty code into the kernel to keep old
CRIU versions working, or do we require that users who want to restore
onto new kernels use new CRIU?

(It seems clear to me that CRIU should apply the patch regardless of
what the kernel does.  It will enable CRIU to work on the same class
of programs that are fixed by the kernel change that started this
thread.)

--Andy

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 21:26                 ` Andy Lutomirski
@ 2015-03-18 21:34                   ` Pavel Emelyanov
  2015-03-18 22:03                     ` Andy Lutomirski
  0 siblings, 1 reply; 39+ messages in thread
From: Pavel Emelyanov @ 2015-03-18 21:34 UTC (permalink / raw)
  To: Andy Lutomirski, Oleg Nesterov
  Cc: Andrey Wagin, Cyrill Gorcunov, Andy Lutomirski, Ingo Molnar,
	Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On 03/19/2015 12:26 AM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 1:02 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 03/18, Andrey Wagin wrote:
>>>
>>> This patch fixes the problem. Oleg, could you send this path in the
>>> criu maillist?
>>
>> Sure, will do.
> 
> We still haven't answered one question: what's the kernel's position
> on ABI stability wrt CRIU?  We clearly shouldn't make changes that
> break the principle of CRIU, but CRIU encodes so many tricky
> assumptions about the inner workings of the kernel that it's really
> tough to avoid breaking old CRIU versions.

Well, we try hard to use only documented kernel API-s. Isn't the sigframe
considered to be some sort of "stable API"? I mean -- it's visible by the
userspace, nobody prevents glibc or gdb from messing with this stuff just
by reading it from memory.

If it's "parse-able" e.g. like VDSO is, but we don't do it in CRIU -- then
it's definitely a CRIU BUG to be fixed.

> So... do we introduce somewhat nasty code into the kernel to keep old
> CRIU versions working, or do we require that users who want to restore
> onto new kernels use new CRIU?

It's OK (I think) to require newer versions of CRIU, it's easy to update
one unlike the kernel ;)

But if "old" version of CRIU just crash the restored processes on "new"
kernels and there's no way to detect this properly -- that's the problem.

> (It seems clear to me that CRIU should apply the patch regardless of
> what the kernel does.  It will enable CRIU to work on the same class
> of programs that are fixed by the kernel change that started this
> thread.)
> 
> --Andy
> .
> 

Thanks,
Pavel


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 21:34                   ` Pavel Emelyanov
@ 2015-03-18 22:03                     ` Andy Lutomirski
  2015-03-19  7:35                       ` Cyrill Gorcunov
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-18 22:03 UTC (permalink / raw)
  To: Pavel Emelyanov
  Cc: Oleg Nesterov, Andrey Wagin, Cyrill Gorcunov, Andy Lutomirski,
	Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 2:34 PM, Pavel Emelyanov <xemul@parallels.com> wrote:
> On 03/19/2015 12:26 AM, Andy Lutomirski wrote:
>> On Wed, Mar 18, 2015 at 1:02 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>>> On 03/18, Andrey Wagin wrote:
>>>>
>>>> This patch fixes the problem. Oleg, could you send this path in the
>>>> criu maillist?
>>>
>>> Sure, will do.
>>
>> We still haven't answered one question: what's the kernel's position
>> on ABI stability wrt CRIU?  We clearly shouldn't make changes that
>> break the principle of CRIU, but CRIU encodes so many tricky
>> assumptions about the inner workings of the kernel that it's really
>> tough to avoid breaking old CRIU versions.
>
> Well, we try hard to use only documented kernel API-s. Isn't the sigframe
> considered to be some sort of "stable API"? I mean -- it's visible by the
> userspace, nobody prevents glibc or gdb from messing with this stuff just
> by reading it from memory.
>
> If it's "parse-able" e.g. like VDSO is, but we don't do it in CRIU -- then
> it's definitely a CRIU BUG to be fixed.

It's certainly parseable by things like gdb.  But it's also supposed
to be extensible.  hpa, any thoughts here?

>
>> So... do we introduce somewhat nasty code into the kernel to keep old
>> CRIU versions working, or do we require that users who want to restore
>> onto new kernels use new CRIU?
>
> It's OK (I think) to require newer versions of CRIU, it's easy to update
> one unlike the kernel ;)
>
> But if "old" version of CRIU just crash the restored processes on "new"
> kernels and there's no way to detect this properly -- that's the problem.

Yeah, that's unfortunate.

I don't have a great idea for how to work around this, unfortunately.
Ideally we'd increment some kind of version counter or use an
extension mechanism rather than shoving ss into a field that used to
be padding.

--Andy

>
>> (It seems clear to me that CRIU should apply the patch regardless of
>> what the kernel does.  It will enable CRIU to work on the same class
>> of programs that are fixed by the kernel change that started this
>> thread.)
>>
>> --Andy
>> .
>>
>
> Thanks,
> Pavel
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-18 22:03                     ` Andy Lutomirski
@ 2015-03-19  7:35                       ` Cyrill Gorcunov
  2015-03-19 16:08                         ` Andy Lutomirski
  2015-03-20 11:43                         ` Denys Vlasenko
  0 siblings, 2 replies; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-19  7:35 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Pavel Emelyanov, Oleg Nesterov, Andrey Wagin, Andy Lutomirski,
	Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Wed, Mar 18, 2015 at 03:03:27PM -0700, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 2:34 PM, Pavel Emelyanov <xemul@parallels.com> wrote:
> > On 03/19/2015 12:26 AM, Andy Lutomirski wrote:
> >> On Wed, Mar 18, 2015 at 1:02 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >>> On 03/18, Andrey Wagin wrote:
> >>>>
> >>>> This patch fixes the problem. Oleg, could you send this path in the
> >>>> criu maillist?
> >>>
> >>> Sure, will do.
> >>
> >> We still haven't answered one question: what's the kernel's position
> >> on ABI stability wrt CRIU?  We clearly shouldn't make changes that
> >> break the principle of CRIU, but CRIU encodes so many tricky
> >> assumptions about the inner workings of the kernel that it's really
> >> tough to avoid breaking old CRIU versions.
> >
> > Well, we try hard to use only documented kernel API-s. Isn't the sigframe
> > considered to be some sort of "stable API"? I mean -- it's visible by the
> > userspace, nobody prevents glibc or gdb from messing with this stuff just
> > by reading it from memory.
> >
> > If it's "parse-able" e.g. like VDSO is, but we don't do it in CRIU -- then
> > it's definitely a CRIU BUG to be fixed.
> 
> It's certainly parseable by things like gdb.  But it's also supposed
> to be extensible.  hpa, any thoughts here?
> 
> >
> >> So... do we introduce somewhat nasty code into the kernel to keep old
> >> CRIU versions working, or do we require that users who want to restore
> >> onto new kernels use new CRIU?
> >
> > It's OK (I think) to require newer versions of CRIU, it's easy to update
> > one unlike the kernel ;)
> >
> > But if "old" version of CRIU just crash the restored processes on "new"
> > kernels and there's no way to detect this properly -- that's the problem.
> 
> Yeah, that's unfortunate.
> 
> I don't have a great idea for how to work around this, unfortunately.
> Ideally we'd increment some kind of version counter or use an
> extension mechanism rather than shoving ss into a field that used to
> be padding.

fwiw currently we're passing zero in this __pad0 (replying to your
previous email, so we can workaround in the kernel assuming zero
as a special case, not that good but better than nothing).

> 
> --Andy
> 
> >
> >> (It seems clear to me that CRIU should apply the patch regardless of
> >> what the kernel does.  It will enable CRIU to work on the same class
> >> of programs that are fixed by the kernel change that started this
> >> thread.)
> >>
> >> --Andy
> >> .
> >>
> >
> > Thanks,
> > Pavel
> >
> 
> 
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC
> 

	Cyrill

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-19  7:35                       ` Cyrill Gorcunov
@ 2015-03-19 16:08                         ` Andy Lutomirski
  2015-03-19 16:19                           ` Cyrill Gorcunov
  2015-03-20 11:43                         ` Denys Vlasenko
  1 sibling, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-03-19 16:08 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Pavel Emelyanov, Oleg Nesterov, Andrey Wagin, Andy Lutomirski,
	Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Thu, Mar 19, 2015 at 12:35 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Wed, Mar 18, 2015 at 03:03:27PM -0700, Andy Lutomirski wrote:
>> On Wed, Mar 18, 2015 at 2:34 PM, Pavel Emelyanov <xemul@parallels.com> wrote:
>> > On 03/19/2015 12:26 AM, Andy Lutomirski wrote:
>> >> On Wed, Mar 18, 2015 at 1:02 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> >>> On 03/18, Andrey Wagin wrote:
>> >>>>
>> >>>> This patch fixes the problem. Oleg, could you send this path in the
>> >>>> criu maillist?
>> >>>
>> >>> Sure, will do.
>> >>
>> >> We still haven't answered one question: what's the kernel's position
>> >> on ABI stability wrt CRIU?  We clearly shouldn't make changes that
>> >> break the principle of CRIU, but CRIU encodes so many tricky
>> >> assumptions about the inner workings of the kernel that it's really
>> >> tough to avoid breaking old CRIU versions.
>> >
>> > Well, we try hard to use only documented kernel API-s. Isn't the sigframe
>> > considered to be some sort of "stable API"? I mean -- it's visible by the
>> > userspace, nobody prevents glibc or gdb from messing with this stuff just
>> > by reading it from memory.
>> >
>> > If it's "parse-able" e.g. like VDSO is, but we don't do it in CRIU -- then
>> > it's definitely a CRIU BUG to be fixed.
>>
>> It's certainly parseable by things like gdb.  But it's also supposed
>> to be extensible.  hpa, any thoughts here?
>>
>> >
>> >> So... do we introduce somewhat nasty code into the kernel to keep old
>> >> CRIU versions working, or do we require that users who want to restore
>> >> onto new kernels use new CRIU?
>> >
>> > It's OK (I think) to require newer versions of CRIU, it's easy to update
>> > one unlike the kernel ;)
>> >
>> > But if "old" version of CRIU just crash the restored processes on "new"
>> > kernels and there's no way to detect this properly -- that's the problem.
>>
>> Yeah, that's unfortunate.
>>
>> I don't have a great idea for how to work around this, unfortunately.
>> Ideally we'd increment some kind of version counter or use an
>> extension mechanism rather than shoving ss into a field that used to
>> be padding.
>
> fwiw currently we're passing zero in this __pad0 (replying to your
> previous email, so we can workaround in the kernel assuming zero
> as a special case, not that good but better than nothing).

We could store ss_plus_one instead of ss.  Yuck.

The only real down side I can see to special casing zero is that it
really is possible to end up with zero in there.  For example, the
SIGSEGV you get do to the failed sigreturn probably has sigcontext->ss
== 0 :)

--Andy

>
>>
>> --Andy
>>
>> >
>> >> (It seems clear to me that CRIU should apply the patch regardless of
>> >> what the kernel does.  It will enable CRIU to work on the same class
>> >> of programs that are fixed by the kernel change that started this
>> >> thread.)
>> >>
>> >> --Andy
>> >> .
>> >>
>> >
>> > Thanks,
>> > Pavel
>> >
>>
>>
>>
>> --
>> Andy Lutomirski
>> AMA Capital Management, LLC
>>
>
>         Cyrill



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-19 16:08                         ` Andy Lutomirski
@ 2015-03-19 16:19                           ` Cyrill Gorcunov
  0 siblings, 0 replies; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-19 16:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Pavel Emelyanov, Oleg Nesterov, Andrey Wagin, Andy Lutomirski,
	Ingo Molnar, Andi Kleen, H. Peter Anvin, Al Viro, X86 ML, LKML,
	Linus Torvalds, Borislav Petkov, Pavel Emelyanov

On Thu, Mar 19, 2015 at 09:08:08AM -0700, Andy Lutomirski wrote:
> >> >
> >> >> So... do we introduce somewhat nasty code into the kernel to keep old
> >> >> CRIU versions working, or do we require that users who want to restore
> >> >> onto new kernels use new CRIU?
> >> >
> >> > It's OK (I think) to require newer versions of CRIU, it's easy to update
> >> > one unlike the kernel ;)
> >> >
> >> > But if "old" version of CRIU just crash the restored processes on "new"
> >> > kernels and there's no way to detect this properly -- that's the problem.
> >>
> >> Yeah, that's unfortunate.
> >>
> >> I don't have a great idea for how to work around this, unfortunately.
> >> Ideally we'd increment some kind of version counter or use an
> >> extension mechanism rather than shoving ss into a field that used to
> >> be padding.
> >
> > fwiw currently we're passing zero in this __pad0 (replying to your
> > previous email, so we can workaround in the kernel assuming zero
> > as a special case, not that good but better than nothing).
> 
> We could store ss_plus_one instead of ss.  Yuck.

Heh, sounds like ugly-dirty-hack ;)

> The only real down side I can see to special casing zero is that it
> really is possible to end up with zero in there.  For example, the
> SIGSEGV you get do to the failed sigreturn probably has sigcontext->ss
> == 0 :)

Sure we;ve failed for exactly this reason simply because we've it
zeroified manually :) So there is no simple workaround in the kernel
for programs dumped before the kernel fix and restored on new kernel
where this patch exist but criu is old enough and has no Oleg's fix.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-19  7:35                       ` Cyrill Gorcunov
  2015-03-19 16:08                         ` Andy Lutomirski
@ 2015-03-20 11:43                         ` Denys Vlasenko
  2015-03-20 11:56                           ` Cyrill Gorcunov
  1 sibling, 1 reply; 39+ messages in thread
From: Denys Vlasenko @ 2015-03-20 11:43 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andy Lutomirski, Pavel Emelyanov, Oleg Nesterov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Thu, Mar 19, 2015 at 8:35 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Wed, Mar 18, 2015 at 03:03:27PM -0700, Andy Lutomirski wrote:
>> I don't have a great idea for how to work around this, unfortunately.
>> Ideally we'd increment some kind of version counter or use an
>> extension mechanism rather than shoving ss into a field that used to
>> be padding.
>
> fwiw currently we're passing zero in this __pad0 (replying to your
> previous email, so we can workaround in the kernel assuming zero
> as a special case, not that good but better than nothing).

Special-casing zero sounds not that bad to me.
It can be removed after a few years - just don't forget
to document it in a good comment: why we have special
case? What software required it?
In which version of that software the need to have this hack
was eliminated?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-20 11:43                         ` Denys Vlasenko
@ 2015-03-20 11:56                           ` Cyrill Gorcunov
  2015-03-20 12:04                             ` Cyrill Gorcunov
  0 siblings, 1 reply; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-20 11:56 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Pavel Emelyanov, Oleg Nesterov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Mar 20, 2015 at 12:43:14PM +0100, Denys Vlasenko wrote:
> On Thu, Mar 19, 2015 at 8:35 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > On Wed, Mar 18, 2015 at 03:03:27PM -0700, Andy Lutomirski wrote:
> >> I don't have a great idea for how to work around this, unfortunately.
> >> Ideally we'd increment some kind of version counter or use an
> >> extension mechanism rather than shoving ss into a field that used to
> >> be padding.
> >
> > fwiw currently we're passing zero in this __pad0 (replying to your
> > previous email, so we can workaround in the kernel assuming zero
> > as a special case, not that good but better than nothing).
> 
> Special-casing zero sounds not that bad to me.
> It can be removed after a few years - just don't forget
> to document it in a good comment: why we have special
> case? What software required it?
> In which version of that software the need to have this hack
> was eliminated?

To be fair, such special case would be ideal for us, so that
if noone object against such hack, i would cook a patch.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-20 11:56                           ` Cyrill Gorcunov
@ 2015-03-20 12:04                             ` Cyrill Gorcunov
  2015-03-20 14:07                               ` Oleg Nesterov
  0 siblings, 1 reply; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-20 12:04 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Pavel Emelyanov, Oleg Nesterov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Mar 20, 2015 at 02:56:22PM +0300, Cyrill Gorcunov wrote:
> > >
> > > fwiw currently we're passing zero in this __pad0 (replying to your
> > > previous email, so we can workaround in the kernel assuming zero
> > > as a special case, not that good but better than nothing).
> > 
> > Special-casing zero sounds not that bad to me.
> > It can be removed after a few years - just don't forget
> > to document it in a good comment: why we have special
> > case? What software required it?
> > In which version of that software the need to have this hack
> > was eliminated?
> 
> To be fair, such special case would be ideal for us, so that
> if noone object against such hack, i would cook a patch.

Denys, note though that Andy pointed a downside for such approach as well:

 | The only real down side I can see to special casing zero is that it
 | really is possible to end up with zero in there.  For example, the
 | SIGSEGV you get do to the failed sigreturn probably has sigcontext->ss
 | == 0 :)

which I don't know how to resolve.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-20 12:04                             ` Cyrill Gorcunov
@ 2015-03-20 14:07                               ` Oleg Nesterov
  2015-03-20 14:47                                 ` Cyrill Gorcunov
  0 siblings, 1 reply; 39+ messages in thread
From: Oleg Nesterov @ 2015-03-20 14:07 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Denys Vlasenko, Andy Lutomirski, Pavel Emelyanov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

I do not have a strong opinion, I leave this to you and Andy...

But,

On 03/20, Cyrill Gorcunov wrote:
> On Fri, Mar 20, 2015 at 02:56:22PM +0300, Cyrill Gorcunov wrote:
> > > >
> > > > fwiw currently we're passing zero in this __pad0 (replying to your
> > > > previous email, so we can workaround in the kernel assuming zero
> > > > as a special case, not that good but better than nothing).
> > >
> > > Special-casing zero sounds not that bad to me.
> > > It can be removed after a few years

If we add this special case now, I am not sure we can remove it later.

> > > just don't forget
> > > to document it in a good comment

and perhaps with WARN_ON_ONCE().

>  | The only real down side I can see to special casing zero is that it
>  | really is possible to end up with zero in there.  For example, the
>  | SIGSEGV you get do to the failed sigreturn probably has sigcontext->ss
>  | == 0 :)
>
> which I don't know how to resolve.

Another downside (to me) is that this special case can help, but only
"by accident".

For example, criu has ->ss = 0, but it was not initialized explicitly.

But as I said, I won't argue.

Oleg.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-20 14:07                               ` Oleg Nesterov
@ 2015-03-20 14:47                                 ` Cyrill Gorcunov
  2015-04-10 21:59                                   ` Andy Lutomirski
  0 siblings, 1 reply; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-03-20 14:47 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Andy Lutomirski, Pavel Emelyanov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Mar 20, 2015 at 03:07:38PM +0100, Oleg Nesterov wrote:
> 
> Another downside (to me) is that this special case can help, but only
> "by accident".
> 
> For example, criu has ->ss = 0, but it was not initialized explicitly.
> 
> But as I said, I won't argue.

OK, after talking to Pavel -- it's fine to require new criu version
for new kernels, thus lets leave kernel pure without this special
hack, moreover now we have a fix from Oleg merged into criu so we
should be safe. Thanks a lot for all comments!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-03-20 14:47                                 ` Cyrill Gorcunov
@ 2015-04-10 21:59                                   ` Andy Lutomirski
  2015-04-10 22:11                                     ` Cyrill Gorcunov
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-04-10 21:59 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, Denys Vlasenko, Pavel Emelyanov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Mar 20, 2015 at 7:47 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Fri, Mar 20, 2015 at 03:07:38PM +0100, Oleg Nesterov wrote:
>>
>> Another downside (to me) is that this special case can help, but only
>> "by accident".
>>
>> For example, criu has ->ss = 0, but it was not initialized explicitly.
>>
>> But as I said, I won't argue.
>
> OK, after talking to Pavel -- it's fine to require new criu version
> for new kernels, thus lets leave kernel pure without this special
> hack, moreover now we have a fix from Oleg merged into criu so we
> should be safe. Thanks a lot for all comments!

As long as you're doing that, any chance you could remove CPREG1(gs)
and CPREG1(fs) as well?  They haven't had any effect since
2.5.something_very_old, and the kernel might want to reuse those
sigcontext slots some day.

--Andy

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-04-10 21:59                                   ` Andy Lutomirski
@ 2015-04-10 22:11                                     ` Cyrill Gorcunov
  2015-04-10 22:16                                       ` Andy Lutomirski
  0 siblings, 1 reply; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-04-10 22:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Denys Vlasenko, Pavel Emelyanov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Apr 10, 2015 at 02:59:19PM -0700, Andy Lutomirski wrote:
> 
> As long as you're doing that, any chance you could remove CPREG1(gs)
> and CPREG1(fs) as well?  They haven't had any effect since
> 2.5.something_very_old, and the kernel might want to reuse those
> sigcontext slots some day.

Yeah, absolutely. I'm implementing 32bits support for criu now
so I managed to postprone this task but will implement on the
week. Thanks for reminder!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-04-10 22:11                                     ` Cyrill Gorcunov
@ 2015-04-10 22:16                                       ` Andy Lutomirski
  2015-04-10 22:20                                         ` Cyrill Gorcunov
  0 siblings, 1 reply; 39+ messages in thread
From: Andy Lutomirski @ 2015-04-10 22:16 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, Denys Vlasenko, Pavel Emelyanov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Apr 10, 2015 at 3:11 PM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Fri, Apr 10, 2015 at 02:59:19PM -0700, Andy Lutomirski wrote:
>>
>> As long as you're doing that, any chance you could remove CPREG1(gs)
>> and CPREG1(fs) as well?  They haven't had any effect since
>> 2.5.something_very_old, and the kernel might want to reuse those
>> sigcontext slots some day.
>
> Yeah, absolutely. I'm implementing 32bits support for criu now
> so I managed to postprone this task but will implement on the
> week. Thanks for reminder!

Yikes, have fun with that.  I think there's a major vdso issue you'll
hit.  Make sure you test on Intel hardware.  I'll try to help if you
find stuff in my areas of the kernel.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs
  2015-04-10 22:16                                       ` Andy Lutomirski
@ 2015-04-10 22:20                                         ` Cyrill Gorcunov
  0 siblings, 0 replies; 39+ messages in thread
From: Cyrill Gorcunov @ 2015-04-10 22:20 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Oleg Nesterov, Denys Vlasenko, Pavel Emelyanov, Andrey Wagin,
	Andy Lutomirski, Ingo Molnar, Andi Kleen, H. Peter Anvin,
	Al Viro, X86 ML, LKML, Linus Torvalds, Borislav Petkov,
	Pavel Emelyanov

On Fri, Apr 10, 2015 at 03:16:18PM -0700, Andy Lutomirski wrote:
> On Fri, Apr 10, 2015 at 3:11 PM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > On Fri, Apr 10, 2015 at 02:59:19PM -0700, Andy Lutomirski wrote:
> >>
> >> As long as you're doing that, any chance you could remove CPREG1(gs)
> >> and CPREG1(fs) as well?  They haven't had any effect since
> >> 2.5.something_very_old, and the kernel might want to reuse those
> >> sigcontext slots some day.
> >
> > Yeah, absolutely. I'm implementing 32bits support for criu now
> > so I managed to postprone this task but will implement on the
> > week. Thanks for reminder!
> 
> Yikes, have fun with that.  I think there's a major vdso issue you'll
> hit.  Make sure you test on Intel hardware.  I'll try to help if you
> find stuff in my areas of the kernel.

Heh. Didn't reach this point yet, but be sure I'll contact you ;)

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-04-10 22:20 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-12 20:57 [PATCH v3 0/2] x86_64: Sigcontext improvements Andy Lutomirski
2015-03-12 20:57 ` [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs Andy Lutomirski
2015-03-13 16:13   ` Borislav Petkov
2015-03-16  8:11   ` Ingo Molnar
2015-03-16 12:08   ` [tip:x86/asm] x86/signal/64: " tip-bot for Andy Lutomirski
2015-03-17  8:44   ` tip-bot for Andy Lutomirski
2015-03-18 17:19   ` [PATCH v3 1/2] x86_64,signal: " Andrey Wagin
2015-03-18 17:48     ` Oleg Nesterov
2015-03-18 18:06       ` Andy Lutomirski
2015-03-18 18:17         ` Cyrill Gorcunov
2015-03-18 18:20           ` Andy Lutomirski
2015-03-18 18:45             ` Cyrill Gorcunov
2015-03-18 18:25           ` Oleg Nesterov
2015-03-18 18:32             ` Andy Lutomirski
2015-03-18 19:13             ` Cyrill Gorcunov
2015-03-18 18:13       ` Cyrill Gorcunov
2015-03-18 18:31         ` Oleg Nesterov
2015-03-18 18:50           ` Cyrill Gorcunov
2015-03-18 19:52             ` Andrey Wagin
2015-03-18 20:02               ` Oleg Nesterov
2015-03-18 21:26                 ` Andy Lutomirski
2015-03-18 21:34                   ` Pavel Emelyanov
2015-03-18 22:03                     ` Andy Lutomirski
2015-03-19  7:35                       ` Cyrill Gorcunov
2015-03-19 16:08                         ` Andy Lutomirski
2015-03-19 16:19                           ` Cyrill Gorcunov
2015-03-20 11:43                         ` Denys Vlasenko
2015-03-20 11:56                           ` Cyrill Gorcunov
2015-03-20 12:04                             ` Cyrill Gorcunov
2015-03-20 14:07                               ` Oleg Nesterov
2015-03-20 14:47                                 ` Cyrill Gorcunov
2015-04-10 21:59                                   ` Andy Lutomirski
2015-04-10 22:11                                     ` Cyrill Gorcunov
2015-04-10 22:16                                       ` Andy Lutomirski
2015-04-10 22:20                                         ` Cyrill Gorcunov
2015-03-12 20:57 ` [PATCH v3 2/2] x86_64,signal: Remove 'fs' and 'gs' from sigcontext Andy Lutomirski
2015-03-16 12:08   ` [tip:x86/asm] x86/signal/64: " tip-bot for Andy Lutomirski
2015-03-17  8:44   ` tip-bot for Andy Lutomirski
2015-03-13 15:31 ` [PATCH v3 0/2] x86_64: Sigcontext improvements Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.