All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-21 13:39 ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-21 13:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: James Hogan, Ralf Baechle, Al Viro, Andrew Morton, Oleg Nesterov,
	Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

MIPS has 128 signals, the highest of which has the number 128 (they
start from 1). The following command causes get_signal_to_deliver() to
pass this signal number straight through to do_group_exit() as the exit
code:

  strace sleep 10 & sleep 1 && kill -128 `pidof sleep`

However do_group_exit() checks for the core dump bit (0x80) in the exit
code which matches in this particular case and the kernel panics:

  BUG_ON(exit_code & 0x80); /* core dumps don't get here */

Fundamentally the exit / wait status code cannot represent SIG128. In
fact it cannot represent SIG127 either as 0x7f represents a stopped
child.

Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
the correct signal number for SIG127 and SIG128.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Daney <david.daney@cavium.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: linux-mips@linux-mips.org
---
v3:

A slightly different approach this time, closer to the original patch I
sent. This is because reducing _NSIG to 127 (like v2) still leaves
incorrect exit status codes for SIG127. The only ABI this changes is the
wait/waitpid status code, and it's in such a way that old binaries, as
long as they use the macros defined in the wait manpage, should see a
process terminated by signal 126 for SIG127 and SIG128 rather than
!WIFSIGNALED(). Software rebuilt with updated libc wait status macros
would see the correct terminating signal number.

 kernel/signal.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 113411b..212598c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -941,6 +941,32 @@ static inline int wants_signal(int sig, struct task_struct *p)
 	return task_curr(p) || !signal_pending(p);
 }
 
+/*
+ * MIPS has 128 signals which doesn't play nicely with the wait status code.
+ * There are 7 bits to store the terminating signal number, with 0 and 127
+ * having special meanings (normal termination and stopped).
+ *
+ * Since signals 127 and 128 cannot be directly represented, cap the signal
+ * number to 126 (so WIFSIGNALED works) and store remainder in some higher bits.
+ */
+
+static inline int exitcode_to_sig(int exit_code)
+{
+#if _NSIG > 126
+	exit_code = (exit_code & 0x7f) + ((exit_code >> 16) & 0xff);
+#endif
+	return exit_code;
+}
+
+static inline int sig_to_exitcode(int sig)
+{
+#if _NSIG > 126
+	if (sig > 126)
+		sig = 126 | (sig - 126) << 16;
+#endif
+	return sig;
+}
+
 static void complete_signal(int sig, struct task_struct *p, int group)
 {
 	struct signal_struct *signal = p->signal;
@@ -997,7 +1023,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			 * thread has the fatal signal pending.
 			 */
 			signal->flags = SIGNAL_GROUP_EXIT;
-			signal->group_exit_code = sig;
+			signal->group_exit_code = sig_to_exitcode(sig);
 			signal->group_stop_count = 0;
 			t = p;
 			do {
@@ -1770,7 +1796,7 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  		info.si_status = SIGCONT;
  		break;
  	case CLD_STOPPED:
- 		info.si_status = tsk->signal->group_exit_code & 0x7f;
+		info.si_status = exitcode_to_sig(tsk->signal->group_exit_code);
  		break;
  	case CLD_TRAPPED:
  		info.si_status = tsk->exit_code & 0x7f;
@@ -2367,7 +2393,7 @@ relock:
 		/*
 		 * Death signals, no core dump.
 		 */
-		do_group_exit(info->si_signo);
+		do_group_exit(sig_to_exitcode(info->si_signo));
 		/* NOTREACHED */
 	}
 	spin_unlock_irq(&sighand->siglock);
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-21 13:39 ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-21 13:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: James Hogan, Ralf Baechle, Al Viro, Andrew Morton, Oleg Nesterov,
	Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

MIPS has 128 signals, the highest of which has the number 128 (they
start from 1). The following command causes get_signal_to_deliver() to
pass this signal number straight through to do_group_exit() as the exit
code:

  strace sleep 10 & sleep 1 && kill -128 `pidof sleep`

However do_group_exit() checks for the core dump bit (0x80) in the exit
code which matches in this particular case and the kernel panics:

  BUG_ON(exit_code & 0x80); /* core dumps don't get here */

Fundamentally the exit / wait status code cannot represent SIG128. In
fact it cannot represent SIG127 either as 0x7f represents a stopped
child.

Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
the correct signal number for SIG127 and SIG128.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Daney <david.daney@cavium.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: linux-mips@linux-mips.org
---
v3:

A slightly different approach this time, closer to the original patch I
sent. This is because reducing _NSIG to 127 (like v2) still leaves
incorrect exit status codes for SIG127. The only ABI this changes is the
wait/waitpid status code, and it's in such a way that old binaries, as
long as they use the macros defined in the wait manpage, should see a
process terminated by signal 126 for SIG127 and SIG128 rather than
!WIFSIGNALED(). Software rebuilt with updated libc wait status macros
would see the correct terminating signal number.

 kernel/signal.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 113411b..212598c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -941,6 +941,32 @@ static inline int wants_signal(int sig, struct task_struct *p)
 	return task_curr(p) || !signal_pending(p);
 }
 
+/*
+ * MIPS has 128 signals which doesn't play nicely with the wait status code.
+ * There are 7 bits to store the terminating signal number, with 0 and 127
+ * having special meanings (normal termination and stopped).
+ *
+ * Since signals 127 and 128 cannot be directly represented, cap the signal
+ * number to 126 (so WIFSIGNALED works) and store remainder in some higher bits.
+ */
+
+static inline int exitcode_to_sig(int exit_code)
+{
+#if _NSIG > 126
+	exit_code = (exit_code & 0x7f) + ((exit_code >> 16) & 0xff);
+#endif
+	return exit_code;
+}
+
+static inline int sig_to_exitcode(int sig)
+{
+#if _NSIG > 126
+	if (sig > 126)
+		sig = 126 | (sig - 126) << 16;
+#endif
+	return sig;
+}
+
 static void complete_signal(int sig, struct task_struct *p, int group)
 {
 	struct signal_struct *signal = p->signal;
@@ -997,7 +1023,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			 * thread has the fatal signal pending.
 			 */
 			signal->flags = SIGNAL_GROUP_EXIT;
-			signal->group_exit_code = sig;
+			signal->group_exit_code = sig_to_exitcode(sig);
 			signal->group_stop_count = 0;
 			t = p;
 			do {
@@ -1770,7 +1796,7 @@ static void do_notify_parent_cldstop(struct task_struct *tsk,
  		info.si_status = SIGCONT;
  		break;
  	case CLD_STOPPED:
- 		info.si_status = tsk->signal->group_exit_code & 0x7f;
+		info.si_status = exitcode_to_sig(tsk->signal->group_exit_code);
  		break;
  	case CLD_TRAPPED:
  		info.si_status = tsk->exit_code & 0x7f;
@@ -2367,7 +2393,7 @@ relock:
 		/*
 		 * Death signals, no core dump.
 		 */
-		do_group_exit(info->si_signo);
+		do_group_exit(sig_to_exitcode(info->si_signo));
 		/* NOTREACHED */
 	}
 	spin_unlock_irq(&sighand->siglock);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-21 13:39 ` James Hogan
  (?)
@ 2013-06-21 15:59 ` David Daney
  2013-06-21 16:12   ` Ralf Baechle
                     ` (2 more replies)
  -1 siblings, 3 replies; 26+ messages in thread
From: David Daney @ 2013-06-21 15:59 UTC (permalink / raw)
  To: James Hogan
  Cc: linux-kernel, Ralf Baechle, Al Viro, Andrew Morton,
	Oleg Nesterov, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On 06/21/2013 06:39 AM, James Hogan wrote:
> MIPS has 128 signals, the highest of which has the number 128 (they
> start from 1). The following command causes get_signal_to_deliver() to
> pass this signal number straight through to do_group_exit() as the exit
> code:
>
>    strace sleep 10 & sleep 1 && kill -128 `pidof sleep`
>
> However do_group_exit() checks for the core dump bit (0x80) in the exit
> code which matches in this particular case and the kernel panics:
>
>    BUG_ON(exit_code & 0x80); /* core dumps don't get here */
>
> Fundamentally the exit / wait status code cannot represent SIG128. In
> fact it cannot represent SIG127 either as 0x7f represents a stopped
> child.
>
> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
> the correct signal number for SIG127 and SIG128.

I really hate this approach.

Can we just change the ABI to reduce the number of signals so that all 
the standard C library wait related macros don't have to be changed?

Think about it, any user space program using signal numbers 127 and 128 
doesn't work correctly as things exist today, so removing those two will 
be no great loss.

David Daney


>
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: David Daney <david.daney@cavium.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Dave Jones <davej@redhat.com>
> Cc: linux-mips@linux-mips.org
> ---
> v3:
>
> A slightly different approach this time, closer to the original patch I
> sent. This is because reducing _NSIG to 127 (like v2) still leaves
> incorrect exit status codes for SIG127. The only ABI this changes is the
> wait/waitpid status code, and it's in such a way that old binaries, as
> long as they use the macros defined in the wait manpage, should see a
> process terminated by signal 126 for SIG127 and SIG128 rather than
> !WIFSIGNALED(). Software rebuilt with updated libc wait status macros
> would see the correct terminating signal number.
>
>   kernel/signal.c | 32 +++++++++++++++++++++++++++++---
>   1 file changed, 29 insertions(+), 3 deletions(-)
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-21 15:59 ` David Daney
@ 2013-06-21 16:12   ` Ralf Baechle
  2013-06-21 20:22   ` Oleg Nesterov
  2013-06-24  9:26     ` James Hogan
  2 siblings, 0 replies; 26+ messages in thread
From: Ralf Baechle @ 2013-06-21 16:12 UTC (permalink / raw)
  To: David Daney
  Cc: James Hogan, linux-kernel, Al Viro, Andrew Morton, Oleg Nesterov,
	Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

On Fri, Jun 21, 2013 at 08:59:32AM -0700, David Daney wrote:

> On 06/21/2013 06:39 AM, James Hogan wrote:
> >MIPS has 128 signals, the highest of which has the number 128 (they
> >start from 1). The following command causes get_signal_to_deliver() to
> >pass this signal number straight through to do_group_exit() as the exit
> >code:
> >
> >   strace sleep 10 & sleep 1 && kill -128 `pidof sleep`
> >
> >However do_group_exit() checks for the core dump bit (0x80) in the exit
> >code which matches in this particular case and the kernel panics:
> >
> >   BUG_ON(exit_code & 0x80); /* core dumps don't get here */
> >
> >Fundamentally the exit / wait status code cannot represent SIG128. In
> >fact it cannot represent SIG127 either as 0x7f represents a stopped
> >child.
> >
> >Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
> >map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
> >sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
> >both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
> >the correct signal number for SIG127 and SIG128.
> 
> I really hate this approach.
> 
> Can we just change the ABI to reduce the number of signals so that
> all the standard C library wait related macros don't have to be
> changed?

Changing the ABI is a very strong medicine that wants to be used very
carefully.

> Think about it, any user space program using signal numbers 127 and
> 128 doesn't work correctly as things exist today, so removing those
> two will be no great loss.

Glibc has it's own sigset_t of 1024 signals.  I wonder if it will even
use more than 64 signals.  Similar for other libcs.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-21 15:59 ` David Daney
  2013-06-21 16:12   ` Ralf Baechle
@ 2013-06-21 20:22   ` Oleg Nesterov
  2013-06-21 20:45       ` David Daney
  2013-06-24  9:26     ` James Hogan
  2 siblings, 1 reply; 26+ messages in thread
From: Oleg Nesterov @ 2013-06-21 20:22 UTC (permalink / raw)
  To: David Daney
  Cc: James Hogan, linux-kernel, Ralf Baechle, Al Viro, Andrew Morton,
	Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

On 06/21, David Daney wrote:
>
> On 06/21/2013 06:39 AM, James Hogan wrote:
>> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
>> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
>> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
>> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
>> the correct signal number for SIG127 and SIG128.
>
> I really hate this approach.
>
> Can we just change the ABI to reduce the number of signals so that all
> the standard C library wait related macros don't have to be changed?
>
> Think about it, any user space program using signal numbers 127 and 128
> doesn't work correctly as things exist today, so removing those two will
> be no great loss.

Oh, I agree.

Besides, this changes ABI anyway. And if we change it we can do this in
a more clean way, afaics. MIPS should simply use 2 bytes in exit_code for
signal number. Yes, this means we need replace 0x80/0x7f in exit.c by
ifdef'ed numbers. And yes, this means that WIFSIGNALED/etc should be
updated too, but this is also true with this patch.

Oleg.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-21 20:45       ` David Daney
  0 siblings, 0 replies; 26+ messages in thread
From: David Daney @ 2013-06-21 20:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David Daney, James Hogan, linux-kernel, Ralf Baechle, Al Viro,
	Andrew Morton, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On 06/21/2013 01:22 PM, Oleg Nesterov wrote:
> On 06/21, David Daney wrote:
>>
>> On 06/21/2013 06:39 AM, James Hogan wrote:
>>> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
>>> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
>>> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
>>> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
>>> the correct signal number for SIG127 and SIG128.
>>
>> I really hate this approach.
>>
>> Can we just change the ABI to reduce the number of signals so that all
>> the standard C library wait related macros don't have to be changed?
>>
>> Think about it, any user space program using signal numbers 127 and 128
>> doesn't work correctly as things exist today, so removing those two will
>> be no great loss.
>
> Oh, I agree.
>
> Besides, this changes ABI anyway. And if we change it we can do this in
> a more clean way, afaics. MIPS should simply use 2 bytes in exit_code for
> signal number.

Wouldn't that break *all* existing programs that use signals?  Perhaps I 
misunderstand what you are suggesting.

I am proposing that we just reduce the number of usable signals such 
that existing libc status checking macros/functions don't change in any way.

  Yes, this means we need replace 0x80/0x7f in exit.c by
> ifdef'ed numbers. And yes, this means that WIFSIGNALED/etc should be
> updated too, but this is also true with this patch.
>
> Oleg.
>
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-21 20:45       ` David Daney
  0 siblings, 0 replies; 26+ messages in thread
From: David Daney @ 2013-06-21 20:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David Daney, James Hogan, linux-kernel, Ralf Baechle, Al Viro,
	Andrew Morton, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On 06/21/2013 01:22 PM, Oleg Nesterov wrote:
> On 06/21, David Daney wrote:
>>
>> On 06/21/2013 06:39 AM, James Hogan wrote:
>>> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
>>> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
>>> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
>>> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
>>> the correct signal number for SIG127 and SIG128.
>>
>> I really hate this approach.
>>
>> Can we just change the ABI to reduce the number of signals so that all
>> the standard C library wait related macros don't have to be changed?
>>
>> Think about it, any user space program using signal numbers 127 and 128
>> doesn't work correctly as things exist today, so removing those two will
>> be no great loss.
>
> Oh, I agree.
>
> Besides, this changes ABI anyway. And if we change it we can do this in
> a more clean way, afaics. MIPS should simply use 2 bytes in exit_code for
> signal number.

Wouldn't that break *all* existing programs that use signals?  Perhaps I 
misunderstand what you are suggesting.

I am proposing that we just reduce the number of usable signals such 
that existing libc status checking macros/functions don't change in any way.

  Yes, this means we need replace 0x80/0x7f in exit.c by
> ifdef'ed numbers. And yes, this means that WIFSIGNALED/etc should be
> updated too, but this is also true with this patch.
>
> Oleg.
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-21 20:45       ` David Daney
  (?)
@ 2013-06-22 19:09       ` Oleg Nesterov
  2013-06-24  9:10           ` James Hogan
  -1 siblings, 1 reply; 26+ messages in thread
From: Oleg Nesterov @ 2013-06-22 19:09 UTC (permalink / raw)
  To: David Daney
  Cc: David Daney, James Hogan, linux-kernel, Ralf Baechle, Al Viro,
	Andrew Morton, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On 06/21, David Daney wrote:
>
> On 06/21/2013 01:22 PM, Oleg Nesterov wrote:
>> On 06/21, David Daney wrote:
>>>
>>> On 06/21/2013 06:39 AM, James Hogan wrote:
>>>> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
>>>> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
>>>> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
>>>> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
>>>> the correct signal number for SIG127 and SIG128.
>>>
>>> I really hate this approach.
>>>
>>> Can we just change the ABI to reduce the number of signals so that all
>>> the standard C library wait related macros don't have to be changed?
>>>
>>> Think about it, any user space program using signal numbers 127 and 128
>>> doesn't work correctly as things exist today, so removing those two will
>>> be no great loss.
>>
>> Oh, I agree.
>>
>> Besides, this changes ABI anyway. And if we change it we can do this in
>> a more clean way, afaics. MIPS should simply use 2 bytes in exit_code for
>> signal number.
>
> Wouldn't that break *all* existing programs that use signals?  Perhaps I
> misunderstand what you are suggesting.

Of course this will break the userspace more than the original patch,
that is why I said "And yes, this means that WIFSIGNALED/etc should
be updated".

> I am proposing that we just reduce the number of usable signals such
> that existing libc status checking macros/functions don't change in any
> way.

And I fully agree! Absolutely, sorry for confusion.


What I tried to say, _if_ we change the ABI instead, lets make this
change sane.

To me this hack is not sane. And btw, the patch doesn't look complete.
Say, wait_task_zombie() should do exitcode_to_sig() for ->si_status.

Oleg.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-24  9:10           ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-24  9:10 UTC (permalink / raw)
  To: Oleg Nesterov, David Daney
  Cc: David Daney, linux-kernel, Ralf Baechle, Al Viro, Andrew Morton,
	Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

On 22/06/13 20:09, Oleg Nesterov wrote:
> On 06/21, David Daney wrote:
>> I am proposing that we just reduce the number of usable signals such
>> that existing libc status checking macros/functions don't change in any
>> way.
> 
> And I fully agree! Absolutely, sorry for confusion.
> 
> 
> What I tried to say, _if_ we change the ABI instead, lets make this
> change sane.

I agree that this approach isn't very nice (I was really just trying to
explore the options) and reducing the number of signals is nicer. But is
anybody here confident enough that the number of signals changing under
the feet of existing binaries/libc won't actually break anything real?
I.e. anything trying to use SIGRTMAX() to get a lower priority signal.

> 
> To me this hack is not sane. And btw, the patch doesn't look complete.
> Say, wait_task_zombie() should do exitcode_to_sig() for ->si_status.

Ah yes, I didn't seen that.

Cheers
James


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-24  9:10           ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-24  9:10 UTC (permalink / raw)
  To: Oleg Nesterov, David Daney
  Cc: David Daney, linux-kernel, Ralf Baechle, Al Viro, Andrew Morton,
	Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

On 22/06/13 20:09, Oleg Nesterov wrote:
> On 06/21, David Daney wrote:
>> I am proposing that we just reduce the number of usable signals such
>> that existing libc status checking macros/functions don't change in any
>> way.
> 
> And I fully agree! Absolutely, sorry for confusion.
> 
> 
> What I tried to say, _if_ we change the ABI instead, lets make this
> change sane.

I agree that this approach isn't very nice (I was really just trying to
explore the options) and reducing the number of signals is nicer. But is
anybody here confident enough that the number of signals changing under
the feet of existing binaries/libc won't actually break anything real?
I.e. anything trying to use SIGRTMAX() to get a lower priority signal.

> 
> To me this hack is not sane. And btw, the patch doesn't look complete.
> Say, wait_task_zombie() should do exitcode_to_sig() for ->si_status.

Ah yes, I didn't seen that.

Cheers
James

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-24  9:26     ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-24  9:26 UTC (permalink / raw)
  To: David Daney
  Cc: linux-kernel, Ralf Baechle, Al Viro, Andrew Morton,
	Oleg Nesterov, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On 21/06/13 16:59, David Daney wrote:
> On 06/21/2013 06:39 AM, James Hogan wrote:
>> MIPS has 128 signals, the highest of which has the number 128 (they
>> start from 1). The following command causes get_signal_to_deliver() to
>> pass this signal number straight through to do_group_exit() as the exit
>> code:
>>
>>    strace sleep 10 & sleep 1 && kill -128 `pidof sleep`
>>
>> However do_group_exit() checks for the core dump bit (0x80) in the exit
>> code which matches in this particular case and the kernel panics:
>>
>>    BUG_ON(exit_code & 0x80); /* core dumps don't get here */
>>
>> Fundamentally the exit / wait status code cannot represent SIG128. In
>> fact it cannot represent SIG127 either as 0x7f represents a stopped
>> child.
>>
>> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
>> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
>> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
>> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
>> the correct signal number for SIG127 and SIG128.
> 
> I really hate this approach.
> 
> Can we just change the ABI to reduce the number of signals so that all
> the standard C library wait related macros don't have to be changed?
> 
> Think about it, any user space program using signal numbers 127 and 128
> doesn't work correctly as things exist today, so removing those two will
> be no great loss.

To clarify, I believe signals 127 and 128 do work during normal
operation. It's only if they cause program termination that odd things
happen in the wait status code (and of course the BUG_ON).

So reducing the number of signals will have potential to break things,
especially reducing below 127 (because glibc only reports SIGRTMAX()=127
already).

Cheers
James


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-24  9:26     ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-24  9:26 UTC (permalink / raw)
  To: David Daney
  Cc: linux-kernel, Ralf Baechle, Al Viro, Andrew Morton,
	Oleg Nesterov, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On 21/06/13 16:59, David Daney wrote:
> On 06/21/2013 06:39 AM, James Hogan wrote:
>> MIPS has 128 signals, the highest of which has the number 128 (they
>> start from 1). The following command causes get_signal_to_deliver() to
>> pass this signal number straight through to do_group_exit() as the exit
>> code:
>>
>>    strace sleep 10 & sleep 1 && kill -128 `pidof sleep`
>>
>> However do_group_exit() checks for the core dump bit (0x80) in the exit
>> code which matches in this particular case and the kernel panics:
>>
>>    BUG_ON(exit_code & 0x80); /* core dumps don't get here */
>>
>> Fundamentally the exit / wait status code cannot represent SIG128. In
>> fact it cannot represent SIG127 either as 0x7f represents a stopped
>> child.
>>
>> Therefore add sig_to_exitcode() and exitcode_to_sig() functions which
>> map signal numbers > 126 to exit code 126 and puts the remainder (i.e.
>> sig - 126) in higher bits. This allows WIFSIGNALED() to return true for
>> both SIG127 and SIG128, and allows WTERMSIG to be later updated to read
>> the correct signal number for SIG127 and SIG128.
> 
> I really hate this approach.
> 
> Can we just change the ABI to reduce the number of signals so that all
> the standard C library wait related macros don't have to be changed?
> 
> Think about it, any user space program using signal numbers 127 and 128
> doesn't work correctly as things exist today, so removing those two will
> be no great loss.

To clarify, I believe signals 127 and 128 do work during normal
operation. It's only if they cause program termination that odd things
happen in the wait status code (and of course the BUG_ON).

So reducing the number of signals will have potential to break things,
especially reducing below 127 (because glibc only reports SIGRTMAX()=127
already).

Cheers
James

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-25 21:40             ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2013-06-25 21:40 UTC (permalink / raw)
  To: James Hogan
  Cc: Oleg Nesterov, David Daney, David Daney, linux-kernel,
	Ralf Baechle, Al Viro, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On Mon, 24 Jun 2013 10:10:08 +0100 James Hogan <james.hogan@imgtec.com> wrote:

> On 22/06/13 20:09, Oleg Nesterov wrote:
> > On 06/21, David Daney wrote:
> >> I am proposing that we just reduce the number of usable signals such
> >> that existing libc status checking macros/functions don't change in any
> >> way.
> > 
> > And I fully agree! Absolutely, sorry for confusion.
> > 
> > 
> > What I tried to say, _if_ we change the ABI instead, lets make this
> > change sane.
> 
> I agree that this approach isn't very nice (I was really just trying to
> explore the options) and reducing the number of signals is nicer. But is
> anybody here confident enough that the number of signals changing under
> the feet of existing binaries/libc won't actually break anything real?
> I.e. anything trying to use SIGRTMAX() to get a lower priority signal.

Meanwhile, unprivileged users can make a MIPS kernel go BUG.

How much of a problem is this?  Obviously less of a problem with MIPS
than it would be with some other CPU types, but I'd imagine it's still
awkward in some environments.

If this _is_ considered a problem, can we think of some nasty little
hack which at least makes the effects less damaging, which we can also
put into -stable kernels?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-25 21:40             ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2013-06-25 21:40 UTC (permalink / raw)
  To: James Hogan
  Cc: Oleg Nesterov, David Daney, David Daney, linux-kernel,
	Ralf Baechle, Al Viro, Kees Cook, David Daney, Paul E. McKenney,
	David Howells, Dave Jones, linux-mips

On Mon, 24 Jun 2013 10:10:08 +0100 James Hogan <james.hogan@imgtec.com> wrote:

> On 22/06/13 20:09, Oleg Nesterov wrote:
> > On 06/21, David Daney wrote:
> >> I am proposing that we just reduce the number of usable signals such
> >> that existing libc status checking macros/functions don't change in any
> >> way.
> > 
> > And I fully agree! Absolutely, sorry for confusion.
> > 
> > 
> > What I tried to say, _if_ we change the ABI instead, lets make this
> > change sane.
> 
> I agree that this approach isn't very nice (I was really just trying to
> explore the options) and reducing the number of signals is nicer. But is
> anybody here confident enough that the number of signals changing under
> the feet of existing binaries/libc won't actually break anything real?
> I.e. anything trying to use SIGRTMAX() to get a lower priority signal.

Meanwhile, unprivileged users can make a MIPS kernel go BUG.

How much of a problem is this?  Obviously less of a problem with MIPS
than it would be with some other CPU types, but I'd imagine it's still
awkward in some environments.

If this _is_ considered a problem, can we think of some nasty little
hack which at least makes the effects less damaging, which we can also
put into -stable kernels?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-25 21:40             ` Andrew Morton
  (?)
@ 2013-06-25 22:13             ` James Hogan
  2013-06-26 11:07                 ` James Hogan
  -1 siblings, 1 reply; 26+ messages in thread
From: James Hogan @ 2013-06-25 22:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, David Daney, David Daney, LKML, Ralf Baechle,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips

On 25 June 2013 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 24 Jun 2013 10:10:08 +0100 James Hogan <james.hogan@imgtec.com> wrote:
>
>> On 22/06/13 20:09, Oleg Nesterov wrote:
>> > On 06/21, David Daney wrote:
>> >> I am proposing that we just reduce the number of usable signals such
>> >> that existing libc status checking macros/functions don't change in any
>> >> way.
>> >
>> > And I fully agree! Absolutely, sorry for confusion.
>> >
>> >
>> > What I tried to say, _if_ we change the ABI instead, lets make this
>> > change sane.
>>
>> I agree that this approach isn't very nice (I was really just trying to
>> explore the options) and reducing the number of signals is nicer. But is
>> anybody here confident enough that the number of signals changing under
>> the feet of existing binaries/libc won't actually break anything real?
>> I.e. anything trying to use SIGRTMAX() to get a lower priority signal.
>
> Meanwhile, unprivileged users can make a MIPS kernel go BUG.
>
> How much of a problem is this?  Obviously less of a problem with MIPS
> than it would be with some other CPU types, but I'd imagine it's still
> awkward in some environments.
>
> If this _is_ considered a problem, can we think of some nasty little
> hack which at least makes the effects less damaging, which we can also
> put into -stable kernels?

The first rfc patch I sent sort of satisfies that by passing 127 if
sig==128, or slightly better would be passing 126 if sig>=127 (so that
SIFSIGNALED returns true). Effectively #ifdef'ing it on _NSIG>127 as
this patch does may be preferable too.

That's probably the minimum change necessary to evade the BUG_ON
without removing it. The wait status code will still be wrong, but it
wasn't exactly right before so it's no worse.

IMO changing the ABI by reducing _NSIG to 127 or 126 isn't appropriate
for stable.

Cheers
James

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-25 22:13             ` James Hogan
  2013-06-26 11:07                 ` James Hogan
@ 2013-06-26 11:07                 ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-26 11:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, David Daney, David Daney, LKML, Ralf Baechle,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 25/06/13 23:13, James Hogan wrote:
> On 25 June 2013 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
>> Meanwhile, unprivileged users can make a MIPS kernel go BUG.
>>
>> How much of a problem is this?  Obviously less of a problem with MIPS
>> than it would be with some other CPU types, but I'd imagine it's still
>> awkward in some environments.
>>
>> If this _is_ considered a problem, can we think of some nasty little
>> hack which at least makes the effects less damaging, which we can also
>> put into -stable kernels?
> 
> The first rfc patch I sent sort of satisfies that by passing 127 if
> sig==128, or slightly better would be passing 126 if sig>=127 (so that
> SIFSIGNALED returns true). Effectively #ifdef'ing it on _NSIG>127 as
> this patch does may be preferable too.
> 
> That's probably the minimum change necessary to evade the BUG_ON
> without removing it. The wait status code will still be wrong, but it
> wasn't exactly right before so it's no worse.
> 
> IMO changing the ABI by reducing _NSIG to 127 or 126 isn't appropriate
> for stable.

How does this look for a nasty/stable fix?

>From 94d734526d61f5c74fd2df1c3ecb677495fc7a23 Mon Sep 17 00:00:00 2001
From: James Hogan <james.hogan@imgtec.com>
Date: Wed, 26 Jun 2013 11:48:11 +0100
Subject: [PATCH 1/1] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)

MIPS has 128 signals, the highest of which has the number 128 (they
start from 1). The following command causes get_signal_to_deliver() to
pass this signal number straight through to do_group_exit() as the exit
code:

  strace sleep 10 & sleep 1 && kill -128 `pidof sleep`

However do_group_exit() checks for the core dump bit (0x80) in the exit
code which matches in this particular case and the kernel panics:

  BUG_ON(exit_code & 0x80); /* core dumps don't get here */

As a quick fix, mask out higher bits in the signal number. This
effectively matches the exit code from other code paths but avoids the
BUG_ON.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Daney <david.daney@cavium.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
---
 kernel/signal.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 113411b..9ea8f4f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2366,8 +2366,14 @@ relock:
 
 		/*
 		 * Death signals, no core dump.
+		 *
+		 * Some architectures (MIPS) have 128 signals which doesn't play
+		 * nicely with the exit code since there are only 7 bits to
+		 * store the terminating signal number. Mask out higher bits to
+		 * avoid overflowing into the core dump bit and triggering
+		 * BUG_ON in do_group_exit.
 		 */
-		do_group_exit(info->si_signo);
+		do_group_exit(info->si_signo & 0x7f);
 		/* NOTREACHED */
 	}
 	spin_unlock_irq(&sighand->siglock);
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-26 11:07                 ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-26 11:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, David Daney, David Daney, LKML, Ralf Baechle,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 25/06/13 23:13, James Hogan wrote:
> On 25 June 2013 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
>> Meanwhile, unprivileged users can make a MIPS kernel go BUG.
>>
>> How much of a problem is this?  Obviously less of a problem with MIPS
>> than it would be with some other CPU types, but I'd imagine it's still
>> awkward in some environments.
>>
>> If this _is_ considered a problem, can we think of some nasty little
>> hack which at least makes the effects less damaging, which we can also
>> put into -stable kernels?
> 
> The first rfc patch I sent sort of satisfies that by passing 127 if
> sig==128, or slightly better would be passing 126 if sig>=127 (so that
> SIFSIGNALED returns true). Effectively #ifdef'ing it on _NSIG>127 as
> this patch does may be preferable too.
> 
> That's probably the minimum change necessary to evade the BUG_ON
> without removing it. The wait status code will still be wrong, but it
> wasn't exactly right before so it's no worse.
> 
> IMO changing the ABI by reducing _NSIG to 127 or 126 isn't appropriate
> for stable.

How does this look for a nasty/stable fix?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-26 11:07                 ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-26 11:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, David Daney, David Daney, LKML, Ralf Baechle,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 25/06/13 23:13, James Hogan wrote:
> On 25 June 2013 22:40, Andrew Morton <akpm@linux-foundation.org> wrote:
>> Meanwhile, unprivileged users can make a MIPS kernel go BUG.
>>
>> How much of a problem is this?  Obviously less of a problem with MIPS
>> than it would be with some other CPU types, but I'd imagine it's still
>> awkward in some environments.
>>
>> If this _is_ considered a problem, can we think of some nasty little
>> hack which at least makes the effects less damaging, which we can also
>> put into -stable kernels?
> 
> The first rfc patch I sent sort of satisfies that by passing 127 if
> sig==128, or slightly better would be passing 126 if sig>=127 (so that
> SIFSIGNALED returns true). Effectively #ifdef'ing it on _NSIG>127 as
> this patch does may be preferable too.
> 
> That's probably the minimum change necessary to evade the BUG_ON
> without removing it. The wait status code will still be wrong, but it
> wasn't exactly right before so it's no worse.
> 
> IMO changing the ABI by reducing _NSIG to 127 or 126 isn't appropriate
> for stable.

How does this look for a nasty/stable fix?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-26 11:07                 ` James Hogan
  (?)
  (?)
@ 2013-06-26 16:01                 ` Ralf Baechle
  -1 siblings, 0 replies; 26+ messages in thread
From: Ralf Baechle @ 2013-06-26 16:01 UTC (permalink / raw)
  To: James Hogan
  Cc: Andrew Morton, Oleg Nesterov, David Daney, David Daney, LKML,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On Wed, Jun 26, 2013 at 12:07:44PM +0100, James Hogan wrote:

> > IMO changing the ABI by reducing _NSIG to 127 or 126 isn't appropriate
> > for stable.
> 
> How does this look for a nasty/stable fix?

> diff --git a/kernel/signal.c b/kernel/signal.c
> index 113411b..9ea8f4f 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2366,8 +2366,14 @@ relock:
>  
>  		/*
>  		 * Death signals, no core dump.
> +		 *
> +		 * Some architectures (MIPS) have 128 signals which doesn't play
> +		 * nicely with the exit code since there are only 7 bits to
> +		 * store the terminating signal number. Mask out higher bits to
> +		 * avoid overflowing into the core dump bit and triggering
> +		 * BUG_ON in do_group_exit.
>  		 */
> -		do_group_exit(info->si_signo);
> +		do_group_exit(info->si_signo & 0x7f);
>  		/* NOTREACHED */
>  	}
>  	spin_unlock_irq(&sighand->siglock);

Looks like something which I think we could live with.

Clearly it also scores in the "nasty" category, so fits the bill ;)

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-26 11:07                 ` James Hogan
                                   ` (2 preceding siblings ...)
  (?)
@ 2013-06-26 16:14                 ` Oleg Nesterov
  2013-06-26 16:59                   ` Ralf Baechle
  -1 siblings, 1 reply; 26+ messages in thread
From: Oleg Nesterov @ 2013-06-26 16:14 UTC (permalink / raw)
  To: James Hogan
  Cc: Andrew Morton, David Daney, David Daney, LKML, Ralf Baechle,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 06/26, James Hogan wrote:
>
> On 25/06/13 23:13, James Hogan wrote:
>   BUG_ON(exit_code & 0x80); /* core dumps don't get here */
>
> As a quick fix, mask out higher bits in the signal number. This
> effectively matches the exit code from other code paths but avoids the
> BUG_ON.
> 
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: David Daney <david.daney@cavium.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Dave Jones <davej@redhat.com>
> Cc: linux-mips@linux-mips.org
> Cc: stable@vger.kernel.org
> ---
>  kernel/signal.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 113411b..9ea8f4f 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2366,8 +2366,14 @@ relock:
>  
>  		/*
>  		 * Death signals, no core dump.
> +		 *
> +		 * Some architectures (MIPS) have 128 signals which doesn't play
> +		 * nicely with the exit code since there are only 7 bits to
> +		 * store the terminating signal number. Mask out higher bits to
> +		 * avoid overflowing into the core dump bit and triggering
> +		 * BUG_ON in do_group_exit.
>  		 */
> -		do_group_exit(info->si_signo);
> +		do_group_exit(info->si_signo & 0x7f);

Or simply remove the BUG_ON(), this can equally confuse wait(status).
128 & 0x7f == 0.

Still I think it would be better to change _NSIG on mips.

Oleg.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-26 16:14                 ` Oleg Nesterov
@ 2013-06-26 16:59                   ` Ralf Baechle
  2013-06-26 17:15                     ` Oleg Nesterov
  2013-06-28 20:09                     ` Denys Vlasenko
  0 siblings, 2 replies; 26+ messages in thread
From: Ralf Baechle @ 2013-06-26 16:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: James Hogan, Andrew Morton, David Daney, David Daney, LKML,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On Wed, Jun 26, 2013 at 06:14:52PM +0200, Oleg Nesterov wrote:

> Or simply remove the BUG_ON(), this can equally confuse wait(status).
> 128 & 0x7f == 0.
> 
> Still I think it would be better to change _NSIG on mips.

If it was that easy.  That's going to outright break binary compatibility,
see kernel/signal.c:

SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
                sigset_t __user *, oset, size_t, sigsetsize)
{
        sigset_t old_set, new_set;
        int error;

        /* XXX: Don't preclude handling different sized sigset_t's.  */
        if (sigsetsize != sizeof(sigset_t))
                return -EINVAL;

There are several more more syscalls performing tests like the above.

So at least the kernel sigset_t will have to remain constant, maybe something
like below, totally untested patch which I'm sure is going to open a few
20 foot containers full of worms such as NSIG being defined by glibc to 128
and fixing the kernel won't magically change installed libc headers or
binaries incorporating NSIG.

  Ralf

 arch/mips/include/uapi/asm/signal.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/uapi/asm/signal.h b/arch/mips/include/uapi/asm/signal.h
index addb9f5..8bba323 100644
--- a/arch/mips/include/uapi/asm/signal.h
+++ b/arch/mips/include/uapi/asm/signal.h
@@ -11,12 +11,13 @@
 
 #include <linux/types.h>
 
-#define _NSIG		128
+#define _NSIG		64
 #define _NSIG_BPW	(sizeof(unsigned long) * 8)
 #define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
 
 typedef struct {
 	unsigned long sig[_NSIG_WORDS];
+	unsigned long __fill[_NSIG_WORDS];
 } sigset_t;
 
 typedef unsigned long old_sigset_t;		/* at least 32 bits */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-26 16:59                   ` Ralf Baechle
@ 2013-06-26 17:15                     ` Oleg Nesterov
  2013-06-28 12:07                         ` James Hogan
  2013-06-28 20:09                     ` Denys Vlasenko
  1 sibling, 1 reply; 26+ messages in thread
From: Oleg Nesterov @ 2013-06-26 17:15 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: James Hogan, Andrew Morton, David Daney, David Daney, LKML,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 06/26, Ralf Baechle wrote:
>
> On Wed, Jun 26, 2013 at 06:14:52PM +0200, Oleg Nesterov wrote:
>
> > Or simply remove the BUG_ON(), this can equally confuse wait(status).
> > 128 & 0x7f == 0.
> >
> > Still I think it would be better to change _NSIG on mips.
>
> If it was that easy.  That's going to outright break binary compatibility,
> see kernel/signal.c:
>
> SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
>                 sigset_t __user *, oset, size_t, sigsetsize)
> {
>         sigset_t old_set, new_set;
>         int error;
>
>         /* XXX: Don't preclude handling different sized sigset_t's.  */
>         if (sigsetsize != sizeof(sigset_t))
>                 return -EINVAL;

I meant the minimal hack like

	--- x/arch/mips/include/uapi/asm/signal.h
	+++ x/arch/mips/include/uapi/asm/signal.h
	@@ -11,9 +11,9 @@
	 
	 #include <linux/types.h>
	 
	-#define _NSIG		128
	+#define _NSIG		127
	 #define _NSIG_BPW	(sizeof(unsigned long) * 8)
	-#define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
	+#define _NSIG_WORDS	DIV_ROUND_UP(_NSIG / _NSIG_BPW)
	 
	 typedef struct {
		unsigned long sig[_NSIG_WORDS];

just to avoid BUG_ON().

I agree that _NSIG == 126 or 64 needs more discussion. Although personally
I think this is the only choice in the long term, or we should change ABI
and break user-space completely.

And, just in case, the hack above doesn't kill SIG_128 completely.
Say, the task can block/unblock it.

Oleg.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-28 12:07                         ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-28 12:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ralf Baechle, Andrew Morton, David Daney, David Daney, LKML,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 26/06/13 18:15, Oleg Nesterov wrote:
> On 06/26, Ralf Baechle wrote:
>>
>> On Wed, Jun 26, 2013 at 06:14:52PM +0200, Oleg Nesterov wrote:
>>
>>> Or simply remove the BUG_ON(), this can equally confuse wait(status).
>>> 128 & 0x7f == 0.
>>>
>>> Still I think it would be better to change _NSIG on mips.
>>
>> If it was that easy.  That's going to outright break binary compatibility,
>> see kernel/signal.c:
>>
>> SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
>>                 sigset_t __user *, oset, size_t, sigsetsize)
>> {
>>         sigset_t old_set, new_set;
>>         int error;
>>
>>         /* XXX: Don't preclude handling different sized sigset_t's.  */
>>         if (sigsetsize != sizeof(sigset_t))
>>                 return -EINVAL;
> 
> I meant the minimal hack like
> 
> 	--- x/arch/mips/include/uapi/asm/signal.h
> 	+++ x/arch/mips/include/uapi/asm/signal.h
> 	@@ -11,9 +11,9 @@
> 	 
> 	 #include <linux/types.h>
> 	 
> 	-#define _NSIG		128
> 	+#define _NSIG		127
> 	 #define _NSIG_BPW	(sizeof(unsigned long) * 8)
> 	-#define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
> 	+#define _NSIG_WORDS	DIV_ROUND_UP(_NSIG / _NSIG_BPW)
> 	 
> 	 typedef struct {
> 		unsigned long sig[_NSIG_WORDS];
> 
> just to avoid BUG_ON().
> 
> I agree that _NSIG == 126 or 64 needs more discussion. Although personally
> I think this is the only choice in the long term, or we should change ABI
> and break user-space completely.
> 
> And, just in case, the hack above doesn't kill SIG_128 completely.
> Say, the task can block/unblock it.

Well it prevents a handler being added or the signal being sent, so it
pretty much does kill it (patch v2 did this). Since glibc already has
SIGRTMAX=127, nothing running glibc should be affected unless it's being
really stupid (uClibc and bionic is a different matter).

I've booted debian with only 64 signals and an appropriate printk in
do_sigaction, and although various things try and set all the handlers,
I didn't found anything that breaks because of them being missing.

I've also audited all the binaries installed on my desktop (Fedora 17)
which contain libc_current_sigrtmax to see if I can find anything
problematic. Various things iterate all the signals (which doesn't do
any harm if a bunch are missing, especially when EINVAL is often
expected for SIGRTMIN..SIGRTMIN+2 because of libc using them). High
level language bindings open up some potential for breakage since
SIGRTMAX etc can be exposed to higher levels. The only real problem I've
found though is openjdk:

http://hg.openjdk.java.net/jdk7/jsn/jdk/file/tip/src/solaris/native/java/net/linux_close.c
57 /*
58  * Signal to unblock thread
59  */
60 static int sigWakeup = (__SIGRTMAX - 2);

It sets up a signal handler when the library is loaded (without any
error checking), and tries to send the signal to all threads blocked on
a file descriptor to wake them up (again without error checking), called
from NET_Dup2 and NET_SocketClose. Clearly this could easily break
backwards compatibility if SIGRTMAX is reduced any lower than 126.

Obviously this isn't exhaustive (I haven't tried android etc or actually
running many applications, and open source software probably isn't the
best example of badly written code), but it looks like it may be safe to
reduce _NSIG to 127 for a stable fix if nobody objects, and possibly to
126 in the future to avoid the wait exit status problem.

Cheers
James


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
@ 2013-06-28 12:07                         ` James Hogan
  0 siblings, 0 replies; 26+ messages in thread
From: James Hogan @ 2013-06-28 12:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ralf Baechle, Andrew Morton, David Daney, David Daney, LKML,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 26/06/13 18:15, Oleg Nesterov wrote:
> On 06/26, Ralf Baechle wrote:
>>
>> On Wed, Jun 26, 2013 at 06:14:52PM +0200, Oleg Nesterov wrote:
>>
>>> Or simply remove the BUG_ON(), this can equally confuse wait(status).
>>> 128 & 0x7f == 0.
>>>
>>> Still I think it would be better to change _NSIG on mips.
>>
>> If it was that easy.  That's going to outright break binary compatibility,
>> see kernel/signal.c:
>>
>> SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
>>                 sigset_t __user *, oset, size_t, sigsetsize)
>> {
>>         sigset_t old_set, new_set;
>>         int error;
>>
>>         /* XXX: Don't preclude handling different sized sigset_t's.  */
>>         if (sigsetsize != sizeof(sigset_t))
>>                 return -EINVAL;
> 
> I meant the minimal hack like
> 
> 	--- x/arch/mips/include/uapi/asm/signal.h
> 	+++ x/arch/mips/include/uapi/asm/signal.h
> 	@@ -11,9 +11,9 @@
> 	 
> 	 #include <linux/types.h>
> 	 
> 	-#define _NSIG		128
> 	+#define _NSIG		127
> 	 #define _NSIG_BPW	(sizeof(unsigned long) * 8)
> 	-#define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
> 	+#define _NSIG_WORDS	DIV_ROUND_UP(_NSIG / _NSIG_BPW)
> 	 
> 	 typedef struct {
> 		unsigned long sig[_NSIG_WORDS];
> 
> just to avoid BUG_ON().
> 
> I agree that _NSIG == 126 or 64 needs more discussion. Although personally
> I think this is the only choice in the long term, or we should change ABI
> and break user-space completely.
> 
> And, just in case, the hack above doesn't kill SIG_128 completely.
> Say, the task can block/unblock it.

Well it prevents a handler being added or the signal being sent, so it
pretty much does kill it (patch v2 did this). Since glibc already has
SIGRTMAX=127, nothing running glibc should be affected unless it's being
really stupid (uClibc and bionic is a different matter).

I've booted debian with only 64 signals and an appropriate printk in
do_sigaction, and although various things try and set all the handlers,
I didn't found anything that breaks because of them being missing.

I've also audited all the binaries installed on my desktop (Fedora 17)
which contain libc_current_sigrtmax to see if I can find anything
problematic. Various things iterate all the signals (which doesn't do
any harm if a bunch are missing, especially when EINVAL is often
expected for SIGRTMIN..SIGRTMIN+2 because of libc using them). High
level language bindings open up some potential for breakage since
SIGRTMAX etc can be exposed to higher levels. The only real problem I've
found though is openjdk:

http://hg.openjdk.java.net/jdk7/jsn/jdk/file/tip/src/solaris/native/java/net/linux_close.c
57 /*
58  * Signal to unblock thread
59  */
60 static int sigWakeup = (__SIGRTMAX - 2);

It sets up a signal handler when the library is loaded (without any
error checking), and tries to send the signal to all threads blocked on
a file descriptor to wake them up (again without error checking), called
from NET_Dup2 and NET_SocketClose. Clearly this could easily break
backwards compatibility if SIGRTMAX is reduced any lower than 126.

Obviously this isn't exhaustive (I haven't tried android etc or actually
running many applications, and open source software probably isn't the
best example of badly written code), but it looks like it may be safe to
reduce _NSIG to 127 for a stable fix if nobody objects, and possibly to
126 in the future to avoid the wait exit status problem.

Cheers
James

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-28 12:07                         ` James Hogan
  (?)
@ 2013-06-28 17:55                         ` Oleg Nesterov
  -1 siblings, 0 replies; 26+ messages in thread
From: Oleg Nesterov @ 2013-06-28 17:55 UTC (permalink / raw)
  To: James Hogan
  Cc: Ralf Baechle, Andrew Morton, David Daney, David Daney, LKML,
	Al Viro, Kees Cook, David Daney, Paul E. McKenney, David Howells,
	Dave Jones, linux-mips, stable

On 06/28, James Hogan wrote:
>
> On 26/06/13 18:15, Oleg Nesterov wrote:
> >
> > I meant the minimal hack like
> >
> > 	--- x/arch/mips/include/uapi/asm/signal.h
> > 	+++ x/arch/mips/include/uapi/asm/signal.h
> > 	@@ -11,9 +11,9 @@
> >
> > 	 #include <linux/types.h>
> >
> > 	-#define _NSIG		128
> > 	+#define _NSIG		127
> > 	 #define _NSIG_BPW	(sizeof(unsigned long) * 8)
> > 	-#define _NSIG_WORDS	(_NSIG / _NSIG_BPW)
> > 	+#define _NSIG_WORDS	DIV_ROUND_UP(_NSIG / _NSIG_BPW)
> >
> > 	 typedef struct {
> > 		unsigned long sig[_NSIG_WORDS];
> >
> > just to avoid BUG_ON().
> >
> > I agree that _NSIG == 126 or 64 needs more discussion. Although personally
> > I think this is the only choice in the long term, or we should change ABI
> > and break user-space completely.
> >
> > And, just in case, the hack above doesn't kill SIG_128 completely.
> > Say, the task can block/unblock it.
>
> Well it prevents a handler being added or the signal being sent, so it
> pretty much does kill it (patch v2 did this).

Yes, iirc you already sent something like the hack above.

> but it looks like it may be safe to
> reduce _NSIG to 127 for a stable fix

This was my point.

Sure, this change can break something anyway, we can't know if nobody
ever uses 128 anyway. But this is better than the ability to crash the
kernel. No need to use strace, just block(128) + kill(128) + unblock().

So perhaps you can resend your patch? Just I think it makes sense to
update the changelog to explain that this is not the "final" solution
but the minimal fix.

Oleg.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS)
  2013-06-26 16:59                   ` Ralf Baechle
  2013-06-26 17:15                     ` Oleg Nesterov
@ 2013-06-28 20:09                     ` Denys Vlasenko
  1 sibling, 0 replies; 26+ messages in thread
From: Denys Vlasenko @ 2013-06-28 20:09 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Oleg Nesterov, James Hogan, Andrew Morton, David Daney,
	David Daney, LKML, Al Viro, Kees Cook, David Daney,
	Paul E. McKenney, David Howells, Dave Jones, linux-mips, stable

On Wednesday 26 June 2013 18:59, Ralf Baechle wrote:
> On Wed, Jun 26, 2013 at 06:14:52PM +0200, Oleg Nesterov wrote:
> 
> > Or simply remove the BUG_ON(), this can equally confuse wait(status).
> > 128 & 0x7f == 0.
> > 
> > Still I think it would be better to change _NSIG on mips.
> 
> If it was that easy.  That's going to outright break binary compatibility,
> see kernel/signal.c:
> 
> SYSCALL_DEFINE4(rt_sigprocmask, int, how, sigset_t __user *, nset,
>                 sigset_t __user *, oset, size_t, sigsetsize)
> {
>         sigset_t old_set, new_set;
>         int error;
> 
>         /* XXX: Don't preclude handling different sized sigset_t's.  */
>         if (sigsetsize != sizeof(sigset_t))
>                 return -EINVAL;
> 
> There are several more more syscalls performing tests like the above.

Reducing _NSIG to 127 (or 126) doesn't change sigset_t, so should be fine wrt above.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2013-06-28 20:09 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-21 13:39 [PATCH v3] kernel/signal.c: fix BUG_ON with SIG128 (MIPS) James Hogan
2013-06-21 13:39 ` James Hogan
2013-06-21 15:59 ` David Daney
2013-06-21 16:12   ` Ralf Baechle
2013-06-21 20:22   ` Oleg Nesterov
2013-06-21 20:45     ` David Daney
2013-06-21 20:45       ` David Daney
2013-06-22 19:09       ` Oleg Nesterov
2013-06-24  9:10         ` James Hogan
2013-06-24  9:10           ` James Hogan
2013-06-25 21:40           ` Andrew Morton
2013-06-25 21:40             ` Andrew Morton
2013-06-25 22:13             ` James Hogan
2013-06-26 11:07               ` James Hogan
2013-06-26 11:07                 ` James Hogan
2013-06-26 11:07                 ` James Hogan
2013-06-26 16:01                 ` Ralf Baechle
2013-06-26 16:14                 ` Oleg Nesterov
2013-06-26 16:59                   ` Ralf Baechle
2013-06-26 17:15                     ` Oleg Nesterov
2013-06-28 12:07                       ` James Hogan
2013-06-28 12:07                         ` James Hogan
2013-06-28 17:55                         ` Oleg Nesterov
2013-06-28 20:09                     ` Denys Vlasenko
2013-06-24  9:26   ` James Hogan
2013-06-24  9:26     ` James Hogan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.