linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug: retry of clone() on Alpha can result in zeroed process thread pointer
@ 2014-07-23  8:52 Michael Cree
  2014-07-24 18:19 ` Richard Henderson
  2014-07-30 19:30 ` Richard Henderson
  0 siblings, 2 replies; 4+ messages in thread
From: Michael Cree @ 2014-07-23  8:52 UTC (permalink / raw)
  To: linux-alpha; +Cc: linux-kernel, Richard Henderson

I am seeing a bug in clone() on the Alpha architecture.  Reported to
Debian as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=755397

The test suite of glibc sometimes fails in the nptl/tst-eintr3 test
with a segmentation fault.  I have tracked it down to the thread
pointer returned by the rduniq PALcall is occasionally zero when
it should point to the TLS.  I have only ever seen this occur when
running a SMP kernel.

Running strace on nptl/tst-eintr3 reveals that the clone() syscall
is retried by the kernel if an ERESTARTNOINTR error occurs.  At
$syscall_error in arch/alpha/kernel/entry.S the kernel handles the
error and in doing that it writes to 72(sp) which is where the value
of the a3 CPU register on entry to the kernel is stored.  Then the
kernel retries the clone() function.  But the alpha specific code
for copy_thread() in arch/alpha/kernel/process.c does not use the
passed a3 cpu register (the argument tls), instead it goes to the
saved stack to get the value of the a3 register, which on the
second call to clone() has been modified to no longer be the value
of the a3 cpu register on entry to the kernel.  And a latent bomb
is laid for userspace in the form of an incorrect process unique
value (which is the thread pointer) in the PCB.

Am I correct in my analysis and, if so, can we get a fix for this
please.

Cheers
Michael.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug: retry of clone() on Alpha can result in zeroed process thread pointer
  2014-07-23  8:52 Bug: retry of clone() on Alpha can result in zeroed process thread pointer Michael Cree
@ 2014-07-24 18:19 ` Richard Henderson
  2014-07-24 19:30   ` Michael Cree
  2014-07-30 19:30 ` Richard Henderson
  1 sibling, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2014-07-24 18:19 UTC (permalink / raw)
  To: Michael Cree, linux-alpha, linux-kernel

On 07/22/2014 10:52 PM, Michael Cree wrote:
> Running strace on nptl/tst-eintr3 reveals that the clone() syscall
> is retried by the kernel if an ERESTARTNOINTR error occurs.  At
> $syscall_error in arch/alpha/kernel/entry.S the kernel handles the
> error and in doing that it writes to 72(sp) which is where the value
> of the a3 CPU register on entry to the kernel is stored.  Then the
> kernel retries the clone() function.  But the alpha specific code
> for copy_thread() in arch/alpha/kernel/process.c does not use the
> passed a3 cpu register (the argument tls), instead it goes to the
> saved stack to get the value of the a3 register, which on the
> second call to clone() has been modified to no longer be the value
> of the a3 cpu register on entry to the kernel.  And a latent bomb
> is laid for userspace in the form of an incorrect process unique
> value (which is the thread pointer) in the PCB.
> 
> Am I correct in my analysis and, if so, can we get a fix for this
> please.

Well...  let me start with the assumption that we can't possibly restart unless
the syscall fails with -ERESTART*.

Before we clobber 72($sp), $syscall_error saves the old value in $19.  This is
the r19 parameter to do_work_pending, and is passed all the way down to
syscall_restart where we do restore the original value of a3 for ERESTARTNOINTR.

So if there's a path that leads to restart, but doesn't save a3 before
clobbering, I don't see it.  Do you have an strace dump that shows this?


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug: retry of clone() on Alpha can result in zeroed process thread pointer
  2014-07-24 18:19 ` Richard Henderson
@ 2014-07-24 19:30   ` Michael Cree
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Cree @ 2014-07-24 19:30 UTC (permalink / raw)
  To: Richard Henderson; +Cc: linux-alpha, linux-kernel

On Thu, Jul 24, 2014 at 08:19:52AM -1000, Richard Henderson wrote:
> On 07/22/2014 10:52 PM, Michael Cree wrote:
> > Running strace on nptl/tst-eintr3 reveals that the clone() syscall
> > is retried by the kernel if an ERESTARTNOINTR error occurs.  At
> > $syscall_error in arch/alpha/kernel/entry.S the kernel handles the
> > error and in doing that it writes to 72(sp) which is where the value
> > of the a3 CPU register on entry to the kernel is stored.  Then the
> > kernel retries the clone() function.  But the alpha specific code
> > for copy_thread() in arch/alpha/kernel/process.c does not use the
> > passed a3 cpu register (the argument tls), instead it goes to the
> > saved stack to get the value of the a3 register, which on the
> > second call to clone() has been modified to no longer be the value
> > of the a3 cpu register on entry to the kernel.  And a latent bomb
> > is laid for userspace in the form of an incorrect process unique
> > value (which is the thread pointer) in the PCB.
> > 
> > Am I correct in my analysis and, if so, can we get a fix for this
> > please.
> 
> Well...  let me start with the assumption that we can't possibly restart unless
> the syscall fails with -ERESTART*.
> 
> Before we clobber 72($sp), $syscall_error saves the old value in $19.  This is
> the r19 parameter to do_work_pending, and is passed all the way down to
> syscall_restart where we do restore the original value of a3 for ERESTARTNOINTR.
> 
> So if there's a path that leads to restart, but doesn't save a3 before
> clobbering, I don't see it.  Do you have an strace dump that shows this?

Yes.  This is an example of a run of nptl/tst-eintr3 that fails after
cutting off quite a bit of stuff at the start to get to the relevant
section:

clone(child_stack=0x2000121eae0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2000121f2c0, tls=0x2000121f8e0, child_tidptr=0x2000121f2c0) = ? ERESTARTNOINTR (To be restarted)
--- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_TKILL, si_pid=20086, si_uid=1000} ---
write(1, ".", 1.)                        = 1
sigreturn() (mask [])                   = -1 ERRNO_312 (Unknown error 312)
clone(child_stack=0x2000121eae0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2000121f2c0, tls=0, child_tidptr=0x2000121f2c0) = 20089
+++ killed by SIGSEGV +++


Note that the retry of clone() has zero for the tls argument.

Examining the resultant core dump reveals that tst-eintr3 segfaulted
when trying to access a thread local variable and that register v0,
used in calculating the TLS location and set up by the rduniq PALcall,
is zero.

Cheers
Michael.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug: retry of clone() on Alpha can result in zeroed process thread pointer
  2014-07-23  8:52 Bug: retry of clone() on Alpha can result in zeroed process thread pointer Michael Cree
  2014-07-24 18:19 ` Richard Henderson
@ 2014-07-30 19:30 ` Richard Henderson
  1 sibling, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2014-07-30 19:30 UTC (permalink / raw)
  To: Michael Cree, linux-alpha, linux-kernel

Ok, I think I've actually found it this time.  It's here:

> 281		childregs->r20 = 1;	/* OSF/1 has some strange fork() semantics.  */
> 282		regs->r20 = 0;

We need to delay this r20 silliness until after restarts or something.  Or just
kill it -- it's not like glibc uses this return value.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-30 19:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-23  8:52 Bug: retry of clone() on Alpha can result in zeroed process thread pointer Michael Cree
2014-07-24 18:19 ` Richard Henderson
2014-07-24 19:30   ` Michael Cree
2014-07-30 19:30 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).