linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ptrace(), fork(), sleep(), exit(), SIGCHLD
@ 2001-08-13  8:29 Bruce Janson
  2001-08-14  7:28 ` christophe barbé
  0 siblings, 1 reply; 20+ messages in thread
From: Bruce Janson @ 2001-08-13  8:29 UTC (permalink / raw)
  To: linux-kernel

Hi,
    The following program behaves incorrectly when traced:

  $ uname -a
  Linux dependo 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i686 unknown
  $ cc -v
  Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
  gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)
  $ strace -V
  strace -- version 4.2
  $ cat t.c
  main()
  {
  	switch (fork())
  	{
  	case -1:
  		write(2, "fork\n", 5);
  		break;
  
  	case 0:
  		usleep(1000000);
  		break;
  
  	default:
  		if (usleep(5000000) == -1)
  			write(2, "wrong\n", 6);
  		break;
  	}
  
  	exit(0);
  }
  $ cc t.c
  $ time ./a.out
  
  real    0m5.011s
  user    0m0.000s
  sys     0m0.000s
  $ time strace -o /dev/null ./a.out
  wrong
  
  real    0m1.025s
  user    0m0.010s
  sys     0m0.010s
  $ 

The problem appears to be that, when traced, the child process' exit()
interrupts the parent's usleep() with a SIGCHLD, the latter returning EINTR.
It also fails in the same way under Linux 2.2.16 and 2.2.19.

What am I missing?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD
  2001-08-13  8:29 ptrace(), fork(), sleep(), exit(), SIGCHLD Bruce Janson
@ 2001-08-14  7:28 ` christophe barbé
  2001-08-14 15:06   ` Bruce Janson
  0 siblings, 1 reply; 20+ messages in thread
From: christophe barbé @ 2001-08-14  7:28 UTC (permalink / raw)
  To: linux-kernel

Have you receive off-line answers?
I guess that it's certainly more a strace issue and that it's perhaps
'trivial' for most people here. But I would be interesting in knowing the
why behind this.

Christophe

Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> Hi,
>     The following program behaves incorrectly when traced:
> 
>   $ uname -a
>   Linux dependo 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i686 unknown
>   $ cc -v
>   Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
>   gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)
>   $ strace -V
>   strace -- version 4.2
>   $ cat t.c
>   main()
>   {
>   	switch (fork())
>   	{
>   	case -1:
>   		write(2, "fork\n", 5);
>   		break;
>   
>   	case 0:
>   		usleep(1000000);
>   		break;
>   
>   	default:
>   		if (usleep(5000000) == -1)
>   			write(2, "wrong\n", 6);
>   		break;
>   	}
>   
>   	exit(0);
>   }
>   $ cc t.c
>   $ time ./a.out
>   
>   real    0m5.011s
>   user    0m0.000s
>   sys     0m0.000s
>   $ time strace -o /dev/null ./a.out
>   wrong
>   
>   real    0m1.025s
>   user    0m0.010s
>   sys     0m0.010s
>   $ 
> 
> The problem appears to be that, when traced, the child process' exit()
> interrupts the parent's usleep() with a SIGCHLD, the latter returning
> EINTR.
> It also fails in the same way under Linux 2.2.16 and 2.2.19.
> 
> What am I missing?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-- 
Christophe Barbé
Software Engineer - christophe.barbe@lineo.fr
Lineo France - Lineo High Availability Group
42-46, rue Médéric - 92110 Clichy - France
phone (33).1.41.40.02.12 - fax (33).1.41.40.02.01
http://www.lineo.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD
  2001-08-14  7:28 ` christophe barbé
@ 2001-08-14 15:06   ` Bruce Janson
  2001-08-15 15:46     ` george anzinger
  0 siblings, 1 reply; 20+ messages in thread
From: Bruce Janson @ 2001-08-14 15:06 UTC (permalink / raw)
  To: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 815 bytes --]

In article <20010814092849.E13892@pc8.lineo.fr>,
christophe =?iso-8859-1?Q?barb=E9?=  <christophe.barbe@lineo.fr> wrote:
..
>Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
..
>>     The following program behaves incorrectly when traced:
..
>Have you receive off-line answers?
..

No, though I did receive an offline reply from someone who appeared
to have misunderstood the post.  In case it wasn't clear, the problem
is that the above program behaves differently when traced to how it
behaves when not traced.  (I do realise that in general, under newer
Unices, when not ignored, a SIGCHLD signal may accompany the death of
a child.)

>I guess that it's certainly more a strace issue and that it's perhaps
..

It's not clear to me whether it is a kernel, glibc or strace bug, but
it does appear to be a bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD
  2001-08-14 15:06   ` Bruce Janson
@ 2001-08-15 15:46     ` george anzinger
  2001-08-15 17:53       ` george anzinger
  2001-08-15 18:02       ` george anzinger
  0 siblings, 2 replies; 20+ messages in thread
From: george anzinger @ 2001-08-15 15:46 UTC (permalink / raw)
  To: Bruce Janson; +Cc: linux-kernel

Bruce Janson wrote:
> 
> In article <20010814092849.E13892@pc8.lineo.fr>,
> christophe =?iso-8859-1?Q?barb=E9?=  <christophe.barbe@lineo.fr> wrote:
> ..
> >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> ..
> >>     The following program behaves incorrectly when traced:
> ..
> >Have you receive off-line answers?
> ..
> 
> No, though I did receive an offline reply from someone who appeared
> to have misunderstood the post.  In case it wasn't clear, the problem
> is that the above program behaves differently when traced to how it
> behaves when not traced.  (I do realise that in general, under newer
> Unices, when not ignored, a SIGCHLD signal may accompany the death of
> a child.)
> 
> >I guess that it's certainly more a strace issue and that it's perhaps
> ..
> 
> It's not clear to me whether it is a kernel, glibc or strace bug, but
> it does appear to be a bug.

I don't have the code for usleep() handy and the man page is not much
help, but here goes:

I think strace is using ptrace() which causes signals to be redirected
to wake up the parent (strace in this case).  In particular, blocked
signals are no longer blocked.  What this means is that a.) SIG CHILD is
posted, b.) the signal, not being blocked, the child is wakened, c.)
ptrace returns to the parent, d) the parent does what ever and tells the
kernel (ptrace) to continue the child with the original mask, e.) the
signal code returns 0 with out delivering the signal to the child. 
Looks good, right?  Wrong!  The wake up at (b) pulls the child out of
the timer queue so when signal returns, the sleep (I assume nano sleep
is actually used here) call returns with the remaining sleep time as a
value. 

This is an issue for debugging also (same ptrace...).  The fix is to fix
nano_sleep to match the standard which says it should only return on a
signal if the signal is delivered to the program (i.e. not on internal
"do nothing" signals).  Signal in the kernel returns 1 if it calls the
task and 0 otherwise, thus nano sleep might be changed as follows:


	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);

	current->state = TASK_INTERRUPTIBLE;
-	expire = schedule_timeout(expire);
+	while (expire = schedule_timeout(expire) && !signal());

	if (expire) {
		if (rmtp) {
			jiffies_to_timespec(expire, &t);
			if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
				return -EFAULT;
		}
		return -EINTR;
	}
	return 0;

This code is in ../kernel/timer.c

Note that this assumes that nano_sleep() underlies usleep().  If
setitimer (via sleep() or otherwise) is used, the problem and fix is in
the library.  In that case, the code needs to notice that it was
awakened but the alarm handler was not called.  Still, with out the full
spec on usleep() it is not clear what it should do.

In any case, this is a bug in nano_sleep(), where the spec is clear on
this point.

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD
  2001-08-15 15:46     ` george anzinger
@ 2001-08-15 17:53       ` george anzinger
  2001-08-15 18:02       ` george anzinger
  1 sibling, 0 replies; 20+ messages in thread
From: george anzinger @ 2001-08-15 17:53 UTC (permalink / raw)
  To: Bruce Janson, linux-kernel

george anzinger wrote:
> 
> Bruce Janson wrote:
> >
> > In article <20010814092849.E13892@pc8.lineo.fr>,
> > christophe =?iso-8859-1?Q?barb=E9?=  <christophe.barbe@lineo.fr> wrote:
> > ..
> > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> > ..
> > >>     The following program behaves incorrectly when traced:
> > ..
> > >Have you receive off-line answers?
> > ..
> >
> > No, though I did receive an offline reply from someone who appeared
> > to have misunderstood the post.  In case it wasn't clear, the problem
> > is that the above program behaves differently when traced to how it
> > behaves when not traced.  (I do realise that in general, under newer
> > Unices, when not ignored, a SIGCHLD signal may accompany the death of
> > a child.)
> >
> > >I guess that it's certainly more a strace issue and that it's perhaps
> > ..
> >
> > It's not clear to me whether it is a kernel, glibc or strace bug, but
> > it does appear to be a bug.
> 
> I don't have the code for usleep() handy and the man page is not much
> help, but here goes:
> 
> I think strace is using ptrace() which causes signals to be redirected
> to wake up the parent (strace in this case).  In particular, blocked
> signals are no longer blocked.  What this means is that a.) SIG CHILD is
> posted, b.) the signal, not being blocked, the child is wakened, c.)
> ptrace returns to the parent, d) the parent does what ever and tells the
> kernel (ptrace) to continue the child with the original mask, e.) the
> signal code returns 0 with out delivering the signal to the child.
> Looks good, right?  Wrong!  The wake up at (b) pulls the child out of
> the timer queue so when signal returns, the sleep (I assume nano sleep
> is actually used here) call returns with the remaining sleep time as a
> value.
> 
> This is an issue for debugging also (same ptrace...).  The fix is to fix
> nano_sleep to match the standard which says it should only return on a
> signal if the signal is delivered to the program (i.e. not on internal
> "do nothing" signals).  Signal in the kernel returns 1 if it calls the
> task and 0 otherwise, thus nano sleep might be changed as follows:
> 
>         expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> 
>         current->state = TASK_INTERRUPTIBLE;
> -       expire = schedule_timeout(expire);
> +       while (expire = schedule_timeout(expire) && !signal());
Oops! should be:
 +       while (expire = schedule_timeout(expire) && !do_signal());
> 
>         if (expire) {
>                 if (rmtp) {
>                         jiffies_to_timespec(expire, &t);
>                         if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
>                                 return -EFAULT;
>                 }
>                 return -EINTR;
>         }
>         return 0;
> 
> This code is in ../kernel/timer.c
> 
> Note that this assumes that nano_sleep() underlies usleep().  If
> setitimer (via sleep() or otherwise) is used, the problem and fix is in
> the library.  In that case, the code needs to notice that it was
> awakened but the alarm handler was not called.  Still, with out the full
> spec on usleep() it is not clear what it should do.
> 
> In any case, this is a bug in nano_sleep(), where the spec is clear on
> this point.
> 
> George
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD
  2001-08-15 15:46     ` george anzinger
  2001-08-15 17:53       ` george anzinger
@ 2001-08-15 18:02       ` george anzinger
  2001-08-16  0:59         ` How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) george anzinger
  1 sibling, 1 reply; 20+ messages in thread
From: george anzinger @ 2001-08-15 18:02 UTC (permalink / raw)
  To: Bruce Janson, linux-kernel

george anzinger wrote:
> 
> Bruce Janson wrote:
> >
> > In article <20010814092849.E13892@pc8.lineo.fr>,
> > christophe =?iso-8859-1?Q?barb=E9?=  <christophe.barbe@lineo.fr> wrote:
> > ..
> > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> > ..
> > >>     The following program behaves incorrectly when traced:
> > ..
> > >Have you receive off-line answers?
> > ..
> >
> > No, though I did receive an offline reply from someone who appeared
> > to have misunderstood the post.  In case it wasn't clear, the problem
> > is that the above program behaves differently when traced to how it
> > behaves when not traced.  (I do realise that in general, under newer
> > Unices, when not ignored, a SIGCHLD signal may accompany the death of
> > a child.)
> >
> > >I guess that it's certainly more a strace issue and that it's perhaps
> > ..
> >
> > It's not clear to me whether it is a kernel, glibc or strace bug, but
> > it does appear to be a bug.
> 
> I don't have the code for usleep() handy and the man page is not much
> help, but here goes:
> 
> I think strace is using ptrace() which causes signals to be redirected
> to wake up the parent (strace in this case).  In particular, blocked
> signals are no longer blocked.  What this means is that a.) SIG CHILD is
> posted, b.) the signal, not being blocked, the child is wakened, c.)
> ptrace returns to the parent, d) the parent does what ever and tells the
> kernel (ptrace) to continue the child with the original mask, e.) the
> signal code returns 0 with out delivering the signal to the child.
> Looks good, right?  Wrong!  The wake up at (b) pulls the child out of
> the timer queue so when signal returns, the sleep (I assume nano sleep
> is actually used here) call returns with the remaining sleep time as a
> value.
> 
> This is an issue for debugging also (same ptrace...).  The fix is to fix
> nano_sleep to match the standard which says it should only return on a
> signal if the signal is delivered to the program (i.e. not on internal
> "do nothing" signals).  Signal in the kernel returns 1 if it calls the
> task and 0 otherwise, thus nano sleep might be changed as follows:
> 
>         expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> 
>         current->state = TASK_INTERRUPTIBLE;
> -       expire = schedule_timeout(expire);
> +       while (expire = schedule_timeout(expire) && !signal());
Still not quite right.  regs needs to be dummied up (see sys_sigpause)
and then:
+       while (expire = schedule_timeout(expire) && !do_signal(regs,
NULL));
> 
>         if (expire) {
>                 if (rmtp) {
>                         jiffies_to_timespec(expire, &t);
>                         if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
>                                 return -EFAULT;
>                 }
>                 return -EINTR;
>         }
>         return 0;
> 
> This code is in ../kernel/timer.c
> 
> Note that this assumes that nano_sleep() underlies usleep().  If
> setitimer (via sleep() or otherwise) is used, the problem and fix is in
> the library.  In that case, the code needs to notice that it was
> awakened but the alarm handler was not called.  Still, with out the full
> spec on usleep() it is not clear what it should do.
> 
> In any case, this is a bug in nano_sleep(), where the spec is clear on
> this point.
> 
> George
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(),  SIGCHLD)
  2001-08-15 18:02       ` george anzinger
@ 2001-08-16  0:59         ` george anzinger
  2001-08-16 10:17           ` christophe barbé
  0 siblings, 1 reply; 20+ messages in thread
From: george anzinger @ 2001-08-16  0:59 UTC (permalink / raw)
  To: Bruce Janson, linux-kernel

The problem is that nano_sleep needs to continue sleeping if it wakes up
on a signal which is not delivered to the task.  This happens when
"strace" or "ptrace" cause otherwise blocked signals to be delivered. 
Do_signal() returns 0 if it does not deliver a signal, a 1 if it does so
I propose the following changes to nano_sleep:

asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec
*rmtp)
{
	struct timespec t;
	unsigned long expire;
+	struct pt_regs * regs = (struct pt_regs *) &rqtp;

	if(copy_from_user(&t, rqtp, sizeof(struct timespec)))
		return -EFAULT;

	if (t.tv_nsec >= 1000000000L || t.tv_nsec < 0 || t.tv_sec < 0)
		return -EINVAL;


	if (t.tv_sec == 0 && t.tv_nsec <= 2000000L &&
	    current->policy != SCHED_OTHER)
	{
		/*
		 * Short delay requests up to 2 ms will be handled with
		 * high precision by a busy wait for all real-time processes.
		 *
		 * Its important on SMP not to do this holding locks.
		 */
		udelay((t.tv_nsec + 999) / 1000);
		return 0;
	}

	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);

-	current->state = TASK_INTERRUPTIBLE;
-	expire = schedule_timeout(expire);
+	regs->eax = -EINTR;
+	do {
+               current->state = TASK_INTERRUPTIBLE;
+       } while((expire = schedule_timeout(expire)) &&
!do_signal(regs,NULL));

	if (expire) {
		if (rmtp) {
			jiffies_to_timespec(expire, &t);
			if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
				return -EFAULT;
		}
		return -EINTR;
	}
	return 0;
}

BUT, it turns out that do_signal() is in the "arch" code, and further
that different arch's have different calling sequences (and, of course,
pt_regs is different also).  This is the ONLY place in the kernel where
platform independent code needs to call do_signal() :(  There does not
seem to be a clean answer for this issue.  I suppose we could put
something like this in timer.c:

#ifndef _do_signal
#define _do_signal(a,b) 1
#endif

and then leave it to the platform code to define a uniform
_do_signal(a,b) interface.  This way each platform will work as it does
now and better once they define the macro.  If this is the path to take,
what should the parameters "a" & "b" be?  We need to cover all platforms
needs.

Another way to solve this issue is to move nano_sleep into the "arch"
signal.c file, but then each would have to change and things would be
broken (i.e. nano_sleep would not work) until the platform made the
move.  I suppose the current nano_sleep could stay in the kernel and
each platform could implement one in their area with a different name. 
When all were done, the current code could be deleted.

How should this be approached?

George


george anzinger wrote:
> 
> george anzinger wrote:
> >
> > Bruce Janson wrote:
> > >
> > > In article <20010814092849.E13892@pc8.lineo.fr>,
> > > christophe =?iso-8859-1?Q?barb=E9?=  <christophe.barbe@lineo.fr> wrote:
> > > ..
> > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> > > ..
> > > >>     The following program behaves incorrectly when traced:
> > > ..
> > > >Have you receive off-line answers?
> > > ..
> > >
> > > No, though I did receive an offline reply from someone who appeared
> > > to have misunderstood the post.  In case it wasn't clear, the problem
> > > is that the above program behaves differently when traced to how it
> > > behaves when not traced.  (I do realise that in general, under newer
> > > Unices, when not ignored, a SIGCHLD signal may accompany the death of
> > > a child.)
> > >
> > > >I guess that it's certainly more a strace issue and that it's perhaps
> > > ..
> > >
> > > It's not clear to me whether it is a kernel, glibc or strace bug, but
> > > it does appear to be a bug.
> >
> > I don't have the code for usleep() handy and the man page is not much
> > help, but here goes:
> >
> > I think strace is using ptrace() which causes signals to be redirected
> > to wake up the parent (strace in this case).  In particular, blocked
> > signals are no longer blocked.  What this means is that a.) SIG CHILD is
> > posted, b.) the signal, not being blocked, the child is wakened, c.)
> > ptrace returns to the parent, d) the parent does what ever and tells the
> > kernel (ptrace) to continue the child with the original mask, e.) the
> > signal code returns 0 with out delivering the signal to the child.
> > Looks good, right?  Wrong!  The wake up at (b) pulls the child out of
> > the timer queue so when signal returns, the sleep (I assume nano sleep
> > is actually used here) call returns with the remaining sleep time as a
> > value.
> >
> > This is an issue for debugging also (same ptrace...).  The fix is to fix
> > nano_sleep to match the standard which says it should only return on a
> > signal if the signal is delivered to the program (i.e. not on internal
> > "do nothing" signals).  Signal in the kernel returns 1 if it calls the
> > task and 0 otherwise, thus nano sleep might be changed as follows:
> >
> >         expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> >
> >         current->state = TASK_INTERRUPTIBLE;
> > -       expire = schedule_timeout(expire);
> > +       while (expire = schedule_timeout(expire) && !signal());
> Still not quite right.  regs needs to be dummied up (see sys_sigpause)
> and then:
> +       while (expire = schedule_timeout(expire) && !do_signal(regs,
> NULL));
> >
> >         if (expire) {
> >                 if (rmtp) {
> >                         jiffies_to_timespec(expire, &t);
> >                         if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
> >                                 return -EFAULT;
> >                 }
> >                 return -EINTR;
> >         }
> >         return 0;
> >
> > This code is in ../kernel/timer.c
> >
> > Note that this assumes that nano_sleep() underlies usleep().  If
> > setitimer (via sleep() or otherwise) is used, the problem and fix is in
> > the library.  In that case, the code needs to notice that it was
> > awakened but the alarm handler was not called.  Still, with out the full
> > spec on usleep() it is not clear what it should do.
> >
> > In any case, this is a bug in nano_sleep(), where the spec is clear on
> > this point.
> >
> > George
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-16  0:59         ` How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) george anzinger
@ 2001-08-16 10:17           ` christophe barbé
  2001-08-16 10:29             ` Russell King
  0 siblings, 1 reply; 20+ messages in thread
From: christophe barbé @ 2001-08-16 10:17 UTC (permalink / raw)
  To: linux-kernel

Thank george for your infos. First I was thinking that it was a bug in a
user-mode tool. Now you convinced me that it's a nanosleep kernel bug. I've
not looked in the user_mode usleed code but I've tested the bruce code with
nanosleep and got the same result. What still astonish me is that this
looks like a not so unprobable bug when running strace and should have been
detected before.

IT should be better to let sys_nanosleep in non-arch directory and use a
macro as you have proposed but this macro should include more than
do_signal :

#ifndef _sys_nanosleep
#define _sys_nanosleep \
	current->state = TASK_INTERRUPTIBLE; \
	expire = schedule_timeout(expire);
#endif

and for x86 :

#define _sys_nanosleep \
     { \
	struct pt_regs * regs = (struct pt_regs *) &rqtp; \
	regs->eax = -EINTR; \
	do { \
               current->state = TASK_INTERRUPTIBLE; \
       } while((expire = schedule_timeout(expire)) &&
!do_signal(regs,NULL)); \
     }

Is it bad to use directly the variable in a macro like this one. We could
still use something like _sys_nanosleep(current,rqtp) but it could be not
generic enough.

Christophe

Le jeu, 16 aoû 2001 02:59:51, george anzinger a écrit :
> The problem is that nano_sleep needs to continue sleeping if it wakes up
> on a signal which is not delivered to the task.  This happens when
> "strace" or "ptrace" cause otherwise blocked signals to be delivered. 
> Do_signal() returns 0 if it does not deliver a signal, a 1 if it does so
> I propose the following changes to nano_sleep:
> 
> asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec
> *rmtp)
> {
> 	struct timespec t;
> 	unsigned long expire;
> +	struct pt_regs * regs = (struct pt_regs *) &rqtp;
> 
> 	if(copy_from_user(&t, rqtp, sizeof(struct timespec)))
> 		return -EFAULT;
> 
> 	if (t.tv_nsec >= 1000000000L || t.tv_nsec < 0 || t.tv_sec < 0)
> 		return -EINVAL;
> 
> 
> 	if (t.tv_sec == 0 && t.tv_nsec <= 2000000L &&
> 	    current->policy != SCHED_OTHER)
> 	{
> 		/*
> 		 * Short delay requests up to 2 ms will be handled with
> 		 * high precision by a busy wait for all real-time
> processes.
> 		 *
> 		 * Its important on SMP not to do this holding locks.
> 		 */
> I 		udelay((t.tv_nsec + 999) / 1000);
> 		return 0;
> 	}
> 
> 	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> 
> -	current->state = TASK_INTERRUPTIBLE;
> -	expire = schedule_timeout(expire);
> +	regs->eax = -EINTR;
> +	do {
> +               current->state = TASK_INTERRUPTIBLE;
> +       } while((expire = schedule_timeout(expire)) &&
> !do_signal(regs,NULL));
> 
> 	if (expire) {
> 		if (rmtp) {
> 			jiffies_to_timespec(expire, &t);
> 			if (copy_to_user(rmtp, &t, sizeof(struct
> timespec)))
> 				return -EFAULT;
> 		}
> 		return -EINTR;
> 	}
> 	return 0;
> }
> 
> BUT, it turns out that do_signal() is in the "arch" code, and further
> that different arch's have different calling sequences (and, of course,
> pt_regs is different also).  This is the ONLY place in the kernel where
> platform independent code needs to call do_signal() :(  There does not
> seem to be a clean answer for this issue.  I suppose we could put
> something like this in timer.c:
> 
> #ifndef _do_signal
> #define _do_signal(a,b) 1
> #endif
> 
> and then leave it to the platform code to define a uniform
> _do_signal(a,b) interface.  This way each platform will work as it does
> now and better once they define the macro.  If this is the path to take,
> what should the parameters "a" & "b" be?  We need to cover all platforms
> needs.
> 
> Another way to solve this issue is to move nano_sleep into the "arch"
> signal.c file, but then each would have to change and things would be
> broken (i.e. nano_sleep would not work) until the platform made the
> move.  I suppose the current nano_sleep could stay in the kernel and
> each platform could implement one in their area with a different name. 
> When all were done, the current code could be deleted.
> 
> How should this be approached?
> 
> George
> 
> 
> george anzinger wrote:
> > 
> > george anzinger wrote:
> > >
> > > Bruce Janson wrote:
> > > >
> > > > In article <20010814092849.E13892@pc8.lineo.fr>,
> > > > christophe =?iso-8859-1?Q?barb=E9?=  <christophe.barbe@lineo.fr>
> wrote:
> > > > ..
> > > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> > > > ..
> > > > >>     The following program behaves incorrectly when traced:
> > > > ..
> > > > >Have you receive off-line answers?
> > > > ..
> > > >
> > > > No, though I did receive an offline reply from someone who appeared
> > > > to have misunderstood the post.  In case it wasn't clear, the
> problem
> > > > is that the above program behaves differently when traced to how it
> > > > behaves when not traced.  (I do realise that in general, under
> newer
> > > > Unices, when not ignored, a SIGCHLD signal may accompany the death
> of
> > > > a child.)
> > > >
> > > > >I guess that it's certainly more a strace issue and that it's
> perhaps
> > > > ..
> > > >
> > > > It's not clear to me whether it is a kernel, glibc or strace bug,
> but
> > > > it does appear to be a bug.
> > >
> > > I don't have the code for usleep() handy and the man page is not much
> > > help, but here goes:
> > >
> > > I think strace is using ptrace() which causes signals to be
> redirected
> > > to wake up the parent (strace in this case).  In particular, blocked
> > > signals are no longer blocked.  What this means is that a.) SIG CHILD
> is
> > > posted, b.) the signal, not being blocked, the child is wakened, c.)
> > > ptrace returns to the parent, d) the parent does what ever and tells
> the
> > > kernel (ptrace) to continue the child with the original mask, e.) the
> > > signal code returns 0 with out delivering the signal to the child.
> > > Looks good, right?  Wrong!  The wake up at (b) pulls the child out of
> > > the timer queue so when signal returns, the sleep (I assume nano
> sleep
> > > is actually used here) call returns with the remaining sleep time as
> a
> > > value.
> > >
> > > This is an issue for debugging also (same ptrace...).  The fix is to
> fix
> > > nano_sleep to match the standard which says it should only return on
> a
> > > signal if the signal is delivered to the program (i.e. not on
> internal
> > > "do nothing" signals).  Signal in the kernel returns 1 if it calls
> the
> > > task and 0 otherwise, thus nano sleep might be changed as follows:
> > >
> > >         expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> > >
> > >         current->state = TASK_INTERRUPTIBLE;
> > > -       expire = schedule_timeout(expire);
> > > +       while (expire = schedule_timeout(expire) && !signal());
> > Still not quite right.  regs needs to be dummied up (see sys_sigpause)
> > and then:
> > +       while (expire = schedule_timeout(expire) && !do_signal(regs,
> > NULL));
> > >
> > >         if (expire) {
> > >                 if (rmtp) {
> > >                         jiffies_to_timespec(expire, &t);
> > >                         if (copy_to_user(rmtp, &t, sizeof(struct
> timespec)))
> > >                                 return -EFAULT;
> > >                 }
> > >                 return -EINTR;
> > >         }
> > >         return 0;
> > >
> > > This code is in ../kernel/timer.c
> > >
> > > Note that this assumes that nano_sleep() underlies usleep().  If
> > > setitimer (via sleep() or otherwise) is used, the problem and fix is
> in
> > > the library.  In that case, the code needs to notice that it was
> > > awakened but the alarm handler was not called.  Still, with out the
> full
> > > spec on usleep() it is not clear what it should do.
> > >
> > > In any case, this is a bug in nano_sleep(), where the spec is clear
> on
> > > this point.
> > >
> > > George
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-- 
Christophe Barbé
Software Engineer - christophe.barbe@lineo.fr
Lineo France - Lineo High Availability Group
42-46, rue Médéric - 92110 Clichy - France
phone (33).1.41.40.02.12 - fax (33).1.41.40.02.01
http://www.lineo.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-16 10:17           ` christophe barbé
@ 2001-08-16 10:29             ` Russell King
  2001-08-16 14:16               ` george anzinger
  2001-08-16 16:00               ` christophe barbé
  0 siblings, 2 replies; 20+ messages in thread
From: Russell King @ 2001-08-16 10:29 UTC (permalink / raw)
  To: christophe barbé; +Cc: linux-kernel

On Thu, Aug 16, 2001 at 12:17:46PM +0200, christophe barbé wrote:
> > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec
> > *rmtp)
> > {
> > 	struct timespec t;
> > 	unsigned long expire;
> > +	struct pt_regs * regs = (struct pt_regs *) &rqtp;

Note also that this is bogus as an architecture invariant.

On ARM, we have to pass a pt_regs pointer into any function that requires
it.

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(),  exit(), SIGCHLD)
  2001-08-16 10:29             ` Russell King
@ 2001-08-16 14:16               ` george anzinger
  2001-08-16 16:00               ` christophe barbé
  1 sibling, 0 replies; 20+ messages in thread
From: george anzinger @ 2001-08-16 14:16 UTC (permalink / raw)
  To: Russell King; +Cc: christophe barbé, linux-kernel

Russell King wrote:
> 
> On Thu, Aug 16, 2001 at 12:17:46PM +0200, christophe barbé wrote:
> > > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec
> > > *rmtp)
> > > {
> > >     struct timespec t;
> > >     unsigned long expire;
> > > +   struct pt_regs * regs = (struct pt_regs *) &rqtp;
> 
> Note also that this is bogus as an architecture invariant.
> 
> On ARM, we have to pass a pt_regs pointer into any function that requires
> it.
> 
Does that mean that any function that requires pt_regs _must_ not be in
common code?  Isn't there a better solution than implementing these
functions over and over for each platform?

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-16 10:29             ` Russell King
  2001-08-16 14:16               ` george anzinger
@ 2001-08-16 16:00               ` christophe barbé
  2001-08-16 16:12                 ` Russell King
  1 sibling, 1 reply; 20+ messages in thread
From: christophe barbé @ 2001-08-16 16:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Russell King


Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit :
> On Thu, Aug 16, 2001 at 12:17:46PM +0200, christophe barbé wrote:
> > > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec
> > > *rmtp)
> > > {
> > > 	struct timespec t;
> > > 	unsigned long expire;
> > > +	struct pt_regs * regs = (struct pt_regs *) &rqtp;
> 
> Note also that this is bogus as an architecture invariant.
> 
> On ARM, we have to pass a pt_regs pointer into any function that requires
> it.

I'm not sure to understand your point.

The first sentence tell me that the "struct pt_regs ..." line is x86
specific and this was the reason behind my proposition to not add a _signal
macro but a  _sys_nanosleep macro to include this too.

The second sentence seem's to indicate that this is a classic problem for
the ARM port. So if this is correct what is the best way to solve it ?

Christophe

> --
> Russell King (rmk@arm.linux.org.uk)                The developer of ARM
> Linux
>              http://www.arm.linux.org.uk/personal/aboutme.html
> 

-- 
Christophe Barbé
Software Engineer - christophe.barbe@lineo.fr
Lineo France - Lineo High Availability Group
42-46, rue Médéric - 92110 Clichy - France
phone (33).1.41.40.02.12 - fax (33).1.41.40.02.01
http://www.lineo.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-16 16:00               ` christophe barbé
@ 2001-08-16 16:12                 ` Russell King
  2001-08-16 18:17                   ` george anzinger
  0 siblings, 1 reply; 20+ messages in thread
From: Russell King @ 2001-08-16 16:12 UTC (permalink / raw)
  To: christophe barbé; +Cc: linux-kernel

On Thu, Aug 16, 2001 at 06:00:10PM +0200, christophe barbé wrote:
> Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit :
> > Note also that this is bogus as an architecture invariant.
> > 
> > On ARM, we have to pass a pt_regs pointer into any function that requires
> > it.
> 
> I'm not sure to understand your point.

Its quite simple:

int sys_foo(struct pt_regs regs)
{
}

does not reveal the user space registers on ARM.  It instead reveals crap.
Why?  The ARM procedure call standard specifies that the first 4 words
of "regs" in this case are in 4 processor registers.  The other words
are on the stack immediately above the frame created by foo.  This is
not how the stack is layed out on ARM on entry to a sys_* function
due to the requirement for these to be restartable.

Instead, we must pass a pointer thusly:

int sys_foo(struct pt_regs *regs)
{
}

and the pointer is specifically setup and passed in by a very small
assembler wrapper.

> The first sentence tell me that the "struct pt_regs ..." line is x86
> specific and this was the reason behind my proposition to not add a _signal
> macro but a  _sys_nanosleep macro to include this too.

Correct.  But the act of getting "struct pt_regs" on entry to the function
is also architecture specific.

> The second sentence seem's to indicate that this is a classic problem for
> the ARM port. So if this is correct what is the best way to solve it ?

It used to be with such functions as sys_execve.  Then, sys_execve
became an architecture specific wrapper around do_execve (not by my
hand), so I guess that its not an ARM specific problem.

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(),  exit(), SIGCHLD)
  2001-08-16 16:12                 ` Russell King
@ 2001-08-16 18:17                   ` george anzinger
  2001-08-17 18:25                     ` george anzinger
  0 siblings, 1 reply; 20+ messages in thread
From: george anzinger @ 2001-08-16 18:17 UTC (permalink / raw)
  To: Russell King; +Cc: christophe barbé, linux-kernel

Russell King wrote:
> 
> On Thu, Aug 16, 2001 at 06:00:10PM +0200, christophe barbé wrote:
> > Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit :
> > > Note also that this is bogus as an architecture invariant.
> > >
> > > On ARM, we have to pass a pt_regs pointer into any function that requires
> > > it.
> >
> > I'm not sure to understand your point.
> 
> Its quite simple:
> 
> int sys_foo(struct pt_regs regs)
> {
> }
> 
> does not reveal the user space registers on ARM.  It instead reveals crap.
> Why?  The ARM procedure call standard specifies that the first 4 words
> of "regs" in this case are in 4 processor registers.  The other words
> are on the stack immediately above the frame created by foo.  This is
> not how the stack is layed out on ARM on entry to a sys_* function
> due to the requirement for these to be restartable.
> 
> Instead, we must pass a pointer thusly:
> 
> int sys_foo(struct pt_regs *regs)
> {
> }
> 
> and the pointer is specifically setup and passed in by a very small
> assembler wrapper.
> 
> > The first sentence tell me that the "struct pt_regs ..." line is x86
> > specific and this was the reason behind my proposition to not add a _signal
> > macro but a  _sys_nanosleep macro to include this too.
> 
> Correct.  But the act of getting "struct pt_regs" on entry to the function
> is also architecture specific.
> 
> > The second sentence seem's to indicate that this is a classic problem for
> > the ARM port. So if this is correct what is the best way to solve it ?
> 
> It used to be with such functions as sys_execve.  Then, sys_execve
> became an architecture specific wrapper around do_execve (not by my
> hand), so I guess that its not an ARM specific problem.
> 
> --
So, it seems we need an arch. specific wrapper for nano_sleep.  Now, how
to do it so it is a smooth transition?

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(),  exit(), SIGCHLD)
  2001-08-16 18:17                   ` george anzinger
@ 2001-08-17 18:25                     ` george anzinger
  2001-08-17 18:57                       ` Victor Yodaiken
  0 siblings, 1 reply; 20+ messages in thread
From: george anzinger @ 2001-08-17 18:25 UTC (permalink / raw)
  To: Russell King, christophe barbé, linux-kernel

george anzinger wrote:
> 
> Russell King wrote:
> >
> > On Thu, Aug 16, 2001 at 06:00:10PM +0200, christophe barbé wrote:
> > > Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit :
> > > > Note also that this is bogus as an architecture invariant.
> > > >
> > > > On ARM, we have to pass a pt_regs pointer into any function that requires
> > > > it.
> > >
> > > I'm not sure to understand your point.
> >
> > Its quite simple:
> >
> > int sys_foo(struct pt_regs regs)
> > {
> > }
> >
> > does not reveal the user space registers on ARM.  It instead reveals crap.
> > Why?  The ARM procedure call standard specifies that the first 4 words
> > of "regs" in this case are in 4 processor registers.  The other words
> > are on the stack immediately above the frame created by foo.  This is
> > not how the stack is layed out on ARM on entry to a sys_* function
> > due to the requirement for these to be restartable.
> >
> > Instead, we must pass a pointer thusly:
> >
> > int sys_foo(struct pt_regs *regs)
> > {
> > }
> >
> > and the pointer is specifically setup and passed in by a very small
> > assembler wrapper.
> >
> > > The first sentence tell me that the "struct pt_regs ..." line is x86
> > > specific and this was the reason behind my proposition to not add a _signal
> > > macro but a  _sys_nanosleep macro to include this too.
> >
> > Correct.  But the act of getting "struct pt_regs" on entry to the function
> > is also architecture specific.
> >
> > > The second sentence seem's to indicate that this is a classic problem for
> > > the ARM port. So if this is correct what is the best way to solve it ?
> >
> > It used to be with such functions as sys_execve.  Then, sys_execve
> > became an architecture specific wrapper around do_execve (not by my
> > hand), so I guess that its not an ARM specific problem.
> >
> > --
> So, it seems we need an arch. specific wrapper for nano_sleep.  Now, how
> to do it so it is a smooth transition?
> 
How about something like:

In ../asm/signal.h  (for i386)

#define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \
type name(p1_type p1,p2_typ p2)\
{	struct pt_regs *regs = (struct pt_regs *)&p1;

#define _do_signal() do_signal(regs, NULL)

in ../asm_arm/signal.h

#define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \
type name(p1_type p1,p2_typ p2, struct pt_regs *regs) \
{

#define _do_signal() do_signal(NULL,regs, NULL)

and other archs as needed.

in ../linux/signal.h  (the default)

:
#include <asm/signal.h>

:
:
#ifndef PT_REGS_ENTRY
#define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \
type name(p1_type p1,p2_typ p2)\
{
#define _do_signal() 1
#endif

and finally in nano_sleep :

PT_REGS_ENTRY(asmlinkage long,sys_nanosleep,struct timespec
*,rqtp,struct timespec *,rmtp)

	struct timespec t;
	unsigned long expire;
:
: etc...

:
	do {
                current->state = TASK_INTERRUPTIBLE;
        } while((expire = schedule_timeout(expire)) && !_do_signal());


The notion is that this could be used for other system calls, in
particular I want to implement clock_nanosleep() as part of the extended
POSIX standard.

Comments, flames,...

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-17 18:25                     ` george anzinger
@ 2001-08-17 18:57                       ` Victor Yodaiken
  2001-08-17 19:56                         ` george anzinger
  0 siblings, 1 reply; 20+ messages in thread
From: Victor Yodaiken @ 2001-08-17 18:57 UTC (permalink / raw)
  To: george anzinger; +Cc: Russell King, christophe barbé, linux-kernel

On Fri, Aug 17, 2001 at 11:25:42AM -0700, george anzinger wrote:
> > 
> How about something like:
> 
> In ../asm/signal.h  (for i386)
> 
> #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \
> type name(p1_type p1,p2_typ p2)\
> {	struct pt_regs *regs = (struct pt_regs *)&p1;

In RTLinux we define MACHDEPREGS as an arch dependent type. PPC defines
this as a pointer and x87 as the structure etc. The small number of functions
that actually need to manipulate this can be made machine dependent too.
Came in handy during the port to BSD too.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(),  exit(), SIGCHLD)
  2001-08-17 18:57                       ` Victor Yodaiken
@ 2001-08-17 19:56                         ` george anzinger
  2001-08-22 18:40                           ` Russell King
  0 siblings, 1 reply; 20+ messages in thread
From: george anzinger @ 2001-08-17 19:56 UTC (permalink / raw)
  To: Victor Yodaiken; +Cc: Russell King, christophe barbé, linux-kernel

Victor Yodaiken wrote:
> 
> On Fri, Aug 17, 2001 at 11:25:42AM -0700, george anzinger wrote:
> > >
> > How about something like:
> >
> > In ../asm/signal.h  (for i386)
> >
> > #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \
> > type name(p1_type p1,p2_typ p2)\
> > {     struct pt_regs *regs = (struct pt_regs *)&p1;
> 
> In RTLinux we define MACHDEPREGS as an arch dependent type. PPC defines
> this as a pointer and x87 as the structure etc. The small number of functions
> that actually need to manipulate this can be made machine dependent too.
> Came in handy during the port to BSD too.

Uh..?  I though that was what I was allowing.  It seems to me to be a
lot of extra work to put the same code in 15 different archs. 
Especially if one does not really know each of them, nor can any one
group (or individual) be expected to be able to test (or even have the
hardware to test) each of them.

This said, what I am trying to define is a way to write common code that
will handle each arch as the arch coder defines the macro for his arch. 
In the given case, for example, the type of the arch dependent structure
is _only_ used in the macros which are defined in the asm/*.h file.  In
this particular case, the common macros, (which the arch macros over
ride) do not call the arch dependent function, but assume that it
returns true.  For nano_sleep, this means that it will return early on
ptrace signals, a standard violation, but what the  current code does,
so each arch can fix their system by defining the macro.  I also would
like a way to avoid writing (and supporting) 15 different nano_sleeps,
clock_nano_sleeps, etc.

If I miss understand what you are saying, please help me to understand
it.  But do keep in mind, that I _really_ don't want to support 15
incarnations of nano_sleep and clock_nano_sleep.  I would, however, like
to find a simple and elegant way to write and support the one common
nano_sleep and clock_nano_sleep, or what ever code.

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-17 19:56                         ` george anzinger
@ 2001-08-22 18:40                           ` Russell King
  2001-08-23 20:04                             ` george anzinger
  0 siblings, 1 reply; 20+ messages in thread
From: Russell King @ 2001-08-22 18:40 UTC (permalink / raw)
  To: george anzinger; +Cc: Victor Yodaiken, christophe barbé, linux-kernel

On Fri, Aug 17, 2001 at 12:56:31PM -0700, george anzinger wrote:
> Uh..?  I though that was what I was allowing.  It seems to me to be a
> lot of extra work to put the same code in 15 different archs. 
> Especially if one does not really know each of them, nor can any one
> group (or individual) be expected to be able to test (or even have the
> hardware to test) each of them.

Umm, my best advice is to look at sys_fork() and do_fork(), sys_execve()
and do_execve().

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(),  exit(), SIGCHLD)
  2001-08-22 18:40                           ` Russell King
@ 2001-08-23 20:04                             ` george anzinger
  2001-08-23 20:11                               ` Russell King
  0 siblings, 1 reply; 20+ messages in thread
From: george anzinger @ 2001-08-23 20:04 UTC (permalink / raw)
  To: Russell King; +Cc: Victor Yodaiken, christophe barbé, linux-kernel

Russell King wrote:
> 
> On Fri, Aug 17, 2001 at 12:56:31PM -0700, george anzinger wrote:
> > Uh..?  I though that was what I was allowing.  It seems to me to be a
> > lot of extra work to put the same code in 15 different archs.
> > Especially if one does not really know each of them, nor can any one
> > group (or individual) be expected to be able to test (or even have the
> > hardware to test) each of them.
> 
> Umm, my best advice is to look at sys_fork() and do_fork(), sys_execve()
> and do_execve().
> 
Sorry, but none of those system calls requires the registers which is
where the problem is.

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD)
  2001-08-23 20:04                             ` george anzinger
@ 2001-08-23 20:11                               ` Russell King
  2001-08-23 21:13                                 ` george anzinger
  0 siblings, 1 reply; 20+ messages in thread
From: Russell King @ 2001-08-23 20:11 UTC (permalink / raw)
  To: george anzinger; +Cc: Victor Yodaiken, christophe barbé, linux-kernel

On Thu, Aug 23, 2001 at 01:04:09PM -0700, george anzinger wrote:
> Sorry, but none of those system calls requires the registers which is
> where the problem is.

/* Fork a new task - this creates a new program thread.
 * This is called indirectly via a small wrapper
 */
asmlinkage int sys_fork(struct pt_regs *regs)
                        ^^^^^^^^^^^^^^^^^^^^
{
        return do_fork(SIGCHLD, regs->ARM_sp, regs, 0);
}

/* sys_execve() executes a new program.
 * This is called indirectly via a small wrapper
 */
asmlinkage int
sys_execve(char *filenamei, char **argv, char **envp, struct pt_regs *regs)
                                                     ^^^^^^^^^^^^^^^^^^^^^
{
        int error;
        char * filename;

        filename = getname(filenamei);
        error = PTR_ERR(filename);
        if (IS_ERR(filename))
                goto out;
        error = do_execve(filename, argv, envp, regs);
        putname(filename);
out:
        return error;
}

Certainly looks to me like they do.  See the highlighted arguments.

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(),  exit(), SIGCHLD)
  2001-08-23 20:11                               ` Russell King
@ 2001-08-23 21:13                                 ` george anzinger
  0 siblings, 0 replies; 20+ messages in thread
From: george anzinger @ 2001-08-23 21:13 UTC (permalink / raw)
  To: Russell King; +Cc: Victor Yodaiken, christophe barbé, linux-kernel

Russell King wrote:
> 
> On Thu, Aug 23, 2001 at 01:04:09PM -0700, george anzinger wrote:
> > Sorry, but none of those system calls requires the registers which is
> > where the problem is.
> 
> /* Fork a new task - this creates a new program thread.
>  * This is called indirectly via a small wrapper
>  */
> asmlinkage int sys_fork(struct pt_regs *regs)
>                         ^^^^^^^^^^^^^^^^^^^^
> {
>         return do_fork(SIGCHLD, regs->ARM_sp, regs, 0);
> }
> 
> /* sys_execve() executes a new program.
>  * This is called indirectly via a small wrapper
>  */
> asmlinkage int
> sys_execve(char *filenamei, char **argv, char **envp, struct pt_regs *regs)
>                                                      ^^^^^^^^^^^^^^^^^^^^^
> {
>         int error;
>         char * filename;
> 
>         filename = getname(filenamei);
>         error = PTR_ERR(filename);
>         if (IS_ERR(filename))
>                 goto out;
>         error = do_execve(filename, argv, envp, regs);
>         putname(filename);
> out:
>         return error;
> }
> 
> Certainly looks to me like they do.  See the highlighted arguments.
> 
I stand corrected!  Guess I should look before I leap, uh email.  OK,
here is the issue I am dealing with.  I would like to define a new
system call \x1f(clock_nanosleep()) to conform to the extended POSIX
standard.  It, as well as the current nanosleep() needs the regs. to
call do_signal(), however, it is strictly a pass thru, do_signal() uses
the regs, not the main body of the sleep functions.  The various
platforms have different ways of passing the regs to the system call and
also different ways of calling do_signal().  The failure to be able to
call do_signal() results in the call not properly ignoring ptrace
signals, see the beginning of this thread.  What I would like to do is
to abstract the interface to the system calls (just the nanosleep ones
will do for now) so that the common code will work (except for the
ptrace problem) while leaving room for the platform code to pick up the
ball as they support people feel so inclined.  This means that some of
them would have to write the required wrappers and all would have to
define the macros that abstract the interface.  Here is what I have so
far:

In include/linux/signal.h
#ifndef PT_REGS_ENTRY
#define PT_REGS_ENTRY(type,name,p1_type,p1,p2_type,p2) \
type name(pi_type p1,p2_type,p2) {

#define _do_signal() 1
#endif

In include/asm_i386/signal.h

asmlinkage int FASTCALL(do_signal(struct pt_regs *regs, sigset_t
*oldset));
#define PT_REGS_ENTRY(type,name,p1_type,p1,p2_type,p2) \
type name(p1_type p1,p2_type p2) \
{	struct pt_regs *regs=(struct pt_regs *)&p1;
#define _do_signal() do_signal(regs,NULL)

and in kernel/timer.c

PT_REGS_ENTRY(asmlinkage long,sys_nanosleep,struct timespec*,rqtp,struct
timespec* ,rmtp)
struct timespec t;.....

do{
	current->state = TASK_INTERRUPTABLE;
}while((expire = schedule_timeout(expire)) && !_do_signal());

I think this lets us get on with fixing the i386 branch without breaking
any other branch and still leaves it easy for other branches to catch up
as they find the time to do so.


The question I haven't had time to research is if there is enough
flexibility for all the other platforms.  Certainly ARM can insert the
additional parameter in the system call using this.  

If you want my $.02 worth, the pt_regs should be addressable without
reference to the call stack.  To the best of my knowledge, they are
always at the base of the kernel stack (assuming we ignore the case of
kernel interrupts where we don't allow signal delivery in any case) and
so should be find able by doing a bit of math on the stack pointer.  How
does the hardware come up with a stack pointer when interrupting/
calling from user space?  Food for thought, and of course, it is
different for each platform.

George

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2001-08-23 21:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-13  8:29 ptrace(), fork(), sleep(), exit(), SIGCHLD Bruce Janson
2001-08-14  7:28 ` christophe barbé
2001-08-14 15:06   ` Bruce Janson
2001-08-15 15:46     ` george anzinger
2001-08-15 17:53       ` george anzinger
2001-08-15 18:02       ` george anzinger
2001-08-16  0:59         ` How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) george anzinger
2001-08-16 10:17           ` christophe barbé
2001-08-16 10:29             ` Russell King
2001-08-16 14:16               ` george anzinger
2001-08-16 16:00               ` christophe barbé
2001-08-16 16:12                 ` Russell King
2001-08-16 18:17                   ` george anzinger
2001-08-17 18:25                     ` george anzinger
2001-08-17 18:57                       ` Victor Yodaiken
2001-08-17 19:56                         ` george anzinger
2001-08-22 18:40                           ` Russell King
2001-08-23 20:04                             ` george anzinger
2001-08-23 20:11                               ` Russell King
2001-08-23 21:13                                 ` george anzinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).