All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
@ 2009-05-25 18:55 Oleg Nesterov
  2009-05-25 19:39 ` Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-25 18:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

If the non-traced sub-thread calls do_notify_parent_cldstop(), we send the
notification to group_leader->real_parent and we report group_leader's pid.

But, if group_leader is traced we use the wrong ->parent->nsproxy->pid_ns,
the tracer and parent can live in different namespaces. Change the code
to use "parent" instead of tsk->parent.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>

--- PTRACE/kernel/signal.c~CLDSTOP_NS	2009-05-25 20:24:50.000000000 +0200
+++ PTRACE/kernel/signal.c	2009-05-25 20:33:37.000000000 +0200
@@ -1496,7 +1496,7 @@ static void do_notify_parent_cldstop(str
 	 * see comment in do_notify_parent() abot the following 3 lines
 	 */
 	rcu_read_lock();
-	info.si_pid = task_pid_nr_ns(tsk, tsk->parent->nsproxy->pid_ns);
+	info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns);
 	info.si_uid = __task_cred(tsk)->uid;
 	rcu_read_unlock();
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-25 18:55 [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
@ 2009-05-25 19:39 ` Oleg Nesterov
  2009-05-27  1:06   ` Roland McGrath
  2009-05-26 21:05 ` Roland McGrath
  2009-06-02  4:48 ` Sukadev Bhattiprolu
  2 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-25 19:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

On 05/25, Oleg Nesterov wrote:
>
> If the non-traced sub-thread calls do_notify_parent_cldstop(), we send the
> notification to group_leader->real_parent and we report group_leader's pid.
>
> But, if group_leader is traced we use the wrong ->parent->nsproxy->pid_ns,
> the tracer and parent can live in different namespaces. Change the code
> to use "parent" instead of tsk->parent.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> --- PTRACE/kernel/signal.c~CLDSTOP_NS	2009-05-25 20:24:50.000000000 +0200
> +++ PTRACE/kernel/signal.c	2009-05-25 20:33:37.000000000 +0200
> @@ -1496,7 +1496,7 @@ static void do_notify_parent_cldstop(str
>  	 * see comment in do_notify_parent() abot the following 3 lines
>  	 */
>  	rcu_read_lock();
> -	info.si_pid = task_pid_nr_ns(tsk, tsk->parent->nsproxy->pid_ns);
> +	info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns);


While this change is correct in any case (I hope), I wonder whether
we need another one:

	--- a/kernel/signal.c
	+++ b/kernel/signal.c
	@@ -1483,12 +1483,12 @@ static void do_notify_parent_cldstop(str
		struct task_struct *parent;
		struct sighand_struct *sighand;
	 
	+	if (!task_ptrace(tsk))
	+		tsk = tsk->group_leader;
	+
	+	parent = tsk->real_parent;
		if (task_ptrace(tsk))
			parent = tsk->parent;
	-	else {
	-		tsk = tsk->group_leader;
	-		parent = tsk->real_parent;
	-	}
	 
		info.si_signo = SIGCHLD;
		info.si_errno = 0;

If the sub-thread is not traced, but ->group_leader is, perhaps it makes
more sense to notify the leader's tracer, not parent?

Not that I think this is really important. Just curious about what was
the intent.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-25 18:55 [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
  2009-05-25 19:39 ` Oleg Nesterov
@ 2009-05-26 21:05 ` Roland McGrath
  2009-05-26 21:33   ` Oleg Nesterov
  2009-05-27 21:32   ` [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
  2009-06-02  4:48 ` Sukadev Bhattiprolu
  2 siblings, 2 replies; 22+ messages in thread
From: Roland McGrath @ 2009-05-26 21:05 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

Acked-by: Roland McGrath <roland@redhat.com>

Yes, all cases setting .si_pid should set it using the pid_ns of the
recipient.  Numerous other cases look wrong too, though maybe Sukadev has
outstanding patches for those already?  (It's been a while since we went
around on this, and I am fuzzy on some of the details.)  ptrace_notify,
ptrace_signal, sys_kill, do_tkill all look wrong to me.  (Only sys_kill
needs to solve the multiple-recipients problem.)


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-26 21:05 ` Roland McGrath
@ 2009-05-26 21:33   ` Oleg Nesterov
  2009-05-27  0:55     ` Roland McGrath
  2009-05-27 21:32   ` [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
  1 sibling, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-26 21:33 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On 05/26, Roland McGrath wrote:
>
> Acked-by: Roland McGrath <roland@redhat.com>
>
> Yes, all cases setting .si_pid should set it using the pid_ns of the
> recipient.  Numerous other cases look wrong too, though maybe Sukadev has
> outstanding patches for those already?  (It's been a while since we went
> around on this, and I am fuzzy on some of the details.)  ptrace_notify,
> ptrace_signal,

Yes. Perhaps it would be nice to add a helper,

	task_pid_xxx(struct task_struct child, struct task_struct *parent)
	{
		rcu_read_lock();
		ret = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns);
		rcu_read_unlock();

		return ret;
	}

ptrace_notify/ptrace_signal can race with untrace + clear ->nsproxy,
probably we don't care.

> sys_kill, do_tkill all look wrong to me.

They should be fine, note the

	if (from_ancestor_ns)
		q->info.si_pid = 0;

in __send_signal(). If we send the signal "down" to the sub-namespace,
si_pid == 0 is correct. And, unlike do_notify_parent/ptrace_notify/etc
kill/tkill can't send the signal "up".

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-26 21:33   ` Oleg Nesterov
@ 2009-05-27  0:55     ` Roland McGrath
  2009-06-02  4:54       ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-05-27  0:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

> Yes. Perhaps it would be nice to add a helper,

I agree.

> > sys_kill, do_tkill all look wrong to me.
> 
> They should be fine, note the
> 
> 	if (from_ancestor_ns)
> 		q->info.si_pid = 0;
> 
> in __send_signal(). If we send the signal "down" to the sub-namespace,
> si_pid == 0 is correct. And, unlike do_notify_parent/ptrace_notify/etc
> kill/tkill can't send the signal "up".

Ah, right.  I knew there was something around this I was forgetting.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-25 19:39 ` Oleg Nesterov
@ 2009-05-27  1:06   ` Roland McGrath
  2009-05-27 23:24     ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-05-27  1:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

> While this change is correct in any case (I hope), I wonder whether
> we need another one:
[...]
> If the sub-thread is not traced, but ->group_leader is, perhaps it makes
> more sense to notify the leader's tracer, not parent?

I don't think so.

> Not that I think this is really important. Just curious about what was
> the intent.

Here is how I would describe the intent (admittedly this logic is
retrospective, not necessarily articulated as such when the code was
written).  If the the triggering task is ptrace'd, this report is "for
ptrace purposes"--even if it's the CLD_STOPPED case.  Otherwise, what's
being reported is "the whole POSIX process is now stopped as per POSIX
definitions".  The latter properly goes to the parent of the process,
which is the group_leader->real_parent.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-26 21:05 ` Roland McGrath
  2009-05-26 21:33   ` Oleg Nesterov
@ 2009-05-27 21:32   ` Oleg Nesterov
  2009-05-27 22:23     ` Roland McGrath
  1 sibling, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-27 21:32 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On 05/26, Roland McGrath wrote:
>
> Acked-by: Roland McGrath <roland@redhat.com>

I didn't remove this ack to remind Andrew about this patch ;)

> ptrace_notify, ptrace_signal,

Actually, ptrace_signal() is fine wrt pid_ns. task_pid_vnr(current->parent)
returns 0 if the tracer is not seen from the tracee's namespace, this is
correct. So, afaics only ptrace_notify() needs the (low priority) fix.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27 21:32   ` [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
@ 2009-05-27 22:23     ` Roland McGrath
  2009-05-27 23:12       ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-05-27 22:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

> Actually, ptrace_signal() is fine wrt pid_ns. task_pid_vnr(current->parent)
> returns 0 if the tracer is not seen from the tracee's namespace, this is
> correct. So, afaics only ptrace_notify() needs the (low priority) fix.

Oh, both are correct.  Sorry.  These are signals received by tracee, which
is current, so task_pid_vnr() dtrt.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27 22:23     ` Roland McGrath
@ 2009-05-27 23:12       ` Oleg Nesterov
  2009-05-27 23:26         ` Roland McGrath
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-27 23:12 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On 05/27, Roland McGrath wrote:
>
> > Actually, ptrace_signal() is fine wrt pid_ns. task_pid_vnr(current->parent)
> > returns 0 if the tracer is not seen from the tracee's namespace, this is
> > correct. So, afaics only ptrace_notify() needs the (low priority) fix.
>
> Oh, both are correct.  Sorry.  These are signals received by tracee, which
> is current, so task_pid_vnr() dtrt.

No, task_pid_vnr(current) in ptrace_notify() is not right. If the tracer
does PTRACE_GETSIGINFO it gets the wrong .si_pid.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27  1:06   ` Roland McGrath
@ 2009-05-27 23:24     ` Oleg Nesterov
  0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-27 23:24 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On 05/26, Roland McGrath wrote:
>
> > While this change is correct in any case (I hope), I wonder whether
> > we need another one:
> [...]
> > If the sub-thread is not traced, but ->group_leader is, perhaps it makes
> > more sense to notify the leader's tracer, not parent?
>
> I don't think so.

Agreed.

> > Not that I think this is really important. Just curious about what was
> > the intent.
>
> Here is how I would describe the intent (admittedly this logic is
> retrospective, not necessarily articulated as such when the code was
> written).  If the the triggering task is ptrace'd, this report is "for
> ptrace purposes"--even if it's the CLD_STOPPED case.  Otherwise, what's
> being reported is "the whole POSIX process is now stopped as per POSIX
> definitions".  The latter properly goes to the parent of the process,
> which is the group_leader->real_parent.

Yes. And I forgot that in this case the traced group_leader has already
reported CLD_STOPPED to tracer.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27 23:12       ` Oleg Nesterov
@ 2009-05-27 23:26         ` Roland McGrath
  2009-05-27 23:43           ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-05-27 23:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

> No, task_pid_vnr(current) in ptrace_notify() is not right. If the tracer
> does PTRACE_GETSIGINFO it gets the wrong .si_pid.

I don't follow.  PTRACE_GETSIGINFO gets the tracee's siginfo_t data--modulo
32/64 conversions it's the data structure the tracee process sees on its
stack when running a handler.  It's not like a signal sent into the
tracer's queue (like SIGCHLD in do_notify_*), where the kernel doing
translation to the tracer's context makes sense.  It's more like some
memory you read from the tracee.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27 23:26         ` Roland McGrath
@ 2009-05-27 23:43           ` Oleg Nesterov
  2009-05-27 23:51             ` Roland McGrath
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-27 23:43 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On 05/27, Roland McGrath wrote:
>
> > No, task_pid_vnr(current) in ptrace_notify() is not right. If the tracer
> > does PTRACE_GETSIGINFO it gets the wrong .si_pid.
>
> I don't follow.  PTRACE_GETSIGINFO gets the tracee's siginfo_t data--modulo
> 32/64 conversions it's the data structure the tracee process sees on its
> stack when running a handler.  It's not like a signal sent into the
> tracer's queue (like SIGCHLD in do_notify_*), where the kernel doing
> translation to the tracer's context makes sense.  It's more like some
> memory you read from the tracee.

Yes, but the (minor and low priority) problem is that .si_pid recorded
in ->last_siginfo does not match the tracee's pid from the tracer pov
(if they run in different namespaces).

Suppose that that we trace the task from the sub-namespace. We see its
pid == 100, but when this tracee calls ptrace_notify() it does
info.si_pid = task_pid_vnr(current), and task_pid_vnr() returns (say) 10.

Oleg.	


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27 23:43           ` Oleg Nesterov
@ 2009-05-27 23:51             ` Roland McGrath
  2009-05-28  0:05               ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-05-27 23:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

> Yes, but the (minor and low priority) problem is that .si_pid recorded
> in ->last_siginfo does not match the tracee's pid from the tracer pov
> (if they run in different namespaces).

That is not a problem, it's how it's supposed to be.  That's what I was
just saying.  This is information about the values seen by the debugged
process.  Any translation to things meaningful in different contexts it
outside the scope of ptrace.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27 23:51             ` Roland McGrath
@ 2009-05-28  0:05               ` Oleg Nesterov
  0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-05-28  0:05 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On 05/27, Roland McGrath wrote:
>
> > Yes, but the (minor and low priority) problem is that .si_pid recorded
> > in ->last_siginfo does not match the tracee's pid from the tracer pov
> > (if they run in different namespaces).
>
> That is not a problem, it's how it's supposed to be.  That's what I was
> just saying.  This is information about the values seen by the debugged
> process.

Ah, OK then.

I misunderstood you.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-25 18:55 [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
  2009-05-25 19:39 ` Oleg Nesterov
  2009-05-26 21:05 ` Roland McGrath
@ 2009-06-02  4:48 ` Sukadev Bhattiprolu
  2 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-02  4:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Christoph Hellwig, Ingo Molnar, Pavel Emelyanov,
	Roland McGrath, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| If the non-traced sub-thread calls do_notify_parent_cldstop(), we send the
| notification to group_leader->real_parent and we report group_leader's pid.
| 
| But, if group_leader is traced we use the wrong ->parent->nsproxy->pid_ns,
| the tracer and parent can live in different namespaces. Change the code
| to use "parent" instead of tsk->parent.
| 
| Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>

| 
| --- PTRACE/kernel/signal.c~CLDSTOP_NS	2009-05-25 20:24:50.000000000 +0200
| +++ PTRACE/kernel/signal.c	2009-05-25 20:33:37.000000000 +0200
| @@ -1496,7 +1496,7 @@ static void do_notify_parent_cldstop(str
|  	 * see comment in do_notify_parent() abot the following 3 lines
|  	 */
|  	rcu_read_lock();
| -	info.si_pid = task_pid_nr_ns(tsk, tsk->parent->nsproxy->pid_ns);
| +	info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns);
|  	info.si_uid = __task_cred(tsk)->uid;
|  	rcu_read_unlock();
| 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
  2009-05-27  0:55     ` Roland McGrath
@ 2009-06-02  4:54       ` Sukadev Bhattiprolu
  2009-06-05 15:43         ` naresh kamboju
  0 siblings, 1 reply; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-02  4:54 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, Andrew Morton, Christoph Hellwig, Ingo Molnar,
	Pavel Emelyanov, linux-kernel

Roland McGrath [roland@redhat.com] wrote:
| > Yes. Perhaps it would be nice to add a helper,
| 
| I agree.
| 
| > > sys_kill, do_tkill all look wrong to me.
| > 
| > They should be fine, note the
| > 
| > 	if (from_ancestor_ns)
| > 		q->info.si_pid = 0;
| > 
| > in __send_signal(). If we send the signal "down" to the sub-namespace,
| > si_pid == 0 is correct. And, unlike do_notify_parent/ptrace_notify/etc
| > kill/tkill can't send the signal "up".
| 
| Ah, right.  I knew there was something around this I was forgetting.

Setting si_pid to task_tgid_vnr(current); in places like do_tkill() is
slightly misleading bc, it can get modified later in send_signal().  We
can't set si_pid correctly in do_tkill() since we must first establish
pid-namespace relationship and that can mess up control flow.

Maybe a comment will help.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong  ->nsproxy usage
  2009-06-02  4:54       ` Sukadev Bhattiprolu
@ 2009-06-05 15:43         ` naresh kamboju
  2009-06-06  0:19           ` Roland McGrath
  2009-06-06  6:47           ` open_posix_testsuite: STOP + CONT + wait hang? Oleg Nesterov
  0 siblings, 2 replies; 22+ messages in thread
From: naresh kamboju @ 2009-06-05 15:43 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Roland McGrath, Oleg Nesterov, Andrew Morton, Christoph Hellwig,
	Ingo Molnar, Pavel Emelyanov, linux-kernel, naresh.kernel

Hi,

I want to inform 2.6.29 signal issues,
As per my understanding I have noticed that if there is a delay
(sleep/nanosleep/usleep) in the child process. Child could not
reporting exit status to parent at this situation parent is waiting
for ever by combinations of SIGSTOP and SIGCONT. So test cases are
reporting as HUNG.

Here I have attached open posix test cases which are reported as HUNG
with 2.6.29 kernels.
1.ltp/testcases/open_posix_testsuite/conformance/interfaces/clock_nanosleep/1-5.c
2.ltp/testcases/open_posix_testsuite/conformance/interfaces/nanosleep/3-2.c


ARCH: ARM
KERNEL: 2.6.29.1
Glibc: 2.9
Gcc: 4.3.3

/*****************************************************************/
 open_posix_testsuite/conformance/interfaces/clock_nanosleep/1-5.c

/*****************************************************************/
/*
 * Copyright (c) 2002-3, Intel Corporation. All rights reserved.
 * Created by:  julie.n.fleischer REMOVE-THIS AT intel DOT com
 * This file is licensed under the GPL license.  For the full content
 * of this license, see the COPYING file at the top level of this
 * source tree.

 * Test that clock_nanosleep() does not stop if a signal is received
 * that has no signal handler.  clock_nanosleep() should still respond
 * to the signal, but should resume after a SIGCONT signal is received.
 *
 * SIGSTOP will be used to stop the sleep.
 */
#include <stdio.h>
#include <time.h>
#include <signal.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>

#define PTS_PASS        0
#define PTS_FAIL        1
#define PTS_UNRESOLVED  2
#define PTS_UNSUPPORTED 4
#define PTS_UNTESTED    5


#define SLEEPSEC 5

#define CHILDPASS 1
#define CHILDFAIL 0

int main(int argc, char *argv[])
{
	int pid, slepts;
	struct timespec tsbefore, tsafter;

	if (clock_gettime(CLOCK_REALTIME, &tsbefore) != 0) {
		perror("clock_gettime() did not return success\n");
		return PTS_UNRESOLVED;
	}


	if ((pid = fork()) == 0) {
		/* child here */
		struct timespec tssleep;

		tssleep.tv_sec=SLEEPSEC;
		tssleep.tv_nsec=0;
		if (clock_nanosleep(CLOCK_REALTIME, 0, &tssleep, NULL) == 0) {
			printf("clock_nanosleep() returned success\n");
			return CHILDPASS;
		} else {
			printf("clock_nanosleep() did not return success\n");
			return CHILDFAIL;
		}
		return CHILDFAIL;
	} else {
		/* parent here */
		int i;

		sleep(1);

		if (kill(pid, SIGSTOP) != 0) {
			printf("Could not raise SIGSTOP\n");
			return PTS_UNRESOLVED;
		}

		if (kill(pid, SIGCONT) != 0) {
			printf("Could not raise SIGCONT\n");
			return PTS_UNRESOLVED;
		}

		if (wait(&i) == -1) {
			perror("Error waiting for child to exit\n");
			return PTS_UNRESOLVED;
		}

		if (!WIFEXITED(i) || !WEXITSTATUS(i)) {
			printf("Test FAILED\n");
			return PTS_FAIL;
		}

		if (clock_gettime(CLOCK_REALTIME, &tsafter) == -1) {
			perror("Error in clock_gettime()\n");
			return PTS_UNRESOLVED;
		}

		slepts=tsafter.tv_sec-tsbefore.tv_sec;

#ifdef DEBUG
		printf("Start %d sec; End %d sec\n", (int) tsbefore.tv_sec,
				(int) tsafter.tv_sec);
#endif
		if (slepts >= SLEEPSEC) {
			printf("Test PASSED\n");
			return PTS_PASS;
		} else {
			printf("clock_nanosleep() did not sleep long enough\n");
			return PTS_FAIL;
		}

	} //end fork

	return PTS_UNRESOLVED;
}



/*****************************************************************/
 open_posix_testsuite/conformance/interfaces/nanosleep/3-2.c

/*****************************************************************/
/*
 * Copyright (c) 2002, Intel Corporation. All rights reserved.
 * Created by:  julie.n.fleischer REMOVE-THIS AT intel DOT com
 * This file is licensed under the GPL license.  For the full content
 * of this license, see the COPYING file at the top level of this
 * source tree.
 *
 * Regression test motivated by an LKML discussion.  Test that nanosleep()
 * can be interrupted and then continue.
 */
#include <stdio.h>
#include <time.h>
#include <signal.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>

#define PTS_PASS        0
#define PTS_FAIL        1
#define PTS_UNRESOLVED  2
#define PTS_UNSUPPORTED 4
#define PTS_UNTESTED    5

#define SLEEPSEC 5

#define CHILDPASS 0 //if interrupted, child will return 0
#define CHILDFAIL 1

int main(int argc, char *argv[])
{
	int pid, slepts;
	struct timespec tsbefore, tsafter;

	if (clock_gettime(CLOCK_REALTIME, &tsbefore) != 0) {
		perror("clock_gettime() did not return success\n");
		return PTS_UNRESOLVED;
	}


	if ((pid = fork()) == 0) {
		/* child here */
		struct timespec tssleep;

		tssleep.tv_sec=SLEEPSEC;
		tssleep.tv_nsec=0;
		if (nanosleep(&tssleep, NULL) == 0) {
			printf("nanosleep() returned success\n");
			return CHILDPASS;
		} else {
			printf("nanosleep() did not return success\n");
			return CHILDFAIL;
		}
		return CHILDFAIL;
	} else {
		/* parent here */
		int i;

		sleep(1);

		if (kill(pid, SIGSTOP) != 0) {
			printf("Could not raise SIGSTOP\n");
			return PTS_UNRESOLVED;
		}

		if (kill(pid, SIGCONT) != 0) {
			printf("Could not raise SIGCONT\n");
			return PTS_UNRESOLVED;
		}

		if (wait(&i) == -1) {
			perror("Error waiting for child to exit\n");
			return PTS_UNRESOLVED;
		}

		if (!WIFEXITED(i)) {
			printf("nanosleep() did not return 0\n");
			return PTS_FAIL;
		}

		if (clock_gettime(CLOCK_REALTIME, &tsafter) == -1) {
			perror("Error in clock_gettime()\n");
			return PTS_UNRESOLVED;
		}

		slepts=tsafter.tv_sec-tsbefore.tv_sec;

		printf("Start %d sec; End %d sec\n", (int) tsbefore.tv_sec,
				(int) tsafter.tv_sec);
		if (slepts >= SLEEPSEC) {
			printf("Test PASSED\n");
			return PTS_PASS;
		} else {
			printf("nanosleep() did not sleep long enough\n");
			return PTS_FAIL;
		}

	} //end fork

	return PTS_UNRESOLVED;
}


/*****************************************************************/

thanks for your time.

Best regards,
Naresh Kamboju

On Tue, Jun 2, 2009 at 10:24 AM, Sukadev
Bhattiprolu<sukadev@linux.vnet.ibm.com> wrote:
> Roland McGrath [roland@redhat.com] wrote:
> | > Yes. Perhaps it would be nice to add a helper,
> |
> | I agree.
> |
> | > > sys_kill, do_tkill all look wrong to me.
> | >
> | > They should be fine, note the
> | >
> | >     if (from_ancestor_ns)
> | >             q->info.si_pid = 0;
> | >
> | > in __send_signal(). If we send the signal "down" to the sub-namespace,
> | > si_pid == 0 is correct. And, unlike do_notify_parent/ptrace_notify/etc
> | > kill/tkill can't send the signal "up".
> |
> | Ah, right.  I knew there was something around this I was forgetting.
>
> Setting si_pid to task_tgid_vnr(current); in places like do_tkill() is
> slightly misleading bc, it can get modified later in send_signal().  We
> can't set si_pid correctly in do_tkill() since we must first establish
> pid-namespace relationship and that can mess up control flow.
>
> Maybe a comment will help.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong  ->nsproxy usage
  2009-06-05 15:43         ` naresh kamboju
@ 2009-06-06  0:19           ` Roland McGrath
  2009-06-06  6:47           ` open_posix_testsuite: STOP + CONT + wait hang? Oleg Nesterov
  1 sibling, 0 replies; 22+ messages in thread
From: Roland McGrath @ 2009-06-06  0:19 UTC (permalink / raw)
  To: naresh kamboju
  Cc: Sukadev Bhattiprolu, Oleg Nesterov, Andrew Morton,
	Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, linux-kernel

This has absolutely nothing to do with the patch/thread you replied to.
Post about your signals issues separately.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* open_posix_testsuite: STOP + CONT + wait hang?
  2009-06-05 15:43         ` naresh kamboju
  2009-06-06  0:19           ` Roland McGrath
@ 2009-06-06  6:47           ` Oleg Nesterov
  2009-06-17  8:35             ` naresh kamboju
  1 sibling, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-06-06  6:47 UTC (permalink / raw)
  To: naresh kamboju
  Cc: Sukadev Bhattiprolu, Roland McGrath, Andrew Morton,
	Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, linux-kernel

(change the subject)

On 06/05, naresh kamboju wrote:
>
> I want to inform 2.6.29 signal issues,
> As per my understanding I have noticed that if there is a delay
> (sleep/nanosleep/usleep) in the child process. Child could not
> reporting exit status to parent at this situation parent is waiting
> for ever by combinations of SIGSTOP and SIGCONT. So test cases are
> reporting as HUNG.

Thanks for report, but please provide more info.

> ARCH: ARM

is it ARM specific? I can't reproduce the problem on x86.

> KERNEL: 2.6.29.1

did you try other kernel versions?

> #define SLEEPSEC 5
...
> 	if ((pid = fork()) == 0) {
> 		/* child here */
> 		struct timespec tssleep;
>
> 		tssleep.tv_sec=SLEEPSEC;
> 		tssleep.tv_nsec=0;
> 		if (clock_nanosleep(CLOCK_REALTIME, 0, &tssleep, NULL) == 0) {
> 			printf("clock_nanosleep() returned success\n");
> 			return CHILDPASS;
> 		} else {
> 			printf("clock_nanosleep() did not return success\n");
> 			return CHILDFAIL;
> 		}
> 		return CHILDFAIL;
> 	} else {
> 		/* parent here */
> 		int i;
>
> 		sleep(1);
>
> 		if (kill(pid, SIGSTOP) != 0) {
> 			printf("Could not raise SIGSTOP\n");
> 			return PTS_UNRESOLVED;
> 		}
>
> 		if (kill(pid, SIGCONT) != 0) {
> 			printf("Could not raise SIGCONT\n");
> 			return PTS_UNRESOLVED;
> 		}
>
> 		if (wait(&i) == -1) {

And I guess it hangs here, right?

The child should sleep SLEEPSEC seconds then exit, so the whole
test-case should take SLEEPSEC seconds too.


Do you mean it really hangs and never completes?

Can you confirm it hangs in wait() ?

Does the child print "returned success" ?

If you can reproduce the problem, please send the content of
/proc/CHILD_PID/status and /proc/PARENT_PID/status.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: open_posix_testsuite: STOP + CONT + wait hang?
  2009-06-06  6:47           ` open_posix_testsuite: STOP + CONT + wait hang? Oleg Nesterov
@ 2009-06-17  8:35             ` naresh kamboju
  2009-06-17 13:29               ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: naresh kamboju @ 2009-06-17  8:35 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sukadev Bhattiprolu, Roland McGrath, Andrew Morton,
	Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9096 bytes --]

Hi Oleg,

please find inline comments

On Sat, Jun 6, 2009 at 12:17 PM, Oleg Nesterov<oleg@redhat.com> wrote:
> (change the subject)
>
> On 06/05, naresh kamboju wrote:
>>
>> I want to inform 2.6.29 signal issues,
>> As per my understanding I have noticed that if there is a delay
>> (sleep/nanosleep/usleep) in the child process. Child could not
>> reporting exit status to parent at this situation parent is waiting
>> for ever by combinations of SIGSTOP and SIGCONT. So test cases are
>> reporting as HUNG.
>
> Thanks for report, but please provide more info.
>
>> ARCH: ARM
>
> is it ARM specific? I can't reproduce the problem on x86.
yes it is specific to ARM. test cases are PASSED on X86-2.6.29 Kernels.
>
>> KERNEL: 2.6.29.1
>
> did you try other kernel versions?
i have tried on 2.6.23 kernels it is PASSED on both ARM and X68.
>
>> #define SLEEPSEC 5
> ...
>>       if ((pid = fork()) == 0) {
>>               /* child here */
>>               struct timespec tssleep;
>>
>>               tssleep.tv_sec=SLEEPSEC;
>>               tssleep.tv_nsec=0;
>>               if (clock_nanosleep(CLOCK_REALTIME, 0, &tssleep, NULL) == 0) {
>>                       printf("clock_nanosleep() returned success\n");
>>                       return CHILDPASS;
>>               } else {
>>                       printf("clock_nanosleep() did not return success\n");
>>                       return CHILDFAIL;
>>               }
>>               return CHILDFAIL;
>>       } else {
>>               /* parent here */
>>               int i;
>>
>>               sleep(1);
>>
>>               if (kill(pid, SIGSTOP) != 0) {
>>                       printf("Could not raise SIGSTOP\n");
>>                       return PTS_UNRESOLVED;
>>               }
>>
>>               if (kill(pid, SIGCONT) != 0) {
>>                       printf("Could not raise SIGCONT\n");
>>                       return PTS_UNRESOLVED;
>>               }
>>
>>               if (wait(&i) == -1) {
>
> And I guess it hangs here, right?
yes.
it is hangs here.
wait4(-1, Process 464 suspended

>
> The child should sleep SLEEPSEC seconds then exit, so the whole
> test-case should take SLEEPSEC seconds too.
yes. that is the expected.
>
>
> Do you mean it really hangs and never completes?
yes. it is hangs and never completes.
>
> Can you confirm it hangs in wait() ?
yes. wait4() syscall is waiting for child status.

>
> Does the child print "returned success" ?
it is not notifying parent about its exit status.

>
> If you can reproduce the problem, please send the content of
> /proc/CHILD_PID/status and /proc/PARENT_PID/status.

i have attached  strace and proc_log
please review the same.

Best regards
Naresh Kamboju
>
> Oleg.
execve("./3-2.test", ["./3-2.test"], [/* 10 vars */]) = 0
brk(0)                                  = 0x11000
uname({sys="Linux", node="43.88.101.211", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x4001d000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/devel/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=13962, ...}) = 0
mmap2(NULL, 13962, PROT_READ, MAP_PRIVATE, 3, 0) = 0x4001e000
close(3)                                = 0
open("/devel/lib/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0008B\0\0004\0\0\0\200"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=805190, ...}) = 0
mmap2(NULL, 127472, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x40026000
mprotect(0x4003a000, 32768, PROT_NONE)  = 0
mmap2(0x40042000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14) = 0x40042000
mmap2(0x40044000, 4592, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40044000
close(3)                                = 0
open("/devel/lib/librt.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\240\26\0\0004\0\0\0\4"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=163621, ...}) = 0
mmap2(NULL, 57876, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x40046000
mprotect(0x4004c000, 28672, PROT_NONE)  = 0
mmap2(0x40053000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5) = 0x40053000
close(3)                                = 0
open("/devel/lib/libm.so.6", O_RDONLY)  = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\2601\0\0004\0\0\0\324"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1349258, ...}) = 0
mmap2(NULL, 688288, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x40055000
mprotect(0x400f5000, 28672, PROT_NONE)  = 0
mmap2(0x400fc000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x9f) = 0x400fc000
close(3)                                = 0
open("/devel/lib/libc.so.6", O_RDONLY)  = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0
R\1\0004\0\0\0\204"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=7855285, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x40022000
mmap2(NULL, 1249800, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE,
3, 0) = 0x400fe000
mprotect(0x40223000, 28672, PROT_NONE)  = 0
mmap2(0x4022a000, 12288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x124) = 0x4022a000
mmap2(0x4022d000, 8712, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4022d000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x40023000
set_tls(0x40022c80, 0x40022c80, 0x67c, 0x40023358, 0x40025048) = 0
mprotect(0x4022a000, 8192, PROT_READ)   = 0
mprotect(0x400fc000, 4096, PROT_READ)   = 0
mprotect(0x40053000, 4096, PROT_READ)   = 0
mprotect(0x40042000, 4096, PROT_READ)   = 0
mprotect(0x40024000, 4096, PROT_READ)   = 0
munmap(0x4001e000, 13962)               = 0
set_tid_address(0x40022828)             = 464
set_robust_list(0x40022830, 0xc)        = 0
rt_sigaction(SIGRTMIN, {0x4002a128, [], SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x40029ce8, [],
SA_RESTART|SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
clock_gettime(CLOCK_REALTIME, {1245240048, 291313608}) = 0
clone(Process 465 attached (waiting for parent)
Process 465 resumed (parent 464 ready)
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x40022828) = 465
[pid   465] nanosleep({5, 0},  <unfinished ...>
[pid   464] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid   464] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
[pid   464] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   464] nanosleep({1, 0}, {1, 0})   = 0
[pid   464] kill(465, SIGSTOP <unfinished ...>
[pid   465] <... nanosleep resumed> 0)  = ? ERESTART_RESTARTBLOCK (To
be restarted)
[pid   465] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid   465] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid   464] <... kill resumed> )        = 0
[pid   464] kill(465, SIGCONT)          = 0
[pid   464] wait4(-1, Process 464 suspended
 <unfinished ...>
[pid   465] --- SIGCONT (Continued) @ 0 (0) ---


/**************************************************************/

-bash-3.2# uname -a
Linux 2.6.29.2
-bash-3.2# cat proc/464/status
Name:   3-2.test
State:  T (tracing stop)
Tgid:   464
Pid:    464
PPid:   462
TracerPid:      462
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 32
Groups: 0
VmPeak:     2332 kB
VmSize:     2316 kB
VmLck:         0 kB
VmHWM:       404 kB
VmRSS:       404 kB
VmData:       32 kB
VmStk:        84 kB
VmExe:         4 kB
VmLib:      2032 kB
VmPTE:         8 kB
Threads:        1
SigQ:   0/1024
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: fffffffffffffeff
CapEff: fffffffffffffeff
CapBnd: fffffffffffffeff
voluntary_ctxt_switches:        126
nonvoluntary_ctxt_switches:     31
-bash-3.2#
-bash-3.2# cat proc/465/status
Name:   3-2.test
State:  R (running)
Tgid:   465
Pid:    465
PPid:   464
TracerPid:      462
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 32
Groups: 0
VmPeak:     2316 kB
VmSize:     2316 kB
VmLck:         0 kB
VmHWM:       164 kB
VmRSS:       164 kB
VmData:       32 kB
VmStk:        84 kB
VmExe:         4 kB
VmLib:      2032 kB
VmPTE:         8 kB
Threads:        1
SigQ:   0/1024
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: fffffffffffffeff
CapEff: fffffffffffffeff
CapBnd: fffffffffffffeff
voluntary_ctxt_switches:        7
nonvoluntary_ctxt_switches:     36623
-bash-3.2#



>
>

[-- Attachment #2: proc_log.txt --]
[-- Type: text/plain, Size: 1618 bytes --]

-bash-3.2# uname -a
Linux 2.6.29.2 
-bash-3.2# cat proc/464/status
Name:   3-2.test
State:  T (tracing stop)
Tgid:   464
Pid:    464
PPid:   462
TracerPid:      462
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 32
Groups: 0
VmPeak:     2332 kB
VmSize:     2316 kB
VmLck:         0 kB
VmHWM:       404 kB
VmRSS:       404 kB
VmData:       32 kB
VmStk:        84 kB
VmExe:         4 kB
VmLib:      2032 kB
VmPTE:         8 kB
Threads:        1
SigQ:   0/1024
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: fffffffffffffeff
CapEff: fffffffffffffeff
CapBnd: fffffffffffffeff
voluntary_ctxt_switches:        126
nonvoluntary_ctxt_switches:     31
-bash-3.2#
-bash-3.2# cat proc/465/status
Name:   3-2.test
State:  R (running)
Tgid:   465
Pid:    465
PPid:   464
TracerPid:      462
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 32
Groups: 0
VmPeak:     2316 kB
VmSize:     2316 kB
VmLck:         0 kB
VmHWM:       164 kB
VmRSS:       164 kB
VmData:       32 kB
VmStk:        84 kB
VmExe:         4 kB
VmLib:      2032 kB
VmPTE:         8 kB
Threads:        1
SigQ:   0/1024
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: fffffffffffffeff
CapEff: fffffffffffffeff
CapBnd: fffffffffffffeff
voluntary_ctxt_switches:        7
nonvoluntary_ctxt_switches:     36623
-bash-3.2#

[-- Attachment #3: nano_sleep_3-2.log --]
[-- Type: application/octet-stream, Size: 4502 bytes --]

execve("./3-2.test", ["./3-2.test"], [/* 10 vars */]) = 0
brk(0)                                  = 0x11000
uname({sys="Linux", node="43.88.101.211", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001d000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/devel/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=13962, ...}) = 0
mmap2(NULL, 13962, PROT_READ, MAP_PRIVATE, 3, 0) = 0x4001e000
close(3)                                = 0
open("/devel/lib/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0008B\0\0004\0\0\0\200"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=805190, ...}) = 0
mmap2(NULL, 127472, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x40026000
mprotect(0x4003a000, 32768, PROT_NONE)  = 0
mmap2(0x40042000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14) = 0x40042000
mmap2(0x40044000, 4592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40044000
close(3)                                = 0
open("/devel/lib/librt.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\240\26\0\0004\0\0\0\4"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=163621, ...}) = 0
mmap2(NULL, 57876, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x40046000
mprotect(0x4004c000, 28672, PROT_NONE)  = 0
mmap2(0x40053000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5) = 0x40053000
close(3)                                = 0
open("/devel/lib/libm.so.6", O_RDONLY)  = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\2601\0\0004\0\0\0\324"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1349258, ...}) = 0
mmap2(NULL, 688288, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x40055000
mprotect(0x400f5000, 28672, PROT_NONE)  = 0
mmap2(0x400fc000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x9f) = 0x400fc000
close(3)                                = 0
open("/devel/lib/libc.so.6", O_RDONLY)  = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0 R\1\0004\0\0\0\204"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=7855285, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40022000
mmap2(NULL, 1249800, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x400fe000
mprotect(0x40223000, 28672, PROT_NONE)  = 0
mmap2(0x4022a000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x124) = 0x4022a000
mmap2(0x4022d000, 8712, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4022d000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40023000
set_tls(0x40022c80, 0x40022c80, 0x67c, 0x40023358, 0x40025048) = 0
mprotect(0x4022a000, 8192, PROT_READ)   = 0
mprotect(0x400fc000, 4096, PROT_READ)   = 0
mprotect(0x40053000, 4096, PROT_READ)   = 0
mprotect(0x40042000, 4096, PROT_READ)   = 0
mprotect(0x40024000, 4096, PROT_READ)   = 0
munmap(0x4001e000, 13962)               = 0
set_tid_address(0x40022828)             = 464
set_robust_list(0x40022830, 0xc)        = 0
rt_sigaction(SIGRTMIN, {0x4002a128, [], SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x40029ce8, [], SA_RESTART|SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
clock_gettime(CLOCK_REALTIME, {1245240048, 291313608}) = 0
clone(Process 465 attached (waiting for parent)
Process 465 resumed (parent 464 ready)
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x40022828) = 465
[pid   465] nanosleep({5, 0},  <unfinished ...>
[pid   464] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid   464] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
[pid   464] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid   464] nanosleep({1, 0}, {1, 0})   = 0
[pid   464] kill(465, SIGSTOP <unfinished ...>
[pid   465] <... nanosleep resumed> 0)  = ? ERESTART_RESTARTBLOCK (To be restarted)
[pid   465] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid   465] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid   464] <... kill resumed> )        = 0
[pid   464] kill(465, SIGCONT)          = 0
[pid   464] wait4(-1, Process 464 suspended
 <unfinished ...>
[pid   465] --- SIGCONT (Continued) @ 0 (0) ---

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: open_posix_testsuite: STOP + CONT + wait hang?
  2009-06-17  8:35             ` naresh kamboju
@ 2009-06-17 13:29               ` Oleg Nesterov
  2009-06-17 14:34                 ` naresh kamboju
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-06-17 13:29 UTC (permalink / raw)
  To: naresh kamboju
  Cc: Sukadev Bhattiprolu, Roland McGrath, Andrew Morton,
	Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, linux-kernel

On 06/17, naresh kamboju wrote:
>
> please find inline comments

Thanks Naresh.

> >> ARCH: ARM
> >
> > is it ARM specific? I can't reproduce the problem on x86.
> yes it is specific to ARM. test cases are PASSED on X86-2.6.29 Kernels.

Not good, because I know nothing about arm and don't have the arm machine ;)
Will try to look...

> > If you can reproduce the problem, please send the content of
> > /proc/CHILD_PID/status and /proc/PARENT_PID/status.
>
> i have attached  strace and proc_log
> please review the same.

Thanks.

Could you reproduce without strace, and send the output of /proc/xxx/status ?
I don't really expect we will see something interesting, but just in case.

Also, the output of sysrq-t may help. (the part which relates to these 2
processes).

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: open_posix_testsuite: STOP + CONT + wait hang?
  2009-06-17 13:29               ` Oleg Nesterov
@ 2009-06-17 14:34                 ` naresh kamboju
  0 siblings, 0 replies; 22+ messages in thread
From: naresh kamboju @ 2009-06-17 14:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sukadev Bhattiprolu, Roland McGrath, Andrew Morton,
	Christoph Hellwig, Ingo Molnar, Pavel Emelyanov, linux-kernel

Thank you very much.
just before i have reported the same issue to ARM Kernel mailing list.


Best regards,
Naresh Kamboju

On Wed, Jun 17, 2009 at 6:59 PM, Oleg Nesterov<oleg@redhat.com> wrote:
> On 06/17, naresh kamboju wrote:
>>
>> please find inline comments
>
> Thanks Naresh.
>
>> >> ARCH: ARM
>> >
>> > is it ARM specific? I can't reproduce the problem on x86.
>> yes it is specific to ARM. test cases are PASSED on X86-2.6.29 Kernels.
>
> Not good, because I know nothing about arm and don't have the arm machine ;)
> Will try to look...
>
>> > If you can reproduce the problem, please send the content of
>> > /proc/CHILD_PID/status and /proc/PARENT_PID/status.
>>
>> i have attached  strace and proc_log
>> please review the same.
>
> Thanks.
>
> Could you reproduce without strace, and send the output of /proc/xxx/status ?
> I don't really expect we will see something interesting, but just in case.
>
> Also, the output of sysrq-t may help. (the part which relates to these 2
> processes).
>
> Oleg.
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-06-17 14:35 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-25 18:55 [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
2009-05-25 19:39 ` Oleg Nesterov
2009-05-27  1:06   ` Roland McGrath
2009-05-27 23:24     ` Oleg Nesterov
2009-05-26 21:05 ` Roland McGrath
2009-05-26 21:33   ` Oleg Nesterov
2009-05-27  0:55     ` Roland McGrath
2009-06-02  4:54       ` Sukadev Bhattiprolu
2009-06-05 15:43         ` naresh kamboju
2009-06-06  0:19           ` Roland McGrath
2009-06-06  6:47           ` open_posix_testsuite: STOP + CONT + wait hang? Oleg Nesterov
2009-06-17  8:35             ` naresh kamboju
2009-06-17 13:29               ` Oleg Nesterov
2009-06-17 14:34                 ` naresh kamboju
2009-05-27 21:32   ` [PATCH 1/1] ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage Oleg Nesterov
2009-05-27 22:23     ` Roland McGrath
2009-05-27 23:12       ` Oleg Nesterov
2009-05-27 23:26         ` Roland McGrath
2009-05-27 23:43           ` Oleg Nesterov
2009-05-27 23:51             ` Roland McGrath
2009-05-28  0:05               ` Oleg Nesterov
2009-06-02  4:48 ` Sukadev Bhattiprolu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.