All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	strace-devel@lists.sourceforge.net,
	Oleg Nesterov <oleg@redhat.com>, Aleksa Sarai <asarai@suse.com>
Subject: Re: strace lockup when tracing exec in go
Date: Thu, 22 Sep 2016 11:40:09 +0200	[thread overview]
Message-ID: <1474537209.5022.8.camel@gmail.com> (raw)
In-Reply-To: <20160922083602.GB11875@dhcp22.suse.cz>

On Thu, 2016-09-22 at 10:36 +0200, Michal Hocko wrote:
> On Thu 22-09-16 10:01:26, Michal Hocko wrote:
> > On Thu 22-09-16 06:15:02, Mike Galbraith wrote:
> > [...]
> > > master.today...
> > 
> > Thanks for trying to reproduce this. My tiny laptop (2 cores, 2 threads
> > per core) cannot reproduce even in 10 minutes or so. I've tried to use
> > the same machine I was testing with 3.12 kernel (2 sockets, 8 cores per
> > soc. and 2 threas per core) and it hit almost instantly. I have tried 
> > mutex_lock_killable -> interruptible and it didn't help as I've
> > expected. So the current kernel doesn't do any magic to prevent from the
> > issue as well.
> > 
> > So I've stared into do_notify_parent some more and the following was
> > just very confusing
> > 
> > 	> > if (!tsk->ptrace && sig == SIGCHLD &&
> > 	> >     (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
> > 	> >      (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
> > 	> > 	> > /*
> > 	> > 	> >  * We are exiting and our parent doesn't care.  POSIX.1
> > 	> > 	> >  * defines special semantics for setting SIGCHLD to SIG_IGN
> > 	> > 	> >  * or setting the SA_NOCLDWAIT flag: we should be reaped
> > 	> > 	> >  * automatically and not left for our parent's wait4 call.
> > 	> > 	> >  * Rather than having the parent do it as a magic kind of
> > 	> > 	> >  * signal handler, we just set this to tell do_exit that we
> > 	> > 	> >  * can be cleaned up without becoming a zombie.  Note that
> > 	> > 	> >  * we still call __wake_up_parent in this case, because a
> > 	> > 	> >  * blocked sys_wait4 might now return -ECHILD.
> > 	> > 	> >  *
> > 	> > 	> >  * Whether we send SIGCHLD or not for SA_NOCLDWAIT
> > 	> > 	> >  * is implementation-defined: we do (if you don't want
> > 	> > 	> >  * it, just use SIG_IGN instead).
> > 	> > 	> >  */
> > 	> > 	> > autoreap = true;
> > 	> > 	> > if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
> > 	> > 	> > 	> > sig = 0;
> > 	> > }
> > 
> > it tries to prevent from what I am seeing in a way. If the SIGCHLD is
> > ignored then it just does autoreap and everything is fine. But this
> > doesn't seem to be the case here. In fact we are not sending the signal
> > because sig_task_ignored is true resp. sig_handler_ignored which can
> > fail even for handler == SIG_DFL && sig_kernel_ignore() and SIGCHLD
> > seems to be in SIG_KERNEL_IGNORE_MASK. So I've tried
> 
> Dohh, I've missed !tsk->ptrace check there. So we are not even going
> that via that path. So the sig_handler_ignored cannot possible help.
> I was just too lucky and didn't hit the lockup with the patch.
> 
> So what else might be wrong here? sig_ignored seems to be quite
> confusing
> 
> 	> /*
> 	>  * Tracers may want to know about even ignored signals.
> 	>  */
> 	> return !t->ptrace;
> 
> t is the tracer here but it shouldn't have t->ptrace because the child
> is not stopped. So do we need something like the following? Not tested
> yet
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 1840c7f4e3c2..bd236ce4a29c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -91,6 +91,10 @@ static int sig_ignored(struct task_struct *t, int sig, bool force)
>  > 	> if (!sig_task_ignored(t, sig, force))
>  > 	> 	> return 0;
>  
> +> 	> /* Do not ignore signals sent from child to the parent */
> +> 	> if (current->ptrace && current->parent == t)
> +> 	> 	> return 0;
> +
>  > 	> /*
>  > 	>  * Tracers may want to know about even ignored signals.
>  > 	>  */

This patch doesn't help, nor does the previous patch... but with both
applied, all is well.  All you have to do now is figure out why :)

	-Mike

  reply	other threads:[~2016-09-22  9:40 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-21 15:29 strace lockup when tracing exec in go Michal Hocko
2016-09-22  4:15 ` Mike Galbraith
2016-09-22  8:01   ` Michal Hocko
2016-09-22  8:20     ` Aleksa Sarai
2016-09-22  8:36     ` Michal Hocko
2016-09-22  9:40       ` Mike Galbraith [this message]
2016-09-22  9:53         ` Michal Hocko
2016-09-22 10:09           ` Mike Galbraith
2016-09-22 11:09             ` Michal Hocko
2016-09-22 13:53               ` Michal Hocko
2016-09-23  1:17                 ` Aleksa Sarai
2016-09-23 10:21                 ` Oleg Nesterov
2016-09-23 11:18                   ` Michal Hocko
2016-09-23 13:21                     ` Oleg Nesterov
2016-09-23 13:40                       ` Michal Hocko
2016-09-23 14:07                         ` Oleg Nesterov
2016-09-23 14:19                           ` Michal Hocko
2016-09-23  9:50 ` Oleg Nesterov
2016-09-23 11:23   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1474537209.5022.8.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=asarai@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=oleg@redhat.com \
    --cc=strace-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.