linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: qianli zhao <zhaoqianligood@gmail.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: christian@brauner.io, axboe@kernel.dk,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Collingbourne <pcc@google.com>,
	linux-kernel@vger.kernel.org, Qianli Zhao <zhaoqianli@xiaomi.com>
Subject: Re: [PATCH V3] exit: trigger panic when global init has exited
Date: Tue, 23 Mar 2021 19:23:11 +0800	[thread overview]
Message-ID: <CAPx_LQH0Qx7R9vNb3FHgO7V8uJ9AAh0j_TZV-VnyBF7Ys_7FTQ@mail.gmail.com> (raw)
In-Reply-To: <20210323090035.GA25159@redhat.com>

Hi,Oleg

> You certainly don't understand me :/

> Please read my email you quoted below. I didn't mean the current logic.
> I meant the logic after your patch which moves atomic_dec_and_test() and
> panic() before exit_signals().

Sorry, I think I see what you mean now.

You mean that after apply my patch,SIGNAL_GROUP_EXIT no longer needs
to be tested or avoid zap_pid_ns_processes()->BUG().
Yes,your consideration is correct.
But,my patch has another purpose,protect some key variables(such
as:task->mm,task->nsproxy,etc) to recover init coredump from
fulldump,if sub-threads finish do_exit(),these variables of sub-task
will be lost,and we cannot parse the coredump of the init process
through the tool normally such as "gcore".


Oleg Nesterov <oleg@redhat.com> 于2021年3月23日周二 下午5:00写道:
>
> On 03/23, qianli zhao wrote:
> >
> > Hi,Oleg
> >
> > > No, there is at least one alive init thread. If they all have exited, we have
> > > the thread which calls panic() above.
> >
> > By current logic, setting PF_EXITING(exit_signals()) is before the
> > panic(),
>
> You certainly don't understand me :/
>
> Please read my email you quoted below. I didn't mean the current logic.
> I meant the logic after your patch which moves atomic_dec_and_test() and
> panic() before exit_signals().
>
> Oleg.
>
> > find_alive_thread() determines the PF_EXITING of all child
> > threads, the panic thread's PF_EXITING has been set before panic(),so
> > find_alive_thread() thinks this thread also dead, resulting in
> > find_alive_thread returning NULL.It is possible to trigger a
> > zap_pid_ns_processes()->BUG() in this case.
> > ========
> > exit_signals(tsk);  /* sets PF_EXITING */
> > ...
> > group_dead = atomic_dec_and_test(&tsk->signal->live);
> > if (group_dead) {
> > if (unlikely(is_global_init(tsk)))
> > panic("Attempted to kill init!
> > exitcode=0x%08x\n",-------------------->//PF_EXITING has been set
> > tsk->signal->group_exit_code ?: (int)code);
> >
> > =======
> >
> > > Why do you think so? It can affect _any_ code which runs under
> > > "if (group_dead)". Again, I don't see anything wrong, but I didn't even
> > > try to audit these code paths.
> >
> > Yes,all places where checked the "signal->live" may be affected,but
> > even before my changes, each program that checks "signal->live" may
> > get different state(group_dead or not), depending on the timing of the
> > caller,this situation will not change after my change.
> > After my patch,"signal->live--" and other variable are set in a
> > different order(such as signal->live and PF_EXITING),this can cause
> > abnormalities in the logic associated with these two variables,that is
> > my thinking.
> > Of course, check all the "signal->live--" path is definitely
> > necessary,it's just the case above that we need more attention.
> >
> > Thanks
> >
> > Oleg Nesterov <oleg@redhat.com> 于2021年3月23日周二 上午12:37写道:
> > >
> > > Hi,
> > >
> > > It seems that we don't understand each other.
> > >
> > > If we move atomic_dec_and_test(signal->live) and do
> > >
> > >         if (group_dead && is_global_init)
> > >                 panic(...);
> > >
> > >
> > > before setting PF_EXITING like your patch does, then zap_pid_ns_processes()
> > > simply won't be called.
> > >
> > > Because:
> > >
> > > On 03/21, qianli zhao wrote:
> > > >
> > > > Hi,Oleg
> > > >
> > > > > How? Perhaps I missed something again, but I don't think this is possible.
> > > >
> > > > > zap_pid_ns_processes() simply won't be called, find_child_reaper() will
> > > > > see the !PF_EXITING thread which calls panic().
> > > >
> > > > > So I think this should be documented somehow, at least in the changelog.
> > > >
> > > > This problem occurs when both two init threads enter the do_exit,
> > > > One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_EXIT
> > > > The other init thread perform ret_to_user()->get_signal() and found
> > > > SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there
> > > > are no alive init threads it finally goes to
> > > > zap_pid_ns_processes()
> > >
> > > No, there is at least one alive init thread. If they all have exited, we have
> > > the thread which calls panic() above.
> > >
> > > > and BUG().
> > >
> > > so we don't need the SIGNAL_GROUP_EXIT check to avoid this BUG().
> > >
> > > What have I missed?
> > >
> > > Oleg.
> > >
> >
>

  reply	other threads:[~2021-03-23 11:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 12:51 [PATCH V3] exit: trigger panic when global init has exited Qianli Zhao
2021-03-17 14:38 ` Oleg Nesterov
2021-03-18  2:47   ` qianli zhao
2021-03-18 18:04     ` Oleg Nesterov
2021-03-18 19:08       ` Eric W. Biederman
2021-03-19  6:33         ` qianli zhao
2021-03-19 16:34           ` Oleg Nesterov
2021-03-19 16:26         ` Oleg Nesterov
2021-03-21 16:00         ` qianli zhao
2021-03-22 17:07           ` Oleg Nesterov
2021-03-22 17:09             ` Oleg Nesterov
2021-03-19  5:08       ` qianli zhao
2021-03-19 16:32         ` Oleg Nesterov
2021-03-21 13:04           ` qianli zhao
2021-03-22 16:37             ` Oleg Nesterov
2021-03-23  3:14               ` qianli zhao
2021-03-23  9:00                 ` Oleg Nesterov
2021-03-23 11:23                   ` qianli zhao [this message]
2021-03-24 18:12                     ` Oleg Nesterov
2021-03-25  3:00                       ` qianli zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPx_LQH0Qx7R9vNb3FHgO7V8uJ9AAh0j_TZV-VnyBF7Ys_7FTQ@mail.gmail.com \
    --to=zhaoqianligood@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=christian@brauner.io \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pcc@google.com \
    --cc=tglx@linutronix.de \
    --cc=zhaoqianli@xiaomi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).