From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753495AbcJEJRv (ORCPT ); Wed, 5 Oct 2016 05:17:51 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:35430 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750807AbcJEJRt (ORCPT ); Wed, 5 Oct 2016 05:17:49 -0400 Date: Wed, 5 Oct 2016 11:17:46 +0200 From: Michal Hocko To: Oleg Nesterov Cc: Andrey Ryabinin , Alexander Viro , Tejun Heo , "Rafael J. Wysocki" , Pavel Machek , linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] coredump: fix unfreezable coredumping task Message-ID: <20161005091745.GA7138@dhcp22.suse.cz> References: <1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com> <20160930124741.GA10356@redhat.com> <20161004071804.GA32234@dhcp22.suse.cz> <20161004161304.GA32428@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161004161304.GA32428@redhat.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 04-10-16 18:13:05, Oleg Nesterov wrote: > On 10/04, Michal Hocko wrote: > > > > On Fri 30-09-16 14:47:41, Oleg Nesterov wrote: > > > On 09/30, Andrey Ryabinin wrote: > > > > > > > > @@ -423,7 +424,9 @@ static int coredump_wait(int exit_code, struct core_state *core_state) > > > > if (core_waiters > 0) { > > > > struct core_thread *ptr; > > > > > > > > + freezer_do_not_count(); > > > > wait_for_completion(&core_state->startup); > > > > + freezer_count(); > > > > > > Agreed... we could probably even do > > > > > > --- x/fs/coredump.c > > > +++ x/fs/coredump.c > > > @@ -423,7 +423,13 @@ static int coredump_wait(int exit_code, > > > if (core_waiters > 0) { > > > struct core_thread *ptr; > > > > > > - wait_for_completion(&core_state->startup); > > > + if (wait_for_completion_interruptible(&core_state->startup)) { > > > + /* see the comment in dump_interrupted() */ > > > + down_write(&mm->mmap_sem); > > > + coredump_finish(mm, false); > > > + up_write(&mm->mmap_sem); > > > + return -EINTR; > > > + } > > > /* > > > * Wait for all the threads to become inactive, so that > > > * all the thread context (extended register state, like > > > > This looks like a very good idea to me. We really want to make the whole > > coredump_wait killable. > > Well, it is already killable. Except wait_for_completion is not killable and the exiting tasks might be blocked in a !killable state blocking this one to continue. But... > And with the change above it can sleep > in down_write(mmap_sem) and we really need this lock to abort, so it > won't necessarily react to SIGKILL faster. you are right that somebody might be holding mmap_sem and we cannot get rid of it here. > > I guess this should help us to remove the > > hackish sig->flags & SIGNAL_GROUP_COREDUMP check from > > __task_will_free_mem. > > Why? This doesn't depend on "killable". __task_will_free_mem() checks > this flag to detect the CLONE_VM processes which won't exit soon because > they participate in the coredumping. I just (wrongly) assumed that if we make this path killable completely we can guarantee a forward progress and get rid of SIGNAL_GROUP_COREDUMP check completely. But you are right this won't be sufficient. -- Michal Hocko SUSE Labs