From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C50C3C5CFE7 for ; Wed, 11 Jul 2018 16:02:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8020020C03 for ; Wed, 11 Jul 2018 16:02:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8020020C03 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=xmission.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387471AbeGKQHw (ORCPT ); Wed, 11 Jul 2018 12:07:52 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:57802 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726586AbeGKQHw (ORCPT ); Wed, 11 Jul 2018 12:07:52 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fdHZm-00024q-9p; Wed, 11 Jul 2018 10:02:49 -0600 Received: from [97.119.167.31] (helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fdHZY-0004DO-Tx; Wed, 11 Jul 2018 10:02:46 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Oleg Nesterov Cc: Linus Torvalds , Andrew Morton , linux-kernel@vger.kernel.org, Wen Yang , majiang References: <877em2jxyr.fsf_-_@xmission.com> <20180711024459.10654-11-ebiederm@xmission.com> <20180711141456.GA6636@redhat.com> Date: Wed, 11 Jul 2018 11:02:29 -0500 In-Reply-To: <20180711141456.GA6636@redhat.com> (Oleg Nesterov's message of "Wed, 11 Jul 2018 16:14:56 +0200") Message-ID: <87h8l5g3qi.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1fdHZY-0004DO-Tx;;;mid=<87h8l5g3qi.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=97.119.167.31;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+VWnJJgPmrAD86VHWscAFru7CRdj/Vhds= X-SA-Exim-Connect-IP: 97.119.167.31 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [RFC][PATCH 11/11] signal: Ignore all but multi-process signals that come in during fork. X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Oleg Nesterov writes: > On 07/10, Eric W. Biederman wrote: >> >> @@ -1602,6 +1603,20 @@ static __latent_entropy struct task_struct *copy_process( >> { >> int retval; >> struct task_struct *p; >> + unsigned seq; >> + >> + /* >> + * Signals that are delivered to multiple processes need to be >> + * delivered to just the parent before the fork or both the >> + * parent and the child after the fork. Cache the multiple >> + * process signal sequence number so we can detect any of >> + * these signals that happen during the fork. In the unlikely >> + * event a signal comes in while fork is starting and restart >> + * fork to handle the signal. >> + */ >> + seq = read_seqcount_begin(¤t->signal->multi_process_seq); >> + if (signal_pending(current)) >> + return ERR_PTR(-ERESTARTNOINTR); >> >> /* >> * Don't allow sharing the root directory with processes in a different >> @@ -1930,8 +1945,8 @@ static __latent_entropy struct task_struct *copy_process( >> * A fatal signal pending means that current will exit, so the new >> * thread can't slip out of an OOM kill (or normal SIGKILL). >> */ >> - recalc_sigpending(); >> - if (signal_pending(current)) { >> + if (read_seqcount_retry(¤t->signal->multi_process_seq, seq) || >> + fatal_signal_pending(current)) { >> retval = -ERESTARTNOINTR; >> goto bad_fork_cancel_cgroup; > > So once again, I think this is not right, see the discussion on > bugzilla. I am trying to dig through and understand your concerns. I am having difficulty understanding your concerns. Do the previous patches look good to you? Can we say that is a sufficient method to get the information about signals that are sent to multiple processes into __send_signal? > If signal_pending() == T we simply can't know if copy_process() can succeed or not. > I have already mentioned the races with stop/freeze, but I think there > are more. If I understand you correctly. Your concern is that since we added the: recalc_sigpending(); if (signal_pending(current)) return -ERESTARTNOINTR; Other (non-signal) code such as the freezer has come to depend upon that test. Changing the test in the proposed way will allow the new child to escape the freezer, as it is not guaranteed the new child will be frozen. It seems reasonable to look at other things that set TIF_SIGPENDING and see if any of them are broken by the fork changes. A quick look at exit_to_usermode_loop shows that TIF_SIGPENDING only triggers signal handling. In get_signal there is only task_work_run, try_to_freeze, and burried there is ptrace_stop. Plus there is restart_syscall() that sets TIF_SIGPENDING. Now that we aren't guaranteed that TIF_SIGPENDING is set before we restart, the code should be using "retval = restart_syscall();" I will fix that. I will dig in and see what attention those cases need in fork signal_pending behavior. I am hoping that it will be as simple as adding: /* Have the child return to userspace slowly * TIF_SIGPENDING was set during fork */ if (test_tsk_thread_flag(current, TIF_SIGPENDING)) set_tsk_thread_flag(p, TIF_SIGPENDING); > And in fact I think that the fact that signal_wake_up() helps to avoid the races > with fork() is useful. Say, we could add signal_wake_up() into syscall_regfunc() > and kill syscall_tracepoint_update(). Not that I think this particular change makes > any sense, but it can work. > > That is why I tried to sugest another approach. copy_process() should always fail > if signal_pending() == T, just the "real" signal should not disturb the forking > thread unless the signal is fatal or multi-process. So after seeing the report of periodic timers causing a 40ms fork to stretch into a 1000ms fork because of restarts, I am not a fan of cases where fork has to restart. 40ms is a lot of work to abandon. A practical (and fixable) problem with your patch was that you modified task->blocked which was then copied to the child. So all children now would start with all signals being blocked. > This also makes another difference in multi-threaded case, a signal with a handler > sent to a forking process will be re-targeted to another thread which can handle it; > with your patch this signal will be "blocked" until fork() finishes or until another > thread gets TIF_SIGPENDING. Not that I think this is that important, > but still. I would not object to wants_signal deciding that a task in the middle of copy_process does not want signals. Eric