From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85BEDC00144 for ; Fri, 29 Jul 2022 16:18:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236473AbiG2QSU (ORCPT ); Fri, 29 Jul 2022 12:18:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230501AbiG2QSP (ORCPT ); Fri, 29 Jul 2022 12:18:15 -0400 Received: from out03.mta.xmission.com (out03.mta.xmission.com [166.70.13.233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABF0C88CEC for ; Fri, 29 Jul 2022 09:18:10 -0700 (PDT) Received: from in01.mta.xmission.com ([166.70.13.51]:36524) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1oHSgj-006ViV-07; Fri, 29 Jul 2022 10:18:09 -0600 Received: from ip68-227-174-4.om.om.cox.net ([68.227.174.4]:34452 helo=email.froward.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1oHSgh-003rm2-VS; Fri, 29 Jul 2022 10:18:08 -0600 From: "Eric W. Biederman" To: Tycho Andersen Cc: Oleg Nesterov , "Serge E. Hallyn" , Miklos Szeredi , linux-kernel@vger.kernel.org References: <20220721015459.GA4297@mail.hallyn.com> <20220727175538.GC18822@redhat.com> <20220727191949.GD18822@redhat.com> <20220728091220.GA11207@redhat.com> <87pmhofr1q.fsf@email.froward.int.ebiederm.org> Date: Fri, 29 Jul 2022 11:15:28 -0500 In-Reply-To: (Tycho Andersen's message of "Fri, 29 Jul 2022 07:50:34 -0600") Message-ID: <87v8rfevz3.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1oHSgh-003rm2-VS;;;mid=<87v8rfevz3.fsf@email.froward.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.174.4;;;frm=ebiederm@xmission.com;;;spf=softfail X-XM-AID: U2FsdGVkX190TV86l60ISfkFnfhVXBGwRERgjN7W5wI= X-SA-Exim-Connect-IP: 68.227.174.4 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH] sched: __fatal_signal_pending() should also check PF_EXITING X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Tycho Andersen writes: > On Fri, Jul 29, 2022 at 12:04:17AM -0500, Eric W. Biederman wrote: >> Tycho Andersen writes: >> >> > On Thu, Jul 28, 2022 at 11:12:20AM +0200, Oleg Nesterov wrote: > >> >> Finally. if fuse_flush() wants __fatal_signal_pending() == T when the >> >> caller exits, perhaps it can do it itself? Something like >> >> >> >> if (current->flags & PF_EXITING) { >> >> spin_lock_irq(siglock); >> >> set_thread_flag(TIF_SIGPENDING); >> >> sigaddset(¤t->pending.signal, SIGKILL); >> >> spin_unlock_irq(siglock); >> >> } >> >> >> >> Sure, this is ugly as hell. But perhaps this can serve as a workaround? >> > >> > or even just >> > >> > if (current->flags & PF_EXITING) >> > return; >> > >> > since we don't have anyone to send the result of the flush to anyway. >> > If we don't end up converging on a fix here, I'll just send that >> > patch. Thanks for the suggestion. >> >> If that was limited to the case you care about that would be reasonable. >> >> That will have an effect on any time a process that opens files on a >> fuse filesystem exits and depends upon the exit path to close it's file >> descriptors to the fuse filesystem. >> >> >> I do see a plausible solution along those lines. >> >> In fuse_flush instead of using fuse_simple_request call an equivalent >> function that when PF_EXITING is true skips calling request_wait_answer. >> Or perhaps when PF_EXITING is set uses schedule_work to call the >> request_wait_answer. > > I don't see why this is any different than what I proposed. It changes > the semantics to flush happening out-of-order with task exit, instead > of strictly before, which you point out might be a problem. What am I > missing? What you proposed skips the flush operation entirely. Which means that a fuse server that tracks opens and closes of a file descriptor will see more opens than closes and will have a reference counting problem (probably resulting in things not being freed). Simply skipping the wait for the result from the fuse server means the fuse server sees what it has always seen. The kernel simply won't block until that result has been returned. Which means the other file descriptors can be closed. For the specific case you are looking at with the server being killed and server's file descriptors not yet being closed, the difference does not matter. In the ordinary case of a process exit closing file descriptors to a fuse filesystem where the server continues to live and function not waiting for the response from the server simply winds up being an optimization, in exit. The key part is the fuse server continues to see the same traffic. In particular the open requests and the flush requests continue to balance, so reference counting in the fuse server is not broken. Eric