From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23E6DC49EAB for ; Thu, 24 Jun 2021 18:58:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 00C13613CC for ; Thu, 24 Jun 2021 18:58:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232566AbhFXTBC (ORCPT ); Thu, 24 Jun 2021 15:01:02 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:55430 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229464AbhFXTBA (ORCPT ); Thu, 24 Jun 2021 15:01:00 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1lwUYe-00GPo3-3g; Thu, 24 Jun 2021 12:58:36 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:46976 helo=email.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1lwUYb-003Gdf-NH; Thu, 24 Jun 2021 12:58:35 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , Michael Schmitz , linux-arch , Jens Axboe , Oleg Nesterov , Linux Kernel Mailing List , Richard Henderson , Ivan Kokshaysky , Matt Turner , alpha , Geert Uytterhoeven , linux-m68k , Arnd Bergmann , Ley Foon Tan , Tejun Heo , Kees Cook References: <87sg1lwhvm.fsf@disp2133> <6e47eff8-d0a4-8390-1222-e975bfbf3a65@gmail.com> <924ec53c-2fd9-2e1c-bbb1-3fda49809be4@gmail.com> <87eed4v2dc.fsf@disp2133> <5929e116-fa61-b211-342a-c706dcb834ca@gmail.com> <87fsxjorgs.fsf@disp2133> <87a6njf0ia.fsf@disp2133> <87tulpbp19.fsf@disp2133> <87zgvgabw1.fsf@disp2133> Date: Thu, 24 Jun 2021 13:57:35 -0500 In-Reply-To: <87zgvgabw1.fsf@disp2133> (Eric W. Biederman's message of "Wed, 23 Jun 2021 09:33:50 -0500") Message-ID: <875yy3850g.fsf_-_@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1lwUYb-003Gdf-NH;;;mid=<875yy3850g.fsf_-_@disp2133>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/U76JadBwWKTlmHkHUkxGlFtBDXVbu+Hg= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH 0/9] Refactoring exit X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I dug into exit because PTRACE_EVENT_EXIT not being guaranteed to be called with a stack where ptrace read and write all of the userspace registers can lead to unfiltered reads and writes of kernel stack contents. While looking into it I realized that there are a lot of little races between all of the ways an exit can be initiated. I don't know of a way those races are harmful, but they make the code difficult to reason about. The solution this set of changes adopts is to implement good primitives for asynchronous exit and exit_group requests and modifies exit(2) and exit_group(2) to use those primitives. The result should be more consistent determination of the reason for an exit, as well as PTRACE_EVENT_EXIT always being called from a context (get_signal) where ptrace is guaranteed to be able to read and write all of the registers. I believe the set of changes could be justified for the cleanups alone even if PTRACE_EVENT_EXIT did not need to be moved. Which makes me feel good about this approach. If a way can be found that coredumps can be started from complete_signal (needed for timely handling of fatal signals) instead of needing to start in do_coredump for proper synchronization force_siginfo_to_task and get_signal can be significantly simplified. As it is a lot of checks are duplicated to ensure that everything works properly in the presence of do_coredump. So far the code has been lightly tested, and the descriptions of some of the patches are a bit light, but I think this shows the direction I am aiming to travel for sorting out exit(2) and exit_group(2). Eric W. Biederman (9): signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL) signal/seccomp: Refactor seccomp signal and coredump generation signal/seccomp: Dump core when there is only one live thread signal: Factor start_group_exit out of complete_signal signal/group_exit: Use start_group_exit in place of do_group_exit signal: Fold do_group_exit into get_signal fixing io_uring threads signal: Make individual tasks exiting a first class concept. signal/task_exit: Use start_task_exit in place of do_exit signal: Move PTRACE_EVENT_EXIT into get_signal arch/sh/kernel/cpu/fpu.c | 10 +-- fs/exec.c | 10 ++- include/linux/sched/jobctl.h | 2 + include/linux/sched/signal.h | 5 ++ include/linux/sched/task.h | 1 - kernel/exit.c | 41 ++--------- kernel/seccomp.c | 45 +++--------- kernel/signal.c | 166 ++++++++++++++++++++++++++++++------------- 8 files changed, 154 insertions(+), 126 deletions(-)