From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 803B9C433EF for ; Wed, 20 Apr 2022 13:01:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378722AbiDTNEB (ORCPT ); Wed, 20 Apr 2022 09:04:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378705AbiDTND4 (ORCPT ); Wed, 20 Apr 2022 09:03:56 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55A3D13E1E; Wed, 20 Apr 2022 06:01:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E5ACC619F3; Wed, 20 Apr 2022 13:01:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2875C385A0; Wed, 20 Apr 2022 13:01:05 +0000 (UTC) Date: Wed, 20 Apr 2022 14:01:02 +0100 From: Catalin Marinas To: Kees Cook Cc: Andrew Morton , Christoph Hellwig , Lennart Poettering , Zbigniew =?utf-8?Q?J=C4=99drzejewski-Szmek?= , Will Deacon , Alexander Viro , Eric Biederman , Szabolcs Nagy , Mark Brown , Jeremy Linton , Topi Miettinen , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-abi-devel@lists.sourceforge.net, linux-hardening@vger.kernel.org, Jann Horn , Salvatore Mesoraca , Igor Zhbanov Subject: Re: [PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE) Message-ID: References: <20220413134946.2732468-1-catalin.marinas@arm.com> <202204141028.0482B08@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202204141028.0482B08@keescook> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 14, 2022 at 11:52:17AM -0700, Kees Cook wrote: > On Wed, Apr 13, 2022 at 02:49:42PM +0100, Catalin Marinas wrote: > > The background to this is that systemd has a configuration option called > > MemoryDenyWriteExecute [1], implemented as a SECCOMP BPF filter. Its aim > > is to prevent a user task from inadvertently creating an executable > > mapping that is (or was) writeable. Since such BPF filter is stateless, > > it cannot detect mappings that were previously writeable but > > subsequently changed to read-only. Therefore the filter simply rejects > > any mprotect(PROT_EXEC). The side-effect is that on arm64 with BTI > > support (Branch Target Identification), the dynamic loader cannot change > > an ELF section from PROT_EXEC to PROT_EXEC|PROT_BTI using mprotect(). > > For libraries, it can resort to unmapping and re-mapping but for the > > main executable it does not have a file descriptor. The original bug > > report in the Red Hat bugzilla - [2] - and subsequent glibc workaround > > for libraries - [3]. > > Right, so, the systemd filter is a big hammer solution for the kernel > not having a very easy way to provide W^X mapping protections to > userspace. There's stuff in SELinux, and there have been several > attempts[1] at other LSMs to do it too, but nothing stuck. > > Given the filter, and the implementation of how to enable BTI, I see two > solutions: > > - provide a way to do W^X so systemd can implement the feature differently > - provide a way to turn on BTI separate from mprotect to bypass the filter > > I would agree, the latter seems like the greater hack, We discussed such hacks in the past but they are just working around the fundamental issue - systemd wants W^X but with BPF it can only achieve it by preventing mprotect(PROT_EXEC) irrespective of whether the mapping was already executable. If we find a better solution for W^X, we wouldn't have to hack anything for mprotect(PROT_EXEC|PROT_BTI). > so I welcome > this RFC, though I think it might need to explore a bit of the feature > space exposed by other solutions[1] (i.e. see SARA and NAX), otherwise > it risks being too narrowly implemented. For example, playing well with > JITs should be part of the design, and will likely need some kind of > ELF flags and/or "sealing" mode, and to handle the vma alias case as > Jann Horn pointed out[2]. I agree we should look at what we want to cover, though trying to avoid re-inventing SELinux. With this patchset I went for the minimum that systemd MDWE does with BPF. I think JITs get around it using something like memfd with two separate mappings to the same page. We could try to prevent such aliases but allow it if an ELF note is detected (or get the JIT to issue a prctl()). Anyway, with a prctl() we can allow finer-grained control starting with anonymous and file mappings and later extending to vma aliases, writeable files etc. On top we can add a seal mask so that a process cannot disable a control was set. Something like (I'm not good at names): prctl(PR_MDWX_SET, flags, seal_mask); prctl(PR_MDWX_GET); with flags like: PR_MDWX_MMAP - basics, should cover mmap() and mprotect() PR_MDWX_ALIAS - vma aliases, allowed with an ELF note PR_MDWX_WRITEABLE_FILE (needs some more thinking) -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D6A0BC433EF for ; Wed, 20 Apr 2022 13:02:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=B78skSmT/Px+7BY9dE16IsF9LFNCTZZ2Lv2Fw6kQc24=; b=yDFcJ/VGuWpy/A Nsm1K6NalNWrdAwFoabCNTnhbF1vYCCG0XN9wzKAGW/iuz6GP8P2nbDZFx67yTEhRrx5Ooq8ALd3e Gfof5esQMRbr/iqVE8/A2sSZCoWnc60y8XiFa/BXLG+7No+H4WTRDxczVREv0iBvnMSq0JINk8Wus 6Q20SeFJ/9/SO/TOeKXhGYk+oIJn+OmThebwxjwx1RtbD1Glgi8NXn0NDRwTBXIeaczkxReoW++8E umPVdwv2sHhznpe0U0VVGVPtS+NebB0jRaTAzmYxiybz9hn7O4JNpsTI7ZZv+OCPnjCDkriwlrMVc nvQa2Bfjr5S5YT8FNIig==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nh9xM-0093oe-Ks; Wed, 20 Apr 2022 13:01:16 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nh9xI-0093ln-9D for linux-arm-kernel@lists.infradead.org; Wed, 20 Apr 2022 13:01:15 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AE384B81F0E; Wed, 20 Apr 2022 13:01:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2875C385A0; Wed, 20 Apr 2022 13:01:05 +0000 (UTC) Date: Wed, 20 Apr 2022 14:01:02 +0100 From: Catalin Marinas To: Kees Cook Cc: Andrew Morton , Christoph Hellwig , Lennart Poettering , Zbigniew =?utf-8?Q?J=C4=99drzejewski-Szmek?= , Will Deacon , Alexander Viro , Eric Biederman , Szabolcs Nagy , Mark Brown , Jeremy Linton , Topi Miettinen , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-abi-devel@lists.sourceforge.net, linux-hardening@vger.kernel.org, Jann Horn , Salvatore Mesoraca , Igor Zhbanov Subject: Re: [PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE) Message-ID: References: <20220413134946.2732468-1-catalin.marinas@arm.com> <202204141028.0482B08@keescook> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <202204141028.0482B08@keescook> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220420_060112_640473_A39461C8 X-CRM114-Status: GOOD ( 35.09 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Apr 14, 2022 at 11:52:17AM -0700, Kees Cook wrote: > On Wed, Apr 13, 2022 at 02:49:42PM +0100, Catalin Marinas wrote: > > The background to this is that systemd has a configuration option called > > MemoryDenyWriteExecute [1], implemented as a SECCOMP BPF filter. Its aim > > is to prevent a user task from inadvertently creating an executable > > mapping that is (or was) writeable. Since such BPF filter is stateless, > > it cannot detect mappings that were previously writeable but > > subsequently changed to read-only. Therefore the filter simply rejects > > any mprotect(PROT_EXEC). The side-effect is that on arm64 with BTI > > support (Branch Target Identification), the dynamic loader cannot change > > an ELF section from PROT_EXEC to PROT_EXEC|PROT_BTI using mprotect(). > > For libraries, it can resort to unmapping and re-mapping but for the > > main executable it does not have a file descriptor. The original bug > > report in the Red Hat bugzilla - [2] - and subsequent glibc workaround > > for libraries - [3]. > > Right, so, the systemd filter is a big hammer solution for the kernel > not having a very easy way to provide W^X mapping protections to > userspace. There's stuff in SELinux, and there have been several > attempts[1] at other LSMs to do it too, but nothing stuck. > > Given the filter, and the implementation of how to enable BTI, I see two > solutions: > > - provide a way to do W^X so systemd can implement the feature differently > - provide a way to turn on BTI separate from mprotect to bypass the filter > > I would agree, the latter seems like the greater hack, We discussed such hacks in the past but they are just working around the fundamental issue - systemd wants W^X but with BPF it can only achieve it by preventing mprotect(PROT_EXEC) irrespective of whether the mapping was already executable. If we find a better solution for W^X, we wouldn't have to hack anything for mprotect(PROT_EXEC|PROT_BTI). > so I welcome > this RFC, though I think it might need to explore a bit of the feature > space exposed by other solutions[1] (i.e. see SARA and NAX), otherwise > it risks being too narrowly implemented. For example, playing well with > JITs should be part of the design, and will likely need some kind of > ELF flags and/or "sealing" mode, and to handle the vma alias case as > Jann Horn pointed out[2]. I agree we should look at what we want to cover, though trying to avoid re-inventing SELinux. With this patchset I went for the minimum that systemd MDWE does with BPF. I think JITs get around it using something like memfd with two separate mappings to the same page. We could try to prevent such aliases but allow it if an ELF note is detected (or get the JIT to issue a prctl()). Anyway, with a prctl() we can allow finer-grained control starting with anonymous and file mappings and later extending to vma aliases, writeable files etc. On top we can add a seal mask so that a process cannot disable a control was set. Something like (I'm not good at names): prctl(PR_MDWX_SET, flags, seal_mask); prctl(PR_MDWX_GET); with flags like: PR_MDWX_MMAP - basics, should cover mmap() and mprotect() PR_MDWX_ALIAS - vma aliases, allowed with an ELF note PR_MDWX_WRITEABLE_FILE (needs some more thinking) -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel