From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1D87C2D0A3 for ; Wed, 4 Nov 2020 12:19:29 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3E674208C7 for ; Wed, 4 Nov 2020 12:19:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="nFAYiUr6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3E674208C7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=y6No1A9sOkCgdxwr32AQzOlUHL3r6EzxWxZRhX6F72I=; b=nFAYiUr67KLhdoQarMnJHOD+o sOyFrNSRQz2XYYPS3FgPqfLVih2rmT9fO/sXYg8WXhKUlfXx7RzpMGRm1wen4lb4sB8c2yrsrKnpY txRQQTnox88usou+jQB0IHb1QJ/WJu0jtmu6zzjjHILhE0PS/kuV3OuzBEbjO+i4cil2aywvmPBVq UdZ0wkAYm/xiZoXzWWqMrYnN4CKNIH53njcDp78HG3c+Z7m1ITFeg/e4enDvBEfGK0lUkk9JGtXeB 1XmrTgaOATzEsBoavIMzyWncVwKcFr6FJ46vAD1JEg3MAB0AyrFejqHKtTC2HaXB4J+Jnq0mGkonN y8+QT3/wA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kaHkn-0005Sb-Nq; Wed, 04 Nov 2020 12:19:05 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kaHkk-0005RC-UW for linux-arm-kernel@lists.infradead.org; Wed, 04 Nov 2020 12:19:04 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CEBCB1474; Wed, 4 Nov 2020 04:19:01 -0800 (PST) Received: from arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 651B63F719; Wed, 4 Nov 2020 04:19:00 -0800 (PST) Date: Wed, 4 Nov 2020 12:18:56 +0000 From: Dave Martin To: Catalin Marinas Subject: Re: BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures Message-ID: <20201104121855.GQ6882@arm.com> References: <8584c14f-5c28-9d70-c054-7c78127d84ea@arm.com> <20201026162410.GB27285@arm.com> <20201026165755.GV3819@arm.com> <20201026175230.GC27285@arm.com> <45c64b49-a38b-4b0c-d9cf-6c586dacbcc9@arm.com> <20201027141522.GD27285@arm.com> <20201029110220.GC10776@gaia> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20201029110220.GC10776@gaia> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201104_071903_141638_036D1CA8 X-CRM114-Status: GOOD ( 32.61 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , systemd-devel@lists.freedesktop.org, Kees Cook , Will Deacon , "linux-kernel@vger.kernel.org" , Jeremy Linton , Mark Brown , toiwoton@gmail.com, libc-alpha@sourceware.org, "linux-arm-kernel@lists.infradead.org" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Oct 29, 2020 at 11:02:22AM +0000, Catalin Marinas via Libc-alpha wrote: > On Tue, Oct 27, 2020 at 02:15:22PM +0000, Dave P Martin wrote: > > I also wonder whether we actually care whether the pages are marked > > executable or not here; probably the flags can just be independent. This > > rather depends on whether the how the architecture treats the BTI (a.k.a > > GP) pagetable bit for non-executable pages. I have a feeling we already > > allow PROT_BTI && !PROT_EXEC through anyway. > > > > > > What about a generic-ish set/clear interface that still works by just > > adding a couple of PROT_ flags: > > > > switch (flags & (PROT_SET | PROT_CLEAR)) { > > case PROT_SET: prot |= flags; break; > > case PROT_CLEAR: prot &= ~flags; break; > > case 0: prot = flags; break; > > > > default: > > return -EINVAL; > > } > > > > This can't atomically set some flags while clearing some others, but for > > simple stuff it seems sufficient and shouldn't be too invasive on the > > kernel side. > > > > We will still have to take the mm lock when doing a SET or CLEAR, but > > not for the non-set/clear case. > > > > > > Anyway, libc could now do: > > > > mprotect(addr, len, PROT_SET | PROT_BTI); > > > > with much the same effect as your PROT_BTI_IF_X. > > > > > > JITting or breakpoint setting code that wants to change the permissions > > temporarily, without needing to know whether PROT_BTI is set, say: > > > > mprotect(addr, len, PROT_SET | PROT_WRITE); > > *addr = BKPT_INSN; > > mprotect(addr, len, PROT_CLEAR | PROT_WRITE); > > The problem with this approach is that you can't catch > PROT_EXEC|PROT_WRITE mappings via seccomp. So you'd have to limit it to > some harmless PROT_ flags only. I don't like this limitation, nor the > PROT_BTI_IF_X approach. Ack; this is just one flavour of interface, and every approach seems to have some shortcomings. > The only generic solutions I see are to either use a stateful filter in > systemd or pass the old state to the kernel in a cmpxchg style so that > seccomp can check it (I think you suggest this at some point). The "cmpxchg" option has the disadvantage that the caller needs to know the original permissions. It seems that glibc is prepared to work around this, but it won't always be feasible in ancillary / instrumentation code or libraries. IMHO it would be preferable to apply a policy to mmap/mprotect in the kernel proper rather then BPF being the only way to do it -- in any case, the required checks seem to be out of the scope of what can be done efficiently (or perhaps at all) in a syscall filter. > The latter requires a new syscall which is not something we can address > as a quick, back-portable fix here. If systemd cannot be changed to use > a stateful filter for w^x detection, my suggestion is to go for the > kernel setting PROT_BTI on the main executable with glibc changed to > tolerate EPERM on mprotect(). I don't mind adding an AT_FLAGS bit if > needed but I don't think it buys us much. I agree, this seems the best short-term approach. > Once the current problem is fixed, we can look at a better solution > longer term as a new syscall. Agreed, I think if we try to rush the addition of new syscalls, the chance of coming up with a bad design is high... Cheers ---Dave _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel