From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 286ECC433DF for ; Tue, 13 Oct 2020 09:18:36 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A4967208D5 for ; Tue, 13 Oct 2020 09:18:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="NOT8XxoD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A4967208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=7gvqqmQF9mKeOUV+Q5AXg1z+gGjbdbj/yR9JBAQBmdc=; b=NOT8XxoDQ8tukl9jqB1BUc4Q0 txTnMen0zRyS5GsvVuLNgnAnTpLNqJCUsU6XRKbBVA+koZCi1fBApOdE6MCpJlVy8MXPpG9AuYW3y WVkaZKJRslU6W3lJy+9AIhsPXHUtBevB2fRDtZJ6092gDhB86xJIjuIpWC1/mtvpEmLG1GWPNNQ9r bE8cwKdoR5yko2hoLxicFcHGIHdX+a5rTfmbkZ2Wk7lHPNcdVCQnjUxI5eUnOy62EaMp4Px9g+PuC LcLILKWvaQ/OerqVMEzATKmG4YYXTenG/Z/z2jncwAb3z3lfJEQliq5ay1FIr+gtx5bu45XzYHYlr GvfOBtK+A==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kSGQM-0003d2-Kz; Tue, 13 Oct 2020 09:16:50 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kSGQJ-0003bB-1X for linux-arm-kernel@lists.infradead.org; Tue, 13 Oct 2020 09:16:48 +0000 Received: from gaia (unknown [95.149.105.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E3ADA208D5; Tue, 13 Oct 2020 09:16:42 +0000 (UTC) Date: Tue, 13 Oct 2020 10:16:40 +0100 From: Catalin Marinas To: Khalid Aziz Subject: Re: [PATCH 1/2] mm/mprotect: Call arch_validate_prot under mmap_lock and with length Message-ID: <20201013091638.GA10778@gaia> References: <20201007073932.865218-1-jannh@google.com> <20201010110949.GA32545@gaia> <20201012172218.GE6493@gaia> <20c85633-b559-c299-3e57-ae136b201526@oracle.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20c85633-b559-c299-3e57-ae136b201526@oracle.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201013_051647_242584_0457CF67 X-CRM114-Status: GOOD ( 36.67 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jann Horn , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Christoph Hellwig , linux-mm@kvack.org, Paul Mackerras , Benjamin Herrenschmidt , sparclinux@vger.kernel.org, Anthony Yznaga , Andrew Morton , Will Deacon , "David S. Miller" , linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Oct 12, 2020 at 01:14:50PM -0600, Khalid Aziz wrote: > On 10/12/20 11:22 AM, Catalin Marinas wrote: > > On Mon, Oct 12, 2020 at 11:03:33AM -0600, Khalid Aziz wrote: > >> On 10/10/20 5:09 AM, Catalin Marinas wrote: > >>> On Wed, Oct 07, 2020 at 02:14:09PM -0600, Khalid Aziz wrote: > >>>> On 10/7/20 1:39 AM, Jann Horn wrote: > >>>>> arch_validate_prot() is a hook that can validate whether a given set of > >>>>> protection flags is valid in an mprotect() operation. It is given the set > >>>>> of protection flags and the address being modified. > >>>>> > >>>>> However, the address being modified can currently not actually be used in > >>>>> a meaningful way because: > >>>>> > >>>>> 1. Only the address is given, but not the length, and the operation can > >>>>> span multiple VMAs. Therefore, the callee can't actually tell which > >>>>> virtual address range, or which VMAs, are being targeted. > >>>>> 2. The mmap_lock is not held, meaning that if the callee were to check > >>>>> the VMA at @addr, that VMA would be unrelated to the one the > >>>>> operation is performed on. > >>>>> > >>>>> Currently, custom arch_validate_prot() handlers are defined by > >>>>> arm64, powerpc and sparc. > >>>>> arm64 and powerpc don't care about the address range, they just check the > >>>>> flags against CPU support masks. > >>>>> sparc's arch_validate_prot() attempts to look at the VMA, but doesn't take > >>>>> the mmap_lock. > >>>>> > >>>>> Change the function signature to also take a length, and move the > >>>>> arch_validate_prot() call in mm/mprotect.c down into the locked region. > >>> [...] > >>>> As Chris pointed out, the call to arch_validate_prot() from do_mmap2() > >>>> is made without holding mmap_lock. Lock is not acquired until > >>>> vm_mmap_pgoff(). This variance is uncomfortable but I am more > >>>> uncomfortable forcing all implementations of validate_prot to require > >>>> mmap_lock be held when non-sparc implementations do not have such need > >>>> yet. Since do_mmap2() is in powerpc specific code, for now this patch > >>>> solves a current problem. > >>> > >>> I still think sparc should avoid walking the vmas in > >>> arch_validate_prot(). The core code already has the vmas, though not > >>> when calling arch_validate_prot(). That's one of the reasons I added > >>> arch_validate_flags() with the MTE patches. For sparc, this could be > >>> (untested, just copied the arch_validate_prot() code): > >> > >> I am little uncomfortable with the idea of validating protection bits > >> inside the VMA walk loop in do_mprotect_pkey(). When ADI is being > >> enabled across multiple VMAs and arch_validate_flags() fails on a VMA > >> later, do_mprotect_pkey() will bail out with error leaving ADI enabled > >> on earlier VMAs. This will apply to protection bits other than ADI as > >> well of course. This becomes a partial failure of mprotect() call. I > >> think it should be all or nothing with mprotect() - when one calls > >> mprotect() from userspace, either the entire address range passed in > >> gets its protection bits updated or none of it does. That requires > >> validating protection bits upfront or undoing what earlier iterations of > >> VMA walk loop might have done. > > > > I thought the same initially but mprotect() already does this with the > > VM_MAY* flag checking. If you ask it for an mprotect() that crosses > > multiple vmas and one of them fails, it doesn't roll back the changes to > > the prior ones. I considered that a similar approach is fine for MTE > > (it's most likely a user error). > > You are right about the current behavior with VM_MAY* flags, but that is > not the right behavior. Adding more cases to this just perpetuates > incorrect behavior. It is not easy to roll back changes after VMAs have > potentially been split/merged which is probably why the current code > simply throws in the towel and returns with partially modified address > space. It is lot easier to do all the checks upfront and then proceed or > not proceed with modifying VMAs. One approach might be to call > arch_validate_flags() in a loop before modifying VMAs and walk all VMAs > with a read lock held. Current code also bails out with ENOMEM if it > finds a hole in the address range and leaves any modifications already > made in place. This is another case where a hole could have been > detected earlier. This should be ideal indeed though with the risk of breaking the current ABI (FWIW, FreeBSD seems to do a first pass to check for violations: https://github.com/freebsd/freebsd/blob/master/sys/vm/vm_map.c#L2630). However, I'm not sure it's worth the hassle. Do we expect the user to call mprotect() across multiple mixed type mappings while relying on no change if an error is returned? We should probably at least document the current behaviour in the mprotect man page. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel