From mboxrd@z Thu Jan 1 00:00:00 1970 From: Miklos Szeredi Subject: Re: overlayfs access checks on underlying layers Date: Wed, 28 Nov 2018 11:00:09 +0100 Message-ID: References: <20181127210542.GA2599@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <20181127210542.GA2599@redhat.com> Sender: linux-kernel-owner@vger.kernel.org To: Vivek Goyal Cc: Stephen Smalley , Ondrej Mosnacek , "J. Bruce Fields" , Mark Salyzyn , Paul Moore , linux-kernel@vger.kernel.org, overlayfs , linux-fsdevel@vger.kernel.org, selinux@vger.kernel.org, Daniel J Walsh List-Id: linux-unionfs@vger.kernel.org On Tue, Nov 27, 2018 at 10:05 PM Vivek Goyal wrote: > > On Tue, Nov 27, 2018 at 08:58:06PM +0100, Miklos Szeredi wrote: > > [resending with fixed email address for Paul Moore] > > > > Moving discussion from github[1] to here. > > > > To summarize: commit 007ea44892e6 ("ovl: relax permission checking on > > underlying layers") was added in 4.20-rc1 to make overlayfs access > > checks on underlying "real" filesystems more consistent. The > > discussion leading up to this commit can be found at [2]. The commit > > broke some selinux-testsuite cases, possibly indicating a security > > hole opened by this commit. > > > > The model this patch tries to follow is that if "cp --preserve=all" > > was allowed to the mounter from underlying layer to the overlay layer, > > then operation is allowed. That means even if mounter's creds doesn't > > provide permission to for example execute underying file X, if > > mounter's creds provide sufficient permission to perform "cp > > --preserve=all X Y" and original creds allow execute on Y, then the > > operation is allowed. This provides consistency in the face of > > copy-ups. Consistency is only provided in sane setups, where mounter > > has sufficient privileges to access both the lower and upper layers. > > [cc daniel walsh] > > I think current selinux testsuite tests are written keeping these > rules in mind. > > 1. Check overlay inode creds in the context of task and underlying > inode creds (lower/upper), in the context of mounter. > > 2. For a lower inode, if said file is being copied up, then only > check MAY_READ on lower. This is equivalent to mounter creating > a copy of file and providing caller access to it (context mount). > > For the case of special devices, we do not copy up these. So should > we continue to do check on lower inode in the context of mounter > (instead of not doing any check on lower at all). Hmm, I'm trying to understand the logic... If we follow the "cp --preserve=all" thing, than mounter needs to have CREATE permission for the special file, not READ or WRITE. Does that make sense? Would that help with the context= mount use case? > > For being able to execute a file, should we atleast check MAY_READ > on lower. Yep, that looks like a bug present from day one: MAY_EXEC doesn't always imply MAY_READ, but to be able to execute a file, the kernel must read it first, and if mounter doesn't have privilege to read the file, then user should not be allowed to execute it. > I am not sure why did we have to drop current checks on special file > and execute. I will read through the thread you pointed out. TL;DR: NFS access model is that creds are checked by server (and cached in client), and server could be denying write access to a device file to mounter (root) independently of DAC. In that case write access by user to device file would be inconsistent (denied before copy-up, allowed after copy-up). Same goes for execute. And same goes for MAC: if it's denying READ/WRITE on device or denying EXECUTE on readable file to mounter, and mounter can just copy that device/file to a temporry location not controlled by that MAC, than it can work around that restriction. IOW, this is just a generalization of the rule that we ignore WRITE access on lower layer, because a write will never reach the lower layer. Thanks, Miklos