From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB1F5C432C3 for ; Tue, 3 Dec 2019 14:40:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6EBBE20661 for ; Tue, 3 Dec 2019 14:40:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b="bgwEgdBw"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b="bgwEgdBw" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726190AbfLCOkR (ORCPT ); Tue, 3 Dec 2019 09:40:17 -0500 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:35524 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725848AbfLCOkR (ORCPT ); Tue, 3 Dec 2019 09:40:17 -0500 Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id 959D98EE12C; Tue, 3 Dec 2019 06:40:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1575384016; bh=qVkS5/TnlW/6UlIqPLHFGwLupPI1Ycqzwc+IE9PQWxs=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=bgwEgdBwAKekOHf7dcTyEHqJDDad3cpLEIXV9V6LKcD27guTamLYEjVgJlgY4t0v/ sUxfIrgyrz4QTjDHplf6TAD1oBW2nFuri/8t7BVI9jN5cNNvrhMWK27E32WotcOVC3 vy2iPfIJuImza9uda62XvmoZ9+oZuMYDCf3Z/MDM= Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Bbvy5NeLiQrI; Tue, 3 Dec 2019 06:40:16 -0800 (PST) Received: from jarvis.lan (unknown [50.35.76.230]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id E94648EE0D2; Tue, 3 Dec 2019 06:40:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1575384016; bh=qVkS5/TnlW/6UlIqPLHFGwLupPI1Ycqzwc+IE9PQWxs=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=bgwEgdBwAKekOHf7dcTyEHqJDDad3cpLEIXV9V6LKcD27guTamLYEjVgJlgY4t0v/ sUxfIrgyrz4QTjDHplf6TAD1oBW2nFuri/8t7BVI9jN5cNNvrhMWK27E32WotcOVC3 vy2iPfIJuImza9uda62XvmoZ9+oZuMYDCf3Z/MDM= Message-ID: <1575384015.3435.16.camel@HansenPartnership.com> Subject: Re: [PATCH 1/2] fs: introduce uid/gid shifting bind mount From: James Bottomley To: Amir Goldstein Cc: linux-fsdevel , David Howells , Al Viro , Miklos Szeredi , Seth Forshee , "Eric W. Biederman" Date: Tue, 03 Dec 2019 06:40:15 -0800 In-Reply-To: References: <1575335637.24227.26.camel@HansenPartnership.com> <1575335700.24227.27.camel@HansenPartnership.com> <1575349974.31937.11.camel@HansenPartnership.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.6 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, 2019-12-03 at 08:55 +0200, Amir Goldstein wrote: > On Tue, Dec 3, 2019 at 7:12 AM James Bottomley > wrote: > > > > On Tue, 2019-12-03 at 06:51 +0200, Amir Goldstein wrote: > > > [cc: ebiederman] [...] > > > 4. This is currently not overlayfs (stacked fs) nor nfsd > > > friendly. Those modules do not call the path based vfs APIs, but > > > they do have the mnt stored internally. > > > > OK, so I've got to confess that I've only tested it with my > > container use case, which doesn't involve overlay or nfs. However, > > as long as we thread path down to the API that nfds and overlayfs > > use, it should easily be made compatible with them ... do we have > > any documentation of what API this is? > > No proper doc AFAIK, but please take a look at: > https://lore.kernel.org/linux-fsdevel/20191025112917.22518-2-mszeredi > @redhat.com/ > It is part of a series to make overlayfs an FS_USERNS_MOUNT. > > The simplest case goes typically something like this: > rmdir -> do_rmdir -(change_userns_creds)-> vfs_rmdir -> > ovl_rmdir -(ovl_override_creds)-> vfs_rmdir -> ext4_rmdir Yes, I figured it would mostly be the vfs_ functions. > So if you shift mounted the overlayfs mount, you won't end up > using shifted creds in ext4 operations. > And if you shift mounted ext4 *before* creating the overlay, then > still, overlay doesn't go through do_rmdir, so your method won't > work either. So I think the upper use case (shift above overlay) is fairly easily solvable: it involves making ovl_override_creds shift aware, so that when it does the override it keeps the shift. This might involve stashing the overlay creds where the shift ones are in the task structure so cred_is_shifted() still works. The lower use case is more problematic because that would involve changing most of the vfs_ API. I think we can take a phased approach: 1. Get agreement for the approach using the unstacked case (current patch effectively) 2. Make the upper case work because it's the low hanging fruit; I can start looking at this (although I'll have to figure out how to get overlayfs working first). 3. Investigate the lower case if there's an actual use. > Similar situation with nfsd, although I have no idea if there are > plans to make nfsd userns aware. It's a similar upper and lower issue, although upper just involves playing nicely with the name remapping. > > > I suppose you do want to be able to mount overlays and export nfs > > > out of those shifted mounts, as they are merely the foundation > > > for unprivileged container storage stack. right? > > > > If the plan of doing this as a bind mount holds, then certainly > > because any underlying filesystem has to work with it. > > > > I am talking above, not under. Hopefully I addressed that above. I think above is easier and should be the first target, but to make this works completely eventually needs the under case as well. > You shift mount an ext4 fs and hand it over to container fake root > (or mark it and let fake root shit mount). > The container fake root should be able to (after overlayfs unpriv > changes) create an overlay from inside container. > IOW, try to mount an overlay over your shifted fs and see how it > behaves. > > > > For overlayfs, you should at least look at ovl_override_creds() > > > for incorporating shift mount logic - or more likely at the > > > creation of ofs->creator_cred. > > > > Well, we had this discussion when I proposed shiftfs as a > > superblock based stackable filesytem, I think: the way the shift > > needs to use creds is fundamentally different from the way > > overlayfs uses them. The ovl_override_creds is overriding with the > > creator's creds but the shifting bind mound needs to backshift > > through the user namespace currently in effect. Since uid shifts > > can stack, we can make them work together, but they are > > fundamentally different things. > > > > Right. > Please take a look at the override_cred code in ovl_create_or_link(). > This code has some fsuid dance that you need to check for shift > friendliness. Certainly, I've added it to my todo list. > The entire security model of overlayfs needs to be reexamined in the > face of shift mount, but as I wrote, I don't think its going to be > too hard to make ovl_override_creds() shift mount aware. > Overlayfs mimics vfs behavior in many cases. Agreed. > Unless you shift mount both overlayfs and underlying (say) ext4, then > you still have only one mnt_cred to cache in any given call stack. Heh well the double shift case will be the stress test of getting 2. and 3. working right. James