From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB722C433E0 for ; Thu, 28 Jan 2021 17:08:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E53A64E0E for ; Thu, 28 Jan 2021 17:08:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232620AbhA1RH4 (ORCPT ); Thu, 28 Jan 2021 12:07:56 -0500 Received: from mail.hallyn.com ([178.63.66.53]:40000 "EHLO mail.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232835AbhA1RGR (ORCPT ); Thu, 28 Jan 2021 12:06:17 -0500 X-Greylist: delayed 389 seconds by postgrey-1.27 at vger.kernel.org; Thu, 28 Jan 2021 12:06:16 EST Received: by mail.hallyn.com (Postfix, from userid 1001) id 5334E11D4; Thu, 28 Jan 2021 10:58:52 -0600 (CST) Date: Thu, 28 Jan 2021 10:58:52 -0600 From: "Serge E. Hallyn" To: "Eric W. Biederman" Cc: Miklos Szeredi , linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, "Serge E . Hallyn" , Christian Brauner Subject: Re: [PATCH 2/2] security.capability: fix conversions on getxattr Message-ID: <20210128165852.GA20974@mail.hallyn.com> References: <20210119162204.2081137-1-mszeredi@redhat.com> <20210119162204.2081137-3-mszeredi@redhat.com> <8735yw8k7a.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8735yw8k7a.fsf@x220.int.ebiederm.org> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 19, 2021 at 07:34:49PM -0600, Eric W. Biederman wrote: > Miklos Szeredi writes: > > > If a capability is stored on disk in v2 format cap_inode_getsecurity() will > > currently return in v2 format unconditionally. > > > > This is wrong: v2 cap should be equivalent to a v3 cap with zero rootid, > > and so the same conversions performed on it. > > > > If the rootid cannot be mapped v3 is returned unconverted. Fix this so > > that both v2 and v3 return -EOVERFLOW if the rootid (or the owner of the fs > > user namespace in case of v2) cannot be mapped in the current user > > namespace. > > This looks like a good cleanup. Sorry, I'm not following. Why is this a good cleanup? Why should the xattr be shown as faked v3 in this case? A separate question below. > I do wonder how well this works with stacking. In particular > ovl_xattr_set appears to call vfs_getxattr without overriding the creds. > What the purpose of that is I haven't quite figured out. It looks like > it is just a probe to see if an xattr is present so maybe it is ok. > > Acked-by: "Eric W. Biederman" > > > > > Signed-off-by: Miklos Szeredi > > --- > > security/commoncap.c | 67 ++++++++++++++++++++++++++++---------------- > > 1 file changed, 43 insertions(+), 24 deletions(-) > > > > diff --git a/security/commoncap.c b/security/commoncap.c > > index bacc1111d871..c9d99f8f4c82 100644 > > --- a/security/commoncap.c > > +++ b/security/commoncap.c > > @@ -371,10 +371,11 @@ int cap_inode_getsecurity(struct inode *inode, const char *name, void **buffer, > > { > > int size, ret; > > kuid_t kroot; > > + __le32 nsmagic, magic; > > uid_t root, mappedroot; > > char *tmpbuf = NULL; > > struct vfs_cap_data *cap; > > - struct vfs_ns_cap_data *nscap; > > + struct vfs_ns_cap_data *nscap = NULL; > > struct dentry *dentry; > > struct user_namespace *fs_ns; > > > > @@ -396,46 +397,61 @@ int cap_inode_getsecurity(struct inode *inode, const char *name, void **buffer, > > fs_ns = inode->i_sb->s_user_ns; > > cap = (struct vfs_cap_data *) tmpbuf; > > if (is_v2header((size_t) ret, cap)) { > > - /* If this is sizeof(vfs_cap_data) then we're ok with the > > - * on-disk value, so return that. */ > > - if (alloc) > > - *buffer = tmpbuf; > > - else > > - kfree(tmpbuf); > > - return ret; > > - } else if (!is_v3header((size_t) ret, cap)) { > > - kfree(tmpbuf); > > - return -EINVAL; > > + root = 0; > > + } else if (is_v3header((size_t) ret, cap)) { > > + nscap = (struct vfs_ns_cap_data *) tmpbuf; > > + root = le32_to_cpu(nscap->rootid); > > + } else { > > + size = -EINVAL; > > + goto out_free; > > } > > > > - nscap = (struct vfs_ns_cap_data *) tmpbuf; > > - root = le32_to_cpu(nscap->rootid); > > kroot = make_kuid(fs_ns, root); > > > > /* If the root kuid maps to a valid uid in current ns, then return > > * this as a nscap. */ > > mappedroot = from_kuid(current_user_ns(), kroot); > > if (mappedroot != (uid_t)-1 && mappedroot != (uid_t)0) { > > + size = sizeof(struct vfs_ns_cap_data); > > if (alloc) { > > - *buffer = tmpbuf; > > + if (!nscap) { > > + /* v2 -> v3 conversion */ > > + nscap = kzalloc(size, GFP_ATOMIC); > > + if (!nscap) { > > + size = -ENOMEM; > > + goto out_free; > > + } > > + nsmagic = VFS_CAP_REVISION_3; > > + magic = le32_to_cpu(cap->magic_etc); > > + if (magic & VFS_CAP_FLAGS_EFFECTIVE) > > + nsmagic |= VFS_CAP_FLAGS_EFFECTIVE; > > + memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32); > > + nscap->magic_etc = cpu_to_le32(nsmagic); > > + } else { > > + /* use allocated v3 buffer */ > > + tmpbuf = NULL; > > + } > > nscap->rootid = cpu_to_le32(mappedroot); > > - } else > > - kfree(tmpbuf); > > - return size; > > + *buffer = nscap; > > + } > > + goto out_free; > > } > > > > if (!rootid_owns_currentns(kroot)) { > > - kfree(tmpbuf); > > - return -EOPNOTSUPP; > > + size = -EOVERFLOW; Why this change? Christian (cc:d) noticed that this is a user visible change. Without this change, if you are in a userns which has different rootid, the EOVERFLOW tells vfs_getxattr to vall back to __vfs_getxattr() and so you can see the v3 capability with its rootid. With this change, you instead just get EOVERFLOW. > > + goto out_free; > > } > > > > /* This comes from a parent namespace. Return as a v2 capability */ > > size = sizeof(struct vfs_cap_data); > > if (alloc) { > > - *buffer = kmalloc(size, GFP_ATOMIC); > > - if (*buffer) { > > - struct vfs_cap_data *cap = *buffer; > > - __le32 nsmagic, magic; > > + if (nscap) { > > + /* v3 -> v2 conversion */ > > + cap = kzalloc(size, GFP_ATOMIC); > > + if (!cap) { > > + size = -ENOMEM; > > + goto out_free; > > + } > > magic = VFS_CAP_REVISION_2; > > nsmagic = le32_to_cpu(nscap->magic_etc); > > if (nsmagic & VFS_CAP_FLAGS_EFFECTIVE) > > @@ -443,9 +459,12 @@ int cap_inode_getsecurity(struct inode *inode, const char *name, void **buffer, > > memcpy(&cap->data, &nscap->data, sizeof(__le32) * 2 * VFS_CAP_U32); > > cap->magic_etc = cpu_to_le32(magic); > > } else { > > - size = -ENOMEM; > > + /* use unconverted v2 */ > > + tmpbuf = NULL; > > } > > + *buffer = cap; > > } > > +out_free: > > kfree(tmpbuf); > > return size; > > }