From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754301AbaESN1O (ORCPT ); Mon, 19 May 2014 09:27:14 -0400 Received: from mail-ob0-f169.google.com ([209.85.214.169]:54844 "EHLO mail-ob0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752085AbaESN1M (ORCPT ); Mon, 19 May 2014 09:27:12 -0400 Date: Mon, 19 May 2014 08:27:03 -0500 From: Seth Forshee To: LXC development mailing-list Cc: "Eric W. Biederman" , Greg Kroah-Hartman , Serge Hallyn , "Michael H. Warfield" , linux-kernel@vger.kernel.org, Jens Axboe , Arnd Bergmann , Serge Hallyn , James Bottomley Subject: Re: [lxc-devel] [RFC PATCH 00/11] Add support for devtmpfs in user namespaces Message-ID: <20140519132703.GA49509@ubuntu-hedt> Mail-Followup-To: LXC development mailing-list , "Eric W. Biederman" , Greg Kroah-Hartman , Serge Hallyn , "Michael H. Warfield" , linux-kernel@vger.kernel.org, Jens Axboe , Arnd Bergmann , Serge Hallyn , James Bottomley References: <20140515040032.GA6702@kroah.com> <1400161337.7699.33.camel@canyon.ip6.wittsend.com> <20140515140856.GA17453@kroah.com> <20140515174254.GM21073@ubuntumail> <20140515221551.GB13306@kroah.com> <20140516014959.GD22591@ubuntumail> <20140516043532.GA14149@kroah.com> <87mwehgh5i.fsf@x220.int.ebiederm.org> <20140517160145.GA44802@ubuntu-hedt> <20140518024458.GB25613@mail.hallyn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140518024458.GB25613@mail.hallyn.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 18, 2014 at 04:44:58AM +0200, Serge E. Hallyn wrote: > Quoting Seth Forshee (seth.forshee@canonical.com): > > On Fri, May 16, 2014 at 09:31:37PM -0700, Eric W. Biederman wrote: > > > Greg Kroah-Hartman writes: > > > > > > > On Fri, May 16, 2014 at 01:49:59AM +0000, Serge Hallyn wrote: > > > >> > I think having to pick and choose what device nodes you want in a > > > >> > container is a good thing. Becides, you would have to do the same thing > > > >> > in the kernel anyway, what's wrong with userspace making the decision > > > >> > here, especially as it knows exactly what it wants to do much more so > > > >> > than the kernel ever can. > > > >> > > > >> For 'real' devices that sounds sensible. The thing about loop devices > > > >> is that we simply want to allow a container to say "give me a loop > > > >> device to use" and have it receive a unique loop device (or 3), without > > > >> having to pre-assign them. I think that would be cleaner to do using > > > >> a pseudofs and loop-control device, rather than having to have a > > > >> daemon in userspace on the host farming those out in response to > > > >> some, I don't know, dbus request? > > > > > > > > I agree that loop devices would be nice to have in a container, and that > > > > the existing loop interface doesn't really lend itself to that. So > > > > create a new type of thing that acts like a loop device in a container. > > > > But don't try to mess with the whole driver core just for a single type > > > > of device. > > > > > > Yes. Something like devpts (without the newinstance option). Built to > > > allow unprivileged users to create loopback devices. > > > > That's where I started, and I've got code, so I guess I'll clean it up > > and send patches. If the stance is that only system-wide CAP_SYS_ADMIN > > gets to do privileged block device ioctls, including reading partitions > > Sorry, where did that come from? What Eric was referring to below is > the fs superblock readers not being trusted. Maybe I glossed over another > email where it was mentioned? You must have. Take a look at [1]. To repeat the point: the ioctl to reread partitions (along with several other block device ioctls) has a capable(CAP_SYS_ADMIN) check. We can't change this to an ns_capable check without at minimum the block layer knowing about the namespace associated with the block device. Ergo we can't reread paritions if this is done entirely within the loop driver via a psuedo fs. [1] http://article.gmane.org/gmane.linux.kernel.containers.lxc.devel/8191