linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Serge Hallyn <serge@hallyn.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Aristeu Rozanski <aris@ruivo.org>,
	Neil Horman <nhorman@tuxdriver.com>,
	"Serge E. Hallyn" <serue@us.ibm.com>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, Michal Hocko <mhocko@suse.cz>,
	Thomas Graf <tgraf@suug.ch>, Paul Mackerras <paulus@samba.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	cgroups@vger.kernel.org, Paul Turner <pjt@google.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: Controlling devices and device namespaces
Date: Sun, 16 Sep 2012 07:23:50 -0700	[thread overview]
Message-ID: <87k3vuqc5l.fsf@xmission.com> (raw)
In-Reply-To: <5055D4D1.3070407@hallyn.com> (Serge Hallyn's message of "Sun, 16 Sep 2012 08:32:01 -0500")

Serge Hallyn <serge@hallyn.com> writes:

> On 09/16/2012 07:17 AM, Eric W. Biederman wrote:
>> ebiederm@xmission.com (Eric W. Biederman) writes:
>>
>>> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>>>
>>>>> One piece of the puzzle is that we should be able to allow unprivileged
>>>>> device node creation and access for any device on any filesystem
>>>>> for which it unprivileged access is safe.
>>>>
>>>> Which devices are "safe" is policy for all interesting and useful cases,
>>>> as are file permissions, security tags, chroot considerations and the
>>>> like.
>>>>
>>>> It's a complete non starter.
>>
>> Come to think of it mknod is completely unnecessary.
>>
>> Without mknod.  Without being able to mount filesystems containing
>> device nodes.
>
> Hm?  That sounds like it will really upset init/udev/upgrades in the
> container.

udev does not create device nodes.  For an older udev the worst
I can see it doing is having mknod failing with EEXIST because
the device node already exists.

We should be able to make it look to init like a ramdisk mounted the
filesystems.

Why should upgrades care?  Package installation shouldn't be calling
mknod.

At least with a recent modern distro I can't imagine this to be an
issue.  I expect we could have a kernel build option that removed the
mknod system call and a modern distro wouldn't notice.

> Are you saying all filesystems containing device nodes will need to be
> mounted in advance by the process setting up the container?

As a general rule.

I think in practice there is wiggle room for special cases
like mounting a fresh devpts.  devpts at least in always create a new
instance on mount mode seems safe, as it can not give you access to
any existing devices.

You can also do a lot of what would normally be done with mknod
with bind mounts to the original devices location.

>> The mount namespace is sufficient to prevent all of the
>> cases that the device control group prevents (open and mknod on device
>> nodes).
>>
>> So I honestly think the device control group is superflous, and it is
>> probably wise to deprecate it and move to a model where it does not
>> exist.
>>
>> Eric
>>
>
> That's what I said a few emails ago :)  The device cgroup was meant as
> a short-term workaround for lack of user (and device) namespaces.

I am saying something stronger.  The device cgroup doesn't seem to have
a practical function now.  That for the general case we don't need any
kernel support.  That all of this should be a matter of some user space
glue code, and just the tiniest bit of sorting out how hotplug events are
sent.

The only thing I can think we would need a device namespace for is
for migration.

For migration with direct access to real hardware devices we must treat
it as hardware hotunplug.  There is nothing else we can do.

If there is any other case where we need to preserve device numbers
etc we have the example of devpts.

So at this point I really don't think we need a device namespace or a
device control group.  (Just emulate devtmpfs, sysfs and uevents).

Eric


  reply	other threads:[~2012-09-16 14:24 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-13 20:58 [RFC] cgroup TODOs Tejun Heo
2012-09-14  9:04 ` Mike Galbraith
2012-09-14 17:17   ` Tejun Heo
2012-09-14  9:10 ` Daniel P. Berrange
2012-09-14 13:58   ` Vivek Goyal
2012-09-14 19:29     ` Tejun Heo
2012-09-14 21:51       ` Kay Sievers
     [not found] ` <5052E7DF.7040000@parallels.com>
2012-09-14  9:12   ` Li Zefan
2012-09-14 11:22     ` Peter Zijlstra
2012-09-14 17:59     ` Tejun Heo
2012-09-14 18:23       ` Peter Zijlstra
2012-09-14 18:33         ` Tejun Heo
2012-09-14 17:43   ` Tejun Heo
2012-09-17  8:50     ` Glauber Costa
2012-09-17 17:21       ` Tejun Heo
2012-09-14 11:15 ` Peter Zijlstra
2012-09-14 12:54   ` Daniel P. Berrange
2012-09-14 17:53   ` Tejun Heo
2012-09-14 14:25 ` Vivek Goyal
2012-09-14 14:53   ` Peter Zijlstra
2012-09-14 15:14     ` Vivek Goyal
2012-09-14 21:57       ` Tejun Heo
2012-09-17 15:27         ` Vivek Goyal
2012-09-18 18:08         ` Vivek Goyal
2012-09-14 21:39   ` Tejun Heo
2012-09-17 15:05     ` Vivek Goyal
2012-09-17 16:40       ` Tejun Heo
2012-09-14 15:03 ` Michal Hocko
2012-09-19 14:02   ` Michal Hocko
2012-09-19 14:03     ` [PATCH 2.6.32] memcg: warn on deeper hierarchies with use_hierarchy==0 Michal Hocko
2012-09-19 19:38       ` David Rientjes
2012-09-20 13:24         ` Michal Hocko
2012-09-20 22:33           ` David Rientjes
2012-09-21  7:16             ` Michal Hocko
2012-09-19 14:03     ` [PATCH 3.0] " Michal Hocko
2012-09-19 14:05     ` [PATCH 3.2+] " Michal Hocko
2012-09-14 18:07 ` [RFC] cgroup TODOs Vivek Goyal
2012-09-14 18:53   ` Tejun Heo
2012-09-14 19:28     ` Vivek Goyal
2012-09-14 19:44       ` Tejun Heo
2012-09-14 19:49         ` Tejun Heo
2012-09-14 20:39           ` Tejun Heo
2012-09-17  8:40             ` Glauber Costa
2012-09-17 17:30               ` Tejun Heo
2012-09-17 14:37             ` Vivek Goyal
2012-09-14 18:36 ` Aristeu Rozanski
2012-09-14 18:54   ` Tejun Heo
2012-09-15  2:20   ` Serge E. Hallyn
2012-09-15  9:27     ` Controlling devices and device namespaces Eric W. Biederman
2012-09-15 22:05       ` Serge E. Hallyn
2012-09-16  0:24         ` Eric W. Biederman
2012-09-16  3:31           ` Serge E. Hallyn
2012-09-16 11:21           ` Alan Cox
2012-09-16 11:56             ` Eric W. Biederman
2012-09-16 12:17               ` Eric W. Biederman
2012-09-16 13:32                 ` Serge Hallyn
2012-09-16 14:23                   ` Eric W. Biederman [this message]
2012-09-16 16:13                     ` Alan Cox
2012-09-16 17:49                       ` Eric W. Biederman
2012-09-16 16:15                     ` Serge Hallyn
2012-09-16 16:53                       ` Eric W. Biederman
2012-09-16  8:19   ` [RFC] cgroup TODOs James Bottomley
2012-09-16 14:41     ` Eric W. Biederman
2012-09-17 13:21     ` Aristeu Rozanski
2012-09-14 22:03 ` Dhaval Giani
2012-09-14 22:06   ` Tejun Heo
2012-09-20  1:33 ` Andy Lutomirski
2012-09-20 18:26   ` Tejun Heo
2012-09-20 18:39     ` Andy Lutomirski
2012-09-21 21:40 ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k3vuqc5l.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=acme@ghostprotocols.net \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=aris@ruivo.org \
    --cc=cgroups@vger.kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=mingo@redhat.com \
    --cc=nhorman@tuxdriver.com \
    --cc=paulus@samba.org \
    --cc=pjt@google.com \
    --cc=serge@hallyn.com \
    --cc=serue@us.ibm.com \
    --cc=tgraf@suug.ch \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).