containers.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir-3AfRa/s5aFdBDgjK7y7TUQ@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Greg Kroah-Hartman
	<gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>,
	Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Kay Sievers <kay.sievers-tD+1rO4QERM@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
	lxc-devel
	<lxc-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	mhw-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org,
	Stephane Graber
	<stgraber-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
Subject: Re: Device Namespaces
Date: Thu, 3 Oct 2013 11:58:39 +0300	[thread overview]
Message-ID: <CAA2m6vc3OFmS9VwiTavRzPqhn+qoe6vDCO2sitXpEQ8a1JVyfg@mail.gmail.com> (raw)
In-Reply-To: <87a9iri3ot.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

On Thu, Oct 3, 2013 at 3:44 AM, Eric W. Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>wrote:

> Amir Goldstein <amir-3AfRa/s5aFdBDgjK7y7TUQ@public.gmane.org> writes:
>
> > What we really like to see is a setns() style API that can be used to
> > add a device in the context of a namespace in either a "shared" or
> > "private" mode.
>
> I think you mean an "ip link set dev FOO netns XXX" style API.
>

correct.


>
> Right now one of the best suggestions on the table is:
>
> mkdir -p /dev/container/X
> ln /dev/zero /dev/container/X/zero
> ln /dev/null /dev/container/X/null
> ...
>
> With /dev/container/X mounted on /dev for container X.
>
> Which seems to cover putting a device in a namespace, while allowing
> things to still be reasonably managed.
>
> There are a few other variations on that scheme but nothing that says we
> must have kernel support or to create any kind of kernel context beyond
> which directory the device nodes live in.
>
> > This kind of API is a required building block for us to write device
> > drivers that are namespace aware in a way that userspace will have
> > enough flexibility for dynamic configuration.
> >
> > We are trying to come up with a proposal for that sort of API.  When
> > we have something decent, we shall post it.
>
> I really think what you need to write are special drivers that
> facilitate your use case.
>
> For the networking stack we wound up adding veth pairs, and macvlan
> devices, to handle the common sharing modes.
>
> Outside of your sharing situation I am not seeing any need or any
> advantage of creating devices that are modified to be sharable and I am
> seeing a lot of disadvantages to implementing things that way.  The
> biggest is that you seem to working independent of the subsystem
> maintainers of those devices which is generally a poor idea.
>
> Unprivileged creation of device nodes we can handle if it can be shown
> that it is safe to create device nodes.
>
> As I understand your problem you are trying to multiplex a device by
> building a device with a built in stop light.  Where one opener can
> write and the other openers are stopped/dropped.  That sounds very
> similar to macvlan, or ethernet bridging.   From the patches you have
> floated I suspect it would be very simple to build and just need a
> little bit of glue.
>

Excellent! let's focus the discussion on a new device driver we want to
write
which is namespace aware. let's call this device driver valarm-dev.
Similarly to Android's alarm-dev, valarm-dev can be used to request RTC
wakeup calls
from user space and get/set RTC values, but with valarm-dev, every container
may use different values for current time.

As you can see in our patch set, we already have a version of alarm-dev
that maintains
its state inside a context, instead of in global variable, so it is capable
of providing
different context per namespace.

And now for the 1M$ question: per *which* namespace do we attribute the
current realtime clock time?
To UTS namespace (because T historically stands for Time)? To device
namespace?
Even if device namespace would exist, we do not want to tie the policy
decision of "separate time"
to a very wide definition of "separate devices".

So what we want to create, is an API for device driver writers, that will
enable to write a namespace
aware device and allow userspace to configure when the namespace aware
device context is unshared.

We would like to share with you our very initial thoughts about how this
will be implemented:
- Extend register_pernet_subsys/device(ops) API
to register_perns_subsys/device(nstype, ops) API
- Extend pernet_operations to perns_operations that include optional
migrate() and/or unshare() ops
- Let valarm-dev register_peruser_subsys/device(&alarm_userns_ops)
- Implement a new syscall (or netlink command if it makes more sense)
setdevns(int dev_fd, int ns_fd, int nstype, int flags)
- Unlike the netlink set netns case, this API is not used solely to *move*
a device to a different namespace,
  but also to *unshare* a device context between namespaces, for those
devices that resigtered unshare() ops.

This is our missing piece of the puzzle.
After that, whether we make changes to existing drivers (e.g. evdev) or
write new virtualized drivers (e.g. vevdev)
is a technicality. We care not which way to go, whichever way seems more
maintainable.

What do you think of this master plan?

P.S. Please try to refrain from addressing the validity of the use case of
alarm-dev in particular,
as we do not wish to get engage "Android sucks" wars.
We simply want to present the case for improving the namespace
infrastructure to cater the needs
of device driver writers that wish to tailor their drivers for containers
based products.

Cheers,
Amir.



>
> Eric
>

  parent reply	other threads:[~2013-10-03  8:58 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-29 19:28 Device Namespaces Amir Goldstein
     [not found] ` <CAA2m6veny-7_ONMA973Wu36U4kz4gAuw0dpodkb8+GZDv6VNBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-29 20:06   ` Greg Kroah-Hartman
     [not found]     ` <20130929200620.GA31304-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-09-30 15:36       ` Michael H. Warfield
2013-10-03  0:44   ` Eric W. Biederman
     [not found]     ` <87a9iri3ot.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-10-03  0:59       ` Eric W. Biederman
2013-10-03  8:58       ` Amir Goldstein [this message]
     [not found]         ` <CAA2m6vc3OFmS9VwiTavRzPqhn+qoe6vDCO2sitXpEQ8a1JVyfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-03  9:17           ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2021-06-08  9:38 device namespaces Enrico Weigelt, metux IT consult
2021-06-08 12:30 ` Christian Brauner
2021-06-08 12:41   ` Greg Kroah-Hartman
2021-06-08 14:10     ` Hannes Reinecke
2021-06-08 14:29       ` Christian Brauner
2021-06-08 15:54         ` Hannes Reinecke
2021-06-08 17:16           ` Eric W. Biederman
2021-06-09  6:38             ` Christian Brauner
2021-06-09  7:02               ` Hannes Reinecke
2021-06-09  7:21                 ` Christian Brauner
2021-06-09  7:54                   ` Hannes Reinecke
2021-06-09  8:09                     ` Christian Brauner
2021-06-11 18:14                       ` Eric W. Biederman
2021-06-14  7:49                         ` Enrico Weigelt, metux IT consult
2021-06-14  8:22                           ` Greg KH
2021-06-14 17:36                           ` Eric W. Biederman
2021-06-15 11:24                             ` Enrico Weigelt, metux IT consult
2021-06-15 11:33                               ` Greg KH
2013-08-22 17:43 RFC: Device Namespaces Oren Laadan
2013-08-22 18:21 ` Serge Hallyn
2013-08-26 10:11   ` Oren Laadan
2013-09-06 17:50     ` Eric W. Biederman
2013-09-08 12:28       ` Amir Goldstein
2013-09-09  0:51         ` Eric W. Biederman
2013-09-10  7:09           ` Amir Goldstein
2013-09-25 11:05             ` Janne Karhunen
     [not found]               ` <CAE=NcrbyFFoMn2nfBA_=ZtwD=eGLvqK=L-U9MuGrtJFLZfZppw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-25 21:34                 ` Eric W. Biederman
     [not found]                   ` <87bo3gshz5.fsf_-_-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-09-26  5:33                     ` Greg Kroah-Hartman
     [not found]                       ` <20130926053320.GB3725-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-09-26  8:25                         ` Janne Karhunen
     [not found]                           ` <CAE=NcrbPXGWU8FUgwchXyL5HjXf+4AKbgUWGe1ZO=Xcq=iV-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-26 13:56                             ` Greg Kroah-Hartman
     [not found]                               ` <20130926135604.GA16624-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-09-26 17:01                                 ` Janne Karhunen
     [not found]                                   ` <CAE=NcrY3xC1AF_GV2b1KsF7AwYZTuGBuKLS5yBUWoWcmKU4YBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-26 17:07                                     ` Greg Kroah-Hartman
     [not found]                                       ` <20130926170757.GA9345-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-09-26 17:56                                         ` Janne Karhunen
2013-09-30 15:37                                         ` James Bottomley
     [not found]                                           ` <1380555439.2161.5.camel-sFMDBYUN5F8GjUHQrlYNx2Wm91YjaHnnhRte9Li2A+AAvxtiuMwx3w@public.gmane.org>
2013-09-30 16:11                                             ` Greg Kroah-Hartman
     [not found]                                               ` <20130930161117.GA26459-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-09-30 16:33                                                 ` James Bottomley
2013-10-01  6:19                         ` Janne Karhunen
     [not found]                           ` <CAE=NcrYV2RiMV7PcwEjFGFRBrz9XdZGs86Wau2a+6xpYN2aEHA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-01 17:27                             ` Andy Lutomirski
     [not found]                               ` <CALCETrWWoHzuJcnfEUY+cFpOgT5gnG8U1cVbCW0_8V7Z_v6DJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-01 17:53                                 ` Serge E. Hallyn
     [not found]                                   ` <20131001175345.GA4145-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-10-01 19:51                                     ` Eric W. Biederman
     [not found]                                       ` <87had0wz07.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-10-01 20:46                                         ` Serge Hallyn
2013-10-02 22:55                                           ` Eric W. Biederman
2013-10-01 20:57                                         ` Greg Kroah-Hartman
     [not found]                                           ` <20131001205718.GA17036-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-10-02 22:45                                             ` Eric W. Biederman
2013-10-01 22:19                                         ` Michael H. Warfield
2013-10-01 18:36                                 ` Janne Karhunen
2013-10-01 17:33                             ` Greg Kroah-Hartman
     [not found]                               ` <20131001173342.GA19267-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2013-10-01 18:23                                 ` Janne Karhunen
2013-10-28 23:31                     ` Andrey Wagin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA2m6vc3OFmS9VwiTavRzPqhn+qoe6vDCO2sitXpEQ8a1JVyfg@mail.gmail.com \
    --to=amir-3afra/s5afdbdgjk7y7tuq@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org \
    --cc=kay.sievers-tD+1rO4QERM@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=lxc-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=mhw-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=stgraber-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).