All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [lxc-devel] device namespaces
       [not found] <CALRD3qKmpzJCRszkG_S9Z3XgoTGWVMFd7FqeJh+W-9pZqPVhCg@mail.gmail.com>
@ 2014-09-24  5:04 ` Eric W. Biederman
       [not found]   ` <CALRD3qKPJHmmY2DSNNfNKzmLihDLm9fgBQprCXNMHVOArV4iuw@mail.gmail.com>
  2014-09-24 16:38   ` Serge Hallyn
  0 siblings, 2 replies; 10+ messages in thread
From: Eric W. Biederman @ 2014-09-24  5:04 UTC (permalink / raw)
  To: LXC development mailing-list
  Cc: linux-kernel, Miklos Szeredi, fuse-devel, Tejun Heo,
	Seth Forshee, serge.hallyn

riya khanna <riyakhanna1983@gmail.com> writes:

> (Please pardon multiple emails, artifact of merging all separate conversations)
>
> Thanks for your feedback! 
>
> Letting the kernel know about what devices a container could access (based on
> device cgroups) and having devtmpfs in the kernel create device nodes for a
> container that map to corresponding CUSE nodes is what I thought of. For
> example, "echo 29:0 > /proc/<pid>/devices" would prepare a virtual framebuffer
> (based on real fb0 SCREENINFO properties) for this process provided permissions
> allow this operation. To view the framebuffer, the CUSE based virtual device
> would talk to the actual hardware. Since namespaces would have different view of
> the underlying devices, "sysfs" has to made aware of this as well. 
>
> Please let me know your inputs. Thanks again!

The solution hugely depends on what you are trying to do with it.

The situation today is that device nodes are slowly fading out.  In
another 20 years linux may not have any device nodes at all.

Therefore the question becomes what are you trying to support.

If it is just filtering of existing device nodes.  We can do a pretty
good approximation with bind mounts.

If you want to emulate a device you can use normal fuse (not cuse).
As normal fuse file will support arbitrary ioctls.

There are a few cases where it is desirable to emulate what devpts
does for allowing arbitrary users to creating virtual devices in the
kernel.  Loop devices in particular.

Ultimately given the existence of device hotplug I don't see any call
for being able to create device nodes with well known device numbers
(fundamentally what a device namespace would be about).

The conversation last year was about people wanting to multiplex devices
that don't have multiplexer support in the kernel.  If that is your
desire I think it is entirely reasonable to device type by device type
add support for multiplexing that device type to the kernel, or
potentially just use fuse or cuse to implement your multiplexer in
userspace but that has the potential to be unusably slow.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [lxc-devel] device namespaces
       [not found]   ` <CALRD3qKPJHmmY2DSNNfNKzmLihDLm9fgBQprCXNMHVOArV4iuw@mail.gmail.com>
@ 2014-09-24 16:37     ` Serge Hallyn
  2014-09-24 17:43       ` Using devices in Containers (was: [lxc-devel] device namespaces) Eric W. Biederman
  2014-09-24 19:07       ` [lxc-devel] device namespaces Riya Khanna
  0 siblings, 2 replies; 10+ messages in thread
From: Serge Hallyn @ 2014-09-24 16:37 UTC (permalink / raw)
  To: riya khanna
  Cc: Eric W. Biederman, LXC development mailing-list, Miklos Szeredi,
	fuse-devel, Tejun Heo, Seth Forshee, linux-kernel

Isolation is provided by the devices cgroup.  You want something more
than isolation.

Quoting riya khanna (riyakhanna1983@gmail.com):
> My use case for having device namespaces is device isolation. Isn't what
> namespaces are there for (as I understand)? Not everything should be
> accessible (or even visible) from a container all the time (we have seen
> people come up with different use cases for this). However, bind-mounting
> takes away this flexibility. I agree that assigning fixed device numbers is
> clearly not a long-term solution. Emulation for safe and flexible
> multiplexing, like you suggested either using CUSE/FUSE or something like
> devpts, is what I'm exploring.
> 
> On Wed, Sep 24, 2014 at 12:04 AM, Eric W. Biederman <ebiederm@xmission.com>
> wrote:
> 
> > riya khanna <riyakhanna1983@gmail.com> writes:
> >
> > > (Please pardon multiple emails, artifact of merging all separate
> > conversations)
> > >
> > > Thanks for your feedback!
> > >
> > > Letting the kernel know about what devices a container could access
> > (based on
> > > device cgroups) and having devtmpfs in the kernel create device nodes
> > for a
> > > container that map to corresponding CUSE nodes is what I thought of. For
> > > example, "echo 29:0 > /proc/<pid>/devices" would prepare a virtual
> > framebuffer
> > > (based on real fb0 SCREENINFO properties) for this process provided
> > permissions
> > > allow this operation. To view the framebuffer, the CUSE based virtual
> > device
> > > would talk to the actual hardware. Since namespaces would have different
> > view of
> > > the underlying devices, "sysfs" has to made aware of this as well.
> > >
> > > Please let me know your inputs. Thanks again!
> >
> > The solution hugely depends on what you are trying to do with it.
> >
> > The situation today is that device nodes are slowly fading out.  In
> > another 20 years linux may not have any device nodes at all.
> >
> > Therefore the question becomes what are you trying to support.
> >
> > If it is just filtering of existing device nodes.  We can do a pretty
> > good approximation with bind mounts.
> >
> > If you want to emulate a device you can use normal fuse (not cuse).
> > As normal fuse file will support arbitrary ioctls.
> >
> > There are a few cases where it is desirable to emulate what devpts
> > does for allowing arbitrary users to creating virtual devices in the
> > kernel.  Loop devices in particular.
> >
> > Ultimately given the existence of device hotplug I don't see any call
> > for being able to create device nodes with well known device numbers
> > (fundamentally what a device namespace would be about).
> >
> > The conversation last year was about people wanting to multiplex devices
> > that don't have multiplexer support in the kernel.  If that is your
> > desire I think it is entirely reasonable to device type by device type
> > add support for multiplexing that device type to the kernel, or
> > potentially just use fuse or cuse to implement your multiplexer in
> > userspace but that has the potential to be unusably slow.
> >
> > Eric
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [lxc-devel] device namespaces
  2014-09-24  5:04 ` [lxc-devel] device namespaces Eric W. Biederman
       [not found]   ` <CALRD3qKPJHmmY2DSNNfNKzmLihDLm9fgBQprCXNMHVOArV4iuw@mail.gmail.com>
@ 2014-09-24 16:38   ` Serge Hallyn
  1 sibling, 0 replies; 10+ messages in thread
From: Serge Hallyn @ 2014-09-24 16:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: LXC development mailing-list, linux-kernel, Miklos Szeredi,
	fuse-devel, Tejun Heo, Seth Forshee

Quoting Eric W. Biederman (ebiederm@xmission.com):
> riya khanna <riyakhanna1983@gmail.com> writes:
> 
> > (Please pardon multiple emails, artifact of merging all separate conversations)
> >
> > Thanks for your feedback! 
> >
> > Letting the kernel know about what devices a container could access (based on
> > device cgroups) and having devtmpfs in the kernel create device nodes for a
> > container that map to corresponding CUSE nodes is what I thought of. For
> > example, "echo 29:0 > /proc/<pid>/devices" would prepare a virtual framebuffer
> > (based on real fb0 SCREENINFO properties) for this process provided permissions
> > allow this operation. To view the framebuffer, the CUSE based virtual device
> > would talk to the actual hardware. Since namespaces would have different view of
> > the underlying devices, "sysfs" has to made aware of this as well. 
> >
> > Please let me know your inputs. Thanks again!
> 
> The solution hugely depends on what you are trying to do with it.
> 
> The situation today is that device nodes are slowly fading out.  In
> another 20 years linux may not have any device nodes at all.
> 
> Therefore the question becomes what are you trying to support.
> 
> If it is just filtering of existing device nodes.  We can do a pretty
> good approximation with bind mounts.
> 
> If you want to emulate a device you can use normal fuse (not cuse).
> As normal fuse file will support arbitrary ioctls.
> 
> There are a few cases where it is desirable to emulate what devpts
> does for allowing arbitrary users to creating virtual devices in the
> kernel.  Loop devices in particular.
> 
> Ultimately given the existence of device hotplug I don't see any call
> for being able to create device nodes with well known device numbers
> (fundamentally what a device namespace would be about).
> 
> The conversation last year was about people wanting to multiplex devices
> that don't have multiplexer support in the kernel.  If that is your
> desire I think it is entirely reasonable to device type by device type
> add support for multiplexing that device type to the kernel, or
> potentially just use fuse or cuse to implement your multiplexer in
> userspace but that has the potential to be unusably slow.

It would be helpful to have a list of devices that may want that
multiplexing.  Is it really just loop and graphics drivers?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Using devices in Containers (was: [lxc-devel] device namespaces)
  2014-09-24 16:37     ` Serge Hallyn
@ 2014-09-24 17:43       ` Eric W. Biederman
  2014-09-24 19:30         ` Riya Khanna
  2014-09-24 19:07       ` [lxc-devel] device namespaces Riya Khanna
  1 sibling, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2014-09-24 17:43 UTC (permalink / raw)
  To: riya khanna
  Cc: LXC development mailing-list, Miklos Szeredi, fuse-devel,
	Tejun Heo, Seth Forshee, linux-kernel, Serge Hallyn

Serge Hallyn <serge.hallyn@ubuntu.com> writes:

> Isolation is provided by the devices cgroup.  You want something more
> than isolation.
>
> Quoting riya khanna (riyakhanna1983@gmail.com):
>> My use case for having device namespaces is device isolation. Isn't what
>> namespaces are there for (as I understand)?

Namespaces fundamentally provide for using the same ``global'' name
in different contexts.  This allows them to be used for isolation
and process migration (because you can take the same name from
machine to machine).

Unless someone cares about device numbers at a namespace level
the work is done.

The mount namespace provides exsits to deal with file names.
The devices cgroup will limit which devices you can access (although
I can't ever imagine a case where the mout namespace would be
insufficient).

>> Not everything should be
>> accessible (or even visible) from a container all the time (we have seen
>> people come up with different use cases for this). However, bind-mounting
>> takes away this flexibility.

I don't see how.  If they are mounts that propogate into the container
and are controlled from outside you can do whatever you want.  (I am
imagining device by device bind mounts here).  It should be trivial
to have a a directory tree that propogates into a container and works.

>> I agree that assigning fixed device numbers is
>> clearly not a long-term solution. Emulation for safe and flexible
>> multiplexing, like you suggested either using CUSE/FUSE or something like
>> devpts, is what I'm exploring.

Is the problem you actually care about multiplexing devices?

I think there is quite a bit of room to talk about how to safely
and effectively use devices in containers.   So let's make that the
discussion.  No one actually wants device number namespaces and talking
about them only muddies the watters.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [lxc-devel] device namespaces
  2014-09-24 16:37     ` Serge Hallyn
  2014-09-24 17:43       ` Using devices in Containers (was: [lxc-devel] device namespaces) Eric W. Biederman
@ 2014-09-24 19:07       ` Riya Khanna
  1 sibling, 0 replies; 10+ messages in thread
From: Riya Khanna @ 2014-09-24 19:07 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Eric W. Biederman, LXC development mailing-list, Miklos Szeredi,
	fuse-devel, Tejun Heo, Seth Forshee, linux-kernel

I guess policy-based multiplexing (or exclusive ownership) is the usage. What kind of devices (loop, fb, etc.) this is needed for depends on the usage. If there are multiple FBs, then each container could potentially own one. One may want to provide exclusive ownership of input devices to one container at a time to avoid information leakage. Like we saw at LPC last year, this applies to sensors (gps, accelerometer, etc.) on mobile devices as well.
 
On Sep 24, 2014, at 11:37 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:

> Isolation is provided by the devices cgroup.  You want something more
> than isolation.
> 
> Quoting riya khanna (riyakhanna1983@gmail.com):
>> My use case for having device namespaces is device isolation. Isn't what
>> namespaces are there for (as I understand)? Not everything should be
>> accessible (or even visible) from a container all the time (we have seen
>> people come up with different use cases for this). However, bind-mounting
>> takes away this flexibility. I agree that assigning fixed device numbers is
>> clearly not a long-term solution. Emulation for safe and flexible
>> multiplexing, like you suggested either using CUSE/FUSE or something like
>> devpts, is what I'm exploring.
>> 
>> On Wed, Sep 24, 2014 at 12:04 AM, Eric W. Biederman <ebiederm@xmission.com>
>> wrote:
>> 
>>> riya khanna <riyakhanna1983@gmail.com> writes:
>>> 
>>>> (Please pardon multiple emails, artifact of merging all separate
>>> conversations)
>>>> 
>>>> Thanks for your feedback!
>>>> 
>>>> Letting the kernel know about what devices a container could access
>>> (based on
>>>> device cgroups) and having devtmpfs in the kernel create device nodes
>>> for a
>>>> container that map to corresponding CUSE nodes is what I thought of. For
>>>> example, "echo 29:0 > /proc/<pid>/devices" would prepare a virtual
>>> framebuffer
>>>> (based on real fb0 SCREENINFO properties) for this process provided
>>> permissions
>>>> allow this operation. To view the framebuffer, the CUSE based virtual
>>> device
>>>> would talk to the actual hardware. Since namespaces would have different
>>> view of
>>>> the underlying devices, "sysfs" has to made aware of this as well.
>>>> 
>>>> Please let me know your inputs. Thanks again!
>>> 
>>> The solution hugely depends on what you are trying to do with it.
>>> 
>>> The situation today is that device nodes are slowly fading out.  In
>>> another 20 years linux may not have any device nodes at all.
>>> 
>>> Therefore the question becomes what are you trying to support.
>>> 
>>> If it is just filtering of existing device nodes.  We can do a pretty
>>> good approximation with bind mounts.
>>> 
>>> If you want to emulate a device you can use normal fuse (not cuse).
>>> As normal fuse file will support arbitrary ioctls.
>>> 
>>> There are a few cases where it is desirable to emulate what devpts
>>> does for allowing arbitrary users to creating virtual devices in the
>>> kernel.  Loop devices in particular.
>>> 
>>> Ultimately given the existence of device hotplug I don't see any call
>>> for being able to create device nodes with well known device numbers
>>> (fundamentally what a device namespace would be about).
>>> 
>>> The conversation last year was about people wanting to multiplex devices
>>> that don't have multiplexer support in the kernel.  If that is your
>>> desire I think it is entirely reasonable to device type by device type
>>> add support for multiplexing that device type to the kernel, or
>>> potentially just use fuse or cuse to implement your multiplexer in
>>> userspace but that has the potential to be unusably slow.
>>> 
>>> Eric
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using devices in Containers (was: [lxc-devel] device namespaces)
  2014-09-24 17:43       ` Using devices in Containers (was: [lxc-devel] device namespaces) Eric W. Biederman
@ 2014-09-24 19:30         ` Riya Khanna
  2014-09-24 22:38           ` Using devices in Containers Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Riya Khanna @ 2014-09-24 19:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: LXC development mailing-list, Miklos Szeredi, fuse-devel,
	Tejun Heo, Seth Forshee, linux-kernel, Serge Hallyn


On Sep 24, 2014, at 12:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:

> Serge Hallyn <serge.hallyn@ubuntu.com> writes:
> 
>> Isolation is provided by the devices cgroup.  You want something more
>> than isolation.
>> 
>> Quoting riya khanna (riyakhanna1983@gmail.com):
>>> My use case for having device namespaces is device isolation. Isn't what
>>> namespaces are there for (as I understand)?
> 
> Namespaces fundamentally provide for using the same ``global'' name
> in different contexts.  This allows them to be used for isolation
> and process migration (because you can take the same name from
> machine to machine).
> 
> Unless someone cares about device numbers at a namespace level
> the work is done.
> 
> The mount namespace provides exsits to deal with file names.
> The devices cgroup will limit which devices you can access (although
> I can't ever imagine a case where the mout namespace would be
> insufficient).
> 
>>> Not everything should be
>>> accessible (or even visible) from a container all the time (we have seen
>>> people come up with different use cases for this). However, bind-mounting
>>> takes away this flexibility.
> 
> I don't see how.  If they are mounts that propogate into the container
> and are controlled from outside you can do whatever you want.  (I am
> imagining device by device bind mounts here).  It should be trivial
> to have a a directory tree that propogates into a container and works.
> 

Device-by-device bind mounts can grant/revoke access to real individual devices as and when needed. However, revoking the access to real devices could break the applications if there’s no transparent mechanism to back up the propagated (but now revoked) device bind mounts that could fool the apps into believing that they are working with real devices. Frame buffer is one such example, where safe multiplexing could be applied. 

>>> I agree that assigning fixed device numbers is
>>> clearly not a long-term solution. Emulation for safe and flexible
>>> multiplexing, like you suggested either using CUSE/FUSE or something like
>>> devpts, is what I'm exploring.
> 
> Is the problem you actually care about multiplexing devices?
> 

The problem I care about is access to real devices, such as input, fb, loop, etc. as and when needed, thereby having native I/O performance - either through secure multiplexing or exclusive ownership, whatever makes sense according to the device type. 

> I think there is quite a bit of room to talk about how to safely
> and effectively use devices in containers.   So let's make that the
> discussion.  No one actually wants device number namespaces and talking
> about them only muddies the watters.
> 

I cannot agree more. Let’s restrict the discussion to it.

Thanks,
Riya

> Eric


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using devices in Containers
  2014-09-24 19:30         ` Riya Khanna
@ 2014-09-24 22:38           ` Eric W. Biederman
       [not found]             ` <CALRD3qLYAc+K8e1xYb27ipi4KyGRmTxokPCHN0L_zta=Cy9sCQ@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2014-09-24 22:38 UTC (permalink / raw)
  To: Riya Khanna
  Cc: LXC development mailing-list, Miklos Szeredi, fuse-devel,
	Tejun Heo, Seth Forshee, linux-kernel, Serge Hallyn

Riya Khanna <riyakhanna1983@gmail.com> writes:

> On Sep 24, 2014, at 12:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
>> Serge Hallyn <serge.hallyn@ubuntu.com> writes:
>> 
>>> Isolation is provided by the devices cgroup.  You want something more
>>> than isolation.
>>> 
>>> Quoting riya khanna (riyakhanna1983@gmail.com):
>>>> My use case for having device namespaces is device isolation. Isn't what
>>>> namespaces are there for (as I understand)?
>> 
>> Namespaces fundamentally provide for using the same ``global'' name
>> in different contexts.  This allows them to be used for isolation
>> and process migration (because you can take the same name from
>> machine to machine).
>> 
>> Unless someone cares about device numbers at a namespace level
>> the work is done.
>> 
>> The mount namespace provides exsits to deal with file names.
>> The devices cgroup will limit which devices you can access (although
>> I can't ever imagine a case where the mout namespace would be
>> insufficient).
>> 
>>>> Not everything should be
>>>> accessible (or even visible) from a container all the time (we have seen
>>>> people come up with different use cases for this). However, bind-mounting
>>>> takes away this flexibility.
>> 
>> I don't see how.  If they are mounts that propogate into the container
>> and are controlled from outside you can do whatever you want.  (I am
>> imagining device by device bind mounts here).  It should be trivial
>> to have a a directory tree that propogates into a container and works.
>> 
>
> Device-by-device bind mounts can grant/revoke access to real
> individual devices as and when needed. However, revoking the access to
> real devices could break the applications if there’s no transparent
> mechanism to back up the propagated (but now revoked) device bind
> mounts that could fool the apps into believing that they are working
> with real devices. Frame buffer is one such example, where safe
> multiplexing could be applied. 
>
>>>> I agree that assigning fixed device numbers is
>>>> clearly not a long-term solution. Emulation for safe and flexible
>>>> multiplexing, like you suggested either using CUSE/FUSE or something like
>>>> devpts, is what I'm exploring.
>> 
>> Is the problem you actually care about multiplexing devices?
>> 
>
> The problem I care about is access to real devices, such as input, fb,
> loop, etc. as and when needed, thereby having native I/O performance -
> either through secure multiplexing or exclusive ownership, whatever
> makes sense according to the device type. 

Riya Khanna <riyakhanna1983@gmail.com> writes:

> I guess policy-based multiplexing (or exclusive ownership) is the
> usage. What kind of devices (loop, fb, etc.) this is needed for
> depends on the usage. If there are multiple FBs, then each container
> could potentially own one. One may want to provide exclusive ownership
> of input devices to one container at a time to avoid information
> leakage. Like we saw at LPC last year, this applies to sensors (gps,
> accelerometer, etc.) on mobile devices as well. 

Allowing mutiplexing of those devices seems reasonable.

Where the discussion ran into problems last time was that people did not
want to use any of the existing linux solutions for multiplexing those
kind of thing and wanted to invent something new.

Inventing something new is fine if it the extra code maintenance can be
justified, or if the invention just a better solution for all users and
new code can just start using that in general.

The old solution to your problem of multiplexing devices is by
allocating a virtual terminal nd sending signals to coordinate
cooperatively sharing those resources.

If you want some sort of preemtive multitasking that requires
something a bit more effort, and work in the device abstractions.
You may be able to share concepts and library code but I don't believe
there is something you can just pain on top of devices and make it
happen.  Certainly in the bad old days of X terminal switching the
cooperation was necessary so that when a video card was yanked from an
application writing directly to that video card the application would
need to restore the video card to a known state so the next application
would have a chance of making sense of it.   Furthermore most devices
are not safe to let unprivileged users to access their control registers
directly.

All of which boils down the simple fact that for each type of device you
would like to share it is necessary to update the subsystem to support
arbitrary numbers of virtual devices that you can talk to.

The macvlan driver in the networking stack is a rough example of what I
expect you would like.  Something that takes one real physical device
and turns it into N virtual devices each of which runs at effectively
full speed.  Along with some kind of new master interface for
controlling when the multiplexing takes place.

I think we do most of this is software today and arguably for a lot of
devices the overhead is small enough that a software solution is fine.
So perhaps all you need is a fuse interface to the existing software
multiplexers so that weird legacy code can be made to run.

Now I suspect part of doing this right will be getting proper video
drivers on Android.  I assume that Android is the platform you care
about.

Eric



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using devices in Containers
       [not found]             ` <CALRD3qLYAc+K8e1xYb27ipi4KyGRmTxokPCHN0L_zta=Cy9sCQ@mail.gmail.com>
@ 2014-09-25 15:40               ` riya khanna
  2014-09-25 18:09                 ` Eric W. Biederman
  2014-09-25 18:21               ` Eric W. Biederman
  1 sibling, 1 reply; 10+ messages in thread
From: riya khanna @ 2014-09-25 15:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: LXC development mailing-list, Miklos Szeredi, fuse-devel,
	Tejun Heo, Seth Forshee, linux-kernel, Serge Hallyn

Is there a plan or work-in-progress to add namespace tags to other
classes in sysfs similar to net? Does it make sense to add namespace
tags to kobjects?

-Riya

On Wed, Sep 24, 2014 at 7:25 PM, riya khanna <riyakhanna1983@gmail.com> wrote:
> On Wed, Sep 24, 2014 at 5:38 PM, Eric W. Biederman <ebiederm@xmission.com>
> wrote:
>>
>> Riya Khanna <riyakhanna1983@gmail.com> writes:
>>
>> > On Sep 24, 2014, at 12:43 PM, Eric W. Biederman <ebiederm@xmission.com>
>> > wrote:
>> >
>> >> Serge Hallyn <serge.hallyn@ubuntu.com> writes:
>> >>
>> >>> Isolation is provided by the devices cgroup.  You want something more
>> >>> than isolation.
>> >>>
>> >>> Quoting riya khanna (riyakhanna1983@gmail.com):
>> >>>> My use case for having device namespaces is device isolation. Isn't
>> >>>> what
>> >>>> namespaces are there for (as I understand)?
>> >>
>> >> Namespaces fundamentally provide for using the same ``global'' name
>> >> in different contexts.  This allows them to be used for isolation
>> >> and process migration (because you can take the same name from
>> >> machine to machine).
>> >>
>> >> Unless someone cares about device numbers at a namespace level
>> >> the work is done.
>> >>
>> >> The mount namespace provides exsits to deal with file names.
>> >> The devices cgroup will limit which devices you can access (although
>> >> I can't ever imagine a case where the mout namespace would be
>> >> insufficient).
>> >>
>> >>>> Not everything should be
>> >>>> accessible (or even visible) from a container all the time (we have
>> >>>> seen
>> >>>> people come up with different use cases for this). However,
>> >>>> bind-mounting
>> >>>> takes away this flexibility.
>> >>
>> >> I don't see how.  If they are mounts that propogate into the container
>> >> and are controlled from outside you can do whatever you want.  (I am
>> >> imagining device by device bind mounts here).  It should be trivial
>> >> to have a a directory tree that propogates into a container and works.
>> >>
>> >
>> > Device-by-device bind mounts can grant/revoke access to real
>> > individual devices as and when needed. However, revoking the access to
>> > real devices could break the applications if there’s no transparent
>> > mechanism to back up the propagated (but now revoked) device bind
>> > mounts that could fool the apps into believing that they are working
>> > with real devices. Frame buffer is one such example, where safe
>> > multiplexing could be applied.
>> >
>> >>>> I agree that assigning fixed device numbers is
>> >>>> clearly not a long-term solution. Emulation for safe and flexible
>> >>>> multiplexing, like you suggested either using CUSE/FUSE or something
>> >>>> like
>> >>>> devpts, is what I'm exploring.
>> >>
>> >> Is the problem you actually care about multiplexing devices?
>> >>
>> >
>> > The problem I care about is access to real devices, such as input, fb,
>> > loop, etc. as and when needed, thereby having native I/O performance -
>> > either through secure multiplexing or exclusive ownership, whatever
>> > makes sense according to the device type.
>>
>> Riya Khanna <riyakhanna1983@gmail.com> writes:
>>
>> > I guess policy-based multiplexing (or exclusive ownership) is the
>> > usage. What kind of devices (loop, fb, etc.) this is needed for
>> > depends on the usage. If there are multiple FBs, then each container
>> > could potentially own one. One may want to provide exclusive ownership
>> > of input devices to one container at a time to avoid information
>> > leakage. Like we saw at LPC last year, this applies to sensors (gps,
>> > accelerometer, etc.) on mobile devices as well.
>>
>> Allowing mutiplexing of those devices seems reasonable.
>>
>> Where the discussion ran into problems last time was that people did not
>> want to use any of the existing linux solutions for multiplexing those
>> kind of thing and wanted to invent something new.
>>
>> Inventing something new is fine if it the extra code maintenance can be
>> justified, or if the invention just a better solution for all users and
>> new code can just start using that in general.
>>
>> The old solution to your problem of multiplexing devices is by
>> allocating a virtual terminal nd sending signals to coordinate
>> cooperatively sharing those resources.
>>
>> If you want some sort of preemtive multitasking that requires
>> something a bit more effort, and work in the device abstractions.
>> You may be able to share concepts and library code but I don't believe
>> there is something you can just pain on top of devices and make it
>> happen.  Certainly in the bad old days of X terminal switching the
>> cooperation was necessary so that when a video card was yanked from an
>> application writing directly to that video card the application would
>> need to restore the video card to a known state so the next application
>> would have a chance of making sense of it.   Furthermore most devices
>> are not safe to let unprivileged users to access their control registers
>> directly.
>>
>> All of which boils down the simple fact that for each type of device you
>> would like to share it is necessary to update the subsystem to support
>> arbitrary numbers of virtual devices that you can talk to.
>>
>> The macvlan driver in the networking stack is a rough example of what I
>> expect you would like.  Something that takes one real physical device
>> and turns it into N virtual devices each of which runs at effectively
>> full speed.  Along with some kind of new master interface for
>> controlling when the multiplexing takes place.
>>
>> I think we do most of this is software today and arguably for a lot of
>> devices the overhead is small enough that a software solution is fine.
>> So perhaps all you need is a fuse interface to the existing software
>> multiplexers so that weird legacy code can be made to run.
>>
>
> What kind of existing multiplexers could be used? Is there one for fb? We
> have evdev abstractions for input in place already.
>
>> Now I suspect part of doing this right will be getting proper video
>> drivers on Android.  I assume that Android is the platform you care
>> about.
>>
>> Eric
>>
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using devices in Containers
  2014-09-25 15:40               ` riya khanna
@ 2014-09-25 18:09                 ` Eric W. Biederman
  0 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2014-09-25 18:09 UTC (permalink / raw)
  To: riya khanna
  Cc: LXC development mailing-list, Miklos Szeredi, fuse-devel,
	Tejun Heo, Seth Forshee, linux-kernel, Serge Hallyn

riya khanna <riyakhanna1983@gmail.com> writes:

> Is there a plan or work-in-progress to add namespace tags to other
> classes in sysfs similar to net? Does it make sense to add namespace
> tags to kobjects?

Currently the a general nack from gregkh on such work.

Given that sysfs is almost never a fast path I suspect it makes most
sense to filter sysfs in some way (aka bind mounts or fuse) and present
the results to the container.

At the point this is something that we are using a lot and have
demonstrated the usefulness of it and it appears a kernel level
solution would be better it would be worth reopening the disucssion.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using devices in Containers
       [not found]             ` <CALRD3qLYAc+K8e1xYb27ipi4KyGRmTxokPCHN0L_zta=Cy9sCQ@mail.gmail.com>
  2014-09-25 15:40               ` riya khanna
@ 2014-09-25 18:21               ` Eric W. Biederman
  1 sibling, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2014-09-25 18:21 UTC (permalink / raw)
  To: riya khanna
  Cc: LXC development mailing-list, Miklos Szeredi, fuse-devel,
	Tejun Heo, Seth Forshee, linux-kernel, Serge Hallyn

riya khanna <riyakhanna1983@gmail.com> writes:

> What kind of existing multiplexers could be used? Is there one for fb? We have
> evdev abstractions for input in place already.

We have X and Wayland/Weston and pulse audio and doubtless more that I
am not aware of.

For video a lot of working is going into compositing and handling
multiple contexts in the hardware so there may already be support in the
kernel.

Fundamentally these are all pieces of hardware we allow multiple
userspace applications access to their information or to modify.
Therefore there is existing multiplexing somewhere.

I won't claim all of the existing multiplexing methods are good and
should be used as is, but they definitely should be used as a starting
point.


>From another perspective there is how kvm tackles this today.  If you
really want to emulate the hardware and make it appear that your
instance of userspace has direct hardware access building upon the
infrastructure that is used for kvm may be worth exploring.

Eric


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-25 18:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CALRD3qKmpzJCRszkG_S9Z3XgoTGWVMFd7FqeJh+W-9pZqPVhCg@mail.gmail.com>
2014-09-24  5:04 ` [lxc-devel] device namespaces Eric W. Biederman
     [not found]   ` <CALRD3qKPJHmmY2DSNNfNKzmLihDLm9fgBQprCXNMHVOArV4iuw@mail.gmail.com>
2014-09-24 16:37     ` Serge Hallyn
2014-09-24 17:43       ` Using devices in Containers (was: [lxc-devel] device namespaces) Eric W. Biederman
2014-09-24 19:30         ` Riya Khanna
2014-09-24 22:38           ` Using devices in Containers Eric W. Biederman
     [not found]             ` <CALRD3qLYAc+K8e1xYb27ipi4KyGRmTxokPCHN0L_zta=Cy9sCQ@mail.gmail.com>
2014-09-25 15:40               ` riya khanna
2014-09-25 18:09                 ` Eric W. Biederman
2014-09-25 18:21               ` Eric W. Biederman
2014-09-24 19:07       ` [lxc-devel] device namespaces Riya Khanna
2014-09-24 16:38   ` Serge Hallyn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.