All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-14 18:20 Andrey Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-14 18:20 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages)

Each namespace has an owning user namespace and now there is not way
to discover these relationships.

Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships too.

Why we may want to know relationships between namespaces?

One use would be visualization, in order to understand the running system.
Another would be to answer the question: what capability does process X have to
perform operations on a resource governed by namespace Y?

One more use-case (which usually called abnormal) is checkpoint/restart.
In CRIU we age going to dump and restore nested namespaces.

There [1] was a discussion about which interface to choose to determing
relationships between namespaces.

Eric suggested to add two ioctl-s [2]:
> Grumble, Grumble.  I think this may actually a case for creating ioctls
> for these two cases.  Now that random nsfs file descriptors are bind
> mountable the original reason for using proc files is not as pressing.
>
> One ioctl for the user namespace that owns a file descriptor.
> One ioctl for the parent namespace of a namespace file descriptor.

Here is an implementaions of these ioctl-s.

[1] https://lkml.org/lkml/2016/7/6/158
[2] https://lkml.org/lkml/2016/7/9/101

Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>

--
2.5.5

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-08-01 18:20     ` Alban Crequy
@ 2016-08-01 23:32         ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-08-01 23:32 UTC (permalink / raw)
  To: Alban Crequy
  Cc: Serge Hallyn, Andrey Vagin, criu-GEFAQzZX7r8dnm+yROfE0A,
	iago-lYLaGTFnO9sWenYVfaLwtA, Linux API, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	Alban Crequy, Alexander Viro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages),
	Eric W. Biederman

On Mon, Aug 01, 2016 at 08:20:27PM +0200, Alban Crequy wrote:
> Hi,
> 
> On 14 July 2016 at 20:20, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
> > Each namespace has an owning user namespace and now there is not way
> > to discover these relationships.
> >
> > Pid and user namepaces are hierarchical. There is no way to discover
> > parent-child relationships too.
> >
> > Why we may want to know relationships between namespaces?
> >
> > One use would be visualization, in order to understand the running system.
> 
> This looks interesting to me because I am interested in representing
> in a graphical way the relationship between different mounts in
> different mount namespaces (showing the ID, the parent-children
> relationships, mount peer groups, the master-slave relationships etc),
> specially for containers. The first idea was to take both
> /proc/1/mountinfo and /proc/$OTHER_PID/mountinfo and I can correlate
> the "shared:" and "master:" fields in the mountinfo files.
> 
> But I cannot read the /proc/$pid/mountinfo of mount namespaces when
> there are no processes in those mount namespaces. For example, if
> those mount namespaces stay alive only because they contain
> "shared&slave" mounts between master mounts and slave mounts that I
> can see in /proc/$pid/mountinfo. Fictional example:
> 
> # mntns 1, mountinfo 1 (visible via /proc/1/mountinfo)
> 61 0 253:1 / / rw shared:1
> 
> # mntns 2, mountinfo 2 (not visible via any /proc/$pid/mountinfo)
> 731 569 0:75 / / rw master:1 shared:42
> 
> # mntns 3, mountinfo 3 (not visible via any /proc/${container_pid}/mountinfo)
> 762 597 0:82 / / rw master:42 shared:76
> 
> As far as I understand, I cannot get a reference to the mntns2 fd
> because mnt namespaces are not hierarchical, and I cannot get its
> /proc/???/mountinfo because no processes live inside.

Hi Alban,

A mount namespace is alive only if someone lives in it or if it is
bind-mounted somewhere.

In your case, the kernel destroys mntns2 and adjusts groups for mounts:

[root@fc24 zzz]# nsenter --mount=mnt2 -- cat /proc/self/mountinfo | grep zzz
184 183 0:43 / /tmp/zzz/a rw,relatime shared:72 master:70 - tmpfs a rw
[root@fc24 zzz]# nsenter --mount=mnt3 -- cat /proc/self/mountinfo | grep zzz
162 161 0:43 / /tmp/zzz/a rw,relatime master:72 - tmpfs a rw

[root@fc24 zzz]# umount mnt2
[root@fc24 zzz]# nsenter --mount=mnt3 -- cat /proc/self/mountinfo | grep zzz
162 161 0:43 / /tmp/zzz/a rw,relatime master:70 - tmpfs a rw

Thanks,
Andrew

> 
> Is there a way around it? Should this use case be handled together?
> 
> Thanks!
> Alban
> 
> > Another would be to answer the question: what capability does process X have to
> > perform operations on a resource governed by namespace Y?
> >
> > One more use-case (which usually called abnormal) is checkpoint/restart.
> > In CRIU we age going to dump and restore nested namespaces.
> >
> > There [1] was a discussion about which interface to choose to determing
> > relationships between namespaces.
> >
> > Eric suggested to add two ioctl-s [2]:
> >> Grumble, Grumble.  I think this may actually a case for creating ioctls
> >> for these two cases.  Now that random nsfs file descriptors are bind
> >> mountable the original reason for using proc files is not as pressing.
> >>
> >> One ioctl for the user namespace that owns a file descriptor.
> >> One ioctl for the parent namespace of a namespace file descriptor.
> >
> > Here is an implementaions of these ioctl-s.
> >
> > [1] https://lkml.org/lkml/2016/7/6/158
> > [2] https://lkml.org/lkml/2016/7/9/101
> >
> > Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> > Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> > Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> > Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> > Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> >
> > --
> > 2.5.5
> >
> > _______________________________________________
> > Containers mailing list
> > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-08-01 23:32         ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-08-01 23:32 UTC (permalink / raw)
  To: Alban Crequy
  Cc: Andrey Vagin, linux-kernel, James Bottomley, Serge Hallyn,
	Linux API, Linux Containers, Alexander Viro, criu,
	Eric W. Biederman, linux-fsdevel, Michael Kerrisk (man-pages),
	iago, Alban Crequy

On Mon, Aug 01, 2016 at 08:20:27PM +0200, Alban Crequy wrote:
> Hi,
> 
> On 14 July 2016 at 20:20, Andrey Vagin <avagin@openvz.org> wrote:
> > Each namespace has an owning user namespace and now there is not way
> > to discover these relationships.
> >
> > Pid and user namepaces are hierarchical. There is no way to discover
> > parent-child relationships too.
> >
> > Why we may want to know relationships between namespaces?
> >
> > One use would be visualization, in order to understand the running system.
> 
> This looks interesting to me because I am interested in representing
> in a graphical way the relationship between different mounts in
> different mount namespaces (showing the ID, the parent-children
> relationships, mount peer groups, the master-slave relationships etc),
> specially for containers. The first idea was to take both
> /proc/1/mountinfo and /proc/$OTHER_PID/mountinfo and I can correlate
> the "shared:" and "master:" fields in the mountinfo files.
> 
> But I cannot read the /proc/$pid/mountinfo of mount namespaces when
> there are no processes in those mount namespaces. For example, if
> those mount namespaces stay alive only because they contain
> "shared&slave" mounts between master mounts and slave mounts that I
> can see in /proc/$pid/mountinfo. Fictional example:
> 
> # mntns 1, mountinfo 1 (visible via /proc/1/mountinfo)
> 61 0 253:1 / / rw shared:1
> 
> # mntns 2, mountinfo 2 (not visible via any /proc/$pid/mountinfo)
> 731 569 0:75 / / rw master:1 shared:42
> 
> # mntns 3, mountinfo 3 (not visible via any /proc/${container_pid}/mountinfo)
> 762 597 0:82 / / rw master:42 shared:76
> 
> As far as I understand, I cannot get a reference to the mntns2 fd
> because mnt namespaces are not hierarchical, and I cannot get its
> /proc/???/mountinfo because no processes live inside.

Hi Alban,

A mount namespace is alive only if someone lives in it or if it is
bind-mounted somewhere.

In your case, the kernel destroys mntns2 and adjusts groups for mounts:

[root@fc24 zzz]# nsenter --mount=mnt2 -- cat /proc/self/mountinfo | grep zzz
184 183 0:43 / /tmp/zzz/a rw,relatime shared:72 master:70 - tmpfs a rw
[root@fc24 zzz]# nsenter --mount=mnt3 -- cat /proc/self/mountinfo | grep zzz
162 161 0:43 / /tmp/zzz/a rw,relatime master:72 - tmpfs a rw

[root@fc24 zzz]# umount mnt2
[root@fc24 zzz]# nsenter --mount=mnt3 -- cat /proc/self/mountinfo | grep zzz
162 161 0:43 / /tmp/zzz/a rw,relatime master:70 - tmpfs a rw

Thanks,
Andrew

> 
> Is there a way around it? Should this use case be handled together?
> 
> Thanks!
> Alban
> 
> > Another would be to answer the question: what capability does process X have to
> > perform operations on a resource governed by namespace Y?
> >
> > One more use-case (which usually called abnormal) is checkpoint/restart.
> > In CRIU we age going to dump and restore nested namespaces.
> >
> > There [1] was a discussion about which interface to choose to determing
> > relationships between namespaces.
> >
> > Eric suggested to add two ioctl-s [2]:
> >> Grumble, Grumble.  I think this may actually a case for creating ioctls
> >> for these two cases.  Now that random nsfs file descriptors are bind
> >> mountable the original reason for using proc files is not as pressing.
> >>
> >> One ioctl for the user namespace that owns a file descriptor.
> >> One ioctl for the parent namespace of a namespace file descriptor.
> >
> > Here is an implementaions of these ioctl-s.
> >
> > [1] https://lkml.org/lkml/2016/7/6/158
> > [2] https://lkml.org/lkml/2016/7/9/101
> >
> > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> > Cc: "W. Trevor King" <wking@tremily.us>
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: Serge Hallyn <serge.hallyn@canonical.com>
> >
> > --
> > 2.5.5
> >
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-29 18:05                                                       ` Eric W. Biederman
@ 2016-08-01 23:01                                                           ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-08-01 23:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Bottomley, Andrey Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	Michael Kerrisk (man-pages),
	linux-fsdevel

On Fri, Jul 29, 2016 at 01:05:48PM -0500, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> 
> > Hi Eric,
> >
> > On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
> >> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >>
> >>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
> >>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
> >>
> >>>> If we want to compare two file descriptors of the current process,
> >>>> it is one of cases for which kcmp can be used. We can call kcmp to
> >>>> compare two namespaces which are opened in other processes.
> >>>
> >>> Is there really a use case there? I assume we're talking about the
> >>> scenario where a process in one namespace opens a /proc/PID/ns/*
> >>> file descriptor and passes that FD to another process via a UNIX
> >>> domain socket. Is that correct?
> >>>
> >>> So, supposing that we want to build a map of the relationships
> >>> between namespaces using the proposed kcmp() API, and there are
> >>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
> >>> to kcmp()?
> >>
> >> Potentially.  The numbers are small enough O(N^2) isn't fatal.
> >
> > Define "small", please.
> >
> > O(N^2) makes me nervous about what other use cases lurk out
> > there that may get bitten by this.
> 
> Worst case for N (One namespace per thread) is about 60k.
> A typical heavy use case may be 1000 namespaces of any type.
> So we are talking about O(N^2) that rarely happens and should be done in
> a couple of seconds.
> 
> >> Where kcmp shines is that it allows migration to happen.  Inode numbers
> >> to change (which they very much will today), and still have things work.
> >
> >
> >> We can keep it O(Nlog(N)) by taking advantage of not just the equality
> >> but the ordering relationship.  Although Ugh.
> >
> > Yes, that sounds pretty ugly...
> 
> Actually having thought about this a little more if kcmp returns an
> ordering by inode and migration preserves the relative order of
> the inodes (which should just be a creation order) it should be quite
> solvable.
> 
> Switch from an order by inode number to an order by object creation
> time, and guarantee that all creations are have an order (which with
> task_list_lock we practically already have) and it should be even easier
> to create.  (A 64bit nanosecond resolution timestamp is good for 544
> years of uptime).  A 64bit number that increments each time an object is
> created should have an even better lifespan.
> 
> I don't know if we can find a way to give that guarantee for other kcmp
> comparisons but it is worth a thought.
> 
> >>One disadvantage of
> >> kcmp currently is that the way the ordering relationship is defined
> >> the order is not preserved over migration :(
> >
> > So, does kcmp() fully solve the proble(s) at hand? It sounds like
> > not, if I understand your last point correctly.
> 
> There are 3 possibilities I see for migration in migration, ordered
> in order of implementation difficulty.
> 1) Have a clear signal that migration happened and a nested migration
>    needs to restart.
> 2) Use kcmp so that only the relative order needs to be preserved.
> 3) Preserve the device number and inode numbers.
> 
> At a practical level I think (2) may actually in net be the simplest.
> It requires a little more care to implement and you have to opt in,
> but it should not require any rolling back of activity (merely careful
> ordering of object creation).
> 
> I definititely like kcmp knowing how to compare things by inode
> (aka st_dev, st_inode) because then even if you have to restart
> the comparisons after a migration the exact details you are comparing
> are hidden and so it is easier to support and harder to get wrong.
> 
> I can imagine how to preserve inode numbers by creating a new instance
> of nsfs instance and using the old inode numbers upon restore.  I don't
> currently see how we could possibly preserve st_dev over migration short of
> a device number namespace.

I think we can avoid comparing st_dev if we will compare inode numbers
for parent user namespaces.

Namespaces looks like a tree where user-namespaces are directories and
other namespaces are files.

A namespace can be described by a path in this imaginary file system,
which looks like /userns1/userns2/XXXns.

In this case we need to guarantee uniq names inside each directories and
that they will be not changed over migration.

> 
> So if we are going to continue with making device numbers be a legacy
> attribute applications should not care about we need a way to compare
> things by not looking at st_dev.  Which brings us back to kcmp.
> 
> Hmm.  Hotplugging as disk and plugging it back likely will change the
> device number and give the same kind of challenge with st_dev (although
> you can't keep a file descriptor open across that kind of event).  So
> certainly a hotplug event on a device should be enough to say don't care
> about the device number.
> 
> Eric
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-08-01 23:01                                                           ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-08-01 23:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Kerrisk (man-pages),
	Andrey Vagin, Serge E. Hallyn, criu, Linux API, Linux Containers,
	LKML, James Bottomley, linux-fsdevel, Alexander Viro

On Fri, Jul 29, 2016 at 01:05:48PM -0500, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> 
> > Hi Eric,
> >
> > On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
> >> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> >>
> >>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
> >>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
> >>
> >>>> If we want to compare two file descriptors of the current process,
> >>>> it is one of cases for which kcmp can be used. We can call kcmp to
> >>>> compare two namespaces which are opened in other processes.
> >>>
> >>> Is there really a use case there? I assume we're talking about the
> >>> scenario where a process in one namespace opens a /proc/PID/ns/*
> >>> file descriptor and passes that FD to another process via a UNIX
> >>> domain socket. Is that correct?
> >>>
> >>> So, supposing that we want to build a map of the relationships
> >>> between namespaces using the proposed kcmp() API, and there are
> >>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
> >>> to kcmp()?
> >>
> >> Potentially.  The numbers are small enough O(N^2) isn't fatal.
> >
> > Define "small", please.
> >
> > O(N^2) makes me nervous about what other use cases lurk out
> > there that may get bitten by this.
> 
> Worst case for N (One namespace per thread) is about 60k.
> A typical heavy use case may be 1000 namespaces of any type.
> So we are talking about O(N^2) that rarely happens and should be done in
> a couple of seconds.
> 
> >> Where kcmp shines is that it allows migration to happen.  Inode numbers
> >> to change (which they very much will today), and still have things work.
> >
> >
> >> We can keep it O(Nlog(N)) by taking advantage of not just the equality
> >> but the ordering relationship.  Although Ugh.
> >
> > Yes, that sounds pretty ugly...
> 
> Actually having thought about this a little more if kcmp returns an
> ordering by inode and migration preserves the relative order of
> the inodes (which should just be a creation order) it should be quite
> solvable.
> 
> Switch from an order by inode number to an order by object creation
> time, and guarantee that all creations are have an order (which with
> task_list_lock we practically already have) and it should be even easier
> to create.  (A 64bit nanosecond resolution timestamp is good for 544
> years of uptime).  A 64bit number that increments each time an object is
> created should have an even better lifespan.
> 
> I don't know if we can find a way to give that guarantee for other kcmp
> comparisons but it is worth a thought.
> 
> >>One disadvantage of
> >> kcmp currently is that the way the ordering relationship is defined
> >> the order is not preserved over migration :(
> >
> > So, does kcmp() fully solve the proble(s) at hand? It sounds like
> > not, if I understand your last point correctly.
> 
> There are 3 possibilities I see for migration in migration, ordered
> in order of implementation difficulty.
> 1) Have a clear signal that migration happened and a nested migration
>    needs to restart.
> 2) Use kcmp so that only the relative order needs to be preserved.
> 3) Preserve the device number and inode numbers.
> 
> At a practical level I think (2) may actually in net be the simplest.
> It requires a little more care to implement and you have to opt in,
> but it should not require any rolling back of activity (merely careful
> ordering of object creation).
> 
> I definititely like kcmp knowing how to compare things by inode
> (aka st_dev, st_inode) because then even if you have to restart
> the comparisons after a migration the exact details you are comparing
> are hidden and so it is easier to support and harder to get wrong.
> 
> I can imagine how to preserve inode numbers by creating a new instance
> of nsfs instance and using the old inode numbers upon restore.  I don't
> currently see how we could possibly preserve st_dev over migration short of
> a device number namespace.

I think we can avoid comparing st_dev if we will compare inode numbers
for parent user namespaces.

Namespaces looks like a tree where user-namespaces are directories and
other namespaces are files.

A namespace can be described by a path in this imaginary file system,
which looks like /userns1/userns2/XXXns.

In this case we need to guarantee uniq names inside each directories and
that they will be not changed over migration.

> 
> So if we are going to continue with making device numbers be a legacy
> attribute applications should not care about we need a way to compare
> things by not looking at st_dev.  Which brings us back to kcmp.
> 
> Hmm.  Hotplugging as disk and plugging it back likely will change the
> device number and give the same kind of challenge with st_dev (although
> you can't keep a file descriptor open across that kind of event).  So
> certainly a hotplug event on a device should be enough to say don't care
> about the device number.
> 
> Eric
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-14 18:20 ` Andrey Vagin
@ 2016-08-01 18:20     ` Alban Crequy
  -1 siblings, 0 replies; 85+ messages in thread
From: Alban Crequy @ 2016-08-01 18:20 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: Serge Hallyn, criu-GEFAQzZX7r8dnm+yROfE0A,
	iago-lYLaGTFnO9sWenYVfaLwtA, Linux API, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	Alban Crequy, Alexander Viro,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages),
	Eric W. Biederman

Hi,

On 14 July 2016 at 20:20, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.

This looks interesting to me because I am interested in representing
in a graphical way the relationship between different mounts in
different mount namespaces (showing the ID, the parent-children
relationships, mount peer groups, the master-slave relationships etc),
specially for containers. The first idea was to take both
/proc/1/mountinfo and /proc/$OTHER_PID/mountinfo and I can correlate
the "shared:" and "master:" fields in the mountinfo files.

But I cannot read the /proc/$pid/mountinfo of mount namespaces when
there are no processes in those mount namespaces. For example, if
those mount namespaces stay alive only because they contain
"shared&slave" mounts between master mounts and slave mounts that I
can see in /proc/$pid/mountinfo. Fictional example:

# mntns 1, mountinfo 1 (visible via /proc/1/mountinfo)
61 0 253:1 / / rw shared:1

# mntns 2, mountinfo 2 (not visible via any /proc/$pid/mountinfo)
731 569 0:75 / / rw master:1 shared:42

# mntns 3, mountinfo 3 (not visible via any /proc/${container_pid}/mountinfo)
762 597 0:82 / / rw master:42 shared:76

As far as I understand, I cannot get a reference to the mntns2 fd
because mnt namespaces are not hierarchical, and I cannot get its
/proc/???/mountinfo because no processes live inside.

Is there a way around it? Should this use case be handled together?

Thanks!
Alban

> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.
>
> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>
> --
> 2.5.5
>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-08-01 18:20     ` Alban Crequy
  0 siblings, 0 replies; 85+ messages in thread
From: Alban Crequy @ 2016-08-01 18:20 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: linux-kernel, James Bottomley, Serge Hallyn, Linux API,
	Linux Containers, Alexander Viro, criu, Eric W. Biederman,
	linux-fsdevel, Michael Kerrisk (man-pages),
	iago, Alban Crequy

Hi,

On 14 July 2016 at 20:20, Andrey Vagin <avagin@openvz.org> wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.

This looks interesting to me because I am interested in representing
in a graphical way the relationship between different mounts in
different mount namespaces (showing the ID, the parent-children
relationships, mount peer groups, the master-slave relationships etc),
specially for containers. The first idea was to take both
/proc/1/mountinfo and /proc/$OTHER_PID/mountinfo and I can correlate
the "shared:" and "master:" fields in the mountinfo files.

But I cannot read the /proc/$pid/mountinfo of mount namespaces when
there are no processes in those mount namespaces. For example, if
those mount namespaces stay alive only because they contain
"shared&slave" mounts between master mounts and slave mounts that I
can see in /proc/$pid/mountinfo. Fictional example:

# mntns 1, mountinfo 1 (visible via /proc/1/mountinfo)
61 0 253:1 / / rw shared:1

# mntns 2, mountinfo 2 (not visible via any /proc/$pid/mountinfo)
731 569 0:75 / / rw master:1 shared:42

# mntns 3, mountinfo 3 (not visible via any /proc/${container_pid}/mountinfo)
762 597 0:82 / / rw master:42 shared:76

As far as I understand, I cannot get a reference to the mntns2 fd
because mnt namespaces are not hierarchical, and I cannot get its
/proc/???/mountinfo because no processes live inside.

Is there a way around it? Should this use case be handled together?

Thanks!
Alban

> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.
>
> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> Cc: "W. Trevor King" <wking@tremily.us>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>
> --
> 2.5.5
>
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-29 18:05                                                       ` Eric W. Biederman
@ 2016-07-31 21:31                                                           ` Michael Kerrisk (man-pages)
  -1 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-31 21:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Bottomley, Andrey Vagin, Andrew Vagin, Linux API,
	Linux Containers, LKML, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-fsdevel

Hi Eric,

On 07/29/2016 08:05 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
>> Hi Eric,
>>
>> On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
>>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>
>>>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>>>
>>>>> If we want to compare two file descriptors of the current process,
>>>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>>>> compare two namespaces which are opened in other processes.
>>>>
>>>> Is there really a use case there? I assume we're talking about the
>>>> scenario where a process in one namespace opens a /proc/PID/ns/*
>>>> file descriptor and passes that FD to another process via a UNIX
>>>> domain socket. Is that correct?
>>>>
>>>> So, supposing that we want to build a map of the relationships
>>>> between namespaces using the proposed kcmp() API, and there are
>>>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>>>> to kcmp()?
>>>
>>> Potentially.  The numbers are small enough O(N^2) isn't fatal.
>>
>> Define "small", please.
>>
>> O(N^2) makes me nervous about what other use cases lurk out
>> there that may get bitten by this.
>
> Worst case for N (One namespace per thread) is about 60k.

I'm getting an education here: where does the 60k number come from?

> A typical heavy use case may be 1000 namespaces of any type.
> So we are talking about O(N^2) that rarely happens and should be done in
> a couple of seconds.

I don't know whether that's acceptable for the migration use case,
but seems quite bad for the visualization use case.

>>> Where kcmp shines is that it allows migration to happen.  Inode numbers
>>> to change (which they very much will today), and still have things work.
>>
>>
>>> We can keep it O(Nlog(N)) by taking advantage of not just the equality
>>> but the ordering relationship.  Although Ugh.
>>
>> Yes, that sounds pretty ugly...
>
> Actually having thought about this a little more if kcmp returns an
> ordering by inode and migration preserves the relative order of
> the inodes (which should just be a creation order) it should be quite
> solvable.
>
> Switch from an order by inode number to an order by object creation
> time, and guarantee that all creations are have an order (which with
> task_list_lock we practically already have) and it should be even easier
> to create.  (A 64bit nanosecond resolution timestamp is good for 544
> years of uptime).  A 64bit number that increments each time an object is
> created should have an even better lifespan.
>
> I don't know if we can find a way to give that guarantee for other kcmp
> comparisons but it is worth a thought.

Okay. So, this is a pathway to O(Nlog(N)) at least then?

>>> One disadvantage of
>>> kcmp currently is that the way the ordering relationship is defined
>>> the order is not preserved over migration :(
>>
>> So, does kcmp() fully solve the proble(s) at hand? It sounds like
>> not, if I understand your last point correctly.
>
> There are 3 possibilities I see for migration in migration, ordered
> in order of implementation difficulty.
> 1) Have a clear signal that migration happened and a nested migration
>    needs to restart.
> 2) Use kcmp so that only the relative order needs to be preserved.
> 3) Preserve the device number and inode numbers.
>
> At a practical level I think (2) may actually in net be the simplest.
> It requires a little more care to implement and you have to opt in,
> but it should not require any rolling back of activity (merely careful
> ordering of object creation).
>
> I definititely like kcmp knowing how to compare things by inode
> (aka st_dev, st_inode) because then even if you have to restart
> the comparisons after a migration the exact details you are comparing
> are hidden and so it is easier to support and harder to get wrong.
>
> I can imagine how to preserve inode numbers by creating a new instance
> of nsfs instance and using the old inode numbers upon restore.  I don't
> currently see how we could possibly preserve st_dev over migration short of
> a device number namespace.
>
> So if we are going to continue with making device numbers be a legacy
> attribute applications should not care about we need a way to compare
> things by not looking at st_dev.  Which brings us back to kcmp.
>
> Hmm.  Hotplugging as disk and plugging it back likely will change the
> device number and give the same kind of challenge with st_dev (although
> you can't keep a file descriptor open across that kind of event).  So
> certainly a hotplug event on a device should be enough to say don't care
> about the device number.

Okay.

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-31 21:31                                                           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-31 21:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Andrew Vagin, Andrey Vagin, Serge E. Hallyn, criu,
	Linux API, Linux Containers, LKML, James Bottomley,
	linux-fsdevel, Alexander Viro

Hi Eric,

On 07/29/2016 08:05 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>
>> Hi Eric,
>>
>> On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
>>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>>
>>>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>>>
>>>>> If we want to compare two file descriptors of the current process,
>>>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>>>> compare two namespaces which are opened in other processes.
>>>>
>>>> Is there really a use case there? I assume we're talking about the
>>>> scenario where a process in one namespace opens a /proc/PID/ns/*
>>>> file descriptor and passes that FD to another process via a UNIX
>>>> domain socket. Is that correct?
>>>>
>>>> So, supposing that we want to build a map of the relationships
>>>> between namespaces using the proposed kcmp() API, and there are
>>>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>>>> to kcmp()?
>>>
>>> Potentially.  The numbers are small enough O(N^2) isn't fatal.
>>
>> Define "small", please.
>>
>> O(N^2) makes me nervous about what other use cases lurk out
>> there that may get bitten by this.
>
> Worst case for N (One namespace per thread) is about 60k.

I'm getting an education here: where does the 60k number come from?

> A typical heavy use case may be 1000 namespaces of any type.
> So we are talking about O(N^2) that rarely happens and should be done in
> a couple of seconds.

I don't know whether that's acceptable for the migration use case,
but seems quite bad for the visualization use case.

>>> Where kcmp shines is that it allows migration to happen.  Inode numbers
>>> to change (which they very much will today), and still have things work.
>>
>>
>>> We can keep it O(Nlog(N)) by taking advantage of not just the equality
>>> but the ordering relationship.  Although Ugh.
>>
>> Yes, that sounds pretty ugly...
>
> Actually having thought about this a little more if kcmp returns an
> ordering by inode and migration preserves the relative order of
> the inodes (which should just be a creation order) it should be quite
> solvable.
>
> Switch from an order by inode number to an order by object creation
> time, and guarantee that all creations are have an order (which with
> task_list_lock we practically already have) and it should be even easier
> to create.  (A 64bit nanosecond resolution timestamp is good for 544
> years of uptime).  A 64bit number that increments each time an object is
> created should have an even better lifespan.
>
> I don't know if we can find a way to give that guarantee for other kcmp
> comparisons but it is worth a thought.

Okay. So, this is a pathway to O(Nlog(N)) at least then?

>>> One disadvantage of
>>> kcmp currently is that the way the ordering relationship is defined
>>> the order is not preserved over migration :(
>>
>> So, does kcmp() fully solve the proble(s) at hand? It sounds like
>> not, if I understand your last point correctly.
>
> There are 3 possibilities I see for migration in migration, ordered
> in order of implementation difficulty.
> 1) Have a clear signal that migration happened and a nested migration
>    needs to restart.
> 2) Use kcmp so that only the relative order needs to be preserved.
> 3) Preserve the device number and inode numbers.
>
> At a practical level I think (2) may actually in net be the simplest.
> It requires a little more care to implement and you have to opt in,
> but it should not require any rolling back of activity (merely careful
> ordering of object creation).
>
> I definititely like kcmp knowing how to compare things by inode
> (aka st_dev, st_inode) because then even if you have to restart
> the comparisons after a migration the exact details you are comparing
> are hidden and so it is easier to support and harder to get wrong.
>
> I can imagine how to preserve inode numbers by creating a new instance
> of nsfs instance and using the old inode numbers upon restore.  I don't
> currently see how we could possibly preserve st_dev over migration short of
> a device number namespace.
>
> So if we are going to continue with making device numbers be a legacy
> attribute applications should not care about we need a way to compare
> things by not looking at st_dev.  Which brings us back to kcmp.
>
> Hmm.  Hotplugging as disk and plugging it back likely will change the
> device number and give the same kind of challenge with st_dev (although
> you can't keep a file descriptor open across that kind of event).  So
> certainly a hotplug event on a device should be enough to say don't care
> about the device number.

Okay.

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                                     ` <40e35f1a-10e6-b7a5-936e-a09f008be0d0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2016-07-29 18:05                                                       ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-29 18:05 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: James Bottomley, Andrew Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Alexander Viro, Andrey Vagin,
	linux-fsdevel

"Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hi Eric,
>
> On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>>
>>>> If we want to compare two file descriptors of the current process,
>>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>>> compare two namespaces which are opened in other processes.
>>>
>>> Is there really a use case there? I assume we're talking about the
>>> scenario where a process in one namespace opens a /proc/PID/ns/*
>>> file descriptor and passes that FD to another process via a UNIX
>>> domain socket. Is that correct?
>>>
>>> So, supposing that we want to build a map of the relationships
>>> between namespaces using the proposed kcmp() API, and there are
>>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>>> to kcmp()?
>>
>> Potentially.  The numbers are small enough O(N^2) isn't fatal.
>
> Define "small", please.
>
> O(N^2) makes me nervous about what other use cases lurk out
> there that may get bitten by this.

Worst case for N (One namespace per thread) is about 60k.
A typical heavy use case may be 1000 namespaces of any type.
So we are talking about O(N^2) that rarely happens and should be done in
a couple of seconds.

>> Where kcmp shines is that it allows migration to happen.  Inode numbers
>> to change (which they very much will today), and still have things work.
>
>
>> We can keep it O(Nlog(N)) by taking advantage of not just the equality
>> but the ordering relationship.  Although Ugh.
>
> Yes, that sounds pretty ugly...

Actually having thought about this a little more if kcmp returns an
ordering by inode and migration preserves the relative order of
the inodes (which should just be a creation order) it should be quite
solvable.

Switch from an order by inode number to an order by object creation
time, and guarantee that all creations are have an order (which with
task_list_lock we practically already have) and it should be even easier
to create.  (A 64bit nanosecond resolution timestamp is good for 544
years of uptime).  A 64bit number that increments each time an object is
created should have an even better lifespan.

I don't know if we can find a way to give that guarantee for other kcmp
comparisons but it is worth a thought.

>>One disadvantage of
>> kcmp currently is that the way the ordering relationship is defined
>> the order is not preserved over migration :(
>
> So, does kcmp() fully solve the proble(s) at hand? It sounds like
> not, if I understand your last point correctly.

There are 3 possibilities I see for migration in migration, ordered
in order of implementation difficulty.
1) Have a clear signal that migration happened and a nested migration
   needs to restart.
2) Use kcmp so that only the relative order needs to be preserved.
3) Preserve the device number and inode numbers.

At a practical level I think (2) may actually in net be the simplest.
It requires a little more care to implement and you have to opt in,
but it should not require any rolling back of activity (merely careful
ordering of object creation).

I definititely like kcmp knowing how to compare things by inode
(aka st_dev, st_inode) because then even if you have to restart
the comparisons after a migration the exact details you are comparing
are hidden and so it is easier to support and harder to get wrong.

I can imagine how to preserve inode numbers by creating a new instance
of nsfs instance and using the old inode numbers upon restore.  I don't
currently see how we could possibly preserve st_dev over migration short of
a device number namespace.

So if we are going to continue with making device numbers be a legacy
attribute applications should not care about we need a way to compare
things by not looking at st_dev.  Which brings us back to kcmp.

Hmm.  Hotplugging as disk and plugging it back likely will change the
device number and give the same kind of challenge with st_dev (although
you can't keep a file descriptor open across that kind of event).  So
certainly a hotplug event on a device should be enough to say don't care
about the device number.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                                     ` <40e35f1a-10e6-b7a5-936e-a09f008be0d0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2016-07-29 18:05                                                       ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-29 18:05 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrew Vagin, Andrey Vagin, Serge E. Hallyn, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hi Eric,
>
> On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>
>>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>>
>>>> If we want to compare two file descriptors of the current process,
>>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>>> compare two namespaces which are opened in other processes.
>>>
>>> Is there really a use case there? I assume we're talking about the
>>> scenario where a process in one namespace opens a /proc/PID/ns/*
>>> file descriptor and passes that FD to another process via a UNIX
>>> domain socket. Is that correct?
>>>
>>> So, supposing that we want to build a map of the relationships
>>> between namespaces using the proposed kcmp() API, and there are
>>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>>> to kcmp()?
>>
>> Potentially.  The numbers are small enough O(N^2) isn't fatal.
>
> Define "small", please.
>
> O(N^2) makes me nervous about what other use cases lurk out
> there that may get bitten by this.

Worst case for N (One namespace per thread) is about 60k.
A typical heavy use case may be 1000 namespaces of any type.
So we are talking about O(N^2) that rarely happens and should be done in
a couple of seconds.

>> Where kcmp shines is that it allows migration to happen.  Inode numbers
>> to change (which they very much will today), and still have things work.
>
>
>> We can keep it O(Nlog(N)) by taking advantage of not just the equality
>> but the ordering relationship.  Although Ugh.
>
> Yes, that sounds pretty ugly...

Actually having thought about this a little more if kcmp returns an
ordering by inode and migration preserves the relative order of
the inodes (which should just be a creation order) it should be quite
solvable.

Switch from an order by inode number to an order by object creation
time, and guarantee that all creations are have an order (which with
task_list_lock we practically already have) and it should be even easier
to create.  (A 64bit nanosecond resolution timestamp is good for 544
years of uptime).  A 64bit number that increments each time an object is
created should have an even better lifespan.

I don't know if we can find a way to give that guarantee for other kcmp
comparisons but it is worth a thought.

>>One disadvantage of
>> kcmp currently is that the way the ordering relationship is defined
>> the order is not preserved over migration :(
>
> So, does kcmp() fully solve the proble(s) at hand? It sounds like
> not, if I understand your last point correctly.

There are 3 possibilities I see for migration in migration, ordered
in order of implementation difficulty.
1) Have a clear signal that migration happened and a nested migration
   needs to restart.
2) Use kcmp so that only the relative order needs to be preserved.
3) Preserve the device number and inode numbers.

At a practical level I think (2) may actually in net be the simplest.
It requires a little more care to implement and you have to opt in,
but it should not require any rolling back of activity (merely careful
ordering of object creation).

I definititely like kcmp knowing how to compare things by inode
(aka st_dev, st_inode) because then even if you have to restart
the comparisons after a migration the exact details you are comparing
are hidden and so it is easier to support and harder to get wrong.

I can imagine how to preserve inode numbers by creating a new instance
of nsfs instance and using the old inode numbers upon restore.  I don't
currently see how we could possibly preserve st_dev over migration short of
a device number namespace.

So if we are going to continue with making device numbers be a legacy
attribute applications should not care about we need a way to compare
things by not looking at st_dev.  Which brings us back to kcmp.

Hmm.  Hotplugging as disk and plugging it back likely will change the
device number and give the same kind of challenge with st_dev (although
you can't keep a file descriptor open across that kind of event).  So
certainly a hotplug event on a device should be enough to say don't care
about the device number.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-29 18:05                                                       ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-29 18:05 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrew Vagin, Andrey Vagin, Serge E. Hallyn, criu@openvz.org,
	Linux API, Linux Containers, LKML, James Bottomley,
	linux-fsdevel, Alexander Viro

"Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hi Eric,
>
> On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>>
>>>> If we want to compare two file descriptors of the current process,
>>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>>> compare two namespaces which are opened in other processes.
>>>
>>> Is there really a use case there? I assume we're talking about the
>>> scenario where a process in one namespace opens a /proc/PID/ns/*
>>> file descriptor and passes that FD to another process via a UNIX
>>> domain socket. Is that correct?
>>>
>>> So, supposing that we want to build a map of the relationships
>>> between namespaces using the proposed kcmp() API, and there are
>>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>>> to kcmp()?
>>
>> Potentially.  The numbers are small enough O(N^2) isn't fatal.
>
> Define "small", please.
>
> O(N^2) makes me nervous about what other use cases lurk out
> there that may get bitten by this.

Worst case for N (One namespace per thread) is about 60k.
A typical heavy use case may be 1000 namespaces of any type.
So we are talking about O(N^2) that rarely happens and should be done in
a couple of seconds.

>> Where kcmp shines is that it allows migration to happen.  Inode numbers
>> to change (which they very much will today), and still have things work.
>
>
>> We can keep it O(Nlog(N)) by taking advantage of not just the equality
>> but the ordering relationship.  Although Ugh.
>
> Yes, that sounds pretty ugly...

Actually having thought about this a little more if kcmp returns an
ordering by inode and migration preserves the relative order of
the inodes (which should just be a creation order) it should be quite
solvable.

Switch from an order by inode number to an order by object creation
time, and guarantee that all creations are have an order (which with
task_list_lock we practically already have) and it should be even easier
to create.  (A 64bit nanosecond resolution timestamp is good for 544
years of uptime).  A 64bit number that increments each time an object is
created should have an even better lifespan.

I don't know if we can find a way to give that guarantee for other kcmp
comparisons but it is worth a thought.

>>One disadvantage of
>> kcmp currently is that the way the ordering relationship is defined
>> the order is not preserved over migration :(
>
> So, does kcmp() fully solve the proble(s) at hand? It sounds like
> not, if I understand your last point correctly.

There are 3 possibilities I see for migration in migration, ordered
in order of implementation difficulty.
1) Have a clear signal that migration happened and a nested migration
   needs to restart.
2) Use kcmp so that only the relative order needs to be preserved.
3) Preserve the device number and inode numbers.

At a practical level I think (2) may actually in net be the simplest.
It requires a little more care to implement and you have to opt in,
but it should not require any rolling back of activity (merely careful
ordering of object creation).

I definititely like kcmp knowing how to compare things by inode
(aka st_dev, st_inode) because then even if you have to restart
the comparisons after a migration the exact details you are comparing
are hidden and so it is easier to support and harder to get wrong.

I can imagine how to preserve inode numbers by creating a new instance
of nsfs instance and using the old inode numbers upon restore.  I don't
currently see how we could possibly preserve st_dev over migration short of
a device number namespace.

So if we are going to continue with making device numbers be a legacy
attribute applications should not care about we need a way to compare
things by not looking at st_dev.  Which brings us back to kcmp.

Hmm.  Hotplugging as disk and plugging it back likely will change the
device number and give the same kind of challenge with st_dev (although
you can't keep a file descriptor open across that kind of event).  So
certainly a hotplug event on a device should be enough to say don't care
about the device number.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                                   ` <87popxkjjp.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
@ 2016-07-28 19:00                                                     ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-28 19:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Bottomley, Andrey Vagin, Andrew Vagin, Linux API,
	Linux Containers, LKML, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-fsdevel

Hi Eric,

On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>
>>> If we want to compare two file descriptors of the current process,
>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>> compare two namespaces which are opened in other processes.
>>
>> Is there really a use case there? I assume we're talking about the
>> scenario where a process in one namespace opens a /proc/PID/ns/*
>> file descriptor and passes that FD to another process via a UNIX
>> domain socket. Is that correct?
>>
>> So, supposing that we want to build a map of the relationships
>> between namespaces using the proposed kcmp() API, and there are
>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>> to kcmp()?
>
> Potentially.  The numbers are small enough O(N^2) isn't fatal.

Define "small", please.

O(N^2) makes me nervous about what other use cases lurk out
there that may get bitten by this.

> Where kcmp shines is that it allows migration to happen.  Inode numbers
> to change (which they very much will today), and still have things work.


> We can keep it O(Nlog(N)) by taking advantage of not just the equality
> but the ordering relationship.  Although Ugh.

Yes, that sounds pretty ugly...

>One disadvantage of
> kcmp currently is that the way the ordering relationship is defined
> the order is not preserved over migration :(

So, does kcmp() fully solve the proble(s) at hand? It sounds like
not, if I understand your last point correctly.


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-28 12:56                                                   ` Eric W. Biederman
  (?)
@ 2016-07-28 19:00                                                   ` Michael Kerrisk (man-pages)
  2016-07-29 18:05                                                       ` Eric W. Biederman
       [not found]                                                     ` <40e35f1a-10e6-b7a5-936e-a09f008be0d0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  -1 siblings, 2 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-28 19:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Andrew Vagin, Andrey Vagin, Serge E. Hallyn, criu,
	Linux API, Linux Containers, LKML, James Bottomley,
	linux-fsdevel, Alexander Viro

Hi Eric,

On 07/28/2016 02:56 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>
>> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>
>>> If we want to compare two file descriptors of the current process,
>>> it is one of cases for which kcmp can be used. We can call kcmp to
>>> compare two namespaces which are opened in other processes.
>>
>> Is there really a use case there? I assume we're talking about the
>> scenario where a process in one namespace opens a /proc/PID/ns/*
>> file descriptor and passes that FD to another process via a UNIX
>> domain socket. Is that correct?
>>
>> So, supposing that we want to build a map of the relationships
>> between namespaces using the proposed kcmp() API, and there are
>> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
>> to kcmp()?
>
> Potentially.  The numbers are small enough O(N^2) isn't fatal.

Define "small", please.

O(N^2) makes me nervous about what other use cases lurk out
there that may get bitten by this.

> Where kcmp shines is that it allows migration to happen.  Inode numbers
> to change (which they very much will today), and still have things work.


> We can keep it O(Nlog(N)) by taking advantage of not just the equality
> but the ordering relationship.  Although Ugh.

Yes, that sounds pretty ugly...

>One disadvantage of
> kcmp currently is that the way the ordering relationship is defined
> the order is not preserved over migration :(

So, does kcmp() fully solve the proble(s) at hand? It sounds like
not, if I understand your last point correctly.


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-28 10:45                                               ` Michael Kerrisk (man-pages)
@ 2016-07-28 12:56                                                   ` Eric W. Biederman
  -1 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-28 12:56 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrew Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Alexander Viro, linux-fsdevel,
	James Bottomley, Andrey Vagin

"Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:

>> If we want to compare two file descriptors of the current process,
>> it is one of cases for which kcmp can be used. We can call kcmp to
>> compare two namespaces which are opened in other processes.
>
> Is there really a use case there? I assume we're talking about the
> scenario where a process in one namespace opens a /proc/PID/ns/*
> file descriptor and passes that FD to another process via a UNIX
> domain socket. Is that correct?
>
> So, supposing that we want to build a map of the relationships
> between namespaces using the proposed kcmp() API, and there are
> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
> to kcmp()?

Potentially.  The numbers are small enough O(N^2) isn't fatal.

Where kcmp shines is that it allows migration to happen.  Inode numbers
to change (which they very much will today), and still have things work.

We can keep it O(Nlog(N)) by taking advantage of not just the equality
but the ordering relationship.  Although Ugh.  One disadvantage of
kcmp currently is that the way the ordering relationship is defined
the order is not preserved over migration :(

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-28 12:56                                                   ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-28 12:56 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrew Vagin, Andrey Vagin, Serge Hallyn, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> On 07/26/2016 10:39 PM, Andrew Vagin wrote:
>> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:

>> If we want to compare two file descriptors of the current process,
>> it is one of cases for which kcmp can be used. We can call kcmp to
>> compare two namespaces which are opened in other processes.
>
> Is there really a use case there? I assume we're talking about the
> scenario where a process in one namespace opens a /proc/PID/ns/*
> file descriptor and passes that FD to another process via a UNIX
> domain socket. Is that correct?
>
> So, supposing that we want to build a map of the relationships
> between namespaces using the proposed kcmp() API, and there are
> say N namespaces? Does this mena we make (N * (N-1) / 2) calls
> to kcmp()?

Potentially.  The numbers are small enough O(N^2) isn't fatal.

Where kcmp shines is that it allows migration to happen.  Inode numbers
to change (which they very much will today), and still have things work.

We can keep it O(Nlog(N)) by taking advantage of not just the equality
but the ordering relationship.  Although Ugh.  One disadvantage of
kcmp currently is that the way the ordering relationship is defined
the order is not preserved over migration :(

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                             ` <20160726203955.GA9415-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-28 10:45                                               ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-28 10:45 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, linux-fsdevel,
	James Bottomley, Eric W. Biederman

On 07/26/2016 10:39 PM, Andrew Vagin wrote:
> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>> Hello Andrew,
>>
>> On 26 July 2016 at 20:25, Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> wrote:
>>> On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
>>>> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
>>>>> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>>>>>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>
>>>>> [snip]
>>>>>
>>>>>> [snip]
>>>>>>>>> So, from my point of view, the important piece that was missing from
>>>>>>>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>>>>>>>> on the returned FDs. I think that detail needs to be part of the
>>>>>>>>> commit message (and also the man page text). I think it even be
>>>>>>>>> helpful to include the above program as part of the commit message:
>>>>>>>>> it helps people more quickly grasp the API.
>>>>>>>>
>>>>>>>> Please, please make the standard way to compare these things fstat.
>>>>>>>> That is much less magic than a symlink, and a little more future proof.
>>>>>>>> Possibly even kcmp.
>>>>>
>>>>> I like the idea to use kcmp to compare namespaces. I am going to add this
>>>>> functionality to kcmp and describe all these in the man page.
>>>>
>>>> Hi Andrey,
>>>>
>>>> Can you briefly sketch out the proposed API and how it would be used?
>>>> I'd find it useful to see that even before the implementation.
>>>
>>> Sure. If a process wants to compare two namespaces, it needs to get file
>>> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
>>> process which has them),
>>> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
>>>
>>> For example, if we want to compare pid namespaces for 1 and 2 processes:
>>>
>>
>> What's the purpose of the following line, and the use of 'pid' in the
>> kcmp() call?:
>
> It's the existing interface of kcmp.  It's used to check whether the
> two processes identified  by pid1  and  pid2 share a kernel resource
> such as virtual memory, file descriptors, and so on.


Yes, understood, but it seems a slightly weird use of the interface,
since in general pid1 will be the same as pid2 in this use case,
whereas in the other use cases, pid1 and pid2 are generally not
equal.

> If we want to compare two file descriptors of the current process,
> it is one of cases for which kcmp can be used. We can call kcmp to
> compare two namespaces which are opened in other processes.

Is there really a use case there? I assume we're talking about the
scenario where a process in one namespace opens a /proc/PID/ns/*
file descriptor and passes that FD to another process via a UNIX
domain socket. Is that correct?

So, supposing that we want to build a map of the relationships
between namespaces using the proposed kcmp() API, and there are
say N namespaces? Does this mena we make (N * (N-1) / 2) calls
to kcmp()?

Cheers,

Michael

>>> pid = getpid();
>>> ns_fd1 = open("/proc/1/ns/pid")
>>> ns_fd2 = open("/proc/2/ns/pid")
>>>
>>> if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
>>>         printf("Both processes live in the same pid namespace\n");
>>
>> Thanks,
>>
>> Michael
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                             ` <20160726203955.GA9415-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-28 10:45                                               ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-28 10:45 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: mtk.manpages, Eric W. Biederman, Andrey Vagin, Serge Hallyn,
	criu, Linux API, Linux Containers, LKML, James Bottomley,
	linux-fsdevel, Alexander Viro

On 07/26/2016 10:39 PM, Andrew Vagin wrote:
> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>> Hello Andrew,
>>
>> On 26 July 2016 at 20:25, Andrew Vagin <avagin@virtuozzo.com> wrote:
>>> On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
>>>> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
>>>>> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>>>>>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>>>>
>>>>> [snip]
>>>>>
>>>>>> [snip]
>>>>>>>>> So, from my point of view, the important piece that was missing from
>>>>>>>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>>>>>>>> on the returned FDs. I think that detail needs to be part of the
>>>>>>>>> commit message (and also the man page text). I think it even be
>>>>>>>>> helpful to include the above program as part of the commit message:
>>>>>>>>> it helps people more quickly grasp the API.
>>>>>>>>
>>>>>>>> Please, please make the standard way to compare these things fstat.
>>>>>>>> That is much less magic than a symlink, and a little more future proof.
>>>>>>>> Possibly even kcmp.
>>>>>
>>>>> I like the idea to use kcmp to compare namespaces. I am going to add this
>>>>> functionality to kcmp and describe all these in the man page.
>>>>
>>>> Hi Andrey,
>>>>
>>>> Can you briefly sketch out the proposed API and how it would be used?
>>>> I'd find it useful to see that even before the implementation.
>>>
>>> Sure. If a process wants to compare two namespaces, it needs to get file
>>> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
>>> process which has them),
>>> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
>>>
>>> For example, if we want to compare pid namespaces for 1 and 2 processes:
>>>
>>
>> What's the purpose of the following line, and the use of 'pid' in the
>> kcmp() call?:
>
> It's the existing interface of kcmp.  It's used to check whether the
> two processes identified  by pid1  and  pid2 share a kernel resource
> such as virtual memory, file descriptors, and so on.


Yes, understood, but it seems a slightly weird use of the interface,
since in general pid1 will be the same as pid2 in this use case,
whereas in the other use cases, pid1 and pid2 are generally not
equal.

> If we want to compare two file descriptors of the current process,
> it is one of cases for which kcmp can be used. We can call kcmp to
> compare two namespaces which are opened in other processes.

Is there really a use case there? I assume we're talking about the
scenario where a process in one namespace opens a /proc/PID/ns/*
file descriptor and passes that FD to another process via a UNIX
domain socket. Is that correct?

So, supposing that we want to build a map of the relationships
between namespaces using the proposed kcmp() API, and there are
say N namespaces? Does this mena we make (N * (N-1) / 2) calls
to kcmp()?

Cheers,

Michael

>>> pid = getpid();
>>> ns_fd1 = open("/proc/1/ns/pid")
>>> ns_fd2 = open("/proc/2/ns/pid")
>>>
>>> if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
>>>         printf("Both processes live in the same pid namespace\n");
>>
>> Thanks,
>>
>> Michael
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-28 10:45                                               ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-28 10:45 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Eric W. Biederman,
	Andrey Vagin, Serge Hallyn, criu-GEFAQzZX7r8dnm+yROfE0A,
	Linux API, Linux Containers, LKML, James Bottomley,
	linux-fsdevel, Alexander Viro

On 07/26/2016 10:39 PM, Andrew Vagin wrote:
> On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
>> Hello Andrew,
>>
>> On 26 July 2016 at 20:25, Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> wrote:
>>> On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
>>>> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
>>>>> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>>>>>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>
>>>>> [snip]
>>>>>
>>>>>> [snip]
>>>>>>>>> So, from my point of view, the important piece that was missing from
>>>>>>>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>>>>>>>> on the returned FDs. I think that detail needs to be part of the
>>>>>>>>> commit message (and also the man page text). I think it even be
>>>>>>>>> helpful to include the above program as part of the commit message:
>>>>>>>>> it helps people more quickly grasp the API.
>>>>>>>>
>>>>>>>> Please, please make the standard way to compare these things fstat.
>>>>>>>> That is much less magic than a symlink, and a little more future proof.
>>>>>>>> Possibly even kcmp.
>>>>>
>>>>> I like the idea to use kcmp to compare namespaces. I am going to add this
>>>>> functionality to kcmp and describe all these in the man page.
>>>>
>>>> Hi Andrey,
>>>>
>>>> Can you briefly sketch out the proposed API and how it would be used?
>>>> I'd find it useful to see that even before the implementation.
>>>
>>> Sure. If a process wants to compare two namespaces, it needs to get file
>>> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
>>> process which has them),
>>> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
>>>
>>> For example, if we want to compare pid namespaces for 1 and 2 processes:
>>>
>>
>> What's the purpose of the following line, and the use of 'pid' in the
>> kcmp() call?:
>
> It's the existing interface of kcmp.  It's used to check whether the
> two processes identified  by pid1  and  pid2 share a kernel resource
> such as virtual memory, file descriptors, and so on.


Yes, understood, but it seems a slightly weird use of the interface,
since in general pid1 will be the same as pid2 in this use case,
whereas in the other use cases, pid1 and pid2 are generally not
equal.

> If we want to compare two file descriptors of the current process,
> it is one of cases for which kcmp can be used. We can call kcmp to
> compare two namespaces which are opened in other processes.

Is there really a use case there? I assume we're talking about the
scenario where a process in one namespace opens a /proc/PID/ns/*
file descriptor and passes that FD to another process via a UNIX
domain socket. Is that correct?

So, supposing that we want to build a map of the relationships
between namespaces using the proposed kcmp() API, and there are
say N namespaces? Does this mena we make (N * (N-1) / 2) calls
to kcmp()?

Cheers,

Michael

>>> pid = getpid();
>>> ns_fd1 = open("/proc/1/ns/pid")
>>> ns_fd2 = open("/proc/2/ns/pid")
>>>
>>> if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
>>>         printf("Both processes live in the same pid namespace\n");
>>
>> Thanks,
>>
>> Michael
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-26 19:17                                       ` Michael Kerrisk (man-pages)
@ 2016-07-26 20:39                                             ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26 20:39 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman, linux-fsdevel,
	James Bottomley, Alexander Viro

On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Andrew,
> 
> On 26 July 2016 at 20:25, Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> wrote:
> > On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
> >> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> >> > On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
> >> > > "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >> >
> >> > [snip]
> >> >
> >> > > [snip]
> >> > > > > > So, from my point of view, the important piece that was missing from
> >> > > > > > your commit message was the note to use readlink("/proc/self/fd/%d")
> >> > > > > > on the returned FDs. I think that detail needs to be part of the
> >> > > > > > commit message (and also the man page text). I think it even be
> >> > > > > > helpful to include the above program as part of the commit message:
> >> > > > > > it helps people more quickly grasp the API.
> >> > > > >
> >> > > > > Please, please make the standard way to compare these things fstat.
> >> > > > > That is much less magic than a symlink, and a little more future proof.
> >> > > > > Possibly even kcmp.
> >> >
> >> > I like the idea to use kcmp to compare namespaces. I am going to add this
> >> > functionality to kcmp and describe all these in the man page.
> >>
> >> Hi Andrey,
> >>
> >> Can you briefly sketch out the proposed API and how it would be used?
> >> I'd find it useful to see that even before the implementation.
> >
> > Sure. If a process wants to compare two namespaces, it needs to get file
> > descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> > process which has them),
> > and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
> >
> > For example, if we want to compare pid namespaces for 1 and 2 processes:
> >
> 
> What's the purpose of the following line, and the use of 'pid' in the
> kcmp() call?:

It's the existing interface of kcmp. It's used to check whether the
two processes identified  by pid1  and  pid2 share a kernel resource
such as virtual memory, file descriptors, and so on.

If we want to compare two file descriptors of the current process,
it is one of cases for which kcmp can be used. We can call kcmp to
compare two namespaces which are opened in other processes.

Thanks,
Andrew

> 
> > pid = getpid();
> > ns_fd1 = open("/proc/1/ns/pid")
> > ns_fd2 = open("/proc/2/ns/pid")
> >
> > if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
> >         printf("Both processes live in the same pid namespace\n");
> 
> Thanks,
> 
> Michael

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26 20:39                                             ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26 20:39 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Eric W. Biederman, Andrey Vagin, Serge Hallyn, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

On Tue, Jul 26, 2016 at 09:17:31PM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Andrew,
> 
> On 26 July 2016 at 20:25, Andrew Vagin <avagin@virtuozzo.com> wrote:
> > On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
> >> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> >> > On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
> >> > > "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> >> >
> >> > [snip]
> >> >
> >> > > [snip]
> >> > > > > > So, from my point of view, the important piece that was missing from
> >> > > > > > your commit message was the note to use readlink("/proc/self/fd/%d")
> >> > > > > > on the returned FDs. I think that detail needs to be part of the
> >> > > > > > commit message (and also the man page text). I think it even be
> >> > > > > > helpful to include the above program as part of the commit message:
> >> > > > > > it helps people more quickly grasp the API.
> >> > > > >
> >> > > > > Please, please make the standard way to compare these things fstat.
> >> > > > > That is much less magic than a symlink, and a little more future proof.
> >> > > > > Possibly even kcmp.
> >> >
> >> > I like the idea to use kcmp to compare namespaces. I am going to add this
> >> > functionality to kcmp and describe all these in the man page.
> >>
> >> Hi Andrey,
> >>
> >> Can you briefly sketch out the proposed API and how it would be used?
> >> I'd find it useful to see that even before the implementation.
> >
> > Sure. If a process wants to compare two namespaces, it needs to get file
> > descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> > process which has them),
> > and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
> >
> > For example, if we want to compare pid namespaces for 1 and 2 processes:
> >
> 
> What's the purpose of the following line, and the use of 'pid' in the
> kcmp() call?:

It's the existing interface of kcmp. It's used to check whether the
two processes identified  by pid1  and  pid2 share a kernel resource
such as virtual memory, file descriptors, and so on.

If we want to compare two file descriptors of the current process,
it is one of cases for which kcmp can be used. We can call kcmp to
compare two namespaces which are opened in other processes.

Thanks,
Andrew

> 
> > pid = getpid();
> > ns_fd1 = open("/proc/1/ns/pid")
> > ns_fd2 = open("/proc/2/ns/pid")
> >
> > if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
> >         printf("Both processes live in the same pid namespace\n");
> 
> Thanks,
> 
> Michael

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-26  2:54                                 ` Andrew Vagin
@ 2016-07-26 19:38                                     ` Eric W. Biederman
  -1 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-26 19:38 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Michael Kerrisk (man-pages),
	linux-fsdevel, James Bottomley, Alexander Viro

Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> writes:

> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> [snip]
>
>> [snip]
>> >>> So, from my point of view, the important piece that was missing from
>> >>> your commit message was the note to use readlink("/proc/self/fd/%d")
>> >>> on the returned FDs. I think that detail needs to be part of the
>> >>> commit message (and also the man page text). I think it even be
>> >>> helpful to include the above program as part of the commit message:
>> >>> it helps people more quickly grasp the API.
>> >>
>> >> Please, please make the standard way to compare these things fstat.
>> >> That is much less magic than a symlink, and a little more future proof.
>> >> Possibly even kcmp.
>
> I like the idea to use kcmp to compare namespaces. I am going to add this
> functionality to kcmp and describe all these in the man page.
>
>> >
>> > As in fstat() to get the st_ino field, right?
>> 
>> Both the st_ino and st_dev fields.
>> 
>> The most likely change to support checkpoint/restart in the future is to
>> preserve st_ino across migrations and instantiate a different instance
>> of nsfs to hold the inode numbers from the previous machine.
>
> It sounds tricky. BTW: Actually this is not only one places where we have
> this sort of problem. For example, now mount id-s are not preserved when
> a container is migrated. The same problem is applied to tmpfs, where
> inode numbers are not preserved for files.

Agreed.

Interesting. Interesting. Interesting.

I am not completely convinced that improving kcmp solves it for
everything but improving kcmp sounds good enough to be very interesting
and enough to solve a practical case (migration in migration).  Plus
improving kcmp is cheap and easy.

I would propose:

KCMP_OBJECT
    Check whether a file descriptor idx1 in the process pid1 refers to
    the same underlying object as file descriptor idx2 in the process
    pid2.

The default case would be checking to see if to file descriptors refer
to the same inode.  But for weird cases (like proc pid directories, or
sysfs files) the comparison could look deeper.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26 19:38                                     ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-26 19:38 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Michael Kerrisk (man-pages),
	Andrey Vagin, Serge Hallyn, criu, Linux API, Linux Containers,
	LKML, James Bottomley, linux-fsdevel, Alexander Viro

Andrew Vagin <avagin@virtuozzo.com> writes:

> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>
> [snip]
>
>> [snip]
>> >>> So, from my point of view, the important piece that was missing from
>> >>> your commit message was the note to use readlink("/proc/self/fd/%d")
>> >>> on the returned FDs. I think that detail needs to be part of the
>> >>> commit message (and also the man page text). I think it even be
>> >>> helpful to include the above program as part of the commit message:
>> >>> it helps people more quickly grasp the API.
>> >>
>> >> Please, please make the standard way to compare these things fstat.
>> >> That is much less magic than a symlink, and a little more future proof.
>> >> Possibly even kcmp.
>
> I like the idea to use kcmp to compare namespaces. I am going to add this
> functionality to kcmp and describe all these in the man page.
>
>> >
>> > As in fstat() to get the st_ino field, right?
>> 
>> Both the st_ino and st_dev fields.
>> 
>> The most likely change to support checkpoint/restart in the future is to
>> preserve st_ino across migrations and instantiate a different instance
>> of nsfs to hold the inode numbers from the previous machine.
>
> It sounds tricky. BTW: Actually this is not only one places where we have
> this sort of problem. For example, now mount id-s are not preserved when
> a container is migrated. The same problem is applied to tmpfs, where
> inode numbers are not preserved for files.

Agreed.

Interesting. Interesting. Interesting.

I am not completely convinced that improving kcmp solves it for
everything but improving kcmp sounds good enough to be very interesting
and enough to solve a practical case (migration in migration).  Plus
improving kcmp is cheap and easy.

I would propose:

KCMP_OBJECT
    Check whether a file descriptor idx1 in the process pid1 refers to
    the same underlying object as file descriptor idx2 in the process
    pid2.

The default case would be checking to see if to file descriptors refer
to the same inode.  But for weird cases (like proc pid directories, or
sysfs files) the comparison could look deeper.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                       ` <20160726182524.GA328-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
  2016-07-26 18:32                                         ` W. Trevor King
@ 2016-07-26 19:17                                         ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-26 19:17 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman, linux-fsdevel,
	James Bottomley, Alexander Viro

Hello Andrew,

On 26 July 2016 at 20:25, Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> wrote:
> On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
>> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
>> > On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> > > "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >
>> > [snip]
>> >
>> > > [snip]
>> > > > > > So, from my point of view, the important piece that was missing from
>> > > > > > your commit message was the note to use readlink("/proc/self/fd/%d")
>> > > > > > on the returned FDs. I think that detail needs to be part of the
>> > > > > > commit message (and also the man page text). I think it even be
>> > > > > > helpful to include the above program as part of the commit message:
>> > > > > > it helps people more quickly grasp the API.
>> > > > >
>> > > > > Please, please make the standard way to compare these things fstat.
>> > > > > That is much less magic than a symlink, and a little more future proof.
>> > > > > Possibly even kcmp.
>> >
>> > I like the idea to use kcmp to compare namespaces. I am going to add this
>> > functionality to kcmp and describe all these in the man page.
>>
>> Hi Andrey,
>>
>> Can you briefly sketch out the proposed API and how it would be used?
>> I'd find it useful to see that even before the implementation.
>
> Sure. If a process wants to compare two namespaces, it needs to get file
> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> process which has them),
> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
>
> For example, if we want to compare pid namespaces for 1 and 2 processes:
>

What's the purpose of the following line, and the use of 'pid' in the
kcmp() call?:

> pid = getpid();
> ns_fd1 = open("/proc/1/ns/pid")
> ns_fd2 = open("/proc/2/ns/pid")
>
> if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
>         printf("Both processes live in the same pid namespace\n");


Thanks,

Michael

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-26 18:25                                       ` Andrew Vagin
                                                         ` (2 preceding siblings ...)
  (?)
@ 2016-07-26 19:17                                       ` Michael Kerrisk (man-pages)
       [not found]                                         ` <CAKgNAkjmOu+vfiMDyeYQkkf7wQBH9PVmJ4nH2CTg43GrN-k7eA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  -1 siblings, 1 reply; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-26 19:17 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Eric W. Biederman, Andrey Vagin, Serge Hallyn, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

Hello Andrew,

On 26 July 2016 at 20:25, Andrew Vagin <avagin@virtuozzo.com> wrote:
> On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
>> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
>> > On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> > > "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>> >
>> > [snip]
>> >
>> > > [snip]
>> > > > > > So, from my point of view, the important piece that was missing from
>> > > > > > your commit message was the note to use readlink("/proc/self/fd/%d")
>> > > > > > on the returned FDs. I think that detail needs to be part of the
>> > > > > > commit message (and also the man page text). I think it even be
>> > > > > > helpful to include the above program as part of the commit message:
>> > > > > > it helps people more quickly grasp the API.
>> > > > >
>> > > > > Please, please make the standard way to compare these things fstat.
>> > > > > That is much less magic than a symlink, and a little more future proof.
>> > > > > Possibly even kcmp.
>> >
>> > I like the idea to use kcmp to compare namespaces. I am going to add this
>> > functionality to kcmp and describe all these in the man page.
>>
>> Hi Andrey,
>>
>> Can you briefly sketch out the proposed API and how it would be used?
>> I'd find it useful to see that even before the implementation.
>
> Sure. If a process wants to compare two namespaces, it needs to get file
> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> process which has them),
> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
>
> For example, if we want to compare pid namespaces for 1 and 2 processes:
>

What's the purpose of the following line, and the use of 'pid' in the
kcmp() call?:

> pid = getpid();
> ns_fd1 = open("/proc/1/ns/pid")
> ns_fd2 = open("/proc/2/ns/pid")
>
> if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
>         printf("Both processes live in the same pid namespace\n");


Thanks,

Michael

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-26 18:32                                         ` W. Trevor King
@ 2016-07-26 19:11                                             ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26 19:11 UTC (permalink / raw)
  To: W. Trevor King
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	Michael Kerrisk (man-pages),
	linux-fsdevel, James Bottomley, Eric W. Biederman

On Tue, Jul 26, 2016 at 11:32:25AM -0700, W. Trevor King wrote:
> On Tue, Jul 26, 2016 at 11:25:24AM -0700, Andrew Vagin wrote:
> > Sure. If a process wants to compare two namespaces, it needs to get file
> > descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> > process which has them),
> > and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
> 
> If you use the new ioctl-s to get ns_fd2, do you walk your local /proc
> to find pid2?

If you use the new ioctl-s to get nf_fd2, you will have it in the
current process, so pid2 will be getpid().

pidX identifies a process where to find fdX.

man 2 kcmp:
 The kcmp() system call can be used to check whether the  two processes
 identified  by  pid1  and  pid2 share a kernel resource such as virtual
 memory, file descriptors, and so on.

> 
> Cheers,
> Trevor
> 
> -- 
> This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
> For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26 19:11                                             ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26 19:11 UTC (permalink / raw)
  To: W. Trevor King
  Cc: Michael Kerrisk (man-pages),
	Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu, Eric W. Biederman, linux-fsdevel, James Bottomley,
	Alexander Viro

On Tue, Jul 26, 2016 at 11:32:25AM -0700, W. Trevor King wrote:
> On Tue, Jul 26, 2016 at 11:25:24AM -0700, Andrew Vagin wrote:
> > Sure. If a process wants to compare two namespaces, it needs to get file
> > descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> > process which has them),
> > and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)
> 
> If you use the new ioctl-s to get ns_fd2, do you walk your local /proc
> to find pid2?

If you use the new ioctl-s to get nf_fd2, you will have it in the
current process, so pid2 will be getpid().

pidX identifies a process where to find fdX.

man 2 kcmp:
 The kcmp() system call can be used to check whether the  two processes
 identified  by  pid1  and  pid2 share a kernel resource such as virtual
 memory, file descriptors, and so on.

> 
> Cheers,
> Trevor
> 
> -- 
> This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
> For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                       ` <20160726182524.GA328-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-26 18:32                                         ` W. Trevor King
  2016-07-26 19:17                                         ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-26 18:32 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	Michael Kerrisk (man-pages),
	linux-fsdevel, James Bottomley, Eric W. Biederman


[-- Attachment #1.1: Type: text/plain, Size: 569 bytes --]

On Tue, Jul 26, 2016 at 11:25:24AM -0700, Andrew Vagin wrote:
> Sure. If a process wants to compare two namespaces, it needs to get file
> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> process which has them),
> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)

If you use the new ioctl-s to get ns_fd2, do you walk your local /proc
to find pid2?

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                       ` <20160726182524.GA328-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-26 18:32                                         ` W. Trevor King
  2016-07-26 19:17                                         ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-26 18:32 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Michael Kerrisk (man-pages),
	Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu, Eric W. Biederman, linux-fsdevel, James Bottomley,
	Alexander Viro

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

On Tue, Jul 26, 2016 at 11:25:24AM -0700, Andrew Vagin wrote:
> Sure. If a process wants to compare two namespaces, it needs to get file
> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> process which has them),
> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)

If you use the new ioctl-s to get ns_fd2, do you walk your local /proc
to find pid2?

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26 18:32                                         ` W. Trevor King
  0 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-26 18:32 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Michael Kerrisk (man-pages),
	Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman, linux-fsdevel,
	James Bottomley, Alexander Viro

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

On Tue, Jul 26, 2016 at 11:25:24AM -0700, Andrew Vagin wrote:
> Sure. If a process wants to compare two namespaces, it needs to get file
> descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
> process which has them),
> and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)

If you use the new ioctl-s to get ns_fd2, do you walk your local /proc
to find pid2?

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-26  8:03                                   ` Michael Kerrisk (man-pages)
@ 2016-07-26 18:25                                       ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26 18:25 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman, linux-fsdevel,
	James Bottomley, Alexander Viro

On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> > On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
> > > "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> > 
> > [snip]
> > 
> > > [snip]
> > > > > > So, from my point of view, the important piece that was missing from
> > > > > > your commit message was the note to use readlink("/proc/self/fd/%d")
> > > > > > on the returned FDs. I think that detail needs to be part of the
> > > > > > commit message (and also the man page text). I think it even be
> > > > > > helpful to include the above program as part of the commit message:
> > > > > > it helps people more quickly grasp the API.
> > > > > 
> > > > > Please, please make the standard way to compare these things fstat.
> > > > > That is much less magic than a symlink, and a little more future proof.
> > > > > Possibly even kcmp.
> > 
> > I like the idea to use kcmp to compare namespaces. I am going to add this
> > functionality to kcmp and describe all these in the man page.
> 
> Hi Andrey,
> 
> Can you briefly sketch out the proposed API and how it would be used?
> I'd find it useful to see that even before the implementation.

Sure. If a process wants to compare two namespaces, it needs to get file
descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
process which has them),
and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)

For example, if we want to compare pid namespaces for 1 and 2 processes:

pid = getpid();
ns_fd1 = open("/proc/1/ns/pid")
ns_fd2 = open("/proc/2/ns/pid")

if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
	printf("Both processes live in the same pid namespace\n");

Thanks,
Andrew
> 
> Cheers,
> 
> Michael
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26 18:25                                       ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26 18:25 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Eric W. Biederman, Andrey Vagin, Serge Hallyn, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

On Tue, Jul 26, 2016 at 10:03:25AM +0200, Michael Kerrisk (man-pages) wrote:
> On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> > On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
> > > "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> > 
> > [snip]
> > 
> > > [snip]
> > > > > > So, from my point of view, the important piece that was missing from
> > > > > > your commit message was the note to use readlink("/proc/self/fd/%d")
> > > > > > on the returned FDs. I think that detail needs to be part of the
> > > > > > commit message (and also the man page text). I think it even be
> > > > > > helpful to include the above program as part of the commit message:
> > > > > > it helps people more quickly grasp the API.
> > > > > 
> > > > > Please, please make the standard way to compare these things fstat.
> > > > > That is much less magic than a symlink, and a little more future proof.
> > > > > Possibly even kcmp.
> > 
> > I like the idea to use kcmp to compare namespaces. I am going to add this
> > functionality to kcmp and describe all these in the man page.
> 
> Hi Andrey,
> 
> Can you briefly sketch out the proposed API and how it would be used?
> I'd find it useful to see that even before the implementation.

Sure. If a process wants to compare two namespaces, it needs to get file
descriptors for them (open /proc/PID/ns/XXX, use new ioctl-s, find a
process which has them),
and then it calls kcmp(pid1, pid2, KCMP_NSFD, ns_fd1, ns_fd2)

For example, if we want to compare pid namespaces for 1 and 2 processes:

pid = getpid();
ns_fd1 = open("/proc/1/ns/pid")
ns_fd2 = open("/proc/2/ns/pid")

if (!kcmp(pid, pid, KCMP_NSFD, ns_fd1, ns_fd2))
	printf("Both processes live in the same pid namespace\n");

Thanks,
Andrew
> 
> Cheers,
> 
> Michael
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                 ` <20160726025455.GC26206-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-26  8:03                                   ` Michael Kerrisk (man-pages)
  2016-07-26 19:38                                     ` Eric W. Biederman
  1 sibling, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-26  8:03 UTC (permalink / raw)
  To: Andrew Vagin, Eric W. Biederman
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-fsdevel, James Bottomley, Alexander Viro

On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> [snip]
>
>> [snip]
>>>>> So, from my point of view, the important piece that was missing from
>>>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>>>> on the returned FDs. I think that detail needs to be part of the
>>>>> commit message (and also the man page text). I think it even be
>>>>> helpful to include the above program as part of the commit message:
>>>>> it helps people more quickly grasp the API.
>>>>
>>>> Please, please make the standard way to compare these things fstat.
>>>> That is much less magic than a symlink, and a little more future proof.
>>>> Possibly even kcmp.
>
> I like the idea to use kcmp to compare namespaces. I am going to add this
> functionality to kcmp and describe all these in the man page.

Hi Andrey,

Can you briefly sketch out the proposed API and how it would be used?
I'd find it useful to see that even before the implementation.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                                 ` <20160726025455.GC26206-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-26  8:03                                   ` Michael Kerrisk (man-pages)
  2016-07-26 19:38                                     ` Eric W. Biederman
  1 sibling, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-26  8:03 UTC (permalink / raw)
  To: Andrew Vagin, Eric W. Biederman
  Cc: mtk.manpages, Andrey Vagin, Serge Hallyn, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>
> [snip]
>
>> [snip]
>>>>> So, from my point of view, the important piece that was missing from
>>>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>>>> on the returned FDs. I think that detail needs to be part of the
>>>>> commit message (and also the man page text). I think it even be
>>>>> helpful to include the above program as part of the commit message:
>>>>> it helps people more quickly grasp the API.
>>>>
>>>> Please, please make the standard way to compare these things fstat.
>>>> That is much less magic than a symlink, and a little more future proof.
>>>> Possibly even kcmp.
>
> I like the idea to use kcmp to compare namespaces. I am going to add this
> functionality to kcmp and describe all these in the man page.

Hi Andrey,

Can you briefly sketch out the proposed API and how it would be used?
I'd find it useful to see that even before the implementation.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26  8:03                                   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-26  8:03 UTC (permalink / raw)
  To: Andrew Vagin, Eric W. Biederman
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Andrey Vagin, Serge Hallyn,
	criu-GEFAQzZX7r8dnm+yROfE0A, Linux API, Linux Containers, LKML,
	James Bottomley, linux-fsdevel, Alexander Viro

On 07/26/2016 04:54 AM, Andrew Vagin wrote:
> On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> [snip]
>
>> [snip]
>>>>> So, from my point of view, the important piece that was missing from
>>>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>>>> on the returned FDs. I think that detail needs to be part of the
>>>>> commit message (and also the man page text). I think it even be
>>>>> helpful to include the above program as part of the commit message:
>>>>> it helps people more quickly grasp the API.
>>>>
>>>> Please, please make the standard way to compare these things fstat.
>>>> That is much less magic than a symlink, and a little more future proof.
>>>> Possibly even kcmp.
>
> I like the idea to use kcmp to compare namespaces. I am going to add this
> functionality to kcmp and describe all these in the man page.

Hi Andrey,

Can you briefly sketch out the proposed API and how it would be used?
I'd find it useful to see that even before the implementation.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-25 14:59                             ` Eric W. Biederman
@ 2016-07-26  2:54                                 ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26  2:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Michael Kerrisk (man-pages),
	linux-fsdevel, James Bottomley, Alexander Viro

On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

[snip]

> [snip]
> >>> So, from my point of view, the important piece that was missing from
> >>> your commit message was the note to use readlink("/proc/self/fd/%d")
> >>> on the returned FDs. I think that detail needs to be part of the
> >>> commit message (and also the man page text). I think it even be
> >>> helpful to include the above program as part of the commit message:
> >>> it helps people more quickly grasp the API.
> >>
> >> Please, please make the standard way to compare these things fstat.
> >> That is much less magic than a symlink, and a little more future proof.
> >> Possibly even kcmp.

I like the idea to use kcmp to compare namespaces. I am going to add this
functionality to kcmp and describe all these in the man page.

> >
> > As in fstat() to get the st_ino field, right?
> 
> Both the st_ino and st_dev fields.
> 
> The most likely change to support checkpoint/restart in the future is to
> preserve st_ino across migrations and instantiate a different instance
> of nsfs to hold the inode numbers from the previous machine.

It sounds tricky. BTW: Actually this is not only one places where we have
this sort of problem. For example, now mount id-s are not preserved when
a container is migrated. The same problem is applied to tmpfs, where
inode numbers are not preserved for files. 

> 
> We would need to handle the preservation carefully or else there is
> a chance that two namespace file descriptors (collected from different
> sources) with different st_dev and st_ino fields may actuall refer to
> the same object.
> 
> Which is a long way of saying we have the st_dev field please use it,
> it may matter at some point.
> 
> Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26  2:54                                 ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26  2:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Kerrisk (man-pages),
	Andrey Vagin, Serge Hallyn, criu, Linux API, Linux Containers,
	LKML, James Bottomley, linux-fsdevel, Alexander Viro

On Mon, Jul 25, 2016 at 09:59:43AM -0500, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

[snip]

> [snip]
> >>> So, from my point of view, the important piece that was missing from
> >>> your commit message was the note to use readlink("/proc/self/fd/%d")
> >>> on the returned FDs. I think that detail needs to be part of the
> >>> commit message (and also the man page text). I think it even be
> >>> helpful to include the above program as part of the commit message:
> >>> it helps people more quickly grasp the API.
> >>
> >> Please, please make the standard way to compare these things fstat.
> >> That is much less magic than a symlink, and a little more future proof.
> >> Possibly even kcmp.

I like the idea to use kcmp to compare namespaces. I am going to add this
functionality to kcmp and describe all these in the man page.

> >
> > As in fstat() to get the st_ino field, right?
> 
> Both the st_ino and st_dev fields.
> 
> The most likely change to support checkpoint/restart in the future is to
> preserve st_ino across migrations and instantiate a different instance
> of nsfs to hold the inode numbers from the previous machine.

It sounds tricky. BTW: Actually this is not only one places where we have
this sort of problem. For example, now mount id-s are not preserved when
a container is migrated. The same problem is applied to tmpfs, where
inode numbers are not preserved for files. 

> 
> We would need to handle the preservation carefully or else there is
> a chance that two namespace file descriptors (collected from different
> sources) with different st_dev and st_ino fields may actuall refer to
> the same object.
> 
> Which is a long way of saying we have the st_dev field please use it,
> it may matter at some point.
> 
> Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-24  5:10     ` Eric W. Biederman
@ 2016-07-26  2:07         ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26  2:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge Hallyn, Andrey Vagin, criu-GEFAQzZX7r8dnm+yROfE0A,
	Linux API, Linux Containers, LKML, James Bottomley,
	Alexander Viro, linux-fsdevel, Michael Kerrisk (man-pages)

On Sun, Jul 24, 2016 at 12:10:21AM -0500, Eric W. Biederman wrote:
> Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:
> 
> > Hello,
> >
> > I forgot to add --cc-cover for git send-email, so everyone who is in
> > Cc got only a cover letter. All messages were sent in mail lists.
> >
> > Sorry for inconvenience.
> 
> Mostly the code looked sensible.  But I had a couple of issues.
> Resend this in September (when the merge window is closed and I am back
> from vacation) and I will give this a thorough review and get this
> merged.  Or possibly next week if Linus releases another -rc

Eric, thank you for the detailed comments. I will rework this series and
send it after the merge window.

> 
> > On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
> >> Each namespace has an owning user namespace and now there is not way
> >> to discover these relationships.
> >>
> >> Pid and user namepaces are hierarchical. There is no way to discover
> >> parent-child relationships too.
> >>
> >> Why we may want to know relationships between namespaces?
> >>
> >> One use would be visualization, in order to understand the running system.
> >> Another would be to answer the question: what capability does process X have to
> >> perform operations on a resource governed by namespace Y?
> >>
> >> One more use-case (which usually called abnormal) is checkpoint/restart.
> >> In CRIU we age going to dump and restore nested namespaces.
> >>
> >> There [1] was a discussion about which interface to choose to determing
> >> relationships between namespaces.
> >>
> >> Eric suggested to add two ioctl-s [2]:
> >>> Grumble, Grumble.  I think this may actually a case for creating ioctls
> >>> for these two cases.  Now that random nsfs file descriptors are bind
> >>> mountable the original reason for using proc files is not as pressing.
> >>>
> >>> One ioctl for the user namespace that owns a file descriptor.
> >>> One ioctl for the parent namespace of a namespace file descriptor.
> >>
> >> Here is an implementaions of these ioctl-s.
> >>
> >> [1] https://lkml.org/lkml/2016/7/6/158
> >> [2] https://lkml.org/lkml/2016/7/9/101
> >>
> >> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> >> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> >> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> >> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> >> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> 
> 
> Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-26  2:07         ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-26  2:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrey Vagin, LKML, James Bottomley, Serge Hallyn, Linux API,
	Linux Containers, Alexander Viro, criu, linux-fsdevel,
	Michael Kerrisk (man-pages)

On Sun, Jul 24, 2016 at 12:10:21AM -0500, Eric W. Biederman wrote:
> Andrey Vagin <avagin@openvz.org> writes:
> 
> > Hello,
> >
> > I forgot to add --cc-cover for git send-email, so everyone who is in
> > Cc got only a cover letter. All messages were sent in mail lists.
> >
> > Sorry for inconvenience.
> 
> Mostly the code looked sensible.  But I had a couple of issues.
> Resend this in September (when the merge window is closed and I am back
> from vacation) and I will give this a thorough review and get this
> merged.  Or possibly next week if Linus releases another -rc

Eric, thank you for the detailed comments. I will rework this series and
send it after the merge window.

> 
> > On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin@openvz.org> wrote:
> >> Each namespace has an owning user namespace and now there is not way
> >> to discover these relationships.
> >>
> >> Pid and user namepaces are hierarchical. There is no way to discover
> >> parent-child relationships too.
> >>
> >> Why we may want to know relationships between namespaces?
> >>
> >> One use would be visualization, in order to understand the running system.
> >> Another would be to answer the question: what capability does process X have to
> >> perform operations on a resource governed by namespace Y?
> >>
> >> One more use-case (which usually called abnormal) is checkpoint/restart.
> >> In CRIU we age going to dump and restore nested namespaces.
> >>
> >> There [1] was a discussion about which interface to choose to determing
> >> relationships between namespaces.
> >>
> >> Eric suggested to add two ioctl-s [2]:
> >>> Grumble, Grumble.  I think this may actually a case for creating ioctls
> >>> for these two cases.  Now that random nsfs file descriptors are bind
> >>> mountable the original reason for using proc files is not as pressing.
> >>>
> >>> One ioctl for the user namespace that owns a file descriptor.
> >>> One ioctl for the parent namespace of a namespace file descriptor.
> >>
> >> Here is an implementaions of these ioctl-s.
> >>
> >> [1] https://lkml.org/lkml/2016/7/6/158
> >> [2] https://lkml.org/lkml/2016/7/9/101
> >>
> >> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> >> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> >> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> >> Cc: "W. Trevor King" <wking@tremily.us>
> >> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> >> Cc: Serge Hallyn <serge.hallyn@canonical.com>
> 
> 
> Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                           ` <20160725145445.GA19879-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2016-07-25 15:17                             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 15:17 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Serge Hallyn, Andrew Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Michael Kerrisk (man-pages),
	Andrey Vagin, linux-fsdevel, James Bottomley, Alexander Viro

"Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:

> Quoting Michael Kerrisk (man-pages) (mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>> Hi Eric,
>> 
>> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
>> >"Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >
>> >>Hi Andrey,
>> >>
>> >>On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>> >>Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
>> >>user namespace"?
>> >
>> >Having looked at that bit of code I don't think capabilities really
>> >have a role to play.
>> 
>> Yes, I caught up with that now. I await to see how this plays out
>> in the next patch version.
>
> Thanks - that had caught my eye but I hadn't had time to look into the
> justification for this.  Hiding this kind of thing indeed seems wrong to
> me, unless there is a really good justification for it, i.e. a way
> to use that info in an exploit.

To avoid breaking checkpoint/restart we need to limit information to the
namespaces the caller is a member of for the user and pid namespaces.

This roughly duplicates the parentage checks in ns_capable.

Conceptually this is the same as limiting .. in a chroot environment.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                           ` <20160725145445.GA19879-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2016-07-25 15:17                             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 15:17 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Michael Kerrisk (man-pages),
	Serge Hallyn, Andrew Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu, linux-fsdevel, James Bottomley,
	Andrey Vagin

"Serge E. Hallyn" <serge@hallyn.com> writes:

> Quoting Michael Kerrisk (man-pages) (mtk.manpages@gmail.com):
>> Hi Eric,
>> 
>> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
>> >"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>> >
>> >>Hi Andrey,
>> >>
>> >>On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>> >>Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
>> >>user namespace"?
>> >
>> >Having looked at that bit of code I don't think capabilities really
>> >have a role to play.
>> 
>> Yes, I caught up with that now. I await to see how this plays out
>> in the next patch version.
>
> Thanks - that had caught my eye but I hadn't had time to look into the
> justification for this.  Hiding this kind of thing indeed seems wrong to
> me, unless there is a really good justification for it, i.e. a way
> to use that info in an exploit.

To avoid breaking checkpoint/restart we need to limit information to the
namespaces the caller is a member of for the user and pid namespaces.

This roughly duplicates the parentage checks in ns_capable.

Conceptually this is the same as limiting .. in a chroot environment.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-25 15:17                             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 15:17 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Michael Kerrisk (man-pages),
	Serge Hallyn, Andrew Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu@openvz.org, linux-fsdevel, James Bottomley,
	Andrey Vagin

"Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:

> Quoting Michael Kerrisk (man-pages) (mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>> Hi Eric,
>> 
>> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
>> >"Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >
>> >>Hi Andrey,
>> >>
>> >>On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>> >>Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
>> >>user namespace"?
>> >
>> >Having looked at that bit of code I don't think capabilities really
>> >have a role to play.
>> 
>> Yes, I caught up with that now. I await to see how this plays out
>> in the next patch version.
>
> Thanks - that had caught my eye but I hadn't had time to look into the
> justification for this.  Hiding this kind of thing indeed seems wrong to
> me, unless there is a really good justification for it, i.e. a way
> to use that info in an exploit.

To avoid breaking checkpoint/restart we need to limit information to the
namespaces the caller is a member of for the user and pid namespaces.

This roughly duplicates the parentage checks in ns_capable.

Conceptually this is the same as limiting .. in a chroot environment.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-25 14:46                         ` Michael Kerrisk (man-pages)
@ 2016-07-25 14:59                             ` Eric W. Biederman
  -1 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 14:59 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Alexander Viro, linux-fsdevel,
	James Bottomley, Andrew Vagin

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hi Eric,
>
> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>
>>> Hi Andrey,
>>>
>>> On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>>>> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
>>>> <mtk.manpages@gmail.com> wrote:
>>>>> Hi Andrey,
>>>>>
>>>>>
>>>>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>>>>
[snip]
>>>>>> where ioctl_type is one of the following:
>>>>>>
>>>>>> NS_GET_USERNS
>>>>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>>>>       pace.
>>>>>>
>>>>>> NS_GET_PARENT
>>>>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>>>>       ing.
>>>
>>> For each of the above, I think it is worth mentioning that the
>>> close-on-exec flag is set for the returned file descriptor.
>>
>> Hmm.  That is an odd default.
>
> Why do you say that? It's pretty common as the default for various
> APIs that create new FDs these days. (There's of course a strong argument
> that the original UNIX default was a design blunder...)

Interesting.  I haven't kept up on that, but it seems reasonable.

[snip]
>>> So, from my point of view, the important piece that was missing from
>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>> on the returned FDs. I think that detail needs to be part of the
>>> commit message (and also the man page text). I think it even be
>>> helpful to include the above program as part of the commit message:
>>> it helps people more quickly grasp the API.
>>
>> Please, please make the standard way to compare these things fstat.
>> That is much less magic than a symlink, and a little more future proof.
>> Possibly even kcmp.
>
> As in fstat() to get the st_ino field, right?

Both the st_ino and st_dev fields.

The most likely change to support checkpoint/restart in the future is to
preserve st_ino across migrations and instantiate a different instance
of nsfs to hold the inode numbers from the previous machine.

We would need to handle the preservation carefully or else there is
a chance that two namespace file descriptors (collected from different
sources) with different st_dev and st_ino fields may actuall refer to
the same object.

Which is a long way of saying we have the st_dev field please use it,
it may matter at some point.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-25 14:59                             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 14:59 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrey Vagin, Serge Hallyn, Andrew Vagin, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hi Eric,
>
> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>
>>> Hi Andrey,
>>>
>>> On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>>>> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
>>>> <mtk.manpages@gmail.com> wrote:
>>>>> Hi Andrey,
>>>>>
>>>>>
>>>>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>>>>
[snip]
>>>>>> where ioctl_type is one of the following:
>>>>>>
>>>>>> NS_GET_USERNS
>>>>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>>>>       pace.
>>>>>>
>>>>>> NS_GET_PARENT
>>>>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>>>>       ing.
>>>
>>> For each of the above, I think it is worth mentioning that the
>>> close-on-exec flag is set for the returned file descriptor.
>>
>> Hmm.  That is an odd default.
>
> Why do you say that? It's pretty common as the default for various
> APIs that create new FDs these days. (There's of course a strong argument
> that the original UNIX default was a design blunder...)

Interesting.  I haven't kept up on that, but it seems reasonable.

[snip]
>>> So, from my point of view, the important piece that was missing from
>>> your commit message was the note to use readlink("/proc/self/fd/%d")
>>> on the returned FDs. I think that detail needs to be part of the
>>> commit message (and also the man page text). I think it even be
>>> helpful to include the above program as part of the commit message:
>>> it helps people more quickly grasp the API.
>>
>> Please, please make the standard way to compare these things fstat.
>> That is much less magic than a symlink, and a little more future proof.
>> Possibly even kcmp.
>
> As in fstat() to get the st_ino field, right?

Both the st_ino and st_dev fields.

The most likely change to support checkpoint/restart in the future is to
preserve st_ino across migrations and instantiate a different instance
of nsfs to hold the inode numbers from the previous machine.

We would need to handle the preservation carefully or else there is
a chance that two namespace file descriptors (collected from different
sources) with different st_dev and st_ino fields may actuall refer to
the same object.

Which is a long way of saying we have the st_dev field please use it,
it may matter at some point.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                         ` <44ca0e41-dc92-45b1-2a6c-c41a048a072d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2016-07-25 14:54                           ` Serge E. Hallyn
  2016-07-25 14:59                             ` Eric W. Biederman
  1 sibling, 0 replies; 85+ messages in thread
From: Serge E. Hallyn @ 2016-07-25 14:54 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrew Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman, Andrey Vagin,
	linux-fsdevel, James Bottomley, Alexander Viro

Quoting Michael Kerrisk (man-pages) (mtk.manpages@gmail.com):
> Hi Eric,
> 
> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
> >"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> >
> >>Hi Andrey,
> >>
> >>On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> >>>On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> >>><mtk.manpages@gmail.com> wrote:
> >>>>Hi Andrey,
> >>>>
> >>>>
> >>>>On 07/21/2016 11:06 PM, Andrew Vagin wrote:
> >>>>>
> >>>>>On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
> >>>>>wrote:
> >>>>>>
> >>>>>>Hi Andrey,
> >>>>>>
> >>>>>>On 07/14/2016 08:20 PM, Andrey Vagin wrote:
> >>>>>
> >>>>>
> >>>>><snip>
> >>>>>
> >>>>>>
> >>>>>>Could you add here an of the API in detail: what do these FDs refer to,
> >>>>>>and how do you use them to solve the use case? And could you you add
> >>>>>>that info to the commit messages please.
> >>>>>
> >>>>>
> >>>>>Hi Michael,
> >>>>>
> >>>>>A patch for man-pages is attached. It adds the following text to
> >>>>>namespaces(7).
> >>>>>
> >>>>>Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
> >>>>>pace file descriptors.  The correct syntax is:
> >>>>>
> >>>>>      fd = ioctl(ns_fd, ioctl_type);
> >>>>>
> >>>>>where ioctl_type is one of the following:
> >>>>>
> >>>>>NS_GET_USERNS
> >>>>>      Returns a file descriptor that refers to an owning  user  names‐
> >>>>>      pace.
> >>>>>
> >>>>>NS_GET_PARENT
> >>>>>      Returns  a  file  descriptor  that refers to a parent namespace.
> >>>>>      This ioctl(2) can be used for pid and user namespaces. For  user
> >>>>>      namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
> >>>>>      ing.
> >>
> >>For each of the above, I think it is worth mentioning that the
> >>close-on-exec flag is set for the returned file descriptor.
> >
> >Hmm.  That is an odd default.
> 
> Why do you say that? It's pretty common as the default for various
> APIs that create new FDs these days. (There's of course a strong argument
> that the original UNIX default was a design blunder...)
> 
> >>>>>
> >>>>>In addition to generic ioctl(2) errors, the following specific ones can
> >>>>>occur:
> >>>>>
> >>>>>EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
> >>>>>
> >>>>>EPERM  The  requested  namespace  is  outside  of the current namespace
> >>>>>      scope.
> >>
> >>Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
> >>user namespace"?
> >
> >Having looked at that bit of code I don't think capabilities really
> >have a role to play.
> 
> Yes, I caught up with that now. I await to see how this plays out
> in the next patch version.

Thanks - that had caught my eye but I hadn't had time to look into the
justification for this.  Hiding this kind of thing indeed seems wrong to
me, unless there is a really good justification for it, i.e. a way
to use that info in an exploit.

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]                         ` <44ca0e41-dc92-45b1-2a6c-c41a048a072d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2016-07-25 14:54                           ` Serge E. Hallyn
  2016-07-25 14:59                             ` Eric W. Biederman
  1 sibling, 0 replies; 85+ messages in thread
From: Serge E. Hallyn @ 2016-07-25 14:54 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Eric W. Biederman, Serge Hallyn, Andrew Vagin, Linux API,
	Linux Containers, LKML, Alexander Viro, criu, linux-fsdevel,
	James Bottomley, Andrey Vagin

Quoting Michael Kerrisk (man-pages) (mtk.manpages@gmail.com):
> Hi Eric,
> 
> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
> >"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> >
> >>Hi Andrey,
> >>
> >>On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> >>>On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> >>><mtk.manpages@gmail.com> wrote:
> >>>>Hi Andrey,
> >>>>
> >>>>
> >>>>On 07/21/2016 11:06 PM, Andrew Vagin wrote:
> >>>>>
> >>>>>On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
> >>>>>wrote:
> >>>>>>
> >>>>>>Hi Andrey,
> >>>>>>
> >>>>>>On 07/14/2016 08:20 PM, Andrey Vagin wrote:
> >>>>>
> >>>>>
> >>>>><snip>
> >>>>>
> >>>>>>
> >>>>>>Could you add here an of the API in detail: what do these FDs refer to,
> >>>>>>and how do you use them to solve the use case? And could you you add
> >>>>>>that info to the commit messages please.
> >>>>>
> >>>>>
> >>>>>Hi Michael,
> >>>>>
> >>>>>A patch for man-pages is attached. It adds the following text to
> >>>>>namespaces(7).
> >>>>>
> >>>>>Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
> >>>>>pace file descriptors.  The correct syntax is:
> >>>>>
> >>>>>      fd = ioctl(ns_fd, ioctl_type);
> >>>>>
> >>>>>where ioctl_type is one of the following:
> >>>>>
> >>>>>NS_GET_USERNS
> >>>>>      Returns a file descriptor that refers to an owning  user  names‐
> >>>>>      pace.
> >>>>>
> >>>>>NS_GET_PARENT
> >>>>>      Returns  a  file  descriptor  that refers to a parent namespace.
> >>>>>      This ioctl(2) can be used for pid and user namespaces. For  user
> >>>>>      namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
> >>>>>      ing.
> >>
> >>For each of the above, I think it is worth mentioning that the
> >>close-on-exec flag is set for the returned file descriptor.
> >
> >Hmm.  That is an odd default.
> 
> Why do you say that? It's pretty common as the default for various
> APIs that create new FDs these days. (There's of course a strong argument
> that the original UNIX default was a design blunder...)
> 
> >>>>>
> >>>>>In addition to generic ioctl(2) errors, the following specific ones can
> >>>>>occur:
> >>>>>
> >>>>>EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
> >>>>>
> >>>>>EPERM  The  requested  namespace  is  outside  of the current namespace
> >>>>>      scope.
> >>
> >>Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
> >>user namespace"?
> >
> >Having looked at that bit of code I don't think capabilities really
> >have a role to play.
> 
> Yes, I caught up with that now. I await to see how this plays out
> in the next patch version.

Thanks - that had caught my eye but I hadn't had time to look into the
justification for this.  Hiding this kind of thing indeed seems wrong to
me, unless there is a really good justification for it, i.e. a way
to use that info in an exploit.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-25 14:54                           ` Serge E. Hallyn
  0 siblings, 0 replies; 85+ messages in thread
From: Serge E. Hallyn @ 2016-07-25 14:54 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Eric W. Biederman, Serge Hallyn, Andrew Vagin, Linux API,
	Linux Containers, LKML, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A, linux-fsdevel, James Bottomley,
	Andrey Vagin

Quoting Michael Kerrisk (man-pages) (mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> Hi Eric,
> 
> On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
> >"Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >
> >>Hi Andrey,
> >>
> >>On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> >>>On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> >>><mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >>>>Hi Andrey,
> >>>>
> >>>>
> >>>>On 07/21/2016 11:06 PM, Andrew Vagin wrote:
> >>>>>
> >>>>>On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
> >>>>>wrote:
> >>>>>>
> >>>>>>Hi Andrey,
> >>>>>>
> >>>>>>On 07/14/2016 08:20 PM, Andrey Vagin wrote:
> >>>>>
> >>>>>
> >>>>><snip>
> >>>>>
> >>>>>>
> >>>>>>Could you add here an of the API in detail: what do these FDs refer to,
> >>>>>>and how do you use them to solve the use case? And could you you add
> >>>>>>that info to the commit messages please.
> >>>>>
> >>>>>
> >>>>>Hi Michael,
> >>>>>
> >>>>>A patch for man-pages is attached. It adds the following text to
> >>>>>namespaces(7).
> >>>>>
> >>>>>Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
> >>>>>pace file descriptors.  The correct syntax is:
> >>>>>
> >>>>>      fd = ioctl(ns_fd, ioctl_type);
> >>>>>
> >>>>>where ioctl_type is one of the following:
> >>>>>
> >>>>>NS_GET_USERNS
> >>>>>      Returns a file descriptor that refers to an owning  user  names‐
> >>>>>      pace.
> >>>>>
> >>>>>NS_GET_PARENT
> >>>>>      Returns  a  file  descriptor  that refers to a parent namespace.
> >>>>>      This ioctl(2) can be used for pid and user namespaces. For  user
> >>>>>      namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
> >>>>>      ing.
> >>
> >>For each of the above, I think it is worth mentioning that the
> >>close-on-exec flag is set for the returned file descriptor.
> >
> >Hmm.  That is an odd default.
> 
> Why do you say that? It's pretty common as the default for various
> APIs that create new FDs these days. (There's of course a strong argument
> that the original UNIX default was a design blunder...)
> 
> >>>>>
> >>>>>In addition to generic ioctl(2) errors, the following specific ones can
> >>>>>occur:
> >>>>>
> >>>>>EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
> >>>>>
> >>>>>EPERM  The  requested  namespace  is  outside  of the current namespace
> >>>>>      scope.
> >>
> >>Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
> >>user namespace"?
> >
> >Having looked at that bit of code I don't think capabilities really
> >have a role to play.
> 
> Yes, I caught up with that now. I await to see how this plays out
> in the next patch version.

Thanks - that had caught my eye but I hadn't had time to look into the
justification for this.  Hiding this kind of thing indeed seems wrong to
me, unless there is a really good justification for it, i.e. a way
to use that info in an exploit.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-25 13:18                     ` Eric W. Biederman
@ 2016-07-25 14:46                         ` Michael Kerrisk (man-pages)
  -1 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-25 14:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge Hallyn, Andrew Vagin, Linux API, Linux Containers, LKML,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, linux-fsdevel,
	James Bottomley, Andrey Vagin

Hi Eric,

On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>
>> Hi Andrey,
>>
>> On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>>> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
>>> <mtk.manpages@gmail.com> wrote:
>>>> Hi Andrey,
>>>>
>>>>
>>>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>>>
>>>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>>>> wrote:
>>>>>>
>>>>>> Hi Andrey,
>>>>>>
>>>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>>
>>>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>>>> and how do you use them to solve the use case? And could you you add
>>>>>> that info to the commit messages please.
>>>>>
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> A patch for man-pages is attached. It adds the following text to
>>>>> namespaces(7).
>>>>>
>>>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>>>> pace file descriptors.  The correct syntax is:
>>>>>
>>>>>       fd = ioctl(ns_fd, ioctl_type);
>>>>>
>>>>> where ioctl_type is one of the following:
>>>>>
>>>>> NS_GET_USERNS
>>>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>>>       pace.
>>>>>
>>>>> NS_GET_PARENT
>>>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>>>       ing.
>>
>> For each of the above, I think it is worth mentioning that the
>> close-on-exec flag is set for the returned file descriptor.
>
> Hmm.  That is an odd default.

Why do you say that? It's pretty common as the default for various
APIs that create new FDs these days. (There's of course a strong argument
that the original UNIX default was a design blunder...)

>>>>>
>>>>> In addition to generic ioctl(2) errors, the following specific ones can
>>>>> occur:
>>>>>
>>>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>>>
>>>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>>>       scope.
>>
>> Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
>> user namespace"?
>
> Having looked at that bit of code I don't think capabilities really
> have a role to play.

Yes, I caught up with that now. I await to see how this plays out
in the next patch version.

>>>>> ENOENT ns_fd refers to the init namespace.
>>>>
>>>>
>>>> Thanks for this. But still part of the question remains unanswered.
>>>> How do we (in user-space) use the file descriptors to answer any of
>>>> the questions that this patch series was designed to solve? (This
>>>> info should be in the commit message and the man-pages patch.)
>>>
>>> I'm sorry, but I am not sure that I understand what you ask.
>>>
>>> Here are the origin questions:
>>> Someone else then asked me a question that led me to wonder about
>>> generally introspecting on the parental relationships between user
>>> namespaces and the association of other namespaces types with user
>>> namespaces. One use would be visualization, in order to understand the
>>> running system. Another would be to answer the question I already
>>> mentioned: what capability does process X have to perform operations
>>> on a resource governed by namespace Y?
>>>
>>> Here is an example which shows how we can get the owning namespace
>>> inode number by using these ioctl-s.
>>>
>>> $ ls -l /proc/13929/ns/pid
>>> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>>>
>>> $ ./nsowner /proc/13929/ns/pid
>>> user:[4026532227]
>>>
>>> The owning user namespace for pid:[4026532228] is user:[4026532227].
>>>
>>> The nsowner  tool is cimpiled from this code:
>>>
>>> int main(int argc, char *argv[])
>>> {
>>>         char buf[128], path[] = "/proc/self/fd/0123456789";
>>>         int ns, uns, ret;
>>>
>>>         ns = open(argv[1], O_RDONLY);
>>>         if (ns < 0)
>>>                 return 1;
>>>
>>>         uns = ioctl(ns, NS_GET_USERNS);
>>>         if (uns < 0)
>>>                 return 1;
>>>
>>>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>>>         ret = readlink(path, buf, sizeof(buf) - 1);
>>>         if (ret < 0)
>>>                 return 1;
>>>         buf[ret] = 0;
>>>
>>>         printf("%s\n", buf);
>>>
>>>         return 0;
>>> }
>>
>> So, from my point of view, the important piece that was missing from
>> your commit message was the note to use readlink("/proc/self/fd/%d")
>> on the returned FDs. I think that detail needs to be part of the
>> commit message (and also the man page text). I think it even be
>> helpful to include the above program as part of the commit message:
>> it helps people more quickly grasp the API.
>
> Please, please make the standard way to compare these things fstat.
> That is much less magic than a symlink, and a little more future proof.
> Possibly even kcmp.

As in fstat() to get the st_ino field, right?

Cheers,

Michael

> At some point we will care about migrating a migrating sub-container and we
> may have to have some minor changes.
>
> Eric
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-25 14:46                         ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-25 14:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Andrey Vagin, Serge Hallyn, Andrew Vagin, criu,
	Linux API, Linux Containers, LKML, James Bottomley,
	linux-fsdevel, Alexander Viro

Hi Eric,

On 07/25/2016 03:18 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>
>> Hi Andrey,
>>
>> On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>>> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
>>> <mtk.manpages@gmail.com> wrote:
>>>> Hi Andrey,
>>>>
>>>>
>>>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>>>
>>>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>>>> wrote:
>>>>>>
>>>>>> Hi Andrey,
>>>>>>
>>>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>>
>>>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>>>> and how do you use them to solve the use case? And could you you add
>>>>>> that info to the commit messages please.
>>>>>
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> A patch for man-pages is attached. It adds the following text to
>>>>> namespaces(7).
>>>>>
>>>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>>>> pace file descriptors.  The correct syntax is:
>>>>>
>>>>>       fd = ioctl(ns_fd, ioctl_type);
>>>>>
>>>>> where ioctl_type is one of the following:
>>>>>
>>>>> NS_GET_USERNS
>>>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>>>       pace.
>>>>>
>>>>> NS_GET_PARENT
>>>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>>>       ing.
>>
>> For each of the above, I think it is worth mentioning that the
>> close-on-exec flag is set for the returned file descriptor.
>
> Hmm.  That is an odd default.

Why do you say that? It's pretty common as the default for various
APIs that create new FDs these days. (There's of course a strong argument
that the original UNIX default was a design blunder...)

>>>>>
>>>>> In addition to generic ioctl(2) errors, the following specific ones can
>>>>> occur:
>>>>>
>>>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>>>
>>>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>>>       scope.
>>
>> Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
>> user namespace"?
>
> Having looked at that bit of code I don't think capabilities really
> have a role to play.

Yes, I caught up with that now. I await to see how this plays out
in the next patch version.

>>>>> ENOENT ns_fd refers to the init namespace.
>>>>
>>>>
>>>> Thanks for this. But still part of the question remains unanswered.
>>>> How do we (in user-space) use the file descriptors to answer any of
>>>> the questions that this patch series was designed to solve? (This
>>>> info should be in the commit message and the man-pages patch.)
>>>
>>> I'm sorry, but I am not sure that I understand what you ask.
>>>
>>> Here are the origin questions:
>>> Someone else then asked me a question that led me to wonder about
>>> generally introspecting on the parental relationships between user
>>> namespaces and the association of other namespaces types with user
>>> namespaces. One use would be visualization, in order to understand the
>>> running system. Another would be to answer the question I already
>>> mentioned: what capability does process X have to perform operations
>>> on a resource governed by namespace Y?
>>>
>>> Here is an example which shows how we can get the owning namespace
>>> inode number by using these ioctl-s.
>>>
>>> $ ls -l /proc/13929/ns/pid
>>> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>>>
>>> $ ./nsowner /proc/13929/ns/pid
>>> user:[4026532227]
>>>
>>> The owning user namespace for pid:[4026532228] is user:[4026532227].
>>>
>>> The nsowner  tool is cimpiled from this code:
>>>
>>> int main(int argc, char *argv[])
>>> {
>>>         char buf[128], path[] = "/proc/self/fd/0123456789";
>>>         int ns, uns, ret;
>>>
>>>         ns = open(argv[1], O_RDONLY);
>>>         if (ns < 0)
>>>                 return 1;
>>>
>>>         uns = ioctl(ns, NS_GET_USERNS);
>>>         if (uns < 0)
>>>                 return 1;
>>>
>>>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>>>         ret = readlink(path, buf, sizeof(buf) - 1);
>>>         if (ret < 0)
>>>                 return 1;
>>>         buf[ret] = 0;
>>>
>>>         printf("%s\n", buf);
>>>
>>>         return 0;
>>> }
>>
>> So, from my point of view, the important piece that was missing from
>> your commit message was the note to use readlink("/proc/self/fd/%d")
>> on the returned FDs. I think that detail needs to be part of the
>> commit message (and also the man page text). I think it even be
>> helpful to include the above program as part of the commit message:
>> it helps people more quickly grasp the API.
>
> Please, please make the standard way to compare these things fstat.
> That is much less magic than a symlink, and a little more future proof.
> Possibly even kcmp.

As in fstat() to get the st_ino field, right?

Cheers,

Michael

> At some point we will care about migrating a migrating sub-container and we
> may have to have some minor changes.
>
> Eric
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
@ 2016-07-25 13:18                     ` Eric W. Biederman
  -1 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 13:18 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrey Vagin, Linux API, Linux Containers, LKML,
	criu-GEFAQzZX7r8dnm+yROfE0A, Alexander Viro, linux-fsdevel,
	James Bottomley, Andrew Vagin

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hi Andrey,
>
> On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
>> <mtk.manpages@gmail.com> wrote:
>>> Hi Andrey,
>>>
>>>
>>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>>
>>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>>> wrote:
>>>>>
>>>>> Hi Andrey,
>>>>>
>>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>>
>>>>
>>>> <snip>
>>>>
>>>>>
>>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>>> and how do you use them to solve the use case? And could you you add
>>>>> that info to the commit messages please.
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> A patch for man-pages is attached. It adds the following text to
>>>> namespaces(7).
>>>>
>>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>>> pace file descriptors.  The correct syntax is:
>>>>
>>>>       fd = ioctl(ns_fd, ioctl_type);
>>>>
>>>> where ioctl_type is one of the following:
>>>>
>>>> NS_GET_USERNS
>>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>>       pace.
>>>>
>>>> NS_GET_PARENT
>>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>>       ing.
>
> For each of the above, I think it is worth mentioning that the
> close-on-exec flag is set for the returned file descriptor.

Hmm.  That is an odd default.

>>>>
>>>> In addition to generic ioctl(2) errors, the following specific ones can
>>>> occur:
>>>>
>>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>>
>>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>>       scope.
>
> Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
> user namespace"?

Having looked at that bit of code I don't think capabilities really
have a role to play.

>>>> ENOENT ns_fd refers to the init namespace.
>>>
>>>
>>> Thanks for this. But still part of the question remains unanswered.
>>> How do we (in user-space) use the file descriptors to answer any of
>>> the questions that this patch series was designed to solve? (This
>>> info should be in the commit message and the man-pages patch.)
>>
>> I'm sorry, but I am not sure that I understand what you ask.
>>
>> Here are the origin questions:
>> Someone else then asked me a question that led me to wonder about
>> generally introspecting on the parental relationships between user
>> namespaces and the association of other namespaces types with user
>> namespaces. One use would be visualization, in order to understand the
>> running system. Another would be to answer the question I already
>> mentioned: what capability does process X have to perform operations
>> on a resource governed by namespace Y?
>>
>> Here is an example which shows how we can get the owning namespace
>> inode number by using these ioctl-s.
>>
>> $ ls -l /proc/13929/ns/pid
>> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>>
>> $ ./nsowner /proc/13929/ns/pid
>> user:[4026532227]
>>
>> The owning user namespace for pid:[4026532228] is user:[4026532227].
>>
>> The nsowner  tool is cimpiled from this code:
>>
>> int main(int argc, char *argv[])
>> {
>>         char buf[128], path[] = "/proc/self/fd/0123456789";
>>         int ns, uns, ret;
>>
>>         ns = open(argv[1], O_RDONLY);
>>         if (ns < 0)
>>                 return 1;
>>
>>         uns = ioctl(ns, NS_GET_USERNS);
>>         if (uns < 0)
>>                 return 1;
>>
>>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>>         ret = readlink(path, buf, sizeof(buf) - 1);
>>         if (ret < 0)
>>                 return 1;
>>         buf[ret] = 0;
>>
>>         printf("%s\n", buf);
>>
>>         return 0;
>> }
>
> So, from my point of view, the important piece that was missing from
> your commit message was the note to use readlink("/proc/self/fd/%d")
> on the returned FDs. I think that detail needs to be part of the
> commit message (and also the man page text). I think it even be
> helpful to include the above program as part of the commit message:
> it helps people more quickly grasp the API.

Please, please make the standard way to compare these things fstat.
That is much less magic than a symlink, and a little more future proof.
Possibly even kcmp.

At some point we will care about migrating a migrating sub-container and we
may have to have some minor changes.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-25 13:18                     ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-25 13:18 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrey Vagin, Serge Hallyn, Andrew Vagin, criu, Linux API,
	Linux Containers, LKML, James Bottomley, linux-fsdevel,
	Alexander Viro

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hi Andrey,
>
> On 07/22/2016 08:25 PM, Andrey Vagin wrote:
>> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
>> <mtk.manpages@gmail.com> wrote:
>>> Hi Andrey,
>>>
>>>
>>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>>
>>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>>> wrote:
>>>>>
>>>>> Hi Andrey,
>>>>>
>>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>>
>>>>
>>>> <snip>
>>>>
>>>>>
>>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>>> and how do you use them to solve the use case? And could you you add
>>>>> that info to the commit messages please.
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> A patch for man-pages is attached. It adds the following text to
>>>> namespaces(7).
>>>>
>>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>>> pace file descriptors.  The correct syntax is:
>>>>
>>>>       fd = ioctl(ns_fd, ioctl_type);
>>>>
>>>> where ioctl_type is one of the following:
>>>>
>>>> NS_GET_USERNS
>>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>>       pace.
>>>>
>>>> NS_GET_PARENT
>>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>>       ing.
>
> For each of the above, I think it is worth mentioning that the
> close-on-exec flag is set for the returned file descriptor.

Hmm.  That is an odd default.

>>>>
>>>> In addition to generic ioctl(2) errors, the following specific ones can
>>>> occur:
>>>>
>>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>>
>>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>>       scope.
>
> Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
> user namespace"?

Having looked at that bit of code I don't think capabilities really
have a role to play.

>>>> ENOENT ns_fd refers to the init namespace.
>>>
>>>
>>> Thanks for this. But still part of the question remains unanswered.
>>> How do we (in user-space) use the file descriptors to answer any of
>>> the questions that this patch series was designed to solve? (This
>>> info should be in the commit message and the man-pages patch.)
>>
>> I'm sorry, but I am not sure that I understand what you ask.
>>
>> Here are the origin questions:
>> Someone else then asked me a question that led me to wonder about
>> generally introspecting on the parental relationships between user
>> namespaces and the association of other namespaces types with user
>> namespaces. One use would be visualization, in order to understand the
>> running system. Another would be to answer the question I already
>> mentioned: what capability does process X have to perform operations
>> on a resource governed by namespace Y?
>>
>> Here is an example which shows how we can get the owning namespace
>> inode number by using these ioctl-s.
>>
>> $ ls -l /proc/13929/ns/pid
>> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>>
>> $ ./nsowner /proc/13929/ns/pid
>> user:[4026532227]
>>
>> The owning user namespace for pid:[4026532228] is user:[4026532227].
>>
>> The nsowner  tool is cimpiled from this code:
>>
>> int main(int argc, char *argv[])
>> {
>>         char buf[128], path[] = "/proc/self/fd/0123456789";
>>         int ns, uns, ret;
>>
>>         ns = open(argv[1], O_RDONLY);
>>         if (ns < 0)
>>                 return 1;
>>
>>         uns = ioctl(ns, NS_GET_USERNS);
>>         if (uns < 0)
>>                 return 1;
>>
>>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>>         ret = readlink(path, buf, sizeof(buf) - 1);
>>         if (ret < 0)
>>                 return 1;
>>         buf[ret] = 0;
>>
>>         printf("%s\n", buf);
>>
>>         return 0;
>> }
>
> So, from my point of view, the important piece that was missing from
> your commit message was the note to use readlink("/proc/self/fd/%d")
> on the returned FDs. I think that detail needs to be part of the
> commit message (and also the man page text). I think it even be
> helpful to include the above program as part of the commit message:
> it helps people more quickly grasp the API.

Please, please make the standard way to compare these things fstat.
That is much less magic than a symlink, and a little more future proof.
Possibly even kcmp.

At some point we will care about migrating a migrating sub-container and we
may have to have some minor changes.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]               ` <CANaxB-w8H8Wo8FmtmBBZTpJX-ZDGRQx0rbm9E5c9WbduQ_Ukmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-25 11:47 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: Serge Hallyn, Eric W. Biederman, Andrew Vagin,
	criu-GEFAQzZX7r8dnm+yROfE0A, Linux API, Linux Containers, LKML,
	James Bottomley, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-fsdevel, Alexander Viro

Hi Andrey,

On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> <mtk.manpages@gmail.com> wrote:
>> Hi Andrey,
>>
>>
>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>
>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>> wrote:
>>>>
>>>> Hi Andrey,
>>>>
>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>
>>>
>>> <snip>
>>>
>>>>
>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>> and how do you use them to solve the use case? And could you you add
>>>> that info to the commit messages please.
>>>
>>>
>>> Hi Michael,
>>>
>>> A patch for man-pages is attached. It adds the following text to
>>> namespaces(7).
>>>
>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>> pace file descriptors.  The correct syntax is:
>>>
>>>       fd = ioctl(ns_fd, ioctl_type);
>>>
>>> where ioctl_type is one of the following:
>>>
>>> NS_GET_USERNS
>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>       pace.
>>>
>>> NS_GET_PARENT
>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>       ing.

For each of the above, I think it is worth mentioning that the
close-on-exec flag is set for the returned file descriptor.

>>>
>>> In addition to generic ioctl(2) errors, the following specific ones can
>>> occur:
>>>
>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>
>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>       scope.

Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
user namespace"?

>>>
>>> ENOENT ns_fd refers to the init namespace.
>>
>>
>> Thanks for this. But still part of the question remains unanswered.
>> How do we (in user-space) use the file descriptors to answer any of
>> the questions that this patch series was designed to solve? (This
>> info should be in the commit message and the man-pages patch.)
>
> I'm sorry, but I am not sure that I understand what you ask.
>
> Here are the origin questions:
> Someone else then asked me a question that led me to wonder about
> generally introspecting on the parental relationships between user
> namespaces and the association of other namespaces types with user
> namespaces. One use would be visualization, in order to understand the
> running system. Another would be to answer the question I already
> mentioned: what capability does process X have to perform operations
> on a resource governed by namespace Y?
>
> Here is an example which shows how we can get the owning namespace
> inode number by using these ioctl-s.
>
> $ ls -l /proc/13929/ns/pid
> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>
> $ ./nsowner /proc/13929/ns/pid
> user:[4026532227]
>
> The owning user namespace for pid:[4026532228] is user:[4026532227].
>
> The nsowner  tool is cimpiled from this code:
>
> int main(int argc, char *argv[])
> {
>         char buf[128], path[] = "/proc/self/fd/0123456789";
>         int ns, uns, ret;
>
>         ns = open(argv[1], O_RDONLY);
>         if (ns < 0)
>                 return 1;
>
>         uns = ioctl(ns, NS_GET_USERNS);
>         if (uns < 0)
>                 return 1;
>
>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>         ret = readlink(path, buf, sizeof(buf) - 1);
>         if (ret < 0)
>                 return 1;
>         buf[ret] = 0;
>
>         printf("%s\n", buf);
>
>         return 0;
> }

So, from my point of view, the important piece that was missing from
your commit message was the note to use readlink("/proc/self/fd/%d")
on the returned FDs. I think that detail needs to be part of the
commit message (and also the man page text). I think it even be
helpful to include the above program as part of the commit message:
it helps people more quickly grasp the API.

> Does this example answer to the origin question?

Yes.

>If it isn't, could
> you eloborate what you expect to see here.
>
> And I wrote one more example which show all relationships between
> namespaces. It enumirates all processes in a system, collects all
> namespaces and determins parent and owning namespaces for each of
> them, then it constructs a namespace tree and shows it.
>
> Here is a code: https://gist.github.com/avagin/db805f95e15ffb0af7e559dbb8de4418

That's great! Thanks!
  
> Here is an example of output for my test system:
> [root@fc24 nsfs]# ./nstree
> user:[4026531837]
>  \__  mnt:[4026532203]
>  \__  ipc:[4026531839]
>  \__  user:[4026532224]
>      \__  user:[4026532226]
>          \__  user:[4026532227]
>              \__  pid:[4026532228]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>  \__  user:[4026532221]
>      \__  pid:[4026532222]
>      \__  user:[4026532223]
>  \__  mnt:[4026532211]
>  \__  uts:[4026531838]
>  \__  cgroup:[4026531835]
>  \__  pid:[4026531836]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>      \__  pid:[4026532222]
>  \__  mnt:[4026531857]
>  \__  mnt:[4026531840]
>  \__  net:[4026531957]

Cheers,

Michael

>>>>> [1] https://lkml.org/lkml/2016/7/6/158
>>>>> [2] https://lkml.org/lkml/2016/7/9/101

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-22 18:25               ` Andrey Vagin
@ 2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
  -1 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-25 11:47 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: mtk.manpages, Andrew Vagin, James Bottomley, Serge Hallyn,
	Linux API, Linux Containers, LKML, Alexander Viro, criu,
	linux-fsdevel, Eric W. Biederman

Hi Andrey,

On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> <mtk.manpages@gmail.com> wrote:
>> Hi Andrey,
>>
>>
>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>
>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>> wrote:
>>>>
>>>> Hi Andrey,
>>>>
>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>
>>>
>>> <snip>
>>>
>>>>
>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>> and how do you use them to solve the use case? And could you you add
>>>> that info to the commit messages please.
>>>
>>>
>>> Hi Michael,
>>>
>>> A patch for man-pages is attached. It adds the following text to
>>> namespaces(7).
>>>
>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>> pace file descriptors.  The correct syntax is:
>>>
>>>       fd = ioctl(ns_fd, ioctl_type);
>>>
>>> where ioctl_type is one of the following:
>>>
>>> NS_GET_USERNS
>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>       pace.
>>>
>>> NS_GET_PARENT
>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>       ing.

For each of the above, I think it is worth mentioning that the
close-on-exec flag is set for the returned file descriptor.

>>>
>>> In addition to generic ioctl(2) errors, the following specific ones can
>>> occur:
>>>
>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>
>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>       scope.

Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
user namespace"?

>>>
>>> ENOENT ns_fd refers to the init namespace.
>>
>>
>> Thanks for this. But still part of the question remains unanswered.
>> How do we (in user-space) use the file descriptors to answer any of
>> the questions that this patch series was designed to solve? (This
>> info should be in the commit message and the man-pages patch.)
>
> I'm sorry, but I am not sure that I understand what you ask.
>
> Here are the origin questions:
> Someone else then asked me a question that led me to wonder about
> generally introspecting on the parental relationships between user
> namespaces and the association of other namespaces types with user
> namespaces. One use would be visualization, in order to understand the
> running system. Another would be to answer the question I already
> mentioned: what capability does process X have to perform operations
> on a resource governed by namespace Y?
>
> Here is an example which shows how we can get the owning namespace
> inode number by using these ioctl-s.
>
> $ ls -l /proc/13929/ns/pid
> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>
> $ ./nsowner /proc/13929/ns/pid
> user:[4026532227]
>
> The owning user namespace for pid:[4026532228] is user:[4026532227].
>
> The nsowner  tool is cimpiled from this code:
>
> int main(int argc, char *argv[])
> {
>         char buf[128], path[] = "/proc/self/fd/0123456789";
>         int ns, uns, ret;
>
>         ns = open(argv[1], O_RDONLY);
>         if (ns < 0)
>                 return 1;
>
>         uns = ioctl(ns, NS_GET_USERNS);
>         if (uns < 0)
>                 return 1;
>
>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>         ret = readlink(path, buf, sizeof(buf) - 1);
>         if (ret < 0)
>                 return 1;
>         buf[ret] = 0;
>
>         printf("%s\n", buf);
>
>         return 0;
> }

So, from my point of view, the important piece that was missing from
your commit message was the note to use readlink("/proc/self/fd/%d")
on the returned FDs. I think that detail needs to be part of the
commit message (and also the man page text). I think it even be
helpful to include the above program as part of the commit message:
it helps people more quickly grasp the API.

> Does this example answer to the origin question?

Yes.

>If it isn't, could
> you eloborate what you expect to see here.
>
> And I wrote one more example which show all relationships between
> namespaces. It enumirates all processes in a system, collects all
> namespaces and determins parent and owning namespaces for each of
> them, then it constructs a namespace tree and shows it.
>
> Here is a code: https://gist.github.com/avagin/db805f95e15ffb0af7e559dbb8de4418

That's great! Thanks!
  
> Here is an example of output for my test system:
> [root@fc24 nsfs]# ./nstree
> user:[4026531837]
>  \__  mnt:[4026532203]
>  \__  ipc:[4026531839]
>  \__  user:[4026532224]
>      \__  user:[4026532226]
>          \__  user:[4026532227]
>              \__  pid:[4026532228]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>  \__  user:[4026532221]
>      \__  pid:[4026532222]
>      \__  user:[4026532223]
>  \__  mnt:[4026532211]
>  \__  uts:[4026531838]
>  \__  cgroup:[4026531835]
>  \__  pid:[4026531836]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>      \__  pid:[4026532222]
>  \__  mnt:[4026531857]
>  \__  mnt:[4026531840]
>  \__  net:[4026531957]

Cheers,

Michael

>>>>> [1] https://lkml.org/lkml/2016/7/6/158
>>>>> [2] https://lkml.org/lkml/2016/7/9/101

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-25 11:47 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: mtk.manpages, Andrew Vagin, James Bottomley, Serge Hallyn,
	Linux API, Linux Containers, LKML, Alexander Viro, criu,
	linux-fsdevel, Eric W. Biederman

Hi Andrey,

On 07/22/2016 08:25 PM, Andrey Vagin wrote:
> On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
> <mtk.manpages@gmail.com> wrote:
>> Hi Andrey,
>>
>>
>> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>>
>>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>>> wrote:
>>>>
>>>> Hi Andrey,
>>>>
>>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>>
>>>
>>> <snip>
>>>
>>>>
>>>> Could you add here an of the API in detail: what do these FDs refer to,
>>>> and how do you use them to solve the use case? And could you you add
>>>> that info to the commit messages please.
>>>
>>>
>>> Hi Michael,
>>>
>>> A patch for man-pages is attached. It adds the following text to
>>> namespaces(7).
>>>
>>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>>> pace file descriptors.  The correct syntax is:
>>>
>>>       fd = ioctl(ns_fd, ioctl_type);
>>>
>>> where ioctl_type is one of the following:
>>>
>>> NS_GET_USERNS
>>>       Returns a file descriptor that refers to an owning  user  names‐
>>>       pace.
>>>
>>> NS_GET_PARENT
>>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>>       ing.

For each of the above, I think it is worth mentioning that the
close-on-exec flag is set for the returned file descriptor.

>>>
>>> In addition to generic ioctl(2) errors, the following specific ones can
>>> occur:
>>>
>>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>>
>>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>>       scope.

Perhaps add "and the caller does not have CAP_SYS_ADMIN" in the initial
user namespace"?

>>>
>>> ENOENT ns_fd refers to the init namespace.
>>
>>
>> Thanks for this. But still part of the question remains unanswered.
>> How do we (in user-space) use the file descriptors to answer any of
>> the questions that this patch series was designed to solve? (This
>> info should be in the commit message and the man-pages patch.)
>
> I'm sorry, but I am not sure that I understand what you ask.
>
> Here are the origin questions:
> Someone else then asked me a question that led me to wonder about
> generally introspecting on the parental relationships between user
> namespaces and the association of other namespaces types with user
> namespaces. One use would be visualization, in order to understand the
> running system. Another would be to answer the question I already
> mentioned: what capability does process X have to perform operations
> on a resource governed by namespace Y?
>
> Here is an example which shows how we can get the owning namespace
> inode number by using these ioctl-s.
>
> $ ls -l /proc/13929/ns/pid
> lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'
>
> $ ./nsowner /proc/13929/ns/pid
> user:[4026532227]
>
> The owning user namespace for pid:[4026532228] is user:[4026532227].
>
> The nsowner  tool is cimpiled from this code:
>
> int main(int argc, char *argv[])
> {
>         char buf[128], path[] = "/proc/self/fd/0123456789";
>         int ns, uns, ret;
>
>         ns = open(argv[1], O_RDONLY);
>         if (ns < 0)
>                 return 1;
>
>         uns = ioctl(ns, NS_GET_USERNS);
>         if (uns < 0)
>                 return 1;
>
>         snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
>         ret = readlink(path, buf, sizeof(buf) - 1);
>         if (ret < 0)
>                 return 1;
>         buf[ret] = 0;
>
>         printf("%s\n", buf);
>
>         return 0;
> }

So, from my point of view, the important piece that was missing from
your commit message was the note to use readlink("/proc/self/fd/%d")
on the returned FDs. I think that detail needs to be part of the
commit message (and also the man page text). I think it even be
helpful to include the above program as part of the commit message:
it helps people more quickly grasp the API.

> Does this example answer to the origin question?

Yes.

>If it isn't, could
> you eloborate what you expect to see here.
>
> And I wrote one more example which show all relationships between
> namespaces. It enumirates all processes in a system, collects all
> namespaces and determins parent and owning namespaces for each of
> them, then it constructs a namespace tree and shows it.
>
> Here is a code: https://gist.github.com/avagin/db805f95e15ffb0af7e559dbb8de4418

That's great! Thanks!
  
> Here is an example of output for my test system:
> [root@fc24 nsfs]# ./nstree
> user:[4026531837]
>  \__  mnt:[4026532203]
>  \__  ipc:[4026531839]
>  \__  user:[4026532224]
>      \__  user:[4026532226]
>          \__  user:[4026532227]
>              \__  pid:[4026532228]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>  \__  user:[4026532221]
>      \__  pid:[4026532222]
>      \__  user:[4026532223]
>  \__  mnt:[4026532211]
>  \__  uts:[4026531838]
>  \__  cgroup:[4026531835]
>  \__  pid:[4026531836]
>      \__  pid:[4026532225]
>          \__  pid:[4026532228]
>      \__  pid:[4026532222]
>  \__  mnt:[4026531857]
>  \__  mnt:[4026531840]
>  \__  net:[4026531957]

Cheers,

Michael

>>>>> [1] https://lkml.org/lkml/2016/7/6/158
>>>>> [2] https://lkml.org/lkml/2016/7/9/101

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]   ` <CANaxB-xw_xBUq=0uT14ANv-jfg2NsGaPy=jyDO9=yF03_7toSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-07-24  5:10     ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-24  5:10 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: Serge Hallyn, criu-GEFAQzZX7r8dnm+yROfE0A, Linux API,
	Linux Containers, LKML, James Bottomley, Alexander Viro,
	linux-fsdevel, Michael Kerrisk (man-pages)

Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:

> Hello,
>
> I forgot to add --cc-cover for git send-email, so everyone who is in
> Cc got only a cover letter. All messages were sent in mail lists.
>
> Sorry for inconvenience.

Mostly the code looked sensible.  But I had a couple of issues.
Resend this in September (when the merge window is closed and I am back
from vacation) and I will give this a thorough review and get this
merged.  Or possibly next week if Linus releases another -rc

> On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
>> Each namespace has an owning user namespace and now there is not way
>> to discover these relationships.
>>
>> Pid and user namepaces are hierarchical. There is no way to discover
>> parent-child relationships too.
>>
>> Why we may want to know relationships between namespaces?
>>
>> One use would be visualization, in order to understand the running system.
>> Another would be to answer the question: what capability does process X have to
>> perform operations on a resource governed by namespace Y?
>>
>> One more use-case (which usually called abnormal) is checkpoint/restart.
>> In CRIU we age going to dump and restore nested namespaces.
>>
>> There [1] was a discussion about which interface to choose to determing
>> relationships between namespaces.
>>
>> Eric suggested to add two ioctl-s [2]:
>>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>>> for these two cases.  Now that random nsfs file descriptors are bind
>>> mountable the original reason for using proc files is not as pressing.
>>>
>>> One ioctl for the user namespace that owns a file descriptor.
>>> One ioctl for the parent namespace of a namespace file descriptor.
>>
>> Here is an implementaions of these ioctl-s.
>>
>> [1] https://lkml.org/lkml/2016/7/6/158
>> [2] https://lkml.org/lkml/2016/7/9/101
>>
>> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
>> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
>> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
>> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
>> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>


Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]   ` <CANaxB-xw_xBUq=0uT14ANv-jfg2NsGaPy=jyDO9=yF03_7toSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-07-24  5:10     ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-24  5:10 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: LKML, James Bottomley, Serge Hallyn, Linux API, Linux Containers,
	Alexander Viro, criu, linux-fsdevel, Michael Kerrisk (man-pages)

Andrey Vagin <avagin@openvz.org> writes:

> Hello,
>
> I forgot to add --cc-cover for git send-email, so everyone who is in
> Cc got only a cover letter. All messages were sent in mail lists.
>
> Sorry for inconvenience.

Mostly the code looked sensible.  But I had a couple of issues.
Resend this in September (when the merge window is closed and I am back
from vacation) and I will give this a thorough review and get this
merged.  Or possibly next week if Linus releases another -rc

> On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin@openvz.org> wrote:
>> Each namespace has an owning user namespace and now there is not way
>> to discover these relationships.
>>
>> Pid and user namepaces are hierarchical. There is no way to discover
>> parent-child relationships too.
>>
>> Why we may want to know relationships between namespaces?
>>
>> One use would be visualization, in order to understand the running system.
>> Another would be to answer the question: what capability does process X have to
>> perform operations on a resource governed by namespace Y?
>>
>> One more use-case (which usually called abnormal) is checkpoint/restart.
>> In CRIU we age going to dump and restore nested namespaces.
>>
>> There [1] was a discussion about which interface to choose to determing
>> relationships between namespaces.
>>
>> Eric suggested to add two ioctl-s [2]:
>>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>>> for these two cases.  Now that random nsfs file descriptors are bind
>>> mountable the original reason for using proc files is not as pressing.
>>>
>>> One ioctl for the user namespace that owns a file descriptor.
>>> One ioctl for the parent namespace of a namespace file descriptor.
>>
>> Here is an implementaions of these ioctl-s.
>>
>> [1] https://lkml.org/lkml/2016/7/6/158
>> [2] https://lkml.org/lkml/2016/7/9/101
>>
>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
>> Cc: "W. Trevor King" <wking@tremily.us>
>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>> Cc: Serge Hallyn <serge.hallyn@canonical.com>


Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-24  5:10     ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-24  5:10 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: LKML, James Bottomley, Serge Hallyn, Linux API, Linux Containers,
	Alexander Viro, criu@openvz.org, linux-fsdevel,
	Michael Kerrisk (man-pages)

Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> writes:

> Hello,
>
> I forgot to add --cc-cover for git send-email, so everyone who is in
> Cc got only a cover letter. All messages were sent in mail lists.
>
> Sorry for inconvenience.

Mostly the code looked sensible.  But I had a couple of issues.
Resend this in September (when the merge window is closed and I am back
from vacation) and I will give this a thorough review and get this
merged.  Or possibly next week if Linus releases another -rc

> On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
>> Each namespace has an owning user namespace and now there is not way
>> to discover these relationships.
>>
>> Pid and user namepaces are hierarchical. There is no way to discover
>> parent-child relationships too.
>>
>> Why we may want to know relationships between namespaces?
>>
>> One use would be visualization, in order to understand the running system.
>> Another would be to answer the question: what capability does process X have to
>> perform operations on a resource governed by namespace Y?
>>
>> One more use-case (which usually called abnormal) is checkpoint/restart.
>> In CRIU we age going to dump and restore nested namespaces.
>>
>> There [1] was a discussion about which interface to choose to determing
>> relationships between namespaces.
>>
>> Eric suggested to add two ioctl-s [2]:
>>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>>> for these two cases.  Now that random nsfs file descriptors are bind
>>> mountable the original reason for using proc files is not as pressing.
>>>
>>> One ioctl for the user namespace that owns a file descriptor.
>>> One ioctl for the parent namespace of a namespace file descriptor.
>>
>> Here is an implementaions of these ioctl-s.
>>
>> [1] https://lkml.org/lkml/2016/7/6/158
>> [2] https://lkml.org/lkml/2016/7/9/101
>>
>> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
>> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
>> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
>> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
>> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>


Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-23 22:34                 ` W. Trevor King
@ 2016-07-24  4:51                     ` Eric W. Biederman
  -1 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-24  4:51 UTC (permalink / raw)
  To: W. Trevor King
  Cc: Serge Hallyn, Andrey Vagin, criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	Alexander Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Michael Kerrisk (man-pages)

"W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org> writes:

> On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote:
>> "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org> writes:
>> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> >> > namespaces(7) and clone(2) both have:
>> >> > 
>> >> >   When a network namespace is freed (i.e., when the last
>> >> >   process in the namespace terminates), its physical network
>> >> >   devices are moved back to the initial network namespace (not
>> >> >   to the parent of the process).
>> >> > 
>> >> > So the initial network namespace (the head of
>> >> > net_namespace_list?)  is special [1].  To understand how
>> >> > physical network devices will be handled, it seems like we want
>> >> > to treat network devices as a depth-1 tree, with all
>> >> > non-initial net namespaces as children of the initial net
>> >> > namespace.  Can we extend this series' NS_GET_PARENT to return:
>> >> > 
>> >> > * EPERM for an unprivileged caller (like this series currently
>> >> >   does for PID namespaces),
>> >> > * ENOENT when called on net_namespace_list, and
>> >> > * net_namespace_list when called on any other net namespace.
>> >> 
>> >> What's the practical application of this?  independent net
>> >> namespaces are managed by the ip netns command.  It pins them by
>> >> a bind mount in a flat fashion; if we make them hierarchical the
>> >> tool would probably need updating to reflect this, so we're going
>> >> to need a reason to give the network people.  Just having the
>> >> interfaces not go back to root when you do an ip netns delete
>> >> doesn't seem very compelling.
>> >
>> > I'm not suggesting we add support for deeper nesting, I'm suggesting
>> > we use NS_GET_PARENT to allow sufficiently privileged users to
>> > determine if a given net namespace is the initial net namespace.  You
>> > could do this already with something like:
>> >
>> > 1. Create a new net namespace.
>> > 2. Add a physical network device to that namespace.
>> > 3. Delete that namespace.
>> > 4. See if the physical network device shows up in your
>> >    initial-net-namespace candidate.
>> > 5. Delete the physical network device (hopefully it ended up
>> >    somewhere you can find it ;).
>> >
>> > But using an NS_GET_PARENT call seems much safer and easier.
>> 
>> Have you had the problem in practice where you can't tell which
>> network namespace is the initial network namespace.  This all seems
>> like a theoretical problem rather than a real one.
>
> I haven't had any practical problems here, I'm just trying to wrap my
> head around namespace-relationship discovery.  The special physical
> network device handling seems a lot like init re-parenting (with no
> PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling
> the initial network namespace a parent (and all the other namespaces
> its direct children) seems natural enough.  If that doesn't sound
> convincing, I'm happy to punt this idea until someone runs into a
> practical problem ;).

Then let's punt this until someone runs into a practical problem.

For scaling and for sanity it is desirable to keep the connections
between namespaces to a minimum.  Further the initial instances of a
namespace always tend to be a little bit special.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-24  4:51                     ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-24  4:51 UTC (permalink / raw)
  To: W. Trevor King
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn, linux-api,
	containers, linux-kernel, Alexander Viro, criu, linux-fsdevel,
	Michael Kerrisk (man-pages)

"W. Trevor King" <wking@tremily.us> writes:

> On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote:
>> "W. Trevor King" <wking@tremily.us> writes:
>> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> >> > namespaces(7) and clone(2) both have:
>> >> > 
>> >> >   When a network namespace is freed (i.e., when the last
>> >> >   process in the namespace terminates), its physical network
>> >> >   devices are moved back to the initial network namespace (not
>> >> >   to the parent of the process).
>> >> > 
>> >> > So the initial network namespace (the head of
>> >> > net_namespace_list?)  is special [1].  To understand how
>> >> > physical network devices will be handled, it seems like we want
>> >> > to treat network devices as a depth-1 tree, with all
>> >> > non-initial net namespaces as children of the initial net
>> >> > namespace.  Can we extend this series' NS_GET_PARENT to return:
>> >> > 
>> >> > * EPERM for an unprivileged caller (like this series currently
>> >> >   does for PID namespaces),
>> >> > * ENOENT when called on net_namespace_list, and
>> >> > * net_namespace_list when called on any other net namespace.
>> >> 
>> >> What's the practical application of this?  independent net
>> >> namespaces are managed by the ip netns command.  It pins them by
>> >> a bind mount in a flat fashion; if we make them hierarchical the
>> >> tool would probably need updating to reflect this, so we're going
>> >> to need a reason to give the network people.  Just having the
>> >> interfaces not go back to root when you do an ip netns delete
>> >> doesn't seem very compelling.
>> >
>> > I'm not suggesting we add support for deeper nesting, I'm suggesting
>> > we use NS_GET_PARENT to allow sufficiently privileged users to
>> > determine if a given net namespace is the initial net namespace.  You
>> > could do this already with something like:
>> >
>> > 1. Create a new net namespace.
>> > 2. Add a physical network device to that namespace.
>> > 3. Delete that namespace.
>> > 4. See if the physical network device shows up in your
>> >    initial-net-namespace candidate.
>> > 5. Delete the physical network device (hopefully it ended up
>> >    somewhere you can find it ;).
>> >
>> > But using an NS_GET_PARENT call seems much safer and easier.
>> 
>> Have you had the problem in practice where you can't tell which
>> network namespace is the initial network namespace.  This all seems
>> like a theoretical problem rather than a real one.
>
> I haven't had any practical problems here, I'm just trying to wrap my
> head around namespace-relationship discovery.  The special physical
> network device handling seems a lot like init re-parenting (with no
> PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling
> the initial network namespace a parent (and all the other namespaces
> its direct children) seems natural enough.  If that doesn't sound
> convincing, I'm happy to punt this idea until someone runs into a
> practical problem ;).

Then let's punt this until someone runs into a practical problem.

For scaling and for sanity it is desirable to keep the connections
between namespaces to a minimum.  Further the initial instances of a
namespace always tend to be a little bit special.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-23 21:56             ` Eric W. Biederman
@ 2016-07-23 22:34                 ` W. Trevor King
  -1 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 22:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge Hallyn, Andrey Vagin, criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	Alexander Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Michael Kerrisk (man-pages)


[-- Attachment #1.1: Type: text/plain, Size: 3232 bytes --]

On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote:
> "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org> writes:
> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> >> > namespaces(7) and clone(2) both have:
> >> > 
> >> >   When a network namespace is freed (i.e., when the last
> >> >   process in the namespace terminates), its physical network
> >> >   devices are moved back to the initial network namespace (not
> >> >   to the parent of the process).
> >> > 
> >> > So the initial network namespace (the head of
> >> > net_namespace_list?)  is special [1].  To understand how
> >> > physical network devices will be handled, it seems like we want
> >> > to treat network devices as a depth-1 tree, with all
> >> > non-initial net namespaces as children of the initial net
> >> > namespace.  Can we extend this series' NS_GET_PARENT to return:
> >> > 
> >> > * EPERM for an unprivileged caller (like this series currently
> >> >   does for PID namespaces),
> >> > * ENOENT when called on net_namespace_list, and
> >> > * net_namespace_list when called on any other net namespace.
> >> 
> >> What's the practical application of this?  independent net
> >> namespaces are managed by the ip netns command.  It pins them by
> >> a bind mount in a flat fashion; if we make them hierarchical the
> >> tool would probably need updating to reflect this, so we're going
> >> to need a reason to give the network people.  Just having the
> >> interfaces not go back to root when you do an ip netns delete
> >> doesn't seem very compelling.
> >
> > I'm not suggesting we add support for deeper nesting, I'm suggesting
> > we use NS_GET_PARENT to allow sufficiently privileged users to
> > determine if a given net namespace is the initial net namespace.  You
> > could do this already with something like:
> >
> > 1. Create a new net namespace.
> > 2. Add a physical network device to that namespace.
> > 3. Delete that namespace.
> > 4. See if the physical network device shows up in your
> >    initial-net-namespace candidate.
> > 5. Delete the physical network device (hopefully it ended up
> >    somewhere you can find it ;).
> >
> > But using an NS_GET_PARENT call seems much safer and easier.
> 
> Have you had the problem in practice where you can't tell which
> network namespace is the initial network namespace.  This all seems
> like a theoretical problem rather than a real one.

I haven't had any practical problems here, I'm just trying to wrap my
head around namespace-relationship discovery.  The special physical
network device handling seems a lot like init re-parenting (with no
PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling
the initial network namespace a parent (and all the other namespaces
its direct children) seems natural enough.  If that doesn't sound
convincing, I'm happy to punt this idea until someone runs into a
practical problem ;).

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-23 22:34                 ` W. Trevor King
  0 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 22:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn, linux-api,
	containers, linux-kernel, Alexander Viro, criu, linux-fsdevel,
	Michael Kerrisk (man-pages)

[-- Attachment #1: Type: text/plain, Size: 3203 bytes --]

On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote:
> "W. Trevor King" <wking@tremily.us> writes:
> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> >> > namespaces(7) and clone(2) both have:
> >> > 
> >> >   When a network namespace is freed (i.e., when the last
> >> >   process in the namespace terminates), its physical network
> >> >   devices are moved back to the initial network namespace (not
> >> >   to the parent of the process).
> >> > 
> >> > So the initial network namespace (the head of
> >> > net_namespace_list?)  is special [1].  To understand how
> >> > physical network devices will be handled, it seems like we want
> >> > to treat network devices as a depth-1 tree, with all
> >> > non-initial net namespaces as children of the initial net
> >> > namespace.  Can we extend this series' NS_GET_PARENT to return:
> >> > 
> >> > * EPERM for an unprivileged caller (like this series currently
> >> >   does for PID namespaces),
> >> > * ENOENT when called on net_namespace_list, and
> >> > * net_namespace_list when called on any other net namespace.
> >> 
> >> What's the practical application of this?  independent net
> >> namespaces are managed by the ip netns command.  It pins them by
> >> a bind mount in a flat fashion; if we make them hierarchical the
> >> tool would probably need updating to reflect this, so we're going
> >> to need a reason to give the network people.  Just having the
> >> interfaces not go back to root when you do an ip netns delete
> >> doesn't seem very compelling.
> >
> > I'm not suggesting we add support for deeper nesting, I'm suggesting
> > we use NS_GET_PARENT to allow sufficiently privileged users to
> > determine if a given net namespace is the initial net namespace.  You
> > could do this already with something like:
> >
> > 1. Create a new net namespace.
> > 2. Add a physical network device to that namespace.
> > 3. Delete that namespace.
> > 4. See if the physical network device shows up in your
> >    initial-net-namespace candidate.
> > 5. Delete the physical network device (hopefully it ended up
> >    somewhere you can find it ;).
> >
> > But using an NS_GET_PARENT call seems much safer and easier.
> 
> Have you had the problem in practice where you can't tell which
> network namespace is the initial network namespace.  This all seems
> like a theoretical problem rather than a real one.

I haven't had any practical problems here, I'm just trying to wrap my
head around namespace-relationship discovery.  The special physical
network device handling seems a lot like init re-parenting (with no
PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling
the initial network namespace a parent (and all the other namespaces
its direct children) seems natural enough.  If that doesn't sound
convincing, I'm happy to punt this idea until someone runs into a
practical problem ;).

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]         ` <1469309936.2332.35.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2016-07-23 21:58           ` W. Trevor King
  0 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 21:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge Hallyn, Andrey Vagin, linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, criu-GEFAQzZX7r8dnm+yROfE0A,
	Alexander Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Michael Kerrisk (man-pages),
	Eric W. Biederman


[-- Attachment #1.1: Type: text/plain, Size: 2240 bytes --]

On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> > namespaces(7) and clone(2) both have:
> > 
> >   When a network namespace is freed (i.e., when the last process
> >   in the namespace terminates), its physical network devices are
> >   moved back to the initial network namespace (not to the parent
> >   of the process).
> > 
> > So the initial network namespace (the head of net_namespace_list?)
> > is special [1].  To understand how physical network devices will
> > be handled, it seems like we want to treat network devices as a
> > depth-1 tree, with all non-initial net namespaces as children of
> > the initial net namespace.  Can we extend this series'
> > NS_GET_PARENT to return:
> > 
> > * EPERM for an unprivileged caller (like this series currently does
> >   for PID namespaces),
> > * ENOENT when called on net_namespace_list, and
> > * net_namespace_list when called on any other net namespace.
> 
> What's the practical application of this?  independent net
> namespaces are managed by the ip netns command.  It pins them by a
> bind mount in a flat fashion; if we make them hierarchical the tool
> would probably need updating to reflect this, so we're going to need
> a reason to give the network people.  Just having the interfaces not
> go back to root when you do an ip netns delete doesn't seem very
> compelling.

I'm not suggesting we add support for deeper nesting, I'm suggesting
we use NS_GET_PARENT to allow sufficiently privileged users to
determine if a given net namespace is the initial net namespace.  You
could do this already with something like:

1. Create a new net namespace.
2. Add a physical network device to that namespace.
3. Delete that namespace.
4. See if the physical network device shows up in your
   initial-net-namespace candidate.
5. Delete the physical network device (hopefully it ended up somewhere
   you can find it ;).

But using an NS_GET_PARENT call seems much safer and easier.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]         ` <1469309936.2332.35.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2016-07-23 21:58           ` W. Trevor King
  0 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 21:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andrey Vagin, Serge Hallyn, linux-api, containers, linux-kernel,
	Alexander Viro, criu, Eric W. Biederman, linux-fsdevel,
	Michael Kerrisk (man-pages)

[-- Attachment #1: Type: text/plain, Size: 2240 bytes --]

On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> > namespaces(7) and clone(2) both have:
> > 
> >   When a network namespace is freed (i.e., when the last process
> >   in the namespace terminates), its physical network devices are
> >   moved back to the initial network namespace (not to the parent
> >   of the process).
> > 
> > So the initial network namespace (the head of net_namespace_list?)
> > is special [1].  To understand how physical network devices will
> > be handled, it seems like we want to treat network devices as a
> > depth-1 tree, with all non-initial net namespaces as children of
> > the initial net namespace.  Can we extend this series'
> > NS_GET_PARENT to return:
> > 
> > * EPERM for an unprivileged caller (like this series currently does
> >   for PID namespaces),
> > * ENOENT when called on net_namespace_list, and
> > * net_namespace_list when called on any other net namespace.
> 
> What's the practical application of this?  independent net
> namespaces are managed by the ip netns command.  It pins them by a
> bind mount in a flat fashion; if we make them hierarchical the tool
> would probably need updating to reflect this, so we're going to need
> a reason to give the network people.  Just having the interfaces not
> go back to root when you do an ip netns delete doesn't seem very
> compelling.

I'm not suggesting we add support for deeper nesting, I'm suggesting
we use NS_GET_PARENT to allow sufficiently privileged users to
determine if a given net namespace is the initial net namespace.  You
could do this already with something like:

1. Create a new net namespace.
2. Add a physical network device to that namespace.
3. Delete that namespace.
4. See if the physical network device shows up in your
   initial-net-namespace candidate.
5. Delete the physical network device (hopefully it ended up somewhere
   you can find it ;).

But using an NS_GET_PARENT call seems much safer and easier.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-23 21:58           ` W. Trevor King
  0 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 21:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andrey Vagin, Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages)

[-- Attachment #1: Type: text/plain, Size: 2240 bytes --]

On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> > namespaces(7) and clone(2) both have:
> > 
> >   When a network namespace is freed (i.e., when the last process
> >   in the namespace terminates), its physical network devices are
> >   moved back to the initial network namespace (not to the parent
> >   of the process).
> > 
> > So the initial network namespace (the head of net_namespace_list?)
> > is special [1].  To understand how physical network devices will
> > be handled, it seems like we want to treat network devices as a
> > depth-1 tree, with all non-initial net namespaces as children of
> > the initial net namespace.  Can we extend this series'
> > NS_GET_PARENT to return:
> > 
> > * EPERM for an unprivileged caller (like this series currently does
> >   for PID namespaces),
> > * ENOENT when called on net_namespace_list, and
> > * net_namespace_list when called on any other net namespace.
> 
> What's the practical application of this?  independent net
> namespaces are managed by the ip netns command.  It pins them by a
> bind mount in a flat fashion; if we make them hierarchical the tool
> would probably need updating to reflect this, so we're going to need
> a reason to give the network people.  Just having the interfaces not
> go back to root when you do an ip netns delete doesn't seem very
> compelling.

I'm not suggesting we add support for deeper nesting, I'm suggesting
we use NS_GET_PARENT to allow sufficiently privileged users to
determine if a given net namespace is the initial net namespace.  You
could do this already with something like:

1. Create a new net namespace.
2. Add a physical network device to that namespace.
3. Delete that namespace.
4. See if the physical network device shows up in your
   initial-net-namespace candidate.
5. Delete the physical network device (hopefully it ended up somewhere
   you can find it ;).

But using an NS_GET_PARENT call seems much safer and easier.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]           ` <20160723215802.GO24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
@ 2016-07-23 21:56             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-23 21:56 UTC (permalink / raw)
  To: W. Trevor King
  Cc: Serge Hallyn, Andrey Vagin, criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	Alexander Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Michael Kerrisk (man-pages)

"W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org> writes:

2> On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> > namespaces(7) and clone(2) both have:
>> > 
>> >   When a network namespace is freed (i.e., when the last process
>> >   in the namespace terminates), its physical network devices are
>> >   moved back to the initial network namespace (not to the parent
>> >   of the process).
>> > 
>> > So the initial network namespace (the head of net_namespace_list?)
>> > is special [1].  To understand how physical network devices will
>> > be handled, it seems like we want to treat network devices as a
>> > depth-1 tree, with all non-initial net namespaces as children of
>> > the initial net namespace.  Can we extend this series'
>> > NS_GET_PARENT to return:
>> > 
>> > * EPERM for an unprivileged caller (like this series currently does
>> >   for PID namespaces),
>> > * ENOENT when called on net_namespace_list, and
>> > * net_namespace_list when called on any other net namespace.
>> 
>> What's the practical application of this?  independent net
>> namespaces are managed by the ip netns command.  It pins them by a
>> bind mount in a flat fashion; if we make them hierarchical the tool
>> would probably need updating to reflect this, so we're going to need
>> a reason to give the network people.  Just having the interfaces not
>> go back to root when you do an ip netns delete doesn't seem very
>> compelling.
>
> I'm not suggesting we add support for deeper nesting, I'm suggesting
> we use NS_GET_PARENT to allow sufficiently privileged users to
> determine if a given net namespace is the initial net namespace.  You
> could do this already with something like:
>
> 1. Create a new net namespace.
> 2. Add a physical network device to that namespace.
> 3. Delete that namespace.
> 4. See if the physical network device shows up in your
>    initial-net-namespace candidate.
> 5. Delete the physical network device (hopefully it ended up somewhere
>    you can find it ;).
>
> But using an NS_GET_PARENT call seems much safer and easier.

Have you had the problem in practice where you can't tell which network
namespace is the initial network namespace.  This all seems like a
theoretical problem rather than a real one.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]           ` <20160723215802.GO24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
@ 2016-07-23 21:56             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-23 21:56 UTC (permalink / raw)
  To: W. Trevor King
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn, linux-api,
	containers, linux-kernel, Alexander Viro, criu, linux-fsdevel,
	Michael Kerrisk (man-pages)

"W. Trevor King" <wking@tremily.us> writes:

2> On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> > namespaces(7) and clone(2) both have:
>> > 
>> >   When a network namespace is freed (i.e., when the last process
>> >   in the namespace terminates), its physical network devices are
>> >   moved back to the initial network namespace (not to the parent
>> >   of the process).
>> > 
>> > So the initial network namespace (the head of net_namespace_list?)
>> > is special [1].  To understand how physical network devices will
>> > be handled, it seems like we want to treat network devices as a
>> > depth-1 tree, with all non-initial net namespaces as children of
>> > the initial net namespace.  Can we extend this series'
>> > NS_GET_PARENT to return:
>> > 
>> > * EPERM for an unprivileged caller (like this series currently does
>> >   for PID namespaces),
>> > * ENOENT when called on net_namespace_list, and
>> > * net_namespace_list when called on any other net namespace.
>> 
>> What's the practical application of this?  independent net
>> namespaces are managed by the ip netns command.  It pins them by a
>> bind mount in a flat fashion; if we make them hierarchical the tool
>> would probably need updating to reflect this, so we're going to need
>> a reason to give the network people.  Just having the interfaces not
>> go back to root when you do an ip netns delete doesn't seem very
>> compelling.
>
> I'm not suggesting we add support for deeper nesting, I'm suggesting
> we use NS_GET_PARENT to allow sufficiently privileged users to
> determine if a given net namespace is the initial net namespace.  You
> could do this already with something like:
>
> 1. Create a new net namespace.
> 2. Add a physical network device to that namespace.
> 3. Delete that namespace.
> 4. See if the physical network device shows up in your
>    initial-net-namespace candidate.
> 5. Delete the physical network device (hopefully it ended up somewhere
>    you can find it ;).
>
> But using an NS_GET_PARENT call seems much safer and easier.

Have you had the problem in practice where you can't tell which network
namespace is the initial network namespace.  This all seems like a
theoretical problem rather than a real one.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-23 21:56             ` Eric W. Biederman
  0 siblings, 0 replies; 85+ messages in thread
From: Eric W. Biederman @ 2016-07-23 21:56 UTC (permalink / raw)
  To: W. Trevor King
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages)

"W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org> writes:

2> On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> > namespaces(7) and clone(2) both have:
>> > 
>> >   When a network namespace is freed (i.e., when the last process
>> >   in the namespace terminates), its physical network devices are
>> >   moved back to the initial network namespace (not to the parent
>> >   of the process).
>> > 
>> > So the initial network namespace (the head of net_namespace_list?)
>> > is special [1].  To understand how physical network devices will
>> > be handled, it seems like we want to treat network devices as a
>> > depth-1 tree, with all non-initial net namespaces as children of
>> > the initial net namespace.  Can we extend this series'
>> > NS_GET_PARENT to return:
>> > 
>> > * EPERM for an unprivileged caller (like this series currently does
>> >   for PID namespaces),
>> > * ENOENT when called on net_namespace_list, and
>> > * net_namespace_list when called on any other net namespace.
>> 
>> What's the practical application of this?  independent net
>> namespaces are managed by the ip netns command.  It pins them by a
>> bind mount in a flat fashion; if we make them hierarchical the tool
>> would probably need updating to reflect this, so we're going to need
>> a reason to give the network people.  Just having the interfaces not
>> go back to root when you do an ip netns delete doesn't seem very
>> compelling.
>
> I'm not suggesting we add support for deeper nesting, I'm suggesting
> we use NS_GET_PARENT to allow sufficiently privileged users to
> determine if a given net namespace is the initial net namespace.  You
> could do this already with something like:
>
> 1. Create a new net namespace.
> 2. Add a physical network device to that namespace.
> 3. Delete that namespace.
> 4. See if the physical network device shows up in your
>    initial-net-namespace candidate.
> 5. Delete the physical network device (hopefully it ended up somewhere
>    you can find it ;).
>
> But using an NS_GET_PARENT call seems much safer and easier.

Have you had the problem in practice where you can't tell which network
namespace is the initial network namespace.  This all seems like a
theoretical problem rather than a real one.

Eric

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-23 21:14     ` W. Trevor King
@ 2016-07-23 21:38         ` James Bottomley
  -1 siblings, 0 replies; 85+ messages in thread
From: James Bottomley @ 2016-07-23 21:38 UTC (permalink / raw)
  To: W. Trevor King, Andrey Vagin
  Cc: Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, criu-GEFAQzZX7r8dnm+yROfE0A,
	Alexander Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Michael Kerrisk (man-pages),
	Eric W. Biederman


[-- Attachment #1.1: Type: text/plain, Size: 1852 bytes --]

On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> On Thu, Jul 14, 2016 at 11:20:14AM -0700, Andrey Vagin wrote:
> > Pid and user namepaces are hierarchical. There is no way to 
> > discover parent-child relationships too.
> 
> It bothers me that network namespaces are not hierarchical too ;).

Well, there's a reason for that: mapping namespaces need to be be
hierarchical because the mapping may be remapped; The initial point for
creating a new namespace is the mapped endpoint of the old one.  Label
based namespaces don't really have any need to be.

> namespaces(7) and clone(2) both have:
> 
>   When a network namespace is freed (i.e., when the last process in
>   the namespace terminates), its physical network devices are moved
>   back to the initial network namespace (not to the parent of the
>   process).
> 
> So the initial network namespace (the head of net_namespace_list?) is
> special [1].  To understand how physical network devices will be
> handled, it seems like we want to treat network devices as a depth-1
> tree, with all non-initial net namespaces as children of the initial
> net namespace.  Can we extend this series' NS_GET_PARENT to return:
> 
> * EPERM for an unprivileged caller (like this series currently does
>   for PID namespaces),
> * ENOENT when called on net_namespace_list, and
> * net_namespace_list when called on any other net namespace.

What's the practical application of this?  independent net namespaces
are managed by the ip netns command.  It pins them by a bind mount in a
flat fashion; if we make them hierarchical the tool would probably need
updating to reflect this, so we're going to need a reason to give the
network people.  Just having the interfaces not go back to root when
you do an ip netns delete doesn't seem very compelling.

James


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-23 21:38         ` James Bottomley
  0 siblings, 0 replies; 85+ messages in thread
From: James Bottomley @ 2016-07-23 21:38 UTC (permalink / raw)
  To: W. Trevor King, Andrey Vagin
  Cc: Serge Hallyn, linux-api, containers, linux-kernel,
	Alexander Viro, criu, Eric W. Biederman, linux-fsdevel,
	Michael Kerrisk (man-pages)

[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]

On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
> On Thu, Jul 14, 2016 at 11:20:14AM -0700, Andrey Vagin wrote:
> > Pid and user namepaces are hierarchical. There is no way to 
> > discover parent-child relationships too.
> 
> It bothers me that network namespaces are not hierarchical too ;).

Well, there's a reason for that: mapping namespaces need to be be
hierarchical because the mapping may be remapped; The initial point for
creating a new namespace is the mapped endpoint of the old one.  Label
based namespaces don't really have any need to be.

> namespaces(7) and clone(2) both have:
> 
>   When a network namespace is freed (i.e., when the last process in
>   the namespace terminates), its physical network devices are moved
>   back to the initial network namespace (not to the parent of the
>   process).
> 
> So the initial network namespace (the head of net_namespace_list?) is
> special [1].  To understand how physical network devices will be
> handled, it seems like we want to treat network devices as a depth-1
> tree, with all non-initial net namespaces as children of the initial
> net namespace.  Can we extend this series' NS_GET_PARENT to return:
> 
> * EPERM for an unprivileged caller (like this series currently does
>   for PID namespaces),
> * ENOENT when called on net_namespace_list, and
> * net_namespace_list when called on any other net namespace.

What's the practical application of this?  independent net namespaces
are managed by the ip netns command.  It pins them by a bind mount in a
flat fashion; if we make them hierarchical the tool would probably need
updating to reflect this, so we're going to need a reason to give the
network people.  Just having the interfaces not go back to root when
you do an ip netns delete doesn't seem very compelling.

James


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-14 18:20 ` Andrey Vagin
@ 2016-07-23 21:14     ` W. Trevor King
  -1 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 21:14 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: James Bottomley, Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A, Eric W. Biederman,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages)


[-- Attachment #1.1: Type: text/plain, Size: 2211 bytes --]

On Thu, Jul 14, 2016 at 11:20:14AM -0700, Andrey Vagin wrote:
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.

It bothers me that network namespaces are not hierarchical too ;).
namespaces(7) and clone(2) both have:

  When a network namespace is freed (i.e., when the last process in
  the namespace terminates), its physical network devices are moved
  back to the initial network namespace (not to the parent of the
  process).

So the initial network namespace (the head of net_namespace_list?) is
special [1].  To understand how physical network devices will be
handled, it seems like we want to treat network devices as a depth-1
tree, with all non-initial net namespaces as children of the initial
net namespace.  Can we extend this series' NS_GET_PARENT to return:

* EPERM for an unprivileged caller (like this series currently does
  for PID namespaces),
* ENOENT when called on net_namespace_list, and
* net_namespace_list when called on any other net namespace.

If that sounds reasonable, I'm happy to stumble my way through a patch
;).

And one benefit of the net_namespace_list approach is that it will be
really easy to walk children if we ever add a parent → children lookup
service to mirror this series' child → parent service.

Cheers,
Trevor

[1]: The commit message for 2b035b39 (net: Batch network namespace
  destruction, 2009-11-29) opens with:

    It is fairly common to kill several network namespaces at once.
    Either because they are nested one inside the other or…

  which I'm having trouble understanding if network namespaces aren't
  hierarchical (and they don't seem to be, except for the initial
  network namespace being special).  Maybe nested network namespaces
  were on the table at one point but never materialized?

  net->list looks like a reference to that namespace's entry in
  net_namespace_list, and I didn't see anything else that looked like
  a reference to a parent or list of children.

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-23 21:14     ` W. Trevor King
  0 siblings, 0 replies; 85+ messages in thread
From: W. Trevor King @ 2016-07-23 21:14 UTC (permalink / raw)
  To: Andrey Vagin
  Cc: linux-kernel, linux-api, containers, criu, linux-fsdevel,
	Eric W. Biederman, James Bottomley, Michael Kerrisk (man-pages),
	Alexander Viro, Serge Hallyn

[-- Attachment #1: Type: text/plain, Size: 2211 bytes --]

On Thu, Jul 14, 2016 at 11:20:14AM -0700, Andrey Vagin wrote:
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.

It bothers me that network namespaces are not hierarchical too ;).
namespaces(7) and clone(2) both have:

  When a network namespace is freed (i.e., when the last process in
  the namespace terminates), its physical network devices are moved
  back to the initial network namespace (not to the parent of the
  process).

So the initial network namespace (the head of net_namespace_list?) is
special [1].  To understand how physical network devices will be
handled, it seems like we want to treat network devices as a depth-1
tree, with all non-initial net namespaces as children of the initial
net namespace.  Can we extend this series' NS_GET_PARENT to return:

* EPERM for an unprivileged caller (like this series currently does
  for PID namespaces),
* ENOENT when called on net_namespace_list, and
* net_namespace_list when called on any other net namespace.

If that sounds reasonable, I'm happy to stumble my way through a patch
;).

And one benefit of the net_namespace_list approach is that it will be
really easy to walk children if we ever add a parent → children lookup
service to mirror this series' child → parent service.

Cheers,
Trevor

[1]: The commit message for 2b035b39 (net: Batch network namespace
  destruction, 2009-11-29) opens with:

    It is fairly common to kill several network namespaces at once.
    Either because they are nested one inside the other or…

  which I'm having trouble understanding if network namespaces aren't
  hierarchical (and they don't seem to be, except for the initial
  network namespace being special).  Maybe nested network namespaces
  were on the table at one point but never materialized?

  net->list looks like a reference to that namespace's entry in
  net_namespace_list, and I didn't see anything else that looked like
  a reference to a parent or list of children.

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-22  6:48         ` Michael Kerrisk (man-pages)
@ 2016-07-22 18:25               ` Andrey Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-22 18:25 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Serge Hallyn, Andrew Vagin, criu-GEFAQzZX7r8dnm+yROfE0A,
	Linux API, Linux Containers, LKML, James Bottomley,
	Alexander Viro, linux-fsdevel, Eric W. Biederman

On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
> Hi Andrey,
>
>
> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>
>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>> wrote:
>>>
>>> Hi Andrey,
>>>
>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>
>>
>> <snip>
>>
>>>
>>> Could you add here an of the API in detail: what do these FDs refer to,
>>> and how do you use them to solve the use case? And could you you add
>>> that info to the commit messages please.
>>
>>
>> Hi Michael,
>>
>> A patch for man-pages is attached. It adds the following text to
>> namespaces(7).
>>
>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>> pace file descriptors.  The correct syntax is:
>>
>>       fd = ioctl(ns_fd, ioctl_type);
>>
>> where ioctl_type is one of the following:
>>
>> NS_GET_USERNS
>>       Returns a file descriptor that refers to an owning  user  names‐
>>       pace.
>>
>> NS_GET_PARENT
>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>       ing.
>>
>> In addition to generic ioctl(2) errors, the following specific ones can
>> occur:
>>
>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>
>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>       scope.
>>
>> ENOENT ns_fd refers to the init namespace.
>
>
> Thanks for this. But still part of the question remains unanswered.
> How do we (in user-space) use the file descriptors to answer any of
> the questions that this patch series was designed to solve? (This
> info should be in the commit message and the man-pages patch.)

I'm sorry, but I am not sure that I understand what you ask.

Here are the origin questions:
Someone else then asked me a question that led me to wonder about
generally introspecting on the parental relationships between user
namespaces and the association of other namespaces types with user
namespaces. One use would be visualization, in order to understand the
running system. Another would be to answer the question I already
mentioned: what capability does process X have to perform operations
on a resource governed by namespace Y?

Here is an example which shows how we can get the owning namespace
inode number by using these ioctl-s.

$ ls -l /proc/13929/ns/pid
lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'

$ ./nsowner /proc/13929/ns/pid
user:[4026532227]

The owning user namespace for pid:[4026532228] is user:[4026532227].

The nsowner  tool is cimpiled from this code:

int main(int argc, char *argv[])
{
        char buf[128], path[] = "/proc/self/fd/0123456789";
        int ns, uns, ret;

        ns = open(argv[1], O_RDONLY);
        if (ns < 0)
                return 1;

        uns = ioctl(ns, NS_GET_USERNS);
        if (uns < 0)
                return 1;

        snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
        ret = readlink(path, buf, sizeof(buf) - 1);
        if (ret < 0)
                return 1;
        buf[ret] = 0;

        printf("%s\n", buf);

        return 0;
}

Does this example answer to the origin question? If it isn't, could
you eloborate what you expect to see here.

And I wrote one more example which show all relationships between
namespaces. It enumirates all processes in a system, collects all
namespaces and determins parent and owning namespaces for each of
them, then it constructs a namespace tree and shows it.

Here is a code: https://gist.github.com/avagin/db805f95e15ffb0af7e559dbb8de4418

Here is an example of output for my test system:
[root@fc24 nsfs]# ./nstree
user:[4026531837]
 \__  mnt:[4026532203]
 \__  ipc:[4026531839]
 \__  user:[4026532224]
     \__  user:[4026532226]
         \__  user:[4026532227]
             \__  pid:[4026532228]
     \__  pid:[4026532225]
         \__  pid:[4026532228]
 \__  user:[4026532221]
     \__  pid:[4026532222]
     \__  user:[4026532223]
 \__  mnt:[4026532211]
 \__  uts:[4026531838]
 \__  cgroup:[4026531835]
 \__  pid:[4026531836]
     \__  pid:[4026532225]
         \__  pid:[4026532228]
     \__  pid:[4026532222]
 \__  mnt:[4026531857]
 \__  mnt:[4026531840]
 \__  net:[4026531957]

Thanks,
Andrew

>
> Thanks,
>
> Michael
>
>
>>>> [1] https://lkml.org/lkml/2016/7/6/158
>>>> [2] https://lkml.org/lkml/2016/7/9/101
>>>>
>>>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>>>> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>>>> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
>>>> Cc: "W. Trevor King" <wking@tremily.us>
>>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>>> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>>>>
>>>> --
>>>> 2.5.5
>>>>
>>>>
>>>
>>>
>>> --
>>> Michael Kerrisk
>>> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>>> Linux/UNIX System Programming Training: http://man7.org/training/
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-22 18:25               ` Andrey Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-22 18:25 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrew Vagin, James Bottomley, Serge Hallyn, Linux API,
	Linux Containers, LKML, Alexander Viro, criu, linux-fsdevel,
	Eric W. Biederman

On Thu, Jul 21, 2016 at 11:48 PM, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
> Hi Andrey,
>
>
> On 07/21/2016 11:06 PM, Andrew Vagin wrote:
>>
>> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages)
>> wrote:
>>>
>>> Hi Andrey,
>>>
>>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>>
>>
>> <snip>
>>
>>>
>>> Could you add here an of the API in detail: what do these FDs refer to,
>>> and how do you use them to solve the use case? And could you you add
>>> that info to the commit messages please.
>>
>>
>> Hi Michael,
>>
>> A patch for man-pages is attached. It adds the following text to
>> namespaces(7).
>>
>> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
>> pace file descriptors.  The correct syntax is:
>>
>>       fd = ioctl(ns_fd, ioctl_type);
>>
>> where ioctl_type is one of the following:
>>
>> NS_GET_USERNS
>>       Returns a file descriptor that refers to an owning  user  names‐
>>       pace.
>>
>> NS_GET_PARENT
>>       Returns  a  file  descriptor  that refers to a parent namespace.
>>       This ioctl(2) can be used for pid and user namespaces. For  user
>>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>>       ing.
>>
>> In addition to generic ioctl(2) errors, the following specific ones can
>> occur:
>>
>> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>>
>> EPERM  The  requested  namespace  is  outside  of the current namespace
>>       scope.
>>
>> ENOENT ns_fd refers to the init namespace.
>
>
> Thanks for this. But still part of the question remains unanswered.
> How do we (in user-space) use the file descriptors to answer any of
> the questions that this patch series was designed to solve? (This
> info should be in the commit message and the man-pages patch.)

I'm sorry, but I am not sure that I understand what you ask.

Here are the origin questions:
Someone else then asked me a question that led me to wonder about
generally introspecting on the parental relationships between user
namespaces and the association of other namespaces types with user
namespaces. One use would be visualization, in order to understand the
running system. Another would be to answer the question I already
mentioned: what capability does process X have to perform operations
on a resource governed by namespace Y?

Here is an example which shows how we can get the owning namespace
inode number by using these ioctl-s.

$ ls -l /proc/13929/ns/pid
lrwxrwxrwx 1 root root 0 Jul 22 21:03 /proc/13929/ns/pid -> 'pid:[4026532228]'

$ ./nsowner /proc/13929/ns/pid
user:[4026532227]

The owning user namespace for pid:[4026532228] is user:[4026532227].

The nsowner  tool is cimpiled from this code:

int main(int argc, char *argv[])
{
        char buf[128], path[] = "/proc/self/fd/0123456789";
        int ns, uns, ret;

        ns = open(argv[1], O_RDONLY);
        if (ns < 0)
                return 1;

        uns = ioctl(ns, NS_GET_USERNS);
        if (uns < 0)
                return 1;

        snprintf(path, sizeof(path), "/proc/self/fd/%d", uns);
        ret = readlink(path, buf, sizeof(buf) - 1);
        if (ret < 0)
                return 1;
        buf[ret] = 0;

        printf("%s\n", buf);

        return 0;
}

Does this example answer to the origin question? If it isn't, could
you eloborate what you expect to see here.

And I wrote one more example which show all relationships between
namespaces. It enumirates all processes in a system, collects all
namespaces and determins parent and owning namespaces for each of
them, then it constructs a namespace tree and shows it.

Here is a code: https://gist.github.com/avagin/db805f95e15ffb0af7e559dbb8de4418

Here is an example of output for my test system:
[root@fc24 nsfs]# ./nstree
user:[4026531837]
 \__  mnt:[4026532203]
 \__  ipc:[4026531839]
 \__  user:[4026532224]
     \__  user:[4026532226]
         \__  user:[4026532227]
             \__  pid:[4026532228]
     \__  pid:[4026532225]
         \__  pid:[4026532228]
 \__  user:[4026532221]
     \__  pid:[4026532222]
     \__  user:[4026532223]
 \__  mnt:[4026532211]
 \__  uts:[4026531838]
 \__  cgroup:[4026531835]
 \__  pid:[4026531836]
     \__  pid:[4026532225]
         \__  pid:[4026532228]
     \__  pid:[4026532222]
 \__  mnt:[4026531857]
 \__  mnt:[4026531840]
 \__  net:[4026531957]

Thanks,
Andrew

>
> Thanks,
>
> Michael
>
>
>>>> [1] https://lkml.org/lkml/2016/7/6/158
>>>> [2] https://lkml.org/lkml/2016/7/9/101
>>>>
>>>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>>>> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>>>> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
>>>> Cc: "W. Trevor King" <wking@tremily.us>
>>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>>> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>>>>
>>>> --
>>>> 2.5.5
>>>>
>>>>
>>>
>>>
>>> --
>>> Michael Kerrisk
>>> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>>> Linux/UNIX System Programming Training: http://man7.org/training/
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found]       ` <20160721210650.GA10989-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
@ 2016-07-22  6:48         ` Michael Kerrisk (man-pages)
       [not found]           ` <1515f5f2-5a49-fcab-61f4-8b627d3ba3e2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-22  6:48 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alexander Viro,
	criu-GEFAQzZX7r8dnm+yROfE0A, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman

Hi Andrey,

On 07/21/2016 11:06 PM, Andrew Vagin wrote:
> On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages) wrote:
>> Hi Andrey,
>>
>> On 07/14/2016 08:20 PM, Andrey Vagin wrote:
>
> <snip>
>
>>
>> Could you add here an of the API in detail: what do these FDs refer to,
>> and how do you use them to solve the use case? And could you you add
>> that info to the commit messages please.
>
> Hi Michael,
>
> A patch for man-pages is attached. It adds the following text to
> namespaces(7).
>
> Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
> pace file descriptors.  The correct syntax is:
>
>       fd = ioctl(ns_fd, ioctl_type);
>
> where ioctl_type is one of the following:
>
> NS_GET_USERNS
>       Returns a file descriptor that refers to an owning  user  names‐
>       pace.
>
> NS_GET_PARENT
>       Returns  a  file  descriptor  that refers to a parent namespace.
>       This ioctl(2) can be used for pid and user namespaces. For  user
>       namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
>       ing.
>
> In addition to generic ioctl(2) errors, the following specific ones can
> occur:
>
> EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.
>
> EPERM  The  requested  namespace  is  outside  of the current namespace
>       scope.
>
> ENOENT ns_fd refers to the init namespace.

Thanks for this. But still part of the question remains unanswered.
How do we (in user-space) use the file descriptors to answer any of
the questions that this patch series was designed to solve? (This
info should be in the commit message and the man-pages patch.)

Thanks,

Michael


>>> [1] https://lkml.org/lkml/2016/7/6/158
>>> [2] https://lkml.org/lkml/2016/7/9/101
>>>
>>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>>> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
>>> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
>>> Cc: "W. Trevor King" <wking@tremily.us>
>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>>>
>>> --
>>> 2.5.5
>>>
>>>
>>
>>
>> --
>> Michael Kerrisk
>> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>> Linux/UNIX System Programming Training: http://man7.org/training/


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
  2016-07-21 14:41   ` Michael Kerrisk (man-pages)
  (?)
@ 2016-07-21 21:06       ` Andrew Vagin
  -1 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-21 21:06 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, criu-GEFAQzZX7r8dnm+yROfE0A,
	Eric W. Biederman, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages) wrote:
> Hi Andrey,
> 
> On 07/14/2016 08:20 PM, Andrey Vagin wrote:

<snip>

> 
> Could you add here an of the API in detail: what do these FDs refer to,
> and how do you use them to solve the use case? And could you you add
> that info to the commit messages please.

Hi Michael,

A patch for man-pages is attached. It adds the following text to
namespaces(7).

Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
pace file descriptors.  The correct syntax is:

      fd = ioctl(ns_fd, ioctl_type);

where ioctl_type is one of the following:

NS_GET_USERNS
      Returns a file descriptor that refers to an owning  user  names‐
      pace.

NS_GET_PARENT
      Returns  a  file  descriptor  that refers to a parent namespace.
      This ioctl(2) can be used for pid and user namespaces. For  user
      namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
      ing.

In addition to generic ioctl(2) errors, the following specific ones can
occur:

EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.

EPERM  The  requested  namespace  is  outside  of the current namespace
      scope.

ENOENT ns_fd refers to the init namespace.

Thanks,
Andrew

> 
> Thanks,
> 
> Michael
> 
> 
> > [1] https://lkml.org/lkml/2016/7/6/158
> > [2] https://lkml.org/lkml/2016/7/9/101
> > 
> > Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> > Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> > Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> > Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> > Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> > 
> > --
> > 2.5.5
> > 
> > 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

[-- Attachment #2: 0001-namespace.7-descirbe-NS_GET_USERNS-and-NS_GET-PARENT.patch --]
[-- Type: text/plain, Size: 1796 bytes --]

From 4b9194026f901c2247150bb3038c41658700f6dd Mon Sep 17 00:00:00 2001
From: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Date: Thu, 21 Jul 2016 13:58:06 -0700
Subject: [PATCH] namespace.7: descirbe NS_GET_USERNS and NS_GET-PARENT ioctl-s

Signed-off-by: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
---
 man7/namespaces.7 | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/man7/namespaces.7 b/man7/namespaces.7
index 98ed3e5..207e4a5 100644
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@@ -149,6 +149,49 @@ even if all processes in the namespace terminate.
 The file descriptor can be passed to
 .BR setns (2).
 
+Since Linux 4.X, the following
+.BR ioctl (2)
+calls are supported for namespace file descriptors.
+The correct syntax is:
+.PP
+.RS
+.nf
+.IB fd " = ioctl(" ns_fd ", " ioctl_type ");"
+.fi
+.RE
+.PP
+where
+.I ioctl_type
+is one of the following:
+.TP
+.B NS_GET_USERNS
+Returns a file descriptor that refers to an owning user namespace.
+.TP
+.B NS_GET_PARENT
+Returns a file descriptor that refers to a parent namespace. This
+.BR ioctl (2)
+can be used for pid and user namespaces. For user namespaces,
+.B NS_GET_PARENT
+and
+.B NS_GET_USERNS
+have the same meaning.
+.PP
+In addition to generic
+.BR ioctl (2)
+errors, the following specific ones can occur:
+.PP
+.TP
+.B EINVAL
+.B NS_GET_PARENT
+was called for a nonhierarchical namespace.
+.TP
+.B EPERM
+The requested namespace is outside of the current namespace scope.
+.TP
+.B ENOENT
+.IB ns_fd
+refers to the init namespace.
+.PP
 In Linux 3.7 and earlier, these files were visible as hard links.
 Since Linux 3.8, they appear as symbolic links.
 If two processes are in the same namespace, then the inode numbers of their
-- 
2.5.5


[-- Attachment #3: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-21 21:06       ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-21 21:06 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrey Vagin, linux-kernel, linux-api, containers, criu,
	linux-fsdevel, Eric W. Biederman, James Bottomley,
	W. Trevor King, Alexander Viro, Serge Hallyn

[-- Attachment #1: Type: text/plain, Size: 1912 bytes --]

On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages) wrote:
> Hi Andrey,
> 
> On 07/14/2016 08:20 PM, Andrey Vagin wrote:

<snip>

> 
> Could you add here an of the API in detail: what do these FDs refer to,
> and how do you use them to solve the use case? And could you you add
> that info to the commit messages please.

Hi Michael,

A patch for man-pages is attached. It adds the following text to
namespaces(7).

Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
pace file descriptors.  The correct syntax is:

      fd = ioctl(ns_fd, ioctl_type);

where ioctl_type is one of the following:

NS_GET_USERNS
      Returns a file descriptor that refers to an owning  user  names‐
      pace.

NS_GET_PARENT
      Returns  a  file  descriptor  that refers to a parent namespace.
      This ioctl(2) can be used for pid and user namespaces. For  user
      namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
      ing.

In addition to generic ioctl(2) errors, the following specific ones can
occur:

EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.

EPERM  The  requested  namespace  is  outside  of the current namespace
      scope.

ENOENT ns_fd refers to the init namespace.

Thanks,
Andrew

> 
> Thanks,
> 
> Michael
> 
> 
> > [1] https://lkml.org/lkml/2016/7/6/158
> > [2] https://lkml.org/lkml/2016/7/9/101
> > 
> > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> > Cc: "W. Trevor King" <wking@tremily.us>
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: Serge Hallyn <serge.hallyn@canonical.com>
> > 
> > --
> > 2.5.5
> > 
> > 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

[-- Attachment #2: 0001-namespace.7-descirbe-NS_GET_USERNS-and-NS_GET-PARENT.patch --]
[-- Type: text/plain, Size: 1739 bytes --]

>From 4b9194026f901c2247150bb3038c41658700f6dd Mon Sep 17 00:00:00 2001
From: Andrey Vagin <avagin@openvz.org>
Date: Thu, 21 Jul 2016 13:58:06 -0700
Subject: [PATCH] namespace.7: descirbe NS_GET_USERNS and NS_GET-PARENT ioctl-s

Signed-off-by: Andrey Vagin <avagin@openvz.org>
---
 man7/namespaces.7 | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/man7/namespaces.7 b/man7/namespaces.7
index 98ed3e5..207e4a5 100644
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@@ -149,6 +149,49 @@ even if all processes in the namespace terminate.
 The file descriptor can be passed to
 .BR setns (2).
 
+Since Linux 4.X, the following
+.BR ioctl (2)
+calls are supported for namespace file descriptors.
+The correct syntax is:
+.PP
+.RS
+.nf
+.IB fd " = ioctl(" ns_fd ", " ioctl_type ");"
+.fi
+.RE
+.PP
+where
+.I ioctl_type
+is one of the following:
+.TP
+.B NS_GET_USERNS
+Returns a file descriptor that refers to an owning user namespace.
+.TP
+.B NS_GET_PARENT
+Returns a file descriptor that refers to a parent namespace. This
+.BR ioctl (2)
+can be used for pid and user namespaces. For user namespaces,
+.B NS_GET_PARENT
+and
+.B NS_GET_USERNS
+have the same meaning.
+.PP
+In addition to generic
+.BR ioctl (2)
+errors, the following specific ones can occur:
+.PP
+.TP
+.B EINVAL
+.B NS_GET_PARENT
+was called for a nonhierarchical namespace.
+.TP
+.B EPERM
+The requested namespace is outside of the current namespace scope.
+.TP
+.B ENOENT
+.IB ns_fd
+refers to the init namespace.
+.PP
 In Linux 3.7 and earlier, these files were visible as hard links.
 Since Linux 3.8, they appear as symbolic links.
 If two processes are in the same namespace, then the inode numbers of their
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-21 21:06       ` Andrew Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrew Vagin @ 2016-07-21 21:06 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, criu-GEFAQzZX7r8dnm+yROfE0A,
	Eric W. Biederman, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	Alexander Viro

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

On Thu, Jul 21, 2016 at 04:41:12PM +0200, Michael Kerrisk (man-pages) wrote:
> Hi Andrey,
> 
> On 07/14/2016 08:20 PM, Andrey Vagin wrote:

<snip>

> 
> Could you add here an of the API in detail: what do these FDs refer to,
> and how do you use them to solve the use case? And could you you add
> that info to the commit messages please.

Hi Michael,

A patch for man-pages is attached. It adds the following text to
namespaces(7).

Since  Linux 4.X, the following ioctl(2) calls are supported for names‐
pace file descriptors.  The correct syntax is:

      fd = ioctl(ns_fd, ioctl_type);

where ioctl_type is one of the following:

NS_GET_USERNS
      Returns a file descriptor that refers to an owning  user  names‐
      pace.

NS_GET_PARENT
      Returns  a  file  descriptor  that refers to a parent namespace.
      This ioctl(2) can be used for pid and user namespaces. For  user
      namespaces,  NS_GET_PARENT and NS_GET_USERNS have the same mean‐
      ing.

In addition to generic ioctl(2) errors, the following specific ones can
occur:

EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.

EPERM  The  requested  namespace  is  outside  of the current namespace
      scope.

ENOENT ns_fd refers to the init namespace.

Thanks,
Andrew

> 
> Thanks,
> 
> Michael
> 
> 
> > [1] https://lkml.org/lkml/2016/7/6/158
> > [2] https://lkml.org/lkml/2016/7/9/101
> > 
> > Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> > Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> > Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> > Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> > Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> > 
> > --
> > 2.5.5
> > 
> > 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

[-- Attachment #2: 0001-namespace.7-descirbe-NS_GET_USERNS-and-NS_GET-PARENT.patch --]
[-- Type: text/plain, Size: 1797 bytes --]

>From 4b9194026f901c2247150bb3038c41658700f6dd Mon Sep 17 00:00:00 2001
From: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Date: Thu, 21 Jul 2016 13:58:06 -0700
Subject: [PATCH] namespace.7: descirbe NS_GET_USERNS and NS_GET-PARENT ioctl-s

Signed-off-by: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
---
 man7/namespaces.7 | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/man7/namespaces.7 b/man7/namespaces.7
index 98ed3e5..207e4a5 100644
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@@ -149,6 +149,49 @@ even if all processes in the namespace terminate.
 The file descriptor can be passed to
 .BR setns (2).
 
+Since Linux 4.X, the following
+.BR ioctl (2)
+calls are supported for namespace file descriptors.
+The correct syntax is:
+.PP
+.RS
+.nf
+.IB fd " = ioctl(" ns_fd ", " ioctl_type ");"
+.fi
+.RE
+.PP
+where
+.I ioctl_type
+is one of the following:
+.TP
+.B NS_GET_USERNS
+Returns a file descriptor that refers to an owning user namespace.
+.TP
+.B NS_GET_PARENT
+Returns a file descriptor that refers to a parent namespace. This
+.BR ioctl (2)
+can be used for pid and user namespaces. For user namespaces,
+.B NS_GET_PARENT
+and
+.B NS_GET_USERNS
+have the same meaning.
+.PP
+In addition to generic
+.BR ioctl (2)
+errors, the following specific ones can occur:
+.PP
+.TP
+.B EINVAL
+.B NS_GET_PARENT
+was called for a nonhierarchical namespace.
+.TP
+.B EPERM
+The requested namespace is outside of the current namespace scope.
+.TP
+.B ENOENT
+.IB ns_fd
+refers to the init namespace.
+.PP
 In Linux 3.7 and earlier, these files were visible as hard links.
 Since Linux 3.8, they appear as symbolic links.
 If two processes are in the same namespace, then the inode numbers of their
-- 
2.5.5


[-- Attachment #3: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found] ` <1468520419-28220-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
  2016-07-14 22:02   ` Andrey Vagin
@ 2016-07-21 14:41   ` Michael Kerrisk (man-pages)
  2016-07-23 21:14     ` W. Trevor King
  2016-08-01 18:20     ` Alban Crequy
  3 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-21 14:41 UTC (permalink / raw)
  To: Andrey Vagin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: James Bottomley, Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman

Hi Andrey,

On 07/14/2016 08:20 PM, Andrey Vagin wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.
> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.

Could you add here an of the API in detail: what do these FDs refer to,
and how do you use them to solve the use case? And could you you add
that info to the commit messages please.

Thanks,

Michael


> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>
> --
> 2.5.5
>
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found] ` <1468520419-28220-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
@ 2016-07-21 14:41   ` Michael Kerrisk (man-pages)
  2016-07-21 14:41   ` Michael Kerrisk (man-pages)
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-21 14:41 UTC (permalink / raw)
  To: Andrey Vagin, linux-kernel
  Cc: mtk.manpages, linux-api, containers, criu, linux-fsdevel,
	Eric W. Biederman, James Bottomley, W. Trevor King,
	Alexander Viro, Serge Hallyn

Hi Andrey,

On 07/14/2016 08:20 PM, Andrey Vagin wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.
> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.

Could you add here an of the API in detail: what do these FDs refer to,
and how do you use them to solve the use case? And could you you add
that info to the commit messages please.

Thanks,

Michael


> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> Cc: "W. Trevor King" <wking@tremily.us>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>
> --
> 2.5.5
>
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-21 14:41   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 85+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-07-21 14:41 UTC (permalink / raw)
  To: Andrey Vagin, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman,
	James Bottomley, W. Trevor King, Alexander Viro, Serge Hallyn

Hi Andrey,

On 07/14/2016 08:20 PM, Andrey Vagin wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.
> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.

Could you add here an of the API in detail: what do these FDs refer to,
and how do you use them to solve the use case? And could you you add
that info to the commit messages please.

Thanks,

Michael


> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>
> --
> 2.5.5
>
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found] ` <1468520419-28220-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
@ 2016-07-14 22:02   ` Andrey Vagin
  2016-07-21 14:41   ` Michael Kerrisk (man-pages)
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-14 22:02 UTC (permalink / raw)
  To: LKML
  Cc: James Bottomley, Andrey Vagin, Serge Hallyn, Linux API,
	Linux Containers, Alexander Viro, criu-GEFAQzZX7r8dnm+yROfE0A,
	Eric W. Biederman, linux-fsdevel, Michael Kerrisk (man-pages)

Hello,

I forgot to add --cc-cover for git send-email, so everyone who is in
Cc got only a cover letter. All messages were sent in mail lists.

Sorry for inconvenience.

On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.
> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.
>
> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
       [not found] ` <1468520419-28220-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
@ 2016-07-14 22:02   ` Andrey Vagin
  2016-07-21 14:41   ` Michael Kerrisk (man-pages)
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-14 22:02 UTC (permalink / raw)
  To: LKML
  Cc: Linux API, Linux Containers, criu, linux-fsdevel, Andrey Vagin,
	Eric W. Biederman, James Bottomley, Michael Kerrisk (man-pages),
	W. Trevor King, Alexander Viro, Serge Hallyn

Hello,

I forgot to add --cc-cover for git send-email, so everyone who is in
Cc got only a cover letter. All messages were sent in mail lists.

Sorry for inconvenience.

On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin@openvz.org> wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.
> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.
>
> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> Cc: "W. Trevor King" <wking@tremily.us>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-14 22:02   ` Andrey Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-14 22:02 UTC (permalink / raw)
  To: LKML
  Cc: Linux API, Linux Containers, criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-fsdevel, Andrey Vagin, Eric W. Biederman, James Bottomley,
	Michael Kerrisk (man-pages),
	W. Trevor King, Alexander Viro, Serge Hallyn

Hello,

I forgot to add --cc-cover for git send-email, so everyone who is in
Cc got only a cover letter. All messages were sent in mail lists.

Sorry for inconvenience.

On Thu, Jul 14, 2016 at 11:20 AM, Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
> Each namespace has an owning user namespace and now there is not way
> to discover these relationships.
>
> Pid and user namepaces are hierarchical. There is no way to discover
> parent-child relationships too.
>
> Why we may want to know relationships between namespaces?
>
> One use would be visualization, in order to understand the running system.
> Another would be to answer the question: what capability does process X have to
> perform operations on a resource governed by namespace Y?
>
> One more use-case (which usually called abnormal) is checkpoint/restart.
> In CRIU we age going to dump and restore nested namespaces.
>
> There [1] was a discussion about which interface to choose to determing
> relationships between namespaces.
>
> Eric suggested to add two ioctl-s [2]:
>> Grumble, Grumble.  I think this may actually a case for creating ioctls
>> for these two cases.  Now that random nsfs file descriptors are bind
>> mountable the original reason for using proc files is not as pressing.
>>
>> One ioctl for the user namespace that owns a file descriptor.
>> One ioctl for the parent namespace of a namespace file descriptor.
>
> Here is an implementaions of these ioctl-s.
>
> [1] https://lkml.org/lkml/2016/7/6/158
> [2] https://lkml.org/lkml/2016/7/9/101
>
> Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
> Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-14 18:20 ` Andrey Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-14 18:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-api, containers, criu, linux-fsdevel, Andrey Vagin,
	Eric W. Biederman, James Bottomley, Michael Kerrisk (man-pages),
	W. Trevor King, Alexander Viro, Serge Hallyn

Each namespace has an owning user namespace and now there is not way
to discover these relationships.

Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships too.

Why we may want to know relationships between namespaces?

One use would be visualization, in order to understand the running system.
Another would be to answer the question: what capability does process X have to
perform operations on a resource governed by namespace Y?

One more use-case (which usually called abnormal) is checkpoint/restart.
In CRIU we age going to dump and restore nested namespaces.

There [1] was a discussion about which interface to choose to determing
relationships between namespaces.

Eric suggested to add two ioctl-s [2]:
> Grumble, Grumble.  I think this may actually a case for creating ioctls
> for these two cases.  Now that random nsfs file descriptors are bind
> mountable the original reason for using proc files is not as pressing.
>
> One ioctl for the user namespace that owns a file descriptor.
> One ioctl for the parent namespace of a namespace file descriptor.

Here is an implementaions of these ioctl-s.

[1] https://lkml.org/lkml/2016/7/6/158
[2] https://lkml.org/lkml/2016/7/9/101

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: "W. Trevor King" <wking@tremily.us>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Serge Hallyn <serge.hallyn@canonical.com>

--
2.5.5

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
@ 2016-07-14 18:20 ` Andrey Vagin
  0 siblings, 0 replies; 85+ messages in thread
From: Andrey Vagin @ 2016-07-14 18:20 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	criu-GEFAQzZX7r8dnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrey Vagin,
	Eric W. Biederman, James Bottomley, Michael Kerrisk (man-pages),
	W. Trevor King, Alexander Viro, Serge Hallyn

Each namespace has an owning user namespace and now there is not way
to discover these relationships.

Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships too.

Why we may want to know relationships between namespaces?

One use would be visualization, in order to understand the running system.
Another would be to answer the question: what capability does process X have to
perform operations on a resource governed by namespace Y?

One more use-case (which usually called abnormal) is checkpoint/restart.
In CRIU we age going to dump and restore nested namespaces.

There [1] was a discussion about which interface to choose to determing
relationships between namespaces.

Eric suggested to add two ioctl-s [2]:
> Grumble, Grumble.  I think this may actually a case for creating ioctls
> for these two cases.  Now that random nsfs file descriptors are bind
> mountable the original reason for using proc files is not as pressing.
>
> One ioctl for the user namespace that owns a file descriptor.
> One ioctl for the parent namespace of a namespace file descriptor.

Here is an implementaions of these ioctl-s.

[1] https://lkml.org/lkml/2016/7/6/158
[2] https://lkml.org/lkml/2016/7/9/101

Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: "W. Trevor King" <wking-vJI2gpByivqcqzYg7KEe8g@public.gmane.org>
Cc: Alexander Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>

--
2.5.5

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2016-08-02  9:49 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-14 18:20 [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces Andrey Vagin
  -- strict thread matches above, loose matches on Subject: below --
2016-07-14 18:20 Andrey Vagin
2016-07-14 18:20 ` Andrey Vagin
2016-07-14 22:02 ` Andrey Vagin
2016-07-14 22:02   ` Andrey Vagin
2016-07-24  5:10   ` Eric W. Biederman
2016-07-24  5:10     ` Eric W. Biederman
     [not found]     ` <87poq3liyq.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-26  2:07       ` Andrew Vagin
2016-07-26  2:07         ` Andrew Vagin
     [not found]   ` <CANaxB-xw_xBUq=0uT14ANv-jfg2NsGaPy=jyDO9=yF03_7toSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-24  5:10     ` Eric W. Biederman
2016-07-21 14:41 ` Michael Kerrisk (man-pages)
2016-07-21 14:41   ` Michael Kerrisk (man-pages)
     [not found]   ` <c9bdaf3d-ec93-d754-81ac-9f524a0d0954-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-21 21:06     ` Andrew Vagin
2016-07-21 21:06       ` Andrew Vagin
2016-07-21 21:06       ` Andrew Vagin
     [not found]       ` <20160721210650.GA10989-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-22  6:48         ` Michael Kerrisk (man-pages)
     [not found]           ` <1515f5f2-5a49-fcab-61f4-8b627d3ba3e2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-22 18:25             ` Andrey Vagin
2016-07-22 18:25               ` Andrey Vagin
2016-07-25 11:47               ` Michael Kerrisk (man-pages)
2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
     [not found]                 ` <e2811bf1-4b86-e115-bcdb-301d6f2546eb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-25 13:18                   ` Eric W. Biederman
2016-07-25 13:18                     ` Eric W. Biederman
     [not found]                     ` <87lh0pg8jx.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-25 14:46                       ` Michael Kerrisk (man-pages)
2016-07-25 14:46                         ` Michael Kerrisk (man-pages)
     [not found]                         ` <44ca0e41-dc92-45b1-2a6c-c41a048a072d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-25 14:54                           ` Serge E. Hallyn
2016-07-25 14:59                           ` Eric W. Biederman
2016-07-25 14:59                             ` Eric W. Biederman
     [not found]                             ` <87r3ahepb4.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-26  2:54                               ` Andrew Vagin
2016-07-26  2:54                                 ` Andrew Vagin
2016-07-26  8:03                                 ` Michael Kerrisk (man-pages)
2016-07-26  8:03                                   ` Michael Kerrisk (man-pages)
     [not found]                                   ` <3390535b-0660-757f-aeba-c03d936b3485-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-26 18:25                                     ` Andrew Vagin
2016-07-26 18:25                                       ` Andrew Vagin
2016-07-26 18:32                                       ` W. Trevor King
2016-07-26 18:32                                         ` W. Trevor King
     [not found]                                         ` <20160726183224.GN24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-26 19:11                                           ` Andrew Vagin
2016-07-26 19:11                                             ` Andrew Vagin
     [not found]                                       ` <20160726182524.GA328-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-26 18:32                                         ` W. Trevor King
2016-07-26 19:17                                         ` Michael Kerrisk (man-pages)
2016-07-26 19:17                                       ` Michael Kerrisk (man-pages)
     [not found]                                         ` <CAKgNAkjmOu+vfiMDyeYQkkf7wQBH9PVmJ4nH2CTg43GrN-k7eA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-26 20:39                                           ` Andrew Vagin
2016-07-26 20:39                                             ` Andrew Vagin
2016-07-28 10:45                                             ` Michael Kerrisk (man-pages)
2016-07-28 10:45                                               ` Michael Kerrisk (man-pages)
     [not found]                                               ` <ca0787a3-b270-e962-46d1-7e63c9335a55-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-28 12:56                                                 ` Eric W. Biederman
2016-07-28 12:56                                                   ` Eric W. Biederman
2016-07-28 19:00                                                   ` Michael Kerrisk (man-pages)
2016-07-29 18:05                                                     ` Eric W. Biederman
2016-07-29 18:05                                                       ` Eric W. Biederman
     [not found]                                                       ` <87h9b8e2v7.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-31 21:31                                                         ` Michael Kerrisk (man-pages)
2016-07-31 21:31                                                           ` Michael Kerrisk (man-pages)
2016-08-01 23:01                                                         ` Andrew Vagin
2016-08-01 23:01                                                           ` Andrew Vagin
     [not found]                                                     ` <40e35f1a-10e6-b7a5-936e-a09f008be0d0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-07-29 18:05                                                       ` Eric W. Biederman
     [not found]                                                   ` <87popxkjjp.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-28 19:00                                                     ` Michael Kerrisk (man-pages)
     [not found]                                             ` <20160726203955.GA9415-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-28 10:45                                               ` Michael Kerrisk (man-pages)
     [not found]                                 ` <20160726025455.GC26206-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-26  8:03                                   ` Michael Kerrisk (man-pages)
2016-07-26 19:38                                   ` Eric W. Biederman
2016-07-26 19:38                                     ` Eric W. Biederman
2016-07-25 14:54                         ` Serge E. Hallyn
2016-07-25 14:54                           ` Serge E. Hallyn
2016-07-25 15:17                           ` Eric W. Biederman
2016-07-25 15:17                             ` Eric W. Biederman
     [not found]                           ` <20160725145445.GA19879-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-07-25 15:17                             ` Eric W. Biederman
     [not found]               ` <CANaxB-w8H8Wo8FmtmBBZTpJX-ZDGRQx0rbm9E5c9WbduQ_Ukmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-25 11:47                 ` Michael Kerrisk (man-pages)
     [not found] ` <1468520419-28220-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-07-14 22:02   ` Andrey Vagin
2016-07-21 14:41   ` Michael Kerrisk (man-pages)
2016-07-23 21:14   ` W. Trevor King
2016-07-23 21:14     ` W. Trevor King
     [not found]     ` <20160723211414.GA25371-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-23 21:38       ` James Bottomley
2016-07-23 21:38         ` James Bottomley
     [not found]         ` <1469309936.2332.35.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-23 21:58           ` W. Trevor King
2016-07-23 21:58         ` W. Trevor King
2016-07-23 21:58           ` W. Trevor King
2016-07-23 21:56           ` Eric W. Biederman
2016-07-23 21:56             ` Eric W. Biederman
     [not found]             ` <87mvl8nhlv.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-23 22:34               ` W. Trevor King
2016-07-23 22:34                 ` W. Trevor King
     [not found]                 ` <20160723223448.GP24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-24  4:51                   ` Eric W. Biederman
2016-07-24  4:51                     ` Eric W. Biederman
     [not found]           ` <20160723215802.GO24913-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-23 21:56             ` Eric W. Biederman
2016-08-01 18:20   ` Alban Crequy
2016-08-01 18:20     ` Alban Crequy
     [not found]     ` <CAMXgnP6j+rTeb5XJgoPV20y8puGyVm=9O9gdg9Sah4DuF5qm9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-01 23:32       ` Andrew Vagin
2016-08-01 23:32         ` Andrew Vagin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.