All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC]Pid conversion between pid namespace
@ 2014-07-03 12:18 ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-07-03 12:18 UTC (permalink / raw)
  To: Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Serge Hallyn
	(serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org),
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org),
	Gotou, Yasunori,
	'Daniel P. Berrange
	(berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org)'
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

We had some discussions on how to carry out
pid conversion between pid namespace via:
syscall[1] and procfs[2].

Pavel suggested that a syscall like
(ID, NS1, NS2) into (ID).

Serge suggested that a syscall 
pid_t getnspid(pid_t query_pid, pid_t observer_pid).


Eric and Richard suggested a procfs solution is
more appropriate.

Oleg suggested that we should expand /proc/pid/status
to report this kind of information.

And Richard suggested adding a directory like
/proc/<pidX>/ns/proc/ which would contain everything
from /proc/<pidX inside the namespace>/.

As procfs provided a more user friendly interface,
how about expose all sets of tgid, pid, pgid, sid 
by expanding /proc/PID/status in procfs?
And we could also expose ns hierarchy under /proc,
which could be another reference.

Ex:
    init_pid_ns    ns1         ns2
t1  2
t2   `- 3          1 
t3       `- 4      `- 5        1

We could get in /proc/t3/status:
NSpid: 4 5 1
We knew that pid 1 in container is pid 4 in init ns.

And we could get ns hierarchy under /proc/ns_hierarchy like:
init_ns->ns1->ns2		(as the result of readlink)
         ->ns3
We knew that t3 in ns2, and its hierarchy.

How these ideas looks like?
Any comments would be appreciated.

Thanks,
- Chen


a) syscall
http://lwn.net/Articles/602987/

b) procfs
http://www.spinics.net/lists/kernel/msg1751688.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC]Pid conversion between pid namespace
@ 2014-07-03 12:18 ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-07-03 12:18 UTC (permalink / raw)
  To: Eric W. Biederman (ebiederm@xmission.com),
	Serge Hallyn (serge.hallyn@ubuntu.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)'
  Cc: containers, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 1512 bytes --]

Hi,

We had some discussions on how to carry out
pid conversion between pid namespace via:
syscall[1] and procfs[2].

Pavel suggested that a syscall like
(ID, NS1, NS2) into (ID).

Serge suggested that a syscall 
pid_t getnspid(pid_t query_pid, pid_t observer_pid).


Eric and Richard suggested a procfs solution is
more appropriate.

Oleg suggested that we should expand /proc/pid/status
to report this kind of information.

And Richard suggested adding a directory like
/proc/<pidX>/ns/proc/ which would contain everything
from /proc/<pidX inside the namespace>/.

As procfs provided a more user friendly interface,
how about expose all sets of tgid, pid, pgid, sid 
by expanding /proc/PID/status in procfs?
And we could also expose ns hierarchy under /proc,
which could be another reference.

Ex:
    init_pid_ns    ns1         ns2
t1  2
t2   `- 3          1 
t3       `- 4      `- 5        1

We could get in /proc/t3/status:
NSpid: 4 5 1
We knew that pid 1 in container is pid 4 in init ns.

And we could get ns hierarchy under /proc/ns_hierarchy like:
init_ns->ns1->ns2		(as the result of readlink)
         ->ns3
We knew that t3 in ns2, and its hierarchy.

How these ideas looks like?
Any comments would be appreciated.

Thanks,
- Chen


a) syscall
http://lwn.net/Articles/602987/

b) procfs
http://www.spinics.net/lists/kernel/msg1751688.html

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-07-03 12:18 ` chenhanxiao
@ 2014-07-04  5:34     ` Yasunori Goto
  -1 siblings, 0 replies; 30+ messages in thread
From: Yasunori Goto @ 2014-07-04  5:34 UTC (permalink / raw)
  To: "Chen, Hanxiao/? �ヨ霄"
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Serge Hallyn
	(serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org),
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Chen-san,

I would like to recommend that you summarize pros/cons for all ideas so far.

For example,

---------
A) make new system call for transrate 

   A-1) systemcall(ID, NS1, NS2) into (ID).
     pros: 
        - foo
        - baa

     cons: 
        - hoge
        - hogehogehoge

   A-2) pid_t getnspid(pid_t query_pid, pid_t observer_pid) 
      (ditto)


B) make/change proc file/directories
   B-1) expand /proc/pid/status
       (ditto)
   
   B-2) /proc/<pidX>/ns/proc/ which would contain everything
         from /proc/<pidX inside the namespace>/.
       (ditto)


------

Please make clear what is the good/bad point of each opinion by the above,
  - Is it hard to keep compatiblity?
  - Is it hard to understand for administorator/programmer?
  - Is it difficult to show for "nested containers"?
  - Is userland tool necessary?
  - any other problems?

I hope it will be good discussion by the above.

Thanks,

> Hi,
> 
> We had some discussions on how to carry out
> pid conversion between pid namespace via:
> syscall[1] and procfs[2].
> 
> Pavel suggested that a syscall like
> (ID, NS1, NS2) into (ID).
> 
> Serge suggested that a syscall 
> pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> 
> 
> Eric and Richard suggested a procfs solution is
> more appropriate.
> 
> Oleg suggested that we should expand /proc/pid/status
> to report this kind of information.
> 
> And Richard suggested adding a directory like
> /proc/<pidX>/ns/proc/ which would contain everything
> from /proc/<pidX inside the namespace>/.
> 
> As procfs provided a more user friendly interface,
> how about expose all sets of tgid, pid, pgid, sid 
> by expanding /proc/PID/status in procfs?
> And we could also expose ns hierarchy under /proc,
> which could be another reference.
> 
> Ex:
>     init_pid_ns    ns1         ns2
> t1  2
> t2   `- 3          1 
> t3       `- 4      `- 5        1
> 
> We could get in /proc/t3/status:
> NSpid: 4 5 1
> We knew that pid 1 in container is pid 4 in init ns.
> 
> And we could get ns hierarchy under /proc/ns_hierarchy like:
> init_ns->ns1->ns2		(as the result of readlink)
>          ->ns3
> We knew that t3 in ns2, and its hierarchy.
> 
> How these ideas looks like?
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> 
> a) syscall
> http://lwn.net/Articles/602987/
> 
> b) procfs
> http://www.spinics.net/lists/kernel/msg1751688.html
> 

-- 
Yasunori Goto <y-goto-+CUm20s59erQFUHtdCDX3A@public.gmane.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-07-04  5:34     ` Yasunori Goto
  0 siblings, 0 replies; 30+ messages in thread
From: Yasunori Goto @ 2014-07-04  5:34 UTC (permalink / raw)
  To: "Chen, Hanxiao/? �ヨ霄"
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Serge Hallyn (serge.hallyn@ubuntu.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

Chen-san,

I would like to recommend that you summarize pros/cons for all ideas so far.

For example,

---------
A) make new system call for transrate 

   A-1) systemcall(ID, NS1, NS2) into (ID).
     pros: 
        - foo
        - baa

     cons: 
        - hoge
        - hogehogehoge

   A-2) pid_t getnspid(pid_t query_pid, pid_t observer_pid) 
      (ditto)


B) make/change proc file/directories
   B-1) expand /proc/pid/status
       (ditto)
   
   B-2) /proc/<pidX>/ns/proc/ which would contain everything
         from /proc/<pidX inside the namespace>/.
       (ditto)


------

Please make clear what is the good/bad point of each opinion by the above,
  - Is it hard to keep compatiblity?
  - Is it hard to understand for administorator/programmer?
  - Is it difficult to show for "nested containers"?
  - Is userland tool necessary?
  - any other problems?

I hope it will be good discussion by the above.

Thanks,

> Hi,
> 
> We had some discussions on how to carry out
> pid conversion between pid namespace via:
> syscall[1] and procfs[2].
> 
> Pavel suggested that a syscall like
> (ID, NS1, NS2) into (ID).
> 
> Serge suggested that a syscall 
> pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> 
> 
> Eric and Richard suggested a procfs solution is
> more appropriate.
> 
> Oleg suggested that we should expand /proc/pid/status
> to report this kind of information.
> 
> And Richard suggested adding a directory like
> /proc/<pidX>/ns/proc/ which would contain everything
> from /proc/<pidX inside the namespace>/.
> 
> As procfs provided a more user friendly interface,
> how about expose all sets of tgid, pid, pgid, sid 
> by expanding /proc/PID/status in procfs?
> And we could also expose ns hierarchy under /proc,
> which could be another reference.
> 
> Ex:
>     init_pid_ns    ns1         ns2
> t1  2
> t2   `- 3          1 
> t3       `- 4      `- 5        1
> 
> We could get in /proc/t3/status:
> NSpid: 4 5 1
> We knew that pid 1 in container is pid 4 in init ns.
> 
> And we could get ns hierarchy under /proc/ns_hierarchy like:
> init_ns->ns1->ns2		(as the result of readlink)
>          ->ns3
> We knew that t3 in ns2, and its hierarchy.
> 
> How these ideas looks like?
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> 
> a) syscall
> http://lwn.net/Articles/602987/
> 
> b) procfs
> http://www.spinics.net/lists/kernel/msg1751688.html
> 

-- 
Yasunori Goto <y-goto@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
  2014-07-03 12:18 ` chenhanxiao
@ 2014-07-09 10:34     ` chenhanxiao
  -1 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-07-09 10:34 UTC (permalink / raw)
  To: Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Serge Hallyn
	(serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org),
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org),
	Gotou, Yasunori,
	'Daniel P. Berrange
	(berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org)'
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

Let me summarize our discussions of ID conversion by pros/cons: 

A) make new system call for translation	
    A-1) systemcall(ID, NS1, NS2) into (ID).
    pros:
        - has a reference ns(NS2)
          We could get any lower level ID directly.
		 
    cons:
        - lack of hierarchy information. 
          CRIU need hierarchy info for checkpoint/restore in nested containers.
        - not easy for debug. 
          And a lot of tools/libs need be modified.

    A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
    pros:
        - ns procfs free, easy to use.
        We could get rid of mounted ns procfs.

    cons:
        - may find multiple results in nested ns.
          We wished the new API could tell us the exact answer.
          But if getnspid return more than one results will bring trouble to admins,
          they had to make another decision.
          Or we marked the deepest level for translation as prerequisite.

        -based on current pidns, no reference ns.

B) make/change proc file/directories
	B-1) expand /proc/pid/status
	pros:
        - easy to use and to debug
        - already had existed interface in kernel
        
	cons:
        - based on current ns
          for middle level, we had to make another decision.
        - do not have hierarchy info.

	B-2) /proc/<pidX>/ns/proc/ which would contain everything
	pros:
        - have enough info from /proc in container

	cons:
        - Requirements unclear.
          We need more discussion to decide which items should not be exposed.
        - do not have hierarchy info.


How about do these things in two steps: 

C)  1. expose all sets of pid, pgid, sid and tgid
via expanded /proc/PID/status
      We could get translated IDs from container like:
    NStgid:	16465 	5 	1 
    NSpid:	16465 	5 	1 
    NSpgid:	16465 	5 	1 
    NSsid:	16423 	1 	0
    (a set of IDs with 3 level of ns)

    2. add hierarchy info under /proc
      We lacked of method of getting hierarchy info, which is useful.
      Then we could know the relationship of ns.
      How about adding a new proc file just under /proc
      to show the hierarchy like readlink did:
	  pid:[4026531836]-> [4026532390] -> [4026532484]
      pid:[4026531836]-> [4026532491]
      (A 3 level pid and 2 level pid_

Any comments would be appreciated.

Thanks,
- Chen

> -----Original Message-----
> Subject: [RFC]Pid conversion between pid namespace
> 
> Hi,
> 
> We had some discussions on how to carry out
> pid conversion between pid namespace via:
> syscall[1] and procfs[2].
> 
> Pavel suggested that a syscall like
> (ID, NS1, NS2) into (ID).
> 
> Serge suggested that a syscall
> pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> 
> 
> Eric and Richard suggested a procfs solution is
> more appropriate.
> 
> Oleg suggested that we should expand /proc/pid/status
> to report this kind of information.
> 
> And Richard suggested adding a directory like
> /proc/<pidX>/ns/proc/ which would contain everything
> from /proc/<pidX inside the namespace>/.
> 
> As procfs provided a more user friendly interface,
> how about expose all sets of tgid, pid, pgid, sid
> by expanding /proc/PID/status in procfs?
> And we could also expose ns hierarchy under /proc,
> which could be another reference.
> 
> Ex:
>     init_pid_ns    ns1         ns2
> t1  2
> t2   `- 3          1
> t3       `- 4      `- 5        1
> 
> We could get in /proc/t3/status:
> NSpid: 4 5 1
> We knew that pid 1 in container is pid 4 in init ns.
> 
> And we could get ns hierarchy under /proc/ns_hierarchy like:
> init_ns->ns1->ns2		(as the result of readlink)
>          ->ns3
> We knew that t3 in ns2, and its hierarchy.
> 
> How these ideas looks like?
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> 
> a) syscall
> http://lwn.net/Articles/602987/
> 
> b) procfs
> http://www.spinics.net/lists/kernel/msg1751688.html
> 
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
@ 2014-07-09 10:34     ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-07-09 10:34 UTC (permalink / raw)
  To: Eric W. Biederman (ebiederm@xmission.com),
	Serge Hallyn (serge.hallyn@ubuntu.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)'
  Cc: containers, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 4241 bytes --]

Hi,

Let me summarize our discussions of ID conversion by pros/cons: 

A) make new system call for translation	
    A-1) systemcall(ID, NS1, NS2) into (ID).
    pros:
        - has a reference ns(NS2)
          We could get any lower level ID directly.
		 
    cons:
        - lack of hierarchy information. 
          CRIU need hierarchy info for checkpoint/restore in nested containers.
        - not easy for debug. 
          And a lot of tools/libs need be modified.

    A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
    pros:
        - ns procfs free, easy to use.
        We could get rid of mounted ns procfs.

    cons:
        - may find multiple results in nested ns.
          We wished the new API could tell us the exact answer.
          But if getnspid return more than one results will bring trouble to admins,
          they had to make another decision.
          Or we marked the deepest level for translation as prerequisite.

        -based on current pidns, no reference ns.

B) make/change proc file/directories
	B-1) expand /proc/pid/status
	pros:
        - easy to use and to debug
        - already had existed interface in kernel
        
	cons:
        - based on current ns
          for middle level, we had to make another decision.
        - do not have hierarchy info.

	B-2) /proc/<pidX>/ns/proc/ which would contain everything
	pros:
        - have enough info from /proc in container

	cons:
        - Requirements unclear.
          We need more discussion to decide which items should not be exposed.
        - do not have hierarchy info.


How about do these things in two steps: 

C)  1. expose all sets of pid, pgid, sid and tgid
via expanded /proc/PID/status
      We could get translated IDs from container like:
    NStgid:	16465 	5 	1 
    NSpid:	16465 	5 	1 
    NSpgid:	16465 	5 	1 
    NSsid:	16423 	1 	0
    (a set of IDs with 3 level of ns)

    2. add hierarchy info under /proc
      We lacked of method of getting hierarchy info, which is useful.
      Then we could know the relationship of ns.
      How about adding a new proc file just under /proc
      to show the hierarchy like readlink did:
	  pid:[4026531836]-> [4026532390] -> [4026532484]
      pid:[4026531836]-> [4026532491]
      (A 3 level pid and 2 level pid_

Any comments would be appreciated.

Thanks,
- Chen

> -----Original Message-----
> Subject: [RFC]Pid conversion between pid namespace
> 
> Hi,
> 
> We had some discussions on how to carry out
> pid conversion between pid namespace via:
> syscall[1] and procfs[2].
> 
> Pavel suggested that a syscall like
> (ID, NS1, NS2) into (ID).
> 
> Serge suggested that a syscall
> pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> 
> 
> Eric and Richard suggested a procfs solution is
> more appropriate.
> 
> Oleg suggested that we should expand /proc/pid/status
> to report this kind of information.
> 
> And Richard suggested adding a directory like
> /proc/<pidX>/ns/proc/ which would contain everything
> from /proc/<pidX inside the namespace>/.
> 
> As procfs provided a more user friendly interface,
> how about expose all sets of tgid, pid, pgid, sid
> by expanding /proc/PID/status in procfs?
> And we could also expose ns hierarchy under /proc,
> which could be another reference.
> 
> Ex:
>     init_pid_ns    ns1         ns2
> t1  2
> t2   `- 3          1
> t3       `- 4      `- 5        1
> 
> We could get in /proc/t3/status:
> NSpid: 4 5 1
> We knew that pid 1 in container is pid 4 in init ns.
> 
> And we could get ns hierarchy under /proc/ns_hierarchy like:
> init_ns->ns1->ns2		(as the result of readlink)
>          ->ns3
> We knew that t3 in ns2, and its hierarchy.
> 
> How these ideas looks like?
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> 
> a) syscall
> http://lwn.net/Articles/602987/
> 
> b) procfs
> http://www.spinics.net/lists/kernel/msg1751688.html
> 
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-07-09 10:34     ` chenhanxiao
@ 2014-07-15  4:16         ` Serge Hallyn
  -1 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-07-15  4:16 UTC (permalink / raw)
  To: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Quoting chenhanxiao-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org (chenhanxiao-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org):
> Hi,
> 
> Let me summarize our discussions of ID conversion by pros/cons: 
> 
> A) make new system call for translation	
>     A-1) systemcall(ID, NS1, NS2) into (ID).
>     pros:
>         - has a reference ns(NS2)
>           We could get any lower level ID directly.
> 		 
>     cons:
>         - lack of hierarchy information. 
>           CRIU need hierarchy info for checkpoint/restore in nested containers.
>         - not easy for debug. 
>           And a lot of tools/libs need be modified.
> 
>     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
>     pros:
>         - ns procfs free, easy to use.
>         We could get rid of mounted ns procfs.
> 
>     cons:
>         - may find multiple results in nested ns.
>           We wished the new API could tell us the exact answer.
>           But if getnspid return more than one results will bring trouble to admins,

(See below for more, but) the question being posed to getnspid has precisely
one answer.

>           they had to make another decision.
>           Or we marked the deepest level for translation as prerequisite.
> 
>         -based on current pidns, no reference ns.

Hm, no.  The intent here was that

	observer_pid would be in current ns
	query_pid would be in observer_pid's ns.

So this would be ideal for "I got a pid in a logfile created by rsyslog in
a nested contaner, what is the logged pid in my pidns."

Taking a set of tasks (like a container with nesting) and bulding a tree
of all pids shouldn't be too difficult either.  Start with the init pid,
call getnspid($pid, $init_pid) for every $pid in the container;  to figure
out whether any $pid is itself a nested init_pid, we can compare the
/proc/$$/ns/pid, as well as look at getnspid($pid, $pid).

> B) make/change proc file/directories
> 	B-1) expand /proc/pid/status
> 	pros:
>         - easy to use and to debug
>         - already had existed interface in kernel
>         
> 	cons:
>         - based on current ns
>           for middle level, we had to make another decision.
>         - do not have hierarchy info.
> 
> 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> 	pros:
>         - have enough info from /proc in container
> 
> 	cons:
>         - Requirements unclear.
>           We need more discussion to decide which items should not be exposed.
>         - do not have hierarchy info.
> 
> 
> How about do these things in two steps: 
> 
> C)  1. expose all sets of pid, pgid, sid and tgid
> via expanded /proc/PID/status
>       We could get translated IDs from container like:
>     NStgid:	16465 	5 	1 
>     NSpid:	16465 	5 	1 
>     NSpgid:	16465 	5 	1 
>     NSsid:	16423 	1 	0
>     (a set of IDs with 3 level of ns)
> 
>     2. add hierarchy info under /proc
>       We lacked of method of getting hierarchy info, which is useful.
>       Then we could know the relationship of ns.
>       How about adding a new proc file just under /proc
>       to show the hierarchy like readlink did:
> 	  pid:[4026531836]-> [4026532390] -> [4026532484]
>       pid:[4026531836]-> [4026532491]
>       (A 3 level pid and 2 level pid_
> 
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> > -----Original Message-----
> > Subject: [RFC]Pid conversion between pid namespace
> > 
> > Hi,
> > 
> > We had some discussions on how to carry out
> > pid conversion between pid namespace via:
> > syscall[1] and procfs[2].
> > 
> > Pavel suggested that a syscall like
> > (ID, NS1, NS2) into (ID).
> > 
> > Serge suggested that a syscall
> > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > 
> > 
> > Eric and Richard suggested a procfs solution is
> > more appropriate.
> > 
> > Oleg suggested that we should expand /proc/pid/status
> > to report this kind of information.
> > 
> > And Richard suggested adding a directory like
> > /proc/<pidX>/ns/proc/ which would contain everything
> > from /proc/<pidX inside the namespace>/.
> > 
> > As procfs provided a more user friendly interface,
> > how about expose all sets of tgid, pid, pgid, sid
> > by expanding /proc/PID/status in procfs?
> > And we could also expose ns hierarchy under /proc,
> > which could be another reference.
> > 
> > Ex:
> >     init_pid_ns    ns1         ns2
> > t1  2
> > t2   `- 3          1
> > t3       `- 4      `- 5        1
> > 
> > We could get in /proc/t3/status:
> > NSpid: 4 5 1
> > We knew that pid 1 in container is pid 4 in init ns.
> > 
> > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > init_ns->ns1->ns2		(as the result of readlink)
> >          ->ns3
> > We knew that t3 in ns2, and its hierarchy.
> > 
> > How these ideas looks like?
> > Any comments would be appreciated.
> > 
> > Thanks,
> > - Chen
> > 
> > 
> > a) syscall
> > http://lwn.net/Articles/602987/
> > 
> > b) procfs
> > http://www.spinics.net/lists/kernel/msg1751688.html
> > 
> > _______________________________________________
> > Containers mailing list
> > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-07-15  4:16         ` Serge Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-07-15  4:16 UTC (permalink / raw)
  To: chenhanxiao
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> Let me summarize our discussions of ID conversion by pros/cons: 
> 
> A) make new system call for translation	
>     A-1) systemcall(ID, NS1, NS2) into (ID).
>     pros:
>         - has a reference ns(NS2)
>           We could get any lower level ID directly.
> 		 
>     cons:
>         - lack of hierarchy information. 
>           CRIU need hierarchy info for checkpoint/restore in nested containers.
>         - not easy for debug. 
>           And a lot of tools/libs need be modified.
> 
>     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
>     pros:
>         - ns procfs free, easy to use.
>         We could get rid of mounted ns procfs.
> 
>     cons:
>         - may find multiple results in nested ns.
>           We wished the new API could tell us the exact answer.
>           But if getnspid return more than one results will bring trouble to admins,

(See below for more, but) the question being posed to getnspid has precisely
one answer.

>           they had to make another decision.
>           Or we marked the deepest level for translation as prerequisite.
> 
>         -based on current pidns, no reference ns.

Hm, no.  The intent here was that

	observer_pid would be in current ns
	query_pid would be in observer_pid's ns.

So this would be ideal for "I got a pid in a logfile created by rsyslog in
a nested contaner, what is the logged pid in my pidns."

Taking a set of tasks (like a container with nesting) and bulding a tree
of all pids shouldn't be too difficult either.  Start with the init pid,
call getnspid($pid, $init_pid) for every $pid in the container;  to figure
out whether any $pid is itself a nested init_pid, we can compare the
/proc/$$/ns/pid, as well as look at getnspid($pid, $pid).

> B) make/change proc file/directories
> 	B-1) expand /proc/pid/status
> 	pros:
>         - easy to use and to debug
>         - already had existed interface in kernel
>         
> 	cons:
>         - based on current ns
>           for middle level, we had to make another decision.
>         - do not have hierarchy info.
> 
> 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> 	pros:
>         - have enough info from /proc in container
> 
> 	cons:
>         - Requirements unclear.
>           We need more discussion to decide which items should not be exposed.
>         - do not have hierarchy info.
> 
> 
> How about do these things in two steps: 
> 
> C)  1. expose all sets of pid, pgid, sid and tgid
> via expanded /proc/PID/status
>       We could get translated IDs from container like:
>     NStgid:	16465 	5 	1 
>     NSpid:	16465 	5 	1 
>     NSpgid:	16465 	5 	1 
>     NSsid:	16423 	1 	0
>     (a set of IDs with 3 level of ns)
> 
>     2. add hierarchy info under /proc
>       We lacked of method of getting hierarchy info, which is useful.
>       Then we could know the relationship of ns.
>       How about adding a new proc file just under /proc
>       to show the hierarchy like readlink did:
> 	  pid:[4026531836]-> [4026532390] -> [4026532484]
>       pid:[4026531836]-> [4026532491]
>       (A 3 level pid and 2 level pid_
> 
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> > -----Original Message-----
> > Subject: [RFC]Pid conversion between pid namespace
> > 
> > Hi,
> > 
> > We had some discussions on how to carry out
> > pid conversion between pid namespace via:
> > syscall[1] and procfs[2].
> > 
> > Pavel suggested that a syscall like
> > (ID, NS1, NS2) into (ID).
> > 
> > Serge suggested that a syscall
> > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > 
> > 
> > Eric and Richard suggested a procfs solution is
> > more appropriate.
> > 
> > Oleg suggested that we should expand /proc/pid/status
> > to report this kind of information.
> > 
> > And Richard suggested adding a directory like
> > /proc/<pidX>/ns/proc/ which would contain everything
> > from /proc/<pidX inside the namespace>/.
> > 
> > As procfs provided a more user friendly interface,
> > how about expose all sets of tgid, pid, pgid, sid
> > by expanding /proc/PID/status in procfs?
> > And we could also expose ns hierarchy under /proc,
> > which could be another reference.
> > 
> > Ex:
> >     init_pid_ns    ns1         ns2
> > t1  2
> > t2   `- 3          1
> > t3       `- 4      `- 5        1
> > 
> > We could get in /proc/t3/status:
> > NSpid: 4 5 1
> > We knew that pid 1 in container is pid 4 in init ns.
> > 
> > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > init_ns->ns1->ns2		(as the result of readlink)
> >          ->ns3
> > We knew that t3 in ns2, and its hierarchy.
> > 
> > How these ideas looks like?
> > Any comments would be appreciated.
> > 
> > Thanks,
> > - Chen
> > 
> > 
> > a) syscall
> > http://lwn.net/Articles/602987/
> > 
> > b) procfs
> > http://www.spinics.net/lists/kernel/msg1751688.html
> > 
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
  2014-07-15  4:16         ` Serge Hallyn
@ 2014-07-21 10:47           ` chenhanxiao
  -1 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-07-21 10:47 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Hi,

> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> Sent: Tuesday, July 15, 2014 12:16 PM
> To: Chen, Hanxiao/陈 晗霄
> Subject: Re: [RFC]Pid conversion between pid namespace
> >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> >     pros:
> >         - ns procfs free, easy to use.
> >         We could get rid of mounted ns procfs.
> >
> >     cons:
> >         - may find multiple results in nested ns.
> >           We wished the new API could tell us the exact answer.
> >           But if getnspid return more than one results will bring trouble to admins,
> 
> (See below for more, but) the question being posed to getnspid has precisely
> one answer.
> 
> >           they had to make another decision.
> >           Or we marked the deepest level for translation as prerequisite.
> >
> >         -based on current pidns, no reference ns.
> 
> Hm, no.  The intent here was that
> 
> 	observer_pid would be in current ns
> 	query_pid would be in observer_pid's ns.
> 
> So this would be ideal for "I got a pid in a logfile created by rsyslog in
> a nested contaner, what is the logged pid in my pidns."
> 
> Taking a set of tasks (like a container with nesting) and bulding a tree
> of all pids shouldn't be too difficult either.  Start with the init pid,
> call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> out whether any $pid is itself a nested init_pid, we can compare the
> /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
I'm a little confused in this section:

Ex:
    init_pid_ns    ns1         ns2
t1  2
t2   `- 3          1 
t3       `- 4      `- 5        1
t4           `-6       `-8      `-9
t5             `-10       `-9      `-10

For getnspid($pid, $init_pid),
Does init_pid means container's init_pid such as 3 for t2?

In nested containers, does this syscall work as:
getnspid(9, 4) -> (6, 8, 9) 
9 in ns2, 4 as t3 in init_pid_ns(current ns)

And:
getnspid($pid, $pid)
If pid in host and pid in container is the same by coincidence:
getnspid(10,10) for t5, it may not work.

Thanks,
- Chen
> 
> > B) make/change proc file/directories
> > 	B-1) expand /proc/pid/status
> > 	pros:
> >         - easy to use and to debug
> >         - already had existed interface in kernel
> >
> > 	cons:
> >         - based on current ns
> >           for middle level, we had to make another decision.
> >         - do not have hierarchy info.
> >
> > 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> > 	pros:
> >         - have enough info from /proc in container
> >
> > 	cons:
> >         - Requirements unclear.
> >           We need more discussion to decide which items should not be exposed.
> >         - do not have hierarchy info.
> >
> >
> > How about do these things in two steps:
> >
> > C)  1. expose all sets of pid, pgid, sid and tgid
> > via expanded /proc/PID/status
> >       We could get translated IDs from container like:
> >     NStgid:	16465 	5 	1
> >     NSpid:	16465 	5 	1
> >     NSpgid:	16465 	5 	1
> >     NSsid:	16423 	1 	0
> >     (a set of IDs with 3 level of ns)
> >
> >     2. add hierarchy info under /proc
> >       We lacked of method of getting hierarchy info, which is useful.
> >       Then we could know the relationship of ns.
> >       How about adding a new proc file just under /proc
> >       to show the hierarchy like readlink did:
> > 	  pid:[4026531836]-> [4026532390] -> [4026532484]
> >       pid:[4026531836]-> [4026532491]
> >       (A 3 level pid and 2 level pid_
> >
> > Any comments would be appreciated.
> >
> > Thanks,
> > - Chen
> >
> > > -----Original Message-----
> > > Subject: [RFC]Pid conversion between pid namespace
> > >
> > > Hi,
> > >
> > > We had some discussions on how to carry out
> > > pid conversion between pid namespace via:
> > > syscall[1] and procfs[2].
> > >
> > > Pavel suggested that a syscall like
> > > (ID, NS1, NS2) into (ID).
> > >
> > > Serge suggested that a syscall
> > > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > >
> > >
> > > Eric and Richard suggested a procfs solution is
> > > more appropriate.
> > >
> > > Oleg suggested that we should expand /proc/pid/status
> > > to report this kind of information.
> > >
> > > And Richard suggested adding a directory like
> > > /proc/<pidX>/ns/proc/ which would contain everything
> > > from /proc/<pidX inside the namespace>/.
> > >
> > > As procfs provided a more user friendly interface,
> > > how about expose all sets of tgid, pid, pgid, sid
> > > by expanding /proc/PID/status in procfs?
> > > And we could also expose ns hierarchy under /proc,
> > > which could be another reference.
> > >
> > > Ex:
> > >     init_pid_ns    ns1         ns2
> > > t1  2
> > > t2   `- 3          1
> > > t3       `- 4      `- 5        1
> > >
> > > We could get in /proc/t3/status:
> > > NSpid: 4 5 1
> > > We knew that pid 1 in container is pid 4 in init ns.
> > >
> > > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > > init_ns->ns1->ns2		(as the result of readlink)
> > >          ->ns3
> > > We knew that t3 in ns2, and its hierarchy.
> > >
> > > How these ideas looks like?
> > > Any comments would be appreciated.
> > >
> > > Thanks,
> > > - Chen
> > >
> > >
> > > a) syscall
> > > http://lwn.net/Articles/602987/
> > >
> > > b) procfs
> > > http://www.spinics.net/lists/kernel/msg1751688.html
> > >
> > > _______________________________________________
> > > Containers mailing list
> > > Containers@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
@ 2014-07-21 10:47           ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-07-21 10:47 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 5923 bytes --]

Hi,

> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> Sent: Tuesday, July 15, 2014 12:16 PM
> To: Chen, Hanxiao/³Â êÏÏö
> Subject: Re: [RFC]Pid conversion between pid namespace
> >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> >     pros:
> >         - ns procfs free, easy to use.
> >         We could get rid of mounted ns procfs.
> >
> >     cons:
> >         - may find multiple results in nested ns.
> >           We wished the new API could tell us the exact answer.
> >           But if getnspid return more than one results will bring trouble to admins,
> 
> (See below for more, but) the question being posed to getnspid has precisely
> one answer.
> 
> >           they had to make another decision.
> >           Or we marked the deepest level for translation as prerequisite.
> >
> >         -based on current pidns, no reference ns.
> 
> Hm, no.  The intent here was that
> 
> 	observer_pid would be in current ns
> 	query_pid would be in observer_pid's ns.
> 
> So this would be ideal for "I got a pid in a logfile created by rsyslog in
> a nested contaner, what is the logged pid in my pidns."
> 
> Taking a set of tasks (like a container with nesting) and bulding a tree
> of all pids shouldn't be too difficult either.  Start with the init pid,
> call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> out whether any $pid is itself a nested init_pid, we can compare the
> /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
I'm a little confused in this section:

Ex:
    init_pid_ns    ns1         ns2
t1  2
t2   `- 3          1 
t3       `- 4      `- 5        1
t4           `-6       `-8      `-9
t5             `-10       `-9      `-10

For getnspid($pid, $init_pid),
Does init_pid means container's init_pid such as 3 for t2?

In nested containers, does this syscall work as:
getnspid(9, 4) -> (6, 8, 9) 
9 in ns2, 4 as t3 in init_pid_ns(current ns)

And:
getnspid($pid, $pid)
If pid in host and pid in container is the same by coincidence:
getnspid(10,10) for t5, it may not work.

Thanks,
- Chen
> 
> > B) make/change proc file/directories
> > 	B-1) expand /proc/pid/status
> > 	pros:
> >         - easy to use and to debug
> >         - already had existed interface in kernel
> >
> > 	cons:
> >         - based on current ns
> >           for middle level, we had to make another decision.
> >         - do not have hierarchy info.
> >
> > 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> > 	pros:
> >         - have enough info from /proc in container
> >
> > 	cons:
> >         - Requirements unclear.
> >           We need more discussion to decide which items should not be exposed.
> >         - do not have hierarchy info.
> >
> >
> > How about do these things in two steps:
> >
> > C)  1. expose all sets of pid, pgid, sid and tgid
> > via expanded /proc/PID/status
> >       We could get translated IDs from container like:
> >     NStgid:	16465 	5 	1
> >     NSpid:	16465 	5 	1
> >     NSpgid:	16465 	5 	1
> >     NSsid:	16423 	1 	0
> >     (a set of IDs with 3 level of ns)
> >
> >     2. add hierarchy info under /proc
> >       We lacked of method of getting hierarchy info, which is useful.
> >       Then we could know the relationship of ns.
> >       How about adding a new proc file just under /proc
> >       to show the hierarchy like readlink did:
> > 	  pid:[4026531836]-> [4026532390] -> [4026532484]
> >       pid:[4026531836]-> [4026532491]
> >       (A 3 level pid and 2 level pid_
> >
> > Any comments would be appreciated.
> >
> > Thanks,
> > - Chen
> >
> > > -----Original Message-----
> > > Subject: [RFC]Pid conversion between pid namespace
> > >
> > > Hi,
> > >
> > > We had some discussions on how to carry out
> > > pid conversion between pid namespace via:
> > > syscall[1] and procfs[2].
> > >
> > > Pavel suggested that a syscall like
> > > (ID, NS1, NS2) into (ID).
> > >
> > > Serge suggested that a syscall
> > > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > >
> > >
> > > Eric and Richard suggested a procfs solution is
> > > more appropriate.
> > >
> > > Oleg suggested that we should expand /proc/pid/status
> > > to report this kind of information.
> > >
> > > And Richard suggested adding a directory like
> > > /proc/<pidX>/ns/proc/ which would contain everything
> > > from /proc/<pidX inside the namespace>/.
> > >
> > > As procfs provided a more user friendly interface,
> > > how about expose all sets of tgid, pid, pgid, sid
> > > by expanding /proc/PID/status in procfs?
> > > And we could also expose ns hierarchy under /proc,
> > > which could be another reference.
> > >
> > > Ex:
> > >     init_pid_ns    ns1         ns2
> > > t1  2
> > > t2   `- 3          1
> > > t3       `- 4      `- 5        1
> > >
> > > We could get in /proc/t3/status:
> > > NSpid: 4 5 1
> > > We knew that pid 1 in container is pid 4 in init ns.
> > >
> > > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > > init_ns->ns1->ns2		(as the result of readlink)
> > >          ->ns3
> > > We knew that t3 in ns2, and its hierarchy.
> > >
> > > How these ideas looks like?
> > > Any comments would be appreciated.
> > >
> > > Thanks,
> > > - Chen
> > >
> > >
> > > a) syscall
> > > http://lwn.net/Articles/602987/
> > >
> > > b) procfs
> > > http://www.spinics.net/lists/kernel/msg1751688.html
> > >
> > > _______________________________________________
> > > Containers mailing list
> > > Containers@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
  2014-07-09 10:34     ` chenhanxiao
@ 2014-07-25 10:01         ` chenhanxiao
  -1 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-07-25 10:01 UTC (permalink / raw)
  To: Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Serge Hallyn
	(serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org),
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	Pavel Emelyanov (xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org),
	Gotou, Yasunori,
	'Daniel P. Berrange
	(berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org)'
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 7462 bytes --]

Hi,

We discussed two ways of pid conversion:
syscall and procfs.

Both of them could do a pid translation job.
But for ns hierarchy, syscall like:

pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
or
pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)

could not work, we knew a pid lived in one ns, but we
did not know their relationships.
For getting the entire set of pids, both of them can do.

So using procfs is a better way.

Ex:
    init_pid_ns     ns1         ns2
t1  2
t2   `- 3           1 
t3       `- 4       `- 5        1
t4           `-6        `-8      `-9
t5             `-10        `-9      `-10

1. How procfs work:
a) adding a nspid hierarchy  under /proc/ like:
[root@localhost proc]# tree /proc/nspid
/proc/nspid
©À©¤©¤ ns0
©¦    ©¸©¤©¤ ns1
©¦       ©À©¤©¤ ns2
©¦       ©¦   ©¸©¤©¤ pid -> /proc/9/ns
©¦       ©¸©¤©¤ pid -> /proc/4/ns
©¸©¤©¤ pid -> /proc/1/ns 

We created dirs and add a link to the 1st process of this ns.

b) expose all sets of pid, pgid, sid and tgid
via expanded /proc/PID/status
      We could get translated IDs from container like:
    NStgid:	6 	8	9 
    NSpid:	6 	8 	9
    NSpgid:	6 	8 	9 
    NSsid:	6 	1 	0
    (a set of IDs with 3 level of ns)

2. Advantage of procfs solution
a) easy to use:
getnspid(6, 10) -> (10, 9, 10)
or
getnspid(10, ns1_fd, ns0_fd) -> 9
getnspid(10, ns2_fd, ns0_fd) -> 10

And we could also get it by:
cat /proc/10/status | grep NSpid:
NSpid:	10 	9 	10
...

b) hierarchy info:
We could not get the ns hierarchy info by just one syscall.
If we had to, it will complicate the interface.

We could check whether two process had some relations
via procfs:
readlink /proc/PID1/ns/pid -> aaa
readlink /proc/PID2/ns/pid -> bbb

Then we could check /proc/nspid/nsX/nsY/nsZ 
and find out their relationship.
Ex£º
We know t4 live in ns2, 
readlink /proc/t4/ns/pid -> AAA
then we refer to /proc/nspid/ and find a same inum AAA under
/proc/nspid/ns0/ns1/ns2
Then we knew that t4 have pid 9 in ns2, have pid 8 in ns1.

Any comments would be warmly welcomed!

Thanks,
- Chen

> -----Original Message-----
> From: containers-bounces@lists.linux-foundation.org
> [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of
> chenhanxiao@cn.fujitsu.com
> Sent: Wednesday, July 09, 2014 6:34 PM
> To: Eric W. Biederman (ebiederm@xmission.com); Serge Hallyn
> (serge.hallyn@ubuntu.com); Oleg Nesterov (oleg@redhat.com); Richard Weinberger
> (richard@nod.at); Pavel Emelyanov (xemul@parallels.com); Vasily Kulikov
> (segoon@openwall.com); Gotou, Yasunori/Îåu ¿µÎÄ; 'Daniel P. Berrange
> (berrange@redhat.com)'
> Cc: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> Subject: RE: [RFC]Pid conversion between pid namespace
> 
> Hi,
> 
> Let me summarize our discussions of ID conversion by pros/cons:
> 
> A) make new system call for translation
>     A-1) systemcall(ID, NS1, NS2) into (ID).
>     pros:
>         - has a reference ns(NS2)
>           We could get any lower level ID directly.
> 
>     cons:
>         - lack of hierarchy information.
>           CRIU need hierarchy info for checkpoint/restore in nested containers.
>         - not easy for debug.
>           And a lot of tools/libs need be modified.
> 
>     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
>     pros:
>         - ns procfs free, easy to use.
>         We could get rid of mounted ns procfs.
> 
>     cons:
>         - may find multiple results in nested ns.
>           We wished the new API could tell us the exact answer.
>           But if getnspid return more than one results will bring trouble to admins,
>           they had to make another decision.
>           Or we marked the deepest level for translation as prerequisite.
> 
>         -based on current pidns, no reference ns.
> 
> B) make/change proc file/directories
> 	B-1) expand /proc/pid/status
> 	pros:
>         - easy to use and to debug
>         - already had existed interface in kernel
> 
> 	cons:
>         - based on current ns
>           for middle level, we had to make another decision.
>         - do not have hierarchy info.
> 
> 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> 	pros:
>         - have enough info from /proc in container
> 
> 	cons:
>         - Requirements unclear.
>           We need more discussion to decide which items should not be exposed.
>         - do not have hierarchy info.
> 
> 
> How about do these things in two steps:
> 
> C)  1. expose all sets of pid, pgid, sid and tgid
> via expanded /proc/PID/status
>       We could get translated IDs from container like:
>     NStgid:	16465 	5 	1
>     NSpid:	16465 	5 	1
>     NSpgid:	16465 	5 	1
>     NSsid:	16423 	1 	0
>     (a set of IDs with 3 level of ns)
> 
>     2. add hierarchy info under /proc
>       We lacked of method of getting hierarchy info, which is useful.
>       Then we could know the relationship of ns.
>       How about adding a new proc file just under /proc
>       to show the hierarchy like readlink did:
> 	  pid:[4026531836]-> [4026532390] -> [4026532484]
>       pid:[4026531836]-> [4026532491]
>       (A 3 level pid and 2 level pid_
> 
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> > -----Original Message-----
> > Subject: [RFC]Pid conversion between pid namespace
> >
> > Hi,
> >
> > We had some discussions on how to carry out
> > pid conversion between pid namespace via:
> > syscall[1] and procfs[2].
> >
> > Pavel suggested that a syscall like
> > (ID, NS1, NS2) into (ID).
> >
> > Serge suggested that a syscall
> > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> >
> >
> > Eric and Richard suggested a procfs solution is
> > more appropriate.
> >
> > Oleg suggested that we should expand /proc/pid/status
> > to report this kind of information.
> >
> > And Richard suggested adding a directory like
> > /proc/<pidX>/ns/proc/ which would contain everything
> > from /proc/<pidX inside the namespace>/.
> >
> > As procfs provided a more user friendly interface,
> > how about expose all sets of tgid, pid, pgid, sid
> > by expanding /proc/PID/status in procfs?
> > And we could also expose ns hierarchy under /proc,
> > which could be another reference.
> >
> > Ex:
> >     init_pid_ns    ns1         ns2
> > t1  2
> > t2   `- 3          1
> > t3       `- 4      `- 5        1
> >
> > We could get in /proc/t3/status:
> > NSpid: 4 5 1
> > We knew that pid 1 in container is pid 4 in init ns.
> >
> > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > init_ns->ns1->ns2		(as the result of readlink)
> >          ->ns3
> > We knew that t3 in ns2, and its hierarchy.
> >
> > How these ideas looks like?
> > Any comments would be appreciated.
> >
> > Thanks,
> > - Chen
> >
> >
> > a) syscall
> > http://lwn.net/Articles/602987/
> >
> > b) procfs
> > http://www.spinics.net/lists/kernel/msg1751688.html
> >
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
@ 2014-07-25 10:01         ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-07-25 10:01 UTC (permalink / raw)
  To: Eric W. Biederman (ebiederm@xmission.com),
	Serge Hallyn (serge.hallyn@ubuntu.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)'
  Cc: containers, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 7382 bytes --]

Hi,

We discussed two ways of pid conversion:
syscall and procfs.

Both of them could do a pid translation job.
But for ns hierarchy, syscall like:

pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
or
pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)

could not work, we knew a pid lived in one ns, but we
did not know their relationships.
For getting the entire set of pids, both of them can do.

So using procfs is a better way.

Ex:
    init_pid_ns     ns1         ns2
t1  2
t2   `- 3           1 
t3       `- 4       `- 5        1
t4           `-6        `-8      `-9
t5             `-10        `-9      `-10

1. How procfs work:
a) adding a nspid hierarchy  under /proc/ like:
[root@localhost proc]# tree /proc/nspid
/proc/nspid
©À©¤©¤ ns0
©¦    ©¸©¤©¤ ns1
©¦       ©À©¤©¤ ns2
©¦       ©¦   ©¸©¤©¤ pid -> /proc/9/ns
©¦       ©¸©¤©¤ pid -> /proc/4/ns
©¸©¤©¤ pid -> /proc/1/ns 

We created dirs and add a link to the 1st process of this ns.

b) expose all sets of pid, pgid, sid and tgid
via expanded /proc/PID/status
      We could get translated IDs from container like:
    NStgid:	6 	8	9 
    NSpid:	6 	8 	9
    NSpgid:	6 	8 	9 
    NSsid:	6 	1 	0
    (a set of IDs with 3 level of ns)

2. Advantage of procfs solution
a) easy to use:
getnspid(6, 10) -> (10, 9, 10)
or
getnspid(10, ns1_fd, ns0_fd) -> 9
getnspid(10, ns2_fd, ns0_fd) -> 10

And we could also get it by:
cat /proc/10/status | grep NSpid:
NSpid:	10 	9 	10
...

b) hierarchy info:
We could not get the ns hierarchy info by just one syscall.
If we had to, it will complicate the interface.

We could check whether two process had some relations
via procfs:
readlink /proc/PID1/ns/pid -> aaa
readlink /proc/PID2/ns/pid -> bbb

Then we could check /proc/nspid/nsX/nsY/nsZ 
and find out their relationship.
Ex£º
We know t4 live in ns2, 
readlink /proc/t4/ns/pid -> AAA
then we refer to /proc/nspid/ and find a same inum AAA under
/proc/nspid/ns0/ns1/ns2
Then we knew that t4 have pid 9 in ns2, have pid 8 in ns1.

Any comments would be warmly welcomed!

Thanks,
- Chen

> -----Original Message-----
> From: containers-bounces@lists.linux-foundation.org
> [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of
> chenhanxiao@cn.fujitsu.com
> Sent: Wednesday, July 09, 2014 6:34 PM
> To: Eric W. Biederman (ebiederm@xmission.com); Serge Hallyn
> (serge.hallyn@ubuntu.com); Oleg Nesterov (oleg@redhat.com); Richard Weinberger
> (richard@nod.at); Pavel Emelyanov (xemul@parallels.com); Vasily Kulikov
> (segoon@openwall.com); Gotou, Yasunori/Îåu ¿µÎÄ; 'Daniel P. Berrange
> (berrange@redhat.com)'
> Cc: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> Subject: RE: [RFC]Pid conversion between pid namespace
> 
> Hi,
> 
> Let me summarize our discussions of ID conversion by pros/cons:
> 
> A) make new system call for translation
>     A-1) systemcall(ID, NS1, NS2) into (ID).
>     pros:
>         - has a reference ns(NS2)
>           We could get any lower level ID directly.
> 
>     cons:
>         - lack of hierarchy information.
>           CRIU need hierarchy info for checkpoint/restore in nested containers.
>         - not easy for debug.
>           And a lot of tools/libs need be modified.
> 
>     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
>     pros:
>         - ns procfs free, easy to use.
>         We could get rid of mounted ns procfs.
> 
>     cons:
>         - may find multiple results in nested ns.
>           We wished the new API could tell us the exact answer.
>           But if getnspid return more than one results will bring trouble to admins,
>           they had to make another decision.
>           Or we marked the deepest level for translation as prerequisite.
> 
>         -based on current pidns, no reference ns.
> 
> B) make/change proc file/directories
> 	B-1) expand /proc/pid/status
> 	pros:
>         - easy to use and to debug
>         - already had existed interface in kernel
> 
> 	cons:
>         - based on current ns
>           for middle level, we had to make another decision.
>         - do not have hierarchy info.
> 
> 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> 	pros:
>         - have enough info from /proc in container
> 
> 	cons:
>         - Requirements unclear.
>           We need more discussion to decide which items should not be exposed.
>         - do not have hierarchy info.
> 
> 
> How about do these things in two steps:
> 
> C)  1. expose all sets of pid, pgid, sid and tgid
> via expanded /proc/PID/status
>       We could get translated IDs from container like:
>     NStgid:	16465 	5 	1
>     NSpid:	16465 	5 	1
>     NSpgid:	16465 	5 	1
>     NSsid:	16423 	1 	0
>     (a set of IDs with 3 level of ns)
> 
>     2. add hierarchy info under /proc
>       We lacked of method of getting hierarchy info, which is useful.
>       Then we could know the relationship of ns.
>       How about adding a new proc file just under /proc
>       to show the hierarchy like readlink did:
> 	  pid:[4026531836]-> [4026532390] -> [4026532484]
>       pid:[4026531836]-> [4026532491]
>       (A 3 level pid and 2 level pid_
> 
> Any comments would be appreciated.
> 
> Thanks,
> - Chen
> 
> > -----Original Message-----
> > Subject: [RFC]Pid conversion between pid namespace
> >
> > Hi,
> >
> > We had some discussions on how to carry out
> > pid conversion between pid namespace via:
> > syscall[1] and procfs[2].
> >
> > Pavel suggested that a syscall like
> > (ID, NS1, NS2) into (ID).
> >
> > Serge suggested that a syscall
> > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> >
> >
> > Eric and Richard suggested a procfs solution is
> > more appropriate.
> >
> > Oleg suggested that we should expand /proc/pid/status
> > to report this kind of information.
> >
> > And Richard suggested adding a directory like
> > /proc/<pidX>/ns/proc/ which would contain everything
> > from /proc/<pidX inside the namespace>/.
> >
> > As procfs provided a more user friendly interface,
> > how about expose all sets of tgid, pid, pgid, sid
> > by expanding /proc/PID/status in procfs?
> > And we could also expose ns hierarchy under /proc,
> > which could be another reference.
> >
> > Ex:
> >     init_pid_ns    ns1         ns2
> > t1  2
> > t2   `- 3          1
> > t3       `- 4      `- 5        1
> >
> > We could get in /proc/t3/status:
> > NSpid: 4 5 1
> > We knew that pid 1 in container is pid 4 in init ns.
> >
> > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > init_ns->ns1->ns2		(as the result of readlink)
> >          ->ns3
> > We knew that t3 in ns2, and its hierarchy.
> >
> > How these ideas looks like?
> > Any comments would be appreciated.
> >
> > Thanks,
> > - Chen
> >
> >
> > a) syscall
> > http://lwn.net/Articles/602987/
> >
> > b) procfs
> > http://www.spinics.net/lists/kernel/msg1751688.html
> >
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-07-21 10:47           ` chenhanxiao
@ 2014-07-25 17:34               ` Serge Hallyn
  -1 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-07-25 17:34 UTC (permalink / raw)
  To: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> > -----Original Message-----
> > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > Sent: Tuesday, July 15, 2014 12:16 PM
> > To: Chen, Hanxiao/陈 晗霄
> > Subject: Re: [RFC]Pid conversion between pid namespace
> > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > >     pros:
> > >         - ns procfs free, easy to use.
> > >         We could get rid of mounted ns procfs.
> > >
> > >     cons:
> > >         - may find multiple results in nested ns.
> > >           We wished the new API could tell us the exact answer.
> > >           But if getnspid return more than one results will bring trouble to admins,
> > 
> > (See below for more, but) the question being posed to getnspid has precisely
> > one answer.
> > 
> > >           they had to make another decision.
> > >           Or we marked the deepest level for translation as prerequisite.
> > >
> > >         -based on current pidns, no reference ns.
> > 
> > Hm, no.  The intent here was that
> > 
> > 	observer_pid would be in current ns
> > 	query_pid would be in observer_pid's ns.
> > 
> > So this would be ideal for "I got a pid in a logfile created by rsyslog in
> > a nested contaner, what is the logged pid in my pidns."
> > 
> > Taking a set of tasks (like a container with nesting) and bulding a tree
> > of all pids shouldn't be too difficult either.  Start with the init pid,
> > call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> > out whether any $pid is itself a nested init_pid, we can compare the
> > /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
> I'm a little confused in this section:
> 
> Ex:
>     init_pid_ns    ns1         ns2
> t1  2
> t2   `- 3          1 
> t3       `- 4      `- 5        1
> t4           `-6       `-8      `-9
> t5             `-10       `-9      `-10
> 
> For getnspid($pid, $init_pid),
> Does init_pid means container's init_pid such as 3 for t2?

Right, if you're in init_pid_ns and making the query, then
you'd pass 3.

> In nested containers, does this syscall work as:
> getnspid(9, 4) -> (6, 8, 9) 

No, assuming the querying task is in init_pid_ns,
getnspid(9, 4) would return 6.

4 is the observer pid given in the querier's own pidns, so
it refers to t3.  9 is the pid being queried, in the oberver's
pidns, so it revers to t4.  The result is, the pid in our own
pidns.

Does that help clarify at all?  I'm not sure whether the problem is that
I didn't explain well enough from the start, or whether this just shows
that the API is one only its mother could love :)

-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-07-25 17:34               ` Serge Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-07-25 17:34 UTC (permalink / raw)
  To: chenhanxiao
  Cc: Richard Weinberger (richard@nod.at),
	containers, Oleg Nesterov (oleg@redhat.com),
	linux-kernel, Eric W. Biederman (ebiederm@xmission.com),
	Vasily Kulikov (segoon@openwall.com)

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> > -----Original Message-----
> > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > Sent: Tuesday, July 15, 2014 12:16 PM
> > To: Chen, Hanxiao/陈 晗霄
> > Subject: Re: [RFC]Pid conversion between pid namespace
> > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > >     pros:
> > >         - ns procfs free, easy to use.
> > >         We could get rid of mounted ns procfs.
> > >
> > >     cons:
> > >         - may find multiple results in nested ns.
> > >           We wished the new API could tell us the exact answer.
> > >           But if getnspid return more than one results will bring trouble to admins,
> > 
> > (See below for more, but) the question being posed to getnspid has precisely
> > one answer.
> > 
> > >           they had to make another decision.
> > >           Or we marked the deepest level for translation as prerequisite.
> > >
> > >         -based on current pidns, no reference ns.
> > 
> > Hm, no.  The intent here was that
> > 
> > 	observer_pid would be in current ns
> > 	query_pid would be in observer_pid's ns.
> > 
> > So this would be ideal for "I got a pid in a logfile created by rsyslog in
> > a nested contaner, what is the logged pid in my pidns."
> > 
> > Taking a set of tasks (like a container with nesting) and bulding a tree
> > of all pids shouldn't be too difficult either.  Start with the init pid,
> > call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> > out whether any $pid is itself a nested init_pid, we can compare the
> > /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
> I'm a little confused in this section:
> 
> Ex:
>     init_pid_ns    ns1         ns2
> t1  2
> t2   `- 3          1 
> t3       `- 4      `- 5        1
> t4           `-6       `-8      `-9
> t5             `-10       `-9      `-10
> 
> For getnspid($pid, $init_pid),
> Does init_pid means container's init_pid such as 3 for t2?

Right, if you're in init_pid_ns and making the query, then
you'd pass 3.

> In nested containers, does this syscall work as:
> getnspid(9, 4) -> (6, 8, 9) 

No, assuming the querying task is in init_pid_ns,
getnspid(9, 4) would return 6.

4 is the observer pid given in the querier's own pidns, so
it refers to t3.  9 is the pid being queried, in the oberver's
pidns, so it revers to t4.  The result is, the pid in our own
pidns.

Does that help clarify at all?  I'm not sure whether the problem is that
I didn't explain well enough from the start, or whether this just shows
that the API is one only its mother could love :)

-serge

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-07-25 17:34               ` Serge Hallyn
@ 2014-07-28  8:14                 ` Hu Tao
  -1 siblings, 0 replies; 30+ messages in thread
From: Hu Tao @ 2014-07-28  8:14 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Hi,

On Fri, Jul 25, 2014 at 05:34:43PM +0000, Serge Hallyn wrote:
> Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > Hi,
> > 
> > > -----Original Message-----
> > > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > > Sent: Tuesday, July 15, 2014 12:16 PM
> > > To: Chen, Hanxiao/陈 晗霄
> > > Subject: Re: [RFC]Pid conversion between pid namespace
> > > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > > >     pros:
> > > >         - ns procfs free, easy to use.
> > > >         We could get rid of mounted ns procfs.
> > > >
> > > >     cons:
> > > >         - may find multiple results in nested ns.
> > > >           We wished the new API could tell us the exact answer.
> > > >           But if getnspid return more than one results will bring trouble to admins,
> > > 
> > > (See below for more, but) the question being posed to getnspid has precisely
> > > one answer.
> > > 
> > > >           they had to make another decision.
> > > >           Or we marked the deepest level for translation as prerequisite.
> > > >
> > > >         -based on current pidns, no reference ns.
> > > 
> > > Hm, no.  The intent here was that
> > > 
> > > 	observer_pid would be in current ns
> > > 	query_pid would be in observer_pid's ns.
> > > 
> > > So this would be ideal for "I got a pid in a logfile created by rsyslog in
> > > a nested contaner, what is the logged pid in my pidns."
> > > 
> > > Taking a set of tasks (like a container with nesting) and bulding a tree
> > > of all pids shouldn't be too difficult either.  Start with the init pid,
> > > call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> > > out whether any $pid is itself a nested init_pid, we can compare the
> > > /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
> > I'm a little confused in this section:
> > 
> > Ex:
> >     init_pid_ns    ns1         ns2
> > t1  2
> > t2   `- 3          1 
> > t3       `- 4      `- 5        1
> > t4           `-6       `-8      `-9
> > t5             `-10       `-9      `-10
> > 
> > For getnspid($pid, $init_pid),
> > Does init_pid means container's init_pid such as 3 for t2?
> 
> Right, if you're in init_pid_ns and making the query, then
> you'd pass 3.

Sorry for jumping in, but I'm not quite understanding the purpose of
$init_pid here, does it identify the ns which the process to be
queried is in? Also see my questions below:

1. Given the example above, what's the return of getnspid(9, 3)? 
   Is it 6(task t4) or 10(task t5)? 

2. if there is a process in ns1 which is a child of process 1 has pid
   10, but not in ns2, like below:

    init_pid_ns          ns1         ns2
t1  2
t2   `- 3                1 
t3       `- 4            +- 5        1
t4           `-6         |   `-8      `-9
t5             `-10      |      `-9      `-10
t6               `-11    `-10   

   then what is the return of getnspid(10, 3)?

Regards,
Hu

>
> 
> > In nested containers, does this syscall work as:
> > getnspid(9, 4) -> (6, 8, 9) 
> 
> No, assuming the querying task is in init_pid_ns,
> getnspid(9, 4) would return 6.
> 
> 4 is the observer pid given in the querier's own pidns, so
> it refers to t3.  9 is the pid being queried, in the oberver's
> pidns, so it revers to t4.  The result is, the pid in our own
> pidns.
> 
> Does that help clarify at all?  I'm not sure whether the problem is that
> I didn't explain well enough from the start, or whether this just shows
> that the API is one only its mother could love :)
> 
> -serge
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-07-28  8:14                 ` Hu Tao
  0 siblings, 0 replies; 30+ messages in thread
From: Hu Tao @ 2014-07-28  8:14 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: chenhanxiao, Richard Weinberger (richard@nod.at),
	containers, Oleg Nesterov (oleg@redhat.com),
	linux-kernel, Eric W. Biederman (ebiederm@xmission.com),
	Vasily Kulikov (segoon@openwall.com)

Hi,

On Fri, Jul 25, 2014 at 05:34:43PM +0000, Serge Hallyn wrote:
> Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > Hi,
> > 
> > > -----Original Message-----
> > > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > > Sent: Tuesday, July 15, 2014 12:16 PM
> > > To: Chen, Hanxiao/陈 晗霄
> > > Subject: Re: [RFC]Pid conversion between pid namespace
> > > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > > >     pros:
> > > >         - ns procfs free, easy to use.
> > > >         We could get rid of mounted ns procfs.
> > > >
> > > >     cons:
> > > >         - may find multiple results in nested ns.
> > > >           We wished the new API could tell us the exact answer.
> > > >           But if getnspid return more than one results will bring trouble to admins,
> > > 
> > > (See below for more, but) the question being posed to getnspid has precisely
> > > one answer.
> > > 
> > > >           they had to make another decision.
> > > >           Or we marked the deepest level for translation as prerequisite.
> > > >
> > > >         -based on current pidns, no reference ns.
> > > 
> > > Hm, no.  The intent here was that
> > > 
> > > 	observer_pid would be in current ns
> > > 	query_pid would be in observer_pid's ns.
> > > 
> > > So this would be ideal for "I got a pid in a logfile created by rsyslog in
> > > a nested contaner, what is the logged pid in my pidns."
> > > 
> > > Taking a set of tasks (like a container with nesting) and bulding a tree
> > > of all pids shouldn't be too difficult either.  Start with the init pid,
> > > call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> > > out whether any $pid is itself a nested init_pid, we can compare the
> > > /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
> > I'm a little confused in this section:
> > 
> > Ex:
> >     init_pid_ns    ns1         ns2
> > t1  2
> > t2   `- 3          1 
> > t3       `- 4      `- 5        1
> > t4           `-6       `-8      `-9
> > t5             `-10       `-9      `-10
> > 
> > For getnspid($pid, $init_pid),
> > Does init_pid means container's init_pid such as 3 for t2?
> 
> Right, if you're in init_pid_ns and making the query, then
> you'd pass 3.

Sorry for jumping in, but I'm not quite understanding the purpose of
$init_pid here, does it identify the ns which the process to be
queried is in? Also see my questions below:

1. Given the example above, what's the return of getnspid(9, 3)? 
   Is it 6(task t4) or 10(task t5)? 

2. if there is a process in ns1 which is a child of process 1 has pid
   10, but not in ns2, like below:

    init_pid_ns          ns1         ns2
t1  2
t2   `- 3                1 
t3       `- 4            +- 5        1
t4           `-6         |   `-8      `-9
t5             `-10      |      `-9      `-10
t6               `-11    `-10   

   then what is the return of getnspid(10, 3)?

Regards,
Hu

>
> 
> > In nested containers, does this syscall work as:
> > getnspid(9, 4) -> (6, 8, 9) 
> 
> No, assuming the querying task is in init_pid_ns,
> getnspid(9, 4) would return 6.
> 
> 4 is the observer pid given in the querier's own pidns, so
> it refers to t3.  9 is the pid being queried, in the oberver's
> pidns, so it revers to t4.  The result is, the pid in our own
> pidns.
> 
> Does that help clarify at all?  I'm not sure whether the problem is that
> I didn't explain well enough from the start, or whether this just shows
> that the API is one only its mother could love :)
> 
> -serge
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-07-28  8:14                 ` Hu Tao
@ 2014-07-28 13:24                     ` Serge Hallyn
  -1 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-07-28 13:24 UTC (permalink / raw)
  To: Hu Tao
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Quoting Hu Tao (hutao@cn.fujitsu.com):
> Hi,
> 
> On Fri, Jul 25, 2014 at 05:34:43PM +0000, Serge Hallyn wrote:
> > Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > > Hi,
> > > 
> > > > -----Original Message-----
> > > > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > > > Sent: Tuesday, July 15, 2014 12:16 PM
> > > > To: Chen, Hanxiao/陈 晗霄
> > > > Subject: Re: [RFC]Pid conversion between pid namespace
> > > > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > > > >     pros:
> > > > >         - ns procfs free, easy to use.
> > > > >         We could get rid of mounted ns procfs.
> > > > >
> > > > >     cons:
> > > > >         - may find multiple results in nested ns.
> > > > >           We wished the new API could tell us the exact answer.
> > > > >           But if getnspid return more than one results will bring trouble to admins,
> > > > 
> > > > (See below for more, but) the question being posed to getnspid has precisely
> > > > one answer.
> > > > 
> > > > >           they had to make another decision.
> > > > >           Or we marked the deepest level for translation as prerequisite.
> > > > >
> > > > >         -based on current pidns, no reference ns.
> > > > 
> > > > Hm, no.  The intent here was that
> > > > 
> > > > 	observer_pid would be in current ns
> > > > 	query_pid would be in observer_pid's ns.
> > > > 
> > > > So this would be ideal for "I got a pid in a logfile created by rsyslog in
> > > > a nested contaner, what is the logged pid in my pidns."
> > > > 
> > > > Taking a set of tasks (like a container with nesting) and bulding a tree
> > > > of all pids shouldn't be too difficult either.  Start with the init pid,
> > > > call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> > > > out whether any $pid is itself a nested init_pid, we can compare the
> > > > /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
> > > I'm a little confused in this section:
> > > 
> > > Ex:
> > >     init_pid_ns    ns1         ns2
> > > t1  2
> > > t2   `- 3          1 
> > > t3       `- 4      `- 5        1
> > > t4           `-6       `-8      `-9
> > > t5             `-10       `-9      `-10
> > > 
> > > For getnspid($pid, $init_pid),
> > > Does init_pid means container's init_pid such as 3 for t2?
> > 
> > Right, if you're in init_pid_ns and making the query, then
> > you'd pass 3.
> 
> Sorry for jumping in, but I'm not quite understanding the purpose of
> $init_pid here, does it identify the ns which the process to be
> queried is in? Also see my questions below:

I was passing in initpid for a particular reason before, the second
argument is NOT meant to be an "initpid", it's meant to be the pid
(in caller's ns) of the observer pid - the pid in whose namespace we
are querying.

> 1. Given the example above, what's the return of getnspid(9, 3)? 
>    Is it 6(task t4) or 10(task t5)? 

Assuming the caller is in init_pid_ns, then the return value is t5.

> 
> 2. if there is a process in ns1 which is a child of process 1 has pid
>    10, but not in ns2, like below:
> 
>     init_pid_ns          ns1         ns2
> t1  2
> t2   `- 3                1 
> t3       `- 4            +- 5        1
> t4           `-6         |   `-8      `-9
> t5             `-10      |      `-9      `-10
> t6               `-11    `-10   
> 
>    then what is the return of getnspid(10, 3)?

Assuming the caller is in init_pid_ns, the answer is t6.  The question
was "In the pid_ns belonging to t2 (pid 3), what task does the pid
10 refer to".
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-07-28 13:24                     ` Serge Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-07-28 13:24 UTC (permalink / raw)
  To: Hu Tao
  Cc: chenhanxiao, Richard Weinberger (richard@nod.at),
	containers, Oleg Nesterov (oleg@redhat.com),
	linux-kernel, Eric W. Biederman (ebiederm@xmission.com),
	Vasily Kulikov (segoon@openwall.com)

Quoting Hu Tao (hutao@cn.fujitsu.com):
> Hi,
> 
> On Fri, Jul 25, 2014 at 05:34:43PM +0000, Serge Hallyn wrote:
> > Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > > Hi,
> > > 
> > > > -----Original Message-----
> > > > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > > > Sent: Tuesday, July 15, 2014 12:16 PM
> > > > To: Chen, Hanxiao/陈 晗霄
> > > > Subject: Re: [RFC]Pid conversion between pid namespace
> > > > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > > > >     pros:
> > > > >         - ns procfs free, easy to use.
> > > > >         We could get rid of mounted ns procfs.
> > > > >
> > > > >     cons:
> > > > >         - may find multiple results in nested ns.
> > > > >           We wished the new API could tell us the exact answer.
> > > > >           But if getnspid return more than one results will bring trouble to admins,
> > > > 
> > > > (See below for more, but) the question being posed to getnspid has precisely
> > > > one answer.
> > > > 
> > > > >           they had to make another decision.
> > > > >           Or we marked the deepest level for translation as prerequisite.
> > > > >
> > > > >         -based on current pidns, no reference ns.
> > > > 
> > > > Hm, no.  The intent here was that
> > > > 
> > > > 	observer_pid would be in current ns
> > > > 	query_pid would be in observer_pid's ns.
> > > > 
> > > > So this would be ideal for "I got a pid in a logfile created by rsyslog in
> > > > a nested contaner, what is the logged pid in my pidns."
> > > > 
> > > > Taking a set of tasks (like a container with nesting) and bulding a tree
> > > > of all pids shouldn't be too difficult either.  Start with the init pid,
> > > > call getnspid($pid, $init_pid) for every $pid in the container;  to figure
> > > > out whether any $pid is itself a nested init_pid, we can compare the
> > > > /proc/$$/ns/pid, as well as look at getnspid($pid, $pid).
> > > I'm a little confused in this section:
> > > 
> > > Ex:
> > >     init_pid_ns    ns1         ns2
> > > t1  2
> > > t2   `- 3          1 
> > > t3       `- 4      `- 5        1
> > > t4           `-6       `-8      `-9
> > > t5             `-10       `-9      `-10
> > > 
> > > For getnspid($pid, $init_pid),
> > > Does init_pid means container's init_pid such as 3 for t2?
> > 
> > Right, if you're in init_pid_ns and making the query, then
> > you'd pass 3.
> 
> Sorry for jumping in, but I'm not quite understanding the purpose of
> $init_pid here, does it identify the ns which the process to be
> queried is in? Also see my questions below:

I was passing in initpid for a particular reason before, the second
argument is NOT meant to be an "initpid", it's meant to be the pid
(in caller's ns) of the observer pid - the pid in whose namespace we
are querying.

> 1. Given the example above, what's the return of getnspid(9, 3)? 
>    Is it 6(task t4) or 10(task t5)? 

Assuming the caller is in init_pid_ns, then the return value is t5.

> 
> 2. if there is a process in ns1 which is a child of process 1 has pid
>    10, but not in ns2, like below:
> 
>     init_pid_ns          ns1         ns2
> t1  2
> t2   `- 3                1 
> t3       `- 4            +- 5        1
> t4           `-6         |   `-8      `-9
> t5             `-10      |      `-9      `-10
> t6               `-11    `-10   
> 
>    then what is the return of getnspid(10, 3)?

Assuming the caller is in init_pid_ns, the answer is t6.  The question
was "In the pid_ns belonging to t2 (pid 3), what task does the pid
10 refer to".

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-07-25 10:01         ` chenhanxiao
@ 2014-08-04 22:20             ` Serge Hallyn
  -1 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-08-04 22:20 UTC (permalink / raw)
  To: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> We discussed two ways of pid conversion:
> syscall and procfs.
> 
> Both of them could do a pid translation job.
> But for ns hierarchy, syscall like:
> 
> pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
> or
> pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)
> 
> could not work, we knew a pid lived in one ns, but we

Note I still disagree here.

> did not know their relationships.
> For getting the entire set of pids, both of them can do.
> 
> So using procfs is a better way.
> 
> Ex:
>     init_pid_ns     ns1         ns2
> t1  2
> t2   `- 3           1 
> t3       `- 4       `- 5        1
> t4           `-6        `-8      `-9
> t5             `-10        `-9      `-10
> 
> 1. How procfs work:
> a) adding a nspid hierarchy  under /proc/ like:
> [root@localhost proc]# tree /proc/nspid
> /proc/nspid
> ├── ns0
> │    └── ns1

Are these actually called 'ns1' etc?  Adding a namespace of pid
namespace names is a bad thing.

> │       ├── ns2
> │       │   └── pid -> /proc/9/ns
> │       └── pid -> /proc/4/ns
> └── pid -> /proc/1/ns 
> 
> We created dirs and add a link to the 1st process of this ns.

How much more kernel space does this take up?

Is there an easy way to go from a pid in your own namespace
to its proper node under /proc/nspid?  I.e. if I am interested
in pid 9987, which happens to be pid 5 inside a container in
ns2, and then I want to know what it means when it (pid 9987)
is talking about 'pid 10'.  Is there a link under /proc/9987/
leading to /proc/nspid/ns2/5 ?

> b) expose all sets of pid, pgid, sid and tgid
> via expanded /proc/PID/status
>       We could get translated IDs from container like:
>     NStgid:	6 	8	9 
>     NSpid:	6 	8 	9
>     NSpgid:	6 	8 	9 
>     NSsid:	6 	1 	0
>     (a set of IDs with 3 level of ns)

This sure does seem the simplest route.  But it actually still
does not provide us an easy answer to "what does pid 9987 mean
when it talks about pid 10?".

> 2. Advantage of procfs solution
> a) easy to use:
> getnspid(6, 10) -> (10, 9, 10)
> or
> getnspid(10, ns1_fd, ns0_fd) -> 9
> getnspid(10, ns2_fd, ns0_fd) -> 10
> 
> And we could also get it by:
> cat /proc/10/status | grep NSpid:
> NSpid:	10 	9 	10
> ...

It looks nice, but I'm not convinced it gives us the info we
need.

It's certainly possible that I've just not thought it through
enough.

Question: are you proposing this (/proc/pid/status expansion) as an
alternative to /proc/nspid, or are they meant to be complementary?

> b) hierarchy info:
> We could not get the ns hierarchy info by just one syscall.
> If we had to, it will complicate the interface.

Agreed.  But I'm not sure that's particularly important.

> We could check whether two process had some relations
> via procfs:
> readlink /proc/PID1/ns/pid -> aaa
> readlink /proc/PID2/ns/pid -> bbb
> 
> Then we could check /proc/nspid/nsX/nsY/nsZ 
> and find out their relationship.
> Ex:
> We know t4 live in ns2, 
> readlink /proc/t4/ns/pid -> AAA
> then we refer to /proc/nspid/ and find a same inum AAA under
> /proc/nspid/ns0/ns1/ns2
> Then we knew that t4 have pid 9 in ns2, have pid 8 in ns1.
> 
> Any comments would be warmly welcomed!
> 
> Thanks,
> - Chen
> 
> > -----Original Message-----
> > From: containers-bounces@lists.linux-foundation.org
> > [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of
> > chenhanxiao@cn.fujitsu.com
> > Sent: Wednesday, July 09, 2014 6:34 PM
> > To: Eric W. Biederman (ebiederm@xmission.com); Serge Hallyn
> > (serge.hallyn@ubuntu.com); Oleg Nesterov (oleg@redhat.com); Richard Weinberger
> > (richard@nod.at); Pavel Emelyanov (xemul@parallels.com); Vasily Kulikov
> > (segoon@openwall.com); Gotou, Yasunori/五島 康文; 'Daniel P. Berrange
> > (berrange@redhat.com)'
> > Cc: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> > Subject: RE: [RFC]Pid conversion between pid namespace
> > 
> > Hi,
> > 
> > Let me summarize our discussions of ID conversion by pros/cons:
> > 
> > A) make new system call for translation
> >     A-1) systemcall(ID, NS1, NS2) into (ID).
> >     pros:
> >         - has a reference ns(NS2)
> >           We could get any lower level ID directly.
> > 
> >     cons:
> >         - lack of hierarchy information.
> >           CRIU need hierarchy info for checkpoint/restore in nested containers.
> >         - not easy for debug.
> >           And a lot of tools/libs need be modified.
> > 
> >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> >     pros:
> >         - ns procfs free, easy to use.
> >         We could get rid of mounted ns procfs.
> > 
> >     cons:
> >         - may find multiple results in nested ns.
> >           We wished the new API could tell us the exact answer.
> >           But if getnspid return more than one results will bring trouble to admins,
> >           they had to make another decision.
> >           Or we marked the deepest level for translation as prerequisite.
> > 
> >         -based on current pidns, no reference ns.
> > 
> > B) make/change proc file/directories
> > 	B-1) expand /proc/pid/status
> > 	pros:
> >         - easy to use and to debug
> >         - already had existed interface in kernel
> > 
> > 	cons:
> >         - based on current ns
> >           for middle level, we had to make another decision.
> >         - do not have hierarchy info.
> > 
> > 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> > 	pros:
> >         - have enough info from /proc in container
> > 
> > 	cons:
> >         - Requirements unclear.
> >           We need more discussion to decide which items should not be exposed.
> >         - do not have hierarchy info.
> > 
> > 
> > How about do these things in two steps:
> > 
> > C)  1. expose all sets of pid, pgid, sid and tgid
> > via expanded /proc/PID/status
> >       We could get translated IDs from container like:
> >     NStgid:	16465 	5 	1
> >     NSpid:	16465 	5 	1
> >     NSpgid:	16465 	5 	1
> >     NSsid:	16423 	1 	0
> >     (a set of IDs with 3 level of ns)
> > 
> >     2. add hierarchy info under /proc
> >       We lacked of method of getting hierarchy info, which is useful.
> >       Then we could know the relationship of ns.
> >       How about adding a new proc file just under /proc
> >       to show the hierarchy like readlink did:
> > 	  pid:[4026531836]-> [4026532390] -> [4026532484]
> >       pid:[4026531836]-> [4026532491]
> >       (A 3 level pid and 2 level pid_
> > 
> > Any comments would be appreciated.
> > 
> > Thanks,
> > - Chen
> > 
> > > -----Original Message-----
> > > Subject: [RFC]Pid conversion between pid namespace
> > >
> > > Hi,
> > >
> > > We had some discussions on how to carry out
> > > pid conversion between pid namespace via:
> > > syscall[1] and procfs[2].
> > >
> > > Pavel suggested that a syscall like
> > > (ID, NS1, NS2) into (ID).
> > >
> > > Serge suggested that a syscall
> > > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > >
> > >
> > > Eric and Richard suggested a procfs solution is
> > > more appropriate.
> > >
> > > Oleg suggested that we should expand /proc/pid/status
> > > to report this kind of information.
> > >
> > > And Richard suggested adding a directory like
> > > /proc/<pidX>/ns/proc/ which would contain everything
> > > from /proc/<pidX inside the namespace>/.
> > >
> > > As procfs provided a more user friendly interface,
> > > how about expose all sets of tgid, pid, pgid, sid
> > > by expanding /proc/PID/status in procfs?
> > > And we could also expose ns hierarchy under /proc,
> > > which could be another reference.
> > >
> > > Ex:
> > >     init_pid_ns    ns1         ns2
> > > t1  2
> > > t2   `- 3          1
> > > t3       `- 4      `- 5        1
> > >
> > > We could get in /proc/t3/status:
> > > NSpid: 4 5 1
> > > We knew that pid 1 in container is pid 4 in init ns.
> > >
> > > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > > init_ns->ns1->ns2		(as the result of readlink)
> > >          ->ns3
> > > We knew that t3 in ns2, and its hierarchy.
> > >
> > > How these ideas looks like?
> > > Any comments would be appreciated.
> > >
> > > Thanks,
> > > - Chen
> > >
> > >
> > > a) syscall
> > > http://lwn.net/Articles/602987/
> > >
> > > b) procfs
> > > http://www.spinics.net/lists/kernel/msg1751688.html
> > >
> > > _______________________________________________
> > > Containers mailing list
> > > Containers@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers

> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-08-04 22:20             ` Serge Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-08-04 22:20 UTC (permalink / raw)
  To: chenhanxiao
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> We discussed two ways of pid conversion:
> syscall and procfs.
> 
> Both of them could do a pid translation job.
> But for ns hierarchy, syscall like:
> 
> pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
> or
> pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)
> 
> could not work, we knew a pid lived in one ns, but we

Note I still disagree here.

> did not know their relationships.
> For getting the entire set of pids, both of them can do.
> 
> So using procfs is a better way.
> 
> Ex:
>     init_pid_ns     ns1         ns2
> t1  2
> t2   `- 3           1 
> t3       `- 4       `- 5        1
> t4           `-6        `-8      `-9
> t5             `-10        `-9      `-10
> 
> 1. How procfs work:
> a) adding a nspid hierarchy  under /proc/ like:
> [root@localhost proc]# tree /proc/nspid
> /proc/nspid
> ├── ns0
> │    └── ns1

Are these actually called 'ns1' etc?  Adding a namespace of pid
namespace names is a bad thing.

> │       ├── ns2
> │       │   └── pid -> /proc/9/ns
> │       └── pid -> /proc/4/ns
> └── pid -> /proc/1/ns 
> 
> We created dirs and add a link to the 1st process of this ns.

How much more kernel space does this take up?

Is there an easy way to go from a pid in your own namespace
to its proper node under /proc/nspid?  I.e. if I am interested
in pid 9987, which happens to be pid 5 inside a container in
ns2, and then I want to know what it means when it (pid 9987)
is talking about 'pid 10'.  Is there a link under /proc/9987/
leading to /proc/nspid/ns2/5 ?

> b) expose all sets of pid, pgid, sid and tgid
> via expanded /proc/PID/status
>       We could get translated IDs from container like:
>     NStgid:	6 	8	9 
>     NSpid:	6 	8 	9
>     NSpgid:	6 	8 	9 
>     NSsid:	6 	1 	0
>     (a set of IDs with 3 level of ns)

This sure does seem the simplest route.  But it actually still
does not provide us an easy answer to "what does pid 9987 mean
when it talks about pid 10?".

> 2. Advantage of procfs solution
> a) easy to use:
> getnspid(6, 10) -> (10, 9, 10)
> or
> getnspid(10, ns1_fd, ns0_fd) -> 9
> getnspid(10, ns2_fd, ns0_fd) -> 10
> 
> And we could also get it by:
> cat /proc/10/status | grep NSpid:
> NSpid:	10 	9 	10
> ...

It looks nice, but I'm not convinced it gives us the info we
need.

It's certainly possible that I've just not thought it through
enough.

Question: are you proposing this (/proc/pid/status expansion) as an
alternative to /proc/nspid, or are they meant to be complementary?

> b) hierarchy info:
> We could not get the ns hierarchy info by just one syscall.
> If we had to, it will complicate the interface.

Agreed.  But I'm not sure that's particularly important.

> We could check whether two process had some relations
> via procfs:
> readlink /proc/PID1/ns/pid -> aaa
> readlink /proc/PID2/ns/pid -> bbb
> 
> Then we could check /proc/nspid/nsX/nsY/nsZ 
> and find out their relationship.
> Ex:
> We know t4 live in ns2, 
> readlink /proc/t4/ns/pid -> AAA
> then we refer to /proc/nspid/ and find a same inum AAA under
> /proc/nspid/ns0/ns1/ns2
> Then we knew that t4 have pid 9 in ns2, have pid 8 in ns1.
> 
> Any comments would be warmly welcomed!
> 
> Thanks,
> - Chen
> 
> > -----Original Message-----
> > From: containers-bounces@lists.linux-foundation.org
> > [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of
> > chenhanxiao@cn.fujitsu.com
> > Sent: Wednesday, July 09, 2014 6:34 PM
> > To: Eric W. Biederman (ebiederm@xmission.com); Serge Hallyn
> > (serge.hallyn@ubuntu.com); Oleg Nesterov (oleg@redhat.com); Richard Weinberger
> > (richard@nod.at); Pavel Emelyanov (xemul@parallels.com); Vasily Kulikov
> > (segoon@openwall.com); Gotou, Yasunori/五島 康文; 'Daniel P. Berrange
> > (berrange@redhat.com)'
> > Cc: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> > Subject: RE: [RFC]Pid conversion between pid namespace
> > 
> > Hi,
> > 
> > Let me summarize our discussions of ID conversion by pros/cons:
> > 
> > A) make new system call for translation
> >     A-1) systemcall(ID, NS1, NS2) into (ID).
> >     pros:
> >         - has a reference ns(NS2)
> >           We could get any lower level ID directly.
> > 
> >     cons:
> >         - lack of hierarchy information.
> >           CRIU need hierarchy info for checkpoint/restore in nested containers.
> >         - not easy for debug.
> >           And a lot of tools/libs need be modified.
> > 
> >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> >     pros:
> >         - ns procfs free, easy to use.
> >         We could get rid of mounted ns procfs.
> > 
> >     cons:
> >         - may find multiple results in nested ns.
> >           We wished the new API could tell us the exact answer.
> >           But if getnspid return more than one results will bring trouble to admins,
> >           they had to make another decision.
> >           Or we marked the deepest level for translation as prerequisite.
> > 
> >         -based on current pidns, no reference ns.
> > 
> > B) make/change proc file/directories
> > 	B-1) expand /proc/pid/status
> > 	pros:
> >         - easy to use and to debug
> >         - already had existed interface in kernel
> > 
> > 	cons:
> >         - based on current ns
> >           for middle level, we had to make another decision.
> >         - do not have hierarchy info.
> > 
> > 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> > 	pros:
> >         - have enough info from /proc in container
> > 
> > 	cons:
> >         - Requirements unclear.
> >           We need more discussion to decide which items should not be exposed.
> >         - do not have hierarchy info.
> > 
> > 
> > How about do these things in two steps:
> > 
> > C)  1. expose all sets of pid, pgid, sid and tgid
> > via expanded /proc/PID/status
> >       We could get translated IDs from container like:
> >     NStgid:	16465 	5 	1
> >     NSpid:	16465 	5 	1
> >     NSpgid:	16465 	5 	1
> >     NSsid:	16423 	1 	0
> >     (a set of IDs with 3 level of ns)
> > 
> >     2. add hierarchy info under /proc
> >       We lacked of method of getting hierarchy info, which is useful.
> >       Then we could know the relationship of ns.
> >       How about adding a new proc file just under /proc
> >       to show the hierarchy like readlink did:
> > 	  pid:[4026531836]-> [4026532390] -> [4026532484]
> >       pid:[4026531836]-> [4026532491]
> >       (A 3 level pid and 2 level pid_
> > 
> > Any comments would be appreciated.
> > 
> > Thanks,
> > - Chen
> > 
> > > -----Original Message-----
> > > Subject: [RFC]Pid conversion between pid namespace
> > >
> > > Hi,
> > >
> > > We had some discussions on how to carry out
> > > pid conversion between pid namespace via:
> > > syscall[1] and procfs[2].
> > >
> > > Pavel suggested that a syscall like
> > > (ID, NS1, NS2) into (ID).
> > >
> > > Serge suggested that a syscall
> > > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > >
> > >
> > > Eric and Richard suggested a procfs solution is
> > > more appropriate.
> > >
> > > Oleg suggested that we should expand /proc/pid/status
> > > to report this kind of information.
> > >
> > > And Richard suggested adding a directory like
> > > /proc/<pidX>/ns/proc/ which would contain everything
> > > from /proc/<pidX inside the namespace>/.
> > >
> > > As procfs provided a more user friendly interface,
> > > how about expose all sets of tgid, pid, pgid, sid
> > > by expanding /proc/PID/status in procfs?
> > > And we could also expose ns hierarchy under /proc,
> > > which could be another reference.
> > >
> > > Ex:
> > >     init_pid_ns    ns1         ns2
> > > t1  2
> > > t2   `- 3          1
> > > t3       `- 4      `- 5        1
> > >
> > > We could get in /proc/t3/status:
> > > NSpid: 4 5 1
> > > We knew that pid 1 in container is pid 4 in init ns.
> > >
> > > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > > init_ns->ns1->ns2		(as the result of readlink)
> > >          ->ns3
> > > We knew that t3 in ns2, and its hierarchy.
> > >
> > > How these ideas looks like?
> > > Any comments would be appreciated.
> > >
> > > Thanks,
> > > - Chen
> > >
> > >
> > > a) syscall
> > > http://lwn.net/Articles/602987/
> > >
> > > b) procfs
> > > http://www.spinics.net/lists/kernel/msg1751688.html
> > >
> > > _______________________________________________
> > > Containers mailing list
> > > Containers@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers

> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
  2014-08-04 22:20             ` Serge Hallyn
@ 2014-08-07 10:03               ` chenhanxiao
  -1 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-08-07 10:03 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Hi,

> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> Sent: Tuesday, August 05, 2014 6:21 AM
> 
> Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > Hi,
> >
> > We discussed two ways of pid conversion:
> > syscall and procfs.
> >
> > Both of them could do a pid translation job.
> > But for ns hierarchy, syscall like:
> >
> > pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
> > or
> > pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)
> >
> > could not work, we knew a pid lived in one ns, but we
> 
> Note I still disagree here. 
> 
> > did not know their relationships.
> > For getting the entire set of pids, both of them can do.
> >
> > So using procfs is a better way.
> >
> > Ex:
> >     init_pid_ns     ns1         ns2
> > t1  2
> > t2   `- 3           1
> > t3       `- 4       `- 5        1
> > t4           `-6        `-8      `-9
> > t5             `-10        `-9      `-10
> >
> > 1. How procfs work:
> > a) adding a nspid hierarchy  under /proc/ like:
> > [root@localhost proc]# tree /proc/nspid
> > /proc/nspid
> > ├── ns0
> > │    └── ns1
> 
> Are these actually called 'ns1' etc?  Adding a namespace of pid
> namespace names is a bad thing.

That's just an example.
We incline to name it as ns$(inum), 
like what we did in proc_ns_readlink.

> 
> > │       ├── ns2
> > │       │   └── pid -> /proc/9/ns
> > │       └── pid -> /proc/4/ns
> > └── pid -> /proc/1/ns
> >
> > We created dirs and add a link to the 1st process of this ns.
> 
> How much more kernel space does this take up?
> 

Only first process when creating new ns will be add here.
So there would not so many items.

> Is there an easy way to go from a pid in your own namespace
> to its proper node under /proc/nspid?  I.e. if I am interested
> in pid 9987, which happens to be pid 5 inside a container in
> ns2, and then I want to know what it means when it (pid 9987)
> is talking about 'pid 10'.  Is there a link under /proc/9987/
> leading to /proc/nspid/ns2/5 ?

If you want to query pid 9987, you could:
a) readlink /proc/9987/ns/pid
b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
c) Also the link to the 1st new ns process could be found under ns$(inum).

Or as what you said above,
we could do some change in /proc/PID/ns/pid
a) when new ns created, we put them under /proc/nspid
b) create a link from /proc/PID/ns/pid to /proc/nspid/ns$(inum)/pid

Then we could get a more clear view:
1. pidns view
/proc/nspid
├── ns_4026531836	(ns0)
│  ├─ ns1
│  │   ├─── ns2
│  │   └── pid -> pid:[4026531836]
│  └── pid -> pid:[4026531816]
└── pid -> pid:[4026531806]

Then there will be a link under /proc/9987/ns/pid to ns2:
2. PID1 live in ns0, PID2 live in ns2
/proc/PID1/ns/pid->/proc/nspid/ns_4026531806

/proc/PID2/ns/pid->/proc/nspid/ns_4026531836

> 
> > b) expose all sets of pid, pgid, sid and tgid
> > via expanded /proc/PID/status
> >       We could get translated IDs from container like:
> >     NStgid:	6 	8	9
> >     NSpid:	6 	8 	9
> >     NSpgid:	6 	8 	9
> >     NSsid:	6 	1 	0
> >     (a set of IDs with 3 level of ns)
> 
> This sure does seem the simplest route.  But it actually still
> does not provide us an easy answer to "what does pid 9987 mean
> when it talks about pid 10?".

Do you mean:
init_pid_ns   ns1     ns2
9987            10      5
Neither getnspid syscall nor proc/PID/status expansion
could answer this without hierarchy information.
For users in init_pid_ns, getnspid needs
an observer pid live and only live in ns1,
or we should call getnspid in ns1.
See below for more.

> 
> > 2. Advantage of procfs solution
> > a) easy to use:
> > getnspid(6, 10) -> (10, 9, 10)
> > or
> > getnspid(10, ns1_fd, ns0_fd) -> 9
> > getnspid(10, ns2_fd, ns0_fd) -> 10
> >
> > And we could also get it by:
> > cat /proc/10/status | grep NSpid:
> > NSpid:	10 	9 	10
> > ...
> 
> It looks nice, but I'm not convinced it gives us the info we
> need.
> 
> It's certainly possible that I've just not thought it through
> enough.
> 
> Question: are you proposing this (/proc/pid/status expansion) as an
> alternative to /proc/nspid, or are they meant to be complementary?
> 

We want /proc/nspid as a complement for pid translation.
Ex:
    init_pid_ns     ns1         ns2
t1  2
t2   `- 3           1 
t3       `- 4       `- 5        1
t4           `-6        `-8      `-9
t5             `-10        `-9      `-10
Suppose we were in init_pid_ns:
getnspid(9,4)->6 (t4)
getnspid(9,3)->10(t5)
We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
but the pre-requisite is that we know ns2 is the child of ns1. 

Thanks,
-Chen

> > b) hierarchy info:
> > We could not get the ns hierarchy info by just one syscall.
> > If we had to, it will complicate the interface.
> 
> Agreed.  But I'm not sure that's particularly important.
> 
> > We could check whether two process had some relations
> > via procfs:
> > readlink /proc/PID1/ns/pid -> aaa
> > readlink /proc/PID2/ns/pid -> bbb
> >
> > Then we could check /proc/nspid/nsX/nsY/nsZ
> > and find out their relationship.
> > Ex:
> > We know t4 live in ns2,
> > readlink /proc/t4/ns/pid -> AAA
> > then we refer to /proc/nspid/ and find a same inum AAA under
> > /proc/nspid/ns0/ns1/ns2
> > Then we knew that t4 have pid 9 in ns2, have pid 8 in ns1.
> >
> > Any comments would be warmly welcomed!
> >
> > Thanks,
> > - Chen
> >
> > > -----Original Message-----
> > > From: containers-bounces@lists.linux-foundation.org
> > > [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of
> > > chenhanxiao@cn.fujitsu.com
> > > Sent: Wednesday, July 09, 2014 6:34 PM
> > > To: Eric W. Biederman (ebiederm@xmission.com); Serge Hallyn
> > > (serge.hallyn@ubuntu.com); Oleg Nesterov (oleg@redhat.com); Richard
> Weinberger
> > > (richard@nod.at); Pavel Emelyanov (xemul@parallels.com); Vasily Kulikov
> > > (segoon@openwall.com); Gotou, Yasunori/五島 康文; 'Daniel P. Berrange
> > > (berrange@redhat.com)'
> > > Cc: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> > > Subject: RE: [RFC]Pid conversion between pid namespace
> > >
> > > Hi,
> > >
> > > Let me summarize our discussions of ID conversion by pros/cons:
> > >
> > > A) make new system call for translation
> > >     A-1) systemcall(ID, NS1, NS2) into (ID).
> > >     pros:
> > >         - has a reference ns(NS2)
> > >           We could get any lower level ID directly.
> > >
> > >     cons:
> > >         - lack of hierarchy information.
> > >           CRIU need hierarchy info for checkpoint/restore in nested containers.
> > >         - not easy for debug.
> > >           And a lot of tools/libs need be modified.
> > >
> > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > >     pros:
> > >         - ns procfs free, easy to use.
> > >         We could get rid of mounted ns procfs.
> > >
> > >     cons:
> > >         - may find multiple results in nested ns.
> > >           We wished the new API could tell us the exact answer.
> > >           But if getnspid return more than one results will bring trouble to
> admins,
> > >           they had to make another decision.
> > >           Or we marked the deepest level for translation as prerequisite.
> > >
> > >         -based on current pidns, no reference ns.
> > >
> > > B) make/change proc file/directories
> > > 	B-1) expand /proc/pid/status
> > > 	pros:
> > >         - easy to use and to debug
> > >         - already had existed interface in kernel
> > >
> > > 	cons:
> > >         - based on current ns
> > >           for middle level, we had to make another decision.
> > >         - do not have hierarchy info.
> > >
> > > 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> > > 	pros:
> > >         - have enough info from /proc in container
> > >
> > > 	cons:
> > >         - Requirements unclear.
> > >           We need more discussion to decide which items should not be exposed.
> > >         - do not have hierarchy info.
> > >
> > >
> > > How about do these things in two steps:
> > >
> > > C)  1. expose all sets of pid, pgid, sid and tgid
> > > via expanded /proc/PID/status
> > >       We could get translated IDs from container like:
> > >     NStgid:	16465 	5 	1
> > >     NSpid:	16465 	5 	1
> > >     NSpgid:	16465 	5 	1
> > >     NSsid:	16423 	1 	0
> > >     (a set of IDs with 3 level of ns)
> > >
> > >     2. add hierarchy info under /proc
> > >       We lacked of method of getting hierarchy info, which is useful.
> > >       Then we could know the relationship of ns.
> > >       How about adding a new proc file just under /proc
> > >       to show the hierarchy like readlink did:
> > > 	  pid:[4026531836]-> [4026532390] -> [4026532484]
> > >       pid:[4026531836]-> [4026532491]
> > >       (A 3 level pid and 2 level pid_
> > >
> > > Any comments would be appreciated.
> > >
> > > Thanks,
> > > - Chen
> > >
> > > > -----Original Message-----
> > > > Subject: [RFC]Pid conversion between pid namespace
> > > >
> > > > Hi,
> > > >
> > > > We had some discussions on how to carry out
> > > > pid conversion between pid namespace via:
> > > > syscall[1] and procfs[2].
> > > >
> > > > Pavel suggested that a syscall like
> > > > (ID, NS1, NS2) into (ID).
> > > >
> > > > Serge suggested that a syscall
> > > > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > > >
> > > >
> > > > Eric and Richard suggested a procfs solution is
> > > > more appropriate.
> > > >
> > > > Oleg suggested that we should expand /proc/pid/status
> > > > to report this kind of information.
> > > >
> > > > And Richard suggested adding a directory like
> > > > /proc/<pidX>/ns/proc/ which would contain everything
> > > > from /proc/<pidX inside the namespace>/.
> > > >
> > > > As procfs provided a more user friendly interface,
> > > > how about expose all sets of tgid, pid, pgid, sid
> > > > by expanding /proc/PID/status in procfs?
> > > > And we could also expose ns hierarchy under /proc,
> > > > which could be another reference.
> > > >
> > > > Ex:
> > > >     init_pid_ns    ns1         ns2
> > > > t1  2
> > > > t2   `- 3          1
> > > > t3       `- 4      `- 5        1
> > > >
> > > > We could get in /proc/t3/status:
> > > > NSpid: 4 5 1
> > > > We knew that pid 1 in container is pid 4 in init ns.
> > > >
> > > > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > > > init_ns->ns1->ns2		(as the result of readlink)
> > > >          ->ns3
> > > > We knew that t3 in ns2, and its hierarchy.
> > > >
> > > > How these ideas looks like?
> > > > Any comments would be appreciated.
> > > >
> > > > Thanks,
> > > > - Chen
> > > >
> > > >
> > > > a) syscall
> > > > http://lwn.net/Articles/602987/
> > > >
> > > > b) procfs
> > > > http://www.spinics.net/lists/kernel/msg1751688.html
> > > >
> > > > _______________________________________________
> > > > Containers mailing list
> > > > Containers@lists.linux-foundation.org
> > > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> > > _______________________________________________
> > > Containers mailing list
> > > Containers@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> 
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
@ 2014-08-07 10:03               ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-08-07 10:03 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 11661 bytes --]

Hi,

> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> Sent: Tuesday, August 05, 2014 6:21 AM
> 
> Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > Hi,
> >
> > We discussed two ways of pid conversion:
> > syscall and procfs.
> >
> > Both of them could do a pid translation job.
> > But for ns hierarchy, syscall like:
> >
> > pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
> > or
> > pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)
> >
> > could not work, we knew a pid lived in one ns, but we
> 
> Note I still disagree here. 
> 
> > did not know their relationships.
> > For getting the entire set of pids, both of them can do.
> >
> > So using procfs is a better way.
> >
> > Ex:
> >     init_pid_ns     ns1         ns2
> > t1  2
> > t2   `- 3           1
> > t3       `- 4       `- 5        1
> > t4           `-6        `-8      `-9
> > t5             `-10        `-9      `-10
> >
> > 1. How procfs work:
> > a) adding a nspid hierarchy  under /proc/ like:
> > [root@localhost proc]# tree /proc/nspid
> > /proc/nspid
> > ├── ns0
> > │    └── ns1
> 
> Are these actually called 'ns1' etc?  Adding a namespace of pid
> namespace names is a bad thing.

That's just an example.
We incline to name it as ns$(inum), 
like what we did in proc_ns_readlink.

> 
> > │       ├── ns2
> > │       │   └── pid -> /proc/9/ns
> > │       └── pid -> /proc/4/ns
> > └── pid -> /proc/1/ns
> >
> > We created dirs and add a link to the 1st process of this ns.
> 
> How much more kernel space does this take up?
> 

Only first process when creating new ns will be add here.
So there would not so many items.

> Is there an easy way to go from a pid in your own namespace
> to its proper node under /proc/nspid?  I.e. if I am interested
> in pid 9987, which happens to be pid 5 inside a container in
> ns2, and then I want to know what it means when it (pid 9987)
> is talking about 'pid 10'.  Is there a link under /proc/9987/
> leading to /proc/nspid/ns2/5 ?

If you want to query pid 9987, you could:
a) readlink /proc/9987/ns/pid
b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
c) Also the link to the 1st new ns process could be found under ns$(inum).

Or as what you said above,
we could do some change in /proc/PID/ns/pid
a) when new ns created, we put them under /proc/nspid
b) create a link from /proc/PID/ns/pid to /proc/nspid/ns$(inum)/pid

Then we could get a more clear view:
1. pidns view
/proc/nspid
├── ns_4026531836	(ns0)
│  ├─ ns1
│  │   ├─── ns2
│  │   └── pid -> pid:[4026531836]
│  └── pid -> pid:[4026531816]
└── pid -> pid:[4026531806]

Then there will be a link under /proc/9987/ns/pid to ns2:
2. PID1 live in ns0, PID2 live in ns2
/proc/PID1/ns/pid->/proc/nspid/ns_4026531806

/proc/PID2/ns/pid->/proc/nspid/ns_4026531836

> 
> > b) expose all sets of pid, pgid, sid and tgid
> > via expanded /proc/PID/status
> >       We could get translated IDs from container like:
> >     NStgid:	6 	8	9
> >     NSpid:	6 	8 	9
> >     NSpgid:	6 	8 	9
> >     NSsid:	6 	1 	0
> >     (a set of IDs with 3 level of ns)
> 
> This sure does seem the simplest route.  But it actually still
> does not provide us an easy answer to "what does pid 9987 mean
> when it talks about pid 10?".

Do you mean:
init_pid_ns   ns1     ns2
9987            10      5
Neither getnspid syscall nor proc/PID/status expansion
could answer this without hierarchy information.
For users in init_pid_ns, getnspid needs
an observer pid live and only live in ns1,
or we should call getnspid in ns1.
See below for more.

> 
> > 2. Advantage of procfs solution
> > a) easy to use:
> > getnspid(6, 10) -> (10, 9, 10)
> > or
> > getnspid(10, ns1_fd, ns0_fd) -> 9
> > getnspid(10, ns2_fd, ns0_fd) -> 10
> >
> > And we could also get it by:
> > cat /proc/10/status | grep NSpid:
> > NSpid:	10 	9 	10
> > ...
> 
> It looks nice, but I'm not convinced it gives us the info we
> need.
> 
> It's certainly possible that I've just not thought it through
> enough.
> 
> Question: are you proposing this (/proc/pid/status expansion) as an
> alternative to /proc/nspid, or are they meant to be complementary?
> 

We want /proc/nspid as a complement for pid translation.
Ex:
    init_pid_ns     ns1         ns2
t1  2
t2   `- 3           1 
t3       `- 4       `- 5        1
t4           `-6        `-8      `-9
t5             `-10        `-9      `-10
Suppose we were in init_pid_ns:
getnspid(9,4)->6 (t4)
getnspid(9,3)->10(t5)
We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
but the pre-requisite is that we know ns2 is the child of ns1. 

Thanks,
-Chen

> > b) hierarchy info:
> > We could not get the ns hierarchy info by just one syscall.
> > If we had to, it will complicate the interface.
> 
> Agreed.  But I'm not sure that's particularly important.
> 
> > We could check whether two process had some relations
> > via procfs:
> > readlink /proc/PID1/ns/pid -> aaa
> > readlink /proc/PID2/ns/pid -> bbb
> >
> > Then we could check /proc/nspid/nsX/nsY/nsZ
> > and find out their relationship.
> > Ex:
> > We know t4 live in ns2,
> > readlink /proc/t4/ns/pid -> AAA
> > then we refer to /proc/nspid/ and find a same inum AAA under
> > /proc/nspid/ns0/ns1/ns2
> > Then we knew that t4 have pid 9 in ns2, have pid 8 in ns1.
> >
> > Any comments would be warmly welcomed!
> >
> > Thanks,
> > - Chen
> >
> > > -----Original Message-----
> > > From: containers-bounces@lists.linux-foundation.org
> > > [mailto:containers-bounces@lists.linux-foundation.org] On Behalf Of
> > > chenhanxiao@cn.fujitsu.com
> > > Sent: Wednesday, July 09, 2014 6:34 PM
> > > To: Eric W. Biederman (ebiederm@xmission.com); Serge Hallyn
> > > (serge.hallyn@ubuntu.com); Oleg Nesterov (oleg@redhat.com); Richard
> Weinberger
> > > (richard@nod.at); Pavel Emelyanov (xemul@parallels.com); Vasily Kulikov
> > > (segoon@openwall.com); Gotou, Yasunori/五島 康文; 'Daniel P. Berrange
> > > (berrange@redhat.com)'
> > > Cc: containers@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> > > Subject: RE: [RFC]Pid conversion between pid namespace
> > >
> > > Hi,
> > >
> > > Let me summarize our discussions of ID conversion by pros/cons:
> > >
> > > A) make new system call for translation
> > >     A-1) systemcall(ID, NS1, NS2) into (ID).
> > >     pros:
> > >         - has a reference ns(NS2)
> > >           We could get any lower level ID directly.
> > >
> > >     cons:
> > >         - lack of hierarchy information.
> > >           CRIU need hierarchy info for checkpoint/restore in nested containers.
> > >         - not easy for debug.
> > >           And a lot of tools/libs need be modified.
> > >
> > >     A-2) syscall pid_t getnspid(pid_t query_pid, pid_t observer_pid)
> > >     pros:
> > >         - ns procfs free, easy to use.
> > >         We could get rid of mounted ns procfs.
> > >
> > >     cons:
> > >         - may find multiple results in nested ns.
> > >           We wished the new API could tell us the exact answer.
> > >           But if getnspid return more than one results will bring trouble to
> admins,
> > >           they had to make another decision.
> > >           Or we marked the deepest level for translation as prerequisite.
> > >
> > >         -based on current pidns, no reference ns.
> > >
> > > B) make/change proc file/directories
> > > 	B-1) expand /proc/pid/status
> > > 	pros:
> > >         - easy to use and to debug
> > >         - already had existed interface in kernel
> > >
> > > 	cons:
> > >         - based on current ns
> > >           for middle level, we had to make another decision.
> > >         - do not have hierarchy info.
> > >
> > > 	B-2) /proc/<pidX>/ns/proc/ which would contain everything
> > > 	pros:
> > >         - have enough info from /proc in container
> > >
> > > 	cons:
> > >         - Requirements unclear.
> > >           We need more discussion to decide which items should not be exposed.
> > >         - do not have hierarchy info.
> > >
> > >
> > > How about do these things in two steps:
> > >
> > > C)  1. expose all sets of pid, pgid, sid and tgid
> > > via expanded /proc/PID/status
> > >       We could get translated IDs from container like:
> > >     NStgid:	16465 	5 	1
> > >     NSpid:	16465 	5 	1
> > >     NSpgid:	16465 	5 	1
> > >     NSsid:	16423 	1 	0
> > >     (a set of IDs with 3 level of ns)
> > >
> > >     2. add hierarchy info under /proc
> > >       We lacked of method of getting hierarchy info, which is useful.
> > >       Then we could know the relationship of ns.
> > >       How about adding a new proc file just under /proc
> > >       to show the hierarchy like readlink did:
> > > 	  pid:[4026531836]-> [4026532390] -> [4026532484]
> > >       pid:[4026531836]-> [4026532491]
> > >       (A 3 level pid and 2 level pid_
> > >
> > > Any comments would be appreciated.
> > >
> > > Thanks,
> > > - Chen
> > >
> > > > -----Original Message-----
> > > > Subject: [RFC]Pid conversion between pid namespace
> > > >
> > > > Hi,
> > > >
> > > > We had some discussions on how to carry out
> > > > pid conversion between pid namespace via:
> > > > syscall[1] and procfs[2].
> > > >
> > > > Pavel suggested that a syscall like
> > > > (ID, NS1, NS2) into (ID).
> > > >
> > > > Serge suggested that a syscall
> > > > pid_t getnspid(pid_t query_pid, pid_t observer_pid).
> > > >
> > > >
> > > > Eric and Richard suggested a procfs solution is
> > > > more appropriate.
> > > >
> > > > Oleg suggested that we should expand /proc/pid/status
> > > > to report this kind of information.
> > > >
> > > > And Richard suggested adding a directory like
> > > > /proc/<pidX>/ns/proc/ which would contain everything
> > > > from /proc/<pidX inside the namespace>/.
> > > >
> > > > As procfs provided a more user friendly interface,
> > > > how about expose all sets of tgid, pid, pgid, sid
> > > > by expanding /proc/PID/status in procfs?
> > > > And we could also expose ns hierarchy under /proc,
> > > > which could be another reference.
> > > >
> > > > Ex:
> > > >     init_pid_ns    ns1         ns2
> > > > t1  2
> > > > t2   `- 3          1
> > > > t3       `- 4      `- 5        1
> > > >
> > > > We could get in /proc/t3/status:
> > > > NSpid: 4 5 1
> > > > We knew that pid 1 in container is pid 4 in init ns.
> > > >
> > > > And we could get ns hierarchy under /proc/ns_hierarchy like:
> > > > init_ns->ns1->ns2		(as the result of readlink)
> > > >          ->ns3
> > > > We knew that t3 in ns2, and its hierarchy.
> > > >
> > > > How these ideas looks like?
> > > > Any comments would be appreciated.
> > > >
> > > > Thanks,
> > > > - Chen
> > > >
> > > >
> > > > a) syscall
> > > > http://lwn.net/Articles/602987/
> > > >
> > > > b) procfs
> > > > http://www.spinics.net/lists/kernel/msg1751688.html
> > > >
> > > > _______________________________________________
> > > > Containers mailing list
> > > > Containers@lists.linux-foundation.org
> > > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> > > _______________________________________________
> > > Containers mailing list
> > > Containers@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/containers
> 
> > _______________________________________________
> > Containers mailing list
> > Containers@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/containers

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-08-07 10:03               ` chenhanxiao
@ 2014-08-07 16:11                   ` Serge Hallyn
  -1 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-08-07 16:11 UTC (permalink / raw)
  To: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> > -----Original Message-----
> > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > Sent: Tuesday, August 05, 2014 6:21 AM
> > 
> > Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > > Hi,
> > >
> > > We discussed two ways of pid conversion:
> > > syscall and procfs.
> > >
> > > Both of them could do a pid translation job.
> > > But for ns hierarchy, syscall like:
> > >
> > > pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
> > > or
> > > pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)
> > >
> > > could not work, we knew a pid lived in one ns, but we
> > 
> > Note I still disagree here. 
> > 
> > > did not know their relationships.
> > > For getting the entire set of pids, both of them can do.
> > >
> > > So using procfs is a better way.
> > >
> > > Ex:
> > >     init_pid_ns     ns1         ns2
> > > t1  2
> > > t2   `- 3           1
> > > t3       `- 4       `- 5        1
> > > t4           `-6        `-8      `-9
> > > t5             `-10        `-9      `-10
> > >
> > > 1. How procfs work:
> > > a) adding a nspid hierarchy  under /proc/ like:
> > > [root@localhost proc]# tree /proc/nspid
> > > /proc/nspid
> > > ├── ns0
> > > │    └── ns1
> > 
> > Are these actually called 'ns1' etc?  Adding a namespace of pid
> > namespace names is a bad thing.
> 
> That's just an example.
> We incline to name it as ns$(inum), 
> like what we did in proc_ns_readlink.
> 
> > 
> > > │       ├── ns2
> > > │       │   └── pid -> /proc/9/ns
> > > │       └── pid -> /proc/4/ns
> > > └── pid -> /proc/1/ns
> > >
> > > We created dirs and add a link to the 1st process of this ns.
> > 
> > How much more kernel space does this take up?
> > 
> 
> Only first process when creating new ns will be add here.
> So there would not so many items.

Oh, I see.

> > Is there an easy way to go from a pid in your own namespace
> > to its proper node under /proc/nspid?  I.e. if I am interested
> > in pid 9987, which happens to be pid 5 inside a container in
> > ns2, and then I want to know what it means when it (pid 9987)
> > is talking about 'pid 10'.  Is there a link under /proc/9987/
> > leading to /proc/nspid/ns2/5 ?
> 
> If you want to query pid 9987, you could:
> a) readlink /proc/9987/ns/pid
> b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
> c) Also the link to the 1st new ns process could be found under ns$(inum).

This is good.  Let's go with it.

> Or as what you said above,

Nah.  Let's not change /proc/PID/ns/pid.

> we could do some change in /proc/PID/ns/pid
> a) when new ns created, we put them under /proc/nspid
> b) create a link from /proc/PID/ns/pid to /proc/nspid/ns$(inum)/pid
> 
> Then we could get a more clear view:
> 1. pidns view
> /proc/nspid
> ├── ns_4026531836	(ns0)
> │  ├─ ns1
> │  │   ├─── ns2
> │  │   └── pid -> pid:[4026531836]
> │  └── pid -> pid:[4026531816]
> └── pid -> pid:[4026531806]
> 
> Then there will be a link under /proc/9987/ns/pid to ns2:
> 2. PID1 live in ns0, PID2 live in ns2
> /proc/PID1/ns/pid->/proc/nspid/ns_4026531806
> 
> /proc/PID2/ns/pid->/proc/nspid/ns_4026531836
> 
> > 
> > > b) expose all sets of pid, pgid, sid and tgid
> > > via expanded /proc/PID/status
> > >       We could get translated IDs from container like:
> > >     NStgid:	6 	8	9
> > >     NSpid:	6 	8 	9
> > >     NSpgid:	6 	8 	9
> > >     NSsid:	6 	1 	0
> > >     (a set of IDs with 3 level of ns)
> > 
> > This sure does seem the simplest route.  But it actually still
> > does not provide us an easy answer to "what does pid 9987 mean
> > when it talks about pid 10?".
> 
> Do you mean:
> init_pid_ns   ns1     ns2
> 9987            10      5
> Neither getnspid syscall nor proc/PID/status expansion
> could answer this without hierarchy information.
> For users in init_pid_ns, getnspid needs
> an observer pid live and only live in ns1,

Yes, good point.  That's a definite disadvantage of getnspid
compared to your proc approach.

> or we should call getnspid in ns1.
> See below for more.
> 
> > 
> > > 2. Advantage of procfs solution
> > > a) easy to use:
> > > getnspid(6, 10) -> (10, 9, 10)
> > > or
> > > getnspid(10, ns1_fd, ns0_fd) -> 9
> > > getnspid(10, ns2_fd, ns0_fd) -> 10
> > >
> > > And we could also get it by:
> > > cat /proc/10/status | grep NSpid:
> > > NSpid:	10 	9 	10
> > > ...
> > 
> > It looks nice, but I'm not convinced it gives us the info we
> > need.
> > 
> > It's certainly possible that I've just not thought it through
> > enough.
> > 
> > Question: are you proposing this (/proc/pid/status expansion) as an
> > alternative to /proc/nspid, or are they meant to be complementary?
> > 
> 
> We want /proc/nspid as a complement for pid translation.

Ok.

> Ex:
>     init_pid_ns     ns1         ns2
> t1  2
> t2   `- 3           1 
> t3       `- 4       `- 5        1
> t4           `-6        `-8      `-9
> t5             `-10        `-9      `-10
> Suppose we were in init_pid_ns:
> getnspid(9,4)->6 (t4)
> getnspid(9,3)->10(t5)
> We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
> If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
> but the pre-requisite is that we know ns2 is the child of ns1. 

I like your proc approach.  Do you have an implementation?

-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-08-07 16:11                   ` Serge Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge Hallyn @ 2014-08-07 16:11 UTC (permalink / raw)
  To: chenhanxiao
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> Hi,
> 
> > -----Original Message-----
> > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > Sent: Tuesday, August 05, 2014 6:21 AM
> > 
> > Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> > > Hi,
> > >
> > > We discussed two ways of pid conversion:
> > > syscall and procfs.
> > >
> > > Both of them could do a pid translation job.
> > > But for ns hierarchy, syscall like:
> > >
> > > pid_t* getnspid(pid_t query_pid, pid_t observer_pid)
> > > or
> > > pid_t getnspid(pid_t query_pid, int query_fd, int ref_fd)
> > >
> > > could not work, we knew a pid lived in one ns, but we
> > 
> > Note I still disagree here. 
> > 
> > > did not know their relationships.
> > > For getting the entire set of pids, both of them can do.
> > >
> > > So using procfs is a better way.
> > >
> > > Ex:
> > >     init_pid_ns     ns1         ns2
> > > t1  2
> > > t2   `- 3           1
> > > t3       `- 4       `- 5        1
> > > t4           `-6        `-8      `-9
> > > t5             `-10        `-9      `-10
> > >
> > > 1. How procfs work:
> > > a) adding a nspid hierarchy  under /proc/ like:
> > > [root@localhost proc]# tree /proc/nspid
> > > /proc/nspid
> > > ├── ns0
> > > │    └── ns1
> > 
> > Are these actually called 'ns1' etc?  Adding a namespace of pid
> > namespace names is a bad thing.
> 
> That's just an example.
> We incline to name it as ns$(inum), 
> like what we did in proc_ns_readlink.
> 
> > 
> > > │       ├── ns2
> > > │       │   └── pid -> /proc/9/ns
> > > │       └── pid -> /proc/4/ns
> > > └── pid -> /proc/1/ns
> > >
> > > We created dirs and add a link to the 1st process of this ns.
> > 
> > How much more kernel space does this take up?
> > 
> 
> Only first process when creating new ns will be add here.
> So there would not so many items.

Oh, I see.

> > Is there an easy way to go from a pid in your own namespace
> > to its proper node under /proc/nspid?  I.e. if I am interested
> > in pid 9987, which happens to be pid 5 inside a container in
> > ns2, and then I want to know what it means when it (pid 9987)
> > is talking about 'pid 10'.  Is there a link under /proc/9987/
> > leading to /proc/nspid/ns2/5 ?
> 
> If you want to query pid 9987, you could:
> a) readlink /proc/9987/ns/pid
> b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
> c) Also the link to the 1st new ns process could be found under ns$(inum).

This is good.  Let's go with it.

> Or as what you said above,

Nah.  Let's not change /proc/PID/ns/pid.

> we could do some change in /proc/PID/ns/pid
> a) when new ns created, we put them under /proc/nspid
> b) create a link from /proc/PID/ns/pid to /proc/nspid/ns$(inum)/pid
> 
> Then we could get a more clear view:
> 1. pidns view
> /proc/nspid
> ├── ns_4026531836	(ns0)
> │  ├─ ns1
> │  │   ├─── ns2
> │  │   └── pid -> pid:[4026531836]
> │  └── pid -> pid:[4026531816]
> └── pid -> pid:[4026531806]
> 
> Then there will be a link under /proc/9987/ns/pid to ns2:
> 2. PID1 live in ns0, PID2 live in ns2
> /proc/PID1/ns/pid->/proc/nspid/ns_4026531806
> 
> /proc/PID2/ns/pid->/proc/nspid/ns_4026531836
> 
> > 
> > > b) expose all sets of pid, pgid, sid and tgid
> > > via expanded /proc/PID/status
> > >       We could get translated IDs from container like:
> > >     NStgid:	6 	8	9
> > >     NSpid:	6 	8 	9
> > >     NSpgid:	6 	8 	9
> > >     NSsid:	6 	1 	0
> > >     (a set of IDs with 3 level of ns)
> > 
> > This sure does seem the simplest route.  But it actually still
> > does not provide us an easy answer to "what does pid 9987 mean
> > when it talks about pid 10?".
> 
> Do you mean:
> init_pid_ns   ns1     ns2
> 9987            10      5
> Neither getnspid syscall nor proc/PID/status expansion
> could answer this without hierarchy information.
> For users in init_pid_ns, getnspid needs
> an observer pid live and only live in ns1,

Yes, good point.  That's a definite disadvantage of getnspid
compared to your proc approach.

> or we should call getnspid in ns1.
> See below for more.
> 
> > 
> > > 2. Advantage of procfs solution
> > > a) easy to use:
> > > getnspid(6, 10) -> (10, 9, 10)
> > > or
> > > getnspid(10, ns1_fd, ns0_fd) -> 9
> > > getnspid(10, ns2_fd, ns0_fd) -> 10
> > >
> > > And we could also get it by:
> > > cat /proc/10/status | grep NSpid:
> > > NSpid:	10 	9 	10
> > > ...
> > 
> > It looks nice, but I'm not convinced it gives us the info we
> > need.
> > 
> > It's certainly possible that I've just not thought it through
> > enough.
> > 
> > Question: are you proposing this (/proc/pid/status expansion) as an
> > alternative to /proc/nspid, or are they meant to be complementary?
> > 
> 
> We want /proc/nspid as a complement for pid translation.

Ok.

> Ex:
>     init_pid_ns     ns1         ns2
> t1  2
> t2   `- 3           1 
> t3       `- 4       `- 5        1
> t4           `-6        `-8      `-9
> t5             `-10        `-9      `-10
> Suppose we were in init_pid_ns:
> getnspid(9,4)->6 (t4)
> getnspid(9,3)->10(t5)
> We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
> If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
> but the pre-requisite is that we know ns2 is the child of ns1. 

I like your proc approach.  Do you have an implementation?

-serge

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
  2014-08-07 16:11                   ` Serge Hallyn
@ 2014-08-08  9:30                     ` chenhanxiao
  -1 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-08-08  9:30 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)



> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> Sent: Friday, August 08, 2014 12:12 AM
> To: Chen, Hanxiao/陈 晗霄

> > > How much more kernel space does this take up?
> > >
> >
> > Only first process when creating new ns will be add here.
> > So there would not so many items.
> 
> Oh, I see.
> 
> > > Is there an easy way to go from a pid in your own namespace
> > > to its proper node under /proc/nspid?  I.e. if I am interested
> > > in pid 9987, which happens to be pid 5 inside a container in
> > > ns2, and then I want to know what it means when it (pid 9987)
> > > is talking about 'pid 10'.  Is there a link under /proc/9987/
> > > leading to /proc/nspid/ns2/5 ?
> >
> > If you want to query pid 9987, you could:
> > a) readlink /proc/9987/ns/pid
> > b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
> > c) Also the link to the 1st new ns process could be found under ns$(inum).
> 
> This is good.  Let's go with it.

OK

> 
> > Or as what you said above,
> 
> Nah.  Let's not change /proc/PID/ns/pid.
> 
> > > This sure does seem the simplest route.  But it actually still
> > > does not provide us an easy answer to "what does pid 9987 mean
> > > when it talks about pid 10?".
> >
> > Do you mean:
> > init_pid_ns   ns1     ns2
> > 9987            10      5
> > Neither getnspid syscall nor proc/PID/status expansion
> > could answer this without hierarchy information.
> > For users in init_pid_ns, getnspid needs
> > an observer pid live and only live in ns1,
> 
> Yes, good point.  That's a definite disadvantage of getnspid
> compared to your proc approach.
> 
> > or we should call getnspid in ns1.
> > See below for more.
> >
> > >
> > > > 2. Advantage of procfs solution
> > > > a) easy to use:
> > > > getnspid(6, 10) -> (10, 9, 10)
> > > > or
> > > > getnspid(10, ns1_fd, ns0_fd) -> 9
> > > > getnspid(10, ns2_fd, ns0_fd) -> 10
> > > >
> > > > And we could also get it by:
> > > > cat /proc/10/status | grep NSpid:
> > > > NSpid:	10 	9 	10
> > > > ...
> > >
> > > It looks nice, but I'm not convinced it gives us the info we
> > > need.
> > >
> > > It's certainly possible that I've just not thought it through
> > > enough.
> > >
> > > Question: are you proposing this (/proc/pid/status expansion) as an
> > > alternative to /proc/nspid, or are they meant to be complementary?
> > >
> >
> > We want /proc/nspid as a complement for pid translation.
> 
> Ok.
> 
> > Ex:
> >     init_pid_ns     ns1         ns2
> > t1  2
> > t2   `- 3           1
> > t3       `- 4       `- 5        1
> > t4           `-6        `-8      `-9
> > t5             `-10        `-9      `-10
> > Suppose we were in init_pid_ns:
> > getnspid(9,4)->6 (t4)
> > getnspid(9,3)->10(t5)
> > We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
> > If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
> > but the pre-requisite is that we know ns2 is the child of ns1.
> 
> I like your proc approach.  Do you have an implementation?

Thanks for your comments.
I'm preparing the pidns hierarchy patch.
It seems that it's not easy to carry it out.

Thanks,
- Chen  
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
@ 2014-08-08  9:30                     ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-08-08  9:30 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Eric W. Biederman (ebiederm@xmission.com),
	Oleg Nesterov (oleg@redhat.com),
	Richard Weinberger (richard@nod.at),
	Pavel Emelyanov (xemul@parallels.com),
	Vasily Kulikov (segoon@openwall.com),
	Gotou, Yasunori,
	'Daniel P. Berrange (berrange@redhat.com)',
	containers, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3286 bytes --]



> -----Original Message-----
> From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> Sent: Friday, August 08, 2014 12:12 AM
> To: Chen, Hanxiao/陈 晗霄

> > > How much more kernel space does this take up?
> > >
> >
> > Only first process when creating new ns will be add here.
> > So there would not so many items.
> 
> Oh, I see.
> 
> > > Is there an easy way to go from a pid in your own namespace
> > > to its proper node under /proc/nspid?  I.e. if I am interested
> > > in pid 9987, which happens to be pid 5 inside a container in
> > > ns2, and then I want to know what it means when it (pid 9987)
> > > is talking about 'pid 10'.  Is there a link under /proc/9987/
> > > leading to /proc/nspid/ns2/5 ?
> >
> > If you want to query pid 9987, you could:
> > a) readlink /proc/9987/ns/pid
> > b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
> > c) Also the link to the 1st new ns process could be found under ns$(inum).
> 
> This is good.  Let's go with it.

OK

> 
> > Or as what you said above,
> 
> Nah.  Let's not change /proc/PID/ns/pid.
> 
> > > This sure does seem the simplest route.  But it actually still
> > > does not provide us an easy answer to "what does pid 9987 mean
> > > when it talks about pid 10?".
> >
> > Do you mean:
> > init_pid_ns   ns1     ns2
> > 9987            10      5
> > Neither getnspid syscall nor proc/PID/status expansion
> > could answer this without hierarchy information.
> > For users in init_pid_ns, getnspid needs
> > an observer pid live and only live in ns1,
> 
> Yes, good point.  That's a definite disadvantage of getnspid
> compared to your proc approach.
> 
> > or we should call getnspid in ns1.
> > See below for more.
> >
> > >
> > > > 2. Advantage of procfs solution
> > > > a) easy to use:
> > > > getnspid(6, 10) -> (10, 9, 10)
> > > > or
> > > > getnspid(10, ns1_fd, ns0_fd) -> 9
> > > > getnspid(10, ns2_fd, ns0_fd) -> 10
> > > >
> > > > And we could also get it by:
> > > > cat /proc/10/status | grep NSpid:
> > > > NSpid:	10 	9 	10
> > > > ...
> > >
> > > It looks nice, but I'm not convinced it gives us the info we
> > > need.
> > >
> > > It's certainly possible that I've just not thought it through
> > > enough.
> > >
> > > Question: are you proposing this (/proc/pid/status expansion) as an
> > > alternative to /proc/nspid, or are they meant to be complementary?
> > >
> >
> > We want /proc/nspid as a complement for pid translation.
> 
> Ok.
> 
> > Ex:
> >     init_pid_ns     ns1         ns2
> > t1  2
> > t2   `- 3           1
> > t3       `- 4       `- 5        1
> > t4           `-6        `-8      `-9
> > t5             `-10        `-9      `-10
> > Suppose we were in init_pid_ns:
> > getnspid(9,4)->6 (t4)
> > getnspid(9,3)->10(t5)
> > We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
> > If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
> > but the pre-requisite is that we know ns2 is the child of ns1.
> 
> I like your proc approach.  Do you have an implementation?

Thanks for your comments.
I'm preparing the pidns hierarchy patch.
It seems that it's not easy to carry it out.

Thanks,
- Chen  
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
  2014-08-08  9:30                     ` chenhanxiao
@ 2014-08-28 13:49                         ` Serge E. Hallyn
  -1 siblings, 0 replies; 30+ messages in thread
From: Serge E. Hallyn @ 2014-08-28 13:49 UTC (permalink / raw)
  To: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Serge Hallyn,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> 
> 
> > -----Original Message-----
> > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > Sent: Friday, August 08, 2014 12:12 AM
> > To: Chen, Hanxiao/陈 晗霄
> 
> > > > How much more kernel space does this take up?
> > > >
> > >
> > > Only first process when creating new ns will be add here.
> > > So there would not so many items.
> > 
> > Oh, I see.
> > 
> > > > Is there an easy way to go from a pid in your own namespace
> > > > to its proper node under /proc/nspid?  I.e. if I am interested
> > > > in pid 9987, which happens to be pid 5 inside a container in
> > > > ns2, and then I want to know what it means when it (pid 9987)
> > > > is talking about 'pid 10'.  Is there a link under /proc/9987/
> > > > leading to /proc/nspid/ns2/5 ?
> > >
> > > If you want to query pid 9987, you could:
> > > a) readlink /proc/9987/ns/pid
> > > b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
> > > c) Also the link to the 1st new ns process could be found under ns$(inum).
> > 
> > This is good.  Let's go with it.
> 
> OK
> 
> > 
> > > Or as what you said above,
> > 
> > Nah.  Let's not change /proc/PID/ns/pid.
> > 
> > > > This sure does seem the simplest route.  But it actually still
> > > > does not provide us an easy answer to "what does pid 9987 mean
> > > > when it talks about pid 10?".
> > >
> > > Do you mean:
> > > init_pid_ns   ns1     ns2
> > > 9987            10      5
> > > Neither getnspid syscall nor proc/PID/status expansion
> > > could answer this without hierarchy information.
> > > For users in init_pid_ns, getnspid needs
> > > an observer pid live and only live in ns1,
> > 
> > Yes, good point.  That's a definite disadvantage of getnspid
> > compared to your proc approach.
> > 
> > > or we should call getnspid in ns1.
> > > See below for more.
> > >
> > > >
> > > > > 2. Advantage of procfs solution
> > > > > a) easy to use:
> > > > > getnspid(6, 10) -> (10, 9, 10)
> > > > > or
> > > > > getnspid(10, ns1_fd, ns0_fd) -> 9
> > > > > getnspid(10, ns2_fd, ns0_fd) -> 10
> > > > >
> > > > > And we could also get it by:
> > > > > cat /proc/10/status | grep NSpid:
> > > > > NSpid:	10 	9 	10
> > > > > ...
> > > >
> > > > It looks nice, but I'm not convinced it gives us the info we
> > > > need.
> > > >
> > > > It's certainly possible that I've just not thought it through
> > > > enough.
> > > >
> > > > Question: are you proposing this (/proc/pid/status expansion) as an
> > > > alternative to /proc/nspid, or are they meant to be complementary?
> > > >
> > >
> > > We want /proc/nspid as a complement for pid translation.
> > 
> > Ok.
> > 
> > > Ex:
> > >     init_pid_ns     ns1         ns2
> > > t1  2
> > > t2   `- 3           1
> > > t3       `- 4       `- 5        1
> > > t4           `-6        `-8      `-9
> > > t5             `-10        `-9      `-10
> > > Suppose we were in init_pid_ns:
> > > getnspid(9,4)->6 (t4)
> > > getnspid(9,3)->10(t5)
> > > We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
> > > If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
> > > but the pre-requisite is that we know ns2 is the child of ns1.
> > 
> > I like your proc approach.  Do you have an implementation?
> 
> Thanks for your comments.
> I'm preparing the pidns hierarchy patch.
> It seems that it's not easy to carry it out.

:)  Not entirely surprised.

Please do send patches earlier rather than later to avoid going
down a path that someone's going to nack anyway.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC]Pid conversion between pid namespace
@ 2014-08-28 13:49                         ` Serge E. Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge E. Hallyn @ 2014-08-28 13:49 UTC (permalink / raw)
  To: chenhanxiao
  Cc: Serge Hallyn, Richard Weinberger (richard@nod.at),
	containers, Oleg Nesterov (oleg@redhat.com),
	linux-kernel, Eric W. Biederman (ebiederm@xmission.com),
	Vasily Kulikov (segoon@openwall.com)

Quoting chenhanxiao@cn.fujitsu.com (chenhanxiao@cn.fujitsu.com):
> 
> 
> > -----Original Message-----
> > From: Serge Hallyn [mailto:serge.hallyn@ubuntu.com]
> > Sent: Friday, August 08, 2014 12:12 AM
> > To: Chen, Hanxiao/陈 晗霄
> 
> > > > How much more kernel space does this take up?
> > > >
> > >
> > > Only first process when creating new ns will be add here.
> > > So there would not so many items.
> > 
> > Oh, I see.
> > 
> > > > Is there an easy way to go from a pid in your own namespace
> > > > to its proper node under /proc/nspid?  I.e. if I am interested
> > > > in pid 9987, which happens to be pid 5 inside a container in
> > > > ns2, and then I want to know what it means when it (pid 9987)
> > > > is talking about 'pid 10'.  Is there a link under /proc/9987/
> > > > leading to /proc/nspid/ns2/5 ?
> > >
> > > If you want to query pid 9987, you could:
> > > a) readlink /proc/9987/ns/pid
> > > b) refer to /proc/nspid/ns$(inum)/ns$(inum)..
> > > c) Also the link to the 1st new ns process could be found under ns$(inum).
> > 
> > This is good.  Let's go with it.
> 
> OK
> 
> > 
> > > Or as what you said above,
> > 
> > Nah.  Let's not change /proc/PID/ns/pid.
> > 
> > > > This sure does seem the simplest route.  But it actually still
> > > > does not provide us an easy answer to "what does pid 9987 mean
> > > > when it talks about pid 10?".
> > >
> > > Do you mean:
> > > init_pid_ns   ns1     ns2
> > > 9987            10      5
> > > Neither getnspid syscall nor proc/PID/status expansion
> > > could answer this without hierarchy information.
> > > For users in init_pid_ns, getnspid needs
> > > an observer pid live and only live in ns1,
> > 
> > Yes, good point.  That's a definite disadvantage of getnspid
> > compared to your proc approach.
> > 
> > > or we should call getnspid in ns1.
> > > See below for more.
> > >
> > > >
> > > > > 2. Advantage of procfs solution
> > > > > a) easy to use:
> > > > > getnspid(6, 10) -> (10, 9, 10)
> > > > > or
> > > > > getnspid(10, ns1_fd, ns0_fd) -> 9
> > > > > getnspid(10, ns2_fd, ns0_fd) -> 10
> > > > >
> > > > > And we could also get it by:
> > > > > cat /proc/10/status | grep NSpid:
> > > > > NSpid:	10 	9 	10
> > > > > ...
> > > >
> > > > It looks nice, but I'm not convinced it gives us the info we
> > > > need.
> > > >
> > > > It's certainly possible that I've just not thought it through
> > > > enough.
> > > >
> > > > Question: are you proposing this (/proc/pid/status expansion) as an
> > > > alternative to /proc/nspid, or are they meant to be complementary?
> > > >
> > >
> > > We want /proc/nspid as a complement for pid translation.
> > 
> > Ok.
> > 
> > > Ex:
> > >     init_pid_ns     ns1         ns2
> > > t1  2
> > > t2   `- 3           1
> > > t3       `- 4       `- 5        1
> > > t4           `-6        `-8      `-9
> > > t5             `-10        `-9      `-10
> > > Suppose we were in init_pid_ns:
> > > getnspid(9,4)->6 (t4)
> > > getnspid(9,3)->10(t5)
> > > We knew t2 in ns1 and t3 in ns2, but we don't know their relationship.
> > > If we want to query pid 9 in ns1, we could use getnspid(9,3)->10(t5)
> > > but the pre-requisite is that we know ns2 is the child of ns1.
> > 
> > I like your proc approach.  Do you have an implementation?
> 
> Thanks for your comments.
> I'm preparing the pidns hierarchy patch.
> It seems that it's not easy to carry it out.

:)  Not entirely surprised.

Please do send patches earlier rather than later to avoid going
down a path that someone's going to nack anyway.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
  2014-08-28 13:49                         ` Serge E. Hallyn
@ 2014-08-29  9:59                             ` chenhanxiao
  -1 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao-BthXqXjhjHXQFUHtdCDX3A @ 2014-08-29  9:59 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Richard Weinberger (richard-/L3Ra7n9ekc@public.gmane.org),
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Serge Hallyn,
	Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org),
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman
	(ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org),
	Vasily Kulikov (segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org)



> -----Original Message-----
> From: Serge E. Hallyn [mailto:serge@hallyn.com]
> Sent: Thursday, August 28, 2014 9:50 PM
> To: Chen, Hanxiao/陈 晗霄
> Cc: Serge Hallyn; Richard Weinberger (richard@nod.at);
[snip]

> > > I like your proc approach.  Do you have an implementation?
> >
> > Thanks for your comments.
> > I'm preparing the pidns hierarchy patch.
> > It seems that it's not easy to carry it out.
> 
> :)  Not entirely surprised.
> 
> Please do send patches earlier rather than later to avoid going
> down a path that someone's going to nack anyway.

I've almost finished ns hierarchy patch
and do some tests now.
It will be sent in the next week.

Thanks,
- Chen
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [RFC]Pid conversion between pid namespace
@ 2014-08-29  9:59                             ` chenhanxiao
  0 siblings, 0 replies; 30+ messages in thread
From: chenhanxiao @ 2014-08-29  9:59 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Serge Hallyn, Richard Weinberger (richard@nod.at),
	containers, Oleg Nesterov (oleg@redhat.com),
	linux-kernel, Eric W. Biederman (ebiederm@xmission.com),
	Vasily Kulikov (segoon@openwall.com)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 839 bytes --]



> -----Original Message-----
> From: Serge E. Hallyn [mailto:serge@hallyn.com]
> Sent: Thursday, August 28, 2014 9:50 PM
> To: Chen, Hanxiao/陈 晗霄
> Cc: Serge Hallyn; Richard Weinberger (richard@nod.at);
[snip]

> > > I like your proc approach.  Do you have an implementation?
> >
> > Thanks for your comments.
> > I'm preparing the pidns hierarchy patch.
> > It seems that it's not easy to carry it out.
> 
> :)  Not entirely surprised.
> 
> Please do send patches earlier rather than later to avoid going
> down a path that someone's going to nack anyway.

I've almost finished ns hierarchy patch
and do some tests now.
It will be sent in the next week.

Thanks,
- Chen
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2014-08-29  9:59 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-03 12:18 [RFC]Pid conversion between pid namespace chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-07-03 12:18 ` chenhanxiao
     [not found] ` <5871495633F38949900D2BF2DC04883E55C374-ZEd+hNNJ6a5ZYpXjqAkB5jz3u5zwRJJDAzI0kPv9QBlmR6Xm/wNWPw@public.gmane.org>
2014-07-04  5:34   ` Yasunori Goto
2014-07-04  5:34     ` Yasunori Goto
2014-07-09 10:34   ` chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-07-09 10:34     ` chenhanxiao
     [not found]     ` <5871495633F38949900D2BF2DC04883E560412-ZEd+hNNJ6a5ZYpXjqAkB5jz3u5zwRJJDAzI0kPv9QBlmR6Xm/wNWPw@public.gmane.org>
2014-07-15  4:16       ` Serge Hallyn
2014-07-15  4:16         ` Serge Hallyn
2014-07-21 10:47         ` chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-07-21 10:47           ` chenhanxiao
     [not found]           ` <5871495633F38949900D2BF2DC04883E569892-ZEd+hNNJ6a5ZYpXjqAkB5jz3u5zwRJJDAzI0kPv9QBlmR6Xm/wNWPw@public.gmane.org>
2014-07-25 17:34             ` Serge Hallyn
2014-07-25 17:34               ` Serge Hallyn
2014-07-28  8:14               ` Hu Tao
2014-07-28  8:14                 ` Hu Tao
     [not found]                 ` <20140728081444.GE31917-HsVU22ltrpZHUXRZIiWhY7I7ww31JBiOxNlhxgUlya3QT0dZR+AlfA@public.gmane.org>
2014-07-28 13:24                   ` Serge Hallyn
2014-07-28 13:24                     ` Serge Hallyn
2014-07-25 10:01       ` chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-07-25 10:01         ` chenhanxiao
     [not found]         ` <5871495633F38949900D2BF2DC04883E56C7A2-ZEd+hNNJ6a5ZYpXjqAkB5jz3u5zwRJJDAzI0kPv9QBlmR6Xm/wNWPw@public.gmane.org>
2014-08-04 22:20           ` Serge Hallyn
2014-08-04 22:20             ` Serge Hallyn
2014-08-07 10:03             ` chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-08-07 10:03               ` chenhanxiao
     [not found]               ` <5871495633F38949900D2BF2DC04883E57702E-ZEd+hNNJ6a5ZYpXjqAkB5jz3u5zwRJJDAzI0kPv9QBlmR6Xm/wNWPw@public.gmane.org>
2014-08-07 16:11                 ` Serge Hallyn
2014-08-07 16:11                   ` Serge Hallyn
2014-08-08  9:30                   ` chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-08-08  9:30                     ` chenhanxiao
     [not found]                     ` <5871495633F38949900D2BF2DC04883E577E1A-ZEd+hNNJ6a5ZYpXjqAkB5jz3u5zwRJJDAzI0kPv9QBlmR6Xm/wNWPw@public.gmane.org>
2014-08-28 13:49                       ` Serge E. Hallyn
2014-08-28 13:49                         ` Serge E. Hallyn
     [not found]                         ` <20140828134957.GA6047-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2014-08-29  9:59                           ` chenhanxiao-BthXqXjhjHXQFUHtdCDX3A
2014-08-29  9:59                             ` chenhanxiao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.