linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
To: Jann Horn <jannh@google.com>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Nagarajan.Muthukrishnan@oracle.com,
	Prakash Sangappa <prakash.sangappa@oracle.com>,
	Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Serge Hallyn <serge.hallyn@ubuntu.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Eugene Syromiatnikov <esyr@redhat.com>,
	xemul@parallels.com
Subject: Re: [RESEND RFC] translate_pid API
Date: Tue, 13 Mar 2018 14:20:16 -0700	[thread overview]
Message-ID: <69f13674-7f84-5dc7-0bd7-e5e65e9cb3b0@oracle.com> (raw)
In-Reply-To: <CAG48ez1ww-pVQ9R+d+P+smRR3eaEkJ-=9GW-O-QOvH-YZoFG8A@mail.gmail.com>



On 03/13/2018 01:47 PM, Jann Horn wrote:
> On Mon, Mar 12, 2018 at 10:18 AM,  <nagarathnam.muthusamy@oracle.com> wrote:
>> Resending the RFC with participants of previous discussions
>> in the list.
>>
>> Following patch which is a variation of a solution discussed
>> in https://lwn.net/Articles/736330/ provides the users of
>> pid namespace, the functionality of pid translation between
>> namespaces using a namespace identifier. The topic of
>> pid translation has been discussed in the community few times
>> but there has always been a resistance to adding new solution
>> for this problem.
>> I will outline the planned usecase of pid namespace by oracle
>> database and explain why any of the existing solution cannot
>> be used to solve their problem.
>>
>> Consider a system in which several PID namespaces with multiple
>> nested levels exists in parallel with monitor processes managing
>> all the namespaces. PID translation is required for controlling
>> and accessing information about the processes by the monitors
>> and other processes down the hierarchy of namespaces. Controlling
>> primarily involves sending signals or using ptrace by a process in
>> parent namespace on any of the processes in its child namespace.
>> Accessing information deals with the reading /proc/<pid>/* files
>> of processes in child namespace. None of the processes have
>> root/CAP_SYS_ADMIN privileges.
> How are you dealing with PID reuse?

We have a monitor process which keeps track of the aliveness of
important processes. When a process dies, monitor makes a note of
it and hence detects if pid is reused.

>
> [...]
>> diff --git a/fs/nsfs.c b/fs/nsfs.c
>> index 36b0772..c635465 100644
>> --- a/fs/nsfs.c
>> +++ b/fs/nsfs.c
>> @@ -222,8 +222,13 @@ int ns_get_name(char *buf, size_t size, struct task_struct *task,
>>          const char *name;
>>          ns = ns_ops->get(task);
>>          if (ns) {
>> -               name = ns_ops->real_ns_name ? : ns_ops->name;
>> -               res = snprintf(buf, size, "%s:[%u]", name, ns->inum);
>> +               if (!strcmp(ns_ops->name, "pidns_id")) {
> Wouldn't it be cleaner to check for "ns_ops==&pidns_id_operations"?

Yup. Will fix it.

>
>> +                       res = snprintf(buf, size, "[%llu]",
>> +                                      (unsigned long long)ns->ns_id);
>> +               } else {
>> +                       name = ns_ops->real_ns_name ? : ns_ops->name;
>> +                       res = snprintf(buf, size, "%s:[%u]", name, ns->inum);
>> +               }
>>                  ns_ops->put(ns);
>>          }
>>          return res;
> [...]
>> diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
>> index 49538b1..11d1d57 100644
>> --- a/include/linux/pid_namespace.h
>> +++ b/include/linux/pid_namespace.h
>> @@ -11,6 +11,7 @@
>>   #include <linux/kref.h>
>>   #include <linux/ns_common.h>
>>   #include <linux/idr.h>
>> +#include <linux/list_bl.h>
>>
>>
>>   struct fs_pin;
>> @@ -44,6 +45,8 @@ struct pid_namespace {
>>          kgid_t pid_gid;
>>          int hide_pid;
>>          int reboot;     /* group exit code if this pidns was rebooted */
>> +       struct hlist_bl_node node;
>> +       atomic_t lookups_pending;
>>          struct ns_common ns;
>>   } __randomize_layout;
>>
> [...]
>> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
>> index 0b53eef..ff83aa8 100644
>> --- a/kernel/pid_namespace.c
>> +++ b/kernel/pid_namespace.c
> [...]
>> @@ -159,6 +201,30 @@ static void delayed_free_pidns(struct rcu_head *p)
>>
>>   static void destroy_pid_namespace(struct pid_namespace *ns)
>>   {
>> +       struct pid_namespace *ph;
>> +       struct hlist_bl_head *head;
>> +       struct hlist_bl_node *dup_node;
>> +
>> +       /*
>> +        * Remove the namespace structure from hash table so
>> +        * now new lookups can start on it.
> s/now new/no new/

Will fix it.

>
> [...]
>> @@ -474,9 +551,116 @@ static struct user_namespace *pidns_owner(struct ns_common *ns)
>>          .get_parent     = pidns_get_parent,
>>   };
>>
>> +/*
>> + * translate_pid - convert pid in source pid-ns into target pid-ns.
>> + * @pid: pid for translation
>> + * @source: pid-ns id
>> + * @target: pid-ns id
>> + *
>> + * Return pid in @target pid-ns, zero if task have no pid there,
>> + * or -ESRCH of task with @pid is not found in @source pid-ns.
> s/of/if/

Will fix it.

>
>> + */
>> +SYSCALL_DEFINE3(translate_pid, pid_t, pid, u64, source,
>> +               u64, target)
>> +{
>> +       struct pid_namespace *source_ns = NULL, *target_ns = NULL;
>> +       struct pid *struct_pid;
>> +       struct pid_namespace *ph;
>> +       struct hlist_bl_head *shead = NULL;
>> +       struct hlist_bl_head *thead = NULL;
>> +       struct hlist_bl_node *dup_node;
>> +       pid_t result;
>> +
>> +       if (!source) {
>> +               source_ns = &init_pid_ns;
>> +       } else {
>> +               shead = pid_ns_hash_head(pid_ns_hash, source);
>> +               hlist_bl_lock(shead);
>> +               hlist_bl_for_each_entry(ph, dup_node, shead, node) {
>> +                       if (source == ph->ns.ns_id) {
>> +                               source_ns = ph;
>> +                               break;
>> +                       }
>> +               }
>> +               if (!source_ns) {
>> +                       hlist_bl_unlock(shead);
>> +                       return -EINVAL;
>> +               }
>> +       }
>> +       if (!ptrace_may_access(source_ns->child_reaper,
>> +                              PTRACE_MODE_READ_FSCREDS)) {
> AFAICS this proposal breaks the visibility restrictions that
> namespaces normally create. If there are two namespaces-based
> containers that use the same UID range, I don't think they should be
> able to learn information about each other, such as which PIDs are in
> use in the other container; but as far as I can tell, your proposal
> makes it possible to do that (unless an LSM or so is interfering). I
> would prefer it if this API required visibility of the targeted PID
> namespaces in the caller's PID namespace.

I am trying to simulate the same access restrictions allowed
on a process's /proc/<pid>/ns/pid file. If the translator has
access to /proc/<pid>/ns/pid file of both source and destination
namespaces, shouldn't it be allowed to translate the pid between
them?

>
> When doing ptrace access checks, please use the real creds in syscalls
> like this one, not the fs creds. The fs creds are for filesystem
> syscalls (in particular sys_open()), not for specialized syscalls like
> ptrace() or this one.
Will fix this.

Thanks,
Nagarathnam.

  reply	other threads:[~2018-03-13 21:25 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-12 17:18 [RESEND RFC] translate_pid API nagarathnam.muthusamy
2018-03-13 20:47 ` Jann Horn
2018-03-13 21:20   ` Nagarathnam Muthusamy [this message]
2018-03-13 21:28     ` Jann Horn
2018-03-13 21:44       ` Nagarathnam Muthusamy
2018-03-13 22:00         ` Jann Horn
2018-03-13 22:45           ` Nagarathnam Muthusamy
2018-03-13 23:10             ` Jann Horn
2018-03-13 23:52               ` Nagarathnam Muthusamy
2018-03-14  3:29 ` Eric W. Biederman
2018-03-14 21:22   ` Nagarathnam Muthusamy
2018-03-14 22:03     ` Eric W. Biederman
2018-03-20 20:14       ` Nagarathnam Muthusamy
2018-03-21  0:33         ` Eric W. Biederman
2018-03-23 19:11           ` [REVIEW][PATCH 00/11] ipc: Fixing the pid namespace support Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 01/11] sem/security: Pass kern_ipc_perm not sem_array into the sem security hooks Eric W. Biederman
2018-03-23 21:46               ` Casey Schaufler
2018-03-28 23:20                 ` Davidlohr Bueso
2018-03-23 19:16             ` [REVIEW][PATCH 02/11] shm/security: Pass kern_ipc_perm not shmid_kernel into the shm " Eric W. Biederman
2018-03-23 21:54               ` Casey Schaufler
2018-03-23 19:16             ` [REVIEW][PATCH 03/11] msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue " Eric W. Biederman
2018-03-23 21:55               ` Casey Schaufler
2018-03-24  5:37                 ` Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 04/11] sem: Move struct sem and struct sem_array into ipc/sem.c Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 05/11] shm: Move struct shmid_kernel into ipc/shm.c Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 06/11] msg: Move struct msg_queue into ipc/msg.c Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 07/11] ipc: Move IPCMNI from include/ipc.h into ipc/util.h Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 08/11] ipc/util: Helpers for making the sysvipc operations pid namespace aware Eric W. Biederman
2018-03-23 19:16             ` [REVIEW][PATCH 09/11] ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces Eric W. Biederman
2018-03-23 21:17               ` NAGARATHNAM MUTHUSAMY
2018-03-23 21:33                 ` Eric W. Biederman
2018-03-23 21:41                   ` NAGARATHNAM MUTHUSAMY
2018-03-28 23:04                     ` Eric W. Biederman
2018-03-28 23:18                       ` Nagarathnam Muthusamy
2018-03-23 19:16             ` [REVIEW][PATCH 10/11] ipc/msg: Fix msgctl(..., " Eric W. Biederman
2018-03-23 21:21               ` NAGARATHNAM MUTHUSAMY
2018-03-23 19:16             ` [REVIEW][PATCH 11/11] ipc/sem: Fix semctl(..., GETPID, " Eric W. Biederman
2018-03-29  0:52               ` Davidlohr Bueso
2018-03-30 19:09                 ` Davidlohr Bueso
2018-03-30 20:12                   ` Eric W. Biederman
2018-03-30 20:45                     ` Davidlohr Bueso
2018-04-02 11:11                   ` Manfred Spraul
2018-03-24  5:40             ` [REVIEW][PATCH 12/11] ipc: Directly call the security hook in ipc_ops.associate Eric W. Biederman
2018-03-28 23:40               ` Davidlohr Bueso
2018-03-31  2:13               ` James Morris
2018-03-24  5:42             ` [REVIEW][PATCH 13/11] ipc/smack: Tidy up from the change in type of the ipc security hooks Eric W. Biederman
2018-03-25  0:05               ` Casey Schaufler
2018-03-28 23:38                 ` Davidlohr Bueso
2018-03-28 23:57               ` Davidlohr Bueso
2018-03-29  1:12             ` [REVIEW][PATCH 00/11] ipc: Fixing the pid namespace support Davidlohr Bueso
2018-03-29 18:42               ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69f13674-7f84-5dc7-0bd7-e5e65e9cb3b0@oracle.com \
    --to=nagarathnam.muthusamy@oracle.com \
    --cc=Nagarajan.Muthukrishnan@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=esyr@redhat.com \
    --cc=jannh@google.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=prakash.sangappa@oracle.com \
    --cc=serge.hallyn@ubuntu.com \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).