From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757550AbdJQAxU (ORCPT ); Mon, 16 Oct 2017 20:53:20 -0400 Received: from mail-it0-f53.google.com ([209.85.214.53]:53897 "EHLO mail-it0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754752AbdJQAxS (ORCPT ); Mon, 16 Oct 2017 20:53:18 -0400 X-Google-Smtp-Source: ABhQp+SDFG4RNXVtBpYfv1YRvTdC0bUmKc1rkQJxWCGawkrLq57BlcDzQfoOR09VMaqqoGEpDpfKlrD1GBG0M1wqHAs= MIME-Version: 1.0 In-Reply-To: References: <150788678482.924140.11785205105514746135.stgit@buzz> <20171013160514.GA27812@redhat.com> <3bdb5341-9ae6-265a-ce5b-45c2cfc76fad@yandex-team.ru> <20171016143628.b2ef80a9ef16d4345889b4d9@linux-foundation.org> From: Andy Lutomirski Date: Mon, 16 Oct 2017 17:52:56 -0700 Message-ID: Subject: Re: [PATCH v4] pidns: introduce syscall translate_pid To: prakash.sangappa@oracle.com Cc: Nagarathnam Muthusamy , Andrew Morton , Konstantin Khlebnikov , Oleg Nesterov , Linux API , "linux-kernel@vger.kernel.org" , Serge Hallyn , "Eric W. Biederman" , Eugene Syromiatnikov Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa wrote: > > > On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote: >> >> >> >> On 10/16/2017 02:36 PM, Andrew Morton wrote: >>> >>> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov >>> wrote: >>> >>>>>>> pid_t translate_pid(pid_t pid, int source, int target); >>>>>>> >>>>>>> This syscall converts pid from source pid-ns into pid in target >>>>>>> pid-ns. >>>>>>> If pid is unreachable from target pid-ns it returns zero. >>>>>>> >>>>>>> Pid-namespaces are referred file descriptors opened to proc files >>>>>>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. Negative >>>>>>> argument >>>>>>> refers to current pid namespace, same as file /proc/self/ns/pid. >>>>>>> >>>>>>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but backward >>>>>>> translation requires scanning all tasks. Also pids could be >>>>>>> translated >>>>>>> by sending them through unix socket between namespaces, this method >>>>>>> is >>>>>>> slow and insecure because other side is exposed inside pid namespace. >>>> >>>> Andrew asked why we might need this. >>>> >>>> Such conversion is required for interaction between processes across >>>> pid-namespaces. >>>> For example to identify process in container by pid file looking from >>>> outside. >>>> >>>> Two years ago I've solved this in project of mine with monstrous code >>>> which >>>> forks couple times just to convert pid, lucky for me performance wasn't >>>> important. >>> >>> That's a single user who needed this a single time, and found a >>> userspace-based solution anyway. This is not exactly compelling! >>> >>> Is there a stronger case to be made? How does this change benefit our >>> users? Sell it to us! >> >> Oracle database is planning to use pid namespace for sandboxing database >> instances and they need an API similar to translate_pid to effectively >> translate process IDs from other pid namespaces. Prakash (cced in mail) can >> provide more details on this usecase. > > > As Nagarathnam indicated, Oracle Database will be using pid namespaces and > needs a direct method of converting pids of processes in the pid namespace > hierarchy. In this use case multiple > nested PID namespaces will be used. The currently available mechanism are > not very efficient for this use case. For ex. as Konstantin described, using > /proc//status would require the application to scan all the pid's > status files to determine the pid of given process in a child namespace. > > Use of SCM_CREDENTIALS's socket message is another way, which would require > every process starting inside a pid namespace to send this message and the > receiving process in the target namespace would have to save the converted > pid and reference it. This mechanism becomes cumbersome especially if the > application has to deal with multiple nested pid namespaces. Also, the > Database needs to be able to convert a thread's global pid(gettid()). > Passing the thread's pid(gettid()) in SCM_CREDENTIALS message requires > CAP_SYS_ADMIN, which is an issue. > > So having a direct method, like the API that Konstantin is proposing, will > work best for the Database > since pid of a process in any of the nested pid namespaces can be converted > as and when required. I think with the proposed API, the application should > be able to convert pid of a process or tid(gettid()) of a thread as well. > Can you explain what Oracle's database is planning to do with this information? From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH v4] pidns: introduce syscall translate_pid Date: Mon, 16 Oct 2017 17:52:56 -0700 Message-ID: References: <150788678482.924140.11785205105514746135.stgit@buzz> <20171013160514.GA27812@redhat.com> <3bdb5341-9ae6-265a-ce5b-45c2cfc76fad@yandex-team.ru> <20171016143628.b2ef80a9ef16d4345889b4d9@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: prakash.sangappa-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org Cc: Nagarathnam Muthusamy , Andrew Morton , Konstantin Khlebnikov , Oleg Nesterov , Linux API , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Serge Hallyn , "Eric W. Biederman" , Eugene Syromiatnikov List-Id: linux-api@vger.kernel.org On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa wrote: > > > On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote: >> >> >> >> On 10/16/2017 02:36 PM, Andrew Morton wrote: >>> >>> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov >>> wrote: >>> >>>>>>> pid_t translate_pid(pid_t pid, int source, int target); >>>>>>> >>>>>>> This syscall converts pid from source pid-ns into pid in target >>>>>>> pid-ns. >>>>>>> If pid is unreachable from target pid-ns it returns zero. >>>>>>> >>>>>>> Pid-namespaces are referred file descriptors opened to proc files >>>>>>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. Negative >>>>>>> argument >>>>>>> refers to current pid namespace, same as file /proc/self/ns/pid. >>>>>>> >>>>>>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but backward >>>>>>> translation requires scanning all tasks. Also pids could be >>>>>>> translated >>>>>>> by sending them through unix socket between namespaces, this method >>>>>>> is >>>>>>> slow and insecure because other side is exposed inside pid namespace. >>>> >>>> Andrew asked why we might need this. >>>> >>>> Such conversion is required for interaction between processes across >>>> pid-namespaces. >>>> For example to identify process in container by pid file looking from >>>> outside. >>>> >>>> Two years ago I've solved this in project of mine with monstrous code >>>> which >>>> forks couple times just to convert pid, lucky for me performance wasn't >>>> important. >>> >>> That's a single user who needed this a single time, and found a >>> userspace-based solution anyway. This is not exactly compelling! >>> >>> Is there a stronger case to be made? How does this change benefit our >>> users? Sell it to us! >> >> Oracle database is planning to use pid namespace for sandboxing database >> instances and they need an API similar to translate_pid to effectively >> translate process IDs from other pid namespaces. Prakash (cced in mail) can >> provide more details on this usecase. > > > As Nagarathnam indicated, Oracle Database will be using pid namespaces and > needs a direct method of converting pids of processes in the pid namespace > hierarchy. In this use case multiple > nested PID namespaces will be used. The currently available mechanism are > not very efficient for this use case. For ex. as Konstantin described, using > /proc//status would require the application to scan all the pid's > status files to determine the pid of given process in a child namespace. > > Use of SCM_CREDENTIALS's socket message is another way, which would require > every process starting inside a pid namespace to send this message and the > receiving process in the target namespace would have to save the converted > pid and reference it. This mechanism becomes cumbersome especially if the > application has to deal with multiple nested pid namespaces. Also, the > Database needs to be able to convert a thread's global pid(gettid()). > Passing the thread's pid(gettid()) in SCM_CREDENTIALS message requires > CAP_SYS_ADMIN, which is an issue. > > So having a direct method, like the API that Konstantin is proposing, will > work best for the Database > since pid of a process in any of the nested pid namespaces can be converted > as and when required. I think with the proposed API, the application should > be able to convert pid of a process or tid(gettid()) of a thread as well. > Can you explain what Oracle's database is planning to do with this information?