All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>, <serge@hallyn.com>,
	<agruenba@redhat.com>, <linux-api@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <paul@paul-moore.com>,
	<viro@zeniv.linux.org.uk>, <avagin@openvz.org>,
	<linux-fsdevel@vger.kernel.org>, <mtk.manpages@gmail.com>,
	<akpm@linux-foundation.org>, <luto@amacapital.net>,
	<gorcunov@openvz.org>, <mingo@kernel.org>,
	<keescook@chromium.org>
Subject: Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy
Date: Wed, 3 May 2017 13:20:58 +0300	[thread overview]
Message-ID: <db243b0d-6fca-88d5-4a3f-2ef179fe1291@virtuozzo.com> (raw)
In-Reply-To: <8737cngdxi.fsf@xmission.com>

On 03.05.2017 00:13, Eric W. Biederman wrote:
> Kirill Tkhai <ktkhai@virtuozzo.com> writes:
> 
>> On 02.05.2017 19:33, Oleg Nesterov wrote:
>>> sorry for delay, vacation...
>>>
>>> On 04/28, Kirill Tkhai wrote:
>>>>
>>>> On 27.04.2017 19:22, Oleg Nesterov wrote:
>>>>>
>>>>> Ah, OK, I didn't notice the ns->child_reaper check in pidns_for_children_get().
>>>>>
>>>>> But note that it doesn't need tasklist_lock too.
>>>>
>>>> Hm, are there possible strange situations with memory ordering, when we see
>>>> ns->child_reaper of already died ns, which was placed in the same memory?
>>>> Do we have to use some memory barriers here?
>>>
>>> Could you spell please? I don't understand your concerns...
>>>
>>> I don't see how, say,
>>>
>>> 	static struct ns_common *pidns_for_children_get(struct task_struct *task)
>>> 	{
>>> 		struct ns_common *ns = NULL;
>>> 		struct pid_namespace *pid_ns;
>>>
>>> 		task_lock(task);
>>> 		if (task->nsproxy) {
>>> 			pid_ns = task->nsproxy->pid_ns_for_children;
>>> 			if (pid_ns->child_reaper) {
>                             ^^^^^^^^^^^^^^^^^^^^
>                             Oleg my apologies I missed this line earlier.
>                             This does look like a valid way to skip read_lock(&tasklist_lock);
>>> 				ns = &pid_ns->ns;
>>> 				get_pid_ns(ns);
>                                 ^^^^^^^^^^^^^ This needs to be:
>                                 get_pid_ns(pid_ns);
>                                 
>>> 			}
>>> 		}
>>> 		task_unlock(task);
>>>
>>> 		return ns;
>>> 	}
>>>
>>> can be wrong. It also looks more clean to me.
>>>
>>> ->child_reaper is not stable without tasklist, it can be dead/etc, but
>>> we do not care?
>>
>> I mean the following. We had a pid_ns1 with a child_reaper set. Then
>> it became dead, and a new pid_ns2 were allocated in the same memory.
> 
> task->nsproxy->pid_ns_for_children is always changed with
> task_lock(task) held.  See switch_task_namespaces (used by unshare and
> setns).  This also gives us the guarantee that the pid_ns reference
> won't be freed/reused in any for until task_lock(task) is dropped.

Now I've checked kmem_cache_zalloc() and it looks like it zeroes cache memory
content synchronous on allocation (it seems there is no pre-zeroed memory
for GFP_ZERO cases).

So, the zeroing happens before switch_task_namespaces() (and task_unlock())
and we're really safe after task_lock() in pidns_for_children_get().

Ok, I'll send new version of the patchset.

WARNING: multiple messages have this Message-ID (diff)
From: Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org,
	agruenba-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	paul-r2n+y4ga6xFZroRs9YW3xA@public.gmane.org,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
	avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org,
	gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
	mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org
Subject: Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy
Date: Wed, 3 May 2017 13:20:58 +0300	[thread overview]
Message-ID: <db243b0d-6fca-88d5-4a3f-2ef179fe1291@virtuozzo.com> (raw)
In-Reply-To: <8737cngdxi.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

On 03.05.2017 00:13, Eric W. Biederman wrote:
> Kirill Tkhai <ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> writes:
> 
>> On 02.05.2017 19:33, Oleg Nesterov wrote:
>>> sorry for delay, vacation...
>>>
>>> On 04/28, Kirill Tkhai wrote:
>>>>
>>>> On 27.04.2017 19:22, Oleg Nesterov wrote:
>>>>>
>>>>> Ah, OK, I didn't notice the ns->child_reaper check in pidns_for_children_get().
>>>>>
>>>>> But note that it doesn't need tasklist_lock too.
>>>>
>>>> Hm, are there possible strange situations with memory ordering, when we see
>>>> ns->child_reaper of already died ns, which was placed in the same memory?
>>>> Do we have to use some memory barriers here?
>>>
>>> Could you spell please? I don't understand your concerns...
>>>
>>> I don't see how, say,
>>>
>>> 	static struct ns_common *pidns_for_children_get(struct task_struct *task)
>>> 	{
>>> 		struct ns_common *ns = NULL;
>>> 		struct pid_namespace *pid_ns;
>>>
>>> 		task_lock(task);
>>> 		if (task->nsproxy) {
>>> 			pid_ns = task->nsproxy->pid_ns_for_children;
>>> 			if (pid_ns->child_reaper) {
>                             ^^^^^^^^^^^^^^^^^^^^
>                             Oleg my apologies I missed this line earlier.
>                             This does look like a valid way to skip read_lock(&tasklist_lock);
>>> 				ns = &pid_ns->ns;
>>> 				get_pid_ns(ns);
>                                 ^^^^^^^^^^^^^ This needs to be:
>                                 get_pid_ns(pid_ns);
>                                 
>>> 			}
>>> 		}
>>> 		task_unlock(task);
>>>
>>> 		return ns;
>>> 	}
>>>
>>> can be wrong. It also looks more clean to me.
>>>
>>> ->child_reaper is not stable without tasklist, it can be dead/etc, but
>>> we do not care?
>>
>> I mean the following. We had a pid_ns1 with a child_reaper set. Then
>> it became dead, and a new pid_ns2 were allocated in the same memory.
> 
> task->nsproxy->pid_ns_for_children is always changed with
> task_lock(task) held.  See switch_task_namespaces (used by unshare and
> setns).  This also gives us the guarantee that the pid_ns reference
> won't be freed/reused in any for until task_lock(task) is dropped.

Now I've checked kmem_cache_zalloc() and it looks like it zeroes cache memory
content synchronous on allocation (it seems there is no pre-zeroed memory
for GFP_ZERO cases).

So, the zeroing happens before switch_task_namespaces() (and task_unlock())
and we're really safe after task_lock() in pidns_for_children_get().

Ok, I'll send new version of the patchset.

  reply	other threads:[~2017-05-03 10:21 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-17 17:34 [PATCH 0/2] nsfs: Introduce ioctl to set vector of ns_last_pid's on pid ns hierarhy Kirill Tkhai
2017-04-17 17:34 ` Kirill Tkhai
2017-04-17 17:36 ` [PATCH 1/2] nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC) Kirill Tkhai
2017-04-17 17:36   ` Kirill Tkhai
2017-04-17 17:36 ` [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy Kirill Tkhai
2017-04-17 17:36   ` Kirill Tkhai
2017-04-19 20:27   ` Serge E. Hallyn
2017-04-19 20:27     ` Serge E. Hallyn
2017-04-24 19:03   ` Cyrill Gorcunov
2017-04-24 19:03     ` Cyrill Gorcunov
2017-04-26 15:53   ` Oleg Nesterov
2017-04-26 15:53     ` Oleg Nesterov
2017-04-26 16:11     ` Kirill Tkhai
2017-04-26 16:11       ` Kirill Tkhai
2017-04-26 16:33       ` Kirill Tkhai
2017-04-26 16:33         ` Kirill Tkhai
2017-04-26 16:32         ` Eric W. Biederman
2017-04-26 16:32           ` Eric W. Biederman
2017-04-26 16:43           ` Kirill Tkhai
2017-04-26 16:43             ` Kirill Tkhai
2017-04-26 17:01             ` Eric W. Biederman
2017-04-26 17:01               ` Eric W. Biederman
2017-04-27 16:12       ` Oleg Nesterov
2017-04-27 16:12         ` Oleg Nesterov
2017-04-27 16:17         ` Kirill Tkhai
2017-04-27 16:17           ` Kirill Tkhai
2017-04-27 16:22           ` Oleg Nesterov
2017-04-27 16:22             ` Oleg Nesterov
2017-04-28  9:17             ` Kirill Tkhai
2017-04-28  9:17               ` Kirill Tkhai
2017-05-02 16:33               ` Oleg Nesterov
2017-05-02 17:22                 ` Eric W. Biederman
2017-05-02 17:22                   ` Eric W. Biederman
2017-05-02 17:33                 ` Kirill Tkhai
2017-05-02 17:33                   ` Kirill Tkhai
2017-05-02 21:13                   ` Eric W. Biederman
2017-05-02 21:13                     ` Eric W. Biederman
2017-05-03 10:20                     ` Kirill Tkhai [this message]
2017-05-03 10:20                       ` Kirill Tkhai
2017-04-27 16:39           ` Eric W. Biederman
2017-04-27 16:39             ` Eric W. Biederman
2017-04-28  9:22             ` Kirill Tkhai
2017-04-28  9:22               ` Kirill Tkhai
2017-04-27 16:16       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db243b0d-6fca-88d5-4a3f-2ef179fe1291@virtuozzo.com \
    --to=ktkhai@virtuozzo.com \
    --cc=agruenba@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=gorcunov@openvz.org \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=serge@hallyn.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.