All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: akpm@linux-foundation.org, peterz@infradead.org,
	viro@zeniv.linux.org.uk, mingo@kernel.org,
	paulmck@linux.vnet.ibm.com, keescook@chromium.org,
	riel@redhat.com, mhocko@suse.com, tglx@linutronix.de,
	kirill.shutemov@linux.intel.com, marcos.souza.org@gmail.com,
	hoeun.ryu@gmail.com, pasha.tatashin@oracle.com,
	gs051095@gmail.com, ebiederm@xmission.com, dhowells@redhat.com,
	rppt@linux.vnet.ibm.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/4] exit: Move read_unlock() up in mm_update_next_owner()
Date: Thu, 26 Apr 2018 17:01:17 +0200	[thread overview]
Message-ID: <20180426150116.GA14818@redhat.com> (raw)
In-Reply-To: <152474043375.29458.13978538538182642678.stgit@localhost.localdomain>

On 04/26, Kirill Tkhai wrote:
>
> @@ -464,18 +464,15 @@ void mm_update_next_owner(struct mm_struct *mm)
>  	return;
>  
>  assign_new_owner:
> -	BUG_ON(c == p);
>  	get_task_struct(c);
> +	read_unlock(&tasklist_lock);
> +	BUG_ON(c == p);
> +
>  	/*
>  	 * The task_lock protects c->mm from changing.
>  	 * We always want mm->owner->mm == mm
>  	 */
>  	task_lock(c);
> -	/*
> -	 * Delay read_unlock() till we have the task_lock()
> -	 * to ensure that c does not slip away underneath us
> -	 */
> -	read_unlock(&tasklist_lock);

I think this is correct, but...

Firstly, I agree with Michal, it would be nice to kill mm_update_next_owner()
altogether.

If this is not possible I agree, it needs cleanups and we can change it to
avoid tasklist (although your 4/4 looks overcomplicated to me at first glance).

But in this case I think that whatever we do we should start with something like
the patch below. I wrote it 3 years ago but it still applies.

Oleg.

Subject: [PATCH 1/3] memcg: introduce assign_new_owner()

The code under "assign_new_owner" looks very ugly and suboptimal.

We do not really need get_task_struct/put_task_struct(), we can
simply recheck/change mm->owner under tasklist_lock. And we do not
want to restart from the very beginning if ->mm was changed by the
time we take task_lock(), we can simply continue (if we do not drop
tasklist_lock).

Just move this code into the new simple helper, assign_new_owner().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/exit.c |   56 ++++++++++++++++++++++++++------------------------------
 1 files changed, 26 insertions(+), 30 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 22fcc05..4d446ab 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -293,6 +293,23 @@ kill_orphaned_pgrp(struct task_struct *tsk, struct task_struct *parent)
 }
 
 #ifdef CONFIG_MEMCG
+static bool assign_new_owner(struct mm_struct *mm, struct task_struct *c)
+{
+	bool ret = false;
+
+	if (c->mm != mm)
+		return ret;
+
+	task_lock(c); /* protects c->mm from changing */
+	if (c->mm == mm) {
+		mm->owner = c;
+		ret = true;
+	}
+	task_unlock(c);
+
+	return ret;
+}
+
 /*
  * A task is exiting.   If it owned this mm, find a new owner for the mm.
  */
@@ -300,7 +317,6 @@ void mm_update_next_owner(struct mm_struct *mm)
 {
 	struct task_struct *c, *g, *p = current;
 
-retry:
 	/*
 	 * If the exiting or execing task is not the owner, it's
 	 * someone else's problem.
@@ -322,16 +338,16 @@ retry:
 	 * Search in the children
 	 */
 	list_for_each_entry(c, &p->children, sibling) {
-		if (c->mm == mm)
-			goto assign_new_owner;
+		if (assign_new_owner(mm, c))
+			goto done;
 	}
 
 	/*
 	 * Search in the siblings
 	 */
 	list_for_each_entry(c, &p->real_parent->children, sibling) {
-		if (c->mm == mm)
-			goto assign_new_owner;
+		if (assign_new_owner(mm, c))
+			goto done;
 	}
 
 	/*
@@ -341,42 +357,22 @@ retry:
 		if (g->flags & PF_KTHREAD)
 			continue;
 		for_each_thread(g, c) {
-			if (c->mm == mm)
-				goto assign_new_owner;
+			if (assign_new_owner(mm, c))
+				goto done;
 			if (c->mm)
 				break;
 		}
 	}
-	read_unlock(&tasklist_lock);
+
 	/*
 	 * We found no owner yet mm_users > 1: this implies that we are
 	 * most likely racing with swapoff (try_to_unuse()) or /proc or
 	 * ptrace or page migration (get_task_mm()).  Mark owner as NULL.
 	 */
 	mm->owner = NULL;
-	return;
-
-assign_new_owner:
-	BUG_ON(c == p);
-	get_task_struct(c);
-	/*
-	 * The task_lock protects c->mm from changing.
-	 * We always want mm->owner->mm == mm
-	 */
-	task_lock(c);
-	/*
-	 * Delay read_unlock() till we have the task_lock()
-	 * to ensure that c does not slip away underneath us
-	 */
+done:
 	read_unlock(&tasklist_lock);
-	if (c->mm != mm) {
-		task_unlock(c);
-		put_task_struct(c);
-		goto retry;
-	}
-	mm->owner = c;
-	task_unlock(c);
-	put_task_struct(c);
+	return;
 }
 #endif /* CONFIG_MEMCG */
 
-- 
1.5.5.1

  reply	other threads:[~2018-04-26 15:01 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-26 11:00 [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable Kirill Tkhai
2018-04-26 11:00 ` [PATCH 1/4] exit: Move read_unlock() up in mm_update_next_owner() Kirill Tkhai
2018-04-26 15:01   ` Oleg Nesterov [this message]
2018-04-26 11:00 ` [PATCH 2/4] exit: Use rcu instead of get_task_struct() " Kirill Tkhai
2018-04-26 11:00 ` [PATCH 3/4] exit: Rename assign_new_owner label " Kirill Tkhai
2018-04-26 11:01 ` [PATCH 4/4] exit: Lockless iteration over task list " Kirill Tkhai
2018-04-26 12:35   ` Andrea Parri
2018-04-26 13:52     ` Kirill Tkhai
2018-04-26 15:20       ` Peter Zijlstra
2018-04-26 15:56         ` Kirill Tkhai
2018-04-26 15:20       ` Peter Zijlstra
2018-04-26 16:04         ` Kirill Tkhai
2018-04-26 15:29       ` Andrea Parri
2018-04-26 16:11         ` Kirill Tkhai
2018-04-26 13:07 ` [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable Michal Hocko
2018-04-26 13:52   ` Oleg Nesterov
2018-04-26 14:07   ` Kirill Tkhai
2018-04-26 15:10     ` Oleg Nesterov
2018-04-26 16:19   ` Eric W. Biederman
2018-04-26 19:28     ` Michal Hocko
2018-04-27  7:08       ` Michal Hocko
2018-04-27 18:05         ` Eric W. Biederman
2018-05-01 17:22           ` Eric W. Biederman
2018-05-01 17:35             ` [RFC][PATCH] memcg: Replace mm->owner with mm->memcg Eric W. Biederman
2018-05-02  8:47               ` Michal Hocko
2018-05-02 13:20                 ` Johannes Weiner
2018-05-02 14:05                   ` Eric W. Biederman
2018-05-02 19:21                   ` [PATCH] " Eric W. Biederman
2018-05-02 21:04                     ` Andrew Morton
2018-05-02 21:35                       ` Eric W. Biederman
2018-05-03 13:33                     ` Oleg Nesterov
2018-05-03 14:39                       ` Eric W. Biederman
2018-05-04 14:20                         ` Oleg Nesterov
2018-05-04 14:36                           ` Eric W. Biederman
2018-05-04 14:54                             ` Oleg Nesterov
2018-05-04 15:49                               ` Eric W. Biederman
2018-05-04 16:22                                 ` Oleg Nesterov
2018-05-04 16:40                                   ` Eric W. Biederman
2018-05-04 17:26                                     ` [PATCH 0/2] mm->owner to mm->memcg fixes Eric W. Biederman
2018-05-04 17:26                                       ` [PATCH 1/2] memcg: Update the mm->memcg maintenance to work when !CONFIG_MMU Eric W. Biederman
2018-05-04 17:27                                       ` [PATCH 2/2] memcg: Close the race between migration and installing bprm->mm as mm Eric W. Biederman
2018-05-09 14:51                                         ` Oleg Nesterov
2018-05-10  3:00                                           ` Eric W. Biederman
2018-05-10 12:14                                       ` [PATCH 0/2] mm->owner to mm->memcg fixes Michal Hocko
2018-05-10 12:18                                         ` Michal Hocko
2018-05-22 12:57                                         ` Michal Hocko
2018-05-23 19:46                                           ` Eric W. Biederman
2018-05-24 11:10                                             ` Michal Hocko
2018-05-24 21:16                                               ` Andrew Morton
2018-05-24 23:37                                                 ` Andrea Parri
2018-05-30 12:17                                                 ` Michal Hocko
2018-05-31 18:41                                                   ` Eric W. Biederman
2018-06-01  1:57                                                     ` [PATCH] memcg: Replace mm->owner with mm->memcg Eric W. Biederman
2018-06-01 14:52                                                       ` [RFC][PATCH 0/2] memcg: Require every task that uses an mm to migrate together Eric W. Biederman
2018-06-01 14:53                                                         ` [RFC][PATCH 1/2] memcg: Ensure every task that uses an mm is in the same memory cgroup Eric W. Biederman
2018-06-01 16:50                                                           ` Tejun Heo
2018-06-01 18:11                                                             ` Eric W. Biederman
2018-06-01 19:16                                                               ` Tejun Heo
2018-06-04 13:01                                                                 ` Michal Hocko
2018-06-04 18:47                                                                   ` Tejun Heo
2018-06-04 19:11                                                                     ` Eric W. Biederman
2018-06-06 11:13                                                           ` Michal Hocko
2018-06-07 11:42                                                             ` Eric W. Biederman
2018-06-07 12:19                                                               ` Michal Hocko
2018-06-01 14:53                                                         ` [RFC][PATCH 2/2] memcgl: Remove dead code now that all tasks of an mm share a memcg Eric W. Biederman
2018-06-01 14:07                                                     ` [PATCH 0/2] mm->owner to mm->memcg fixes Michal Hocko
2018-05-24 21:17                                               ` Andrew Morton
2018-05-30 11:52                                             ` Michal Hocko
2018-05-31 17:43                                               ` Eric W. Biederman
2018-05-07 14:33                                     ` [PATCH] memcg: Replace mm->owner with mm->memcg Oleg Nesterov
2018-05-08  3:15                                       ` Eric W. Biederman
2018-05-09 14:40                                         ` Oleg Nesterov
2018-05-10  3:09                                           ` Eric W. Biederman
2018-05-10  4:03                                             ` [RFC][PATCH] cgroup: Don't mess with tasks in exec Eric W. Biederman
2018-05-10 12:15                                               ` Oleg Nesterov
2018-05-10 12:35                                                 ` Tejun Heo
2018-05-10 12:38                                             ` [PATCH] memcg: Replace mm->owner with mm->memcg Oleg Nesterov
2018-05-04 11:07                     ` Michal Hocko
2018-05-05 16:54                     ` kbuild test robot
2018-05-07 23:18                       ` Andrew Morton
2018-05-08  2:17                         ` Eric W. Biederman
2018-05-09 21:00                         ` Michal Hocko
2018-05-02 23:59               ` [RFC][PATCH] " Balbir Singh
2018-05-03 15:11                 ` Eric W. Biederman
2018-05-04  4:59                   ` Balbir Singh
2018-05-03 10:52           ` [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable Kirill Tkhai
2018-06-01  1:07   ` Eric W. Biederman
2018-06-01 13:57     ` Michal Hocko
2018-06-01 14:32       ` Eric W. Biederman
2018-06-01 15:02         ` Michal Hocko
2018-06-01 15:25           ` Eric W. Biederman
2018-06-04  6:54             ` Michal Hocko
2018-06-04 14:31               ` Eric W. Biederman
2018-06-05  8:15                 ` Michal Hocko
2018-06-05  8:48             ` Kirill Tkhai
2018-06-05 15:36               ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180426150116.GA14818@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=gs051095@gmail.com \
    --cc=hoeun.ryu@gmail.com \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcos.souza.org@gmail.com \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=pasha.tatashin@oracle.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.