From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932630AbcE0ID0 (ORCPT <rfc822;w@1wt.eu>);
	Fri, 27 May 2016 04:03:26 -0400
Received: from mail-wm0-f65.google.com ([74.125.82.65]:33747 "EHLO
	mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S932338AbcE0IDW (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 27 May 2016 04:03:22 -0400
Date: Fri, 27 May 2016 10:03:20 +0200
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, rientjes@google.com, oleg@redhat.com,
        vdavydov@parallels.com, akpm@linux-foundation.org,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/6] mm, oom: do not loop over all tasks if there are no
 external tasks sharing mm
Message-ID: <20160527080319.GD27686@dhcp22.suse.cz>
References: <1464266415-15558-2-git-send-email-mhocko@kernel.org>
 <201605262330.EEB52182.OtMFOJHFLOSFVQ@I-love.SAKURA.ne.jp>
 <20160526145930.GF23675@dhcp22.suse.cz>
 <201605270025.IAC48454.QSHOOMFOLtFJFV@I-love.SAKURA.ne.jp>
 <20160526153532.GG23675@dhcp22.suse.cz>
 <201605270114.IEI48969.MFFtFOJLQOOHSV@I-love.SAKURA.ne.jp>
 <20160527064510.GA27686@dhcp22.suse.cz>
 <20160527071507.GC27686@dhcp22.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160527071507.GC27686@dhcp22.suse.cz>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri 27-05-16 09:15:07, Michal Hocko wrote:
> On Fri 27-05-16 08:45:10, Michal Hocko wrote:
> [...]
> > It is still an operation which is not needed for 99% of situations. So
> > if we do not need it for correctness then I do not think this is worth
> > bothering.
> 
> Since you have pointed out exit_mm vs. __exit_signal race yesterday I
> was thinking how to make the check reliable. Even
> atomic_read(mm->mm_users) > get_nr_threads() is not reliable and we can
> miss other tasks just because the current thread group is mostly past
> exit_mm. So far I couldn't find a way to tweak this around though.

Just for the record I was playing with the following yesterday but I
couldn't convince myself that this is safe and reasonable in the first
place (I do not like it to be honest).
---
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 1685890d424e..db027eca8be5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -123,6 +123,35 @@ struct task_struct *find_lock_task_mm(struct task_struct *p)
 	return t;
 }
 
+bool task_has_external_users(struct task_struct *p)
+{
+	struct mm_struct *mm = NULL;
+	struct task_struct *t;
+	int active_threads = 0;
+	bool ret = true;	/* be pessimistic */
+
+	rcu_read_lock();
+	for_each_thread(p, t) {
+		task_lock(t);
+		if (likely(t->mm)) {
+			active_threads++;
+			if (!mm) {
+				mm = t->mm;
+				atomic_inc(&mm->mm_count);
+			}
+		}
+		task_unlock(t);
+	}
+	rcu_read_unlock();
+
+	if (mm) {
+		if (atomic_read(&mm->mm_users) <= active_threads)
+			ret = false;
+		mmdrop(mm);
+	}
+	return ret;
+}
+
 /*
  * order == -1 means the oom kill is required by sysrq, otherwise only
  * for display purposes.
-- 
Michal Hocko
SUSE Labs