linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: paulmck <paulmck@linux.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Russell King, ARM Linux" <linux@armlinux.org.uk>,
	Chris Metcalf <cmetcalf@ezchip.com>, Chris Lameter <cl@linux.com>,
	Kirill Tkhai <tkhai@yandex.ru>, Mike Galbraith <efault@gmx.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load
Date: Wed, 4 Sep 2019 13:28:19 +0200	[thread overview]
Message-ID: <20190904112819.GD2349@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1029906102.725.1567543307658.JavaMail.zimbra@efficios.com>

On Tue, Sep 03, 2019 at 04:41:47PM -0400, Mathieu Desnoyers wrote:
> As discussed on IRC, one alternative for the multi-threaded case would
> be to grab the task list lock and iterate over all existing tasks to
> set the bit, so we don't have to touch an extra cache line from the
> scheduler.
> 
> In order to keep the speed of the common single-threaded library
> constructor common case fast, we simply set the bit in the current
> task struct, and rely on clone() propagating the flag to children
> threads (which it already does).

Something like the completely untested thing below.

And yes, that do_each_thread/while_each_thread thing is unfortunate and
yuck too, but supposedly that's a slow path not many people are expected
to hit anyway, right?

---
 include/linux/sched.h     |  4 ++++
 kernel/sched/membarrier.c | 20 +++++++++++++++++---
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 33b310a826d7..dbafafb8ef40 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1136,6 +1136,10 @@ struct task_struct {
 	unsigned long			numa_pages_migrated;
 #endif /* CONFIG_NUMA_BALANCING */
 
+#ifdef CONFIG_MEMBARRIER
+	atomic_t			membarrier_state;
+#endif
+
 #ifdef CONFIG_RSEQ
 	struct rseq __user *rseq;
 	u32 rseq_sig;
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index aa8d75804108..961f6affbf38 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -72,8 +72,8 @@ static int membarrier_global_expedited(void)
 
 		rcu_read_lock();
 		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
-		if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
-				   MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
+		if (p && (atomic_read(&p->membarrier_state) &
+			  MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
 			if (!fallback)
 				__cpumask_set_cpu(cpu, tmpmask);
 			else
@@ -185,7 +185,9 @@ static int membarrier_register_global_expedited(void)
 	if (atomic_read(&mm->membarrier_state) &
 	    MEMBARRIER_STATE_GLOBAL_EXPEDITED_READY)
 		return 0;
+
 	atomic_or(MEMBARRIER_STATE_GLOBAL_EXPEDITED, &mm->membarrier_state);
+	atomic_or(MEMBARRIER_STATE_GLOBAL_EXPEDITED, &p->membarrier_state);
 	if (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1) {
 		/*
 		 * For single mm user, single threaded process, we can
@@ -196,6 +198,17 @@ static int membarrier_register_global_expedited(void)
 		 */
 		smp_mb();
 	} else {
+		struct task_struct *g, *t;
+
+		read_lock(&tasklist_lock);
+		do_each_thread(g, t) {
+			if (t->mm == mm) {
+				atomic_or(MEMBARRIER_STATE_GLOBAL_EXPEDITED,
+					  &t->membarrier_state);
+			}
+		} while_each_thread(g, t);
+		read_unlock(&tasklist_lock);
+
 		/*
 		 * For multi-mm user threads, we need to ensure all
 		 * future scheduler executions will observe the new
@@ -229,9 +242,10 @@ static int membarrier_register_private_expedited(int flags)
 	if (atomic_read(&mm->membarrier_state) & state)
 		return 0;
 	atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED, &mm->membarrier_state);
-	if (flags & MEMBARRIER_FLAG_SYNC_CORE)
+	if (flags & MEMBARRIER_FLAG_SYNC_CORE) {
 		atomic_or(MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE,
 			  &mm->membarrier_state);
+	}
 	if (!(atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)) {
 		/*
 		 * Ensure all future scheduler executions will observe the

  reply	other threads:[~2019-09-04 11:28 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 20:11 [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load Mathieu Desnoyers
2019-09-03 20:11 ` [RFC PATCH 2/2] Fix: sched/membarrier: private expedited registration check Mathieu Desnoyers
2019-09-03 20:24 ` [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load Peter Zijlstra
2019-09-03 20:36   ` Linus Torvalds
2019-09-04 15:19     ` Mathieu Desnoyers
2019-09-04 16:09       ` Peter Zijlstra
2019-09-04 17:12         ` Mathieu Desnoyers
2019-09-04 18:26           ` Peter Zijlstra
2019-09-06  0:51             ` Mathieu Desnoyers
2019-09-03 20:41   ` Mathieu Desnoyers
2019-09-04 11:28     ` Peter Zijlstra [this message]
2019-09-04 11:49       ` Peter Zijlstra
2019-09-04 15:26         ` Mathieu Desnoyers
2019-09-04 12:03       ` Oleg Nesterov
2019-09-04 12:43         ` Peter Zijlstra
2019-09-04 13:17           ` Oleg Nesterov
2019-09-03 20:27 ` Linus Torvalds
2019-09-03 20:53   ` Mathieu Desnoyers
2019-09-04 10:53 ` Oleg Nesterov
2019-09-04 11:39   ` Peter Zijlstra
2019-09-04 15:24   ` Mathieu Desnoyers
2019-09-04 11:11 ` Oleg Nesterov
2019-09-04 16:11   ` Mathieu Desnoyers
2019-09-08 13:46   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190904112819.GD2349@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=cl@linux.com \
    --cc=cmetcalf@ezchip.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tkhai@yandex.ru \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).