linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
@ 2019-01-28 22:07 Mathieu Desnoyers
  2019-01-28 22:39 ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2019-01-28 22:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, linux-api, Mathieu Desnoyers, Jann Horn,
	Thomas Gleixner, Andrea Parri, Andy Lutomirski, Avi Kivity,
	Benjamin Herrenschmidt, Boqun Feng, Dave Watson, David Sehr,
	H . Peter Anvin, Linus Torvalds, Maged Michael, Michael Ellerman,
	Paul E . McKenney, Paul Mackerras, Russell King, Will Deacon,
	stable

Jann Horn identified a racy access to p->mm in the global expedited
command of the membarrier system call.

The suggested fix is to hold the task_lock() around the accesses to
p->mm and to the mm_struct membarrier_state field to guarantee the
existence of the mm_struct.

Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Tested-by: Jann Horn <jannh@google.com>
CC: Jann Horn <jannh@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra (Intel) <peterz@infradead.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Andrea Parri <parri.andrea@gmail.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Dave Watson <davejwatson@fb.com>
CC: David Sehr <sehr@google.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Maged Michael <maged.michael@gmail.com>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Paul Mackerras <paulus@samba.org>
CC: Russell King <linux@armlinux.org.uk>
CC: Will Deacon <will.deacon@arm.com>
CC: stable@vger.kernel.org # v4.16+
CC: linux-api@vger.kernel.org
---
 kernel/sched/membarrier.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 76e0eaf4654e..305fdcc4c5f7 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
 
 		rcu_read_lock();
 		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
-		if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
-				   MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
-			if (!fallback)
-				__cpumask_set_cpu(cpu, tmpmask);
-			else
-				smp_call_function_single(cpu, ipi_mb, NULL, 1);
+		/*
+		 * Skip this CPU if the runqueue's current task is NULL or if
+		 * it is a kernel thread.
+		 */
+		if (p && READ_ONCE(p->mm)) {
+			bool mm_match;
+
+			/*
+			 * Read p->mm and access membarrier_state while holding
+			 * the task lock to ensure existence of mm.
+			 */
+			task_lock(p);
+			mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) &
+					     MEMBARRIER_STATE_GLOBAL_EXPEDITED);
+			task_unlock(p);
+			if (mm_match) {
+				if (!fallback)
+					__cpumask_set_cpu(cpu, tmpmask);
+				else
+					smp_call_function_single(cpu, ipi_mb, NULL, 1);
+			}
 		}
 		rcu_read_unlock();
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 22:07 [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited() Mathieu Desnoyers
@ 2019-01-28 22:39 ` Paul E. McKenney
  2019-01-28 22:45   ` Jann Horn
  2019-01-28 22:46   ` Mathieu Desnoyers
  0 siblings, 2 replies; 5+ messages in thread
From: Paul E. McKenney @ 2019-01-28 22:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, linux-api, Jann Horn,
	Thomas Gleixner, Andrea Parri, Andy Lutomirski, Avi Kivity,
	Benjamin Herrenschmidt, Boqun Feng, Dave Watson, David Sehr,
	H . Peter Anvin, Linus Torvalds, Maged Michael, Michael Ellerman,
	Paul Mackerras, Russell King, Will Deacon, stable

On Mon, Jan 28, 2019 at 05:07:07PM -0500, Mathieu Desnoyers wrote:
> Jann Horn identified a racy access to p->mm in the global expedited
> command of the membarrier system call.
> 
> The suggested fix is to hold the task_lock() around the accesses to
> p->mm and to the mm_struct membarrier_state field to guarantee the
> existence of the mm_struct.
> 
> Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Tested-by: Jann Horn <jannh@google.com>
> CC: Jann Horn <jannh@google.com>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Peter Zijlstra (Intel) <peterz@infradead.org>
> CC: Ingo Molnar <mingo@kernel.org>
> CC: Andrea Parri <parri.andrea@gmail.com>
> CC: Andy Lutomirski <luto@kernel.org>
> CC: Avi Kivity <avi@scylladb.com>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Boqun Feng <boqun.feng@gmail.com>
> CC: Dave Watson <davejwatson@fb.com>
> CC: David Sehr <sehr@google.com>
> CC: H. Peter Anvin <hpa@zytor.com>
> CC: Linus Torvalds <torvalds@linux-foundation.org>
> CC: Maged Michael <maged.michael@gmail.com>
> CC: Michael Ellerman <mpe@ellerman.id.au>
> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> CC: Paul Mackerras <paulus@samba.org>
> CC: Russell King <linux@armlinux.org.uk>
> CC: Will Deacon <will.deacon@arm.com>
> CC: stable@vger.kernel.org # v4.16+
> CC: linux-api@vger.kernel.org
> ---
>  kernel/sched/membarrier.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
> index 76e0eaf4654e..305fdcc4c5f7 100644
> --- a/kernel/sched/membarrier.c
> +++ b/kernel/sched/membarrier.c
> @@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
> 
>  		rcu_read_lock();
>  		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> -		if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
> -				   MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
> -			if (!fallback)
> -				__cpumask_set_cpu(cpu, tmpmask);
> -			else
> -				smp_call_function_single(cpu, ipi_mb, NULL, 1);
> +		/*
> +		 * Skip this CPU if the runqueue's current task is NULL or if
> +		 * it is a kernel thread.
> +		 */
> +		if (p && READ_ONCE(p->mm)) {
> +			bool mm_match;
> +
> +			/*
> +			 * Read p->mm and access membarrier_state while holding
> +			 * the task lock to ensure existence of mm.
> +			 */
> +			task_lock(p);
> +			mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) &

Are we guaranteed that this p->mm will be the same as the one loaded via
READ_ONCE() above?  Either way, wouldn't it be better to READ_ONCE() it a
single time and use the same value everywhere?

							Thanx, Paul

> +					     MEMBARRIER_STATE_GLOBAL_EXPEDITED);
> +			task_unlock(p);
> +			if (mm_match) {
> +				if (!fallback)
> +					__cpumask_set_cpu(cpu, tmpmask);
> +				else
> +					smp_call_function_single(cpu, ipi_mb, NULL, 1);
> +			}
>  		}
>  		rcu_read_unlock();
>  	}
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 22:39 ` Paul E. McKenney
@ 2019-01-28 22:45   ` Jann Horn
  2019-01-28 23:22     ` Paul E. McKenney
  2019-01-28 22:46   ` Mathieu Desnoyers
  1 sibling, 1 reply; 5+ messages in thread
From: Jann Horn @ 2019-01-28 22:45 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Mathieu Desnoyers, Ingo Molnar, Peter Zijlstra, kernel list,
	Linux API, Thomas Gleixner, Andrea Parri, Andy Lutomirski,
	Avi Kivity, Benjamin Herrenschmidt, Boqun Feng, Dave Watson,
	David Sehr, H . Peter Anvin, Linus Torvalds, Maged Michael,
	Michael Ellerman, Paul Mackerras, Russell King, Will Deacon,
	stable

On Mon, Jan 28, 2019 at 11:39 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> On Mon, Jan 28, 2019 at 05:07:07PM -0500, Mathieu Desnoyers wrote:
> > Jann Horn identified a racy access to p->mm in the global expedited
> > command of the membarrier system call.
> >
> > The suggested fix is to hold the task_lock() around the accesses to
> > p->mm and to the mm_struct membarrier_state field to guarantee the
> > existence of the mm_struct.
> >
> > Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
[...]
> > --- a/kernel/sched/membarrier.c
> > +++ b/kernel/sched/membarrier.c
> > @@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
> >
> >               rcu_read_lock();
> >               p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> > -             if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
> > -                                MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
> > -                     if (!fallback)
> > -                             __cpumask_set_cpu(cpu, tmpmask);
> > -                     else
> > -                             smp_call_function_single(cpu, ipi_mb, NULL, 1);
> > +             /*
> > +              * Skip this CPU if the runqueue's current task is NULL or if
> > +              * it is a kernel thread.
> > +              */
> > +             if (p && READ_ONCE(p->mm)) {
> > +                     bool mm_match;
> > +
> > +                     /*
> > +                      * Read p->mm and access membarrier_state while holding
> > +                      * the task lock to ensure existence of mm.
> > +                      */
> > +                     task_lock(p);
> > +                     mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) &
>
> Are we guaranteed that this p->mm will be the same as the one loaded via
> READ_ONCE() above?

No; the way I read it, that's just an optimization and has no effect
on correctness.

> Either way, wouldn't it be better to READ_ONCE() it a
> single time and use the same value everywhere?

No; the first READ_ONCE() returns a pointer that you can't access
because it wasn't read under a lock. You can only use it for a NULL
check.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 22:39 ` Paul E. McKenney
  2019-01-28 22:45   ` Jann Horn
@ 2019-01-28 22:46   ` Mathieu Desnoyers
  1 sibling, 0 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2019-01-28 22:46 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, linux-api, Jann Horn,
	Thomas Gleixner, Andrea Parri, Andy Lutomirski, Avi Kivity,
	Benjamin Herrenschmidt, Boqun Feng, Dave Watson, David Sehr,
	H. Peter Anvin, Linus Torvalds, maged michael, Michael Ellerman,
	Paul Mackerras, Russell King, ARM Linux, Will Deacon, stable

----- On Jan 28, 2019, at 5:39 PM, paulmck paulmck@linux.ibm.com wrote:

> On Mon, Jan 28, 2019 at 05:07:07PM -0500, Mathieu Desnoyers wrote:
>> Jann Horn identified a racy access to p->mm in the global expedited
>> command of the membarrier system call.
>> 
>> The suggested fix is to hold the task_lock() around the accesses to
>> p->mm and to the mm_struct membarrier_state field to guarantee the
>> existence of the mm_struct.
>> 
>> Link:
>> https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Tested-by: Jann Horn <jannh@google.com>
>> CC: Jann Horn <jannh@google.com>
>> CC: Thomas Gleixner <tglx@linutronix.de>
>> CC: Peter Zijlstra (Intel) <peterz@infradead.org>
>> CC: Ingo Molnar <mingo@kernel.org>
>> CC: Andrea Parri <parri.andrea@gmail.com>
>> CC: Andy Lutomirski <luto@kernel.org>
>> CC: Avi Kivity <avi@scylladb.com>
>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> CC: Boqun Feng <boqun.feng@gmail.com>
>> CC: Dave Watson <davejwatson@fb.com>
>> CC: David Sehr <sehr@google.com>
>> CC: H. Peter Anvin <hpa@zytor.com>
>> CC: Linus Torvalds <torvalds@linux-foundation.org>
>> CC: Maged Michael <maged.michael@gmail.com>
>> CC: Michael Ellerman <mpe@ellerman.id.au>
>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> CC: Paul Mackerras <paulus@samba.org>
>> CC: Russell King <linux@armlinux.org.uk>
>> CC: Will Deacon <will.deacon@arm.com>
>> CC: stable@vger.kernel.org # v4.16+
>> CC: linux-api@vger.kernel.org
>> ---
>>  kernel/sched/membarrier.c | 27 +++++++++++++++++++++------
>>  1 file changed, 21 insertions(+), 6 deletions(-)
>> 
>> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
>> index 76e0eaf4654e..305fdcc4c5f7 100644
>> --- a/kernel/sched/membarrier.c
>> +++ b/kernel/sched/membarrier.c
>> @@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
>> 
>>  		rcu_read_lock();
>>  		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
>> -		if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
>> -				   MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
>> -			if (!fallback)
>> -				__cpumask_set_cpu(cpu, tmpmask);
>> -			else
>> -				smp_call_function_single(cpu, ipi_mb, NULL, 1);
>> +		/*
>> +		 * Skip this CPU if the runqueue's current task is NULL or if
>> +		 * it is a kernel thread.
>> +		 */
>> +		if (p && READ_ONCE(p->mm)) {
>> +			bool mm_match;
>> +
>> +			/*
>> +			 * Read p->mm and access membarrier_state while holding
>> +			 * the task lock to ensure existence of mm.
>> +			 */
>> +			task_lock(p);
>> +			mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) &
> 
> Are we guaranteed that this p->mm will be the same as the one loaded via
> READ_ONCE() above?  Either way, wouldn't it be better to READ_ONCE() it a
> single time and use the same value everywhere?

The first "READ_ONCE()" above is _outside_ of the task_lock() critical section.
Those two accesses _can_ load two different pointers, and this is why we
need to re-read the p->mm pointer within the task_lock() critical section to
ensure existence of the mm_struct that we use.

If we move the READ_ONCE() into the task_lock(), we need to uselessly
take a lock before we can skip kernel threads.

If we lead the READ_ONCE() outside the task_lock(), then p->mm can be updated
between the READ_ONCE() and reference to the mm_struct content within the
task_lock(), which is racy and does not guarantee its existence.

Or am I missing your point ?

Thanks,

Mathieu


> 
>							Thanx, Paul
> 
>> +					     MEMBARRIER_STATE_GLOBAL_EXPEDITED);
>> +			task_unlock(p);
>> +			if (mm_match) {
>> +				if (!fallback)
>> +					__cpumask_set_cpu(cpu, tmpmask);
>> +				else
>> +					smp_call_function_single(cpu, ipi_mb, NULL, 1);
>> +			}
>>  		}
>>  		rcu_read_unlock();
>>  	}
>> --
>> 2.17.1

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 22:45   ` Jann Horn
@ 2019-01-28 23:22     ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2019-01-28 23:22 UTC (permalink / raw)
  To: Jann Horn
  Cc: Mathieu Desnoyers, Ingo Molnar, Peter Zijlstra, kernel list,
	Linux API, Thomas Gleixner, Andrea Parri, Andy Lutomirski,
	Avi Kivity, Benjamin Herrenschmidt, Boqun Feng, Dave Watson,
	David Sehr, H . Peter Anvin, Linus Torvalds, Maged Michael,
	Michael Ellerman, Paul Mackerras, Russell King, Will Deacon,
	stable

On Mon, Jan 28, 2019 at 11:45:32PM +0100, Jann Horn wrote:
> On Mon, Jan 28, 2019 at 11:39 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> > On Mon, Jan 28, 2019 at 05:07:07PM -0500, Mathieu Desnoyers wrote:
> > > Jann Horn identified a racy access to p->mm in the global expedited
> > > command of the membarrier system call.
> > >
> > > The suggested fix is to hold the task_lock() around the accesses to
> > > p->mm and to the mm_struct membarrier_state field to guarantee the
> > > existence of the mm_struct.
> > >
> > > Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
> > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> [...]
> > > --- a/kernel/sched/membarrier.c
> > > +++ b/kernel/sched/membarrier.c
> > > @@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
> > >
> > >               rcu_read_lock();
> > >               p = task_rcu_dereference(&cpu_rq(cpu)->curr);
> > > -             if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
> > > -                                MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
> > > -                     if (!fallback)
> > > -                             __cpumask_set_cpu(cpu, tmpmask);
> > > -                     else
> > > -                             smp_call_function_single(cpu, ipi_mb, NULL, 1);
> > > +             /*
> > > +              * Skip this CPU if the runqueue's current task is NULL or if
> > > +              * it is a kernel thread.
> > > +              */
> > > +             if (p && READ_ONCE(p->mm)) {
> > > +                     bool mm_match;
> > > +
> > > +                     /*
> > > +                      * Read p->mm and access membarrier_state while holding
> > > +                      * the task lock to ensure existence of mm.
> > > +                      */
> > > +                     task_lock(p);
> > > +                     mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) &
> >
> > Are we guaranteed that this p->mm will be the same as the one loaded via
> > READ_ONCE() above?
> 
> No; the way I read it, that's just an optimization and has no effect
> on correctness.
> 
> > Either way, wouldn't it be better to READ_ONCE() it a
> > single time and use the same value everywhere?
> 
> No; the first READ_ONCE() returns a pointer that you can't access
> because it wasn't read under a lock. You can only use it for a NULL
> check.

Ah, of course!  Thank you both!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-28 23:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-28 22:07 [PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited() Mathieu Desnoyers
2019-01-28 22:39 ` Paul E. McKenney
2019-01-28 22:45   ` Jann Horn
2019-01-28 23:22     ` Paul E. McKenney
2019-01-28 22:46   ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).