All of lore.kernel.org
 help / color / mirror / Atom feed
* get_online_cpus() from a  preemptible() context (bug?)
@ 2017-11-03 14:45 James Morse
  2017-11-06 10:32 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: James Morse @ 2017-11-03 14:45 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner; +Cc: linux-kernel

Hi Thomas, Peter,

I'm trying to work out what stops a thread being pre-empted and migrated between
calling get_online_cpus() and put_online_cpus().

According to __percpu_down_read(), its the pre-empt count:
>  * Due to having preemption disabled the decrement happens on
>  * the same CPU as the increment, avoiding the
>  * increment-on-one-CPU-and-decrement-on-another problem.


So this:
> void cpus_read_lock(void)
> {
>        percpu_down_read(&cpu_hotplug_lock);
> +
> +       /* Can we migrated before we release this per-cpu lock? */
> +       WARN_ON(preemptible());
>  }

should never fire?

It does, some of the offenders:
> kmem_cache_create
> apply_workqueue_attrs
> stop_machine
> static_key_enable
> lru_add_drain_all
> __cpuhp_setup_state
> kmem_cache_shrink
> vmstat_shepherd
> __cpuhp_state_add_instance


Trying to leave preempt disabled between the down/up leads to
scheduling-while-atomic instead.

Can you point out what I've missed here?


Thanks,

James

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: get_online_cpus() from a  preemptible() context (bug?)
  2017-11-03 14:45 get_online_cpus() from a preemptible() context (bug?) James Morse
@ 2017-11-06 10:32 ` Peter Zijlstra
  2017-11-06 10:40   ` Peter Zijlstra
  2017-11-06 18:51   ` James Morse
  0 siblings, 2 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-11-06 10:32 UTC (permalink / raw)
  To: James Morse; +Cc: Thomas Gleixner, linux-kernel

On Fri, Nov 03, 2017 at 02:45:45PM +0000, James Morse wrote:
> Hi Thomas, Peter,
> 
> I'm trying to work out what stops a thread being pre-empted and migrated between
> calling get_online_cpus() and put_online_cpus().
> 
> According to __percpu_down_read(), its the pre-empt count:
> >  * Due to having preemption disabled the decrement happens on
> >  * the same CPU as the increment, avoiding the
> >  * increment-on-one-CPU-and-decrement-on-another problem.
> 
> 
> So this:
> > void cpus_read_lock(void)
> > {
> >        percpu_down_read(&cpu_hotplug_lock);
> > +
> > +       /* Can we migrated before we release this per-cpu lock? */
> > +       WARN_ON(preemptible());
> >  }
> 
> should never fire?

It should.. You're reading a comment on __percpu_down_read() and using
percpu_down_read(), _not_ the same function ;-)

If you look at percpu_down_read(), you'll note it'll disable preemption
before calling __percpu_down_read().

And yes, that whole percpu-rwsem code is fairly magical :-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: get_online_cpus() from a  preemptible() context (bug?)
  2017-11-06 10:32 ` Peter Zijlstra
@ 2017-11-06 10:40   ` Peter Zijlstra
  2017-11-06 18:51   ` James Morse
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-11-06 10:40 UTC (permalink / raw)
  To: James Morse; +Cc: Thomas Gleixner, linux-kernel

On Mon, Nov 06, 2017 at 11:32:12AM +0100, Peter Zijlstra wrote:
> On Fri, Nov 03, 2017 at 02:45:45PM +0000, James Morse wrote:
> > Hi Thomas, Peter,
> > 
> > I'm trying to work out what stops a thread being pre-empted and migrated between
> > calling get_online_cpus() and put_online_cpus().

Nothing; why would you think it would? All those functions guarantee is
that any CPU observed as being online says online (and its converse,
that a CPU observed as being offline, says offline, although less people
care about that one).

That is; it serializes against CPU hotplug, nothing else.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: get_online_cpus() from a  preemptible() context (bug?)
  2017-11-06 10:32 ` Peter Zijlstra
  2017-11-06 10:40   ` Peter Zijlstra
@ 2017-11-06 18:51   ` James Morse
  2017-11-06 21:07     ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: James Morse @ 2017-11-06 18:51 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Thomas Gleixner, linux-kernel

Hi Peter,

(combining your replies)

On 06/11/17 10:32, Peter Zijlstra wrote:
> On Fri, Nov 03, 2017 at 02:45:45PM +0000, James Morse wrote:
>> I'm trying to work out what stops a thread being pre-empted and migrated between
>> calling get_online_cpus() and put_online_cpus().

> Nothing; why would you think it would?

To stop the this_cpu_*() operations in down/up being applied on different CPUs,
affecting a different percpu:read_count.


> All those functions guarantee is
> that any CPU observed as being online says online (and its converse,
> that a CPU observed as being offline, says offline, although less people
> care about that one).


>> According to __percpu_down_read(), its the pre-empt count:
>>>  * Due to having preemption disabled the decrement happens on
>>>  * the same CPU as the increment, avoiding the
>>>  * increment-on-one-CPU-and-decrement-on-another problem.
>>
>>
>> So this:
>>> void cpus_read_lock(void)
>>> {
>>>        percpu_down_read(&cpu_hotplug_lock);
>>> +
>>> +       /* Can we migrated before we release this per-cpu lock? */
>>> +       WARN_ON(preemptible());
>>>  }
>>
>> should never fire?

> It should.. You're reading a comment on __percpu_down_read() and using
> percpu_down_read(), _not_ the same function ;-)

Yes, sorry, I thought you did a better job of describing the case I'm trying to
work-out.


> If you look at percpu_down_read(), you'll note it'll disable preemption
> before calling __percpu_down_read().

Yes, this is how __percpu_down_read() protects the combination of it's fast/slow
paths.

But next percpu_down_read() calls preempt_enable(), I can't see what stops us
migrating before percpu_up_read() preempt_disable()s to call __this_cpu_dec(),
which now affects a different variable.


> And yes, that whole percpu-rwsem code is fairly magical :-)

I think I'll file this under magical. That rcu_sync_is_idle() must know
something I don't!


Thanks!

James

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: get_online_cpus() from a  preemptible() context (bug?)
  2017-11-06 18:51   ` James Morse
@ 2017-11-06 21:07     ` Peter Zijlstra
  2017-11-08 16:07       ` James Morse
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2017-11-06 21:07 UTC (permalink / raw)
  To: James Morse; +Cc: Thomas Gleixner, linux-kernel

On Mon, Nov 06, 2017 at 06:51:35PM +0000, James Morse wrote:
> > If you look at percpu_down_read(), you'll note it'll disable preemption
> > before calling __percpu_down_read().
> 
> Yes, this is how __percpu_down_read() protects the combination of it's fast/slow
> paths.
> 
> But next percpu_down_read() calls preempt_enable(), I can't see what stops us
> migrating before percpu_up_read() preempt_disable()s to call __this_cpu_dec(),
> which now affects a different variable.
> 

Ah, so the two operations that comment talks about are:

    percpu_down_read_preempt_disable()
      preempt_disable();
1)    __this_cpu_inc(*sem->read_count);
      if (unlikely(!rcu_sync_is_idle(&sem->rss)))
	__percpu_down_read()
	  smp_mb()
	  if (likely(!smp_load_acquire(&sem->readers_block))) // false
	  __percpu_up_read()
	    smp_mb()
2)	   __this_cpu_dec(*sem->read_count);
	    rcuwait_wake_up(&sem->writer);
	  preempt_enable_no_resched();

If you want more detail on this, I'll actually have to go think :-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: get_online_cpus() from a  preemptible() context (bug?)
  2017-11-06 21:07     ` Peter Zijlstra
@ 2017-11-08 16:07       ` James Morse
  0 siblings, 0 replies; 6+ messages in thread
From: James Morse @ 2017-11-08 16:07 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Thomas Gleixner, linux-kernel

Hi Peter,

On 06/11/17 21:07, Peter Zijlstra wrote:
> On Mon, Nov 06, 2017 at 06:51:35PM +0000, James Morse wrote:
>>> If you look at percpu_down_read(), you'll note it'll disable preemption
>>> before calling __percpu_down_read().
>>
>> Yes, this is how __percpu_down_read() protects the combination of it's fast/slow
>> paths.
>>
>> But next percpu_down_read() calls preempt_enable(), I can't see what stops us
>> migrating before percpu_up_read() preempt_disable()s to call __this_cpu_dec(),
>> which now affects a different variable.
>>
> 
> Ah, so the two operations that comment talks about are:
> 
>     percpu_down_read_preempt_disable()
>       preempt_disable();
> 1)    __this_cpu_inc(*sem->read_count);
>       if (unlikely(!rcu_sync_is_idle(&sem->rss)))
> 	__percpu_down_read()
> 	  smp_mb()
> 	  if (likely(!smp_load_acquire(&sem->readers_block))) // false
> 	  __percpu_up_read()
> 	    smp_mb()
> 2)	   __this_cpu_dec(*sem->read_count);
> 	    rcuwait_wake_up(&sem->writer);
> 	  preempt_enable_no_resched();
> 
> If you want more detail on this, I'll actually have to go think :-)

I think this was the answer to a much smarter question than mine!

I've tried (and failed) to break it instead. To answer my own question:

I thought this was potentially-broken because the __this_cpu_{add,dec}() out in
{get,put}_online_cpus() will operate on different per-cpu read_count variables
if we migrate. (not the pair above)

This isn't a problem as the only thing that reads the read_count is
readers_active_check(), which per_cpu_sum()s them all together before comparing
against zero. As they are all unsigned-ints it uses unsigned-overflow to do the
right thing. This even works if a CPU holding a vital part of the read_count is
offline, as per_cpu_sum() uses for_each_possible_cpu().


Thanks!

James

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-08 16:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-03 14:45 get_online_cpus() from a preemptible() context (bug?) James Morse
2017-11-06 10:32 ` Peter Zijlstra
2017-11-06 10:40   ` Peter Zijlstra
2017-11-06 18:51   ` James Morse
2017-11-06 21:07     ` Peter Zijlstra
2017-11-08 16:07       ` James Morse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.