All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>,
	tj@kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org,
	der.herr@hofr.at, dave@stgolabs.net, riel@redhat.com,
	viro@ZenIV.linux.org.uk, torvalds@linux-foundation.org
Subject: Re: [RFC][PATCH 12/13] stop_machine: Remove lglock
Date: Tue, 23 Jun 2015 11:26:26 -0700	[thread overview]
Message-ID: <20150623182626.GO3892@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150623180411.GF3644@twins.programming.kicks-ass.net>

On Tue, Jun 23, 2015 at 08:04:11PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 23, 2015 at 10:30:38AM -0700, Paul E. McKenney wrote:
> > Good, you don't need this because you can check for dynticks later.
> > You will need to check for offline CPUs.
> 
> get_online_cpus()
> for_each_online_cpus() {
>  ...
> }
> 
> is what the new code does.

Ah, I missed that this was not deleted.

> > > -	/*
> > > -	 * Each pass through the following loop attempts to force a
> > > -	 * context switch on each CPU.
> > > -	 */
> > > -	while (try_stop_cpus(cma ? cm : cpu_online_mask,
> > > -			     synchronize_sched_expedited_cpu_stop,
> > > -			     NULL) == -EAGAIN) {
> > > -		put_online_cpus();
> > > -		atomic_long_inc(&rsp->expedited_tryfail);
> > > -
> > > -		/* Check to see if someone else did our work for us. */
> > > -		s = atomic_long_read(&rsp->expedited_done);
> > > -		if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) {
> > > -			/* ensure test happens before caller kfree */
> > > -			smp_mb__before_atomic(); /* ^^^ */
> > > -			atomic_long_inc(&rsp->expedited_workdone1);
> > > -			free_cpumask_var(cm);
> > > -			return;
> > 
> > Here you lose batching.  Yeah, I know that synchronize_sched_expedited()
> > is -supposed- to be used sparingly, but it is not cool for the kernel
> > to melt down just because some creative user found a way to heat up a
> > code path.  Need a mutex_trylock() with a counter and checking for
> > others having already done the needed work.
> 
> I really think you're making that expedited nonsense far too accessible.

This has nothing to do with accessibility and everything to do with
robustness.  And with me not becoming the triage center for too many
non-RCU bugs.

> But it was exactly that trylock I was trying to get rid of.

OK.  Why, exactly?

> > And we still need to be able to drop back to synchronize_sched()
> > (AKA wait_rcu_gp(call_rcu_sched) in this case) in case we have both a
> > creative user and a long-running RCU-sched read-side critical section.
> 
> No, a long-running RCU-sched read-side is a bug and we should fix that,
> its called a preemption-latency, we don't like those.

Yes, we should fix them.  No, they absolutely must not result in a
meltdown of some unrelated portion of the kernel (like RCU), particularly
if this situation occurs on some system running a production workload
that doesn't happen to care about preemption latency.

> > > +	for_each_online_cpu(cpu) {
> > > +		struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> > > 
> > > -		/* Recheck to see if someone else did our work for us. */
> > > -		s = atomic_long_read(&rsp->expedited_done);
> > > -		if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) {
> > > -			/* ensure test happens before caller kfree */
> > > -			smp_mb__before_atomic(); /* ^^^ */
> > > -			atomic_long_inc(&rsp->expedited_workdone2);
> > > -			free_cpumask_var(cm);
> > > -			return;
> > > -		}
> > > +		/* Offline CPUs, idle CPUs, and any CPU we run on are quiescent. */
> > > +		if (!(atomic_add_return(0, &rdtp->dynticks) & 0x1))
> > > +			continue;
> > 
> > Let's see...  This does work for idle CPUs and for nohz_full CPUs running
> > in userspace.
> > 
> > It does not work for the current CPU, so the check needs an additional
> > check against raw_smp_processor_id(), which is easy enough to add.
> 
> Right, realized after I send it out, but it _should_ work for the
> current cpu too. Just pointless doing it.

OK, and easily fixed up in any case.

> > There always has been a race window involving CPU hotplug.
> 
> There is no hotplug race, the entire thing has get_online_cpus() held
> across it.

Which I would like to get rid of, but not urgent.

> > > +		stop_one_cpu(cpu, synchronize_sched_expedited_cpu_stop, NULL);
> > 
> > My thought was to use smp_call_function_single(), and to have the function
> > called recheck dyntick-idle state, avoiding doing a set_tsk_need_resched()
> > if so.
> 
> set_tsk_need_resched() is buggy and should not be used.

OK, what API is used for this purpose?

> > This would result in a single pass through schedule() instead
> > of stop_one_cpu()'s double context switch.  It would likely also require
> > some rework of rcu_note_context_switch(), which stop_one_cpu() avoids
> > the need for.
> 
> _IF_ you're going to touch rcu_note_context_switch(), you might as well
> use a completion, set it for the number of CPUs that need a resched,
> spray resched-IPI and have rcu_note_context_switch() do a complete().
> 
> But I would really like to avoid adding code to
> rcu_note_context_switch(), because we run that on _every_ single context
> switch.

I believe that I can rework the current code to get the effect without
increased overhead, given that I have no intention of adding the
complete().  Adding the complete -would- add overhead to that fastpath.

							Thanx, Paul


  reply	other threads:[~2015-06-23 18:26 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-22 12:16 [RFC][PATCH 00/13] percpu rwsem -v2 Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 01/13] rcu: Create rcu_sync infrastructure Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 02/13] rcusync: Introduce struct rcu_sync_ops Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 03/13] rcusync: Add the CONFIG_PROVE_RCU checks Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 04/13] rcusync: Introduce rcu_sync_dtor() Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 05/13] percpu-rwsem: Optimize readers and reduce global impact Peter Zijlstra
2015-06-22 23:02   ` Oleg Nesterov
2015-06-23  7:28   ` Nicholas Mc Guire
2015-06-25 19:08     ` Peter Zijlstra
2015-06-25 19:17       ` Tejun Heo
2015-06-29  9:32         ` Peter Zijlstra
2015-06-29 15:12           ` Tejun Heo
2015-06-29 15:14             ` Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 06/13] percpu-rwsem: Provide percpu_down_read_trylock() Peter Zijlstra
2015-06-22 23:08   ` Oleg Nesterov
2015-06-22 12:16 ` [RFC][PATCH 07/13] sched: Reorder task_struct Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 08/13] percpu-rwsem: DEFINE_STATIC_PERCPU_RWSEM Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 09/13] hotplug: Replace hotplug lock with percpu-rwsem Peter Zijlstra
2015-06-22 22:57   ` Oleg Nesterov
2015-06-23  7:16     ` Peter Zijlstra
2015-06-23 17:01       ` Oleg Nesterov
2015-06-23 17:53         ` Peter Zijlstra
2015-06-24 13:50           ` Oleg Nesterov
2015-06-24 14:13             ` Peter Zijlstra
2015-06-24 15:12               ` Oleg Nesterov
2015-06-24 16:15                 ` Peter Zijlstra
2015-06-28 23:56             ` [PATCH 0/3] percpu-rwsem: introduce percpu_rw_semaphore->recursive mode Oleg Nesterov
2015-06-28 23:56               ` [PATCH 1/3] rcusync: introduce rcu_sync_struct->exclusive mode Oleg Nesterov
2015-06-28 23:56               ` [PATCH 2/3] percpu-rwsem: don't use percpu_rw_semaphore->rw_sem to exclude writers Oleg Nesterov
2015-06-28 23:56               ` [PATCH 3/3] percpu-rwsem: introduce percpu_rw_semaphore->recursive mode Oleg Nesterov
2015-06-22 12:16 ` [RFC][PATCH 10/13] fs/locks: Replace lg_global with a percpu-rwsem Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 11/13] fs/locks: Replace lg_local with a per-cpu spinlock Peter Zijlstra
2015-06-23  0:19   ` Oleg Nesterov
2015-06-22 12:16 ` [RFC][PATCH 12/13] stop_machine: Remove lglock Peter Zijlstra
2015-06-22 22:21   ` Oleg Nesterov
2015-06-23 10:09     ` Peter Zijlstra
2015-06-23 10:55       ` Peter Zijlstra
2015-06-23 11:20         ` Peter Zijlstra
2015-06-23 13:08           ` Peter Zijlstra
2015-06-23 16:36             ` Oleg Nesterov
2015-06-23 17:30             ` Paul E. McKenney
2015-06-23 18:04               ` Peter Zijlstra
2015-06-23 18:26                 ` Paul E. McKenney [this message]
2015-06-23 19:05                   ` Paul E. McKenney
2015-06-24  2:23                     ` Paul E. McKenney
2015-06-24  8:32                       ` Peter Zijlstra
2015-06-24  9:31                         ` Peter Zijlstra
2015-06-24 13:48                           ` Paul E. McKenney
2015-06-24 15:01                         ` Paul E. McKenney
2015-06-24 15:34                           ` Peter Zijlstra
2015-06-24  7:35                   ` Peter Zijlstra
2015-06-24  8:42                     ` Ingo Molnar
2015-06-24 13:39                       ` Paul E. McKenney
2015-06-24 13:43                         ` Ingo Molnar
2015-06-24 14:03                           ` Paul E. McKenney
2015-06-24 14:50                     ` Paul E. McKenney
2015-06-24 15:01                       ` Peter Zijlstra
2015-06-24 15:27                         ` Paul E. McKenney
2015-06-24 15:40                           ` Peter Zijlstra
2015-06-24 16:09                             ` Paul E. McKenney
2015-06-24 16:42                               ` Peter Zijlstra
2015-06-24 17:10                                 ` Paul E. McKenney
2015-06-24 17:20                                   ` Paul E. McKenney
2015-06-24 17:29                                     ` Peter Zijlstra
2015-06-24 17:28                                   ` Peter Zijlstra
2015-06-24 17:32                                     ` Peter Zijlstra
2015-06-24 18:14                                     ` Peter Zijlstra
2015-06-24 17:58                                   ` Peter Zijlstra
2015-06-25  3:23                                     ` Paul E. McKenney
2015-06-25 11:07                                       ` Peter Zijlstra
2015-06-25 13:47                                         ` Paul E. McKenney
2015-06-25 14:20                                           ` Peter Zijlstra
2015-06-25 14:51                                             ` Paul E. McKenney
2015-06-26 12:32                                               ` Peter Zijlstra
2015-06-26 16:14                                                 ` Paul E. McKenney
2015-06-29  7:56                                                   ` Peter Zijlstra
2015-06-30 21:32                                                     ` Paul E. McKenney
2015-07-01 11:56                                                       ` Peter Zijlstra
2015-07-01 15:56                                                         ` Paul E. McKenney
2015-07-01 16:16                                                           ` Peter Zijlstra
2015-07-01 18:45                                                             ` Paul E. McKenney
2015-06-23 14:39         ` Paul E. McKenney
2015-06-23 16:20       ` Oleg Nesterov
2015-06-23 17:24         ` Oleg Nesterov
2015-06-25 19:18           ` Peter Zijlstra
2015-06-22 12:16 ` [RFC][PATCH 13/13] locking: " Peter Zijlstra
2015-06-22 12:36 ` [RFC][PATCH 00/13] percpu rwsem -v2 Peter Zijlstra
2015-06-22 18:11 ` Daniel Wagner
2015-06-22 19:05   ` Peter Zijlstra
2015-06-23  9:35     ` Daniel Wagner
2015-06-23 10:00       ` Ingo Molnar
2015-06-23 14:34       ` Peter Zijlstra
2015-06-23 14:56         ` Daniel Wagner
2015-06-23 17:50           ` Peter Zijlstra
2015-06-23 19:36             ` Peter Zijlstra
2015-06-24  8:46               ` Ingo Molnar
2015-06-24  9:01                 ` Peter Zijlstra
2015-06-24  9:18                 ` Daniel Wagner
2015-07-01  5:57                   ` Daniel Wagner
2015-07-01 21:54                     ` Linus Torvalds
2015-07-02  9:41                       ` Peter Zijlstra
2015-07-20  5:53                         ` Daniel Wagner
2015-07-20 18:44                           ` Linus Torvalds
2015-06-22 20:06 ` Linus Torvalds
2015-06-23 16:10 ` Davidlohr Bueso
2015-06-23 16:21   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150623182626.GO3892@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=dave@stgolabs.net \
    --cc=der.herr@hofr.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.