linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] rcuperf: Make rcuperf test more robust for !expedited mode
@ 2019-07-03  4:39 Joel Fernandes (Google)
  2019-07-03 17:23 ` Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Joel Fernandes (Google) @ 2019-07-03  4:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Davidlohr Bueso, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Paul E. McKenney, rcu, Steven Rostedt

It is possible that rcuperf run concurrently with init starting up.
During this time, the system is running all grace periods as expedited.
However, rcuperf can also be run in a normal mode. The rcuperf test
depends on a holdoff before starting the test to ensure grace periods
start later. This works fine with the default holdoff time however it is
not robust in situations where init takes greater than the holdoff time
the finish running. Or, as in my case:

I modified the rcuperf test locally to also run a thread that did
preempt disable/enable in a loop. This had the effect of slowing down
init. The end result was "batches:" counter was 0. This was because only
expedited GPs seem to happen, not normal ones which led to the
rcu_state.gp_seq counter remaining constant across grace periods which
unexpectedly happen to be expedited.

This led me to debug that even though the test could be for normal GP
performance, because init has still not run enough, the
rcu_unexpedited_gp() call would not have run yet. In other words, the
test would concurrently with init booting in expedited GP mode.

To fix this properly, let us just check for whether rcu_unexpedited_gp()
was called yet before starting the writer test. With this, the holdoff
parameter could also be dropped or reduced to speed up the test.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
Please consider this patch as an RFC only! This is the first time I am
running the RCU performance tests, thanks!

Question:
I actually did not know that expedited gp does not increment
rcu_state.gp_seq. Does expedited GPs not go through the same RCU-tree
machinery as non-expedited? If yes, why doesn't rcu_state.gp_seq
increment when we are expedited? If no, why not?

 kernel/rcu/rcu.h     | 2 ++
 kernel/rcu/rcuperf.c | 5 +++++
 kernel/rcu/update.c  | 9 +++++++++
 3 files changed, 16 insertions(+)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 8fd4f82c9b3d..5d30dbc7000b 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -429,12 +429,14 @@ static inline void srcu_init(void) { }
 static inline bool rcu_gp_is_normal(void) { return true; }
 static inline bool rcu_gp_is_expedited(void) { return false; }
 static inline void rcu_expedite_gp(void) { }
+static inline bool rcu_expedite_gp_called(void) { }
 static inline void rcu_unexpedite_gp(void) { }
 static inline void rcu_request_urgent_qs_task(struct task_struct *t) { }
 #else /* #ifdef CONFIG_TINY_RCU */
 bool rcu_gp_is_normal(void);     /* Internal RCU use. */
 bool rcu_gp_is_expedited(void);  /* Internal RCU use. */
 void rcu_expedite_gp(void);
+bool rcu_expedite_gp_called(void);
 void rcu_unexpedite_gp(void);
 void rcupdate_announce_bootup_oddness(void);
 void rcu_request_urgent_qs_task(struct task_struct *t);
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index 4513807cd4c4..9902857d3cc6 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -375,6 +375,11 @@ rcu_perf_writer(void *arg)
 	if (holdoff)
 		schedule_timeout_uninterruptible(holdoff * HZ);
 
+	// Wait for rcu_unexpedite_gp() to be called from init to avoid
+	// doing expedited GPs if we are not supposed to
+	while (!gp_exp && rcu_expedite_gp_called())
+		schedule_timeout_uninterruptible(1);
+
 	t = ktime_get_mono_fast_ns();
 	if (atomic_inc_return(&n_rcu_perf_writer_started) >= nrealwriters) {
 		t_rcu_perf_writer_started = t;
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 249517058b13..840f62805d62 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -154,6 +154,15 @@ void rcu_expedite_gp(void)
 }
 EXPORT_SYMBOL_GPL(rcu_expedite_gp);
 
+/**
+ * rcu_expedite_gp_called - Was there a prior call to rcu_expedite_gp()?
+ */
+bool rcu_expedite_gp_called(void)
+{
+	return (atomic_read(&rcu_expedited_nesting) != 0);
+}
+EXPORT_SYMBOL_GPL(rcu_expedite_gp_called);
+
 /**
  * rcu_unexpedite_gp - Cancel prior rcu_expedite_gp() invocation
  *
-- 
2.22.0.410.gd8fdbe21b5-goog

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] rcuperf: Make rcuperf test more robust for !expedited mode
  2019-07-03  4:39 [RFC] rcuperf: Make rcuperf test more robust for !expedited mode Joel Fernandes (Google)
@ 2019-07-03 17:23 ` Paul E. McKenney
  2019-07-03 20:37   ` Joel Fernandes
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2019-07-03 17:23 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Davidlohr Bueso, Josh Triplett, Lai Jiangshan,
	Mathieu Desnoyers, rcu, Steven Rostedt

On Wed, Jul 03, 2019 at 12:39:45AM -0400, Joel Fernandes (Google) wrote:
> It is possible that rcuperf run concurrently with init starting up.
> During this time, the system is running all grace periods as expedited.
> However, rcuperf can also be run in a normal mode. The rcuperf test
> depends on a holdoff before starting the test to ensure grace periods
> start later. This works fine with the default holdoff time however it is
> not robust in situations where init takes greater than the holdoff time
> the finish running. Or, as in my case:
> 
> I modified the rcuperf test locally to also run a thread that did
> preempt disable/enable in a loop. This had the effect of slowing down
> init. The end result was "batches:" counter was 0. This was because only
> expedited GPs seem to happen, not normal ones which led to the
> rcu_state.gp_seq counter remaining constant across grace periods which
> unexpectedly happen to be expedited.
> 
> This led me to debug that even though the test could be for normal GP
> performance, because init has still not run enough, the
> rcu_unexpedited_gp() call would not have run yet. In other words, the
> test would concurrently with init booting in expedited GP mode.
> 
> To fix this properly, let us just check for whether rcu_unexpedited_gp()
> was called yet before starting the writer test. With this, the holdoff
> parameter could also be dropped or reduced to speed up the test.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
> Please consider this patch as an RFC only! This is the first time I am
> running the RCU performance tests, thanks!

Another approach is to create (say) a late_initcall() function that
sets a global variable.  Then have the wait loop wait for that global
variable to be set.  Or use an explicit wait/wakeup scheme, if you wish.

This has the virtue of keeping this (admittedly small) bit of complexit
out of the core kernel.

> Question:
> I actually did not know that expedited gp does not increment
> rcu_state.gp_seq. Does expedited GPs not go through the same RCU-tree
> machinery as non-expedited? If yes, why doesn't rcu_state.gp_seq
> increment when we are expedited? If no, why not?

They are indeed (mostly) independent mechanisms.

This is in contrast to SRCU, where an expedited grace period does what
you expect, causing all grace periods to do less waiting until the
most recent expedited grace period has completed.

Why the difference?

o	Current SRCU uses have relatively few updates, so the decreases
	in batching effectiveness for normal grace periods are less
	troublesome than they would be for RCU.  Shortening RCU grace
	periods would significantly increase per-update overhead, for
	example, and less so for SRCU.

o	RCU uses a much more distributed design, which means that
	expediting an already-started RCU grace period would be more
	challenging than it is for SRCU.  The race conditions between
	an "expedite now!" event and the various changes in state for
	a normal RCU grace period would be challenging.

o	In addition, RCU's more distributed design results in
	higher latencies.  Expedited RCU grace periods simply bypass
	this and get much better latencies.

So, yes, normal and expedited RCU grace periods could be converged, but
it does not seem like a good idea given current requirements.

							Thanx, Paul

>  kernel/rcu/rcu.h     | 2 ++
>  kernel/rcu/rcuperf.c | 5 +++++
>  kernel/rcu/update.c  | 9 +++++++++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 8fd4f82c9b3d..5d30dbc7000b 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -429,12 +429,14 @@ static inline void srcu_init(void) { }
>  static inline bool rcu_gp_is_normal(void) { return true; }
>  static inline bool rcu_gp_is_expedited(void) { return false; }
>  static inline void rcu_expedite_gp(void) { }
> +static inline bool rcu_expedite_gp_called(void) { }
>  static inline void rcu_unexpedite_gp(void) { }
>  static inline void rcu_request_urgent_qs_task(struct task_struct *t) { }
>  #else /* #ifdef CONFIG_TINY_RCU */
>  bool rcu_gp_is_normal(void);     /* Internal RCU use. */
>  bool rcu_gp_is_expedited(void);  /* Internal RCU use. */
>  void rcu_expedite_gp(void);
> +bool rcu_expedite_gp_called(void);
>  void rcu_unexpedite_gp(void);
>  void rcupdate_announce_bootup_oddness(void);
>  void rcu_request_urgent_qs_task(struct task_struct *t);
> diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
> index 4513807cd4c4..9902857d3cc6 100644
> --- a/kernel/rcu/rcuperf.c
> +++ b/kernel/rcu/rcuperf.c
> @@ -375,6 +375,11 @@ rcu_perf_writer(void *arg)
>  	if (holdoff)
>  		schedule_timeout_uninterruptible(holdoff * HZ);
>  
> +	// Wait for rcu_unexpedite_gp() to be called from init to avoid
> +	// doing expedited GPs if we are not supposed to
> +	while (!gp_exp && rcu_expedite_gp_called())
> +		schedule_timeout_uninterruptible(1);
> +
>  	t = ktime_get_mono_fast_ns();
>  	if (atomic_inc_return(&n_rcu_perf_writer_started) >= nrealwriters) {
>  		t_rcu_perf_writer_started = t;
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 249517058b13..840f62805d62 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -154,6 +154,15 @@ void rcu_expedite_gp(void)
>  }
>  EXPORT_SYMBOL_GPL(rcu_expedite_gp);
>  
> +/**
> + * rcu_expedite_gp_called - Was there a prior call to rcu_expedite_gp()?
> + */
> +bool rcu_expedite_gp_called(void)
> +{
> +	return (atomic_read(&rcu_expedited_nesting) != 0);
> +}
> +EXPORT_SYMBOL_GPL(rcu_expedite_gp_called);
> +
>  /**
>   * rcu_unexpedite_gp - Cancel prior rcu_expedite_gp() invocation
>   *
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] rcuperf: Make rcuperf test more robust for !expedited mode
  2019-07-03 17:23 ` Paul E. McKenney
@ 2019-07-03 20:37   ` Joel Fernandes
  0 siblings, 0 replies; 3+ messages in thread
From: Joel Fernandes @ 2019-07-03 20:37 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Davidlohr Bueso, Josh Triplett, Lai Jiangshan,
	Mathieu Desnoyers, rcu, Steven Rostedt

On Wed, Jul 03, 2019 at 10:23:44AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 03, 2019 at 12:39:45AM -0400, Joel Fernandes (Google) wrote:
> > It is possible that rcuperf run concurrently with init starting up.
> > During this time, the system is running all grace periods as expedited.
> > However, rcuperf can also be run in a normal mode. The rcuperf test
> > depends on a holdoff before starting the test to ensure grace periods
> > start later. This works fine with the default holdoff time however it is
> > not robust in situations where init takes greater than the holdoff time
> > the finish running. Or, as in my case:
> > 
> > I modified the rcuperf test locally to also run a thread that did
> > preempt disable/enable in a loop. This had the effect of slowing down
> > init. The end result was "batches:" counter was 0. This was because only
> > expedited GPs seem to happen, not normal ones which led to the
> > rcu_state.gp_seq counter remaining constant across grace periods which
> > unexpectedly happen to be expedited.
> > 
> > This led me to debug that even though the test could be for normal GP
> > performance, because init has still not run enough, the
> > rcu_unexpedited_gp() call would not have run yet. In other words, the
> > test would concurrently with init booting in expedited GP mode.
> > 
> > To fix this properly, let us just check for whether rcu_unexpedited_gp()
> > was called yet before starting the writer test. With this, the holdoff
> > parameter could also be dropped or reduced to speed up the test.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > ---
> > Please consider this patch as an RFC only! This is the first time I am
> > running the RCU performance tests, thanks!
> 
> Another approach is to create (say) a late_initcall() function that
> sets a global variable.  Then have the wait loop wait for that global
> variable to be set.  Or use an explicit wait/wakeup scheme, if you wish.
> 
> This has the virtue of keeping this (admittedly small) bit of complexit
> out of the core kernel.

Agreed, I thought of the late_initcall approach as well. I will respin the
patch to do that.

> > Question:
> > I actually did not know that expedited gp does not increment
> > rcu_state.gp_seq. Does expedited GPs not go through the same RCU-tree
> > machinery as non-expedited? If yes, why doesn't rcu_state.gp_seq
> > increment when we are expedited? If no, why not?
> 
> They are indeed (mostly) independent mechanisms.
> 
> This is in contrast to SRCU, where an expedited grace period does what
> you expect, causing all grace periods to do less waiting until the
> most recent expedited grace period has completed.
> 
> Why the difference?
> 
> o	Current SRCU uses have relatively few updates, so the decreases
> 	in batching effectiveness for normal grace periods are less
> 	troublesome than they would be for RCU.  Shortening RCU grace
> 	periods would significantly increase per-update overhead, for
> 	example, and less so for SRCU.
> 
> o	RCU uses a much more distributed design, which means that
> 	expediting an already-started RCU grace period would be more
> 	challenging than it is for SRCU.  The race conditions between
> 	an "expedite now!" event and the various changes in state for
> 	a normal RCU grace period would be challenging.
> 
> o	In addition, RCU's more distributed design results in
> 	higher latencies.  Expedited RCU grace periods simply bypass
> 	this and get much better latencies.
> 
> So, yes, normal and expedited RCU grace periods could be converged, but
> it does not seem like a good idea given current requirements.

Thanks a lot for the explanation of these subtleties, I really appreciate
that and it will serve as a great future reference for everyone (and for my notes!)

Thanks again!

- Joel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-03 20:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-03  4:39 [RFC] rcuperf: Make rcuperf test more robust for !expedited mode Joel Fernandes (Google)
2019-07-03 17:23 ` Paul E. McKenney
2019-07-03 20:37   ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).