linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/3] rcu/tree: Add a warning if CPU being onlined did not report QS already
@ 2020-07-31  0:40 Joel Fernandes (Google)
  2020-07-31  0:40 ` [PATCH v2 2/3] rcu/tree: Clarify comments about FQS loop reporting quiescent states Joel Fernandes (Google)
  2020-07-31  0:40 ` [PATCH v2 3/3] rcu/tree: Make FQS complaining about offline CPU more aggressive Joel Fernandes (Google)
  0 siblings, 2 replies; 4+ messages in thread
From: Joel Fernandes (Google) @ 2020-07-31  0:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Paul E . McKenney, Neeraj Upadhyay, Josh Triplett, Lai Jiangshan,
	Mathieu Desnoyers, rcu, Steven Rostedt

Add a warning if CPU being onlined did not report QS already. This is to
simplify the code in the CPU onlining path and also to make clear about
where QS is reported. The act of QS reporting in CPU onlining path is
is likely unnecessary as shown by code reading and testing with
rcutorture's TREE03 and hotplug parameters.

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

---
 kernel/rcu/tree.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 65e1b5e92319..6b6fc28bb670 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3996,7 +3996,19 @@ void rcu_cpu_starting(unsigned int cpu)
 	rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
 	rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
 	rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
-	if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
+
+	/*
+	 * Delete QS reporting from here, by June 2021, if warning does not
+	 * fire, in order to make the rules for reporting QS for an offline CPUs
+	 * more clear. The CPU onlining path does not need to report QS for
+	 * an offline CPU. Either the QS should have reported during CPU
+	 * offlining, or during rcu_gp_init() if it detected a race with either
+	 * CPU offlining or task unblocking on previously offlined CPUs. Note
+	 * that to avoid hotplug notifier deadlocks,the FQS loop also does not
+	 * report QS for an offline CPU any longer (unless it splats due to an
+	 * offline CPU blocking the GP for too long).
+	 */
+	if (WARN_ON_ONCE(rnp->qsmask & mask)) { /* RCU waiting on incoming CPU? */
 		rcu_disable_urgency_upon_qs(rdp);
 		/* Report QS -after- changing ->qsmaskinitnext! */
 		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/3] rcu/tree: Clarify comments about FQS loop reporting quiescent states
  2020-07-31  0:40 [PATCH v2 1/3] rcu/tree: Add a warning if CPU being onlined did not report QS already Joel Fernandes (Google)
@ 2020-07-31  0:40 ` Joel Fernandes (Google)
  2020-07-31  0:40 ` [PATCH v2 3/3] rcu/tree: Make FQS complaining about offline CPU more aggressive Joel Fernandes (Google)
  1 sibling, 0 replies; 4+ messages in thread
From: Joel Fernandes (Google) @ 2020-07-31  0:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay,
	Paul E. McKenney, rcu, Steven Rostedt

At least since v4.19, the FQS loop no longer reports quiescent states
unless it is a dire situation where an offlined CPU failed to report
a quiescent state. Let us clarify the comment in rcu_gp_init() inorder
to keep the comment current.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 6b6fc28bb670..a621932cc385 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1701,8 +1701,8 @@ static bool rcu_gp_init(void)
 
 	/*
 	 * Apply per-leaf buffered online and offline operations to the
-	 * rcu_node tree.  Note that this new grace period need not wait
-	 * for subsequent online CPUs, and that quiescent-state forcing
+	 * rcu_node tree.  Note that this new grace period need not wait for
+	 * subsequent online CPUs, and that RCU hooks in CPU offlining path
 	 * will handle subsequent offline CPUs.
 	 */
 	rcu_state.gp_state = RCU_GP_ONOFF;
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 3/3] rcu/tree: Make FQS complaining about offline CPU more aggressive
  2020-07-31  0:40 [PATCH v2 1/3] rcu/tree: Add a warning if CPU being onlined did not report QS already Joel Fernandes (Google)
  2020-07-31  0:40 ` [PATCH v2 2/3] rcu/tree: Clarify comments about FQS loop reporting quiescent states Joel Fernandes (Google)
@ 2020-07-31  0:40 ` Joel Fernandes (Google)
  2020-07-31  1:36   ` Paul E. McKenney
  1 sibling, 1 reply; 4+ messages in thread
From: Joel Fernandes (Google) @ 2020-07-31  0:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay,
	Paul E. McKenney, rcu, Steven Rostedt

Make FQS loop consider it an immediate failure if the case of an offline CPU
reporting QS is detected, instead of a full second.

This is because rcu_report_dead() already reports quiescent states and
updates ->qsmaskinitnext under node lock.

Light testing with TREE03 and hotplug shows no warnings.

Convert the warning as well to WARN_ON_ONCE() to reduce log spam.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a621932cc385..39bdd744ba97 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1208,13 +1208,15 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 		return 1;
 	}
 
-	/* If waiting too long on an offline CPU, complain. */
-	if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp)) &&
-	    time_after(jiffies, rcu_state.gp_start + HZ)) {
+	/*
+	 * Complain if an offline CPU by RCU's books has not reported QS. Node
+	 * lock is held ensuring offlining does not race here.
+	 */
+	if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp))) {
 		bool onl;
 		struct rcu_node *rnp1;
 
-		WARN_ON(1);  /* Offline CPUs are supposed to report QS! */
+		WARN_ON_ONCE(1);  /* Offline CPUs are supposed to report QS! */
 		pr_info("%s: grp: %d-%d level: %d ->gp_seq %ld ->completedqs %ld\n",
 			__func__, rnp->grplo, rnp->grphi, rnp->level,
 			(long)rnp->gp_seq, (long)rnp->completedqs);
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 3/3] rcu/tree: Make FQS complaining about offline CPU more aggressive
  2020-07-31  0:40 ` [PATCH v2 3/3] rcu/tree: Make FQS complaining about offline CPU more aggressive Joel Fernandes (Google)
@ 2020-07-31  1:36   ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2020-07-31  1:36 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Neeraj Upadhyay, rcu, Steven Rostedt

On Thu, Jul 30, 2020 at 08:40:12PM -0400, Joel Fernandes (Google) wrote:
> Make FQS loop consider it an immediate failure if the case of an offline CPU
> reporting QS is detected, instead of a full second.
> 
> This is because rcu_report_dead() already reports quiescent states and
> updates ->qsmaskinitnext under node lock.
> 
> Light testing with TREE03 and hotplug shows no warnings.
> 
> Convert the warning as well to WARN_ON_ONCE() to reduce log spam.

I will give you a chance to upgrade the above on your V3.
And the comment below.

I do very much like the change to WARN_ON_ONCE(), by the way!

							Thanx, Paul

> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/tree.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a621932cc385..39bdd744ba97 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1208,13 +1208,15 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
>  		return 1;
>  	}
>  
> -	/* If waiting too long on an offline CPU, complain. */
> -	if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp)) &&
> -	    time_after(jiffies, rcu_state.gp_start + HZ)) {
> +	/*
> +	 * Complain if an offline CPU by RCU's books has not reported QS. Node
> +	 * lock is held ensuring offlining does not race here.
> +	 */
> +	if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp))) {
>  		bool onl;
>  		struct rcu_node *rnp1;
>  
> -		WARN_ON(1);  /* Offline CPUs are supposed to report QS! */
> +		WARN_ON_ONCE(1);  /* Offline CPUs are supposed to report QS! */
>  		pr_info("%s: grp: %d-%d level: %d ->gp_seq %ld ->completedqs %ld\n",
>  			__func__, rnp->grplo, rnp->grphi, rnp->level,
>  			(long)rnp->gp_seq, (long)rnp->completedqs);
> -- 
> 2.28.0.163.g6104cc2f0b6-goog
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-31  1:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-31  0:40 [PATCH v2 1/3] rcu/tree: Add a warning if CPU being onlined did not report QS already Joel Fernandes (Google)
2020-07-31  0:40 ` [PATCH v2 2/3] rcu/tree: Clarify comments about FQS loop reporting quiescent states Joel Fernandes (Google)
2020-07-31  0:40 ` [PATCH v2 3/3] rcu/tree: Make FQS complaining about offline CPU more aggressive Joel Fernandes (Google)
2020-07-31  1:36   ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).