linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] rcu: two small fixes for RCU kthreads
@ 2015-09-04 12:11 Petr Mladek
  2015-09-04 12:11 ` [PATCH 1/2] rcu: Show the real fqs_state Petr Mladek
  2015-09-04 12:11 ` [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state Petr Mladek
  0 siblings, 2 replies; 11+ messages in thread
From: Petr Mladek @ 2015-09-04 12:11 UTC (permalink / raw)
  To: Paul E. McKenney, Josh Triplett
  Cc: Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Jiri Kosina,
	linux-kernel, Petr Mladek

I am trying to convert kthreads into a more sane API. I played
also with RCU kthreads and found two small problems. They are
independent on the conversion, so I send the patches already now.

Petr Mladek (2):
  rcu: Show the real fqs_state
  rcu: Fix up timeouts for forcing the quiescent state

 kernel/rcu/tree.c | 90 +++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 58 insertions(+), 32 deletions(-)

-- 
1.8.5.6


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] rcu: Show the real fqs_state
  2015-09-04 12:11 [PATCH 0/2] rcu: two small fixes for RCU kthreads Petr Mladek
@ 2015-09-04 12:11 ` Petr Mladek
  2015-09-04 23:24   ` Paul E. McKenney
  2015-09-04 12:11 ` [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state Petr Mladek
  1 sibling, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2015-09-04 12:11 UTC (permalink / raw)
  To: Paul E. McKenney, Josh Triplett
  Cc: Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Jiri Kosina,
	linux-kernel, Petr Mladek

The value of "fqs_state" in struct rcu_state is always RCU_GP_IDLE.

The real state is stored in a local variable in rcu_gp_kthread().
It is modified by rcu_gp_fqs() via parameter and return value.
But the actual value is never stored to rsp->fqs_state.

The result is that print_one_rcu_state() does not show the real
state.

This code has been added 3 years ago by the commit 4cdfc175c25c89ee
("rcu: Move quiescent-state forcing into kthread"). I guess that it
was an overlook or optimization.

Anyway, the value seems to be manipulated only by the thread, except
for shoving the status. I do not see any risk in updating it directly
in the struct.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/rcu/tree.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9f75f25cc5d9..54af8d5f9f7b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1927,16 +1927,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
 /*
  * Do one round of quiescent-state forcing.
  */
-static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
+static void rcu_gp_fqs(struct rcu_state *rsp)
 {
-	int fqs_state = fqs_state_in;
 	bool isidle = false;
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	WRITE_ONCE(rsp->gp_activity, jiffies);
 	rsp->n_force_qs++;
-	if (fqs_state == RCU_SAVE_DYNTICK) {
+	if (rsp->fqs_state == RCU_SAVE_DYNTICK) {
 		/* Collect dyntick-idle snapshots. */
 		if (is_sysidle_rcu_state(rsp)) {
 			isidle = true;
@@ -1945,7 +1944,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 		force_qs_rnp(rsp, dyntick_save_progress_counter,
 			     &isidle, &maxj);
 		rcu_sysidle_report_gp(rsp, isidle, maxj);
-		fqs_state = RCU_FORCE_QS;
+		rsp->fqs_state = RCU_FORCE_QS;
 	} else {
 		/* Handle dyntick-idle and offline CPUs. */
 		isidle = true;
@@ -1959,7 +1958,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
 		raw_spin_unlock_irq(&rnp->lock);
 	}
-	return fqs_state;
 }
 
 /*
@@ -2041,7 +2039,6 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
-	int fqs_state;
 	int gf;
 	unsigned long j;
 	int ret;
@@ -2073,7 +2070,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 		}
 
 		/* Handle quiescent-state forcing. */
-		fqs_state = RCU_SAVE_DYNTICK;
+		rsp->fqs_state = RCU_SAVE_DYNTICK;
 		j = jiffies_till_first_fqs;
 		if (j > HZ) {
 			j = HZ;
@@ -2101,7 +2098,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
-				fqs_state = rcu_gp_fqs(rsp, fqs_state);
+				rcu_gp_fqs(rsp);
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state
  2015-09-04 12:11 [PATCH 0/2] rcu: two small fixes for RCU kthreads Petr Mladek
  2015-09-04 12:11 ` [PATCH 1/2] rcu: Show the real fqs_state Petr Mladek
@ 2015-09-04 12:11 ` Petr Mladek
  2015-09-04 23:49   ` Paul E. McKenney
  1 sibling, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2015-09-04 12:11 UTC (permalink / raw)
  To: Paul E. McKenney, Josh Triplett
  Cc: Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Jiri Kosina,
	linux-kernel, Petr Mladek

The deadline to force the quiescent state (jiffies_force_qs) is currently
updated only when the previous timeout passed. But the timeout used for
wait_event() is always the entire original timeout. This is strange.

First, we might miss the deadline if we wait after a spurious wake up
or after sleeping in cond_resched() because we wait too long.

Second, we might do another forcing too early if the previous forcing
was done earlier because of RCU_GP_FLAG_FQS and we later get a spurious
wake up. IMHO, we should reset the deadline in this case.

This patch updates the deadline "jiffies_force_qs" right after forcing
the quiescent state by rcu_gp_fqs().

Also it updates the remaining timeout according to the current jiffies and
the requested deadline.

It moves the cond_resched_rcu_qs() to a single place. It changes the order
of the check for the pending signal. But there never should be a pending
signal. If there was we would have bigger problems because wait_event()
would never sleep again until someone flushed the signal.

I have found these problems when trying to understand the code. I do not
have any reproducer. I think that it is hardly visible because
the spurious wakeup is rather theoretical.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/rcu/tree.c | 77 ++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 54af8d5f9f7b..aaeeabcba545 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2035,13 +2035,45 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 }
 
 /*
+ * Normalize, update, and return the first timeout.
+ */
+static unsigned long normalize_jiffies_till_first_fqs(void)
+{
+	unsigned long j = jiffies_till_first_fqs;
+
+	if (unlikely(j > HZ)) {
+		j = HZ;
+		jiffies_till_first_fqs = HZ;
+	}
+
+	return j;
+}
+
+/*
+ * Normalize, update, and return the first timeout.
+ */
+static unsigned long normalize_jiffies_till_next_fqs(void)
+{
+	unsigned long j = jiffies_till_next_fqs;
+
+	if (unlikely(j > HZ)) {
+		j = HZ;
+		jiffies_till_next_fqs = HZ;
+	} else if (unlikely(j < 1)) {
+		j = 1;
+		jiffies_till_next_fqs = 1;
+	}
+
+	return j;
+}
+
+/*
  * Body of kthread that handles grace periods.
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
 	int gf;
-	unsigned long j;
-	int ret;
+	unsigned long timeout, j;
 	struct rcu_state *rsp = arg;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
@@ -2071,22 +2103,18 @@ static int __noreturn rcu_gp_kthread(void *arg)
 
 		/* Handle quiescent-state forcing. */
 		rsp->fqs_state = RCU_SAVE_DYNTICK;
-		j = jiffies_till_first_fqs;
-		if (j > HZ) {
-			j = HZ;
-			jiffies_till_first_fqs = HZ;
-		}
-		ret = 0;
+		timeout = normalize_jiffies_till_first_fqs();
+		rsp->jiffies_force_qs = jiffies + timeout;
 		for (;;) {
-			if (!ret)
-				rsp->jiffies_force_qs = jiffies + j;
 			trace_rcu_grace_period(rsp->name,
 					       READ_ONCE(rsp->gpnum),
 					       TPS("fqswait"));
 			rsp->gp_state = RCU_GP_WAIT_FQS;
-			ret = wait_event_interruptible_timeout(rsp->gp_wq,
-					rcu_gp_fqs_check_wake(rsp, &gf), j);
+			wait_event_interruptible_timeout(rsp->gp_wq,
+					rcu_gp_fqs_check_wake(rsp, &gf),
+					timeout);
 			rsp->gp_state = RCU_GP_DOING_FQS;
+try_again:
 			/* Locking provides needed memory barriers. */
 			/* If grace period done, leave loop. */
 			if (!READ_ONCE(rnp->qsmask) &&
@@ -2099,28 +2127,29 @@ static int __noreturn rcu_gp_kthread(void *arg)
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
 				rcu_gp_fqs(rsp);
+				timeout = normalize_jiffies_till_next_fqs();
+				rsp->jiffies_force_qs = jiffies + timeout;
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
-				cond_resched_rcu_qs();
-				WRITE_ONCE(rsp->gp_activity, jiffies);
 			} else {
 				/* Deal with stray signal. */
-				cond_resched_rcu_qs();
-				WRITE_ONCE(rsp->gp_activity, jiffies);
 				WARN_ON(signal_pending(current));
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqswaitsig"));
 			}
-			j = jiffies_till_next_fqs;
-			if (j > HZ) {
-				j = HZ;
-				jiffies_till_next_fqs = HZ;
-			} else if (j < 1) {
-				j = 1;
-				jiffies_till_next_fqs = 1;
-			}
+			cond_resched_rcu_qs();
+			WRITE_ONCE(rsp->gp_activity, jiffies);
+			/*
+			 * Count the remaining timeout when it was a spurious
+			 * wakeup. Well, it is useful also when we have slept
+			 * in the cond_resched().
+			 */
+			j = jiffies;
+			if (ULONG_CMP_GE(j, rsp->jiffies_force_qs))
+				goto try_again;
+			timeout = rsp->jiffies_force_qs - j;
 		}
 
 		/* Handle grace-period end. */
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] rcu: Show the real fqs_state
  2015-09-04 12:11 ` [PATCH 1/2] rcu: Show the real fqs_state Petr Mladek
@ 2015-09-04 23:24   ` Paul E. McKenney
  2015-09-07 14:58     ` Petr Mladek
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2015-09-04 23:24 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote:
> The value of "fqs_state" in struct rcu_state is always RCU_GP_IDLE.
> 
> The real state is stored in a local variable in rcu_gp_kthread().
> It is modified by rcu_gp_fqs() via parameter and return value.
> But the actual value is never stored to rsp->fqs_state.
> 
> The result is that print_one_rcu_state() does not show the real
> state.
> 
> This code has been added 3 years ago by the commit 4cdfc175c25c89ee
> ("rcu: Move quiescent-state forcing into kthread"). I guess that it
> was an overlook or optimization.
> 
> Anyway, the value seems to be manipulated only by the thread, except
> for shoving the status. I do not see any risk in updating it directly
> in the struct.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Good catch, but how about the following fix instead?

							Thanx, Paul

------------------------------------------------------------------------

    rcu: Finish folding ->fqs_state into ->gp_state
    
    Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
    into kthread") started the process of folding the old ->fqs_state
    into ->gp_state, but did not complete it.  This situation does not
    cause any malfunction, but can result in extremely confusing trace
    output.  This commit completes this task of eliminating ->fqs_state
    in favor of ->gp_state.
    
    Reported-by: Petr Mladek <pmladek@suse.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 69ab7ce2cf7b..04234936d897 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -97,7 +97,7 @@ struct rcu_state sname##_state = { \
 	.level = { &sname##_state.node[0] }, \
 	.rda = &sname##_data, \
 	.call = cr, \
-	.fqs_state = RCU_GP_IDLE, \
+	.gp_state = RCU_GP_IDLE, \
 	.gpnum = 0UL - 300UL, \
 	.completed = 0UL - 300UL, \
 	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
@@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
 /*
  * Do one round of quiescent-state forcing.
  */
-static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
+static void rcu_gp_fqs(struct rcu_state *rsp)
 {
-	int fqs_state = fqs_state_in;
 	bool isidle = false;
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	WRITE_ONCE(rsp->gp_activity, jiffies);
 	rsp->n_force_qs++;
-	if (fqs_state == RCU_SAVE_DYNTICK) {
+	if (rsp->gp_state == RCU_SAVE_DYNTICK) {
 		/* Collect dyntick-idle snapshots. */
 		if (is_sysidle_rcu_state(rsp)) {
 			isidle = true;
@@ -1967,7 +1966,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 		force_qs_rnp(rsp, dyntick_save_progress_counter,
 			     &isidle, &maxj);
 		rcu_sysidle_report_gp(rsp, isidle, maxj);
-		fqs_state = RCU_FORCE_QS;
+		rsp->gp_state = RCU_FORCE_QS;
 	} else {
 		/* Handle dyntick-idle and offline CPUs. */
 		isidle = true;
@@ -1981,7 +1980,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
 		raw_spin_unlock_irq(&rnp->lock);
 	}
-	return fqs_state;
 }
 
 /*
@@ -2045,7 +2043,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	/* Declare grace period done. */
 	WRITE_ONCE(rsp->completed, rsp->gpnum);
 	trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
-	rsp->fqs_state = RCU_GP_IDLE;
+	rsp->gp_state = RCU_GP_IDLE;
 	rdp = this_cpu_ptr(rsp->rda);
 	/* Advance CBs to reduce false positives below. */
 	needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp;
@@ -2063,7 +2061,6 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
-	int fqs_state;
 	int gf;
 	unsigned long j;
 	int ret;
@@ -2095,7 +2092,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 		}
 
 		/* Handle quiescent-state forcing. */
-		fqs_state = RCU_SAVE_DYNTICK;
+		rsp->gp_state = RCU_SAVE_DYNTICK;
 		j = jiffies_till_first_fqs;
 		if (j > HZ) {
 			j = HZ;
@@ -2123,7 +2120,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
-				fqs_state = rcu_gp_fqs(rsp, fqs_state);
+				rcu_gp_fqs(rsp);
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index d5f58e717c8b..9faad70a8246 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -417,12 +417,11 @@ struct rcu_data {
 	struct rcu_state *rsp;
 };
 
-/* Values for fqs_state field in struct rcu_state. */
+/* Values for gp_state field in struct rcu_state. */
 #define RCU_GP_IDLE		0	/* No grace period in progress. */
 #define RCU_GP_INIT		1	/* Grace period being initialized. */
 #define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick state. */
 #define RCU_FORCE_QS		3	/* Need to force quiescent state. */
-#define RCU_SIGNAL_INIT		RCU_SAVE_DYNTICK
 
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
 #define RCU_NOGP_WAKE_NOT	0
@@ -474,9 +473,8 @@ struct rcu_state {
 
 	/* The following fields are guarded by the root rcu_node's lock. */
 
-	u8	fqs_state ____cacheline_internodealigned_in_smp;
-						/* Force QS state. */
-	u8	boost;				/* Subject to priority boost. */
+	u8	boost ____cacheline_internodealigned_in_smp;
+						/* Subject to priority boost. */
 	unsigned long gpnum;			/* Current gp number. */
 	unsigned long completed;		/* # of last completed gp. */
 	struct task_struct *gp_kthread;		/* Task for grace periods. */
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 999c3672f990..ef7093cc9b5c 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -268,7 +268,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 	gpnum = rsp->gpnum;
 	seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
 		   ulong2long(rsp->completed), ulong2long(gpnum),
-		   rsp->fqs_state,
+		   rsp->gp_state,
 		   (long)(rsp->jiffies_force_qs - jiffies),
 		   (int)(jiffies & 0xffff));
 	seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state
  2015-09-04 12:11 ` [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state Petr Mladek
@ 2015-09-04 23:49   ` Paul E. McKenney
  2015-09-07 15:21     ` Petr Mladek
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2015-09-04 23:49 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Fri, Sep 04, 2015 at 02:11:30PM +0200, Petr Mladek wrote:
> The deadline to force the quiescent state (jiffies_force_qs) is currently
> updated only when the previous timeout passed. But the timeout used for
> wait_event() is always the entire original timeout. This is strange.

They tell me that kthreads aren't supposed to every catch signals,
hence the WARN_ON() in the early-exit case stray-signal case.

In the case where we were awakened with an explicit force-quiescent-state
request, we do the scan, and then wait the full time for the next scan.
So the point of the delay is to space out the scans, not to fit a
pre-determined schedule.

The reason we get awakened with an explicit force-quiescent-state
request is that a given CPU just got inundated with RCU callbacks
or that rcutorture wants to hammer this code path.

So I am not seeing this as anything in need of fixing.

Am I missing something subtle here?

							Thanx, Paul

> First, we might miss the deadline if we wait after a spurious wake up
> or after sleeping in cond_resched() because we wait too long.
> 
> Second, we might do another forcing too early if the previous forcing
> was done earlier because of RCU_GP_FLAG_FQS and we later get a spurious
> wake up. IMHO, we should reset the deadline in this case.
> 
> This patch updates the deadline "jiffies_force_qs" right after forcing
> the quiescent state by rcu_gp_fqs().
> 
> Also it updates the remaining timeout according to the current jiffies and
> the requested deadline.
> 
> It moves the cond_resched_rcu_qs() to a single place. It changes the order
> of the check for the pending signal. But there never should be a pending
> signal. If there was we would have bigger problems because wait_event()
> would never sleep again until someone flushed the signal.
> 
> I have found these problems when trying to understand the code. I do not
> have any reproducer. I think that it is hardly visible because
> the spurious wakeup is rather theoretical.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/rcu/tree.c | 77 ++++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 53 insertions(+), 24 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 54af8d5f9f7b..aaeeabcba545 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2035,13 +2035,45 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  }
> 
>  /*
> + * Normalize, update, and return the first timeout.
> + */
> +static unsigned long normalize_jiffies_till_first_fqs(void)
> +{
> +	unsigned long j = jiffies_till_first_fqs;
> +
> +	if (unlikely(j > HZ)) {
> +		j = HZ;
> +		jiffies_till_first_fqs = HZ;
> +	}
> +
> +	return j;
> +}
> +
> +/*
> + * Normalize, update, and return the first timeout.
> + */
> +static unsigned long normalize_jiffies_till_next_fqs(void)
> +{
> +	unsigned long j = jiffies_till_next_fqs;
> +
> +	if (unlikely(j > HZ)) {
> +		j = HZ;
> +		jiffies_till_next_fqs = HZ;
> +	} else if (unlikely(j < 1)) {
> +		j = 1;
> +		jiffies_till_next_fqs = 1;
> +	}
> +
> +	return j;
> +}
> +
> +/*
>   * Body of kthread that handles grace periods.
>   */
>  static int __noreturn rcu_gp_kthread(void *arg)
>  {
>  	int gf;
> -	unsigned long j;
> -	int ret;
> +	unsigned long timeout, j;
>  	struct rcu_state *rsp = arg;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
> 
> @@ -2071,22 +2103,18 @@ static int __noreturn rcu_gp_kthread(void *arg)
> 
>  		/* Handle quiescent-state forcing. */
>  		rsp->fqs_state = RCU_SAVE_DYNTICK;
> -		j = jiffies_till_first_fqs;
> -		if (j > HZ) {
> -			j = HZ;
> -			jiffies_till_first_fqs = HZ;
> -		}
> -		ret = 0;
> +		timeout = normalize_jiffies_till_first_fqs();
> +		rsp->jiffies_force_qs = jiffies + timeout;
>  		for (;;) {
> -			if (!ret)
> -				rsp->jiffies_force_qs = jiffies + j;
>  			trace_rcu_grace_period(rsp->name,
>  					       READ_ONCE(rsp->gpnum),
>  					       TPS("fqswait"));
>  			rsp->gp_state = RCU_GP_WAIT_FQS;
> -			ret = wait_event_interruptible_timeout(rsp->gp_wq,
> -					rcu_gp_fqs_check_wake(rsp, &gf), j);
> +			wait_event_interruptible_timeout(rsp->gp_wq,
> +					rcu_gp_fqs_check_wake(rsp, &gf),
> +					timeout);
>  			rsp->gp_state = RCU_GP_DOING_FQS;
> +try_again:
>  			/* Locking provides needed memory barriers. */
>  			/* If grace period done, leave loop. */
>  			if (!READ_ONCE(rnp->qsmask) &&
> @@ -2099,28 +2127,29 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqsstart"));
>  				rcu_gp_fqs(rsp);
> +				timeout = normalize_jiffies_till_next_fqs();
> +				rsp->jiffies_force_qs = jiffies + timeout;
>  				trace_rcu_grace_period(rsp->name,
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqsend"));
> -				cond_resched_rcu_qs();
> -				WRITE_ONCE(rsp->gp_activity, jiffies);
>  			} else {
>  				/* Deal with stray signal. */
> -				cond_resched_rcu_qs();
> -				WRITE_ONCE(rsp->gp_activity, jiffies);
>  				WARN_ON(signal_pending(current));
>  				trace_rcu_grace_period(rsp->name,
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqswaitsig"));
>  			}
> -			j = jiffies_till_next_fqs;
> -			if (j > HZ) {
> -				j = HZ;
> -				jiffies_till_next_fqs = HZ;
> -			} else if (j < 1) {
> -				j = 1;
> -				jiffies_till_next_fqs = 1;
> -			}
> +			cond_resched_rcu_qs();
> +			WRITE_ONCE(rsp->gp_activity, jiffies);
> +			/*
> +			 * Count the remaining timeout when it was a spurious
> +			 * wakeup. Well, it is useful also when we have slept
> +			 * in the cond_resched().
> +			 */
> +			j = jiffies;
> +			if (ULONG_CMP_GE(j, rsp->jiffies_force_qs))
> +				goto try_again;
> +			timeout = rsp->jiffies_force_qs - j;
>  		}
> 
>  		/* Handle grace-period end. */
> -- 
> 1.8.5.6
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] rcu: Show the real fqs_state
  2015-09-04 23:24   ` Paul E. McKenney
@ 2015-09-07 14:58     ` Petr Mladek
  2015-09-08 19:59       ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2015-09-07 14:58 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Fri 2015-09-04 16:24:22, Paul E. McKenney wrote:
> On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote:
> > The value of "fqs_state" in struct rcu_state is always RCU_GP_IDLE.
> > 
> > The real state is stored in a local variable in rcu_gp_kthread().
> > It is modified by rcu_gp_fqs() via parameter and return value.
> > But the actual value is never stored to rsp->fqs_state.
> > 
> > The result is that print_one_rcu_state() does not show the real
> > state.
> > 
> > This code has been added 3 years ago by the commit 4cdfc175c25c89ee
> > ("rcu: Move quiescent-state forcing into kthread"). I guess that it
> > was an overlook or optimization.
> > 
> > Anyway, the value seems to be manipulated only by the thread, except
> > for shoving the status. I do not see any risk in updating it directly
> > in the struct.
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.com>
> 
> Good catch, but how about the following fix instead?
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
>     rcu: Finish folding ->fqs_state into ->gp_state
>     
>     Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
>     into kthread") started the process of folding the old ->fqs_state
>     into ->gp_state, but did not complete it.  This situation does not
>     cause any malfunction, but can result in extremely confusing trace
>     output.  This commit completes this task of eliminating ->fqs_state
>     in favor of ->gp_state.

It makes sense but it breaks dynticks handling in rcu_gp_fqs(), see
below.

>     
>     Reported-by: Petr Mladek <pmladek@suse.com>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 69ab7ce2cf7b..04234936d897 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
>  /*
>   * Do one round of quiescent-state forcing.
>   */
> -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> +static void rcu_gp_fqs(struct rcu_state *rsp)
>  {
> -	int fqs_state = fqs_state_in;
>  	bool isidle = false;
>  	unsigned long maxj;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
>  	WRITE_ONCE(rsp->gp_activity, jiffies);
>  	rsp->n_force_qs++;
> -	if (fqs_state == RCU_SAVE_DYNTICK) {
> +	if (rsp->gp_state == RCU_SAVE_DYNTICK) {

This will never happen because rcu_gp_kthread() modifies rsp->gp_state
many times. The last value before calling rcu_gp_fqs() is
RCU_GP_DOING_FQS.

I think about passing this information via a separate bool.

[...]

> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index d5f58e717c8b..9faad70a8246 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -417,12 +417,11 @@ struct rcu_data {
>  	struct rcu_state *rsp;
>  };
>  
> -/* Values for fqs_state field in struct rcu_state. */
> +/* Values for gp_state field in struct rcu_state. */
>  #define RCU_GP_IDLE		0	/* No grace period in progress. */

This value seems to be used instead of the new RCU_GP_WAIT_INIT.

>  #define RCU_GP_INIT		1	/* Grace period being
>  #initialized. */

This value is unused.

>  #define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick
>  #state. */

This one is not longer preserved when merged with the other state.

>  #define RCU_FORCE_QS		3	/* Need to force quiescent
>  #state. */

The meaning of this one is strange. If I get it correctly,
it is set after the state was forced. But the comment suggests
that it is before.

By other words, these states seems to get obsoleted by

/* Values for rcu_state structure's gp_flags field. */
#define RCU_GP_WAIT_INIT 0	/* Initial state. */
#define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
#define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
#define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
#define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
#define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
#define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */


Please, find below your commit updated with my ideas:

	+ used bool save_dyntick instead of RCU_SAVE_DYNTICK
	  and RCU_FORCE_QS states
	+ rename RCU_GP_WAIT_INIT -> RCU_GP_IDLE
	+ remove all the obsolete states

I am sorry if I handled "Signed-off-by" flags a wrong way. It is
basically your patch with few small updates from me. I am not sure
what is the right process in this case. Feel free to use Reviewed-by
instead of Signed-off-by with my name.

Well, I guess that this is not the final state ;-)


>From 61a1bf6659f4f4c0c4021f185bc156f8c83f9ea5 Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Fri, 4 Sep 2015 16:24:22 -0700
Subject: [PATCH] rcu: Finish folding ->fqs_state into ->gp_state

Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
into kthread") started the process of folding the old ->fqs_state
into ->gp_state, but did not complete it.  This situation does not
cause any malfunction, but can result in extremely confusing trace
output.  This commit completes this task of eliminating ->fqs_state
in favor of ->gp_state.

The old fqs_state had one side effect.  It was used to decide whether
to collect dyntick-idle snapshots.  For this purpose, we add a boolean
into the state struct.

Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/rcu/tree.c       | 17 +++++++----------
 kernel/rcu/tree.h       | 16 +++++-----------
 kernel/rcu/tree_trace.c |  2 +-
 3 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9f75f25cc5d9..f47067fdc783 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -98,7 +98,7 @@ struct rcu_state sname##_state = { \
 	.level = { &sname##_state.node[0] }, \
 	.rda = &sname##_data, \
 	.call = cr, \
-	.fqs_state = RCU_GP_IDLE, \
+	.gp_state = RCU_GP_IDLE, \
 	.gpnum = 0UL - 300UL, \
 	.completed = 0UL - 300UL, \
 	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
@@ -1927,16 +1927,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
 /*
  * Do one round of quiescent-state forcing.
  */
-static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
+static void rcu_gp_fqs(struct rcu_state *rsp)
 {
-	int fqs_state = fqs_state_in;
 	bool isidle = false;
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	WRITE_ONCE(rsp->gp_activity, jiffies);
 	rsp->n_force_qs++;
-	if (fqs_state == RCU_SAVE_DYNTICK) {
+	if (rsp->save_dyntick) {
 		/* Collect dyntick-idle snapshots. */
 		if (is_sysidle_rcu_state(rsp)) {
 			isidle = true;
@@ -1945,7 +1944,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 		force_qs_rnp(rsp, dyntick_save_progress_counter,
 			     &isidle, &maxj);
 		rcu_sysidle_report_gp(rsp, isidle, maxj);
-		fqs_state = RCU_FORCE_QS;
+		rsp->save_dyntick = false;
 	} else {
 		/* Handle dyntick-idle and offline CPUs. */
 		isidle = true;
@@ -1959,7 +1958,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
 		raw_spin_unlock_irq(&rnp->lock);
 	}
-	return fqs_state;
 }
 
 /*
@@ -2023,7 +2021,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	/* Declare grace period done. */
 	WRITE_ONCE(rsp->completed, rsp->gpnum);
 	trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
-	rsp->fqs_state = RCU_GP_IDLE;
+	rsp->gp_state = RCU_GP_IDLE;
 	rdp = this_cpu_ptr(rsp->rda);
 	/* Advance CBs to reduce false positives below. */
 	needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp;
@@ -2041,7 +2039,6 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
-	int fqs_state;
 	int gf;
 	unsigned long j;
 	int ret;
@@ -2073,7 +2070,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 		}
 
 		/* Handle quiescent-state forcing. */
-		fqs_state = RCU_SAVE_DYNTICK;
+		rsp->save_dyntick = true;
 		j = jiffies_till_first_fqs;
 		if (j > HZ) {
 			j = HZ;
@@ -2101,7 +2098,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
-				fqs_state = rcu_gp_fqs(rsp, fqs_state);
+				rcu_gp_fqs(rsp);
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 2e991f8361e4..12303ff25077 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -412,13 +412,6 @@ struct rcu_data {
 	struct rcu_state *rsp;
 };
 
-/* Values for fqs_state field in struct rcu_state. */
-#define RCU_GP_IDLE		0	/* No grace period in progress. */
-#define RCU_GP_INIT		1	/* Grace period being initialized. */
-#define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick state. */
-#define RCU_FORCE_QS		3	/* Need to force quiescent state. */
-#define RCU_SIGNAL_INIT		RCU_SAVE_DYNTICK
-
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
 #define RCU_NOGP_WAKE_NOT	0
 #define RCU_NOGP_WAKE		1
@@ -469,15 +462,16 @@ struct rcu_state {
 
 	/* The following fields are guarded by the root rcu_node's lock. */
 
-	u8	fqs_state ____cacheline_internodealigned_in_smp;
-						/* Force QS state. */
-	u8	boost;				/* Subject to priority boost. */
+	u8	boost ____cacheline_internodealigned_in_smp;
+						/* Subject to priority boost. */
 	unsigned long gpnum;			/* Current gp number. */
 	unsigned long completed;		/* # of last completed gp. */
 	struct task_struct *gp_kthread;		/* Task for grace periods. */
 	wait_queue_head_t gp_wq;		/* Where GP task waits. */
 	short gp_flags;				/* Commands for GP task. */
 	short gp_state;				/* GP kthread sleep state. */
+	bool save_dyntick;			/* Collect dyntick-idle */
+						/* snapshots when forcing QS. */
 
 	/* End of fields guarded by root rcu_node's lock. */
 
@@ -539,7 +533,7 @@ struct rcu_state {
 #define RCU_GP_FLAG_FQS  0x2	/* Need grace-period quiescent-state forcing. */
 
 /* Values for rcu_state structure's gp_flags field. */
-#define RCU_GP_WAIT_INIT 0	/* Initial state. */
+#define RCU_GP_IDLE	 0	/* Initial state and no GP in progress. */
 #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
 #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
 #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 6fc4c5ff3bb5..1d61f5ba4641 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -268,7 +268,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 	gpnum = rsp->gpnum;
 	seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
 		   ulong2long(rsp->completed), ulong2long(gpnum),
-		   rsp->fqs_state,
+		   rsp->gp_state,
 		   (long)(rsp->jiffies_force_qs - jiffies),
 		   (int)(jiffies & 0xffff));
 	seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
-- 
1.8.5.6



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state
  2015-09-04 23:49   ` Paul E. McKenney
@ 2015-09-07 15:21     ` Petr Mladek
  2015-09-08 20:02       ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2015-09-07 15:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Fri 2015-09-04 16:49:46, Paul E. McKenney wrote:
> On Fri, Sep 04, 2015 at 02:11:30PM +0200, Petr Mladek wrote:
> > The deadline to force the quiescent state (jiffies_force_qs) is currently
> > updated only when the previous timeout passed. But the timeout used for
> > wait_event() is always the entire original timeout. This is strange.
> 
> They tell me that kthreads aren't supposed to every catch signals,
> hence the WARN_ON() in the early-exit case stray-signal case.

Yup, I have investigated this recently. All signals are really blocked
for kthreads by default. There are few threads that use signals but
they explicitly enable it by allow_signal().


> In the case where we were awakened with an explicit force-quiescent-state
> request, we do the scan, and then wait the full time for the next scan.
> So the point of the delay is to space out the scans, not to fit a
> pre-determined schedule.
> 
> The reason we get awakened with an explicit force-quiescent-state
> request is that a given CPU just got inundated with RCU callbacks
> or that rcutorture wants to hammer this code path.
> 
> So I am not seeing this as anything in need of fixing.
> 
> Am I missing something subtle here?

There is the commit 88d6df612cc3c99f5 ("rcu: Prevent spurious-wakeup
DoS attack on rcu_gp_kthread()"). It suggests that the spurious
wakeups are possible.

I would consider this patch as a fix/clean up of this Dos attack fix.
Huh, I forgot to mention it in the commit message.

To be honest, I personally do not know how to trigger the spurious
wakeup in the current state of the code. I am trying to convert
the kthread into the kthread worker API and there I got the spurious
wakeups but this is another story.

Thanks a lot for reviewing.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] rcu: Show the real fqs_state
  2015-09-07 14:58     ` Petr Mladek
@ 2015-09-08 19:59       ` Paul E. McKenney
  2015-09-09 12:39         ` Petr Mladek
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2015-09-08 19:59 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Mon, Sep 07, 2015 at 04:58:27PM +0200, Petr Mladek wrote:
> On Fri 2015-09-04 16:24:22, Paul E. McKenney wrote:
> > On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote:
> > > The value of "fqs_state" in struct rcu_state is always RCU_GP_IDLE.
> > > 
> > > The real state is stored in a local variable in rcu_gp_kthread().
> > > It is modified by rcu_gp_fqs() via parameter and return value.
> > > But the actual value is never stored to rsp->fqs_state.
> > > 
> > > The result is that print_one_rcu_state() does not show the real
> > > state.
> > > 
> > > This code has been added 3 years ago by the commit 4cdfc175c25c89ee
> > > ("rcu: Move quiescent-state forcing into kthread"). I guess that it
> > > was an overlook or optimization.
> > > 
> > > Anyway, the value seems to be manipulated only by the thread, except
> > > for shoving the status. I do not see any risk in updating it directly
> > > in the struct.
> > > 
> > > Signed-off-by: Petr Mladek <pmladek@suse.com>
> > 
> > Good catch, but how about the following fix instead?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> >     rcu: Finish folding ->fqs_state into ->gp_state
> >     
> >     Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
> >     into kthread") started the process of folding the old ->fqs_state
> >     into ->gp_state, but did not complete it.  This situation does not
> >     cause any malfunction, but can result in extremely confusing trace
> >     output.  This commit completes this task of eliminating ->fqs_state
> >     in favor of ->gp_state.
> 
> It makes sense but it breaks dynticks handling in rcu_gp_fqs(), see
> below.

Indeed, more confusion on my part!

> >     Reported-by: Petr Mladek <pmladek@suse.com>
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 69ab7ce2cf7b..04234936d897 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
> >  /*
> >   * Do one round of quiescent-state forcing.
> >   */
> > -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> > +static void rcu_gp_fqs(struct rcu_state *rsp)
> >  {
> > -	int fqs_state = fqs_state_in;
> >  	bool isidle = false;
> >  	unsigned long maxj;
> >  	struct rcu_node *rnp = rcu_get_root(rsp);
> >  
> >  	WRITE_ONCE(rsp->gp_activity, jiffies);
> >  	rsp->n_force_qs++;
> > -	if (fqs_state == RCU_SAVE_DYNTICK) {
> > +	if (rsp->gp_state == RCU_SAVE_DYNTICK) {
> 
> This will never happen because rcu_gp_kthread() modifies rsp->gp_state
> many times. The last value before calling rcu_gp_fqs() is
> RCU_GP_DOING_FQS.
> 
> I think about passing this information via a separate bool.
> 
> [...]
> 
> > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > index d5f58e717c8b..9faad70a8246 100644
> > --- a/kernel/rcu/tree.h
> > +++ b/kernel/rcu/tree.h
> > @@ -417,12 +417,11 @@ struct rcu_data {
> >  	struct rcu_state *rsp;
> >  };
> >  
> > -/* Values for fqs_state field in struct rcu_state. */
> > +/* Values for gp_state field in struct rcu_state. */
> >  #define RCU_GP_IDLE		0	/* No grace period in progress. */
> 
> This value seems to be used instead of the new RCU_GP_WAIT_INIT.
> 
> >  #define RCU_GP_INIT		1	/* Grace period being
> >  #initialized. */
> 
> This value is unused.
> 
> >  #define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick
> >  #state. */
> 
> This one is not longer preserved when merged with the other state.
> 
> >  #define RCU_FORCE_QS		3	/* Need to force quiescent
> >  #state. */
> 
> The meaning of this one is strange. If I get it correctly,
> it is set after the state was forced. But the comment suggests
> that it is before.
> 
> By other words, these states seems to get obsoleted by
> 
> /* Values for rcu_state structure's gp_flags field. */
> #define RCU_GP_WAIT_INIT 0	/* Initial state. */
> #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
> #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
> #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
> #define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
> #define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
> #define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */
> 
> 
> Please, find below your commit updated with my ideas:
> 
> 	+ used bool save_dyntick instead of RCU_SAVE_DYNTICK
> 	  and RCU_FORCE_QS states
> 	+ rename RCU_GP_WAIT_INIT -> RCU_GP_IDLE
> 	+ remove all the obsolete states
> 
> I am sorry if I handled "Signed-off-by" flags a wrong way. It is
> basically your patch with few small updates from me. I am not sure
> what is the right process in this case. Feel free to use Reviewed-by
> instead of Signed-off-by with my name.
> 
> Well, I guess that this is not the final state ;-)

Good points, but perhaps an easier solution would be to have a
"firsttime" argument to rcu_gp_fqs() that said whether or not this
was the first call to rcu_gp_fqs() during the current grace period.
If this is the first call, then take the "if" branch that passes
dyntick_save_progress_counter() to force_qs_rnp(), otherwise take the
other branch.

An alternative approach would use the bottom bit of ->gp_state to
record whether or not the current grace period had done its first
call to rcu_gp_fqs().

But I am not generating the patch today, just flew across the Pacific
yesterday.  ;-)

						Thanx, Paul

> >From 61a1bf6659f4f4c0c4021f185bc156f8c83f9ea5 Mon Sep 17 00:00:00 2001
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Date: Fri, 4 Sep 2015 16:24:22 -0700
> Subject: [PATCH] rcu: Finish folding ->fqs_state into ->gp_state
> 
> Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
> into kthread") started the process of folding the old ->fqs_state
> into ->gp_state, but did not complete it.  This situation does not
> cause any malfunction, but can result in extremely confusing trace
> output.  This commit completes this task of eliminating ->fqs_state
> in favor of ->gp_state.
> 
> The old fqs_state had one side effect.  It was used to decide whether
> to collect dyntick-idle snapshots.  For this purpose, we add a boolean
> into the state struct.
> 
> Reported-by: Petr Mladek <pmladek@suse.com>
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/rcu/tree.c       | 17 +++++++----------
>  kernel/rcu/tree.h       | 16 +++++-----------
>  kernel/rcu/tree_trace.c |  2 +-
>  3 files changed, 13 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9f75f25cc5d9..f47067fdc783 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -98,7 +98,7 @@ struct rcu_state sname##_state = { \
>  	.level = { &sname##_state.node[0] }, \
>  	.rda = &sname##_data, \
>  	.call = cr, \
> -	.fqs_state = RCU_GP_IDLE, \
> +	.gp_state = RCU_GP_IDLE, \
>  	.gpnum = 0UL - 300UL, \
>  	.completed = 0UL - 300UL, \
>  	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
> @@ -1927,16 +1927,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
>  /*
>   * Do one round of quiescent-state forcing.
>   */
> -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> +static void rcu_gp_fqs(struct rcu_state *rsp)
>  {
> -	int fqs_state = fqs_state_in;
>  	bool isidle = false;
>  	unsigned long maxj;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
> 
>  	WRITE_ONCE(rsp->gp_activity, jiffies);
>  	rsp->n_force_qs++;
> -	if (fqs_state == RCU_SAVE_DYNTICK) {
> +	if (rsp->save_dyntick) {
>  		/* Collect dyntick-idle snapshots. */
>  		if (is_sysidle_rcu_state(rsp)) {
>  			isidle = true;
> @@ -1945,7 +1944,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
>  		force_qs_rnp(rsp, dyntick_save_progress_counter,
>  			     &isidle, &maxj);
>  		rcu_sysidle_report_gp(rsp, isidle, maxj);
> -		fqs_state = RCU_FORCE_QS;
> +		rsp->save_dyntick = false;
>  	} else {
>  		/* Handle dyntick-idle and offline CPUs. */
>  		isidle = true;
> @@ -1959,7 +1958,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
>  			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
>  		raw_spin_unlock_irq(&rnp->lock);
>  	}
> -	return fqs_state;
>  }
> 
>  /*
> @@ -2023,7 +2021,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  	/* Declare grace period done. */
>  	WRITE_ONCE(rsp->completed, rsp->gpnum);
>  	trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
> -	rsp->fqs_state = RCU_GP_IDLE;
> +	rsp->gp_state = RCU_GP_IDLE;
>  	rdp = this_cpu_ptr(rsp->rda);
>  	/* Advance CBs to reduce false positives below. */
>  	needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp;
> @@ -2041,7 +2039,6 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>   */
>  static int __noreturn rcu_gp_kthread(void *arg)
>  {
> -	int fqs_state;
>  	int gf;
>  	unsigned long j;
>  	int ret;
> @@ -2073,7 +2070,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  		}
> 
>  		/* Handle quiescent-state forcing. */
> -		fqs_state = RCU_SAVE_DYNTICK;
> +		rsp->save_dyntick = true;
>  		j = jiffies_till_first_fqs;
>  		if (j > HZ) {
>  			j = HZ;
> @@ -2101,7 +2098,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  				trace_rcu_grace_period(rsp->name,
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqsstart"));
> -				fqs_state = rcu_gp_fqs(rsp, fqs_state);
> +				rcu_gp_fqs(rsp);
>  				trace_rcu_grace_period(rsp->name,
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqsend"));
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 2e991f8361e4..12303ff25077 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -412,13 +412,6 @@ struct rcu_data {
>  	struct rcu_state *rsp;
>  };
> 
> -/* Values for fqs_state field in struct rcu_state. */
> -#define RCU_GP_IDLE		0	/* No grace period in progress. */
> -#define RCU_GP_INIT		1	/* Grace period being initialized. */
> -#define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick state. */
> -#define RCU_FORCE_QS		3	/* Need to force quiescent state. */
> -#define RCU_SIGNAL_INIT		RCU_SAVE_DYNTICK
> -
>  /* Values for nocb_defer_wakeup field in struct rcu_data. */
>  #define RCU_NOGP_WAKE_NOT	0
>  #define RCU_NOGP_WAKE		1
> @@ -469,15 +462,16 @@ struct rcu_state {
> 
>  	/* The following fields are guarded by the root rcu_node's lock. */
> 
> -	u8	fqs_state ____cacheline_internodealigned_in_smp;
> -						/* Force QS state. */
> -	u8	boost;				/* Subject to priority boost. */
> +	u8	boost ____cacheline_internodealigned_in_smp;
> +						/* Subject to priority boost. */
>  	unsigned long gpnum;			/* Current gp number. */
>  	unsigned long completed;		/* # of last completed gp. */
>  	struct task_struct *gp_kthread;		/* Task for grace periods. */
>  	wait_queue_head_t gp_wq;		/* Where GP task waits. */
>  	short gp_flags;				/* Commands for GP task. */
>  	short gp_state;				/* GP kthread sleep state. */
> +	bool save_dyntick;			/* Collect dyntick-idle */
> +						/* snapshots when forcing QS. */
> 
>  	/* End of fields guarded by root rcu_node's lock. */
> 
> @@ -539,7 +533,7 @@ struct rcu_state {
>  #define RCU_GP_FLAG_FQS  0x2	/* Need grace-period quiescent-state forcing. */
> 
>  /* Values for rcu_state structure's gp_flags field. */
> -#define RCU_GP_WAIT_INIT 0	/* Initial state. */
> +#define RCU_GP_IDLE	 0	/* Initial state and no GP in progress. */
>  #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
>  #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
>  #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
> diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
> index 6fc4c5ff3bb5..1d61f5ba4641 100644
> --- a/kernel/rcu/tree_trace.c
> +++ b/kernel/rcu/tree_trace.c
> @@ -268,7 +268,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
>  	gpnum = rsp->gpnum;
>  	seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
>  		   ulong2long(rsp->completed), ulong2long(gpnum),
> -		   rsp->fqs_state,
> +		   rsp->gp_state,
>  		   (long)(rsp->jiffies_force_qs - jiffies),
>  		   (int)(jiffies & 0xffff));
>  	seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
> -- 
> 1.8.5.6
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state
  2015-09-07 15:21     ` Petr Mladek
@ 2015-09-08 20:02       ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-09-08 20:02 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Mon, Sep 07, 2015 at 05:21:02PM +0200, Petr Mladek wrote:
> On Fri 2015-09-04 16:49:46, Paul E. McKenney wrote:
> > On Fri, Sep 04, 2015 at 02:11:30PM +0200, Petr Mladek wrote:
> > > The deadline to force the quiescent state (jiffies_force_qs) is currently
> > > updated only when the previous timeout passed. But the timeout used for
> > > wait_event() is always the entire original timeout. This is strange.
> > 
> > They tell me that kthreads aren't supposed to every catch signals,
> > hence the WARN_ON() in the early-exit case stray-signal case.
> 
> Yup, I have investigated this recently. All signals are really blocked
> for kthreads by default. There are few threads that use signals but
> they explicitly enable it by allow_signal().

Good!  ;-)

> > In the case where we were awakened with an explicit force-quiescent-state
> > request, we do the scan, and then wait the full time for the next scan.
> > So the point of the delay is to space out the scans, not to fit a
> > pre-determined schedule.
> > 
> > The reason we get awakened with an explicit force-quiescent-state
> > request is that a given CPU just got inundated with RCU callbacks
> > or that rcutorture wants to hammer this code path.
> > 
> > So I am not seeing this as anything in need of fixing.
> > 
> > Am I missing something subtle here?
> 
> There is the commit 88d6df612cc3c99f5 ("rcu: Prevent spurious-wakeup
> DoS attack on rcu_gp_kthread()"). It suggests that the spurious
> wakeups are possible.
> 
> I would consider this patch as a fix/clean up of this Dos attack fix.
> Huh, I forgot to mention it in the commit message.
> 
> To be honest, I personally do not know how to trigger the spurious
> wakeup in the current state of the code. I am trying to convert
> the kthread into the kthread worker API and there I got the spurious
> wakeups but this is another story.

You can do it via rcutorture, but that is not an in-production concern.

You can also do it by having all CPUs invoke call_rcu() in a tight loop.

> Thanks a lot for reviewing.

And thank you for your interest in the Linux-kernel RCU implementation!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] rcu: Show the real fqs_state
  2015-09-08 19:59       ` Paul E. McKenney
@ 2015-09-09 12:39         ` Petr Mladek
  2015-09-09 19:12           ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2015-09-09 12:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Tue 2015-09-08 12:59:15, Paul E. McKenney wrote:
> On Mon, Sep 07, 2015 at 04:58:27PM +0200, Petr Mladek wrote:
> > On Fri 2015-09-04 16:24:22, Paul E. McKenney wrote:
> > > On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote:
[...]

> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 69ab7ce2cf7b..04234936d897 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
> > >  /*
> > >   * Do one round of quiescent-state forcing.
> > >   */
> > > -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> > > +static void rcu_gp_fqs(struct rcu_state *rsp)
> > >  {
> > > -	int fqs_state = fqs_state_in;
> > >  	bool isidle = false;
> > >  	unsigned long maxj;
> > >  	struct rcu_node *rnp = rcu_get_root(rsp);
> > >  
> > >  	WRITE_ONCE(rsp->gp_activity, jiffies);
> > >  	rsp->n_force_qs++;
> > > -	if (fqs_state == RCU_SAVE_DYNTICK) {
> > > +	if (rsp->gp_state == RCU_SAVE_DYNTICK) {
> > 
> > This will never happen because rcu_gp_kthread() modifies rsp->gp_state
> > many times. The last value before calling rcu_gp_fqs() is
> > RCU_GP_DOING_FQS.
> > 
> > I think about passing this information via a separate bool.
> > 
> > [...]
> > 
> > > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > > index d5f58e717c8b..9faad70a8246 100644
> > > --- a/kernel/rcu/tree.h
> > > +++ b/kernel/rcu/tree.h
> > > @@ -417,12 +417,11 @@ struct rcu_data {
> > >  	struct rcu_state *rsp;
> > >  };
> > >  
> > > -/* Values for fqs_state field in struct rcu_state. */
> > > +/* Values for gp_state field in struct rcu_state. */
> > >  #define RCU_GP_IDLE		0	/* No grace period in progress. */
> > 
> > This value seems to be used instead of the new RCU_GP_WAIT_INIT.
> > 
> > >  #define RCU_GP_INIT		1	/* Grace period being
> > >  #initialized. */
> > 
> > This value is unused.
> > 
> > >  #define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick
> > >  #state. */
> > 
> > This one is not longer preserved when merged with the other state.
> > 
> > >  #define RCU_FORCE_QS		3	/* Need to force quiescent
> > >  #state. */
> > 
> > The meaning of this one is strange. If I get it correctly,
> > it is set after the state was forced. But the comment suggests
> > that it is before.
> > 
> > By other words, these states seems to get obsoleted by
> > 
> > /* Values for rcu_state structure's gp_flags field. */
> > #define RCU_GP_WAIT_INIT 0	/* Initial state. */
> > #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
> > #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
> > #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
> > #define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
> > #define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
> > #define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */
> > 
> > 
> > Please, find below your commit updated with my ideas:
> > 
> > 	+ used bool save_dyntick instead of RCU_SAVE_DYNTICK
> > 	  and RCU_FORCE_QS states
> > 	+ rename RCU_GP_WAIT_INIT -> RCU_GP_IDLE
> > 	+ remove all the obsolete states
> > 
> > I am sorry if I handled "Signed-off-by" flags a wrong way. It is
> > basically your patch with few small updates from me. I am not sure
> > what is the right process in this case. Feel free to use Reviewed-by
> > instead of Signed-off-by with my name.
> > 
> > Well, I guess that this is not the final state ;-)
> 
> Good points, but perhaps an easier solution would be to have a
> "firsttime" argument to rcu_gp_fqs() that said whether or not this
> was the first call to rcu_gp_fqs() during the current grace period.
> If this is the first call, then take the "if" branch that passes
> dyntick_save_progress_counter() to force_qs_rnp(), otherwise take the
> other branch.

This seems to be the most elegant solution at the moment.

> But I am not generating the patch today, just flew across the Pacific
> yesterday.  ;-)

Please, find below the updated patch where I used the first_time
parameter.

Again, I am not sure about the commit person and Signed-off-by
tags. Many parts of the patch are yours. Feel free to update
them.


>From 7d7f2ee97a451f5cb055901a3bf22fec23a53bff Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Fri, 4 Sep 2015 16:24:22 -0700
Subject: [PATCH] rcu: Finish folding ->fqs_state into ->gp_state

Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
into kthread") started the process of folding the old ->fqs_state
into ->gp_state, but did not complete it.  This situation does not
cause any malfunction, but can result in extremely confusing trace
output.  This commit completes this task of eliminating ->fqs_state
in favor of ->gp_state.

The old fqs_state had one side effect.  It was used to decide whether
to collect dyntick-idle snapshots.  For this purpose, we add a boolean
variable into the kthread.

Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/rcu/tree.c       | 18 ++++++++----------
 kernel/rcu/tree.h       | 14 +++-----------
 kernel/rcu/tree_trace.c |  2 +-
 3 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9f75f25cc5d9..5413d87a67c6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -98,7 +98,7 @@ struct rcu_state sname##_state = { \
 	.level = { &sname##_state.node[0] }, \
 	.rda = &sname##_data, \
 	.call = cr, \
-	.fqs_state = RCU_GP_IDLE, \
+	.gp_state = RCU_GP_IDLE, \
 	.gpnum = 0UL - 300UL, \
 	.completed = 0UL - 300UL, \
 	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
@@ -1927,16 +1927,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
 /*
  * Do one round of quiescent-state forcing.
  */
-static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
+static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time)
 {
-	int fqs_state = fqs_state_in;
 	bool isidle = false;
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	WRITE_ONCE(rsp->gp_activity, jiffies);
 	rsp->n_force_qs++;
-	if (fqs_state == RCU_SAVE_DYNTICK) {
+	if (first_time) {
 		/* Collect dyntick-idle snapshots. */
 		if (is_sysidle_rcu_state(rsp)) {
 			isidle = true;
@@ -1945,7 +1944,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 		force_qs_rnp(rsp, dyntick_save_progress_counter,
 			     &isidle, &maxj);
 		rcu_sysidle_report_gp(rsp, isidle, maxj);
-		fqs_state = RCU_FORCE_QS;
 	} else {
 		/* Handle dyntick-idle and offline CPUs. */
 		isidle = true;
@@ -1959,7 +1957,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
 		raw_spin_unlock_irq(&rnp->lock);
 	}
-	return fqs_state;
 }
 
 /*
@@ -2023,7 +2020,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	/* Declare grace period done. */
 	WRITE_ONCE(rsp->completed, rsp->gpnum);
 	trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
-	rsp->fqs_state = RCU_GP_IDLE;
+	rsp->gp_state = RCU_GP_IDLE;
 	rdp = this_cpu_ptr(rsp->rda);
 	/* Advance CBs to reduce false positives below. */
 	needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp;
@@ -2041,7 +2038,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
-	int fqs_state;
+	bool first_gp_fqs;
 	int gf;
 	unsigned long j;
 	int ret;
@@ -2073,7 +2070,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 		}
 
 		/* Handle quiescent-state forcing. */
-		fqs_state = RCU_SAVE_DYNTICK;
+		first_gp_fqs = true;
 		j = jiffies_till_first_fqs;
 		if (j > HZ) {
 			j = HZ;
@@ -2101,7 +2098,8 @@ static int __noreturn rcu_gp_kthread(void *arg)
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
-				fqs_state = rcu_gp_fqs(rsp, fqs_state);
+				rcu_gp_fqs(rsp, first_gp_fqs);
+				first_gp_fqs = false;
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 2e991f8361e4..de370b611837 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -412,13 +412,6 @@ struct rcu_data {
 	struct rcu_state *rsp;
 };
 
-/* Values for fqs_state field in struct rcu_state. */
-#define RCU_GP_IDLE		0	/* No grace period in progress. */
-#define RCU_GP_INIT		1	/* Grace period being initialized. */
-#define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick state. */
-#define RCU_FORCE_QS		3	/* Need to force quiescent state. */
-#define RCU_SIGNAL_INIT		RCU_SAVE_DYNTICK
-
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
 #define RCU_NOGP_WAKE_NOT	0
 #define RCU_NOGP_WAKE		1
@@ -469,9 +462,8 @@ struct rcu_state {
 
 	/* The following fields are guarded by the root rcu_node's lock. */
 
-	u8	fqs_state ____cacheline_internodealigned_in_smp;
-						/* Force QS state. */
-	u8	boost;				/* Subject to priority boost. */
+	u8	boost ____cacheline_internodealigned_in_smp;
+						/* Subject to priority boost. */
 	unsigned long gpnum;			/* Current gp number. */
 	unsigned long completed;		/* # of last completed gp. */
 	struct task_struct *gp_kthread;		/* Task for grace periods. */
@@ -539,7 +531,7 @@ struct rcu_state {
 #define RCU_GP_FLAG_FQS  0x2	/* Need grace-period quiescent-state forcing. */
 
 /* Values for rcu_state structure's gp_flags field. */
-#define RCU_GP_WAIT_INIT 0	/* Initial state. */
+#define RCU_GP_IDLE	 0	/* Initial state and no GP in progress. */
 #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
 #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
 #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 6fc4c5ff3bb5..1d61f5ba4641 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -268,7 +268,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 	gpnum = rsp->gpnum;
 	seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
 		   ulong2long(rsp->completed), ulong2long(gpnum),
-		   rsp->fqs_state,
+		   rsp->gp_state,
 		   (long)(rsp->jiffies_force_qs - jiffies),
 		   (int)(jiffies & 0xffff));
 	seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] rcu: Show the real fqs_state
  2015-09-09 12:39         ` Petr Mladek
@ 2015-09-09 19:12           ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2015-09-09 19:12 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Jiri Kosina, linux-kernel

On Wed, Sep 09, 2015 at 02:39:50PM +0200, Petr Mladek wrote:
> On Tue 2015-09-08 12:59:15, Paul E. McKenney wrote:
> > On Mon, Sep 07, 2015 at 04:58:27PM +0200, Petr Mladek wrote:
> > > On Fri 2015-09-04 16:24:22, Paul E. McKenney wrote:
> > > > On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote:
> [...]
> 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 69ab7ce2cf7b..04234936d897 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
> > > >  /*
> > > >   * Do one round of quiescent-state forcing.
> > > >   */
> > > > -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> > > > +static void rcu_gp_fqs(struct rcu_state *rsp)
> > > >  {
> > > > -	int fqs_state = fqs_state_in;
> > > >  	bool isidle = false;
> > > >  	unsigned long maxj;
> > > >  	struct rcu_node *rnp = rcu_get_root(rsp);
> > > >  
> > > >  	WRITE_ONCE(rsp->gp_activity, jiffies);
> > > >  	rsp->n_force_qs++;
> > > > -	if (fqs_state == RCU_SAVE_DYNTICK) {
> > > > +	if (rsp->gp_state == RCU_SAVE_DYNTICK) {
> > > 
> > > This will never happen because rcu_gp_kthread() modifies rsp->gp_state
> > > many times. The last value before calling rcu_gp_fqs() is
> > > RCU_GP_DOING_FQS.
> > > 
> > > I think about passing this information via a separate bool.
> > > 
> > > [...]
> > > 
> > > > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > > > index d5f58e717c8b..9faad70a8246 100644
> > > > --- a/kernel/rcu/tree.h
> > > > +++ b/kernel/rcu/tree.h
> > > > @@ -417,12 +417,11 @@ struct rcu_data {
> > > >  	struct rcu_state *rsp;
> > > >  };
> > > >  
> > > > -/* Values for fqs_state field in struct rcu_state. */
> > > > +/* Values for gp_state field in struct rcu_state. */
> > > >  #define RCU_GP_IDLE		0	/* No grace period in progress. */
> > > 
> > > This value seems to be used instead of the new RCU_GP_WAIT_INIT.
> > > 
> > > >  #define RCU_GP_INIT		1	/* Grace period being
> > > >  #initialized. */
> > > 
> > > This value is unused.
> > > 
> > > >  #define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick
> > > >  #state. */
> > > 
> > > This one is not longer preserved when merged with the other state.
> > > 
> > > >  #define RCU_FORCE_QS		3	/* Need to force quiescent
> > > >  #state. */
> > > 
> > > The meaning of this one is strange. If I get it correctly,
> > > it is set after the state was forced. But the comment suggests
> > > that it is before.
> > > 
> > > By other words, these states seems to get obsoleted by
> > > 
> > > /* Values for rcu_state structure's gp_flags field. */
> > > #define RCU_GP_WAIT_INIT 0	/* Initial state. */
> > > #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
> > > #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
> > > #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
> > > #define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
> > > #define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
> > > #define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */
> > > 
> > > 
> > > Please, find below your commit updated with my ideas:
> > > 
> > > 	+ used bool save_dyntick instead of RCU_SAVE_DYNTICK
> > > 	  and RCU_FORCE_QS states
> > > 	+ rename RCU_GP_WAIT_INIT -> RCU_GP_IDLE
> > > 	+ remove all the obsolete states
> > > 
> > > I am sorry if I handled "Signed-off-by" flags a wrong way. It is
> > > basically your patch with few small updates from me. I am not sure
> > > what is the right process in this case. Feel free to use Reviewed-by
> > > instead of Signed-off-by with my name.
> > > 
> > > Well, I guess that this is not the final state ;-)
> > 
> > Good points, but perhaps an easier solution would be to have a
> > "firsttime" argument to rcu_gp_fqs() that said whether or not this
> > was the first call to rcu_gp_fqs() during the current grace period.
> > If this is the first call, then take the "if" branch that passes
> > dyntick_save_progress_counter() to force_qs_rnp(), otherwise take the
> > other branch.
> 
> This seems to be the most elegant solution at the moment.
> 
> > But I am not generating the patch today, just flew across the Pacific
> > yesterday.  ;-)
> 
> Please, find below the updated patch where I used the first_time
> parameter.
> 
> Again, I am not sure about the commit person and Signed-off-by
> tags. Many parts of the patch are yours. Feel free to update
> them.

Thank you and here you go!  Starting testing, and will let you
know how it goes.

							Thanx, Paul

------------------------------------------------------------------------

commit bbe84e224959475fd5be9e9c18aede3a6abe4ab9
Author: Petr Mladek <pmladek@suse.com>
Date:   Wed Sep 9 12:09:49 2015 -0700

    rcu: Finish folding ->fqs_state into ->gp_state
    
    Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
    into kthread") started the process of folding the old ->fqs_state into
    ->gp_state, but did not complete it.  This situation does not cause
    any malfunction, but can result in extremely confusing trace output.
    This commit completes this task of eliminating ->fqs_state in favor
    of ->gp_state.
    
    The old ->fqs_state was also used to decide when to collect dyntick-idle
    snapshots.  For this purpose, we add a boolean variable into the kthread,
    which is set on the first call to rcu_gp_fqs() for a given grace period
    and clear otherwise.
    
    Signed-off-by: Petr Mladek <pmladek@suse.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 655414e9a1f4..595148ab53aa 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -97,7 +97,7 @@ struct rcu_state sname##_state = { \
 	.level = { &sname##_state.node[0] }, \
 	.rda = &sname##_data, \
 	.call = cr, \
-	.fqs_state = RCU_GP_IDLE, \
+	.gp_state = RCU_GP_IDLE, \
 	.gpnum = 0UL - 300UL, \
 	.completed = 0UL - 300UL, \
 	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
@@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
 /*
  * Do one round of quiescent-state forcing.
  */
-static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
+static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time)
 {
-	int fqs_state = fqs_state_in;
 	bool isidle = false;
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
 	WRITE_ONCE(rsp->gp_activity, jiffies);
 	rsp->n_force_qs++;
-	if (fqs_state == RCU_SAVE_DYNTICK) {
+	if (first_time) {
 		/* Collect dyntick-idle snapshots. */
 		if (is_sysidle_rcu_state(rsp)) {
 			isidle = true;
@@ -1967,7 +1966,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 		force_qs_rnp(rsp, dyntick_save_progress_counter,
 			     &isidle, &maxj);
 		rcu_sysidle_report_gp(rsp, isidle, maxj);
-		fqs_state = RCU_FORCE_QS;
 	} else {
 		/* Handle dyntick-idle and offline CPUs. */
 		isidle = true;
@@ -1981,7 +1979,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
 		raw_spin_unlock_irq(&rnp->lock);
 	}
-	return fqs_state;
 }
 
 /*
@@ -2045,7 +2042,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	/* Declare grace period done. */
 	WRITE_ONCE(rsp->completed, rsp->gpnum);
 	trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
-	rsp->fqs_state = RCU_GP_IDLE;
+	rsp->gp_state = RCU_GP_IDLE;
 	rdp = this_cpu_ptr(rsp->rda);
 	/* Advance CBs to reduce false positives below. */
 	needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp;
@@ -2063,7 +2060,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
-	int fqs_state;
+	bool first_gp_fqs;
 	int gf;
 	unsigned long j;
 	int ret;
@@ -2095,7 +2092,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 		}
 
 		/* Handle quiescent-state forcing. */
-		fqs_state = RCU_SAVE_DYNTICK;
+		first_gp_fqs = true;
 		j = jiffies_till_first_fqs;
 		if (j > HZ) {
 			j = HZ;
@@ -2123,7 +2120,8 @@ static int __noreturn rcu_gp_kthread(void *arg)
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
-				fqs_state = rcu_gp_fqs(rsp, fqs_state);
+				rcu_gp_fqs(rsp, first_gp_fqs);
+				first_gp_fqs = false;
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index d5f58e717c8b..9d92d161683e 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -417,13 +417,6 @@ struct rcu_data {
 	struct rcu_state *rsp;
 };
 
-/* Values for fqs_state field in struct rcu_state. */
-#define RCU_GP_IDLE		0	/* No grace period in progress. */
-#define RCU_GP_INIT		1	/* Grace period being initialized. */
-#define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick state. */
-#define RCU_FORCE_QS		3	/* Need to force quiescent state. */
-#define RCU_SIGNAL_INIT		RCU_SAVE_DYNTICK
-
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
 #define RCU_NOGP_WAKE_NOT	0
 #define RCU_NOGP_WAKE		1
@@ -474,9 +467,8 @@ struct rcu_state {
 
 	/* The following fields are guarded by the root rcu_node's lock. */
 
-	u8	fqs_state ____cacheline_internodealigned_in_smp;
-						/* Force QS state. */
-	u8	boost;				/* Subject to priority boost. */
+	u8	boost ____cacheline_internodealigned_in_smp;
+						/* Subject to priority boost. */
 	unsigned long gpnum;			/* Current gp number. */
 	unsigned long completed;		/* # of last completed gp. */
 	struct task_struct *gp_kthread;		/* Task for grace periods. */
@@ -545,7 +537,7 @@ struct rcu_state {
 #define RCU_GP_FLAG_FQS  0x2	/* Need grace-period quiescent-state forcing. */
 
 /* Values for rcu_state structure's gp_flags field. */
-#define RCU_GP_WAIT_INIT 0	/* Initial state. */
+#define RCU_GP_IDLE	 0	/* Initial state and no GP in progress. */
 #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
 #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
 #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 999c3672f990..ef7093cc9b5c 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -268,7 +268,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
 	gpnum = rsp->gpnum;
 	seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
 		   ulong2long(rsp->completed), ulong2long(gpnum),
-		   rsp->fqs_state,
+		   rsp->gp_state,
 		   (long)(rsp->jiffies_force_qs - jiffies),
 		   (int)(jiffies & 0xffff));
 	seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-09-09 19:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-04 12:11 [PATCH 0/2] rcu: two small fixes for RCU kthreads Petr Mladek
2015-09-04 12:11 ` [PATCH 1/2] rcu: Show the real fqs_state Petr Mladek
2015-09-04 23:24   ` Paul E. McKenney
2015-09-07 14:58     ` Petr Mladek
2015-09-08 19:59       ` Paul E. McKenney
2015-09-09 12:39         ` Petr Mladek
2015-09-09 19:12           ` Paul E. McKenney
2015-09-04 12:11 ` [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state Petr Mladek
2015-09-04 23:49   ` Paul E. McKenney
2015-09-07 15:21     ` Petr Mladek
2015-09-08 20:02       ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).