linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04)
@ 2020-09-21  1:21 Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 1/5] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Joel Fernandes (Google) @ 2020-09-21  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)


NOTE: I marked as RFC since TREE 04 fails even though TREE03 passes. I don't
see any RCU errors in the counters, however when shutdown thread tries to
shutdown the system, it hangs when trying to shutdown the rcu_barrier thread.
My suspicion is this is becaues I broke rcu_barrier() however I can't figure
out how/where.  I added a patch in this series that could fix it, but it still
does not alleviate TREE04's issues.  I also have a feeling this issue is
related to nocb mode.

This is required for several usecases identified. One of them being tracing how
the segmented callback list changes. Tracing this has identified issues in RCU
code in the past.

From Paul:
Another use case is of course more accurately determining whether a given CPU's
large pile of callbacks can be best served by making grace periods go faster,
invoking callbacks more vigorously, or both.  It should also be possible to
simplify some of the callback handling a bit, given that some of the unnatural
acts are due to there having been no per-batch counts.

Revision history:
v5: Various changes, bug fixes. Discovery of rcu_barrier issue.

v4: Restructured rcu_do_batch() and segcblist merging to avoid issues.
    Fixed minor nit from Davidlohr.
v1->v3: minor nits.
(https://lore.kernel.org/lkml/20200719034210.2382053-1-joel@joelfernandes.org/)

Joel Fernandes (Google) (5):
rcu/tree: Make rcu_do_batch count how many callbacks were executed
rcu/segcblist: Add counters to segcblist datastructure
rcu: Fix rcu_barrier() breakage from earlier patch
rcu/trace: Add tracing for how segcb list changes
rcu/segcblist: Remove useless rcupdate.h include

include/linux/rcu_segcblist.h |   3 +
include/trace/events/rcu.h    |  25 +++++
kernel/rcu/rcu_segcblist.c    | 174 +++++++++++++++++++++++-----------
kernel/rcu/rcu_segcblist.h    |  33 +++++--
kernel/rcu/srcutree.c         |   1 -
kernel/rcu/tree.c             |  22 +++--
6 files changed, 192 insertions(+), 66 deletions(-)

--
2.28.0.681.g6f77f65b4e-goog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC v5 1/5] rcu/tree: Make rcu_do_batch count how many callbacks were executed
  2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
@ 2020-09-21  1:21 ` Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 2/5] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Joel Fernandes (Google) @ 2020-09-21  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

Currently, rcu_do_batch() depends on the unsegmented callback list's len field
to know how many CBs are executed. This fields counts down from 0 as CBs are
dequeued.  It is possible that all CBs could not be run because of reaching
limits in which case the remaining unexecuted callbacks are requeued in the
CPU's segcblist.

The number of callbacks that were not requeued are then the negative count (how
many CBs were run) stored in the rcl->len which has been counting down on every
dequeue. This negative count is then added to the per-cpu segmented callback
list's to correct its count.

Such a design works against future efforts to track the length of each segment
of the segmented callback list. The reason is because
rcu_segcblist_extract_done_cbs() will be populating the unsegmented callback
list's length field (rcl->len) during extraction.

Also, the design of counting down from 0 is confusing and error-prone IMHO.

This commit therefore explicitly counts have many callbacks were executed in
rcu_do_batch() itself, and uses that to update the per-CPU segcb list's ->len
field, without relying on the negativity of rcl->len.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/rcu_segcblist.c | 2 +-
 kernel/rcu/rcu_segcblist.h | 1 +
 kernel/rcu/tree.c          | 9 ++++-----
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 2d2a6b6b9dfb..bb246d8c6ef1 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -95,7 +95,7 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
  * This increase is fully ordered with respect to the callers accesses
  * both before and after.
  */
-static void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
+void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
 {
 #ifdef CONFIG_RCU_NOCB_CPU
 	smp_mb__before_atomic(); /* Up to the caller! */
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 5c293afc07b8..b90725f81d77 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -76,6 +76,7 @@ static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg)
 }
 
 void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
+void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v);
 void rcu_segcblist_init(struct rcu_segcblist *rsclp);
 void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
 void rcu_segcblist_offload(struct rcu_segcblist *rsclp);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7623128d0020..50af465729f4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2427,7 +2427,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 			       rcu_segcblist_is_offloaded(&rdp->cblist);
 	struct rcu_head *rhp;
 	struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
-	long bl, count;
+	long bl, count = 0;
 	long pending, tlimit = 0;
 
 	/* If no callbacks are ready, just return. */
@@ -2472,6 +2472,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
 		rcu_callback_t f;
 
+		count++;
 		debug_rcu_head_unqueue(rhp);
 
 		rcu_lock_acquire(&rcu_callback_map);
@@ -2485,9 +2486,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
 
 		/*
 		 * Stop only if limit reached and CPU has something to do.
-		 * Note: The rcl structure counts down from zero.
 		 */
-		if (-rcl.len >= bl && !offloaded &&
+		if (count >= bl && !offloaded &&
 		    (need_resched() ||
 		     (!is_idle_task(current) && !rcu_is_callbacks_kthread())))
 			break;
@@ -2510,7 +2510,6 @@ static void rcu_do_batch(struct rcu_data *rdp)
 
 	local_irq_save(flags);
 	rcu_nocb_lock(rdp);
-	count = -rcl.len;
 	rdp->n_cbs_invoked += count;
 	trace_rcu_batch_end(rcu_state.name, count, !!rcl.head, need_resched(),
 			    is_idle_task(current), rcu_is_callbacks_kthread());
@@ -2518,7 +2517,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Update counts and requeue any remaining callbacks. */
 	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
 	smp_mb(); /* List handling before counting for rcu_barrier(). */
-	rcu_segcblist_insert_count(&rdp->cblist, &rcl);
+	rcu_segcblist_add_len(&rdp->cblist, -count);
 
 	/* Reinstate batch limit if we have worked down the excess. */
 	count = rcu_segcblist_n_cbs(&rdp->cblist);
-- 
2.28.0.681.g6f77f65b4e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v5 2/5] rcu/segcblist: Add counters to segcblist datastructure
  2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 1/5] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
@ 2020-09-21  1:21 ` Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 3/5] rcu: Fix rcu_barrier() breakage from earlier patch Joel Fernandes (Google)
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Joel Fernandes (Google) @ 2020-09-21  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

Add counting of segment lengths of segmented callback list.

This will be useful for a number of things such as knowing how big the
ready-to-execute segment have gotten. The immediate benefit is ability
to trace how the callbacks in the segmented callback list change.

Also this patch remove hacks related to using donecbs's ->len field as a
temporary variable to save the segmented callback list's length. This is not
needed any more.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 include/linux/rcu_segcblist.h |   2 +
 kernel/rcu/rcu_segcblist.c    | 137 +++++++++++++++++++++-------------
 kernel/rcu/rcu_segcblist.h    |   4 -
 kernel/rcu/srcutree.c         |   1 -
 kernel/rcu/tree.c             |   2 -
 5 files changed, 88 insertions(+), 58 deletions(-)

diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index b36afe7b22c9..d462ae5e340a 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -69,8 +69,10 @@ struct rcu_segcblist {
 	unsigned long gp_seq[RCU_CBLIST_NSEGS];
 #ifdef CONFIG_RCU_NOCB_CPU
 	atomic_long_t len;
+	atomic_long_t seglen[RCU_CBLIST_NSEGS];
 #else
 	long len;
+	long seglen[RCU_CBLIST_NSEGS];
 #endif
 	u8 enabled;
 	u8 offloaded;
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index bb246d8c6ef1..314799426c2d 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -88,6 +88,62 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
 #endif
 }
 
+/* Get the length of a segment of the rcu_segcblist structure. */
+static long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
+{
+#ifdef CONFIG_RCU_NOCB_CPU
+	return atomic_long_read(&rsclp->seglen[seg]);
+#else
+	return READ_ONCE(rsclp->seglen[seg]);
+#endif
+}
+
+/* Set the length of a segment of the rcu_segcblist structure. */
+static void rcu_segcblist_set_seglen(struct rcu_segcblist *rsclp, int seg, long v)
+{
+#ifdef CONFIG_RCU_NOCB_CPU
+	atomic_long_set(&rsclp->seglen[seg], v);
+#else
+	WRITE_ONCE(rsclp->seglen[seg], v);
+#endif
+}
+
+/* Return number of callbacks in a segment of the segmented callback list. */
+static void rcu_segcblist_add_seglen(struct rcu_segcblist *rsclp, int seg, long v)
+{
+#ifdef CONFIG_RCU_NOCB_CPU
+	smp_mb__before_atomic(); /* Up to the caller! */
+	atomic_long_add(v, &rsclp->seglen[seg]);
+	smp_mb__after_atomic(); /* Up to the caller! */
+#else
+	smp_mb(); /* Up to the caller! */
+	WRITE_ONCE(rsclp->seglen[seg], rsclp->seglen[seg] + v);
+	smp_mb(); /* Up to the caller! */
+#endif
+}
+
+/* Move from's segment length to to's segment. */
+static void rcu_segcblist_move_seglen(struct rcu_segcblist *rsclp, int from, int to)
+{
+	long len;
+
+	if (from == to)
+		return;
+
+	len = rcu_segcblist_get_seglen(rsclp, from);
+	if (!len)
+		return;
+
+	rcu_segcblist_add_seglen(rsclp, to, len);
+	rcu_segcblist_set_seglen(rsclp, from, 0);
+}
+
+/* Increment segment's length. */
+static void rcu_segcblist_inc_seglen(struct rcu_segcblist *rsclp, int seg)
+{
+	rcu_segcblist_add_seglen(rsclp, seg, 1);
+}
+
 /*
  * Increase the numeric length of an rcu_segcblist structure by the
  * specified amount, which can be negative.  This can cause the ->len
@@ -119,26 +175,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
 	rcu_segcblist_add_len(rsclp, 1);
 }
 
-/*
- * Exchange the numeric length of the specified rcu_segcblist structure
- * with the specified value.  This can cause the ->len field to disagree
- * with the actual number of callbacks on the structure.  This exchange is
- * fully ordered with respect to the callers accesses both before and after.
- */
-static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
-{
-#ifdef CONFIG_RCU_NOCB_CPU
-	return atomic_long_xchg(&rsclp->len, v);
-#else
-	long ret = rsclp->len;
-
-	smp_mb(); /* Up to the caller! */
-	WRITE_ONCE(rsclp->len, v);
-	smp_mb(); /* Up to the caller! */
-	return ret;
-#endif
-}
-
 /*
  * Initialize an rcu_segcblist structure.
  */
@@ -149,8 +185,10 @@ void rcu_segcblist_init(struct rcu_segcblist *rsclp)
 	BUILD_BUG_ON(RCU_NEXT_TAIL + 1 != ARRAY_SIZE(rsclp->gp_seq));
 	BUILD_BUG_ON(ARRAY_SIZE(rsclp->tails) != ARRAY_SIZE(rsclp->gp_seq));
 	rsclp->head = NULL;
-	for (i = 0; i < RCU_CBLIST_NSEGS; i++)
+	for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
 		rsclp->tails[i] = &rsclp->head;
+		rcu_segcblist_set_seglen(rsclp, i, 0);
+	}
 	rcu_segcblist_set_len(rsclp, 0);
 	rsclp->enabled = 1;
 }
@@ -245,6 +283,7 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
 			   struct rcu_head *rhp)
 {
 	rcu_segcblist_inc_len(rsclp);
+	rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL);
 	smp_mb(); /* Ensure counts are updated before callback is enqueued. */
 	rhp->next = NULL;
 	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp);
@@ -274,27 +313,13 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
 	for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
 		if (rsclp->tails[i] != rsclp->tails[i - 1])
 			break;
+	rcu_segcblist_inc_seglen(rsclp, i);
 	WRITE_ONCE(*rsclp->tails[i], rhp);
 	for (; i <= RCU_NEXT_TAIL; i++)
 		WRITE_ONCE(rsclp->tails[i], &rhp->next);
 	return true;
 }
 
-/*
- * Extract only the counts from the specified rcu_segcblist structure,
- * and place them in the specified rcu_cblist structure.  This function
- * supports both callback orphaning and invocation, hence the separation
- * of counts and callbacks.  (Callbacks ready for invocation must be
- * orphaned and adopted separately from pending callbacks, but counts
- * apply to all callbacks.  Locking must be used to make sure that
- * both orphaned-callbacks lists are consistent.)
- */
-void rcu_segcblist_extract_count(struct rcu_segcblist *rsclp,
-					       struct rcu_cblist *rclp)
-{
-	rclp->len = rcu_segcblist_xchg_len(rsclp, 0);
-}
-
 /*
  * Extract only those callbacks ready to be invoked from the specified
  * rcu_segcblist structure and place them in the specified rcu_cblist
@@ -307,6 +332,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 
 	if (!rcu_segcblist_ready_cbs(rsclp))
 		return; /* Nothing to do. */
+	rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_DONE_TAIL);
 	*rclp->tail = rsclp->head;
 	WRITE_ONCE(rsclp->head, *rsclp->tails[RCU_DONE_TAIL]);
 	WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
@@ -314,6 +340,8 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 	for (i = RCU_CBLIST_NSEGS - 1; i >= RCU_DONE_TAIL; i--)
 		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
 			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
+	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
+	rcu_segcblist_add_len(rsclp, -(rclp->len));
 }
 
 /*
@@ -330,22 +358,17 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
 
 	if (!rcu_segcblist_pend_cbs(rsclp))
 		return; /* Nothing to do. */
+	rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_WAIT_TAIL) +
+		    rcu_segcblist_get_seglen(rsclp, RCU_NEXT_READY_TAIL) +
+		    rcu_segcblist_get_seglen(rsclp, RCU_NEXT_TAIL);
 	*rclp->tail = *rsclp->tails[RCU_DONE_TAIL];
 	rclp->tail = rsclp->tails[RCU_NEXT_TAIL];
 	WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
-	for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++)
+	for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++) {
 		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
-}
-
-/*
- * Insert counts from the specified rcu_cblist structure in the
- * specified rcu_segcblist structure.
- */
-void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
-				struct rcu_cblist *rclp)
-{
-	rcu_segcblist_add_len(rsclp, rclp->len);
-	rclp->len = 0;
+		rcu_segcblist_set_seglen(rsclp, i, 0);
+	}
+	rcu_segcblist_add_len(rsclp, -(rclp->len));
 }
 
 /*
@@ -359,6 +382,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 
 	if (!rclp->head)
 		return; /* No callbacks to move. */
+	rcu_segcblist_add_seglen(rsclp, RCU_DONE_TAIL, rclp->len);
 	*rclp->tail = rsclp->head;
 	WRITE_ONCE(rsclp->head, rclp->head);
 	for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
@@ -368,6 +392,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 			break;
 	rclp->head = NULL;
 	rclp->tail = &rclp->head;
+	rcu_segcblist_add_len(rsclp, rclp->len);
 }
 
 /*
@@ -379,8 +404,11 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
 {
 	if (!rclp->head)
 		return; /* Nothing to do. */
+
+	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
 	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
 	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
+	rcu_segcblist_add_len(rsclp, rclp->len);
 }
 
 /*
@@ -403,6 +431,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
 		if (ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
 			break;
 		WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]);
+		rcu_segcblist_move_seglen(rsclp, i, RCU_DONE_TAIL);
 	}
 
 	/* If no callbacks moved, nothing more need be done. */
@@ -423,6 +452,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
 		if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
 			break;  /* No more callbacks. */
 		WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]);
+		rcu_segcblist_move_seglen(rsclp, i, j);
 		rsclp->gp_seq[j] = rsclp->gp_seq[i];
 	}
 }
@@ -444,7 +474,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
  */
 bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
 {
-	int i;
+	int i, j;
 
 	WARN_ON_ONCE(!rcu_segcblist_is_enabled(rsclp));
 	if (rcu_segcblist_restempty(rsclp, RCU_DONE_TAIL))
@@ -487,6 +517,10 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
 	if (rcu_segcblist_restempty(rsclp, i) || ++i >= RCU_NEXT_TAIL)
 		return false;
 
+	/* Accounting: everything below i is about to get merged into i. */
+	for (j = i + 1; j <= RCU_NEXT_TAIL; j++)
+		rcu_segcblist_move_seglen(rsclp, j, i);
+
 	/*
 	 * Merge all later callbacks, including newly arrived callbacks,
 	 * into the segment located by the for-loop above.  Assign "seq"
@@ -516,11 +550,12 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
 
 	rcu_cblist_init(&donecbs);
 	rcu_cblist_init(&pendcbs);
-	rcu_segcblist_extract_count(src_rsclp, &donecbs);
+
 	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
 	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
-	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
+
 	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
 	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
+
 	rcu_segcblist_init(src_rsclp);
 }
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index b90725f81d77..7f4e02bbb806 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -89,14 +89,10 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
 			   struct rcu_head *rhp);
 bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
 			   struct rcu_head *rhp);
-void rcu_segcblist_extract_count(struct rcu_segcblist *rsclp,
-				 struct rcu_cblist *rclp);
 void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 				    struct rcu_cblist *rclp);
 void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
 				    struct rcu_cblist *rclp);
-void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
-				struct rcu_cblist *rclp);
 void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 				   struct rcu_cblist *rclp);
 void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 0f23d20d485a..62b83fb997ee 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1196,7 +1196,6 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	 * schedule another round of callback invocation.
 	 */
 	spin_lock_irq_rcu_node(sdp);
-	rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
 				       rcu_seq_snap(&ssp->srcu_gp_seq));
 	sdp->srcu_cblist_invoking = false;
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 50af465729f4..ab4d4e9ff549 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2516,8 +2516,6 @@ static void rcu_do_batch(struct rcu_data *rdp)
 
 	/* Update counts and requeue any remaining callbacks. */
 	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
-	smp_mb(); /* List handling before counting for rcu_barrier(). */
-	rcu_segcblist_add_len(&rdp->cblist, -count);
 
 	/* Reinstate batch limit if we have worked down the excess. */
 	count = rcu_segcblist_n_cbs(&rdp->cblist);
-- 
2.28.0.681.g6f77f65b4e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v5 3/5] rcu: Fix rcu_barrier() breakage from earlier patch
  2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 1/5] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 2/5] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
@ 2020-09-21  1:21 ` Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 4/5] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Joel Fernandes (Google) @ 2020-09-21  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

This patch is split up because I'm not sure about the patch, and also because
it does not fix the issue I am seeing with TREE04. :-(. Though it does fix the
theoretical issue I was considering, with rcu_barrier.

The previous patch breaks rcu_barrier (in theory at least). This is because
rcu_barrier() may skip queuing a callback and return too early. Fix it by storing
state to indicate that callbacks are being invoked and the callback list should
not appear as non-empty. This is a terrible hack, however it still does not fix
TREE04.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 include/linux/rcu_segcblist.h |  1 +
 kernel/rcu/rcu_segcblist.h    | 23 +++++++++++++++++++++--
 kernel/rcu/tree.c             |  4 ++++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index d462ae5e340a..319a565f6ecb 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -76,6 +76,7 @@ struct rcu_segcblist {
 #endif
 	u8 enabled;
 	u8 offloaded;
+	u8 invoking;
 };
 
 #define RCU_SEGCBLIST_INITIALIZER(n) \
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 7f4e02bbb806..78949e125364 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -40,14 +40,33 @@ static inline bool rcu_segcblist_empty(struct rcu_segcblist *rsclp)
 	return !READ_ONCE(rsclp->head);
 }
 
+static inline void rcu_segcblist_set_invoking(struct rcu_segcblist *rsclp)
+{
+	WRITE_ONCE(rsclp->invoking, 1);
+}
+
+static inline void rcu_segcblist_reset_invoking(struct rcu_segcblist *rsclp)
+{
+	WRITE_ONCE(rsclp->invoking, 0);
+}
+
 /* Return number of callbacks in segmented callback list. */
 static inline long rcu_segcblist_n_cbs(struct rcu_segcblist *rsclp)
 {
+	long ret;
 #ifdef CONFIG_RCU_NOCB_CPU
-	return atomic_long_read(&rsclp->len);
+	ret = atomic_long_read(&rsclp->len);
 #else
-	return READ_ONCE(rsclp->len);
+	ret = READ_ONCE(rsclp->len);
 #endif
+
+	/*
+	 * An invoking list should not appear empty. This is required
+	 * by rcu_barrier().
+	 */
+	if (ret)
+		return ret;
+	return READ_ONCE(rsclp->invoking) ? 1 : 0;
 }
 
 /*
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index ab4d4e9ff549..23fb6d7b6d4a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2461,6 +2461,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	}
 	trace_rcu_batch_start(rcu_state.name,
 			      rcu_segcblist_n_cbs(&rdp->cblist), bl);
+	rcu_segcblist_set_invoking(&rdp->cblist);
 	rcu_segcblist_extract_done_cbs(&rdp->cblist, &rcl);
 	if (offloaded)
 		rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist);
@@ -2517,6 +2518,9 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Update counts and requeue any remaining callbacks. */
 	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
 
+	smp_mb();
+	rcu_segcblist_reset_invoking(&rdp->cblist);
+
 	/* Reinstate batch limit if we have worked down the excess. */
 	count = rcu_segcblist_n_cbs(&rdp->cblist);
 	if (rdp->blimit >= DEFAULT_MAX_RCU_BLIMIT && count <= qlowmark)
-- 
2.28.0.681.g6f77f65b4e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v5 4/5] rcu/trace: Add tracing for how segcb list changes
  2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
                   ` (2 preceding siblings ...)
  2020-09-21  1:21 ` [RFC v5 3/5] rcu: Fix rcu_barrier() breakage from earlier patch Joel Fernandes (Google)
@ 2020-09-21  1:21 ` Joel Fernandes (Google)
  2020-09-21  1:21 ` [RFC v5 5/5] rcu/segcblist: Remove useless rcupdate.h include Joel Fernandes (Google)
  2020-09-21  2:06 ` [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes
  5 siblings, 0 replies; 8+ messages in thread
From: Joel Fernandes (Google) @ 2020-09-21  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

Track how the segcb list changes before/after acceleration, during
queuing and during dequeuing.

This has proved useful to discover an optimization to avoid unwanted GP
requests when there are no callbacks accelerated. The overhead is minimal as
each segment's length is now stored in the respective segment.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 include/trace/events/rcu.h | 25 +++++++++++++++++++++++++
 kernel/rcu/rcu_segcblist.c | 34 ++++++++++++++++++++++++++++++++++
 kernel/rcu/rcu_segcblist.h |  5 +++++
 kernel/rcu/tree.c          |  9 +++++++++
 4 files changed, 73 insertions(+)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 155b5cb43cfd..7b84df3c95df 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
 		  __entry->qlen)
 );
 
+TRACE_EVENT_RCU(rcu_segcb,
+
+		TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
+
+		TP_ARGS(ctx, cb_count, gp_seq),
+
+		TP_STRUCT__entry(
+			__field(const char *, ctx)
+			__array(int, cb_count, 4)
+			__array(unsigned long, gp_seq, 4)
+		),
+
+		TP_fast_assign(
+			__entry->ctx = ctx;
+			memcpy(__entry->cb_count, cb_count, 4 * sizeof(int));
+			memcpy(__entry->gp_seq, gp_seq, 4 * sizeof(unsigned long));
+		),
+
+		TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, NEXT=%d) "
+			  "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
+			  __entry->cb_count[0], __entry->cb_count[1], __entry->cb_count[2], __entry->cb_count[3],
+			  __entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
+
+);
+
 /*
  * Tracepoint for the registration of a single RCU callback of the special
  * kvfree() form.  The first argument is the RCU type, the second argument
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 314799426c2d..72b284f965aa 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -13,6 +13,7 @@
 #include <linux/rcupdate.h>
 
 #include "rcu_segcblist.h"
+#include "rcu.h"
 
 /* Initialize simple callback list. */
 void rcu_cblist_init(struct rcu_cblist *rclp)
@@ -344,6 +345,39 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 	rcu_segcblist_add_len(rsclp, -(rclp->len));
 }
 
+/*
+ * Return how many CBs each segment along with their gp_seq values.
+ *
+ * This function is O(N) where N is the number of callbacks. Only used from
+ * tracing code which is usually disabled in production.
+ */
+#ifdef CONFIG_RCU_TRACE
+static void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
+			 int cbcount[RCU_CBLIST_NSEGS],
+			 unsigned long gpseq[RCU_CBLIST_NSEGS])
+{
+	int i;
+
+	for (i = 0; i < RCU_CBLIST_NSEGS; i++)
+		cbcount[i] = 0;
+
+	for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
+		cbcount[i] = rcu_segcblist_get_seglen(rsclp, i);
+		gpseq[i] = rsclp->gp_seq[i];
+	}
+}
+
+void trace_rcu_segcb_list(struct rcu_segcblist *rsclp, char *context)
+{
+	int cbs[RCU_CBLIST_NSEGS];
+	unsigned long gps[RCU_CBLIST_NSEGS];
+
+	rcu_segcblist_countseq(rsclp, cbs, gps);
+
+	trace_rcu_segcb(context, cbs, gps);
+}
+#endif
+
 /*
  * Extract only those callbacks still pending (not yet ready to be
  * invoked) from the specified rcu_segcblist structure and place them in
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 78949e125364..ca2a403591e4 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -120,3 +120,8 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq);
 bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq);
 void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
 			 struct rcu_segcblist *src_rsclp);
+#ifdef CONFIG_RCU_TRACE
+void trace_rcu_segcb_list(struct rcu_segcblist *rsclp, char *context);
+#else
+#define trace_rcu_segcb_list(...)
+#endif
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 23fb6d7b6d4a..3afb0e4daca6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1492,6 +1492,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
 	if (!rcu_segcblist_pend_cbs(&rdp->cblist))
 		return false;
 
+	trace_rcu_segcb_list(&rdp->cblist, "SegCbPreAcc");
+
 	/*
 	 * Callbacks are often registered with incomplete grace-period
 	 * information.  Something about the fact that getting exact
@@ -1512,6 +1514,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
 	else
 		trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccReadyCB"));
 
+	trace_rcu_segcb_list(&rdp->cblist, "SegCbPostAcc");
+
 	return ret;
 }
 
@@ -2470,6 +2474,9 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Invoke callbacks. */
 	tick_dep_set_task(current, TICK_DEP_BIT_RCU);
 	rhp = rcu_cblist_dequeue(&rcl);
+
+	trace_rcu_segcb_list(&rdp->cblist, "SegCbDequeued");
+
 	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
 		rcu_callback_t f;
 
@@ -2984,6 +2991,8 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
 		trace_rcu_callback(rcu_state.name, head,
 				   rcu_segcblist_n_cbs(&rdp->cblist));
 
+	trace_rcu_segcb_list(&rdp->cblist, "SegCBQueued");
+
 	/* Go handle any RCU core processing required. */
 	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
 	    unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
-- 
2.28.0.681.g6f77f65b4e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC v5 5/5] rcu/segcblist: Remove useless rcupdate.h include
  2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
                   ` (3 preceding siblings ...)
  2020-09-21  1:21 ` [RFC v5 4/5] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
@ 2020-09-21  1:21 ` Joel Fernandes (Google)
  2020-09-21  2:06 ` [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes
  5 siblings, 0 replies; 8+ messages in thread
From: Joel Fernandes (Google) @ 2020-09-21  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/rcu_segcblist.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 72b284f965aa..13f8f181521d 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -10,7 +10,6 @@
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/interrupt.h>
-#include <linux/rcupdate.h>
 
 #include "rcu_segcblist.h"
 #include "rcu.h"
-- 
2.28.0.681.g6f77f65b4e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04)
  2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
                   ` (4 preceding siblings ...)
  2020-09-21  1:21 ` [RFC v5 5/5] rcu/segcblist: Remove useless rcupdate.h include Joel Fernandes (Google)
@ 2020-09-21  2:06 ` Joel Fernandes
  2020-09-21 15:34   ` Paul E. McKenney
  5 siblings, 1 reply; 8+ messages in thread
From: Joel Fernandes @ 2020-09-21  2:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Josh Triplett, Lai Jiangshan, Marco Elver,
	Mathieu Desnoyers, neeraj.iitr10, Paul E. McKenney, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

On Sun, Sep 20, 2020 at 09:21:47PM -0400, Joel Fernandes (Google) wrote:
> 
> NOTE: I marked as RFC since TREE 04 fails even though TREE03 passes. I don't
> see any RCU errors in the counters, however when shutdown thread tries to
> shutdown the system, it hangs when trying to shutdown the rcu_barrier thread.

Looks like if I restore the logic of setting the segcb_list ->len field from
the callers of extract_done_cbs(), then TREE04 issues go away. I am inclined
to do that considering this series, for what it is trying to do, does not
really need to optimize that logic (can be done later).

With the below diff on top of this series, TREE04 passes. I will await any
other review comments before posting another series.. meanwhile, will pass
it through the grinders..

---8<---
diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index 319a565f6ecb..d462ae5e340a 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -76,7 +76,6 @@ struct rcu_segcblist {
 #endif
 	u8 enabled;
 	u8 offloaded;
-	u8 invoking;
 };
 
 #define RCU_SEGCBLIST_INITIALIZER(n) \
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index ec9a609a461b..3c82c016feb1 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -388,7 +388,6 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
 			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
 	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
-	rcu_segcblist_add_len(rsclp, -(rclp->len));
 }
 
 /*
@@ -452,7 +451,17 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
 		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
 		rcu_segcblist_set_seglen(rsclp, i, 0);
 	}
-	rcu_segcblist_add_len(rsclp, -(rclp->len));
+}
+
+/*
+ * Insert counts from the specified rcu_cblist structure in the
+ * specified rcu_segcblist structure.
+ */
+void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
+				struct rcu_cblist *rclp)
+{
+	rcu_segcblist_add_len(rsclp, rclp->len);
+	rclp->len = 0;
 }
 
 /*
@@ -476,7 +485,6 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 			break;
 	rclp->head = NULL;
 	rclp->tail = &rclp->head;
-	rcu_segcblist_add_len(rsclp, rclp->len);
 }
 
 /*
@@ -492,7 +500,6 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
 	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
 	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
 	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
-	rcu_segcblist_add_len(rsclp, rclp->len);
 }
 
 /*
@@ -637,9 +644,12 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
 	rcu_cblist_init(&donecbs);
 	rcu_cblist_init(&pendcbs);
 
+	rcu_segcblist_set_len(src_rsclp, 0);
 	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
 	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
 
+	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
+	rcu_segcblist_insert_count(dst_rsclp, &pendcbs);
 	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
 	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
 
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index c2dc08a9b020..15c10d30f88c 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -40,33 +40,14 @@ static inline bool rcu_segcblist_empty(struct rcu_segcblist *rsclp)
 	return !READ_ONCE(rsclp->head);
 }
 
-static inline void rcu_segcblist_set_invoking(struct rcu_segcblist *rsclp)
-{
-	WRITE_ONCE(rsclp->invoking, 1);
-}
-
-static inline void rcu_segcblist_reset_invoking(struct rcu_segcblist *rsclp)
-{
-	WRITE_ONCE(rsclp->invoking, 0);
-}
-
 /* Return number of callbacks in segmented callback list. */
 static inline long rcu_segcblist_n_cbs(struct rcu_segcblist *rsclp)
 {
-	long ret;
 #ifdef CONFIG_RCU_NOCB_CPU
-	ret = atomic_long_read(&rsclp->len);
+	return atomic_long_read(&rsclp->len);
 #else
-	ret = READ_ONCE(rsclp->len);
+	return READ_ONCE(rsclp->len);
 #endif
-
-	/*
-	 * An invoking list should not appear empty. This is required
-	 * by rcu_barrier().
-	 */
-	if (ret)
-		return ret;
-	return (READ_ONCE(rsclp->invoking) ? 1 : 0);
 }
 
 /*
@@ -112,6 +93,8 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 				    struct rcu_cblist *rclp);
 void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
 				    struct rcu_cblist *rclp);
+void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
+				struct rcu_cblist *rclp);
 void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 				   struct rcu_cblist *rclp);
 void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 62b83fb997ee..0f23d20d485a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1196,6 +1196,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	 * schedule another round of callback invocation.
 	 */
 	spin_lock_irq_rcu_node(sdp);
+	rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
 				       rcu_seq_snap(&ssp->srcu_gp_seq));
 	sdp->srcu_cblist_invoking = false;
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index b4dc18e4661c..64dff7b17e74 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2465,7 +2465,6 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	}
 	trace_rcu_batch_start(rcu_state.name,
 			      rcu_segcblist_n_cbs(&rdp->cblist), bl);
-	rcu_segcblist_set_invoking(&rdp->cblist);
 	rcu_segcblist_extract_done_cbs(&rdp->cblist, &rcl);
 	if (offloaded)
 		rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist);
@@ -2524,9 +2523,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
 
 	/* Update counts and requeue any remaining callbacks. */
 	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
-
-	smp_mb();
-	rcu_segcblist_reset_invoking(&rdp->cblist);
+	smp_mb(); /* List handling before counting for rcu_barrier(). */
+	rcu_segcblist_add_len(&rdp->cblist, -count);
 
 	/* Reinstate batch limit if we have worked down the excess. */
 	count = rcu_segcblist_n_cbs(&rdp->cblist);

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04)
  2020-09-21  2:06 ` [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes
@ 2020-09-21 15:34   ` Paul E. McKenney
  0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2020-09-21 15:34 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, Ingo Molnar, Josh Triplett, Lai Jiangshan,
	Marco Elver, Mathieu Desnoyers, neeraj.iitr10, rcu,
	Steven Rostedt, Uladzislau Rezki (Sony)

On Sun, Sep 20, 2020 at 10:06:25PM -0400, Joel Fernandes wrote:
> On Sun, Sep 20, 2020 at 09:21:47PM -0400, Joel Fernandes (Google) wrote:
> > 
> > NOTE: I marked as RFC since TREE 04 fails even though TREE03 passes. I don't
> > see any RCU errors in the counters, however when shutdown thread tries to
> > shutdown the system, it hangs when trying to shutdown the rcu_barrier thread.
> 
> Looks like if I restore the logic of setting the segcb_list ->len field from
> the callers of extract_done_cbs(), then TREE04 issues go away. I am inclined
> to do that considering this series, for what it is trying to do, does not
> really need to optimize that logic (can be done later).
> 
> With the below diff on top of this series, TREE04 passes. I will await any
> other review comments before posting another series.. meanwhile, will pass
> it through the grinders..

As discussed on IRC, I will await v6.

							Thanx, Paul

> ---8<---
> diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
> index 319a565f6ecb..d462ae5e340a 100644
> --- a/include/linux/rcu_segcblist.h
> +++ b/include/linux/rcu_segcblist.h
> @@ -76,7 +76,6 @@ struct rcu_segcblist {
>  #endif
>  	u8 enabled;
>  	u8 offloaded;
> -	u8 invoking;
>  };
>  
>  #define RCU_SEGCBLIST_INITIALIZER(n) \
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index ec9a609a461b..3c82c016feb1 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -388,7 +388,6 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
>  		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
>  			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
>  	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> -	rcu_segcblist_add_len(rsclp, -(rclp->len));
>  }
>  
>  /*
> @@ -452,7 +451,17 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
>  		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
>  		rcu_segcblist_set_seglen(rsclp, i, 0);
>  	}
> -	rcu_segcblist_add_len(rsclp, -(rclp->len));
> +}
> +
> +/*
> + * Insert counts from the specified rcu_cblist structure in the
> + * specified rcu_segcblist structure.
> + */
> +void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
> +				struct rcu_cblist *rclp)
> +{
> +	rcu_segcblist_add_len(rsclp, rclp->len);
> +	rclp->len = 0;
>  }
>  
>  /*
> @@ -476,7 +485,6 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
>  			break;
>  	rclp->head = NULL;
>  	rclp->tail = &rclp->head;
> -	rcu_segcblist_add_len(rsclp, rclp->len);
>  }
>  
>  /*
> @@ -492,7 +500,6 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
>  	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
>  	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
>  	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> -	rcu_segcblist_add_len(rsclp, rclp->len);
>  }
>  
>  /*
> @@ -637,9 +644,12 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
>  	rcu_cblist_init(&donecbs);
>  	rcu_cblist_init(&pendcbs);
>  
> +	rcu_segcblist_set_len(src_rsclp, 0);
>  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
>  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
>  
> +	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> +	rcu_segcblist_insert_count(dst_rsclp, &pendcbs);
>  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
>  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
>  
> diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
> index c2dc08a9b020..15c10d30f88c 100644
> --- a/kernel/rcu/rcu_segcblist.h
> +++ b/kernel/rcu/rcu_segcblist.h
> @@ -40,33 +40,14 @@ static inline bool rcu_segcblist_empty(struct rcu_segcblist *rsclp)
>  	return !READ_ONCE(rsclp->head);
>  }
>  
> -static inline void rcu_segcblist_set_invoking(struct rcu_segcblist *rsclp)
> -{
> -	WRITE_ONCE(rsclp->invoking, 1);
> -}
> -
> -static inline void rcu_segcblist_reset_invoking(struct rcu_segcblist *rsclp)
> -{
> -	WRITE_ONCE(rsclp->invoking, 0);
> -}
> -
>  /* Return number of callbacks in segmented callback list. */
>  static inline long rcu_segcblist_n_cbs(struct rcu_segcblist *rsclp)
>  {
> -	long ret;
>  #ifdef CONFIG_RCU_NOCB_CPU
> -	ret = atomic_long_read(&rsclp->len);
> +	return atomic_long_read(&rsclp->len);
>  #else
> -	ret = READ_ONCE(rsclp->len);
> +	return READ_ONCE(rsclp->len);
>  #endif
> -
> -	/*
> -	 * An invoking list should not appear empty. This is required
> -	 * by rcu_barrier().
> -	 */
> -	if (ret)
> -		return ret;
> -	return (READ_ONCE(rsclp->invoking) ? 1 : 0);
>  }
>  
>  /*
> @@ -112,6 +93,8 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
>  				    struct rcu_cblist *rclp);
>  void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
>  				    struct rcu_cblist *rclp);
> +void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
> +				struct rcu_cblist *rclp);
>  void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
>  				   struct rcu_cblist *rclp);
>  void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 62b83fb997ee..0f23d20d485a 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1196,6 +1196,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
>  	 * schedule another round of callback invocation.
>  	 */
>  	spin_lock_irq_rcu_node(sdp);
> +	rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
>  	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
>  				       rcu_seq_snap(&ssp->srcu_gp_seq));
>  	sdp->srcu_cblist_invoking = false;
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index b4dc18e4661c..64dff7b17e74 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2465,7 +2465,6 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	}
>  	trace_rcu_batch_start(rcu_state.name,
>  			      rcu_segcblist_n_cbs(&rdp->cblist), bl);
> -	rcu_segcblist_set_invoking(&rdp->cblist);
>  	rcu_segcblist_extract_done_cbs(&rdp->cblist, &rcl);
>  	if (offloaded)
>  		rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist);
> @@ -2524,9 +2523,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  
>  	/* Update counts and requeue any remaining callbacks. */
>  	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
> -
> -	smp_mb();
> -	rcu_segcblist_reset_invoking(&rdp->cblist);
> +	smp_mb(); /* List handling before counting for rcu_barrier(). */
> +	rcu_segcblist_add_len(&rdp->cblist, -count);
>  
>  	/* Reinstate batch limit if we have worked down the excess. */
>  	count = rcu_segcblist_n_cbs(&rdp->cblist);

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-09-21 15:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21  1:21 [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes (Google)
2020-09-21  1:21 ` [RFC v5 1/5] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
2020-09-21  1:21 ` [RFC v5 2/5] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
2020-09-21  1:21 ` [RFC v5 3/5] rcu: Fix rcu_barrier() breakage from earlier patch Joel Fernandes (Google)
2020-09-21  1:21 ` [RFC v5 4/5] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
2020-09-21  1:21 ` [RFC v5 5/5] rcu/segcblist: Remove useless rcupdate.h include Joel Fernandes (Google)
2020-09-21  2:06 ` [RFC v5 0/5] Add support length of each segment in the segcblist (breaks TREE04) Joel Fernandes
2020-09-21 15:34   ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).