rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist
@ 2020-08-25  2:48 Joel Fernandes (Google)
  2020-08-25  2:48 ` [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge Joel Fernandes (Google)
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Joel Fernandes (Google) @ 2020-08-25  2:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	boqun.feng, dave, Ingo Molnar, Josh Triplett, Lai Jiangshan,
	Madhuparna Bhowmik, Mathieu Desnoyers, neeraj.iitr10,
	Paul E. McKenney, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

This is required for several usecases identified. One of them being tracing how
the segmented callback list changes. Tracing this has identified issues in RCU
code in the past. The trace patch is the last one in this series.

Passes 30 minutes of TREE01, TREE03, TREE07.  Testing of other TREE configs is
in progress.

Patches are based on -rcu commit:
4f43de2a419a ("Merge branch 'lkmm-dev.2020.08.24a' into HEAD")

Revision history:
v4: Restructured rcu_do_batch() and segcblist merging to avoid issues.
    Fixed minor nit from Davidlohr.
v1->v3: minor nits.
(https://lore.kernel.org/lkml/20200719034210.2382053-1-joel@joelfernandes.org/)

Joel Fernandes (Google) (4):
rcu/segcblist: Do not depend on rcl->len to store the segcb len during
merge
rcu/tree: Make rcu_do_batch count how many callbacks were executed
rcu/segcblist: Add counters to segcblist datastructure
rcu/trace: Add tracing for how segcb list changes

include/linux/rcu_segcblist.h |   2 +
include/trace/events/rcu.h    |  25 +++++++
kernel/rcu/rcu_segcblist.c    | 119 ++++++++++++++++++++++++++++++++--
kernel/rcu/rcu_segcblist.h    |   8 +++
kernel/rcu/tree.c             |  32 +++++++--
5 files changed, 175 insertions(+), 11 deletions(-)

--
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-25  2:48 [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Joel Fernandes (Google)
@ 2020-08-25  2:48 ` Joel Fernandes (Google)
  2020-08-25 20:08   ` Paul E. McKenney
  2020-08-25  2:48 ` [PATCH v4 -rcu 2/4] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes (Google) @ 2020-08-25  2:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	boqun.feng, dave, Ingo Molnar, Josh Triplett, Lai Jiangshan,
	Madhuparna Bhowmik, Mathieu Desnoyers, neeraj.iitr10,
	Paul E. McKenney, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

The donecbs's ->len field is used to store the total count of the segmented
callback list's length. This ->len field is then added to the destination segcb
list.

However, this presents a problem for per-segment length counting which is added
in a future patch. This future patch sets the rcl->len field as we move
segments of callbacks between source and destination lists, thus becoming
incompatible with the donecb's ->len field.

This commit therefore avoids depending on the ->len field in this way. IMHO,
this is also less error-prone and is more accurate - the donecb's ->len field
should be the length of the done segment and not just used as a temporarily
variable.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/rcu_segcblist.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 2d2a6b6b9dfb..b70d4154433c 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
 {
 	struct rcu_cblist donecbs;
 	struct rcu_cblist pendcbs;
+	long src_len;
 
 	rcu_cblist_init(&donecbs);
 	rcu_cblist_init(&pendcbs);
-	rcu_segcblist_extract_count(src_rsclp, &donecbs);
+
+	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
 	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
 	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
-	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
+
+	rcu_segcblist_add_len(dst_rsclp, src_len);
 	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
 	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
+
 	rcu_segcblist_init(src_rsclp);
 }
-- 
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 -rcu 2/4] rcu/tree: Make rcu_do_batch count how many callbacks were executed
  2020-08-25  2:48 [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Joel Fernandes (Google)
  2020-08-25  2:48 ` [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge Joel Fernandes (Google)
@ 2020-08-25  2:48 ` Joel Fernandes (Google)
  2020-08-25 20:13   ` Paul E. McKenney
  2020-08-25  2:48 ` [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes (Google) @ 2020-08-25  2:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	boqun.feng, dave, Ingo Molnar, Josh Triplett, Lai Jiangshan,
	Madhuparna Bhowmik, Mathieu Desnoyers, neeraj.iitr10,
	Paul E. McKenney, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

Currently, rcu_do_batch() depends on the unsegmented callback list's len field
to know how many CBs are executed. This fields counts down from 0 as CBs are
dequeued.  It is possible that all CBs could not be run because of reaching
limits in which case the remaining unexecuted callbacks are requeued in the
CPU's segcblist.

The number of callbacks that were not requeued are then the negative count (how
many CBs were run) stored in the rcl->len which has been counting down on every
dequeue. This negative count is then added to the per-cpu segmented callback
list's to correct its count.

Such a design works against future efforts to track the length of each segment
of the segmented callback list. The reason is because
rcu_segcblist_extract_done_cbs() will have to store the length of the callback
list in rcl->len to make rcu_segcblist_merge() work.

Also, the design of counting down from 0 is confusing and error-prone IMHO.

This commit therefore explicitly counts have many callbacks were executed in
rcu_do_batch() itself, and uses that to update the per-CPU segcb list's len
field, without relying on the negativity of rcl->len.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/rcu_segcblist.c | 2 +-
 kernel/rcu/rcu_segcblist.h | 1 +
 kernel/rcu/tree.c          | 9 ++++-----
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index b70d4154433c..076337ae2e50 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -95,7 +95,7 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
  * This increase is fully ordered with respect to the callers accesses
  * both before and after.
  */
-static void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
+void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
 {
 #ifdef CONFIG_RCU_NOCB_CPU
 	smp_mb__before_atomic(); /* Up to the caller! */
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 5c293afc07b8..b90725f81d77 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -76,6 +76,7 @@ static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg)
 }
 
 void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
+void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v);
 void rcu_segcblist_init(struct rcu_segcblist *rsclp);
 void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
 void rcu_segcblist_offload(struct rcu_segcblist *rsclp);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 548404489c04..51348144a4ea 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2419,7 +2419,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 			       rcu_segcblist_is_offloaded(&rdp->cblist);
 	struct rcu_head *rhp;
 	struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
-	long bl, count;
+	long bl, count = 0;
 	long pending, tlimit = 0;
 
 	/* If no callbacks are ready, just return. */
@@ -2464,6 +2464,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
 		rcu_callback_t f;
 
+		count++;
 		debug_rcu_head_unqueue(rhp);
 
 		rcu_lock_acquire(&rcu_callback_map);
@@ -2477,9 +2478,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
 
 		/*
 		 * Stop only if limit reached and CPU has something to do.
-		 * Note: The rcl structure counts down from zero.
 		 */
-		if (-rcl.len >= bl && !offloaded &&
+		if (count >= bl && !offloaded &&
 		    (need_resched() ||
 		     (!is_idle_task(current) && !rcu_is_callbacks_kthread())))
 			break;
@@ -2502,7 +2502,6 @@ static void rcu_do_batch(struct rcu_data *rdp)
 
 	local_irq_save(flags);
 	rcu_nocb_lock(rdp);
-	count = -rcl.len;
 	rdp->n_cbs_invoked += count;
 	trace_rcu_batch_end(rcu_state.name, count, !!rcl.head, need_resched(),
 			    is_idle_task(current), rcu_is_callbacks_kthread());
@@ -2510,7 +2509,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Update counts and requeue any remaining callbacks. */
 	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
 	smp_mb(); /* List handling before counting for rcu_barrier(). */
-	rcu_segcblist_insert_count(&rdp->cblist, &rcl);
+	rcu_segcblist_add_len(&rdp->cblist, -count);
 
 	/* Reinstate batch limit if we have worked down the excess. */
 	count = rcu_segcblist_n_cbs(&rdp->cblist);
-- 
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure
  2020-08-25  2:48 [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Joel Fernandes (Google)
  2020-08-25  2:48 ` [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge Joel Fernandes (Google)
  2020-08-25  2:48 ` [PATCH v4 -rcu 2/4] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
@ 2020-08-25  2:48 ` Joel Fernandes (Google)
  2020-08-25 21:53   ` Paul E. McKenney
  2020-08-25  2:48 ` [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
  2020-08-25 19:58 ` [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Paul E. McKenney
  4 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes (Google) @ 2020-08-25  2:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	boqun.feng, dave, Ingo Molnar, Josh Triplett, Lai Jiangshan,
	Madhuparna Bhowmik, Mathieu Desnoyers, neeraj.iitr10,
	Paul E. McKenney, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

Add counting of segment lengths of segmented callback list.

This will be useful for a number of things such as knowing how big the
ready-to-execute segment have gotten. The immediate benefit is ability
to trace how the callbacks in the segmented callback list change.

Tested by profusely reading traces when segcblist counts updated.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 include/linux/rcu_segcblist.h |  2 +
 kernel/rcu/rcu_segcblist.c    | 82 +++++++++++++++++++++++++++++++++--
 2 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index b36afe7b22c9..d462ae5e340a 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -69,8 +69,10 @@ struct rcu_segcblist {
 	unsigned long gp_seq[RCU_CBLIST_NSEGS];
 #ifdef CONFIG_RCU_NOCB_CPU
 	atomic_long_t len;
+	atomic_long_t seglen[RCU_CBLIST_NSEGS];
 #else
 	long len;
+	long seglen[RCU_CBLIST_NSEGS];
 #endif
 	u8 enabled;
 	u8 offloaded;
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 076337ae2e50..73a103464ea4 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -88,6 +88,62 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
 #endif
 }
 
+/* Get the length of a segment of the rcu_segcblist structure. */
+static long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
+{
+#ifdef CONFIG_RCU_NOCB_CPU
+	return atomic_long_read(&rsclp->seglen[seg]);
+#else
+	return READ_ONCE(rsclp->seglen[seg]);
+#endif
+}
+
+/* Set the length of a segment of the rcu_segcblist structure. */
+static void rcu_segcblist_set_seglen(struct rcu_segcblist *rsclp, int seg, long v)
+{
+#ifdef CONFIG_RCU_NOCB_CPU
+	atomic_long_set(&rsclp->seglen[seg], v);
+#else
+	WRITE_ONCE(rsclp->seglen[seg], v);
+#endif
+}
+
+/* Return number of callbacks in a segment of the segmented callback list. */
+static void rcu_segcblist_add_seglen(struct rcu_segcblist *rsclp, int seg, long v)
+{
+#ifdef CONFIG_RCU_NOCB_CPU
+	smp_mb__before_atomic(); /* Up to the caller! */
+	atomic_long_add(v, &rsclp->seglen[seg]);
+	smp_mb__after_atomic(); /* Up to the caller! */
+#else
+	smp_mb(); /* Up to the caller! */
+	WRITE_ONCE(rsclp->seglen[seg], rsclp->seglen[seg] + v);
+	smp_mb(); /* Up to the caller! */
+#endif
+}
+
+/* Move from's segment length to to's segment. */
+static void rcu_segcblist_move_seglen(struct rcu_segcblist *rsclp, int from, int to)
+{
+	long len;
+
+	if (from == to)
+		return;
+
+	len = rcu_segcblist_get_seglen(rsclp, from);
+	if (!len)
+		return;
+
+	rcu_segcblist_add_seglen(rsclp, to, len);
+	rcu_segcblist_set_seglen(rsclp, from, 0);
+}
+
+/* Increment segment's length. */
+static void rcu_segcblist_inc_seglen(struct rcu_segcblist *rsclp, int seg)
+{
+	rcu_segcblist_add_seglen(rsclp, seg, 1);
+}
+
 /*
  * Increase the numeric length of an rcu_segcblist structure by the
  * specified amount, which can be negative.  This can cause the ->len
@@ -149,8 +205,10 @@ void rcu_segcblist_init(struct rcu_segcblist *rsclp)
 	BUILD_BUG_ON(RCU_NEXT_TAIL + 1 != ARRAY_SIZE(rsclp->gp_seq));
 	BUILD_BUG_ON(ARRAY_SIZE(rsclp->tails) != ARRAY_SIZE(rsclp->gp_seq));
 	rsclp->head = NULL;
-	for (i = 0; i < RCU_CBLIST_NSEGS; i++)
+	for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
 		rsclp->tails[i] = &rsclp->head;
+		rcu_segcblist_set_seglen(rsclp, i, 0);
+	}
 	rcu_segcblist_set_len(rsclp, 0);
 	rsclp->enabled = 1;
 }
@@ -245,6 +303,7 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
 			   struct rcu_head *rhp)
 {
 	rcu_segcblist_inc_len(rsclp);
+	rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL);
 	smp_mb(); /* Ensure counts are updated before callback is enqueued. */
 	rhp->next = NULL;
 	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp);
@@ -274,6 +333,7 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
 	for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
 		if (rsclp->tails[i] != rsclp->tails[i - 1])
 			break;
+	rcu_segcblist_inc_seglen(rsclp, i);
 	WRITE_ONCE(*rsclp->tails[i], rhp);
 	for (; i <= RCU_NEXT_TAIL; i++)
 		WRITE_ONCE(rsclp->tails[i], &rhp->next);
@@ -307,6 +367,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 
 	if (!rcu_segcblist_ready_cbs(rsclp))
 		return; /* Nothing to do. */
+	rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_DONE_TAIL);
 	*rclp->tail = rsclp->head;
 	WRITE_ONCE(rsclp->head, *rsclp->tails[RCU_DONE_TAIL]);
 	WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
@@ -314,6 +375,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 	for (i = RCU_CBLIST_NSEGS - 1; i >= RCU_DONE_TAIL; i--)
 		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
 			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
+	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
 }
 
 /*
@@ -330,11 +392,16 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
 
 	if (!rcu_segcblist_pend_cbs(rsclp))
 		return; /* Nothing to do. */
+	rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_WAIT_TAIL) +
+		    rcu_segcblist_get_seglen(rsclp, RCU_NEXT_READY_TAIL) +
+		    rcu_segcblist_get_seglen(rsclp, RCU_NEXT_TAIL);
 	*rclp->tail = *rsclp->tails[RCU_DONE_TAIL];
 	rclp->tail = rsclp->tails[RCU_NEXT_TAIL];
 	WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
-	for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++)
+	for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++) {
 		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
+		rcu_segcblist_set_seglen(rsclp, i, 0);
+	}
 }
 
 /*
@@ -359,6 +426,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 
 	if (!rclp->head)
 		return; /* No callbacks to move. */
+	rcu_segcblist_add_seglen(rsclp, RCU_DONE_TAIL, rclp->len);
 	*rclp->tail = rsclp->head;
 	WRITE_ONCE(rsclp->head, rclp->head);
 	for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
@@ -379,6 +447,8 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
 {
 	if (!rclp->head)
 		return; /* Nothing to do. */
+
+	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
 	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
 	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
 }
@@ -403,6 +473,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
 		if (ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
 			break;
 		WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]);
+		rcu_segcblist_move_seglen(rsclp, i, RCU_DONE_TAIL);
 	}
 
 	/* If no callbacks moved, nothing more need be done. */
@@ -423,6 +494,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
 		if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
 			break;  /* No more callbacks. */
 		WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]);
+		rcu_segcblist_move_seglen(rsclp, i, j);
 		rsclp->gp_seq[j] = rsclp->gp_seq[i];
 	}
 }
@@ -444,7 +516,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
  */
 bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
 {
-	int i;
+	int i, j;
 
 	WARN_ON_ONCE(!rcu_segcblist_is_enabled(rsclp));
 	if (rcu_segcblist_restempty(rsclp, RCU_DONE_TAIL))
@@ -487,6 +559,10 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
 	if (rcu_segcblist_restempty(rsclp, i) || ++i >= RCU_NEXT_TAIL)
 		return false;
 
+	/* Accounting: everything below i is about to get merged into i. */
+	for (j = i + 1; j <= RCU_NEXT_TAIL; j++)
+		rcu_segcblist_move_seglen(rsclp, j, i);
+
 	/*
 	 * Merge all later callbacks, including newly arrived callbacks,
 	 * into the segment located by the for-loop above.  Assign "seq"
-- 
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes
  2020-08-25  2:48 [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Joel Fernandes (Google)
                   ` (2 preceding siblings ...)
  2020-08-25  2:48 ` [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
@ 2020-08-25  2:48 ` Joel Fernandes (Google)
  2020-08-25 21:55   ` Paul E. McKenney
  2020-08-25 19:58 ` [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Paul E. McKenney
  4 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes (Google) @ 2020-08-25  2:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	boqun.feng, dave, Ingo Molnar, Josh Triplett, Lai Jiangshan,
	Madhuparna Bhowmik, Mathieu Desnoyers, neeraj.iitr10,
	Paul E. McKenney, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

Track how the segcb list changes before/after acceleration, during
queuing and during dequeuing.

This has proved useful to discover an optimization to avoid unwanted GP
requests when there are no callbacks accelerated. The overhead is minimal as
each segment's length is now stored in the respective segment.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 include/trace/events/rcu.h | 25 +++++++++++++++++++++++++
 kernel/rcu/rcu_segcblist.c | 27 +++++++++++++++++++++++++++
 kernel/rcu/rcu_segcblist.h |  7 +++++++
 kernel/rcu/tree.c          | 23 +++++++++++++++++++++++
 4 files changed, 82 insertions(+)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 155b5cb43cfd..7b84df3c95df 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
 		  __entry->qlen)
 );
 
+TRACE_EVENT_RCU(rcu_segcb,
+
+		TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
+
+		TP_ARGS(ctx, cb_count, gp_seq),
+
+		TP_STRUCT__entry(
+			__field(const char *, ctx)
+			__array(int, cb_count, 4)
+			__array(unsigned long, gp_seq, 4)
+		),
+
+		TP_fast_assign(
+			__entry->ctx = ctx;
+			memcpy(__entry->cb_count, cb_count, 4 * sizeof(int));
+			memcpy(__entry->gp_seq, gp_seq, 4 * sizeof(unsigned long));
+		),
+
+		TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, NEXT=%d) "
+			  "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
+			  __entry->cb_count[0], __entry->cb_count[1], __entry->cb_count[2], __entry->cb_count[3],
+			  __entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
+
+);
+
 /*
  * Tracepoint for the registration of a single RCU callback of the special
  * kvfree() form.  The first argument is the RCU type, the second argument
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 73a103464ea4..6419dbbaecde 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -378,6 +378,33 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
 }
 
+/*
+ * Return how many CBs each segment along with their gp_seq values.
+ *
+ * This function is O(N) where N is the number of callbacks. Only used from
+ * tracing code which is usually disabled in production.
+ */
+#ifdef CONFIG_RCU_TRACE
+void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
+			 int cbcount[RCU_CBLIST_NSEGS],
+			 unsigned long gpseq[RCU_CBLIST_NSEGS])
+{
+	struct rcu_head **cur_tail;
+	int i;
+
+	for (i = 0; i < RCU_CBLIST_NSEGS; i++)
+		cbcount[i] = 0;
+
+	cur_tail = &(rsclp->head);
+
+	for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
+		cbcount[i] = rcu_segcblist_get_seglen(rsclp, i);
+		gpseq[i] = rsclp->gp_seq[i];
+		cur_tail = rsclp->tails[i];
+	}
+}
+#endif
+
 /*
  * Extract only those callbacks still pending (not yet ready to be
  * invoked) from the specified rcu_segcblist structure and place them in
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index b90725f81d77..5f7cdfed0ba4 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -105,3 +105,10 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq);
 bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq);
 void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
 			 struct rcu_segcblist *src_rsclp);
+#ifdef CONFIG_RCU_TRACE
+void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
+			 int cbcount[RCU_CBLIST_NSEGS],
+			 unsigned long gpseq[RCU_CBLIST_NSEGS]);
+#else
+#define rcu_segcblist_countseq(...)
+#endif
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 51348144a4ea..16ad99a9ebba 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1476,6 +1476,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
 {
 	unsigned long gp_seq_req;
 	bool ret = false;
+	int cbs[RCU_CBLIST_NSEGS];
+	unsigned long gps[RCU_CBLIST_NSEGS];
 
 	rcu_lockdep_assert_cblist_protected(rdp);
 	raw_lockdep_assert_held_rcu_node(rnp);
@@ -1484,6 +1486,10 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
 	if (!rcu_segcblist_pend_cbs(&rdp->cblist))
 		return false;
 
+	/* Count CBs for tracing. */
+	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
+	trace_rcu_segcb("SegCbPreAcc", cbs, gps);
+
 	/*
 	 * Callbacks are often registered with incomplete grace-period
 	 * information.  Something about the fact that getting exact
@@ -1504,6 +1510,10 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
 	else
 		trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccReadyCB"));
 
+	/* Count CBs for tracing. */
+	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
+	trace_rcu_segcb("SegCbPostAcc", cbs, gps);
+
 	return ret;
 }
 
@@ -2421,6 +2431,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
 	long bl, count = 0;
 	long pending, tlimit = 0;
+	int cbs[RCU_CBLIST_NSEGS];
+	unsigned long gps[RCU_CBLIST_NSEGS];
 
 	/* If no callbacks are ready, just return. */
 	if (!rcu_segcblist_ready_cbs(&rdp->cblist)) {
@@ -2461,6 +2473,11 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Invoke callbacks. */
 	tick_dep_set_task(current, TICK_DEP_BIT_RCU);
 	rhp = rcu_cblist_dequeue(&rcl);
+
+	/* Count CBs for tracing. */
+	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
+	trace_rcu_segcb("SegCbDequeued", cbs, gps);
+
 	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
 		rcu_callback_t f;
 
@@ -2929,6 +2946,8 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
 	unsigned long flags;
 	struct rcu_data *rdp;
 	bool was_alldone;
+	int cbs[RCU_CBLIST_NSEGS];
+	unsigned long gps[RCU_CBLIST_NSEGS];
 
 	/* Misaligned rcu_head! */
 	WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
@@ -2974,6 +2993,10 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
 		trace_rcu_callback(rcu_state.name, head,
 				   rcu_segcblist_n_cbs(&rdp->cblist));
 
+	/* Count CBs for tracing. */
+	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
+	trace_rcu_segcb("SegCBQueued", cbs, gps);
+
 	/* Go handle any RCU core processing required. */
 	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
 	    unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
-- 
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist
  2020-08-25  2:48 [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Joel Fernandes (Google)
                   ` (3 preceding siblings ...)
  2020-08-25  2:48 ` [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
@ 2020-08-25 19:58 ` Paul E. McKenney
  4 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-25 19:58 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Mon, Aug 24, 2020 at 10:48:38PM -0400, Joel Fernandes (Google) wrote:
> This is required for several usecases identified. One of them being tracing how
> the segmented callback list changes. Tracing this has identified issues in RCU
> code in the past. The trace patch is the last one in this series.

Thank you for doing this!  Another use case is of course more accurately
determining whether a given CPU's large pile of callbacks can be best
served by making grace periods go faster, invoking callbacks more
vigorously, or both.  It should also be possible to simplify some of
the callback handling a bit, given that some of the unnatural acts are
due to there having been no per-batch counts.

							Thanx, Paul

> Passes 30 minutes of TREE01, TREE03, TREE07.  Testing of other TREE configs is
> in progress.
> 
> Patches are based on -rcu commit:
> 4f43de2a419a ("Merge branch 'lkmm-dev.2020.08.24a' into HEAD")
> 
> Revision history:
> v4: Restructured rcu_do_batch() and segcblist merging to avoid issues.
>     Fixed minor nit from Davidlohr.
> v1->v3: minor nits.
> (https://lore.kernel.org/lkml/20200719034210.2382053-1-joel@joelfernandes.org/)
> 
> Joel Fernandes (Google) (4):
> rcu/segcblist: Do not depend on rcl->len to store the segcb len during
> merge
> rcu/tree: Make rcu_do_batch count how many callbacks were executed
> rcu/segcblist: Add counters to segcblist datastructure
> rcu/trace: Add tracing for how segcb list changes
> 
> include/linux/rcu_segcblist.h |   2 +
> include/trace/events/rcu.h    |  25 +++++++
> kernel/rcu/rcu_segcblist.c    | 119 ++++++++++++++++++++++++++++++++--
> kernel/rcu/rcu_segcblist.h    |   8 +++
> kernel/rcu/tree.c             |  32 +++++++--
> 5 files changed, 175 insertions(+), 11 deletions(-)
> 
> --
> 2.28.0.297.g1956fa8f8d-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-25  2:48 ` [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge Joel Fernandes (Google)
@ 2020-08-25 20:08   ` Paul E. McKenney
  2020-08-25 22:47     ` Joel Fernandes
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-25 20:08 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Mon, Aug 24, 2020 at 10:48:39PM -0400, Joel Fernandes (Google) wrote:
> The donecbs's ->len field is used to store the total count of the segmented
> callback list's length. This ->len field is then added to the destination segcb
> list.
> 
> However, this presents a problem for per-segment length counting which is added
> in a future patch. This future patch sets the rcl->len field as we move
> segments of callbacks between source and destination lists, thus becoming
> incompatible with the donecb's ->len field.

OK, I will bite.  What is "rcl"?  A placeholder for donecbs and pendcbs?
If so, please just name them both.  If not, please explain.

> This commit therefore avoids depending on the ->len field in this way. IMHO,
> this is also less error-prone and is more accurate - the donecb's ->len field
> should be the length of the done segment and not just used as a temporarily
> variable.

Please also mention why ->len is handled specially at all, namely
interactions between rcu_barrier() and callback invocation.  This is
the answer to "why not just make all this work like normal lists?"
This might go well in the first paragraph.

> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/rcu_segcblist.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 2d2a6b6b9dfb..b70d4154433c 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
>  {
>  	struct rcu_cblist donecbs;
>  	struct rcu_cblist pendcbs;
> +	long src_len;
>  
>  	rcu_cblist_init(&donecbs);
>  	rcu_cblist_init(&pendcbs);
> -	rcu_segcblist_extract_count(src_rsclp, &donecbs);
> +
> +	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);

Given that both rcu_segcblist_xchg_len() and rcu_segcblist_extract_count()
have only one callsite each, why not get rid of one of them?

Or better yet, please see below, which should allow getting rid of both
of them.

>  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
>  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> +
> +	rcu_segcblist_add_len(dst_rsclp, src_len);
>  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
>  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);

Rather than adding the blank lines, why not have the rcu_cblist structures
carry the lengths?  You are already adjusting one of the two call sites
that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
That should shorten this function a bit more.  And make callback handling
much more approachable, I suspect.

There would still be the callback-invocation need to be careful with
->cblist.len due to rcu_barrier() and srcu_barrier().  But both of
those should be excluded by this code.  (But don't take my word for it,
ask KCSAN.)

							Thanx, Paul

> +
>  	rcu_segcblist_init(src_rsclp);
>  }
> -- 
> 2.28.0.297.g1956fa8f8d-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 2/4] rcu/tree: Make rcu_do_batch count how many callbacks were executed
  2020-08-25  2:48 ` [PATCH v4 -rcu 2/4] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
@ 2020-08-25 20:13   ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-25 20:13 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Mon, Aug 24, 2020 at 10:48:40PM -0400, Joel Fernandes (Google) wrote:
> Currently, rcu_do_batch() depends on the unsegmented callback list's len field
> to know how many CBs are executed. This fields counts down from 0 as CBs are
> dequeued.  It is possible that all CBs could not be run because of reaching
> limits in which case the remaining unexecuted callbacks are requeued in the
> CPU's segcblist.

Again, please mention why the ->cblist.len fields are preserved, namely
due to interactions with rcu_barrier().

> The number of callbacks that were not requeued are then the negative count (how
> many CBs were run) stored in the rcl->len which has been counting down on every
> dequeue. This negative count is then added to the per-cpu segmented callback
> list's to correct its count.
> 
> Such a design works against future efforts to track the length of each segment
> of the segmented callback list. The reason is because
> rcu_segcblist_extract_done_cbs() will have to store the length of the callback
> list in rcl->len to make rcu_segcblist_merge() work.
> 
> Also, the design of counting down from 0 is confusing and error-prone IMHO.
> 
> This commit therefore explicitly counts have many callbacks were executed in
> rcu_do_batch() itself, and uses that to update the per-CPU segcb list's len
> field, without relying on the negativity of rcl->len.

This last, as noted in response to 1/4, should allow rcu_segcblist_merge()
to avoid carrying the callback count separately.

> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/rcu_segcblist.c | 2 +-
>  kernel/rcu/rcu_segcblist.h | 1 +
>  kernel/rcu/tree.c          | 9 ++++-----
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index b70d4154433c..076337ae2e50 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -95,7 +95,7 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
>   * This increase is fully ordered with respect to the callers accesses
>   * both before and after.
>   */
> -static void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
> +void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
>  {
>  #ifdef CONFIG_RCU_NOCB_CPU
>  	smp_mb__before_atomic(); /* Up to the caller! */
> diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
> index 5c293afc07b8..b90725f81d77 100644
> --- a/kernel/rcu/rcu_segcblist.h
> +++ b/kernel/rcu/rcu_segcblist.h
> @@ -76,6 +76,7 @@ static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg)
>  }
>  
>  void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
> +void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v);
>  void rcu_segcblist_init(struct rcu_segcblist *rsclp);
>  void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
>  void rcu_segcblist_offload(struct rcu_segcblist *rsclp);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 548404489c04..51348144a4ea 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2419,7 +2419,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  			       rcu_segcblist_is_offloaded(&rdp->cblist);
>  	struct rcu_head *rhp;
>  	struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
> -	long bl, count;
> +	long bl, count = 0;
>  	long pending, tlimit = 0;
>  
>  	/* If no callbacks are ready, just return. */
> @@ -2464,6 +2464,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
>  		rcu_callback_t f;
>  
> +		count++;
>  		debug_rcu_head_unqueue(rhp);
>  
>  		rcu_lock_acquire(&rcu_callback_map);
> @@ -2477,9 +2478,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  
>  		/*
>  		 * Stop only if limit reached and CPU has something to do.
> -		 * Note: The rcl structure counts down from zero.
>  		 */
> -		if (-rcl.len >= bl && !offloaded &&
> +		if (count >= bl && !offloaded &&
>  		    (need_resched() ||
>  		     (!is_idle_task(current) && !rcu_is_callbacks_kthread())))
>  			break;

Please also replace "-rcl.len" in this if statement with "count":

			/* only call local_clock() every 32 callbacks */
			if (likely((-rcl.len & 31) || local_clock() < tlimit))
				continue;
			/* Exceeded the time limit, so leave. */
			break;

Yeah, it does work either way, but having "count" consistently throughout
is easier on the people reading the code.  And I haven't heard many
complaints about the code being too easy to read...

							Thanx, Paul

> @@ -2502,7 +2502,6 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  
>  	local_irq_save(flags);
>  	rcu_nocb_lock(rdp);
> -	count = -rcl.len;
>  	rdp->n_cbs_invoked += count;
>  	trace_rcu_batch_end(rcu_state.name, count, !!rcl.head, need_resched(),
>  			    is_idle_task(current), rcu_is_callbacks_kthread());
> @@ -2510,7 +2509,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	/* Update counts and requeue any remaining callbacks. */
>  	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rcl);
>  	smp_mb(); /* List handling before counting for rcu_barrier(). */
> -	rcu_segcblist_insert_count(&rdp->cblist, &rcl);
> +	rcu_segcblist_add_len(&rdp->cblist, -count);
>  
>  	/* Reinstate batch limit if we have worked down the excess. */
>  	count = rcu_segcblist_n_cbs(&rdp->cblist);
> -- 
> 2.28.0.297.g1956fa8f8d-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure
  2020-08-25  2:48 ` [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
@ 2020-08-25 21:53   ` Paul E. McKenney
  2020-08-25 22:51     ` Joel Fernandes
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-25 21:53 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Mon, Aug 24, 2020 at 10:48:41PM -0400, Joel Fernandes (Google) wrote:
> Add counting of segment lengths of segmented callback list.
> 
> This will be useful for a number of things such as knowing how big the
> ready-to-execute segment have gotten. The immediate benefit is ability
> to trace how the callbacks in the segmented callback list change.
> 
> Tested by profusely reading traces when segcblist counts updated.

Might I recommend a more repeatable and sustainable test methodology?
(Sorry, couldn't resist, given that you left yourself wide open for
that one...)

> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  include/linux/rcu_segcblist.h |  2 +
>  kernel/rcu/rcu_segcblist.c    | 82 +++++++++++++++++++++++++++++++++--
>  2 files changed, 81 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
> index b36afe7b22c9..d462ae5e340a 100644
> --- a/include/linux/rcu_segcblist.h
> +++ b/include/linux/rcu_segcblist.h
> @@ -69,8 +69,10 @@ struct rcu_segcblist {
>  	unsigned long gp_seq[RCU_CBLIST_NSEGS];
>  #ifdef CONFIG_RCU_NOCB_CPU
>  	atomic_long_t len;
> +	atomic_long_t seglen[RCU_CBLIST_NSEGS];

What are the guarantees for access to seglen?

I get that the possibility of RCU callback offloading means that you
need atomic_long_t, but presumably you are only guaranteeing coherent
lockless access to the individual elements of the array, rather than
any coherent relationship among them or with ->len.  Anything surprising
should go in the header comment blocks of the access functions.

And in case someone thinks that atomic_t would suffice, I have in the past
had list with in excess of 30M callbacks without RCU CPU stall warnings,
so 2B is not as large a number as you might think.

>  #else
>  	long len;
> +	long seglen[RCU_CBLIST_NSEGS];
>  #endif
>  	u8 enabled;
>  	u8 offloaded;
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 076337ae2e50..73a103464ea4 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -88,6 +88,62 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
>  #endif
>  }
>  
> +/* Get the length of a segment of the rcu_segcblist structure. */
> +static long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
> +{
> +#ifdef CONFIG_RCU_NOCB_CPU
> +	return atomic_long_read(&rsclp->seglen[seg]);
> +#else
> +	return READ_ONCE(rsclp->seglen[seg]);
> +#endif
> +}
> +
> +/* Set the length of a segment of the rcu_segcblist structure. */
> +static void rcu_segcblist_set_seglen(struct rcu_segcblist *rsclp, int seg, long v)
> +{
> +#ifdef CONFIG_RCU_NOCB_CPU
> +	atomic_long_set(&rsclp->seglen[seg], v);
> +#else
> +	WRITE_ONCE(rsclp->seglen[seg], v);
> +#endif
> +}
> +
> +/* Return number of callbacks in a segment of the segmented callback list. */
> +static void rcu_segcblist_add_seglen(struct rcu_segcblist *rsclp, int seg, long v)
> +{
> +#ifdef CONFIG_RCU_NOCB_CPU
> +	smp_mb__before_atomic(); /* Up to the caller! */
> +	atomic_long_add(v, &rsclp->seglen[seg]);
> +	smp_mb__after_atomic(); /* Up to the caller! */
> +#else
> +	smp_mb(); /* Up to the caller! */
> +	WRITE_ONCE(rsclp->seglen[seg], rsclp->seglen[seg] + v);
> +	smp_mb(); /* Up to the caller! */
> +#endif
> +}
> +
> +/* Move from's segment length to to's segment. */
> +static void rcu_segcblist_move_seglen(struct rcu_segcblist *rsclp, int from, int to)
> +{
> +	long len;
> +
> +	if (from == to)
> +		return;
> +
> +	len = rcu_segcblist_get_seglen(rsclp, from);
> +	if (!len)
> +		return;

You don't need the blank lines on this small a function.

> +	rcu_segcblist_add_seglen(rsclp, to, len);
> +	rcu_segcblist_set_seglen(rsclp, from, 0);
> +}
> +
> +/* Increment segment's length. */
> +static void rcu_segcblist_inc_seglen(struct rcu_segcblist *rsclp, int seg)
> +{
> +	rcu_segcblist_add_seglen(rsclp, seg, 1);
> +}
> +
>  /*
>   * Increase the numeric length of an rcu_segcblist structure by the
>   * specified amount, which can be negative.  This can cause the ->len
> @@ -149,8 +205,10 @@ void rcu_segcblist_init(struct rcu_segcblist *rsclp)
>  	BUILD_BUG_ON(RCU_NEXT_TAIL + 1 != ARRAY_SIZE(rsclp->gp_seq));
>  	BUILD_BUG_ON(ARRAY_SIZE(rsclp->tails) != ARRAY_SIZE(rsclp->gp_seq));
>  	rsclp->head = NULL;
> -	for (i = 0; i < RCU_CBLIST_NSEGS; i++)
> +	for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
>  		rsclp->tails[i] = &rsclp->head;
> +		rcu_segcblist_set_seglen(rsclp, i, 0);
> +	}
>  	rcu_segcblist_set_len(rsclp, 0);
>  	rsclp->enabled = 1;
>  }
> @@ -245,6 +303,7 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
>  			   struct rcu_head *rhp)
>  {
>  	rcu_segcblist_inc_len(rsclp);
> +	rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL);
>  	smp_mb(); /* Ensure counts are updated before callback is enqueued. */
>  	rhp->next = NULL;
>  	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp);
> @@ -274,6 +333,7 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
>  	for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
>  		if (rsclp->tails[i] != rsclp->tails[i - 1])
>  			break;
> +	rcu_segcblist_inc_seglen(rsclp, i);
>  	WRITE_ONCE(*rsclp->tails[i], rhp);
>  	for (; i <= RCU_NEXT_TAIL; i++)
>  		WRITE_ONCE(rsclp->tails[i], &rhp->next);
> @@ -307,6 +367,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
>  
>  	if (!rcu_segcblist_ready_cbs(rsclp))
>  		return; /* Nothing to do. */
> +	rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_DONE_TAIL);
>  	*rclp->tail = rsclp->head;
>  	WRITE_ONCE(rsclp->head, *rsclp->tails[RCU_DONE_TAIL]);
>  	WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
> @@ -314,6 +375,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
>  	for (i = RCU_CBLIST_NSEGS - 1; i >= RCU_DONE_TAIL; i--)
>  		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
>  			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
> +	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
>  }
>  
>  /*
> @@ -330,11 +392,16 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
>  
>  	if (!rcu_segcblist_pend_cbs(rsclp))
>  		return; /* Nothing to do. */
> +	rclp->len = rcu_segcblist_get_seglen(rsclp, RCU_WAIT_TAIL) +
> +		    rcu_segcblist_get_seglen(rsclp, RCU_NEXT_READY_TAIL) +
> +		    rcu_segcblist_get_seglen(rsclp, RCU_NEXT_TAIL);

A loop for future-proofness, please.  The compiler will unroll it anyway.
Plus this is nowhere near a fastpath.

>  	*rclp->tail = *rsclp->tails[RCU_DONE_TAIL];
>  	rclp->tail = rsclp->tails[RCU_NEXT_TAIL];
>  	WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
> -	for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++)
> +	for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++) {
>  		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
> +		rcu_segcblist_set_seglen(rsclp, i, 0);
> +	}
>  }
>  
>  /*
> @@ -359,6 +426,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
>  
>  	if (!rclp->head)
>  		return; /* No callbacks to move. */
> +	rcu_segcblist_add_seglen(rsclp, RCU_DONE_TAIL, rclp->len);
>  	*rclp->tail = rsclp->head;
>  	WRITE_ONCE(rsclp->head, rclp->head);
>  	for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
> @@ -379,6 +447,8 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
>  {
>  	if (!rclp->head)
>  		return; /* Nothing to do. */
> +

This only has five lines of code, so the blank line is superfluous.

> +	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
>  	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
>  	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
>  }
> @@ -403,6 +473,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
>  		if (ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
>  			break;
>  		WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]);
> +		rcu_segcblist_move_seglen(rsclp, i, RCU_DONE_TAIL);
>  	}
>  
>  	/* If no callbacks moved, nothing more need be done. */
> @@ -423,6 +494,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
>  		if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
>  			break;  /* No more callbacks. */
>  		WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]);
> +		rcu_segcblist_move_seglen(rsclp, i, j);
>  		rsclp->gp_seq[j] = rsclp->gp_seq[i];
>  	}
>  }
> @@ -444,7 +516,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
>   */
>  bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
>  {
> -	int i;
> +	int i, j;
>  
>  	WARN_ON_ONCE(!rcu_segcblist_is_enabled(rsclp));
>  	if (rcu_segcblist_restempty(rsclp, RCU_DONE_TAIL))
> @@ -487,6 +559,10 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
>  	if (rcu_segcblist_restempty(rsclp, i) || ++i >= RCU_NEXT_TAIL)
>  		return false;
>  
> +	/* Accounting: everything below i is about to get merged into i. */

How about "newer than i"?  Many people would think of "below i" as being
numerically less than i, which is not what we need them to think.

> +	for (j = i + 1; j <= RCU_NEXT_TAIL; j++)
> +		rcu_segcblist_move_seglen(rsclp, j, i);

I thought about merging the above loop with the next one, but the added
check and the snapshotting of "i" makes it not worthwhile, so please
leave this loop separate, just as you now have it.

							Thanx, Paul

>  	/*
>  	 * Merge all later callbacks, including newly arrived callbacks,
>  	 * into the segment located by the for-loop above.  Assign "seq"
> -- 
> 2.28.0.297.g1956fa8f8d-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes
  2020-08-25  2:48 ` [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
@ 2020-08-25 21:55   ` Paul E. McKenney
  2020-08-25 22:53     ` Joel Fernandes
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-25 21:55 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Mon, Aug 24, 2020 at 10:48:42PM -0400, Joel Fernandes (Google) wrote:
> Track how the segcb list changes before/after acceleration, during
> queuing and during dequeuing.
> 
> This has proved useful to discover an optimization to avoid unwanted GP
> requests when there are no callbacks accelerated. The overhead is minimal as
> each segment's length is now stored in the respective segment.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Speaking of tracing, diagnostics, and debugging, a few instances of
ASSERT_EXCLUSIVE_WRITER() in the previous patch might save a lot
of debugging.  Help KCSAN help us.

> ---
>  include/trace/events/rcu.h | 25 +++++++++++++++++++++++++
>  kernel/rcu/rcu_segcblist.c | 27 +++++++++++++++++++++++++++
>  kernel/rcu/rcu_segcblist.h |  7 +++++++
>  kernel/rcu/tree.c          | 23 +++++++++++++++++++++++
>  4 files changed, 82 insertions(+)
> 
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index 155b5cb43cfd..7b84df3c95df 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
>  		  __entry->qlen)
>  );
>  
> +TRACE_EVENT_RCU(rcu_segcb,
> +
> +		TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
> +
> +		TP_ARGS(ctx, cb_count, gp_seq),
> +
> +		TP_STRUCT__entry(
> +			__field(const char *, ctx)
> +			__array(int, cb_count, 4)
> +			__array(unsigned long, gp_seq, 4)
> +		),
> +
> +		TP_fast_assign(
> +			__entry->ctx = ctx;
> +			memcpy(__entry->cb_count, cb_count, 4 * sizeof(int));
> +			memcpy(__entry->gp_seq, gp_seq, 4 * sizeof(unsigned long));
> +		),
> +
> +		TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, NEXT=%d) "
> +			  "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
> +			  __entry->cb_count[0], __entry->cb_count[1], __entry->cb_count[2], __entry->cb_count[3],
> +			  __entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
> +
> +);
> +
>  /*
>   * Tracepoint for the registration of a single RCU callback of the special
>   * kvfree() form.  The first argument is the RCU type, the second argument
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 73a103464ea4..6419dbbaecde 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -378,6 +378,33 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
>  	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
>  }
>  
> +/*
> + * Return how many CBs each segment along with their gp_seq values.
> + *
> + * This function is O(N) where N is the number of callbacks. Only used from
> + * tracing code which is usually disabled in production.
> + */
> +#ifdef CONFIG_RCU_TRACE
> +void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
> +			 int cbcount[RCU_CBLIST_NSEGS],
> +			 unsigned long gpseq[RCU_CBLIST_NSEGS])
> +{

Why not declare the arrays here and invoke the new trace event from
here also?  That would simplify the usage and save a few lines of code.

						Thanx, Paul

> +	struct rcu_head **cur_tail;
> +	int i;
> +
> +	for (i = 0; i < RCU_CBLIST_NSEGS; i++)
> +		cbcount[i] = 0;
> +
> +	cur_tail = &(rsclp->head);
> +
> +	for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
> +		cbcount[i] = rcu_segcblist_get_seglen(rsclp, i);
> +		gpseq[i] = rsclp->gp_seq[i];
> +		cur_tail = rsclp->tails[i];
> +	}
> +}
> +#endif
> +
>  /*
>   * Extract only those callbacks still pending (not yet ready to be
>   * invoked) from the specified rcu_segcblist structure and place them in
> diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
> index b90725f81d77..5f7cdfed0ba4 100644
> --- a/kernel/rcu/rcu_segcblist.h
> +++ b/kernel/rcu/rcu_segcblist.h
> @@ -105,3 +105,10 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq);
>  bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq);
>  void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
>  			 struct rcu_segcblist *src_rsclp);
> +#ifdef CONFIG_RCU_TRACE
> +void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
> +			 int cbcount[RCU_CBLIST_NSEGS],
> +			 unsigned long gpseq[RCU_CBLIST_NSEGS]);
> +#else
> +#define rcu_segcblist_countseq(...)
> +#endif
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 51348144a4ea..16ad99a9ebba 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1476,6 +1476,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
>  {
>  	unsigned long gp_seq_req;
>  	bool ret = false;
> +	int cbs[RCU_CBLIST_NSEGS];
> +	unsigned long gps[RCU_CBLIST_NSEGS];
>  
>  	rcu_lockdep_assert_cblist_protected(rdp);
>  	raw_lockdep_assert_held_rcu_node(rnp);
> @@ -1484,6 +1486,10 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
>  	if (!rcu_segcblist_pend_cbs(&rdp->cblist))
>  		return false;
>  
> +	/* Count CBs for tracing. */
> +	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
> +	trace_rcu_segcb("SegCbPreAcc", cbs, gps);
> +
>  	/*
>  	 * Callbacks are often registered with incomplete grace-period
>  	 * information.  Something about the fact that getting exact
> @@ -1504,6 +1510,10 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
>  	else
>  		trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccReadyCB"));
>  
> +	/* Count CBs for tracing. */
> +	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
> +	trace_rcu_segcb("SegCbPostAcc", cbs, gps);
> +
>  	return ret;
>  }
>  
> @@ -2421,6 +2431,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
>  	long bl, count = 0;
>  	long pending, tlimit = 0;
> +	int cbs[RCU_CBLIST_NSEGS];
> +	unsigned long gps[RCU_CBLIST_NSEGS];
>  
>  	/* If no callbacks are ready, just return. */
>  	if (!rcu_segcblist_ready_cbs(&rdp->cblist)) {
> @@ -2461,6 +2473,11 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	/* Invoke callbacks. */
>  	tick_dep_set_task(current, TICK_DEP_BIT_RCU);
>  	rhp = rcu_cblist_dequeue(&rcl);
> +
> +	/* Count CBs for tracing. */
> +	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
> +	trace_rcu_segcb("SegCbDequeued", cbs, gps);
> +
>  	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
>  		rcu_callback_t f;
>  
> @@ -2929,6 +2946,8 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	unsigned long flags;
>  	struct rcu_data *rdp;
>  	bool was_alldone;
> +	int cbs[RCU_CBLIST_NSEGS];
> +	unsigned long gps[RCU_CBLIST_NSEGS];
>  
>  	/* Misaligned rcu_head! */
>  	WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
> @@ -2974,6 +2993,10 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
>  		trace_rcu_callback(rcu_state.name, head,
>  				   rcu_segcblist_n_cbs(&rdp->cblist));
>  
> +	/* Count CBs for tracing. */
> +	rcu_segcblist_countseq(&rdp->cblist, cbs, gps);
> +	trace_rcu_segcb("SegCBQueued", cbs, gps);
> +
>  	/* Go handle any RCU core processing required. */
>  	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
>  	    unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
> -- 
> 2.28.0.297.g1956fa8f8d-goog
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-25 20:08   ` Paul E. McKenney
@ 2020-08-25 22:47     ` Joel Fernandes
  2020-08-26 14:20       ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes @ 2020-08-25 22:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

Hi Paul,

On Tue, Aug 25, 2020 at 01:08:09PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 24, 2020 at 10:48:39PM -0400, Joel Fernandes (Google) wrote:
> > The donecbs's ->len field is used to store the total count of the segmented
> > callback list's length. This ->len field is then added to the destination segcb
> > list.
> > 
> > However, this presents a problem for per-segment length counting which is added
> > in a future patch. This future patch sets the rcl->len field as we move
> > segments of callbacks between source and destination lists, thus becoming
> > incompatible with the donecb's ->len field.
> 
> OK, I will bite.  What is "rcl"?  A placeholder for donecbs and pendcbs?
> If so, please just name them both.  If not, please explain.

Ok will fix.

> > This commit therefore avoids depending on the ->len field in this way. IMHO,
> > this is also less error-prone and is more accurate - the donecb's ->len field
> > should be the length of the done segment and not just used as a temporarily
> > variable.
> 
> Please also mention why ->len is handled specially at all, namely
> interactions between rcu_barrier() and callback invocation.  This is
> the answer to "why not just make all this work like normal lists?"
> This might go well in the first paragraph.

Are you referring to the cblist structures ->len?  I know the segcblist's
->len field is what rcu_barrier() samples but I am not changing that behavior
at all in this patch. This patch is only about the donecb's len (which is a
cblist structure on the stack).

> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > ---
> >  kernel/rcu/rcu_segcblist.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index 2d2a6b6b9dfb..b70d4154433c 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> >  {
> >  	struct rcu_cblist donecbs;
> >  	struct rcu_cblist pendcbs;
> > +	long src_len;
> >  
> >  	rcu_cblist_init(&donecbs);
> >  	rcu_cblist_init(&pendcbs);
> > -	rcu_segcblist_extract_count(src_rsclp, &donecbs);
> > +
> > +	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> 
> Given that both rcu_segcblist_xchg_len() and rcu_segcblist_extract_count()
> have only one callsite each, why not get rid of one of them?

Good point, I will do that.

> Or better yet, please see below, which should allow getting rid of both
> of them.
> 
> >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > +
> > +	rcu_segcblist_add_len(dst_rsclp, src_len);
> >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> 
> Rather than adding the blank lines, why not have the rcu_cblist structures
> carry the lengths?  You are already adjusting one of the two call sites
> that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> That should shorten this function a bit more.  And make callback handling
> much more approachable, I suspect.

Sorry, I did not understand. The rcu_cblist structure already has a length
field. I do modify rcu_segcblist_extract_done_cbs() and
rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
patch.

Just to emphasize, this patch is just a small refactor to avoid an issue in
later patches. It aims to keep current functionality unchanged.

thanks,

 - Joel

> 
> There would still be the callback-invocation need to be careful with
> ->cblist.len due to rcu_barrier() and srcu_barrier().  But both of
> those should be excluded by this code.  (But don't take my word for it,
> ask KCSAN.)
> 
> 							Thanx, Paul
> 
> > +
> >  	rcu_segcblist_init(src_rsclp);
> >  }
> > -- 
> > 2.28.0.297.g1956fa8f8d-goog
> > 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure
  2020-08-25 21:53   ` Paul E. McKenney
@ 2020-08-25 22:51     ` Joel Fernandes
  2020-08-26 14:24       ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes @ 2020-08-25 22:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Tue, Aug 25, 2020 at 02:53:38PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 24, 2020 at 10:48:41PM -0400, Joel Fernandes (Google) wrote:
> > Add counting of segment lengths of segmented callback list.
> > 
> > This will be useful for a number of things such as knowing how big the
> > ready-to-execute segment have gotten. The immediate benefit is ability
> > to trace how the callbacks in the segmented callback list change.
> > 
> > Tested by profusely reading traces when segcblist counts updated.
> 
> Might I recommend a more repeatable and sustainable test methodology?
> (Sorry, couldn't resist, given that you left yourself wide open for
> that one...)

Ah, this message was from an older series :-(. I did test it with rcutorture
for many hours. But I forgot to update this commit message. :-(

I will respond to other comments soon, I am still stuck on the comment about
rcl.len and rcu_barrier interaction that you brought up in 1/4 and 2/4.

Thanks!

 - Joel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes
  2020-08-25 21:55   ` Paul E. McKenney
@ 2020-08-25 22:53     ` Joel Fernandes
  0 siblings, 0 replies; 19+ messages in thread
From: Joel Fernandes @ 2020-08-25 22:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Tue, Aug 25, 2020 at 02:55:53PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 24, 2020 at 10:48:42PM -0400, Joel Fernandes (Google) wrote:
> > Track how the segcb list changes before/after acceleration, during
> > queuing and during dequeuing.
> > 
> > This has proved useful to discover an optimization to avoid unwanted GP
> > requests when there are no callbacks accelerated. The overhead is minimal as
> > each segment's length is now stored in the respective segment.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> Speaking of tracing, diagnostics, and debugging, a few instances of
> ASSERT_EXCLUSIVE_WRITER() in the previous patch might save a lot
> of debugging.  Help KCSAN help us.

Ok, I will look into it, thanks.

> >  include/trace/events/rcu.h | 25 +++++++++++++++++++++++++
> >  kernel/rcu/rcu_segcblist.c | 27 +++++++++++++++++++++++++++
> >  kernel/rcu/rcu_segcblist.h |  7 +++++++
> >  kernel/rcu/tree.c          | 23 +++++++++++++++++++++++
> >  4 files changed, 82 insertions(+)
> > 
> > diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> > index 155b5cb43cfd..7b84df3c95df 100644
> > --- a/include/trace/events/rcu.h
> > +++ b/include/trace/events/rcu.h
> > @@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
> >  		  __entry->qlen)
> >  );
> >  
> > +TRACE_EVENT_RCU(rcu_segcb,
> > +
> > +		TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
> > +
> > +		TP_ARGS(ctx, cb_count, gp_seq),
> > +
> > +		TP_STRUCT__entry(
> > +			__field(const char *, ctx)
> > +			__array(int, cb_count, 4)
> > +			__array(unsigned long, gp_seq, 4)
> > +		),
> > +
> > +		TP_fast_assign(
> > +			__entry->ctx = ctx;
> > +			memcpy(__entry->cb_count, cb_count, 4 * sizeof(int));
> > +			memcpy(__entry->gp_seq, gp_seq, 4 * sizeof(unsigned long));
> > +		),
> > +
> > +		TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, NEXT=%d) "
> > +			  "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
> > +			  __entry->cb_count[0], __entry->cb_count[1], __entry->cb_count[2], __entry->cb_count[3],
> > +			  __entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
> > +
> > +);
> > +
> >  /*
> >   * Tracepoint for the registration of a single RCU callback of the special
> >   * kvfree() form.  The first argument is the RCU type, the second argument
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index 73a103464ea4..6419dbbaecde 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -378,6 +378,33 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
> >  	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> >  }
> >  
> > +/*
> > + * Return how many CBs each segment along with their gp_seq values.
> > + *
> > + * This function is O(N) where N is the number of callbacks. Only used from
> > + * tracing code which is usually disabled in production.
> > + */
> > +#ifdef CONFIG_RCU_TRACE
> > +void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
> > +			 int cbcount[RCU_CBLIST_NSEGS],
> > +			 unsigned long gpseq[RCU_CBLIST_NSEGS])
> > +{
> 
> Why not declare the arrays here and invoke the new trace event from
> here also?  That would simplify the usage and save a few lines of code.

True! I will do it that way.

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-25 22:47     ` Joel Fernandes
@ 2020-08-26 14:20       ` Paul E. McKenney
  2020-08-27 22:55         ` Joel Fernandes
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-26 14:20 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Tue, Aug 25, 2020 at 06:47:23PM -0400, Joel Fernandes wrote:
> Hi Paul,
> 
> On Tue, Aug 25, 2020 at 01:08:09PM -0700, Paul E. McKenney wrote:
> > On Mon, Aug 24, 2020 at 10:48:39PM -0400, Joel Fernandes (Google) wrote:
> > > The donecbs's ->len field is used to store the total count of the segmented
> > > callback list's length. This ->len field is then added to the destination segcb
> > > list.
> > > 
> > > However, this presents a problem for per-segment length counting which is added
> > > in a future patch. This future patch sets the rcl->len field as we move
> > > segments of callbacks between source and destination lists, thus becoming
> > > incompatible with the donecb's ->len field.
> > 
> > OK, I will bite.  What is "rcl"?  A placeholder for donecbs and pendcbs?
> > If so, please just name them both.  If not, please explain.
> 
> Ok will fix.
> 
> > > This commit therefore avoids depending on the ->len field in this way. IMHO,
> > > this is also less error-prone and is more accurate - the donecb's ->len field
> > > should be the length of the done segment and not just used as a temporarily
> > > variable.
> > 
> > Please also mention why ->len is handled specially at all, namely
> > interactions between rcu_barrier() and callback invocation.  This is
> > the answer to "why not just make all this work like normal lists?"
> > This might go well in the first paragraph.
> 
> Are you referring to the cblist structures ->len?  I know the segcblist's
> ->len field is what rcu_barrier() samples but I am not changing that behavior
> at all in this patch. This patch is only about the donecb's len (which is a
> cblist structure on the stack).

Yes, we agree.  I am just suggesting that you call this out in the
commit log.  It is probably not obvious to those who have not been
through the code yet.  ;-)

> > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > ---
> > >  kernel/rcu/rcu_segcblist.c | 8 ++++++--
> > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > > index 2d2a6b6b9dfb..b70d4154433c 100644
> > > --- a/kernel/rcu/rcu_segcblist.c
> > > +++ b/kernel/rcu/rcu_segcblist.c
> > > @@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > >  {
> > >  	struct rcu_cblist donecbs;
> > >  	struct rcu_cblist pendcbs;
> > > +	long src_len;
> > >  
> > >  	rcu_cblist_init(&donecbs);
> > >  	rcu_cblist_init(&pendcbs);
> > > -	rcu_segcblist_extract_count(src_rsclp, &donecbs);
> > > +
> > > +	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> > 
> > Given that both rcu_segcblist_xchg_len() and rcu_segcblist_extract_count()
> > have only one callsite each, why not get rid of one of them?
> 
> Good point, I will do that.
> 
> > Or better yet, please see below, which should allow getting rid of both
> > of them.
> > 
> > >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > +
> > > +	rcu_segcblist_add_len(dst_rsclp, src_len);
> > >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > 
> > Rather than adding the blank lines, why not have the rcu_cblist structures
> > carry the lengths?  You are already adjusting one of the two call sites
> > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > That should shorten this function a bit more.  And make callback handling
> > much more approachable, I suspect.
> 
> Sorry, I did not understand. The rcu_cblist structure already has a length
> field. I do modify rcu_segcblist_extract_done_cbs() and
> rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> patch.
> 
> Just to emphasize, this patch is just a small refactor to avoid an issue in
> later patches. It aims to keep current functionality unchanged.

True enough.  I am just suggesting that an equally small refactor in
a slightly different direction should get to a better place.  The key
point enabling this slightly different direction is that this code is
an exception to the "preserve ->cblist.len" rule because it is invoked
only from the CPU hotplug code.

So you could use the rcu_cblist .len field to update the ->cblist.len
field, thus combining the _cbs and _count updates.  One thing that helps
is that setting th e rcu_cblist .len field doesn't hurt the other use
cases that require careful handling of ->cblist.len.

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> > 
> > There would still be the callback-invocation need to be careful with
> > ->cblist.len due to rcu_barrier() and srcu_barrier().  But both of
> > those should be excluded by this code.  (But don't take my word for it,
> > ask KCSAN.)
> > 
> > 							Thanx, Paul
> > 
> > > +
> > >  	rcu_segcblist_init(src_rsclp);
> > >  }
> > > -- 
> > > 2.28.0.297.g1956fa8f8d-goog
> > > 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure
  2020-08-25 22:51     ` Joel Fernandes
@ 2020-08-26 14:24       ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-26 14:24 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Tue, Aug 25, 2020 at 06:51:03PM -0400, Joel Fernandes wrote:
> On Tue, Aug 25, 2020 at 02:53:38PM -0700, Paul E. McKenney wrote:
> > On Mon, Aug 24, 2020 at 10:48:41PM -0400, Joel Fernandes (Google) wrote:
> > > Add counting of segment lengths of segmented callback list.
> > > 
> > > This will be useful for a number of things such as knowing how big the
> > > ready-to-execute segment have gotten. The immediate benefit is ability
> > > to trace how the callbacks in the segmented callback list change.
> > > 
> > > Tested by profusely reading traces when segcblist counts updated.
> > 
> > Might I recommend a more repeatable and sustainable test methodology?
> > (Sorry, couldn't resist, given that you left yourself wide open for
> > that one...)
> 
> Ah, this message was from an older series :-(. I did test it with rcutorture
> for many hours. But I forgot to update this commit message. :-(

That is better.  Let me see if I can find a more targeted test from the
old days...

> I will respond to other comments soon, I am still stuck on the comment about
> rcl.len and rcu_barrier interaction that you brought up in 1/4 and 2/4.

Hoping my more recent replies help.  If not, you know where to find me.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-26 14:20       ` Paul E. McKenney
@ 2020-08-27 22:55         ` Joel Fernandes
  2020-08-28 14:18           ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes @ 2020-08-27 22:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
[...]
> > > Or better yet, please see below, which should allow getting rid of both
> > > of them.
> > > 
> > > >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > +
> > > > +	rcu_segcblist_add_len(dst_rsclp, src_len);
> > > >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > 
> > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > carry the lengths?  You are already adjusting one of the two call sites
> > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > That should shorten this function a bit more.  And make callback handling
> > > much more approachable, I suspect.
> > 
> > Sorry, I did not understand. The rcu_cblist structure already has a length
> > field. I do modify rcu_segcblist_extract_done_cbs() and
> > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > patch.
> > 
> > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > later patches. It aims to keep current functionality unchanged.
> 
> True enough.  I am just suggesting that an equally small refactor in
> a slightly different direction should get to a better place.  The key
> point enabling this slightly different direction is that this code is
> an exception to the "preserve ->cblist.len" rule because it is invoked
> only from the CPU hotplug code.
> 
> So you could use the rcu_cblist .len field to update the ->cblist.len
> field, thus combining the _cbs and _count updates.  One thing that helps
> is that setting th e rcu_cblist .len field doesn't hurt the other use
> cases that require careful handling of ->cblist.len.

Thank you for the ideas. I am trying something like this on top of this
series based on the ideas. One thing I concerned a bit is if getting rid of
the rcu_segcblist_xchg_len() function (which has memory barriers in them)
causes issues in the hotplug path. I am now directly updating the length
without additional memory barriers. I will test it more and try to reason
more about it as well.

---8<-----------------------

From: Joel Fernandes <joelaf@google.com>
Date: Thu, 27 Aug 2020 18:30:25 -0400
Subject: [PATCH] fixup! rcu/segcblist: Do not depend on donecbs ->len to store
 the segcb len during merge

Signed-off-by: Joel Fernandes <joelaf@google.com>
---
 kernel/rcu/rcu_segcblist.c | 38 ++++----------------------------------
 1 file changed, 4 insertions(+), 34 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 79c2cbe388c5..c33abbc97a07 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -175,26 +175,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
 	rcu_segcblist_add_len(rsclp, 1);
 }
 
-/*
- * Exchange the numeric length of the specified rcu_segcblist structure
- * with the specified value.  This can cause the ->len field to disagree
- * with the actual number of callbacks on the structure.  This exchange is
- * fully ordered with respect to the callers accesses both before and after.
- */
-static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
-{
-#ifdef CONFIG_RCU_NOCB_CPU
-	return atomic_long_xchg(&rsclp->len, v);
-#else
-	long ret = rsclp->len;
-
-	smp_mb(); /* Up to the caller! */
-	WRITE_ONCE(rsclp->len, v);
-	smp_mb(); /* Up to the caller! */
-	return ret;
-#endif
-}
-
 /*
  * Initialize an rcu_segcblist structure.
  */
@@ -361,6 +341,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
 		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
 			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
 	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
+	rcu_segcblist_add_len(rsclp, -(rclp->len));
 }
 
 /*
@@ -414,17 +395,7 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
 		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
 		rcu_segcblist_set_seglen(rsclp, i, 0);
 	}
-}
-
-/*
- * Insert counts from the specified rcu_cblist structure in the
- * specified rcu_segcblist structure.
- */
-void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
-				struct rcu_cblist *rclp)
-{
-	rcu_segcblist_add_len(rsclp, rclp->len);
-	rclp->len = 0;
+	rcu_segcblist_add_len(rsclp, -(rclp->len));
 }
 
 /*
@@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
 			break;
 	rclp->head = NULL;
 	rclp->tail = &rclp->head;
+	rcu_segcblist_add_len(rsclp, rclp->len);
 }
 
 /*
@@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
 	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
 	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
 	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
+	rcu_segcblist_add_len(rsclp, rclp->len);
 }
 
 /*
@@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
 {
 	struct rcu_cblist donecbs;
 	struct rcu_cblist pendcbs;
-	long src_len;
 
 	rcu_cblist_init(&donecbs);
 	rcu_cblist_init(&pendcbs);
 
-	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
 	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
 	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
 
-	rcu_segcblist_add_len(dst_rsclp, src_len);
 	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
 	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
 
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-27 22:55         ` Joel Fernandes
@ 2020-08-28 14:18           ` Paul E. McKenney
  2020-09-01 15:06             ` Joel Fernandes
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2020-08-28 14:18 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote:
> On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
> [...]
> > > > Or better yet, please see below, which should allow getting rid of both
> > > > of them.
> > > > 
> > > > >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > > -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > > +
> > > > > +	rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > > 
> > > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > > carry the lengths?  You are already adjusting one of the two call sites
> > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > > That should shorten this function a bit more.  And make callback handling
> > > > much more approachable, I suspect.
> > > 
> > > Sorry, I did not understand. The rcu_cblist structure already has a length
> > > field. I do modify rcu_segcblist_extract_done_cbs() and
> > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > > patch.
> > > 
> > > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > > later patches. It aims to keep current functionality unchanged.
> > 
> > True enough.  I am just suggesting that an equally small refactor in
> > a slightly different direction should get to a better place.  The key
> > point enabling this slightly different direction is that this code is
> > an exception to the "preserve ->cblist.len" rule because it is invoked
> > only from the CPU hotplug code.
> > 
> > So you could use the rcu_cblist .len field to update the ->cblist.len
> > field, thus combining the _cbs and _count updates.  One thing that helps
> > is that setting th e rcu_cblist .len field doesn't hurt the other use
> > cases that require careful handling of ->cblist.len.
> 
> Thank you for the ideas. I am trying something like this on top of this
> series based on the ideas. One thing I concerned a bit is if getting rid of
> the rcu_segcblist_xchg_len() function (which has memory barriers in them)
> causes issues in the hotplug path. I am now directly updating the length
> without additional memory barriers. I will test it more and try to reason
> more about it as well.

In this particular case, the CPU-hotplug locks prevent rcu_barrier()
from running concurrently, so it should be OK.  Is there an easy way
to make lockdep help us check this?  Does lockdep_assert_cpus_held()
suffice, or is it too easily satisfied?

> ---8<-----------------------
> 
> From: Joel Fernandes <joelaf@google.com>
> Date: Thu, 27 Aug 2020 18:30:25 -0400
> Subject: [PATCH] fixup! rcu/segcblist: Do not depend on donecbs ->len to store
>  the segcb len during merge
> 
> Signed-off-by: Joel Fernandes <joelaf@google.com>
> ---
>  kernel/rcu/rcu_segcblist.c | 38 ++++----------------------------------
>  1 file changed, 4 insertions(+), 34 deletions(-)
> 
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 79c2cbe388c5..c33abbc97a07 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -175,26 +175,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
>  	rcu_segcblist_add_len(rsclp, 1);
>  }
>  
> -/*
> - * Exchange the numeric length of the specified rcu_segcblist structure
> - * with the specified value.  This can cause the ->len field to disagree
> - * with the actual number of callbacks on the structure.  This exchange is
> - * fully ordered with respect to the callers accesses both before and after.
> - */
> -static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
> -{
> -#ifdef CONFIG_RCU_NOCB_CPU
> -	return atomic_long_xchg(&rsclp->len, v);
> -#else
> -	long ret = rsclp->len;
> -
> -	smp_mb(); /* Up to the caller! */
> -	WRITE_ONCE(rsclp->len, v);
> -	smp_mb(); /* Up to the caller! */
> -	return ret;
> -#endif
> -}
> -

This looks nice!

>  /*
>   * Initialize an rcu_segcblist structure.
>   */
> @@ -361,6 +341,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
>  		if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
>  			WRITE_ONCE(rsclp->tails[i], &rsclp->head);
>  	rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> +	rcu_segcblist_add_len(rsclp, -(rclp->len));
>  }
>  
>  /*
> @@ -414,17 +395,7 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
>  		WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
>  		rcu_segcblist_set_seglen(rsclp, i, 0);
>  	}
> -}
> -
> -/*
> - * Insert counts from the specified rcu_cblist structure in the
> - * specified rcu_segcblist structure.
> - */
> -void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
> -				struct rcu_cblist *rclp)
> -{
> -	rcu_segcblist_add_len(rsclp, rclp->len);
> -	rclp->len = 0;
> +	rcu_segcblist_add_len(rsclp, -(rclp->len));

As does this.  ;-)

>  }
>  
>  /*
> @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
>  			break;
>  	rclp->head = NULL;
>  	rclp->tail = &rclp->head;
> +	rcu_segcblist_add_len(rsclp, rclp->len);

Does there need to be a compensating action in rcu_do_batch(), or is
this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));"
added to rcu_segcblist_extract_done_cbs() above?

If so, a comment would be good.

>  }
>  
>  /*
> @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
>  	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
>  	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
>  	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> +	rcu_segcblist_add_len(rsclp, rclp->len);
>  }
>  
>  /*
> @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
>  {
>  	struct rcu_cblist donecbs;
>  	struct rcu_cblist pendcbs;
> -	long src_len;
>  
>  	rcu_cblist_init(&donecbs);
>  	rcu_cblist_init(&pendcbs);
>  
> -	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
>  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
>  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
>  
> -	rcu_segcblist_add_len(dst_rsclp, src_len);
>  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
>  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);

Can we now pair the corresponding _extract_ and _insert_ calls, thus
requiring only one rcu_cblist structure?  This would drop two more lines
of code.  Or would that break something?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-08-28 14:18           ` Paul E. McKenney
@ 2020-09-01 15:06             ` Joel Fernandes
  2020-09-01 16:26               ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Joel Fernandes @ 2020-09-01 15:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

Hi,
Resuming regular activities here, post-LPC :)

On Fri, Aug 28, 2020 at 07:18:55AM -0700, Paul E. McKenney wrote:
> On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote:
> > On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
> > [...]
> > > > > Or better yet, please see below, which should allow getting rid of both
> > > > > of them.
> > > > > 
> > > > > >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > > >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > > > -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > > > +
> > > > > > +	rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > > >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > > >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > > > 
> > > > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > > > carry the lengths?  You are already adjusting one of the two call sites
> > > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > > > That should shorten this function a bit more.  And make callback handling
> > > > > much more approachable, I suspect.
> > > > 
> > > > Sorry, I did not understand. The rcu_cblist structure already has a length
> > > > field. I do modify rcu_segcblist_extract_done_cbs() and
> > > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > > > patch.
> > > > 
> > > > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > > > later patches. It aims to keep current functionality unchanged.
> > > 
> > > True enough.  I am just suggesting that an equally small refactor in
> > > a slightly different direction should get to a better place.  The key
> > > point enabling this slightly different direction is that this code is
> > > an exception to the "preserve ->cblist.len" rule because it is invoked
> > > only from the CPU hotplug code.
> > > 
> > > So you could use the rcu_cblist .len field to update the ->cblist.len
> > > field, thus combining the _cbs and _count updates.  One thing that helps
> > > is that setting th e rcu_cblist .len field doesn't hurt the other use
> > > cases that require careful handling of ->cblist.len.
> > 
> > Thank you for the ideas. I am trying something like this on top of this
> > series based on the ideas. One thing I concerned a bit is if getting rid of
> > the rcu_segcblist_xchg_len() function (which has memory barriers in them)
> > causes issues in the hotplug path. I am now directly updating the length
> > without additional memory barriers. I will test it more and try to reason
> > more about it as well.
> 
> In this particular case, the CPU-hotplug locks prevent rcu_barrier()
> from running concurrently, so it should be OK.  Is there an easy way
> to make lockdep help us check this?  Does lockdep_assert_cpus_held()
> suffice, or is it too easily satisfied?

Just to clarify, the race you mention is rcu_barrier() should not run
concurrently with CPU hotplug operation.

The reason is:
rcu_barrier() checks whether the segmented list's length is 0 before sending
IPIs for entraining a callback onto the destination CPU. But if it
misunderstands the length's value due to a lack of memory barriers, it may
not missing sending an IPI causing the barrier to fail. Please correct me if
I'm wrong though.

Due to CPU hotplug locking, such race is impossible because both
rcu_barrier() and the CPU migrating the callbacks holds it.

I think the lockdep_assert_cpus_held() may suffice. I can add it to the
rcu_segcblist_merge() function.

BTW, I think we can simplify rcu_barrier() a bit and combine the tracepoint
and rid one outer if clause (diff at end of email).

> >  }
> >  
> >  /*
> > @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
> >  			break;
> >  	rclp->head = NULL;
> >  	rclp->tail = &rclp->head;
> > +	rcu_segcblist_add_len(rsclp, rclp->len);
> 
> Does there need to be a compensating action in rcu_do_batch(), or is
> this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));"
> added to rcu_segcblist_extract_done_cbs() above?
> 
> If so, a comment would be good.

I think external compensation by rcu_do_batch is not needed, its best to
handle all cblist.len modifications internally in the segcblist API
where possible.

> >  /*
> > @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
> >  	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
> >  	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
> >  	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> > +	rcu_segcblist_add_len(rsclp, rclp->len);
> >  }
> >  
> >  /*
> > @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> >  {
> >  	struct rcu_cblist donecbs;
> >  	struct rcu_cblist pendcbs;
> > -	long src_len;
> >  
> >  	rcu_cblist_init(&donecbs);
> >  	rcu_cblist_init(&pendcbs);
> >  
> > -	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> >  
> > -	rcu_segcblist_add_len(dst_rsclp, src_len);
> >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> 
> Can we now pair the corresponding _extract_ and _insert_ calls, thus
> requiring only one rcu_cblist structure?  This would drop two more lines
> of code.  Or would that break something?

That could work as well, yes. But may not be worth adding another function to
combine extract/insert operation, since the extract and insert operations are
needed by rcutree and srcutree as well (other than hotplug).

thanks,

 - Joel

---8<-----------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 16ad99a9ebba..274a1845ad38 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3856,17 +3856,16 @@ void rcu_barrier(void)
 		if (cpu_is_offline(cpu) &&
 		    !rcu_segcblist_is_offloaded(&rdp->cblist))
 			continue;
-		if (rcu_segcblist_n_cbs(&rdp->cblist) && cpu_online(cpu)) {
-			rcu_barrier_trace(TPS("OnlineQ"), cpu,
-					  rcu_state.barrier_sequence);
-			smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
-		} else if (rcu_segcblist_n_cbs(&rdp->cblist) &&
-			   cpu_is_offline(cpu)) {
-			rcu_barrier_trace(TPS("OfflineNoCBQ"), cpu,
-					  rcu_state.barrier_sequence);
-			local_irq_disable();
-			rcu_barrier_func((void *)cpu);
-			local_irq_enable();
+		if (rcu_segcblist_n_cbs(&rdp->cblist)) {
+			rcu_barrier_trace(cpu_online(cpu) ? TPS("OnlineQ") : TPS("OfflineNoCBQ"),
+					  cpu, rcu_state.barrier_sequence);
+			if (cpu_online(cpu)) {
+				smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
+			} else {
+				local_irq_disable();
+				rcu_barrier_func((void *)cpu);
+				local_irq_enable();
+			}
 		} else if (cpu_is_offline(cpu)) {
 			rcu_barrier_trace(TPS("OfflineNoCBNoQ"), cpu,
 					  rcu_state.barrier_sequence);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge
  2020-09-01 15:06             ` Joel Fernandes
@ 2020-09-01 16:26               ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2020-09-01 16:26 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, boqun.feng, dave, Ingo Molnar, Josh Triplett,
	Lai Jiangshan, Madhuparna Bhowmik, Mathieu Desnoyers,
	neeraj.iitr10, rcu, Steven Rostedt, Uladzislau Rezki (Sony),
	vineethrp

On Tue, Sep 01, 2020 at 11:06:09AM -0400, Joel Fernandes wrote:
> Hi,
> Resuming regular activities here, post-LPC :)
> 
> On Fri, Aug 28, 2020 at 07:18:55AM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote:
> > > On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
> > > [...]
> > > > > > Or better yet, please see below, which should allow getting rid of both
> > > > > > of them.
> > > > > > 
> > > > > > >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > > > >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > > > > -	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > > > > +
> > > > > > > +	rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > > > >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > > > >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > > > > 
> > > > > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > > > > carry the lengths?  You are already adjusting one of the two call sites
> > > > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > > > > That should shorten this function a bit more.  And make callback handling
> > > > > > much more approachable, I suspect.
> > > > > 
> > > > > Sorry, I did not understand. The rcu_cblist structure already has a length
> > > > > field. I do modify rcu_segcblist_extract_done_cbs() and
> > > > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > > > > patch.
> > > > > 
> > > > > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > > > > later patches. It aims to keep current functionality unchanged.
> > > > 
> > > > True enough.  I am just suggesting that an equally small refactor in
> > > > a slightly different direction should get to a better place.  The key
> > > > point enabling this slightly different direction is that this code is
> > > > an exception to the "preserve ->cblist.len" rule because it is invoked
> > > > only from the CPU hotplug code.
> > > > 
> > > > So you could use the rcu_cblist .len field to update the ->cblist.len
> > > > field, thus combining the _cbs and _count updates.  One thing that helps
> > > > is that setting th e rcu_cblist .len field doesn't hurt the other use
> > > > cases that require careful handling of ->cblist.len.
> > > 
> > > Thank you for the ideas. I am trying something like this on top of this
> > > series based on the ideas. One thing I concerned a bit is if getting rid of
> > > the rcu_segcblist_xchg_len() function (which has memory barriers in them)
> > > causes issues in the hotplug path. I am now directly updating the length
> > > without additional memory barriers. I will test it more and try to reason
> > > more about it as well.
> > 
> > In this particular case, the CPU-hotplug locks prevent rcu_barrier()
> > from running concurrently, so it should be OK.  Is there an easy way
> > to make lockdep help us check this?  Does lockdep_assert_cpus_held()
> > suffice, or is it too easily satisfied?
> 
> Just to clarify, the race you mention is rcu_barrier() should not run
> concurrently with CPU hotplug operation.
> 
> The reason is:
> rcu_barrier() checks whether the segmented list's length is 0 before sending
> IPIs for entraining a callback onto the destination CPU. But if it
> misunderstands the length's value due to a lack of memory barriers, it may
> not missing sending an IPI causing the barrier to fail. Please correct me if
> I'm wrong though.

That is the concern for code that is not part of a CPU-hotplug operation.

> Due to CPU hotplug locking, such race is impossible because both
> rcu_barrier() and the CPU migrating the callbacks holds it.

Exactly.

> I think the lockdep_assert_cpus_held() may suffice. I can add it to the
> rcu_segcblist_merge() function.

The main thing to check is that the following results in a lockdep splat:

	cpus_read_lock();
	lockdep_assert_cpus_held();

> BTW, I think we can simplify rcu_barrier() a bit and combine the tracepoint
> and rid one outer if clause (diff at end of email).

If you were already changing this function for some reason, OK.  But you
are trading a pair of lines for a long line (though still less than the
new 100-character limit), so as a standalone patch I am left uninspired.

> > >  }
> > >  
> > >  /*
> > > @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
> > >  			break;
> > >  	rclp->head = NULL;
> > >  	rclp->tail = &rclp->head;
> > > +	rcu_segcblist_add_len(rsclp, rclp->len);
> > 
> > Does there need to be a compensating action in rcu_do_batch(), or is
> > this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));"
> > added to rcu_segcblist_extract_done_cbs() above?
> > 
> > If so, a comment would be good.
> 
> I think external compensation by rcu_do_batch is not needed, its best to
> handle all cblist.len modifications internally in the segcblist API
> where possible.

At the very least, the segcblist function providing a negative-length
list needs to very carefully explain itself!!!  ;-)

> > >  /*
> > > @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
> > >  	rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
> > >  	WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
> > >  	WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> > > +	rcu_segcblist_add_len(rsclp, rclp->len);
> > >  }
> > >  
> > >  /*
> > > @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > >  {
> > >  	struct rcu_cblist donecbs;
> > >  	struct rcu_cblist pendcbs;
> > > -	long src_len;
> > >  
> > >  	rcu_cblist_init(&donecbs);
> > >  	rcu_cblist_init(&pendcbs);
> > >  
> > > -	src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> > >  	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > >  	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > >  
> > > -	rcu_segcblist_add_len(dst_rsclp, src_len);
> > >  	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > >  	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > 
> > Can we now pair the corresponding _extract_ and _insert_ calls, thus
> > requiring only one rcu_cblist structure?  This would drop two more lines
> > of code.  Or would that break something?
> 
> That could work as well, yes. But may not be worth adding another function to
> combine extract/insert operation, since the extract and insert operations are
> needed by rcutree and srcutree as well (other than hotplug).

I was thinking in terms of just reordering the calls to the existing
functions, though I clearly was having trouble counting.  Like this:

 	struct rcu_cblist cbs;
 
 	rcu_cblist_init(&cbs);
 	rcu_segcblist_extract_done_cbs(src_rsclp, &cbs);
 	rcu_segcblist_insert_done_cbs(dst_rsclp, &cbs);
 	rcu_cblist_init(&cbs);
 	rcu_segcblist_insert_pend_cbs(dst_rsclp, &cbs);
 	rcu_segcblist_extract_pend_cbs(src_rsclp, &cbs);

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> ---8<-----------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 16ad99a9ebba..274a1845ad38 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3856,17 +3856,16 @@ void rcu_barrier(void)
>  		if (cpu_is_offline(cpu) &&
>  		    !rcu_segcblist_is_offloaded(&rdp->cblist))
>  			continue;
> -		if (rcu_segcblist_n_cbs(&rdp->cblist) && cpu_online(cpu)) {
> -			rcu_barrier_trace(TPS("OnlineQ"), cpu,
> -					  rcu_state.barrier_sequence);
> -			smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
> -		} else if (rcu_segcblist_n_cbs(&rdp->cblist) &&
> -			   cpu_is_offline(cpu)) {
> -			rcu_barrier_trace(TPS("OfflineNoCBQ"), cpu,
> -					  rcu_state.barrier_sequence);
> -			local_irq_disable();
> -			rcu_barrier_func((void *)cpu);
> -			local_irq_enable();
> +		if (rcu_segcblist_n_cbs(&rdp->cblist)) {
> +			rcu_barrier_trace(cpu_online(cpu) ? TPS("OnlineQ") : TPS("OfflineNoCBQ"),
> +					  cpu, rcu_state.barrier_sequence);
> +			if (cpu_online(cpu)) {
> +				smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
> +			} else {
> +				local_irq_disable();
> +				rcu_barrier_func((void *)cpu);
> +				local_irq_enable();
> +			}
>  		} else if (cpu_is_offline(cpu)) {
>  			rcu_barrier_trace(TPS("OfflineNoCBNoQ"), cpu,
>  					  rcu_state.barrier_sequence);

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-09-01 16:27 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-25  2:48 [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Joel Fernandes (Google)
2020-08-25  2:48 ` [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge Joel Fernandes (Google)
2020-08-25 20:08   ` Paul E. McKenney
2020-08-25 22:47     ` Joel Fernandes
2020-08-26 14:20       ` Paul E. McKenney
2020-08-27 22:55         ` Joel Fernandes
2020-08-28 14:18           ` Paul E. McKenney
2020-09-01 15:06             ` Joel Fernandes
2020-09-01 16:26               ` Paul E. McKenney
2020-08-25  2:48 ` [PATCH v4 -rcu 2/4] rcu/tree: Make rcu_do_batch count how many callbacks were executed Joel Fernandes (Google)
2020-08-25 20:13   ` Paul E. McKenney
2020-08-25  2:48 ` [PATCH v4 -rcu 3/4] rcu/segcblist: Add counters to segcblist datastructure Joel Fernandes (Google)
2020-08-25 21:53   ` Paul E. McKenney
2020-08-25 22:51     ` Joel Fernandes
2020-08-26 14:24       ` Paul E. McKenney
2020-08-25  2:48 ` [PATCH v4 -rcu 4/4] rcu/trace: Add tracing for how segcb list changes Joel Fernandes (Google)
2020-08-25 21:55   ` Paul E. McKenney
2020-08-25 22:53     ` Joel Fernandes
2020-08-25 19:58 ` [PATCH v4 -rcu 0/4] Maintain the length of each segment in the segcblist Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).