rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rcu/dev -fixes 0/4]
@ 2020-04-20 15:38 Joel Fernandes (Google)
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention Joel Fernandes (Google)
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Joel Fernandes (Google) @ 2020-04-20 15:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	bigeasy, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Paul E. McKenney, rcu, Steven Rostedt, urezki

Hi,
Some of these patches would prevent breakage on PREEMPT_RT. I have marked them
as "rcu/dev fixes". Please consider applying for v5.8. The exceptions are 3/4
and 4/4 which some small clean-ups. Thanks!

Joel Fernandes (Google) (3):
rcu/tree: Keep kfree_rcu() awake during lock contention
rcu/tree: Skip entry into the page allocator for PREEMPT_RT
rcu/tree: Use consistent style for comments

Sebastian Andrzej Siewior (1):
rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk()

kernel/rcu/tree.c | 61 ++++++++++++++++++++++++++++-------------------
1 file changed, 37 insertions(+), 24 deletions(-)

--
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention
  2020-04-20 15:38 [PATCH rcu/dev -fixes 0/4] Joel Fernandes (Google)
@ 2020-04-20 15:38 ` Joel Fernandes (Google)
  2020-04-21 13:12   ` Uladzislau Rezki
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT Joel Fernandes (Google)
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Joel Fernandes (Google) @ 2020-04-20 15:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	urezki, bigeasy, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Paul E. McKenney, rcu, Steven Rostedt

On PREEMPT_RT kernels, contending on the krcp spinlock can cause
sleeping as on these kernels, the spinlock is converted to an rt-mutex.
To prevent breakage of possible usage of kfree_rcu() now or in the
future, make use of raw spinlocks which are not subject to such
conversions.

Vetting all code paths, there is no reason to believe that the raw
spinlock will be held for long time so PREEMPT_RT should not suffer from
lengthy acquirals of the lock.

Cc: urezki@gmail.com
Cc: bigeasy@linutronix.de
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f288477ee1c26..cf68d3d9f5b81 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2905,7 +2905,7 @@ struct kfree_rcu_cpu {
 	struct kfree_rcu_bulk_data *bhead;
 	struct kfree_rcu_bulk_data *bcached;
 	struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES];
-	spinlock_t lock;
+	raw_spinlock_t lock;
 	struct delayed_work monitor_work;
 	bool monitor_todo;
 	bool initialized;
@@ -2939,12 +2939,12 @@ static void kfree_rcu_work(struct work_struct *work)
 	krwp = container_of(to_rcu_work(work),
 			    struct kfree_rcu_cpu_work, rcu_work);
 	krcp = krwp->krcp;
-	spin_lock_irqsave(&krcp->lock, flags);
+	raw_spin_lock_irqsave(&krcp->lock, flags);
 	head = krwp->head_free;
 	krwp->head_free = NULL;
 	bhead = krwp->bhead_free;
 	krwp->bhead_free = NULL;
-	spin_unlock_irqrestore(&krcp->lock, flags);
+	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 
 	/* "bhead" is now private, so traverse locklessly. */
 	for (; bhead; bhead = bnext) {
@@ -3047,14 +3047,14 @@ static inline void kfree_rcu_drain_unlock(struct kfree_rcu_cpu *krcp,
 	krcp->monitor_todo = false;
 	if (queue_kfree_rcu_work(krcp)) {
 		// Success! Our job is done here.
-		spin_unlock_irqrestore(&krcp->lock, flags);
+		raw_spin_unlock_irqrestore(&krcp->lock, flags);
 		return;
 	}
 
 	// Previous RCU batch still in progress, try again later.
 	krcp->monitor_todo = true;
 	schedule_delayed_work(&krcp->monitor_work, KFREE_DRAIN_JIFFIES);
-	spin_unlock_irqrestore(&krcp->lock, flags);
+	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 }
 
 /*
@@ -3067,11 +3067,11 @@ static void kfree_rcu_monitor(struct work_struct *work)
 	struct kfree_rcu_cpu *krcp = container_of(work, struct kfree_rcu_cpu,
 						 monitor_work.work);
 
-	spin_lock_irqsave(&krcp->lock, flags);
+	raw_spin_lock_irqsave(&krcp->lock, flags);
 	if (krcp->monitor_todo)
 		kfree_rcu_drain_unlock(krcp, flags);
 	else
-		spin_unlock_irqrestore(&krcp->lock, flags);
+		raw_spin_unlock_irqrestore(&krcp->lock, flags);
 }
 
 static inline bool
@@ -3142,7 +3142,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	local_irq_save(flags);	// For safely calling this_cpu_ptr().
 	krcp = this_cpu_ptr(&krc);
 	if (krcp->initialized)
-		spin_lock(&krcp->lock);
+		raw_spin_lock(&krcp->lock);
 
 	// Queue the object but don't yet schedule the batch.
 	if (debug_rcu_head_queue(head)) {
@@ -3173,7 +3173,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 
 unlock_return:
 	if (krcp->initialized)
-		spin_unlock(&krcp->lock);
+		raw_spin_unlock(&krcp->lock);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(kfree_call_rcu);
@@ -3205,11 +3205,11 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
 
 		count = krcp->count;
-		spin_lock_irqsave(&krcp->lock, flags);
+		raw_spin_lock_irqsave(&krcp->lock, flags);
 		if (krcp->monitor_todo)
 			kfree_rcu_drain_unlock(krcp, flags);
 		else
-			spin_unlock_irqrestore(&krcp->lock, flags);
+			raw_spin_unlock_irqrestore(&krcp->lock, flags);
 
 		sc->nr_to_scan -= count;
 		freed += count;
@@ -3236,15 +3236,15 @@ void __init kfree_rcu_scheduler_running(void)
 	for_each_online_cpu(cpu) {
 		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
 
-		spin_lock_irqsave(&krcp->lock, flags);
+		raw_spin_lock_irqsave(&krcp->lock, flags);
 		if (!krcp->head || krcp->monitor_todo) {
-			spin_unlock_irqrestore(&krcp->lock, flags);
+			raw_spin_unlock_irqrestore(&krcp->lock, flags);
 			continue;
 		}
 		krcp->monitor_todo = true;
 		schedule_delayed_work_on(cpu, &krcp->monitor_work,
 					 KFREE_DRAIN_JIFFIES);
-		spin_unlock_irqrestore(&krcp->lock, flags);
+		raw_spin_unlock_irqrestore(&krcp->lock, flags);
 	}
 }
 
@@ -4140,7 +4140,7 @@ static void __init kfree_rcu_batch_init(void)
 	for_each_possible_cpu(cpu) {
 		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
 
-		spin_lock_init(&krcp->lock);
+		raw_spin_lock_init(&krcp->lock);
 		for (i = 0; i < KFREE_N_BATCHES; i++) {
 			INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work);
 			krcp->krw_arr[i].krcp = krcp;
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
  2020-04-20 15:38 [PATCH rcu/dev -fixes 0/4] Joel Fernandes (Google)
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention Joel Fernandes (Google)
@ 2020-04-20 15:38 ` Joel Fernandes (Google)
  2020-04-22 10:35   ` Uladzislau Rezki
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk() Joel Fernandes (Google)
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 4/4] rcu/tree: Use consistent style for comments Joel Fernandes (Google)
  3 siblings, 1 reply; 13+ messages in thread
From: Joel Fernandes (Google) @ 2020-04-20 15:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	Sebastian Andrzej Siewior, Uladzislau Rezki, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Paul E. McKenney, rcu,
	Steven Rostedt

To keep kfree_rcu() path working on raw non-preemptible sections,
prevent the optional entry into the allocator as it uses sleeping locks.
In fact, even if the caller of kfree_rcu() is preemptible, this path
still is not, as krcp->lock is a raw spinlock as done in previous
patches. With additional page pre-allocation in the works, hitting this
return is going to be much less likely soon so just prevent it for now
so that PREEMPT_RT does not break. Note that page allocation here is an
optimization and skipping it still makes kfree_rcu() work.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Co-developed-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index cf68d3d9f5b81..cd61649e1b001 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3092,6 +3092,18 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
 		if (!bnode) {
 			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
 
+			/*
+			 * To keep this path working on raw non-preemptible
+			 * sections, prevent the optional entry into the
+			 * allocator as it uses sleeping locks. In fact, even
+			 * if the caller of kfree_rcu() is preemptible, this
+			 * path still is not, as krcp->lock is a raw spinlock.
+			 * With additional page pre-allocation in the works,
+			 * hitting this return is going to be much less likely.
+			 */
+			if (IS_ENABLED(CONFIG_PREEMPT_RT))
+				return false;
+
 			bnode = (struct kfree_rcu_bulk_data *)
 				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
 		}
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk()
  2020-04-20 15:38 [PATCH rcu/dev -fixes 0/4] Joel Fernandes (Google)
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention Joel Fernandes (Google)
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT Joel Fernandes (Google)
@ 2020-04-20 15:38 ` Joel Fernandes (Google)
  2020-04-20 17:18   ` Uladzislau Rezki
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 4/4] rcu/tree: Use consistent style for comments Joel Fernandes (Google)
  3 siblings, 1 reply; 13+ messages in thread
From: Joel Fernandes (Google) @ 2020-04-20 15:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sebastian Andrzej Siewior, Joel Fernandes, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Paul E. McKenney, rcu,
	Steven Rostedt, urezki

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

There is no need to use xchg(), the access is serialized by krcp->lock.
The xchg() function adds some atomic barriers which remain hidden in
x86's disassembly but are visible on ARM for instance.

Replace xchg() with a load + store.

Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/rcu/tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index cd61649e1b001..f6eb3aee0935e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3088,7 +3088,8 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
 	/* Check if a new block is required. */
 	if (!krcp->bhead ||
 			krcp->bhead->nr_records == KFREE_BULK_MAX_ENTR) {
-		bnode = xchg(&krcp->bcached, NULL);
+		bnode = krcp->bcached;
+		krcp->bcached = NULL;
 		if (!bnode) {
 			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
 
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH rcu/dev -fixes 4/4] rcu/tree: Use consistent style for comments
  2020-04-20 15:38 [PATCH rcu/dev -fixes 0/4] Joel Fernandes (Google)
                   ` (2 preceding siblings ...)
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk() Joel Fernandes (Google)
@ 2020-04-20 15:38 ` Joel Fernandes (Google)
  2020-04-21 13:08   ` Uladzislau Rezki
  3 siblings, 1 reply; 13+ messages in thread
From: Joel Fernandes (Google) @ 2020-04-20 15:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	bigeasy, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Paul E. McKenney, rcu, Steven Rostedt, urezki

Simple clean up of comments in kfree_rcu() code to keep it consistent
with majority of commenting styles.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f6eb3aee0935e..0512e0f9e2f31 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3043,15 +3043,15 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp)
 static inline void kfree_rcu_drain_unlock(struct kfree_rcu_cpu *krcp,
 					  unsigned long flags)
 {
-	// Attempt to start a new batch.
+	/* Attempt to start a new batch. */
 	krcp->monitor_todo = false;
 	if (queue_kfree_rcu_work(krcp)) {
-		// Success! Our job is done here.
+		/* Success! Our job is done here. */
 		raw_spin_unlock_irqrestore(&krcp->lock, flags);
 		return;
 	}
 
-	// Previous RCU batch still in progress, try again later.
+	/* Previous RCU batch still in progress, try again later. */
 	krcp->monitor_todo = true;
 	schedule_delayed_work(&krcp->monitor_work, KFREE_DRAIN_JIFFIES);
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
@@ -3152,14 +3152,14 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 	unsigned long flags;
 	struct kfree_rcu_cpu *krcp;
 
-	local_irq_save(flags);	// For safely calling this_cpu_ptr().
+	local_irq_save(flags);	/* For safely calling this_cpu_ptr(). */
 	krcp = this_cpu_ptr(&krc);
 	if (krcp->initialized)
 		raw_spin_lock(&krcp->lock);
 
-	// Queue the object but don't yet schedule the batch.
+	/* Queue the object but don't yet schedule the batch. */
 	if (debug_rcu_head_queue(head)) {
-		// Probable double kfree_rcu(), just leak.
+		/* Probable double kfree_rcu(), just leak. */
 		WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
 			  __func__, head);
 		goto unlock_return;
@@ -3177,7 +3177,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 
 	WRITE_ONCE(krcp->count, krcp->count + 1);
 
-	// Set timer to drain after KFREE_DRAIN_JIFFIES.
+	/* Set timer to drain after KFREE_DRAIN_JIFFIES. */
 	if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
 	    !krcp->monitor_todo) {
 		krcp->monitor_todo = true;
@@ -3723,7 +3723,7 @@ int rcutree_offline_cpu(unsigned int cpu)
 
 	rcutree_affinity_setting(cpu, cpu);
 
-	// nohz_full CPUs need the tick for stop-machine to work quickly
+	/* nohz_full CPUs need the tick for stop-machine to work quickly */
 	tick_dep_set(TICK_DEP_BIT_RCU);
 	return 0;
 }
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk()
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk() Joel Fernandes (Google)
@ 2020-04-20 17:18   ` Uladzislau Rezki
  2020-04-20 18:19     ` Joel Fernandes
  0 siblings, 1 reply; 13+ messages in thread
From: Uladzislau Rezki @ 2020-04-20 17:18 UTC (permalink / raw)
  To: Joel Fernandes (Google), Sebastian Andrzej Siewior
  Cc: linux-kernel, Sebastian Andrzej Siewior, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Paul E. McKenney, rcu,
	Steven Rostedt, urezki

On Mon, Apr 20, 2020 at 11:38:36AM -0400, Joel Fernandes (Google) wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> There is no need to use xchg(), the access is serialized by krcp->lock.
> The xchg() function adds some atomic barriers which remain hidden in
> x86's disassembly but are visible on ARM for instance.
> 
> Replace xchg() with a load + store.
> 
> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  kernel/rcu/tree.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index cd61649e1b001..f6eb3aee0935e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3088,7 +3088,8 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
>  	/* Check if a new block is required. */
>  	if (!krcp->bhead ||
>  			krcp->bhead->nr_records == KFREE_BULK_MAX_ENTR) {
> -		bnode = xchg(&krcp->bcached, NULL);
> +		bnode = krcp->bcached;
> +		krcp->bcached = NULL;
>  		if (!bnode) {
>  			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
>  
>
But we populate the cache without holding the krcp->lock. That is why it
is xchg here. See below:

<snip>
  if (cmpxchg(&krcp->bcached, NULL, bhead))
     free_page((unsigned long) bhead);
<snip>

we do not hold any krcp->lock here, we do not need it.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk()
  2020-04-20 17:18   ` Uladzislau Rezki
@ 2020-04-20 18:19     ` Joel Fernandes
  0 siblings, 0 replies; 13+ messages in thread
From: Joel Fernandes @ 2020-04-20 18:19 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, linux-kernel, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Paul E. McKenney, rcu,
	Steven Rostedt

On Mon, Apr 20, 2020 at 07:18:29PM +0200, Uladzislau Rezki wrote:
> On Mon, Apr 20, 2020 at 11:38:36AM -0400, Joel Fernandes (Google) wrote:
> > From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > 
> > There is no need to use xchg(), the access is serialized by krcp->lock.
> > The xchg() function adds some atomic barriers which remain hidden in
> > x86's disassembly but are visible on ARM for instance.
> > 
> > Replace xchg() with a load + store.
> > 
> > Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  kernel/rcu/tree.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index cd61649e1b001..f6eb3aee0935e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3088,7 +3088,8 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
> >  	/* Check if a new block is required. */
> >  	if (!krcp->bhead ||
> >  			krcp->bhead->nr_records == KFREE_BULK_MAX_ENTR) {
> > -		bnode = xchg(&krcp->bcached, NULL);
> > +		bnode = krcp->bcached;
> > +		krcp->bcached = NULL;
> >  		if (!bnode) {
> >  			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
> >  
> >
> But we populate the cache without holding the krcp->lock. That is why it
> is xchg here. See below:
> 
> <snip>
>   if (cmpxchg(&krcp->bcached, NULL, bhead))
>      free_page((unsigned long) bhead);
> <snip>
> 
> we do not hold any krcp->lock here, we do not need it.

You are right. This patch is not helpful in this situation even though it
does not break things. Let us drop this patch. Please review the other 3 and
provide your Reviewed-by tag if they look Ok to you, thanks.

For the workqueue one that Paul asked us to look into - we are continuing to
discuss there if we need to move it outside the lock or not. If we decide to
move it outside lock, we can add that as one more patch to this series and
I'll send a v2.

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 4/4] rcu/tree: Use consistent style for comments
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 4/4] rcu/tree: Use consistent style for comments Joel Fernandes (Google)
@ 2020-04-21 13:08   ` Uladzislau Rezki
  0 siblings, 0 replies; 13+ messages in thread
From: Uladzislau Rezki @ 2020-04-21 13:08 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, bigeasy, Josh Triplett, Lai Jiangshan,
	Mathieu Desnoyers, Paul E. McKenney, rcu, Steven Rostedt, urezki

On Mon, Apr 20, 2020 at 11:38:37AM -0400, Joel Fernandes (Google) wrote:
> Simple clean up of comments in kfree_rcu() code to keep it consistent
> with majority of commenting styles.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/tree.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f6eb3aee0935e..0512e0f9e2f31 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3043,15 +3043,15 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp)
>  static inline void kfree_rcu_drain_unlock(struct kfree_rcu_cpu *krcp,
>  					  unsigned long flags)
>  {
> -	// Attempt to start a new batch.
> +	/* Attempt to start a new batch. */
>  	krcp->monitor_todo = false;
>  	if (queue_kfree_rcu_work(krcp)) {
> -		// Success! Our job is done here.
> +		/* Success! Our job is done here. */
>  		raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  		return;
>  	}
>  
> -	// Previous RCU batch still in progress, try again later.
> +	/* Previous RCU batch still in progress, try again later. */
>  	krcp->monitor_todo = true;
>  	schedule_delayed_work(&krcp->monitor_work, KFREE_DRAIN_JIFFIES);
>  	raw_spin_unlock_irqrestore(&krcp->lock, flags);
> @@ -3152,14 +3152,14 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	unsigned long flags;
>  	struct kfree_rcu_cpu *krcp;
>  
> -	local_irq_save(flags);	// For safely calling this_cpu_ptr().
> +	local_irq_save(flags);	/* For safely calling this_cpu_ptr(). */
>  	krcp = this_cpu_ptr(&krc);
>  	if (krcp->initialized)
>  		raw_spin_lock(&krcp->lock);
>  
> -	// Queue the object but don't yet schedule the batch.
> +	/* Queue the object but don't yet schedule the batch. */
>  	if (debug_rcu_head_queue(head)) {
> -		// Probable double kfree_rcu(), just leak.
> +		/* Probable double kfree_rcu(), just leak. */
>  		WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
>  			  __func__, head);
>  		goto unlock_return;
> @@ -3177,7 +3177,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  
>  	WRITE_ONCE(krcp->count, krcp->count + 1);
>  
> -	// Set timer to drain after KFREE_DRAIN_JIFFIES.
> +	/* Set timer to drain after KFREE_DRAIN_JIFFIES. */
>  	if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
>  	    !krcp->monitor_todo) {
>  		krcp->monitor_todo = true;
> @@ -3723,7 +3723,7 @@ int rcutree_offline_cpu(unsigned int cpu)
>  
>  	rcutree_affinity_setting(cpu, cpu);
>  
> -	// nohz_full CPUs need the tick for stop-machine to work quickly
> +	/* nohz_full CPUs need the tick for stop-machine to work quickly */
>  	tick_dep_set(TICK_DEP_BIT_RCU);
>  	return 0;
>  }
> -- 
> 2.26.1.301.g55bc3eb7cb9-goog
> 
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention Joel Fernandes (Google)
@ 2020-04-21 13:12   ` Uladzislau Rezki
  0 siblings, 0 replies; 13+ messages in thread
From: Uladzislau Rezki @ 2020-04-21 13:12 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, urezki, bigeasy, Josh Triplett, Lai Jiangshan,
	Mathieu Desnoyers, Paul E. McKenney, rcu, Steven Rostedt

On Mon, Apr 20, 2020 at 11:38:34AM -0400, Joel Fernandes (Google) wrote:
> On PREEMPT_RT kernels, contending on the krcp spinlock can cause
> sleeping as on these kernels, the spinlock is converted to an rt-mutex.
> To prevent breakage of possible usage of kfree_rcu() now or in the
> future, make use of raw spinlocks which are not subject to such
> conversions.
> 
> Vetting all code paths, there is no reason to believe that the raw
> spinlock will be held for long time so PREEMPT_RT should not suffer from
> lengthy acquirals of the lock.
> 
> Cc: urezki@gmail.com
> Cc: bigeasy@linutronix.de
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/tree.c | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f288477ee1c26..cf68d3d9f5b81 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2905,7 +2905,7 @@ struct kfree_rcu_cpu {
>  	struct kfree_rcu_bulk_data *bhead;
>  	struct kfree_rcu_bulk_data *bcached;
>  	struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES];
> -	spinlock_t lock;
> +	raw_spinlock_t lock;
>  	struct delayed_work monitor_work;
>  	bool monitor_todo;
>  	bool initialized;
> @@ -2939,12 +2939,12 @@ static void kfree_rcu_work(struct work_struct *work)
>  	krwp = container_of(to_rcu_work(work),
>  			    struct kfree_rcu_cpu_work, rcu_work);
>  	krcp = krwp->krcp;
> -	spin_lock_irqsave(&krcp->lock, flags);
> +	raw_spin_lock_irqsave(&krcp->lock, flags);
>  	head = krwp->head_free;
>  	krwp->head_free = NULL;
>  	bhead = krwp->bhead_free;
>  	krwp->bhead_free = NULL;
> -	spin_unlock_irqrestore(&krcp->lock, flags);
> +	raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  
>  	/* "bhead" is now private, so traverse locklessly. */
>  	for (; bhead; bhead = bnext) {
> @@ -3047,14 +3047,14 @@ static inline void kfree_rcu_drain_unlock(struct kfree_rcu_cpu *krcp,
>  	krcp->monitor_todo = false;
>  	if (queue_kfree_rcu_work(krcp)) {
>  		// Success! Our job is done here.
> -		spin_unlock_irqrestore(&krcp->lock, flags);
> +		raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  		return;
>  	}
>  
>  	// Previous RCU batch still in progress, try again later.
>  	krcp->monitor_todo = true;
>  	schedule_delayed_work(&krcp->monitor_work, KFREE_DRAIN_JIFFIES);
> -	spin_unlock_irqrestore(&krcp->lock, flags);
> +	raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  }
>  
>  /*
> @@ -3067,11 +3067,11 @@ static void kfree_rcu_monitor(struct work_struct *work)
>  	struct kfree_rcu_cpu *krcp = container_of(work, struct kfree_rcu_cpu,
>  						 monitor_work.work);
>  
> -	spin_lock_irqsave(&krcp->lock, flags);
> +	raw_spin_lock_irqsave(&krcp->lock, flags);
>  	if (krcp->monitor_todo)
>  		kfree_rcu_drain_unlock(krcp, flags);
>  	else
> -		spin_unlock_irqrestore(&krcp->lock, flags);
> +		raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  }
>  
>  static inline bool
> @@ -3142,7 +3142,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	local_irq_save(flags);	// For safely calling this_cpu_ptr().
>  	krcp = this_cpu_ptr(&krc);
>  	if (krcp->initialized)
> -		spin_lock(&krcp->lock);
> +		raw_spin_lock(&krcp->lock);
>  
>  	// Queue the object but don't yet schedule the batch.
>  	if (debug_rcu_head_queue(head)) {
> @@ -3173,7 +3173,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>  
>  unlock_return:
>  	if (krcp->initialized)
> -		spin_unlock(&krcp->lock);
> +		raw_spin_unlock(&krcp->lock);
>  	local_irq_restore(flags);
>  }
>  EXPORT_SYMBOL_GPL(kfree_call_rcu);
> @@ -3205,11 +3205,11 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>  
>  		count = krcp->count;
> -		spin_lock_irqsave(&krcp->lock, flags);
> +		raw_spin_lock_irqsave(&krcp->lock, flags);
>  		if (krcp->monitor_todo)
>  			kfree_rcu_drain_unlock(krcp, flags);
>  		else
> -			spin_unlock_irqrestore(&krcp->lock, flags);
> +			raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  
>  		sc->nr_to_scan -= count;
>  		freed += count;
> @@ -3236,15 +3236,15 @@ void __init kfree_rcu_scheduler_running(void)
>  	for_each_online_cpu(cpu) {
>  		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>  
> -		spin_lock_irqsave(&krcp->lock, flags);
> +		raw_spin_lock_irqsave(&krcp->lock, flags);
>  		if (!krcp->head || krcp->monitor_todo) {
> -			spin_unlock_irqrestore(&krcp->lock, flags);
> +			raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  			continue;
>  		}
>  		krcp->monitor_todo = true;
>  		schedule_delayed_work_on(cpu, &krcp->monitor_work,
>  					 KFREE_DRAIN_JIFFIES);
> -		spin_unlock_irqrestore(&krcp->lock, flags);
> +		raw_spin_unlock_irqrestore(&krcp->lock, flags);
>  	}
>  }
>  
> @@ -4140,7 +4140,7 @@ static void __init kfree_rcu_batch_init(void)
>  	for_each_possible_cpu(cpu) {
>  		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>  
> -		spin_lock_init(&krcp->lock);
> +		raw_spin_lock_init(&krcp->lock);
>  		for (i = 0; i < KFREE_N_BATCHES; i++) {
>  			INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work);
>  			krcp->krw_arr[i].krcp = krcp;
> -- 
> 2.26.1.301.g55bc3eb7cb9-goog
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

If we decide to move the schedule_delayed_work() outside of the critical
section, i think, it would be better to submit separate patch with good
explanation why we do it.

--
Vlad Rezki
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
  2020-04-20 15:38 ` [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT Joel Fernandes (Google)
@ 2020-04-22 10:35   ` Uladzislau Rezki
  2020-04-22 11:45     ` Uladzislau Rezki
  2020-04-22 13:18     ` joel
  0 siblings, 2 replies; 13+ messages in thread
From: Uladzislau Rezki @ 2020-04-22 10:35 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Sebastian Andrzej Siewior, Uladzislau Rezki,
	Josh Triplett, Lai Jiangshan, Mathieu Desnoyers,
	Paul E. McKenney, rcu, Steven Rostedt

On Mon, Apr 20, 2020 at 11:38:35AM -0400, Joel Fernandes (Google) wrote:
> To keep kfree_rcu() path working on raw non-preemptible sections,
> prevent the optional entry into the allocator as it uses sleeping locks.
> In fact, even if the caller of kfree_rcu() is preemptible, this path
> still is not, as krcp->lock is a raw spinlock as done in previous
> patches. With additional page pre-allocation in the works, hitting this
> return is going to be much less likely soon so just prevent it for now
> so that PREEMPT_RT does not break. Note that page allocation here is an
> optimization and skipping it still makes kfree_rcu() work.
> 
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Co-developed-by: Uladzislau Rezki <urezki@gmail.com>
> Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/tree.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index cf68d3d9f5b81..cd61649e1b001 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3092,6 +3092,18 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
>  		if (!bnode) {
>  			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
>  
> +			/*
> +			 * To keep this path working on raw non-preemptible
> +			 * sections, prevent the optional entry into the
> +			 * allocator as it uses sleeping locks. In fact, even
> +			 * if the caller of kfree_rcu() is preemptible, this
> +			 * path still is not, as krcp->lock is a raw spinlock.
> +			 * With additional page pre-allocation in the works,
> +			 * hitting this return is going to be much less likely.
> +			 */
> +			if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +				return false;
> +
>  			bnode = (struct kfree_rcu_bulk_data *)
>  				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
>  		}
This will not be correctly working by just reverting everything to the
"rcu_head path" for CONFIG_PREEMPT_RT kernel. We need to preallocate at
least once. I can add caching on top of this series to address it.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
  2020-04-22 10:35   ` Uladzislau Rezki
@ 2020-04-22 11:45     ` Uladzislau Rezki
  2020-04-22 13:18     ` joel
  1 sibling, 0 replies; 13+ messages in thread
From: Uladzislau Rezki @ 2020-04-22 11:45 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: Joel Fernandes (Google),
	linux-kernel, Sebastian Andrzej Siewior, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Paul E. McKenney, rcu,
	Steven Rostedt

On Wed, Apr 22, 2020 at 12:35:36PM +0200, Uladzislau Rezki wrote:
> On Mon, Apr 20, 2020 at 11:38:35AM -0400, Joel Fernandes (Google) wrote:
> > To keep kfree_rcu() path working on raw non-preemptible sections,
> > prevent the optional entry into the allocator as it uses sleeping locks.
> > In fact, even if the caller of kfree_rcu() is preemptible, this path
> > still is not, as krcp->lock is a raw spinlock as done in previous
> > patches. With additional page pre-allocation in the works, hitting this
> > return is going to be much less likely soon so just prevent it for now
> > so that PREEMPT_RT does not break. Note that page allocation here is an
> > optimization and skipping it still makes kfree_rcu() work.
> > 
> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Co-developed-by: Uladzislau Rezki <urezki@gmail.com>
> > Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > ---
> >  kernel/rcu/tree.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index cf68d3d9f5b81..cd61649e1b001 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3092,6 +3092,18 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
> >  		if (!bnode) {
> >  			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
> >  
> > +			/*
> > +			 * To keep this path working on raw non-preemptible
> > +			 * sections, prevent the optional entry into the
> > +			 * allocator as it uses sleeping locks. In fact, even
> > +			 * if the caller of kfree_rcu() is preemptible, this
> > +			 * path still is not, as krcp->lock is a raw spinlock.
> > +			 * With additional page pre-allocation in the works,
> > +			 * hitting this return is going to be much less likely.
> > +			 */
> > +			if (IS_ENABLED(CONFIG_PREEMPT_RT))
> > +				return false;
> > +
> >  			bnode = (struct kfree_rcu_bulk_data *)
> >  				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> >  		}
> This will not be correctly working by just reverting everything to the
> "rcu_head path" for CONFIG_PREEMPT_RT kernel. We need to preallocate at
> least once. I can add caching on top of this series to address it.
> 
Forgot to add: Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

Caching should be added as separate patch.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
  2020-04-22 10:35   ` Uladzislau Rezki
  2020-04-22 11:45     ` Uladzislau Rezki
@ 2020-04-22 13:18     ` joel
  2020-04-22 13:28       ` Paul E. McKenney
  1 sibling, 1 reply; 13+ messages in thread
From: joel @ 2020-04-22 13:18 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: linux-kernel, Sebastian Andrzej Siewior, Josh Triplett,
	Lai Jiangshan, Mathieu Desnoyers, Paul E. McKenney, rcu,
	Steven Rostedt



On April 22, 2020 6:35:36 AM EDT, Uladzislau Rezki <urezki@gmail.com> wrote:
>On Mon, Apr 20, 2020 at 11:38:35AM -0400, Joel Fernandes (Google)
>wrote:
>> To keep kfree_rcu() path working on raw non-preemptible sections,
>> prevent the optional entry into the allocator as it uses sleeping
>locks.
>> In fact, even if the caller of kfree_rcu() is preemptible, this path
>> still is not, as krcp->lock is a raw spinlock as done in previous
>> patches. With additional page pre-allocation in the works, hitting
>this
>> return is going to be much less likely soon so just prevent it for
>now
>> so that PREEMPT_RT does not break. Note that page allocation here is
>an
>> optimization and skipping it still makes kfree_rcu() work.
>> 
>> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>> Co-developed-by: Uladzislau Rezki <urezki@gmail.com>
>> Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
>> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
>> ---
>>  kernel/rcu/tree.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>> 
>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>> index cf68d3d9f5b81..cd61649e1b001 100644
>> --- a/kernel/rcu/tree.c
>> +++ b/kernel/rcu/tree.c
>> @@ -3092,6 +3092,18 @@ kfree_call_rcu_add_ptr_to_bulk(struct
>kfree_rcu_cpu *krcp,
>>  		if (!bnode) {
>>  			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
>>  
>> +			/*
>> +			 * To keep this path working on raw non-preemptible
>> +			 * sections, prevent the optional entry into the
>> +			 * allocator as it uses sleeping locks. In fact, even
>> +			 * if the caller of kfree_rcu() is preemptible, this
>> +			 * path still is not, as krcp->lock is a raw spinlock.
>> +			 * With additional page pre-allocation in the works,
>> +			 * hitting this return is going to be much less likely.
>> +			 */
>> +			if (IS_ENABLED(CONFIG_PREEMPT_RT))
>> +				return false;
>> +
>>  			bnode = (struct kfree_rcu_bulk_data *)
>>  				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
>>  		}
>This will not be correctly working by just reverting everything to the
>"rcu_head path" for CONFIG_PREEMPT_RT kernel. We need to preallocate at
>least once. I can add caching on top of this series to address it.
>

I discussed with Vlad offline, this patch is fine for mainline as we don't have headless support yet. So this patch is good. Future patches adding caching will also add the headless support after additional caching, so skipping the allocation here is ok.

Thanks.

- Joel




>--
>Vlad Rezki

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
  2020-04-22 13:18     ` joel
@ 2020-04-22 13:28       ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2020-04-22 13:28 UTC (permalink / raw)
  To: joel
  Cc: Uladzislau Rezki, linux-kernel, Sebastian Andrzej Siewior,
	Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, rcu,
	Steven Rostedt

On Wed, Apr 22, 2020 at 09:18:41AM -0400, joel@joelfernandes.org wrote:
> 
> 
> On April 22, 2020 6:35:36 AM EDT, Uladzislau Rezki <urezki@gmail.com> wrote:
> >On Mon, Apr 20, 2020 at 11:38:35AM -0400, Joel Fernandes (Google)
> >wrote:
> >> To keep kfree_rcu() path working on raw non-preemptible sections,
> >> prevent the optional entry into the allocator as it uses sleeping
> >locks.
> >> In fact, even if the caller of kfree_rcu() is preemptible, this path
> >> still is not, as krcp->lock is a raw spinlock as done in previous
> >> patches. With additional page pre-allocation in the works, hitting
> >this
> >> return is going to be much less likely soon so just prevent it for
> >now
> >> so that PREEMPT_RT does not break. Note that page allocation here is
> >an
> >> optimization and skipping it still makes kfree_rcu() work.
> >> 
> >> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> >> Co-developed-by: Uladzislau Rezki <urezki@gmail.com>
> >> Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
> >> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> >> ---
> >>  kernel/rcu/tree.c | 12 ++++++++++++
> >>  1 file changed, 12 insertions(+)
> >> 
> >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >> index cf68d3d9f5b81..cd61649e1b001 100644
> >> --- a/kernel/rcu/tree.c
> >> +++ b/kernel/rcu/tree.c
> >> @@ -3092,6 +3092,18 @@ kfree_call_rcu_add_ptr_to_bulk(struct
> >kfree_rcu_cpu *krcp,
> >>  		if (!bnode) {
> >>  			WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
> >>  
> >> +			/*
> >> +			 * To keep this path working on raw non-preemptible
> >> +			 * sections, prevent the optional entry into the
> >> +			 * allocator as it uses sleeping locks. In fact, even
> >> +			 * if the caller of kfree_rcu() is preemptible, this
> >> +			 * path still is not, as krcp->lock is a raw spinlock.
> >> +			 * With additional page pre-allocation in the works,
> >> +			 * hitting this return is going to be much less likely.
> >> +			 */
> >> +			if (IS_ENABLED(CONFIG_PREEMPT_RT))
> >> +				return false;
> >> +
> >>  			bnode = (struct kfree_rcu_bulk_data *)
> >>  				__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> >>  		}
> >This will not be correctly working by just reverting everything to the
> >"rcu_head path" for CONFIG_PREEMPT_RT kernel. We need to preallocate at
> >least once. I can add caching on top of this series to address it.
> 
> I discussed with Vlad offline, this patch is fine for mainline as we
> don't have headless support yet. So this patch is good. Future patches
> adding caching will also add the headless support after additional
> caching, so skipping the allocation here is ok.

Sounds good!

But would one of the realtime guys be willing to give a Tested-by?

							Thanx, Paul

> Thanks.
> 
> - Joel
> 
> 
> 
> 
> >--
> >Vlad Rezki
> 
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-04-22 13:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20 15:38 [PATCH rcu/dev -fixes 0/4] Joel Fernandes (Google)
2020-04-20 15:38 ` [PATCH rcu/dev -fixes 1/4] rcu/tree: Keep kfree_rcu() awake during lock contention Joel Fernandes (Google)
2020-04-21 13:12   ` Uladzislau Rezki
2020-04-20 15:38 ` [PATCH rcu/dev -fixes 2/4] rcu/tree: Skip entry into the page allocator for PREEMPT_RT Joel Fernandes (Google)
2020-04-22 10:35   ` Uladzislau Rezki
2020-04-22 11:45     ` Uladzislau Rezki
2020-04-22 13:18     ` joel
2020-04-22 13:28       ` Paul E. McKenney
2020-04-20 15:38 ` [PATCH rcu/dev -fixes 3/4] rcu/tree: Avoid using xchg() in kfree_call_rcu_add_ptr_to_bulk() Joel Fernandes (Google)
2020-04-20 17:18   ` Uladzislau Rezki
2020-04-20 18:19     ` Joel Fernandes
2020-04-20 15:38 ` [PATCH rcu/dev -fixes 4/4] rcu/tree: Use consistent style for comments Joel Fernandes (Google)
2020-04-21 13:08   ` Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).