All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/3] Introduce put_task_struct_atomic_sleep()
@ 2023-04-14 12:55 Wander Lairson Costa
  2023-04-14 12:55 ` [PATCH v6 1/3] sched/core: warn on call put_task_struct in invalid context Wander Lairson Costa
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-14 12:55 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Wander Lairson Costa,
	Andrew Morton, Guo Ren, Kefeng Wang, Oleg Nesterov,
	Liam R. Howlett, Vlastimil Babka, Christian Brauner,
	Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM

The put_task_struct() function reduces a usage counter and invokes
__put_task_struct() when the counter reaches zero.

In the case of __put_task_struct(), it indirectly acquires a spinlock,
which operates as a sleeping lock under the PREEMPT_RT configuration.
As a result, invoking put_task_struct() within an atomic context is
not feasible for real-time (RT) kernels.

To address this issue, this patch series introduces a new function
called put_task_struct_atomic_safe(). When compiled with the
PREEMPT_RT configuration, this function defers the call to
__put_task_struct() to a process context.

Additionally, the patch series rectifies known problematic call sites
to ensure smooth functioning.

Changelog
=========

v1:
* Initial implementation fixing the splat.

v2:
* Isolate the logic in its own function.
* Fix two more cases caught in review.

v3:
* Change __put_task_struct() to handle the issue internally.

v4:
* Explain why call_rcu() is safe to call from interrupt context.

v5:
* Explain why __put_task_struct() doesn't conflict with
  put_task_sruct_rcu_user.

v6:
* As per Sebastian's review, revert back the implementation of v2
  with a distinct function.
* Add a check in put_task_struct() to warning when called from a
  non-sleepable context.
* Address more call sites.

Wander Lairson Costa (3):
  sched/core: warn on call put_task_struct in invalid context
  sched/task: Add the put_task_struct_atomic_safe() function
  treewide: replace put_task_struct() witht the atomic safe version

 include/linux/sched/task.h | 45 ++++++++++++++++++++++++++++++++++++++
 kernel/events/core.c       |  6 ++---
 kernel/fork.c              |  8 +++++++
 kernel/locking/rtmutex.c   | 10 ++++-----
 kernel/sched/core.c        |  6 ++---
 kernel/sched/deadline.c    | 16 +++++++-------
 kernel/sched/rt.c          |  4 ++--
 7 files changed, 74 insertions(+), 21 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v6 1/3] sched/core: warn on call put_task_struct in invalid context
  2023-04-14 12:55 [PATCH v6 0/3] Introduce put_task_struct_atomic_sleep() Wander Lairson Costa
@ 2023-04-14 12:55 ` Wander Lairson Costa
  2023-04-14 12:55 ` [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function Wander Lairson Costa
  2023-04-14 12:55 ` [PATCH v6 3/3] treewide: replace put_task_struct() witht the atomic safe version Wander Lairson Costa
  2 siblings, 0 replies; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-14 12:55 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Wander Lairson Costa,
	Russell King (Oracle),
	Oleg Nesterov, Kefeng Wang, Andrew Morton, Liam R. Howlett,
	Vlastimil Babka, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM
  Cc: Sebastian Andrzej Siewior

Under PREEMPT_RT, spinlocks become sleepable locks. put_task_struct()
indirectly acquires a spinlock. Therefore, it can't be called in
atomic/interrupt context in RT kernels.

To prevent such conditions, add a check for atomic/interrupt context
before calling put_task_struct().

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/sched/task.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 357e0068497c..b597b97b1f8f 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -113,14 +113,28 @@ static inline struct task_struct *get_task_struct(struct task_struct *t)
 
 extern void __put_task_struct(struct task_struct *t);
 
+#define PUT_TASK_RESCHED_OFFSETS \
+	(rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT)
+
+#define __put_task_might_resched() \
+	__might_resched(__FILE__, __LINE__, PUT_TASK_RESCHED_OFFSETS)
+
+#define put_task_might_resched()			\
+	do {						\
+		if (IS_ENABLED(CONFIG_PREEMPT_RT))	\
+			__put_task_might_resched();	\
+	} while (0)
+
 static inline void put_task_struct(struct task_struct *t)
 {
+	put_task_might_resched();
 	if (refcount_dec_and_test(&t->usage))
 		__put_task_struct(t);
 }
 
 static inline void put_task_struct_many(struct task_struct *t, int nr)
 {
+	put_task_might_resched();
 	if (refcount_sub_and_test(nr, &t->usage))
 		__put_task_struct(t);
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-14 12:55 [PATCH v6 0/3] Introduce put_task_struct_atomic_sleep() Wander Lairson Costa
  2023-04-14 12:55 ` [PATCH v6 1/3] sched/core: warn on call put_task_struct in invalid context Wander Lairson Costa
@ 2023-04-14 12:55 ` Wander Lairson Costa
  2023-04-17 18:51   ` Waiman Long
  2023-04-24 18:09   ` Paul E. McKenney
  2023-04-14 12:55 ` [PATCH v6 3/3] treewide: replace put_task_struct() witht the atomic safe version Wander Lairson Costa
  2 siblings, 2 replies; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-14 12:55 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Wander Lairson Costa,
	Michael Ellerman, Andrew Morton, Oleg Nesterov, Kefeng Wang,
	Liam R. Howlett, Kees Cook, Christian Brauner, Andrei Vagin,
	Shakeel Butt, open list, open list:PERFORMANCE EVENTS SUBSYSTEM
  Cc: Hu Chunyu, Paul McKenney, Thomas Gleixner

Due to the possibility of indirectly acquiring sleeping locks, it is
unsafe to call put_task_struct() in atomic contexts when the kernel is
compiled with PREEMPT_RT.

To mitigate this issue, this commit introduces
put_task_struct_atomic_safe(), which schedules __put_task_struct()
through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
be a more natural approach, we cannot allocate dynamic memory from
atomic context in PREEMPT_RT, making the code more complex.

This implementation ensures safe execution in atomic contexts and
avoids any potential issues that may arise from using the non-atomic
version.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reported-by: Hu Chunyu <chuhu@redhat.com>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
 kernel/fork.c              |  8 ++++++++
 2 files changed, 39 insertions(+)

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index b597b97b1f8f..5c13b83d7008 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
 
 void put_task_struct_rcu_user(struct task_struct *task);
 
+extern void __delayed_put_task_struct(struct rcu_head *rhp);
+
+static inline void put_task_struct_atomic_safe(struct task_struct *task)
+{
+	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+		/*
+		 * Decrement the refcount explicitly to avoid unnecessarily
+		 * calling call_rcu.
+		 */
+		if (refcount_dec_and_test(&task->usage))
+			/*
+			 * under PREEMPT_RT, we can't call put_task_struct
+			 * in atomic context because it will indirectly
+			 * acquire sleeping locks.
+			 * call_rcu() will schedule delayed_put_task_struct_rcu()
+			 * to be called in process context.
+			 *
+			 * __put_task_struct() is called called when
+			 * refcount_dec_and_test(&t->usage) succeeds.
+			 *
+			 * This means that it can't "conflict" with
+			 * put_task_struct_rcu_user() which abuses ->rcu the same
+			 * way; rcu_users has a reference so task->usage can't be
+			 * zero after rcu_users 1 -> 0 transition.
+			 */
+			call_rcu(&task->rcu, __delayed_put_task_struct);
+	} else {
+		put_task_struct(task);
+	}
+}
+
 /* Free all architecture-specific resources held by a thread. */
 void release_thread(struct task_struct *dead_task);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0c92f224c68c..9884794fe4b8 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk)
 }
 EXPORT_SYMBOL_GPL(__put_task_struct);
 
+void __delayed_put_task_struct(struct rcu_head *rhp)
+{
+	struct task_struct *task = container_of(rhp, struct task_struct, rcu);
+
+	__put_task_struct(task);
+}
+EXPORT_SYMBOL_GPL(__delayed_put_task_struct);
+
 void __init __weak arch_task_cache_init(void) { }
 
 /*
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 3/3] treewide: replace put_task_struct() witht the atomic safe version
  2023-04-14 12:55 [PATCH v6 0/3] Introduce put_task_struct_atomic_sleep() Wander Lairson Costa
  2023-04-14 12:55 ` [PATCH v6 1/3] sched/core: warn on call put_task_struct in invalid context Wander Lairson Costa
  2023-04-14 12:55 ` [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function Wander Lairson Costa
@ 2023-04-14 12:55 ` Wander Lairson Costa
  2023-04-17 18:53   ` Waiman Long
  2 siblings, 1 reply; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-14 12:55 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Wander Lairson Costa,
	Michael Ellerman, Brian Cain, Kefeng Wang, Oleg Nesterov,
	Andrew Morton, Liam R. Howlett, Christian Brauner,
	Vlastimil Babka, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM

In places where put_task_struct() is called in a non-sleepable context,
we replace those calls by put_task_struct_atomic_safe().

These call sites were found by running internal regression tests and
looking for warnings generated by put_task_might_resched().

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
---
 kernel/events/core.c     |  6 +++---
 kernel/locking/rtmutex.c | 10 +++++-----
 kernel/sched/core.c      |  6 +++---
 kernel/sched/deadline.c  | 16 ++++++++--------
 kernel/sched/rt.c        |  4 ++--
 5 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 435815d3be3f..8f823da02324 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1181,7 +1181,7 @@ static void put_ctx(struct perf_event_context *ctx)
 		if (ctx->parent_ctx)
 			put_ctx(ctx->parent_ctx);
 		if (ctx->task && ctx->task != TASK_TOMBSTONE)
-			put_task_struct(ctx->task);
+			put_task_struct_atomic_safe(ctx->task);
 		call_rcu(&ctx->rcu_head, free_ctx);
 	}
 }
@@ -13019,7 +13019,7 @@ static void perf_event_exit_task_context(struct task_struct *child)
 	RCU_INIT_POINTER(child->perf_event_ctxp, NULL);
 	put_ctx(child_ctx); /* cannot be last */
 	WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
-	put_task_struct(current); /* cannot be last */
+	put_task_struct_atomic_safe(current); /* cannot be last */
 
 	clone_ctx = unclone_ctx(child_ctx);
 	raw_spin_unlock_irq(&child_ctx->lock);
@@ -13124,7 +13124,7 @@ void perf_event_free_task(struct task_struct *task)
 	 */
 	RCU_INIT_POINTER(task->perf_event_ctxp, NULL);
 	WRITE_ONCE(ctx->task, TASK_TOMBSTONE);
-	put_task_struct(task); /* cannot be last */
+	put_task_struct_atomic_safe(task); /* cannot be last */
 	raw_spin_unlock_irq(&ctx->lock);
 
 
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 728f434de2bb..3ecb8d6ae039 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -509,7 +509,7 @@ static __always_inline void rt_mutex_wake_up_q(struct rt_wake_q_head *wqh)
 {
 	if (IS_ENABLED(CONFIG_PREEMPT_RT) && wqh->rtlock_task) {
 		wake_up_state(wqh->rtlock_task, TASK_RTLOCK_WAIT);
-		put_task_struct(wqh->rtlock_task);
+		put_task_struct_atomic_safe(wqh->rtlock_task);
 		wqh->rtlock_task = NULL;
 	}
 
@@ -649,7 +649,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
 			       "task: %s (%d)\n", max_lock_depth,
 			       top_task->comm, task_pid_nr(top_task));
 		}
-		put_task_struct(task);
+		put_task_struct_atomic_safe(task);
 
 		return -EDEADLK;
 	}
@@ -817,7 +817,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
 		 * No requeue[7] here. Just release @task [8]
 		 */
 		raw_spin_unlock(&task->pi_lock);
-		put_task_struct(task);
+		put_task_struct_atomic_safe(task);
 
 		/*
 		 * [9] check_exit_conditions_3 protected by lock->wait_lock.
@@ -886,7 +886,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
 
 	/* [8] Release the task */
 	raw_spin_unlock(&task->pi_lock);
-	put_task_struct(task);
+	put_task_struct_atomic_safe(task);
 
 	/*
 	 * [9] check_exit_conditions_3 protected by lock->wait_lock.
@@ -990,7 +990,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
  out_unlock_pi:
 	raw_spin_unlock_irq(&task->pi_lock);
  out_put_task:
-	put_task_struct(task);
+	put_task_struct_atomic_safe(task);
 
 	return ret;
 }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0d18c3969f90..a4783f0c9f01 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1007,7 +1007,7 @@ void wake_up_q(struct wake_q_head *head)
 		 * the queueing in wake_q_add() so as not to miss wakeups.
 		 */
 		wake_up_process(task);
-		put_task_struct(task);
+		put_task_struct_atomic_safe(task);
 	}
 }
 
@@ -2528,7 +2528,7 @@ int push_cpu_stop(void *arg)
 	raw_spin_rq_unlock(rq);
 	raw_spin_unlock_irq(&p->pi_lock);
 
-	put_task_struct(p);
+	put_task_struct_atomic_safe(p);
 	return 0;
 }
 
@@ -9316,7 +9316,7 @@ static int __balance_push_cpu_stop(void *arg)
 	rq_unlock(rq, &rf);
 	raw_spin_unlock_irq(&p->pi_lock);
 
-	put_task_struct(p);
+	put_task_struct_atomic_safe(p);
 
 	return 0;
 }
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 71b24371a6f7..0f8b8a490dc0 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -327,7 +327,7 @@ static void dl_change_utilization(struct task_struct *p, u64 new_bw)
 		 * so we are still safe.
 		 */
 		if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
-			put_task_struct(p);
+			put_task_struct_atomic_safe(p);
 	}
 	__sub_rq_bw(p->dl.dl_bw, &rq->dl);
 	__add_rq_bw(new_bw, &rq->dl);
@@ -467,7 +467,7 @@ static void task_contending(struct sched_dl_entity *dl_se, int flags)
 		 * so we are still safe.
 		 */
 		if (hrtimer_try_to_cancel(&dl_se->inactive_timer) == 1)
-			put_task_struct(dl_task_of(dl_se));
+			put_task_struct_atomic_safe(dl_task_of(dl_se));
 	} else {
 		/*
 		 * Since "dl_non_contending" is not set, the
@@ -1207,7 +1207,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
 	 * This can free the task_struct, including this hrtimer, do not touch
 	 * anything related to that after this.
 	 */
-	put_task_struct(p);
+	put_task_struct_atomic_safe(p);
 
 	return HRTIMER_NORESTART;
 }
@@ -1442,7 +1442,7 @@ static enum hrtimer_restart inactive_task_timer(struct hrtimer *timer)
 	dl_se->dl_non_contending = 0;
 unlock:
 	task_rq_unlock(rq, p, &rf);
-	put_task_struct(p);
+	put_task_struct_atomic_safe(p);
 
 	return HRTIMER_NORESTART;
 }
@@ -1899,7 +1899,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
 		 * so we are still safe.
 		 */
 		if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
-			put_task_struct(p);
+			put_task_struct_atomic_safe(p);
 	}
 	sub_rq_bw(&p->dl, &rq->dl);
 	rq_unlock(rq, &rf);
@@ -2351,7 +2351,7 @@ static int push_dl_task(struct rq *rq)
 			/* No more tasks */
 			goto out;
 
-		put_task_struct(next_task);
+		put_task_struct_atomic_safe(next_task);
 		next_task = task;
 		goto retry;
 	}
@@ -2366,7 +2366,7 @@ static int push_dl_task(struct rq *rq)
 	double_unlock_balance(rq, later_rq);
 
 out:
-	put_task_struct(next_task);
+	put_task_struct_atomic_safe(next_task);
 
 	return ret;
 }
@@ -2633,7 +2633,7 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
 static void switched_to_dl(struct rq *rq, struct task_struct *p)
 {
 	if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
-		put_task_struct(p);
+		put_task_struct_atomic_safe(p);
 
 	/* If p is not queued we will update its parameters at next wakeup. */
 	if (!task_on_rq_queued(p)) {
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0a11f44adee5..e58a84535f61 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2150,7 +2150,7 @@ static int push_rt_task(struct rq *rq, bool pull)
 		/*
 		 * Something has shifted, try again.
 		 */
-		put_task_struct(next_task);
+		put_task_struct_atomic_safe(next_task);
 		next_task = task;
 		goto retry;
 	}
@@ -2163,7 +2163,7 @@ static int push_rt_task(struct rq *rq, bool pull)
 
 	double_unlock_balance(rq, lowest_rq);
 out:
-	put_task_struct(next_task);
+	put_task_struct_atomic_safe(next_task);
 
 	return ret;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-14 12:55 ` [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function Wander Lairson Costa
@ 2023-04-17 18:51   ` Waiman Long
  2023-04-18 14:18     ` Wander Lairson Costa
  2023-04-24 18:09   ` Paul E. McKenney
  1 sibling, 1 reply; 14+ messages in thread
From: Waiman Long @ 2023-04-17 18:51 UTC (permalink / raw)
  To: Wander Lairson Costa, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers, Adrian Hunter, Will Deacon,
	Boqun Feng, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider,
	Eric W. Biederman, Michael Ellerman, Andrew Morton,
	Oleg Nesterov, Kefeng Wang, Liam R. Howlett, Kees Cook,
	Christian Brauner, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM
  Cc: Hu Chunyu, Paul McKenney, Thomas Gleixner


On 4/14/23 08:55, Wander Lairson Costa wrote:
> Due to the possibility of indirectly acquiring sleeping locks, it is
> unsafe to call put_task_struct() in atomic contexts when the kernel is
> compiled with PREEMPT_RT.
>
> To mitigate this issue, this commit introduces
> put_task_struct_atomic_safe(), which schedules __put_task_struct()
> through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> be a more natural approach, we cannot allocate dynamic memory from
> atomic context in PREEMPT_RT, making the code more complex.
>
> This implementation ensures safe execution in atomic contexts and
> avoids any potential issues that may arise from using the non-atomic
> version.
>
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Reported-by: Hu Chunyu <chuhu@redhat.com>
> Cc: Paul McKenney <paulmck@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
>   include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
>   kernel/fork.c              |  8 ++++++++
>   2 files changed, 39 insertions(+)
>
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index b597b97b1f8f..5c13b83d7008 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
>   
>   void put_task_struct_rcu_user(struct task_struct *task);
>   
> +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> +
> +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> +{
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +		/*
> +		 * Decrement the refcount explicitly to avoid unnecessarily
> +		 * calling call_rcu.
> +		 */
> +		if (refcount_dec_and_test(&task->usage))
> +			/*
> +			 * under PREEMPT_RT, we can't call put_task_struct
> +			 * in atomic context because it will indirectly
> +			 * acquire sleeping locks.
> +			 * call_rcu() will schedule delayed_put_task_struct_rcu()
delayed_put_task_struct_rcu()?
> +			 * to be called in process context.
> +			 *
> +			 * __put_task_struct() is called called when
"called called"?
> +			 * refcount_dec_and_test(&t->usage) succeeds.
> +			 *
> +			 * This means that it can't "conflict" with
> +			 * put_task_struct_rcu_user() which abuses ->rcu the same
> +			 * way; rcu_users has a reference so task->usage can't be
> +			 * zero after rcu_users 1 -> 0 transition.

Note that put_task_struct_rcu_user() isn't the only user of task->rcu. 
delayed_free_task() in kernel/fork.c also uses it, though it is only 
called in the error case. Still you may need to take a look to make sure 
that there is no conflict.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 3/3] treewide: replace put_task_struct() witht the atomic safe version
  2023-04-14 12:55 ` [PATCH v6 3/3] treewide: replace put_task_struct() witht the atomic safe version Wander Lairson Costa
@ 2023-04-17 18:53   ` Waiman Long
  0 siblings, 0 replies; 14+ messages in thread
From: Waiman Long @ 2023-04-17 18:53 UTC (permalink / raw)
  To: Wander Lairson Costa, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers, Adrian Hunter, Will Deacon,
	Boqun Feng, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider,
	Eric W. Biederman, Michael Ellerman, Brian Cain, Kefeng Wang,
	Oleg Nesterov, Andrew Morton, Liam R. Howlett, Christian Brauner,
	Vlastimil Babka, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM


On 4/14/23 08:55, Wander Lairson Costa wrote:
> In places where put_task_struct() is called in a non-sleepable context,
> we replace those calls by put_task_struct_atomic_safe().
>
> These call sites were found by running internal regression tests and
> looking for warnings generated by put_task_might_resched().
>
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Cc: Valentin Schneider <vschneid@redhat.com>
> ---
>   kernel/events/core.c     |  6 +++---
>   kernel/locking/rtmutex.c | 10 +++++-----
>   kernel/sched/core.c      |  6 +++---
>   kernel/sched/deadline.c  | 16 ++++++++--------
>   kernel/sched/rt.c        |  4 ++--
>   5 files changed, 21 insertions(+), 21 deletions(-)

Typo "witht" in your patch title.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-17 18:51   ` Waiman Long
@ 2023-04-18 14:18     ` Wander Lairson Costa
  2023-04-18 14:26       ` Waiman Long
  0 siblings, 1 reply; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-18 14:18 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Boqun Feng, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider,
	Eric W. Biederman, Michael Ellerman, Andrew Morton,
	Oleg Nesterov, Kefeng Wang, Liam R. Howlett, Kees Cook,
	Christian Brauner, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu, Paul McKenney,
	Thomas Gleixner

On Mon, Apr 17, 2023 at 02:51:45PM -0400, Waiman Long wrote:
> 
> On 4/14/23 08:55, Wander Lairson Costa wrote:
> > Due to the possibility of indirectly acquiring sleeping locks, it is
> > unsafe to call put_task_struct() in atomic contexts when the kernel is
> > compiled with PREEMPT_RT.
> > 
> > To mitigate this issue, this commit introduces
> > put_task_struct_atomic_safe(), which schedules __put_task_struct()
> > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> > be a more natural approach, we cannot allocate dynamic memory from
> > atomic context in PREEMPT_RT, making the code more complex.
> > 
> > This implementation ensures safe execution in atomic contexts and
> > avoids any potential issues that may arise from using the non-atomic
> > version.
> > 
> > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > Reported-by: Hu Chunyu <chuhu@redhat.com>
> > Cc: Paul McKenney <paulmck@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >   include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
> >   kernel/fork.c              |  8 ++++++++
> >   2 files changed, 39 insertions(+)
> > 
> > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > index b597b97b1f8f..5c13b83d7008 100644
> > --- a/include/linux/sched/task.h
> > +++ b/include/linux/sched/task.h
> > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
> >   void put_task_struct_rcu_user(struct task_struct *task);
> > +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> > +
> > +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> > +{
> > +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > +		/*
> > +		 * Decrement the refcount explicitly to avoid unnecessarily
> > +		 * calling call_rcu.
> > +		 */
> > +		if (refcount_dec_and_test(&task->usage))
> > +			/*
> > +			 * under PREEMPT_RT, we can't call put_task_struct
> > +			 * in atomic context because it will indirectly
> > +			 * acquire sleeping locks.
> > +			 * call_rcu() will schedule delayed_put_task_struct_rcu()
> delayed_put_task_struct_rcu()?
> > +			 * to be called in process context.
> > +			 *
> > +			 * __put_task_struct() is called called when
> "called called"?
> > +			 * refcount_dec_and_test(&t->usage) succeeds.
> > +			 *
> > +			 * This means that it can't "conflict" with
> > +			 * put_task_struct_rcu_user() which abuses ->rcu the same
> > +			 * way; rcu_users has a reference so task->usage can't be
> > +			 * zero after rcu_users 1 -> 0 transition.
> 
> Note that put_task_struct_rcu_user() isn't the only user of task->rcu.
> delayed_free_task() in kernel/fork.c also uses it, though it is only called
> in the error case. Still you may need to take a look to make sure that there
> is no conflict.
> 

delayed_free_task() is called when a process fails to start. Therefore, AFAICT,
there is no way it can conflict with put_task_struct().

> Cheers,
> Longman
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-18 14:18     ` Wander Lairson Costa
@ 2023-04-18 14:26       ` Waiman Long
  0 siblings, 0 replies; 14+ messages in thread
From: Waiman Long @ 2023-04-18 14:26 UTC (permalink / raw)
  To: Wander Lairson Costa
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Boqun Feng, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider,
	Eric W. Biederman, Michael Ellerman, Andrew Morton,
	Oleg Nesterov, Kefeng Wang, Liam R. Howlett, Kees Cook,
	Christian Brauner, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu, Paul McKenney,
	Thomas Gleixner

On 4/18/23 10:18, Wander Lairson Costa wrote:
> On Mon, Apr 17, 2023 at 02:51:45PM -0400, Waiman Long wrote:
>> On 4/14/23 08:55, Wander Lairson Costa wrote:
>>> Due to the possibility of indirectly acquiring sleeping locks, it is
>>> unsafe to call put_task_struct() in atomic contexts when the kernel is
>>> compiled with PREEMPT_RT.
>>>
>>> To mitigate this issue, this commit introduces
>>> put_task_struct_atomic_safe(), which schedules __put_task_struct()
>>> through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
>>> be a more natural approach, we cannot allocate dynamic memory from
>>> atomic context in PREEMPT_RT, making the code more complex.
>>>
>>> This implementation ensures safe execution in atomic contexts and
>>> avoids any potential issues that may arise from using the non-atomic
>>> version.
>>>
>>> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
>>> Reported-by: Hu Chunyu <chuhu@redhat.com>
>>> Cc: Paul McKenney <paulmck@kernel.org>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> ---
>>>    include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
>>>    kernel/fork.c              |  8 ++++++++
>>>    2 files changed, 39 insertions(+)
>>>
>>> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
>>> index b597b97b1f8f..5c13b83d7008 100644
>>> --- a/include/linux/sched/task.h
>>> +++ b/include/linux/sched/task.h
>>> @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
>>>    void put_task_struct_rcu_user(struct task_struct *task);
>>> +extern void __delayed_put_task_struct(struct rcu_head *rhp);
>>> +
>>> +static inline void put_task_struct_atomic_safe(struct task_struct *task)
>>> +{
>>> +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
>>> +		/*
>>> +		 * Decrement the refcount explicitly to avoid unnecessarily
>>> +		 * calling call_rcu.
>>> +		 */
>>> +		if (refcount_dec_and_test(&task->usage))
>>> +			/*
>>> +			 * under PREEMPT_RT, we can't call put_task_struct
>>> +			 * in atomic context because it will indirectly
>>> +			 * acquire sleeping locks.
>>> +			 * call_rcu() will schedule delayed_put_task_struct_rcu()
>> delayed_put_task_struct_rcu()?
>>> +			 * to be called in process context.
>>> +			 *
>>> +			 * __put_task_struct() is called called when
>> "called called"?
>>> +			 * refcount_dec_and_test(&t->usage) succeeds.
>>> +			 *
>>> +			 * This means that it can't "conflict" with
>>> +			 * put_task_struct_rcu_user() which abuses ->rcu the same
>>> +			 * way; rcu_users has a reference so task->usage can't be
>>> +			 * zero after rcu_users 1 -> 0 transition.
>> Note that put_task_struct_rcu_user() isn't the only user of task->rcu.
>> delayed_free_task() in kernel/fork.c also uses it, though it is only called
>> in the error case. Still you may need to take a look to make sure that there
>> is no conflict.
>>
> delayed_free_task() is called when a process fails to start. Therefore, AFAICT,
> there is no way it can conflict with put_task_struct().

I think so too, but for completeness, you should document somewhere that 
it is a possible conflicting user.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-14 12:55 ` [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function Wander Lairson Costa
  2023-04-17 18:51   ` Waiman Long
@ 2023-04-24 18:09   ` Paul E. McKenney
  2023-04-24 18:43     ` Wander Lairson Costa
  1 sibling, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2023-04-24 18:09 UTC (permalink / raw)
  To: Wander Lairson Costa
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Michael Ellerman,
	Andrew Morton, Oleg Nesterov, Kefeng Wang, Liam R. Howlett,
	Kees Cook, Christian Brauner, Andrei Vagin, Shakeel Butt,
	open list, open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu,
	Thomas Gleixner

On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote:
> Due to the possibility of indirectly acquiring sleeping locks, it is
> unsafe to call put_task_struct() in atomic contexts when the kernel is
> compiled with PREEMPT_RT.
> 
> To mitigate this issue, this commit introduces
> put_task_struct_atomic_safe(), which schedules __put_task_struct()
> through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> be a more natural approach, we cannot allocate dynamic memory from
> atomic context in PREEMPT_RT, making the code more complex.
> 
> This implementation ensures safe execution in atomic contexts and
> avoids any potential issues that may arise from using the non-atomic
> version.
> 
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Reported-by: Hu Chunyu <chuhu@redhat.com>
> Cc: Paul McKenney <paulmck@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
>  kernel/fork.c              |  8 ++++++++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index b597b97b1f8f..5c13b83d7008 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
>  
>  void put_task_struct_rcu_user(struct task_struct *task);
>  
> +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> +
> +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> +{
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +		/*
> +		 * Decrement the refcount explicitly to avoid unnecessarily
> +		 * calling call_rcu.
> +		 */
> +		if (refcount_dec_and_test(&task->usage))
> +			/*
> +			 * under PREEMPT_RT, we can't call put_task_struct
> +			 * in atomic context because it will indirectly
> +			 * acquire sleeping locks.
> +			 * call_rcu() will schedule delayed_put_task_struct_rcu()
> +			 * to be called in process context.
> +			 *
> +			 * __put_task_struct() is called called when
> +			 * refcount_dec_and_test(&t->usage) succeeds.
> +			 *
> +			 * This means that it can't "conflict" with
> +			 * put_task_struct_rcu_user() which abuses ->rcu the same
> +			 * way; rcu_users has a reference so task->usage can't be
> +			 * zero after rcu_users 1 -> 0 transition.
> +			 */
> +			call_rcu(&task->rcu, __delayed_put_task_struct);

This will invoke __delayed_put_task_struct() with softirqs disabled.
Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT?

							Thanx, Paul

> +	} else {
> +		put_task_struct(task);
> +	}
> +}
> +
>  /* Free all architecture-specific resources held by a thread. */
>  void release_thread(struct task_struct *dead_task);
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 0c92f224c68c..9884794fe4b8 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk)
>  }
>  EXPORT_SYMBOL_GPL(__put_task_struct);
>  
> +void __delayed_put_task_struct(struct rcu_head *rhp)
> +{
> +	struct task_struct *task = container_of(rhp, struct task_struct, rcu);
> +
> +	__put_task_struct(task);
> +}
> +EXPORT_SYMBOL_GPL(__delayed_put_task_struct);
> +
>  void __init __weak arch_task_cache_init(void) { }
>  
>  /*
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-24 18:09   ` Paul E. McKenney
@ 2023-04-24 18:43     ` Wander Lairson Costa
  2023-04-24 18:52       ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-24 18:43 UTC (permalink / raw)
  To: paulmck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Michael Ellerman,
	Andrew Morton, Oleg Nesterov, Kefeng Wang, Liam R. Howlett,
	Kees Cook, Christian Brauner, Andrei Vagin, Shakeel Butt,
	open list, open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu,
	Thomas Gleixner

On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote:
> > Due to the possibility of indirectly acquiring sleeping locks, it is
> > unsafe to call put_task_struct() in atomic contexts when the kernel is
> > compiled with PREEMPT_RT.
> >
> > To mitigate this issue, this commit introduces
> > put_task_struct_atomic_safe(), which schedules __put_task_struct()
> > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> > be a more natural approach, we cannot allocate dynamic memory from
> > atomic context in PREEMPT_RT, making the code more complex.
> >
> > This implementation ensures safe execution in atomic contexts and
> > avoids any potential issues that may arise from using the non-atomic
> > version.
> >
> > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > Reported-by: Hu Chunyu <chuhu@redhat.com>
> > Cc: Paul McKenney <paulmck@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >  include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
> >  kernel/fork.c              |  8 ++++++++
> >  2 files changed, 39 insertions(+)
> >
> > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > index b597b97b1f8f..5c13b83d7008 100644
> > --- a/include/linux/sched/task.h
> > +++ b/include/linux/sched/task.h
> > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
> >
> >  void put_task_struct_rcu_user(struct task_struct *task);
> >
> > +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> > +
> > +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> > +{
> > +     if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > +             /*
> > +              * Decrement the refcount explicitly to avoid unnecessarily
> > +              * calling call_rcu.
> > +              */
> > +             if (refcount_dec_and_test(&task->usage))
> > +                     /*
> > +                      * under PREEMPT_RT, we can't call put_task_struct
> > +                      * in atomic context because it will indirectly
> > +                      * acquire sleeping locks.
> > +                      * call_rcu() will schedule delayed_put_task_struct_rcu()
> > +                      * to be called in process context.
> > +                      *
> > +                      * __put_task_struct() is called called when
> > +                      * refcount_dec_and_test(&t->usage) succeeds.
> > +                      *
> > +                      * This means that it can't "conflict" with
> > +                      * put_task_struct_rcu_user() which abuses ->rcu the same
> > +                      * way; rcu_users has a reference so task->usage can't be
> > +                      * zero after rcu_users 1 -> 0 transition.
> > +                      */
> > +                     call_rcu(&task->rcu, __delayed_put_task_struct);
>
> This will invoke __delayed_put_task_struct() with softirqs disabled.
> Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT?
>

softirqs execute in thread context in PREEMPT_RT. We are good here.

>                                                         Thanx, Paul
>
> > +     } else {
> > +             put_task_struct(task);
> > +     }
> > +}
> > +
> >  /* Free all architecture-specific resources held by a thread. */
> >  void release_thread(struct task_struct *dead_task);
> >
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 0c92f224c68c..9884794fe4b8 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk)
> >  }
> >  EXPORT_SYMBOL_GPL(__put_task_struct);
> >
> > +void __delayed_put_task_struct(struct rcu_head *rhp)
> > +{
> > +     struct task_struct *task = container_of(rhp, struct task_struct, rcu);
> > +
> > +     __put_task_struct(task);
> > +}
> > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct);
> > +
> >  void __init __weak arch_task_cache_init(void) { }
> >
> >  /*
> > --
> > 2.39.2
> >
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-24 18:43     ` Wander Lairson Costa
@ 2023-04-24 18:52       ` Paul E. McKenney
  2023-04-24 20:34         ` Wander Lairson Costa
  2023-04-24 20:34         ` Steven Rostedt
  0 siblings, 2 replies; 14+ messages in thread
From: Paul E. McKenney @ 2023-04-24 18:52 UTC (permalink / raw)
  To: Wander Lairson Costa
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Michael Ellerman,
	Andrew Morton, Oleg Nesterov, Kefeng Wang, Liam R. Howlett,
	Kees Cook, Christian Brauner, Andrei Vagin, Shakeel Butt,
	open list, open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu,
	Thomas Gleixner

On Mon, Apr 24, 2023 at 03:43:09PM -0300, Wander Lairson Costa wrote:
> On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote:
> > > Due to the possibility of indirectly acquiring sleeping locks, it is
> > > unsafe to call put_task_struct() in atomic contexts when the kernel is
> > > compiled with PREEMPT_RT.
> > >
> > > To mitigate this issue, this commit introduces
> > > put_task_struct_atomic_safe(), which schedules __put_task_struct()
> > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> > > be a more natural approach, we cannot allocate dynamic memory from
> > > atomic context in PREEMPT_RT, making the code more complex.
> > >
> > > This implementation ensures safe execution in atomic contexts and
> > > avoids any potential issues that may arise from using the non-atomic
> > > version.
> > >
> > > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > > Reported-by: Hu Chunyu <chuhu@redhat.com>
> > > Cc: Paul McKenney <paulmck@kernel.org>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > ---
> > >  include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
> > >  kernel/fork.c              |  8 ++++++++
> > >  2 files changed, 39 insertions(+)
> > >
> > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > > index b597b97b1f8f..5c13b83d7008 100644
> > > --- a/include/linux/sched/task.h
> > > +++ b/include/linux/sched/task.h
> > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
> > >
> > >  void put_task_struct_rcu_user(struct task_struct *task);
> > >
> > > +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> > > +
> > > +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> > > +{
> > > +     if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > > +             /*
> > > +              * Decrement the refcount explicitly to avoid unnecessarily
> > > +              * calling call_rcu.
> > > +              */
> > > +             if (refcount_dec_and_test(&task->usage))
> > > +                     /*
> > > +                      * under PREEMPT_RT, we can't call put_task_struct
> > > +                      * in atomic context because it will indirectly
> > > +                      * acquire sleeping locks.
> > > +                      * call_rcu() will schedule delayed_put_task_struct_rcu()
> > > +                      * to be called in process context.
> > > +                      *
> > > +                      * __put_task_struct() is called called when
> > > +                      * refcount_dec_and_test(&t->usage) succeeds.
> > > +                      *
> > > +                      * This means that it can't "conflict" with
> > > +                      * put_task_struct_rcu_user() which abuses ->rcu the same
> > > +                      * way; rcu_users has a reference so task->usage can't be
> > > +                      * zero after rcu_users 1 -> 0 transition.
> > > +                      */
> > > +                     call_rcu(&task->rcu, __delayed_put_task_struct);
> >
> > This will invoke __delayed_put_task_struct() with softirqs disabled.
> > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT?
> 
> softirqs execute in thread context in PREEMPT_RT. We are good here.

So the sleeping lock is a spinlock rather than (say) a mutex?

                                                        Thanx, Paul

> > > +     } else {
> > > +             put_task_struct(task);
> > > +     }
> > > +}
> > > +
> > >  /* Free all architecture-specific resources held by a thread. */
> > >  void release_thread(struct task_struct *dead_task);
> > >
> > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > index 0c92f224c68c..9884794fe4b8 100644
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk)
> > >  }
> > >  EXPORT_SYMBOL_GPL(__put_task_struct);
> > >
> > > +void __delayed_put_task_struct(struct rcu_head *rhp)
> > > +{
> > > +     struct task_struct *task = container_of(rhp, struct task_struct, rcu);
> > > +
> > > +     __put_task_struct(task);
> > > +}
> > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct);
> > > +
> > >  void __init __weak arch_task_cache_init(void) { }
> > >
> > >  /*
> > > --
> > > 2.39.2
> > >
> >
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-24 18:52       ` Paul E. McKenney
@ 2023-04-24 20:34         ` Wander Lairson Costa
  2023-04-24 21:54           ` Paul E. McKenney
  2023-04-24 20:34         ` Steven Rostedt
  1 sibling, 1 reply; 14+ messages in thread
From: Wander Lairson Costa @ 2023-04-24 20:34 UTC (permalink / raw)
  To: paulmck
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Michael Ellerman,
	Andrew Morton, Oleg Nesterov, Kefeng Wang, Liam R. Howlett,
	Kees Cook, Christian Brauner, Andrei Vagin, Shakeel Butt,
	open list, open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu,
	Thomas Gleixner

On Mon, Apr 24, 2023 at 3:52 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Apr 24, 2023 at 03:43:09PM -0300, Wander Lairson Costa wrote:
> > On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote:
> > > > Due to the possibility of indirectly acquiring sleeping locks, it is
> > > > unsafe to call put_task_struct() in atomic contexts when the kernel is
> > > > compiled with PREEMPT_RT.
> > > >
> > > > To mitigate this issue, this commit introduces
> > > > put_task_struct_atomic_safe(), which schedules __put_task_struct()
> > > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> > > > be a more natural approach, we cannot allocate dynamic memory from
> > > > atomic context in PREEMPT_RT, making the code more complex.
> > > >
> > > > This implementation ensures safe execution in atomic contexts and
> > > > avoids any potential issues that may arise from using the non-atomic
> > > > version.
> > > >
> > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > > > Reported-by: Hu Chunyu <chuhu@redhat.com>
> > > > Cc: Paul McKenney <paulmck@kernel.org>
> > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > ---
> > > >  include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
> > > >  kernel/fork.c              |  8 ++++++++
> > > >  2 files changed, 39 insertions(+)
> > > >
> > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > > > index b597b97b1f8f..5c13b83d7008 100644
> > > > --- a/include/linux/sched/task.h
> > > > +++ b/include/linux/sched/task.h
> > > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
> > > >
> > > >  void put_task_struct_rcu_user(struct task_struct *task);
> > > >
> > > > +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> > > > +
> > > > +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> > > > +{
> > > > +     if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > > > +             /*
> > > > +              * Decrement the refcount explicitly to avoid unnecessarily
> > > > +              * calling call_rcu.
> > > > +              */
> > > > +             if (refcount_dec_and_test(&task->usage))
> > > > +                     /*
> > > > +                      * under PREEMPT_RT, we can't call put_task_struct
> > > > +                      * in atomic context because it will indirectly
> > > > +                      * acquire sleeping locks.
> > > > +                      * call_rcu() will schedule delayed_put_task_struct_rcu()
> > > > +                      * to be called in process context.
> > > > +                      *
> > > > +                      * __put_task_struct() is called called when
> > > > +                      * refcount_dec_and_test(&t->usage) succeeds.
> > > > +                      *
> > > > +                      * This means that it can't "conflict" with
> > > > +                      * put_task_struct_rcu_user() which abuses ->rcu the same
> > > > +                      * way; rcu_users has a reference so task->usage can't be
> > > > +                      * zero after rcu_users 1 -> 0 transition.
> > > > +                      */
> > > > +                     call_rcu(&task->rcu, __delayed_put_task_struct);
> > >
> > > This will invoke __delayed_put_task_struct() with softirqs disabled.
> > > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT?
> >
> > softirqs execute in thread context in PREEMPT_RT. We are good here.
>
> So the sleeping lock is a spinlock rather than (say) a mutex?
>

Yes, under PREEMPT_RT, spinlocks are implemented in terms of rtmutex.

>                                                         Thanx, Paul
>
> > > > +     } else {
> > > > +             put_task_struct(task);
> > > > +     }
> > > > +}
> > > > +
> > > >  /* Free all architecture-specific resources held by a thread. */
> > > >  void release_thread(struct task_struct *dead_task);
> > > >
> > > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > > index 0c92f224c68c..9884794fe4b8 100644
> > > > --- a/kernel/fork.c
> > > > +++ b/kernel/fork.c
> > > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(__put_task_struct);
> > > >
> > > > +void __delayed_put_task_struct(struct rcu_head *rhp)
> > > > +{
> > > > +     struct task_struct *task = container_of(rhp, struct task_struct, rcu);
> > > > +
> > > > +     __put_task_struct(task);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct);
> > > > +
> > > >  void __init __weak arch_task_cache_init(void) { }
> > > >
> > > >  /*
> > > > --
> > > > 2.39.2
> > > >
> > >
> >
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-24 18:52       ` Paul E. McKenney
  2023-04-24 20:34         ` Wander Lairson Costa
@ 2023-04-24 20:34         ` Steven Rostedt
  1 sibling, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2023-04-24 20:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Wander Lairson Costa, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers, Adrian Hunter, Will Deacon,
	Waiman Long, Boqun Feng, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider,
	Eric W. Biederman, Michael Ellerman, Andrew Morton,
	Oleg Nesterov, Kefeng Wang, Liam R. Howlett, Kees Cook,
	Christian Brauner, Andrei Vagin, Shakeel Butt, open list,
	open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu,
	Thomas Gleixner

On Mon, 24 Apr 2023 11:52:52 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> > softirqs execute in thread context in PREEMPT_RT. We are good here.  
> 
> So the sleeping lock is a spinlock rather than (say) a mutex?

local_bh_disable() on RT basically turns into:

	local_lock(&softirq_ctrl.lock);
	rcu_read_lock();

Which grabs a per CPU mutex that is taken by softirqs, and also calls
rcu_read_lock(). This allows bottom halves to still run as threads but
maintain the same synchronization as they do on mainline.

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function
  2023-04-24 20:34         ` Wander Lairson Costa
@ 2023-04-24 21:54           ` Paul E. McKenney
  0 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2023-04-24 21:54 UTC (permalink / raw)
  To: Wander Lairson Costa
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Will Deacon, Waiman Long, Boqun Feng,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, Eric W. Biederman, Michael Ellerman,
	Andrew Morton, Oleg Nesterov, Kefeng Wang, Liam R. Howlett,
	Kees Cook, Christian Brauner, Andrei Vagin, Shakeel Butt,
	open list, open list:PERFORMANCE EVENTS SUBSYSTEM, Hu Chunyu,
	Thomas Gleixner

On Mon, Apr 24, 2023 at 05:34:29PM -0300, Wander Lairson Costa wrote:
> On Mon, Apr 24, 2023 at 3:52 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > On Mon, Apr 24, 2023 at 03:43:09PM -0300, Wander Lairson Costa wrote:
> > > On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote:
> > > > > Due to the possibility of indirectly acquiring sleeping locks, it is
> > > > > unsafe to call put_task_struct() in atomic contexts when the kernel is
> > > > > compiled with PREEMPT_RT.
> > > > >
> > > > > To mitigate this issue, this commit introduces
> > > > > put_task_struct_atomic_safe(), which schedules __put_task_struct()
> > > > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
> > > > > be a more natural approach, we cannot allocate dynamic memory from
> > > > > atomic context in PREEMPT_RT, making the code more complex.
> > > > >
> > > > > This implementation ensures safe execution in atomic contexts and
> > > > > avoids any potential issues that may arise from using the non-atomic
> > > > > version.
> > > > >
> > > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > > > > Reported-by: Hu Chunyu <chuhu@redhat.com>
> > > > > Cc: Paul McKenney <paulmck@kernel.org>
> > > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > > ---
> > > > >  include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
> > > > >  kernel/fork.c              |  8 ++++++++
> > > > >  2 files changed, 39 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > > > > index b597b97b1f8f..5c13b83d7008 100644
> > > > > --- a/include/linux/sched/task.h
> > > > > +++ b/include/linux/sched/task.h
> > > > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr)
> > > > >
> > > > >  void put_task_struct_rcu_user(struct task_struct *task);
> > > > >
> > > > > +extern void __delayed_put_task_struct(struct rcu_head *rhp);
> > > > > +
> > > > > +static inline void put_task_struct_atomic_safe(struct task_struct *task)
> > > > > +{
> > > > > +     if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > > > > +             /*
> > > > > +              * Decrement the refcount explicitly to avoid unnecessarily
> > > > > +              * calling call_rcu.
> > > > > +              */
> > > > > +             if (refcount_dec_and_test(&task->usage))
> > > > > +                     /*
> > > > > +                      * under PREEMPT_RT, we can't call put_task_struct
> > > > > +                      * in atomic context because it will indirectly
> > > > > +                      * acquire sleeping locks.
> > > > > +                      * call_rcu() will schedule delayed_put_task_struct_rcu()
> > > > > +                      * to be called in process context.
> > > > > +                      *
> > > > > +                      * __put_task_struct() is called called when
> > > > > +                      * refcount_dec_and_test(&t->usage) succeeds.
> > > > > +                      *
> > > > > +                      * This means that it can't "conflict" with
> > > > > +                      * put_task_struct_rcu_user() which abuses ->rcu the same
> > > > > +                      * way; rcu_users has a reference so task->usage can't be
> > > > > +                      * zero after rcu_users 1 -> 0 transition.
> > > > > +                      */
> > > > > +                     call_rcu(&task->rcu, __delayed_put_task_struct);
> > > >
> > > > This will invoke __delayed_put_task_struct() with softirqs disabled.
> > > > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT?
> > >
> > > softirqs execute in thread context in PREEMPT_RT. We are good here.
> >
> > So the sleeping lock is a spinlock rather than (say) a mutex?
> 
> Yes, under PREEMPT_RT, spinlocks are implemented in terms of rtmutex.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

                                                        Thanx, Paul

> > > > > +     } else {
> > > > > +             put_task_struct(task);
> > > > > +     }
> > > > > +}
> > > > > +
> > > > >  /* Free all architecture-specific resources held by a thread. */
> > > > >  void release_thread(struct task_struct *dead_task);
> > > > >
> > > > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > > > index 0c92f224c68c..9884794fe4b8 100644
> > > > > --- a/kernel/fork.c
> > > > > +++ b/kernel/fork.c
> > > > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk)
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(__put_task_struct);
> > > > >
> > > > > +void __delayed_put_task_struct(struct rcu_head *rhp)
> > > > > +{
> > > > > +     struct task_struct *task = container_of(rhp, struct task_struct, rcu);
> > > > > +
> > > > > +     __put_task_struct(task);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct);
> > > > > +
> > > > >  void __init __weak arch_task_cache_init(void) { }
> > > > >
> > > > >  /*
> > > > > --
> > > > > 2.39.2
> > > > >
> > > >
> > >
> >
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-04-24 21:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-14 12:55 [PATCH v6 0/3] Introduce put_task_struct_atomic_sleep() Wander Lairson Costa
2023-04-14 12:55 ` [PATCH v6 1/3] sched/core: warn on call put_task_struct in invalid context Wander Lairson Costa
2023-04-14 12:55 ` [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function Wander Lairson Costa
2023-04-17 18:51   ` Waiman Long
2023-04-18 14:18     ` Wander Lairson Costa
2023-04-18 14:26       ` Waiman Long
2023-04-24 18:09   ` Paul E. McKenney
2023-04-24 18:43     ` Wander Lairson Costa
2023-04-24 18:52       ` Paul E. McKenney
2023-04-24 20:34         ` Wander Lairson Costa
2023-04-24 21:54           ` Paul E. McKenney
2023-04-24 20:34         ` Steven Rostedt
2023-04-14 12:55 ` [PATCH v6 3/3] treewide: replace put_task_struct() witht the atomic safe version Wander Lairson Costa
2023-04-17 18:53   ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.