All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies.
@ 2018-12-14 15:17 Tetsuo Handa
  2018-12-14 15:58 ` Paul E. McKenney
  2018-12-14 20:31 ` Andrew Morton
  0 siblings, 2 replies; 4+ messages in thread
From: Tetsuo Handa @ 2018-12-14 15:17 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Andrew Morton, Paul E. McKenney
  Cc: Dmitry Vyukov, Rafael J. Wysocki, Vitaly Kuznetsov, linux-kernel,
	Tetsuo Handa

check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
for every 1024 threads. But check_hung_task() is very slow if printk()
was called, and is very fast otherwise. If many threads within some 1024
threads called printk(), the RCU grace period might be extended enough
to trigger RCU stall warnings. Therefore, calling rcu_lock_break() for
every some fixed jiffies will be safer.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/hung_task.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index cb8e3e8..444b8b5 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -34,7 +34,7 @@
  * is disabled during the critical section. It also controls the size of
  * the RCU grace period. So it needs to be upper-bound.
  */
-#define HUNG_TASK_BATCHING 1024
+#define HUNG_TASK_LOCK_BREAK (HZ / 10)
 
 /*
  * Zero means infinite timeout - no checking done:
@@ -173,7 +173,7 @@ static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
 static void check_hung_uninterruptible_tasks(unsigned long timeout)
 {
 	int max_count = sysctl_hung_task_check_count;
-	int batch_count = HUNG_TASK_BATCHING;
+	unsigned long last_break = jiffies;
 	struct task_struct *g, *t;
 
 	/*
@@ -188,10 +188,10 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 	for_each_process_thread(g, t) {
 		if (!max_count--)
 			goto unlock;
-		if (!--batch_count) {
-			batch_count = HUNG_TASK_BATCHING;
+		if (time_after(jiffies, last_break + HUNG_TASK_LOCK_BREAK)) {
 			if (!rcu_lock_break(g, t))
 				goto unlock;
+			last_break = jiffies;
 		}
 		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
 		if (t->state == TASK_UNINTERRUPTIBLE)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies.
  2018-12-14 15:17 [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies Tetsuo Handa
@ 2018-12-14 15:58 ` Paul E. McKenney
  2018-12-14 20:31 ` Andrew Morton
  1 sibling, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2018-12-14 15:58 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Andrew Morton, Dmitry Vyukov,
	Rafael J. Wysocki, Vitaly Kuznetsov, linux-kernel

On Sat, Dec 15, 2018 at 12:17:38AM +0900, Tetsuo Handa wrote:
> check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
> for every 1024 threads. But check_hung_task() is very slow if printk()
> was called, and is very fast otherwise. If many threads within some 1024
> threads called printk(), the RCU grace period might be extended enough
> to trigger RCU stall warnings. Therefore, calling rcu_lock_break() for
> every some fixed jiffies will be safer.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>

> ---
>  kernel/hung_task.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index cb8e3e8..444b8b5 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -34,7 +34,7 @@
>   * is disabled during the critical section. It also controls the size of
>   * the RCU grace period. So it needs to be upper-bound.
>   */
> -#define HUNG_TASK_BATCHING 1024
> +#define HUNG_TASK_LOCK_BREAK (HZ / 10)
> 
>  /*
>   * Zero means infinite timeout - no checking done:
> @@ -173,7 +173,7 @@ static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
>  static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  {
>  	int max_count = sysctl_hung_task_check_count;
> -	int batch_count = HUNG_TASK_BATCHING;
> +	unsigned long last_break = jiffies;
>  	struct task_struct *g, *t;
> 
>  	/*
> @@ -188,10 +188,10 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  	for_each_process_thread(g, t) {
>  		if (!max_count--)
>  			goto unlock;
> -		if (!--batch_count) {
> -			batch_count = HUNG_TASK_BATCHING;
> +		if (time_after(jiffies, last_break + HUNG_TASK_LOCK_BREAK)) {
>  			if (!rcu_lock_break(g, t))
>  				goto unlock;
> +			last_break = jiffies;
>  		}
>  		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
>  		if (t->state == TASK_UNINTERRUPTIBLE)
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies.
  2018-12-14 15:17 [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies Tetsuo Handa
  2018-12-14 15:58 ` Paul E. McKenney
@ 2018-12-14 20:31 ` Andrew Morton
  2018-12-14 21:01   ` Paul E. McKenney
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2018-12-14 20:31 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Paul E. McKenney, Dmitry Vyukov,
	Rafael J. Wysocki, Vitaly Kuznetsov, linux-kernel

On Sat, 15 Dec 2018 00:17:38 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
> for every 1024 threads. But check_hung_task() is very slow if printk()
> was called, and is very fast otherwise. If many threads within some 1024
> threads called printk(), the RCU grace period might be extended enough
> to trigger RCU stall warnings. Therefore, calling rcu_lock_break() for
> every some fixed jiffies will be safer.
> 
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -34,7 +34,7 @@
>   * is disabled during the critical section. It also controls the size of
>   * the RCU grace period. So it needs to be upper-bound.
>   */
> -#define HUNG_TASK_BATCHING 1024
> +#define HUNG_TASK_LOCK_BREAK (HZ / 10)

This won't work correctly if rcu_cpu_stall_timeout is set to something
stupidly small.  Perhaps is would be better to make this code aware of
the current rcu_cpu_stall_timeout setting?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies.
  2018-12-14 20:31 ` Andrew Morton
@ 2018-12-14 21:01   ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2018-12-14 21:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tetsuo Handa, Petr Mladek, Sergey Senozhatsky, Dmitry Vyukov,
	Rafael J. Wysocki, Vitaly Kuznetsov, linux-kernel

On Fri, Dec 14, 2018 at 12:31:11PM -0800, Andrew Morton wrote:
> On Sat, 15 Dec 2018 00:17:38 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
> 
> > check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
> > for every 1024 threads. But check_hung_task() is very slow if printk()
> > was called, and is very fast otherwise. If many threads within some 1024
> > threads called printk(), the RCU grace period might be extended enough
> > to trigger RCU stall warnings. Therefore, calling rcu_lock_break() for
> > every some fixed jiffies will be safer.
> > 
> > --- a/kernel/hung_task.c
> > +++ b/kernel/hung_task.c
> > @@ -34,7 +34,7 @@
> >   * is disabled during the critical section. It also controls the size of
> >   * the RCU grace period. So it needs to be upper-bound.
> >   */
> > -#define HUNG_TASK_BATCHING 1024
> > +#define HUNG_TASK_LOCK_BREAK (HZ / 10)
> 
> This won't work correctly if rcu_cpu_stall_timeout is set to something
> stupidly small.  Perhaps is would be better to make this code aware of
> the current rcu_cpu_stall_timeout setting?

Good point.

However, the reason that I wasn't worried because any settings of
rcu_cpu_stall_timeout less than 3 seconds are cheerfully bumped up to
3 seconds, so we have a safety factor of 30 as things stand.

I could export the minimum, though, if that would be helpful.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-14 21:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-14 15:17 [PATCH] kernel/hung_task.c: Break RCU locks based on jiffies Tetsuo Handa
2018-12-14 15:58 ` Paul E. McKenney
2018-12-14 20:31 ` Andrew Morton
2018-12-14 21:01   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.