[PATCH 0/1] hung_task: Change hung_task.c to use for_each_process

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/1] hung_task: Change hung_task.c to use for_each_process_thread()
@ 2015-03-17 14:13 Aaron Tomlin
  2015-03-17 14:13 ` [PATCH 1/1] " Aaron Tomlin
  0 siblings, 1 reply; 9+ messages in thread
From: Aaron Tomlin @ 2015-03-17 14:13 UTC (permalink / raw)
  To: akpm; +Cc: oleg, rientjes, dwysocha, atomlin, linux-kernel

Hi Andrew,


Further work is required to improve khungtaskd. I'll do this later
but for now let's start with this trivial clean up.


Aaron Tomlin (1):
  hung_task: Change hung_task.c to use for_each_process_thread()

 kernel/hung_task.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/1] hung_task: Change hung_task.c to use for_each_process_thread()
  2015-03-17 14:13 [PATCH 0/1] hung_task: Change hung_task.c to use for_each_process_thread() Aaron Tomlin
@ 2015-03-17 14:13 ` Aaron Tomlin
  2015-03-17 17:09   ` Oleg Nesterov
  0 siblings, 1 reply; 9+ messages in thread
From: Aaron Tomlin @ 2015-03-17 14:13 UTC (permalink / raw)
  To: akpm; +Cc: oleg, rientjes, dwysocha, atomlin, linux-kernel

In check_hung_uninterruptible_tasks() avoid the use of deprecated
while_each_thread().

The "max_count" logic will prevents a livelock - see commit 0c740d0a
("introduce for_each_thread() to replace the buggy while_each_thread()").
Having said this let's use for_each_process_thread().

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/hung_task.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 06db124..e0f90c2 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -169,7 +169,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 		return;
 
 	rcu_read_lock();
-	do_each_thread(g, t) {
+	for_each_process_thread(g, t) {
 		if (!max_count--)
 			goto unlock;
 		if (!--batch_count) {
@@ -180,7 +180,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
 		if (t->state == TASK_UNINTERRUPTIBLE)
 			check_hung_task(t, timeout);
-	} while_each_thread(g, t);
+	}
  unlock:
 	rcu_read_unlock();
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] hung_task: Change hung_task.c to use for_each_process_thread()
  2015-03-17 14:13 ` [PATCH 1/1] " Aaron Tomlin
@ 2015-03-17 17:09   ` Oleg Nesterov
  2015-03-17 17:18     ` Aaron Tomlin
  2015-03-17 19:24     ` [PATCH 0/2] hung_task: improve rcu_lock_break() logic Oleg Nesterov
  0 siblings, 2 replies; 9+ messages in thread
From: Oleg Nesterov @ 2015-03-17 17:09 UTC (permalink / raw)
  To: Aaron Tomlin; +Cc: akpm, rientjes, dwysocha, linux-kernel

On 03/17, Aaron Tomlin wrote:
>
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -169,7 +169,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  		return;
>  
>  	rcu_read_lock();
> -	do_each_thread(g, t) {
> +	for_each_process_thread(g, t) {
>  		if (!max_count--)
>  			goto unlock;
>  		if (!--batch_count) {
> @@ -180,7 +180,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
>  		if (t->state == TASK_UNINTERRUPTIBLE)
>  			check_hung_task(t, timeout);
> -	} while_each_thread(g, t);
> +	}


Acked-by: Oleg Nesterov <oleg@redhat.com>



Perhaps it also makes sense to improve this rcu_lock_break a bit...
For example, if 't' is dead but 'g' is alive we can continue the
"for_each_process" part of this double loop. And if 't' is still
alive then we can find the new leader and continue...

But I agree, lets start from this fix, then we will see.

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] hung_task: Change hung_task.c to use for_each_process_thread()
  2015-03-17 17:09   ` Oleg Nesterov
@ 2015-03-17 17:18     ` Aaron Tomlin
  2015-03-17 19:24     ` [PATCH 0/2] hung_task: improve rcu_lock_break() logic Oleg Nesterov
  1 sibling, 0 replies; 9+ messages in thread
From: Aaron Tomlin @ 2015-03-17 17:18 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, rientjes, dwysocha, linux-kernel

On Tue 2015-03-17 18:09 +0100, Oleg Nesterov wrote:
> On 03/17, Aaron Tomlin wrote:
> >
> > --- a/kernel/hung_task.c
> > +++ b/kernel/hung_task.c
> > @@ -169,7 +169,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >  		return;
> >  
> >  	rcu_read_lock();
> > -	do_each_thread(g, t) {
> > +	for_each_process_thread(g, t) {
> >  		if (!max_count--)
> >  			goto unlock;
> >  		if (!--batch_count) {
> > @@ -180,7 +180,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >  		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
> >  		if (t->state == TASK_UNINTERRUPTIBLE)
> >  			check_hung_task(t, timeout);
> > -	} while_each_thread(g, t);
> > +	}
> 
> 
> Acked-by: Oleg Nesterov <oleg@redhat.com>
> 
> 
> 
> Perhaps it also makes sense to improve this rcu_lock_break a bit...
> For example, if 't' is dead but 'g' is alive we can continue the
> "for_each_process" part of this double loop. And if 't' is still
> alive then we can find the new leader and continue...

OK - I'll incorporate that in a separate patch.

Thanks,

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/2] hung_task: improve rcu_lock_break() logic
  2015-03-17 17:09   ` Oleg Nesterov
  2015-03-17 17:18     ` Aaron Tomlin
@ 2015-03-17 19:24     ` Oleg Nesterov
  2015-03-17 19:25       ` [PATCH 1/2] hung_task: split for_each_process_thread() into for_each_process() + __for_each_thread() Oleg Nesterov
  2015-03-17 19:25       ` [PATCH 2/2] hung_task: improve the rcu_lock_break() logic Oleg Nesterov
  1 sibling, 2 replies; 9+ messages in thread
From: Oleg Nesterov @ 2015-03-17 19:24 UTC (permalink / raw)
  To: Aaron Tomlin; +Cc: akpm, rientjes, dwysocha, linux-kernel, Ingo Molnar

On 03/17, Oleg Nesterov wrote:
>
> On 03/17, Aaron Tomlin wrote:
> >
> > --- a/kernel/hung_task.c
> > +++ b/kernel/hung_task.c
> > @@ -169,7 +169,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >  		return;
> >
> >  	rcu_read_lock();
> > -	do_each_thread(g, t) {
> > +	for_each_process_thread(g, t) {
> >  		if (!max_count--)
> >  			goto unlock;
> >  		if (!--batch_count) {
> > @@ -180,7 +180,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >  		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
> >  		if (t->state == TASK_UNINTERRUPTIBLE)
> >  			check_hung_task(t, timeout);
> > -	} while_each_thread(g, t);
> > +	}
>
>
> Acked-by: Oleg Nesterov <oleg@redhat.com>
>
>
>
> Perhaps it also makes sense to improve this rcu_lock_break a bit...
> For example, if 't' is dead but 'g' is alive we can continue the
> "for_each_process" part of this double loop. And if 't' is still
> alive then we can find the new leader and continue...
>
> But I agree, lets start from this fix, then we will see.

Something like this. on top of Aaron's change.

But actually I am not sure this really makes a lot of sense. But if
yes, we can do more. We can save g->start_time before rcu_lock_break(),
and then "both dead" case can use for_each_process() to (try to) find
the first process whis has a ge start_time.

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] hung_task: split for_each_process_thread() into for_each_process() + __for_each_thread()
  2015-03-17 19:24     ` [PATCH 0/2] hung_task: improve rcu_lock_break() logic Oleg Nesterov
@ 2015-03-17 19:25       ` Oleg Nesterov
  2015-03-20 16:55         ` Aaron Tomlin
  2015-03-17 19:25       ` [PATCH 2/2] hung_task: improve the rcu_lock_break() logic Oleg Nesterov
  1 sibling, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2015-03-17 19:25 UTC (permalink / raw)
  To: Aaron Tomlin; +Cc: akpm, rientjes, dwysocha, linux-kernel, Ingo Molnar

Preparation. Change the main loop in check_hung_uninterruptible_tasks()
to use the nested for_each_process() + __for_each_thread() loops explicitly.
Note that we use __for_each_thread(), not for_each_thread(). This way it
is clear that the inner loop doesn't depend on 'g' after we read ->signal.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/hung_task.c |   23 ++++++++++++++---------
 1 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index e0f90c2..4735b99 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -169,17 +169,22 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 		return;
 
 	rcu_read_lock();
-	for_each_process_thread(g, t) {
-		if (!max_count--)
-			goto unlock;
-		if (!--batch_count) {
-			batch_count = HUNG_TASK_BATCHING;
-			if (!rcu_lock_break(g, t))
+	for_each_process(g) {
+		struct signal_struct *sig = g->signal;
+
+		__for_each_thread(sig, t) {
+			if (!max_count--)
 				goto unlock;
+
+			if (!--batch_count) {
+				batch_count = HUNG_TASK_BATCHING;
+				if (!rcu_lock_break(g, t))
+					goto unlock;
+			}
+			/* use "==" to skip the TASK_KILLABLE tasks */
+			if (t->state == TASK_UNINTERRUPTIBLE)
+				check_hung_task(t, timeout);
 		}
-		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
-		if (t->state == TASK_UNINTERRUPTIBLE)
-			check_hung_task(t, timeout);
 	}
  unlock:
 	rcu_read_unlock();
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] hung_task: improve the rcu_lock_break() logic
  2015-03-17 19:24     ` [PATCH 0/2] hung_task: improve rcu_lock_break() logic Oleg Nesterov
  2015-03-17 19:25       ` [PATCH 1/2] hung_task: split for_each_process_thread() into for_each_process() + __for_each_thread() Oleg Nesterov
@ 2015-03-17 19:25       ` Oleg Nesterov
  2015-03-20 16:55         ` Aaron Tomlin
  1 sibling, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2015-03-17 19:25 UTC (permalink / raw)
  To: Aaron Tomlin; +Cc: akpm, rientjes, dwysocha, linux-kernel, Ingo Molnar

check_hung_uninterruptible_tasks() stops after rcu_lock_break() if either
"t" or "g" exits, this is suboptimal.

If "t" is alive, we can always continue, t->group_leader can be used as the
new "g". We do not even bother to check g != NULL in this case.

If "g" is alive, we can at least continue the outer for_each_process() loop.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/hung_task.c |   29 ++++++++++++++++++++---------
 1 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 4735b99..f488059 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -134,20 +134,26 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
  * For preemptible RCU it is sufficient to call rcu_read_unlock in order
  * to exit the grace period. For classic RCU, a reschedule is required.
  */
-static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
+static void rcu_lock_break(struct task_struct **g, struct task_struct **t)
 {
-	bool can_cont;
+	bool alive;
+
+	get_task_struct(*g);
+	get_task_struct(*t);
 
-	get_task_struct(g);
-	get_task_struct(t);
 	rcu_read_unlock();
 	cond_resched();
 	rcu_read_lock();
-	can_cont = pid_alive(g) && pid_alive(t);
-	put_task_struct(t);
-	put_task_struct(g);
 
-	return can_cont;
+	alive = pid_alive(*g);
+	put_task_struct(*g);
+	if (!alive)
+		*g = NULL;
+
+	alive = pid_alive(*t);
+	put_task_struct(*t);
+	if (!alive)
+		*t = NULL;
 }
 
 /*
@@ -178,7 +184,12 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
 
 			if (!--batch_count) {
 				batch_count = HUNG_TASK_BATCHING;
-				if (!rcu_lock_break(g, t))
+				rcu_lock_break(&g, &t);
+				if (t)		/* in case g == NULL */
+					g = t->group_leader;
+				else if (g)	/* continue the outer loop */
+					break;
+				else		/* both dead */
 					goto unlock;
 			}
 			/* use "==" to skip the TASK_KILLABLE tasks */
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] hung_task: split for_each_process_thread() into for_each_process() + __for_each_thread()
  2015-03-17 19:25       ` [PATCH 1/2] hung_task: split for_each_process_thread() into for_each_process() + __for_each_thread() Oleg Nesterov
@ 2015-03-20 16:55         ` Aaron Tomlin
  0 siblings, 0 replies; 9+ messages in thread
From: Aaron Tomlin @ 2015-03-20 16:55 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, rientjes, dwysocha, linux-kernel, Ingo Molnar

On Tue 2015-03-17 20:25 +0100, Oleg Nesterov wrote:
> Preparation. Change the main loop in check_hung_uninterruptible_tasks()
> to use the nested for_each_process() + __for_each_thread() loops explicitly.
> Note that we use __for_each_thread(), not for_each_thread(). This way it
> is clear that the inner loop doesn't depend on 'g' after we read ->signal.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  kernel/hung_task.c |   23 ++++++++++++++---------
>  1 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index e0f90c2..4735b99 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -169,17 +169,22 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  		return;
>  
>  	rcu_read_lock();
> -	for_each_process_thread(g, t) {
> -		if (!max_count--)
> -			goto unlock;
> -		if (!--batch_count) {
> -			batch_count = HUNG_TASK_BATCHING;
> -			if (!rcu_lock_break(g, t))
> +	for_each_process(g) {
> +		struct signal_struct *sig = g->signal;
> +
> +		__for_each_thread(sig, t) {
> +			if (!max_count--)
>  				goto unlock;
> +
> +			if (!--batch_count) {
> +				batch_count = HUNG_TASK_BATCHING;
> +				if (!rcu_lock_break(g, t))
> +					goto unlock;
> +			}
> +			/* use "==" to skip the TASK_KILLABLE tasks */
> +			if (t->state == TASK_UNINTERRUPTIBLE)
> +				check_hung_task(t, timeout);
>  		}
> -		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
> -		if (t->state == TASK_UNINTERRUPTIBLE)
> -			check_hung_task(t, timeout);
>  	}
>   unlock:
>  	rcu_read_unlock();
> 

Acked-by: Aaron Tomlin <atomlin@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] hung_task: improve the rcu_lock_break() logic
  2015-03-17 19:25       ` [PATCH 2/2] hung_task: improve the rcu_lock_break() logic Oleg Nesterov
@ 2015-03-20 16:55         ` Aaron Tomlin
  0 siblings, 0 replies; 9+ messages in thread
From: Aaron Tomlin @ 2015-03-20 16:55 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, rientjes, dwysocha, linux-kernel, Ingo Molnar

On Tue 2015-03-17 20:25 +0100, Oleg Nesterov wrote:
> check_hung_uninterruptible_tasks() stops after rcu_lock_break() if either
> "t" or "g" exits, this is suboptimal.
> 
> If "t" is alive, we can always continue, t->group_leader can be used as the
> new "g". We do not even bother to check g != NULL in this case.
> 
> If "g" is alive, we can at least continue the outer for_each_process() loop.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  kernel/hung_task.c |   29 ++++++++++++++++++++---------
>  1 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index 4735b99..f488059 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -134,20 +134,26 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>   * For preemptible RCU it is sufficient to call rcu_read_unlock in order
>   * to exit the grace period. For classic RCU, a reschedule is required.
>   */
> -static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
> +static void rcu_lock_break(struct task_struct **g, struct task_struct **t)
>  {
> -	bool can_cont;
> +	bool alive;
> +
> +	get_task_struct(*g);
> +	get_task_struct(*t);
>  
> -	get_task_struct(g);
> -	get_task_struct(t);
>  	rcu_read_unlock();
>  	cond_resched();
>  	rcu_read_lock();
> -	can_cont = pid_alive(g) && pid_alive(t);
> -	put_task_struct(t);
> -	put_task_struct(g);
>  
> -	return can_cont;
> +	alive = pid_alive(*g);
> +	put_task_struct(*g);
> +	if (!alive)
> +		*g = NULL;
> +
> +	alive = pid_alive(*t);
> +	put_task_struct(*t);
> +	if (!alive)
> +		*t = NULL;
>  }
>  
>  /*
> @@ -178,7 +184,12 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>  
>  			if (!--batch_count) {
>  				batch_count = HUNG_TASK_BATCHING;
> -				if (!rcu_lock_break(g, t))
> +				rcu_lock_break(&g, &t);
> +				if (t)		/* in case g == NULL */
> +					g = t->group_leader;
> +				else if (g)	/* continue the outer loop */
> +					break;
> +				else		/* both dead */
>  					goto unlock;
>  			}
>  			/* use "==" to skip the TASK_KILLABLE tasks */

Looks good to me. Thanks.

Acked-by: Aaron Tomlin <atomlin@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-03-20 16:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-17 14:13 [PATCH 0/1] hung_task: Change hung_task.c to use for_each_process_thread() Aaron Tomlin
2015-03-17 14:13 ` [PATCH 1/1] " Aaron Tomlin
2015-03-17 17:09   ` Oleg Nesterov
2015-03-17 17:18     ` Aaron Tomlin
2015-03-17 19:24     ` [PATCH 0/2] hung_task: improve rcu_lock_break() logic Oleg Nesterov
2015-03-17 19:25       ` [PATCH 1/2] hung_task: split for_each_process_thread() into for_each_process() + __for_each_thread() Oleg Nesterov
2015-03-20 16:55         ` Aaron Tomlin
2015-03-17 19:25       ` [PATCH 2/2] hung_task: improve the rcu_lock_break() logic Oleg Nesterov
2015-03-20 16:55         ` Aaron Tomlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.