linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: akpm@linux-foundation.org, peterz@infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] kernel/hung_task: print real_parent->comm, pid in check_hung_task
Date: Mon, 28 Nov 2022 16:48:04 +0100	[thread overview]
Message-ID: <Y4TYNF8jRSkGii/U@alley> (raw)
In-Reply-To: <20221124112526.GA21832@didi-ThinkCentre-M930t-N000>

On Thu 2022-11-24 19:25:26, Tio Zhang wrote:
> We can avoid a hung task by fixing the process who causes it.
> But sometimes it is difficult to find out which service 
> the bad process belongs to by only knowing its pid and comm.
> Since userspace tools to learn who launches the bad process
> do not always work when we get a hung task, 
> it is helpful printing the parent by kernel.

Could you please be more specific how the information about
the parent helped to debug the problem?

Was it really important who started the process?
Was it related to some cgroup limits or permissions?

> Signed-off-by: Tio Zhang <tiozhang@didiglobal.com>
> ---
>  kernel/hung_task.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index c71889f3f3fc..33543d27bd5c 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -89,6 +89,7 @@ static struct notifier_block panic_block = {
>  
>  static void check_hung_task(struct task_struct *t, unsigned long timeout)
>  {
> +	struct task_struct *p = t->real_parent;

IMHO, this should be read using rcu_dereference(t->real_parent).

Note that check_hung_task() is already called under
rcu_read_lock() from check_hung_uninterruptible_tasks().

>  	unsigned long switch_count = t->nvcsw + t->nivcsw;
>  
>  	/*
> @@ -129,8 +130,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
>  	if (sysctl_hung_task_warnings) {
>  		if (sysctl_hung_task_warnings > 0)
>  			sysctl_hung_task_warnings--;
> -		pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
> -		       t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
> +		pr_err("INFO: task %s:%d, parent %s:%d blocked for more than %ld seconds.\n",
> +		       t->comm, t->pid, p->comm, p->pid, (jiffies - t->last_switch_time) / HZ);

IMHO, this is a wrong place. The formulation creates more harm than
good. It might confuse people that both processes are blocked. Or it
makes the feeling that the parent somehow created the deadlock.

But if I get it correctly, the information about the parent is
needed only in special situations where only a particular parent
triggers the lockup.

>  		pr_err("      %s %s %.*s\n",
>  			print_tainted(), init_utsname()->release,
>  			(int)strcspn(init_utsname()->version, " "),

Alternative solution would be to print the parent in
sched_show_task() that is called here as well.

sched_show_task() prints many useful information that might
be useful for debugging. And the parent is just yet another
information that might bu useful.

Also sched_show_task() is called in more situations where
this information might be useful as well.

Best Regards,
Petr

      reply	other threads:[~2022-11-28 15:48 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-24 11:25 [PATCH] kernel/hung_task: print real_parent->comm, pid in check_hung_task Tio Zhang
2022-11-28 15:48 ` Petr Mladek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4TYNF8jRSkGii/U@alley \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).