All of lore.kernel.org
 help / color / mirror / Atom feed
From: "liwei (GF)" <liwei391@huawei.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: <mingo@redhat.com>, <peterz@infradead.org>,
	<juri.lelli@redhat.com>, <vincent.guittot@linaro.org>,
	<dietmar.eggemann@arm.com>, <bsegall@google.com>,
	<mgorman@suse.de>, <huawei.libin@huawei.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched/debug: Reset watchdog on all CPUs while processing sysrq-t
Date: Fri, 3 Jan 2020 09:54:44 +0800	[thread overview]
Message-ID: <2a7a9e51-bb25-3ed1-2643-e293e3ce5188@huawei.com> (raw)
In-Reply-To: <20200102144514.646df101@gandalf.local.home>

Hi Steven,
Yes, it can be triggered on the Hi1620 system (128 cores) as follows:
stress-ng -c 50 &
stress-ng -m 50 &
stress-ng -i 20 &
echo 7 > /proc/sys/kernel/printk
echo t > /proc/sysrq-trigger

Then a soft lockup will be reported at migration thread
[39636.303531] watchdog: BUG: soft lockup - CPU#67 stuck for 23s! [migration/67:348]
which is waiting for the CPU handling sysrq-t to process stop_two_cpus.

Thanks,
Wei

On 2020/1/3 3:45, Steven Rostedt wrote:
> On Thu, 26 Dec 2019 16:52:24 +0800
> Wei Li <liwei391@huawei.com> wrote:
> 
>> Lengthy output of sysrq-t may take a lot of time on slow serial console
>> with lots of processes and CPUs.
>>
>> So we need to reset NMI-watchdog to avoid spurious lockup messages, and
>> we also reset softlockup watchdogs on all other CPUs since another CPU
>> might be blocked waiting for us to process an IPI or stop_machine.
> 
> Have you had this triggered?
> 
>>
>> Add to sysrq_sched_debug_show() as what we did in show_state_filter().
>>
>> Signed-off-by: Wei Li <liwei391@huawei.com>
>> ---
>>  kernel/sched/debug.c | 11 +++++++++--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
>> index f7e4579e746c..879d3ccf3806 100644
>> --- a/kernel/sched/debug.c
>> +++ b/kernel/sched/debug.c
>> @@ -751,9 +751,16 @@ void sysrq_sched_debug_show(void)
>>  	int cpu;
>>  
>>  	sched_debug_header(NULL);
>> -	for_each_online_cpu(cpu)
>> +	for_each_online_cpu(cpu) {
>> +		/*
>> +		 * Need to reset softlockup watchdogs on all CPUs, because
>> +		 * another CPU might be blocked waiting for us to process
>> +		 * an IPI or stop_machine.
>> +		 */
>> +		touch_nmi_watchdog();
>> +		touch_all_softlockup_watchdogs();
> 
> This doesn't seem to hurt to add, thus.
> 
> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> -- Steve
> 
>>  		print_cpu(NULL, cpu);
>> -
>> +	}
>>  }
>>  
>>  /*
> 
> 
> .
> 


  reply	other threads:[~2020-01-03  1:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-26  8:52 [PATCH] sched/debug: Reset watchdog on all CPUs while processing sysrq-t Wei Li
2020-01-02 19:45 ` Steven Rostedt
2020-01-03  1:54   ` liwei (GF) [this message]
2020-01-07  9:32   ` Peter Zijlstra
2020-01-17 10:08 ` [tip: sched/core] " tip-bot2 for Wei Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a7a9e51-bb25-3ed1-2643-e293e3ce5188@huawei.com \
    --to=liwei391@huawei.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=huawei.libin@huawei.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.