From: Oliver Sang <oliver.sang@intel.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
John Stultz <john.stultz@linaro.org>,
Stephen Boyd <sboyd@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
Mark Rutland <Mark.Rutland@arm.com>,
Marc Zyngier <maz@kernel.org>, Andi Kleen <ak@linux.intel.com>,
Feng Tang <feng.tang@intel.com>,
Xing Zhengjun <zhengjun.xing@linux.intel.com>,
Chris Mason <clm@fb.com>, LKML <linux-kernel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>,
lkp@lists.01.org, lkp@intel.com
Subject: Re: [clocksource] 8e614d5b58: WARNING:at_kernel/time/clocksource-wdtest.c:#wdtest_func.cold
Date: Tue, 25 May 2021 15:36:55 +0800 [thread overview]
Message-ID: <20210525073655.GE7744@xsang-OptiPlex-9020> (raw)
In-Reply-To: <20210507171259.GA236800@paulmck-ThinkPad-P17-Gen-1>
[-- Attachment #1: Type: text/plain, Size: 5368 bytes --]
hi Paul,
On Fri, May 07, 2021 at 10:12:59AM -0700, Paul E. McKenney wrote:
> On Wed, May 05, 2021 at 11:03:12AM -0700, Paul E. McKenney wrote:
> > On Wed, May 05, 2021 at 10:36:16PM +0800, kernel test robot wrote:
> > >
> > >
> > > Greeting,
> > >
> > > FYI, we noticed the following commit (built with gcc-9):
> > >
> > > commit: 8e614d5b58992e722f07de7c2426f2c44668092b ("clocksource: Provide kernel module to test clocksource watchdog")
> > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > >
> > >
> > > in testcase: boot
> > >
> > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > >
> > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > >
> > >
> > > +-------------------------------------------------------------------------+------------+------------+
> > > | | bdbd9c673e | 8e614d5b58 |
> > > +-------------------------------------------------------------------------+------------+------------+
> > > | WARNING:at_kernel/time/clocksource-wdtest.c:#wdtest_func.cold | 0 | 11 |
> > > | RIP:wdtest_func.cold | 0 | 11 |
> > > +-------------------------------------------------------------------------+------------+------------+
> >
> > Might it be useful to address the lockdep issues that preceded this splat?
> >
> > Leaving that aside, the system appears to still be booting. There are
> > RCU CPU stall warning messages later on, and then the system hangs more
> > than six minutes while still booting, presumably due to the large number
> > of self-tests and debug options enabled.
> >
> > The intent is that the clocksource-wdtest tests run after boot has
> > completed. One approach would be to test it using modprobe after boot
> > has completed. In addition, the clocksource-wdtest module is not designed
> > to handle CPU overload conditions, and making it do so would reduce the
> > effectiveness of the test.
> >
> > I suggest setting clocksource-wdtest.holdoff=N, where "N" is in seconds
> > and is large enough that boot has completed. Alternatively, use modprobe
> > to activate this module from userspace after boot has completed.
> >
> > What I do is just set CONFIG_TEST_CLOCKSOURCE_WATCHDOG=y in an ordinary
> > rcutorture run, if that helps.
>
> All that aside, does the patch below help in your environment?
sorry for late.
I applied below patch to f4c6b34ee1 ("clocksource: Provide kernel module to test clocksource watchdog")
still found the issue on both f4c6b34ee1 and patched kernel.
attached dmesg.xz FYI
[ 157.028293] ------------[ cut here ]------------
[ 157.031055] WARNING: CPU: 0 PID: 207 at kernel/time/clocksource-wdtest.c:154 wdtest_func.cold+0x6a/0x117
[ 157.034366] Modules linked in:
[ 157.036812] CPU: 0 PID: 207 Comm: wdtest Tainted: G S W 5.13.0-rc1-00006-g2aaf396b0698 #1
[ 157.040112] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 157.043283] EIP: wdtest_func.cold+0x6a/0x117
[ 157.045833] Code: 00 69 55 00 a1 24 eb ea da 83 c4 10 85 c0 74 02 0f 0b a1 04 32 9a d9 39 1d b0 0f 9a d9 0f 93 c1 83 e0 40 0f 94 c2 38 d1 74 02 <0f> 0
b 85 c0 74 32 b8 c0 31 9a d9 e8 ca d7 3a fb b8 19 00 00 00 c7
[ 157.052700] EAX: 00000040 EBX: 00000000 ECX: 00000001 EDX: 00000000
[ 157.055738] ESI: d8abf847 EDI: d8abf836 EBP: c680ff84 ESP: c680ff78
[ 157.058770] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202
[ 157.061854] CR0: 80050033 CR2: 00000000 CR3: 1a780000 CR4: 000406d0
[ 157.064864] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 157.067797] DR6: fffe0ff0 DR7: 00000400
[ 157.070414] Call Trace:
[ 157.072676] kthread+0xf6/0x140
[ 157.074915] ? wdtest_ktime_read+0x80/0x80
[ 157.077140] ? kthread_insert_work_sanity_check+0x60/0x60
[ 157.079472] ret_from_fork+0x1c/0x28
[ 157.081558] irq event stamp: 533
[ 157.083404] hardirqs last enabled at (541): [<d19c60e5>] console_unlock+0x4c5/0x600
[ 157.085717] hardirqs last disabled at (550): [<d19c6055>] console_unlock+0x435/0x600
[ 157.087921] softirqs last enabled at (368): [<d6bb1ff0>] __do_softirq+0x2f0/0x44b
[ 157.090071] softirqs last disabled at (567): [<d18ce2a5>] call_on_stack+0x45/0x60
[ 157.092111] ---[ end trace 0b325ef1942d2bda ]---
>
> If so, I can adjust so that my testing gets done quickly and yours
> avoids false-positive failures.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/time/clocksource-wdtest.c b/kernel/time/clocksource-wdtest.c
> index 01df12395c0e..0d8542f8b1d2 100644
> --- a/kernel/time/clocksource-wdtest.c
> +++ b/kernel/time/clocksource-wdtest.c
> @@ -149,7 +149,7 @@ static int wdtest_func(void *arg)
> s = ", expect clock skew";
> pr_info("--- Watchdog with %dx error injection, %lu retries%s.\n", i, max_cswd_read_retries, s);
> WRITE_ONCE(wdtest_ktime_read_ndelays, i);
> - schedule_timeout_uninterruptible(2 * HZ);
> + schedule_timeout_uninterruptible(60 * HZ);
> WARN_ON_ONCE(READ_ONCE(wdtest_ktime_read_ndelays));
> WARN_ON_ONCE((i <= max_cswd_read_retries) !=
> !(clocksource_wdtest_ktime.flags & CLOCK_SOURCE_UNSTABLE));
[-- Attachment #2: dmesg.xz --]
[-- Type: application/x-xz, Size: 40020 bytes --]
next prev parent reply other threads:[~2021-05-25 7:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-05 14:36 [clocksource] 8e614d5b58: WARNING:at_kernel/time/clocksource-wdtest.c:#wdtest_func.cold kernel test robot
2021-05-05 18:03 ` Paul E. McKenney
2021-05-07 17:12 ` Paul E. McKenney
2021-05-25 7:36 ` Oliver Sang [this message]
2021-05-25 21:03 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210525073655.GE7744@xsang-OptiPlex-9020 \
--to=oliver.sang@intel.com \
--cc=Mark.Rutland@arm.com \
--cc=ak@linux.intel.com \
--cc=clm@fb.com \
--cc=corbet@lwn.net \
--cc=feng.tang@intel.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=lkp@lists.01.org \
--cc=maz@kernel.org \
--cc=paulmck@kernel.org \
--cc=sboyd@kernel.org \
--cc=tglx@linutronix.de \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).