From: Lecopzer Chen <lecopzer.chen@mediatek.com>
To: <pmladek@suse.com>
Cc: <acme@kernel.org>, <akpm@linux-foundation.org>,
<alexander.shishkin@linux.intel.com>, <catalin.marinas@arm.com>,
<davem@davemloft.net>, <jolsa@redhat.com>, <jthierry@redhat.com>,
<keescook@chromium.org>, <kernelfans@gmail.com>,
<lecopzer.chen@mediatek.com>,
<linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>,
<linux-mediatek@lists.infradead.org>,
<linux-perf-users@vger.kernel.org>, <mark.rutland@arm.com>,
<masahiroy@kernel.org>, <matthias.bgg@gmail.com>,
<maz@kernel.org>, <mcgrof@kernel.org>, <mingo@redhat.com>,
<namhyung@kernel.org>, <nixiaoming@huawei.com>,
<peterz@infradead.org>, <sparclinux@vger.kernel.org>,
<sumit.garg@linaro.org>, <wangqing@vivo.com>, <will@kernel.org>,
<yj.chiang@mediatek.com>
Subject: Re: [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Tue, 1 Mar 2022 00:32:57 +0800 [thread overview]
Message-ID: <20220228163257.2411-1-lecopzer.chen@mediatek.com> (raw)
In-Reply-To: <YhygkafOHc6eeP9f@alley>
Yes, there is no race now, the condition is much like a verbose checking for
the state. I'll remove it.
> > I think it make sense to remove WARN now becasue it looks verbosely...
> > However, I would rather change the following printk to
> > "Delayed init for lockup detector failed."
>
> I would print both messages. The above message says what failed.
>
>
> > > > + pr_info("Perf NMI watchdog permanently disabled\n");
>
> And this message explains what is the result of the above failure.
> It is not obvious.
Yes, make sense, let's print both.
>
> > > > + }
> > > > +}
> > > > +
> > > > +/* Ensure the check is called after the initialization of PMU driver */
> > > > +static int __init lockup_detector_check(void)
> > > > +{
> > > > + if (detector_delay_init_state < DELAY_INIT_WAIT)
> > > > + return 0;
> > > > +
> > > > + if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {
> > >
> > > Again. Is WARN_ON() needed?
> > >
> > > Also the condition looks wrong. IMHO, this is the expected state.
> > >
> >
> > This does expected DELAY_INIT_READY here, which means,
> > every one who comes here to be checked should be READY and WARN if you're
> > still in WAIT state, and which means the previous lockup_detector_delay_init()
> > failed.
>
> No, DELAY_INIT_READY is set below. DELAY_INIT_WAIT is valid value here.
> It means that lockup_detector_delay_init() work is queued.
>
Sorry, I didn't describe clearly,
For the call flow:
kernel_init_freeable()
-> lockup_detector_init()
--> queue work(lockup_detector_delay_init) with state registering
to DELAY_INIT_WAIT.
---> lockup_detector_delay_init wait DELAY_INIT_READY that set
by armv8_pmu_driver_init().
----> device_initcall(armv8_pmu_driver_init),
set state to READY and wake_up the work. (in 5th patch)
-----> lockup_detector_delay_init recieves READY and calls
watchdog_nmi_probe() again.
------> late_initcall_sync(lockup_detector_check);
check if the state is READY? In other words, did the arch driver
finish probing watchdog between "queue work" and "late_initcall_sync()"?
If not, we forcely set state to READY and wake_up again.
>
> > IMO, either keeping or removing WARN is fine with me.
> >
> > I think I'll remove WARN and add
> > pr_info("Delayed init checking for lockup detector failed, retry for once.");
> > inside the `if (detector_delay_init_state == DELAY_INIT_WAIT)`
> >
> > Or would you have any other suggestion? thanks.
> >
> > > > + detector_delay_init_state = DELAY_INIT_READY;
> > > > + wake_up(&hld_detector_wait);
>
> I see another problem now. We should always call the wake up here
> when the work was queued. Otherwise, the worker will stay blocked
> forewer.
>
> The worker will also get blocked when the late_initcall is called
> before the work is proceed by a worker.
lockup_detector_check() is used to solve the blocking state.
As the description above, if state is WAIT when lockup_detector_check(),
we would forcely set state to READY can wake up the work for once.
After lockup_detector_check(), nobody cares about the state and the worker
also finishes its work.
>
> > > > + }
> > > > + flush_work(&detector_work);
> > > > + return 0;
> > > > +}
> > > > +late_initcall_sync(lockup_detector_check);
>
>
> OK, I think that the three states are too complicated. I suggest to
> use only a single bool. Something like:
>
> static bool lockup_detector_pending_init __initdata;
>
> struct wait_queue_head lockup_detector_wait __initdata =
> __WAIT_QUEUE_HEAD_INITIALIZER(lockup_detector_wait);
>
> static struct work_struct detector_work __initdata =
> __WORK_INITIALIZER(lockup_detector_work,
> lockup_detector_delay_init);
>
> static void __init lockup_detector_delay_init(struct work_struct *work)
> {
> int ret;
>
> wait_event(lockup_detector_wait, lockup_detector_pending_init == false);
>
> ret = watchdog_nmi_probe();
> if (ret) {
> pr_info("Delayed init of the lockup detector failed: %\n);
> pr_info("Perf NMI watchdog permanently disabled\n");
> return;
> }
>
> nmi_watchdog_available = true;
> lockup_detector_setup();
> }
>
> /* Trigger delayedEnsure the check is called after the initialization of PMU driver */
> static int __init lockup_detector_check(void)
> {
> if (!lockup_detector_pending_init)
> return;
>
> lockup_detector_pending_init = false;
> wake_up(&lockup_detector_wait);
> return 0;
> }
> late_initcall_sync(lockup_detector_check);
>
> void __init lockup_detector_init(void)
> {
> int ret;
>
> if (tick_nohz_full_enabled())
> pr_info("Disabling watchdog on nohz_full cores by default\n");
>
> cpumask_copy(&watchdog_cpumask,
> housekeeping_cpumask(HK_FLAG_TIMER));
>
> ret = watchdog_nmi_probe();
> if (!ret)
> nmi_watchdog_available = true;
> else if (ret == -EBUSY) {
> detector_delay_pending_init = true;
> /* Init must be done in a process context on a bound CPU. */
> queue_work_on(smp_processor_id(), system_wq,
> &lockup_detector_work);
> }
>
> lockup_detector_setup();
> watchdog_sysctl_init();
> }
>
> The result is that lockup_detector_work() will never stay blocked
> forever. There are two possibilities:
>
> 1. lockup_detector_work() called before lockup_detector_check().
> In this case, wait_event() will wait until lockup_detector_check()
> clears detector_delay_pending_init and calls wake_up().
>
> 2. lockup_detector_check() called before lockup_detector_work().
> In this case, wait_even() will immediately continue because
> it will see cleared detector_delay_pending_init.
>
Thanks, I think this logic is much simpler than three states for our use case now,
It also fits the call flow described above, I will revise it base on this
code.
Thanks a lot for your code and review!
BRs,
Lecopzer
next prev parent reply other threads:[~2022-02-28 16:33 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-12 10:43 [PATCH 0/5] Support hld based on Pseudo-NMI for arm64 Lecopzer Chen
2022-02-12 10:43 ` [PATCH 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
2022-02-25 12:47 ` Petr Mladek
2022-02-26 9:52 ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 2/5] kernel/watchdog: change watchdog_nmi_enable() to void Lecopzer Chen
2022-02-25 12:50 ` Petr Mladek
2022-02-26 9:54 ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Lecopzer Chen
2022-02-25 13:15 ` Petr Mladek
2022-02-12 10:43 ` [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
2022-02-25 15:20 ` Petr Mladek
2022-02-26 10:52 ` Lecopzer Chen
2022-02-28 10:14 ` Petr Mladek
2022-02-28 16:32 ` Lecopzer Chen [this message]
2022-02-12 10:43 ` [PATCH 5/5] arm64: Enable perf events based hard lockup detector Lecopzer Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220228163257.2411-1-lecopzer.chen@mediatek.com \
--to=lecopzer.chen@mediatek.com \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=jolsa@redhat.com \
--cc=jthierry@redhat.com \
--cc=keescook@chromium.org \
--cc=kernelfans@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=masahiroy@kernel.org \
--cc=matthias.bgg@gmail.com \
--cc=maz@kernel.org \
--cc=mcgrof@kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=nixiaoming@huawei.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=sparclinux@vger.kernel.org \
--cc=sumit.garg@linaro.org \
--cc=wangqing@vivo.com \
--cc=will@kernel.org \
--cc=yj.chiang@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).