linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lecopzer Chen <lecopzer.chen@mediatek.com>
To: <pmladek@suse.com>
Cc: <acme@kernel.org>, <akpm@linux-foundation.org>,
	<alexander.shishkin@linux.intel.com>, <catalin.marinas@arm.com>,
	<davem@davemloft.net>, <jolsa@redhat.com>, <jthierry@redhat.com>,
	<keescook@chromium.org>, <kernelfans@gmail.com>,
	<lecopzer.chen@mediatek.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-mediatek@lists.infradead.org>,
	<linux-perf-users@vger.kernel.org>, <mark.rutland@arm.com>,
	<masahiroy@kernel.org>, <matthias.bgg@gmail.com>,
	<maz@kernel.org>, <mcgrof@kernel.org>, <mingo@redhat.com>,
	<namhyung@kernel.org>, <nixiaoming@huawei.com>,
	<peterz@infradead.org>, <sparclinux@vger.kernel.org>,
	<sumit.garg@linaro.org>, <wangqing@vivo.com>, <will@kernel.org>,
	<yj.chiang@mediatek.com>
Subject: Re: [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Tue, 1 Mar 2022 00:32:57 +0800	[thread overview]
Message-ID: <20220228163257.2411-1-lecopzer.chen@mediatek.com> (raw)
In-Reply-To: <YhygkafOHc6eeP9f@alley>

Yes, there is no race now, the condition is much like a verbose checking for
the state. I'll remove it.


> > I think it make sense to remove WARN now becasue it looks verbosely...
> > However, I would rather change the following printk to
> > "Delayed init for lockup detector failed."
> 
> I would print both messages. The above message says what failed.
> 
> 
> > > > +		pr_info("Perf NMI watchdog permanently disabled\n");
> 
> And this message explains what is the result of the above failure.
> It is not obvious.

Yes, make sense, let's print both.


> 
> > > > +	}
> > > > +}
> > > > +
> > > > +/* Ensure the check is called after the initialization of PMU driver */
> > > > +static int __init lockup_detector_check(void)
> > > > +{
> > > > +	if (detector_delay_init_state < DELAY_INIT_WAIT)
> > > > +		return 0;
> > > > +
> > > > +	if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {
> > > 
> > > Again. Is WARN_ON() needed?
> > > 
> > > Also the condition looks wrong. IMHO, this is the expected state.
> > > 
> > 
> > This does expected DELAY_INIT_READY here, which means,
> > every one who comes here to be checked should be READY and WARN if you're
> > still in WAIT state, and which means the previous lockup_detector_delay_init()
> > failed.
> 
> No, DELAY_INIT_READY is set below. DELAY_INIT_WAIT is valid value here.
> It means that lockup_detector_delay_init() work is queued.
> 

Sorry, I didn't describe clearly,

For the call flow:

kernel_init_freeable()
-> lockup_detector_init()
--> queue work(lockup_detector_delay_init) with state registering
    to DELAY_INIT_WAIT.
---> lockup_detector_delay_init wait DELAY_INIT_READY that set
     by armv8_pmu_driver_init().
----> device_initcall(armv8_pmu_driver_init),
      set state to READY and wake_up the work. (in 5th patch)
-----> lockup_detector_delay_init recieves READY and calls
       watchdog_nmi_probe() again.
------> late_initcall_sync(lockup_detector_check);
        check if the state is READY? In other words, did the arch driver
        finish probing watchdog between "queue work" and "late_initcall_sync()"?
        If not, we forcely set state to READY and wake_up again.


> 
> > IMO, either keeping or removing WARN is fine with me.
> > 
> > I think I'll remove WARN and add
> > pr_info("Delayed init checking for lockup detector failed, retry for once.");
> > inside the `if (detector_delay_init_state == DELAY_INIT_WAIT)`
> > 
> > Or would you have any other suggestion? thanks.
> > 
> > > > +		detector_delay_init_state = DELAY_INIT_READY;
> > > > +		wake_up(&hld_detector_wait);
> 
> I see another problem now. We should always call the wake up here
> when the work was queued. Otherwise, the worker will stay blocked
> forewer.
> 
> The worker will also get blocked when the late_initcall is called
> before the work is proceed by a worker.

lockup_detector_check() is used to solve the blocking state.
As the description above, if state is WAIT when lockup_detector_check(),
we would forcely set state to READY can wake up the work for once.
After lockup_detector_check(), nobody cares about the state and the worker
also finishes its work.

> 
> > > > +	}
> > > > +	flush_work(&detector_work);
> > > > +	return 0;
> > > > +}
> > > > +late_initcall_sync(lockup_detector_check);
> 
> 
> OK, I think that the three states are too complicated. I suggest to
> use only a single bool. Something like:
> 
> static bool lockup_detector_pending_init __initdata;
> 
> struct wait_queue_head lockup_detector_wait __initdata =
> 		__WAIT_QUEUE_HEAD_INITIALIZER(lockup_detector_wait);
> 
> static struct work_struct detector_work __initdata =
> 		__WORK_INITIALIZER(lockup_detector_work,
> 				   lockup_detector_delay_init);
> 
> static void __init lockup_detector_delay_init(struct work_struct *work)
> {
> 	int ret;
> 
> 	wait_event(lockup_detector_wait, lockup_detector_pending_init == false);
> 
> 	ret = watchdog_nmi_probe();
> 	if (ret) {
> 		pr_info("Delayed init of the lockup detector failed: %\n);
> 		pr_info("Perf NMI watchdog permanently disabled\n");
> 		return;
> 	}
> 
> 	nmi_watchdog_available = true;
> 	lockup_detector_setup();
> }
> 
> /* Trigger delayedEnsure the check is called after the initialization of PMU driver */
> static int __init lockup_detector_check(void)
> {
> 	if (!lockup_detector_pending_init)
> 		return;
> 
> 	lockup_detector_pending_init = false;
> 	wake_up(&lockup_detector_wait);
> 	return 0;
> }
> late_initcall_sync(lockup_detector_check);
> 
> void __init lockup_detector_init(void)
> {
> 	int ret;
> 
> 	if (tick_nohz_full_enabled())
> 		pr_info("Disabling watchdog on nohz_full cores by default\n");
> 
> 	cpumask_copy(&watchdog_cpumask,
> 		     housekeeping_cpumask(HK_FLAG_TIMER));
> 
> 	ret = watchdog_nmi_probe();
> 	if (!ret)
> 		nmi_watchdog_available = true;
> 	else if (ret == -EBUSY) {
> 		detector_delay_pending_init = true;
> 		/* Init must be done in a process context on a bound CPU. */
> 		queue_work_on(smp_processor_id(), system_wq, 
> 				  &lockup_detector_work);
> 	}
> 
> 	lockup_detector_setup();
> 	watchdog_sysctl_init();
> }
> 
> The result is that lockup_detector_work() will never stay blocked
> forever. There are two possibilities:
> 
> 1.  lockup_detector_work() called before lockup_detector_check().
>     In this case, wait_event() will wait until lockup_detector_check()
>     clears detector_delay_pending_init and calls wake_up().
> 
> 2. lockup_detector_check() called before lockup_detector_work().
>    In this case, wait_even() will immediately continue because
>    it will see cleared detector_delay_pending_init.
> 

Thanks, I think this logic is much simpler than three states for our use case now,
It also fits the call flow described above, I will revise it base on this
code.


Thanks a lot for your code and review!

BRs,
Lecopzer

  reply	other threads:[~2022-02-28 16:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-12 10:43 [PATCH 0/5] Support hld based on Pseudo-NMI for arm64 Lecopzer Chen
2022-02-12 10:43 ` [PATCH 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
2022-02-25 12:47   ` Petr Mladek
2022-02-26  9:52     ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 2/5] kernel/watchdog: change watchdog_nmi_enable() to void Lecopzer Chen
2022-02-25 12:50   ` Petr Mladek
2022-02-26  9:54     ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Lecopzer Chen
2022-02-25 13:15   ` Petr Mladek
2022-02-12 10:43 ` [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
2022-02-25 15:20   ` Petr Mladek
2022-02-26 10:52     ` Lecopzer Chen
2022-02-28 10:14       ` Petr Mladek
2022-02-28 16:32         ` Lecopzer Chen [this message]
2022-02-12 10:43 ` [PATCH 5/5] arm64: Enable perf events based hard lockup detector Lecopzer Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220228163257.2411-1-lecopzer.chen@mediatek.com \
    --to=lecopzer.chen@mediatek.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=jolsa@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernelfans@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=matthias.bgg@gmail.com \
    --cc=maz@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=nixiaoming@huawei.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=sumit.garg@linaro.org \
    --cc=wangqing@vivo.com \
    --cc=will@kernel.org \
    --cc=yj.chiang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).