All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	davem@davemloft.net, Matthias Brugger <matthias.bgg@gmail.com>,
	Marc Zyngier <maz@kernel.org>,
	Julien Thierry <jthierry@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Wang Qing <wangqing@vivo.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Xiaoming Ni <nixiaoming@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mediatek@lists.infradead.org, sumit.garg@linaro.org,
	kernelfans@gmail.com, yj.chiang@mediatek.com
Subject: Re: [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Fri, 25 Feb 2022 16:20:15 +0100	[thread overview]
Message-ID: <Yhjzr8geK7dTXXd2@alley> (raw)
In-Reply-To: <20220212104349.14266-5-lecopzer.chen@mediatek.com>

On Sat 2022-02-12 18:43:48, Lecopzer Chen wrote:
> From: Pingfan Liu <kernelfans@gmail.com>
> 
> from: Pingfan Liu <kernelfans@gmail.com>
> 
> When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> yet. E.g. on arm64, PMU is not ready until
> device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> with the driver model and cpuhp. Hence it is hard to push this
> initialization before smp_init().
> 
> But it is easy to take an opposite approach by enabling watchdog_hld to
> get the capability of PMU async.
> 
> The async model is achieved by expanding watchdog_nmi_probe() with
> -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Co-developed-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> ---
>  kernel/watchdog.c | 56 +++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index b71d434cf648..fa8490cfeef8 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -839,16 +843,64 @@ static void __init watchdog_sysctl_init(void)
>  #define watchdog_sysctl_init() do { } while (0)
>  #endif /* CONFIG_SYSCTL */
>  
> +static void lockup_detector_delay_init(struct work_struct *work);
> +enum hld_detector_state detector_delay_init_state __initdata;

I would call this "lockup_detector_init_state" to use the same
naming scheme everywhere.

> +
> +struct wait_queue_head hld_detector_wait __initdata =
> +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> +
> +static struct work_struct detector_work __initdata =

I would call this "lockup_detector_work" to use the same naming scheme
everywhere.

> +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> +
> +static void __init lockup_detector_delay_init(struct work_struct *work)
> +{
> +	int ret;
> +
> +	wait_event(hld_detector_wait,
> +			detector_delay_init_state == DELAY_INIT_READY);

DELAY_INIT_READY is defined in the 5th patch.

There are many other build errors because this patch uses something
that is defined in the 5th patch.

> +	ret = watchdog_nmi_probe();
> +	if (!ret) {
> +		nmi_watchdog_available = true;
> +		lockup_detector_setup();
> +	} else {
> +		WARN_ON(ret == -EBUSY);

Why WARN_ON(), please?

Note that it might cause panic() when "panic_on_warn" command line
parameter is used.

Also the backtrace will not help much. The context is well known.
This code is called from a workqueue worker.


> +		pr_info("Perf NMI watchdog permanently disabled\n");
> +	}
> +}
> +
> +/* Ensure the check is called after the initialization of PMU driver */
> +static int __init lockup_detector_check(void)
> +{
> +	if (detector_delay_init_state < DELAY_INIT_WAIT)
> +		return 0;
> +
> +	if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {

Again. Is WARN_ON() needed?

Also the condition looks wrong. IMHO, this is the expected state.

> +		detector_delay_init_state = DELAY_INIT_READY;
> +		wake_up(&hld_detector_wait);
> +	}
> +	flush_work(&detector_work);
> +	return 0;
> +}
> +late_initcall_sync(lockup_detector_check);

Otherwise, it make sense.

Best Regards,
Petr

PS: I am not going to review the last patch because I am no familiar
    with arm. I reviewed just the changes in the generic watchdog
    code.

WARNING: multiple messages have this Message-ID (diff)
From: Petr Mladek <pmladek@suse.com>
To: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	davem@davemloft.net, Matthias Brugger <matthias.bgg@gmail.com>,
	Marc Zyngier <maz@kernel.org>,
	Julien Thierry <jthierry@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Wang Qing <wangqing@vivo.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Xiaoming Ni <nixiaoming@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mediatek@lists.infradead.org, sumit.garg@linaro.org,
	kernelfans@gmail.com, yj.chiang@mediatek.com
Subject: Re: [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Fri, 25 Feb 2022 16:20:15 +0100	[thread overview]
Message-ID: <Yhjzr8geK7dTXXd2@alley> (raw)
In-Reply-To: <20220212104349.14266-5-lecopzer.chen@mediatek.com>

On Sat 2022-02-12 18:43:48, Lecopzer Chen wrote:
> From: Pingfan Liu <kernelfans@gmail.com>
> 
> from: Pingfan Liu <kernelfans@gmail.com>
> 
> When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> yet. E.g. on arm64, PMU is not ready until
> device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> with the driver model and cpuhp. Hence it is hard to push this
> initialization before smp_init().
> 
> But it is easy to take an opposite approach by enabling watchdog_hld to
> get the capability of PMU async.
> 
> The async model is achieved by expanding watchdog_nmi_probe() with
> -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Co-developed-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> ---
>  kernel/watchdog.c | 56 +++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index b71d434cf648..fa8490cfeef8 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -839,16 +843,64 @@ static void __init watchdog_sysctl_init(void)
>  #define watchdog_sysctl_init() do { } while (0)
>  #endif /* CONFIG_SYSCTL */
>  
> +static void lockup_detector_delay_init(struct work_struct *work);
> +enum hld_detector_state detector_delay_init_state __initdata;

I would call this "lockup_detector_init_state" to use the same
naming scheme everywhere.

> +
> +struct wait_queue_head hld_detector_wait __initdata =
> +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> +
> +static struct work_struct detector_work __initdata =

I would call this "lockup_detector_work" to use the same naming scheme
everywhere.

> +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> +
> +static void __init lockup_detector_delay_init(struct work_struct *work)
> +{
> +	int ret;
> +
> +	wait_event(hld_detector_wait,
> +			detector_delay_init_state == DELAY_INIT_READY);

DELAY_INIT_READY is defined in the 5th patch.

There are many other build errors because this patch uses something
that is defined in the 5th patch.

> +	ret = watchdog_nmi_probe();
> +	if (!ret) {
> +		nmi_watchdog_available = true;
> +		lockup_detector_setup();
> +	} else {
> +		WARN_ON(ret == -EBUSY);

Why WARN_ON(), please?

Note that it might cause panic() when "panic_on_warn" command line
parameter is used.

Also the backtrace will not help much. The context is well known.
This code is called from a workqueue worker.


> +		pr_info("Perf NMI watchdog permanently disabled\n");
> +	}
> +}
> +
> +/* Ensure the check is called after the initialization of PMU driver */
> +static int __init lockup_detector_check(void)
> +{
> +	if (detector_delay_init_state < DELAY_INIT_WAIT)
> +		return 0;
> +
> +	if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {

Again. Is WARN_ON() needed?

Also the condition looks wrong. IMHO, this is the expected state.

> +		detector_delay_init_state = DELAY_INIT_READY;
> +		wake_up(&hld_detector_wait);
> +	}
> +	flush_work(&detector_work);
> +	return 0;
> +}
> +late_initcall_sync(lockup_detector_check);

Otherwise, it make sense.

Best Regards,
Petr

PS: I am not going to review the last patch because I am no familiar
    with arm. I reviewed just the changes in the generic watchdog
    code.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

WARNING: multiple messages have this Message-ID (diff)
From: Petr Mladek <pmladek@suse.com>
To: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	davem@davemloft.net, Matthias Brugger <matthias.bgg@gmail.com>,
	Marc Zyngier <maz@kernel.org>,
	Julien Thierry <jthierry@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Wang Qing <wangqing@vivo.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Xiaoming Ni <nixiaoming@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, sparclinux@vger.kernel.org,
	linux-mediatek@lists.infradead.org, sumit.garg@linaro.org,
	kernelfans@gmail.com, yj.chiang@mediatek.com
Subject: Re: [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
Date: Fri, 25 Feb 2022 16:20:15 +0100	[thread overview]
Message-ID: <Yhjzr8geK7dTXXd2@alley> (raw)
In-Reply-To: <20220212104349.14266-5-lecopzer.chen@mediatek.com>

On Sat 2022-02-12 18:43:48, Lecopzer Chen wrote:
> From: Pingfan Liu <kernelfans@gmail.com>
> 
> from: Pingfan Liu <kernelfans@gmail.com>
> 
> When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> yet. E.g. on arm64, PMU is not ready until
> device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> with the driver model and cpuhp. Hence it is hard to push this
> initialization before smp_init().
> 
> But it is easy to take an opposite approach by enabling watchdog_hld to
> get the capability of PMU async.
> 
> The async model is achieved by expanding watchdog_nmi_probe() with
> -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Co-developed-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> ---
>  kernel/watchdog.c | 56 +++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index b71d434cf648..fa8490cfeef8 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -839,16 +843,64 @@ static void __init watchdog_sysctl_init(void)
>  #define watchdog_sysctl_init() do { } while (0)
>  #endif /* CONFIG_SYSCTL */
>  
> +static void lockup_detector_delay_init(struct work_struct *work);
> +enum hld_detector_state detector_delay_init_state __initdata;

I would call this "lockup_detector_init_state" to use the same
naming scheme everywhere.

> +
> +struct wait_queue_head hld_detector_wait __initdata =
> +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> +
> +static struct work_struct detector_work __initdata =

I would call this "lockup_detector_work" to use the same naming scheme
everywhere.

> +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> +
> +static void __init lockup_detector_delay_init(struct work_struct *work)
> +{
> +	int ret;
> +
> +	wait_event(hld_detector_wait,
> +			detector_delay_init_state == DELAY_INIT_READY);

DELAY_INIT_READY is defined in the 5th patch.

There are many other build errors because this patch uses something
that is defined in the 5th patch.

> +	ret = watchdog_nmi_probe();
> +	if (!ret) {
> +		nmi_watchdog_available = true;
> +		lockup_detector_setup();
> +	} else {
> +		WARN_ON(ret == -EBUSY);

Why WARN_ON(), please?

Note that it might cause panic() when "panic_on_warn" command line
parameter is used.

Also the backtrace will not help much. The context is well known.
This code is called from a workqueue worker.


> +		pr_info("Perf NMI watchdog permanently disabled\n");
> +	}
> +}
> +
> +/* Ensure the check is called after the initialization of PMU driver */
> +static int __init lockup_detector_check(void)
> +{
> +	if (detector_delay_init_state < DELAY_INIT_WAIT)
> +		return 0;
> +
> +	if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {

Again. Is WARN_ON() needed?

Also the condition looks wrong. IMHO, this is the expected state.

> +		detector_delay_init_state = DELAY_INIT_READY;
> +		wake_up(&hld_detector_wait);
> +	}
> +	flush_work(&detector_work);
> +	return 0;
> +}
> +late_initcall_sync(lockup_detector_check);

Otherwise, it make sense.

Best Regards,
Petr

PS: I am not going to review the last patch because I am no familiar
    with arm. I reviewed just the changes in the generic watchdog
    code.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-02-25 15:20 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-12 10:43 [PATCH 0/5] Support hld based on Pseudo-NMI for arm64 Lecopzer Chen
2022-02-12 10:43 ` Lecopzer Chen
2022-02-12 10:43 ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-25 12:47   ` Petr Mladek
2022-02-25 12:47     ` Petr Mladek
2022-02-25 12:47     ` Petr Mladek
2022-02-26  9:52     ` Lecopzer Chen
2022-02-26  9:52       ` Lecopzer Chen
2022-02-26  9:52       ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 2/5] kernel/watchdog: change watchdog_nmi_enable() to void Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-25 12:50   ` Petr Mladek
2022-02-25 12:50     ` Petr Mladek
2022-02-25 12:50     ` Petr Mladek
2022-02-26  9:54     ` Lecopzer Chen
2022-02-26  9:54       ` Lecopzer Chen
2022-02-26  9:54       ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-25 13:15   ` Petr Mladek
2022-02-25 13:15     ` Petr Mladek
2022-02-25 13:15     ` Petr Mladek
2022-02-12 10:43 ` [PATCH 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-25 15:20   ` Petr Mladek [this message]
2022-02-25 15:20     ` Petr Mladek
2022-02-25 15:20     ` Petr Mladek
2022-02-26 10:52     ` Lecopzer Chen
2022-02-26 10:52       ` Lecopzer Chen
2022-02-26 10:52       ` Lecopzer Chen
2022-02-28 10:14       ` Petr Mladek
2022-02-28 10:14         ` Petr Mladek
2022-02-28 10:14         ` Petr Mladek
2022-02-28 16:32         ` Lecopzer Chen
2022-02-28 16:32           ` Lecopzer Chen
2022-02-28 16:32           ` Lecopzer Chen
2022-02-12 10:43 ` [PATCH 5/5] arm64: Enable perf events based hard lockup detector Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen
2022-02-12 10:43   ` Lecopzer Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yhjzr8geK7dTXXd2@alley \
    --to=pmladek@suse.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=jolsa@redhat.com \
    --cc=jthierry@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernelfans@gmail.com \
    --cc=lecopzer.chen@mediatek.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=matthias.bgg@gmail.com \
    --cc=maz@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=nixiaoming@huawei.com \
    --cc=peterz@infradead.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=sumit.garg@linaro.org \
    --cc=wangqing@vivo.com \
    --cc=will@kernel.org \
    --cc=yj.chiang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.