Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
@ 2019-12-18  7:03 Bhaskar Upadhaya
  2020-01-02 10:49 ` Bhaskar Upadhaya
  2020-01-02 18:01 ` Borislav Petkov
  0 siblings, 2 replies; 8+ messages in thread
From: Bhaskar Upadhaya @ 2019-12-18  7:03 UTC (permalink / raw)
  To: linux-kernel, linux-acpi, linux-edac, lenb, rafael
  Cc: gkulkarni, rrichter, bhaskar.upadhaya.linux, Bhaskar Upadhaya

Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
because of which it is serviced when the CPU eventually wakes up with a
subsequent non-deferrable timer and not at the configured polling interval.

For polling mode, the polling interval configured by firmware should not
be exceeded as per ACPI_6_3 spec[refer Table 18-394], So Timer need to
be configured in non-deferrable mode by removing TIMER_DEFERRABLE flag.
With NO_HZ enabled and timer callback being configured in non-deferrable
mode, timer callback will get called exactly after polling interval.

Impact of removing TIMER_DEFFERABLE flag
- With NO_HZ enabled, additional timer ticks and unnecessary wakeups of
 the cpu happens exactly after polling interval.

- If polling interval is too small than polling function will be called
 too frequently which may stall the cpu.

Signed-off-by: Bhaskar Upadhaya <bupadhaya@marvell.com>
---
 drivers/acpi/apei/ghes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 777f6f7122b4..c8f9230f69fb 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1181,7 +1181,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 
 	switch (generic->notify.type) {
 	case ACPI_HEST_NOTIFY_POLLED:
-		timer_setup(&ghes->timer, ghes_poll_func, TIMER_DEFERRABLE);
+		timer_setup(&ghes->timer, ghes_poll_func, 0);
 		ghes_add_timer(ghes);
 		break;
 	case ACPI_HEST_NOTIFY_EXTERNAL:
-- 
2.17.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2019-12-18  7:03 [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode Bhaskar Upadhaya
@ 2020-01-02 10:49 ` Bhaskar Upadhaya
  2020-01-02 18:01 ` Borislav Petkov
  1 sibling, 0 replies; 8+ messages in thread
From: Bhaskar Upadhaya @ 2020-01-02 10:49 UTC (permalink / raw)
  To: Bhaskar Upadhaya, rafael, lenb
  Cc: linux-kernel, linux-acpi, linux-edac, gkulkarni, rrichter

Hi Rafael, Len
  Hope you get time to look into this patch.
Regards
--Bhaskar

On Wed, Dec 18, 2019 at 12:34 PM Bhaskar Upadhaya <bupadhaya@marvell.com> wrote:
>
> Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
> because of which it is serviced when the CPU eventually wakes up with a
> subsequent non-deferrable timer and not at the configured polling interval.
>
> For polling mode, the polling interval configured by firmware should not
> be exceeded as per ACPI_6_3 spec[refer Table 18-394], So Timer need to
> be configured in non-deferrable mode by removing TIMER_DEFERRABLE flag.
> With NO_HZ enabled and timer callback being configured in non-deferrable
> mode, timer callback will get called exactly after polling interval.
>
> Impact of removing TIMER_DEFFERABLE flag
> - With NO_HZ enabled, additional timer ticks and unnecessary wakeups of
>  the cpu happens exactly after polling interval.
>
> - If polling interval is too small than polling function will be called
>  too frequently which may stall the cpu.
>
> Signed-off-by: Bhaskar Upadhaya <bupadhaya@marvell.com>
> ---
>  drivers/acpi/apei/ghes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 777f6f7122b4..c8f9230f69fb 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -1181,7 +1181,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
>
>         switch (generic->notify.type) {
>         case ACPI_HEST_NOTIFY_POLLED:
> -               timer_setup(&ghes->timer, ghes_poll_func, TIMER_DEFERRABLE);
> +               timer_setup(&ghes->timer, ghes_poll_func, 0);
>                 ghes_add_timer(ghes);
>                 break;
>         case ACPI_HEST_NOTIFY_EXTERNAL:
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2019-12-18  7:03 [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode Bhaskar Upadhaya
  2020-01-02 10:49 ` Bhaskar Upadhaya
@ 2020-01-02 18:01 ` Borislav Petkov
  2020-01-06 11:03   ` Bhaskar Upadhaya
  1 sibling, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2020-01-02 18:01 UTC (permalink / raw)
  To: Bhaskar Upadhaya
  Cc: linux-kernel, linux-acpi, linux-edac, lenb, rafael, gkulkarni,
	rrichter, bhaskar.upadhaya.linux

On Tue, Dec 17, 2019 at 11:03:38PM -0800, Bhaskar Upadhaya wrote:
> Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
> because of which it is serviced when the CPU eventually wakes up with a
> subsequent non-deferrable timer and not at the configured polling interval.
> 
> For polling mode, the polling interval configured by firmware should not
> be exceeded as per ACPI_6_3 spec[refer Table 18-394],

I see

"Table 18-394 Hardware Error Notification Structure"

where does it say that the interval should not be exceeded and what is
going to happen if it gets exceeded?

IOW, are you fixing something you're observing on some platform or
you're reading the spec only?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2020-01-02 18:01 ` Borislav Petkov
@ 2020-01-06 11:03   ` Bhaskar Upadhaya
  2020-01-06 13:09     ` Borislav Petkov
  0 siblings, 1 reply; 8+ messages in thread
From: Bhaskar Upadhaya @ 2020-01-06 11:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Bhaskar Upadhaya, linux-kernel, linux-acpi, linux-edac, lenb,
	rafael, gkulkarni, rrichter

On Thu, Jan 2, 2020 at 11:31 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Dec 17, 2019 at 11:03:38PM -0800, Bhaskar Upadhaya wrote:
> > Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
> > because of which it is serviced when the CPU eventually wakes up with a
> > subsequent non-deferrable timer and not at the configured polling interval.
> >
> > For polling mode, the polling interval configured by firmware should not
> > be exceeded as per ACPI_6_3 spec[refer Table 18-394],
>
> I see
>
> "Table 18-394 Hardware Error Notification Structure"
>
> where does it say that the interval should not be exceeded and what is
> going to happen if it gets exceeded?

Definition of poll interval as per spec (referred ACPI 6.3):
"Indicates the poll interval in milliseconds OSPM should use to
periodically check the error source for the presence of an error
condition."

This indicates OSPM should periodically check error source within poll
interval, but with timer being configured with TIMER_DEFERRABLE, timer
is not called within poll interval limit
>
> IOW, are you fixing something you're observing on some platform or
> you're reading the spec only?

We are observing an issue in our ThunderX2 platforms wherein
ghes_poll_func is not called within poll interval when timer is
configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence we
are losing the error records.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2020-01-06 11:03   ` Bhaskar Upadhaya
@ 2020-01-06 13:09     ` Borislav Petkov
  2020-01-07 11:03       ` Bhaskar Upadhaya
  0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2020-01-06 13:09 UTC (permalink / raw)
  To: Bhaskar Upadhaya
  Cc: Bhaskar Upadhaya, linux-kernel, linux-acpi, linux-edac, lenb,
	rafael, gkulkarni, rrichter

On Mon, Jan 06, 2020 at 04:33:19PM +0530, Bhaskar Upadhaya wrote:
> Definition of poll interval as per spec (referred ACPI 6.3):
> "Indicates the poll interval in milliseconds OSPM should use to
> periodically check the error source for the presence of an error
> condition."

Please add that...

> We are observing an issue in our ThunderX2 platforms wherein
> ghes_poll_func is not called within poll interval when timer is
> configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence we
> are losing the error records.

... and that to your commit message then, so that it is crystal clear
*why* you're making this change.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2020-01-06 13:09     ` Borislav Petkov
@ 2020-01-07 11:03       ` Bhaskar Upadhaya
  2020-01-07 13:04         ` Robert Richter
  0 siblings, 1 reply; 8+ messages in thread
From: Bhaskar Upadhaya @ 2020-01-07 11:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Bhaskar Upadhaya, linux-kernel, linux-acpi, linux-edac, lenb,
	rafael, gkulkarni, rrichter

On Mon, Jan 6, 2020 at 6:39 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 06, 2020 at 04:33:19PM +0530, Bhaskar Upadhaya wrote:
> > Definition of poll interval as per spec (referred ACPI 6.3):
> > "Indicates the poll interval in milliseconds OSPM should use to
> > periodically check the error source for the presence of an error
> > condition."
>
> Please add that...
>
> > We are observing an issue in our ThunderX2 platforms wherein
> > ghes_poll_func is not called within poll interval when timer is
> > configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence we
> > are losing the error records.
>
> ... and that to your commit message then, so that it is crystal clear
> *why* you're making this change.

Thanks Borislav, I will edit the commit message with you comments in
the next patch.
Can I get your Ack in the next patch ?

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2020-01-07 11:03       ` Bhaskar Upadhaya
@ 2020-01-07 13:04         ` Robert Richter
  2020-01-07 20:02           ` Borislav Petkov
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Richter @ 2020-01-07 13:04 UTC (permalink / raw)
  To: Bhaskar Upadhaya
  Cc: Borislav Petkov, Bhaskar Upadhaya, linux-kernel, linux-acpi,
	linux-edac, lenb, rafael, Ganapatrao Prabhakerrao Kulkarni

On 07.01.20 16:33:24, Bhaskar Upadhaya wrote:
> On Mon, Jan 6, 2020 at 6:39 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Mon, Jan 06, 2020 at 04:33:19PM +0530, Bhaskar Upadhaya wrote:
> > > Definition of poll interval as per spec (referred ACPI 6.3):
> > > "Indicates the poll interval in milliseconds OSPM should use to
> > > periodically check the error source for the presence of an error
> > > condition."
> >
> > Please add that...
> >
> > > We are observing an issue in our ThunderX2 platforms wherein
> > > ghes_poll_func is not called within poll interval when timer is
> > > configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence we
> > > are losing the error records.
> >
> > ... and that to your commit message then, so that it is crystal clear
> > *why* you're making this change.
> 
> Thanks Borislav, I will edit the commit message with you comments in
> the next patch.
> Can I get your Ack in the next patch ?

I guess Boris will apply the patch to his tree as maintainer, so no
need to ack it.

-Robert

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode
  2020-01-07 13:04         ` Robert Richter
@ 2020-01-07 20:02           ` Borislav Petkov
  0 siblings, 0 replies; 8+ messages in thread
From: Borislav Petkov @ 2020-01-07 20:02 UTC (permalink / raw)
  To: Robert Richter, Bhaskar Upadhaya
  Cc: Bhaskar Upadhaya, linux-kernel, linux-acpi, linux-edac, lenb,
	rafael, Ganapatrao Prabhakerrao Kulkarni

On Tue, Jan 07, 2020 at 01:04:29PM +0000, Robert Richter wrote:
> > Thanks Borislav, I will edit the commit message with you comments in
> > the next patch.
> > Can I get your Ack in the next patch ?

Acks are being given when the new version arrives. Look at LKML archives
for examples.

> I guess Boris will apply the patch to his tree as maintainer, so no
> need to ack it.

Nah, apei/ghes stuff goes through Rafael. I'm just a reviewer for the
APEI side.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18  7:03 [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode Bhaskar Upadhaya
2020-01-02 10:49 ` Bhaskar Upadhaya
2020-01-02 18:01 ` Borislav Petkov
2020-01-06 11:03   ` Bhaskar Upadhaya
2020-01-06 13:09     ` Borislav Petkov
2020-01-07 11:03       ` Bhaskar Upadhaya
2020-01-07 13:04         ` Robert Richter
2020-01-07 20:02           ` Borislav Petkov

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git