linux-renesas-soc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] iopoll: Busy loop and timeout improvements
@ 2023-05-10 13:23 Geert Uytterhoeven
  2023-05-10 13:23 ` [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops Geert Uytterhoeven
  2023-05-10 13:23 ` [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic() Geert Uytterhoeven
  0 siblings, 2 replies; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-10 13:23 UTC (permalink / raw)
  To: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Ulf Hansson, Rafael J . Wysocki, Vincent Guittot
  Cc: linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel,
	Geert Uytterhoeven

	Hi all,

When implementing a polling loop, a common review comment is to use one
of the read*_poll_timeout*() helpers.  Unfortunately this is not always
possible, or might introduce subtle bugs.  This patch series aims to
improve these helpers, so they can gain wider use.

  1. The first patch improves busy-looping behavior of both the atomic
     and non-atomic read*_poll_timeout*() helpers.
     The issue addressed by this patch was discussed before[1-2], but I
     am not aware of any patches moving forward.

  2. The second patch fixes timeout handling of the atomic variants.
     Some of the issues addressed by this patch were mitigated in
     various places[3-5], and some of these findings may of interest to
     the authors of [6-8].

The first patch was sent before, and already received some acks, but I
hadn't queued it yet as a depedency for more read*_poll_timeout*() use,
because I ran into the issue fixed by the second patch.

Changes compared to v1[9]:
  - Add Acked-by,
  - Add compiler barrier and inlining explanation (thanks, Peter!),
  - Add patch [2/2].

Thanks for your comments!

[1] "Re: [PATCH 6/7] clk: renesas: rcar-gen3: Add custom clock for PLLs"
    https://lore.kernel.org/all/CAMuHMdWUEhs=nwP+a0vO2jOzkq-7FEOqcJ+SsxAGNXX1PQ2KMA@mail.gmail.com
[2] "Re: [PATCH v2] clk: samsung: Prevent potential endless loop in the
    PLL set_rate ops"
    https://lore.kernel.org/all/20200811164628.GA7958@kozik-lap
[3] b3e9431854e8f305 ("bus: ti-sysc: Fix timekeeping_suspended warning
		       on resume")
[4] 44a9e78f9242872c ("clk: samsung: Prevent potential endless loop in
		       the PLL ops")
[5] 3d8598fb9c5a7783 ("clk: ti: clkctrl: use fallback udelay approach if
		      timekeeping is suspended")
[6] bd40cbb0e3b37a4d ("PM: domains: Move genpd's time-accounting to
		       ktime_get_mono_fast_ns()")
[7] c155f6499f9797f2 ("PM-runtime: Switch accounting over to
		       ktime_get_mono_fast_ns()")
[8] 15efb47dc560849d ("PM-runtime: Fix deadlock with ktime_get()")

[9] https://lore.kernel.org/r/8d492ee4a391bd089a01c218b0b4e05cf8ea593c.1674729407.git.geert+renesas@glider.be/

Geert Uytterhoeven (2):
  iopoll: Call cpu_relax() in busy loops
  iopoll: Do not use timekeeping in read_poll_timeout_atomic()

 include/linux/iopoll.h | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

-- 
2.34.1

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops
  2023-05-10 13:23 [PATCH v2 0/2] iopoll: Busy loop and timeout improvements Geert Uytterhoeven
@ 2023-05-10 13:23 ` Geert Uytterhoeven
  2023-05-11  6:48   ` Tony Lindgren
  2023-05-11  9:48   ` Ulf Hansson
  2023-05-10 13:23 ` [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic() Geert Uytterhoeven
  1 sibling, 2 replies; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-10 13:23 UTC (permalink / raw)
  To: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Ulf Hansson, Rafael J . Wysocki, Vincent Guittot
  Cc: linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel,
	Geert Uytterhoeven

It is considered good practice to call cpu_relax() in busy loops, see
Documentation/process/volatile-considered-harmful.rst.  This can not
only lower CPU power consumption or yield to a hyperthreaded twin
processor, but also allows an architecture to mitigate hardware issues
(e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
architecture-specific cpu_relax() implementation.

In addition, cpu_relax() is also a compiler barrier.  It is not
immediately obvious that the @op argument "function" will result in an
actual function call (e.g. in case of inlining).

Where a function call is a C sequence point, this is lost on inlining.
Therefore, with agressive enough optimization it might be possible for
the compiler to hoist the:

        (val) = op(args);

"load" out of the loop because it doesn't see the value changing. The
addition of cpu_relax() would inhibit this.

As the iopoll helpers lack calls to cpu_relax(), people are sometimes
reluctant to use them, and may fall back to open-coded polling loops
(including cpu_relax() calls) instead.

Fix this by adding calls to cpu_relax() to the iopoll helpers:
  - For the non-atomic case, it is sufficient to call cpu_relax() in
    case of a zero sleep-between-reads value, as a call to
    usleep_range() is a safe barrier otherwise.  However, it doesn't
    hurt to add the call regardless, for simplicity, and for similarity
    with the atomic case below.
  - For the atomic case, cpu_relax() must be called regardless of the
    sleep-between-reads value, as there is no guarantee all
    architecture-specific implementations of udelay() handle this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
v2:
  - Add Acked-by,
  - Add compiler barrier and inlining explanation (thanks, Peter!).
---
 include/linux/iopoll.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
index 2c8860e406bd8cae..0417360a6db9b0d6 100644
--- a/include/linux/iopoll.h
+++ b/include/linux/iopoll.h
@@ -53,6 +53,7 @@
 		} \
 		if (__sleep_us) \
 			usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
+		cpu_relax(); \
 	} \
 	(cond) ? 0 : -ETIMEDOUT; \
 })
@@ -95,6 +96,7 @@
 		} \
 		if (__delay_us) \
 			udelay(__delay_us); \
+		cpu_relax(); \
 	} \
 	(cond) ? 0 : -ETIMEDOUT; \
 })
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-10 13:23 [PATCH v2 0/2] iopoll: Busy loop and timeout improvements Geert Uytterhoeven
  2023-05-10 13:23 ` [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops Geert Uytterhoeven
@ 2023-05-10 13:23 ` Geert Uytterhoeven
  2023-05-10 13:35   ` Arnd Bergmann
                     ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-10 13:23 UTC (permalink / raw)
  To: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Ulf Hansson, Rafael J . Wysocki, Vincent Guittot
  Cc: linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel,
	Geert Uytterhoeven

read_poll_timeout_atomic() uses ktime_get() to implement the timeout
feature, just like its non-atomic counterpart.  However, there are
several issues with this, due to its use in atomic contexts:

  1. When called in the s2ram path (as typically done by clock or PM
     domain drivers), timekeeping may be suspended, triggering the
     WARN_ON(timekeeping_suspended) in ktime_get():

	WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78

     Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
     rid of that warning.  However, that would break timeout handling,
     as (at least on systems with an ARM architectured timer), the time
     returned by ktime_get_mono_fast_ns() does not advance while
     timekeeping is suspended.
     Interestingly, (on the same ARM systems) the time returned by
     ktime_get() does advance while timekeeping is suspended, despite
     the warning.

  2. Depending on the actual clock source, and especially before a
     high-resolution clocksource (e.g. the ARM architectured timer)
     becomes available, time may not advance in atomic contexts, thus
     breaking timeout handling.

Fix this by abandoning the idea that one can rely on timekeeping to
implement timeout handling in all atomic contexts, and switch from a
global time-based to a locally-estimated timeout handling.  In most
(all?) cases the timeout condition is exceptional and an error
condition, hence any additional delays due to underestimating wall clock
time are irrelevant.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Alternatively, one could use a mixed approach (use both
ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
earliest occasion), but I think that would complicate things without
much gain.

v2:
  - New.
---
 include/linux/iopoll.h | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
index 0417360a6db9b0d6..bb2e1d9117e96679 100644
--- a/include/linux/iopoll.h
+++ b/include/linux/iopoll.h
@@ -81,22 +81,30 @@
 					delay_before_read, args...) \
 ({ \
 	u64 __timeout_us = (timeout_us); \
+	s64 __left_ns = __timeout_us * NSEC_PER_USEC; \
 	unsigned long __delay_us = (delay_us); \
-	ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
-	if (delay_before_read && __delay_us) \
+	u64 __delay_ns = __delay_us * NSEC_PER_USEC; \
+	if (delay_before_read && __delay_us) { \
 		udelay(__delay_us); \
+		if (__timeout_us) \
+			__left_ns -= __delay_ns; \
+	} \
 	for (;;) { \
 		(val) = op(args); \
 		if (cond) \
 			break; \
-		if (__timeout_us && \
-		    ktime_compare(ktime_get(), __timeout) > 0) { \
+		if (__timeout_us && __left_ns < 0) { \
 			(val) = op(args); \
 			break; \
 		} \
-		if (__delay_us) \
+		if (__delay_us) { \
 			udelay(__delay_us); \
+			if (__timeout_us) \
+				__left_ns -= __delay_ns; \
+		} \
 		cpu_relax(); \
+		if (__timeout_us) \
+			__left_ns--; \
 	} \
 	(cond) ? 0 : -ETIMEDOUT; \
 })
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-10 13:23 ` [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic() Geert Uytterhoeven
@ 2023-05-10 13:35   ` Arnd Bergmann
  2023-05-10 13:46     ` Geert Uytterhoeven
  2023-05-11  6:48   ` Tony Lindgren
  2023-05-11 10:26   ` Ulf Hansson
  2 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2023-05-10 13:35 UTC (permalink / raw)
  To: Geert Uytterhoeven, Stephen Boyd, Tomasz Figa,
	Sylwester Nawrocki, Will Deacon, Wolfram Sang, Dejin Zheng,
	Kai-Heng Feng, Nicholas Piggin, Heiko Carstens, Peter Zijlstra,
	Russell King, John Stultz, Thomas Gleixner, Tony Lindgren,
	Krzysztof Kozlowski, Tero Kristo, Ulf Hansson, Rafael J. Wysocki,
	Vincent Guittot
  Cc: linux-arm-kernel, Linux-Renesas, linux-pm, linux-kernel

On Wed, May 10, 2023, at 15:23, Geert Uytterhoeven wrote:
> read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> feature, just like its non-atomic counterpart.  However, there are
> several issues with this, due to its use in atomic contexts:
>
>   1. When called in the s2ram path (as typically done by clock or PM
>      domain drivers), timekeeping may be suspended, triggering the
>      WARN_ON(timekeeping_suspended) in ktime_get():
>
> 	WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
>
>      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
>      rid of that warning.  However, that would break timeout handling,
>      as (at least on systems with an ARM architectured timer), the time
>      returned by ktime_get_mono_fast_ns() does not advance while
>      timekeeping is suspended.
>      Interestingly, (on the same ARM systems) the time returned by
>      ktime_get() does advance while timekeeping is suspended, despite
>      the warning.
>
>   2. Depending on the actual clock source, and especially before a
>      high-resolution clocksource (e.g. the ARM architectured timer)
>      becomes available, time may not advance in atomic contexts, thus
>      breaking timeout handling.
>
> Fix this by abandoning the idea that one can rely on timekeeping to
> implement timeout handling in all atomic contexts, and switch from a
> global time-based to a locally-estimated timeout handling.  In most
> (all?) cases the timeout condition is exceptional and an error
> condition, hence any additional delays due to underestimating wall clock
> time are irrelevant.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>

This looks reasonable to me,

Acked-by: Arnd Bergmann <arnd@arndb.de>

I assume you sent this because you ran into the bug on a
particular driver. It might help to be more specific about
how this can be reproduced.

> ---
> Alternatively, one could use a mixed approach (use both
> ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> earliest occasion), but I think that would complicate things without
> much gain.

Agreed.

     Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-10 13:35   ` Arnd Bergmann
@ 2023-05-10 13:46     ` Geert Uytterhoeven
  2023-05-10 13:56       ` Arnd Bergmann
  0 siblings, 1 reply; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-10 13:46 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Wolfram Sang, Dejin Zheng, Kai-Heng Feng, Nicholas Piggin,
	Heiko Carstens, Peter Zijlstra, Russell King, John Stultz,
	Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski, Tero Kristo,
	Ulf Hansson, Rafael J. Wysocki, Vincent Guittot,
	linux-arm-kernel, Linux-Renesas, linux-pm, linux-kernel

Hi Arnd,

On Wed, May 10, 2023 at 3:36 PM Arnd Bergmann <arnd@arndb.de> wrote:
> On Wed, May 10, 2023, at 15:23, Geert Uytterhoeven wrote:
> > read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> > feature, just like its non-atomic counterpart.  However, there are
> > several issues with this, due to its use in atomic contexts:
> >
> >   1. When called in the s2ram path (as typically done by clock or PM
> >      domain drivers), timekeeping may be suspended, triggering the
> >      WARN_ON(timekeeping_suspended) in ktime_get():
> >
> >       WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
> >
> >      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> >      rid of that warning.  However, that would break timeout handling,
> >      as (at least on systems with an ARM architectured timer), the time
> >      returned by ktime_get_mono_fast_ns() does not advance while
> >      timekeeping is suspended.
> >      Interestingly, (on the same ARM systems) the time returned by
> >      ktime_get() does advance while timekeeping is suspended, despite
> >      the warning.
> >
> >   2. Depending on the actual clock source, and especially before a
> >      high-resolution clocksource (e.g. the ARM architectured timer)
> >      becomes available, time may not advance in atomic contexts, thus
> >      breaking timeout handling.
> >
> > Fix this by abandoning the idea that one can rely on timekeeping to
> > implement timeout handling in all atomic contexts, and switch from a
> > global time-based to a locally-estimated timeout handling.  In most
> > (all?) cases the timeout condition is exceptional and an error
> > condition, hence any additional delays due to underestimating wall clock
> > time are irrelevant.
> >
> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> This looks reasonable to me,
>
> Acked-by: Arnd Bergmann <arnd@arndb.de>

Thanks!

> I assume you sent this because you ran into the bug on a
> particular driver. It might help to be more specific about
> how this can be reproduced.

I first ran into it when converting open-coded loops to
read*_poll_timeout_atomic().
Later, I also saw the issue with the existing
read*_poll_timeout_atomic() calls in the R-Car SYSC driver, but only
after applying additional patches from the BSP that impact the moment
PM Domains are powered during s2ram.
The various pointers to existing mitigations in the cover letter should
give you other suggestions for how to reproduce...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-10 13:46     ` Geert Uytterhoeven
@ 2023-05-10 13:56       ` Arnd Bergmann
  0 siblings, 0 replies; 17+ messages in thread
From: Arnd Bergmann @ 2023-05-10 13:56 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Wolfram Sang, Dejin Zheng, Kai-Heng Feng, Nicholas Piggin,
	Heiko Carstens, Peter Zijlstra, Russell King, John Stultz,
	Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski, Tero Kristo,
	Ulf Hansson, Rafael J. Wysocki, Vincent Guittot,
	linux-arm-kernel, Linux-Renesas, linux-pm, linux-kernel

On Wed, May 10, 2023, at 15:46, Geert Uytterhoeven wrote:
> I first ran into it when converting open-coded loops to
> read*_poll_timeout_atomic().
> Later, I also saw the issue with the existing
> read*_poll_timeout_atomic() calls in the R-Car SYSC driver, but only
> after applying additional patches from the BSP that impact the moment
> PM Domains are powered during s2ram.

Ok

> The various pointers to existing mitigations in the cover letter should
> give you other suggestions for how to reproduce...

I see the cover letter now, I thought I had looked for it earlier but
didn't notice it.

     Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-10 13:23 ` [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic() Geert Uytterhoeven
  2023-05-10 13:35   ` Arnd Bergmann
@ 2023-05-11  6:48   ` Tony Lindgren
  2023-05-11 10:26   ` Ulf Hansson
  2 siblings, 0 replies; 17+ messages in thread
From: Tony Lindgren @ 2023-05-11  6:48 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Krzysztof Kozlowski, Tero Kristo,
	Ulf Hansson, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

Hi,

* Geert Uytterhoeven <geert+renesas@glider.be> [230510 13:23]:
> read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> feature, just like its non-atomic counterpart.  However, there are
> several issues with this, due to its use in atomic contexts:
> 
>   1. When called in the s2ram path (as typically done by clock or PM
>      domain drivers), timekeeping may be suspended, triggering the
>      WARN_ON(timekeeping_suspended) in ktime_get():

Maybe add a comment to read_poll_timeout_atomic() saying it can be
used also with timekeeping_suspended?

Otherwise a few years later it might get broken when somebody goes
to patch it without testing it with timekeeping_suspended :)

Other than that looks good to me:

Reviewed-by: Tony Lindgren <tony@atomide.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops
  2023-05-10 13:23 ` [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops Geert Uytterhoeven
@ 2023-05-11  6:48   ` Tony Lindgren
  2023-05-11 10:48     ` David Laight
  2023-05-11  9:48   ` Ulf Hansson
  1 sibling, 1 reply; 17+ messages in thread
From: Tony Lindgren @ 2023-05-11  6:48 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Krzysztof Kozlowski, Tero Kristo,
	Ulf Hansson, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

* Geert Uytterhoeven <geert+renesas@glider.be> [230510 13:23]:
> It is considered good practice to call cpu_relax() in busy loops, see
> Documentation/process/volatile-considered-harmful.rst.  This can not
> only lower CPU power consumption or yield to a hyperthreaded twin
> processor, but also allows an architecture to mitigate hardware issues
> (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
> architecture-specific cpu_relax() implementation.
> 
> In addition, cpu_relax() is also a compiler barrier.  It is not
> immediately obvious that the @op argument "function" will result in an
> actual function call (e.g. in case of inlining).
> 
> Where a function call is a C sequence point, this is lost on inlining.
> Therefore, with agressive enough optimization it might be possible for
> the compiler to hoist the:
> 
>         (val) = op(args);
> 
> "load" out of the loop because it doesn't see the value changing. The
> addition of cpu_relax() would inhibit this.
> 
> As the iopoll helpers lack calls to cpu_relax(), people are sometimes
> reluctant to use them, and may fall back to open-coded polling loops
> (including cpu_relax() calls) instead.
> 
> Fix this by adding calls to cpu_relax() to the iopoll helpers:
>   - For the non-atomic case, it is sufficient to call cpu_relax() in
>     case of a zero sleep-between-reads value, as a call to
>     usleep_range() is a safe barrier otherwise.  However, it doesn't
>     hurt to add the call regardless, for simplicity, and for similarity
>     with the atomic case below.
>   - For the atomic case, cpu_relax() must be called regardless of the
>     sleep-between-reads value, as there is no guarantee all
>     architecture-specific implementations of udelay() handle this.

Reviewed-by: Tony Lindgren <tony@atomide.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops
  2023-05-10 13:23 ` [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops Geert Uytterhoeven
  2023-05-11  6:48   ` Tony Lindgren
@ 2023-05-11  9:48   ` Ulf Hansson
  1 sibling, 0 replies; 17+ messages in thread
From: Ulf Hansson @ 2023-05-11  9:48 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
<geert+renesas@glider.be> wrote:
>
> It is considered good practice to call cpu_relax() in busy loops, see
> Documentation/process/volatile-considered-harmful.rst.  This can not
> only lower CPU power consumption or yield to a hyperthreaded twin
> processor, but also allows an architecture to mitigate hardware issues
> (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
> architecture-specific cpu_relax() implementation.
>
> In addition, cpu_relax() is also a compiler barrier.  It is not
> immediately obvious that the @op argument "function" will result in an
> actual function call (e.g. in case of inlining).
>
> Where a function call is a C sequence point, this is lost on inlining.
> Therefore, with agressive enough optimization it might be possible for
> the compiler to hoist the:
>
>         (val) = op(args);
>
> "load" out of the loop because it doesn't see the value changing. The
> addition of cpu_relax() would inhibit this.
>
> As the iopoll helpers lack calls to cpu_relax(), people are sometimes
> reluctant to use them, and may fall back to open-coded polling loops
> (including cpu_relax() calls) instead.
>
> Fix this by adding calls to cpu_relax() to the iopoll helpers:
>   - For the non-atomic case, it is sufficient to call cpu_relax() in
>     case of a zero sleep-between-reads value, as a call to
>     usleep_range() is a safe barrier otherwise.  However, it doesn't
>     hurt to add the call regardless, for simplicity, and for similarity
>     with the atomic case below.
>   - For the atomic case, cpu_relax() must be called regardless of the
>     sleep-between-reads value, as there is no guarantee all
>     architecture-specific implementations of udelay() handle this.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Acked-by: Arnd Bergmann <arnd@arndb.de>

Makes sense to me! Feel free to add:

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

> ---
> v2:
>   - Add Acked-by,
>   - Add compiler barrier and inlining explanation (thanks, Peter!).
> ---
>  include/linux/iopoll.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
> index 2c8860e406bd8cae..0417360a6db9b0d6 100644
> --- a/include/linux/iopoll.h
> +++ b/include/linux/iopoll.h
> @@ -53,6 +53,7 @@
>                 } \
>                 if (__sleep_us) \
>                         usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
> +               cpu_relax(); \
>         } \
>         (cond) ? 0 : -ETIMEDOUT; \
>  })
> @@ -95,6 +96,7 @@
>                 } \
>                 if (__delay_us) \
>                         udelay(__delay_us); \
> +               cpu_relax(); \
>         } \
>         (cond) ? 0 : -ETIMEDOUT; \
>  })
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-10 13:23 ` [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic() Geert Uytterhoeven
  2023-05-10 13:35   ` Arnd Bergmann
  2023-05-11  6:48   ` Tony Lindgren
@ 2023-05-11 10:26   ` Ulf Hansson
  2023-05-11 12:44     ` Geert Uytterhoeven
  2 siblings, 1 reply; 17+ messages in thread
From: Ulf Hansson @ 2023-05-11 10:26 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
<geert+renesas@glider.be> wrote:
>
> read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> feature, just like its non-atomic counterpart.  However, there are
> several issues with this, due to its use in atomic contexts:
>
>   1. When called in the s2ram path (as typically done by clock or PM
>      domain drivers), timekeeping may be suspended, triggering the
>      WARN_ON(timekeeping_suspended) in ktime_get():
>
>         WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
>
>      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
>      rid of that warning.  However, that would break timeout handling,
>      as (at least on systems with an ARM architectured timer), the time
>      returned by ktime_get_mono_fast_ns() does not advance while
>      timekeeping is suspended.
>      Interestingly, (on the same ARM systems) the time returned by
>      ktime_get() does advance while timekeeping is suspended, despite
>      the warning.

Interesting, looks like we should spend some time to further
investigate this behaviour.

>
>   2. Depending on the actual clock source, and especially before a
>      high-resolution clocksource (e.g. the ARM architectured timer)
>      becomes available, time may not advance in atomic contexts, thus
>      breaking timeout handling.
>
> Fix this by abandoning the idea that one can rely on timekeeping to
> implement timeout handling in all atomic contexts, and switch from a
> global time-based to a locally-estimated timeout handling.  In most
> (all?) cases the timeout condition is exceptional and an error
> condition, hence any additional delays due to underestimating wall clock
> time are irrelevant.

I wonder if this isn't an oversimplification of the situation. Don't
we have timeout-error-conditions that we expected to happen quite
frequently?

If so, in these cases, we really don't want to continue looping longer
than actually needed, as then we will remain in the atomic context
longer than necessary.

I guess some information about how big these additional delays could
be, would help to understand better. Of course, it's not entirely easy
to get that data, but did you run some tests to see how this changes?

>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Alternatively, one could use a mixed approach (use both
> ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> earliest occasion), but I think that would complicate things without
> much gain.

Another option could be to provide two different polling APIs for the
atomic use-case.

One that keeps using ktime, which is more accurate and generally
favourable - and another, along the lines of what you propose, that
should be used by those that can't rely on timekeeping.

>
> v2:
>   - New.
> ---
>  include/linux/iopoll.h | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
> index 0417360a6db9b0d6..bb2e1d9117e96679 100644
> --- a/include/linux/iopoll.h
> +++ b/include/linux/iopoll.h
> @@ -81,22 +81,30 @@
>                                         delay_before_read, args...) \
>  ({ \
>         u64 __timeout_us = (timeout_us); \
> +       s64 __left_ns = __timeout_us * NSEC_PER_USEC; \
>         unsigned long __delay_us = (delay_us); \
> -       ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
> -       if (delay_before_read && __delay_us) \
> +       u64 __delay_ns = __delay_us * NSEC_PER_USEC; \
> +       if (delay_before_read && __delay_us) { \
>                 udelay(__delay_us); \
> +               if (__timeout_us) \
> +                       __left_ns -= __delay_ns; \
> +       } \
>         for (;;) { \
>                 (val) = op(args); \
>                 if (cond) \
>                         break; \
> -               if (__timeout_us && \
> -                   ktime_compare(ktime_get(), __timeout) > 0) { \
> +               if (__timeout_us && __left_ns < 0) { \
>                         (val) = op(args); \
>                         break; \
>                 } \
> -               if (__delay_us) \
> +               if (__delay_us) { \
>                         udelay(__delay_us); \
> +                       if (__timeout_us) \
> +                               __left_ns -= __delay_ns; \
> +               } \
>                 cpu_relax(); \
> +               if (__timeout_us) \
> +                       __left_ns--; \
>         } \
>         (cond) ? 0 : -ETIMEDOUT; \
>  })
> --
> 2.34.1
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops
  2023-05-11  6:48   ` Tony Lindgren
@ 2023-05-11 10:48     ` David Laight
  2023-05-23  7:29       ` Geert Uytterhoeven
  0 siblings, 1 reply; 17+ messages in thread
From: David Laight @ 2023-05-11 10:48 UTC (permalink / raw)
  To: 'Tony Lindgren', Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Krzysztof Kozlowski, Tero Kristo,
	Ulf Hansson, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

From: Tony Lindgren
> Sent: 11 May 2023 07:49
> 
> * Geert Uytterhoeven <geert+renesas@glider.be> [230510 13:23]:
> > It is considered good practice to call cpu_relax() in busy loops, see
> > Documentation/process/volatile-considered-harmful.rst.  This can not
> > only lower CPU power consumption or yield to a hyperthreaded twin
> > processor, but also allows an architecture to mitigate hardware issues
> > (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
> > architecture-specific cpu_relax() implementation.

Don't you also need to call cond_resched() (at least some times).
Otherwise the process can't be pre-empted and a RT process
that last ran on that cpu will never be scheduled.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-11 10:26   ` Ulf Hansson
@ 2023-05-11 12:44     ` Geert Uytterhoeven
  2023-05-12  7:53       ` Ulf Hansson
  0 siblings, 1 reply; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-11 12:44 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

Hi Ulf,

On Thu, May 11, 2023 at 12:27 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
> <geert+renesas@glider.be> wrote:
> > read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> > feature, just like its non-atomic counterpart.  However, there are
> > several issues with this, due to its use in atomic contexts:
> >
> >   1. When called in the s2ram path (as typically done by clock or PM
> >      domain drivers), timekeeping may be suspended, triggering the
> >      WARN_ON(timekeeping_suspended) in ktime_get():
> >
> >         WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
> >
> >      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> >      rid of that warning.  However, that would break timeout handling,
> >      as (at least on systems with an ARM architectured timer), the time
> >      returned by ktime_get_mono_fast_ns() does not advance while
> >      timekeeping is suspended.
> >      Interestingly, (on the same ARM systems) the time returned by
> >      ktime_get() does advance while timekeeping is suspended, despite
> >      the warning.
>
> Interesting, looks like we should spend some time to further
> investigate this behaviour.

Probably, I was a bit surprised by this behavior, too.

> >   2. Depending on the actual clock source, and especially before a
> >      high-resolution clocksource (e.g. the ARM architectured timer)
> >      becomes available, time may not advance in atomic contexts, thus
> >      breaking timeout handling.
> >
> > Fix this by abandoning the idea that one can rely on timekeeping to
> > implement timeout handling in all atomic contexts, and switch from a
> > global time-based to a locally-estimated timeout handling.  In most
> > (all?) cases the timeout condition is exceptional and an error
> > condition, hence any additional delays due to underestimating wall clock
> > time are irrelevant.
>
> I wonder if this isn't an oversimplification of the situation. Don't
> we have timeout-error-conditions that we expected to happen quite
> frequently?

We may have some.  But they definitely do not happen when time
does not advance, or they would have been mitigated long ago
(the loop would never terminate).

> If so, in these cases, we really don't want to continue looping longer
> than actually needed, as then we will remain in the atomic context
> longer than necessary.
>
> I guess some information about how big these additional delays could
> be, would help to understand better. Of course, it's not entirely easy
> to get that data, but did you run some tests to see how this changes?

I did some timings (when timekeeping is available), and the differences
are rather minor.  The delay and timeout parameters are in µs, and
1 µs is already a few orders of magnitude larger than the cycle time
of a contemporary CPU.

Under-estimates are due to the time spent in op() (depends on the
user, typical use is a hardware device register read), udelay()
(architecture/platform-dependent accuracy), and general loop overhead.

> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > ---
> > Alternatively, one could use a mixed approach (use both
> > ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> > earliest occasion), but I think that would complicate things without
> > much gain.
>
> Another option could be to provide two different polling APIs for the
> atomic use-case.
>
> One that keeps using ktime, which is more accurate and generally
> favourable - and another, along the lines of what you propose, that
> should be used by those that can't rely on timekeeping.

At the risk of people picking the wrong one, leading to hard to
find bugs?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-11 12:44     ` Geert Uytterhoeven
@ 2023-05-12  7:53       ` Ulf Hansson
  2023-05-12  8:03         ` Geert Uytterhoeven
  0 siblings, 1 reply; 17+ messages in thread
From: Ulf Hansson @ 2023-05-12  7:53 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

On Thu, 11 May 2023 at 14:44, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Ulf,
>
> On Thu, May 11, 2023 at 12:27 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
> > <geert+renesas@glider.be> wrote:
> > > read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> > > feature, just like its non-atomic counterpart.  However, there are
> > > several issues with this, due to its use in atomic contexts:
> > >
> > >   1. When called in the s2ram path (as typically done by clock or PM
> > >      domain drivers), timekeeping may be suspended, triggering the
> > >      WARN_ON(timekeeping_suspended) in ktime_get():
> > >
> > >         WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
> > >
> > >      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> > >      rid of that warning.  However, that would break timeout handling,
> > >      as (at least on systems with an ARM architectured timer), the time
> > >      returned by ktime_get_mono_fast_ns() does not advance while
> > >      timekeeping is suspended.
> > >      Interestingly, (on the same ARM systems) the time returned by
> > >      ktime_get() does advance while timekeeping is suspended, despite
> > >      the warning.
> >
> > Interesting, looks like we should spend some time to further
> > investigate this behaviour.
>
> Probably, I was a bit surprised by this behavior, too.
>
> > >   2. Depending on the actual clock source, and especially before a
> > >      high-resolution clocksource (e.g. the ARM architectured timer)
> > >      becomes available, time may not advance in atomic contexts, thus
> > >      breaking timeout handling.
> > >
> > > Fix this by abandoning the idea that one can rely on timekeeping to
> > > implement timeout handling in all atomic contexts, and switch from a
> > > global time-based to a locally-estimated timeout handling.  In most
> > > (all?) cases the timeout condition is exceptional and an error
> > > condition, hence any additional delays due to underestimating wall clock
> > > time are irrelevant.
> >
> > I wonder if this isn't an oversimplification of the situation. Don't
> > we have timeout-error-conditions that we expected to happen quite
> > frequently?
>
> We may have some.  But they definitely do not happen when time
> does not advance, or they would have been mitigated long ago
> (the loop would never terminate).

Right, I was merely thinking of the case when ktime isn't suspended,
which of course is the most common case.

>
> > If so, in these cases, we really don't want to continue looping longer
> > than actually needed, as then we will remain in the atomic context
> > longer than necessary.
> >
> > I guess some information about how big these additional delays could
> > be, would help to understand better. Of course, it's not entirely easy
> > to get that data, but did you run some tests to see how this changes?
>
> I did some timings (when timekeeping is available), and the differences
> are rather minor.  The delay and timeout parameters are in µs, and
> 1 µs is already a few orders of magnitude larger than the cycle time
> of a contemporary CPU.

Ohh, I was certainly expecting a bigger spread. If it's in that
ballpark we should certainly be fine.

I will run some tests at my side too, as I am curious to see the
behaviour. I will let you know, whatever the result is, of course.

>
> Under-estimates are due to the time spent in op() (depends on the
> user, typical use is a hardware device register read), udelay()
> (architecture/platform-dependent accuracy), and general loop overhead.

Yes, you are right. My main concern is the accuracy of the udelay, but
I may be totally wrong here.

>
> > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > ---
> > > Alternatively, one could use a mixed approach (use both
> > > ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> > > earliest occasion), but I think that would complicate things without
> > > much gain.
> >
> > Another option could be to provide two different polling APIs for the
> > atomic use-case.
> >
> > One that keeps using ktime, which is more accurate and generally
> > favourable - and another, along the lines of what you propose, that
> > should be used by those that can't rely on timekeeping.
>
> At the risk of people picking the wrong one, leading to hard to
> find bugs?

I agree, If we don't need two APIs, it's certainly better to stick with one.

My main point is that we should not sacrifice "performance" for the
most common case, just to keep things simple, right?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-12  7:53       ` Ulf Hansson
@ 2023-05-12  8:03         ` Geert Uytterhoeven
  2023-05-15  9:26           ` Ulf Hansson
  0 siblings, 1 reply; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-12  8:03 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

Hi Ulf,

On Fri, May 12, 2023 at 9:54 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On Thu, 11 May 2023 at 14:44, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Thu, May 11, 2023 at 12:27 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
> > > <geert+renesas@glider.be> wrote:
> > > > read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> > > > feature, just like its non-atomic counterpart.  However, there are
> > > > several issues with this, due to its use in atomic contexts:
> > > >
> > > >   1. When called in the s2ram path (as typically done by clock or PM
> > > >      domain drivers), timekeeping may be suspended, triggering the
> > > >      WARN_ON(timekeeping_suspended) in ktime_get():
> > > >
> > > >         WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
> > > >
> > > >      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> > > >      rid of that warning.  However, that would break timeout handling,
> > > >      as (at least on systems with an ARM architectured timer), the time
> > > >      returned by ktime_get_mono_fast_ns() does not advance while
> > > >      timekeeping is suspended.
> > > >      Interestingly, (on the same ARM systems) the time returned by
> > > >      ktime_get() does advance while timekeeping is suspended, despite
> > > >      the warning.
> > >
> > > Interesting, looks like we should spend some time to further
> > > investigate this behaviour.
> >
> > Probably, I was a bit surprised by this behavior, too.
> >
> > > >   2. Depending on the actual clock source, and especially before a
> > > >      high-resolution clocksource (e.g. the ARM architectured timer)
> > > >      becomes available, time may not advance in atomic contexts, thus
> > > >      breaking timeout handling.
> > > >
> > > > Fix this by abandoning the idea that one can rely on timekeeping to
> > > > implement timeout handling in all atomic contexts, and switch from a
> > > > global time-based to a locally-estimated timeout handling.  In most
> > > > (all?) cases the timeout condition is exceptional and an error
> > > > condition, hence any additional delays due to underestimating wall clock
> > > > time are irrelevant.
> > >
> > > I wonder if this isn't an oversimplification of the situation. Don't
> > > we have timeout-error-conditions that we expected to happen quite
> > > frequently?
> >
> > We may have some.  But they definitely do not happen when time
> > does not advance, or they would have been mitigated long ago
> > (the loop would never terminate).
>
> Right, I was merely thinking of the case when ktime isn't suspended,
> which of course is the most common case.
>
> >
> > > If so, in these cases, we really don't want to continue looping longer
> > > than actually needed, as then we will remain in the atomic context
> > > longer than necessary.
> > >
> > > I guess some information about how big these additional delays could
> > > be, would help to understand better. Of course, it's not entirely easy
> > > to get that data, but did you run some tests to see how this changes?
> >
> > I did some timings (when timekeeping is available), and the differences
> > are rather minor.  The delay and timeout parameters are in µs, and
> > 1 µs is already a few orders of magnitude larger than the cycle time
> > of a contemporary CPU.
>
> Ohh, I was certainly expecting a bigger spread. If it's in that
> ballpark we should certainly be fine.
>
> I will run some tests at my side too, as I am curious to see the
> behaviour. I will let you know, whatever the result is, of course.
>
> >
> > Under-estimates are due to the time spent in op() (depends on the
> > user, typical use is a hardware device register read), udelay()
> > (architecture/platform-dependent accuracy), and general loop overhead.
>
> Yes, you are right. My main concern is the accuracy of the udelay, but
> I may be totally wrong here.
>
> >
> > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > > ---
> > > > Alternatively, one could use a mixed approach (use both
> > > > ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> > > > earliest occasion), but I think that would complicate things without
> > > > much gain.
> > >
> > > Another option could be to provide two different polling APIs for the
> > > atomic use-case.
> > >
> > > One that keeps using ktime, which is more accurate and generally
> > > favourable - and another, along the lines of what you propose, that
> > > should be used by those that can't rely on timekeeping.
> >
> > At the risk of people picking the wrong one, leading to hard to
> > find bugs?
>
> I agree, If we don't need two APIs, it's certainly better to stick with one.
>
> My main point is that we should not sacrifice "performance" for the
> most common case, just to keep things simple, right?

Most of these loops run just 1 or 2 cycles.
Performance mostly kicks in when timing out, but note that not
calling ktime_get() also reduces loop overhead...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()
  2023-05-12  8:03         ` Geert Uytterhoeven
@ 2023-05-15  9:26           ` Ulf Hansson
  0 siblings, 0 replies; 17+ messages in thread
From: Ulf Hansson @ 2023-05-15  9:26 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Stephen Boyd, Tomasz Figa, Sylwester Nawrocki, Will Deacon,
	Arnd Bergmann, Wolfram Sang, Dejin Zheng, Kai-Heng Feng,
	Nicholas Piggin, Heiko Carstens, Peter Zijlstra, Russell King,
	John Stultz, Thomas Gleixner, Tony Lindgren, Krzysztof Kozlowski,
	Tero Kristo, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

On Fri, 12 May 2023 at 10:03, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Ulf,
>
> On Fri, May 12, 2023 at 9:54 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > On Thu, 11 May 2023 at 14:44, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Thu, May 11, 2023 at 12:27 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
> > > > <geert+renesas@glider.be> wrote:
> > > > > read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> > > > > feature, just like its non-atomic counterpart.  However, there are
> > > > > several issues with this, due to its use in atomic contexts:
> > > > >
> > > > >   1. When called in the s2ram path (as typically done by clock or PM
> > > > >      domain drivers), timekeeping may be suspended, triggering the
> > > > >      WARN_ON(timekeeping_suspended) in ktime_get():
> > > > >
> > > > >         WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
> > > > >
> > > > >      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> > > > >      rid of that warning.  However, that would break timeout handling,
> > > > >      as (at least on systems with an ARM architectured timer), the time
> > > > >      returned by ktime_get_mono_fast_ns() does not advance while
> > > > >      timekeeping is suspended.
> > > > >      Interestingly, (on the same ARM systems) the time returned by
> > > > >      ktime_get() does advance while timekeeping is suspended, despite
> > > > >      the warning.
> > > >
> > > > Interesting, looks like we should spend some time to further
> > > > investigate this behaviour.
> > >
> > > Probably, I was a bit surprised by this behavior, too.
> > >
> > > > >   2. Depending on the actual clock source, and especially before a
> > > > >      high-resolution clocksource (e.g. the ARM architectured timer)
> > > > >      becomes available, time may not advance in atomic contexts, thus
> > > > >      breaking timeout handling.
> > > > >
> > > > > Fix this by abandoning the idea that one can rely on timekeeping to
> > > > > implement timeout handling in all atomic contexts, and switch from a
> > > > > global time-based to a locally-estimated timeout handling.  In most
> > > > > (all?) cases the timeout condition is exceptional and an error
> > > > > condition, hence any additional delays due to underestimating wall clock
> > > > > time are irrelevant.
> > > >
> > > > I wonder if this isn't an oversimplification of the situation. Don't
> > > > we have timeout-error-conditions that we expected to happen quite
> > > > frequently?
> > >
> > > We may have some.  But they definitely do not happen when time
> > > does not advance, or they would have been mitigated long ago
> > > (the loop would never terminate).
> >
> > Right, I was merely thinking of the case when ktime isn't suspended,
> > which of course is the most common case.
> >
> > >
> > > > If so, in these cases, we really don't want to continue looping longer
> > > > than actually needed, as then we will remain in the atomic context
> > > > longer than necessary.
> > > >
> > > > I guess some information about how big these additional delays could
> > > > be, would help to understand better. Of course, it's not entirely easy
> > > > to get that data, but did you run some tests to see how this changes?
> > >
> > > I did some timings (when timekeeping is available), and the differences
> > > are rather minor.  The delay and timeout parameters are in µs, and
> > > 1 µs is already a few orders of magnitude larger than the cycle time
> > > of a contemporary CPU.
> >
> > Ohh, I was certainly expecting a bigger spread. If it's in that
> > ballpark we should certainly be fine.
> >
> > I will run some tests at my side too, as I am curious to see the
> > behaviour. I will let you know, whatever the result is, of course.
> >
> > >
> > > Under-estimates are due to the time spent in op() (depends on the
> > > user, typical use is a hardware device register read), udelay()
> > > (architecture/platform-dependent accuracy), and general loop overhead.
> >
> > Yes, you are right. My main concern is the accuracy of the udelay, but
> > I may be totally wrong here.
> >
> > >
> > > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > > > ---
> > > > > Alternatively, one could use a mixed approach (use both
> > > > > ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> > > > > earliest occasion), but I think that would complicate things without
> > > > > much gain.
> > > >
> > > > Another option could be to provide two different polling APIs for the
> > > > atomic use-case.
> > > >
> > > > One that keeps using ktime, which is more accurate and generally
> > > > favourable - and another, along the lines of what you propose, that
> > > > should be used by those that can't rely on timekeeping.
> > >
> > > At the risk of people picking the wrong one, leading to hard to
> > > find bugs?
> >
> > I agree, If we don't need two APIs, it's certainly better to stick with one.
> >
> > My main point is that we should not sacrifice "performance" for the
> > most common case, just to keep things simple, right?
>
> Most of these loops run just 1 or 2 cycles.
> Performance mostly kicks in when timing out, but note that not
> calling ktime_get() also reduces loop overhead...

That's a good point too!

It sure sounds like the benefits are superior to the potential
downside. Let me not stand in the way of getting this applied.
Instead, feel free to add my:

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops
  2023-05-11 10:48     ` David Laight
@ 2023-05-23  7:29       ` Geert Uytterhoeven
  2023-05-23  8:55         ` David Laight
  0 siblings, 1 reply; 17+ messages in thread
From: Geert Uytterhoeven @ 2023-05-23  7:29 UTC (permalink / raw)
  To: David Laight
  Cc: Tony Lindgren, Stephen Boyd, Tomasz Figa, Sylwester Nawrocki,
	Will Deacon, Arnd Bergmann, Wolfram Sang, Dejin Zheng,
	Kai-Heng Feng, Nicholas Piggin, Heiko Carstens, Peter Zijlstra,
	Russell King, John Stultz, Thomas Gleixner, Krzysztof Kozlowski,
	Tero Kristo, Ulf Hansson, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

Hi David,

On Thu, May 11, 2023 at 12:49 PM David Laight <David.Laight@aculab.com> wrote:
> > * Geert Uytterhoeven <geert+renesas@glider.be> [230510 13:23]:
> > > It is considered good practice to call cpu_relax() in busy loops, see
> > > Documentation/process/volatile-considered-harmful.rst.  This can not
> > > only lower CPU power consumption or yield to a hyperthreaded twin
> > > processor, but also allows an architecture to mitigate hardware issues
> > > (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
> > > architecture-specific cpu_relax() implementation.
>
> Don't you also need to call cond_resched() (at least some times).
> Otherwise the process can't be pre-empted and a RT process
> that last ran on that cpu will never be scheduled.

According to [1], cond_resched() must be called at least once per few
tens of milliseconds.

read_poll_timeout() uses usleep_range(), which calls schedule_hrtimeout_range().
read_poll_timeout_atomic() should not be used with multi-ms timeouts anyway.

So I guess we're OK?

[1] https://elixir.bootlin.com/linux/latest/source/Documentation/RCU/Design/Requirements/Requirements.rst#L2348

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops
  2023-05-23  7:29       ` Geert Uytterhoeven
@ 2023-05-23  8:55         ` David Laight
  0 siblings, 0 replies; 17+ messages in thread
From: David Laight @ 2023-05-23  8:55 UTC (permalink / raw)
  To: 'Geert Uytterhoeven'
  Cc: Tony Lindgren, Stephen Boyd, Tomasz Figa, Sylwester Nawrocki,
	Will Deacon, Arnd Bergmann, Wolfram Sang, Dejin Zheng,
	Kai-Heng Feng, Nicholas Piggin, Heiko Carstens, Peter Zijlstra,
	Russell King, John Stultz, Thomas Gleixner, Krzysztof Kozlowski,
	Tero Kristo, Ulf Hansson, Rafael J . Wysocki, Vincent Guittot,
	linux-arm-kernel, linux-renesas-soc, linux-pm, linux-kernel

From: Geert Uytterhoeven
> Sent: 23 May 2023 08:30
> 
> Hi David,
> 
> On Thu, May 11, 2023 at 12:49 PM David Laight <David.Laight@aculab.com> wrote:
> > > * Geert Uytterhoeven <geert+renesas@glider.be> [230510 13:23]:
> > > > It is considered good practice to call cpu_relax() in busy loops, see
> > > > Documentation/process/volatile-considered-harmful.rst.  This can not
> > > > only lower CPU power consumption or yield to a hyperthreaded twin
> > > > processor, but also allows an architecture to mitigate hardware issues
> > > > (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
> > > > architecture-specific cpu_relax() implementation.
> >
> > Don't you also need to call cond_resched() (at least some times).
> > Otherwise the process can't be pre-empted and a RT process
> > that last ran on that cpu will never be scheduled.
> 
> According to [1], cond_resched() must be called at least once per few
> tens of milliseconds.

Hmmm.... tens of milliseconds is really much too long for RT threads.
A limit nearer 1ms would be barely acceptable.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-05-23  8:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-10 13:23 [PATCH v2 0/2] iopoll: Busy loop and timeout improvements Geert Uytterhoeven
2023-05-10 13:23 ` [PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops Geert Uytterhoeven
2023-05-11  6:48   ` Tony Lindgren
2023-05-11 10:48     ` David Laight
2023-05-23  7:29       ` Geert Uytterhoeven
2023-05-23  8:55         ` David Laight
2023-05-11  9:48   ` Ulf Hansson
2023-05-10 13:23 ` [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic() Geert Uytterhoeven
2023-05-10 13:35   ` Arnd Bergmann
2023-05-10 13:46     ` Geert Uytterhoeven
2023-05-10 13:56       ` Arnd Bergmann
2023-05-11  6:48   ` Tony Lindgren
2023-05-11 10:26   ` Ulf Hansson
2023-05-11 12:44     ` Geert Uytterhoeven
2023-05-12  7:53       ` Ulf Hansson
2023-05-12  8:03         ` Geert Uytterhoeven
2023-05-15  9:26           ` Ulf Hansson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).