All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] debug: More properly delay for secondary CPUs
@ 2016-10-14 18:41 Douglas Anderson
  2016-10-18  9:48 ` Daniel Thompson
  2016-10-20  0:14 ` Andrew Morton
  0 siblings, 2 replies; 3+ messages in thread
From: Douglas Anderson @ 2016-10-14 18:41 UTC (permalink / raw)
  To: Jason Wessel
  Cc: Daniel Thompson, Andrew Morton, briannorris, Douglas Anderson,
	kgdb-bugreport, linux-kernel

We've got a delay loop waiting for secondary CPUs.  That loop uses
loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures.  It is quite
common to see things like this in the boot log:
  Calibrating delay loop (skipped), value calculated using timer
  frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugging only to print their stack crawls shortly after
the kdb> prompt was written.

It appears that other code with similar loops (like __spin_lock_debug)
adds an extra __delay(1) in there which makes it work better.
Presumably the __delay(1) is very safe.  At least on modern ARM/ARM64
systems it will just do a CP15 instruction, which should be safe.  On
older ARM systems it will fall back to an actual delay loop, or perhaps
another memory-mapped timer.  On other platforms it must be safe too or
it wouldn't be used in __spin_lock_debug.

Note that we use __delay(100) instead of __delay(1) so we can get a
little closer to a more accurate timeout on systems where __delay() is
backed by a timer.  It's better to have a more accurate timeout and the
only penalty is that we might wait an extra 99 "loops" before we enter
the debugger.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
 kernel/debug/debug_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 0874e2edd275..454150d98dbc 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -598,11 +598,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
 	/*
 	 * Wait for the other CPUs to be notified and be waiting for us:
 	 */
-	time_left = loops_per_jiffy * HZ;
+	time_left = DIV_ROUND_UP(loops_per_jiffy * HZ, 100);
 	while (kgdb_do_roundup && --time_left &&
 	       (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
 		   online_cpus)
-		cpu_relax();
+		__delay(100);
 	if (!time_left)
 		pr_crit("Timed out waiting for secondary CPUs.\n");
 
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] debug: More properly delay for secondary CPUs
  2016-10-14 18:41 [PATCH] debug: More properly delay for secondary CPUs Douglas Anderson
@ 2016-10-18  9:48 ` Daniel Thompson
  2016-10-20  0:14 ` Andrew Morton
  1 sibling, 0 replies; 3+ messages in thread
From: Daniel Thompson @ 2016-10-18  9:48 UTC (permalink / raw)
  To: Douglas Anderson, Jason Wessel
  Cc: Andrew Morton, briannorris, kgdb-bugreport, linux-kernel

On 14/10/16 19:41, Douglas Anderson wrote:
> We've got a delay loop waiting for secondary CPUs.  That loop uses
> loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
> many tight loops make up a jiffy on all architectures.  It is quite
> common to see things like this in the boot log:
>   Calibrating delay loop (skipped), value calculated using timer
>   frequency.. 48.00 BogoMIPS (lpj=24000)
>
> In my case I was seeing lots of cases where other CPUs timed out
> entering the debugging only to print their stack crawls shortly after
> the kdb> prompt was written.
>
> It appears that other code with similar loops (like __spin_lock_debug)
> adds an extra __delay(1) in there which makes it work better.
> Presumably the __delay(1) is very safe.  At least on modern ARM/ARM64
> systems it will just do a CP15 instruction, which should be safe.  On
> older ARM systems it will fall back to an actual delay loop, or perhaps
> another memory-mapped timer.  On other platforms it must be safe too or
> it wouldn't be used in __spin_lock_debug.
 >
> Note that we use __delay(100) instead of __delay(1) so we can get a
> little closer to a more accurate timeout on systems where __delay() is
> backed by a timer.  It's better to have a more accurate timeout and the
> only penalty is that we might wait an extra 99 "loops" before we enter
> the debugger.

It would probably be better to switch this code fully over to udelay(10) 
instead (and forget about loops_per_jiffy entirely). Even udelay(10) is 
still plenty fast enough not to be human detectable when bringing up the 
debug prompt.

Note udelay() is already used internally to kgdb so there should be 
little risk introducing it here.


Daniel.


>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>  kernel/debug/debug_core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 0874e2edd275..454150d98dbc 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -598,11 +598,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
>  	/*
>  	 * Wait for the other CPUs to be notified and be waiting for us:
>  	 */
> -	time_left = loops_per_jiffy * HZ;
> +	time_left = DIV_ROUND_UP(loops_per_jiffy * HZ, 100);
>  	while (kgdb_do_roundup && --time_left &&
>  	       (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
>  		   online_cpus)
> -		cpu_relax();
> +		__delay(100);
>  	if (!time_left)
>  		pr_crit("Timed out waiting for secondary CPUs.\n");
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] debug: More properly delay for secondary CPUs
  2016-10-14 18:41 [PATCH] debug: More properly delay for secondary CPUs Douglas Anderson
  2016-10-18  9:48 ` Daniel Thompson
@ 2016-10-20  0:14 ` Andrew Morton
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2016-10-20  0:14 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Jason Wessel, Daniel Thompson, briannorris, kgdb-bugreport, linux-kernel

On Fri, 14 Oct 2016 11:41:21 -0700 Douglas Anderson <dianders@chromium.org> wrote:

> We've got a delay loop waiting for secondary CPUs.  That loop uses
> loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
> many tight loops make up a jiffy on all architectures.  It is quite
> common to see things like this in the boot log:
>   Calibrating delay loop (skipped), value calculated using timer
>   frequency.. 48.00 BogoMIPS (lpj=24000)
> 
> In my case I was seeing lots of cases where other CPUs timed out
> entering the debugging only to print their stack crawls shortly after
> the kdb> prompt was written.
> 
> It appears that other code with similar loops (like __spin_lock_debug)
> adds an extra __delay(1) in there which makes it work better.
> Presumably the __delay(1) is very safe.  At least on modern ARM/ARM64
> systems it will just do a CP15 instruction, which should be safe.  On
> older ARM systems it will fall back to an actual delay loop, or perhaps
> another memory-mapped timer.  On other platforms it must be safe too or
> it wouldn't be used in __spin_lock_debug.
> 
> Note that we use __delay(100) instead of __delay(1) so we can get a
> little closer to a more accurate timeout on systems where __delay() is
> backed by a timer.  It's better to have a more accurate timeout and the
> only penalty is that we might wait an extra 99 "loops" before we enter
> the debugger.
> 
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -598,11 +598,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
>  	/*
>  	 * Wait for the other CPUs to be notified and be waiting for us:
>  	 */
> -	time_left = loops_per_jiffy * HZ;
> +	time_left = DIV_ROUND_UP(loops_per_jiffy * HZ, 100);
>  	while (kgdb_do_roundup && --time_left &&
>  	       (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
>  		   online_cpus)
> -		cpu_relax();
> +		__delay(100);
>  	if (!time_left)
>  		pr_crit("Timed out waiting for secondary CPUs.\n");
>  

This is all rather vague, isn't it?

Can the code be redone using ndelay() or udelay()?  That way we should
be able to get predictable, arch-independent, cpu-freq-independent
delay periods.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-10-20  0:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-14 18:41 [PATCH] debug: More properly delay for secondary CPUs Douglas Anderson
2016-10-18  9:48 ` Daniel Thompson
2016-10-20  0:14 ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.