All of lore.kernel.org
 help / color / mirror / Atom feed
* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 14:32 ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 14:32 UTC (permalink / raw)
  To: Linux ARM; +Cc: cpufreq, Linux PM, Rafael J. Wysocki, Viresh Kumar

[-- Attachment #1: Type: text/plain, Size: 2593 bytes --]

Hello,

I'm working on a Cortex A9 based platform.

I have a basic clock tree, and a very basic cpufreq driver using
mostly generic driver glue:

static struct cpufreq_driver tangox_cpufreq_driver = {
	.name		= "tangox-cpufreq",
	.init		= tangox_cpu_init,
	.verify		= cpufreq_generic_frequency_table_verify,
	.target_index	= tangox_target,
	.get		= cpufreq_generic_get,
	.exit		= cpufreq_generic_exit,
	.attr		= cpufreq_generic_attr,
};

My target_index function is trivial:

static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
{
	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
}

I was testing an unrelated driver at low frequencies, with the nominal
frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
that when the driver calls

    schedule_timeout(HZ);

the thread sleeps 54 seconds instead of 1.

[ 1107.827279] Sleep for 300 jiffies.
[ 1161.838513] Done sleeping.

(I have HZ set to 300)

I enabled DEBUG in the generic cpufreq driver to see what happens
when I request a frequency change:

# cd /sys/devices/system/cpu/cpu0/cpufreq
# cat scaling_governor
performance
# cat scaling_available_frequencies
999000 499500 333000 199800 111000 18500
# echo 18500 > scaling_max_freq
[   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
[   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
[   19.610345] cpufreq: governor: change or update limits
[   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
[   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
[   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
[   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
[   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
[   19.650454] NEW RATE=9250000
[   19.653644] NEW RATE=9250000
[   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
[   19.664176] cpufreq: FREQ: 18500 - CPU: 0
[   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
[   19.675648] cpufreq: FREQ: 18500 - CPU: 1

The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
(my clockevent source) to check whether the CPU frequency change propagated
to smp_twd.

I must have made an obvious mistake. Could someone point it out to me?
(I have attached the output of /proc/timer_list to this message)

Looking more closely at schedule_timeout(), I suppose the work happens
in __mod_timer()... Hmm, that one is over my head.

What am I missing?

Regards.

[-- Attachment #2: timer_list --]
[-- Type: text/plain, Size: 3403 bytes --]

Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 1996236245543 nsecs

cpu: 0
 clock 0:
  .base:       e7ae51e8
  .index:      0
  .resolution: 3333333 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: sched_clock_timer, sched_clock_poll, S:01
 # expires at 2087831315985-2087831315985 nsecs [in 91595070442 to 91595070442 nsecs]
 clock 1:
  .base:       e7ae5220
  .index:      1
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       e7ae5258
  .index:      2
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       e7ae5290
  .index:      3
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 0
  .nr_events      : 0
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 0
  .last_tick      : 0 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 0
  .idle_calls     : 0
  .idle_sleeps    : 0
  .idle_entrytime : 1996229420769 nsecs
  .idle_waketime  : 0 nsecs
  .idle_exittime  : 0 nsecs
  .idle_sleeptime : 1985388003747 nsecs
  .iowait_sleeptime: 10995516 nsecs
  .last_jiffies   : 0
  .next_jiffies   : 0
  .idle_expires   : 0 nsecs
jiffies: 4294894041

cpu: 1
 clock 0:
  .base:       e7aed1e8
  .index:      0
  .resolution: 3333333 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 clock 1:
  .base:       e7aed220
  .index:      1
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       e7aed258
  .index:      2
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       e7aed290
  .index:      3
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 0
  .nr_events      : 0
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 0
  .last_tick      : 0 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 0
  .idle_calls     : 0
  .idle_sleeps    : 0
  .idle_entrytime : 1996231929568 nsecs
  .idle_waketime  : 0 nsecs
  .idle_exittime  : 0 nsecs
  .idle_sleeptime : 1992784761637 nsecs
  .iowait_sleeptime: 40375788 nsecs
  .last_jiffies   : 0
  .next_jiffies   : 0
  .idle_expires   : 0 nsecs
jiffies: 4294894041

Tick Device: mode:     0
Broadcast device
Clock Event Device: <NULL>
tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     0
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   464320782665
 min_delta_ns:   1622
 mult:           19864224
 shift:          31
 mode:           2
 next_event:     9223372036854775807 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  tick_handle_periodic
 retries:        0

Tick Device: mode:     0
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   464320782665
 min_delta_ns:   1622
 mult:           19864224
 shift:          31
 mode:           2
 next_event:     9223372036854775807 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  tick_handle_periodic
 retries:        0


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 14:32 ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 14:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

I'm working on a Cortex A9 based platform.

I have a basic clock tree, and a very basic cpufreq driver using
mostly generic driver glue:

static struct cpufreq_driver tangox_cpufreq_driver = {
	.name		= "tangox-cpufreq",
	.init		= tangox_cpu_init,
	.verify		= cpufreq_generic_frequency_table_verify,
	.target_index	= tangox_target,
	.get		= cpufreq_generic_get,
	.exit		= cpufreq_generic_exit,
	.attr		= cpufreq_generic_attr,
};

My target_index function is trivial:

static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
{
	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
}

I was testing an unrelated driver at low frequencies, with the nominal
frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
that when the driver calls

    schedule_timeout(HZ);

the thread sleeps 54 seconds instead of 1.

[ 1107.827279] Sleep for 300 jiffies.
[ 1161.838513] Done sleeping.

(I have HZ set to 300)

I enabled DEBUG in the generic cpufreq driver to see what happens
when I request a frequency change:

# cd /sys/devices/system/cpu/cpu0/cpufreq
# cat scaling_governor
performance
# cat scaling_available_frequencies
999000 499500 333000 199800 111000 18500
# echo 18500 > scaling_max_freq
[   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
[   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
[   19.610345] cpufreq: governor: change or update limits
[   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
[   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
[   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
[   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
[   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
[   19.650454] NEW RATE=9250000
[   19.653644] NEW RATE=9250000
[   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
[   19.664176] cpufreq: FREQ: 18500 - CPU: 0
[   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
[   19.675648] cpufreq: FREQ: 18500 - CPU: 1

The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
(my clockevent source) to check whether the CPU frequency change propagated
to smp_twd.

I must have made an obvious mistake. Could someone point it out to me?
(I have attached the output of /proc/timer_list to this message)

Looking more closely at schedule_timeout(), I suppose the work happens
in __mod_timer()... Hmm, that one is over my head.

What am I missing?

Regards.
-------------- next part --------------
Timer List Version: v0.7
HRTIMER_MAX_CLOCK_BASES: 4
now at 1996236245543 nsecs

cpu: 0
 clock 0:
  .base:       e7ae51e8
  .index:      0
  .resolution: 3333333 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: sched_clock_timer, sched_clock_poll, S:01
 # expires at 2087831315985-2087831315985 nsecs [in 91595070442 to 91595070442 nsecs]
 clock 1:
  .base:       e7ae5220
  .index:      1
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       e7ae5258
  .index:      2
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       e7ae5290
  .index:      3
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 0
  .nr_events      : 0
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 0
  .last_tick      : 0 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 0
  .idle_calls     : 0
  .idle_sleeps    : 0
  .idle_entrytime : 1996229420769 nsecs
  .idle_waketime  : 0 nsecs
  .idle_exittime  : 0 nsecs
  .idle_sleeptime : 1985388003747 nsecs
  .iowait_sleeptime: 10995516 nsecs
  .last_jiffies   : 0
  .next_jiffies   : 0
  .idle_expires   : 0 nsecs
jiffies: 4294894041

cpu: 1
 clock 0:
  .base:       e7aed1e8
  .index:      0
  .resolution: 3333333 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 clock 1:
  .base:       e7aed220
  .index:      1
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       e7aed258
  .index:      2
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       e7aed290
  .index:      3
  .resolution: 3333333 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 0
  .nr_events      : 0
  .nr_retries     : 0
  .nr_hangs       : 0
  .max_hang_time  : 0 nsecs
  .nohz_mode      : 0
  .last_tick      : 0 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 0
  .idle_calls     : 0
  .idle_sleeps    : 0
  .idle_entrytime : 1996231929568 nsecs
  .idle_waketime  : 0 nsecs
  .idle_exittime  : 0 nsecs
  .idle_sleeptime : 1992784761637 nsecs
  .iowait_sleeptime: 40375788 nsecs
  .last_jiffies   : 0
  .next_jiffies   : 0
  .idle_expires   : 0 nsecs
jiffies: 4294894041

Tick Device: mode:     0
Broadcast device
Clock Event Device: <NULL>
tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000

Tick Device: mode:     0
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   464320782665
 min_delta_ns:   1622
 mult:           19864224
 shift:          31
 mode:           2
 next_event:     9223372036854775807 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  tick_handle_periodic
 retries:        0

Tick Device: mode:     0
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   464320782665
 min_delta_ns:   1622
 mult:           19864224
 shift:          31
 mode:           2
 next_event:     9223372036854775807 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  tick_handle_periodic
 retries:        0

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 14:32 ` Mason
@ 2015-05-12 14:46   ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-12 14:46 UTC (permalink / raw)
  To: Mason, Daniel Lezcano; +Cc: Linux ARM, cpufreq, Linux PM, Rafael J. Wysocki

CC'ing Daniel..

On 12 May 2015 at 20:02, Mason <slash.tmp@free.fr> wrote:
> Hello,
>
> I'm working on a Cortex A9 based platform.
>
> I have a basic clock tree, and a very basic cpufreq driver using
> mostly generic driver glue:
>
> static struct cpufreq_driver tangox_cpufreq_driver = {
>         .name           = "tangox-cpufreq",
>         .init           = tangox_cpu_init,
>         .verify         = cpufreq_generic_frequency_table_verify,
>         .target_index   = tangox_target,
>         .get            = cpufreq_generic_get,
>         .exit           = cpufreq_generic_exit,
>         .attr           = cpufreq_generic_attr,
> };
>
> My target_index function is trivial:
>
> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
> {
>         return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
> }
>
> I was testing an unrelated driver at low frequencies, with the nominal
> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
> that when the driver calls
>
>     schedule_timeout(HZ);
>
> the thread sleeps 54 seconds instead of 1.
>
> [ 1107.827279] Sleep for 300 jiffies.
> [ 1161.838513] Done sleeping.
>
> (I have HZ set to 300)
>
> I enabled DEBUG in the generic cpufreq driver to see what happens
> when I request a frequency change:
>
> # cd /sys/devices/system/cpu/cpu0/cpufreq
> # cat scaling_governor
> performance
> # cat scaling_available_frequencies
> 999000 499500 333000 199800 111000 18500
> # echo 18500 > scaling_max_freq
> [   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
> [   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
> [   19.610345] cpufreq: governor: change or update limits
> [   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
> [   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
> [   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
> [   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
> [   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
> [   19.650454] NEW RATE=9250000
> [   19.653644] NEW RATE=9250000
> [   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
> [   19.664176] cpufreq: FREQ: 18500 - CPU: 0
> [   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
> [   19.675648] cpufreq: FREQ: 18500 - CPU: 1
>
> The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
> (my clockevent source) to check whether the CPU frequency change propagated
> to smp_twd.
>
> I must have made an obvious mistake. Could someone point it out to me?
> (I have attached the output of /proc/timer_list to this message)
>
> Looking more closely at schedule_timeout(), I suppose the work happens
> in __mod_timer()... Hmm, that one is over my head.
>
> What am I missing?

A Broadcast device ? So you have two CPUs with a local timer each, whose
expires-next is set to KTIME_MAX (infinite).... So you will get interrupted
once the counter has overflown, probably that's what 54 seconds is all about..

Probably your CPUs are going into idle and no one is waking them up and that's
why you need a broadcast device, i.e. another global timer on your SoC, outside
of the CPU subsystem.

It doesn't have anything to do with cpufreq AFAICT ..

--
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 14:46   ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-12 14:46 UTC (permalink / raw)
  To: linux-arm-kernel

CC'ing Daniel..

On 12 May 2015 at 20:02, Mason <slash.tmp@free.fr> wrote:
> Hello,
>
> I'm working on a Cortex A9 based platform.
>
> I have a basic clock tree, and a very basic cpufreq driver using
> mostly generic driver glue:
>
> static struct cpufreq_driver tangox_cpufreq_driver = {
>         .name           = "tangox-cpufreq",
>         .init           = tangox_cpu_init,
>         .verify         = cpufreq_generic_frequency_table_verify,
>         .target_index   = tangox_target,
>         .get            = cpufreq_generic_get,
>         .exit           = cpufreq_generic_exit,
>         .attr           = cpufreq_generic_attr,
> };
>
> My target_index function is trivial:
>
> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
> {
>         return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
> }
>
> I was testing an unrelated driver at low frequencies, with the nominal
> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
> that when the driver calls
>
>     schedule_timeout(HZ);
>
> the thread sleeps 54 seconds instead of 1.
>
> [ 1107.827279] Sleep for 300 jiffies.
> [ 1161.838513] Done sleeping.
>
> (I have HZ set to 300)
>
> I enabled DEBUG in the generic cpufreq driver to see what happens
> when I request a frequency change:
>
> # cd /sys/devices/system/cpu/cpu0/cpufreq
> # cat scaling_governor
> performance
> # cat scaling_available_frequencies
> 999000 499500 333000 199800 111000 18500
> # echo 18500 > scaling_max_freq
> [   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
> [   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
> [   19.610345] cpufreq: governor: change or update limits
> [   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
> [   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
> [   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
> [   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
> [   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
> [   19.650454] NEW RATE=9250000
> [   19.653644] NEW RATE=9250000
> [   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
> [   19.664176] cpufreq: FREQ: 18500 - CPU: 0
> [   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
> [   19.675648] cpufreq: FREQ: 18500 - CPU: 1
>
> The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
> (my clockevent source) to check whether the CPU frequency change propagated
> to smp_twd.
>
> I must have made an obvious mistake. Could someone point it out to me?
> (I have attached the output of /proc/timer_list to this message)
>
> Looking more closely at schedule_timeout(), I suppose the work happens
> in __mod_timer()... Hmm, that one is over my head.
>
> What am I missing?

A Broadcast device ? So you have two CPUs with a local timer each, whose
expires-next is set to KTIME_MAX (infinite).... So you will get interrupted
once the counter has overflown, probably that's what 54 seconds is all about..

Probably your CPUs are going into idle and no one is waking them up and that's
why you need a broadcast device, i.e. another global timer on your SoC, outside
of the CPU subsystem.

It doesn't have anything to do with cpufreq AFAICT ..

--
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 14:46   ` Viresh Kumar
@ 2015-05-12 15:14     ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 15:14 UTC (permalink / raw)
  To: Viresh Kumar, Daniel Lezcano
  Cc: Linux ARM, cpufreq, Linux PM, Rafael J. Wysocki

On 12/05/2015 16:46, Viresh Kumar wrote:

> On 12 May 2015 at 20:02, Mason wrote:
>
>> I'm working on a Cortex A9 based platform.
>>
>> I have a basic clock tree, and a very basic cpufreq driver using
>> mostly generic driver glue:
>>
>> static struct cpufreq_driver tangox_cpufreq_driver = {
>>         .name           = "tangox-cpufreq",
>>         .init           = tangox_cpu_init,
>>         .verify         = cpufreq_generic_frequency_table_verify,
>>         .target_index   = tangox_target,
>>         .get            = cpufreq_generic_get,
>>         .exit           = cpufreq_generic_exit,
>>         .attr           = cpufreq_generic_attr,
>> };
>>
>> My target_index function is trivial:
>>
>> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
>> {
>>         return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
>> }
>>
>> I was testing an unrelated driver at low frequencies, with the nominal
>> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
>> that when the driver calls
>>
>>     schedule_timeout(HZ);
>>
>> the thread sleeps 54 seconds instead of 1.
>>
>> [ 1107.827279] Sleep for 300 jiffies.
>> [ 1161.838513] Done sleeping.
>>
>> (I have HZ set to 300)
>>
>> I enabled DEBUG in the generic cpufreq driver to see what happens
>> when I request a frequency change:
>>
>> # cd /sys/devices/system/cpu/cpu0/cpufreq
>> # cat scaling_governor
>> performance
>> # cat scaling_available_frequencies
>> 999000 499500 333000 199800 111000 18500
>> # echo 18500 > scaling_max_freq
>> [   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
>> [   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
>> [   19.610345] cpufreq: governor: change or update limits
>> [   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
>> [   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
>> [   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
>> [   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.650454] NEW RATE=9250000
>> [   19.653644] NEW RATE=9250000
>> [   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.664176] cpufreq: FREQ: 18500 - CPU: 0
>> [   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.675648] cpufreq: FREQ: 18500 - CPU: 1
>>
>> The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
>> (my clockevent source) to check whether the CPU frequency change propagated
>> to smp_twd.
>>
>> I must have made an obvious mistake. Could someone point it out to me?
>> (I have attached the output of /proc/timer_list to this message)
>>
>> Looking more closely at schedule_timeout(), I suppose the work happens
>> in __mod_timer()... Hmm, that one is over my head.
>>
>> What am I missing?
> 
> A Broadcast device ? So you have two CPUs with a local timer each, whose
> expires-next is set to KTIME_MAX (infinite).... So you will get interrupted
> once the counter has overflown, probably that's what 54 seconds is all about..

If I divide the CPU clock by 54, then
schedule_timeout(HZ) sleeps 54 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 5.4 seconds (instead of 0.1)

If I divide the CPU clock by 9, then
schedule_timeout(HZ) sleeps 9 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 0.9 seconds (instead of 0.1)

I didn't test other dividers, but I am convinced that if I divide
the CPU clock by N, schedule_timeout(HZ) will sleep N seconds.

It looks like my system is calculating the number of cycles to wait
using the nominal (higher) frequency, and so the system waits N times
too long, because the actual clock is N times slower than nominal.

> Probably your CPUs are going into idle and no one is waking them up and that's
> why you need a broadcast device, i.e. another global timer on your SoC, outside
> of the CPU subsystem.

This ties in to another thread I started in LAKML:
("High-resolution timers not supported when using smp_twd on Cortex A9")

$ git show 5388a6b2 arch/arm/kernel/smp_twd.c
commit 5388a6b266e9c3357353332ba0cd5549082887f1
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Mon Jul 26 13:19:43 2010 +0100

    ARM: SMP: Always enable clock event broadcast support
    
    The TWD local timers are unable to wake up the CPU when it is placed
    into a low power mode, eg. C3.  Therefore, we need to adapt things
    such that the TWD code can cope with this.
    
    We do this by always providing a broadcast tick function, and marking
    the fact that the TWD local timer will stop in low power modes.  This
    means that when the CPU is placed into a low power mode, the core
    timer code marks this fact, and allows an IPI to be given to the core.

This mentions a "broadcast tick function" (of which I know nothing).
Is this what you're referring to?

> It doesn't have anything to do with cpufreq AFAICT ..

I'm sure it's not in the cpufreq driver, if that's what you mean.
It's probably something I did, or didn't do. But it happens when
I change the frequency of the CPU, so I hoped someone on cpufreq
would spot my blunder.

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 15:14     ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 15:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/05/2015 16:46, Viresh Kumar wrote:

> On 12 May 2015 at 20:02, Mason wrote:
>
>> I'm working on a Cortex A9 based platform.
>>
>> I have a basic clock tree, and a very basic cpufreq driver using
>> mostly generic driver glue:
>>
>> static struct cpufreq_driver tangox_cpufreq_driver = {
>>         .name           = "tangox-cpufreq",
>>         .init           = tangox_cpu_init,
>>         .verify         = cpufreq_generic_frequency_table_verify,
>>         .target_index   = tangox_target,
>>         .get            = cpufreq_generic_get,
>>         .exit           = cpufreq_generic_exit,
>>         .attr           = cpufreq_generic_attr,
>> };
>>
>> My target_index function is trivial:
>>
>> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
>> {
>>         return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
>> }
>>
>> I was testing an unrelated driver at low frequencies, with the nominal
>> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
>> that when the driver calls
>>
>>     schedule_timeout(HZ);
>>
>> the thread sleeps 54 seconds instead of 1.
>>
>> [ 1107.827279] Sleep for 300 jiffies.
>> [ 1161.838513] Done sleeping.
>>
>> (I have HZ set to 300)
>>
>> I enabled DEBUG in the generic cpufreq driver to see what happens
>> when I request a frequency change:
>>
>> # cd /sys/devices/system/cpu/cpu0/cpufreq
>> # cat scaling_governor
>> performance
>> # cat scaling_available_frequencies
>> 999000 499500 333000 199800 111000 18500
>> # echo 18500 > scaling_max_freq
>> [   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
>> [   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
>> [   19.610345] cpufreq: governor: change or update limits
>> [   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
>> [   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
>> [   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
>> [   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.650454] NEW RATE=9250000
>> [   19.653644] NEW RATE=9250000
>> [   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.664176] cpufreq: FREQ: 18500 - CPU: 0
>> [   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.675648] cpufreq: FREQ: 18500 - CPU: 1
>>
>> The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
>> (my clockevent source) to check whether the CPU frequency change propagated
>> to smp_twd.
>>
>> I must have made an obvious mistake. Could someone point it out to me?
>> (I have attached the output of /proc/timer_list to this message)
>>
>> Looking more closely at schedule_timeout(), I suppose the work happens
>> in __mod_timer()... Hmm, that one is over my head.
>>
>> What am I missing?
> 
> A Broadcast device ? So you have two CPUs with a local timer each, whose
> expires-next is set to KTIME_MAX (infinite).... So you will get interrupted
> once the counter has overflown, probably that's what 54 seconds is all about..

If I divide the CPU clock by 54, then
schedule_timeout(HZ) sleeps 54 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 5.4 seconds (instead of 0.1)

If I divide the CPU clock by 9, then
schedule_timeout(HZ) sleeps 9 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 0.9 seconds (instead of 0.1)

I didn't test other dividers, but I am convinced that if I divide
the CPU clock by N, schedule_timeout(HZ) will sleep N seconds.

It looks like my system is calculating the number of cycles to wait
using the nominal (higher) frequency, and so the system waits N times
too long, because the actual clock is N times slower than nominal.

> Probably your CPUs are going into idle and no one is waking them up and that's
> why you need a broadcast device, i.e. another global timer on your SoC, outside
> of the CPU subsystem.

This ties in to another thread I started in LAKML:
("High-resolution timers not supported when using smp_twd on Cortex A9")

$ git show 5388a6b2 arch/arm/kernel/smp_twd.c
commit 5388a6b266e9c3357353332ba0cd5549082887f1
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Mon Jul 26 13:19:43 2010 +0100

    ARM: SMP: Always enable clock event broadcast support
    
    The TWD local timers are unable to wake up the CPU when it is placed
    into a low power mode, eg. C3.  Therefore, we need to adapt things
    such that the TWD code can cope with this.
    
    We do this by always providing a broadcast tick function, and marking
    the fact that the TWD local timer will stop in low power modes.  This
    means that when the CPU is placed into a low power mode, the core
    timer code marks this fact, and allows an IPI to be given to the core.

This mentions a "broadcast tick function" (of which I know nothing).
Is this what you're referring to?

> It doesn't have anything to do with cpufreq AFAICT ..

I'm sure it's not in the cpufreq driver, if that's what you mean.
It's probably something I did, or didn't do. But it happens when
I change the frequency of the CPU, so I hoped someone on cpufreq
would spot my blunder.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 14:32 ` Mason
@ 2015-05-12 15:23   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-12 15:23 UTC (permalink / raw)
  To: Mason; +Cc: Linux ARM, Viresh Kumar, Rafael J. Wysocki, cpufreq, Linux PM

On Tue, May 12, 2015 at 04:32:47PM +0200, Mason wrote:
> Hello,
> 
> I'm working on a Cortex A9 based platform.
> 
> I have a basic clock tree, and a very basic cpufreq driver using
> mostly generic driver glue:
> 
> static struct cpufreq_driver tangox_cpufreq_driver = {
> 	.name		= "tangox-cpufreq",
> 	.init		= tangox_cpu_init,
> 	.verify		= cpufreq_generic_frequency_table_verify,
> 	.target_index	= tangox_target,
> 	.get		= cpufreq_generic_get,
> 	.exit		= cpufreq_generic_exit,
> 	.attr		= cpufreq_generic_attr,
> };
> 
> My target_index function is trivial:
> 
> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
> {
> 	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
> }
> 
> I was testing an unrelated driver at low frequencies, with the nominal
> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
> that when the driver calls
> 
>     schedule_timeout(HZ);
> 
> the thread sleeps 54 seconds instead of 1.

I'm guessing that this will be because your local timer changes frequency
with the CPU, which means that the clockevent which was set for one second
ends up timing out after 54 seconds.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 15:23   ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-12 15:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 12, 2015 at 04:32:47PM +0200, Mason wrote:
> Hello,
> 
> I'm working on a Cortex A9 based platform.
> 
> I have a basic clock tree, and a very basic cpufreq driver using
> mostly generic driver glue:
> 
> static struct cpufreq_driver tangox_cpufreq_driver = {
> 	.name		= "tangox-cpufreq",
> 	.init		= tangox_cpu_init,
> 	.verify		= cpufreq_generic_frequency_table_verify,
> 	.target_index	= tangox_target,
> 	.get		= cpufreq_generic_get,
> 	.exit		= cpufreq_generic_exit,
> 	.attr		= cpufreq_generic_attr,
> };
> 
> My target_index function is trivial:
> 
> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
> {
> 	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
> }
> 
> I was testing an unrelated driver at low frequencies, with the nominal
> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
> that when the driver calls
> 
>     schedule_timeout(HZ);
> 
> the thread sleeps 54 seconds instead of 1.

I'm guessing that this will be because your local timer changes frequency
with the CPU, which means that the clockevent which was set for one second
ends up timing out after 54 seconds.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 15:14     ` Mason
@ 2015-05-12 15:50       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-12 15:50 UTC (permalink / raw)
  To: Mason
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, cpufreq,
	Linux ARM, Linux PM

On Tue, May 12, 2015 at 05:14:15PM +0200, Mason wrote:
> This ties in to another thread I started in LAKML:
> ("High-resolution timers not supported when using smp_twd on Cortex A9")
> 
> $ git show 5388a6b2 arch/arm/kernel/smp_twd.c
> commit 5388a6b266e9c3357353332ba0cd5549082887f1
> Author: Russell King <rmk+kernel@arm.linux.org.uk>
> Date:   Mon Jul 26 13:19:43 2010 +0100
> 
>     ARM: SMP: Always enable clock event broadcast support
>     
>     The TWD local timers are unable to wake up the CPU when it is placed
>     into a low power mode, eg. C3.  Therefore, we need to adapt things
>     such that the TWD code can cope with this.
>     
>     We do this by always providing a broadcast tick function, and marking
>     the fact that the TWD local timer will stop in low power modes.  This
>     means that when the CPU is placed into a low power mode, the core
>     timer code marks this fact, and allows an IPI to be given to the core.
> 
> This mentions a "broadcast tick function" (of which I know nothing).
> Is this what you're referring to?

No.  This has nothing to do with low power modes.

How this works depends on how your kernel is configured, but essentially
it's something like this:

* The CPU which will be idling sets its local timer to wake up after N
  counter cycles, where N is calculated from the timer frequency.

* When the local timer fires, the CPU is kicked out of the idle loop, and
  it reads the current system time.  If the current system time indicates
  that the software timer set in schedule_timeout() has fired, that
  software timer fires.

If the local timer changes frequency without the idling CPU being woken
up, then the problem you're referring to can happen.

As you're not giving much information about your system (including
indicating where we might see some source code) we're not able to help
more than providing above descriptions.  Maybe if you posted your
patches so far to support the project you're working on, we could
provide better answers.

Thanks.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 15:50       ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-12 15:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 12, 2015 at 05:14:15PM +0200, Mason wrote:
> This ties in to another thread I started in LAKML:
> ("High-resolution timers not supported when using smp_twd on Cortex A9")
> 
> $ git show 5388a6b2 arch/arm/kernel/smp_twd.c
> commit 5388a6b266e9c3357353332ba0cd5549082887f1
> Author: Russell King <rmk+kernel@arm.linux.org.uk>
> Date:   Mon Jul 26 13:19:43 2010 +0100
> 
>     ARM: SMP: Always enable clock event broadcast support
>     
>     The TWD local timers are unable to wake up the CPU when it is placed
>     into a low power mode, eg. C3.  Therefore, we need to adapt things
>     such that the TWD code can cope with this.
>     
>     We do this by always providing a broadcast tick function, and marking
>     the fact that the TWD local timer will stop in low power modes.  This
>     means that when the CPU is placed into a low power mode, the core
>     timer code marks this fact, and allows an IPI to be given to the core.
> 
> This mentions a "broadcast tick function" (of which I know nothing).
> Is this what you're referring to?

No.  This has nothing to do with low power modes.

How this works depends on how your kernel is configured, but essentially
it's something like this:

* The CPU which will be idling sets its local timer to wake up after N
  counter cycles, where N is calculated from the timer frequency.

* When the local timer fires, the CPU is kicked out of the idle loop, and
  it reads the current system time.  If the current system time indicates
  that the software timer set in schedule_timeout() has fired, that
  software timer fires.

If the local timer changes frequency without the idling CPU being woken
up, then the problem you're referring to can happen.

As you're not giving much information about your system (including
indicating where we might see some source code) we're not able to help
more than providing above descriptions.  Maybe if you posted your
patches so far to support the project you're working on, we could
provide better answers.

Thanks.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 15:23   ` Russell King - ARM Linux
@ 2015-05-12 16:03     ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 16:03 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Linux ARM, Viresh Kumar, Rafael J. Wysocki, cpufreq, Linux PM,
	Daniel Lezcano

On 12/05/2015 17:23, Russell King - ARM Linux wrote:

> On Tue, May 12, 2015 at 04:32:47PM +0200, Mason wrote:
> 
>> I'm working on a Cortex A9 based platform.
>>
>> I have a basic clock tree, and a very basic cpufreq driver using
>> mostly generic driver glue:
>>
>> static struct cpufreq_driver tangox_cpufreq_driver = {
>> 	.name		= "tangox-cpufreq",
>> 	.init		= tangox_cpu_init,
>> 	.verify		= cpufreq_generic_frequency_table_verify,
>> 	.target_index	= tangox_target,
>> 	.get		= cpufreq_generic_get,
>> 	.exit		= cpufreq_generic_exit,
>> 	.attr		= cpufreq_generic_attr,
>> };
>>
>> My target_index function is trivial:
>>
>> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
>> {
>> 	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
>> }
>>
>> I was testing an unrelated driver at low frequencies, with the nominal
>> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
>> that when the driver calls
>>
>>     schedule_timeout(HZ);
>>
>> the thread sleeps 54 seconds instead of 1.
> 
> I'm guessing that this will be because your local timer changes frequency
> with the CPU, which means that the clockevent which was set for one second
> ends up timing out after 54 seconds.

That's the first thing I suspected, but smp_twd.c registers a clk_notifier
to be notified of CPU frequency changes:

static struct notifier_block twd_clk_nb = {
	.notifier_call = twd_rate_change,
};

static int twd_clk_init(void)
{
	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
		return clk_notifier_register(twd_clk, &twd_clk_nb);

	return 0;
}

And I instrumented twd_update_frequency() to check it was being called.

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 16:03     ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 16:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/05/2015 17:23, Russell King - ARM Linux wrote:

> On Tue, May 12, 2015 at 04:32:47PM +0200, Mason wrote:
> 
>> I'm working on a Cortex A9 based platform.
>>
>> I have a basic clock tree, and a very basic cpufreq driver using
>> mostly generic driver glue:
>>
>> static struct cpufreq_driver tangox_cpufreq_driver = {
>> 	.name		= "tangox-cpufreq",
>> 	.init		= tangox_cpu_init,
>> 	.verify		= cpufreq_generic_frequency_table_verify,
>> 	.target_index	= tangox_target,
>> 	.get		= cpufreq_generic_get,
>> 	.exit		= cpufreq_generic_exit,
>> 	.attr		= cpufreq_generic_attr,
>> };
>>
>> My target_index function is trivial:
>>
>> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
>> {
>> 	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
>> }
>>
>> I was testing an unrelated driver at low frequencies, with the nominal
>> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
>> that when the driver calls
>>
>>     schedule_timeout(HZ);
>>
>> the thread sleeps 54 seconds instead of 1.
> 
> I'm guessing that this will be because your local timer changes frequency
> with the CPU, which means that the clockevent which was set for one second
> ends up timing out after 54 seconds.

That's the first thing I suspected, but smp_twd.c registers a clk_notifier
to be notified of CPU frequency changes:

static struct notifier_block twd_clk_nb = {
	.notifier_call = twd_rate_change,
};

static int twd_clk_init(void)
{
	if (twd_evt && __this_cpu_ptr(twd_evt) && !IS_ERR(twd_clk))
		return clk_notifier_register(twd_clk, &twd_clk_nb);

	return 0;
}

And I instrumented twd_update_frequency() to check it was being called.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 15:50       ` Russell King - ARM Linux
@ 2015-05-12 16:14         ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 16:14 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, cpufreq,
	Linux ARM, Linux PM

[-- Attachment #1: Type: text/plain, Size: 2384 bytes --]

On 12/05/2015 17:50, Russell King - ARM Linux wrote:
> On Tue, May 12, 2015 at 05:14:15PM +0200, Mason wrote:
>> This ties in to another thread I started in LAKML:
>> ("High-resolution timers not supported when using smp_twd on Cortex A9")
>>
>> $ git show 5388a6b2 arch/arm/kernel/smp_twd.c
>> commit 5388a6b266e9c3357353332ba0cd5549082887f1
>> Author: Russell King <rmk+kernel@arm.linux.org.uk>
>> Date:   Mon Jul 26 13:19:43 2010 +0100
>>
>>     ARM: SMP: Always enable clock event broadcast support
>>     
>>     The TWD local timers are unable to wake up the CPU when it is placed
>>     into a low power mode, eg. C3.  Therefore, we need to adapt things
>>     such that the TWD code can cope with this.
>>     
>>     We do this by always providing a broadcast tick function, and marking
>>     the fact that the TWD local timer will stop in low power modes.  This
>>     means that when the CPU is placed into a low power mode, the core
>>     timer code marks this fact, and allows an IPI to be given to the core.
>>
>> This mentions a "broadcast tick function" (of which I know nothing).
>> Is this what you're referring to?
> 
> No.  This has nothing to do with low power modes.
> 
> How this works depends on how your kernel is configured, but essentially
> it's something like this:
> 
> * The CPU which will be idling sets its local timer to wake up after N
>   counter cycles, where N is calculated from the timer frequency.
> 
> * When the local timer fires, the CPU is kicked out of the idle loop, and
>   it reads the current system time.  If the current system time indicates
>   that the software timer set in schedule_timeout() has fired, that
>   software timer fires.
> 
> If the local timer changes frequency without the idling CPU being woken
> up, then the problem you're referring to can happen.
> 
> As you're not giving much information about your system (including
> indicating where we might see some source code) we're not able to help
> more than providing above descriptions.  Maybe if you posted your
> patches so far to support the project you're working on, we could
> provide better answers.

Hello Russell,

I am using as much generic code as I can.

I've attached my clock registration code and cpufreq platform
driver to this message.

I plan to set up a github repo in order to share my progress
while I try to mainline the port.

Regards.


[-- Attachment #2: clock-tangox.c --]
[-- Type: text/x-csrc, Size: 4167 bytes --]

#include <linux/clocksource.h>	/* clocksource_register_khz	*/
#include <linux/sched_clock.h>	/* sched_clock_register		*/
#include <linux/clk-provider.h>	/* clk_register_fixed_rate	*/
#include <linux/clkdev.h>	/* clk_register_clkdev		*/
#include <linux/delay.h>	/* register_current_timer_delay	*/
#include <asm/smp_twd.h>	/* twd_local_timer_register	*/
#include <asm/smp_scu.h>	/* scu_a9_get_base		*/

#define FAST_RAMP		1
#define XTAL_FREQ		27000000 /* in Hz */
#define CLKGEN_BASE		0x10000
#define SYS_clkgen0_pll		(clkgen_base + 0x00)
#define SYS_cpuclk_div_ctrl	(clkgen_base + 0x24)
#define SYS_xtal_in_cnt		(clkgen_base + 0x48)

static void __iomem *clkgen_base;

static unsigned long read_xtal_counter(void)
{
	return readl_relaxed(SYS_xtal_in_cnt);
}

static u64 read_sched_clock(void)
{
	return read_xtal_counter();
}

static cycle_t read_clocksource(struct clocksource *cs)
{
	return read_xtal_counter();
}

static struct clocksource tangox_xtal = {
	.name	= "tangox_xtal",
	.rating	= 300,
	.read	= read_clocksource,
	.mask	= CLOCKSOURCE_MASK(32),
	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
};

static struct delay_timer delay_timer = { read_xtal_counter, XTAL_FREQ };

#define pi_alert(format, ...) do {				\
	static char fmt[] __initdata = KERN_ALERT format;	\
	printk(fmt, ## __VA_ARGS__);				\
} while (0)

static int __init wrap_local_timer_register(void)
{
	unsigned long twd_base = scu_a9_get_base() + 0x600;
	struct twd_local_timer tangox_twd = {{
		DEFINE_RES_MEM(twd_base, 16), DEFINE_RES_IRQ(29)
	}};
	return twd_local_timer_register(&tangox_twd);
}

#define REG(name, ...) union name { struct { u32 __VA_ARGS__; }; u32 val; }

REG(SYS_clkgen_pll, N:7, :6, K:3, M:3, :5, Isel:3, :3, T:1, B:1);
/*
 * CG0, CG1, CG2, CG3 PLL Control:
 * -------------------------------
 *
 * |    Byte 3     |    Byte 2     |    Byte 1     |    Byte 0     |
 * |3 3 2 2 2 2 2 2|2 2 2 2 1 1 1 1|1 1 1 1 1 1    |               |
 * |1 0 9 8 7 6 5 4|3 2 1 0 9 8 7 6|5 4 3 2 1 0 9 8|7 6 5 4 3 2 1 0|
 * |-|-|-----|-----|---------|-----|-----|---------|-|-------------|
 * |B|T|xxxxx|Isel |xxxxxxxxx|  M  |  K  |xxxxxxxxx|x|      N      |
 * |-|-|-----|-----|---------|-----|-----|---------|-|-------------|
 *
 * These registers are used to configure the PLL parameters:
 *
 * Bits  6 to  0: N[6:0]. Default = 29
 * Bits 15 to 13: K[2:0]. Default = 1
 * Bit  18 to 16: M[2:0]. Default = 0
 * Bits 26 to 24: Isel[2:0] (PLL Input Select). Default = 1
 * Bits 30      : T (PLL Test). Default = 0
 * Bits 31      : B (PLL Bypass). Default = 0
 *
 * PLL0 : Out = In * (N+1) / (M+1) / 2^K
 * PLL1 : Same as PLL0
 * PLL2 : Same as PLL0
 * Default values : All PLLs configured to output 405MHz.
 */
static void __init tangox_clock_tree_register(void)
{
	struct clk *clk;
	unsigned int mul, div;
	union SYS_clkgen_pll pll;

	pll.val = readl_relaxed(SYS_clkgen0_pll);
	mul = pll.N + 1; div = (pll.M + 1) << pll.K;
	if (pll.Isel != 1) pi_alert("PLL0 source is not XTAL_IN!\n");

	clk = clk_register_fixed_rate(0, "XTAL", 0, CLK_IS_ROOT, XTAL_FREQ);
	if (!clk) pi_alert("Failed to register %s clk!\n", "XTAL");

	clk = clk_register_fixed_factor(0, "PLL0", "XTAL", 0, mul, div);
	if (!clk) pi_alert("Failed to register %s clk!\n", "PLL0");

	clk = clk_register_divider(0, "CPU_CLK", "PLL0", 0, SYS_cpuclk_div_ctrl, 8, 8, CLK_DIVIDER_ONE_BASED, 0);
	if (!clk) pi_alert("Failed to register %s clk!\n", "CPU_CLK");
	clk_register_clkdev(clk, NULL, "cpu_clk");

	clk = clk_register_fixed_factor(0, "PERIPHCLK", "CPU_CLK", 0, 1, 2);
	if (!clk) pi_alert("Failed to register %s clk!\n", "PERIPHCLK");
	clk_register_clkdev(clk, NULL, "smp_twd");

	writel_relaxed(FAST_RAMP << 21 | 1 << 8, SYS_cpuclk_div_ctrl);
}

void __init tangox_timer_init(void)
{
	int err;

	clkgen_base = ioremap(CLKGEN_BASE, 0x100);
	if (clkgen_base == NULL) return;

	register_current_timer_delay(&delay_timer);
	sched_clock_register(read_sched_clock, 32, XTAL_FREQ);

	err = clocksource_register_hz(&tangox_xtal, XTAL_FREQ);
	if (err) pi_alert("Failed to register tangox_xtal clocksource!\n");

	tangox_clock_tree_register();

	err = wrap_local_timer_register();
	if (err) pi_alert("Failed to register local timer!\n");
}

[-- Attachment #3: cpufreq.c --]
[-- Type: text/x-csrc, Size: 1906 bytes --]

/*
 * Copyright 2015 Sigma Designs
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2 as
 * published by the Free Software Foundation.
 */
#include <linux/module.h>
#include <linux/cpufreq.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Sigma Designs");
MODULE_DESCRIPTION("cpufreq driver for Tangox");

static struct cpufreq_frequency_table freq_table[] = {
	{ .driver_data = 1 },
	{ .driver_data = 2 },
	{ .driver_data = 3 },
	{ .driver_data = 5 },
	{ .driver_data = 9 },
	{ .driver_data = 54 },
	{ .frequency = CPUFREQ_TABLE_END }
};

static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
{
	// TODO: MUST CHECK FOR IDLE BEFORE CALLING clk_set_rate()
	return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
}

#define FAST_RAMP_SPEED 15 /* in kHz per nanosecond */

static int tangox_cpu_init(struct cpufreq_policy *policy)
{
	struct cpufreq_frequency_table *p;
	unsigned int freq, transition_latency;

	policy->clk = clk_get_sys("cpu_clk", NULL);
	freq = clk_get_rate(policy->clk) / 1000;
	transition_latency = freq / FAST_RAMP_SPEED;

	for (p = freq_table; p->frequency != CPUFREQ_TABLE_END; ++p) {
		p->frequency = freq / p->driver_data;
	}

	return cpufreq_generic_init(policy, freq_table, transition_latency);
}

static struct cpufreq_driver tangox_cpufreq_driver = {
	.name		= "tangox-cpufreq",
	.init		= tangox_cpu_init,
	.verify		= cpufreq_generic_frequency_table_verify,
	.target_index	= tangox_target,
	.get		= cpufreq_generic_get,
	.exit		= cpufreq_generic_exit,
	.attr		= cpufreq_generic_attr,
};

static int __init tangox_cpufreq_init(void)
{
	return cpufreq_register_driver(&tangox_cpufreq_driver);
}

static void __exit tangox_cpufreq_exit(void)
{
	cpufreq_unregister_driver(&tangox_cpufreq_driver);
}

module_init(tangox_cpufreq_init);
module_exit(tangox_cpufreq_exit);

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-12 16:14         ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-12 16:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/05/2015 17:50, Russell King - ARM Linux wrote:
> On Tue, May 12, 2015 at 05:14:15PM +0200, Mason wrote:
>> This ties in to another thread I started in LAKML:
>> ("High-resolution timers not supported when using smp_twd on Cortex A9")
>>
>> $ git show 5388a6b2 arch/arm/kernel/smp_twd.c
>> commit 5388a6b266e9c3357353332ba0cd5549082887f1
>> Author: Russell King <rmk+kernel@arm.linux.org.uk>
>> Date:   Mon Jul 26 13:19:43 2010 +0100
>>
>>     ARM: SMP: Always enable clock event broadcast support
>>     
>>     The TWD local timers are unable to wake up the CPU when it is placed
>>     into a low power mode, eg. C3.  Therefore, we need to adapt things
>>     such that the TWD code can cope with this.
>>     
>>     We do this by always providing a broadcast tick function, and marking
>>     the fact that the TWD local timer will stop in low power modes.  This
>>     means that when the CPU is placed into a low power mode, the core
>>     timer code marks this fact, and allows an IPI to be given to the core.
>>
>> This mentions a "broadcast tick function" (of which I know nothing).
>> Is this what you're referring to?
> 
> No.  This has nothing to do with low power modes.
> 
> How this works depends on how your kernel is configured, but essentially
> it's something like this:
> 
> * The CPU which will be idling sets its local timer to wake up after N
>   counter cycles, where N is calculated from the timer frequency.
> 
> * When the local timer fires, the CPU is kicked out of the idle loop, and
>   it reads the current system time.  If the current system time indicates
>   that the software timer set in schedule_timeout() has fired, that
>   software timer fires.
> 
> If the local timer changes frequency without the idling CPU being woken
> up, then the problem you're referring to can happen.
> 
> As you're not giving much information about your system (including
> indicating where we might see some source code) we're not able to help
> more than providing above descriptions.  Maybe if you posted your
> patches so far to support the project you're working on, we could
> provide better answers.

Hello Russell,

I am using as much generic code as I can.

I've attached my clock registration code and cpufreq platform
driver to this message.

I plan to set up a github repo in order to share my progress
while I try to mainline the port.

Regards.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: clock-tangox.c
Type: text/x-csrc
Size: 4167 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150512/a4eee71b/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cpufreq.c
Type: text/x-csrc
Size: 1906 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150512/a4eee71b/attachment-0003.bin>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-12 15:50       ` Russell King - ARM Linux
@ 2015-05-13 16:51         ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-13 16:51 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

[-- Attachment #1: Type: text/plain, Size: 2722 bytes --]

On 12/05/2015 17:50, Russell King - ARM Linux wrote:
> On Tue, May 12, 2015 at 05:14:15PM +0200, Mason wrote:
>> This ties in to another thread I started in LAKML:
>> ("High-resolution timers not supported when using smp_twd on Cortex A9")
>>
>> $ git show 5388a6b2 arch/arm/kernel/smp_twd.c
>> commit 5388a6b266e9c3357353332ba0cd5549082887f1
>> Author: Russell King <rmk+kernel@arm.linux.org.uk>
>> Date:   Mon Jul 26 13:19:43 2010 +0100
>>
>>     ARM: SMP: Always enable clock event broadcast support
>>     
>>     The TWD local timers are unable to wake up the CPU when it is placed
>>     into a low power mode, eg. C3.  Therefore, we need to adapt things
>>     such that the TWD code can cope with this.
>>     
>>     We do this by always providing a broadcast tick function, and marking
>>     the fact that the TWD local timer will stop in low power modes.  This
>>     means that when the CPU is placed into a low power mode, the core
>>     timer code marks this fact, and allows an IPI to be given to the core.
>>
>> This mentions a "broadcast tick function" (of which I know nothing).
>> Is this what you're referring to?
> 
> No.  This has nothing to do with low power modes.
> 
> How this works depends on how your kernel is configured, but essentially
> it's something like this:
> 
> * The CPU which will be idling sets its local timer to wake up after N
>   counter cycles, where N is calculated from the timer frequency.
> 
> * When the local timer fires, the CPU is kicked out of the idle loop, and
>   it reads the current system time.  If the current system time indicates
>   that the software timer set in schedule_timeout() has fired, that
>   software timer fires.
> 
> If the local timer changes frequency without the idling CPU being woken
> up, then the problem you're referring to can happen.
> 
> As you're not giving much information about your system (including
> indicating where we might see some source code) we're not able to help
> more than providing above descriptions.  Maybe if you posted your
> patches so far to support the project you're working on, we could
> provide better answers.

$ git diff v3.14.41 HEAD >tango.patch && xz tango.patch

I don't understand the IRQ-related part yet
( arch/arm/mach-tangox/irq.c and drivers/irqchip/irq-gic.c )

If anyone spots the problem, that would make my day.

I tested with a loadable module whose init function is

static int __init ts_init(void)
{
	long res;
	printk("ABOUT TO SLEEP\n");
	set_current_state(TASK_INTERRUPTIBLE);
	res = schedule_timeout(HZ);
	printk("WAKE UP res=%ld\n", res);
	return 0;
}

Loading the module, with cpufreq divided by 9, prints:
[ 1738.962982] ABOUT TO SLEEP
[ 1747.956191] WAKE UP res=0

Regards.


[-- Attachment #2: tango.patch.xz --]
[-- Type: application/force-download, Size: 30112 bytes --]

[-- Attachment #3: dot.config.xz --]
[-- Type: application/force-download, Size: 7924 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-13 16:51         ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-13 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/05/2015 17:50, Russell King - ARM Linux wrote:
> On Tue, May 12, 2015 at 05:14:15PM +0200, Mason wrote:
>> This ties in to another thread I started in LAKML:
>> ("High-resolution timers not supported when using smp_twd on Cortex A9")
>>
>> $ git show 5388a6b2 arch/arm/kernel/smp_twd.c
>> commit 5388a6b266e9c3357353332ba0cd5549082887f1
>> Author: Russell King <rmk+kernel@arm.linux.org.uk>
>> Date:   Mon Jul 26 13:19:43 2010 +0100
>>
>>     ARM: SMP: Always enable clock event broadcast support
>>     
>>     The TWD local timers are unable to wake up the CPU when it is placed
>>     into a low power mode, eg. C3.  Therefore, we need to adapt things
>>     such that the TWD code can cope with this.
>>     
>>     We do this by always providing a broadcast tick function, and marking
>>     the fact that the TWD local timer will stop in low power modes.  This
>>     means that when the CPU is placed into a low power mode, the core
>>     timer code marks this fact, and allows an IPI to be given to the core.
>>
>> This mentions a "broadcast tick function" (of which I know nothing).
>> Is this what you're referring to?
> 
> No.  This has nothing to do with low power modes.
> 
> How this works depends on how your kernel is configured, but essentially
> it's something like this:
> 
> * The CPU which will be idling sets its local timer to wake up after N
>   counter cycles, where N is calculated from the timer frequency.
> 
> * When the local timer fires, the CPU is kicked out of the idle loop, and
>   it reads the current system time.  If the current system time indicates
>   that the software timer set in schedule_timeout() has fired, that
>   software timer fires.
> 
> If the local timer changes frequency without the idling CPU being woken
> up, then the problem you're referring to can happen.
> 
> As you're not giving much information about your system (including
> indicating where we might see some source code) we're not able to help
> more than providing above descriptions.  Maybe if you posted your
> patches so far to support the project you're working on, we could
> provide better answers.

$ git diff v3.14.41 HEAD >tango.patch && xz tango.patch

I don't understand the IRQ-related part yet
( arch/arm/mach-tangox/irq.c and drivers/irqchip/irq-gic.c )

If anyone spots the problem, that would make my day.

I tested with a loadable module whose init function is

static int __init ts_init(void)
{
	long res;
	printk("ABOUT TO SLEEP\n");
	set_current_state(TASK_INTERRUPTIBLE);
	res = schedule_timeout(HZ);
	printk("WAKE UP res=%ld\n", res);
	return 0;
}

Loading the module, with cpufreq divided by 9, prints:
[ 1738.962982] ABOUT TO SLEEP
[ 1747.956191] WAKE UP res=0

Regards.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: tango.patch.xz
Type: application/force-download
Size: 30112 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150513/bcc44c93/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dot.config.xz
Type: application/force-download
Size: 7924 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150513/bcc44c93/attachment-0003.bin>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-13 16:51         ` Mason
@ 2015-05-14  2:13           ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14  2:13 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On Wed, May 13, 2015 at 10:21 PM, Mason <slash.tmp@free.fr> wrote:
> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
>
> I don't understand the IRQ-related part yet
> ( arch/arm/mach-tangox/irq.c and drivers/irqchip/irq-gic.c )
>
> If anyone spots the problem, that would make my day.
>
> I tested with a loadable module whose init function is
>
> static int __init ts_init(void)
> {
>         long res;
>         printk("ABOUT TO SLEEP\n");
>         set_current_state(TASK_INTERRUPTIBLE);
>         res = schedule_timeout(HZ);
>         printk("WAKE UP res=%ld\n", res);
>         return 0;
> }
>
> Loading the module, with cpufreq divided by 9, prints:
> [ 1738.962982] ABOUT TO SLEEP
> [ 1747.956191] WAKE UP res=0

I hope you are also matching this time with a physical watch,
to make sure 38->47 is really 9 seconds :)

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14  2:13           ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14  2:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 13, 2015 at 10:21 PM, Mason <slash.tmp@free.fr> wrote:
> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
>
> I don't understand the IRQ-related part yet
> ( arch/arm/mach-tangox/irq.c and drivers/irqchip/irq-gic.c )
>
> If anyone spots the problem, that would make my day.
>
> I tested with a loadable module whose init function is
>
> static int __init ts_init(void)
> {
>         long res;
>         printk("ABOUT TO SLEEP\n");
>         set_current_state(TASK_INTERRUPTIBLE);
>         res = schedule_timeout(HZ);
>         printk("WAKE UP res=%ld\n", res);
>         return 0;
> }
>
> Loading the module, with cpufreq divided by 9, prints:
> [ 1738.962982] ABOUT TO SLEEP
> [ 1747.956191] WAKE UP res=0

I hope you are also matching this time with a physical watch,
to make sure 38->47 is really 9 seconds :)

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14  2:13           ` Viresh Kumar
@ 2015-05-14 11:22             ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 11:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14/05/2015 04:13, Viresh Kumar wrote:
> On Wed, May 13, 2015 at 10:21 PM, Mason <slash.tmp@free.fr> wrote:
>> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
>>
>> I don't understand the IRQ-related part yet
>> ( arch/arm/mach-tangox/irq.c and drivers/irqchip/irq-gic.c )
>>
>> If anyone spots the problem, that would make my day.
>>
>> I tested with a loadable module whose init function is
>>
>> static int __init ts_init(void)
>> {
>>         long res;
>>         printk("ABOUT TO SLEEP\n");
>>         set_current_state(TASK_INTERRUPTIBLE);
>>         res = schedule_timeout(HZ);
>>         printk("WAKE UP res=%ld\n", res);
>>         return 0;
>> }
>>
>> Loading the module, with cpufreq divided by 9, prints:
>> [ 1738.962982] ABOUT TO SLEEP
>> [ 1747.956191] WAKE UP res=0
> 
> I hope you are also matching this time with a physical watch,
> to make sure 38->47 is really 9 seconds :)

Hello Viresh,

I didn't /literally/ take a stop-watch to verify it was EXACTLY
9 seconds, but I was staring at the prompt, and it /felt/ like
9 seconds. And the 54 second case felt like a minute.

I'm using a 27 MHz crystal as clocksource. This is independent
of the CPU frequency. However, I'm using the ARM TWD as the
system's clockevent source, and the TWD's clock is tied to
the CPU clock (PERIPHCLK = CPUCLK / 2 on this SoC).

I'm wondering if there's another standard clockevent source
I could try (it would be great if it supported high-resolution
timers).

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 11:22             ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/2015 04:13, Viresh Kumar wrote:
> On Wed, May 13, 2015 at 10:21 PM, Mason <slash.tmp@free.fr> wrote:
>> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
>>
>> I don't understand the IRQ-related part yet
>> ( arch/arm/mach-tangox/irq.c and drivers/irqchip/irq-gic.c )
>>
>> If anyone spots the problem, that would make my day.
>>
>> I tested with a loadable module whose init function is
>>
>> static int __init ts_init(void)
>> {
>>         long res;
>>         printk("ABOUT TO SLEEP\n");
>>         set_current_state(TASK_INTERRUPTIBLE);
>>         res = schedule_timeout(HZ);
>>         printk("WAKE UP res=%ld\n", res);
>>         return 0;
>> }
>>
>> Loading the module, with cpufreq divided by 9, prints:
>> [ 1738.962982] ABOUT TO SLEEP
>> [ 1747.956191] WAKE UP res=0
> 
> I hope you are also matching this time with a physical watch,
> to make sure 38->47 is really 9 seconds :)

Hello Viresh,

I didn't /literally/ take a stop-watch to verify it was EXACTLY
9 seconds, but I was staring at the prompt, and it /felt/ like
9 seconds. And the 54 second case felt like a minute.

I'm using a 27 MHz crystal as clocksource. This is independent
of the CPU frequency. However, I'm using the ARM TWD as the
system's clockevent source, and the TWD's clock is tied to
the CPU clock (PERIPHCLK = CPUCLK / 2 on this SoC).

I'm wondering if there's another standard clockevent source
I could try (it would be great if it supported high-resolution
timers).

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 11:22             ` Mason
@ 2015-05-14 11:54               ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 11:54 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14-05-15, 13:22, Mason wrote:
> I didn't /literally/ take a stop-watch to verify it was EXACTLY
> 9 seconds, but I was staring at the prompt, and it /felt/ like
> 9 seconds. And the 54 second case felt like a minute.

:)

> I'm using a 27 MHz crystal as clocksource. This is independent
> of the CPU frequency. However, I'm using the ARM TWD as the
> system's clockevent source, and the TWD's clock is tied to
> the CPU clock (PERIPHCLK = CPUCLK / 2 on this SoC).

The only (very straight forward) problem is that we aren't propagating
the freq update to clockevents core and you need to debug a bit there.

Also I wanted to see the source of your print message:
[   19.650454] NEW RATE=9250000
[   19.653644] NEW RATE=9250000

What's this rate ? Old/new ? Because you are atleast printing the old
rate here, and the function by default gets the new rate.

> I'm wondering if there's another standard clockevent source
> I could try (it would be great if it supported high-resolution
> timers).

I hope you have some platform general-purpose-timers.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 11:54               ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 11:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 14-05-15, 13:22, Mason wrote:
> I didn't /literally/ take a stop-watch to verify it was EXACTLY
> 9 seconds, but I was staring at the prompt, and it /felt/ like
> 9 seconds. And the 54 second case felt like a minute.

:)

> I'm using a 27 MHz crystal as clocksource. This is independent
> of the CPU frequency. However, I'm using the ARM TWD as the
> system's clockevent source, and the TWD's clock is tied to
> the CPU clock (PERIPHCLK = CPUCLK / 2 on this SoC).

The only (very straight forward) problem is that we aren't propagating
the freq update to clockevents core and you need to debug a bit there.

Also I wanted to see the source of your print message:
[   19.650454] NEW RATE=9250000
[   19.653644] NEW RATE=9250000

What's this rate ? Old/new ? Because you are atleast printing the old
rate here, and the function by default gets the new rate.

> I'm wondering if there's another standard clockevent source
> I could try (it would be great if it supported high-resolution
> timers).

I hope you have some platform general-purpose-timers.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 11:54               ` Viresh Kumar
@ 2015-05-14 13:06                 ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 13:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14/05/2015 13:54, Viresh Kumar wrote:

> Mason wrote:
> 
>> I'm using a 27 MHz crystal as clocksource. This is independent
>> of the CPU frequency. However, I'm using the ARM TWD as the
>> system's clockevent source, and the TWD's clock is tied to
>> the CPU clock (PERIPHCLK = CPUCLK / 2 on this SoC).
> 
> The only (very straight forward) problem is that we aren't propagating
> the freq update to clockevents core and you need to debug a bit there.

I had the same thought, which is why I added the "NEW RATE" trace.

> Also I wanted to see the source of your print message:
> [   19.650454] NEW RATE=9250000
> [   19.653644] NEW RATE=9250000
> 
> What's this rate ? Old/new ? Because you are at least printing the old
> rate here, and the function by default gets the new rate.

I added a printk inside twd_update_frequency().

http://lxr.free-electrons.com/source/arch/arm/kernel/smp_twd.c?v=3.14#L107

I inserted printk("NEW RATE=%lu\n", twd_timer_rate);
right before the call to clockevents_update_freq().

When I execute "echo 18500 > scaling_max_freq"
the system is supposed to change the CPU frequency to 18.5 MHz
(I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
i.e 9.25 MHz.

twd_update_frequency() is called twice: once for each CPU.
(The timers are local to each CPU.)

>> I'm wondering if there's another standard clockevent source
>> I could try (it would be great if it supported high-resolution
>> timers).
> 
> I hope you have some platform general-purpose-timers.

Yes, I do, but I was trying to use as much generic code as
possible to limit the chances of introducing bugs.

I'll take a fresh look at the ARM GLOBAL TIMER, but last I
checked, it didn't seem to handle frequency scaling.

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 13:06                 ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 13:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/2015 13:54, Viresh Kumar wrote:

> Mason wrote:
> 
>> I'm using a 27 MHz crystal as clocksource. This is independent
>> of the CPU frequency. However, I'm using the ARM TWD as the
>> system's clockevent source, and the TWD's clock is tied to
>> the CPU clock (PERIPHCLK = CPUCLK / 2 on this SoC).
> 
> The only (very straight forward) problem is that we aren't propagating
> the freq update to clockevents core and you need to debug a bit there.

I had the same thought, which is why I added the "NEW RATE" trace.

> Also I wanted to see the source of your print message:
> [   19.650454] NEW RATE=9250000
> [   19.653644] NEW RATE=9250000
> 
> What's this rate ? Old/new ? Because you are at least printing the old
> rate here, and the function by default gets the new rate.

I added a printk inside twd_update_frequency().

http://lxr.free-electrons.com/source/arch/arm/kernel/smp_twd.c?v=3.14#L107

I inserted printk("NEW RATE=%lu\n", twd_timer_rate);
right before the call to clockevents_update_freq().

When I execute "echo 18500 > scaling_max_freq"
the system is supposed to change the CPU frequency to 18.5 MHz
(I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
i.e 9.25 MHz.

twd_update_frequency() is called twice: once for each CPU.
(The timers are local to each CPU.)

>> I'm wondering if there's another standard clockevent source
>> I could try (it would be great if it supported high-resolution
>> timers).
> 
> I hope you have some platform general-purpose-timers.

Yes, I do, but I was trying to use as much generic code as
possible to limit the chances of introducing bugs.

I'll take a fresh look at the ARM GLOBAL TIMER, but last I
checked, it didn't seem to handle frequency scaling.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 13:06                 ` Mason
@ 2015-05-14 13:53                   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-14 13:53 UTC (permalink / raw)
  To: Mason
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On Thu, May 14, 2015 at 03:06:54PM +0200, Mason wrote:
> I added a printk inside twd_update_frequency().
> 
> http://lxr.free-electrons.com/source/arch/arm/kernel/smp_twd.c?v=3.14#L107
> 
> I inserted printk("NEW RATE=%lu\n", twd_timer_rate);
> right before the call to clockevents_update_freq().
> 
> When I execute "echo 18500 > scaling_max_freq"
> the system is supposed to change the CPU frequency to 18.5 MHz
> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> i.e 9.25 MHz.
> 
> twd_update_frequency() is called twice: once for each CPU.
> (The timers are local to each CPU.)

That's expected.

I wonder if clockevents_update_freq() is returning an error - which we
don't check, and don't print a warning.  It would probably be a very
good idea to print a warning as a failure to reconfigure the clock
events code for a different frequency will lead to this kind of issue.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 13:53                   ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-14 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 14, 2015 at 03:06:54PM +0200, Mason wrote:
> I added a printk inside twd_update_frequency().
> 
> http://lxr.free-electrons.com/source/arch/arm/kernel/smp_twd.c?v=3.14#L107
> 
> I inserted printk("NEW RATE=%lu\n", twd_timer_rate);
> right before the call to clockevents_update_freq().
> 
> When I execute "echo 18500 > scaling_max_freq"
> the system is supposed to change the CPU frequency to 18.5 MHz
> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> i.e 9.25 MHz.
> 
> twd_update_frequency() is called twice: once for each CPU.
> (The timers are local to each CPU.)

That's expected.

I wonder if clockevents_update_freq() is returning an error - which we
don't check, and don't print a warning.  It would probably be a very
good idea to print a warning as a failure to reconfigure the clock
events code for a different frequency will lead to this kind of issue.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 13:06                 ` Mason
  (?)
@ 2015-05-14 13:59                   ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 13:59 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
> When I execute "echo 18500 > scaling_max_freq"
> the system is supposed to change the CPU frequency to 18.5 MHz
> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> i.e 9.25 MHz.

So at least we are on the right path. But it looks to me that this
call is not getting propagated well.

>From the attachment you gave initially, the event handler for
twd-timers is: tick_handle_periodic(). i.e. you are running in
periodic mode and not onshot...

why ?

> Yes, I do, but I was trying to use as much generic code as
> possible to limit the chances of introducing bugs.

Hmm..

> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
> checked, it didn't seem to handle frequency scaling.

why is that required? Why will you change its freq ?
The same timer is probably used for SPEAr (the platform
I used to work on):

http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 13:59                   ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 13:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
> When I execute "echo 18500 > scaling_max_freq"
> the system is supposed to change the CPU frequency to 18.5 MHz
> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> i.e 9.25 MHz.

So at least we are on the right path. But it looks to me that this
call is not getting propagated well.

>From the attachment you gave initially, the event handler for
twd-timers is: tick_handle_periodic(). i.e. you are running in
periodic mode and not onshot...

why ?

> Yes, I do, but I was trying to use as much generic code as
> possible to limit the chances of introducing bugs.

Hmm..

> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
> checked, it didn't seem to handle frequency scaling.

why is that required? Why will you change its freq ?
The same timer is probably used for SPEAr (the platform
I used to work on):

http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 13:59                   ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 13:59 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
> When I execute "echo 18500 > scaling_max_freq"
> the system is supposed to change the CPU frequency to 18.5 MHz
> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> i.e 9.25 MHz.

So at least we are on the right path. But it looks to me that this
call is not getting propagated well.

From the attachment you gave initially, the event handler for
twd-timers is: tick_handle_periodic(). i.e. you are running in
periodic mode and not onshot...

why ?

> Yes, I do, but I was trying to use as much generic code as
> possible to limit the chances of introducing bugs.

Hmm..

> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
> checked, it didn't seem to handle frequency scaling.

why is that required? Why will you change its freq ?
The same timer is probably used for SPEAr (the platform
I used to work on):

http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 13:59                   ` Viresh Kumar
@ 2015-05-14 14:38                     ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 14:38 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14 May 2015 at 19:29, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
>> When I execute "echo 18500 > scaling_max_freq"
>> the system is supposed to change the CPU frequency to 18.5 MHz
>> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
>> i.e 9.25 MHz.
>
> So at least we are on the right path. But it looks to me that this
> call is not getting propagated well.
>
> From the attachment you gave initially, the event handler for
> twd-timers is: tick_handle_periodic(). i.e. you are running in
> periodic mode and not onshot...
>
> why ?
>
>> Yes, I do, but I was trying to use as much generic code as
>> possible to limit the chances of introducing bugs.
>
> Hmm..
>
>> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
>> checked, it didn't seem to handle frequency scaling.
>
> why is that required? Why will you change its freq ?
> The same timer is probably used for SPEAr (the platform
> I used to work on):
>
> http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

And this might be the problem you are facing:

http://lists.infradead.org/pipermail/linux-arm-kernel/2012-May/098564.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 14:38                     ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-14 14:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 May 2015 at 19:29, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
>> When I execute "echo 18500 > scaling_max_freq"
>> the system is supposed to change the CPU frequency to 18.5 MHz
>> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
>> i.e 9.25 MHz.
>
> So at least we are on the right path. But it looks to me that this
> call is not getting propagated well.
>
> From the attachment you gave initially, the event handler for
> twd-timers is: tick_handle_periodic(). i.e. you are running in
> periodic mode and not onshot...
>
> why ?
>
>> Yes, I do, but I was trying to use as much generic code as
>> possible to limit the chances of introducing bugs.
>
> Hmm..
>
>> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
>> checked, it didn't seem to handle frequency scaling.
>
> why is that required? Why will you change its freq ?
> The same timer is probably used for SPEAr (the platform
> I used to work on):
>
> http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

And this might be the problem you are facing:

http://lists.infradead.org/pipermail/linux-arm-kernel/2012-May/098564.html

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 13:59                   ` Viresh Kumar
@ 2015-05-14 14:42                     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-14 14:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Mason, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On Thu, May 14, 2015 at 07:29:38PM +0530, Viresh Kumar wrote:
> On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
> > When I execute "echo 18500 > scaling_max_freq"
> > the system is supposed to change the CPU frequency to 18.5 MHz
> > (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> > i.e 9.25 MHz.
> 
> So at least we are on the right path. But it looks to me that this
> call is not getting propagated well.
> 
> >From the attachment you gave initially, the event handler for
> twd-timers is: tick_handle_periodic(). i.e. you are running in
> periodic mode and not onshot...
> 
> why ?

If it's in periodic mode, the update should still be propagated to the
hardware, assuming the generic time keeping code doesn't produce an
error.

twd_update_frequency
`-clockevents_update_freq
  `-__clockevents_update_freq
    `-__clockevents_set_state(, CLOCK_EVT_STATE_PERIODIC)
      `-dev->set_mode (twd_set_mode)

That re-writes the TWD_TIMER_LOAD register based on twd_timer_rate,
which would have been updated by twd_update_frequency().

The question I posed earlier remains: is clockevents_update_freq()
failing?  We don't know, because we never check its return value.

Another thing to look at is whether we reach twd_set_mode().

Lastly, printing the values of the TWD_TIMER_LOAD and TWD_TIMER_COUNTER
after twd_set_mode() has written TWD_TIMER_LOAD might provide some
hints as to what's going on.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 14:42                     ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-14 14:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 14, 2015 at 07:29:38PM +0530, Viresh Kumar wrote:
> On 14 May 2015 at 18:36, Mason <slash.tmp@free.fr> wrote:
> > When I execute "echo 18500 > scaling_max_freq"
> > the system is supposed to change the CPU frequency to 18.5 MHz
> > (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
> > i.e 9.25 MHz.
> 
> So at least we are on the right path. But it looks to me that this
> call is not getting propagated well.
> 
> >From the attachment you gave initially, the event handler for
> twd-timers is: tick_handle_periodic(). i.e. you are running in
> periodic mode and not onshot...
> 
> why ?

If it's in periodic mode, the update should still be propagated to the
hardware, assuming the generic time keeping code doesn't produce an
error.

twd_update_frequency
`-clockevents_update_freq
  `-__clockevents_update_freq
    `-__clockevents_set_state(, CLOCK_EVT_STATE_PERIODIC)
      `-dev->set_mode (twd_set_mode)

That re-writes the TWD_TIMER_LOAD register based on twd_timer_rate,
which would have been updated by twd_update_frequency().

The question I posed earlier remains: is clockevents_update_freq()
failing?  We don't know, because we never check its return value.

Another thing to look at is whether we reach twd_set_mode().

Lastly, printing the values of the TWD_TIMER_LOAD and TWD_TIMER_COUNTER
after twd_set_mode() has written TWD_TIMER_LOAD might provide some
hints as to what's going on.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 13:59                   ` Viresh Kumar
@ 2015-05-14 14:48                     ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 14:48 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14/05/2015 15:59, Viresh Kumar wrote:
> On 14 May 2015 at 18:36, Mason wrote:
>> When I execute "echo 18500 > scaling_max_freq"
>> the system is supposed to change the CPU frequency to 18.5 MHz
>> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
>> i.e 9.25 MHz.
> 
> So at least we are on the right path. But it looks to me that this
> call is not getting propagated well.
> 
> From the attachment you gave initially, the event handler for
> twd-timers is: tick_handle_periodic(). i.e. you are running in
> periodic mode and not one-shot... why ?

I don't know. Is it not obvious (to someone who knows what to
look for) from reading the smp_twd.c source?

How do I run the TWD in one-shot mode?

>> Yes, I do, but I was trying to use as much generic code as
>> possible to limit the chances of introducing bugs.
> 
> Hmm..
> 
>> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
>> checked, it didn't seem to handle frequency scaling.
> 
> why is that required? Why will you change its freq ?

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0407g/CIHGECHJ.html

"The global timer is clocked by PERIPHCLK."

PERIPHCLK = CPUCLK/2

Change the CPUCLK, change the PERIPHCLK.

> The same timer is probably used for SPEAr (the platform
> I used to work on):
> 
> http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

Did that platform use cpufreq DFS?

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 14:48                     ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 14:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/2015 15:59, Viresh Kumar wrote:
> On 14 May 2015 at 18:36, Mason wrote:
>> When I execute "echo 18500 > scaling_max_freq"
>> the system is supposed to change the CPU frequency to 18.5 MHz
>> (I might have a bug lurking there) and PERIPHCLK is 1/2 of that,
>> i.e 9.25 MHz.
> 
> So at least we are on the right path. But it looks to me that this
> call is not getting propagated well.
> 
> From the attachment you gave initially, the event handler for
> twd-timers is: tick_handle_periodic(). i.e. you are running in
> periodic mode and not one-shot... why ?

I don't know. Is it not obvious (to someone who knows what to
look for) from reading the smp_twd.c source?

How do I run the TWD in one-shot mode?

>> Yes, I do, but I was trying to use as much generic code as
>> possible to limit the chances of introducing bugs.
> 
> Hmm..
> 
>> I'll take a fresh look at the ARM GLOBAL TIMER, but last I
>> checked, it didn't seem to handle frequency scaling.
> 
> why is that required? Why will you change its freq ?

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0407g/CIHGECHJ.html

"The global timer is clocked by PERIPHCLK."

PERIPHCLK = CPUCLK/2

Change the CPUCLK, change the PERIPHCLK.

> The same timer is probably used for SPEAr (the platform
> I used to work on):
> 
> http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14

Did that platform use cpufreq DFS?

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 13:53                   ` Russell King - ARM Linux
@ 2015-05-14 14:51                     ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 14:51 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On 14/05/2015 15:53, Russell King - ARM Linux wrote:

> I wonder if clockevents_update_freq() is returning an error - which we
> don't check, and don't print a warning.  It would probably be a very
> good idea to print a warning as a failure to reconfigure the clock
> events code for a different frequency will lead to this kind of issue.

I'll add code to test the return value, and report back here.

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-14 14:51                     ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-14 14:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/2015 15:53, Russell King - ARM Linux wrote:

> I wonder if clockevents_update_freq() is returning an error - which we
> don't check, and don't print a warning.  It would probably be a very
> good idea to print a warning as a failure to reconfigure the clock
> events code for a different frequency will lead to this kind of issue.

I'll add code to test the return value, and report back here.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 14:48                     ` Mason
@ 2015-05-15  4:16                       ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-15  4:16 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 14-05-15, 16:48, Mason wrote:
> I don't know. Is it not obvious (to someone who knows what to
> look for) from reading the smp_twd.c source?
> 
> How do I run the TWD in one-shot mode?

I haven't refreshed my memory for the earlier reply, but I have tried
to go through the code again.

The logic is this:
- Oneshot mode is only useful if we are going to support NO_HZ mode.
  i.e. we can disable the clkevt device during CPU idle ..
- For that to work, or in other words for the CPU to wake up from the
  idle state, we need a broadcast timer. Which will wake up the idle
  cpu and its clkevt device.

  You don't have this 'broadcast' timer in your case and so we aren't
  able to switch to oneshot mode.

  Even if you do that, you will be stuck in LOW resolution mode. And
  so you probably leave this timer as is and use a out-of-CPU-domain
  timer for this work, which can support high resolution mode.

> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0407g/CIHGECHJ.html

What core does your platform has? Cortex A9? Doesn't that uses this
timer instead: drivers/clocksource/arm_arch_timer.c ?

> "The global timer is clocked by PERIPHCLK."
> 
> PERIPHCLK = CPUCLK/2
> 
> Change the CPUCLK, change the PERIPHCLK.

I am not sure if GPT on SPEAr was an ARM provided timer. But it
definitely doesn't depend on cpu-clk and so no cpufreq support in
that.

> > The same timer is probably used for SPEAr (the platform
> > I used to work on):
> > 
> > http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14
> 
> Did that platform use cpufreq DFS?

Yes, but the clock to that timer was independent of CPU's clock.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15  4:16                       ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-15  4:16 UTC (permalink / raw)
  To: linux-arm-kernel

On 14-05-15, 16:48, Mason wrote:
> I don't know. Is it not obvious (to someone who knows what to
> look for) from reading the smp_twd.c source?
> 
> How do I run the TWD in one-shot mode?

I haven't refreshed my memory for the earlier reply, but I have tried
to go through the code again.

The logic is this:
- Oneshot mode is only useful if we are going to support NO_HZ mode.
  i.e. we can disable the clkevt device during CPU idle ..
- For that to work, or in other words for the CPU to wake up from the
  idle state, we need a broadcast timer. Which will wake up the idle
  cpu and its clkevt device.

  You don't have this 'broadcast' timer in your case and so we aren't
  able to switch to oneshot mode.

  Even if you do that, you will be stuck in LOW resolution mode. And
  so you probably leave this timer as is and use a out-of-CPU-domain
  timer for this work, which can support high resolution mode.

> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0407g/CIHGECHJ.html

What core does your platform has? Cortex A9? Doesn't that uses this
timer instead: drivers/clocksource/arm_arch_timer.c ?

> "The global timer is clocked by PERIPHCLK."
> 
> PERIPHCLK = CPUCLK/2
> 
> Change the CPUCLK, change the PERIPHCLK.

I am not sure if GPT on SPEAr was an ARM provided timer. But it
definitely doesn't depend on cpu-clk and so no cpufreq support in
that.

> > The same timer is probably used for SPEAr (the platform
> > I used to work on):
> > 
> > http://lxr.free-electrons.com/source/arch/arm/mach-spear/time.c?v=3.14
> 
> Did that platform use cpufreq DFS?

Yes, but the clock to that timer was independent of CPU's clock.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  4:16                       ` Viresh Kumar
@ 2015-05-15  5:07                         ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-15  5:07 UTC (permalink / raw)
  To: Mason
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 15-05-15, 09:46, Viresh Kumar wrote:
> The logic is this:
> - Oneshot mode is only useful if we are going to support NO_HZ mode.
>   i.e. we can disable the clkevt device during CPU idle ..

Minor correction. The last line talks about NO_HZ_IDLE instead. NO_HZ
is about disabling to tick when not required.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15  5:07                         ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-15  5:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 15-05-15, 09:46, Viresh Kumar wrote:
> The logic is this:
> - Oneshot mode is only useful if we are going to support NO_HZ mode.
>   i.e. we can disable the clkevt device during CPU idle ..

Minor correction. The last line talks about NO_HZ_IDLE instead. NO_HZ
is about disabling to tick when not required.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  4:16                       ` Viresh Kumar
@ 2015-05-15  9:00                         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15  9:00 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Mason, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On Fri, May 15, 2015 at 09:46:32AM +0530, Viresh Kumar wrote:
> What core does your platform has? Cortex A9? Doesn't that uses this
> timer instead: drivers/clocksource/arm_arch_timer.c ?

Cortex A9 doesn't always have the architected timer.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15  9:00                         ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15  9:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 15, 2015 at 09:46:32AM +0530, Viresh Kumar wrote:
> What core does your platform has? Cortex A9? Doesn't that uses this
> timer instead: drivers/clocksource/arm_arch_timer.c ?

Cortex A9 doesn't always have the architected timer.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  4:16                       ` Viresh Kumar
@ 2015-05-15  9:21                         ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15  9:21 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 15/05/2015 06:16, Viresh Kumar wrote:

> What core does your platform have? Cortex A9? Doesn't that uses this
> timer instead: drivers/clocksource/arm_arch_timer.c ?

Cortex-A9 MPCore r3p0

Generic Timer (aka Architected system timer) is not available
in Cortex-A9.

https://community.arm.com/message/8775

"Generic Timer is not an option on the Cortex-A9 (or A8 or A5)
processors. The Generic Timer is an optional part of the ARMv7
architecture, and is not implemented by those three processors."

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15  9:21                         ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15  9:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/05/2015 06:16, Viresh Kumar wrote:

> What core does your platform have? Cortex A9? Doesn't that uses this
> timer instead: drivers/clocksource/arm_arch_timer.c ?

Cortex-A9 MPCore r3p0

Generic Timer (aka Architected system timer) is not available
in Cortex-A9.

https://community.arm.com/message/8775

"Generic Timer is not an option on the Cortex-A9 (or A8 or A5)
processors. The Generic Timer is an optional part of the ARMv7
architecture, and is not implemented by those three processors."

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-14 14:42                     ` Russell King - ARM Linux
@ 2015-05-15  9:29                       ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15  9:29 UTC (permalink / raw)
  To: Russell King - ARM Linux, Viresh Kumar
  Cc: Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard, Linux ARM,
	Linux PM, cpufreq

On 14/05/2015 16:42, Russell King - ARM Linux wrote:

> If it's in periodic mode, the update should still be propagated to the
> hardware, assuming the generic time keeping code doesn't produce an
> error.
> 
> twd_update_frequency
> `-clockevents_update_freq
>   `-__clockevents_update_freq
>     `-__clockevents_set_state(, CLOCK_EVT_STATE_PERIODIC)
>       `-dev->set_mode (twd_set_mode)
> 
> That re-writes the TWD_TIMER_LOAD register based on twd_timer_rate,
> which would have been updated by twd_update_frequency().
> 
> The question I posed earlier remains: is clockevents_update_freq()
> failing?  We don't know, because we never check its return value.
> 
> Another thing to look at is whether we reach twd_set_mode().
> 
> Lastly, printing the values of the TWD_TIMER_LOAD and TWD_TIMER_COUNTER
> after twd_set_mode() has written TWD_TIMER_LOAD might provide some
> hints as to what's going on.

I'll have access to the board on Monday.

I'll add printk in strategic places, and report back ASAP.

(I'm considering dropping TWD, and using platform timers.)

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15  9:29                       ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15  9:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/2015 16:42, Russell King - ARM Linux wrote:

> If it's in periodic mode, the update should still be propagated to the
> hardware, assuming the generic time keeping code doesn't produce an
> error.
> 
> twd_update_frequency
> `-clockevents_update_freq
>   `-__clockevents_update_freq
>     `-__clockevents_set_state(, CLOCK_EVT_STATE_PERIODIC)
>       `-dev->set_mode (twd_set_mode)
> 
> That re-writes the TWD_TIMER_LOAD register based on twd_timer_rate,
> which would have been updated by twd_update_frequency().
> 
> The question I posed earlier remains: is clockevents_update_freq()
> failing?  We don't know, because we never check its return value.
> 
> Another thing to look at is whether we reach twd_set_mode().
> 
> Lastly, printing the values of the TWD_TIMER_LOAD and TWD_TIMER_COUNTER
> after twd_set_mode() has written TWD_TIMER_LOAD might provide some
> hints as to what's going on.

I'll have access to the board on Monday.

I'll add printk in strategic places, and report back ASAP.

(I'm considering dropping TWD, and using platform timers.)

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  9:29                       ` Mason
@ 2015-05-15  9:51                         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15  9:51 UTC (permalink / raw)
  To: Mason
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On Fri, May 15, 2015 at 11:29:34AM +0200, Mason wrote:
> On 14/05/2015 16:42, Russell King - ARM Linux wrote:
> 
> > If it's in periodic mode, the update should still be propagated to the
> > hardware, assuming the generic time keeping code doesn't produce an
> > error.
> > 
> > twd_update_frequency
> > `-clockevents_update_freq
> >   `-__clockevents_update_freq
> >     `-__clockevents_set_state(, CLOCK_EVT_STATE_PERIODIC)
> >       `-dev->set_mode (twd_set_mode)
> > 
> > That re-writes the TWD_TIMER_LOAD register based on twd_timer_rate,
> > which would have been updated by twd_update_frequency().
> > 
> > The question I posed earlier remains: is clockevents_update_freq()
> > failing?  We don't know, because we never check its return value.
> > 
> > Another thing to look at is whether we reach twd_set_mode().
> > 
> > Lastly, printing the values of the TWD_TIMER_LOAD and TWD_TIMER_COUNTER
> > after twd_set_mode() has written TWD_TIMER_LOAD might provide some
> > hints as to what's going on.
> 
> I'll have access to the board on Monday.
> 
> I'll add printk in strategic places, and report back ASAP.

That would be good.

> (I'm considering dropping TWD, and using platform timers.)

As you don't say which kernel version you're using, for all we know, you
might be using a version which omits some fixes in this area, such as
this one which you really must have if your timer is operating in
period mode:

fe79a9ba1196 clockevents: Adjust timer interval when frequency changes



-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15  9:51                         ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15  9:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 15, 2015 at 11:29:34AM +0200, Mason wrote:
> On 14/05/2015 16:42, Russell King - ARM Linux wrote:
> 
> > If it's in periodic mode, the update should still be propagated to the
> > hardware, assuming the generic time keeping code doesn't produce an
> > error.
> > 
> > twd_update_frequency
> > `-clockevents_update_freq
> >   `-__clockevents_update_freq
> >     `-__clockevents_set_state(, CLOCK_EVT_STATE_PERIODIC)
> >       `-dev->set_mode (twd_set_mode)
> > 
> > That re-writes the TWD_TIMER_LOAD register based on twd_timer_rate,
> > which would have been updated by twd_update_frequency().
> > 
> > The question I posed earlier remains: is clockevents_update_freq()
> > failing?  We don't know, because we never check its return value.
> > 
> > Another thing to look at is whether we reach twd_set_mode().
> > 
> > Lastly, printing the values of the TWD_TIMER_LOAD and TWD_TIMER_COUNTER
> > after twd_set_mode() has written TWD_TIMER_LOAD might provide some
> > hints as to what's going on.
> 
> I'll have access to the board on Monday.
> 
> I'll add printk in strategic places, and report back ASAP.

That would be good.

> (I'm considering dropping TWD, and using platform timers.)

As you don't say which kernel version you're using, for all we know, you
might be using a version which omits some fixes in this area, such as
this one which you really must have if your timer is operating in
period mode:

fe79a9ba1196 clockevents: Adjust timer interval when frequency changes



-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  9:51                         ` Russell King - ARM Linux
@ 2015-05-15 10:01                           ` Viresh Kumar
  -1 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-15 10:01 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Mason, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On 15-05-15, 10:51, Russell King - ARM Linux wrote:
> As you don't say which kernel version you're using, for all we know, you
> might be using a version which omits some fixes in this area, such as
> this one which you really must have if your timer is operating in
> period mode:
> 
> fe79a9ba1196 clockevents: Adjust timer interval when frequency changes

Atleast the link he gave was for 3.14 and this patch got commited in
3.15 :)

If that's the Mason, this patch is all you need.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15 10:01                           ` Viresh Kumar
  0 siblings, 0 replies; 77+ messages in thread
From: Viresh Kumar @ 2015-05-15 10:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 15-05-15, 10:51, Russell King - ARM Linux wrote:
> As you don't say which kernel version you're using, for all we know, you
> might be using a version which omits some fixes in this area, such as
> this one which you really must have if your timer is operating in
> period mode:
> 
> fe79a9ba1196 clockevents: Adjust timer interval when frequency changes

Atleast the link he gave was for 3.14 and this patch got commited in
3.15 :)

If that's the Mason, this patch is all you need.

-- 
viresh

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  4:16                       ` Viresh Kumar
@ 2015-05-15 10:11                         ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15 10:11 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Russell King - ARM Linux, Daniel Lezcano, Rafael J. Wysocki,
	Mans Rullgard, Linux ARM, Linux PM, cpufreq

On 15/05/2015 06:16, Viresh Kumar wrote:

> On 14-05-15, 16:48, Mason wrote:
>
>> How do I run the TWD in one-shot mode?
> 
> I haven't refreshed my memory for the earlier reply, but I have tried
> to go through the code again.
> 
> The logic is this:
> - Oneshot mode is only useful if we are going to support NO_HZ mode.
>   i.e. we can disable the clkevt device during CPU idle ..
> - For that to work, or in other words for the CPU to wake up from the
>   idle state, we need a broadcast timer. Which will wake up the idle
>   cpu and its clkevt device.
> 
>   You don't have this 'broadcast' timer in your case and so we aren't
>   able to switch to oneshot mode.
> 
>   Even if you do that, you will be stuck in LOW resolution mode. And
>   so you probably leave this timer as is and use a out-of-CPU-domain
>   timer for this work, which can support high resolution mode.

For the record, my kernel config contains:
(There are some BROADCAST-related options enabled.)

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_KTIME_SCALAR=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_ARCH_HAS_TICK_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set


Full config (and platform patch) provided in <555380F8.5050306@free.fr>
http://article.gmane.org/gmane.linux.power-management.general/60164

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15 10:11                         ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15 10:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/05/2015 06:16, Viresh Kumar wrote:

> On 14-05-15, 16:48, Mason wrote:
>
>> How do I run the TWD in one-shot mode?
> 
> I haven't refreshed my memory for the earlier reply, but I have tried
> to go through the code again.
> 
> The logic is this:
> - Oneshot mode is only useful if we are going to support NO_HZ mode.
>   i.e. we can disable the clkevt device during CPU idle ..
> - For that to work, or in other words for the CPU to wake up from the
>   idle state, we need a broadcast timer. Which will wake up the idle
>   cpu and its clkevt device.
> 
>   You don't have this 'broadcast' timer in your case and so we aren't
>   able to switch to oneshot mode.
> 
>   Even if you do that, you will be stuck in LOW resolution mode. And
>   so you probably leave this timer as is and use a out-of-CPU-domain
>   timer for this work, which can support high resolution mode.

For the record, my kernel config contains:
(There are some BROADCAST-related options enabled.)

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_KTIME_SCALAR=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_ARCH_HAS_TICK_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set


Full config (and platform patch) provided in <555380F8.5050306@free.fr>
http://article.gmane.org/gmane.linux.power-management.general/60164

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15  9:51                         ` Russell King - ARM Linux
@ 2015-05-15 10:36                           ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15 10:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On 15/05/2015 11:51, Russell King - ARM Linux wrote:

> As you don't say which kernel version you're using,

Sorry about that, I thought the patch-generating command
I posted conveyed the information.

$ git diff v3.14.41 HEAD >tango.patch && xz tango.patch

I'm using the 3.14.y branch from linux-stable.

> for all we know, you might be using a version which omits some fixes
> in this area, such as this one which you really must have if your
> timer is operating in period mode:
> 
> fe79a9ba1196 clockevents: Adjust timer interval when frequency changes

Hmmm, according to git log, this patch was accepted during
the 3.14-rc2 merge window. Why didn't it make it for 3.14?

I'll cherry-pick that commit, and report back along with
the other tests.

By any chance, is there a patch that would allow the platform
to get high-resolution timers when using smp_twd?

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15 10:36                           ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15 10:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/05/2015 11:51, Russell King - ARM Linux wrote:

> As you don't say which kernel version you're using,

Sorry about that, I thought the patch-generating command
I posted conveyed the information.

$ git diff v3.14.41 HEAD >tango.patch && xz tango.patch

I'm using the 3.14.y branch from linux-stable.

> for all we know, you might be using a version which omits some fixes
> in this area, such as this one which you really must have if your
> timer is operating in period mode:
> 
> fe79a9ba1196 clockevents: Adjust timer interval when frequency changes

Hmmm, according to git log, this patch was accepted during
the 3.14-rc2 merge window. Why didn't it make it for 3.14?

I'll cherry-pick that commit, and report back along with
the other tests.

By any chance, is there a patch that would allow the platform
to get high-resolution timers when using smp_twd?

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15 10:36                           ` Mason
@ 2015-05-15 11:58                             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15 11:58 UTC (permalink / raw)
  To: Mason
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On Fri, May 15, 2015 at 12:36:10PM +0200, Mason wrote:
> On 15/05/2015 11:51, Russell King - ARM Linux wrote:
> 
> > As you don't say which kernel version you're using,
> 
> Sorry about that, I thought the patch-generating command
> I posted conveyed the information.
> 
> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
> 
> I'm using the 3.14.y branch from linux-stable.

So you're using a kernel over a year old...

> > for all we know, you might be using a version which omits some fixes
> > in this area, such as this one which you really must have if your
> > timer is operating in period mode:
> > 
> > fe79a9ba1196 clockevents: Adjust timer interval when frequency changes
> 
> Hmmm, according to git log, this patch was accepted during
> the 3.14-rc2 merge window. Why didn't it make it for 3.14?

No it wasn't:

$ git describe --contains fe79a9ba1196
v3.15-rc1~123^2~12

It was merged during the merge window immediately preceding v3.15-rc1.

> By any chance, is there a patch that would allow the platform
> to get high-resolution timers when using smp_twd?

I think you'll have to do the research for that yourself... what
I can say is that TWD is capable of one-shot mode, and is capable
of running as a high-res timer:

Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   21691754031
 min_delta_ns:   1000
 mult:           425201762
 shift:          31
 mode:           3			<== CLOCK_EVT_MODE_ONESHOT
 next_event:     595603944000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt	<== indicates that it's switched
					to high-res mode
 retries:        0

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15 11:58                             ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15 11:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 15, 2015 at 12:36:10PM +0200, Mason wrote:
> On 15/05/2015 11:51, Russell King - ARM Linux wrote:
> 
> > As you don't say which kernel version you're using,
> 
> Sorry about that, I thought the patch-generating command
> I posted conveyed the information.
> 
> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
> 
> I'm using the 3.14.y branch from linux-stable.

So you're using a kernel over a year old...

> > for all we know, you might be using a version which omits some fixes
> > in this area, such as this one which you really must have if your
> > timer is operating in period mode:
> > 
> > fe79a9ba1196 clockevents: Adjust timer interval when frequency changes
> 
> Hmmm, according to git log, this patch was accepted during
> the 3.14-rc2 merge window. Why didn't it make it for 3.14?

No it wasn't:

$ git describe --contains fe79a9ba1196
v3.15-rc1~123^2~12

It was merged during the merge window immediately preceding v3.15-rc1.

> By any chance, is there a patch that would allow the platform
> to get high-resolution timers when using smp_twd?

I think you'll have to do the research for that yourself... what
I can say is that TWD is capable of one-shot mode, and is capable
of running as a high-res timer:

Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   21691754031
 min_delta_ns:   1000
 mult:           425201762
 shift:          31
 mode:           3			<== CLOCK_EVT_MODE_ONESHOT
 next_event:     595603944000000 nsecs
 set_next_event: twd_set_next_event
 set_mode:       twd_set_mode
 event_handler:  hrtimer_interrupt	<== indicates that it's switched
					to high-res mode
 retries:        0

-- 
FTTC broadband for 0.8mile line: currently@10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15 11:58                             ` Russell King - ARM Linux
@ 2015-05-15 12:45                               ` Mason
  -1 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15 12:45 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On 15/05/2015 13:58, Russell King - ARM Linux wrote:
> On Fri, May 15, 2015 at 12:36:10PM +0200, Mason wrote:
>> On 15/05/2015 11:51, Russell King - ARM Linux wrote:
>>
>>> As you don't say which kernel version you're using,
>>
>> Sorry about that, I thought the patch-generating command
>> I posted conveyed the information.
>>
>> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
>>
>> I'm using the 3.14.y branch from linux-stable.
> 
> So you're using a kernel over a year old...

The sad part is: I'm working on the "new" kernel.
The current "official" kernel is 3.4 (and I'm afraid
there's a 2.6.32 still kicking around somewhere).

I'm trying to convince management that one of the advantages
of mainlining the port is that it is easier to switch to newer
kernels. Do you agree with this assertion?

>>> for all we know, you might be using a version which omits some fixes
>>> in this area, such as this one which you really must have if your
>>> timer is operating in period mode:
>>>
>>> fe79a9ba1196 clockevents: Adjust timer interval when frequency changes
>>
>> Hmmm, according to git log, this patch was accepted during
>> the 3.14-rc2 merge window. Why didn't it make it for 3.14?
> 
> No it wasn't:
> 
> $ git describe --contains fe79a9ba1196
> v3.15-rc1~123^2~12
> 
> It was merged during the merge window immediately preceding v3.15-rc1.
> 
>> By any chance, is there a patch that would allow the platform
>> to get high-resolution timers when using smp_twd?
> 
> I think you'll have to do the research for that yourself... what
> I can say is that TWD is capable of one-shot mode,

Is one-shot mode preferred over periodic mode?
(Seems counter-intuitive, as one-shot mode needs to be
periodically reprogrammed, whereas periodic mode just
keeps firing.)

> and is capable of running as a high-res timer:
> 
> Per CPU device: 0
> Clock Event Device: local_timer
>  max_delta_ns:   21691754031
>  min_delta_ns:   1000
>  mult:           425201762
>  shift:          31
>  mode:           3			<== CLOCK_EVT_MODE_ONESHOT
>  next_event:     595603944000000 nsecs
>  set_next_event: twd_set_next_event
>  set_mode:       twd_set_mode
>  event_handler:  hrtimer_interrupt	<== indicates that it's switched
> 					to high-res mode
>  retries:        0

I'm confused.

In the "High-resolution timers not supported when using
smp_twd on Cortex A9" thread, you seemed to agree that
TWD could not be used as a hrtimer...

> Yes - because the TWD stops in low power modes, which makes it
> unsuitable as a high-resolution timer.  These kinds of issues
> are annoying, but it's the way things are.

So it looks like TWD can be used as a hrtimer, but something
on my platform is making it choose otherwise, and it is not
the CLOCK_EVT_FEAT_C3STOP flag?

Regards.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15 12:45                               ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-15 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/05/2015 13:58, Russell King - ARM Linux wrote:
> On Fri, May 15, 2015 at 12:36:10PM +0200, Mason wrote:
>> On 15/05/2015 11:51, Russell King - ARM Linux wrote:
>>
>>> As you don't say which kernel version you're using,
>>
>> Sorry about that, I thought the patch-generating command
>> I posted conveyed the information.
>>
>> $ git diff v3.14.41 HEAD >tango.patch && xz tango.patch
>>
>> I'm using the 3.14.y branch from linux-stable.
> 
> So you're using a kernel over a year old...

The sad part is: I'm working on the "new" kernel.
The current "official" kernel is 3.4 (and I'm afraid
there's a 2.6.32 still kicking around somewhere).

I'm trying to convince management that one of the advantages
of mainlining the port is that it is easier to switch to newer
kernels. Do you agree with this assertion?

>>> for all we know, you might be using a version which omits some fixes
>>> in this area, such as this one which you really must have if your
>>> timer is operating in period mode:
>>>
>>> fe79a9ba1196 clockevents: Adjust timer interval when frequency changes
>>
>> Hmmm, according to git log, this patch was accepted during
>> the 3.14-rc2 merge window. Why didn't it make it for 3.14?
> 
> No it wasn't:
> 
> $ git describe --contains fe79a9ba1196
> v3.15-rc1~123^2~12
> 
> It was merged during the merge window immediately preceding v3.15-rc1.
> 
>> By any chance, is there a patch that would allow the platform
>> to get high-resolution timers when using smp_twd?
> 
> I think you'll have to do the research for that yourself... what
> I can say is that TWD is capable of one-shot mode,

Is one-shot mode preferred over periodic mode?
(Seems counter-intuitive, as one-shot mode needs to be
periodically reprogrammed, whereas periodic mode just
keeps firing.)

> and is capable of running as a high-res timer:
> 
> Per CPU device: 0
> Clock Event Device: local_timer
>  max_delta_ns:   21691754031
>  min_delta_ns:   1000
>  mult:           425201762
>  shift:          31
>  mode:           3			<== CLOCK_EVT_MODE_ONESHOT
>  next_event:     595603944000000 nsecs
>  set_next_event: twd_set_next_event
>  set_mode:       twd_set_mode
>  event_handler:  hrtimer_interrupt	<== indicates that it's switched
> 					to high-res mode
>  retries:        0

I'm confused.

In the "High-resolution timers not supported when using
smp_twd on Cortex A9" thread, you seemed to agree that
TWD could not be used as a hrtimer...

> Yes - because the TWD stops in low power modes, which makes it
> unsuitable as a high-resolution timer.  These kinds of issues
> are annoying, but it's the way things are.

So it looks like TWD can be used as a hrtimer, but something
on my platform is making it choose otherwise, and it is not
the CLOCK_EVT_FEAT_C3STOP flag?

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15 12:45                               ` Mason
@ 2015-05-15 13:15                                 ` Russell King - ARM Linux
  -1 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15 13:15 UTC (permalink / raw)
  To: Mason
  Cc: Viresh Kumar, Daniel Lezcano, Rafael J. Wysocki, Mans Rullgard,
	Linux ARM, Linux PM, cpufreq

On Fri, May 15, 2015 at 02:45:28PM +0200, Mason wrote:
> The sad part is: I'm working on the "new" kernel.
> The current "official" kernel is 3.4 (and I'm afraid
> there's a 2.6.32 still kicking around somewhere).
> 
> I'm trying to convince management that one of the advantages
> of mainlining the port is that it is easier to switch to newer
> kernels. Do you agree with this assertion?

Yes and no.  If you can get stuff upstream, it should make things
easier.

The difficult bit - and time consuming bit - is getting code upstream.
Even in my position, I'd suggest allowing about five years for that
activity, and even then don't have expectations of getting everything
upstream.

Frankly now, my advice to people would be to just not bother, it's
*way* too much effort to get everything to an acceptable state,
especially if the SoC has no support what so ever.

For example, we're still banging on over the iMX6 SoCs - just getting
the display stuff sorted took more than a year.  Ethernet keeps becoming
unstable (people keep reporting problems with it) and it became far too
difficult for me to get my patch set for that upstream, so I dumped it.

> > I think you'll have to do the research for that yourself... what
> > I can say is that TWD is capable of one-shot mode,
> 
> Is one-shot mode preferred over periodic mode?
> (Seems counter-intuitive, as one-shot mode needs to be
> periodically reprogrammed, whereas periodic mode just
> keeps firing.)

Think about it in terms of high-resolution timers.  How can something
that's programmed to fire HZ times a second provide you with a high
resolution timer?  Something that you can program for an arbitary
period depending on your needs obviously allows higher resolution
intervals, up to the minimum timer count interval.  It can also provide
a longer interval, up to the maximum timer count interval.

> > and is capable of running as a high-res timer:
> > 
> > Per CPU device: 0
> > Clock Event Device: local_timer
> >  max_delta_ns:   21691754031
> >  min_delta_ns:   1000
> >  mult:           425201762
> >  shift:          31
> >  mode:           3			<== CLOCK_EVT_MODE_ONESHOT
> >  next_event:     595603944000000 nsecs
> >  set_next_event: twd_set_next_event
> >  set_mode:       twd_set_mode
> >  event_handler:  hrtimer_interrupt	<== indicates that it's switched
> > 					to high-res mode
> >  retries:        0
> 
> I'm confused.
> 
> In the "High-resolution timers not supported when using
> smp_twd on Cortex A9" thread, you seemed to agree that
> TWD could not be used as a hrtimer...

I guess I was wrong.  I'm no expert on the kernel's time code.  That's
why I've suggested to you several times that you do the research
yourself, rather than me having to do that research for you and then
write an email about it.  It's kind of wasting my time, and you're not
paying me for this kind of support service...

> > Yes - because the TWD stops in low power modes, which makes it
> > unsuitable as a high-resolution timer.  These kinds of issues
> > are annoying, but it's the way things are.
> 
> So it looks like TWD can be used as a hrtimer, but something
> on my platform is making it choose otherwise, and it is not
> the CLOCK_EVT_FEAT_C3STOP flag?

I've no idea.  I don't have the 3.14 code in front of me to research.
I _could_ wind my tree back to 3.14, and read that code, but I don't
see why I should go to all that effort... especially as I have other
things I need to be getting on with, and other problems to be taking
care of.

What I'm saying is that I have a platform here running a modern kernel
which _does_ use the TWD, and it _does_ appear to be running in high-res
mode.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
@ 2015-05-15 13:15                                 ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-15 13:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 15, 2015 at 02:45:28PM +0200, Mason wrote:
> The sad part is: I'm working on the "new" kernel.
> The current "official" kernel is 3.4 (and I'm afraid
> there's a 2.6.32 still kicking around somewhere).
> 
> I'm trying to convince management that one of the advantages
> of mainlining the port is that it is easier to switch to newer
> kernels. Do you agree with this assertion?

Yes and no.  If you can get stuff upstream, it should make things
easier.

The difficult bit - and time consuming bit - is getting code upstream.
Even in my position, I'd suggest allowing about five years for that
activity, and even then don't have expectations of getting everything
upstream.

Frankly now, my advice to people would be to just not bother, it's
*way* too much effort to get everything to an acceptable state,
especially if the SoC has no support what so ever.

For example, we're still banging on over the iMX6 SoCs - just getting
the display stuff sorted took more than a year.  Ethernet keeps becoming
unstable (people keep reporting problems with it) and it became far too
difficult for me to get my patch set for that upstream, so I dumped it.

> > I think you'll have to do the research for that yourself... what
> > I can say is that TWD is capable of one-shot mode,
> 
> Is one-shot mode preferred over periodic mode?
> (Seems counter-intuitive, as one-shot mode needs to be
> periodically reprogrammed, whereas periodic mode just
> keeps firing.)

Think about it in terms of high-resolution timers.  How can something
that's programmed to fire HZ times a second provide you with a high
resolution timer?  Something that you can program for an arbitary
period depending on your needs obviously allows higher resolution
intervals, up to the minimum timer count interval.  It can also provide
a longer interval, up to the maximum timer count interval.

> > and is capable of running as a high-res timer:
> > 
> > Per CPU device: 0
> > Clock Event Device: local_timer
> >  max_delta_ns:   21691754031
> >  min_delta_ns:   1000
> >  mult:           425201762
> >  shift:          31
> >  mode:           3			<== CLOCK_EVT_MODE_ONESHOT
> >  next_event:     595603944000000 nsecs
> >  set_next_event: twd_set_next_event
> >  set_mode:       twd_set_mode
> >  event_handler:  hrtimer_interrupt	<== indicates that it's switched
> > 					to high-res mode
> >  retries:        0
> 
> I'm confused.
> 
> In the "High-resolution timers not supported when using
> smp_twd on Cortex A9" thread, you seemed to agree that
> TWD could not be used as a hrtimer...

I guess I was wrong.  I'm no expert on the kernel's time code.  That's
why I've suggested to you several times that you do the research
yourself, rather than me having to do that research for you and then
write an email about it.  It's kind of wasting my time, and you're not
paying me for this kind of support service...

> > Yes - because the TWD stops in low power modes, which makes it
> > unsuitable as a high-resolution timer.  These kinds of issues
> > are annoying, but it's the way things are.
> 
> So it looks like TWD can be used as a hrtimer, but something
> on my platform is making it choose otherwise, and it is not
> the CLOCK_EVT_FEAT_C3STOP flag?

I've no idea.  I don't have the 3.14 code in front of me to research.
I _could_ wind my tree back to 3.14, and read that code, but I don't
see why I should go to all that effort... especially as I have other
things I need to be getting on with, and other problems to be taking
care of.

What I'm saying is that I have a platform here running a modern kernel
which _does_ use the TWD, and it _does_ appear to be running in high-res
mode.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15 13:15                                 ` Russell King - ARM Linux
  (?)
@ 2015-05-15 13:58                                 ` Mason
  2015-05-15 18:35                                   ` Mason
  -1 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-15 13:58 UTC (permalink / raw)
  To: linux-arm-kernel

[ Dropping cpufreq and linux-pm ]

On 15/05/2015 15:15, Russell King - ARM Linux wrote:

> I guess I was wrong.  I'm no expert on the kernel's time code.  That's
> why I've suggested to you several times that you do the research
> yourself, rather than me having to do that research for you and then
> write an email about it.  It's kind of wasting my time, and you're not
> paying me for this kind of support service...

Sorry if my questions appeared to be asking you to do
my research for me. I was hoping some of the ARM guys
might eventually chime in, as they wrote most of the
TWD code.

Anyway, thanks to you and Viresh for pointing out the
most likely culprit for my problem. I'll take a hard
look at clockevents source code next week.

> What I'm saying is that I have a platform here running a modern kernel
> which _does_ use the TWD, and it _does_ appear to be running in high-res
> mode.

Thanks for providing evidence that it does work. I am
now hopeful to get it working. I'll start by applying
my patch to a more recent kernel.

For the record, if I remove the CLOCK_EVT_FEAT_C3STOP
flag, then the system does provide hrtimers, but this
seems to be a red herring.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15 13:58                                 ` Mason
@ 2015-05-15 18:35                                   ` Mason
  2015-05-18 11:24                                     ` Mason
  0 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-15 18:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/05/2015 15:58, Mason wrote:

> Thanks for providing evidence that it does work. I am
> now hopeful to get it working. I'll start by applying
> my patch to a more recent kernel.

Could you provide the full output of /proc/timer_list for
the platform where TWD is running in one-shot mode?

Also what kernel version is that platform running?

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-15 18:35                                   ` Mason
@ 2015-05-18 11:24                                     ` Mason
  2015-05-18 11:54                                       ` Russell King - ARM Linux
  0 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-18 11:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/05/2015 20:35, Mason wrote:

> On 15/05/2015 15:58, Mason wrote:
> 
>> Thanks for providing evidence that it does work. I am
>> now hopeful to get it working. I'll start by applying
>> my patch to a more recent kernel.
> 
> Could you provide the full output of /proc/timer_list for
> the platform where TWD is running in one-shot mode?
> 
> Also what kernel version is that platform running?

Hello Russell,

On your platform, where TWD works in one-shot mode with hrtimers
enabled, I assume you registered a tick broadcast device?

(AFAIU, one is required since TWD is tagged CLOCK_EVT_FEAT_C3STOP.)
https://lwn.net/Articles/574962/

I didn't register a tick broadcast device, and this may be the root
of my problems.

Here's the relevant part of /proc/timer_list

Tick Device: mode:     0
Broadcast device
Clock Event Device: <NULL>
tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000

What does your timer_list show?


Questions for the ARM devs:

Is there an architected / standard tick broadcast device on ARM,
equivalent to x86's HPET?

Russell's 2010 patch (5388a6b266) states

  The TWD local timers are unable to wake up the CPU when it is placed
  into a low power mode, eg. C3.  Therefore, we need to adapt things
  such that the TWD code can cope with this.

What exactly is meant by "low power modes".
Does the idle loop calling WFI send the system into low power mode?
Are there several low power modes?
How are these modes entered and exited?
Is this documented in the ARM reference manual?
or in the Cortex A9 technical manual?

The technical manual states

5.3.1  Individual Cortex-A9 processor power management
[...]
Each Cortex-A9 processor can be in one of the following modes:
Run mode     = Everything is clocked and powered-up
Standby mode = The CPU clock is stopped. Only logic required for wake-up is still active.
Dormant mode = Everything is powered off except RAM arrays that are in retention mode.
Shutdown     = Everything is powered-off.
[...]
Standby modes
WFI and WFE Standby modes disable most of the clocks in a processor, while keeping its logic
powered up. This reduces the power drawn to the static leakage current, leaving a tiny clock
power overhead requirement to enable the device to wake up.
[...]
Dormant mode
Dormant mode is designed to enable the Cortex-A9 processor to be powered down, while
leaving the caches powered up and maintaining their state.

Does the "low power mode" in Russell's commit message refer to Dormant mode?
Because it seems the TWD should still work in Standby mode, right?


Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-18 11:24                                     ` Mason
@ 2015-05-18 11:54                                       ` Russell King - ARM Linux
  2015-05-20 16:21                                         ` Mason
  0 siblings, 1 reply; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-18 11:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 18, 2015 at 01:24:49PM +0200, Mason wrote:
> On your platform, where TWD works in one-shot mode with hrtimers
> enabled, I assume you registered a tick broadcast device?

It is iMX6, and yes, there is a broadcast device, which appears to be
disabled:

Tick Device: mode:     1
Broadcast device
Clock Event Device: mxc_timer1
 max_delta_ns:   1431655752223
 min_delta_ns:   85000
 mult:           6442451
 shift:          31
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: v2_set_next_event
 set_mode:       mxc_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 00000000
tick_broadcast_oneshot_mask: 00000000

> I didn't register a tick broadcast device, and this may be the root
> of my problems.

Possibly, I wouldn't know without reading the code.

> Questions for the ARM devs:
> 
> Is there an architected / standard tick broadcast device on ARM,
> equivalent to x86's HPET?
> 
> Russell's 2010 patch (5388a6b266) states
> 
>   The TWD local timers are unable to wake up the CPU when it is placed
>   into a low power mode, eg. C3.  Therefore, we need to adapt things
>   such that the TWD code can cope with this.
> 
> What exactly is meant by "low power modes".
> Does the idle loop calling WFI send the system into low power mode?

That depends on your hardware!  Some SoCs program the low power mode
via a register, and then you need to issue a WFI to enter that power
mode.

Low power modes are not really part of the architecture.  The
architecture defines what low power modes are available at CPU level,
but does not completely define how those low power modes are to be
tnetered.  It's up to the SoC vendor to determine what low power modes
are available, and how they are entered, and how fine-grained they are.

For a SoC where WFI is not programmed to cause anything other than the
architecture specified dormant behaviour, WFI will not cause the TWD
to stop.

> Are there several low power modes?

There can be.  Up to the SoC vendor to define.

> How are these modes entered and exited?

Up to the SoC vendor to define.

> Is this documented in the ARM reference manual?

No.  This documents the power modes that are (possibly) provided by the
CPU architecture for use by the SoC.

> or in the Cortex A9 technical manual?

No.

To give an example, a SoC vendor _may_ decide that CPU states below
"dormant" are entered by writing to a SoC specific power mode register,
and then executing a WFI instruction.

Another SoC vendor may decode that a CPU can be completely powered down
by merely writing to a register, which causes the hardware to assert a
CPU core specific reset and then remove power from the CPU core.

It's entirely up to the SoC vendor.  Therefore, you need to read the SoC
specific documentation.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-18 11:54                                       ` Russell King - ARM Linux
@ 2015-05-20 16:21                                         ` Mason
  2015-05-20 18:50                                           ` Arnd Bergmann
  0 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-20 16:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hello Russell,

Thanks for your detailed answer.

On 18/05/2015 13:54, Russell King - ARM Linux wrote:

> For a SoC where WFI is not programmed to cause anything other than
> the architecture specified dormant behaviour, WFI will not cause the
> TWD to stop.

According to the hardware engineers, my SoC does not support any
low-power modes.

But I didn't see the "clean" way to make the kernel aware of this.
Is this an acceptable patch? (I have my doubts.)


diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
index 6591e26..300f13a 100644
--- a/arch/arm/kernel/smp_twd.c
+++ b/arch/arm/kernel/smp_twd.c
@@ -295,6 +295,10 @@ static void twd_timer_setup(void)
        clk->name = "local_timer";
        clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
                        CLOCK_EVT_FEAT_C3STOP;
+#ifdef CONFIG_TANGOX
+       /*** Tango does not implement low power modes ***/
+       clk->features &= ~CLOCK_EVT_FEAT_C3STOP;
+#endif
        clk->rating = 350;
        clk->set_mode = twd_set_mode;
        clk->set_next_event = twd_set_next_event;


Regards.

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 16:21                                         ` Mason
@ 2015-05-20 18:50                                           ` Arnd Bergmann
  2015-05-20 19:34                                             ` Mason
  0 siblings, 1 reply; 77+ messages in thread
From: Arnd Bergmann @ 2015-05-20 18:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 20 May 2015 18:21:45 Mason wrote:
> 
> On 18/05/2015 13:54, Russell King - ARM Linux wrote:
> 
> > For a SoC where WFI is not programmed to cause anything other than
> > the architecture specified dormant behaviour, WFI will not cause the
> > TWD to stop.
> 
> According to the hardware engineers, my SoC does not support any
> low-power modes.
> 
> But I didn't see the "clean" way to make the kernel aware of this.
> Is this an acceptable patch? (I have my doubts.)

No, this is clearly broken.

> diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
> index 6591e26..300f13a 100644
> --- a/arch/arm/kernel/smp_twd.c
> +++ b/arch/arm/kernel/smp_twd.c
> @@ -295,6 +295,10 @@ static void twd_timer_setup(void)
>         clk->name = "local_timer";
>         clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
>                         CLOCK_EVT_FEAT_C3STOP;
> +#ifdef CONFIG_TANGOX
> +       /*** Tango does not implement low power modes ***/
> +       clk->features &= ~CLOCK_EVT_FEAT_C3STOP;
> +#endif
>         clk->rating = 350;
>         clk->set_mode = twd_set_mode;
>         clk->set_next_event = twd_set_next_event;

This will disable the feature on all machines that are configured 
in the kernel. You have to make a run-time decision instead, e.g.
based on a boolean DT property of the twd node.

	Arnd

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 18:50                                           ` Arnd Bergmann
@ 2015-05-20 19:34                                             ` Mason
  2015-05-20 20:14                                               ` Russell King - ARM Linux
  0 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-20 19:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/05/2015 20:50, Arnd Bergmann wrote:

> On Wednesday 20 May 2015 18:21:45 Mason wrote:
>>
>> On 18/05/2015 13:54, Russell King - ARM Linux wrote:
>>
>>> For a SoC where WFI is not programmed to cause anything other than
>>> the architecture specified dormant behaviour, WFI will not cause the
>>> TWD to stop.
>>
>> According to the hardware engineers, my SoC does not support any
>> low-power modes.
>>
>> But I didn't see the "clean" way to make the kernel aware of this.
>> Is this an acceptable patch? (I have my doubts.)
> 
> No, this is clearly broken.

I don't see it. Could you be more explicit?

>> diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
>> index 6591e26..300f13a 100644
>> --- a/arch/arm/kernel/smp_twd.c
>> +++ b/arch/arm/kernel/smp_twd.c
>> @@ -295,6 +295,10 @@ static void twd_timer_setup(void)
>>         clk->name = "local_timer";
>>         clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
>>                         CLOCK_EVT_FEAT_C3STOP;
>> +#ifdef CONFIG_TANGOX
>> +       /*** Tango does not implement low power modes ***/
>> +       clk->features &= ~CLOCK_EVT_FEAT_C3STOP;
>> +#endif
>>         clk->rating = 350;
>>         clk->set_mode = twd_set_mode;
>>         clk->set_next_event = twd_set_next_event;
> 
> This will disable the feature on all machines that are configured 
> in the kernel.

What do you mean, "disable the feature"?

My proposed patch doesn't change the default, which is to set
CLOCK_EVT_FEAT_C3STOP unconditionally for all machines.

And then only for some platforms (in this case only TANGOX)
CLOCK_EVT_FEAT_C3STOP is *removed* from the features list.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 19:34                                             ` Mason
@ 2015-05-20 20:14                                               ` Russell King - ARM Linux
  2015-05-20 20:41                                                 ` Mason
  0 siblings, 1 reply; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-20 20:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 20, 2015 at 09:34:16PM +0200, Mason wrote:
> On 20/05/2015 20:50, Arnd Bergmann wrote:
> >> diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
> >> index 6591e26..300f13a 100644
> >> --- a/arch/arm/kernel/smp_twd.c
> >> +++ b/arch/arm/kernel/smp_twd.c
> >> @@ -295,6 +295,10 @@ static void twd_timer_setup(void)
> >>         clk->name = "local_timer";
> >>         clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
> >>                         CLOCK_EVT_FEAT_C3STOP;
> >> +#ifdef CONFIG_TANGOX
> >> +       /*** Tango does not implement low power modes ***/
> >> +       clk->features &= ~CLOCK_EVT_FEAT_C3STOP;
> >> +#endif
> >>         clk->rating = 350;
> >>         clk->set_mode = twd_set_mode;
> >>         clk->set_next_event = twd_set_next_event;
> > 
> > This will disable the feature on all machines that are configured 
> > in the kernel.
> 
> What do you mean, "disable the feature"?
> 
> My proposed patch doesn't change the default, which is to set
> CLOCK_EVT_FEAT_C3STOP unconditionally for all machines.
> 
> And then only for some platforms (in this case only TANGOX)
> CLOCK_EVT_FEAT_C3STOP is *removed* from the features list.

So we build a kernel containing both tangox and OMAP together, and
OMAP starts misbehaving because CLOCK_EVT_FEAT_C3STOP is no longer
set.  That's a regression caused by your change, even if you didn't
intend for anyone else to enable your option.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 20:14                                               ` Russell King - ARM Linux
@ 2015-05-20 20:41                                                 ` Mason
  2015-05-20 20:52                                                   ` Arnd Bergmann
  0 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-20 20:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/05/2015 22:14, Russell King - ARM Linux wrote:
> On Wed, May 20, 2015 at 09:34:16PM +0200, Mason wrote:
>> On 20/05/2015 20:50, Arnd Bergmann wrote:
>>>> diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
>>>> index 6591e26..300f13a 100644
>>>> --- a/arch/arm/kernel/smp_twd.c
>>>> +++ b/arch/arm/kernel/smp_twd.c
>>>> @@ -295,6 +295,10 @@ static void twd_timer_setup(void)
>>>>         clk->name = "local_timer";
>>>>         clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
>>>>                         CLOCK_EVT_FEAT_C3STOP;
>>>> +#ifdef CONFIG_TANGOX
>>>> +       /*** Tango does not implement low power modes ***/
>>>> +       clk->features &= ~CLOCK_EVT_FEAT_C3STOP;
>>>> +#endif
>>>>         clk->rating = 350;
>>>>         clk->set_mode = twd_set_mode;
>>>>         clk->set_next_event = twd_set_next_event;
>>>
>>> This will disable the feature on all machines that are configured 
>>> in the kernel.
>>
>> What do you mean, "disable the feature"?
>>
>> My proposed patch doesn't change the default, which is to set
>> CLOCK_EVT_FEAT_C3STOP unconditionally for all machines.
>>
>> And then only for some platforms (in this case only TANGOX)
>> CLOCK_EVT_FEAT_C3STOP is *removed* from the features list.
> 
> So we build a kernel containing both tangox and OMAP together, and
> OMAP starts misbehaving because CLOCK_EVT_FEAT_C3STOP is no longer
> set.  That's a regression caused by your change, even if you didn't
> intend for anyone else to enable your option.

Oh... I didn't even think it made sense (and was supported)
to select more than one machine...

Is this related to CONFIG_ARCH_MULTIPLATFORM?

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 20:41                                                 ` Mason
@ 2015-05-20 20:52                                                   ` Arnd Bergmann
  2015-05-20 21:56                                                     ` Mason
  0 siblings, 1 reply; 77+ messages in thread
From: Arnd Bergmann @ 2015-05-20 20:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 20 May 2015 22:41:33 Mason wrote:
> 
> Oh... I didn't even think it made sense (and was supported)
> to select more than one machine...
> 
> Is this related to CONFIG_ARCH_MULTIPLATFORM?

Yes. In the old days, each platform had its own entry in the top-level
choice statement, which meant they were mutually exclusive. Most of
them are converted now and can be put into a single kernel with
CONFIG_ARCH_MULTIPLATFORM, and we stopped accepting new ones that
don't do this a few years ago.

I have a patch series that converts all remaining ARMv6 and ARMv7
platforms that don't are still standalone to use CONFIG_ARCH_MULTIPLATFORM
as well.

	Arnd

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 20:52                                                   ` Arnd Bergmann
@ 2015-05-20 21:56                                                     ` Mason
  2015-05-20 22:18                                                       ` Arnd Bergmann
  2015-05-20 23:14                                                       ` Russell King - ARM Linux
  0 siblings, 2 replies; 77+ messages in thread
From: Mason @ 2015-05-20 21:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/05/2015 22:52, Arnd Bergmann wrote:

> On Wednesday 20 May 2015 22:41:33 Mason wrote:
>
>> Oh... I didn't even think it made sense (and was supported)
>> to select more than one machine...
>>
>> Is this related to CONFIG_ARCH_MULTIPLATFORM?
> 
> Yes. In the old days, each platform had its own entry in the top-level
> choice statement, which meant they were mutually exclusive. Most of
> them are converted now and can be put into a single kernel with
> CONFIG_ARCH_MULTIPLATFORM, and we stopped accepting new ones that
> don't do this a few years ago.

Is the following patch (more) acceptable:

diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
index 6591e26fc13f..3e867b2f067f 100644
--- a/arch/arm/kernel/smp_twd.c
+++ b/arch/arm/kernel/smp_twd.c
@@ -32,6 +32,7 @@ static void __iomem *twd_base;
 static struct clk *twd_clk;
 static unsigned long twd_timer_rate;
 static DEFINE_PER_CPU(bool, percpu_setup_called);
+static int feat_c3stop = CLOCK_EVT_FEAT_C3STOP;
 
 static struct clock_event_device __percpu *twd_evt;
 static int twd_ppi;
@@ -294,7 +295,7 @@ static void twd_timer_setup(void)
 
        clk->name = "local_timer";
        clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
-                       CLOCK_EVT_FEAT_C3STOP;
+                       feat_c3stop;
        clk->rating = 350;
        clk->set_mode = twd_set_mode;
        clk->set_next_event = twd_set_next_event;
@@ -346,6 +347,7 @@ static int __init twd_local_timer_common_register(struct device_node *np)
                goto out_irq;
 
        twd_get_clock(np);
+       if (of_get_property(np, "twd_never_stops", NULL)) feat_c3stop = 0;
 
        /*
         * Immediately configure the timer on the boot CPU, unless we need


Or perhaps the other way around?
i.e. feat_c3stop initialized to 0 (thus in the bss section)
and set to FEAT_C3STOP if "twd_never_stops" doesn't exist...

Russell, when you added the FEAT_C3STOP flag unconditionally in
commit 5388a6b266, didn't that potentially break platforms that
didn't expect the flag to be set?

Regards.

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 21:56                                                     ` Mason
@ 2015-05-20 22:18                                                       ` Arnd Bergmann
  2015-05-21 12:35                                                         ` Mason
  2015-05-20 23:14                                                       ` Russell King - ARM Linux
  1 sibling, 1 reply; 77+ messages in thread
From: Arnd Bergmann @ 2015-05-20 22:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 20 May 2015 23:56:31 Mason wrote:
> diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
> index 6591e26fc13f..3e867b2f067f 100644
> --- a/arch/arm/kernel/smp_twd.c
> +++ b/arch/arm/kernel/smp_twd.c
> @@ -32,6 +32,7 @@ static void __iomem *twd_base;
>  static struct clk *twd_clk;
>  static unsigned long twd_timer_rate;
>  static DEFINE_PER_CPU(bool, percpu_setup_called);
> +static int feat_c3stop = CLOCK_EVT_FEAT_C3STOP;
>  
>  static struct clock_event_device __percpu *twd_evt;
>  static int twd_ppi;
> @@ -294,7 +295,7 @@ static void twd_timer_setup(void)
>  
>         clk->name = "local_timer";
>         clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
> -                       CLOCK_EVT_FEAT_C3STOP;
> +                       feat_c3stop;
>         clk->rating = 350;
>         clk->set_mode = twd_set_mode;
>         clk->set_next_event = twd_set_next_event;
> @@ -346,6 +347,7 @@ static int __init twd_local_timer_common_register(struct device_node *np)
>                 goto out_irq;
>  
>         twd_get_clock(np);
> +       if (of_get_property(np, "twd_never_stops", NULL)) feat_c3stop = 0;

of_property_read_bool(), and newline between the condition and the
assignment.

>         /*
>          * Immediately configure the timer on the boot CPU, unless we need
> 
> 
> Or perhaps the other way around?
> i.e. feat_c3stop initialized to 0 (thus in the bss section)
> and set to FEAT_C3STOP if "twd_never_stops" doesn't exist...

yes.

> Russell, when you added the FEAT_C3STOP flag unconditionally in
> commit 5388a6b266, didn't that potentially break platforms that
> didn't expect the flag to be set?

To take a step back, you should first figure out whether clearing
this flag is actually the correct behavior for your hardware, or
just happens to work by accident.

	Arnd

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 21:56                                                     ` Mason
  2015-05-20 22:18                                                       ` Arnd Bergmann
@ 2015-05-20 23:14                                                       ` Russell King - ARM Linux
  2015-05-21  9:56                                                         ` Mason
  1 sibling, 1 reply; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-20 23:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 20, 2015 at 11:56:31PM +0200, Mason wrote:
> Russell, when you added the FEAT_C3STOP flag unconditionally in
> commit 5388a6b266, didn't that potentially break platforms that
> didn't expect the flag to be set?

Why would it break any platforms at the time it was merged?

You're only having problems because you don't provide a global timer
to the kernel, which the C3STOP feature does require - and that's
because your platform appears to be very limited in what it can
provide.  All other platforms at the time provided a global timer,
so the scenario you're facing never existed.

Maybe you also should read the discussion around this which is over 5
years old.  You can find some initial discussion via these message IDs:

alpine.LFD.2.00.1004171152080.3625 at localhost.localdomain

And the thread "SMP Local timer and power management" from May 2010.

Yes, making it conditional depending on the platform was mooted, but
it seemed unnecessary.  The fact that no one until now has had a
problem with it is testament to the approach being correct for the
hardware that was in use over the last five years.  No bug reports
means no problem.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 23:14                                                       ` Russell King - ARM Linux
@ 2015-05-21  9:56                                                         ` Mason
  2015-05-21 10:20                                                           ` Russell King - ARM Linux
  0 siblings, 1 reply; 77+ messages in thread
From: Mason @ 2015-05-21  9:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 21/05/2015 01:14, Russell King - ARM Linux wrote:

Do you ever sleep? :-)

> On Wed, May 20, 2015 at 11:56:31PM +0200, Mason wrote:
>
>> Russell, when you added the FEAT_C3STOP flag unconditionally in
>> commit 5388a6b266, didn't that potentially break platforms that
>> didn't expect the flag to be set?
> 
> Why would it break any platforms at the time it was merged?
> 
> You're only having problems because you don't provide a global timer
> to the kernel, which the C3STOP feature does require - and that's
> because your platform appears to be very limited in what it can
> provide.

For the record, I don't think it is the platform that is limited;
it is my understanding of Linux internals and interrupt management
that is insufficient. (I'm working on it.)

AFAICT, most of the Linux know-how is not explicitly spelled out in
Documentation, it's often bits-and-pieces hidden in long ML threads.
(Sites like LWN are a blessing for grunts like me. I await eagerly
the release of LDD4.)

> All other platforms at the time provided a global timer, so the
> scenario you're facing never existed.

Sorry, my limitations are showing... Take mach-ux500 for example.
Did it provide a global timer at the time?

$ git checkout 5388a6b266
$ git grep clock.*event arch/arm/mach-ux500
arch/arm/mach-ux500/localtimer.c: * Setup the local clock events for a CPU.
arch/arm/mach-ux500/localtimer.c:void __cpuinit local_timer_setup(struct clock_event_device *evt)

> Maybe you also should read the discussion around this which is over 5
> years old.  You can find some initial discussion via these message IDs:
> 
> alpine.LFD.2.00.1004171152080.3625 at localhost.localdomain
> And the thread "SMP Local timer and power management" from May 2010.

I downloaded gmane's NNTP archive for linux-arm-kernel (415000 message),
going back to 2002.

$ grep alpine.LFD.2.00.1004171152080.3625@ gmane.linux.ports.arm.kernel.msf
$ grep -i "SMP Local timer and power management" gmane.linux.ports.arm.kernel.msf

No hits using gmane's search interface either.

> Yes, making it conditional depending on the platform was mooted, but
> it seemed unnecessary.  The fact that no one until now has had a
> problem with it is testament to the approach being correct for the
> hardware that was in use over the last five years.  No bug reports
> means no problem.

I agree with this pragmatic stance.

Regards.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-21  9:56                                                         ` Mason
@ 2015-05-21 10:20                                                           ` Russell King - ARM Linux
  0 siblings, 0 replies; 77+ messages in thread
From: Russell King - ARM Linux @ 2015-05-21 10:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 21, 2015 at 11:56:20AM +0200, Mason wrote:
> On 21/05/2015 01:14, Russell King - ARM Linux wrote:
> 
> Do you ever sleep? :-)

Yes, normally between about midnight-1am until 9am London time.  All
other hours are normally spent trying to keep up with the email deluge
(which gives me no satisfication what so ever, it feels like I haven't
achieved anything) or trying to get some real practical work done.

> Sorry, my limitations are showing... Take mach-ux500 for example.
> Did it provide a global timer at the time?

Yes.

> $ git checkout 5388a6b266
> $ git grep clock.*event arch/arm/mach-ux500
> arch/arm/mach-ux500/localtimer.c: * Setup the local clock events for a CPU.
> arch/arm/mach-ux500/localtimer.c:void __cpuinit local_timer_setup(struct clock_event_device *evt)

I have to wonder wtf you are trying to prove me wrong... in any case,
your "example" is too short-sighted to be meaningful.  If you want to
try and find an example, you have to read the code, and follow the code
paths.  Let's take your example of ux500.

$ git diff -u 5388a6b266.. -- arch/arm/mach-ux500 arch/arm/plat-nomadik

and search the output of that.  What you'll find is that ux500 has:

-MACHINE_START(U8500, "ST-Ericsson U5500 Platform")
-       .timer          = &ux500_timer,

-struct sys_timer ux500_timer = {
-       .init   = ux500_timer_init,
-};

and tracing this code...

-static void __init ux500_timer_init(void)
 {
-#ifdef CONFIG_LOCAL_TIMERS
-       /* Setup the local timer base */
-       twd_base = __io_address(UX500_TWD_BASE);
-#endif
-       /* Setup the MTU base */
-       if (cpu_is_u8500ed())
-               mtu_base = __io_address(U8500_MTU0_BASE_ED);
-       else
-               mtu_base = __io_address(UX500_MTU0_BASE);
-
-       nmdk_timer_init();
 }

Now, nmdk_timer_init() is in arch/arm/plat-nomadik/timer.c, and that
sets up the clocksouce for timekeeping, and the global clock event.
I'll let you read that code.

> I downloaded gmane's NNTP archive for linux-arm-kernel (415000 message),
> going back to 2002.
> 
> $ grep alpine.LFD.2.00.1004171152080.3625@ gmane.linux.ports.arm.kernel.msf
> $ grep -i "SMP Local timer and power management" gmane.linux.ports.arm.kernel.msf
> 
> No hits using gmane's search interface either.

Yea, sorry, those messages weren't on a mailing list, but were a private
discussion between Thomas, Catalin, Santosh and others.  As it's private,
it can't be published without the consent of everyone involved.  As people
would have been talking on behalf of their companies, and they have moved
on, it's highly unlikely that the people could give consent now.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* schedule_timeout sleeps too long after dividing CPU frequency
  2015-05-20 22:18                                                       ` Arnd Bergmann
@ 2015-05-21 12:35                                                         ` Mason
  0 siblings, 0 replies; 77+ messages in thread
From: Mason @ 2015-05-21 12:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 21/05/2015 00:18, Arnd Bergmann wrote:

> On Wednesday 20 May 2015 23:56:31 Mason wrote:
>
>> Or perhaps the other way around?
>> i.e. feat_c3stop initialized to 0 (thus in the bss section)
>> and set to FEAT_C3STOP if "twd_never_stops" doesn't exist...
> 
> yes.

IIUC, something along these lines:

diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
index 172c6a05..e10a388 100644
--- a/arch/arm/kernel/smp_twd.c
+++ b/arch/arm/kernel/smp_twd.c
@@ -34,6 +34,7 @@ static unsigned long twd_timer_rate;
 static DEFINE_PER_CPU(bool, percpu_setup_called);
 
 static struct clock_event_device __percpu *twd_evt;
+static int feat_c3stop;
 static int twd_ppi;
 
 static void twd_set_mode(enum clock_event_mode mode,
@@ -294,7 +295,7 @@ static void twd_timer_setup(void)
 
 	clk->name = "local_timer";
 	clk->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
-			CLOCK_EVT_FEAT_C3STOP;
+			feat_c3stop;
 	clk->rating = 350;
 	clk->set_mode = twd_set_mode;
 	clk->set_next_event = twd_set_next_event;
@@ -346,6 +347,8 @@ static int __init twd_local_timer_common_register(struct device_node *np)
 		goto out_irq;
 
 	twd_get_clock(np);
+	if (!of_property_read_bool(np, "twd_never_stops"))
+		feat_c3stop = CLOCK_EVT_FEAT_C3STOP;
 
 	/*
 	 * Immediately configure the timer on the boot CPU, unless we need


>> Russell, when you added the FEAT_C3STOP flag unconditionally in
>> commit 5388a6b266, didn't that potentially break platforms that
>> didn't expect the flag to be set?
> 
> To take a step back, you should first figure out whether clearing
> this flag is actually the correct behavior for your hardware, or
> just happens to work by accident.

According to my (limited) understanding of the clockevents core,
a "broadcast device" is required if and only if the local timers
have CLOCK_EVT_FEAT_C3STOP. I could be wrong.

Additional reference:
https://lwn.net/Articles/574962/

Regards.

^ permalink raw reply related	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2015-05-21 12:35 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-12 14:32 schedule_timeout sleeps too long after dividing CPU frequency Mason
2015-05-12 14:32 ` Mason
2015-05-12 14:46 ` Viresh Kumar
2015-05-12 14:46   ` Viresh Kumar
2015-05-12 15:14   ` Mason
2015-05-12 15:14     ` Mason
2015-05-12 15:50     ` Russell King - ARM Linux
2015-05-12 15:50       ` Russell King - ARM Linux
2015-05-12 16:14       ` Mason
2015-05-12 16:14         ` Mason
2015-05-13 16:51       ` Mason
2015-05-13 16:51         ` Mason
2015-05-14  2:13         ` Viresh Kumar
2015-05-14  2:13           ` Viresh Kumar
2015-05-14 11:22           ` Mason
2015-05-14 11:22             ` Mason
2015-05-14 11:54             ` Viresh Kumar
2015-05-14 11:54               ` Viresh Kumar
2015-05-14 13:06               ` Mason
2015-05-14 13:06                 ` Mason
2015-05-14 13:53                 ` Russell King - ARM Linux
2015-05-14 13:53                   ` Russell King - ARM Linux
2015-05-14 14:51                   ` Mason
2015-05-14 14:51                     ` Mason
2015-05-14 13:59                 ` Viresh Kumar
2015-05-14 13:59                   ` Viresh Kumar
2015-05-14 13:59                   ` Viresh Kumar
2015-05-14 14:38                   ` Viresh Kumar
2015-05-14 14:38                     ` Viresh Kumar
2015-05-14 14:42                   ` Russell King - ARM Linux
2015-05-14 14:42                     ` Russell King - ARM Linux
2015-05-15  9:29                     ` Mason
2015-05-15  9:29                       ` Mason
2015-05-15  9:51                       ` Russell King - ARM Linux
2015-05-15  9:51                         ` Russell King - ARM Linux
2015-05-15 10:01                         ` Viresh Kumar
2015-05-15 10:01                           ` Viresh Kumar
2015-05-15 10:36                         ` Mason
2015-05-15 10:36                           ` Mason
2015-05-15 11:58                           ` Russell King - ARM Linux
2015-05-15 11:58                             ` Russell King - ARM Linux
2015-05-15 12:45                             ` Mason
2015-05-15 12:45                               ` Mason
2015-05-15 13:15                               ` Russell King - ARM Linux
2015-05-15 13:15                                 ` Russell King - ARM Linux
2015-05-15 13:58                                 ` Mason
2015-05-15 18:35                                   ` Mason
2015-05-18 11:24                                     ` Mason
2015-05-18 11:54                                       ` Russell King - ARM Linux
2015-05-20 16:21                                         ` Mason
2015-05-20 18:50                                           ` Arnd Bergmann
2015-05-20 19:34                                             ` Mason
2015-05-20 20:14                                               ` Russell King - ARM Linux
2015-05-20 20:41                                                 ` Mason
2015-05-20 20:52                                                   ` Arnd Bergmann
2015-05-20 21:56                                                     ` Mason
2015-05-20 22:18                                                       ` Arnd Bergmann
2015-05-21 12:35                                                         ` Mason
2015-05-20 23:14                                                       ` Russell King - ARM Linux
2015-05-21  9:56                                                         ` Mason
2015-05-21 10:20                                                           ` Russell King - ARM Linux
2015-05-14 14:48                   ` Mason
2015-05-14 14:48                     ` Mason
2015-05-15  4:16                     ` Viresh Kumar
2015-05-15  4:16                       ` Viresh Kumar
2015-05-15  5:07                       ` Viresh Kumar
2015-05-15  5:07                         ` Viresh Kumar
2015-05-15  9:00                       ` Russell King - ARM Linux
2015-05-15  9:00                         ` Russell King - ARM Linux
2015-05-15  9:21                       ` Mason
2015-05-15  9:21                         ` Mason
2015-05-15 10:11                       ` Mason
2015-05-15 10:11                         ` Mason
2015-05-12 15:23 ` Russell King - ARM Linux
2015-05-12 15:23   ` Russell King - ARM Linux
2015-05-12 16:03   ` Mason
2015-05-12 16:03     ` Mason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.