All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mason <slash.tmp@free.fr>
To: Viresh Kumar <viresh.kumar@linaro.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linux ARM <linux-arm-kernel@lists.infradead.org>,
	cpufreq <cpufreq@vger.kernel.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>
Subject: Re: schedule_timeout sleeps too long after dividing CPU frequency
Date: Tue, 12 May 2015 17:14:15 +0200	[thread overview]
Message-ID: <555218C7.5050602@free.fr> (raw)
In-Reply-To: <CAKohpo=eMA6L9=jK=bAnbFuDxcdtpQNub0WFnSy6MOqB3jOdjA@mail.gmail.com>

On 12/05/2015 16:46, Viresh Kumar wrote:

> On 12 May 2015 at 20:02, Mason wrote:
>
>> I'm working on a Cortex A9 based platform.
>>
>> I have a basic clock tree, and a very basic cpufreq driver using
>> mostly generic driver glue:
>>
>> static struct cpufreq_driver tangox_cpufreq_driver = {
>>         .name           = "tangox-cpufreq",
>>         .init           = tangox_cpu_init,
>>         .verify         = cpufreq_generic_frequency_table_verify,
>>         .target_index   = tangox_target,
>>         .get            = cpufreq_generic_get,
>>         .exit           = cpufreq_generic_exit,
>>         .attr           = cpufreq_generic_attr,
>> };
>>
>> My target_index function is trivial:
>>
>> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
>> {
>>         return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
>> }
>>
>> I was testing an unrelated driver at low frequencies, with the nominal
>> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
>> that when the driver calls
>>
>>     schedule_timeout(HZ);
>>
>> the thread sleeps 54 seconds instead of 1.
>>
>> [ 1107.827279] Sleep for 300 jiffies.
>> [ 1161.838513] Done sleeping.
>>
>> (I have HZ set to 300)
>>
>> I enabled DEBUG in the generic cpufreq driver to see what happens
>> when I request a frequency change:
>>
>> # cd /sys/devices/system/cpu/cpu0/cpufreq
>> # cat scaling_governor
>> performance
>> # cat scaling_available_frequencies
>> 999000 499500 333000 199800 111000 18500
>> # echo 18500 > scaling_max_freq
>> [   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
>> [   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
>> [   19.610345] cpufreq: governor: change or update limits
>> [   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
>> [   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
>> [   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
>> [   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.650454] NEW RATE=9250000
>> [   19.653644] NEW RATE=9250000
>> [   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.664176] cpufreq: FREQ: 18500 - CPU: 0
>> [   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.675648] cpufreq: FREQ: 18500 - CPU: 1
>>
>> The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
>> (my clockevent source) to check whether the CPU frequency change propagated
>> to smp_twd.
>>
>> I must have made an obvious mistake. Could someone point it out to me?
>> (I have attached the output of /proc/timer_list to this message)
>>
>> Looking more closely at schedule_timeout(), I suppose the work happens
>> in __mod_timer()... Hmm, that one is over my head.
>>
>> What am I missing?
> 
> A Broadcast device ? So you have two CPUs with a local timer each, whose
> expires-next is set to KTIME_MAX (infinite).... So you will get interrupted
> once the counter has overflown, probably that's what 54 seconds is all about..

If I divide the CPU clock by 54, then
schedule_timeout(HZ) sleeps 54 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 5.4 seconds (instead of 0.1)

If I divide the CPU clock by 9, then
schedule_timeout(HZ) sleeps 9 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 0.9 seconds (instead of 0.1)

I didn't test other dividers, but I am convinced that if I divide
the CPU clock by N, schedule_timeout(HZ) will sleep N seconds.

It looks like my system is calculating the number of cycles to wait
using the nominal (higher) frequency, and so the system waits N times
too long, because the actual clock is N times slower than nominal.

> Probably your CPUs are going into idle and no one is waking them up and that's
> why you need a broadcast device, i.e. another global timer on your SoC, outside
> of the CPU subsystem.

This ties in to another thread I started in LAKML:
("High-resolution timers not supported when using smp_twd on Cortex A9")

$ git show 5388a6b2 arch/arm/kernel/smp_twd.c
commit 5388a6b266e9c3357353332ba0cd5549082887f1
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Mon Jul 26 13:19:43 2010 +0100

    ARM: SMP: Always enable clock event broadcast support
    
    The TWD local timers are unable to wake up the CPU when it is placed
    into a low power mode, eg. C3.  Therefore, we need to adapt things
    such that the TWD code can cope with this.
    
    We do this by always providing a broadcast tick function, and marking
    the fact that the TWD local timer will stop in low power modes.  This
    means that when the CPU is placed into a low power mode, the core
    timer code marks this fact, and allows an IPI to be given to the core.

This mentions a "broadcast tick function" (of which I know nothing).
Is this what you're referring to?

> It doesn't have anything to do with cpufreq AFAICT ..

I'm sure it's not in the cpufreq driver, if that's what you mean.
It's probably something I did, or didn't do. But it happens when
I change the frequency of the CPU, so I hoped someone on cpufreq
would spot my blunder.

Regards.


WARNING: multiple messages have this Message-ID (diff)
From: slash.tmp@free.fr (Mason)
To: linux-arm-kernel@lists.infradead.org
Subject: schedule_timeout sleeps too long after dividing CPU frequency
Date: Tue, 12 May 2015 17:14:15 +0200	[thread overview]
Message-ID: <555218C7.5050602@free.fr> (raw)
In-Reply-To: <CAKohpo=eMA6L9=jK=bAnbFuDxcdtpQNub0WFnSy6MOqB3jOdjA@mail.gmail.com>

On 12/05/2015 16:46, Viresh Kumar wrote:

> On 12 May 2015 at 20:02, Mason wrote:
>
>> I'm working on a Cortex A9 based platform.
>>
>> I have a basic clock tree, and a very basic cpufreq driver using
>> mostly generic driver glue:
>>
>> static struct cpufreq_driver tangox_cpufreq_driver = {
>>         .name           = "tangox-cpufreq",
>>         .init           = tangox_cpu_init,
>>         .verify         = cpufreq_generic_frequency_table_verify,
>>         .target_index   = tangox_target,
>>         .get            = cpufreq_generic_get,
>>         .exit           = cpufreq_generic_exit,
>>         .attr           = cpufreq_generic_attr,
>> };
>>
>> My target_index function is trivial:
>>
>> static int tangox_target(struct cpufreq_policy *policy, unsigned int idx)
>> {
>>         return clk_set_rate(policy->clk, freq_table[idx].frequency * 1000);
>> }
>>
>> I was testing an unrelated driver at low frequencies, with the nominal
>> frequency (999 MHz) divided by 54 (i.e. freq = 18.5 MHz) and I noticed
>> that when the driver calls
>>
>>     schedule_timeout(HZ);
>>
>> the thread sleeps 54 seconds instead of 1.
>>
>> [ 1107.827279] Sleep for 300 jiffies.
>> [ 1161.838513] Done sleeping.
>>
>> (I have HZ set to 300)
>>
>> I enabled DEBUG in the generic cpufreq driver to see what happens
>> when I request a frequency change:
>>
>> # cd /sys/devices/system/cpu/cpu0/cpufreq
>> # cat scaling_governor
>> performance
>> # cat scaling_available_frequencies
>> 999000 499500 333000 199800 111000 18500
>> # echo 18500 > scaling_max_freq
>> [   19.597257] cpufreq: setting new policy for CPU 0: 18500 - 18500 kHz
>> [   19.604017] cpufreq: new min and max freqs are 18500 - 18500 kHz
>> [   19.610345] cpufreq: governor: change or update limits
>> [   19.615596] cpufreq: __cpufreq_governor for CPU 0, event 3
>> [   19.621186] cpufreq: target for CPU 0: 18500 kHz, relation 1, requested 18500 kHz
>> [   19.628783] cpufreq: __cpufreq_driver_target: cpu: 0, oldfreq: 999000, new freq: 18500
>> [   19.636818] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.643625] cpufreq: notification 0 of frequency transition to 18500 kHz
>> [   19.650454] NEW RATE=9250000
>> [   19.653644] NEW RATE=9250000
>> [   19.657091] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.664176] cpufreq: FREQ: 18500 - CPU: 0
>> [   19.668412] cpufreq: notification 1 of frequency transition to 18500 kHz
>> [   19.675648] cpufreq: FREQ: 18500 - CPU: 1
>>
>> The "NEW RATE" traces are ones I added in smp_twd.c:twd_update_frequency()
>> (my clockevent source) to check whether the CPU frequency change propagated
>> to smp_twd.
>>
>> I must have made an obvious mistake. Could someone point it out to me?
>> (I have attached the output of /proc/timer_list to this message)
>>
>> Looking more closely at schedule_timeout(), I suppose the work happens
>> in __mod_timer()... Hmm, that one is over my head.
>>
>> What am I missing?
> 
> A Broadcast device ? So you have two CPUs with a local timer each, whose
> expires-next is set to KTIME_MAX (infinite).... So you will get interrupted
> once the counter has overflown, probably that's what 54 seconds is all about..

If I divide the CPU clock by 54, then
schedule_timeout(HZ) sleeps 54 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 5.4 seconds (instead of 0.1)

If I divide the CPU clock by 9, then
schedule_timeout(HZ) sleeps 9 seconds (instead of 1)
schedule_timeout(HZ/10) sleeps 0.9 seconds (instead of 0.1)

I didn't test other dividers, but I am convinced that if I divide
the CPU clock by N, schedule_timeout(HZ) will sleep N seconds.

It looks like my system is calculating the number of cycles to wait
using the nominal (higher) frequency, and so the system waits N times
too long, because the actual clock is N times slower than nominal.

> Probably your CPUs are going into idle and no one is waking them up and that's
> why you need a broadcast device, i.e. another global timer on your SoC, outside
> of the CPU subsystem.

This ties in to another thread I started in LAKML:
("High-resolution timers not supported when using smp_twd on Cortex A9")

$ git show 5388a6b2 arch/arm/kernel/smp_twd.c
commit 5388a6b266e9c3357353332ba0cd5549082887f1
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Mon Jul 26 13:19:43 2010 +0100

    ARM: SMP: Always enable clock event broadcast support
    
    The TWD local timers are unable to wake up the CPU when it is placed
    into a low power mode, eg. C3.  Therefore, we need to adapt things
    such that the TWD code can cope with this.
    
    We do this by always providing a broadcast tick function, and marking
    the fact that the TWD local timer will stop in low power modes.  This
    means that when the CPU is placed into a low power mode, the core
    timer code marks this fact, and allows an IPI to be given to the core.

This mentions a "broadcast tick function" (of which I know nothing).
Is this what you're referring to?

> It doesn't have anything to do with cpufreq AFAICT ..

I'm sure it's not in the cpufreq driver, if that's what you mean.
It's probably something I did, or didn't do. But it happens when
I change the frequency of the CPU, so I hoped someone on cpufreq
would spot my blunder.

Regards.

  reply	other threads:[~2015-05-12 15:14 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-12 14:32 schedule_timeout sleeps too long after dividing CPU frequency Mason
2015-05-12 14:32 ` Mason
2015-05-12 14:46 ` Viresh Kumar
2015-05-12 14:46   ` Viresh Kumar
2015-05-12 15:14   ` Mason [this message]
2015-05-12 15:14     ` Mason
2015-05-12 15:50     ` Russell King - ARM Linux
2015-05-12 15:50       ` Russell King - ARM Linux
2015-05-12 16:14       ` Mason
2015-05-12 16:14         ` Mason
2015-05-13 16:51       ` Mason
2015-05-13 16:51         ` Mason
2015-05-14  2:13         ` Viresh Kumar
2015-05-14  2:13           ` Viresh Kumar
2015-05-14 11:22           ` Mason
2015-05-14 11:22             ` Mason
2015-05-14 11:54             ` Viresh Kumar
2015-05-14 11:54               ` Viresh Kumar
2015-05-14 13:06               ` Mason
2015-05-14 13:06                 ` Mason
2015-05-14 13:53                 ` Russell King - ARM Linux
2015-05-14 13:53                   ` Russell King - ARM Linux
2015-05-14 14:51                   ` Mason
2015-05-14 14:51                     ` Mason
2015-05-14 13:59                 ` Viresh Kumar
2015-05-14 13:59                   ` Viresh Kumar
2015-05-14 13:59                   ` Viresh Kumar
2015-05-14 14:38                   ` Viresh Kumar
2015-05-14 14:38                     ` Viresh Kumar
2015-05-14 14:42                   ` Russell King - ARM Linux
2015-05-14 14:42                     ` Russell King - ARM Linux
2015-05-15  9:29                     ` Mason
2015-05-15  9:29                       ` Mason
2015-05-15  9:51                       ` Russell King - ARM Linux
2015-05-15  9:51                         ` Russell King - ARM Linux
2015-05-15 10:01                         ` Viresh Kumar
2015-05-15 10:01                           ` Viresh Kumar
2015-05-15 10:36                         ` Mason
2015-05-15 10:36                           ` Mason
2015-05-15 11:58                           ` Russell King - ARM Linux
2015-05-15 11:58                             ` Russell King - ARM Linux
2015-05-15 12:45                             ` Mason
2015-05-15 12:45                               ` Mason
2015-05-15 13:15                               ` Russell King - ARM Linux
2015-05-15 13:15                                 ` Russell King - ARM Linux
2015-05-15 13:58                                 ` Mason
2015-05-15 18:35                                   ` Mason
2015-05-18 11:24                                     ` Mason
2015-05-18 11:54                                       ` Russell King - ARM Linux
2015-05-20 16:21                                         ` Mason
2015-05-20 18:50                                           ` Arnd Bergmann
2015-05-20 19:34                                             ` Mason
2015-05-20 20:14                                               ` Russell King - ARM Linux
2015-05-20 20:41                                                 ` Mason
2015-05-20 20:52                                                   ` Arnd Bergmann
2015-05-20 21:56                                                     ` Mason
2015-05-20 22:18                                                       ` Arnd Bergmann
2015-05-21 12:35                                                         ` Mason
2015-05-20 23:14                                                       ` Russell King - ARM Linux
2015-05-21  9:56                                                         ` Mason
2015-05-21 10:20                                                           ` Russell King - ARM Linux
2015-05-14 14:48                   ` Mason
2015-05-14 14:48                     ` Mason
2015-05-15  4:16                     ` Viresh Kumar
2015-05-15  4:16                       ` Viresh Kumar
2015-05-15  5:07                       ` Viresh Kumar
2015-05-15  5:07                         ` Viresh Kumar
2015-05-15  9:00                       ` Russell King - ARM Linux
2015-05-15  9:00                         ` Russell King - ARM Linux
2015-05-15  9:21                       ` Mason
2015-05-15  9:21                         ` Mason
2015-05-15 10:11                       ` Mason
2015-05-15 10:11                         ` Mason
2015-05-12 15:23 ` Russell King - ARM Linux
2015-05-12 15:23   ` Russell King - ARM Linux
2015-05-12 16:03   ` Mason
2015-05-12 16:03     ` Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555218C7.5050602@free.fr \
    --to=slash.tmp@free.fr \
    --cc=cpufreq@vger.kernel.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.