* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:41 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> Rafael,
Hi,
Thanks for the report!
> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> timers with utilization update callbacks' with next-20160215. An example
> crash log and bisect results are attached below.
>
> Please let me know if there is anything I can do to help tracking down
> the problem.
It looks like we've uncovered some nastiness in the arch ARM code (see below).
[cut]
> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [ 1.340000] pgd = c0204000
> [ 1.340000] [00000000] *pgd=00000000
> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> [ 1.340000] Modules linked in:
> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> [ 1.340000] PC is at 0x0
> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
Since this is ARM, arch_send_call_function_single_ipi() looks like this:
void arch_send_call_function_single_ipi(int cpu)
{
smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
}
so I'm not sure how the NULL pointer deref is possible even.
The only thing coming to mind would be that cpumask_of(cpu) triggers
this, but I'm not sure how exactly that can happen.
I need help from somebody who knows how this low-level stuff works on ARM.
> [ 1.340000] pc : [<00000000>] lr : [<c030de78>] psr: 20000193
> [ 1.340000] sp : cb05b7c0 ip : 00000000 fp : cb05b83c
> [ 1.340000] r10: cfb8c0c0 r9 : 00000000 r8 : cb18a4c0
> [ 1.340000] r7 : 00000005 r6 : 00000005 r5 : cb5c0334 r4 : 00000000
> [ 1.340000] r3 : 00000000 r2 : c0c06a7c r1 : 00000003 r0 : c0c06a7c
> [ 1.340000] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> [ 1.340000] Control: 10c5387d Table: 80204059 DAC: 00000051
> [ 1.340000] Process swapper/0 (pid: 1, stack limit = 0xcb05a220)
> [ 1.340000] Stack: (0xcb05b7c0 to 0xcb05c000)
> [ 1.340000] b7c0: 00000000 c03b3350 4fdec700 00000000 00000005 c0959a84 ffffffff 00000000
> [ 1.340000] b7e0: ffffffff cb18a4c0 cfb8c0c0 c03732d8 4c4b4000 cb18a4c0 cfb8c0c0 cfb8c0c0
> [ 1.340000] b800: 0e979000 cb18a4c0 cfb8c0c0 00000005 0e979000 c12130c0 00000000 cfb8c0c0
> [ 1.340000] b820: cb05b83c c0360d28 00000000 cb18a4c0 cfb8c0c0 60000193 cb05b84c c0360fc0
> [ 1.340000] b840: cb18a4c0 cb18a8b4 cb05b87c c0361b74 cfb8c100 00000141 cb05b934 cb1c1cc0
> [ 1.340000] b860: 00000002 00000000 00000000 00000048 c1416d0c cb0096c0 00000001 c0381de0
> [ 1.340000] b880: c1416080 cfb8c100 00000400 cb0096c0 cb009720 00000000 00000038 cb003000
> [ 1.340000] b8a0: 00000000 cb05b9c4 00000a28 c0381ea4 cb0096c0 cb0096d0 00000000 c0385150
> [ 1.340000] b8c0: c03850ac c1211518 00000000 c038168c 00000155 c0381788 c0932830 20000013
> [ 1.340000] b8e0: ffffffff cb05b924 00000000 c030bad4 00000001 00000009 00000002 fa070024
> [ 1.340000] b900: cb127c10 00009401 cb05b9b8 c1302100 00000000 00000000 cb05b9c4 00000a28
> [ 1.340000] b920: 00000000 cb05b940 00009601 c0932830 20000013 ffffffff 00000051 c093261c
> [ 1.340000] b940: 00000014 cb127c58 00000002 00000001 000f4240 cb127c10 1443fd00 00000001
> [ 1.340000] b960: c1302100 cb127c58 cb05b9b8 00000002 c145d438 ffff16ac 00000001 c0928358
> [ 1.340000] b980: cb127c74 cb127c58 00000002 cb05b9b8 cb05ba97 00000001 cb05ba97 00000001
> [ 1.340000] b9a0: 00000001 c0928538 00000000 cb518000 cb513740 c07726c4 0000004b cfb80001
> [ 1.340000] b9c0: cb513740 0001004b 017d0001 cb05ba97 00000000 c076dc30 00000001 00000000
> [ 1.340000] b9e0: 00000004 000000b9 000000ba cb518000 000000ba 000000b9 00000001 c076dd70
> [ 1.340000] ba00: 00000000 00000000 cfb8c100 cb518000 000000ba 00000001 00000001 cb05ba97
> [ 1.340000] ba20: 00000001 000000b9 00000000 c076dfcc c099a208 cb59d048 00000001 c1336dd0
> [ 1.340000] ba40: a0000113 00000000 00000001 cb05ba97 0000005e 00000004 00000001 00000000
> [ 1.340000] ba60: 00000000 000ee098 000ee098 c077fd34 0000000d c09e51f0 c09e51d0 cb51f400
> [ 1.340000] ba80: ffffffff 000ee098 000ee098 c068cb48 00000000 c09c157c cb019180 c067887c
> [ 1.340000] baa0: cb51f400 c067a700 000ee098 c09c160c cb015780 00000000 3b9aca00 cb5bdcc0
> [ 1.340000] bac0: cb51f400 00000000 00000000 00000000 000ee098 c067ab5c 000ee098 000ee098
> [ 1.340000] bae0: cb5bdcc0 000ee098 000ee098 000ee098 cfb87050 00000000 000ee098 c067c614
> [ 1.340000] bb00: cb5bdcc0 000ee098 000ee098 c0765ad4 1dcd6500 cb5bdc80 00000000 07735940
> [ 1.340000] bb20: cb5bdc80 cfb87050 cb5bdcc0 00000000 000ee098 c076660c 000ee098 cb5c11d0
> [ 1.340000] bb40: cb05bb90 00124f80 00124f80 00124f80 07735940 1dcd6500 ffffffff cb5c1100
> [ 1.340000] bb60: 00000000 00000000 c145dc8c cb5c0280 00000000 00000001 cb05bb90 c0958e78
> [ 1.340000] bb80: cb05bb8c c13cb404 00000000 00000000 00000010 0007a120 0001e848 00000021
> [ 1.340000] bba0: ffffffff ee222d90 00000000 00000000 00000000 00000010 cfb8b598 c13cb310
> [ 1.340000] bbc0: c1302578 c095ca58 c1302578 00000000 cb5c1100 00000000 000927c0 cb5bdfc0
> [ 1.340000] bbe0: c120e300 00000000 ee32cf60 00000000 c13cb310 cb5c1100 00000000 cb5c0304
> [ 1.340000] bc00: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c095cd04 c145dc8c 00000001
> [ 1.340000] bc20: cb5c1100 cb5c1100 00000000 c145dc8c c1302578 00000003 cb5c1100 00000000
> [ 1.340000] bc40: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c0959c5c cb5c1100 00000000
> [ 1.340000] bc60: 00000000 c095a2dc c0c0df58 00000001 0000ffff 00000001 00000000 00000000
> [ 1.340000] bc80: cb5bdc00 000927c0 0001e848 000493e0 0001e848 000927c0 0007a120 00000000
> [ 1.340000] bca0: 00000000 00000000 00000000 c13cb310 00000000 00000000 00000000 00000000
> [ 1.340000] bcc0: 00000000 00000000 ffffffe0 cb5c1160 cb5c1160 c095abf4 0001e848 000927c0
> [ 1.340000] bce0: cb5c0280 c13cb0a8 c13cb0a8 cb5bdf00 cb5c1184 cb5c1184 cb11e600 00000000
> [ 1.340000] bd00: c13cb128 cb5bf460 00000001 00000003 00000000 00000000 cb5c11ac cb5c11ac
> [ 1.340000] bd20: ffff0001 cb5c11b8 cb5c11b8 00000000 00000000 cb060000 00000000 00000000
> [ 1.340000] bd40: 00000000 cb5c11d8 cb5c11d8 00000000 cb5bdf80 cb5bdec0 cb5c1100 c095a5f0
> [ 1.340000] bd60: 00000000 cb11e600 00000000 c1212594 60000013 00000001 00000000 c13cb110
> [ 1.340000] bd80: c13acc68 c13cb0a8 c13cb440 c13cb440 00000000 00000000 00000000 c075674c
> [ 1.340000] bda0: c13cb440 cb00cc5c cb169db4 00000000 c1334248 c13cb488 c145dc8c c0959764
> [ 1.340000] bdc0: ffffffed cfb87050 cb5e2600 c095d670 ffffffed cb5e2610 fffffdfb c0758e48
> [ 1.340000] bde0: c0758df8 cb5e2610 c1459090 c1459098 00000000 c07577b0 00000000 00000000
> [ 1.340000] be00: cb05be30 c0757a68 00000001 c145906c 00000000 c0755d3c cb00cb70 cb5938b8
> [ 1.340000] be20: cb5e2610 cb5e2644 c13aca58 c0757534 cb5e2610 00000001 00000000 cb5e2610
> [ 1.340000] be40: cb5e2610 c13aca58 c13acaa8 c0756bc0 cb5e2610 00000000 cb5e2618 c07550c0
> [ 1.340000] be60: 00000000 c0587884 cb05beb8 cb5e2600 00000000 cb5e2600 cb5e2610 c1419000
> [ 1.340000] be80: c110362c c11a183c 00000000 c0758fdc 00000000 cb05beb8 cb5e2600 cb5bdb00
> [ 1.340000] bea0: c1419000 c07597a8 c0ead2ac c1306788 c1306788 c1112510 00000000 00000000
> [ 1.340000] bec0: c0ead2ac 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 1.340000] bee0: 00000000 00000000 00000000 c110f828 c110fabc c110fac4 c110fabc c1103648
> [ 1.340000] bf00: c1306788 c0301d28 0000006f cb05bf28 c035a8bc c035a8cc 60000013 ffffffff
> [ 1.340000] bf20: 00000051 c058b428 c0ff5b24 c0c1da88 0000011a c035ab48 c11a183c c0ea7034
> [ 1.340000] bf40: c0ff451c 00000000 00000007 00000007 c1335704 cfb96300 c120de7c 00000007
> [ 1.340000] bf60: c11a1834 c1419000 0000011a c11a183c c1100598 c1100dc4 00000007 00000007
> [ 1.340000] bf80: 00000000 c1100598 00000000 c0b0bcfc 00000000 00000000 00000000 00000000
> [ 1.340000] bfa0: 00000000 c0b0bd04 00000000 c0307e78 00000000 00000000 00000000 00000000
> [ 1.340000] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 1.340000] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [ 1.340000] [<c030de78>] (arch_send_call_function_single_ipi) from [<c03b3350>] (irq_work_queue_on+0x90/0x100)
> [ 1.340000] [<c03b3350>] (irq_work_queue_on) from [<c0959a84>] (cpufreq_update_util+0x40/0x4c)
> [ 1.340000] [<c0959a84>] (cpufreq_update_util) from [<c03732d8>] (enqueue_task_rt+0x28/0x26c)
> [ 1.340000] [<c03732d8>] (enqueue_task_rt) from [<c0360d28>] (activate_task+0x60/0x64)
> [ 1.340000] [<c0360d28>] (activate_task) from [<c0360fc0>] (ttwu_do_activate.constprop.13+0x34/0x68)
> [ 1.340000] [<c0360fc0>] (ttwu_do_activate.constprop.13) from [<c0361b74>] (try_to_wake_up+0x1a0/0x318)
> [ 1.340000] [<c0361b74>] (try_to_wake_up) from [<c0381de0>] (handle_irq_event_percpu+0xdc/0x15c)
> [ 1.340000] [<c0381de0>] (handle_irq_event_percpu) from [<c0381ea4>] (handle_irq_event+0x44/0x68)
> [ 1.340000] [<c0381ea4>] (handle_irq_event) from [<c0385150>] (handle_level_irq+0xa4/0x13c)
> [ 1.340000] [<c0385150>] (handle_level_irq) from [<c038168c>] (generic_handle_irq+0x18/0x28)
> [ 1.340000] [<c038168c>] (generic_handle_irq) from [<c0381788>] (__handle_domain_irq+0x54/0xb0)
> [ 1.340000] [<c0381788>] (__handle_domain_irq) from [<c030bad4>] (__irq_svc+0x54/0x70)
> [ 1.340000] [<c030bad4>] (__irq_svc) from [<c0932830>] (omap_i2c_xfer+0x320/0x5a0)
It looks like we got an interrupt in the middle of an i2c transaction
changing the CPU OPP. The handler of that tried to enqueue an RT task
and that led to a cpufreq update that in turn triggered the crash.
That's during cpufreq_online(), so it looks like something might not
be set up entirely somewhere.
> [ 1.340000] [<c0932830>] (omap_i2c_xfer) from [<c0928358>] (__i2c_transfer+0x140/0x29c)
> [ 1.340000] [<c0928358>] (__i2c_transfer) from [<c0928538>] (i2c_transfer+0x84/0xd4)
> [ 1.340000] [<c0928538>] (i2c_transfer) from [<c07726c4>] (regmap_i2c_read+0x48/0x64)
> [ 1.340000] [<c07726c4>] (regmap_i2c_read) from [<c076dc30>] (_regmap_raw_read+0xa4/0x110)
> [ 1.340000] [<c076dc30>] (_regmap_raw_read) from [<c076dd70>] (regmap_raw_read+0xd4/0x170)
> [ 1.340000] [<c076dd70>] (regmap_raw_read) from [<c076dfcc>] (regmap_bulk_read+0x1c0/0x2b0)
> [ 1.340000] [<c076dfcc>] (regmap_bulk_read) from [<c077fd34>] (twl_i2c_read+0x48/0x8c)
> [ 1.340000] [<c077fd34>] (twl_i2c_read) from [<c068cb48>] (twl4030smps_get_voltage+0x44/0x60)
> [ 1.340000] [<c068cb48>] (twl4030smps_get_voltage) from [<c067887c>] (_regulator_get_voltage+0x68/0xb8)
> [ 1.340000] [<c067887c>] (_regulator_get_voltage) from [<c067a700>] (_regulator_do_set_voltage+0x48/0x320)
> [ 1.340000] [<c067a700>] (_regulator_do_set_voltage) from [<c067ab5c>] (regulator_set_voltage_unlocked+0xcc/0x220)
> [ 1.340000] [<c067ab5c>] (regulator_set_voltage_unlocked) from [<c067c614>] (regulator_set_voltage+0x28/0x54)
> [ 1.340000] [<c067c614>] (regulator_set_voltage) from [<c0765ad4>] (_set_opp_voltage+0x34/0x90)
> [ 1.340000] [<c0765ad4>] (_set_opp_voltage) from [<c076660c>] (dev_pm_opp_set_rate+0x19c/0x288)
> [ 1.340000] [<c076660c>] (dev_pm_opp_set_rate) from [<c0958e78>] (__cpufreq_driver_target+0x180/0x2a0)
> [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
This is the registration of the cpufreq driver (cpufreq-dt in this case).
It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
The only way that can happen is when cpufreq_set_policy() finds that
the "old" and the "new" policies use the same governor, so it goes and
calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
how this is possible during the initialization ATM.
Viresh, any ideas?
> [ 1.340000] [<c0959764>] (cpufreq_register_driver) from [<c095d670>] (dt_cpufreq_probe+0x64/0xe8)
> [ 1.340000] [<c095d670>] (dt_cpufreq_probe) from [<c0758e48>] (platform_drv_probe+0x50/0xb0)
> [ 1.340000] [<c0758e48>] (platform_drv_probe) from [<c07577b0>] (driver_probe_device+0x1f4/0x2b0)
> [ 1.340000] [<c07577b0>] (driver_probe_device) from [<c0755d3c>] (bus_for_each_drv+0x44/0x8c)
> [ 1.340000] [<c0755d3c>] (bus_for_each_drv) from [<c0757534>] (__device_attach+0x9c/0x100)
> [ 1.340000] [<c0757534>] (__device_attach) from [<c0756bc0>] (bus_probe_device+0x84/0x8c)
> [ 1.340000] [<c0756bc0>] (bus_probe_device) from [<c07550c0>] (device_add+0x33c/0x528)
> [ 1.340000] [<c07550c0>] (device_add) from [<c0758fdc>] (platform_device_add+0xa8/0x20c)
> [ 1.340000] [<c0758fdc>] (platform_device_add) from [<c07597a8>] (platform_device_register_full+0xe0/0x108)
> [ 1.340000] [<c07597a8>] (platform_device_register_full) from [<c1112510>] (omap2_common_pm_late_init+0xc8/0x10c)
> [ 1.340000] [<c1112510>] (omap2_common_pm_late_init) from [<c110f828>] (omap_common_late_init+0xc/0x14)
> [ 1.340000] [<c110f828>] (omap_common_late_init) from [<c110fac4>] (omap3_init_late+0x8/0x14)
> [ 1.340000] [<c110fac4>] (omap3_init_late) from [<c1103648>] (init_machine_late+0x1c/0x90)
> [ 1.340000] [<c1103648>] (init_machine_late) from [<c0301d28>] (do_one_initcall+0x84/0x1d4)
> [ 1.340000] [<c0301d28>] (do_one_initcall) from [<c1100dc4>] (kernel_init_freeable+0x120/0x1ec)
> [ 1.340000] [<c1100dc4>] (kernel_init_freeable) from [<c0b0bd04>] (kernel_init+0x8/0xec)
> [ 1.340000] [<c0b0bd04>] (kernel_init) from [<c0307e78>] (ret_from_fork+0x14/0x3c)
> [ 1.340000] Code: bad PC value
> [ 1.340000] ---[ end trace 384223760a5ee799 ]---
> [ 1.340000] Kernel panic - not syncing: Fatal exception in interrupt
> [ 1.340000] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:41 UTC (permalink / raw)
To: Guenter Roeck, Viresh Kumar
Cc: Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> Rafael,
Hi,
Thanks for the report!
> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> timers with utilization update callbacks' with next-20160215. An example
> crash log and bisect results are attached below.
>
> Please let me know if there is anything I can do to help tracking down
> the problem.
It looks like we've uncovered some nastiness in the arch ARM code (see below).
[cut]
> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [ 1.340000] pgd = c0204000
> [ 1.340000] [00000000] *pgd=00000000
> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> [ 1.340000] Modules linked in:
> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> [ 1.340000] PC is at 0x0
> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
Since this is ARM, arch_send_call_function_single_ipi() looks like this:
void arch_send_call_function_single_ipi(int cpu)
{
smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
}
so I'm not sure how the NULL pointer deref is possible even.
The only thing coming to mind would be that cpumask_of(cpu) triggers
this, but I'm not sure how exactly that can happen.
I need help from somebody who knows how this low-level stuff works on ARM.
> [ 1.340000] pc : [<00000000>] lr : [<c030de78>] psr: 20000193
> [ 1.340000] sp : cb05b7c0 ip : 00000000 fp : cb05b83c
> [ 1.340000] r10: cfb8c0c0 r9 : 00000000 r8 : cb18a4c0
> [ 1.340000] r7 : 00000005 r6 : 00000005 r5 : cb5c0334 r4 : 00000000
> [ 1.340000] r3 : 00000000 r2 : c0c06a7c r1 : 00000003 r0 : c0c06a7c
> [ 1.340000] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> [ 1.340000] Control: 10c5387d Table: 80204059 DAC: 00000051
> [ 1.340000] Process swapper/0 (pid: 1, stack limit = 0xcb05a220)
> [ 1.340000] Stack: (0xcb05b7c0 to 0xcb05c000)
> [ 1.340000] b7c0: 00000000 c03b3350 4fdec700 00000000 00000005 c0959a84 ffffffff 00000000
> [ 1.340000] b7e0: ffffffff cb18a4c0 cfb8c0c0 c03732d8 4c4b4000 cb18a4c0 cfb8c0c0 cfb8c0c0
> [ 1.340000] b800: 0e979000 cb18a4c0 cfb8c0c0 00000005 0e979000 c12130c0 00000000 cfb8c0c0
> [ 1.340000] b820: cb05b83c c0360d28 00000000 cb18a4c0 cfb8c0c0 60000193 cb05b84c c0360fc0
> [ 1.340000] b840: cb18a4c0 cb18a8b4 cb05b87c c0361b74 cfb8c100 00000141 cb05b934 cb1c1cc0
> [ 1.340000] b860: 00000002 00000000 00000000 00000048 c1416d0c cb0096c0 00000001 c0381de0
> [ 1.340000] b880: c1416080 cfb8c100 00000400 cb0096c0 cb009720 00000000 00000038 cb003000
> [ 1.340000] b8a0: 00000000 cb05b9c4 00000a28 c0381ea4 cb0096c0 cb0096d0 00000000 c0385150
> [ 1.340000] b8c0: c03850ac c1211518 00000000 c038168c 00000155 c0381788 c0932830 20000013
> [ 1.340000] b8e0: ffffffff cb05b924 00000000 c030bad4 00000001 00000009 00000002 fa070024
> [ 1.340000] b900: cb127c10 00009401 cb05b9b8 c1302100 00000000 00000000 cb05b9c4 00000a28
> [ 1.340000] b920: 00000000 cb05b940 00009601 c0932830 20000013 ffffffff 00000051 c093261c
> [ 1.340000] b940: 00000014 cb127c58 00000002 00000001 000f4240 cb127c10 1443fd00 00000001
> [ 1.340000] b960: c1302100 cb127c58 cb05b9b8 00000002 c145d438 ffff16ac 00000001 c0928358
> [ 1.340000] b980: cb127c74 cb127c58 00000002 cb05b9b8 cb05ba97 00000001 cb05ba97 00000001
> [ 1.340000] b9a0: 00000001 c0928538 00000000 cb518000 cb513740 c07726c4 0000004b cfb80001
> [ 1.340000] b9c0: cb513740 0001004b 017d0001 cb05ba97 00000000 c076dc30 00000001 00000000
> [ 1.340000] b9e0: 00000004 000000b9 000000ba cb518000 000000ba 000000b9 00000001 c076dd70
> [ 1.340000] ba00: 00000000 00000000 cfb8c100 cb518000 000000ba 00000001 00000001 cb05ba97
> [ 1.340000] ba20: 00000001 000000b9 00000000 c076dfcc c099a208 cb59d048 00000001 c1336dd0
> [ 1.340000] ba40: a0000113 00000000 00000001 cb05ba97 0000005e 00000004 00000001 00000000
> [ 1.340000] ba60: 00000000 000ee098 000ee098 c077fd34 0000000d c09e51f0 c09e51d0 cb51f400
> [ 1.340000] ba80: ffffffff 000ee098 000ee098 c068cb48 00000000 c09c157c cb019180 c067887c
> [ 1.340000] baa0: cb51f400 c067a700 000ee098 c09c160c cb015780 00000000 3b9aca00 cb5bdcc0
> [ 1.340000] bac0: cb51f400 00000000 00000000 00000000 000ee098 c067ab5c 000ee098 000ee098
> [ 1.340000] bae0: cb5bdcc0 000ee098 000ee098 000ee098 cfb87050 00000000 000ee098 c067c614
> [ 1.340000] bb00: cb5bdcc0 000ee098 000ee098 c0765ad4 1dcd6500 cb5bdc80 00000000 07735940
> [ 1.340000] bb20: cb5bdc80 cfb87050 cb5bdcc0 00000000 000ee098 c076660c 000ee098 cb5c11d0
> [ 1.340000] bb40: cb05bb90 00124f80 00124f80 00124f80 07735940 1dcd6500 ffffffff cb5c1100
> [ 1.340000] bb60: 00000000 00000000 c145dc8c cb5c0280 00000000 00000001 cb05bb90 c0958e78
> [ 1.340000] bb80: cb05bb8c c13cb404 00000000 00000000 00000010 0007a120 0001e848 00000021
> [ 1.340000] bba0: ffffffff ee222d90 00000000 00000000 00000000 00000010 cfb8b598 c13cb310
> [ 1.340000] bbc0: c1302578 c095ca58 c1302578 00000000 cb5c1100 00000000 000927c0 cb5bdfc0
> [ 1.340000] bbe0: c120e300 00000000 ee32cf60 00000000 c13cb310 cb5c1100 00000000 cb5c0304
> [ 1.340000] bc00: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c095cd04 c145dc8c 00000001
> [ 1.340000] bc20: cb5c1100 cb5c1100 00000000 c145dc8c c1302578 00000003 cb5c1100 00000000
> [ 1.340000] bc40: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c0959c5c cb5c1100 00000000
> [ 1.340000] bc60: 00000000 c095a2dc c0c0df58 00000001 0000ffff 00000001 00000000 00000000
> [ 1.340000] bc80: cb5bdc00 000927c0 0001e848 000493e0 0001e848 000927c0 0007a120 00000000
> [ 1.340000] bca0: 00000000 00000000 00000000 c13cb310 00000000 00000000 00000000 00000000
> [ 1.340000] bcc0: 00000000 00000000 ffffffe0 cb5c1160 cb5c1160 c095abf4 0001e848 000927c0
> [ 1.340000] bce0: cb5c0280 c13cb0a8 c13cb0a8 cb5bdf00 cb5c1184 cb5c1184 cb11e600 00000000
> [ 1.340000] bd00: c13cb128 cb5bf460 00000001 00000003 00000000 00000000 cb5c11ac cb5c11ac
> [ 1.340000] bd20: ffff0001 cb5c11b8 cb5c11b8 00000000 00000000 cb060000 00000000 00000000
> [ 1.340000] bd40: 00000000 cb5c11d8 cb5c11d8 00000000 cb5bdf80 cb5bdec0 cb5c1100 c095a5f0
> [ 1.340000] bd60: 00000000 cb11e600 00000000 c1212594 60000013 00000001 00000000 c13cb110
> [ 1.340000] bd80: c13acc68 c13cb0a8 c13cb440 c13cb440 00000000 00000000 00000000 c075674c
> [ 1.340000] bda0: c13cb440 cb00cc5c cb169db4 00000000 c1334248 c13cb488 c145dc8c c0959764
> [ 1.340000] bdc0: ffffffed cfb87050 cb5e2600 c095d670 ffffffed cb5e2610 fffffdfb c0758e48
> [ 1.340000] bde0: c0758df8 cb5e2610 c1459090 c1459098 00000000 c07577b0 00000000 00000000
> [ 1.340000] be00: cb05be30 c0757a68 00000001 c145906c 00000000 c0755d3c cb00cb70 cb5938b8
> [ 1.340000] be20: cb5e2610 cb5e2644 c13aca58 c0757534 cb5e2610 00000001 00000000 cb5e2610
> [ 1.340000] be40: cb5e2610 c13aca58 c13acaa8 c0756bc0 cb5e2610 00000000 cb5e2618 c07550c0
> [ 1.340000] be60: 00000000 c0587884 cb05beb8 cb5e2600 00000000 cb5e2600 cb5e2610 c1419000
> [ 1.340000] be80: c110362c c11a183c 00000000 c0758fdc 00000000 cb05beb8 cb5e2600 cb5bdb00
> [ 1.340000] bea0: c1419000 c07597a8 c0ead2ac c1306788 c1306788 c1112510 00000000 00000000
> [ 1.340000] bec0: c0ead2ac 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 1.340000] bee0: 00000000 00000000 00000000 c110f828 c110fabc c110fac4 c110fabc c1103648
> [ 1.340000] bf00: c1306788 c0301d28 0000006f cb05bf28 c035a8bc c035a8cc 60000013 ffffffff
> [ 1.340000] bf20: 00000051 c058b428 c0ff5b24 c0c1da88 0000011a c035ab48 c11a183c c0ea7034
> [ 1.340000] bf40: c0ff451c 00000000 00000007 00000007 c1335704 cfb96300 c120de7c 00000007
> [ 1.340000] bf60: c11a1834 c1419000 0000011a c11a183c c1100598 c1100dc4 00000007 00000007
> [ 1.340000] bf80: 00000000 c1100598 00000000 c0b0bcfc 00000000 00000000 00000000 00000000
> [ 1.340000] bfa0: 00000000 c0b0bd04 00000000 c0307e78 00000000 00000000 00000000 00000000
> [ 1.340000] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 1.340000] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [ 1.340000] [<c030de78>] (arch_send_call_function_single_ipi) from [<c03b3350>] (irq_work_queue_on+0x90/0x100)
> [ 1.340000] [<c03b3350>] (irq_work_queue_on) from [<c0959a84>] (cpufreq_update_util+0x40/0x4c)
> [ 1.340000] [<c0959a84>] (cpufreq_update_util) from [<c03732d8>] (enqueue_task_rt+0x28/0x26c)
> [ 1.340000] [<c03732d8>] (enqueue_task_rt) from [<c0360d28>] (activate_task+0x60/0x64)
> [ 1.340000] [<c0360d28>] (activate_task) from [<c0360fc0>] (ttwu_do_activate.constprop.13+0x34/0x68)
> [ 1.340000] [<c0360fc0>] (ttwu_do_activate.constprop.13) from [<c0361b74>] (try_to_wake_up+0x1a0/0x318)
> [ 1.340000] [<c0361b74>] (try_to_wake_up) from [<c0381de0>] (handle_irq_event_percpu+0xdc/0x15c)
> [ 1.340000] [<c0381de0>] (handle_irq_event_percpu) from [<c0381ea4>] (handle_irq_event+0x44/0x68)
> [ 1.340000] [<c0381ea4>] (handle_irq_event) from [<c0385150>] (handle_level_irq+0xa4/0x13c)
> [ 1.340000] [<c0385150>] (handle_level_irq) from [<c038168c>] (generic_handle_irq+0x18/0x28)
> [ 1.340000] [<c038168c>] (generic_handle_irq) from [<c0381788>] (__handle_domain_irq+0x54/0xb0)
> [ 1.340000] [<c0381788>] (__handle_domain_irq) from [<c030bad4>] (__irq_svc+0x54/0x70)
> [ 1.340000] [<c030bad4>] (__irq_svc) from [<c0932830>] (omap_i2c_xfer+0x320/0x5a0)
It looks like we got an interrupt in the middle of an i2c transaction
changing the CPU OPP. The handler of that tried to enqueue an RT task
and that led to a cpufreq update that in turn triggered the crash.
That's during cpufreq_online(), so it looks like something might not
be set up entirely somewhere.
> [ 1.340000] [<c0932830>] (omap_i2c_xfer) from [<c0928358>] (__i2c_transfer+0x140/0x29c)
> [ 1.340000] [<c0928358>] (__i2c_transfer) from [<c0928538>] (i2c_transfer+0x84/0xd4)
> [ 1.340000] [<c0928538>] (i2c_transfer) from [<c07726c4>] (regmap_i2c_read+0x48/0x64)
> [ 1.340000] [<c07726c4>] (regmap_i2c_read) from [<c076dc30>] (_regmap_raw_read+0xa4/0x110)
> [ 1.340000] [<c076dc30>] (_regmap_raw_read) from [<c076dd70>] (regmap_raw_read+0xd4/0x170)
> [ 1.340000] [<c076dd70>] (regmap_raw_read) from [<c076dfcc>] (regmap_bulk_read+0x1c0/0x2b0)
> [ 1.340000] [<c076dfcc>] (regmap_bulk_read) from [<c077fd34>] (twl_i2c_read+0x48/0x8c)
> [ 1.340000] [<c077fd34>] (twl_i2c_read) from [<c068cb48>] (twl4030smps_get_voltage+0x44/0x60)
> [ 1.340000] [<c068cb48>] (twl4030smps_get_voltage) from [<c067887c>] (_regulator_get_voltage+0x68/0xb8)
> [ 1.340000] [<c067887c>] (_regulator_get_voltage) from [<c067a700>] (_regulator_do_set_voltage+0x48/0x320)
> [ 1.340000] [<c067a700>] (_regulator_do_set_voltage) from [<c067ab5c>] (regulator_set_voltage_unlocked+0xcc/0x220)
> [ 1.340000] [<c067ab5c>] (regulator_set_voltage_unlocked) from [<c067c614>] (regulator_set_voltage+0x28/0x54)
> [ 1.340000] [<c067c614>] (regulator_set_voltage) from [<c0765ad4>] (_set_opp_voltage+0x34/0x90)
> [ 1.340000] [<c0765ad4>] (_set_opp_voltage) from [<c076660c>] (dev_pm_opp_set_rate+0x19c/0x288)
> [ 1.340000] [<c076660c>] (dev_pm_opp_set_rate) from [<c0958e78>] (__cpufreq_driver_target+0x180/0x2a0)
> [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
This is the registration of the cpufreq driver (cpufreq-dt in this case).
It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
The only way that can happen is when cpufreq_set_policy() finds that
the "old" and the "new" policies use the same governor, so it goes and
calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
how this is possible during the initialization ATM.
Viresh, any ideas?
> [ 1.340000] [<c0959764>] (cpufreq_register_driver) from [<c095d670>] (dt_cpufreq_probe+0x64/0xe8)
> [ 1.340000] [<c095d670>] (dt_cpufreq_probe) from [<c0758e48>] (platform_drv_probe+0x50/0xb0)
> [ 1.340000] [<c0758e48>] (platform_drv_probe) from [<c07577b0>] (driver_probe_device+0x1f4/0x2b0)
> [ 1.340000] [<c07577b0>] (driver_probe_device) from [<c0755d3c>] (bus_for_each_drv+0x44/0x8c)
> [ 1.340000] [<c0755d3c>] (bus_for_each_drv) from [<c0757534>] (__device_attach+0x9c/0x100)
> [ 1.340000] [<c0757534>] (__device_attach) from [<c0756bc0>] (bus_probe_device+0x84/0x8c)
> [ 1.340000] [<c0756bc0>] (bus_probe_device) from [<c07550c0>] (device_add+0x33c/0x528)
> [ 1.340000] [<c07550c0>] (device_add) from [<c0758fdc>] (platform_device_add+0xa8/0x20c)
> [ 1.340000] [<c0758fdc>] (platform_device_add) from [<c07597a8>] (platform_device_register_full+0xe0/0x108)
> [ 1.340000] [<c07597a8>] (platform_device_register_full) from [<c1112510>] (omap2_common_pm_late_init+0xc8/0x10c)
> [ 1.340000] [<c1112510>] (omap2_common_pm_late_init) from [<c110f828>] (omap_common_late_init+0xc/0x14)
> [ 1.340000] [<c110f828>] (omap_common_late_init) from [<c110fac4>] (omap3_init_late+0x8/0x14)
> [ 1.340000] [<c110fac4>] (omap3_init_late) from [<c1103648>] (init_machine_late+0x1c/0x90)
> [ 1.340000] [<c1103648>] (init_machine_late) from [<c0301d28>] (do_one_initcall+0x84/0x1d4)
> [ 1.340000] [<c0301d28>] (do_one_initcall) from [<c1100dc4>] (kernel_init_freeable+0x120/0x1ec)
> [ 1.340000] [<c1100dc4>] (kernel_init_freeable) from [<c0b0bd04>] (kernel_init+0x8/0xec)
> [ 1.340000] [<c0b0bd04>] (kernel_init) from [<c0307e78>] (ret_from_fork+0x14/0x3c)
> [ 1.340000] Code: bad PC value
> [ 1.340000] ---[ end trace 384223760a5ee799 ]---
> [ 1.340000] Kernel panic - not syncing: Fatal exception in interrupt
> [ 1.340000] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:41 ` Rafael J. Wysocki
(?)
@ 2016-02-15 18:49 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:49 UTC (permalink / raw)
To: Guenter Roeck
Cc: Viresh Kumar, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On Mon, Feb 15, 2016 at 7:41 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> Rafael,
>
> Hi,
>
> Thanks for the report!
>
>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>> timers with utilization update callbacks' with next-20160215. An example
>> crash log and bisect results are attached below.
>>
>> Please let me know if there is anything I can do to help tracking down
>> the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> [ 1.340000] pgd = c0204000
>> [ 1.340000] [00000000] *pgd=00000000
>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>> [ 1.340000] Modules linked in:
>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>> [ 1.340000] PC is at 0x0
>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
Well, could there be a problem with sending an IPI to the same CPU
that's sending it?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:49 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:49 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 7:41 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> Rafael,
>
> Hi,
>
> Thanks for the report!
>
>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>> timers with utilization update callbacks' with next-20160215. An example
>> crash log and bisect results are attached below.
>>
>> Please let me know if there is anything I can do to help tracking down
>> the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> [ 1.340000] pgd = c0204000
>> [ 1.340000] [00000000] *pgd=00000000
>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>> [ 1.340000] Modules linked in:
>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>> [ 1.340000] PC is at 0x0
>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
Well, could there be a problem with sending an IPI to the same CPU
that's sending it?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:49 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:49 UTC (permalink / raw)
To: Guenter Roeck
Cc: Viresh Kumar, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On Mon, Feb 15, 2016 at 7:41 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> Rafael,
>
> Hi,
>
> Thanks for the report!
>
>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>> timers with utilization update callbacks' with next-20160215. An example
>> crash log and bisect results are attached below.
>>
>> Please let me know if there is anything I can do to help tracking down
>> the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> [ 1.340000] pgd = c0204000
>> [ 1.340000] [00000000] *pgd=00000000
>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>> [ 1.340000] Modules linked in:
>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>> [ 1.340000] PC is at 0x0
>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
Well, could there be a problem with sending an IPI to the same CPU
that's sending it?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:41 ` Rafael J. Wysocki
(?)
@ 2016-02-15 18:49 ` Marc Zyngier
-1 siblings, 0 replies; 81+ messages in thread
From: Marc Zyngier @ 2016-02-15 18:49 UTC (permalink / raw)
To: Rafael J. Wysocki, Guenter Roeck, Viresh Kumar
Cc: Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On 15/02/16 18:41, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> Rafael,
>
> Hi,
>
> Thanks for the report!
>
>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>> timers with utilization update callbacks' with next-20160215. An example
>> crash log and bisect results are attached below.
>>
>> Please let me know if there is anything I can do to help tracking down
>> the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> [ 1.340000] pgd = c0204000
>> [ 1.340000] [00000000] *pgd=00000000
>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>> [ 1.340000] Modules linked in:
>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>> [ 1.340000] PC is at 0x0
>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
Given that OMAP3 is a UP system, there is zero chance that it has
registered the magic hook that delivers IPIs (its interrupt controller
is not even capable of doing so).
I don't really know the context, but IPIs on a UP system seem at best odd.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:49 ` Marc Zyngier
0 siblings, 0 replies; 81+ messages in thread
From: Marc Zyngier @ 2016-02-15 18:49 UTC (permalink / raw)
To: linux-arm-kernel
On 15/02/16 18:41, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> Rafael,
>
> Hi,
>
> Thanks for the report!
>
>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>> timers with utilization update callbacks' with next-20160215. An example
>> crash log and bisect results are attached below.
>>
>> Please let me know if there is anything I can do to help tracking down
>> the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> [ 1.340000] pgd = c0204000
>> [ 1.340000] [00000000] *pgd=00000000
>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>> [ 1.340000] Modules linked in:
>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>> [ 1.340000] PC is at 0x0
>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
Given that OMAP3 is a UP system, there is zero chance that it has
registered the magic hook that delivers IPIs (its interrupt controller
is not even capable of doing so).
I don't really know the context, but IPIs on a UP system seem at best odd.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:49 ` Marc Zyngier
0 siblings, 0 replies; 81+ messages in thread
From: Marc Zyngier @ 2016-02-15 18:49 UTC (permalink / raw)
To: Rafael J. Wysocki, Guenter Roeck, Viresh Kumar
Cc: Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On 15/02/16 18:41, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> Rafael,
>
> Hi,
>
> Thanks for the report!
>
>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>> timers with utilization update callbacks' with next-20160215. An example
>> crash log and bisect results are attached below.
>>
>> Please let me know if there is anything I can do to help tracking down
>> the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> [ 1.340000] pgd = c0204000
>> [ 1.340000] [00000000] *pgd=00000000
>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>> [ 1.340000] Modules linked in:
>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>> [ 1.340000] PC is at 0x0
>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
Given that OMAP3 is a UP system, there is zero chance that it has
registered the magic hook that delivers IPIs (its interrupt controller
is not even capable of doing so).
I don't really know the context, but IPIs on a UP system seem at best odd.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:49 ` Marc Zyngier
(?)
@ 2016-02-15 18:54 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:54 UTC (permalink / raw)
To: Marc Zyngier
Cc: Rafael J. Wysocki, Guenter Roeck, Viresh Kumar,
Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Rafael,
>>
>> Hi,
>>
>> Thanks for the report!
>>
>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>> timers with utilization update callbacks' with next-20160215. An example
>>> crash log and bisect results are attached below.
>>>
>>> Please let me know if there is anything I can do to help tracking down
>>> the problem.
>>
>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>
>> [cut]
>>
>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>> [ 1.340000] pgd = c0204000
>>> [ 1.340000] [00000000] *pgd=00000000
>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>> [ 1.340000] Modules linked in:
>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>> [ 1.340000] PC is at 0x0
>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>
>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>
>> void arch_send_call_function_single_ipi(int cpu)
>> {
>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>> }
>>
>> so I'm not sure how the NULL pointer deref is possible even.
>>
>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>> this, but I'm not sure how exactly that can happen.
>>
>> I need help from somebody who knows how this low-level stuff works on ARM.
>
> Given that OMAP3 is a UP system, there is zero chance that it has
> registered the magic hook that delivers IPIs (its interrupt controller
> is not even capable of doing so).
>
> I don't really know the context, but IPIs on a UP system seem at best odd.
That would explain it, thanks.
So it looks like we should always use irq_work_queue() on UP even if
CONFIG_SMP is set, shouldn't we?
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:54 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:54 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Rafael,
>>
>> Hi,
>>
>> Thanks for the report!
>>
>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>> timers with utilization update callbacks' with next-20160215. An example
>>> crash log and bisect results are attached below.
>>>
>>> Please let me know if there is anything I can do to help tracking down
>>> the problem.
>>
>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>
>> [cut]
>>
>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>> [ 1.340000] pgd = c0204000
>>> [ 1.340000] [00000000] *pgd=00000000
>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>> [ 1.340000] Modules linked in:
>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>> [ 1.340000] PC is at 0x0
>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>
>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>
>> void arch_send_call_function_single_ipi(int cpu)
>> {
>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>> }
>>
>> so I'm not sure how the NULL pointer deref is possible even.
>>
>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>> this, but I'm not sure how exactly that can happen.
>>
>> I need help from somebody who knows how this low-level stuff works on ARM.
>
> Given that OMAP3 is a UP system, there is zero chance that it has
> registered the magic hook that delivers IPIs (its interrupt controller
> is not even capable of doing so).
>
> I don't really know the context, but IPIs on a UP system seem at best odd.
That would explain it, thanks.
So it looks like we should always use irq_work_queue() on UP even if
CONFIG_SMP is set, shouldn't we?
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 18:54 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 18:54 UTC (permalink / raw)
To: Marc Zyngier
Cc: Rafael J. Wysocki, Guenter Roeck, Viresh Kumar,
Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Rafael,
>>
>> Hi,
>>
>> Thanks for the report!
>>
>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>> timers with utilization update callbacks' with next-20160215. An example
>>> crash log and bisect results are attached below.
>>>
>>> Please let me know if there is anything I can do to help tracking down
>>> the problem.
>>
>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>
>> [cut]
>>
>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>> [ 1.340000] pgd = c0204000
>>> [ 1.340000] [00000000] *pgd=00000000
>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>> [ 1.340000] Modules linked in:
>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>> [ 1.340000] PC is at 0x0
>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>
>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>
>> void arch_send_call_function_single_ipi(int cpu)
>> {
>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>> }
>>
>> so I'm not sure how the NULL pointer deref is possible even.
>>
>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>> this, but I'm not sure how exactly that can happen.
>>
>> I need help from somebody who knows how this low-level stuff works on ARM.
>
> Given that OMAP3 is a UP system, there is zero chance that it has
> registered the magic hook that delivers IPIs (its interrupt controller
> is not even capable of doing so).
>
> I don't really know the context, but IPIs on a UP system seem at best odd.
That would explain it, thanks.
So it looks like we should always use irq_work_queue() on UP even if
CONFIG_SMP is set, shouldn't we?
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:54 ` Rafael J. Wysocki
(?)
@ 2016-02-15 19:03 ` Marc Zyngier
-1 siblings, 0 replies; 81+ messages in thread
From: Marc Zyngier @ 2016-02-15 19:03 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On 15/02/16 18:54, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>>> Rafael,
>>>
>>> Hi,
>>>
>>> Thanks for the report!
>>>
>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>>> timers with utilization update callbacks' with next-20160215. An example
>>>> crash log and bisect results are attached below.
>>>>
>>>> Please let me know if there is anything I can do to help tracking down
>>>> the problem.
>>>
>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>>
>>> [cut]
>>>
>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>>> [ 1.340000] pgd = c0204000
>>>> [ 1.340000] [00000000] *pgd=00000000
>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>>> [ 1.340000] Modules linked in:
>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>>> [ 1.340000] PC is at 0x0
>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>>
>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>>
>>> void arch_send_call_function_single_ipi(int cpu)
>>> {
>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>>> }
>>>
>>> so I'm not sure how the NULL pointer deref is possible even.
>>>
>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>>> this, but I'm not sure how exactly that can happen.
>>>
>>> I need help from somebody who knows how this low-level stuff works on ARM.
>>
>> Given that OMAP3 is a UP system, there is zero chance that it has
>> registered the magic hook that delivers IPIs (its interrupt controller
>> is not even capable of doing so).
>>
>> I don't really know the context, but IPIs on a UP system seem at best odd.
>
> That would explain it, thanks.
>
> So it looks like we should always use irq_work_queue() on UP even if
> CONFIG_SMP is set, shouldn't we?
Something like that, yes. CONFIG_SMP is not an indication of an SMP
system anymore (we've even dropped the config option on arm64).
Hopefully num_possible_cpus() is reliable enough to let you do the right
thing...
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:03 ` Marc Zyngier
0 siblings, 0 replies; 81+ messages in thread
From: Marc Zyngier @ 2016-02-15 19:03 UTC (permalink / raw)
To: linux-arm-kernel
On 15/02/16 18:54, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>>> Rafael,
>>>
>>> Hi,
>>>
>>> Thanks for the report!
>>>
>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>>> timers with utilization update callbacks' with next-20160215. An example
>>>> crash log and bisect results are attached below.
>>>>
>>>> Please let me know if there is anything I can do to help tracking down
>>>> the problem.
>>>
>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>>
>>> [cut]
>>>
>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>>> [ 1.340000] pgd = c0204000
>>>> [ 1.340000] [00000000] *pgd=00000000
>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>>> [ 1.340000] Modules linked in:
>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>>> [ 1.340000] PC is at 0x0
>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>>
>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>>
>>> void arch_send_call_function_single_ipi(int cpu)
>>> {
>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>>> }
>>>
>>> so I'm not sure how the NULL pointer deref is possible even.
>>>
>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>>> this, but I'm not sure how exactly that can happen.
>>>
>>> I need help from somebody who knows how this low-level stuff works on ARM.
>>
>> Given that OMAP3 is a UP system, there is zero chance that it has
>> registered the magic hook that delivers IPIs (its interrupt controller
>> is not even capable of doing so).
>>
>> I don't really know the context, but IPIs on a UP system seem at best odd.
>
> That would explain it, thanks.
>
> So it looks like we should always use irq_work_queue() on UP even if
> CONFIG_SMP is set, shouldn't we?
Something like that, yes. CONFIG_SMP is not an indication of an SMP
system anymore (we've even dropped the config option on arm64).
Hopefully num_possible_cpus() is reliable enough to let you do the right
thing...
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:03 ` Marc Zyngier
0 siblings, 0 replies; 81+ messages in thread
From: Marc Zyngier @ 2016-02-15 19:03 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On 15/02/16 18:54, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>>> Rafael,
>>>
>>> Hi,
>>>
>>> Thanks for the report!
>>>
>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>>> timers with utilization update callbacks' with next-20160215. An example
>>>> crash log and bisect results are attached below.
>>>>
>>>> Please let me know if there is anything I can do to help tracking down
>>>> the problem.
>>>
>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>>
>>> [cut]
>>>
>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>>> [ 1.340000] pgd = c0204000
>>>> [ 1.340000] [00000000] *pgd=00000000
>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>>> [ 1.340000] Modules linked in:
>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>>> [ 1.340000] PC is at 0x0
>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>>
>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>>
>>> void arch_send_call_function_single_ipi(int cpu)
>>> {
>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>>> }
>>>
>>> so I'm not sure how the NULL pointer deref is possible even.
>>>
>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>>> this, but I'm not sure how exactly that can happen.
>>>
>>> I need help from somebody who knows how this low-level stuff works on ARM.
>>
>> Given that OMAP3 is a UP system, there is zero chance that it has
>> registered the magic hook that delivers IPIs (its interrupt controller
>> is not even capable of doing so).
>>
>> I don't really know the context, but IPIs on a UP system seem at best odd.
>
> That would explain it, thanks.
>
> So it looks like we should always use irq_work_queue() on UP even if
> CONFIG_SMP is set, shouldn't we?
Something like that, yes. CONFIG_SMP is not an indication of an SMP
system anymore (we've even dropped the config option on arm64).
Hopefully num_possible_cpus() is reliable enough to let you do the right
thing...
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:03 ` Marc Zyngier
(?)
@ 2016-02-15 19:12 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 19:12 UTC (permalink / raw)
To: Marc Zyngier
Cc: Rafael J. Wysocki, Guenter Roeck, Viresh Kumar,
Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 8:03 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/02/16 18:54, Rafael J. Wysocki wrote:
>> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>>>> Rafael,
>>>>
>>>> Hi,
>>>>
>>>> Thanks for the report!
>>>>
>>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>>>> timers with utilization update callbacks' with next-20160215. An example
>>>>> crash log and bisect results are attached below.
>>>>>
>>>>> Please let me know if there is anything I can do to help tracking down
>>>>> the problem.
>>>>
>>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>>>
>>>> [cut]
>>>>
>>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>>>> [ 1.340000] pgd = c0204000
>>>>> [ 1.340000] [00000000] *pgd=00000000
>>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>>>> [ 1.340000] Modules linked in:
>>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>>>> [ 1.340000] PC is at 0x0
>>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>>>
>>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>>>
>>>> void arch_send_call_function_single_ipi(int cpu)
>>>> {
>>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>>>> }
>>>>
>>>> so I'm not sure how the NULL pointer deref is possible even.
>>>>
>>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>>>> this, but I'm not sure how exactly that can happen.
>>>>
>>>> I need help from somebody who knows how this low-level stuff works on ARM.
>>>
>>> Given that OMAP3 is a UP system, there is zero chance that it has
>>> registered the magic hook that delivers IPIs (its interrupt controller
>>> is not even capable of doing so).
>>>
>>> I don't really know the context, but IPIs on a UP system seem at best odd.
>>
>> That would explain it, thanks.
>>
>> So it looks like we should always use irq_work_queue() on UP even if
>> CONFIG_SMP is set, shouldn't we?
>
> Something like that, yes. CONFIG_SMP is not an indication of an SMP
> system anymore (we've even dropped the config option on arm64).
>
> Hopefully num_possible_cpus() is reliable enough to let you do the right
> thing...
Well, in fact I can always use irq_work_queue() in there at least for
the time being.
Let me prepare a patch.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:12 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 19:12 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 8:03 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/02/16 18:54, Rafael J. Wysocki wrote:
>> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>>>> Rafael,
>>>>
>>>> Hi,
>>>>
>>>> Thanks for the report!
>>>>
>>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>>>> timers with utilization update callbacks' with next-20160215. An example
>>>>> crash log and bisect results are attached below.
>>>>>
>>>>> Please let me know if there is anything I can do to help tracking down
>>>>> the problem.
>>>>
>>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>>>
>>>> [cut]
>>>>
>>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>>>> [ 1.340000] pgd = c0204000
>>>>> [ 1.340000] [00000000] *pgd=00000000
>>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>>>> [ 1.340000] Modules linked in:
>>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>>>> [ 1.340000] PC is at 0x0
>>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>>>
>>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>>>
>>>> void arch_send_call_function_single_ipi(int cpu)
>>>> {
>>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>>>> }
>>>>
>>>> so I'm not sure how the NULL pointer deref is possible even.
>>>>
>>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>>>> this, but I'm not sure how exactly that can happen.
>>>>
>>>> I need help from somebody who knows how this low-level stuff works on ARM.
>>>
>>> Given that OMAP3 is a UP system, there is zero chance that it has
>>> registered the magic hook that delivers IPIs (its interrupt controller
>>> is not even capable of doing so).
>>>
>>> I don't really know the context, but IPIs on a UP system seem at best odd.
>>
>> That would explain it, thanks.
>>
>> So it looks like we should always use irq_work_queue() on UP even if
>> CONFIG_SMP is set, shouldn't we?
>
> Something like that, yes. CONFIG_SMP is not an indication of an SMP
> system anymore (we've even dropped the config option on arm64).
>
> Hopefully num_possible_cpus() is reliable enough to let you do the right
> thing...
Well, in fact I can always use irq_work_queue() in there at least for
the time being.
Let me prepare a patch.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:12 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 19:12 UTC (permalink / raw)
To: Marc Zyngier
Cc: Rafael J. Wysocki, Guenter Roeck, Viresh Kumar,
Rafael J. Wysocki, linux-next, Linux Kernel Mailing List,
linux-arm-kernel, linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 8:03 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/02/16 18:54, Rafael J. Wysocki wrote:
>> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
>>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>>>> Rafael,
>>>>
>>>> Hi,
>>>>
>>>> Thanks for the report!
>>>>
>>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>>>> timers with utilization update callbacks' with next-20160215. An example
>>>>> crash log and bisect results are attached below.
>>>>>
>>>>> Please let me know if there is anything I can do to help tracking down
>>>>> the problem.
>>>>
>>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>>>
>>>> [cut]
>>>>
>>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>>>> [ 1.340000] pgd = c0204000
>>>>> [ 1.340000] [00000000] *pgd=00000000
>>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>>>> [ 1.340000] Modules linked in:
>>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>>>> [ 1.340000] PC is at 0x0
>>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>>>
>>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>>>
>>>> void arch_send_call_function_single_ipi(int cpu)
>>>> {
>>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>>>> }
>>>>
>>>> so I'm not sure how the NULL pointer deref is possible even.
>>>>
>>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>>>> this, but I'm not sure how exactly that can happen.
>>>>
>>>> I need help from somebody who knows how this low-level stuff works on ARM.
>>>
>>> Given that OMAP3 is a UP system, there is zero chance that it has
>>> registered the magic hook that delivers IPIs (its interrupt controller
>>> is not even capable of doing so).
>>>
>>> I don't really know the context, but IPIs on a UP system seem at best odd.
>>
>> That would explain it, thanks.
>>
>> So it looks like we should always use irq_work_queue() on UP even if
>> CONFIG_SMP is set, shouldn't we?
>
> Something like that, yes. CONFIG_SMP is not an indication of an SMP
> system anymore (we've even dropped the config option on arm64).
>
> Hopefully num_possible_cpus() is reliable enough to let you do the right
> thing...
Well, in fact I can always use irq_work_queue() in there at least for
the time being.
Let me prepare a patch.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:12 ` Rafael J. Wysocki
(?)
@ 2016-02-15 19:28 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 19:28 UTC (permalink / raw)
To: Guenter Roeck, Tony Lindgren
Cc: Marc Zyngier, Viresh Kumar, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On Monday, February 15, 2016 08:12:33 PM Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 8:03 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > On 15/02/16 18:54, Rafael J. Wysocki wrote:
> >> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
> >>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> >>>>> Rafael,
> >>>>
> >>>> Hi,
> >>>>
> >>>> Thanks for the report!
> >>>>
> >>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> >>>>> timers with utilization update callbacks' with next-20160215. An example
> >>>>> crash log and bisect results are attached below.
> >>>>>
> >>>>> Please let me know if there is anything I can do to help tracking down
> >>>>> the problem.
> >>>>
> >>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
> >>>>
> >>>> [cut]
> >>>>
> >>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> >>>>> [ 1.340000] pgd = c0204000
> >>>>> [ 1.340000] [00000000] *pgd=00000000
> >>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> >>>>> [ 1.340000] Modules linked in:
> >>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> >>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> >>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> >>>>> [ 1.340000] PC is at 0x0
> >>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
> >>>>
> >>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
> >>>>
> >>>> void arch_send_call_function_single_ipi(int cpu)
> >>>> {
> >>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> >>>> }
> >>>>
> >>>> so I'm not sure how the NULL pointer deref is possible even.
> >>>>
> >>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
> >>>> this, but I'm not sure how exactly that can happen.
> >>>>
> >>>> I need help from somebody who knows how this low-level stuff works on ARM.
> >>>
> >>> Given that OMAP3 is a UP system, there is zero chance that it has
> >>> registered the magic hook that delivers IPIs (its interrupt controller
> >>> is not even capable of doing so).
> >>>
> >>> I don't really know the context, but IPIs on a UP system seem at best odd.
> >>
> >> That would explain it, thanks.
> >>
> >> So it looks like we should always use irq_work_queue() on UP even if
> >> CONFIG_SMP is set, shouldn't we?
> >
> > Something like that, yes. CONFIG_SMP is not an indication of an SMP
> > system anymore (we've even dropped the config option on arm64).
> >
> > Hopefully num_possible_cpus() is reliable enough to let you do the right
> > thing...
>
> Well, in fact I can always use irq_work_queue() in there at least for
> the time being.
>
> Let me prepare a patch.
Guenter, Tony,
Below is a patch to try, on top of linux-next.
Please let me know if the problem is still around with that patch applied.
Thanks,
Rafael
---
drivers/cpufreq/cpufreq_governor.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -350,15 +350,6 @@ static void dbs_irq_work(struct irq_work
schedule_work(&policy_dbs->work);
}
-static inline void gov_queue_irq_work(struct policy_dbs_info *policy_dbs)
-{
-#ifdef CONFIG_SMP
- irq_work_queue_on(&policy_dbs->irq_work, smp_processor_id());
-#else
- irq_work_queue(&policy_dbs->irq_work);
-#endif
-}
-
static void dbs_update_util_handler(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max)
{
@@ -378,7 +369,7 @@ static void dbs_update_util_handler(stru
delta_ns = time - policy_dbs->last_sample_time;
if ((s64)delta_ns >= policy_dbs->sample_delay_ns) {
policy_dbs->last_sample_time = time;
- gov_queue_irq_work(policy_dbs);
+ irq_work_queue(&policy_dbs->irq_work);
return;
}
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:28 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 19:28 UTC (permalink / raw)
To: linux-arm-kernel
On Monday, February 15, 2016 08:12:33 PM Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 8:03 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > On 15/02/16 18:54, Rafael J. Wysocki wrote:
> >> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
> >>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> >>>>> Rafael,
> >>>>
> >>>> Hi,
> >>>>
> >>>> Thanks for the report!
> >>>>
> >>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> >>>>> timers with utilization update callbacks' with next-20160215. An example
> >>>>> crash log and bisect results are attached below.
> >>>>>
> >>>>> Please let me know if there is anything I can do to help tracking down
> >>>>> the problem.
> >>>>
> >>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
> >>>>
> >>>> [cut]
> >>>>
> >>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> >>>>> [ 1.340000] pgd = c0204000
> >>>>> [ 1.340000] [00000000] *pgd=00000000
> >>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> >>>>> [ 1.340000] Modules linked in:
> >>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> >>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> >>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> >>>>> [ 1.340000] PC is at 0x0
> >>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
> >>>>
> >>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
> >>>>
> >>>> void arch_send_call_function_single_ipi(int cpu)
> >>>> {
> >>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> >>>> }
> >>>>
> >>>> so I'm not sure how the NULL pointer deref is possible even.
> >>>>
> >>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
> >>>> this, but I'm not sure how exactly that can happen.
> >>>>
> >>>> I need help from somebody who knows how this low-level stuff works on ARM.
> >>>
> >>> Given that OMAP3 is a UP system, there is zero chance that it has
> >>> registered the magic hook that delivers IPIs (its interrupt controller
> >>> is not even capable of doing so).
> >>>
> >>> I don't really know the context, but IPIs on a UP system seem at best odd.
> >>
> >> That would explain it, thanks.
> >>
> >> So it looks like we should always use irq_work_queue() on UP even if
> >> CONFIG_SMP is set, shouldn't we?
> >
> > Something like that, yes. CONFIG_SMP is not an indication of an SMP
> > system anymore (we've even dropped the config option on arm64).
> >
> > Hopefully num_possible_cpus() is reliable enough to let you do the right
> > thing...
>
> Well, in fact I can always use irq_work_queue() in there at least for
> the time being.
>
> Let me prepare a patch.
Guenter, Tony,
Below is a patch to try, on top of linux-next.
Please let me know if the problem is still around with that patch applied.
Thanks,
Rafael
---
drivers/cpufreq/cpufreq_governor.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -350,15 +350,6 @@ static void dbs_irq_work(struct irq_work
schedule_work(&policy_dbs->work);
}
-static inline void gov_queue_irq_work(struct policy_dbs_info *policy_dbs)
-{
-#ifdef CONFIG_SMP
- irq_work_queue_on(&policy_dbs->irq_work, smp_processor_id());
-#else
- irq_work_queue(&policy_dbs->irq_work);
-#endif
-}
-
static void dbs_update_util_handler(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max)
{
@@ -378,7 +369,7 @@ static void dbs_update_util_handler(stru
delta_ns = time - policy_dbs->last_sample_time;
if ((s64)delta_ns >= policy_dbs->sample_delay_ns) {
policy_dbs->last_sample_time = time;
- gov_queue_irq_work(policy_dbs);
+ irq_work_queue(&policy_dbs->irq_work);
return;
}
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:28 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 19:28 UTC (permalink / raw)
To: Guenter Roeck, Tony Lindgren
Cc: Marc Zyngier, Viresh Kumar, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On Monday, February 15, 2016 08:12:33 PM Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 8:03 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > On 15/02/16 18:54, Rafael J. Wysocki wrote:
> >> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>> On 15/02/16 18:41, Rafael J. Wysocki wrote:
> >>>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> >>>>> Rafael,
> >>>>
> >>>> Hi,
> >>>>
> >>>> Thanks for the report!
> >>>>
> >>>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> >>>>> timers with utilization update callbacks' with next-20160215. An example
> >>>>> crash log and bisect results are attached below.
> >>>>>
> >>>>> Please let me know if there is anything I can do to help tracking down
> >>>>> the problem.
> >>>>
> >>>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
> >>>>
> >>>> [cut]
> >>>>
> >>>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> >>>>> [ 1.340000] pgd = c0204000
> >>>>> [ 1.340000] [00000000] *pgd=00000000
> >>>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> >>>>> [ 1.340000] Modules linked in:
> >>>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> >>>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> >>>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> >>>>> [ 1.340000] PC is at 0x0
> >>>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
> >>>>
> >>>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
> >>>>
> >>>> void arch_send_call_function_single_ipi(int cpu)
> >>>> {
> >>>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> >>>> }
> >>>>
> >>>> so I'm not sure how the NULL pointer deref is possible even.
> >>>>
> >>>> The only thing coming to mind would be that cpumask_of(cpu) triggers
> >>>> this, but I'm not sure how exactly that can happen.
> >>>>
> >>>> I need help from somebody who knows how this low-level stuff works on ARM.
> >>>
> >>> Given that OMAP3 is a UP system, there is zero chance that it has
> >>> registered the magic hook that delivers IPIs (its interrupt controller
> >>> is not even capable of doing so).
> >>>
> >>> I don't really know the context, but IPIs on a UP system seem at best odd.
> >>
> >> That would explain it, thanks.
> >>
> >> So it looks like we should always use irq_work_queue() on UP even if
> >> CONFIG_SMP is set, shouldn't we?
> >
> > Something like that, yes. CONFIG_SMP is not an indication of an SMP
> > system anymore (we've even dropped the config option on arm64).
> >
> > Hopefully num_possible_cpus() is reliable enough to let you do the right
> > thing...
>
> Well, in fact I can always use irq_work_queue() in there at least for
> the time being.
>
> Let me prepare a patch.
Guenter, Tony,
Below is a patch to try, on top of linux-next.
Please let me know if the problem is still around with that patch applied.
Thanks,
Rafael
---
drivers/cpufreq/cpufreq_governor.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -350,15 +350,6 @@ static void dbs_irq_work(struct irq_work
schedule_work(&policy_dbs->work);
}
-static inline void gov_queue_irq_work(struct policy_dbs_info *policy_dbs)
-{
-#ifdef CONFIG_SMP
- irq_work_queue_on(&policy_dbs->irq_work, smp_processor_id());
-#else
- irq_work_queue(&policy_dbs->irq_work);
-#endif
-}
-
static void dbs_update_util_handler(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max)
{
@@ -378,7 +369,7 @@ static void dbs_update_util_handler(stru
delta_ns = time - policy_dbs->last_sample_time;
if ((s64)delta_ns >= policy_dbs->sample_delay_ns) {
policy_dbs->last_sample_time = time;
- gov_queue_irq_work(policy_dbs);
+ irq_work_queue(&policy_dbs->irq_work);
return;
}
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:28 ` Rafael J. Wysocki
(?)
@ 2016-02-15 19:42 ` Tony Lindgren
-1 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:42 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Marc Zyngier, Viresh Kumar, Rafael J. Wysocki,
linux-next, Linux Kernel Mailing List, linux-arm-kernel,
linux-pm, Peter Zijlstra
* Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
>
> Guenter, Tony,
>
> Below is a patch to try, on top of linux-next.
Fixes the issue on UP for me:
Tested-by: Tony Lindgren <tony@atomide.com>
> Please let me know if the problem is still around with that patch applied.
It seems we still have another issue with SMP systems, see below.
Regards,
Tony
8< ------------------
Unable to handle kernel NULL pointer dereference at virtual address 00000030
pgd = c0204000
[00000030] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215-00002-g08cd608 #895
Hardware name: Generic OMAP4 (Flattened Device Tree)
task: ee870000 ti: ee85e000 task.ti: ee85e000
PC is at regulator_set_voltage+0x10/0x54
LR is at _set_opp_voltage+0x30/0x98
pc : [<c0684270>] lr : [<c0774900>] psr: 00000113
sp : ee85fb20 ip : 00000001 fp : 000fa3e8
r10: 000fa3e8 r9 : 000fa3e8 r8 : 00000000
r7 : ef7ab050 r6 : 000fa3e8 r5 : 000fa3e8 r4 : 00000000
r3 : 000fa3e8 r2 : 000fa3e8 r1 : 000fa3e8 r0 : 00000000
Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 8020404a DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xee85e220)
Stack: (0xee85fb20 to 0xee860000)
fb20: 00000000 000fa3e8 000fa3e8 c0774900 eedc8500 11e1a300 00000000 11e1a300
fb40: ef7ab050 eedc8500 00000000 ef7ab050 eedc8540 c0775488 000fa3e8 00000000
fb60: 00000000 00124f80 00124f80 00124f80 11e1a300 23c34600 00000000 00000000
fb80: eea82e00 c144e250 eedc86c0 00000000 00000000 00000000 00000000 c096d8ec
fba0: ee85fbac ee85fbf8 00000001 00000000 00000010 000927c0 000493e0 00000021
fbc0: 00000010 00000000 c13bc9c4 c1302574 ef7bc598 0000001e eea82e00 c0971684
fbe0: 000927c0 eedc87c0 c1211598 00000000 c120d300 c1302670 001ef19f 00000000
fc00: c1302574 00000000 eea82e00 00000003 eedcbe04 c144e250 00000010 eea82eb4
fc20: c1302574 c0971ab0 c144e250 eea82e00 eea82e00 00000001 00000000 c144e250
fc40: 00000010 eea82e00 00000000 00000003 c13bc750 c144e250 00000010 eea82eb4
fc60: c1302574 c096eb20 eea82e00 00000000 eea82e08 c096f344 eedcca00 00000003
fc80: 0000ffff 00000003 00000000 00000000 eedc8440 000f6180 000493e0 000493e0
fca0: 000493e0 000f6180 000927c0 00000000 00000000 00000000 00000000 c13bc9c4
fcc0: 00000000 00000000 00000000 00000000 00000000 00000000 ffffffe0 eea82e60
fce0: eea82e60 c096f188 000493e0 000f6180 eedc86c0 c13bc750 c13bc750 eedc8700
fd00: eea82e84 eea82e84 ee9357c0 00000000 c13bc7d0 eedcc4b0 00000001 00000003
fd20: 00000000 00000000 eea82eac eea82eac ffff0001 eea82eb8 eea82eb8 00000000
fd40: 00000000 ee870000 00000000 00000000 00000000 eea82ed8 eea82ed8 00000000
fd60: eedc8780 eedc8680 eea82e00 c096fa00 00000001 60000113 eea82e04 00000000
fd80: ee85fdac c13bc7a4 c139e468 c13bc750 fffffdfb 00000000 00000000 00000000
fda0: 00000000 c0764dd0 c144e904 ee82fc5c ee99e4b4 00000000 c1334208 c13bcb30
fdc0: c144e250 c096e690 eedc8440 ef7ab050 eee32200 c0972368 eee32210 eee32210
fde0: c13bcae8 c0767e5c eee32210 c1449eac c1449eb4 c13bcae8 00000000 c07666c0
fe00: 00000000 ee85fe38 c07667fc 00000001 c1449e88 00000000 00000000 c0764ab4
fe20: ee82fb70 eedf3338 eee32210 eee32244 c139e3e8 c07663cc eee32210 00000001
fe40: eee32218 eee32218 eee32210 c139e3e8 00000000 c07658ac eee32218 eee32210
fe60: c139e260 c0763bfc c120ce1c c058e688 ee85fec0 eee32200 00000000 eee32200
fe80: eee32210 c1103670 00000000 c120ce1c 0000011a c0767bbc ee85fec0 eee32200
fea0: eedc8340 c1103670 00000000 c07685a8 c144e908 c1306810 eedc8340 c11122e0
fec0: 00000000 00000000 c0ec230c 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 00000000 00000000 c1306810 c110f738 c1306810 c110fc30
ff00: c1306810 c1103690 c1306810 c0301d5c 00000000 c0463578 00000000 ee842b80
ff20: 00000000 c13356dc efffc0bf 0000011a c0c1d73c c035aac0 00000000 c0ebc080
ff40: c10095f8 00000000 00000007 00000007 c13356c4 00000007 c140a000 c140a000
ff60: 00000007 c140a000 c140a000 c11a1838 c11a183c c1100e14 00000007 00000007
ff80: 00000000 c1100594 00000000 c0b26878 00000000 00000000 00000000 00000000
ffa0: 00000000 c0b26880 00000000 c0307d78 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 a1718b7d a59ff7d9
[<c0684270>] (regulator_set_voltage) from [<c0774900>] (_set_opp_voltage+0x30/0x98)
[<c0774900>] (_set_opp_voltage) from [<c0775488>] (dev_pm_opp_set_rate+0x170/0x28c)
[<c0775488>] (dev_pm_opp_set_rate) from [<c096d8ec>] (__cpufreq_driver_target+0x180/0x2b4)
[<c096d8ec>] (__cpufreq_driver_target) from [<c0971684>] (dbs_check_cpu+0x19c/0x1d0)
[<c0971684>] (dbs_check_cpu) from [<c0971ab0>] (cpufreq_governor_dbs+0x274/0x620)
[<c0971ab0>] (cpufreq_governor_dbs) from [<c096eb20>] (__cpufreq_governor+0xf0/0x1a4)
[<c096eb20>] (__cpufreq_governor) from [<c096f344>] (cpufreq_init_policy+0x64/0x8c)
[<c096f344>] (cpufreq_init_policy) from [<c096fa00>] (cpufreq_online+0x2f8/0x714)
[<c096fa00>] (cpufreq_online) from [<c0764dd0>] (subsys_interface_register+0x94/0xd8)
[<c0764dd0>] (subsys_interface_register) from [<c096e690>] (cpufreq_register_driver+0x14c/0x19c)
[<c096e690>] (cpufreq_register_driver) from [<c0972368>] (dt_cpufreq_probe+0x70/0xec)
[<c0972368>] (dt_cpufreq_probe) from [<c0767e5c>] (platform_drv_probe+0x4c/0xb0)
[<c0767e5c>] (platform_drv_probe) from [<c07666c0>] (driver_probe_device+0x214/0x2c0)
[<c07666c0>] (driver_probe_device) from [<c0764ab4>] (bus_for_each_drv+0x60/0x94)
[<c0764ab4>] (bus_for_each_drv) from [<c07663cc>] (__device_attach+0xb0/0x114)
[<c07663cc>] (__device_attach) from [<c07658ac>] (bus_probe_device+0x84/0x8c)
[<c07658ac>] (bus_probe_device) from [<c0763bfc>] (device_add+0x370/0x56c)
[<c0763bfc>] (device_add) from [<c0767bbc>] (platform_device_add+0xfc/0x224)
[<c0767bbc>] (platform_device_add) from [<c07685a8>] (platform_device_register_full+0xf8/0x120)
[<c07685a8>] (platform_device_register_full) from [<c11122e0>] (omap2_common_pm_late_init+0x108/0x114)
[<c11122e0>] (omap2_common_pm_late_init) from [<c110f738>] (omap_common_late_init+0xc/0x14)
[<c110f738>] (omap_common_late_init) from [<c110fc30>] (dra7xx_init_late+0x8/0x14)
[<c110fc30>] (dra7xx_init_late) from [<c1103690>] (init_machine_late+0x20/0x98)
[<c1103690>] (init_machine_late) from [<c0301d5c>] (do_one_initcall+0x90/0x1d8)
[<c0301d5c>] (do_one_initcall) from [<c1100e14>] (kernel_init_freeable+0x15c/0x1fc)
[<c1100e14>] (kernel_init_freeable) from [<c0b26880>] (kernel_init+0x8/0xf0)
[<c0b26880>] (kernel_init) from [<c0307d78>] (ret_from_fork+0x14/0x3c)
Code: e92d4070 e1a04000 e1a05001 e1a06002 (e5900030)
---[ end trace d0b8b8949b1b4202 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU1: stopping
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 4.5.0-rc4-next-20160215-00002-g08cd608 #895
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0310290>] (unwind_backtrace) from [<c030b98c>] (show_stack+0x10/0x14)
[<c030b98c>] (show_stack) from [<c058c174>] (dump_stack+0x90/0xa4)
[<c058c174>] (dump_stack) from [<c030ea58>] (handle_IPI+0x174/0x194)
[<c030ea58>] (handle_IPI) from [<c030175c>] (gic_handle_irq+0x90/0x94)
[<c030175c>] (gic_handle_irq) from [<c030c4d4>] (__irq_svc+0x54/0x70)
Exception stack(0xee895eb0 to 0xee895ef8)
5ea0: 00200040 c140cb80 00000001 00000000
5ec0: 00000082 00000000 ee894000 00000001 c1302080 fa241100 ee895fe0 c1302504
5ee0: 00000001 ee895f00 c0344a8c c0344668 60000113 ffffffff
[<c030c4d4>] (__irq_svc) from [<c0344668>] (__do_softirq+0x90/0x214)
[<c0344668>] (__do_softirq) from [<c0344a8c>] (irq_exit+0xb0/0x118)
[<c0344a8c>] (irq_exit) from [<c0382f88>] (__handle_domain_irq+0x60/0xb4)
[<c0382f88>] (__handle_domain_irq) from [<c0301720>] (gic_handle_irq+0x54/0x94)
[<c0301720>] (gic_handle_irq) from [<c030c4d4>] (__irq_svc+0x54/0x70)
Exception stack(0xee895f88 to 0xee895fd0)
5f80: 00000001 00000000 00000000 c031af20 ee894000 c13024a4
5fa0: 00000000 00000000 c120d3a8 c12115d8 ee895fe0 c1302504 00000000 ee895fd8
5fc0: c030878c c0308790 60000113 ffffffff
[<c030c4d4>] (__irq_svc) from [<c0308790>] (arch_cpu_idle+0x38/0x3c)
[<c0308790>] (arch_cpu_idle) from [<c0377808>] (cpu_startup_entry+0x1e4/0x240)
[<c0377808>] (cpu_startup_entry) from [<80301b6c>] (0x80301b6c)
---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:42 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:42 UTC (permalink / raw)
To: linux-arm-kernel
* Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
>
> Guenter, Tony,
>
> Below is a patch to try, on top of linux-next.
Fixes the issue on UP for me:
Tested-by: Tony Lindgren <tony@atomide.com>
> Please let me know if the problem is still around with that patch applied.
It seems we still have another issue with SMP systems, see below.
Regards,
Tony
8< ------------------
Unable to handle kernel NULL pointer dereference at virtual address 00000030
pgd = c0204000
[00000030] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215-00002-g08cd608 #895
Hardware name: Generic OMAP4 (Flattened Device Tree)
task: ee870000 ti: ee85e000 task.ti: ee85e000
PC is at regulator_set_voltage+0x10/0x54
LR is@_set_opp_voltage+0x30/0x98
pc : [<c0684270>] lr : [<c0774900>] psr: 00000113
sp : ee85fb20 ip : 00000001 fp : 000fa3e8
r10: 000fa3e8 r9 : 000fa3e8 r8 : 00000000
r7 : ef7ab050 r6 : 000fa3e8 r5 : 000fa3e8 r4 : 00000000
r3 : 000fa3e8 r2 : 000fa3e8 r1 : 000fa3e8 r0 : 00000000
Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 8020404a DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xee85e220)
Stack: (0xee85fb20 to 0xee860000)
fb20: 00000000 000fa3e8 000fa3e8 c0774900 eedc8500 11e1a300 00000000 11e1a300
fb40: ef7ab050 eedc8500 00000000 ef7ab050 eedc8540 c0775488 000fa3e8 00000000
fb60: 00000000 00124f80 00124f80 00124f80 11e1a300 23c34600 00000000 00000000
fb80: eea82e00 c144e250 eedc86c0 00000000 00000000 00000000 00000000 c096d8ec
fba0: ee85fbac ee85fbf8 00000001 00000000 00000010 000927c0 000493e0 00000021
fbc0: 00000010 00000000 c13bc9c4 c1302574 ef7bc598 0000001e eea82e00 c0971684
fbe0: 000927c0 eedc87c0 c1211598 00000000 c120d300 c1302670 001ef19f 00000000
fc00: c1302574 00000000 eea82e00 00000003 eedcbe04 c144e250 00000010 eea82eb4
fc20: c1302574 c0971ab0 c144e250 eea82e00 eea82e00 00000001 00000000 c144e250
fc40: 00000010 eea82e00 00000000 00000003 c13bc750 c144e250 00000010 eea82eb4
fc60: c1302574 c096eb20 eea82e00 00000000 eea82e08 c096f344 eedcca00 00000003
fc80: 0000ffff 00000003 00000000 00000000 eedc8440 000f6180 000493e0 000493e0
fca0: 000493e0 000f6180 000927c0 00000000 00000000 00000000 00000000 c13bc9c4
fcc0: 00000000 00000000 00000000 00000000 00000000 00000000 ffffffe0 eea82e60
fce0: eea82e60 c096f188 000493e0 000f6180 eedc86c0 c13bc750 c13bc750 eedc8700
fd00: eea82e84 eea82e84 ee9357c0 00000000 c13bc7d0 eedcc4b0 00000001 00000003
fd20: 00000000 00000000 eea82eac eea82eac ffff0001 eea82eb8 eea82eb8 00000000
fd40: 00000000 ee870000 00000000 00000000 00000000 eea82ed8 eea82ed8 00000000
fd60: eedc8780 eedc8680 eea82e00 c096fa00 00000001 60000113 eea82e04 00000000
fd80: ee85fdac c13bc7a4 c139e468 c13bc750 fffffdfb 00000000 00000000 00000000
fda0: 00000000 c0764dd0 c144e904 ee82fc5c ee99e4b4 00000000 c1334208 c13bcb30
fdc0: c144e250 c096e690 eedc8440 ef7ab050 eee32200 c0972368 eee32210 eee32210
fde0: c13bcae8 c0767e5c eee32210 c1449eac c1449eb4 c13bcae8 00000000 c07666c0
fe00: 00000000 ee85fe38 c07667fc 00000001 c1449e88 00000000 00000000 c0764ab4
fe20: ee82fb70 eedf3338 eee32210 eee32244 c139e3e8 c07663cc eee32210 00000001
fe40: eee32218 eee32218 eee32210 c139e3e8 00000000 c07658ac eee32218 eee32210
fe60: c139e260 c0763bfc c120ce1c c058e688 ee85fec0 eee32200 00000000 eee32200
fe80: eee32210 c1103670 00000000 c120ce1c 0000011a c0767bbc ee85fec0 eee32200
fea0: eedc8340 c1103670 00000000 c07685a8 c144e908 c1306810 eedc8340 c11122e0
fec0: 00000000 00000000 c0ec230c 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 00000000 00000000 c1306810 c110f738 c1306810 c110fc30
ff00: c1306810 c1103690 c1306810 c0301d5c 00000000 c0463578 00000000 ee842b80
ff20: 00000000 c13356dc efffc0bf 0000011a c0c1d73c c035aac0 00000000 c0ebc080
ff40: c10095f8 00000000 00000007 00000007 c13356c4 00000007 c140a000 c140a000
ff60: 00000007 c140a000 c140a000 c11a1838 c11a183c c1100e14 00000007 00000007
ff80: 00000000 c1100594 00000000 c0b26878 00000000 00000000 00000000 00000000
ffa0: 00000000 c0b26880 00000000 c0307d78 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 a1718b7d a59ff7d9
[<c0684270>] (regulator_set_voltage) from [<c0774900>] (_set_opp_voltage+0x30/0x98)
[<c0774900>] (_set_opp_voltage) from [<c0775488>] (dev_pm_opp_set_rate+0x170/0x28c)
[<c0775488>] (dev_pm_opp_set_rate) from [<c096d8ec>] (__cpufreq_driver_target+0x180/0x2b4)
[<c096d8ec>] (__cpufreq_driver_target) from [<c0971684>] (dbs_check_cpu+0x19c/0x1d0)
[<c0971684>] (dbs_check_cpu) from [<c0971ab0>] (cpufreq_governor_dbs+0x274/0x620)
[<c0971ab0>] (cpufreq_governor_dbs) from [<c096eb20>] (__cpufreq_governor+0xf0/0x1a4)
[<c096eb20>] (__cpufreq_governor) from [<c096f344>] (cpufreq_init_policy+0x64/0x8c)
[<c096f344>] (cpufreq_init_policy) from [<c096fa00>] (cpufreq_online+0x2f8/0x714)
[<c096fa00>] (cpufreq_online) from [<c0764dd0>] (subsys_interface_register+0x94/0xd8)
[<c0764dd0>] (subsys_interface_register) from [<c096e690>] (cpufreq_register_driver+0x14c/0x19c)
[<c096e690>] (cpufreq_register_driver) from [<c0972368>] (dt_cpufreq_probe+0x70/0xec)
[<c0972368>] (dt_cpufreq_probe) from [<c0767e5c>] (platform_drv_probe+0x4c/0xb0)
[<c0767e5c>] (platform_drv_probe) from [<c07666c0>] (driver_probe_device+0x214/0x2c0)
[<c07666c0>] (driver_probe_device) from [<c0764ab4>] (bus_for_each_drv+0x60/0x94)
[<c0764ab4>] (bus_for_each_drv) from [<c07663cc>] (__device_attach+0xb0/0x114)
[<c07663cc>] (__device_attach) from [<c07658ac>] (bus_probe_device+0x84/0x8c)
[<c07658ac>] (bus_probe_device) from [<c0763bfc>] (device_add+0x370/0x56c)
[<c0763bfc>] (device_add) from [<c0767bbc>] (platform_device_add+0xfc/0x224)
[<c0767bbc>] (platform_device_add) from [<c07685a8>] (platform_device_register_full+0xf8/0x120)
[<c07685a8>] (platform_device_register_full) from [<c11122e0>] (omap2_common_pm_late_init+0x108/0x114)
[<c11122e0>] (omap2_common_pm_late_init) from [<c110f738>] (omap_common_late_init+0xc/0x14)
[<c110f738>] (omap_common_late_init) from [<c110fc30>] (dra7xx_init_late+0x8/0x14)
[<c110fc30>] (dra7xx_init_late) from [<c1103690>] (init_machine_late+0x20/0x98)
[<c1103690>] (init_machine_late) from [<c0301d5c>] (do_one_initcall+0x90/0x1d8)
[<c0301d5c>] (do_one_initcall) from [<c1100e14>] (kernel_init_freeable+0x15c/0x1fc)
[<c1100e14>] (kernel_init_freeable) from [<c0b26880>] (kernel_init+0x8/0xf0)
[<c0b26880>] (kernel_init) from [<c0307d78>] (ret_from_fork+0x14/0x3c)
Code: e92d4070 e1a04000 e1a05001 e1a06002 (e5900030)
---[ end trace d0b8b8949b1b4202 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU1: stopping
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 4.5.0-rc4-next-20160215-00002-g08cd608 #895
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0310290>] (unwind_backtrace) from [<c030b98c>] (show_stack+0x10/0x14)
[<c030b98c>] (show_stack) from [<c058c174>] (dump_stack+0x90/0xa4)
[<c058c174>] (dump_stack) from [<c030ea58>] (handle_IPI+0x174/0x194)
[<c030ea58>] (handle_IPI) from [<c030175c>] (gic_handle_irq+0x90/0x94)
[<c030175c>] (gic_handle_irq) from [<c030c4d4>] (__irq_svc+0x54/0x70)
Exception stack(0xee895eb0 to 0xee895ef8)
5ea0: 00200040 c140cb80 00000001 00000000
5ec0: 00000082 00000000 ee894000 00000001 c1302080 fa241100 ee895fe0 c1302504
5ee0: 00000001 ee895f00 c0344a8c c0344668 60000113 ffffffff
[<c030c4d4>] (__irq_svc) from [<c0344668>] (__do_softirq+0x90/0x214)
[<c0344668>] (__do_softirq) from [<c0344a8c>] (irq_exit+0xb0/0x118)
[<c0344a8c>] (irq_exit) from [<c0382f88>] (__handle_domain_irq+0x60/0xb4)
[<c0382f88>] (__handle_domain_irq) from [<c0301720>] (gic_handle_irq+0x54/0x94)
[<c0301720>] (gic_handle_irq) from [<c030c4d4>] (__irq_svc+0x54/0x70)
Exception stack(0xee895f88 to 0xee895fd0)
5f80: 00000001 00000000 00000000 c031af20 ee894000 c13024a4
5fa0: 00000000 00000000 c120d3a8 c12115d8 ee895fe0 c1302504 00000000 ee895fd8
5fc0: c030878c c0308790 60000113 ffffffff
[<c030c4d4>] (__irq_svc) from [<c0308790>] (arch_cpu_idle+0x38/0x3c)
[<c0308790>] (arch_cpu_idle) from [<c0377808>] (cpu_startup_entry+0x1e4/0x240)
[<c0377808>] (cpu_startup_entry) from [<80301b6c>] (0x80301b6c)
---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:42 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:42 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Marc Zyngier, Viresh Kumar, Rafael J. Wysocki,
linux-next, Linux Kernel Mailing List, linux-arm-kernel,
linux-pm, Peter Zijlstra
* Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
>
> Guenter, Tony,
>
> Below is a patch to try, on top of linux-next.
Fixes the issue on UP for me:
Tested-by: Tony Lindgren <tony@atomide.com>
> Please let me know if the problem is still around with that patch applied.
It seems we still have another issue with SMP systems, see below.
Regards,
Tony
8< ------------------
Unable to handle kernel NULL pointer dereference at virtual address 00000030
pgd = c0204000
[00000030] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215-00002-g08cd608 #895
Hardware name: Generic OMAP4 (Flattened Device Tree)
task: ee870000 ti: ee85e000 task.ti: ee85e000
PC is at regulator_set_voltage+0x10/0x54
LR is at _set_opp_voltage+0x30/0x98
pc : [<c0684270>] lr : [<c0774900>] psr: 00000113
sp : ee85fb20 ip : 00000001 fp : 000fa3e8
r10: 000fa3e8 r9 : 000fa3e8 r8 : 00000000
r7 : ef7ab050 r6 : 000fa3e8 r5 : 000fa3e8 r4 : 00000000
r3 : 000fa3e8 r2 : 000fa3e8 r1 : 000fa3e8 r0 : 00000000
Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 8020404a DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xee85e220)
Stack: (0xee85fb20 to 0xee860000)
fb20: 00000000 000fa3e8 000fa3e8 c0774900 eedc8500 11e1a300 00000000 11e1a300
fb40: ef7ab050 eedc8500 00000000 ef7ab050 eedc8540 c0775488 000fa3e8 00000000
fb60: 00000000 00124f80 00124f80 00124f80 11e1a300 23c34600 00000000 00000000
fb80: eea82e00 c144e250 eedc86c0 00000000 00000000 00000000 00000000 c096d8ec
fba0: ee85fbac ee85fbf8 00000001 00000000 00000010 000927c0 000493e0 00000021
fbc0: 00000010 00000000 c13bc9c4 c1302574 ef7bc598 0000001e eea82e00 c0971684
fbe0: 000927c0 eedc87c0 c1211598 00000000 c120d300 c1302670 001ef19f 00000000
fc00: c1302574 00000000 eea82e00 00000003 eedcbe04 c144e250 00000010 eea82eb4
fc20: c1302574 c0971ab0 c144e250 eea82e00 eea82e00 00000001 00000000 c144e250
fc40: 00000010 eea82e00 00000000 00000003 c13bc750 c144e250 00000010 eea82eb4
fc60: c1302574 c096eb20 eea82e00 00000000 eea82e08 c096f344 eedcca00 00000003
fc80: 0000ffff 00000003 00000000 00000000 eedc8440 000f6180 000493e0 000493e0
fca0: 000493e0 000f6180 000927c0 00000000 00000000 00000000 00000000 c13bc9c4
fcc0: 00000000 00000000 00000000 00000000 00000000 00000000 ffffffe0 eea82e60
fce0: eea82e60 c096f188 000493e0 000f6180 eedc86c0 c13bc750 c13bc750 eedc8700
fd00: eea82e84 eea82e84 ee9357c0 00000000 c13bc7d0 eedcc4b0 00000001 00000003
fd20: 00000000 00000000 eea82eac eea82eac ffff0001 eea82eb8 eea82eb8 00000000
fd40: 00000000 ee870000 00000000 00000000 00000000 eea82ed8 eea82ed8 00000000
fd60: eedc8780 eedc8680 eea82e00 c096fa00 00000001 60000113 eea82e04 00000000
fd80: ee85fdac c13bc7a4 c139e468 c13bc750 fffffdfb 00000000 00000000 00000000
fda0: 00000000 c0764dd0 c144e904 ee82fc5c ee99e4b4 00000000 c1334208 c13bcb30
fdc0: c144e250 c096e690 eedc8440 ef7ab050 eee32200 c0972368 eee32210 eee32210
fde0: c13bcae8 c0767e5c eee32210 c1449eac c1449eb4 c13bcae8 00000000 c07666c0
fe00: 00000000 ee85fe38 c07667fc 00000001 c1449e88 00000000 00000000 c0764ab4
fe20: ee82fb70 eedf3338 eee32210 eee32244 c139e3e8 c07663cc eee32210 00000001
fe40: eee32218 eee32218 eee32210 c139e3e8 00000000 c07658ac eee32218 eee32210
fe60: c139e260 c0763bfc c120ce1c c058e688 ee85fec0 eee32200 00000000 eee32200
fe80: eee32210 c1103670 00000000 c120ce1c 0000011a c0767bbc ee85fec0 eee32200
fea0: eedc8340 c1103670 00000000 c07685a8 c144e908 c1306810 eedc8340 c11122e0
fec0: 00000000 00000000 c0ec230c 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 00000000 00000000 c1306810 c110f738 c1306810 c110fc30
ff00: c1306810 c1103690 c1306810 c0301d5c 00000000 c0463578 00000000 ee842b80
ff20: 00000000 c13356dc efffc0bf 0000011a c0c1d73c c035aac0 00000000 c0ebc080
ff40: c10095f8 00000000 00000007 00000007 c13356c4 00000007 c140a000 c140a000
ff60: 00000007 c140a000 c140a000 c11a1838 c11a183c c1100e14 00000007 00000007
ff80: 00000000 c1100594 00000000 c0b26878 00000000 00000000 00000000 00000000
ffa0: 00000000 c0b26880 00000000 c0307d78 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 a1718b7d a59ff7d9
[<c0684270>] (regulator_set_voltage) from [<c0774900>] (_set_opp_voltage+0x30/0x98)
[<c0774900>] (_set_opp_voltage) from [<c0775488>] (dev_pm_opp_set_rate+0x170/0x28c)
[<c0775488>] (dev_pm_opp_set_rate) from [<c096d8ec>] (__cpufreq_driver_target+0x180/0x2b4)
[<c096d8ec>] (__cpufreq_driver_target) from [<c0971684>] (dbs_check_cpu+0x19c/0x1d0)
[<c0971684>] (dbs_check_cpu) from [<c0971ab0>] (cpufreq_governor_dbs+0x274/0x620)
[<c0971ab0>] (cpufreq_governor_dbs) from [<c096eb20>] (__cpufreq_governor+0xf0/0x1a4)
[<c096eb20>] (__cpufreq_governor) from [<c096f344>] (cpufreq_init_policy+0x64/0x8c)
[<c096f344>] (cpufreq_init_policy) from [<c096fa00>] (cpufreq_online+0x2f8/0x714)
[<c096fa00>] (cpufreq_online) from [<c0764dd0>] (subsys_interface_register+0x94/0xd8)
[<c0764dd0>] (subsys_interface_register) from [<c096e690>] (cpufreq_register_driver+0x14c/0x19c)
[<c096e690>] (cpufreq_register_driver) from [<c0972368>] (dt_cpufreq_probe+0x70/0xec)
[<c0972368>] (dt_cpufreq_probe) from [<c0767e5c>] (platform_drv_probe+0x4c/0xb0)
[<c0767e5c>] (platform_drv_probe) from [<c07666c0>] (driver_probe_device+0x214/0x2c0)
[<c07666c0>] (driver_probe_device) from [<c0764ab4>] (bus_for_each_drv+0x60/0x94)
[<c0764ab4>] (bus_for_each_drv) from [<c07663cc>] (__device_attach+0xb0/0x114)
[<c07663cc>] (__device_attach) from [<c07658ac>] (bus_probe_device+0x84/0x8c)
[<c07658ac>] (bus_probe_device) from [<c0763bfc>] (device_add+0x370/0x56c)
[<c0763bfc>] (device_add) from [<c0767bbc>] (platform_device_add+0xfc/0x224)
[<c0767bbc>] (platform_device_add) from [<c07685a8>] (platform_device_register_full+0xf8/0x120)
[<c07685a8>] (platform_device_register_full) from [<c11122e0>] (omap2_common_pm_late_init+0x108/0x114)
[<c11122e0>] (omap2_common_pm_late_init) from [<c110f738>] (omap_common_late_init+0xc/0x14)
[<c110f738>] (omap_common_late_init) from [<c110fc30>] (dra7xx_init_late+0x8/0x14)
[<c110fc30>] (dra7xx_init_late) from [<c1103690>] (init_machine_late+0x20/0x98)
[<c1103690>] (init_machine_late) from [<c0301d5c>] (do_one_initcall+0x90/0x1d8)
[<c0301d5c>] (do_one_initcall) from [<c1100e14>] (kernel_init_freeable+0x15c/0x1fc)
[<c1100e14>] (kernel_init_freeable) from [<c0b26880>] (kernel_init+0x8/0xf0)
[<c0b26880>] (kernel_init) from [<c0307d78>] (ret_from_fork+0x14/0x3c)
Code: e92d4070 e1a04000 e1a05001 e1a06002 (e5900030)
---[ end trace d0b8b8949b1b4202 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU1: stopping
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 4.5.0-rc4-next-20160215-00002-g08cd608 #895
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0310290>] (unwind_backtrace) from [<c030b98c>] (show_stack+0x10/0x14)
[<c030b98c>] (show_stack) from [<c058c174>] (dump_stack+0x90/0xa4)
[<c058c174>] (dump_stack) from [<c030ea58>] (handle_IPI+0x174/0x194)
[<c030ea58>] (handle_IPI) from [<c030175c>] (gic_handle_irq+0x90/0x94)
[<c030175c>] (gic_handle_irq) from [<c030c4d4>] (__irq_svc+0x54/0x70)
Exception stack(0xee895eb0 to 0xee895ef8)
5ea0: 00200040 c140cb80 00000001 00000000
5ec0: 00000082 00000000 ee894000 00000001 c1302080 fa241100 ee895fe0 c1302504
5ee0: 00000001 ee895f00 c0344a8c c0344668 60000113 ffffffff
[<c030c4d4>] (__irq_svc) from [<c0344668>] (__do_softirq+0x90/0x214)
[<c0344668>] (__do_softirq) from [<c0344a8c>] (irq_exit+0xb0/0x118)
[<c0344a8c>] (irq_exit) from [<c0382f88>] (__handle_domain_irq+0x60/0xb4)
[<c0382f88>] (__handle_domain_irq) from [<c0301720>] (gic_handle_irq+0x54/0x94)
[<c0301720>] (gic_handle_irq) from [<c030c4d4>] (__irq_svc+0x54/0x70)
Exception stack(0xee895f88 to 0xee895fd0)
5f80: 00000001 00000000 00000000 c031af20 ee894000 c13024a4
5fa0: 00000000 00000000 c120d3a8 c12115d8 ee895fe0 c1302504 00000000 ee895fd8
5fc0: c030878c c0308790 60000113 ffffffff
[<c030c4d4>] (__irq_svc) from [<c0308790>] (arch_cpu_idle+0x38/0x3c)
[<c0308790>] (arch_cpu_idle) from [<c0377808>] (cpu_startup_entry+0x1e4/0x240)
[<c0377808>] (cpu_startup_entry) from [<80301b6c>] (0x80301b6c)
---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:42 ` Tony Lindgren
(?)
@ 2016-02-15 19:46 ` Guenter Roeck
-1 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 19:46 UTC (permalink / raw)
To: Tony Lindgren
Cc: Rafael J. Wysocki, Marc Zyngier, Viresh Kumar, Rafael J. Wysocki,
linux-next, Linux Kernel Mailing List, linux-arm-kernel,
linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 11:42:27AM -0800, Tony Lindgren wrote:
> * Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
> >
> > Guenter, Tony,
> >
> > Below is a patch to try, on top of linux-next.
>
> Fixes the issue on UP for me:
>
> Tested-by: Tony Lindgren <tony@atomide.com>
>
> > Please let me know if the problem is still around with that patch applied.
>
> It seems we still have another issue with SMP systems, see below.
>
Try https://patchwork.kernel.org/patch/8318221
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:46 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 19:46 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 11:42:27AM -0800, Tony Lindgren wrote:
> * Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
> >
> > Guenter, Tony,
> >
> > Below is a patch to try, on top of linux-next.
>
> Fixes the issue on UP for me:
>
> Tested-by: Tony Lindgren <tony@atomide.com>
>
> > Please let me know if the problem is still around with that patch applied.
>
> It seems we still have another issue with SMP systems, see below.
>
Try https://patchwork.kernel.org/patch/8318221
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:46 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 19:46 UTC (permalink / raw)
To: Tony Lindgren
Cc: Rafael J. Wysocki, Marc Zyngier, Viresh Kumar, Rafael J. Wysocki,
linux-next, Linux Kernel Mailing List, linux-arm-kernel,
linux-pm, Peter Zijlstra
On Mon, Feb 15, 2016 at 11:42:27AM -0800, Tony Lindgren wrote:
> * Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
> >
> > Guenter, Tony,
> >
> > Below is a patch to try, on top of linux-next.
>
> Fixes the issue on UP for me:
>
> Tested-by: Tony Lindgren <tony@atomide.com>
>
> > Please let me know if the problem is still around with that patch applied.
>
> It seems we still have another issue with SMP systems, see below.
>
Try https://patchwork.kernel.org/patch/8318221
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:46 ` Guenter Roeck
(?)
@ 2016-02-15 19:57 ` Tony Lindgren
-1 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:57 UTC (permalink / raw)
To: Guenter Roeck
Cc: Rafael J. Wysocki, Marc Zyngier, Viresh Kumar, Rafael J. Wysocki,
linux-next, Linux Kernel Mailing List, linux-arm-kernel,
linux-pm, Peter Zijlstra
* Guenter Roeck <linux@roeck-us.net> [160215 11:47]:
> On Mon, Feb 15, 2016 at 11:42:27AM -0800, Tony Lindgren wrote:
> > * Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
> > >
> > > Guenter, Tony,
> > >
> > > Below is a patch to try, on top of linux-next.
> >
> > Fixes the issue on UP for me:
> >
> > Tested-by: Tony Lindgren <tony@atomide.com>
> >
> > > Please let me know if the problem is still around with that patch applied.
> >
> > It seems we still have another issue with SMP systems, see below.
> >
> Try https://patchwork.kernel.org/patch/8318221
Great, that one fixes the SMP issue for me. So for patchwork
patch 8318221, here's a cross thread tested-by as looks like
I was not on Cc for it:
Tested-by: Tony Lindgren <tony@atomide.com>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:57 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:57 UTC (permalink / raw)
To: linux-arm-kernel
* Guenter Roeck <linux@roeck-us.net> [160215 11:47]:
> On Mon, Feb 15, 2016 at 11:42:27AM -0800, Tony Lindgren wrote:
> > * Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
> > >
> > > Guenter, Tony,
> > >
> > > Below is a patch to try, on top of linux-next.
> >
> > Fixes the issue on UP for me:
> >
> > Tested-by: Tony Lindgren <tony@atomide.com>
> >
> > > Please let me know if the problem is still around with that patch applied.
> >
> > It seems we still have another issue with SMP systems, see below.
> >
> Try https://patchwork.kernel.org/patch/8318221
Great, that one fixes the SMP issue for me. So for patchwork
patch 8318221, here's a cross thread tested-by as looks like
I was not on Cc for it:
Tested-by: Tony Lindgren <tony@atomide.com>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:57 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:57 UTC (permalink / raw)
To: Guenter Roeck
Cc: Rafael J. Wysocki, Marc Zyngier, Viresh Kumar, Rafael J. Wysocki,
linux-next, Linux Kernel Mailing List, linux-arm-kernel,
linux-pm, Peter Zijlstra
* Guenter Roeck <linux@roeck-us.net> [160215 11:47]:
> On Mon, Feb 15, 2016 at 11:42:27AM -0800, Tony Lindgren wrote:
> > * Rafael J. Wysocki <rjw@rjwysocki.net> [160215 11:28]:
> > >
> > > Guenter, Tony,
> > >
> > > Below is a patch to try, on top of linux-next.
> >
> > Fixes the issue on UP for me:
> >
> > Tested-by: Tony Lindgren <tony@atomide.com>
> >
> > > Please let me know if the problem is still around with that patch applied.
> >
> > It seems we still have another issue with SMP systems, see below.
> >
> Try https://patchwork.kernel.org/patch/8318221
Great, that one fixes the SMP issue for me. So for patchwork
patch 8318221, here's a cross thread tested-by as looks like
I was not on Cc for it:
Tested-by: Tony Lindgren <tony@atomide.com>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:03 ` Marc Zyngier
(?)
@ 2016-02-15 19:23 ` Russell King - ARM Linux
-1 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:23 UTC (permalink / raw)
To: Marc Zyngier
Cc: Rafael J. Wysocki, linux-pm, Peter Zijlstra, Viresh Kumar,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
Guenter Roeck, linux-arm-kernel
On Mon, Feb 15, 2016 at 07:03:33PM +0000, Marc Zyngier wrote:
> On 15/02/16 18:54, Rafael J. Wysocki wrote:
> > That would explain it, thanks.
> >
> > So it looks like we should always use irq_work_queue() on UP even if
> > CONFIG_SMP is set, shouldn't we?
>
> Something like that, yes. CONFIG_SMP is not an indication of an SMP
> system anymore (we've even dropped the config option on arm64).
>
> Hopefully num_possible_cpus() is reliable enough to let you do the right
> thing...
CONFIG_SMP just says whether to include support for SMP. It doesn't
mandate running on a SMP system. :)
I've been looking around the usages of irq_work_queue_on in kernel/
in -rc4, and some places seem to check for "this CPU":
/*
* It is possible that a restart caused this CPU to be
* chosen again. Don't bother with an IPI, just see if we
* have more to push.
*/
if (unlikely(cpu == rq->cpu))
goto again;
/* Try the next RT overloaded CPU */
irq_work_queue_on(&rt_rq->push_work, cpu);
I'm not sure about tell_cpu_to_push().
It's also called via tick_nohz_full_kick_cpu(), and the core scheduler
avoids calling this for the current CPU:
if (tick_nohz_full_cpu(cpu)) {
if (cpu != smp_processor_id() ||
tick_nohz_tick_stopped())
tick_nohz_full_kick_cpu(cpu);
I'm not sure about add_nr_running() in kernel/sched/sched.h - I think
that _could_ be a problem even without Rafael's cpufreq change.
So... the question is what do we do with irq_work_queue_on() in general
when called on non-SMP systems.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:23 ` Russell King - ARM Linux
0 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:23 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 07:03:33PM +0000, Marc Zyngier wrote:
> On 15/02/16 18:54, Rafael J. Wysocki wrote:
> > That would explain it, thanks.
> >
> > So it looks like we should always use irq_work_queue() on UP even if
> > CONFIG_SMP is set, shouldn't we?
>
> Something like that, yes. CONFIG_SMP is not an indication of an SMP
> system anymore (we've even dropped the config option on arm64).
>
> Hopefully num_possible_cpus() is reliable enough to let you do the right
> thing...
CONFIG_SMP just says whether to include support for SMP. It doesn't
mandate running on a SMP system. :)
I've been looking around the usages of irq_work_queue_on in kernel/
in -rc4, and some places seem to check for "this CPU":
/*
* It is possible that a restart caused this CPU to be
* chosen again. Don't bother with an IPI, just see if we
* have more to push.
*/
if (unlikely(cpu == rq->cpu))
goto again;
/* Try the next RT overloaded CPU */
irq_work_queue_on(&rt_rq->push_work, cpu);
I'm not sure about tell_cpu_to_push().
It's also called via tick_nohz_full_kick_cpu(), and the core scheduler
avoids calling this for the current CPU:
if (tick_nohz_full_cpu(cpu)) {
if (cpu != smp_processor_id() ||
tick_nohz_tick_stopped())
tick_nohz_full_kick_cpu(cpu);
I'm not sure about add_nr_running() in kernel/sched/sched.h - I think
that _could_ be a problem even without Rafael's cpufreq change.
So... the question is what do we do with irq_work_queue_on() in general
when called on non-SMP systems.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:23 ` Russell King - ARM Linux
0 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:23 UTC (permalink / raw)
To: Marc Zyngier
Cc: Rafael J. Wysocki, linux-pm, Peter Zijlstra, Viresh Kumar,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
Guenter Roeck, linux-arm-kernel
On Mon, Feb 15, 2016 at 07:03:33PM +0000, Marc Zyngier wrote:
> On 15/02/16 18:54, Rafael J. Wysocki wrote:
> > That would explain it, thanks.
> >
> > So it looks like we should always use irq_work_queue() on UP even if
> > CONFIG_SMP is set, shouldn't we?
>
> Something like that, yes. CONFIG_SMP is not an indication of an SMP
> system anymore (we've even dropped the config option on arm64).
>
> Hopefully num_possible_cpus() is reliable enough to let you do the right
> thing...
CONFIG_SMP just says whether to include support for SMP. It doesn't
mandate running on a SMP system. :)
I've been looking around the usages of irq_work_queue_on in kernel/
in -rc4, and some places seem to check for "this CPU":
/*
* It is possible that a restart caused this CPU to be
* chosen again. Don't bother with an IPI, just see if we
* have more to push.
*/
if (unlikely(cpu == rq->cpu))
goto again;
/* Try the next RT overloaded CPU */
irq_work_queue_on(&rt_rq->push_work, cpu);
I'm not sure about tell_cpu_to_push().
It's also called via tick_nohz_full_kick_cpu(), and the core scheduler
avoids calling this for the current CPU:
if (tick_nohz_full_cpu(cpu)) {
if (cpu != smp_processor_id() ||
tick_nohz_tick_stopped())
tick_nohz_full_kick_cpu(cpu);
I'm not sure about add_nr_running() in kernel/sched/sched.h - I think
that _could_ be a problem even without Rafael's cpufreq change.
So... the question is what do we do with irq_work_queue_on() in general
when called on non-SMP systems.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:23 ` Russell King - ARM Linux
(?)
@ 2016-02-15 20:41 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:41 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: Marc Zyngier, Rafael J. Wysocki, linux-pm, Peter Zijlstra,
Viresh Kumar, Rafael J. Wysocki, Linux Kernel Mailing List,
linux-next, Guenter Roeck, linux-arm-kernel
On Mon, Feb 15, 2016 at 8:23 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Feb 15, 2016 at 07:03:33PM +0000, Marc Zyngier wrote:
>> On 15/02/16 18:54, Rafael J. Wysocki wrote:
>> > That would explain it, thanks.
>> >
>> > So it looks like we should always use irq_work_queue() on UP even if
>> > CONFIG_SMP is set, shouldn't we?
>>
>> Something like that, yes. CONFIG_SMP is not an indication of an SMP
>> system anymore (we've even dropped the config option on arm64).
>>
>> Hopefully num_possible_cpus() is reliable enough to let you do the right
>> thing...
>
> CONFIG_SMP just says whether to include support for SMP. It doesn't
> mandate running on a SMP system. :)
>
> I've been looking around the usages of irq_work_queue_on in kernel/
> in -rc4, and some places seem to check for "this CPU":
>
> /*
> * It is possible that a restart caused this CPU to be
> * chosen again. Don't bother with an IPI, just see if we
> * have more to push.
> */
> if (unlikely(cpu == rq->cpu))
> goto again;
>
> /* Try the next RT overloaded CPU */
> irq_work_queue_on(&rt_rq->push_work, cpu);
>
> I'm not sure about tell_cpu_to_push().
>
> It's also called via tick_nohz_full_kick_cpu(), and the core scheduler
> avoids calling this for the current CPU:
>
> if (tick_nohz_full_cpu(cpu)) {
> if (cpu != smp_processor_id() ||
> tick_nohz_tick_stopped())
> tick_nohz_full_kick_cpu(cpu);
>
> I'm not sure about add_nr_running() in kernel/sched/sched.h - I think
> that _could_ be a problem even without Rafael's cpufreq change.
>
> So... the question is what do we do with irq_work_queue_on() in general
> when called on non-SMP systems.
I guess it might fall back to arch_irq_work_raise() when asked to
queue on the same CPU, so long as that will always do the right thing
(ie. actually queue on the same one).
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:41 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 8:23 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Feb 15, 2016 at 07:03:33PM +0000, Marc Zyngier wrote:
>> On 15/02/16 18:54, Rafael J. Wysocki wrote:
>> > That would explain it, thanks.
>> >
>> > So it looks like we should always use irq_work_queue() on UP even if
>> > CONFIG_SMP is set, shouldn't we?
>>
>> Something like that, yes. CONFIG_SMP is not an indication of an SMP
>> system anymore (we've even dropped the config option on arm64).
>>
>> Hopefully num_possible_cpus() is reliable enough to let you do the right
>> thing...
>
> CONFIG_SMP just says whether to include support for SMP. It doesn't
> mandate running on a SMP system. :)
>
> I've been looking around the usages of irq_work_queue_on in kernel/
> in -rc4, and some places seem to check for "this CPU":
>
> /*
> * It is possible that a restart caused this CPU to be
> * chosen again. Don't bother with an IPI, just see if we
> * have more to push.
> */
> if (unlikely(cpu == rq->cpu))
> goto again;
>
> /* Try the next RT overloaded CPU */
> irq_work_queue_on(&rt_rq->push_work, cpu);
>
> I'm not sure about tell_cpu_to_push().
>
> It's also called via tick_nohz_full_kick_cpu(), and the core scheduler
> avoids calling this for the current CPU:
>
> if (tick_nohz_full_cpu(cpu)) {
> if (cpu != smp_processor_id() ||
> tick_nohz_tick_stopped())
> tick_nohz_full_kick_cpu(cpu);
>
> I'm not sure about add_nr_running() in kernel/sched/sched.h - I think
> that _could_ be a problem even without Rafael's cpufreq change.
>
> So... the question is what do we do with irq_work_queue_on() in general
> when called on non-SMP systems.
I guess it might fall back to arch_irq_work_raise() when asked to
queue on the same CPU, so long as that will always do the right thing
(ie. actually queue on the same one).
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:41 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: Marc Zyngier, Rafael J. Wysocki, linux-pm, Peter Zijlstra,
Viresh Kumar, Rafael J. Wysocki, Linux Kernel Mailing List,
linux-next, Guenter Roeck, linux-arm-kernel
On Mon, Feb 15, 2016 at 8:23 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Feb 15, 2016 at 07:03:33PM +0000, Marc Zyngier wrote:
>> On 15/02/16 18:54, Rafael J. Wysocki wrote:
>> > That would explain it, thanks.
>> >
>> > So it looks like we should always use irq_work_queue() on UP even if
>> > CONFIG_SMP is set, shouldn't we?
>>
>> Something like that, yes. CONFIG_SMP is not an indication of an SMP
>> system anymore (we've even dropped the config option on arm64).
>>
>> Hopefully num_possible_cpus() is reliable enough to let you do the right
>> thing...
>
> CONFIG_SMP just says whether to include support for SMP. It doesn't
> mandate running on a SMP system. :)
>
> I've been looking around the usages of irq_work_queue_on in kernel/
> in -rc4, and some places seem to check for "this CPU":
>
> /*
> * It is possible that a restart caused this CPU to be
> * chosen again. Don't bother with an IPI, just see if we
> * have more to push.
> */
> if (unlikely(cpu == rq->cpu))
> goto again;
>
> /* Try the next RT overloaded CPU */
> irq_work_queue_on(&rt_rq->push_work, cpu);
>
> I'm not sure about tell_cpu_to_push().
>
> It's also called via tick_nohz_full_kick_cpu(), and the core scheduler
> avoids calling this for the current CPU:
>
> if (tick_nohz_full_cpu(cpu)) {
> if (cpu != smp_processor_id() ||
> tick_nohz_tick_stopped())
> tick_nohz_full_kick_cpu(cpu);
>
> I'm not sure about add_nr_running() in kernel/sched/sched.h - I think
> that _could_ be a problem even without Rafael's cpufreq change.
>
> So... the question is what do we do with irq_work_queue_on() in general
> when called on non-SMP systems.
I guess it might fall back to arch_irq_work_raise() when asked to
queue on the same CPU, so long as that will always do the right thing
(ie. actually queue on the same one).
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:54 ` Rafael J. Wysocki
(?)
@ 2016-02-15 19:07 ` Russell King - ARM Linux
-1 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:07 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Marc Zyngier, Peter Zijlstra, Viresh Kumar, linux-pm,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel, Guenter Roeck
On Mon, Feb 15, 2016 at 07:54:26PM +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > Given that OMAP3 is a UP system, there is zero chance that it has
> > registered the magic hook that delivers IPIs (its interrupt controller
> > is not even capable of doing so).
> >
> > I don't really know the context, but IPIs on a UP system seem at best odd.
>
> That would explain it, thanks.
>
> So it looks like we should always use irq_work_queue() on UP even if
> CONFIG_SMP is set, shouldn't we?
irq_work_queue_on() doesn't check whether 'cpu' is the CPU that we're
running on. This is a problem where we want to be able to run a kernel
built for SMP on a UP system.
I guess the question is whether irq_work_queue_on() is buggy, or whether
our implementation of arch_send_call_function_single_ipi() is buggy.
Should arch_send_call_function_single_ipi() do something on UP systems,
if so what?
We don't have IPIs on UP systems, so we can't raise any interrupts.
So, should we call generic_smp_call_function_interrupt() directly
from it?
Some clues would be good...
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:07 ` Russell King - ARM Linux
0 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:07 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 07:54:26PM +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > Given that OMAP3 is a UP system, there is zero chance that it has
> > registered the magic hook that delivers IPIs (its interrupt controller
> > is not even capable of doing so).
> >
> > I don't really know the context, but IPIs on a UP system seem at best odd.
>
> That would explain it, thanks.
>
> So it looks like we should always use irq_work_queue() on UP even if
> CONFIG_SMP is set, shouldn't we?
irq_work_queue_on() doesn't check whether 'cpu' is the CPU that we're
running on. This is a problem where we want to be able to run a kernel
built for SMP on a UP system.
I guess the question is whether irq_work_queue_on() is buggy, or whether
our implementation of arch_send_call_function_single_ipi() is buggy.
Should arch_send_call_function_single_ipi() do something on UP systems,
if so what?
We don't have IPIs on UP systems, so we can't raise any interrupts.
So, should we call generic_smp_call_function_interrupt() directly
from it?
Some clues would be good...
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:07 ` Russell King - ARM Linux
0 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:07 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Marc Zyngier, Peter Zijlstra, Viresh Kumar, linux-pm,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel, Guenter Roeck
On Mon, Feb 15, 2016 at 07:54:26PM +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 7:49 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > Given that OMAP3 is a UP system, there is zero chance that it has
> > registered the magic hook that delivers IPIs (its interrupt controller
> > is not even capable of doing so).
> >
> > I don't really know the context, but IPIs on a UP system seem at best odd.
>
> That would explain it, thanks.
>
> So it looks like we should always use irq_work_queue() on UP even if
> CONFIG_SMP is set, shouldn't we?
irq_work_queue_on() doesn't check whether 'cpu' is the CPU that we're
running on. This is a problem where we want to be able to run a kernel
built for SMP on a UP system.
I guess the question is whether irq_work_queue_on() is buggy, or whether
our implementation of arch_send_call_function_single_ipi() is buggy.
Should arch_send_call_function_single_ipi() do something on UP systems,
if so what?
We don't have IPIs on UP systems, so we can't raise any interrupts.
So, should we call generic_smp_call_function_interrupt() directly
from it?
Some clues would be good...
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:41 ` Rafael J. Wysocki
(?)
@ 2016-02-15 19:01 ` Tony Lindgren
-1 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:01 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
* Rafael J. Wysocki <rafael@kernel.org> [160215 10:44]:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > Rafael,
>
> Hi,
>
> Thanks for the report!
>
> > I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> > timers with utilization update callbacks' with next-20160215. An example
> > crash log and bisect results are attached below.
> >
> > Please let me know if there is anything I can do to help tracking down
> > the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
> > [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> > [ 1.340000] pgd = c0204000
> > [ 1.340000] [00000000] *pgd=00000000
> > [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> > [ 1.340000] Modules linked in:
> > [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> > [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> > [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> > [ 1.340000] PC is at 0x0
> > [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
That's not even an SMP machine? I suspect a bunch of out of the
65 boot failures here are related to this:
https://kernelci.org/boot/all/job/next/kernel/next-20160215/
The SMP ones seem to fail with some regulator issues?
Regards,
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:01 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:01 UTC (permalink / raw)
To: linux-arm-kernel
* Rafael J. Wysocki <rafael@kernel.org> [160215 10:44]:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > Rafael,
>
> Hi,
>
> Thanks for the report!
>
> > I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> > timers with utilization update callbacks' with next-20160215. An example
> > crash log and bisect results are attached below.
> >
> > Please let me know if there is anything I can do to help tracking down
> > the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
> > [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> > [ 1.340000] pgd = c0204000
> > [ 1.340000] [00000000] *pgd=00000000
> > [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> > [ 1.340000] Modules linked in:
> > [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> > [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> > [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> > [ 1.340000] PC is at 0x0
> > [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
That's not even an SMP machine? I suspect a bunch of out of the
65 boot failures here are related to this:
https://kernelci.org/boot/all/job/next/kernel/next-20160215/
The SMP ones seem to fail with some regulator issues?
Regards,
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:01 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:01 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
* Rafael J. Wysocki <rafael@kernel.org> [160215 10:44]:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > Rafael,
>
> Hi,
>
> Thanks for the report!
>
> > I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
> > timers with utilization update callbacks' with next-20160215. An example
> > crash log and bisect results are attached below.
> >
> > Please let me know if there is anything I can do to help tracking down
> > the problem.
>
> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>
> [cut]
>
> > [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> > [ 1.340000] pgd = c0204000
> > [ 1.340000] [00000000] *pgd=00000000
> > [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
> > [ 1.340000] Modules linked in:
> > [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
> > [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> > [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
> > [ 1.340000] PC is at 0x0
> > [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
>
> The only thing coming to mind would be that cpumask_of(cpu) triggers
> this, but I'm not sure how exactly that can happen.
>
> I need help from somebody who knows how this low-level stuff works on ARM.
That's not even an SMP machine? I suspect a bunch of out of the
65 boot failures here are related to this:
https://kernelci.org/boot/all/job/next/kernel/next-20160215/
The SMP ones seem to fail with some regulator issues?
Regards,
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:01 ` Tony Lindgren
(?)
@ 2016-02-15 19:40 ` Guenter Roeck
-1 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 19:40 UTC (permalink / raw)
To: Tony Lindgren, Rafael J. Wysocki
Cc: Viresh Kumar, linux-pm, Peter Zijlstra, Rafael J. Wysocki,
Linux Kernel Mailing List, linux-next, linux-arm-kernel
On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160215 10:44]:
>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Rafael,
>>
>> Hi,
>>
>> Thanks for the report!
>>
>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>> timers with utilization update callbacks' with next-20160215. An example
>>> crash log and bisect results are attached below.
>>>
>>> Please let me know if there is anything I can do to help tracking down
>>> the problem.
>>
>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>
>> [cut]
>>
>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>> [ 1.340000] pgd = c0204000
>>> [ 1.340000] [00000000] *pgd=00000000
>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>> [ 1.340000] Modules linked in:
>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>> [ 1.340000] PC is at 0x0
>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>
>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>
>> void arch_send_call_function_single_ipi(int cpu)
>> {
>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>> }
>>
>> so I'm not sure how the NULL pointer deref is possible even.
>>
>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>> this, but I'm not sure how exactly that can happen.
>>
>> I need help from somebody who knows how this low-level stuff works on ARM.
>
> That's not even an SMP machine? I suspect a bunch of out of the
> 65 boot failures here are related to this:
>
> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>
> The SMP ones seem to fail with some regulator issues?
>
There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
and others experience that problem.
Essentially, the code now assumes that a CPU clock always has a voltage
regulator attached to it, which is not correct. I sent out a patch to fix
that problem a minute ago.
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:40 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 19:40 UTC (permalink / raw)
To: linux-arm-kernel
On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160215 10:44]:
>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Rafael,
>>
>> Hi,
>>
>> Thanks for the report!
>>
>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>> timers with utilization update callbacks' with next-20160215. An example
>>> crash log and bisect results are attached below.
>>>
>>> Please let me know if there is anything I can do to help tracking down
>>> the problem.
>>
>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>
>> [cut]
>>
>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>> [ 1.340000] pgd = c0204000
>>> [ 1.340000] [00000000] *pgd=00000000
>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>> [ 1.340000] Modules linked in:
>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>> [ 1.340000] PC is at 0x0
>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>
>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>
>> void arch_send_call_function_single_ipi(int cpu)
>> {
>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>> }
>>
>> so I'm not sure how the NULL pointer deref is possible even.
>>
>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>> this, but I'm not sure how exactly that can happen.
>>
>> I need help from somebody who knows how this low-level stuff works on ARM.
>
> That's not even an SMP machine? I suspect a bunch of out of the
> 65 boot failures here are related to this:
>
> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>
> The SMP ones seem to fail with some regulator issues?
>
There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
and others experience that problem.
Essentially, the code now assumes that a CPU clock always has a voltage
regulator attached to it, which is not correct. I sent out a patch to fix
that problem a minute ago.
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:40 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 19:40 UTC (permalink / raw)
To: Tony Lindgren, Rafael J. Wysocki
Cc: Viresh Kumar, linux-pm, Peter Zijlstra, Rafael J. Wysocki,
Linux Kernel Mailing List, linux-next, linux-arm-kernel
On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160215 10:44]:
>> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> Rafael,
>>
>> Hi,
>>
>> Thanks for the report!
>>
>>> I see crashes in various arm qemu tests due to 'cpufreq: governor: Replace
>>> timers with utilization update callbacks' with next-20160215. An example
>>> crash log and bisect results are attached below.
>>>
>>> Please let me know if there is anything I can do to help tracking down
>>> the problem.
>>
>> It looks like we've uncovered some nastiness in the arch ARM code (see below).
>>
>> [cut]
>>
>>> [ 1.340000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
>>> [ 1.340000] pgd = c0204000
>>> [ 1.340000] [00000000] *pgd=00000000
>>> [ 1.340000] Internal error: Oops: 80000005 [#1] SMP ARM
>>> [ 1.340000] Modules linked in:
>>> [ 1.340000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4-next-20160215 #1
>>> [ 1.340000] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
>>> [ 1.340000] task: cb060000 ti: cb05a000 task.ti: cb05a000
>>> [ 1.340000] PC is at 0x0
>>> [ 1.340000] LR is at arch_send_call_function_single_ipi+0x34/0x38
>>
>> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>>
>> void arch_send_call_function_single_ipi(int cpu)
>> {
>> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
>> }
>>
>> so I'm not sure how the NULL pointer deref is possible even.
>>
>> The only thing coming to mind would be that cpumask_of(cpu) triggers
>> this, but I'm not sure how exactly that can happen.
>>
>> I need help from somebody who knows how this low-level stuff works on ARM.
>
> That's not even an SMP machine? I suspect a bunch of out of the
> 65 boot failures here are related to this:
>
> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>
> The SMP ones seem to fail with some regulator issues?
>
There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
and others experience that problem.
Essentially, the code now assumes that a CPU clock always has a voltage
regulator attached to it, which is not correct. I sent out a patch to fix
that problem a minute ago.
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:40 ` Guenter Roeck
(?)
@ 2016-02-15 19:58 ` Tony Lindgren
-1 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:58 UTC (permalink / raw)
To: Guenter Roeck
Cc: Rafael J. Wysocki, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
* Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> >
> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
> >
> >The SMP ones seem to fail with some regulator issues?
> >
>
> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
> and others experience that problem.
>
> Essentially, the code now assumes that a CPU clock always has a voltage
> regulator attached to it, which is not correct. I sent out a patch to fix
> that problem a minute ago.
Yes that fixed it thanks.
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:58 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:58 UTC (permalink / raw)
To: linux-arm-kernel
* Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> >
> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
> >
> >The SMP ones seem to fail with some regulator issues?
> >
>
> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
> and others experience that problem.
>
> Essentially, the code now assumes that a CPU clock always has a voltage
> regulator attached to it, which is not correct. I sent out a patch to fix
> that problem a minute ago.
Yes that fixed it thanks.
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:58 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 19:58 UTC (permalink / raw)
To: Guenter Roeck
Cc: Rafael J. Wysocki, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
* Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> >
> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
> >
> >The SMP ones seem to fail with some regulator issues?
> >
>
> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
> and others experience that problem.
>
> Essentially, the code now assumes that a CPU clock always has a voltage
> regulator attached to it, which is not correct. I sent out a patch to fix
> that problem a minute ago.
Yes that fixed it thanks.
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:58 ` Tony Lindgren
(?)
@ 2016-02-15 20:09 ` Guenter Roeck
-1 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 20:09 UTC (permalink / raw)
To: Tony Lindgren
Cc: Rafael J. Wysocki, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
On 02/15/2016 11:58 AM, Tony Lindgren wrote:
> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>
>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>
>>> The SMP ones seem to fail with some regulator issues?
>>>
>>
>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>> and others experience that problem.
>>
>> Essentially, the code now assumes that a CPU clock always has a voltage
>> regulator attached to it, which is not correct. I sent out a patch to fix
>> that problem a minute ago.
>
> Yes that fixed it thanks.
>
Confirmed. With this patch plus mine, all arm qemu tests are again passing for me.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:09 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 20:09 UTC (permalink / raw)
To: linux-arm-kernel
On 02/15/2016 11:58 AM, Tony Lindgren wrote:
> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>
>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>
>>> The SMP ones seem to fail with some regulator issues?
>>>
>>
>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>> and others experience that problem.
>>
>> Essentially, the code now assumes that a CPU clock always has a voltage
>> regulator attached to it, which is not correct. I sent out a patch to fix
>> that problem a minute ago.
>
> Yes that fixed it thanks.
>
Confirmed. With this patch plus mine, all arm qemu tests are again passing for me.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:09 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-15 20:09 UTC (permalink / raw)
To: Tony Lindgren
Cc: Rafael J. Wysocki, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
On 02/15/2016 11:58 AM, Tony Lindgren wrote:
> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>
>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>
>>> The SMP ones seem to fail with some regulator issues?
>>>
>>
>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>> and others experience that problem.
>>
>> Essentially, the code now assumes that a CPU clock always has a voltage
>> regulator attached to it, which is not correct. I sent out a patch to fix
>> that problem a minute ago.
>
> Yes that fixed it thanks.
>
Confirmed. With this patch plus mine, all arm qemu tests are again passing for me.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 20:09 ` Guenter Roeck
(?)
@ 2016-02-15 20:38 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:38 UTC (permalink / raw)
To: Guenter Roeck
Cc: Tony Lindgren, Rafael J. Wysocki, Viresh Kumar, linux-pm,
Peter Zijlstra, Rafael J. Wysocki, Linux Kernel Mailing List,
linux-next, linux-arm-kernel
On Mon, Feb 15, 2016 at 9:09 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On 02/15/2016 11:58 AM, Tony Lindgren wrote:
>>
>> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>>>
>>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>>
>>>>
>>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>>
>>>> The SMP ones seem to fail with some regulator issues?
>>>>
>>>
>>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>>> dev_pm_opp_set_rate()"). The kernelci boot log for
>>> next-20160212:omap3-overo-tobi
>>> and others experience that problem.
>>>
>>> Essentially, the code now assumes that a CPU clock always has a voltage
>>> regulator attached to it, which is not correct. I sent out a patch to fix
>>> that problem a minute ago.
>>
>>
>> Yes that fixed it thanks.
>>
>
> Confirmed. With this patch plus mine, all arm qemu tests are again passing
> for me.
OK, I'll add it to the governor changes branch.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:38 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:38 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 9:09 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On 02/15/2016 11:58 AM, Tony Lindgren wrote:
>>
>> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>>>
>>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>>
>>>>
>>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>>
>>>> The SMP ones seem to fail with some regulator issues?
>>>>
>>>
>>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>>> dev_pm_opp_set_rate()"). The kernelci boot log for
>>> next-20160212:omap3-overo-tobi
>>> and others experience that problem.
>>>
>>> Essentially, the code now assumes that a CPU clock always has a voltage
>>> regulator attached to it, which is not correct. I sent out a patch to fix
>>> that problem a minute ago.
>>
>>
>> Yes that fixed it thanks.
>>
>
> Confirmed. With this patch plus mine, all arm qemu tests are again passing
> for me.
OK, I'll add it to the governor changes branch.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:38 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:38 UTC (permalink / raw)
To: Guenter Roeck
Cc: Tony Lindgren, Rafael J. Wysocki, Viresh Kumar, linux-pm,
Peter Zijlstra, Rafael J. Wysocki, Linux Kernel Mailing List,
linux-next, linux-arm-kernel
On Mon, Feb 15, 2016 at 9:09 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> On 02/15/2016 11:58 AM, Tony Lindgren wrote:
>>
>> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>>>
>>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>>
>>>>
>>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>>
>>>> The SMP ones seem to fail with some regulator issues?
>>>>
>>>
>>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>>> dev_pm_opp_set_rate()"). The kernelci boot log for
>>> next-20160212:omap3-overo-tobi
>>> and others experience that problem.
>>>
>>> Essentially, the code now assumes that a CPU clock always has a voltage
>>> regulator attached to it, which is not correct. I sent out a patch to fix
>>> that problem a minute ago.
>>
>>
>> Yes that fixed it thanks.
>>
>
> Confirmed. With this patch plus mine, all arm qemu tests are again passing
> for me.
OK, I'll add it to the governor changes branch.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 19:58 ` Tony Lindgren
(?)
@ 2016-02-15 20:37 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:37 UTC (permalink / raw)
To: Tony Lindgren, Guenter Roeck
Cc: Viresh Kumar, linux-pm, Peter Zijlstra, Rafael J. Wysocki,
Linux Kernel Mailing List, linux-next, linux-arm-kernel
On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>> >
>> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>> >
>> >The SMP ones seem to fail with some regulator issues?
>> >
>>
>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>> and others experience that problem.
>>
>> Essentially, the code now assumes that a CPU clock always has a voltage
>> regulator attached to it, which is not correct. I sent out a patch to fix
>> that problem a minute ago.
>
> Yes that fixed it thanks.
Can you please also check if this alternative fix from Viresh works:
https://patchwork.kernel.org/patch/8316611/
?
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:37 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:37 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>> >
>> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>> >
>> >The SMP ones seem to fail with some regulator issues?
>> >
>>
>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>> and others experience that problem.
>>
>> Essentially, the code now assumes that a CPU clock always has a voltage
>> regulator attached to it, which is not correct. I sent out a patch to fix
>> that problem a minute ago.
>
> Yes that fixed it thanks.
Can you please also check if this alternative fix from Viresh works:
https://patchwork.kernel.org/patch/8316611/
?
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 20:37 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-15 20:37 UTC (permalink / raw)
To: Tony Lindgren, Guenter Roeck
Cc: Viresh Kumar, linux-pm, Peter Zijlstra, Rafael J. Wysocki,
Linux Kernel Mailing List, linux-next, linux-arm-kernel
On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>> >
>> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>> >
>> >The SMP ones seem to fail with some regulator issues?
>> >
>>
>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>> and others experience that problem.
>>
>> Essentially, the code now assumes that a CPU clock always has a voltage
>> regulator attached to it, which is not correct. I sent out a patch to fix
>> that problem a minute ago.
>
> Yes that fixed it thanks.
Can you please also check if this alternative fix from Viresh works:
https://patchwork.kernel.org/patch/8316611/
?
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 20:37 ` Rafael J. Wysocki
(?)
@ 2016-02-15 21:36 ` Tony Lindgren
-1 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 21:36 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
* Rafael J. Wysocki <rafael@kernel.org> [160215 12:39]:
> On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
> >> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> >> >
> >> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
> >> >
> >> >The SMP ones seem to fail with some regulator issues?
> >> >
> >>
> >> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
> >> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
> >> and others experience that problem.
> >>
> >> Essentially, the code now assumes that a CPU clock always has a voltage
> >> regulator attached to it, which is not correct. I sent out a patch to fix
> >> that problem a minute ago.
> >
> > Yes that fixed it thanks.
>
> Can you please also check if this alternative fix from Viresh works:
>
> https://patchwork.kernel.org/patch/8316611/
Yes that one too seems to fix the issue on SMP systems for
me:
Tested-by: Tony Lindgren <tony@atomide.com>
Regards,
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 21:36 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 21:36 UTC (permalink / raw)
To: linux-arm-kernel
* Rafael J. Wysocki <rafael@kernel.org> [160215 12:39]:
> On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
> >> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> >> >
> >> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
> >> >
> >> >The SMP ones seem to fail with some regulator issues?
> >> >
> >>
> >> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
> >> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
> >> and others experience that problem.
> >>
> >> Essentially, the code now assumes that a CPU clock always has a voltage
> >> regulator attached to it, which is not correct. I sent out a patch to fix
> >> that problem a minute ago.
> >
> > Yes that fixed it thanks.
>
> Can you please also check if this alternative fix from Viresh works:
>
> https://patchwork.kernel.org/patch/8316611/
Yes that one too seems to fix the issue on SMP systems for
me:
Tested-by: Tony Lindgren <tony@atomide.com>
Regards,
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 21:36 ` Tony Lindgren
0 siblings, 0 replies; 81+ messages in thread
From: Tony Lindgren @ 2016-02-15 21:36 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
* Rafael J. Wysocki <rafael@kernel.org> [160215 12:39]:
> On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
> > * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
> >> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
> >> >
> >> >https://kernelci.org/boot/all/job/next/kernel/next-20160215/
> >> >
> >> >The SMP ones seem to fail with some regulator issues?
> >> >
> >>
> >> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
> >> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
> >> and others experience that problem.
> >>
> >> Essentially, the code now assumes that a CPU clock always has a voltage
> >> regulator attached to it, which is not correct. I sent out a patch to fix
> >> that problem a minute ago.
> >
> > Yes that fixed it thanks.
>
> Can you please also check if this alternative fix from Viresh works:
>
> https://patchwork.kernel.org/patch/8316611/
Yes that one too seems to fix the issue on SMP systems for
me:
Tested-by: Tony Lindgren <tony@atomide.com>
Regards,
Tony
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 21:36 ` Tony Lindgren
(?)
@ 2016-02-16 1:38 ` Guenter Roeck
-1 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-16 1:38 UTC (permalink / raw)
To: Tony Lindgren, Rafael J. Wysocki
Cc: Viresh Kumar, linux-pm, Peter Zijlstra, Rafael J. Wysocki,
Linux Kernel Mailing List, linux-next, linux-arm-kernel
On 02/15/2016 01:36 PM, Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160215 12:39]:
>> On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
>>> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>>>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>>>
>>>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>>>
>>>>> The SMP ones seem to fail with some regulator issues?
>>>>>
>>>>
>>>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>>>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>>>> and others experience that problem.
>>>>
>>>> Essentially, the code now assumes that a CPU clock always has a voltage
>>>> regulator attached to it, which is not correct. I sent out a patch to fix
>>>> that problem a minute ago.
>>>
>>> Yes that fixed it thanks.
>>
>> Can you please also check if this alternative fix from Viresh works:
>>
>> https://patchwork.kernel.org/patch/8316611/
>
> Yes that one too seems to fix the issue on SMP systems for
> me:
>
> Tested-by: Tony Lindgren <tony@atomide.com>
>
Same here.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:38 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-16 1:38 UTC (permalink / raw)
To: linux-arm-kernel
On 02/15/2016 01:36 PM, Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160215 12:39]:
>> On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
>>> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>>>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>>>
>>>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>>>
>>>>> The SMP ones seem to fail with some regulator issues?
>>>>>
>>>>
>>>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>>>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>>>> and others experience that problem.
>>>>
>>>> Essentially, the code now assumes that a CPU clock always has a voltage
>>>> regulator attached to it, which is not correct. I sent out a patch to fix
>>>> that problem a minute ago.
>>>
>>> Yes that fixed it thanks.
>>
>> Can you please also check if this alternative fix from Viresh works:
>>
>> https://patchwork.kernel.org/patch/8316611/
>
> Yes that one too seems to fix the issue on SMP systems for
> me:
>
> Tested-by: Tony Lindgren <tony@atomide.com>
>
Same here.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:38 ` Guenter Roeck
0 siblings, 0 replies; 81+ messages in thread
From: Guenter Roeck @ 2016-02-16 1:38 UTC (permalink / raw)
To: Tony Lindgren, Rafael J. Wysocki
Cc: Viresh Kumar, linux-pm, Peter Zijlstra, Rafael J. Wysocki,
Linux Kernel Mailing List, linux-next, linux-arm-kernel
On 02/15/2016 01:36 PM, Tony Lindgren wrote:
> * Rafael J. Wysocki <rafael@kernel.org> [160215 12:39]:
>> On Mon, Feb 15, 2016 at 8:58 PM, Tony Lindgren <tony@atomide.com> wrote:
>>> * Guenter Roeck <linux@roeck-us.net> [160215 11:41]:
>>>> On 02/15/2016 11:01 AM, Tony Lindgren wrote:
>>>>>
>>>>> https://kernelci.org/boot/all/job/next/kernel/next-20160215/
>>>>>
>>>>> The SMP ones seem to fail with some regulator issues?
>>>>>
>>>>
>>>> There is another problem, introduced with 6a0712f6f199e ("PM / OPP: Add
>>>> dev_pm_opp_set_rate()"). The kernelci boot log for next-20160212:omap3-overo-tobi
>>>> and others experience that problem.
>>>>
>>>> Essentially, the code now assumes that a CPU clock always has a voltage
>>>> regulator attached to it, which is not correct. I sent out a patch to fix
>>>> that problem a minute ago.
>>>
>>> Yes that fixed it thanks.
>>
>> Can you please also check if this alternative fix from Viresh works:
>>
>> https://patchwork.kernel.org/patch/8316611/
>
> Yes that one too seems to fix the issue on SMP systems for
> me:
>
> Tested-by: Tony Lindgren <tony@atomide.com>
>
Same here.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Guenter
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:41 ` Rafael J. Wysocki
(?)
@ 2016-02-15 19:02 ` Russell King - ARM Linux
-1 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:02 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
On Mon, Feb 15, 2016 at 07:41:21PM +0100, Rafael J. Wysocki wrote:
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
smp_cross_call() is a function pointer, and the hint is:
> I need help from somebody who knows how this low-level stuff works on ARM.
>
> > [ 1.340000] pc : [<00000000>] lr : [<c030de78>] psr: 20000193
here that the PC is zero. It's initialised via set_smp_cross_call(),
which should be happening in drivers/irqchip/irq-gic.c for SMP
capable systems.
However, looking at this, this is an OMAP34xx based Beagle board, which
is a single CPU SoC. There are no other CPUs to send IPIs to.
> > [ 1.340000] sp : cb05b7c0 ip : 00000000 fp : cb05b83c
> > [ 1.340000] r10: cfb8c0c0 r9 : 00000000 r8 : cb18a4c0
> > [ 1.340000] r7 : 00000005 r6 : 00000005 r5 : cb5c0334 r4 : 00000000
> > [ 1.340000] r3 : 00000000 r2 : c0c06a7c r1 : 00000003 r0 : c0c06a7c
> > [ 1.340000] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> > [ 1.340000] Control: 10c5387d Table: 80204059 DAC: 00000051
> > [ 1.340000] Process swapper/0 (pid: 1, stack limit = 0xcb05a220)
> > [ 1.340000] Stack: (0xcb05b7c0 to 0xcb05c000)
> > [ 1.340000] b7c0: 00000000 c03b3350 4fdec700 00000000 00000005 c0959a84 ffffffff 00000000
> > [ 1.340000] b7e0: ffffffff cb18a4c0 cfb8c0c0 c03732d8 4c4b4000 cb18a4c0 cfb8c0c0 cfb8c0c0
> > [ 1.340000] b800: 0e979000 cb18a4c0 cfb8c0c0 00000005 0e979000 c12130c0 00000000 cfb8c0c0
> > [ 1.340000] b820: cb05b83c c0360d28 00000000 cb18a4c0 cfb8c0c0 60000193 cb05b84c c0360fc0
> > [ 1.340000] b840: cb18a4c0 cb18a8b4 cb05b87c c0361b74 cfb8c100 00000141 cb05b934 cb1c1cc0
> > [ 1.340000] b860: 00000002 00000000 00000000 00000048 c1416d0c cb0096c0 00000001 c0381de0
> > [ 1.340000] b880: c1416080 cfb8c100 00000400 cb0096c0 cb009720 00000000 00000038 cb003000
> > [ 1.340000] b8a0: 00000000 cb05b9c4 00000a28 c0381ea4 cb0096c0 cb0096d0 00000000 c0385150
> > [ 1.340000] b8c0: c03850ac c1211518 00000000 c038168c 00000155 c0381788 c0932830 20000013
> > [ 1.340000] b8e0: ffffffff cb05b924 00000000 c030bad4 00000001 00000009 00000002 fa070024
> > [ 1.340000] b900: cb127c10 00009401 cb05b9b8 c1302100 00000000 00000000 cb05b9c4 00000a28
> > [ 1.340000] b920: 00000000 cb05b940 00009601 c0932830 20000013 ffffffff 00000051 c093261c
> > [ 1.340000] b940: 00000014 cb127c58 00000002 00000001 000f4240 cb127c10 1443fd00 00000001
> > [ 1.340000] b960: c1302100 cb127c58 cb05b9b8 00000002 c145d438 ffff16ac 00000001 c0928358
> > [ 1.340000] b980: cb127c74 cb127c58 00000002 cb05b9b8 cb05ba97 00000001 cb05ba97 00000001
> > [ 1.340000] b9a0: 00000001 c0928538 00000000 cb518000 cb513740 c07726c4 0000004b cfb80001
> > [ 1.340000] b9c0: cb513740 0001004b 017d0001 cb05ba97 00000000 c076dc30 00000001 00000000
> > [ 1.340000] b9e0: 00000004 000000b9 000000ba cb518000 000000ba 000000b9 00000001 c076dd70
> > [ 1.340000] ba00: 00000000 00000000 cfb8c100 cb518000 000000ba 00000001 00000001 cb05ba97
> > [ 1.340000] ba20: 00000001 000000b9 00000000 c076dfcc c099a208 cb59d048 00000001 c1336dd0
> > [ 1.340000] ba40: a0000113 00000000 00000001 cb05ba97 0000005e 00000004 00000001 00000000
> > [ 1.340000] ba60: 00000000 000ee098 000ee098 c077fd34 0000000d c09e51f0 c09e51d0 cb51f400
> > [ 1.340000] ba80: ffffffff 000ee098 000ee098 c068cb48 00000000 c09c157c cb019180 c067887c
> > [ 1.340000] baa0: cb51f400 c067a700 000ee098 c09c160c cb015780 00000000 3b9aca00 cb5bdcc0
> > [ 1.340000] bac0: cb51f400 00000000 00000000 00000000 000ee098 c067ab5c 000ee098 000ee098
> > [ 1.340000] bae0: cb5bdcc0 000ee098 000ee098 000ee098 cfb87050 00000000 000ee098 c067c614
> > [ 1.340000] bb00: cb5bdcc0 000ee098 000ee098 c0765ad4 1dcd6500 cb5bdc80 00000000 07735940
> > [ 1.340000] bb20: cb5bdc80 cfb87050 cb5bdcc0 00000000 000ee098 c076660c 000ee098 cb5c11d0
> > [ 1.340000] bb40: cb05bb90 00124f80 00124f80 00124f80 07735940 1dcd6500 ffffffff cb5c1100
> > [ 1.340000] bb60: 00000000 00000000 c145dc8c cb5c0280 00000000 00000001 cb05bb90 c0958e78
> > [ 1.340000] bb80: cb05bb8c c13cb404 00000000 00000000 00000010 0007a120 0001e848 00000021
> > [ 1.340000] bba0: ffffffff ee222d90 00000000 00000000 00000000 00000010 cfb8b598 c13cb310
> > [ 1.340000] bbc0: c1302578 c095ca58 c1302578 00000000 cb5c1100 00000000 000927c0 cb5bdfc0
> > [ 1.340000] bbe0: c120e300 00000000 ee32cf60 00000000 c13cb310 cb5c1100 00000000 cb5c0304
> > [ 1.340000] bc00: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c095cd04 c145dc8c 00000001
> > [ 1.340000] bc20: cb5c1100 cb5c1100 00000000 c145dc8c c1302578 00000003 cb5c1100 00000000
> > [ 1.340000] bc40: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c0959c5c cb5c1100 00000000
> > [ 1.340000] bc60: 00000000 c095a2dc c0c0df58 00000001 0000ffff 00000001 00000000 00000000
> > [ 1.340000] bc80: cb5bdc00 000927c0 0001e848 000493e0 0001e848 000927c0 0007a120 00000000
> > [ 1.340000] bca0: 00000000 00000000 00000000 c13cb310 00000000 00000000 00000000 00000000
> > [ 1.340000] bcc0: 00000000 00000000 ffffffe0 cb5c1160 cb5c1160 c095abf4 0001e848 000927c0
> > [ 1.340000] bce0: cb5c0280 c13cb0a8 c13cb0a8 cb5bdf00 cb5c1184 cb5c1184 cb11e600 00000000
> > [ 1.340000] bd00: c13cb128 cb5bf460 00000001 00000003 00000000 00000000 cb5c11ac cb5c11ac
> > [ 1.340000] bd20: ffff0001 cb5c11b8 cb5c11b8 00000000 00000000 cb060000 00000000 00000000
> > [ 1.340000] bd40: 00000000 cb5c11d8 cb5c11d8 00000000 cb5bdf80 cb5bdec0 cb5c1100 c095a5f0
> > [ 1.340000] bd60: 00000000 cb11e600 00000000 c1212594 60000013 00000001 00000000 c13cb110
> > [ 1.340000] bd80: c13acc68 c13cb0a8 c13cb440 c13cb440 00000000 00000000 00000000 c075674c
> > [ 1.340000] bda0: c13cb440 cb00cc5c cb169db4 00000000 c1334248 c13cb488 c145dc8c c0959764
> > [ 1.340000] bdc0: ffffffed cfb87050 cb5e2600 c095d670 ffffffed cb5e2610 fffffdfb c0758e48
> > [ 1.340000] bde0: c0758df8 cb5e2610 c1459090 c1459098 00000000 c07577b0 00000000 00000000
> > [ 1.340000] be00: cb05be30 c0757a68 00000001 c145906c 00000000 c0755d3c cb00cb70 cb5938b8
> > [ 1.340000] be20: cb5e2610 cb5e2644 c13aca58 c0757534 cb5e2610 00000001 00000000 cb5e2610
> > [ 1.340000] be40: cb5e2610 c13aca58 c13acaa8 c0756bc0 cb5e2610 00000000 cb5e2618 c07550c0
> > [ 1.340000] be60: 00000000 c0587884 cb05beb8 cb5e2600 00000000 cb5e2600 cb5e2610 c1419000
> > [ 1.340000] be80: c110362c c11a183c 00000000 c0758fdc 00000000 cb05beb8 cb5e2600 cb5bdb00
> > [ 1.340000] bea0: c1419000 c07597a8 c0ead2ac c1306788 c1306788 c1112510 00000000 00000000
> > [ 1.340000] bec0: c0ead2ac 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 1.340000] bee0: 00000000 00000000 00000000 c110f828 c110fabc c110fac4 c110fabc c1103648
> > [ 1.340000] bf00: c1306788 c0301d28 0000006f cb05bf28 c035a8bc c035a8cc 60000013 ffffffff
> > [ 1.340000] bf20: 00000051 c058b428 c0ff5b24 c0c1da88 0000011a c035ab48 c11a183c c0ea7034
> > [ 1.340000] bf40: c0ff451c 00000000 00000007 00000007 c1335704 cfb96300 c120de7c 00000007
> > [ 1.340000] bf60: c11a1834 c1419000 0000011a c11a183c c1100598 c1100dc4 00000007 00000007
> > [ 1.340000] bf80: 00000000 c1100598 00000000 c0b0bcfc 00000000 00000000 00000000 00000000
> > [ 1.340000] bfa0: 00000000 c0b0bd04 00000000 c0307e78 00000000 00000000 00000000 00000000
> > [ 1.340000] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 1.340000] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> > [ 1.340000] [<c030de78>] (arch_send_call_function_single_ipi) from [<c03b3350>] (irq_work_queue_on+0x90/0x100)
> > [ 1.340000] [<c03b3350>] (irq_work_queue_on) from [<c0959a84>] (cpufreq_update_util+0x40/0x4c)
> > [ 1.340000] [<c0959a84>] (cpufreq_update_util) from [<c03732d8>] (enqueue_task_rt+0x28/0x26c)
> > [ 1.340000] [<c03732d8>] (enqueue_task_rt) from [<c0360d28>] (activate_task+0x60/0x64)
> > [ 1.340000] [<c0360d28>] (activate_task) from [<c0360fc0>] (ttwu_do_activate.constprop.13+0x34/0x68)
> > [ 1.340000] [<c0360fc0>] (ttwu_do_activate.constprop.13) from [<c0361b74>] (try_to_wake_up+0x1a0/0x318)
> > [ 1.340000] [<c0361b74>] (try_to_wake_up) from [<c0381de0>] (handle_irq_event_percpu+0xdc/0x15c)
> > [ 1.340000] [<c0381de0>] (handle_irq_event_percpu) from [<c0381ea4>] (handle_irq_event+0x44/0x68)
> > [ 1.340000] [<c0381ea4>] (handle_irq_event) from [<c0385150>] (handle_level_irq+0xa4/0x13c)
> > [ 1.340000] [<c0385150>] (handle_level_irq) from [<c038168c>] (generic_handle_irq+0x18/0x28)
> > [ 1.340000] [<c038168c>] (generic_handle_irq) from [<c0381788>] (__handle_domain_irq+0x54/0xb0)
> > [ 1.340000] [<c0381788>] (__handle_domain_irq) from [<c030bad4>] (__irq_svc+0x54/0x70)
> > [ 1.340000] [<c030bad4>] (__irq_svc) from [<c0932830>] (omap_i2c_xfer+0x320/0x5a0)
>
> It looks like we got an interrupt in the middle of an i2c transaction
> changing the CPU OPP. The handler of that tried to enqueue an RT task
> and that led to a cpufreq update that in turn triggered the crash.
I think the question here is around cpufreq_update_util() calling
irq_work_queue_on() for the same CPU... from an IRQ handler.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:02 ` Russell King - ARM Linux
0 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:02 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 15, 2016 at 07:41:21PM +0100, Rafael J. Wysocki wrote:
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
smp_cross_call() is a function pointer, and the hint is:
> I need help from somebody who knows how this low-level stuff works on ARM.
>
> > [ 1.340000] pc : [<00000000>] lr : [<c030de78>] psr: 20000193
here that the PC is zero. It's initialised via set_smp_cross_call(),
which should be happening in drivers/irqchip/irq-gic.c for SMP
capable systems.
However, looking at this, this is an OMAP34xx based Beagle board, which
is a single CPU SoC. There are no other CPUs to send IPIs to.
> > [ 1.340000] sp : cb05b7c0 ip : 00000000 fp : cb05b83c
> > [ 1.340000] r10: cfb8c0c0 r9 : 00000000 r8 : cb18a4c0
> > [ 1.340000] r7 : 00000005 r6 : 00000005 r5 : cb5c0334 r4 : 00000000
> > [ 1.340000] r3 : 00000000 r2 : c0c06a7c r1 : 00000003 r0 : c0c06a7c
> > [ 1.340000] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> > [ 1.340000] Control: 10c5387d Table: 80204059 DAC: 00000051
> > [ 1.340000] Process swapper/0 (pid: 1, stack limit = 0xcb05a220)
> > [ 1.340000] Stack: (0xcb05b7c0 to 0xcb05c000)
> > [ 1.340000] b7c0: 00000000 c03b3350 4fdec700 00000000 00000005 c0959a84 ffffffff 00000000
> > [ 1.340000] b7e0: ffffffff cb18a4c0 cfb8c0c0 c03732d8 4c4b4000 cb18a4c0 cfb8c0c0 cfb8c0c0
> > [ 1.340000] b800: 0e979000 cb18a4c0 cfb8c0c0 00000005 0e979000 c12130c0 00000000 cfb8c0c0
> > [ 1.340000] b820: cb05b83c c0360d28 00000000 cb18a4c0 cfb8c0c0 60000193 cb05b84c c0360fc0
> > [ 1.340000] b840: cb18a4c0 cb18a8b4 cb05b87c c0361b74 cfb8c100 00000141 cb05b934 cb1c1cc0
> > [ 1.340000] b860: 00000002 00000000 00000000 00000048 c1416d0c cb0096c0 00000001 c0381de0
> > [ 1.340000] b880: c1416080 cfb8c100 00000400 cb0096c0 cb009720 00000000 00000038 cb003000
> > [ 1.340000] b8a0: 00000000 cb05b9c4 00000a28 c0381ea4 cb0096c0 cb0096d0 00000000 c0385150
> > [ 1.340000] b8c0: c03850ac c1211518 00000000 c038168c 00000155 c0381788 c0932830 20000013
> > [ 1.340000] b8e0: ffffffff cb05b924 00000000 c030bad4 00000001 00000009 00000002 fa070024
> > [ 1.340000] b900: cb127c10 00009401 cb05b9b8 c1302100 00000000 00000000 cb05b9c4 00000a28
> > [ 1.340000] b920: 00000000 cb05b940 00009601 c0932830 20000013 ffffffff 00000051 c093261c
> > [ 1.340000] b940: 00000014 cb127c58 00000002 00000001 000f4240 cb127c10 1443fd00 00000001
> > [ 1.340000] b960: c1302100 cb127c58 cb05b9b8 00000002 c145d438 ffff16ac 00000001 c0928358
> > [ 1.340000] b980: cb127c74 cb127c58 00000002 cb05b9b8 cb05ba97 00000001 cb05ba97 00000001
> > [ 1.340000] b9a0: 00000001 c0928538 00000000 cb518000 cb513740 c07726c4 0000004b cfb80001
> > [ 1.340000] b9c0: cb513740 0001004b 017d0001 cb05ba97 00000000 c076dc30 00000001 00000000
> > [ 1.340000] b9e0: 00000004 000000b9 000000ba cb518000 000000ba 000000b9 00000001 c076dd70
> > [ 1.340000] ba00: 00000000 00000000 cfb8c100 cb518000 000000ba 00000001 00000001 cb05ba97
> > [ 1.340000] ba20: 00000001 000000b9 00000000 c076dfcc c099a208 cb59d048 00000001 c1336dd0
> > [ 1.340000] ba40: a0000113 00000000 00000001 cb05ba97 0000005e 00000004 00000001 00000000
> > [ 1.340000] ba60: 00000000 000ee098 000ee098 c077fd34 0000000d c09e51f0 c09e51d0 cb51f400
> > [ 1.340000] ba80: ffffffff 000ee098 000ee098 c068cb48 00000000 c09c157c cb019180 c067887c
> > [ 1.340000] baa0: cb51f400 c067a700 000ee098 c09c160c cb015780 00000000 3b9aca00 cb5bdcc0
> > [ 1.340000] bac0: cb51f400 00000000 00000000 00000000 000ee098 c067ab5c 000ee098 000ee098
> > [ 1.340000] bae0: cb5bdcc0 000ee098 000ee098 000ee098 cfb87050 00000000 000ee098 c067c614
> > [ 1.340000] bb00: cb5bdcc0 000ee098 000ee098 c0765ad4 1dcd6500 cb5bdc80 00000000 07735940
> > [ 1.340000] bb20: cb5bdc80 cfb87050 cb5bdcc0 00000000 000ee098 c076660c 000ee098 cb5c11d0
> > [ 1.340000] bb40: cb05bb90 00124f80 00124f80 00124f80 07735940 1dcd6500 ffffffff cb5c1100
> > [ 1.340000] bb60: 00000000 00000000 c145dc8c cb5c0280 00000000 00000001 cb05bb90 c0958e78
> > [ 1.340000] bb80: cb05bb8c c13cb404 00000000 00000000 00000010 0007a120 0001e848 00000021
> > [ 1.340000] bba0: ffffffff ee222d90 00000000 00000000 00000000 00000010 cfb8b598 c13cb310
> > [ 1.340000] bbc0: c1302578 c095ca58 c1302578 00000000 cb5c1100 00000000 000927c0 cb5bdfc0
> > [ 1.340000] bbe0: c120e300 00000000 ee32cf60 00000000 c13cb310 cb5c1100 00000000 cb5c0304
> > [ 1.340000] bc00: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c095cd04 c145dc8c 00000001
> > [ 1.340000] bc20: cb5c1100 cb5c1100 00000000 c145dc8c c1302578 00000003 cb5c1100 00000000
> > [ 1.340000] bc40: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c0959c5c cb5c1100 00000000
> > [ 1.340000] bc60: 00000000 c095a2dc c0c0df58 00000001 0000ffff 00000001 00000000 00000000
> > [ 1.340000] bc80: cb5bdc00 000927c0 0001e848 000493e0 0001e848 000927c0 0007a120 00000000
> > [ 1.340000] bca0: 00000000 00000000 00000000 c13cb310 00000000 00000000 00000000 00000000
> > [ 1.340000] bcc0: 00000000 00000000 ffffffe0 cb5c1160 cb5c1160 c095abf4 0001e848 000927c0
> > [ 1.340000] bce0: cb5c0280 c13cb0a8 c13cb0a8 cb5bdf00 cb5c1184 cb5c1184 cb11e600 00000000
> > [ 1.340000] bd00: c13cb128 cb5bf460 00000001 00000003 00000000 00000000 cb5c11ac cb5c11ac
> > [ 1.340000] bd20: ffff0001 cb5c11b8 cb5c11b8 00000000 00000000 cb060000 00000000 00000000
> > [ 1.340000] bd40: 00000000 cb5c11d8 cb5c11d8 00000000 cb5bdf80 cb5bdec0 cb5c1100 c095a5f0
> > [ 1.340000] bd60: 00000000 cb11e600 00000000 c1212594 60000013 00000001 00000000 c13cb110
> > [ 1.340000] bd80: c13acc68 c13cb0a8 c13cb440 c13cb440 00000000 00000000 00000000 c075674c
> > [ 1.340000] bda0: c13cb440 cb00cc5c cb169db4 00000000 c1334248 c13cb488 c145dc8c c0959764
> > [ 1.340000] bdc0: ffffffed cfb87050 cb5e2600 c095d670 ffffffed cb5e2610 fffffdfb c0758e48
> > [ 1.340000] bde0: c0758df8 cb5e2610 c1459090 c1459098 00000000 c07577b0 00000000 00000000
> > [ 1.340000] be00: cb05be30 c0757a68 00000001 c145906c 00000000 c0755d3c cb00cb70 cb5938b8
> > [ 1.340000] be20: cb5e2610 cb5e2644 c13aca58 c0757534 cb5e2610 00000001 00000000 cb5e2610
> > [ 1.340000] be40: cb5e2610 c13aca58 c13acaa8 c0756bc0 cb5e2610 00000000 cb5e2618 c07550c0
> > [ 1.340000] be60: 00000000 c0587884 cb05beb8 cb5e2600 00000000 cb5e2600 cb5e2610 c1419000
> > [ 1.340000] be80: c110362c c11a183c 00000000 c0758fdc 00000000 cb05beb8 cb5e2600 cb5bdb00
> > [ 1.340000] bea0: c1419000 c07597a8 c0ead2ac c1306788 c1306788 c1112510 00000000 00000000
> > [ 1.340000] bec0: c0ead2ac 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 1.340000] bee0: 00000000 00000000 00000000 c110f828 c110fabc c110fac4 c110fabc c1103648
> > [ 1.340000] bf00: c1306788 c0301d28 0000006f cb05bf28 c035a8bc c035a8cc 60000013 ffffffff
> > [ 1.340000] bf20: 00000051 c058b428 c0ff5b24 c0c1da88 0000011a c035ab48 c11a183c c0ea7034
> > [ 1.340000] bf40: c0ff451c 00000000 00000007 00000007 c1335704 cfb96300 c120de7c 00000007
> > [ 1.340000] bf60: c11a1834 c1419000 0000011a c11a183c c1100598 c1100dc4 00000007 00000007
> > [ 1.340000] bf80: 00000000 c1100598 00000000 c0b0bcfc 00000000 00000000 00000000 00000000
> > [ 1.340000] bfa0: 00000000 c0b0bd04 00000000 c0307e78 00000000 00000000 00000000 00000000
> > [ 1.340000] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 1.340000] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> > [ 1.340000] [<c030de78>] (arch_send_call_function_single_ipi) from [<c03b3350>] (irq_work_queue_on+0x90/0x100)
> > [ 1.340000] [<c03b3350>] (irq_work_queue_on) from [<c0959a84>] (cpufreq_update_util+0x40/0x4c)
> > [ 1.340000] [<c0959a84>] (cpufreq_update_util) from [<c03732d8>] (enqueue_task_rt+0x28/0x26c)
> > [ 1.340000] [<c03732d8>] (enqueue_task_rt) from [<c0360d28>] (activate_task+0x60/0x64)
> > [ 1.340000] [<c0360d28>] (activate_task) from [<c0360fc0>] (ttwu_do_activate.constprop.13+0x34/0x68)
> > [ 1.340000] [<c0360fc0>] (ttwu_do_activate.constprop.13) from [<c0361b74>] (try_to_wake_up+0x1a0/0x318)
> > [ 1.340000] [<c0361b74>] (try_to_wake_up) from [<c0381de0>] (handle_irq_event_percpu+0xdc/0x15c)
> > [ 1.340000] [<c0381de0>] (handle_irq_event_percpu) from [<c0381ea4>] (handle_irq_event+0x44/0x68)
> > [ 1.340000] [<c0381ea4>] (handle_irq_event) from [<c0385150>] (handle_level_irq+0xa4/0x13c)
> > [ 1.340000] [<c0385150>] (handle_level_irq) from [<c038168c>] (generic_handle_irq+0x18/0x28)
> > [ 1.340000] [<c038168c>] (generic_handle_irq) from [<c0381788>] (__handle_domain_irq+0x54/0xb0)
> > [ 1.340000] [<c0381788>] (__handle_domain_irq) from [<c030bad4>] (__irq_svc+0x54/0x70)
> > [ 1.340000] [<c030bad4>] (__irq_svc) from [<c0932830>] (omap_i2c_xfer+0x320/0x5a0)
>
> It looks like we got an interrupt in the middle of an i2c transaction
> changing the CPU OPP. The handler of that tried to enqueue an RT task
> and that led to a cpufreq update that in turn triggered the crash.
I think the question here is around cpufreq_update_util() calling
irq_work_queue_on() for the same CPU... from an IRQ handler.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-15 19:02 ` Russell King - ARM Linux
0 siblings, 0 replies; 81+ messages in thread
From: Russell King - ARM Linux @ 2016-02-15 19:02 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Viresh Kumar, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, Linux Kernel Mailing List, linux-next,
linux-arm-kernel
On Mon, Feb 15, 2016 at 07:41:21PM +0100, Rafael J. Wysocki wrote:
> Since this is ARM, arch_send_call_function_single_ipi() looks like this:
>
> void arch_send_call_function_single_ipi(int cpu)
> {
> smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
> }
>
> so I'm not sure how the NULL pointer deref is possible even.
smp_cross_call() is a function pointer, and the hint is:
> I need help from somebody who knows how this low-level stuff works on ARM.
>
> > [ 1.340000] pc : [<00000000>] lr : [<c030de78>] psr: 20000193
here that the PC is zero. It's initialised via set_smp_cross_call(),
which should be happening in drivers/irqchip/irq-gic.c for SMP
capable systems.
However, looking at this, this is an OMAP34xx based Beagle board, which
is a single CPU SoC. There are no other CPUs to send IPIs to.
> > [ 1.340000] sp : cb05b7c0 ip : 00000000 fp : cb05b83c
> > [ 1.340000] r10: cfb8c0c0 r9 : 00000000 r8 : cb18a4c0
> > [ 1.340000] r7 : 00000005 r6 : 00000005 r5 : cb5c0334 r4 : 00000000
> > [ 1.340000] r3 : 00000000 r2 : c0c06a7c r1 : 00000003 r0 : c0c06a7c
> > [ 1.340000] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> > [ 1.340000] Control: 10c5387d Table: 80204059 DAC: 00000051
> > [ 1.340000] Process swapper/0 (pid: 1, stack limit = 0xcb05a220)
> > [ 1.340000] Stack: (0xcb05b7c0 to 0xcb05c000)
> > [ 1.340000] b7c0: 00000000 c03b3350 4fdec700 00000000 00000005 c0959a84 ffffffff 00000000
> > [ 1.340000] b7e0: ffffffff cb18a4c0 cfb8c0c0 c03732d8 4c4b4000 cb18a4c0 cfb8c0c0 cfb8c0c0
> > [ 1.340000] b800: 0e979000 cb18a4c0 cfb8c0c0 00000005 0e979000 c12130c0 00000000 cfb8c0c0
> > [ 1.340000] b820: cb05b83c c0360d28 00000000 cb18a4c0 cfb8c0c0 60000193 cb05b84c c0360fc0
> > [ 1.340000] b840: cb18a4c0 cb18a8b4 cb05b87c c0361b74 cfb8c100 00000141 cb05b934 cb1c1cc0
> > [ 1.340000] b860: 00000002 00000000 00000000 00000048 c1416d0c cb0096c0 00000001 c0381de0
> > [ 1.340000] b880: c1416080 cfb8c100 00000400 cb0096c0 cb009720 00000000 00000038 cb003000
> > [ 1.340000] b8a0: 00000000 cb05b9c4 00000a28 c0381ea4 cb0096c0 cb0096d0 00000000 c0385150
> > [ 1.340000] b8c0: c03850ac c1211518 00000000 c038168c 00000155 c0381788 c0932830 20000013
> > [ 1.340000] b8e0: ffffffff cb05b924 00000000 c030bad4 00000001 00000009 00000002 fa070024
> > [ 1.340000] b900: cb127c10 00009401 cb05b9b8 c1302100 00000000 00000000 cb05b9c4 00000a28
> > [ 1.340000] b920: 00000000 cb05b940 00009601 c0932830 20000013 ffffffff 00000051 c093261c
> > [ 1.340000] b940: 00000014 cb127c58 00000002 00000001 000f4240 cb127c10 1443fd00 00000001
> > [ 1.340000] b960: c1302100 cb127c58 cb05b9b8 00000002 c145d438 ffff16ac 00000001 c0928358
> > [ 1.340000] b980: cb127c74 cb127c58 00000002 cb05b9b8 cb05ba97 00000001 cb05ba97 00000001
> > [ 1.340000] b9a0: 00000001 c0928538 00000000 cb518000 cb513740 c07726c4 0000004b cfb80001
> > [ 1.340000] b9c0: cb513740 0001004b 017d0001 cb05ba97 00000000 c076dc30 00000001 00000000
> > [ 1.340000] b9e0: 00000004 000000b9 000000ba cb518000 000000ba 000000b9 00000001 c076dd70
> > [ 1.340000] ba00: 00000000 00000000 cfb8c100 cb518000 000000ba 00000001 00000001 cb05ba97
> > [ 1.340000] ba20: 00000001 000000b9 00000000 c076dfcc c099a208 cb59d048 00000001 c1336dd0
> > [ 1.340000] ba40: a0000113 00000000 00000001 cb05ba97 0000005e 00000004 00000001 00000000
> > [ 1.340000] ba60: 00000000 000ee098 000ee098 c077fd34 0000000d c09e51f0 c09e51d0 cb51f400
> > [ 1.340000] ba80: ffffffff 000ee098 000ee098 c068cb48 00000000 c09c157c cb019180 c067887c
> > [ 1.340000] baa0: cb51f400 c067a700 000ee098 c09c160c cb015780 00000000 3b9aca00 cb5bdcc0
> > [ 1.340000] bac0: cb51f400 00000000 00000000 00000000 000ee098 c067ab5c 000ee098 000ee098
> > [ 1.340000] bae0: cb5bdcc0 000ee098 000ee098 000ee098 cfb87050 00000000 000ee098 c067c614
> > [ 1.340000] bb00: cb5bdcc0 000ee098 000ee098 c0765ad4 1dcd6500 cb5bdc80 00000000 07735940
> > [ 1.340000] bb20: cb5bdc80 cfb87050 cb5bdcc0 00000000 000ee098 c076660c 000ee098 cb5c11d0
> > [ 1.340000] bb40: cb05bb90 00124f80 00124f80 00124f80 07735940 1dcd6500 ffffffff cb5c1100
> > [ 1.340000] bb60: 00000000 00000000 c145dc8c cb5c0280 00000000 00000001 cb05bb90 c0958e78
> > [ 1.340000] bb80: cb05bb8c c13cb404 00000000 00000000 00000010 0007a120 0001e848 00000021
> > [ 1.340000] bba0: ffffffff ee222d90 00000000 00000000 00000000 00000010 cfb8b598 c13cb310
> > [ 1.340000] bbc0: c1302578 c095ca58 c1302578 00000000 cb5c1100 00000000 000927c0 cb5bdfc0
> > [ 1.340000] bbe0: c120e300 00000000 ee32cf60 00000000 c13cb310 cb5c1100 00000000 cb5c0304
> > [ 1.340000] bc00: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c095cd04 c145dc8c 00000001
> > [ 1.340000] bc20: cb5c1100 cb5c1100 00000000 c145dc8c c1302578 00000003 cb5c1100 00000000
> > [ 1.340000] bc40: 00000010 c145dc8c c1302578 cb5c11b4 cb5c1108 c0959c5c cb5c1100 00000000
> > [ 1.340000] bc60: 00000000 c095a2dc c0c0df58 00000001 0000ffff 00000001 00000000 00000000
> > [ 1.340000] bc80: cb5bdc00 000927c0 0001e848 000493e0 0001e848 000927c0 0007a120 00000000
> > [ 1.340000] bca0: 00000000 00000000 00000000 c13cb310 00000000 00000000 00000000 00000000
> > [ 1.340000] bcc0: 00000000 00000000 ffffffe0 cb5c1160 cb5c1160 c095abf4 0001e848 000927c0
> > [ 1.340000] bce0: cb5c0280 c13cb0a8 c13cb0a8 cb5bdf00 cb5c1184 cb5c1184 cb11e600 00000000
> > [ 1.340000] bd00: c13cb128 cb5bf460 00000001 00000003 00000000 00000000 cb5c11ac cb5c11ac
> > [ 1.340000] bd20: ffff0001 cb5c11b8 cb5c11b8 00000000 00000000 cb060000 00000000 00000000
> > [ 1.340000] bd40: 00000000 cb5c11d8 cb5c11d8 00000000 cb5bdf80 cb5bdec0 cb5c1100 c095a5f0
> > [ 1.340000] bd60: 00000000 cb11e600 00000000 c1212594 60000013 00000001 00000000 c13cb110
> > [ 1.340000] bd80: c13acc68 c13cb0a8 c13cb440 c13cb440 00000000 00000000 00000000 c075674c
> > [ 1.340000] bda0: c13cb440 cb00cc5c cb169db4 00000000 c1334248 c13cb488 c145dc8c c0959764
> > [ 1.340000] bdc0: ffffffed cfb87050 cb5e2600 c095d670 ffffffed cb5e2610 fffffdfb c0758e48
> > [ 1.340000] bde0: c0758df8 cb5e2610 c1459090 c1459098 00000000 c07577b0 00000000 00000000
> > [ 1.340000] be00: cb05be30 c0757a68 00000001 c145906c 00000000 c0755d3c cb00cb70 cb5938b8
> > [ 1.340000] be20: cb5e2610 cb5e2644 c13aca58 c0757534 cb5e2610 00000001 00000000 cb5e2610
> > [ 1.340000] be40: cb5e2610 c13aca58 c13acaa8 c0756bc0 cb5e2610 00000000 cb5e2618 c07550c0
> > [ 1.340000] be60: 00000000 c0587884 cb05beb8 cb5e2600 00000000 cb5e2600 cb5e2610 c1419000
> > [ 1.340000] be80: c110362c c11a183c 00000000 c0758fdc 00000000 cb05beb8 cb5e2600 cb5bdb00
> > [ 1.340000] bea0: c1419000 c07597a8 c0ead2ac c1306788 c1306788 c1112510 00000000 00000000
> > [ 1.340000] bec0: c0ead2ac 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 1.340000] bee0: 00000000 00000000 00000000 c110f828 c110fabc c110fac4 c110fabc c1103648
> > [ 1.340000] bf00: c1306788 c0301d28 0000006f cb05bf28 c035a8bc c035a8cc 60000013 ffffffff
> > [ 1.340000] bf20: 00000051 c058b428 c0ff5b24 c0c1da88 0000011a c035ab48 c11a183c c0ea7034
> > [ 1.340000] bf40: c0ff451c 00000000 00000007 00000007 c1335704 cfb96300 c120de7c 00000007
> > [ 1.340000] bf60: c11a1834 c1419000 0000011a c11a183c c1100598 c1100dc4 00000007 00000007
> > [ 1.340000] bf80: 00000000 c1100598 00000000 c0b0bcfc 00000000 00000000 00000000 00000000
> > [ 1.340000] bfa0: 00000000 c0b0bd04 00000000 c0307e78 00000000 00000000 00000000 00000000
> > [ 1.340000] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 1.340000] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> > [ 1.340000] [<c030de78>] (arch_send_call_function_single_ipi) from [<c03b3350>] (irq_work_queue_on+0x90/0x100)
> > [ 1.340000] [<c03b3350>] (irq_work_queue_on) from [<c0959a84>] (cpufreq_update_util+0x40/0x4c)
> > [ 1.340000] [<c0959a84>] (cpufreq_update_util) from [<c03732d8>] (enqueue_task_rt+0x28/0x26c)
> > [ 1.340000] [<c03732d8>] (enqueue_task_rt) from [<c0360d28>] (activate_task+0x60/0x64)
> > [ 1.340000] [<c0360d28>] (activate_task) from [<c0360fc0>] (ttwu_do_activate.constprop.13+0x34/0x68)
> > [ 1.340000] [<c0360fc0>] (ttwu_do_activate.constprop.13) from [<c0361b74>] (try_to_wake_up+0x1a0/0x318)
> > [ 1.340000] [<c0361b74>] (try_to_wake_up) from [<c0381de0>] (handle_irq_event_percpu+0xdc/0x15c)
> > [ 1.340000] [<c0381de0>] (handle_irq_event_percpu) from [<c0381ea4>] (handle_irq_event+0x44/0x68)
> > [ 1.340000] [<c0381ea4>] (handle_irq_event) from [<c0385150>] (handle_level_irq+0xa4/0x13c)
> > [ 1.340000] [<c0385150>] (handle_level_irq) from [<c038168c>] (generic_handle_irq+0x18/0x28)
> > [ 1.340000] [<c038168c>] (generic_handle_irq) from [<c0381788>] (__handle_domain_irq+0x54/0xb0)
> > [ 1.340000] [<c0381788>] (__handle_domain_irq) from [<c030bad4>] (__irq_svc+0x54/0x70)
> > [ 1.340000] [<c030bad4>] (__irq_svc) from [<c0932830>] (omap_i2c_xfer+0x320/0x5a0)
>
> It looks like we got an interrupt in the middle of an i2c transaction
> changing the CPU OPP. The handler of that tried to enqueue an RT task
> and that led to a cpufreq update that in turn triggered the crash.
I think the question here is around cpufreq_update_util() calling
irq_work_queue_on() for the same CPU... from an IRQ handler.
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-15 18:41 ` Rafael J. Wysocki
(?)
@ 2016-02-16 1:13 ` Viresh Kumar
-1 siblings, 0 replies; 81+ messages in thread
From: Viresh Kumar @ 2016-02-16 1:13 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On 15-02-16, 19:41, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> > [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> > [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> > [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> > [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> > [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> > [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
>
> This is the registration of the cpufreq driver (cpufreq-dt in this case).
>
> It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
>
> The only way that can happen is when cpufreq_set_policy() finds that
> the "old" and the "new" policies use the same governor, so it goes and
> calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
> how this is possible during the initialization ATM.
>
> Viresh, any ideas?
You misread probably.
During init, policy->gov is NULL and new_policy->gov is set to the
default one, probably ondemand/conservative. And in that case, we do:
- INIT
- START
- LIMITS
So above sequence is guaranteed to happen rather.
--
viresh
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:13 ` Viresh Kumar
0 siblings, 0 replies; 81+ messages in thread
From: Viresh Kumar @ 2016-02-16 1:13 UTC (permalink / raw)
To: linux-arm-kernel
On 15-02-16, 19:41, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> > [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> > [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> > [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> > [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> > [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> > [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
>
> This is the registration of the cpufreq driver (cpufreq-dt in this case).
>
> It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
>
> The only way that can happen is when cpufreq_set_policy() finds that
> the "old" and the "new" policies use the same governor, so it goes and
> calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
> how this is possible during the initialization ATM.
>
> Viresh, any ideas?
You misread probably.
During init, policy->gov is NULL and new_policy->gov is set to the
default one, probably ondemand/conservative. And in that case, we do:
- INIT
- START
- LIMITS
So above sequence is guaranteed to happen rather.
--
viresh
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:13 ` Viresh Kumar
0 siblings, 0 replies; 81+ messages in thread
From: Viresh Kumar @ 2016-02-16 1:13 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Guenter Roeck, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On 15-02-16, 19:41, Rafael J. Wysocki wrote:
> On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> > [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> > [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> > [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> > [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> > [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> > [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
>
> This is the registration of the cpufreq driver (cpufreq-dt in this case).
>
> It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
>
> The only way that can happen is when cpufreq_set_policy() finds that
> the "old" and the "new" policies use the same governor, so it goes and
> calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
> how this is possible during the initialization ATM.
>
> Viresh, any ideas?
You misread probably.
During init, policy->gov is NULL and new_policy->gov is set to the
default one, probably ondemand/conservative. And in that case, we do:
- INIT
- START
- LIMITS
So above sequence is guaranteed to happen rather.
--
viresh
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-16 1:13 ` Viresh Kumar
(?)
@ 2016-02-16 1:27 ` Rafael J. Wysocki
-1 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-16 1:27 UTC (permalink / raw)
To: Viresh Kumar
Cc: Rafael J. Wysocki, Guenter Roeck, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On Tuesday, February 16, 2016 06:43:35 AM Viresh Kumar wrote:
> On 15-02-16, 19:41, Rafael J. Wysocki wrote:
> > On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > > [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> > > [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> > > [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> > > [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> > > [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> > > [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> > > [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
> >
> > This is the registration of the cpufreq driver (cpufreq-dt in this case).
> >
> > It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
> >
> > The only way that can happen is when cpufreq_set_policy() finds that
> > the "old" and the "new" policies use the same governor, so it goes and
> > calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
> > how this is possible during the initialization ATM.
> >
> > Viresh, any ideas?
>
> You misread probably.
>
> During init, policy->gov is NULL and new_policy->gov is set to the
> default one, probably ondemand/conservative. And in that case, we do:
> - INIT
> - START
> - LIMITS
Yes, that's what we should be doing, but it seemed to me that we didn't.
Or maybe the trace just contained the last one, because that's when the
crash happened.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:27 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-16 1:27 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday, February 16, 2016 06:43:35 AM Viresh Kumar wrote:
> On 15-02-16, 19:41, Rafael J. Wysocki wrote:
> > On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > > [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> > > [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> > > [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> > > [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> > > [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> > > [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> > > [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
> >
> > This is the registration of the cpufreq driver (cpufreq-dt in this case).
> >
> > It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
> >
> > The only way that can happen is when cpufreq_set_policy() finds that
> > the "old" and the "new" policies use the same governor, so it goes and
> > calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
> > how this is possible during the initialization ATM.
> >
> > Viresh, any ideas?
>
> You misread probably.
>
> During init, policy->gov is NULL and new_policy->gov is set to the
> default one, probably ondemand/conservative. And in that case, we do:
> - INIT
> - START
> - LIMITS
Yes, that's what we should be doing, but it seemed to me that we didn't.
Or maybe the trace just contained the last one, because that's when the
crash happened.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:27 ` Rafael J. Wysocki
0 siblings, 0 replies; 81+ messages in thread
From: Rafael J. Wysocki @ 2016-02-16 1:27 UTC (permalink / raw)
To: Viresh Kumar
Cc: Rafael J. Wysocki, Guenter Roeck, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On Tuesday, February 16, 2016 06:43:35 AM Viresh Kumar wrote:
> On 15-02-16, 19:41, Rafael J. Wysocki wrote:
> > On Mon, Feb 15, 2016 at 6:05 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> > > [ 1.340000] [<c0958e78>] (__cpufreq_driver_target) from [<c095ca58>] (dbs_check_cpu+0x1ac/0x1e8)
> > > [ 1.340000] [<c095ca58>] (dbs_check_cpu) from [<c095cd04>] (cpufreq_governor_dbs+0x1fc/0x608)
> > > [ 1.340000] [<c095cd04>] (cpufreq_governor_dbs) from [<c0959c5c>] (__cpufreq_governor+0x1a8/0x204)
> > > [ 1.340000] [<c0959c5c>] (__cpufreq_governor) from [<c095a2dc>] (cpufreq_init_policy+0x60/0x8c)
> > > [ 1.340000] [<c095a2dc>] (cpufreq_init_policy) from [<c095a5f0>] (cpufreq_online+0x2e8/0x708)
> > > [ 1.340000] [<c095a5f0>] (cpufreq_online) from [<c075674c>] (subsys_interface_register+0x80/0xc4)
> > > [ 1.340000] [<c075674c>] (subsys_interface_register) from [<c0959764>] (cpufreq_register_driver+0x144/0x1a0)
> >
> > This is the registration of the cpufreq driver (cpufreq-dt in this case).
> >
> > It does cpufreq_online()->cpufreq_init_policy()->__cpufreq_governor()->cpufreq_governor_dbs()->dbs_check_cpu().
> >
> > The only way that can happen is when cpufreq_set_policy() finds that
> > the "old" and the "new" policies use the same governor, so it goes and
> > calls __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS), but I'm not sure
> > how this is possible during the initialization ATM.
> >
> > Viresh, any ideas?
>
> You misread probably.
>
> During init, policy->gov is NULL and new_policy->gov is set to the
> default one, probably ondemand/conservative. And in that case, we do:
> - INIT
> - START
> - LIMITS
Yes, that's what we should be doing, but it seemed to me that we didn't.
Or maybe the trace just contained the last one, because that's when the
crash happened.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
2016-02-16 1:27 ` Rafael J. Wysocki
(?)
@ 2016-02-16 1:36 ` Viresh Kumar
-1 siblings, 0 replies; 81+ messages in thread
From: Viresh Kumar @ 2016-02-16 1:36 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Rafael J. Wysocki, Guenter Roeck, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On 16-02-16, 02:27, Rafael J. Wysocki wrote:
> Yes, that's what we should be doing, but it seemed to me that we didn't.
>
> Or maybe the trace just contained the last one, because that's when the
> crash happened.
Ofcourse, it wouldn't mention the function calls that have already
finished :)
--
viresh
^ permalink raw reply [flat|nested] 81+ messages in thread
* Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:36 ` Viresh Kumar
0 siblings, 0 replies; 81+ messages in thread
From: Viresh Kumar @ 2016-02-16 1:36 UTC (permalink / raw)
To: linux-arm-kernel
On 16-02-16, 02:27, Rafael J. Wysocki wrote:
> Yes, that's what we should be doing, but it seemed to me that we didn't.
>
> Or maybe the trace just contained the last one, because that's when the
> crash happened.
Ofcourse, it wouldn't mention the function calls that have already
finished :)
--
viresh
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Crashes in arm qemu emulations due to 'cpufreq: governor: Replace timers with utilization ...'
@ 2016-02-16 1:36 ` Viresh Kumar
0 siblings, 0 replies; 81+ messages in thread
From: Viresh Kumar @ 2016-02-16 1:36 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Rafael J. Wysocki, Guenter Roeck, Rafael J. Wysocki, linux-next,
Linux Kernel Mailing List, linux-arm-kernel, linux-pm,
Peter Zijlstra
On 16-02-16, 02:27, Rafael J. Wysocki wrote:
> Yes, that's what we should be doing, but it seemed to me that we didn't.
>
> Or maybe the trace just contained the last one, because that's when the
> crash happened.
Ofcourse, it wouldn't mention the function calls that have already
finished :)
--
viresh
^ permalink raw reply [flat|nested] 81+ messages in thread