We recentlty discovered a call path which takes a mutex from the low level secondary CPU bringup code and wondered why this was not caught by might_sleep(). The reason is that both debug facilities depend on system_state == SYSTEM_RUNNING, which is set after init memory is freed. That means that SMP bootup and builtin driver initialization are not covered by these checks at all. The patch series addresses this by adding an intermediate state which enables both debug features right when scheduling starts, i.e. the boot CPU idle task schedules the first time. Changes since V1: - Use only one new state - Enable both debug facilities right before scheduling starts - Add more commentry about state ordering and placement of the state switch - CC ACPI folks on the relevant patch and amend changelog. - Collected acks/reviewed-by's Thanks, tglx
[-- Attachment #1: init--Pin-init-task-to-boot-cpu-initially.patch --] [-- Type: text/plain, Size: 2087 bytes --] Some of the boot code in init_kernel_freeable() which runs before SMP bringup assumes (rightfully) that it runs on the boot cpu and therefor can use smp_processor_id() in preemptible context. That works so far because the smp_processor_id() check starts to be effective after smp bringup. That's just wrong. Starting with SMP bringup and the ability to move threads around, smp_processor_id() in preemptible context is broken. Aside of that it does not make sense to allow init to run on all cpus before sched_smp_init() has been run. Pin the init to the boot cpu so the existing code can continue to use smp_processor_id() without triggering the checks when the enabling of those checks starts earlier. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- init/main.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) --- a/init/main.c +++ b/init/main.c @@ -389,6 +389,7 @@ static __initdata DECLARE_COMPLETION(kth static noinline void __ref rest_init(void) { + struct task_struct *tsk; int pid; rcu_scheduler_starting(); @@ -397,7 +398,17 @@ static noinline void __ref rest_init(voi * the init task will end up wanting to create kthreads, which, if * we schedule it before we create kthreadd, will OOPS. */ - kernel_thread(kernel_init, NULL, CLONE_FS); + pid = kernel_thread(kernel_init, NULL, CLONE_FS); + /* + * Pin init on the boot cpu. Task migration is not properly working + * until sched_init_smp() has been run. It will set the allowed + * cpus for init to the non isolated cpus. + */ + rcu_read_lock(); + tsk = find_task_by_pid_ns(pid, &init_pid_ns); + set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id())); + rcu_read_unlock(); + numa_default_policy(); pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); rcu_read_lock(); @@ -1015,10 +1026,6 @@ static noinline void __init kernel_init_ * init can allocate pages on any node */ set_mems_allowed(node_states[N_MEMORY]); - /* - * init can run on any cpu. - */ - set_cpus_allowed_ptr(current, cpu_all_mask); cad_pid = task_pid(current);
[-- Attachment #1: arm--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 818 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in ipi_cpu_stop() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org --- arch/arm/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -555,8 +555,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock); */ static void ipi_cpu_stop(unsigned int cpu) { - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) { + if (system_state <= SYSTEM_RUNNING) { raw_spin_lock(&stop_lock); pr_crit("CPU%u: stopping\n", cpu); dump_stack();
To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in ipi_cpu_stop() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel at lists.infradead.org --- arch/arm/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -555,8 +555,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock); */ static void ipi_cpu_stop(unsigned int cpu) { - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) { + if (system_state <= SYSTEM_RUNNING) { raw_spin_lock(&stop_lock); pr_crit("CPU%u: stopping\n", cpu); dump_stack();
[-- Attachment #1: arm64--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 939 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in smp_send_stop() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org --- arch/arm64/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -961,8 +961,7 @@ void smp_send_stop(void) cpumask_copy(&mask, cpu_online_mask); cpumask_clear_cpu(smp_processor_id(), &mask); - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) + if (system_state <= SYSTEM_RUNNING) pr_crit("SMP: stopping secondary CPUs\n"); smp_cross_call(&mask, IPI_CPU_STOP); }
To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in smp_send_stop() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel at lists.infradead.org --- arch/arm64/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -961,8 +961,7 @@ void smp_send_stop(void) cpumask_copy(&mask, cpu_online_mask); cpumask_clear_cpu(smp_processor_id(), &mask); - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) + if (system_state <= SYSTEM_RUNNING) pr_crit("SMP: stopping secondary CPUs\n"); smp_cross_call(&mask, IPI_CPU_STOP); }
[-- Attachment #1: x86-smp--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 785 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in announce_cpu() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> --- arch/x86/kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -863,7 +863,7 @@ static void announce_cpu(int cpu, int ap if (cpu == 1) printk(KERN_INFO "x86: Booting SMP configuration:\n"); - if (system_state == SYSTEM_BOOTING) { + if (system_state < SYSTEM_RUNNING) { if (node != current_node) { if (current_node > (-1)) pr_cont("\n");
[-- Attachment #1: metag--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 805 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in stop_this_cpu() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org --- arch/metag/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/arch/metag/kernel/smp.c +++ b/arch/metag/kernel/smp.c @@ -567,8 +567,7 @@ static void stop_this_cpu(void *data) { unsigned int cpu = smp_processor_id(); - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) { + if (system_state <= SYSTEM_RUNNING) { spin_lock(&stop_lock); pr_crit("CPU%u: stopping\n", cpu); dump_stack();
[-- Attachment #1: powerpc--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 1023 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in smp_generic_cpu_bootable() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org --- arch/powerpc/kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -97,7 +97,7 @@ int smp_generic_cpu_bootable(unsigned in /* Special case - we inhibit secondary thread startup * during boot if the user requests it. */ - if (system_state == SYSTEM_BOOTING && cpu_has_feature(CPU_FTR_SMT)) { + if (system_state < SYSTEM_RUNNING && cpu_has_feature(CPU_FTR_SMT)) { if (!smt_enabled_at_boot && cpu_thread_in_core(nr) != 0) return 0; if (smt_enabled_at_boot
To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in smp_generic_cpu_bootable() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org --- arch/powerpc/kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -97,7 +97,7 @@ int smp_generic_cpu_bootable(unsigned in /* Special case - we inhibit secondary thread startup * during boot if the user requests it. */ - if (system_state == SYSTEM_BOOTING && cpu_has_feature(CPU_FTR_SMT)) { + if (system_state < SYSTEM_RUNNING && cpu_has_feature(CPU_FTR_SMT)) { if (!smt_enabled_at_boot && cpu_thread_in_core(nr) != 0) return 0; if (smt_enabled_at_boot
[-- Attachment #1: ACPI--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 999 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Make the decision whether a pci root is hotplugged depend on SYSTEM_RUNNING instead of !SYSTEM_BOOTING. It makes no sense to cover states greater than SYSTEM_RUNNING as there are not hotplug events on reboot and poweroff. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: linux-acpi@vger.kernel.org --- drivers/acpi/pci_root.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -523,7 +523,7 @@ static int acpi_pci_root_add(struct acpi struct acpi_pci_root *root; acpi_handle handle = device->handle; int no_aspm = 0; - bool hotadd = system_state != SYSTEM_BOOTING; + bool hotadd = system_state == SYSTEM_RUNNING; root = kzalloc(sizeof(struct acpi_pci_root), GFP_KERNEL); if (!root)
[-- Attachment #1: mm--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 1079 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. get_nid_for_pfn() checks for system_state == BOOTING to decide whether to use early_pfn_to_nid() when CONFIG_DEFERRED_STRUCT_PAGE_INIT=y. That check is dubious, because the switch to state RUNNING happes way after page_alloc_init_late() has been invoked. Change the check to less than RUNNING state so it covers the new intermediate states as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> --- drivers/base/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -377,7 +377,7 @@ static int __ref get_nid_for_pfn(unsigne if (!pfn_valid_within(pfn)) return -1; #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT - if (system_state == SYSTEM_BOOTING) + if (system_state < SYSTEM_RUNNING) return early_pfn_to_nid(pfn); #endif page = pfn_to_page(pfn);
[-- Attachment #1: cpufreq-pasemi--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 875 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in pas_cpufreq_cpu_exit() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linuxppc-dev@lists.ozlabs.org --- drivers/cpufreq/pasemi-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -226,7 +226,7 @@ static int pas_cpufreq_cpu_exit(struct c * We don't support CPU hotplug. Don't unmap after the system * has already made it to a running state. */ - if (system_state != SYSTEM_BOOTING) + if (system_state >= SYSTEM_RUNNING) return 0; if (sdcasr_mapbase)
To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in pas_cpufreq_cpu_exit() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linuxppc-dev@lists.ozlabs.org --- drivers/cpufreq/pasemi-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -226,7 +226,7 @@ static int pas_cpufreq_cpu_exit(struct c * We don't support CPU hotplug. Don't unmap after the system * has already made it to a running state. */ - if (system_state != SYSTEM_BOOTING) + if (system_state >= SYSTEM_RUNNING) return 0; if (sdcasr_mapbase)
[-- Attachment #1: iommu-vt-d--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 1260 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state checks in dmar_parse_one_atsr() and dmar_iommu_notify_scope_dev() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org --- drivers/iommu/intel-iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4312,7 +4312,7 @@ int dmar_parse_one_atsr(struct acpi_dmar struct acpi_dmar_atsr *atsr; struct dmar_atsr_unit *atsru; - if (system_state != SYSTEM_BOOTING && !intel_iommu_enabled) + if (system_state >= SYSTEM_RUNNING && !intel_iommu_enabled) return 0; atsr = container_of(hdr, struct acpi_dmar_atsr, header); @@ -4562,7 +4562,7 @@ int dmar_iommu_notify_scope_dev(struct d struct acpi_dmar_atsr *atsr; struct acpi_dmar_reserved_memory *rmrr; - if (!intel_iommu_enabled && system_state != SYSTEM_BOOTING) + if (!intel_iommu_enabled && system_state >= SYSTEM_RUNNING) return 0; list_for_each_entry(rmrru, &dmar_rmrr_units, list) {
[-- Attachment #1: iommu-vt-d--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 1375 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state checks in dmar_parse_one_atsr() and dmar_iommu_notify_scope_dev() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> Acked-by: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> Cc: David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org --- drivers/iommu/intel-iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4312,7 +4312,7 @@ int dmar_parse_one_atsr(struct acpi_dmar struct acpi_dmar_atsr *atsr; struct dmar_atsr_unit *atsru; - if (system_state != SYSTEM_BOOTING && !intel_iommu_enabled) + if (system_state >= SYSTEM_RUNNING && !intel_iommu_enabled) return 0; atsr = container_of(hdr, struct acpi_dmar_atsr, header); @@ -4562,7 +4562,7 @@ int dmar_iommu_notify_scope_dev(struct d struct acpi_dmar_atsr *atsr; struct acpi_dmar_reserved_memory *rmrr; - if (!intel_iommu_enabled && system_state != SYSTEM_BOOTING) + if (!intel_iommu_enabled && system_state >= SYSTEM_RUNNING) return 0; list_for_each_entry(rmrru, &dmar_rmrr_units, list) {
[-- Attachment #1: iommu-of--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 902 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in of_iommu_driver_present() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Cc: iommu@lists.linux-foundation.org --- drivers/iommu/of_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -103,7 +103,7 @@ static bool of_iommu_driver_present(stru * it never will be. We don't want to defer indefinitely, nor attempt * to dereference __iommu_of_table after it's been freed. */ - if (system_state > SYSTEM_BOOTING) + if (system_state >= SYSTEM_RUNNING) return false; return of_match_node(&__iommu_of_table, np);
[-- Attachment #1: iommu-of--Adjust-system_state-check.patch --] [-- Type: text/plain, Size: 1012 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in of_iommu_driver_present() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> Acked-by: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> Acked-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org --- drivers/iommu/of_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -103,7 +103,7 @@ static bool of_iommu_driver_present(stru * it never will be. We don't want to defer indefinitely, nor attempt * to dereference __iommu_of_table after it's been freed. */ - if (system_state > SYSTEM_BOOTING) + if (system_state >= SYSTEM_RUNNING) return false; return of_match_node(&__iommu_of_table, np);
[-- Attachment #1: async--Adjust-system_state-checks.patch --] [-- Type: text/plain, Size: 1783 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in async_run_entry_fn() and async_synchronize_cookie_domain() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> --- kernel/async.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/kernel/async.c +++ b/kernel/async.c @@ -114,14 +114,14 @@ static void async_run_entry_fn(struct wo ktime_t uninitialized_var(calltime), delta, rettime; /* 1) run (and print duration) */ - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { pr_debug("calling %lli_%pF @ %i\n", (long long)entry->cookie, entry->func, task_pid_nr(current)); calltime = ktime_get(); } entry->func(entry->data, entry->cookie); - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { rettime = ktime_get(); delta = ktime_sub(rettime, calltime); pr_debug("initcall %lli_%pF returned 0 after %lld usecs\n", @@ -284,14 +284,14 @@ void async_synchronize_cookie_domain(asy { ktime_t uninitialized_var(starttime), delta, endtime; - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { pr_debug("async_waiting @ %i\n", task_pid_nr(current)); starttime = ktime_get(); } wait_event(async_done, lowest_in_progress(domain) >= cookie); - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { endtime = ktime_get(); delta = ktime_sub(endtime, starttime);
[-- Attachment #1: extable--Adjust-system_state-checks.patch --] [-- Type: text/plain, Size: 794 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in core_kernel_text() to handle the extra states, i.e. to cover init text up to the point where the system switches to state RUNNING. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> --- kernel/extable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/extable.c +++ b/kernel/extable.c @@ -75,7 +75,7 @@ int core_kernel_text(unsigned long addr) addr < (unsigned long)_etext) return 1; - if (system_state == SYSTEM_BOOTING && + if (system_state < SYSTEM_RUNNING && init_kernel_text(addr)) return 1; return 0;
[-- Attachment #1: printk--Adjust-system_state-checks.patch --] [-- Type: text/plain, Size: 768 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in boot_delay_msec() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> --- kernel/printk/printk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1176,7 +1176,7 @@ static void boot_delay_msec(int level) unsigned long long k; unsigned long timeout; - if ((boot_delay == 0 || system_state != SYSTEM_BOOTING) + if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING) || suppress_message_printing(level)) { return; }
[-- Attachment #1: mm-vmscan--Adjust-system_state-checks.patch --] [-- Type: text/plain, Size: 1044 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in kswapd_run() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-mm@kvack.org --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3643,7 +3643,7 @@ int kswapd_run(int nid) pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); if (IS_ERR(pgdat->kswapd)) { /* failure at boot is fatal */ - BUG_ON(system_state == SYSTEM_BOOTING); + BUG_ON(system_state < SYSTEM_RUNNING); pr_err("Failed to start kswapd on node %d\n", nid); ret = PTR_ERR(pgdat->kswapd); pgdat->kswapd = NULL;
[-- Attachment #1: mm-vmscan--Adjust-system_state-checks.patch --] [-- Type: text/plain, Size: 1271 bytes --] To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in kswapd_run() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-mm@kvack.org --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3643,7 +3643,7 @@ int kswapd_run(int nid) pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); if (IS_ERR(pgdat->kswapd)) { /* failure at boot is fatal */ - BUG_ON(system_state == SYSTEM_BOOTING); + BUG_ON(system_state < SYSTEM_RUNNING); pr_err("Failed to start kswapd on node %d\n", nid); ret = PTR_ERR(pgdat->kswapd); pgdat->kswapd = NULL; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
[-- Attachment #1: init--Introduce-SYSTEM_SCHEDULING-state.patch --] [-- Type: text/plain, Size: 1507 bytes --] might_sleep() debugging and smp_processor_id() debugging should be active right after the scheduler starts working. The init task can invoke smp_processor_id() from preemptible context as it is pinned on the boot cpu until sched_smp_init() removes the pinning and lets it schedule on all non isolated cpus. Add a new state which allows to enable those checks earlier and add it to the xen do_poweroff() function. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> --- V2: Use only one intermediate state and document that state order matters. drivers/xen/manage.c | 1 + include/linux/kernel.h | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-) --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -190,6 +190,7 @@ static void do_poweroff(void) { switch (system_state) { case SYSTEM_BOOTING: + case SYSTEM_SCHEDULING: orderly_poweroff(true); break; case SYSTEM_RUNNING: --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -490,9 +490,13 @@ extern int root_mountflags; extern bool early_boot_irqs_disabled; -/* Values used for system_state */ +/* + * Values used for system_state. Ordering of the states must not be changed + * as code checks for <, <=, >, >= STATE. + */ extern enum system_states { SYSTEM_BOOTING, + SYSTEM_SCHEDULING, SYSTEM_RUNNING, SYSTEM_HALT, SYSTEM_POWER_OFF,
[-- Attachment #1: sched--Enable-might_sleep---checks-early.patch --] [-- Type: text/plain, Size: 1894 bytes --] might_sleep() and smp_processor_id() checks are enabled after the boot process is done. That hides bugs in the smp bringup and driver initialization code. Enable it right when the scheduler starts working, i.e. when init task and kthreadd have been created and right before the idle task enables preemption. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- init/main.c | 10 ++++++++++ kernel/sched/core.c | 4 +++- lib/smp_processor_id.c | 2 +- 3 files changed, 14 insertions(+), 2 deletions(-) --- a/init/main.c +++ b/init/main.c @@ -414,6 +414,16 @@ static noinline void __ref rest_init(voi rcu_read_lock(); kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); rcu_read_unlock(); + + /* + * Enable might_sleep() and smp_processor_id() checks. + * They cannot be enabled earlier because with CONFIG_PRREMPT=y + * kernel_thread() would trigger might_sleep() splats. With + * CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled + * already, but it's stuck on the kthreadd_done completion. + */ + system_state = SYSTEM_SCHEDULING; + complete(&kthreadd_done); /* --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6226,8 +6226,10 @@ void ___might_sleep(const char *file, in if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && !is_idle_task(current)) || - system_state != SYSTEM_RUNNING || oops_in_progress) + system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || + oops_in_progress) return; + if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) return; prev_jiffy = jiffies; --- a/lib/smp_processor_id.c +++ b/lib/smp_processor_id.c @@ -28,7 +28,7 @@ notrace static unsigned int check_preemp /* * It is valid to assume CPU-locality during early bootup: */ - if (system_state != SYSTEM_RUNNING) + if (system_state < SYSTEM_SCHEDULING) goto out; /*
On Tue, 16 May 2017 20:42:32 +0200 Thomas Gleixner <tglx@linutronix.de> wrote: > Some of the boot code in init_kernel_freeable() which runs before SMP > bringup assumes (rightfully) that it runs on the boot cpu and therefor can > use smp_processor_id() in preemptible context. > > That works so far because the smp_processor_id() check starts to be > effective after smp bringup. That's just wrong. Starting with SMP bringup > and the ability to move threads around, smp_processor_id() in preemptible > context is broken. > > Aside of that it does not make sense to allow init to run on all cpus > before sched_smp_init() has been run. > > Pin the init to the boot cpu so the existing code can continue to use > smp_processor_id() without triggering the checks when the enabling of those > checks starts earlier. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > init/main.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > --- a/init/main.c > +++ b/init/main.c > @@ -389,6 +389,7 @@ static __initdata DECLARE_COMPLETION(kth > > static noinline void __ref rest_init(void) > { > + struct task_struct *tsk; > int pid; > > rcu_scheduler_starting(); > @@ -397,7 +398,17 @@ static noinline void __ref rest_init(voi > * the init task will end up wanting to create kthreads, which, if > * we schedule it before we create kthreadd, will OOPS. > */ > - kernel_thread(kernel_init, NULL, CLONE_FS); > + pid = kernel_thread(kernel_init, NULL, CLONE_FS); > + /* > + * Pin init on the boot cpu. Task migration is not properly working > + * until sched_init_smp() has been run. It will set the allowed > + * cpus for init to the non isolated cpus. > + */ > + rcu_read_lock(); > + tsk = find_task_by_pid_ns(pid, &init_pid_ns); Should we have a: BUG_ON(!tsk); Just to be paranoid? -- Steve > + set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id())); > + rcu_read_unlock(); > + > numa_default_policy(); > pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); > rcu_read_lock(); > @@ -1015,10 +1026,6 @@ static noinline void __init kernel_init_ > * init can allocate pages on any node > */ > set_mems_allowed(node_states[N_MEMORY]); > - /* > - * init can run on any cpu. > - */ > - set_cpus_allowed_ptr(current, cpu_all_mask); > > cad_pid = task_pid(current); > >
On Tue, 16 May 2017 20:42:38 +0200 Thomas Gleixner <tglx@linutronix.de> wrote: > To enable smp_processor_id() and might_sleep() debug checks earlier, it's > required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. > > Make the decision whether a pci root is hotplugged depend on SYSTEM_RUNNING > instead of !SYSTEM_BOOTING. It makes no sense to cover states greater than > SYSTEM_RUNNING as there are not hotplug events on reboot and poweroff. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -- Steve > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> > Cc: Len Brown <lenb@kernel.org> > Cc: linux-acpi@vger.kernel.org > --- > drivers/acpi/pci_root.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- a/drivers/acpi/pci_root.c > +++ b/drivers/acpi/pci_root.c > @@ -523,7 +523,7 @@ static int acpi_pci_root_add(struct acpi > struct acpi_pci_root *root; > acpi_handle handle = device->handle; > int no_aspm = 0; > - bool hotadd = system_state != SYSTEM_BOOTING; > + bool hotadd = system_state == SYSTEM_RUNNING; > > root = kzalloc(sizeof(struct acpi_pci_root), GFP_KERNEL); > if (!root) >
On Tue, 16 May 2017 20:42:48 +0200 Thomas Gleixner <tglx@linutronix.de> wrote: > might_sleep() and smp_processor_id() checks are enabled after the boot > process is done. That hides bugs in the smp bringup and driver > initialization code. > > Enable it right when the scheduler starts working, i.e. when init task and > kthreadd have been created and right before the idle task enables > preemption. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > init/main.c | 10 ++++++++++ > kernel/sched/core.c | 4 +++- > lib/smp_processor_id.c | 2 +- > 3 files changed, 14 insertions(+), 2 deletions(-) > > --- a/init/main.c > +++ b/init/main.c > @@ -414,6 +414,16 @@ static noinline void __ref rest_init(voi > rcu_read_lock(); > kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); > rcu_read_unlock(); > + > + /* > + * Enable might_sleep() and smp_processor_id() checks. > + * They cannot be enabled earlier because with CONFIG_PRREMPT=y My cat's version of CONFIG_PREEMPT, it's CONFIG_ PRR EMPT! > + * kernel_thread() would trigger might_sleep() splats. With > + * CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled > + * already, but it's stuck on the kthreadd_done completion. > + */ > + system_state = SYSTEM_SCHEDULING; > + > complete(&kthreadd_done); > > /* > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -6226,8 +6226,10 @@ void ___might_sleep(const char *file, in > > if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && > !is_idle_task(current)) || > - system_state != SYSTEM_RUNNING || oops_in_progress) > + system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || > + oops_in_progress) > return; > + > if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) > return; > prev_jiffy = jiffies; > --- a/lib/smp_processor_id.c > +++ b/lib/smp_processor_id.c > @@ -28,7 +28,7 @@ notrace static unsigned int check_preemp > /* > * It is valid to assume CPU-locality during early bootup: > */ > - if (system_state != SYSTEM_RUNNING) > + if (system_state < SYSTEM_SCHEDULING) Do we want to ignore halting or rebooting too? -- Steve > goto out; > > /* >
On Tue, 16 May 2017, Steven Rostedt wrote: > On Tue, 16 May 2017 20:42:48 +0200 > Thomas Gleixner <tglx@linutronix.de> wrote: > > + > > + /* > > + * Enable might_sleep() and smp_processor_id() checks. > > + * They cannot be enabled earlier because with CONFIG_PRREMPT=y > > My cat's version of CONFIG_PREEMPT, it's CONFIG_ PRR EMPT! I don't have a cat and I don't need one for creating typos. I'm perfectly able to do that myself :) > > + * kernel_thread() would trigger might_sleep() splats. With > > + * CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled > > + * already, but it's stuck on the kthreadd_done completion. > > + */ > > + system_state = SYSTEM_SCHEDULING; > > + > > complete(&kthreadd_done); > > > > /* > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -6226,8 +6226,10 @@ void ___might_sleep(const char *file, in > > > > if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && > > !is_idle_task(current)) || > > - system_state != SYSTEM_RUNNING || oops_in_progress) > > + system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || > > + oops_in_progress) > > return; > > + > > if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) > > return; > > prev_jiffy = jiffies; > > --- a/lib/smp_processor_id.c > > +++ b/lib/smp_processor_id.c > > @@ -28,7 +28,7 @@ notrace static unsigned int check_preemp > > /* > > * It is valid to assume CPU-locality during early bootup: > > */ > > - if (system_state != SYSTEM_RUNNING) > > + if (system_state < SYSTEM_SCHEDULING) > > Do we want to ignore halting or rebooting too? I don't think so. After setting those states, interesting stuff like device_shutdown() gets invoked. We want the coverage there. Thanks, tglx
On Wed, 17 May 2017 00:46:37 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:
> > > --- a/lib/smp_processor_id.c
> > > +++ b/lib/smp_processor_id.c
> > > @@ -28,7 +28,7 @@ notrace static unsigned int check_preemp
> > > /*
> > > * It is valid to assume CPU-locality during early bootup:
> > > */
> > > - if (system_state != SYSTEM_RUNNING)
> > > + if (system_state < SYSTEM_SCHEDULING)
> >
> > Do we want to ignore halting or rebooting too?
>
> I don't think so. After setting those states, interesting stuff like
> device_shutdown() gets invoked. We want the coverage there.
Then I'd suggest that you update the change log, as this patch also
adds coverage to those states as well.
-- Steve
On 05/16/2017 08:42 PM, Thomas Gleixner wrote: > To enable smp_processor_id() and might_sleep() debug checks earlier, it's > required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. > > Adjust the system_state check in kswapd_run() to handle the extra states. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: linux-mm@kvack.org Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > mm/vmscan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3643,7 +3643,7 @@ int kswapd_run(int nid) > pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); > if (IS_ERR(pgdat->kswapd)) { > /* failure at boot is fatal */ > - BUG_ON(system_state == SYSTEM_BOOTING); > + BUG_ON(system_state < SYSTEM_RUNNING); > pr_err("Failed to start kswapd on node %d\n", nid); > ret = PTR_ERR(pgdat->kswapd); > pgdat->kswapd = NULL; > >
On 05/16/2017 08:42 PM, Thomas Gleixner wrote: > To enable smp_processor_id() and might_sleep() debug checks earlier, it's > required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. > > Adjust the system_state check in kswapd_run() to handle the extra states. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: linux-mm@kvack.org Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > mm/vmscan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3643,7 +3643,7 @@ int kswapd_run(int nid) > pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); > if (IS_ERR(pgdat->kswapd)) { > /* failure at boot is fatal */ > - BUG_ON(system_state == SYSTEM_BOOTING); > + BUG_ON(system_state < SYSTEM_RUNNING); > pr_err("Failed to start kswapd on node %d\n", nid); > ret = PTR_ERR(pgdat->kswapd); > pgdat->kswapd = NULL; > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
On Tue, May 16, 2017 at 08:42:34PM +0200, Thomas Gleixner wrote: > To enable smp_processor_id() and might_sleep() debug checks earlier, it's > required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. > > Adjust the system_state check in smp_send_stop() to handle the extra states. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: linux-arm-kernel@lists.infradead.org > --- > arch/arm64/kernel/smp.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) FWIW: Acked-by: Mark Rutland <mark.rutland@arm.com> Mark. > > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -961,8 +961,7 @@ void smp_send_stop(void) > cpumask_copy(&mask, cpu_online_mask); > cpumask_clear_cpu(smp_processor_id(), &mask); > > - if (system_state == SYSTEM_BOOTING || > - system_state == SYSTEM_RUNNING) > + if (system_state <= SYSTEM_RUNNING) > pr_crit("SMP: stopping secondary CPUs\n"); > smp_cross_call(&mask, IPI_CPU_STOP); > } > >
On Tue, May 16, 2017 at 08:42:34PM +0200, Thomas Gleixner wrote: > To enable smp_processor_id() and might_sleep() debug checks earlier, it's > required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. > > Adjust the system_state check in smp_send_stop() to handle the extra states. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: linux-arm-kernel at lists.infradead.org > --- > arch/arm64/kernel/smp.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) FWIW: Acked-by: Mark Rutland <mark.rutland@arm.com> Mark. > > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -961,8 +961,7 @@ void smp_send_stop(void) > cpumask_copy(&mask, cpu_online_mask); > cpumask_clear_cpu(smp_processor_id(), &mask); > > - if (system_state == SYSTEM_BOOTING || > - system_state == SYSTEM_RUNNING) > + if (system_state <= SYSTEM_RUNNING) > pr_crit("SMP: stopping secondary CPUs\n"); > smp_cross_call(&mask, IPI_CPU_STOP); > } > >
Hi, On Tue, May 16, 2017 at 08:42:31PM +0200, Thomas Gleixner wrote: > We recentlty discovered a call path which takes a mutex from the low level > secondary CPU bringup code and wondered why this was not caught by > might_sleep(). > > The reason is that both debug facilities depend on system_state == > SYSTEM_RUNNING, which is set after init memory is freed. > > That means that SMP bootup and builtin driver initialization are not > covered by these checks at all. > > The patch series addresses this by adding an intermediate state which > enables both debug features right when scheduling starts, i.e. the boot CPU > idle task schedules the first time. Thanks again for attacking this. I gave this a spin atop of v4.12-rc1 on an ARM Juno platform. It picks up the mutex issue, and I see no other new warnings. With a fix [1] for the mutex issue appplied, I see no warnings. Feel free to add my Tested-by for the arm64 and common bits. Thanks, Mark. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-May/506558.html
On Tue, May 16, 2017 at 08:42:47PM +0200, Thomas Gleixner wrote: > might_sleep() debugging and smp_processor_id() debugging should be active > right after the scheduler starts working. The init task can invoke > smp_processor_id() from preemptible context as it is pinned on the boot cpu > until sched_smp_init() removes the pinning and lets it schedule on all non > isolated cpus. > > Add a new state which allows to enable those checks earlier and add it to > the xen do_poweroff() function. > > No functional change. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Cc: Juergen Gross <jgross@suse.com> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Makes sense to me. FWIW: Acked-by: Mark Rutland <mark.rutland@arm.com> Mark. > --- > > V2: Use only one intermediate state and document that state order matters. > > drivers/xen/manage.c | 1 + > include/linux/kernel.h | 6 +++++- > 2 files changed, 6 insertions(+), 1 deletion(-) > > --- a/drivers/xen/manage.c > +++ b/drivers/xen/manage.c > @@ -190,6 +190,7 @@ static void do_poweroff(void) > { > switch (system_state) { > case SYSTEM_BOOTING: > + case SYSTEM_SCHEDULING: > orderly_poweroff(true); > break; > case SYSTEM_RUNNING: > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -490,9 +490,13 @@ extern int root_mountflags; > > extern bool early_boot_irqs_disabled; > > -/* Values used for system_state */ > +/* > + * Values used for system_state. Ordering of the states must not be changed > + * as code checks for <, <=, >, >= STATE. > + */ > extern enum system_states { > SYSTEM_BOOTING, > + SYSTEM_SCHEDULING, > SYSTEM_RUNNING, > SYSTEM_HALT, > SYSTEM_POWER_OFF, > >
On Tue, May 16, 2017 at 08:42:48PM +0200, Thomas Gleixner wrote: > might_sleep() and smp_processor_id() checks are enabled after the boot > process is done. That hides bugs in the smp bringup and driver > initialization code. > > Enable it right when the scheduler starts working, i.e. when init task and > kthreadd have been created and right before the idle task enables > preemption. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Modulo Steve's comments, FWIW: Acked-by: Mark Rutland <mark.rutland@arm.com> Mark. > --- > init/main.c | 10 ++++++++++ > kernel/sched/core.c | 4 +++- > lib/smp_processor_id.c | 2 +- > 3 files changed, 14 insertions(+), 2 deletions(-) > > --- a/init/main.c > +++ b/init/main.c > @@ -414,6 +414,16 @@ static noinline void __ref rest_init(voi > rcu_read_lock(); > kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); > rcu_read_unlock(); > + > + /* > + * Enable might_sleep() and smp_processor_id() checks. > + * They cannot be enabled earlier because with CONFIG_PRREMPT=y > + * kernel_thread() would trigger might_sleep() splats. With > + * CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled > + * already, but it's stuck on the kthreadd_done completion. > + */ > + system_state = SYSTEM_SCHEDULING; > + > complete(&kthreadd_done); > > /* > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -6226,8 +6226,10 @@ void ___might_sleep(const char *file, in > > if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && > !is_idle_task(current)) || > - system_state != SYSTEM_RUNNING || oops_in_progress) > + system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || > + oops_in_progress) > return; > + > if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) > return; > prev_jiffy = jiffies; > --- a/lib/smp_processor_id.c > +++ b/lib/smp_processor_id.c > @@ -28,7 +28,7 @@ notrace static unsigned int check_preemp > /* > * It is valid to assume CPU-locality during early bootup: > */ > - if (system_state != SYSTEM_RUNNING) > + if (system_state < SYSTEM_SCHEDULING) > goto out; > > /* > >
On Tue, May 16, 2017 at 08:42:34PM +0200, Thomas Gleixner wrote:
> To enable smp_processor_id() and might_sleep() debug checks earlier, it's
> required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.
>
> Adjust the system_state check in smp_send_stop() to handle the extra states.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
On Tue, May 16, 2017 at 08:42:34PM +0200, Thomas Gleixner wrote:
> To enable smp_processor_id() and might_sleep() debug checks earlier, it's
> required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.
>
> Adjust the system_state check in smp_send_stop() to handle the extra states.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel at lists.infradead.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
On 05/16/2017 02:42 PM, Thomas Gleixner wrote:
> might_sleep() debugging and smp_processor_id() debugging should be active
> right after the scheduler starts working. The init task can invoke
> smp_processor_id() from preemptible context as it is pinned on the boot cpu
> until sched_smp_init() removes the pinning and lets it schedule on all non
> isolated cpus.
>
> Add a new state which allows to enable those checks earlier and add it to
> the xen do_poweroff() function.
>
> No functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Commit-ID: 8fb12156b8db61af3d49f3e5e104568494581d1f Gitweb: http://git.kernel.org/tip/8fb12156b8db61af3d49f3e5e104568494581d1f Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:32 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:34 +0200 init: Pin init task to the boot CPU, initially Some of the boot code in init_kernel_freeable() which runs before SMP bringup assumes (rightfully) that it runs on the boot CPU and therefore can use smp_processor_id() in preemptible context. That works so far because the smp_processor_id() check starts to be effective after smp bringup. That's just wrong. Starting with SMP bringup and the ability to move threads around, smp_processor_id() in preemptible context is broken. Aside of that it does not make sense to allow init to run on all CPUs before sched_smp_init() has been run. Pin the init to the boot CPU so the existing code can continue to use smp_processor_id() without triggering the checks when the enabling of those checks starts earlier. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170516184734.943149935@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- init/main.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/init/main.c b/init/main.c index f866510..badae3b 100644 --- a/init/main.c +++ b/init/main.c @@ -389,6 +389,7 @@ static __initdata DECLARE_COMPLETION(kthreadd_done); static noinline void __ref rest_init(void) { + struct task_struct *tsk; int pid; rcu_scheduler_starting(); @@ -397,7 +398,17 @@ static noinline void __ref rest_init(void) * the init task will end up wanting to create kthreads, which, if * we schedule it before we create kthreadd, will OOPS. */ - kernel_thread(kernel_init, NULL, CLONE_FS); + pid = kernel_thread(kernel_init, NULL, CLONE_FS); + /* + * Pin init on the boot CPU. Task migration is not properly working + * until sched_init_smp() has been run. It will set the allowed + * CPUs for init to the non isolated CPUs. + */ + rcu_read_lock(); + tsk = find_task_by_pid_ns(pid, &init_pid_ns); + set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id())); + rcu_read_unlock(); + numa_default_policy(); pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); rcu_read_lock(); @@ -1015,10 +1026,6 @@ static noinline void __init kernel_init_freeable(void) * init can allocate pages on any node */ set_mems_allowed(node_states[N_MEMORY]); - /* - * init can run on any cpu. - */ - set_cpus_allowed_ptr(current, cpu_all_mask); cad_pid = task_pid(current);
Commit-ID: 5976a66913a8bf42465d96776fd37fb5631edc19 Gitweb: http://git.kernel.org/tip/5976a66913a8bf42465d96776fd37fb5631edc19 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:33 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:35 +0200 arm: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in ipi_cpu_stop() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20170516184735.020718977@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/arm/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 572a8df..c9a0a52 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -555,8 +555,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock); */ static void ipi_cpu_stop(unsigned int cpu) { - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) { + if (system_state <= SYSTEM_RUNNING) { raw_spin_lock(&stop_lock); pr_crit("CPU%u: stopping\n", cpu); dump_stack();
Commit-ID: ef284f5ca5f102bf855e599305c0c16d6e844635 Gitweb: http://git.kernel.org/tip/ef284f5ca5f102bf855e599305c0c16d6e844635 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:34 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:35 +0200 arm64: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in smp_send_stop() to handle the extra states. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20170516184735.112589728@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/arm64/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 6e0e16a..3211198 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -961,8 +961,7 @@ void smp_send_stop(void) cpumask_copy(&mask, cpu_online_mask); cpumask_clear_cpu(smp_processor_id(), &mask); - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) + if (system_state <= SYSTEM_RUNNING) pr_crit("SMP: stopping secondary CPUs\n"); smp_cross_call(&mask, IPI_CPU_STOP); }
Commit-ID: 719b3680d1f789c1e3054e3fcb26bfff07c3c623 Gitweb: http://git.kernel.org/tip/719b3680d1f789c1e3054e3fcb26bfff07c3c623 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:35 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:35 +0200 x86/smp: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in announce_cpu() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170516184735.191715856@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index f04479a..045e4f9 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -863,7 +863,7 @@ static void announce_cpu(int cpu, int apicid) if (cpu == 1) printk(KERN_INFO "x86: Booting SMP configuration:\n"); - if (system_state == SYSTEM_BOOTING) { + if (system_state < SYSTEM_RUNNING) { if (node != current_node) { if (current_node > (-1)) pr_cont("\n");
Commit-ID: dcd2e4734b428709984e2fa35ebbd6cccc246d47 Gitweb: http://git.kernel.org/tip/dcd2e4734b428709984e2fa35ebbd6cccc246d47 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:36 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:35 +0200 metag: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in stop_this_cpu() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170516184735.283420315@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/metag/kernel/smp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c index 232a12b..2dbbb7c 100644 --- a/arch/metag/kernel/smp.c +++ b/arch/metag/kernel/smp.c @@ -567,8 +567,7 @@ static void stop_this_cpu(void *data) { unsigned int cpu = smp_processor_id(); - if (system_state == SYSTEM_BOOTING || - system_state == SYSTEM_RUNNING) { + if (system_state <= SYSTEM_RUNNING) { spin_lock(&stop_lock); pr_crit("CPU%u: stopping\n", cpu); dump_stack();
Commit-ID: a8fcfc1917681ba1ccc23a429543a67aad8bfd00 Gitweb: http://git.kernel.org/tip/a8fcfc1917681ba1ccc23a429543a67aad8bfd00 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:37 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:35 +0200 powerpc: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in smp_generic_cpu_bootable() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170516184735.359536998@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/powerpc/kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index df2a416..1069f74 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -97,7 +97,7 @@ int smp_generic_cpu_bootable(unsigned int nr) /* Special case - we inhibit secondary thread startup * during boot if the user requests it. */ - if (system_state == SYSTEM_BOOTING && cpu_has_feature(CPU_FTR_SMT)) { + if (system_state < SYSTEM_RUNNING && cpu_has_feature(CPU_FTR_SMT)) { if (!smt_enabled_at_boot && cpu_thread_in_core(nr) != 0) return 0; if (smt_enabled_at_boot
Commit-ID: 9762b33dc31c67e34b36ba4e787e64084b3136ff Gitweb: http://git.kernel.org/tip/9762b33dc31c67e34b36ba4e787e64084b3136ff Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:38 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:36 +0200 ACPI: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Make the decision whether a pci root is hotplugged depend on SYSTEM_RUNNING instead of !SYSTEM_BOOTING. It makes no sense to cover states greater than SYSTEM_RUNNING as there are not hotplug events on reboot and poweroff. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Link: http://lkml.kernel.org/r/20170516184735.446455652@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/acpi/pci_root.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 919be0a..2405442 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -523,7 +523,7 @@ static int acpi_pci_root_add(struct acpi_device *device, struct acpi_pci_root *root; acpi_handle handle = device->handle; int no_aspm = 0; - bool hotadd = system_state != SYSTEM_BOOTING; + bool hotadd = system_state == SYSTEM_RUNNING; root = kzalloc(sizeof(struct acpi_pci_root), GFP_KERNEL); if (!root)
Commit-ID: 8cdde385c7a33afbe13fd71351da0968540fa566 Gitweb: http://git.kernel.org/tip/8cdde385c7a33afbe13fd71351da0968540fa566 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:39 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:36 +0200 mm: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. get_nid_for_pfn() checks for system_state == BOOTING to decide whether to use early_pfn_to_nid() when CONFIG_DEFERRED_STRUCT_PAGE_INIT=y. That check is dubious, because the switch to state RUNNING happes way after page_alloc_init_late() has been invoked. Change the check to less than RUNNING state so it covers the new intermediate states as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170516184735.528279534@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/base/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 5548f96..0440d95 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -377,7 +377,7 @@ static int __ref get_nid_for_pfn(unsigned long pfn) if (!pfn_valid_within(pfn)) return -1; #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT - if (system_state == SYSTEM_BOOTING) + if (system_state < SYSTEM_RUNNING) return early_pfn_to_nid(pfn); #endif page = pfn_to_page(pfn);
Commit-ID: d04e31a23c3c828456cb5613f391ce4ac4e5765f Gitweb: http://git.kernel.org/tip/d04e31a23c3c828456cb5613f391ce4ac4e5765f Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:40 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:36 +0200 cpufreq/pasemi: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in pas_cpufreq_cpu_exit() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170516184735.620023128@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/cpufreq/pasemi-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 35dd4d7..b257fc7 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -226,7 +226,7 @@ static int pas_cpufreq_cpu_exit(struct cpufreq_policy *policy) * We don't support CPU hotplug. Don't unmap after the system * has already made it to a running state. */ - if (system_state != SYSTEM_BOOTING) + if (system_state >= SYSTEM_RUNNING) return 0; if (sdcasr_mapbase)
Commit-ID: b608fe356fe8328665445a26ec75dfac918c8c5d Gitweb: http://git.kernel.org/tip/b608fe356fe8328665445a26ec75dfac918c8c5d Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:41 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:36 +0200 iommu/vt-d: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state checks in dmar_parse_one_atsr() and dmar_iommu_notify_scope_dev() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20170516184735.712365947@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/iommu/intel-iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index fc2765c..8500ded 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4315,7 +4315,7 @@ int dmar_parse_one_atsr(struct acpi_dmar_header *hdr, void *arg) struct acpi_dmar_atsr *atsr; struct dmar_atsr_unit *atsru; - if (system_state != SYSTEM_BOOTING && !intel_iommu_enabled) + if (system_state >= SYSTEM_RUNNING && !intel_iommu_enabled) return 0; atsr = container_of(hdr, struct acpi_dmar_atsr, header); @@ -4565,7 +4565,7 @@ int dmar_iommu_notify_scope_dev(struct dmar_pci_notify_info *info) struct acpi_dmar_atsr *atsr; struct acpi_dmar_reserved_memory *rmrr; - if (!intel_iommu_enabled && system_state != SYSTEM_BOOTING) + if (!intel_iommu_enabled && system_state >= SYSTEM_RUNNING) return 0; list_for_each_entry(rmrru, &dmar_rmrr_units, list) {
Commit-ID: b903dfb277c09e53d499480e9670557dcce36fbd Gitweb: http://git.kernel.org/tip/b903dfb277c09e53d499480e9670557dcce36fbd Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:42 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:37 +0200 iommu/of: Adjust system_state check To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in of_iommu_driver_present() to handle the extra states. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joerg Roedel <joro@8bytes.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20170516184735.788023442@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/iommu/of_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 9f44ee8..b8dcf44 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -103,7 +103,7 @@ static bool of_iommu_driver_present(struct device_node *np) * it never will be. We don't want to defer indefinitely, nor attempt * to dereference __iommu_of_table after it's been freed. */ - if (system_state > SYSTEM_BOOTING) + if (system_state >= SYSTEM_RUNNING) return false; return of_match_node(&__iommu_of_table, np);
Commit-ID: b4def42724594cd399cfee365221f5b38639711d Gitweb: http://git.kernel.org/tip/b4def42724594cd399cfee365221f5b38639711d Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:43 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:37 +0200 async: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in async_run_entry_fn() and async_synchronize_cookie_domain() to handle the extra states. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170516184735.865155020@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/async.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/async.c b/kernel/async.c index d2edd6e..2cbd3dd 100644 --- a/kernel/async.c +++ b/kernel/async.c @@ -114,14 +114,14 @@ static void async_run_entry_fn(struct work_struct *work) ktime_t uninitialized_var(calltime), delta, rettime; /* 1) run (and print duration) */ - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { pr_debug("calling %lli_%pF @ %i\n", (long long)entry->cookie, entry->func, task_pid_nr(current)); calltime = ktime_get(); } entry->func(entry->data, entry->cookie); - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { rettime = ktime_get(); delta = ktime_sub(rettime, calltime); pr_debug("initcall %lli_%pF returned 0 after %lld usecs\n", @@ -284,14 +284,14 @@ void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain { ktime_t uninitialized_var(starttime), delta, endtime; - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { pr_debug("async_waiting @ %i\n", task_pid_nr(current)); starttime = ktime_get(); } wait_event(async_done, lowest_in_progress(domain) >= cookie); - if (initcall_debug && system_state == SYSTEM_BOOTING) { + if (initcall_debug && system_state < SYSTEM_RUNNING) { endtime = ktime_get(); delta = ktime_sub(endtime, starttime);
Commit-ID: 0594729c24d846889408a07057b5cc9e8d931419 Gitweb: http://git.kernel.org/tip/0594729c24d846889408a07057b5cc9e8d931419 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:44 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:37 +0200 extable: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in core_kernel_text() to handle the extra states, i.e. to cover init text up to the point where the system switches to state RUNNING. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170516184735.949992741@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/extable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/extable.c b/kernel/extable.c index 2676d7f..0fbdd85 100644 --- a/kernel/extable.c +++ b/kernel/extable.c @@ -75,7 +75,7 @@ int core_kernel_text(unsigned long addr) addr < (unsigned long)_etext) return 1; - if (system_state == SYSTEM_BOOTING && + if (system_state < SYSTEM_RUNNING && init_kernel_text(addr)) return 1; return 0;
Commit-ID: ff48cd26fc4889b9deb5f9333d3c61746e450b7f Gitweb: http://git.kernel.org/tip/ff48cd26fc4889b9deb5f9333d3c61746e450b7f Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:45 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:37 +0200 printk: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in boot_delay_msec() to handle the extra states. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170516184736.027534895@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/printk/printk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index a1aecf4..32fac39 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1176,7 +1176,7 @@ static void boot_delay_msec(int level) unsigned long long k; unsigned long timeout; - if ((boot_delay == 0 || system_state != SYSTEM_BOOTING) + if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING) || suppress_message_printing(level)) { return; }
Commit-ID: c6202adf3a0969514299cf10ff07376a84ad09bb Gitweb: http://git.kernel.org/tip/c6202adf3a0969514299cf10ff07376a84ad09bb Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:46 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:37 +0200 mm/vmscan: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state check in kswapd_run() to handle the extra states. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170516184736.119158930@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 8ad39bb..c3c1c6a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3652,7 +3652,7 @@ int kswapd_run(int nid) pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid); if (IS_ERR(pgdat->kswapd)) { /* failure at boot is fatal */ - BUG_ON(system_state == SYSTEM_BOOTING); + BUG_ON(system_state < SYSTEM_RUNNING); pr_err("Failed to start kswapd on node %d\n", nid); ret = PTR_ERR(pgdat->kswapd); pgdat->kswapd = NULL;
Commit-ID: 69a78ff226fe0241ab6cb9dd961667be477e3cf7 Gitweb: http://git.kernel.org/tip/69a78ff226fe0241ab6cb9dd961667be477e3cf7 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:47 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:38 +0200 init: Introduce SYSTEM_SCHEDULING state might_sleep() debugging and smp_processor_id() debugging should be active right after the scheduler starts working. The init task can invoke smp_processor_id() from preemptible context as it is pinned on the boot cpu until sched_smp_init() removes the pinning and lets it schedule on all non isolated cpus. Add a new state which allows to enable those checks earlier and add it to the xen do_poweroff() function. No functional change. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170516184736.196214622@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- drivers/xen/manage.c | 1 + include/linux/kernel.h | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index c1ec8ee..9e35032 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -190,6 +190,7 @@ static void do_poweroff(void) { switch (system_state) { case SYSTEM_BOOTING: + case SYSTEM_SCHEDULING: orderly_poweroff(true); break; case SYSTEM_RUNNING: diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 13bc08a..1c91f26 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -490,9 +490,13 @@ extern int root_mountflags; extern bool early_boot_irqs_disabled; -/* Values used for system_state */ +/* + * Values used for system_state. Ordering of the states must not be changed + * as code checks for <, <=, >, >= STATE. + */ extern enum system_states { SYSTEM_BOOTING, + SYSTEM_SCHEDULING, SYSTEM_RUNNING, SYSTEM_HALT, SYSTEM_POWER_OFF,
Commit-ID: 1c3c5eab171590f86edd8d31389d61dd1efe3037 Gitweb: http://git.kernel.org/tip/1c3c5eab171590f86edd8d31389d61dd1efe3037 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 16 May 2017 20:42:48 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 23 May 2017 10:01:38 +0200 sched/core: Enable might_sleep() and smp_processor_id() checks early might_sleep() and smp_processor_id() checks are enabled after the boot process is done. That hides bugs in the SMP bringup and driver initialization code. Enable it right when the scheduler starts working, i.e. when init task and kthreadd have been created and right before the idle task enables preemption. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170516184736.272225698@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> --- init/main.c | 10 ++++++++++ kernel/sched/core.c | 4 +++- lib/smp_processor_id.c | 2 +- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/init/main.c b/init/main.c index badae3b..df58a41 100644 --- a/init/main.c +++ b/init/main.c @@ -414,6 +414,16 @@ static noinline void __ref rest_init(void) rcu_read_lock(); kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); rcu_read_unlock(); + + /* + * Enable might_sleep() and smp_processor_id() checks. + * They cannot be enabled earlier because with CONFIG_PRREMPT=y + * kernel_thread() would trigger might_sleep() splats. With + * CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled + * already, but it's stuck on the kthreadd_done completion. + */ + system_state = SYSTEM_SCHEDULING; + complete(&kthreadd_done); /* diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 877241e..c3e50ca 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6238,8 +6238,10 @@ void ___might_sleep(const char *file, int line, int preempt_offset) if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && !is_idle_task(current)) || - system_state != SYSTEM_RUNNING || oops_in_progress) + system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || + oops_in_progress) return; + if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) return; prev_jiffy = jiffies; diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c index 690d75b..2fb007b 100644 --- a/lib/smp_processor_id.c +++ b/lib/smp_processor_id.c @@ -28,7 +28,7 @@ notrace static unsigned int check_preemption_disabled(const char *what1, /* * It is valid to assume CPU-locality during early bootup: */ - if (system_state != SYSTEM_RUNNING) + if (system_state < SYSTEM_SCHEDULING) goto out; /*