All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Phil Auld <pauld@redhat.com>, linux-kernel@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arch/arm64: Fix topology initialization for core scheduling
Date: Tue, 29 Mar 2022 16:02:22 +0200	[thread overview]
Message-ID: <1a546197-872b-7762-68ac-d5e6bb6d19aa@arm.com> (raw)
In-Reply-To: <20220322160304.26229-1-pauld@redhat.com>

On 22/03/2022 17:03, Phil Auld wrote:
> Some arm64 rely on store_cpu_topology() to setup the real topology.
> This needs to be done before the call to notify_cpu_starting() which
> tell the scheduler about the cpu otherwise the core scheduling data
> structures are setup in a way that does not match the actual topology.
> 
> Without this change stress-ng (which enables core scheduling in its prctl 
> tests) causes a warning and then a crash (trimmed for legibility):
> 
> [ 1853.805168] ------------[ cut here ]------------
> [ 1853.809784] task_rq(b)->core != rq->core
> [ 1853.809792] WARNING: CPU: 117 PID: 0 at kernel/sched/fair.c:11102 cfs_prio_less+0x1b4/0x1c4
> ...
> [ 1854.015210] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> ...
> [ 1854.231256] Call trace:
> [ 1854.233689]  pick_next_task+0x3dc/0x81c
> [ 1854.237512]  __schedule+0x10c/0x4cc
> [ 1854.240988]  schedule_idle+0x34/0x54
> 
> Fixes: 9edeaea1bc45 ("sched: Core-wide rq->lock")
> Signed-off-by: Phil Auld <pauld@redhat.com>
> ---
> This is a similar issue to 
>   f2703def339c ("MIPS: smp: fill in sibling and core maps earlier") 
> which fixed it for MIPS.

I assume this is for a machine which relies on MPIDR-based setup
(package_id == -1)? I.e. it doesn't have proper ACPI/(DT) data for
topology setup.

Tried on a ThunderX2 by disabling parse_acpi_topology() but then I end
up with a machine w/o SMT, so `stress-ng --prctl N` doesn't show this issue.

Which machine were you using?

>  arch/arm64/kernel/smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 27df5c1e6baa..3b46041f2b97 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -234,6 +234,7 @@ asmlinkage notrace void secondary_start_kernel(void)
>  	 * Log the CPU info before it is marked online and might get read.
>  	 */
>  	cpuinfo_store_cpu();
> +	store_cpu_topology(cpu);
>  
>  	/*
>  	 * Enable GIC and timers.
> @@ -242,7 +243,6 @@ asmlinkage notrace void secondary_start_kernel(void)
>  
>  	ipi_setup(cpu);
>  
> -	store_cpu_topology(cpu);
>  	numa_add_cpu(cpu);
>  
>  	/*


WARNING: multiple messages have this Message-ID (diff)
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Phil Auld <pauld@redhat.com>, linux-kernel@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	 Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arch/arm64: Fix topology initialization for core scheduling
Date: Tue, 29 Mar 2022 16:02:22 +0200	[thread overview]
Message-ID: <1a546197-872b-7762-68ac-d5e6bb6d19aa@arm.com> (raw)
In-Reply-To: <20220322160304.26229-1-pauld@redhat.com>

On 22/03/2022 17:03, Phil Auld wrote:
> Some arm64 rely on store_cpu_topology() to setup the real topology.
> This needs to be done before the call to notify_cpu_starting() which
> tell the scheduler about the cpu otherwise the core scheduling data
> structures are setup in a way that does not match the actual topology.
> 
> Without this change stress-ng (which enables core scheduling in its prctl 
> tests) causes a warning and then a crash (trimmed for legibility):
> 
> [ 1853.805168] ------------[ cut here ]------------
> [ 1853.809784] task_rq(b)->core != rq->core
> [ 1853.809792] WARNING: CPU: 117 PID: 0 at kernel/sched/fair.c:11102 cfs_prio_less+0x1b4/0x1c4
> ...
> [ 1854.015210] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> ...
> [ 1854.231256] Call trace:
> [ 1854.233689]  pick_next_task+0x3dc/0x81c
> [ 1854.237512]  __schedule+0x10c/0x4cc
> [ 1854.240988]  schedule_idle+0x34/0x54
> 
> Fixes: 9edeaea1bc45 ("sched: Core-wide rq->lock")
> Signed-off-by: Phil Auld <pauld@redhat.com>
> ---
> This is a similar issue to 
>   f2703def339c ("MIPS: smp: fill in sibling and core maps earlier") 
> which fixed it for MIPS.

I assume this is for a machine which relies on MPIDR-based setup
(package_id == -1)? I.e. it doesn't have proper ACPI/(DT) data for
topology setup.

Tried on a ThunderX2 by disabling parse_acpi_topology() but then I end
up with a machine w/o SMT, so `stress-ng --prctl N` doesn't show this issue.

Which machine were you using?

>  arch/arm64/kernel/smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 27df5c1e6baa..3b46041f2b97 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -234,6 +234,7 @@ asmlinkage notrace void secondary_start_kernel(void)
>  	 * Log the CPU info before it is marked online and might get read.
>  	 */
>  	cpuinfo_store_cpu();
> +	store_cpu_topology(cpu);
>  
>  	/*
>  	 * Enable GIC and timers.
> @@ -242,7 +243,6 @@ asmlinkage notrace void secondary_start_kernel(void)
>  
>  	ipi_setup(cpu);
>  
> -	store_cpu_topology(cpu);
>  	numa_add_cpu(cpu);
>  
>  	/*


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-03-29 14:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 16:03 [PATCH] arch/arm64: Fix topology initialization for core scheduling Phil Auld
2022-03-22 16:03 ` Phil Auld
2022-03-29 14:02 ` Dietmar Eggemann [this message]
2022-03-29 14:02   ` Dietmar Eggemann
2022-03-29 15:20   ` Phil Auld
2022-03-29 15:20     ` Phil Auld
2022-03-29 18:55     ` Dietmar Eggemann
2022-03-29 18:55       ` Dietmar Eggemann
2022-03-29 19:50       ` Phil Auld
2022-03-29 19:50         ` Phil Auld
2022-03-30 15:48         ` Dietmar Eggemann
2022-03-30 15:48           ` Dietmar Eggemann
2022-03-30 15:52           ` Phil Auld
2022-03-30 15:52             ` Phil Auld
2022-03-30 16:07           ` Phil Auld
2022-03-30 16:07             ` Phil Auld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1a546197-872b-7762-68ac-d5e6bb6d19aa@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.