All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sudeep Holla <sudeep.holla@arm.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Atish Patra <atish.patra@wdc.com>,
	Russell King <linux@armlinux.org.uk>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Lukasz Luba <lukasz.luba@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Sudeep Holla <sudeep.holla@arm.com>
Subject: Re: [PATCH] arm: Fix topology setup in case of CPU hotplug for CONFIG_SCHED_MC
Date: Wed, 20 Nov 2019 11:09:11 +0000	[thread overview]
Message-ID: <20191120110911.GA31600@bogus> (raw)
In-Reply-To: <20191120104212.14791-1-dietmar.eggemann@arm.com>

Hi Dietmar, Lukasz,

Thanks for digging this bug and fixing it.

On Wed, Nov 20, 2019 at 10:42:12AM +0000, Dietmar Eggemann wrote:
> Commit ca74b316df96 ("arm: Use common cpu_topology structure and
> functions.") changed cpu_coregroup_mask() from the ARM32 specific
> implementation in arch/arm/include/asm/topology.h to the one shared
> with ARM64 and RISCV in drivers/base/arch_topology.c.
>
> Currently on Arm32 (TC2 w/ CONFIG_SCHED_MC) the task scheduler setup
> code (w/ CONFIG_SCHED_DEBUG) shows this during CPU hotplug:
>
>   ERROR: groups don't span domain->span
>
> It happens to CPUs of the cluster of the CPU which gets hot-plugged
> out on scheduler domain MC.
>
> Turns out that the shared cpu_coregroup_mask() requires that the
> hot-plugged CPU is removed from the core_sibling mask via
> remove_cpu_topology(). Otherwise the 'is core_sibling subset of
> cpumask_of_node()' doesn't work. In this case the task scheduler has to
> deal with cpumask_of_node instead of core_sibling which is wrong on
> scheduler domain MC.
>
> e.g. CPU3 hot-plugged out on TC2 [cluster0: 0,3-4 cluster1: 1-2]:
>
>   cpu_coregroup_mask(): CPU3 cpumask_of_node=0-2,4 core_sibling=0,3-4
>                                                                   ^
> should be:
>
>   cpu_coregroup_mask(): CPU3 cpumask_of_node=0-2,4 core_sibling=0,4
>
> Add remove_cpu_topology() to __cpu_disable() to remove the CPU from the
> topology masks in case of a CPU hotplug out operation.
>
> At the same time tweak store_cpu_topology() slightly so it will call
> update_siblings_masks() in case of CPU hotplug in operation via
> secondary_start_kernel()->smp_store_cpu_info().
>
> This aligns the Arm32 implementation with the Arm64 one.
>
> Fixes: ca74b316df96 ("arm: Use common cpu_topology structure and functions")

Apart from the minor nit below,

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  arch/arm/kernel/smp.c      |  2 ++
>  arch/arm/kernel/topology.c | 15 +++++++--------
>  2 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 4b0bab2607e4..139c0d98fa29 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -240,6 +240,8 @@ int __cpu_disable(void)
>  	if (ret)
>  		return ret;
>
> +	remove_cpu_topology(cpu);
> +
>  	/*
>  	 * Take this CPU offline.  Once we clear this, we can't return,
>  	 * and we must not schedule until we're ready to give up the cpu.
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 5b9faba03afb..b37b0a340991 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -196,9 +196,8 @@ void store_cpu_topology(unsigned int cpuid)
>  	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
>  	unsigned int mpidr;
>
> -	/* If the cpu topology has been already set, just return */
> -	if (cpuid_topo->core_id != -1)
> -		return;
> +	if (cpuid_topo->package_id != -1)
> +		goto topology_populated;
>
>  	mpidr = read_cpuid_mpidr();
>
> @@ -231,14 +230,14 @@ void store_cpu_topology(unsigned int cpuid)
>  		cpuid_topo->package_id = -1;
>  	}
>
> -	update_siblings_masks(cpuid);
> -
>  	update_cpu_capacity(cpuid);
>
>  	pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
> -		cpuid, cpu_topology[cpuid].thread_id,
> -		cpu_topology[cpuid].core_id,
> -		cpu_topology[cpuid].package_id, mpidr);
> +		cpuid, cpuid_topo->thread_id, cpuid_topo->core_id,
> +		cpuid_topo->package_id, mpidr);
> +

[nit] The above is a clearly cosmetic cleanup and shouldn't be part of
this fix as they will be backported automatically. So I prefer to drop
this or make it separate patch if required.

--
Regards,
Sudeep

WARNING: multiple messages have this Message-ID (diff)
From: Sudeep Holla <sudeep.holla@arm.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org,
	Russell King <linux@armlinux.org.uk>,
	Atish Patra <atish.patra@wdc.com>,
	Lukasz Luba <lukasz.luba@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm: Fix topology setup in case of CPU hotplug for CONFIG_SCHED_MC
Date: Wed, 20 Nov 2019 11:09:11 +0000	[thread overview]
Message-ID: <20191120110911.GA31600@bogus> (raw)
In-Reply-To: <20191120104212.14791-1-dietmar.eggemann@arm.com>

Hi Dietmar, Lukasz,

Thanks for digging this bug and fixing it.

On Wed, Nov 20, 2019 at 10:42:12AM +0000, Dietmar Eggemann wrote:
> Commit ca74b316df96 ("arm: Use common cpu_topology structure and
> functions.") changed cpu_coregroup_mask() from the ARM32 specific
> implementation in arch/arm/include/asm/topology.h to the one shared
> with ARM64 and RISCV in drivers/base/arch_topology.c.
>
> Currently on Arm32 (TC2 w/ CONFIG_SCHED_MC) the task scheduler setup
> code (w/ CONFIG_SCHED_DEBUG) shows this during CPU hotplug:
>
>   ERROR: groups don't span domain->span
>
> It happens to CPUs of the cluster of the CPU which gets hot-plugged
> out on scheduler domain MC.
>
> Turns out that the shared cpu_coregroup_mask() requires that the
> hot-plugged CPU is removed from the core_sibling mask via
> remove_cpu_topology(). Otherwise the 'is core_sibling subset of
> cpumask_of_node()' doesn't work. In this case the task scheduler has to
> deal with cpumask_of_node instead of core_sibling which is wrong on
> scheduler domain MC.
>
> e.g. CPU3 hot-plugged out on TC2 [cluster0: 0,3-4 cluster1: 1-2]:
>
>   cpu_coregroup_mask(): CPU3 cpumask_of_node=0-2,4 core_sibling=0,3-4
>                                                                   ^
> should be:
>
>   cpu_coregroup_mask(): CPU3 cpumask_of_node=0-2,4 core_sibling=0,4
>
> Add remove_cpu_topology() to __cpu_disable() to remove the CPU from the
> topology masks in case of a CPU hotplug out operation.
>
> At the same time tweak store_cpu_topology() slightly so it will call
> update_siblings_masks() in case of CPU hotplug in operation via
> secondary_start_kernel()->smp_store_cpu_info().
>
> This aligns the Arm32 implementation with the Arm64 one.
>
> Fixes: ca74b316df96 ("arm: Use common cpu_topology structure and functions")

Apart from the minor nit below,

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  arch/arm/kernel/smp.c      |  2 ++
>  arch/arm/kernel/topology.c | 15 +++++++--------
>  2 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 4b0bab2607e4..139c0d98fa29 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -240,6 +240,8 @@ int __cpu_disable(void)
>  	if (ret)
>  		return ret;
>
> +	remove_cpu_topology(cpu);
> +
>  	/*
>  	 * Take this CPU offline.  Once we clear this, we can't return,
>  	 * and we must not schedule until we're ready to give up the cpu.
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 5b9faba03afb..b37b0a340991 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -196,9 +196,8 @@ void store_cpu_topology(unsigned int cpuid)
>  	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
>  	unsigned int mpidr;
>
> -	/* If the cpu topology has been already set, just return */
> -	if (cpuid_topo->core_id != -1)
> -		return;
> +	if (cpuid_topo->package_id != -1)
> +		goto topology_populated;
>
>  	mpidr = read_cpuid_mpidr();
>
> @@ -231,14 +230,14 @@ void store_cpu_topology(unsigned int cpuid)
>  		cpuid_topo->package_id = -1;
>  	}
>
> -	update_siblings_masks(cpuid);
> -
>  	update_cpu_capacity(cpuid);
>
>  	pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
> -		cpuid, cpu_topology[cpuid].thread_id,
> -		cpu_topology[cpuid].core_id,
> -		cpu_topology[cpuid].package_id, mpidr);
> +		cpuid, cpuid_topo->thread_id, cpuid_topo->core_id,
> +		cpuid_topo->package_id, mpidr);
> +

[nit] The above is a clearly cosmetic cleanup and shouldn't be part of
this fix as they will be backported automatically. So I prefer to drop
this or make it separate patch if required.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-11-20 11:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 10:42 [PATCH] arm: Fix topology setup in case of CPU hotplug for CONFIG_SCHED_MC Dietmar Eggemann
2019-11-20 10:42 ` Dietmar Eggemann
2019-11-20 11:09 ` Sudeep Holla [this message]
2019-11-20 11:09   ` Sudeep Holla
2019-11-20 12:30   ` Dietmar Eggemann
2019-11-20 12:30     ` Dietmar Eggemann
2019-11-20 11:16 ` Lukasz Luba
2019-11-20 11:16   ` Lukasz Luba
2019-11-24 21:47 ` Ondřej Jirman
2019-11-24 21:47   ` Ondřej Jirman
2019-11-26 10:42   ` Dietmar Eggemann
2019-11-26 10:42     ` Dietmar Eggemann
2019-11-26 21:45     ` Ondřej Jirman
2019-11-26 21:45       ` Ondřej Jirman
2019-11-27 14:31       ` Dietmar Eggemann
2019-11-27 14:31         ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191120110911.GA31600@bogus \
    --to=sudeep.holla@arm.com \
    --cc=atish.patra@wdc.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lukasz.luba@arm.com \
    --cc=morten.rasmussen@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.