All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Drop CONFIG_SCHED_MC
@ 2022-05-31 18:02 Helge Deller
  2022-06-01 13:42 ` Mikulas Patocka
  0 siblings, 1 reply; 7+ messages in thread
From: Helge Deller @ 2022-05-31 18:02 UTC (permalink / raw)
  To: linux-parisc, James Bottomley, John David Anglin, Mikulas Patocka

Mikulas noticed that the parisc kernel crashes in sd_init() if
CONFIG_SCHED_MC is enabled.
Multicore-scheduling is probably not very useful on parisc, so simply
drop this option.

Signed-off-by: Helge Deller <deller@gmx.de>
Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index bd22578859d0..34591a981cb7 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -281,14 +281,6 @@ config SMP

 	  If you don't know what to do here, say N.

-config SCHED_MC
-	bool "Multi-core scheduler support"
-	depends on GENERIC_ARCH_TOPOLOGY && PA8X00
-	help
-	  Multi-core scheduler support improves the CPU scheduler's decision
-	  making when dealing with multi-core CPU chips at a cost of slightly
-	  increased overhead in some places. If unsure say N here.
-
 config IRQSTACKS
 	bool "Use separate kernel stacks when processing interrupts"
 	default y
diff --git a/arch/parisc/kernel/topology.c b/arch/parisc/kernel/topology.c
index 9696e3cb6a2a..71a678ceb33a 100644
--- a/arch/parisc/kernel/topology.c
+++ b/arch/parisc/kernel/topology.c
@@ -81,10 +81,6 @@ void store_cpu_topology(unsigned int cpuid)
 }

 static struct sched_domain_topology_level parisc_mc_topology[] = {
-#ifdef CONFIG_SCHED_MC
-	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
-#endif
-
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Drop CONFIG_SCHED_MC
  2022-05-31 18:02 [PATCH] Drop CONFIG_SCHED_MC Helge Deller
@ 2022-06-01 13:42 ` Mikulas Patocka
  2022-06-01 14:54   ` Mikulas Patocka
  0 siblings, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2022-06-01 13:42 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin



On Tue, 31 May 2022, Helge Deller wrote:

> Mikulas noticed that the parisc kernel crashes in sd_init() if
> CONFIG_SCHED_MC is enabled.
> Multicore-scheduling is probably not very useful on parisc, so simply
> drop this option.
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: <stable@vger.kernel.org> # 5.18

Hi

I think that we should fix the root cause instead of trying to treat the 
symptoms.

Some more testing showed that:

in sd_init: tl->mask(cpu) returns an empty mask
tl->mask is cpu_coregroup_mask
in cpu_coregroup_mask: cpu_topology[cpu].core_sibling is an empty mask, 
that gets returned to sd_init

In arch/parisc/kernel/topology.c:
init_cpu_topology is called before store_cpu_topology, but it depends on 
the variable dualcores_found being set by store_cpu_topology. Thus, it is 
not set.

store_cpu_topology returns if cpuid_topo->core_id != -1, but during boot, 
store_cpu_topology is called before reset_cpu_topology, thus the member 
"core_id" is uninitialized zero and store_cpu_tolopogy does nothing.

If these issues are addrssed, multicore scheduling will work.

Mikulas


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Drop CONFIG_SCHED_MC
  2022-06-01 13:42 ` Mikulas Patocka
@ 2022-06-01 14:54   ` Mikulas Patocka
  2022-06-01 15:54     ` [PATCH] parisc: fix a crash with multicore scheduler Mikulas Patocka
  0 siblings, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2022-06-01 14:54 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin



On Wed, 1 Jun 2022, Mikulas Patocka wrote:

> 
> 
> On Tue, 31 May 2022, Helge Deller wrote:
> 
> > Mikulas noticed that the parisc kernel crashes in sd_init() if
> > CONFIG_SCHED_MC is enabled.
> > Multicore-scheduling is probably not very useful on parisc, so simply
> > drop this option.
> > 
> > Signed-off-by: Helge Deller <deller@gmx.de>
> > Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: <stable@vger.kernel.org> # 5.18
> 
> Hi
> 
> I think that we should fix the root cause instead of trying to treat the 
> symptoms.
> 
> Some more testing showed that:
> 
> in sd_init: tl->mask(cpu) returns an empty mask
> tl->mask is cpu_coregroup_mask
> in cpu_coregroup_mask: cpu_topology[cpu].core_sibling is an empty mask, 
> that gets returned to sd_init
> 
> In arch/parisc/kernel/topology.c:
> init_cpu_topology is called before store_cpu_topology, but it depends on 
> the variable dualcores_found being set by store_cpu_topology. Thus, it is 
> not set.
> 
> store_cpu_topology returns if cpuid_topo->core_id != -1, but during boot, 
> store_cpu_topology is called before reset_cpu_topology, thus the member 
> "core_id" is uninitialized zero and store_cpu_tolopogy does nothing.
> 
> If these issues are addrssed, multicore scheduling will work.

I've found that this fixes it.

Mikulas


---
 arch/parisc/kernel/topology.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/parisc/kernel/topology.c
===================================================================
--- linux-2.6.orig/arch/parisc/kernel/topology.c	2022-06-01 15:32:59.000000000 +0200
+++ linux-2.6/arch/parisc/kernel/topology.c	2022-06-01 16:47:36.000000000 +0200
@@ -95,7 +95,8 @@ static struct sched_domain_topology_leve
  */
 void __init init_cpu_topology(void)
 {
+	reset_cpu_topology();
 	/* Set scheduler topology descriptor */
-	if (dualcores_found)
+	/*if (dualcores_found)*/
 		set_sched_topology(parisc_mc_topology);
 }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] parisc: fix a crash with multicore scheduler
  2022-06-01 14:54   ` Mikulas Patocka
@ 2022-06-01 15:54     ` Mikulas Patocka
  2022-06-01 17:18       ` [PATCH v2] " Mikulas Patocka
  0 siblings, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2022-06-01 15:54 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin

With the kernel 5.18, the system will hang on boot if it is compiled with
CONFIG_SCHED_MC. The last printed message is "Brought up 1 node, 1 CPU".

The crash happens in sd_init
tl->mask (which is cpu_coregroup_mask) returns an empty mask. This happens
	because cpu_topology[0].core_sibling is empty.
Consequently, sd_span is set to an empty mask
sd_id = cpumask_first(sd_span) sets sd_id == NR_CPUS (because the mask is
	empty)
sd->shared = *per_cpu_ptr(sdd->sds, sd_id); sets sd->shared to NULL
	because sd_id is out of range
atomic_inc(&sd->shared->ref); crashes without printing anything

We can fix it by calling reset_cpu_topology() from init_cpu_topology() - 
this will initialize the sibling masks on CPUs, so that they're not empty.

This patch also removes the variable "dualcores_found", it is useless,
because during boot, init_cpu_topology is called before
store_cpu_topology. Thus, set_sched_topology(parisc_mc_topology) is never
called. We don't need to call it at all because default_topology in
kernel/sched/topology.c contains the same items as parisc_mc_topology.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v5.18

---
 arch/parisc/kernel/topology.c |   16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

Index: linux-2.6/arch/parisc/kernel/topology.c
===================================================================
--- linux-2.6.orig/arch/parisc/kernel/topology.c	2022-06-01 15:32:59.000000000 +0200
+++ linux-2.6/arch/parisc/kernel/topology.c	2022-06-01 17:04:09.000000000 +0200
@@ -20,8 +20,6 @@
 
 static DEFINE_PER_CPU(struct cpu, cpu_devices);
 
-static int dualcores_found;
-
 /*
  * store_cpu_topology is called at boot when only one cpu is running
  * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
@@ -60,7 +58,6 @@ void store_cpu_topology(unsigned int cpu
 			if (p->cpu_loc) {
 				cpuid_topo->core_id++;
 				cpuid_topo->package_id = cpu_topology[cpu].package_id;
-				dualcores_found = 1;
 				continue;
 			}
 		}
@@ -80,22 +77,11 @@ void store_cpu_topology(unsigned int cpu
 		cpu_topology[cpuid].package_id);
 }
 
-static struct sched_domain_topology_level parisc_mc_topology[] = {
-#ifdef CONFIG_SCHED_MC
-	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
-#endif
-
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
-	{ NULL, },
-};
-
 /*
  * init_cpu_topology is called at boot when only one cpu is running
  * which prevent simultaneous write access to cpu_topology array
  */
 void __init init_cpu_topology(void)
 {
-	/* Set scheduler topology descriptor */
-	if (dualcores_found)
-		set_sched_topology(parisc_mc_topology);
+	reset_cpu_topology();
 }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] parisc: fix a crash with multicore scheduler
  2022-06-01 15:54     ` [PATCH] parisc: fix a crash with multicore scheduler Mikulas Patocka
@ 2022-06-01 17:18       ` Mikulas Patocka
  2022-06-02 20:52         ` Helge Deller
  0 siblings, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2022-06-01 17:18 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin

With the kernel 5.18, the system will hang on boot if it is compiled with
CONFIG_SCHED_MC. The last printed message is "Brought up 1 node, 1 CPU".

The crash happens in sd_init
tl->mask (which is cpu_coregroup_mask) returns an empty mask. This happens
	because cpu_topology[0].core_sibling is empty.
Consequently, sd_span is set to an empty mask
sd_id = cpumask_first(sd_span) sets sd_id == NR_CPUS (because the mask is
	empty)
sd->shared = *per_cpu_ptr(sdd->sds, sd_id); sets sd->shared to NULL
	because sd_id is out of range
atomic_inc(&sd->shared->ref); crashes without printing anything

We can fix it by calling reset_cpu_topology() from init_cpu_topology() -
this will initialize the sibling masks on CPUs, so that they're not empty.

This patch also removes the variable "dualcores_found", it is useless,
because during boot, init_cpu_topology is called before
store_cpu_topology. Thus, set_sched_topology(parisc_mc_topology) is never
called. We don't need to call it at all because default_topology in
kernel/sched/topology.c contains the same items as parisc_mc_topology.

Note that we should not call store_cpu_topology() from init_per_cpu()
because it is called too early in the kernel initialization process and it
results in the message "Failure to register CPU0 device". Before this
patch, store_cpu_topology() would exit immediatelly because
cpuid_topo->core id was uninitialized and it was 0.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v5.18

---
 arch/parisc/kernel/processor.c |    2 --
 arch/parisc/kernel/topology.c  |   16 +---------------
 2 files changed, 1 insertion(+), 17 deletions(-)

Index: linux-2.6/arch/parisc/kernel/topology.c
===================================================================
--- linux-2.6.orig/arch/parisc/kernel/topology.c	2022-06-01 15:32:59.000000000 +0200
+++ linux-2.6/arch/parisc/kernel/topology.c	2022-06-01 18:37:37.000000000 +0200
@@ -20,8 +20,6 @@
 
 static DEFINE_PER_CPU(struct cpu, cpu_devices);
 
-static int dualcores_found;
-
 /*
  * store_cpu_topology is called at boot when only one cpu is running
  * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
@@ -60,7 +58,6 @@ void store_cpu_topology(unsigned int cpu
 			if (p->cpu_loc) {
 				cpuid_topo->core_id++;
 				cpuid_topo->package_id = cpu_topology[cpu].package_id;
-				dualcores_found = 1;
 				continue;
 			}
 		}
@@ -80,22 +77,11 @@ void store_cpu_topology(unsigned int cpu
 		cpu_topology[cpuid].package_id);
 }
 
-static struct sched_domain_topology_level parisc_mc_topology[] = {
-#ifdef CONFIG_SCHED_MC
-	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
-#endif
-
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
-	{ NULL, },
-};
-
 /*
  * init_cpu_topology is called at boot when only one cpu is running
  * which prevent simultaneous write access to cpu_topology array
  */
 void __init init_cpu_topology(void)
 {
-	/* Set scheduler topology descriptor */
-	if (dualcores_found)
-		set_sched_topology(parisc_mc_topology);
+	reset_cpu_topology();
 }
Index: linux-2.6/arch/parisc/kernel/processor.c
===================================================================
--- linux-2.6.orig/arch/parisc/kernel/processor.c	2022-06-01 15:32:59.000000000 +0200
+++ linux-2.6/arch/parisc/kernel/processor.c	2022-06-01 18:35:12.000000000 +0200
@@ -327,8 +327,6 @@ int init_per_cpu(int cpunum)
 	set_firmware_width();
 	ret = pdc_coproc_cfg(&coproc_cfg);
 
-	store_cpu_topology(cpunum);
-
 	if(ret >= 0 && coproc_cfg.ccr_functional) {
 		mtctl(coproc_cfg.ccr_functional, 10);  /* 10 == Coprocessor Control Reg */
 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] parisc: fix a crash with multicore scheduler
  2022-06-01 17:18       ` [PATCH v2] " Mikulas Patocka
@ 2022-06-02 20:52         ` Helge Deller
  2022-06-03  7:40           ` Mikulas Patocka
  0 siblings, 1 reply; 7+ messages in thread
From: Helge Deller @ 2022-06-02 20:52 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: linux-parisc, James Bottomley, John David Anglin

Hi Mikulas,

On 6/1/22 19:18, Mikulas Patocka wrote:
> With the kernel 5.18, the system will hang on boot if it is compiled with
> CONFIG_SCHED_MC. The last printed message is "Brought up 1 node, 1 CPU".
>
> The crash happens in sd_init
> tl->mask (which is cpu_coregroup_mask) returns an empty mask. This happens
> 	because cpu_topology[0].core_sibling is empty.
> Consequently, sd_span is set to an empty mask
> sd_id = cpumask_first(sd_span) sets sd_id == NR_CPUS (because the mask is
> 	empty)
> sd->shared = *per_cpu_ptr(sdd->sds, sd_id); sets sd->shared to NULL
> 	because sd_id is out of range
> atomic_inc(&sd->shared->ref); crashes without printing anything
>
> We can fix it by calling reset_cpu_topology() from init_cpu_topology() -
> this will initialize the sibling masks on CPUs, so that they're not empty.
>
> This patch also removes the variable "dualcores_found", it is useless,
> because during boot, init_cpu_topology is called before
> store_cpu_topology. Thus, set_sched_topology(parisc_mc_topology) is never
> called. We don't need to call it at all because default_topology in
> kernel/sched/topology.c contains the same items as parisc_mc_topology.
>
> Note that we should not call store_cpu_topology() from init_per_cpu()
> because it is called too early in the kernel initialization process and it
> results in the message "Failure to register CPU0 device". Before this
> patch, store_cpu_topology() would exit immediatelly because
> cpuid_topo->core id was uninitialized and it was 0.
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org	# v5.18

Thanks a lot !!!

It took me some time to test it, but it looks good and boots on
all of my machines so far. I was curious if 32-bit kernels still
work since that was one of the issues with the older patches...

With your patch we can drop the "config SCHED_MC" entry from
arch/parisc/Kconfig as well.
Will you respin, or should I simply add this to your patch?

Helge


>
> ---
>  arch/parisc/kernel/processor.c |    2 --
>  arch/parisc/kernel/topology.c  |   16 +---------------
>  2 files changed, 1 insertion(+), 17 deletions(-)
>
> Index: linux-2.6/arch/parisc/kernel/topology.c
> ===================================================================
> --- linux-2.6.orig/arch/parisc/kernel/topology.c	2022-06-01 15:32:59.000000000 +0200
> +++ linux-2.6/arch/parisc/kernel/topology.c	2022-06-01 18:37:37.000000000 +0200
> @@ -20,8 +20,6 @@
>
>  static DEFINE_PER_CPU(struct cpu, cpu_devices);
>
> -static int dualcores_found;
> -
>  /*
>   * store_cpu_topology is called at boot when only one cpu is running
>   * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
> @@ -60,7 +58,6 @@ void store_cpu_topology(unsigned int cpu
>  			if (p->cpu_loc) {
>  				cpuid_topo->core_id++;
>  				cpuid_topo->package_id = cpu_topology[cpu].package_id;
> -				dualcores_found = 1;
>  				continue;
>  			}
>  		}
> @@ -80,22 +77,11 @@ void store_cpu_topology(unsigned int cpu
>  		cpu_topology[cpuid].package_id);
>  }
>
> -static struct sched_domain_topology_level parisc_mc_topology[] = {
> -#ifdef CONFIG_SCHED_MC
> -	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
> -#endif
> -
> -	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> -	{ NULL, },
> -};
> -
>  /*
>   * init_cpu_topology is called at boot when only one cpu is running
>   * which prevent simultaneous write access to cpu_topology array
>   */
>  void __init init_cpu_topology(void)
>  {
> -	/* Set scheduler topology descriptor */
> -	if (dualcores_found)
> -		set_sched_topology(parisc_mc_topology);
> +	reset_cpu_topology();
>  }
> Index: linux-2.6/arch/parisc/kernel/processor.c
> ===================================================================
> --- linux-2.6.orig/arch/parisc/kernel/processor.c	2022-06-01 15:32:59.000000000 +0200
> +++ linux-2.6/arch/parisc/kernel/processor.c	2022-06-01 18:35:12.000000000 +0200
> @@ -327,8 +327,6 @@ int init_per_cpu(int cpunum)
>  	set_firmware_width();
>  	ret = pdc_coproc_cfg(&coproc_cfg);
>
> -	store_cpu_topology(cpunum);
> -
>  	if(ret >= 0 && coproc_cfg.ccr_functional) {
>  		mtctl(coproc_cfg.ccr_functional, 10);  /* 10 == Coprocessor Control Reg */
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] parisc: fix a crash with multicore scheduler
  2022-06-02 20:52         ` Helge Deller
@ 2022-06-03  7:40           ` Mikulas Patocka
  0 siblings, 0 replies; 7+ messages in thread
From: Mikulas Patocka @ 2022-06-03  7:40 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin



On Thu, 2 Jun 2022, Helge Deller wrote:

> Hi Mikulas,
> 
> Thanks a lot !!!
> 
> It took me some time to test it, but it looks good and boots on
> all of my machines so far. I was curious if 32-bit kernels still
> work since that was one of the issues with the older patches...
> 
> With your patch we can drop the "config SCHED_MC" entry from
> arch/parisc/Kconfig as well.
> Will you respin, or should I simply add this to your patch?
> 
> Helge

I think that we don't have to drop "config SCHED_MC". It is used in 
kernel/sched/topology.c to select the multicore-aware scheduler. There is 
no reason why the multicore scheduler would not work on parisc.

Mikulas


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-03  7:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-31 18:02 [PATCH] Drop CONFIG_SCHED_MC Helge Deller
2022-06-01 13:42 ` Mikulas Patocka
2022-06-01 14:54   ` Mikulas Patocka
2022-06-01 15:54     ` [PATCH] parisc: fix a crash with multicore scheduler Mikulas Patocka
2022-06-01 17:18       ` [PATCH v2] " Mikulas Patocka
2022-06-02 20:52         ` Helge Deller
2022-06-03  7:40           ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.