linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Online new threads according to the current SMT level
@ 2023-03-31 15:39 Laurent Dufour
  2023-03-31 15:39 ` [PATCH 1/2] pseries/smp: export the smt level in the SYS FS Laurent Dufour
  2023-03-31 15:39 ` [PATCH 2/2] powerpc/pseries/cpuhp: respect current SMT when adding new CPU Laurent Dufour
  0 siblings, 2 replies; 10+ messages in thread
From: Laurent Dufour @ 2023-03-31 15:39 UTC (permalink / raw)
  To: mpe, npiggin, christophe.leroy
  Cc: msuchanek, nathanl, linux-kernel, linuxppc-dev

When a new CPU is added, the kernel is activating all its threads. This
leads to weird, but functional, result when adding CPU on a SMT 4 system
for instance.

Here the newly added CPU 1 has 8 threads while the other one has 4 threads
active (system has been booted with the 'smt-enabled=4' kernel option):

ltcden3-lp12:~ # ppc64_cpu --info
Core   0:    0*    1*    2*    3*    4     5     6     7
Core   1:    8*    9*   10*   11*   12*   13*   14*   15*

This mixed SMT level is confusing end users and some application like
lparstat are reporting wrong values.

There is no SMT level recorded in the kernel, neither in user space. Such a
level could be helpful when adding new CPU or when optimizing the energy
efficiency. This series introduce a new SYS FS entry named 'pseries_smt' to
store the current SMT level.

The SMT level is provided in best effort, writing a new value into that
entry is only recording it into the kernel. This way, it can be used when
new CPU are onlined for instance. There is no real change to the CPU setup
when a value is written, no CPU are onlined or offlined.

At boot time `pseries_smt` is loaded with smt_enabled_at_boot which is
containing the SMT level set at boot time, even if no kernel option is
specified.

The change is specific to pseries since CPU hot-plug is only provided for
this platform.

The second patch of this series is implementing the change to online only
the right number of threads when a new CPU is added.

Laurent Dufour (2):
  pseries/smp: export the smt level in the SYS FS.
  powerpc/pseries/cpuhp: respect current SMT when adding new CPU

 arch/powerpc/platforms/pseries/hotplug-cpu.c | 18 ++++++---
 arch/powerpc/platforms/pseries/pseries.h     |  3 ++
 arch/powerpc/platforms/pseries/smp.c         | 39 ++++++++++++++++++++
 3 files changed, 55 insertions(+), 5 deletions(-)

-- 
2.40.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-03-31 15:39 [PATCH 0/2] Online new threads according to the current SMT level Laurent Dufour
@ 2023-03-31 15:39 ` Laurent Dufour
  2023-03-31 16:05   ` Michal Suchánek
  2023-04-13 13:37   ` Michael Ellerman
  2023-03-31 15:39 ` [PATCH 2/2] powerpc/pseries/cpuhp: respect current SMT when adding new CPU Laurent Dufour
  1 sibling, 2 replies; 10+ messages in thread
From: Laurent Dufour @ 2023-03-31 15:39 UTC (permalink / raw)
  To: mpe, npiggin, christophe.leroy
  Cc: msuchanek, nathanl, linux-kernel, linuxppc-dev, Srikar Dronamraju

There is no SMT level recorded in the kernel neither in user space.
Indeed there is no real constraint about that and mixed SMT levels are
allowed and system is working fine this way.

However when new CPU are added, the kernel is onlining all the threads
which is leading to mixed SMT levels and confuse end user a bit.

To prevent this exports a SMT level from the kernel so user space
application like the energy daemon, could read it to adjust their settings.
There is no action unless recording the value when a SMT value is written
into the new sysfs entry. User space applications like ppc64_cpu should
update the sysfs when changing the SMT level to keep the system consistent.

Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/pseries.h |  3 ++
 arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index f8bce40ebd0c..af0a145af98f 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -23,7 +23,9 @@ extern int pSeries_machine_check_exception(struct pt_regs *regs);
 extern long pseries_machine_check_realmode(struct pt_regs *regs);
 void pSeries_machine_check_log_err(void);
 
+
 #ifdef CONFIG_SMP
+extern int pseries_smt;
 extern void smp_init_pseries(void);
 
 /* Get state of physical CPU from query_cpu_stopped */
@@ -34,6 +36,7 @@ int smp_query_cpu_stopped(unsigned int pcpu);
 #define QCSS_HARDWARE_ERROR -1
 #define QCSS_HARDWARE_BUSY -2
 #else
+#define pseries_smt 1
 static inline void smp_init_pseries(void) { }
 #endif
 
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
index c597711ef20a..6c382922f8f3 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -21,6 +21,7 @@
 #include <linux/device.h>
 #include <linux/cpu.h>
 #include <linux/pgtable.h>
+#include <linux/sysfs.h>
 
 #include <asm/ptrace.h>
 #include <linux/atomic.h>
@@ -45,6 +46,8 @@
 
 #include "pseries.h"
 
+int pseries_smt;
+
 /*
  * The Primary thread of each non-boot processor was started from the OF client
  * interface by prom_hold_cpus and is spinning on secondary_hold_spinloop.
@@ -280,3 +283,39 @@ void __init smp_init_pseries(void)
 
 	pr_debug(" <- smp_init_pSeries()\n");
 }
+
+static ssize_t pseries_smt_store(struct class *class,
+			 struct class_attribute *attr,
+			 const char *buf, size_t count)
+{
+	int smt;
+
+	if (kstrtou32(buf, 0, &smt) || !smt || smt > (u32) threads_per_core) {
+		pr_err("Invalid pseries_smt specified.\n");
+		return -EINVAL;
+	}
+
+	pseries_smt = smt;
+
+	return count;
+}
+
+static ssize_t pseries_smt_show(struct class *class, struct class_attribute *attr,
+			  char *buf)
+{
+	return sysfs_emit(buf, "%d\n", pseries_smt);
+}
+
+static CLASS_ATTR_RW(pseries_smt);
+
+static int __init pseries_smt_init(void)
+{
+	int rc;
+
+	pseries_smt = smt_enabled_at_boot;
+	rc = sysfs_create_file(kernel_kobj, &class_attr_pseries_smt.attr);
+	if (rc)
+		pr_err("Can't create pseries_smt sysfs/kernel entry.\n");
+	return rc;
+}
+machine_device_initcall(pseries, pseries_smt_init);
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] powerpc/pseries/cpuhp: respect current SMT when adding new CPU
  2023-03-31 15:39 [PATCH 0/2] Online new threads according to the current SMT level Laurent Dufour
  2023-03-31 15:39 ` [PATCH 1/2] pseries/smp: export the smt level in the SYS FS Laurent Dufour
@ 2023-03-31 15:39 ` Laurent Dufour
  1 sibling, 0 replies; 10+ messages in thread
From: Laurent Dufour @ 2023-03-31 15:39 UTC (permalink / raw)
  To: mpe, npiggin, christophe.leroy
  Cc: msuchanek, nathanl, linux-kernel, linuxppc-dev, Srikar Dronamraju

When a new CPU is added, the kernel is activating all its threads. This
leads to weird, but functional, result when adding CPU on a SMT 4 system
for instance.

Here the newly added CPU 1 has 8 threads while the other one has 4 threads
active (system has been booted with the 'smt-enabled=4' kernel option):

ltcden3-lp12:~ # ppc64_cpu --info
Core   0:    0*    1*    2*    3*    4     5     6     7
Core   1:    8*    9*   10*   11*   12*   13*   14*   15*

We rely on the newly pseries_smt value which should be updated when
changing the SMT level by ppc64_cpu --smt=x and at boot time using the
smt-enabled kernel option.

This way on a LPAR running in SMT=4, newly added CPU will be running 4
threads, which is what a end user would expect.

Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 18 +++++++++++++-----
 arch/powerpc/platforms/pseries/smp.c         |  2 +-
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 1a3cb313976a..e623ed8649b3 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -382,7 +382,7 @@ static int dlpar_online_cpu(struct device_node *dn)
 {
 	int rc = 0;
 	unsigned int cpu;
-	int len, nthreads, i;
+	int len, nthreads, i, smt;
 	const __be32 *intserv;
 	u32 thread;
 
@@ -392,6 +392,11 @@ static int dlpar_online_cpu(struct device_node *dn)
 
 	nthreads = len / sizeof(u32);
 
+	smt = READ_ONCE(pseries_smt);
+	/* We should online at least one thread */
+	if (smt < 1)
+		smt = 1;
+
 	cpu_maps_update_begin();
 	for (i = 0; i < nthreads; i++) {
 		thread = be32_to_cpu(intserv[i]);
@@ -400,10 +405,13 @@ static int dlpar_online_cpu(struct device_node *dn)
 				continue;
 			cpu_maps_update_done();
 			find_and_update_cpu_nid(cpu);
-			rc = device_online(get_cpu_device(cpu));
-			if (rc) {
-				dlpar_offline_cpu(dn);
-				goto out;
+			/* Don't active CPU over the current SMT setting */
+			if (smt-- > 0) {
+				rc = device_online(get_cpu_device(cpu));
+				if (rc) {
+					dlpar_offline_cpu(dn);
+					goto out;
+				}
 			}
 			cpu_maps_update_begin();
 
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
index 6c382922f8f3..ef8070651846 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -295,7 +295,7 @@ static ssize_t pseries_smt_store(struct class *class,
 		return -EINVAL;
 	}
 
-	pseries_smt = smt;
+	WRITE_ONCE(pseries_smt, smt);
 
 	return count;
 }
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-03-31 15:39 ` [PATCH 1/2] pseries/smp: export the smt level in the SYS FS Laurent Dufour
@ 2023-03-31 16:05   ` Michal Suchánek
  2023-04-03  8:20     ` Laurent Dufour
  2023-04-13 13:37   ` Michael Ellerman
  1 sibling, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2023-03-31 16:05 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: mpe, npiggin, christophe.leroy, nathanl, linux-kernel,
	linuxppc-dev, Srikar Dronamraju

Hello,

On Fri, Mar 31, 2023 at 05:39:04PM +0200, Laurent Dufour wrote:
> There is no SMT level recorded in the kernel neither in user space.
> Indeed there is no real constraint about that and mixed SMT levels are
> allowed and system is working fine this way.
> 
> However when new CPU are added, the kernel is onlining all the threads
> which is leading to mixed SMT levels and confuse end user a bit.
> 
> To prevent this exports a SMT level from the kernel so user space
> application like the energy daemon, could read it to adjust their settings.
> There is no action unless recording the value when a SMT value is written
> into the new sysfs entry. User space applications like ppc64_cpu should
> update the sysfs when changing the SMT level to keep the system consistent.
> 
> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
>  2 files changed, 42 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
> index f8bce40ebd0c..af0a145af98f 100644
> --- a/arch/powerpc/platforms/pseries/pseries.h
> +++ b/arch/powerpc/platforms/pseries/pseries.h
> @@ -23,7 +23,9 @@ extern int pSeries_machine_check_exception(struct pt_regs *regs);
>  extern long pseries_machine_check_realmode(struct pt_regs *regs);
>  void pSeries_machine_check_log_err(void);
>  
> +
>  #ifdef CONFIG_SMP
> +extern int pseries_smt;
>  extern void smp_init_pseries(void);
>  
>  /* Get state of physical CPU from query_cpu_stopped */
> @@ -34,6 +36,7 @@ int smp_query_cpu_stopped(unsigned int pcpu);
>  #define QCSS_HARDWARE_ERROR -1
>  #define QCSS_HARDWARE_BUSY -2
>  #else
> +#define pseries_smt 1

Is this really needed for anything?

The code using pseries_smt would not compile with a define, and would be
only compiled with SMP enabled anyway so we should not need this.

Thanks

Michal

>  static inline void smp_init_pseries(void) { }
>  #endif
>  
> diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
> index c597711ef20a..6c382922f8f3 100644
> --- a/arch/powerpc/platforms/pseries/smp.c
> +++ b/arch/powerpc/platforms/pseries/smp.c
> @@ -21,6 +21,7 @@
>  #include <linux/device.h>
>  #include <linux/cpu.h>
>  #include <linux/pgtable.h>
> +#include <linux/sysfs.h>
>  
>  #include <asm/ptrace.h>
>  #include <linux/atomic.h>
> @@ -45,6 +46,8 @@
>  
>  #include "pseries.h"
>  
> +int pseries_smt;
> +
>  /*
>   * The Primary thread of each non-boot processor was started from the OF client
>   * interface by prom_hold_cpus and is spinning on secondary_hold_spinloop.
> @@ -280,3 +283,39 @@ void __init smp_init_pseries(void)
>  
>  	pr_debug(" <- smp_init_pSeries()\n");
>  }
> +
> +static ssize_t pseries_smt_store(struct class *class,
> +			 struct class_attribute *attr,
> +			 const char *buf, size_t count)
> +{
> +	int smt;
> +
> +	if (kstrtou32(buf, 0, &smt) || !smt || smt > (u32) threads_per_core) {
> +		pr_err("Invalid pseries_smt specified.\n");
> +		return -EINVAL;
> +	}
> +
> +	pseries_smt = smt;
> +
> +	return count;
> +}
> +
> +static ssize_t pseries_smt_show(struct class *class, struct class_attribute *attr,
> +			  char *buf)
> +{
> +	return sysfs_emit(buf, "%d\n", pseries_smt);
> +}
> +
> +static CLASS_ATTR_RW(pseries_smt);
> +
> +static int __init pseries_smt_init(void)
> +{
> +	int rc;
> +
> +	pseries_smt = smt_enabled_at_boot;
> +	rc = sysfs_create_file(kernel_kobj, &class_attr_pseries_smt.attr);
> +	if (rc)
> +		pr_err("Can't create pseries_smt sysfs/kernel entry.\n");
> +	return rc;
> +}
> +machine_device_initcall(pseries, pseries_smt_init);
> -- 
> 2.40.0
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-03-31 16:05   ` Michal Suchánek
@ 2023-04-03  8:20     ` Laurent Dufour
  0 siblings, 0 replies; 10+ messages in thread
From: Laurent Dufour @ 2023-04-03  8:20 UTC (permalink / raw)
  To: Michal Suchánek, mpe
  Cc: npiggin, christophe.leroy, nathanl, linux-kernel, linuxppc-dev,
	Srikar Dronamraju

On 31/03/2023 18:05:27, Michal Suchánek wrote:
> Hello,
> 
> On Fri, Mar 31, 2023 at 05:39:04PM +0200, Laurent Dufour wrote:
>> There is no SMT level recorded in the kernel neither in user space.
>> Indeed there is no real constraint about that and mixed SMT levels are
>> allowed and system is working fine this way.
>>
>> However when new CPU are added, the kernel is onlining all the threads
>> which is leading to mixed SMT levels and confuse end user a bit.
>>
>> To prevent this exports a SMT level from the kernel so user space
>> application like the energy daemon, could read it to adjust their settings.
>> There is no action unless recording the value when a SMT value is written
>> into the new sysfs entry. User space applications like ppc64_cpu should
>> update the sysfs when changing the SMT level to keep the system consistent.
>>
>> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
>>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
>>  2 files changed, 42 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
>> index f8bce40ebd0c..af0a145af98f 100644
>> --- a/arch/powerpc/platforms/pseries/pseries.h
>> +++ b/arch/powerpc/platforms/pseries/pseries.h
>> @@ -23,7 +23,9 @@ extern int pSeries_machine_check_exception(struct pt_regs *regs);
>>  extern long pseries_machine_check_realmode(struct pt_regs *regs);
>>  void pSeries_machine_check_log_err(void);
>>  
>> +
>>  #ifdef CONFIG_SMP
>> +extern int pseries_smt;
>>  extern void smp_init_pseries(void);
>>  
>>  /* Get state of physical CPU from query_cpu_stopped */
>> @@ -34,6 +36,7 @@ int smp_query_cpu_stopped(unsigned int pcpu);
>>  #define QCSS_HARDWARE_ERROR -1
>>  #define QCSS_HARDWARE_BUSY -2
>>  #else
>> +#define pseries_smt 1
> 
> Is this really needed for anything?
> 
> The code using pseries_smt would not compile with a define, and would be
> only compiled with SMP enabled anyway so we should not need this.
> 

Hi Michal,

I do agree, the pseries code is implying SMP.

When writing that code, I found that SMP conditional block and just add
this define to be sure the code will compile in the case SMP is not
defined, but that's probably useless.

Instead of resending a new series, Michael, could you please remove that
line when applying the patch to your tree?

Thanks,
Laurent.

> Thanks
> 
> Michal
> 
>>  static inline void smp_init_pseries(void) { }
>>  #endif
>>  
>> diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
>> index c597711ef20a..6c382922f8f3 100644
>> --- a/arch/powerpc/platforms/pseries/smp.c
>> +++ b/arch/powerpc/platforms/pseries/smp.c
>> @@ -21,6 +21,7 @@
>>  #include <linux/device.h>
>>  #include <linux/cpu.h>
>>  #include <linux/pgtable.h>
>> +#include <linux/sysfs.h>
>>  
>>  #include <asm/ptrace.h>
>>  #include <linux/atomic.h>
>> @@ -45,6 +46,8 @@
>>  
>>  #include "pseries.h"
>>  
>> +int pseries_smt;
>> +
>>  /*
>>   * The Primary thread of each non-boot processor was started from the OF client
>>   * interface by prom_hold_cpus and is spinning on secondary_hold_spinloop.
>> @@ -280,3 +283,39 @@ void __init smp_init_pseries(void)
>>  
>>  	pr_debug(" <- smp_init_pSeries()\n");
>>  }
>> +
>> +static ssize_t pseries_smt_store(struct class *class,
>> +			 struct class_attribute *attr,
>> +			 const char *buf, size_t count)
>> +{
>> +	int smt;
>> +
>> +	if (kstrtou32(buf, 0, &smt) || !smt || smt > (u32) threads_per_core) {
>> +		pr_err("Invalid pseries_smt specified.\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	pseries_smt = smt;
>> +
>> +	return count;
>> +}
>> +
>> +static ssize_t pseries_smt_show(struct class *class, struct class_attribute *attr,
>> +			  char *buf)
>> +{
>> +	return sysfs_emit(buf, "%d\n", pseries_smt);
>> +}
>> +
>> +static CLASS_ATTR_RW(pseries_smt);
>> +
>> +static int __init pseries_smt_init(void)
>> +{
>> +	int rc;
>> +
>> +	pseries_smt = smt_enabled_at_boot;
>> +	rc = sysfs_create_file(kernel_kobj, &class_attr_pseries_smt.attr);
>> +	if (rc)
>> +		pr_err("Can't create pseries_smt sysfs/kernel entry.\n");
>> +	return rc;
>> +}
>> +machine_device_initcall(pseries, pseries_smt_init);
>> -- 
>> 2.40.0
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-03-31 15:39 ` [PATCH 1/2] pseries/smp: export the smt level in the SYS FS Laurent Dufour
  2023-03-31 16:05   ` Michal Suchánek
@ 2023-04-13 13:37   ` Michael Ellerman
  2023-04-13 15:38     ` Laurent Dufour
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2023-04-13 13:37 UTC (permalink / raw)
  To: Laurent Dufour, npiggin, christophe.leroy
  Cc: msuchanek, nathanl, linux-kernel, linuxppc-dev, Srikar Dronamraju

Hi Laurent,

Laurent Dufour <ldufour@linux.ibm.com> writes:
> There is no SMT level recorded in the kernel neither in user space.
> Indeed there is no real constraint about that and mixed SMT levels are
> allowed and system is working fine this way.
>
> However when new CPU are added, the kernel is onlining all the threads
> which is leading to mixed SMT levels and confuse end user a bit.
>
> To prevent this exports a SMT level from the kernel so user space
> application like the energy daemon, could read it to adjust their settings.
> There is no action unless recording the value when a SMT value is written
> into the new sysfs entry. User space applications like ppc64_cpu should
> update the sysfs when changing the SMT level to keep the system consistent.
>
> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
>  2 files changed, 42 insertions(+)

There is a generic sysfs interface for smt in /sys/devices/system/cpu/smt

I think we should be enabling that on powerpc and then adapting it to
our needs, rather than adding a pseries specific file.

Currently the generic code is only aware of SMT on/off, so it would need
to be taught about SMT4 and 8 at least.

There are already hooks in the generic code to check the SMT level when
bringing CPUs up, see cpu_smt_allowed(), they may work for the pseries
hotplug case too, though maybe we need some additional logic.

Wiring up the basic support is pretty straight forward, something like
the diff below.

cheers


diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 0f123f1f62a1..a48576f1c579 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -260,6 +260,7 @@ config PPC
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
+	select HOTPLUG_SMT			if HOTPLUG_CPU
 	select HUGETLB_PAGE_SIZE_VARIABLE	if PPC_BOOK3S_64 && HUGETLB_PAGE
 	select IOMMU_HELPER			if PPC64
 	select IRQ_DOMAIN
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 8a4d4f4d9749..bd23ba716d23 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -143,5 +143,8 @@ static inline int cpu_to_coregroup_id(int cpu)
 #endif
 #endif

+bool topology_is_primary_thread(unsigned int cpu);
+bool topology_smt_supported(void);
+
 #endif /* __KERNEL__ */
 #endif	/* _ASM_POWERPC_TOPOLOGY_H */
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 265801a3e94c..8619609809d5 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1769,4 +1769,20 @@ void __noreturn arch_cpu_idle_dead(void)
 	start_secondary_resume();
 }

+/**
+ * topology_is_primary_thread - Check whether CPU is the primary SMT thread
+ * @cpu:	CPU to check
+ */
+bool topology_is_primary_thread(unsigned int cpu)
+{
+	return cpu == cpu_first_thread_sibling(cpu);
+}
+
+/**
+ * topology_smt_supported - Check whether SMT is supported by the CPUs
+ */
+bool topology_smt_supported(void)
+{
+	return threads_per_core > 1;
+}
 #endif

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-04-13 13:37   ` Michael Ellerman
@ 2023-04-13 15:38     ` Laurent Dufour
  2023-04-14 12:11       ` Michael Ellerman
  2023-04-18 17:25       ` Srikar Dronamraju
  0 siblings, 2 replies; 10+ messages in thread
From: Laurent Dufour @ 2023-04-13 15:38 UTC (permalink / raw)
  To: Michael Ellerman, npiggin, christophe.leroy
  Cc: msuchanek, nathanl, linux-kernel, linuxppc-dev, Srikar Dronamraju

On 13/04/2023 15:37:59, Michael Ellerman wrote:
> Hi Laurent,
> 
> Laurent Dufour <ldufour@linux.ibm.com> writes:
>> There is no SMT level recorded in the kernel neither in user space.
>> Indeed there is no real constraint about that and mixed SMT levels are
>> allowed and system is working fine this way.
>>
>> However when new CPU are added, the kernel is onlining all the threads
>> which is leading to mixed SMT levels and confuse end user a bit.
>>
>> To prevent this exports a SMT level from the kernel so user space
>> application like the energy daemon, could read it to adjust their settings.
>> There is no action unless recording the value when a SMT value is written
>> into the new sysfs entry. User space applications like ppc64_cpu should
>> update the sysfs when changing the SMT level to keep the system consistent.
>>
>> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
>>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
>>  2 files changed, 42 insertions(+)
> 
> There is a generic sysfs interface for smt in /sys/devices/system/cpu/smt
> 
> I think we should be enabling that on powerpc and then adapting it to
> our needs, rather than adding a pseries specific file.

Thanks Michael, I was not aware of this sysfs interface.

> Currently the generic code is only aware of SMT on/off, so it would need
> to be taught about SMT4 and 8 at least.

Do you think we should limit our support to SMT4 and SMT8 only?

> There are already hooks in the generic code to check the SMT level when
> bringing CPUs up, see cpu_smt_allowed(), they may work for the pseries
> hotplug case too, though maybe we need some additional logic.
> 
> Wiring up the basic support is pretty straight forward, something like
> the diff below.

I'll look into how to wire this up.
Thanks a lot!

> cheers
> 
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 0f123f1f62a1..a48576f1c579 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -260,6 +260,7 @@ config PPC
>  	select HAVE_SYSCALL_TRACEPOINTS
>  	select HAVE_VIRT_CPU_ACCOUNTING
>  	select HAVE_VIRT_CPU_ACCOUNTING_GEN
> +	select HOTPLUG_SMT			if HOTPLUG_CPU
>  	select HUGETLB_PAGE_SIZE_VARIABLE	if PPC_BOOK3S_64 && HUGETLB_PAGE
>  	select IOMMU_HELPER			if PPC64
>  	select IRQ_DOMAIN
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index 8a4d4f4d9749..bd23ba716d23 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -143,5 +143,8 @@ static inline int cpu_to_coregroup_id(int cpu)
>  #endif
>  #endif
> 
> +bool topology_is_primary_thread(unsigned int cpu);
> +bool topology_smt_supported(void);
> +
>  #endif /* __KERNEL__ */
>  #endif	/* _ASM_POWERPC_TOPOLOGY_H */
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 265801a3e94c..8619609809d5 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1769,4 +1769,20 @@ void __noreturn arch_cpu_idle_dead(void)
>  	start_secondary_resume();
>  }
> 
> +/**
> + * topology_is_primary_thread - Check whether CPU is the primary SMT thread
> + * @cpu:	CPU to check
> + */
> +bool topology_is_primary_thread(unsigned int cpu)
> +{
> +	return cpu == cpu_first_thread_sibling(cpu);
> +}
> +
> +/**
> + * topology_smt_supported - Check whether SMT is supported by the CPUs
> + */
> +bool topology_smt_supported(void)
> +{
> +	return threads_per_core > 1;
> +}
>  #endif


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-04-13 15:38     ` Laurent Dufour
@ 2023-04-14 12:11       ` Michael Ellerman
  2023-04-14 14:38         ` Michal Suchánek
  2023-04-18 17:25       ` Srikar Dronamraju
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2023-04-14 12:11 UTC (permalink / raw)
  To: Laurent Dufour, npiggin, christophe.leroy
  Cc: msuchanek, nathanl, linux-kernel, linuxppc-dev, Srikar Dronamraju

Laurent Dufour <ldufour@linux.ibm.com> writes:
> On 13/04/2023 15:37:59, Michael Ellerman wrote:
>> Laurent Dufour <ldufour@linux.ibm.com> writes:
>>> There is no SMT level recorded in the kernel neither in user space.
>>> Indeed there is no real constraint about that and mixed SMT levels are
>>> allowed and system is working fine this way.
>>>
>>> However when new CPU are added, the kernel is onlining all the threads
>>> which is leading to mixed SMT levels and confuse end user a bit.
>>>
>>> To prevent this exports a SMT level from the kernel so user space
>>> application like the energy daemon, could read it to adjust their settings.
>>> There is no action unless recording the value when a SMT value is written
>>> into the new sysfs entry. User space applications like ppc64_cpu should
>>> update the sysfs when changing the SMT level to keep the system consistent.
>>>
>>> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
>>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>>> ---
>>>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
>>>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
>>>  2 files changed, 42 insertions(+)
>>
>> There is a generic sysfs interface for smt in /sys/devices/system/cpu/smt
>>
>> I think we should be enabling that on powerpc and then adapting it to
>> our needs, rather than adding a pseries specific file.
>
> Thanks Michael, I was not aware of this sysfs interface.
>
>> Currently the generic code is only aware of SMT on/off, so it would need
>> to be taught about SMT4 and 8 at least.
>
> Do you think we should limit our support to SMT4 and SMT8 only?

Possibly? Currently the SMT state is represented by an enum:

enum cpuhp_smt_control {
	CPU_SMT_ENABLED,
	CPU_SMT_DISABLED,
	CPU_SMT_FORCE_DISABLED,
	CPU_SMT_NOT_SUPPORTED,
	CPU_SMT_NOT_IMPLEMENTED,
};

Adding two states for SMT4 and SMT8 seeems like it might be acceptable.

On the other hand if we want to support artbitrary SMT values from 3 to
8 then it might be better to store that value separately from the state
enum.

TBH I'm not sure whether we want to support values other than 1/2/4/8
via this interface.

A user who wants some odd numbered SMT value can always configure that
manually using the existing tools.

But maybe it's less confusing if this interface supports all values?
Even if they're unlikely to get much usage.

>> There are already hooks in the generic code to check the SMT level when
>> bringing CPUs up, see cpu_smt_allowed(), they may work for the pseries
>> hotplug case too, though maybe we need some additional logic.
>>
>> Wiring up the basic support is pretty straight forward, something like
>> the diff below.
>
> I'll look into how to wire this up.
> Thanks a lot!

Thanks.

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-04-14 12:11       ` Michael Ellerman
@ 2023-04-14 14:38         ` Michal Suchánek
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Suchánek @ 2023-04-14 14:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Laurent Dufour, npiggin, christophe.leroy, nathanl, linux-kernel,
	linuxppc-dev, Srikar Dronamraju

Hello,

On Fri, Apr 14, 2023 at 10:11:24PM +1000, Michael Ellerman wrote:
> Laurent Dufour <ldufour@linux.ibm.com> writes:
> > On 13/04/2023 15:37:59, Michael Ellerman wrote:
> >> Laurent Dufour <ldufour@linux.ibm.com> writes:
> >>> There is no SMT level recorded in the kernel neither in user space.
> >>> Indeed there is no real constraint about that and mixed SMT levels are
> >>> allowed and system is working fine this way.
> >>>
> >>> However when new CPU are added, the kernel is onlining all the threads
> >>> which is leading to mixed SMT levels and confuse end user a bit.
> >>>
> >>> To prevent this exports a SMT level from the kernel so user space
> >>> application like the energy daemon, could read it to adjust their settings.
> >>> There is no action unless recording the value when a SMT value is written
> >>> into the new sysfs entry. User space applications like ppc64_cpu should
> >>> update the sysfs when changing the SMT level to keep the system consistent.
> >>>
> >>> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> >>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> >>> ---
> >>>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
> >>>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
> >>>  2 files changed, 42 insertions(+)
> >>
> >> There is a generic sysfs interface for smt in /sys/devices/system/cpu/smt
> >>
> >> I think we should be enabling that on powerpc and then adapting it to
> >> our needs, rather than adding a pseries specific file.
> >
> > Thanks Michael, I was not aware of this sysfs interface.
> >
> >> Currently the generic code is only aware of SMT on/off, so it would need
> >> to be taught about SMT4 and 8 at least.
> >
> > Do you think we should limit our support to SMT4 and SMT8 only?
> 
> Possibly? Currently the SMT state is represented by an enum:
> 
> enum cpuhp_smt_control {
> 	CPU_SMT_ENABLED,
> 	CPU_SMT_DISABLED,
> 	CPU_SMT_FORCE_DISABLED,
> 	CPU_SMT_NOT_SUPPORTED,
> 	CPU_SMT_NOT_IMPLEMENTED,
> };
> 
> Adding two states for SMT4 and SMT8 seeems like it might be acceptable.
> 
> On the other hand if we want to support artbitrary SMT values from 3 to
> 8 then it might be better to store that value separately from the state
> enum.
> 
> TBH I'm not sure whether we want to support values other than 1/2/4/8
> via this interface.
> 
> A user who wants some odd numbered SMT value can always configure that
> manually using the existing tools.
> 
> But maybe it's less confusing if this interface supports all values?
> Even if they're unlikely to get much usage.

It looks like ppc64_cpu simply enables first n threads of the CPU where
n is the smt value without any interleaving hoping that the architecture
does the right thing. Under this implementation smt=3 is well-defined.

For the dual cluster P9 CPUs that have two clusters of four this might
work out well for some workloads, and others might want that
interleaving. With that the odd smt values are not well-definedd
anymore.

Nonetheless, if the kernel does support some smt=n parameter whatever
the semantic this should be also supported by the runtime knob.

If it's too difficult to get right there is always that option to not
enable any thread by default, and let the userspace to implement
arbitrarily complex schemes :)

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] pseries/smp: export the smt level in the SYS FS.
  2023-04-13 15:38     ` Laurent Dufour
  2023-04-14 12:11       ` Michael Ellerman
@ 2023-04-18 17:25       ` Srikar Dronamraju
  1 sibling, 0 replies; 10+ messages in thread
From: Srikar Dronamraju @ 2023-04-18 17:25 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Michael Ellerman, npiggin, christophe.leroy, msuchanek, nathanl,
	linux-kernel, linuxppc-dev

* Laurent Dufour <ldufour@linux.ibm.com> [2023-04-13 17:38:51]:

> On 13/04/2023 15:37:59, Michael Ellerman wrote:
> > Hi Laurent,
> > 
> > Laurent Dufour <ldufour@linux.ibm.com> writes:
> >> There is no SMT level recorded in the kernel neither in user space.
> >> Indeed there is no real constraint about that and mixed SMT levels are
> >> allowed and system is working fine this way.
> >>
> >> However when new CPU are added, the kernel is onlining all the threads
> >> which is leading to mixed SMT levels and confuse end user a bit.
> >>
> >> To prevent this exports a SMT level from the kernel so user space
> >> application like the energy daemon, could read it to adjust their settings.
> >> There is no action unless recording the value when a SMT value is written
> >> into the new sysfs entry. User space applications like ppc64_cpu should
> >> update the sysfs when changing the SMT level to keep the system consistent.
> >>
> >> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> >> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> >> ---
> >>  arch/powerpc/platforms/pseries/pseries.h |  3 ++
> >>  arch/powerpc/platforms/pseries/smp.c     | 39 ++++++++++++++++++++++++
> >>  2 files changed, 42 insertions(+)
> > 
> > There is a generic sysfs interface for smt in /sys/devices/system/cpu/smt
> > 
> > I think we should be enabling that on powerpc and then adapting it to
> > our needs, rather than adding a pseries specific file.
> 
> Thanks Michael, I was not aware of this sysfs interface.
> 
> > Currently the generic code is only aware of SMT on/off, so it would need
> > to be taught about SMT4 and 8 at least.
> 
> Do you think we should limit our support to SMT4 and SMT8 only?

smt2 is also a valid already supported configuration and we are evaluating 
smt6 mode based on some inputs from ISV teams. 

So I believe having a value for all modes would be good. 

> 
> > There are already hooks in the generic code to check the SMT level when
> > bringing CPUs up, see cpu_smt_allowed(), they may work for the pseries
> > hotplug case too, though maybe we need some additional logic.
> > 
> > Wiring up the basic support is pretty straight forward, something like
> > the diff below.
> 

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-04-18 17:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-31 15:39 [PATCH 0/2] Online new threads according to the current SMT level Laurent Dufour
2023-03-31 15:39 ` [PATCH 1/2] pseries/smp: export the smt level in the SYS FS Laurent Dufour
2023-03-31 16:05   ` Michal Suchánek
2023-04-03  8:20     ` Laurent Dufour
2023-04-13 13:37   ` Michael Ellerman
2023-04-13 15:38     ` Laurent Dufour
2023-04-14 12:11       ` Michael Ellerman
2023-04-14 14:38         ` Michal Suchánek
2023-04-18 17:25       ` Srikar Dronamraju
2023-03-31 15:39 ` [PATCH 2/2] powerpc/pseries/cpuhp: respect current SMT when adding new CPU Laurent Dufour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).