All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes
@ 2017-07-28 12:06 Prarit Bhargava
  2017-07-28 12:06 ` [PATCH v2] turbostat: Running on virtual machine is not supported Prarit Bhargava
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

[Sending to linux-pm instead of linux-kernel ...]

AMD family processors do not show all cores in the output of turbostat.  This
occurs because AMD has multiple nodes per socket and enumerates cores
within each node from 0.  For example, socket 0 may have two nodes (0 and 1)
and those nodes both have cores enumerated from 0 through 7.  turbostat cannot
handle this configuration, and as a result only shows 1/2 the cores in its
output.

This patchset makes turbostate aware of nodes.  It has been tested on
various AMD and Intel systems and no issues have been found.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>

Prarit Bhargava (7):
  turbostat: set max_num_cpus equal to the cpumask length
  turbostat: Fix node and siblings lookup data
  turbostat: Calculate additional node information for a package
  turbostat: track thread ID in cpu_topology
  turbostat: rename num_cores_per_pkg to num_cores_per_node
  turbostat: remove num_ from cpu_topology struct
  turbostat: add node information into turbostat calculations

 tools/power/x86/turbostat/turbostat.c | 419 +++++++++++++++++++++-------------
 1 file changed, 259 insertions(+), 160 deletions(-)

-- 
1.8.5.5

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] turbostat: Running on virtual machine is not supported
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 1/7] turbostat: set max_num_cpus equal to the cpumask length Prarit Bhargava
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown, Henrique de Moraes Holschuh

When running turbostat on a virtual machine the error

turbostat: msr 0 offset 0xe2 read failed: Input/output error

is output to the user.

/dev/msr and perf do not work on a virtual machine.  turbostat is
dependent on that support so turbostat does not work either.

When an error is returned from an msr read, turbostat must check to see if
the system is a virtual machine and return an error to the user.

[v2]: Only check for VM if msr read fails

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
---
 tools/power/x86/turbostat/turbostat.c | 57 +++++++++++++++++++++++++++--------
 1 file changed, 45 insertions(+), 12 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 0dafba2c1e7d..97c51bdce2ef 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -333,14 +333,58 @@ int get_msr_fd(int cpu)
 	return fd;
 }
 
+/*
+ * Open a file, and exit on failure
+ */
+FILE *fopen_or_die(const char *path, const char *mode)
+{
+	FILE *filep = fopen(path, mode);
+
+	if (!filep)
+		err(1, "%s: open failed", path);
+	return filep;
+}
+
+void err_on_hypervisor(void)
+{
+	FILE *cpuinfo;
+	char *flags, *hypervisor;
+	char *buffer;
+
+	/* On VMs /proc/cpuinfo contains a "flags" entry for hypervisor */
+	cpuinfo = fopen_or_die("/proc/cpuinfo", "ro");
+
+	buffer = malloc(4096);
+	if (!buffer)
+		err(-ENOMEM, "buffer malloc fail");
+
+	fread(buffer, 1024, 1, cpuinfo);
+
+	flags = strstr(buffer, "flags");
+	rewind(cpuinfo);
+	fseek(cpuinfo, flags - buffer, SEEK_SET);
+	fgets(buffer, 4096, cpuinfo);
+	fclose(cpuinfo);
+
+	hypervisor = strstr(buffer, "hypervisor");
+
+	free(buffer);
+
+	if (hypervisor)
+		err(-1,
+		    "turbostat is not supported on this virtual machine.");
+}
+
 int get_msr(int cpu, off_t offset, unsigned long long *msr)
 {
 	ssize_t retval;
 
 	retval = pread(get_msr_fd(cpu), msr, sizeof(*msr), offset);
 
-	if (retval != sizeof *msr)
+	if (retval != sizeof *msr) {
+		err_on_hypervisor();
 		err(-1, "cpu%d: msr offset 0x%llx read failed", cpu, (unsigned long long)offset);
+	}
 
 	return 0;
 }
@@ -1453,17 +1497,6 @@ static unsigned long long rdtsc(void)
 }
 
 /*
- * Open a file, and exit on failure
- */
-FILE *fopen_or_die(const char *path, const char *mode)
-{
-	FILE *filep = fopen(path, mode);
-
-	if (!filep)
-		err(1, "%s: open failed", path);
-	return filep;
-}
-/*
  * snapshot_sysfs_counter()
  *
  * return snapshot of given counter
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/7] turbostat: set max_num_cpus equal to the cpumask length
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
  2017-07-28 12:06 ` [PATCH v2] turbostat: Running on virtual machine is not supported Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 2/7] turbostat: Fix node and siblings lookup data Prarit Bhargava
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

Future fixes will use sysfs files that contain cpumask output.  The code
needs to know the length of the cpumask in order to determine which cpus
are set in a cpumask.  Currently topo.max_cpu_num is the maximum cpu
number.  It can be increased the the maximum value of cpus represented in
cpumasks.

Set max_num_cpus to the length of a cpumask.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 0dafba2c1e7d..6d368e744d27 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2413,6 +2413,20 @@ void re_initialize(void)
 	printf("turbostat: re-initialized with num_cpus %d\n", topo.num_cpus);
 }
 
+void set_max_cpu_num(void)
+{
+	FILE *filep;
+	unsigned long dummy;
+
+	topo.max_cpu_num = 0;
+	filep = fopen_or_die(
+			"/sys/devices/system/cpu/cpu0/topology/thread_siblings",
+			"r");
+	while (fscanf(filep, "%lx,", &dummy) == 1)
+		topo.max_cpu_num+=32;
+	fclose(filep);
+	topo.max_cpu_num--; /* 0 based */
+}
 
 /*
  * count_cpus()
@@ -2420,10 +2434,7 @@ void re_initialize(void)
  */
 int count_cpus(int cpu)
 {
-	if (topo.max_cpu_num < cpu)
-		topo.max_cpu_num = cpu;
-
-	topo.num_cpus += 1;
+	topo.num_cpus++;
 	return 0;
 }
 int mark_cpu_present(int cpu)
@@ -4320,8 +4331,8 @@ void topology_probe()
 	} *cpus;
 
 	/* Initialize num_cpus, max_cpu_num */
+	set_max_cpu_num();
 	topo.num_cpus = 0;
-	topo.max_cpu_num = 0;
 	for_all_proc_cpus(count_cpus);
 	if (!summary_only && topo.num_cpus > 1)
 		BIC_PRESENT(BIC_CPU);
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/7] turbostat: Fix node and siblings lookup data
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
  2017-07-28 12:06 ` [PATCH v2] turbostat: Running on virtual machine is not supported Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 1/7] turbostat: set max_num_cpus equal to the cpumask length Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 3/7] turbostat: Calculate additional node information for a package Prarit Bhargava
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

The turbostat code only looks at thread_siblings_list to determine if
processing units/threads are on the same the core.  This works well on
Intel systems which have a shared L1 instruction and data cache.  This
does not work on AMD systems which have shared L1 instruction cache but
separate L1 data caches.  Other utilities also check sibling's core ID
to determine if the processing unit shares the same core.

Additionally, the cpu_topology *cpus list used in topology_probe() can
be used elsewhere in the code to simplify things.

Export *cpus to the entire turbostat code, and add Processing Unit/Thread
IDs information to each cpu_topology struct.  Confirm that the thread
is on the same core as indicated by thread_siblings_list.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 112 +++++++++++++++++++++++-----------
 1 file changed, 77 insertions(+), 35 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 6d368e744d27..5f66988b0303 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -149,6 +149,7 @@
 cpu_set_t *cpu_present_set, *cpu_affinity_set, *cpu_subset;
 size_t cpu_present_setsize, cpu_affinity_setsize, cpu_subset_size;
 #define MAX_ADDED_COUNTERS 16
+#define BITMASK_SIZE 32
 
 struct thread_data {
 	struct timeval tv_begin;
@@ -245,6 +246,13 @@ struct system_summary {
 	struct pkg_data packages;
 } average;
 
+struct cpu_topology {
+	int physical_package_id;
+	int logical_cpu_id;
+	int node_id;
+	int physical_core_id;
+	cpu_set_t *put_ids; /* Processing Unit/Thread IDs */
+} *cpus;
 
 struct topo_params {
 	int num_packages;
@@ -2188,6 +2196,8 @@ void free_fd_percpu(void)
 
 void free_all_buffers(void)
 {
+	int i;
+
 	CPU_FREE(cpu_present_set);
 	cpu_present_set = NULL;
 	cpu_present_setsize = 0;
@@ -2220,6 +2230,12 @@ void free_all_buffers(void)
 
 	free(irq_column_2_cpu);
 	free(irqs_per_cpu);
+
+	for (i = 0; i <= topo.max_cpu_num; ++i) {
+		if (cpus[i].put_ids)
+			CPU_FREE(cpus[i].put_ids);
+	}
+	free(cpus);
 }
 
 
@@ -2300,35 +2316,55 @@ int get_core_id(int cpu)
 	return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/core_id", cpu);
 }
 
-int get_num_ht_siblings(int cpu)
+int get_node_id(struct cpu_topology *thiscpu)
 {
 	char path[80];
 	FILE *filep;
-	int sib1;
-	int matches = 0;
-	char character;
-	char str[100];
-	char *ch;
+	int i;
+	int cpu = thiscpu->logical_cpu_id;
+	for (i = 0; i <= topo.max_cpu_num; i++) {
+		sprintf(path, "/sys/devices/system/cpu/cpu%d/node%i/cpulist",
+			cpu, i);
+		filep = fopen(path, "r");
+		if (!filep)
+			continue;
+		fclose(filep);
+		return i;
+	}
+	return -1;
+}
 
-	sprintf(path, "/sys/devices/system/cpu/cpu%d/topology/thread_siblings_list", cpu);
-	filep = fopen_or_die(path, "r");
+int get_thread_siblings(struct cpu_topology *thiscpu)
+{
+	char path[80], character;
+	FILE *filep;
+	unsigned long map;
+	int shift, sib_core;
+	int cpu = thiscpu->logical_cpu_id;
+	int offset = topo.max_cpu_num + 1;
 
-	/*
-	 * file format:
-	 * A ',' separated or '-' separated set of numbers
-	 * (eg 1-2 or 1,3,4,5)
-	 */
-	fscanf(filep, "%d%c\n", &sib1, &character);
-	fseek(filep, 0, SEEK_SET);
-	fgets(str, 100, filep);
-	ch = strchr(str, character);
-	while (ch != NULL) {
-		matches++;
-		ch = strchr(ch+1, character);
-	}
+	thiscpu->put_ids = CPU_ALLOC((topo.max_cpu_num + 1));
+	if (!thiscpu->put_ids)
+		return -1;
+	CPU_ZERO(thiscpu->put_ids);
 
+	sprintf(path, "/sys/devices/system/cpu/cpu%d/topology/thread_siblings", cpu);
+	filep = fopen_or_die(path, "r");
+	do {
+		offset-=BITMASK_SIZE;
+		fscanf(filep, "%lx%c", &map, &character);
+		for (shift = 0; shift < BITMASK_SIZE; shift++) {
+			if ((map >> shift) & 0x1) {
+				sib_core = get_core_id(shift + offset);
+				if (sib_core == thiscpu->physical_core_id)
+					CPU_SET(shift + offset,
+						thiscpu->put_ids);
+			}
+		}
+	} while (!strncmp(&character, ",", 1));
 	fclose(filep);
-	return matches+1;
+
+	return CPU_COUNT(thiscpu->put_ids);
 }
 
 /*
@@ -2423,7 +2459,7 @@ void set_max_cpu_num(void)
 			"/sys/devices/system/cpu/cpu0/topology/thread_siblings",
 			"r");
 	while (fscanf(filep, "%lx,", &dummy) == 1)
-		topo.max_cpu_num+=32;
+		topo.max_cpu_num+=BITMASK_SIZE;
 	fclose(filep);
 	topo.max_cpu_num--; /* 0 based */
 }
@@ -4325,10 +4361,6 @@ void topology_probe()
 	int max_core_id = 0;
 	int max_package_id = 0;
 	int max_siblings = 0;
-	struct cpu_topology {
-		int core_id;
-		int physical_package_id;
-	} *cpus;
 
 	/* Initialize num_cpus, max_cpu_num */
 	set_max_cpu_num();
@@ -4385,20 +4417,32 @@ void topology_probe()
 				fprintf(outf, "cpu%d NOT PRESENT\n", i);
 			continue;
 		}
-		cpus[i].core_id = get_core_id(i);
-		if (cpus[i].core_id > max_core_id)
-			max_core_id = cpus[i].core_id;
 
+		cpus[i].logical_cpu_id = i;
+
+		/* get package information */
 		cpus[i].physical_package_id = get_physical_package_id(i);
 		if (cpus[i].physical_package_id > max_package_id)
 			max_package_id = cpus[i].physical_package_id;
 
-		siblings = get_num_ht_siblings(i);
+		/* get numa node information */
+		cpus[i].node_id = get_node_id(&cpus[i]);
+
+		/* get core information */
+		cpus[i].physical_core_id = get_core_id(i);
+		if (cpus[i].physical_core_id > max_core_id)
+			max_core_id = cpus[i].physical_core_id;
+
+		/* get thread information */
+		siblings = get_thread_siblings(&cpus[i]);
 		if (siblings > max_siblings)
 			max_siblings = siblings;
+
 		if (debug > 1)
-			fprintf(outf, "cpu %d pkg %d core %d\n",
-				i, cpus[i].physical_package_id, cpus[i].core_id);
+			fprintf(outf, "cpu %d pkg %d node %d core %d\n",
+				i, cpus[i].physical_package_id,
+				cpus[i].node_id,
+				cpus[i].physical_core_id);
 	}
 	topo.num_cores_per_pkg = max_core_id + 1;
 	if (debug > 1)
@@ -4417,8 +4461,6 @@ void topology_probe()
 	topo.num_threads_per_core = max_siblings;
 	if (debug > 1)
 		fprintf(outf, "max_siblings %d\n", max_siblings);
-
-	free(cpus);
 }
 
 void
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/7] turbostat: Calculate additional node information for a package
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
                   ` (2 preceding siblings ...)
  2017-07-28 12:06 ` [PATCH 2/7] turbostat: Fix node and siblings lookup data Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 4/7] turbostat: track thread ID in cpu_topology Prarit Bhargava
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

The code currently assumes each package has exactly one node.  This is not
the case for AMD systems and Intel systems with COD.  AMD systems also
may re-enumerate each node's core IDs starting at 0 (for example, an AMD
processor may have two nodes, each with core IDs from 0 to 7).  In order
to properly enumerate the cores we need to track both the physical and
logical node IDs.

Add physical_node_id to track the node ID assigned by the kernel, and
logical_node_id used by turbostat to track the nodes per package ie) a 0-based
count within the package.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 66 ++++++++++++++++++++++++++++++++---
 1 file changed, 62 insertions(+), 4 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 5f66988b0303..696ced1974c5 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -249,7 +249,8 @@ struct system_summary {
 struct cpu_topology {
 	int physical_package_id;
 	int logical_cpu_id;
-	int node_id;
+	int physical_node_id;
+	int logical_node_id;	/* 0-based count within the package */
 	int physical_core_id;
 	cpu_set_t *put_ids; /* Processing Unit/Thread IDs */
 } *cpus;
@@ -259,6 +260,8 @@ struct topo_params {
 	int num_cpus;
 	int num_cores;
 	int max_cpu_num;
+	int max_node_num;
+	int num_nodes_per_pkg;
 	int num_cores_per_pkg;
 	int num_threads_per_core;
 } topo;
@@ -2316,7 +2319,54 @@ int get_core_id(int cpu)
 	return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/core_id", cpu);
 }
 
-int get_node_id(struct cpu_topology *thiscpu)
+void set_node_data(void)
+{
+	char path[80];
+	FILE *filep;
+	int pkg, node, cpu;
+
+	struct pkg_node_info {
+		int count;
+		int min;
+	} *pni;
+
+	pni = calloc(topo.num_packages, sizeof(struct pkg_node_info));
+	if (!pni)
+		err(1, "calloc pkg_node_count");
+
+	for (pkg = 0; pkg < topo.num_packages; pkg++)
+		pni[pkg].min = topo.num_cpus;
+
+	for (node = 0; node <= topo.max_node_num; node++) {
+		/* find the "first" cpu in the node */
+		sprintf(path, "/sys/bus/node/devices/node%d/cpulist", node);
+		filep = fopen(path, "r");
+		if (!filep)
+			continue;
+		fscanf(filep, "%d", &cpu);
+		fclose(filep);
+
+		pkg = cpus[cpu].physical_package_id;
+		pni[pkg].count++;
+
+		if (node < pni[pkg].min)
+			pni[pkg].min = node;
+	}
+
+	for (pkg = 0; pkg < topo.num_packages; pkg++)
+		if (pni[pkg].count > topo.num_nodes_per_pkg)
+			topo.num_nodes_per_pkg = pni[0].count;
+
+	for (cpu = 0; cpu < topo.num_cpus; cpu++) {
+		pkg = cpus[cpu].physical_package_id;
+		node = cpus[cpu].physical_node_id;
+		cpus[cpu].logical_node_id = node - pni[pkg].min;
+	}
+	free(pni);
+
+}
+
+int get_physical_node_id(struct cpu_topology *thiscpu)
 {
 	char path[80];
 	FILE *filep;
@@ -4361,6 +4411,7 @@ void topology_probe()
 	int max_core_id = 0;
 	int max_package_id = 0;
 	int max_siblings = 0;
+	int max_physical_node_id = 0;
 
 	/* Initialize num_cpus, max_cpu_num */
 	set_max_cpu_num();
@@ -4426,7 +4477,9 @@ void topology_probe()
 			max_package_id = cpus[i].physical_package_id;
 
 		/* get numa node information */
-		cpus[i].node_id = get_node_id(&cpus[i]);
+		cpus[i].physical_node_id = get_physical_node_id(&cpus[i]);
+		if (cpus[i].physical_node_id > max_physical_node_id)
+			topo.max_node_num = cpus[i].physical_node_id;
 
 		/* get core information */
 		cpus[i].physical_core_id = get_core_id(i);
@@ -4441,9 +4494,10 @@ void topology_probe()
 		if (debug > 1)
 			fprintf(outf, "cpu %d pkg %d node %d core %d\n",
 				i, cpus[i].physical_package_id,
-				cpus[i].node_id,
+				cpus[i].physical_node_id,
 				cpus[i].physical_core_id);
 	}
+
 	topo.num_cores_per_pkg = max_core_id + 1;
 	if (debug > 1)
 		fprintf(outf, "max_core_id %d, sizing for %d cores per package\n",
@@ -4458,6 +4512,10 @@ void topology_probe()
 	if (!summary_only && topo.num_packages > 1)
 		BIC_PRESENT(BIC_Package);
 
+	set_node_data();
+	if (debug > 1)
+		fprintf(outf, "num_nodes_per_pkg %d\n", topo.num_nodes_per_pkg);
+
 	topo.num_threads_per_core = max_siblings;
 	if (debug > 1)
 		fprintf(outf, "max_siblings %d\n", max_siblings);
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/7] turbostat: track thread ID in cpu_topology
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
                   ` (3 preceding siblings ...)
  2017-07-28 12:06 ` [PATCH 3/7] turbostat: Calculate additional node information for a package Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 5/7] turbostat: rename num_cores_per_pkg to num_cores_per_node Prarit Bhargava
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

The code can be simplified if the cpu_topology *cpus tracks the thread
IDs.  This removes an additional file lookup and simplifies the counter
initialization code.

Add thread ID to cpu_topology information and cleanup the counter
initialization code.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 81 +++++++++--------------------------
 1 file changed, 21 insertions(+), 60 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 696ced1974c5..124d9fbbaab6 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -252,6 +252,7 @@ struct cpu_topology {
 	int physical_node_id;
 	int logical_node_id;	/* 0-based count within the package */
 	int physical_core_id;
+	int thread_id;
 	cpu_set_t *put_ids; /* Processing Unit/Thread IDs */
 } *cpus;
 
@@ -2263,44 +2264,6 @@ int parse_int_file(const char *fmt, ...)
 }
 
 /*
- * get_cpu_position_in_core(cpu)
- * return the position of the CPU among its HT siblings in the core
- * return -1 if the sibling is not in list
- */
-int get_cpu_position_in_core(int cpu)
-{
-	char path[64];
-	FILE *filep;
-	int this_cpu;
-	char character;
-	int i;
-
-	sprintf(path,
-		"/sys/devices/system/cpu/cpu%d/topology/thread_siblings_list",
-		cpu);
-	filep = fopen(path, "r");
-	if (filep == NULL) {
-		perror(path);
-		exit(1);
-	}
-
-	for (i = 0; i < topo.num_threads_per_core; i++) {
-		fscanf(filep, "%d", &this_cpu);
-		if (this_cpu == cpu) {
-			fclose(filep);
-			return i;
-		}
-
-		/* Account for no separator after last thread*/
-		if (i != (topo.num_threads_per_core - 1))
-			fscanf(filep, "%c", &character);
-	}
-
-	fclose(filep);
-	return -1;
-}
-
-/*
  * cpu_is_first_core_in_package(cpu)
  * return 1 if given CPU is 1st core in package
  */
@@ -2392,8 +2355,10 @@ int get_thread_siblings(struct cpu_topology *thiscpu)
 	int shift, sib_core;
 	int cpu = thiscpu->logical_cpu_id;
 	int offset = topo.max_cpu_num + 1;
+	int thread_id = 0;
 
 	thiscpu->put_ids = CPU_ALLOC((topo.max_cpu_num + 1));
+	thiscpu->thread_id = thread_id++;
 	if (!thiscpu->put_ids)
 		return -1;
 	CPU_ZERO(thiscpu->put_ids);
@@ -2406,9 +2371,12 @@ int get_thread_siblings(struct cpu_topology *thiscpu)
 		for (shift = 0; shift < BITMASK_SIZE; shift++) {
 			if ((map >> shift) & 0x1) {
 				sib_core = get_core_id(shift + offset);
-				if (sib_core == thiscpu->physical_core_id)
+				if (sib_core == thiscpu->physical_core_id) {
 					CPU_SET(shift + offset,
 						thiscpu->put_ids);
+					if ((shift + offset) != cpu)
+						cpus[shift + offset].thread_id = thread_id++;
+				}
 			}
 		}
 	} while (!strncmp(&character, ",", 1));
@@ -4485,6 +4453,8 @@ void topology_probe()
 		cpus[i].physical_core_id = get_core_id(i);
 		if (cpus[i].physical_core_id > max_core_id)
 			max_core_id = cpus[i].physical_core_id;
+		if (!cpus[i].thread_id)
+			topo.num_cores++;
 
 		/* get thread information */
 		siblings = get_thread_siblings(&cpus[i]);
@@ -4557,47 +4527,38 @@ void topology_probe()
 /*
  * init_counter()
  *
- * set cpu_id, core_num, pkg_num
  * set FIRST_THREAD_IN_CORE and FIRST_CORE_IN_PACKAGE
- *
- * increment topo.num_cores when 1st core in pkg seen
  */
 void init_counter(struct thread_data *thread_base, struct core_data *core_base,
-	struct pkg_data *pkg_base, int thread_num, int core_num,
-	int pkg_num, int cpu_id)
+	struct pkg_data *pkg_base, int cpu_id)
 {
+	int pkg_id = cpus[cpu_id].physical_package_id;
+	int core_id = cpus[cpu_id].physical_core_id;
+	int thread_id = cpus[cpu_id].thread_id;
 	struct thread_data *t;
 	struct core_data *c;
 	struct pkg_data *p;
 
-	t = GET_THREAD(thread_base, thread_num, core_num, pkg_num);
-	c = GET_CORE(core_base, core_num, pkg_num);
-	p = GET_PKG(pkg_base, pkg_num);
+	t = GET_THREAD(thread_base, thread_id, core_id, pkg_id);
+	c = GET_CORE(core_base, core_id, pkg_id);
+	p = GET_PKG(pkg_base, pkg_id);
 
 	t->cpu_id = cpu_id;
-	if (thread_num == 0) {
+	if (thread_id == 0) {
 		t->flags |= CPU_IS_FIRST_THREAD_IN_CORE;
 		if (cpu_is_first_core_in_package(cpu_id))
 			t->flags |= CPU_IS_FIRST_CORE_IN_PACKAGE;
 	}
 
-	c->core_id = core_num;
-	p->package_id = pkg_num;
+	c->core_id = core_id;
+	p->package_id = pkg_id;
 }
 
 
 int initialize_counters(int cpu_id)
 {
-	int my_thread_id, my_core_id, my_package_id;
-
-	my_package_id = get_physical_package_id(cpu_id);
-	my_core_id = get_core_id(cpu_id);
-	my_thread_id = get_cpu_position_in_core(cpu_id);
-	if (!my_thread_id)
-		topo.num_cores++;
-
-	init_counter(EVEN_COUNTERS, my_thread_id, my_core_id, my_package_id, cpu_id);
-	init_counter(ODD_COUNTERS, my_thread_id, my_core_id, my_package_id, cpu_id);
+	init_counter(EVEN_COUNTERS, cpu_id);
+	init_counter(ODD_COUNTERS, cpu_id);
 	return 0;
 }
 
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/7] turbostat: rename num_cores_per_pkg to num_cores_per_node
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
                   ` (4 preceding siblings ...)
  2017-07-28 12:06 ` [PATCH 4/7] turbostat: track thread ID in cpu_topology Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 6/7] turbostat: remove num_ from cpu_topology struct Prarit Bhargava
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

turbostat incorrectly assumes that there is one node per package.  As a
result num_cores_per_pkg is not correctly named and is actually
num_cores_per_node.

Rename num_cores_per_pkg to num_cores_per_node.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 124d9fbbaab6..db690ef4bfb0 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -206,11 +206,11 @@ struct pkg_data {
 #define EVEN_COUNTERS thread_even, core_even, package_even
 
 #define GET_THREAD(thread_base, thread_no, core_no, pkg_no) \
-	(thread_base + (pkg_no) * topo.num_cores_per_pkg * \
+	(thread_base + (pkg_no) * topo.num_cores_per_node * \
 		topo.num_threads_per_core + \
 		(core_no) * topo.num_threads_per_core + (thread_no))
 #define GET_CORE(core_base, core_no, pkg_no) \
-	(core_base + (pkg_no) * topo.num_cores_per_pkg + (core_no))
+	(core_base + (pkg_no) * topo.num_cores_per_node + (core_no))
 #define GET_PKG(pkg_base, pkg_no) (pkg_base + pkg_no)
 
 enum counter_scope {SCOPE_CPU, SCOPE_CORE, SCOPE_PACKAGE};
@@ -263,7 +263,7 @@ struct topo_params {
 	int max_cpu_num;
 	int max_node_num;
 	int num_nodes_per_pkg;
-	int num_cores_per_pkg;
+	int num_cores_per_node;
 	int num_threads_per_core;
 } topo;
 
@@ -289,7 +289,7 @@ int for_all_cpus(int (func)(struct thread_data *, struct core_data *, struct pkg
 	int retval, pkg_no, core_no, thread_no;
 
 	for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
-		for (core_no = 0; core_no < topo.num_cores_per_pkg; ++core_no) {
+		for (core_no = 0; core_no < topo.num_cores_per_node; ++core_no) {
 			for (thread_no = 0; thread_no <
 				topo.num_threads_per_core; ++thread_no) {
 				struct thread_data *t;
@@ -2400,7 +2400,7 @@ int for_all_cpus_2(int (func)(struct thread_data *, struct core_data *,
 	int retval, pkg_no, core_no, thread_no;
 
 	for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
-		for (core_no = 0; core_no < topo.num_cores_per_pkg; ++core_no) {
+		for (core_no = 0; core_no < topo.num_cores_per_node; ++core_no) {
 			for (thread_no = 0; thread_no <
 				topo.num_threads_per_core; ++thread_no) {
 				struct thread_data *t, *t2;
@@ -4468,11 +4468,11 @@ void topology_probe()
 				cpus[i].physical_core_id);
 	}
 
-	topo.num_cores_per_pkg = max_core_id + 1;
+	topo.num_cores_per_node = max_core_id + 1;
 	if (debug > 1)
 		fprintf(outf, "max_core_id %d, sizing for %d cores per package\n",
-			max_core_id, topo.num_cores_per_pkg);
-	if (!summary_only && topo.num_cores_per_pkg > 1)
+			max_core_id, topo.num_cores_per_node);
+	if (!summary_only && topo.num_cores_per_node > 1)
 		BIC_PRESENT(BIC_Core);
 
 	topo.num_packages = max_package_id + 1;
@@ -4496,21 +4496,21 @@ void topology_probe()
 {
 	int i;
 
-	*t = calloc(topo.num_threads_per_core * topo.num_cores_per_pkg *
+	*t = calloc(topo.num_threads_per_core * topo.num_cores_per_node *
 		topo.num_packages, sizeof(struct thread_data));
 	if (*t == NULL)
 		goto error;
 
 	for (i = 0; i < topo.num_threads_per_core *
-		topo.num_cores_per_pkg * topo.num_packages; i++)
+		topo.num_cores_per_node * topo.num_packages; i++)
 		(*t)[i].cpu_id = -1;
 
-	*c = calloc(topo.num_cores_per_pkg * topo.num_packages,
+	*c = calloc(topo.num_cores_per_node * topo.num_packages,
 		sizeof(struct core_data));
 	if (*c == NULL)
 		goto error;
 
-	for (i = 0; i < topo.num_cores_per_pkg * topo.num_packages; i++)
+	for (i = 0; i < topo.num_cores_per_node * topo.num_packages; i++)
 		(*c)[i].core_id = -1;
 
 	*p = calloc(topo.num_packages, sizeof(struct pkg_data));
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/7] turbostat: remove num_ from cpu_topology struct
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
                   ` (5 preceding siblings ...)
  2017-07-28 12:06 ` [PATCH 5/7] turbostat: rename num_cores_per_pkg to num_cores_per_node Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-07-28 12:06 ` [PATCH 7/7] turbostat: add node information into turbostat calculations Prarit Bhargava
  2017-08-05  7:06 ` [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Len Brown
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

Cleanup, remove num_ from num_nodes_per_pkg, num_cores_per_node, and
num_threads_per_node.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 46 +++++++++++++++++------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index db690ef4bfb0..537f879ea90a 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -206,11 +206,11 @@ struct pkg_data {
 #define EVEN_COUNTERS thread_even, core_even, package_even
 
 #define GET_THREAD(thread_base, thread_no, core_no, pkg_no) \
-	(thread_base + (pkg_no) * topo.num_cores_per_node * \
-		topo.num_threads_per_core + \
-		(core_no) * topo.num_threads_per_core + (thread_no))
+	(thread_base + (pkg_no) * topo.cores_per_node * \
+		topo.threads_per_core + \
+		(core_no) * topo.threads_per_core + (thread_no))
 #define GET_CORE(core_base, core_no, pkg_no) \
-	(core_base + (pkg_no) * topo.num_cores_per_node + (core_no))
+	(core_base + (pkg_no) * topo.cores_per_node + (core_no))
 #define GET_PKG(pkg_base, pkg_no) (pkg_base + pkg_no)
 
 enum counter_scope {SCOPE_CPU, SCOPE_CORE, SCOPE_PACKAGE};
@@ -262,9 +262,9 @@ struct topo_params {
 	int num_cores;
 	int max_cpu_num;
 	int max_node_num;
-	int num_nodes_per_pkg;
-	int num_cores_per_node;
-	int num_threads_per_core;
+	int nodes_per_pkg;
+	int cores_per_node;
+	int threads_per_core;
 } topo;
 
 struct timeval tv_even, tv_odd, tv_delta;
@@ -289,9 +289,9 @@ int for_all_cpus(int (func)(struct thread_data *, struct core_data *, struct pkg
 	int retval, pkg_no, core_no, thread_no;
 
 	for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
-		for (core_no = 0; core_no < topo.num_cores_per_node; ++core_no) {
+		for (core_no = 0; core_no < topo.cores_per_node; ++core_no) {
 			for (thread_no = 0; thread_no <
-				topo.num_threads_per_core; ++thread_no) {
+				topo.threads_per_core; ++thread_no) {
 				struct thread_data *t;
 				struct core_data *c;
 				struct pkg_data *p;
@@ -2317,8 +2317,8 @@ void set_node_data(void)
 	}
 
 	for (pkg = 0; pkg < topo.num_packages; pkg++)
-		if (pni[pkg].count > topo.num_nodes_per_pkg)
-			topo.num_nodes_per_pkg = pni[0].count;
+		if (pni[pkg].count > topo.nodes_per_pkg)
+			topo.nodes_per_pkg = pni[0].count;
 
 	for (cpu = 0; cpu < topo.num_cpus; cpu++) {
 		pkg = cpus[cpu].physical_package_id;
@@ -2400,9 +2400,9 @@ int for_all_cpus_2(int (func)(struct thread_data *, struct core_data *,
 	int retval, pkg_no, core_no, thread_no;
 
 	for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
-		for (core_no = 0; core_no < topo.num_cores_per_node; ++core_no) {
+		for (core_no = 0; core_no < topo.cores_per_node; ++core_no) {
 			for (thread_no = 0; thread_no <
-				topo.num_threads_per_core; ++thread_no) {
+				topo.threads_per_core; ++thread_no) {
 				struct thread_data *t, *t2;
 				struct core_data *c, *c2;
 				struct pkg_data *p, *p2;
@@ -4468,11 +4468,11 @@ void topology_probe()
 				cpus[i].physical_core_id);
 	}
 
-	topo.num_cores_per_node = max_core_id + 1;
+	topo.cores_per_node = max_core_id + 1;
 	if (debug > 1)
 		fprintf(outf, "max_core_id %d, sizing for %d cores per package\n",
-			max_core_id, topo.num_cores_per_node);
-	if (!summary_only && topo.num_cores_per_node > 1)
+			max_core_id, topo.cores_per_node);
+	if (!summary_only && topo.cores_per_node > 1)
 		BIC_PRESENT(BIC_Core);
 
 	topo.num_packages = max_package_id + 1;
@@ -4484,9 +4484,9 @@ void topology_probe()
 
 	set_node_data();
 	if (debug > 1)
-		fprintf(outf, "num_nodes_per_pkg %d\n", topo.num_nodes_per_pkg);
+		fprintf(outf, "nodes_per_pkg %d\n", topo.nodes_per_pkg);
 
-	topo.num_threads_per_core = max_siblings;
+	topo.threads_per_core = max_siblings;
 	if (debug > 1)
 		fprintf(outf, "max_siblings %d\n", max_siblings);
 }
@@ -4496,21 +4496,21 @@ void topology_probe()
 {
 	int i;
 
-	*t = calloc(topo.num_threads_per_core * topo.num_cores_per_node *
+	*t = calloc(topo.threads_per_core * topo.cores_per_node *
 		topo.num_packages, sizeof(struct thread_data));
 	if (*t == NULL)
 		goto error;
 
-	for (i = 0; i < topo.num_threads_per_core *
-		topo.num_cores_per_node * topo.num_packages; i++)
+	for (i = 0; i < topo.threads_per_core *
+		topo.cores_per_node * topo.num_packages; i++)
 		(*t)[i].cpu_id = -1;
 
-	*c = calloc(topo.num_cores_per_node * topo.num_packages,
+	*c = calloc(topo.cores_per_node * topo.num_packages,
 		sizeof(struct core_data));
 	if (*c == NULL)
 		goto error;
 
-	for (i = 0; i < topo.num_cores_per_node * topo.num_packages; i++)
+	for (i = 0; i < topo.cores_per_node * topo.num_packages; i++)
 		(*c)[i].core_id = -1;
 
 	*p = calloc(topo.num_packages, sizeof(struct pkg_data));
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/7] turbostat: add node information into turbostat calculations
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
                   ` (6 preceding siblings ...)
  2017-07-28 12:06 ` [PATCH 6/7] turbostat: remove num_ from cpu_topology struct Prarit Bhargava
@ 2017-07-28 12:06 ` Prarit Bhargava
  2017-08-05  7:06 ` [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Len Brown
  8 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-28 12:06 UTC (permalink / raw)
  To: linux-pm; +Cc: lenb, Prarit Bhargava, Len Brown

The previous patches have added node information to turbostat, but the
counters code does not take it into account.

Add node information from cpu_topology calculations to turbostat
counters.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 143 ++++++++++++++++++++--------------
 1 file changed, 85 insertions(+), 58 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 537f879ea90a..22c845a2ea00 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -205,12 +205,21 @@ struct pkg_data {
 #define ODD_COUNTERS thread_odd, core_odd, package_odd
 #define EVEN_COUNTERS thread_even, core_even, package_even
 
-#define GET_THREAD(thread_base, thread_no, core_no, pkg_no) \
-	(thread_base + (pkg_no) * topo.cores_per_node * \
-		topo.threads_per_core + \
-		(core_no) * topo.threads_per_core + (thread_no))
-#define GET_CORE(core_base, core_no, pkg_no) \
-	(core_base + (pkg_no) * topo.cores_per_node + (core_no))
+#define GET_THREAD(thread_base, thread_no, core_no, node_no, pkg_no)	      \
+	((thread_base) + 						      \
+	 ((pkg_no) * 							      \
+	  topo.nodes_per_pkg * topo.cores_per_node * topo.threads_per_core) + \
+	 ((node_no) * topo.cores_per_node * topo.threads_per_core) + 	      \
+	 ((core_no) * topo.threads_per_core) + 				      \
+	 (thread_no))
+
+#define GET_CORE(core_base, core_no, node_no, pkg_no) 			\
+	((core_base) +							\
+	 ((pkg_no) *  topo.nodes_per_pkg * topo.cores_per_node) +	\
+	 ((node_no) * topo.cores_per_node) +				\
+	 (core_no));
+
+
 #define GET_PKG(pkg_base, pkg_no) (pkg_base + pkg_no)
 
 enum counter_scope {SCOPE_CPU, SCOPE_CORE, SCOPE_PACKAGE};
@@ -286,27 +295,33 @@ int cpu_is_not_present(int cpu)
 int for_all_cpus(int (func)(struct thread_data *, struct core_data *, struct pkg_data *),
 	struct thread_data *thread_base, struct core_data *core_base, struct pkg_data *pkg_base)
 {
-	int retval, pkg_no, core_no, thread_no;
+	int retval, pkg_no, core_no, thread_no, node_no;
 
 	for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
 		for (core_no = 0; core_no < topo.cores_per_node; ++core_no) {
-			for (thread_no = 0; thread_no <
-				topo.threads_per_core; ++thread_no) {
-				struct thread_data *t;
-				struct core_data *c;
-				struct pkg_data *p;
-
-				t = GET_THREAD(thread_base, thread_no, core_no, pkg_no);
-
-				if (cpu_is_not_present(t->cpu_id))
-					continue;
-
-				c = GET_CORE(core_base, core_no, pkg_no);
-				p = GET_PKG(pkg_base, pkg_no);
-
-				retval = func(t, c, p);
-				if (retval)
-					return retval;
+			for (node_no = 0; node_no < topo.nodes_per_pkg;
+			     node_no++) {
+				for (thread_no = 0; thread_no <
+					topo.threads_per_core; ++thread_no) {
+					struct thread_data *t;
+					struct core_data *c;
+					struct pkg_data *p;
+
+					t = GET_THREAD(thread_base, thread_no,
+						       core_no, node_no,
+						       pkg_no);
+
+					if (cpu_is_not_present(t->cpu_id))
+						continue;
+
+					c = GET_CORE(core_base, core_no,
+						     node_no, pkg_no);
+					p = GET_PKG(pkg_base, pkg_no);
+
+					retval = func(t, c, p);
+					if (retval)
+						return retval;
+				}
 			}
 		}
 	}
@@ -2397,32 +2412,42 @@ int for_all_cpus_2(int (func)(struct thread_data *, struct core_data *,
 	struct thread_data *thread_base2, struct core_data *core_base2,
 	struct pkg_data *pkg_base2)
 {
-	int retval, pkg_no, core_no, thread_no;
+	int retval, pkg_no, node_no, core_no, thread_no;
 
 	for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
-		for (core_no = 0; core_no < topo.cores_per_node; ++core_no) {
-			for (thread_no = 0; thread_no <
-				topo.threads_per_core; ++thread_no) {
-				struct thread_data *t, *t2;
-				struct core_data *c, *c2;
-				struct pkg_data *p, *p2;
-
-				t = GET_THREAD(thread_base, thread_no, core_no, pkg_no);
-
-				if (cpu_is_not_present(t->cpu_id))
-					continue;
-
-				t2 = GET_THREAD(thread_base2, thread_no, core_no, pkg_no);
-
-				c = GET_CORE(core_base, core_no, pkg_no);
-				c2 = GET_CORE(core_base2, core_no, pkg_no);
-
-				p = GET_PKG(pkg_base, pkg_no);
-				p2 = GET_PKG(pkg_base2, pkg_no);
-
-				retval = func(t, c, p, t2, c2, p2);
-				if (retval)
-					return retval;
+		for (node_no = 0; node_no < topo.nodes_per_pkg; ++node_no) {
+			for (core_no = 0; core_no < topo.cores_per_node;
+			     ++core_no) {
+				for (thread_no = 0; thread_no <
+					topo.threads_per_core; ++thread_no) {
+					struct thread_data *t, *t2;
+					struct core_data *c, *c2;
+					struct pkg_data *p, *p2;
+
+					t = GET_THREAD(thread_base, thread_no,
+						       core_no, node_no,
+						       pkg_no);
+
+					if (cpu_is_not_present(t->cpu_id))
+						continue;
+
+					t2 = GET_THREAD(thread_base2, thread_no,
+							core_no, node_no,
+							pkg_no);
+
+					c = GET_CORE(core_base, core_no,
+						     node_no, pkg_no);
+					c2 = GET_CORE(core_base2, core_no,
+						      node_no,
+						      pkg_no);
+
+					p = GET_PKG(pkg_base, pkg_no);
+					p2 = GET_PKG(pkg_base2, pkg_no);
+
+					retval = func(t, c, p, t2, c2, p2);
+					if (retval)
+						return retval;
+				}
 			}
 		}
 	}
@@ -4492,25 +4517,26 @@ void topology_probe()
 }
 
 void
-allocate_counters(struct thread_data **t, struct core_data **c, struct pkg_data **p)
+allocate_counters(struct thread_data **t, struct core_data **c,
+		  struct pkg_data **p)
 {
 	int i;
+	int num_cores = topo.cores_per_node * topo.nodes_per_pkg *
+			topo.num_packages;
+	int num_threads = topo.threads_per_core * num_cores;
 
-	*t = calloc(topo.threads_per_core * topo.cores_per_node *
-		topo.num_packages, sizeof(struct thread_data));
+	*t = calloc(num_threads, sizeof(struct thread_data));
 	if (*t == NULL)
 		goto error;
 
-	for (i = 0; i < topo.threads_per_core *
-		topo.cores_per_node * topo.num_packages; i++)
+	for (i = 0; i < num_threads; i++)
 		(*t)[i].cpu_id = -1;
 
-	*c = calloc(topo.cores_per_node * topo.num_packages,
-		sizeof(struct core_data));
+	*c = calloc(num_cores, sizeof(struct core_data));
 	if (*c == NULL)
 		goto error;
 
-	for (i = 0; i < topo.cores_per_node * topo.num_packages; i++)
+	for (i = 0; i < num_cores; i++)
 		(*c)[i].core_id = -1;
 
 	*p = calloc(topo.num_packages, sizeof(struct pkg_data));
@@ -4533,14 +4559,15 @@ void init_counter(struct thread_data *thread_base, struct core_data *core_base,
 	struct pkg_data *pkg_base, int cpu_id)
 {
 	int pkg_id = cpus[cpu_id].physical_package_id;
+	int node_id = cpus[cpu_id].logical_node_id;
 	int core_id = cpus[cpu_id].physical_core_id;
 	int thread_id = cpus[cpu_id].thread_id;
 	struct thread_data *t;
 	struct core_data *c;
 	struct pkg_data *p;
 
-	t = GET_THREAD(thread_base, thread_id, core_id, pkg_id);
-	c = GET_CORE(core_base, core_id, pkg_id);
+	t = GET_THREAD(thread_base, thread_id, core_id, node_id, pkg_id);
+	c = GET_CORE(core_base, core_id, node_id, pkg_id);
 	p = GET_PKG(pkg_base, pkg_id);
 
 	t->cpu_id = cpu_id;
-- 
1.8.5.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes
  2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
                   ` (7 preceding siblings ...)
  2017-07-28 12:06 ` [PATCH 7/7] turbostat: add node information into turbostat calculations Prarit Bhargava
@ 2017-08-05  7:06 ` Len Brown
  2017-08-17 13:29   ` Prarit Bhargava
  8 siblings, 1 reply; 12+ messages in thread
From: Len Brown @ 2017-08-05  7:06 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: Linux PM list, Len Brown

Hi Prarit,
Thanks for re-sending to linux-pm -- much easier to track patches there:-)

three suggestions.

the package,core,cpu columns in turbostat are helpful for
understanding topology.
Now that you've taught turbostat about nodes, I think it would make sense
to expose that knowledge in an additional "Node" column.

In testing this patch series, I found that it now visits HT siblings
in reverse order.
I would prefer if the patch series did not change the displayed CPU
order on existing systems,
since there are existing data-sets and it would be good to be able to
still be able to compare them...
(also, I think it reads better to continue visit the lower numbered HT
sibling before the higher numbered, rather than the reverse)

please re-send after the patch series is checkpatch.pl clean

thanks,
Len Brown, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes
  2017-08-05  7:06 ` [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Len Brown
@ 2017-08-17 13:29   ` Prarit Bhargava
  0 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-08-17 13:29 UTC (permalink / raw)
  To: Len Brown; +Cc: Linux PM list, Len Brown



On 08/05/2017 03:06 AM, Len Brown wrote:
> Hi Prarit,
> Thanks for re-sending to linux-pm -- much easier to track patches there:-)
> 
> three suggestions.
> 
> the package,core,cpu columns in turbostat are helpful for
> understanding topology.
> Now that you've taught turbostat about nodes, I think it would make sense
> to expose that knowledge in an additional "Node" column.

Will do.

> 
> In testing this patch series, I found that it now visits HT siblings
> in reverse order.

Interesting, I had not noticed that.  I'll fix that.

P.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes
@ 2017-07-25 12:24 Prarit Bhargava
  0 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2017-07-25 12:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: lenb, Prarit Bhargava, Len Brown

AMD family processors do not show all cores in the output of turbostat.  This
occurs because AMD has multiple nodes per socket and enumerates cores
within each node from 0.  For example, socket 0 may have two nodes (0 and 1)
and those nodes both have cores enumerated from 0 through 7.  turbostat cannot
handle this configuration, and as a result only shows 1/2 the cores in its
output.

This patchset makes turbostate aware of nodes.  It has been tested on
various AMD and Intel systems and no issues have been found.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>

Prarit Bhargava (7):
  turbostat: set max_num_cpus equal to the cpumask length
  turbostat: Fix node and siblings lookup data
  turbostat: Calculate additional node information for a package
  turbostat: track thread ID in cpu_topology
  turbostat: rename num_cores_per_pkg to num_cores_per_node
  turbostat: remove num_ from cpu_topology struct
  turbostat: add node information into turbostat calculations

 tools/power/x86/turbostat/turbostat.c | 419 +++++++++++++++++++++-------------
 1 file changed, 259 insertions(+), 160 deletions(-)

-- 
1.8.5.5

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-08-17 13:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-28 12:06 [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Prarit Bhargava
2017-07-28 12:06 ` [PATCH v2] turbostat: Running on virtual machine is not supported Prarit Bhargava
2017-07-28 12:06 ` [PATCH 1/7] turbostat: set max_num_cpus equal to the cpumask length Prarit Bhargava
2017-07-28 12:06 ` [PATCH 2/7] turbostat: Fix node and siblings lookup data Prarit Bhargava
2017-07-28 12:06 ` [PATCH 3/7] turbostat: Calculate additional node information for a package Prarit Bhargava
2017-07-28 12:06 ` [PATCH 4/7] turbostat: track thread ID in cpu_topology Prarit Bhargava
2017-07-28 12:06 ` [PATCH 5/7] turbostat: rename num_cores_per_pkg to num_cores_per_node Prarit Bhargava
2017-07-28 12:06 ` [PATCH 6/7] turbostat: remove num_ from cpu_topology struct Prarit Bhargava
2017-07-28 12:06 ` [PATCH 7/7] turbostat: add node information into turbostat calculations Prarit Bhargava
2017-08-05  7:06 ` [PATCH 0/7 RESEND] turbostat: Fix AMD output by making turbostat aware of nodes Len Brown
2017-08-17 13:29   ` Prarit Bhargava
  -- strict thread matches above, loose matches on Subject: below --
2017-07-25 12:24 Prarit Bhargava

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.