linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K
@ 2019-11-20  9:33 Alexey Budankov
  2019-11-20  9:36 ` [PATCH v1 1/3] tools bitmap: implement bitmap_equal() operation at bitmap API Alexey Budankov
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Alexey Budankov @ 2019-11-20  9:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Current implementation of cpu_set_t type by glibc has internal cpu
mask size limitation of no more than 1024 CPUs. This limitation confines
NUMA awareness of Perf tool in record mode, thru --affinity option,
to the first 1024 CPUs on machines with larger amount of CPUs.

This patch set enables Perf tool to overcome 1024 CPUs limitation by
using a dedicated struct mmap_cpu_mask type and applying tool's bitmap
API operations to manipulate affinity masks of the tool's thread and
the mmaped data buffers.

tools bitmap API has been extended with bitmap_equal() operation
and its implementation is derived from the kernel one.

---
Alexey Budankov (3):
  tools bitmap: extend bitmap API with bitmap_equal()
  perf mmap: declare type for cpu mask of arbitrary length
  perf record: adapt affinity to machines with #CPUs > 1K

 tools/include/linux/bitmap.h | 21 +++++++++++++++++++++
 tools/lib/bitmap.c           | 15 +++++++++++++++
 tools/perf/builtin-record.c  | 28 ++++++++++++++++++++++------
 tools/perf/util/mmap.c       | 28 ++++++++++++++++++++++------
 tools/perf/util/mmap.h       | 11 ++++++++++-
 5 files changed, 90 insertions(+), 13 deletions(-)

---
Testing:

  $ tools/perf/perf record -v --affinity=cpu -- ls
  thread mask[8]: empty
  Using CPUID GenuineIntel-6-5E-3
  ...
  mmap size 528384B
  0x7f95f8f85010: mmap mask[8]: 0
  0x7f95f8f950d8: mmap mask[8]: 1
  0x7f95f8fa51a0: mmap mask[8]: 2
  0x7f95f8fb5268: mmap mask[8]: 3
  0x7f95f8fc5330: mmap mask[8]: 4
  0x7f95f8fd53f8: mmap mask[8]: 5
  0x7f95f8fe54c0: mmap mask[8]: 6
  0x7f95f8ff5588: mmap mask[8]: 7
  ...
  thread mask[8]: 0
  thread mask[8]: 1
  thread mask[8]: 2
  thread mask[8]: 3
  arch			      copy     Documentation  init     kernel	 MAINTAINERS	  modules.builtin.modinfo  perf.data	  scripts   System.map	vmlinux
  block			      COPYING  drivers	      ipc      lbuild	 Makefile	  modules.order		   perf.data.old  security  tools	vmlinux.o
  certs			      CREDITS  fs	      Kbuild   lib	 mm		  Module.symvers	   README	  sound     usr
  config-5.2.7-100.fc29.x86_64  crypto   include	      Kconfig  LICENSES  modules.builtin  net			   samples	  stdio     virt
  thread mask[8]: 4
  thread mask[8]: 5
  thread mask[8]: 6
  thread mask[8]: 7
  thread mask[8]: 0
  thread mask[8]: 1
  thread mask[8]: 2
  thread mask[8]: 3
  thread mask[8]: 4
  thread mask[8]: 5
  thread mask[8]: 6
  thread mask[8]: 7
  [ perf record: Woken up 0 times to write data ]
  thread mask[8]: 0
  thread mask[8]: 1
  thread mask[8]: 2
  thread mask[8]: 3
  thread mask[8]: 4
  thread mask[8]: 5
  thread mask[8]: 6
  thread mask[8]: 7
  ...
  [ perf record: Captured and wrote 0.014 MB perf.data (11 samples) ]

-- 
2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1 1/3] tools bitmap: implement bitmap_equal() operation at bitmap API
  2019-11-20  9:33 [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K Alexey Budankov
@ 2019-11-20  9:36 ` Alexey Budankov
  2019-11-20  9:37 ` [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length Alexey Budankov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Alexey Budankov @ 2019-11-20  9:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Extend tools bitmap API with bitmap_equal() implementation.
The implementation has been derived from the kernel.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/include/linux/bitmap.h | 21 +++++++++++++++++++++
 tools/lib/bitmap.c           | 15 +++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index 05dca5c203f3..362f22fd3c00 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -15,6 +15,8 @@ void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 		 const unsigned long *bitmap2, int bits);
 int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 		 const unsigned long *bitmap2, unsigned int bits);
+int __bitmap_equal(const unsigned long *bitmap1,
+		   const unsigned long *bitmap2, unsigned int bits);
 void bitmap_clear(unsigned long *map, unsigned int start, int len);
 
 #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
@@ -148,4 +150,23 @@ static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
 	return __bitmap_and(dst, src1, src2, nbits);
 }
 
+#ifdef __LITTLE_ENDIAN
+#define BITMAP_MEM_ALIGNMENT 8
+#else
+#define BITMAP_MEM_ALIGNMENT (8 * sizeof(unsigned long))
+#endif
+#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1)
+#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
+
+static inline int bitmap_equal(const unsigned long *src1,
+			const unsigned long *src2, unsigned int nbits)
+{
+	if (small_const_nbits(nbits))
+		return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+	if (__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
+	    IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
+		return !memcmp(src1, src2, nbits / 8);
+	return __bitmap_equal(src1, src2, nbits);
+}
+
 #endif /* _PERF_BITOPS_H */
diff --git a/tools/lib/bitmap.c b/tools/lib/bitmap.c
index 38494782be06..5043747ef6c5 100644
--- a/tools/lib/bitmap.c
+++ b/tools/lib/bitmap.c
@@ -71,3 +71,18 @@ int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 			   BITMAP_LAST_WORD_MASK(bits));
 	return result != 0;
 }
+
+int __bitmap_equal(const unsigned long *bitmap1,
+		const unsigned long *bitmap2, unsigned int bits)
+{
+	unsigned int k, lim = bits/BITS_PER_LONG;
+	for (k = 0; k < lim; ++k)
+		if (bitmap1[k] != bitmap2[k])
+			return 0;
+
+	if (bits % BITS_PER_LONG)
+		if ((bitmap1[k] ^ bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+			return 0;
+
+	return 1;
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length
  2019-11-20  9:33 [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K Alexey Budankov
  2019-11-20  9:36 ` [PATCH v1 1/3] tools bitmap: implement bitmap_equal() operation at bitmap API Alexey Budankov
@ 2019-11-20  9:37 ` Alexey Budankov
  2019-11-22 13:20   ` Jiri Olsa
  2019-11-20  9:38 ` [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K Alexey Budankov
  2019-11-22 13:26 ` [PATCH v1 0/3] perf record: adapt NUMA awareness " Jiri Olsa
  3 siblings, 1 reply; 11+ messages in thread
From: Alexey Budankov @ 2019-11-20  9:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Declare a dedicated struct map_cpu_mask type for cpu masks of 
arbitrary length. Mask is available thru bits pointer and the 
mask length is kept in nbits field. mmap_cpu_mask_bytes() macro 
returns mask storage size in bytes.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/util/mmap.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index bee4e83f7109..a218a0eb1466 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -15,6 +15,15 @@
 #include "event.h"
 
 struct aiocb;
+
+struct mmap_cpu_mask {
+	unsigned long *bits;
+	size_t nbits;
+};
+
+#define mmap_cpu_mask_bytes(m) \
+	(BITS_TO_LONGS(((struct mmap_cpu_mask *)m)->nbits) * sizeof(unsigned long))
+
 /**
  * struct mmap - perf's ring buffer mmap details
  *
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K
  2019-11-20  9:33 [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K Alexey Budankov
  2019-11-20  9:36 ` [PATCH v1 1/3] tools bitmap: implement bitmap_equal() operation at bitmap API Alexey Budankov
  2019-11-20  9:37 ` [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length Alexey Budankov
@ 2019-11-20  9:38 ` Alexey Budankov
  2019-11-22 13:22   ` Jiri Olsa
  2019-11-22 13:26 ` [PATCH v1 0/3] perf record: adapt NUMA awareness " Jiri Olsa
  3 siblings, 1 reply; 11+ messages in thread
From: Alexey Budankov @ 2019-11-20  9:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Use struct mmap_cpu_mask type for tool's thread and mmap data
buffers to overcome current 1024 CPUs mask size limitation of
cpu_set_t type.

Currently glibc cpu_set_t type has internal mask size limit
of 1024 CPUs. Moving to struct mmap_cpu_mask type allows
overcoming that limit. tools bitmap API is used to manipulate
objects of struct mmap_cpu_mask type.

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-record.c | 28 ++++++++++++++++++++++------
 tools/perf/util/mmap.c      | 28 ++++++++++++++++++++++------
 tools/perf/util/mmap.h      |  2 +-
 3 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b95c000c1ed9..238688516bc8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -62,6 +62,7 @@
 #include <linux/string.h>
 #include <linux/time64.h>
 #include <linux/zalloc.h>
+#include <linux/bitmap.h>
 
 struct switch_output {
 	bool		 enabled;
@@ -93,7 +94,7 @@ struct record {
 	bool			timestamp_boundary;
 	struct switch_output	switch_output;
 	unsigned long long	samples;
-	cpu_set_t		affinity_mask;
+	struct mmap_cpu_mask	affinity_mask;
 	unsigned long		output_max_size;	/* = 0: unlimited */
 };
 
@@ -951,13 +952,21 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+#define MASK_SIZE 1023
 static void record__adjust_affinity(struct record *rec, struct mmap *map)
 {
+	char mask[MASK_SIZE + 1] = {0};
+
 	if (rec->opts.affinity != PERF_AFFINITY_SYS &&
-	    !CPU_EQUAL(&rec->affinity_mask, &map->affinity_mask)) {
-		CPU_ZERO(&rec->affinity_mask);
-		CPU_OR(&rec->affinity_mask, &rec->affinity_mask, &map->affinity_mask);
-		sched_setaffinity(0, sizeof(rec->affinity_mask), &rec->affinity_mask);
+	    !bitmap_equal(rec->affinity_mask.bits, map->affinity_mask.bits,
+			  rec->affinity_mask.nbits)) {
+		bitmap_zero(rec->affinity_mask.bits, rec->affinity_mask.nbits);
+		bitmap_or(rec->affinity_mask.bits, rec->affinity_mask.bits,
+			  map->affinity_mask.bits, rec->affinity_mask.nbits);
+		sched_setaffinity(0, mmap_cpu_mask_bytes(&rec->affinity_mask),
+				  (cpu_set_t *)rec->affinity_mask.bits);
+		bitmap_scnprintf(rec->affinity_mask.bits, rec->affinity_mask.nbits, mask, MASK_SIZE);
+		pr_debug("thread mask[%ld]: %s\n", rec->affinity_mask.nbits, mask);
 	}
 }
 
@@ -2389,7 +2398,6 @@ int cmd_record(int argc, const char **argv)
 # undef REASON
 #endif
 
-	CPU_ZERO(&rec->affinity_mask);
 	rec->opts.affinity = PERF_AFFINITY_SYS;
 
 	rec->evlist = evlist__new();
@@ -2405,6 +2413,14 @@ int cmd_record(int argc, const char **argv)
 	if (quiet)
 		perf_quiet_option();
 
+	rec->affinity_mask.nbits = cpu__max_cpu();
+	rec->affinity_mask.bits = bitmap_alloc(rec->affinity_mask.nbits);
+	if (!rec->affinity_mask.bits) {
+		pr_err("Failed to allocate thread mask for %ld cpus\n", rec->affinity_mask.nbits);
+		return -ENOMEM;
+	}
+	pr_debug("thread mask[%ld]: empty\n", rec->affinity_mask.nbits);
+
 	/* Make system wide (-a) the default target. */
 	if (!argc && target__none(&rec->opts.target))
 		rec->opts.target.system_wide = true;
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 063d1b93c53d..070b1873cd45 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -23,6 +23,7 @@
 #include "mmap.h"
 #include "../perf.h"
 #include <internal/lib.h> /* page_size */
+#include <linux/bitmap.h>
 
 size_t mmap__mmap_len(struct mmap *map)
 {
@@ -215,7 +216,7 @@ void mmap__munmap(struct mmap *map)
 	auxtrace_mmap__munmap(&map->auxtrace_mmap);
 }
 
-static void build_node_mask(int node, cpu_set_t *mask)
+static void build_node_mask(int node, struct mmap_cpu_mask *mask)
 {
 	int c, cpu, nr_cpus;
 	const struct perf_cpu_map *cpu_map = NULL;
@@ -228,28 +229,43 @@ static void build_node_mask(int node, cpu_set_t *mask)
 	for (c = 0; c < nr_cpus; c++) {
 		cpu = cpu_map->map[c]; /* map c index to online cpu index */
 		if (cpu__get_node(cpu) == node)
-			CPU_SET(cpu, mask);
+			set_bit(cpu, mask->bits);
 	}
 }
 
-static void perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
+static int perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
 {
-	CPU_ZERO(&map->affinity_mask);
+	map->affinity_mask.nbits = cpu__max_cpu();
+	map->affinity_mask.bits = bitmap_alloc(map->affinity_mask.nbits);
+	if (!map->affinity_mask.bits)
+		return 1;
+
 	if (mp->affinity == PERF_AFFINITY_NODE && cpu__max_node() > 1)
 		build_node_mask(cpu__get_node(map->core.cpu), &map->affinity_mask);
 	else if (mp->affinity == PERF_AFFINITY_CPU)
-		CPU_SET(map->core.cpu, &map->affinity_mask);
+		set_bit(map->core.cpu, map->affinity_mask.bits);
+
+	return 0;
 }
 
+#define MASK_SIZE 1023
 int mmap__mmap(struct mmap *map, struct mmap_params *mp, int fd, int cpu)
 {
+	char mask[MASK_SIZE + 1] = {0};
+
 	if (perf_mmap__mmap(&map->core, &mp->core, fd, cpu)) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
 			  errno);
 		return -1;
 	}
 
-	perf_mmap__setup_affinity_mask(map, mp);
+	if (perf_mmap__setup_affinity_mask(map, mp)) {
+		pr_debug2("failed to alloc mmap affinity mask, error %d\n",
+			  errno);
+		return -1;
+	}
+	bitmap_scnprintf(map->affinity_mask.bits, map->affinity_mask.nbits, mask, MASK_SIZE);
+	pr_debug("%p: mmap mask[%ld]: %s\n", map, map->affinity_mask.nbits, mask);
 
 	map->core.flush = mp->flush;
 
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index a218a0eb1466..0a644e112ac6 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -40,7 +40,7 @@ struct mmap {
 		int		 nr_cblocks;
 	} aio;
 #endif
-	cpu_set_t	affinity_mask;
+	struct mmap_cpu_mask	affinity_mask;
 	void		*data;
 	int		comp_level;
 };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length
  2019-11-20  9:37 ` [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length Alexey Budankov
@ 2019-11-22 13:20   ` Jiri Olsa
  2019-11-22 13:58     ` Alexey Budankov
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2019-11-22 13:20 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Wed, Nov 20, 2019 at 12:37:48PM +0300, Alexey Budankov wrote:
> 
> Declare a dedicated struct map_cpu_mask type for cpu masks of 
> arbitrary length. Mask is available thru bits pointer and the 
> mask length is kept in nbits field. mmap_cpu_mask_bytes() macro 
> returns mask storage size in bytes.
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> ---
>  tools/perf/util/mmap.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
> index bee4e83f7109..a218a0eb1466 100644
> --- a/tools/perf/util/mmap.h
> +++ b/tools/perf/util/mmap.h
> @@ -15,6 +15,15 @@
>  #include "event.h"
>  
>  struct aiocb;
> +
> +struct mmap_cpu_mask {
> +	unsigned long *bits;
> +	size_t nbits;
> +};
> +
> +#define mmap_cpu_mask_bytes(m) \

we try to have all macros upper case

> +	(BITS_TO_LONGS(((struct mmap_cpu_mask *)m)->nbits) * sizeof(unsigned long))

we have BITS_TO_BYTES

thanks,
jirka

> +
>  /**
>   * struct mmap - perf's ring buffer mmap details
>   *
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K
  2019-11-20  9:38 ` [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K Alexey Budankov
@ 2019-11-22 13:22   ` Jiri Olsa
  2019-11-22 13:55     ` Alexey Budankov
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2019-11-22 13:22 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Wed, Nov 20, 2019 at 12:38:57PM +0300, Alexey Budankov wrote:

SNIP

> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index 063d1b93c53d..070b1873cd45 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -23,6 +23,7 @@
>  #include "mmap.h"
>  #include "../perf.h"
>  #include <internal/lib.h> /* page_size */
> +#include <linux/bitmap.h>
>  
>  size_t mmap__mmap_len(struct mmap *map)
>  {
> @@ -215,7 +216,7 @@ void mmap__munmap(struct mmap *map)
>  	auxtrace_mmap__munmap(&map->auxtrace_mmap);
>  }
>  
> -static void build_node_mask(int node, cpu_set_t *mask)
> +static void build_node_mask(int node, struct mmap_cpu_mask *mask)
>  {
>  	int c, cpu, nr_cpus;
>  	const struct perf_cpu_map *cpu_map = NULL;
> @@ -228,28 +229,43 @@ static void build_node_mask(int node, cpu_set_t *mask)
>  	for (c = 0; c < nr_cpus; c++) {
>  		cpu = cpu_map->map[c]; /* map c index to online cpu index */
>  		if (cpu__get_node(cpu) == node)
> -			CPU_SET(cpu, mask);
> +			set_bit(cpu, mask->bits);
>  	}
>  }
>  
> -static void perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
> +static int perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
>  {
> -	CPU_ZERO(&map->affinity_mask);
> +	map->affinity_mask.nbits = cpu__max_cpu();
> +	map->affinity_mask.bits = bitmap_alloc(map->affinity_mask.nbits);
> +	if (!map->affinity_mask.bits)
> +		return 1;

I guess this works, but please return < 0 on error

thanks,
jirka


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K
  2019-11-20  9:33 [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K Alexey Budankov
                   ` (2 preceding siblings ...)
  2019-11-20  9:38 ` [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K Alexey Budankov
@ 2019-11-22 13:26 ` Jiri Olsa
  2019-11-22 14:00   ` Alexey Budankov
  3 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2019-11-22 13:26 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Wed, Nov 20, 2019 at 12:33:10PM +0300, Alexey Budankov wrote:
> 
> Current implementation of cpu_set_t type by glibc has internal cpu
> mask size limitation of no more than 1024 CPUs. This limitation confines
> NUMA awareness of Perf tool in record mode, thru --affinity option,
> to the first 1024 CPUs on machines with larger amount of CPUs.
> 
> This patch set enables Perf tool to overcome 1024 CPUs limitation by
> using a dedicated struct mmap_cpu_mask type and applying tool's bitmap
> API operations to manipulate affinity masks of the tool's thread and
> the mmaped data buffers.
> 
> tools bitmap API has been extended with bitmap_equal() operation
> and its implementation is derived from the kernel one.
> 
> ---
> Alexey Budankov (3):
>   tools bitmap: extend bitmap API with bitmap_equal()
>   perf mmap: declare type for cpu mask of arbitrary length
>   perf record: adapt affinity to machines with #CPUs > 1K

looks good to me, I sent some minor comments

> 
>  tools/include/linux/bitmap.h | 21 +++++++++++++++++++++
>  tools/lib/bitmap.c           | 15 +++++++++++++++
>  tools/perf/builtin-record.c  | 28 ++++++++++++++++++++++------
>  tools/perf/util/mmap.c       | 28 ++++++++++++++++++++++------
>  tools/perf/util/mmap.h       | 11 ++++++++++-
>  5 files changed, 90 insertions(+), 13 deletions(-)
> 
> ---
> Testing:
> 
>   $ tools/perf/perf record -v --affinity=cpu -- ls
>   thread mask[8]: empty
>   Using CPUID GenuineIntel-6-5E-3
>   ...
>   mmap size 528384B
>   0x7f95f8f85010: mmap mask[8]: 0
>   0x7f95f8f950d8: mmap mask[8]: 1
>   0x7f95f8fa51a0: mmap mask[8]: 2
>   0x7f95f8fb5268: mmap mask[8]: 3
>   0x7f95f8fc5330: mmap mask[8]: 4
>   0x7f95f8fd53f8: mmap mask[8]: 5
>   0x7f95f8fe54c0: mmap mask[8]: 6
>   0x7f95f8ff5588: mmap mask[8]: 7

could we add this to -vv? -v is poluted already

perhaps we should make some effort and try to consolidate -v output
for some really basic verbose, the rest would be under -vv or specialized
--debug variable .. not in scope of this patchset of course ;-)

thanks,
jirka


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K
  2019-11-22 13:22   ` Jiri Olsa
@ 2019-11-22 13:55     ` Alexey Budankov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexey Budankov @ 2019-11-22 13:55 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 22.11.2019 16:22, Jiri Olsa wrote:
> On Wed, Nov 20, 2019 at 12:38:57PM +0300, Alexey Budankov wrote:
> 
> SNIP
> 
>> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
<SNIP>
>> -static void perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
>> +static int perf_mmap__setup_affinity_mask(struct mmap *map, struct mmap_params *mp)
>>  {
>> -	CPU_ZERO(&map->affinity_mask);
>> +	map->affinity_mask.nbits = cpu__max_cpu();
>> +	map->affinity_mask.bits = bitmap_alloc(map->affinity_mask.nbits);
>> +	if (!map->affinity_mask.bits)
>> +		return 1;
> 
> I guess this works, but please return < 0 on error

In v2.

> 
> thanks,
> jirka
> 
> 

~Alexey


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length
  2019-11-22 13:20   ` Jiri Olsa
@ 2019-11-22 13:58     ` Alexey Budankov
  2019-11-22 16:01       ` Alexey Budankov
  0 siblings, 1 reply; 11+ messages in thread
From: Alexey Budankov @ 2019-11-22 13:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 22.11.2019 16:20, Jiri Olsa wrote:
> On Wed, Nov 20, 2019 at 12:37:48PM +0300, Alexey Budankov wrote:
>>
>> Declare a dedicated struct map_cpu_mask type for cpu masks of 
>> arbitrary length. Mask is available thru bits pointer and the 
>> mask length is kept in nbits field. mmap_cpu_mask_bytes() macro 
>> returns mask storage size in bytes.
>>
>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>> ---
>>  tools/perf/util/mmap.h | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
>> index bee4e83f7109..a218a0eb1466 100644
>> --- a/tools/perf/util/mmap.h
>> +++ b/tools/perf/util/mmap.h
>> @@ -15,6 +15,15 @@
>>  #include "event.h"
>>  
>>  struct aiocb;
>> +
>> +struct mmap_cpu_mask {
>> +	unsigned long *bits;
>> +	size_t nbits;
>> +};
>> +
>> +#define mmap_cpu_mask_bytes(m) \
> 
> we try to have all macros upper case

In v2.

> 
>> +	(BITS_TO_LONGS(((struct mmap_cpu_mask *)m)->nbits) * sizeof(unsigned long))
> 
> we have BITS_TO_BYTES

In v2.

> 
> thanks,
> jirka
> 

~Alexey

>> +
>>  /**
>>   * struct mmap - perf's ring buffer mmap details
>>   *
>> -- 
>> 2.20.1
>>
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K
  2019-11-22 13:26 ` [PATCH v1 0/3] perf record: adapt NUMA awareness " Jiri Olsa
@ 2019-11-22 14:00   ` Alexey Budankov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexey Budankov @ 2019-11-22 14:00 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 22.11.2019 16:26, Jiri Olsa wrote:
> On Wed, Nov 20, 2019 at 12:33:10PM +0300, Alexey Budankov wrote:
>>
>> Current implementation of cpu_set_t type by glibc has internal cpu
>> mask size limitation of no more than 1024 CPUs. This limitation confines
>> NUMA awareness of Perf tool in record mode, thru --affinity option,
>> to the first 1024 CPUs on machines with larger amount of CPUs.
>>
>> This patch set enables Perf tool to overcome 1024 CPUs limitation by
>> using a dedicated struct mmap_cpu_mask type and applying tool's bitmap
>> API operations to manipulate affinity masks of the tool's thread and
>> the mmaped data buffers.
>>
>> tools bitmap API has been extended with bitmap_equal() operation
>> and its implementation is derived from the kernel one.
>>
>> ---
>> Alexey Budankov (3):
>>   tools bitmap: extend bitmap API with bitmap_equal()
>>   perf mmap: declare type for cpu mask of arbitrary length
>>   perf record: adapt affinity to machines with #CPUs > 1K
> 
> looks good to me, I sent some minor comments
> 
>>
>>  tools/include/linux/bitmap.h | 21 +++++++++++++++++++++
>>  tools/lib/bitmap.c           | 15 +++++++++++++++
>>  tools/perf/builtin-record.c  | 28 ++++++++++++++++++++++------
>>  tools/perf/util/mmap.c       | 28 ++++++++++++++++++++++------
>>  tools/perf/util/mmap.h       | 11 ++++++++++-
>>  5 files changed, 90 insertions(+), 13 deletions(-)
>>
>> ---
>> Testing:
>>
>>   $ tools/perf/perf record -v --affinity=cpu -- ls
>>   thread mask[8]: empty
>>   Using CPUID GenuineIntel-6-5E-3
>>   ...
>>   mmap size 528384B
>>   0x7f95f8f85010: mmap mask[8]: 0
>>   0x7f95f8f950d8: mmap mask[8]: 1
>>   0x7f95f8fa51a0: mmap mask[8]: 2
>>   0x7f95f8fb5268: mmap mask[8]: 3
>>   0x7f95f8fc5330: mmap mask[8]: 4
>>   0x7f95f8fd53f8: mmap mask[8]: 5
>>   0x7f95f8fe54c0: mmap mask[8]: 6
>>   0x7f95f8ff5588: mmap mask[8]: 7
> 
> could we add this to -vv? -v is poluted already

In v2.

Thanks,
Alexey

> 
> perhaps we should make some effort and try to consolidate -v output
> for some really basic verbose, the rest would be under -vv or specialized
> --debug variable .. not in scope of this patchset of course ;-)
> 
> thanks,
> jirka
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length
  2019-11-22 13:58     ` Alexey Budankov
@ 2019-11-22 16:01       ` Alexey Budankov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexey Budankov @ 2019-11-22 16:01 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 22.11.2019 16:58, Alexey Budankov wrote:
> On 22.11.2019 16:20, Jiri Olsa wrote:
>> On Wed, Nov 20, 2019 at 12:37:48PM +0300, Alexey Budankov wrote:
>>>

<SNIP>

>>> +#define mmap_cpu_mask_bytes(m) \
>>
>> we try to have all macros upper case
> 
> In v2.
> 
>>
>>> +	(BITS_TO_LONGS(((struct mmap_cpu_mask *)m)->nbits) * sizeof(unsigned long))
>>
>> we have BITS_TO_BYTES
> 
> In v2.

Avoided BITS_TO_BYTES() in this particular case because 
the storage is allocated in unsigned long chunks, thus
the possibly unused tail bytes at the last chunk would 
be accounted in the returned size, and for compatibility 
with cpu_set_t which is also allocated in unsigned longs.

~Alexey

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-11-22 16:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-20  9:33 [PATCH v1 0/3] perf record: adapt NUMA awareness to machines with #CPUs > 1K Alexey Budankov
2019-11-20  9:36 ` [PATCH v1 1/3] tools bitmap: implement bitmap_equal() operation at bitmap API Alexey Budankov
2019-11-20  9:37 ` [PATCH v1 2/3] perf mmap: declare type for cpu mask of arbitrary length Alexey Budankov
2019-11-22 13:20   ` Jiri Olsa
2019-11-22 13:58     ` Alexey Budankov
2019-11-22 16:01       ` Alexey Budankov
2019-11-20  9:38 ` [PATCH v1 3/3] perf record: adapt affinity to machines with #CPUs > 1K Alexey Budankov
2019-11-22 13:22   ` Jiri Olsa
2019-11-22 13:55     ` Alexey Budankov
2019-11-22 13:26 ` [PATCH v1 0/3] perf record: adapt NUMA awareness " Jiri Olsa
2019-11-22 14:00   ` Alexey Budankov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).