* [PATCH 0/3] perf tools: Speedup DWARF unwind
@ 2014-04-17 17:39 Jiri Olsa
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
` (5 more replies)
0 siblings, 6 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-17 17:39 UTC (permalink / raw)
To: linux-kernel
Cc: Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet, Jiri Olsa
hi,
trying to speedup DWARF unwind report code by factoring
related code:
- caching sample's registers access
- keep dso data file descriptor open for the
life of the dso object
- replace dso cache code by mapping dso data file
directly for the life of the dso object
The speedup is mainly for libunwind unwind. The libdw will benefit
mainly from cached registers access, because it handles dso data
accesses by itself.. and anyway it's still faster ;-).
Also reachable in here:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
perf/core_unwind_speedup
thanks,
jirka
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jean Pihet <jean.pihet@linaro.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
---
Jiri Olsa (3):
perf tools: Cache register accesses for unwind processing
perf tools: Cache dso data file descriptor
perf tools: Replace dso data cache with mapped data
tools/perf/tests/dso-data.c | 7 ++++
tools/perf/util/dso.c | 200 +++++++++++++++++++++++++++---------------------------------------------------------------------
tools/perf/util/dso.h | 14 ++-----
tools/perf/util/event.h | 5 +++
tools/perf/util/perf_regs.c | 10 ++++-
tools/perf/util/perf_regs.h | 4 +-
tools/perf/util/unwind-libunwind.c | 2 -
7 files changed, 83 insertions(+), 159 deletions(-)
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
@ 2014-04-17 17:39 ` Jiri Olsa
2014-04-27 14:29 ` Namhyung Kim
2014-04-28 10:39 ` Christian Borntraeger
2014-04-17 17:39 ` [PATCH 2/3] perf tools: Cache dso data file descriptor Jiri Olsa
` (4 subsequent siblings)
5 siblings, 2 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-17 17:39 UTC (permalink / raw)
To: linux-kernel
Cc: Jiri Olsa, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
Caching registers value into an array. Got about 4% speed up
of perf_reg_value function for report command processing
dwarf unwind stacks.
Output from report over 1.5 GB data with DWARF unwind stacks:
(TODO fix perf diff)
current code:
6.81% perf.old perf.old [.] perf_reg_value
change:
2.24% perf perf [.] perf_reg_value
And little bit of speed up:
Performance counter stats for './perf.old report -i perf-test.data --stdio':
134,664,011,577 cycles:u # 2.472 GHz
189,677,227,475 instructions:u # 1.41 insns per cycle
54465.096050 task-clock (msec) # 0.998 CPUs utilized
54.598339009 seconds time elapsed
Performance counter stats for './perf report -i perf-test.data --stdio':
124,478,681,672 cycles:u # 2.466 GHz
168,998,379,866 instructions:u # 1.36 insns per cycle
50487.110482 task-clock (msec) # 0.997 CPUs utilized
50.635824229 seconds time elapsed
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jean Pihet <jean.pihet@linaro.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
---
tools/perf/util/event.h | 5 +++++
tools/perf/util/perf_regs.c | 10 +++++++++-
tools/perf/util/perf_regs.h | 4 +++-
3 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 38457d4..970d4eb 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -7,6 +7,7 @@
#include "../perf.h"
#include "map.h"
#include "build-id.h"
+#include "perf_regs.h"
struct mmap_event {
struct perf_event_header header;
@@ -87,6 +88,10 @@ struct regs_dump {
u64 abi;
u64 mask;
u64 *regs;
+
+ /* Cached values/mask filled by first register access. */
+ u64 cache_regs[PERF_REGS_MAX];
+ u64 cache_mask;
};
struct stack_dump {
diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
index a3539ef..43168fb 100644
--- a/tools/perf/util/perf_regs.c
+++ b/tools/perf/util/perf_regs.c
@@ -1,11 +1,15 @@
#include <errno.h>
#include "perf_regs.h"
+#include "event.h"
int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
{
int i, idx = 0;
u64 mask = regs->mask;
+ if (regs->cache_mask & (1 << id))
+ goto out;
+
if (!(mask & (1 << id)))
return -EINVAL;
@@ -14,6 +18,10 @@ int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
idx++;
}
- *valp = regs->regs[idx];
+ regs->cache_mask |= (1 << id);
+ regs->cache_regs[id] = regs->regs[idx];
+
+out:
+ *valp = regs->cache_regs[id];
return 0;
}
diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
index d6e8b6a..80d8ab1 100644
--- a/tools/perf/util/perf_regs.h
+++ b/tools/perf/util/perf_regs.h
@@ -2,15 +2,17 @@
#define __PERF_REGS_H
#include "types.h"
-#include "event.h"
#ifdef HAVE_PERF_REGS_SUPPORT
#include <perf_regs.h>
+struct regs_dump;
+
int perf_reg_value(u64 *valp, struct regs_dump *regs, int id);
#else
#define PERF_REGS_MASK 0
+#define PERF_REGS_MAX 0
static inline const char *perf_reg_name(int id __maybe_unused)
{
--
1.8.3.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
@ 2014-04-17 17:39 ` Jiri Olsa
2014-04-27 14:36 ` Namhyung Kim
2014-04-17 17:39 ` [PATCH 3/3] perf tools: Replace dso data cache with mapped data Jiri Olsa
` (3 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: Jiri Olsa @ 2014-04-17 17:39 UTC (permalink / raw)
To: linux-kernel
Cc: Jiri Olsa, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
Keeping the data file description open for the whole life
of the dso object.
The report shows just little speedup in dso__data_fd function
for report command processing dwarf unwind stacks.
Output from report over 1.5 GB data with DWARF unwind stacks:
(TODO fix perf diff)
current code:
0.22% perf.old perf.old [.] dso__data_fd
change:
0.15% perf perf [.] dso__data_fd
But a bigger overall speedup:
Performance counter stats for './perf.old report -i perf-test.data --stdio':
126,055,895,573 cycles:u # 2.463 GHz
168,964,795,208 instructions:u # 1.34 insns per cycle
51174.366434 task-clock (msec) # 0.997 CPUs utilized
51.306236943 seconds time elapsed
Performance counter stats for './perf report -i perf-test.data --stdio':
112,531,906,656 cycles:u # 2.680 GHz
163,466,037,207 instructions:u # 1.45 insns per cycle
41991.297576 task-clock (msec) # 1.000 CPUs utilized
41.985142753 seconds time elapsed
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jean Pihet <jean.pihet@linaro.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
---
tools/perf/util/dso.c | 15 +++++++++++----
tools/perf/util/dso.h | 1 +
tools/perf/util/unwind-libunwind.c | 2 --
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 64453d6..0dca5d6 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -159,6 +159,12 @@ static int open_dso(struct dso *dso, struct machine *machine)
return fd;
}
+static void dso__data_close(struct dso *dso)
+{
+ if (dso->data_fd >= 0)
+ close(dso->data_fd);
+}
+
int dso__data_fd(struct dso *dso, struct machine *machine)
{
enum dso_binary_type binary_type_data[] = {
@@ -168,8 +174,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
};
int i = 0;
- if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND)
- return open_dso(dso, machine);
+ if (dso->data_fd >= 0)
+ return dso->data_fd;
do {
int fd;
@@ -178,7 +184,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
fd = open_dso(dso, machine);
if (fd >= 0)
- return fd;
+ return dso->data_fd = fd;
} while (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND);
@@ -301,7 +307,6 @@ dso_cache__read(struct dso *dso, struct machine *machine,
if (ret <= 0)
free(cache);
- close(fd);
return ret;
}
@@ -485,6 +490,7 @@ struct dso *dso__new(const char *name)
dso->kernel = DSO_TYPE_USER;
dso->needs_swap = DSO_SWAP__UNSET;
INIT_LIST_HEAD(&dso->node);
+ dso->data_fd = -1;
}
return dso;
@@ -506,6 +512,7 @@ void dso__delete(struct dso *dso)
dso->long_name_allocated = false;
}
+ dso__data_close(dso);
dso_cache__free(&dso->cache);
dso__free_a2l(dso);
zfree(&dso->symsrc_filename);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ab06f1c..6e48cdc 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -99,6 +99,7 @@ struct dso {
const char *long_name;
u16 long_name_len;
u16 short_name_len;
+ int data_fd;
char name[0];
};
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index bd5768d..25578b9 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -250,7 +250,6 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
/* Check the .eh_frame section for unwinding info */
offset = elf_section_offset(fd, ".eh_frame_hdr");
- close(fd);
if (offset)
ret = unwind_spec_ehframe(dso, machine, offset,
@@ -271,7 +270,6 @@ static int read_unwind_spec_debug_frame(struct dso *dso,
/* Check the .debug_frame section for unwinding info */
*offset = elf_section_offset(fd, ".debug_frame");
- close(fd);
if (*offset)
return 0;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/3] perf tools: Replace dso data cache with mapped data
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
2014-04-17 17:39 ` [PATCH 2/3] perf tools: Cache dso data file descriptor Jiri Olsa
@ 2014-04-17 17:39 ` Jiri Olsa
2014-04-18 7:51 ` [PATCH 0/3] perf tools: Speedup DWARF unwind Ingo Molnar
` (2 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-17 17:39 UTC (permalink / raw)
To: linux-kernel
Cc: Jiri Olsa, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
Removing dso data cache processing and mapping
whole dso object instead when requested.
Got about 13% speed up in dso__data_read_offset function
for report command processing dwarf unwind stacks.
Output from report over 1.5 GB data with DWARF unwind stacks:
(TODO fix perf diff)
13.63% perf.old perf.old [.] dso__data_read_offset
0.32% perf perf [.] dso__data_read_offset
And overall speedup:
Performance counter stats for './perf.old report -i perf-test.data --stdio':
113,076,591,004 cycles:u # 2.675 GHz
163,353,590,494 instructions:u # 1.44 insns per cycle
42269.774797 task-clock (msec) # 1.000 CPUs utilized
42.267550053 seconds time elapsed
Performance counter stats for './perf report -i perf-test.data --stdio':
92,953,167,072 cycles:u # 2.534 GHz
132,967,448,023 instructions:u # 1.43 insns per cycle
36683.242639 task-clock (msec) # 1.000 CPUs utilized
36.682799394 seconds time elapsed
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jean Pihet <jean.pihet@linaro.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
---
tools/perf/tests/dso-data.c | 7 ++
tools/perf/util/dso.c | 185 +++++++++++---------------------------------
tools/perf/util/dso.h | 13 +---
3 files changed, 54 insertions(+), 151 deletions(-)
diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
index 9cc81a3..024c15f 100644
--- a/tools/perf/tests/dso-data.c
+++ b/tools/perf/tests/dso-data.c
@@ -40,6 +40,13 @@ static char *test_file(int size)
return templ;
}
+/*
+ * The data access is now pure memory map of the file,
+ * so we dont need DSO__DATA_CACHE_SIZE anymore.
+ * Anyway keeping it for the sake of this test to
+ * ensure dso__data_read_offset interface works.
+ */
+#define DSO__DATA_CACHE_SIZE 4096
#define TEST_FILE_SIZE (DSO__DATA_CACHE_SIZE * 20)
struct test_data_offset {
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 0dca5d6..f274c85 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1,3 +1,5 @@
+#include <sys/mman.h>
+
#include "symbol.h"
#include "dso.h"
#include "machine.h"
@@ -161,6 +163,14 @@ static int open_dso(struct dso *dso, struct machine *machine)
static void dso__data_close(struct dso *dso)
{
+ if (dso->data_mmap) {
+ size_t size = PERF_ALIGN(dso->data_size, page_size);
+
+ if (munmap(dso->data_mmap, size))
+ pr_err("dso mmap failed, munmap: %s\n",
+ strerror(errno));
+ }
+
if (dso->data_fd >= 0)
close(dso->data_fd);
}
@@ -191,164 +201,61 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
return -EINVAL;
}
-static void
-dso_cache__free(struct rb_root *root)
-{
- struct rb_node *next = rb_first(root);
-
- while (next) {
- struct dso_cache *cache;
-
- cache = rb_entry(next, struct dso_cache, rb_node);
- next = rb_next(&cache->rb_node);
- rb_erase(&cache->rb_node, root);
- free(cache);
- }
-}
-
-static struct dso_cache *dso_cache__find(const struct rb_root *root, u64 offset)
+static int dso__data_mmap(struct dso *dso, struct machine *machine, char **ptr)
{
- struct rb_node * const *p = &root->rb_node;
- const struct rb_node *parent = NULL;
- struct dso_cache *cache;
-
- while (*p != NULL) {
- u64 end;
-
- parent = *p;
- cache = rb_entry(parent, struct dso_cache, rb_node);
- end = cache->offset + DSO__DATA_CACHE_SIZE;
-
- if (offset < cache->offset)
- p = &(*p)->rb_left;
- else if (offset >= end)
- p = &(*p)->rb_right;
- else
- return cache;
- }
- return NULL;
-}
-
-static void
-dso_cache__insert(struct rb_root *root, struct dso_cache *new)
-{
- struct rb_node **p = &root->rb_node;
- struct rb_node *parent = NULL;
- struct dso_cache *cache;
- u64 offset = new->offset;
-
- while (*p != NULL) {
- u64 end;
-
- parent = *p;
- cache = rb_entry(parent, struct dso_cache, rb_node);
- end = cache->offset + DSO__DATA_CACHE_SIZE;
-
- if (offset < cache->offset)
- p = &(*p)->rb_left;
- else if (offset >= end)
- p = &(*p)->rb_right;
- }
-
- rb_link_node(&new->rb_node, parent, p);
- rb_insert_color(&new->rb_node, root);
-}
-
-static ssize_t
-dso_cache__memcpy(struct dso_cache *cache, u64 offset,
- u8 *data, u64 size)
-{
- u64 cache_offset = offset - cache->offset;
- u64 cache_size = min(cache->size - cache_offset, size);
-
- memcpy(data, cache->data + cache_offset, cache_size);
- return cache_size;
-}
-
-static ssize_t
-dso_cache__read(struct dso *dso, struct machine *machine,
- u64 offset, u8 *data, ssize_t size)
-{
- struct dso_cache *cache;
- ssize_t ret;
+ struct stat st;
int fd;
+ char *m;
+
+ if (dso->data_mmap)
+ goto out;
fd = dso__data_fd(dso, machine);
if (fd < 0)
- return -1;
-
- do {
- u64 cache_offset;
-
- ret = -ENOMEM;
-
- cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
- if (!cache)
- break;
-
- cache_offset = offset & DSO__DATA_CACHE_MASK;
- ret = -EINVAL;
-
- if (-1 == lseek(fd, cache_offset, SEEK_SET))
- break;
+ return fd;
- ret = read(fd, cache->data, DSO__DATA_CACHE_SIZE);
- if (ret <= 0)
- break;
-
- cache->offset = cache_offset;
- cache->size = ret;
- dso_cache__insert(&dso->cache, cache);
-
- ret = dso_cache__memcpy(cache, offset, data, size);
-
- } while (0);
+ if (fstat(fd, &st)) {
+ pr_err("dso mmap failed, fstat: %s\n", strerror(errno));
+ return -1;
+ }
- if (ret <= 0)
- free(cache);
+ dso->data_size = st.st_size;
- return ret;
-}
+ m = mmap(0, PERF_ALIGN(dso->data_size, page_size),
+ PROT_READ, MAP_SHARED, fd, 0);
+ if (m == MAP_FAILED) {
+ pr_err("dso mmap failed, mmap: %s\n", strerror(errno));
+ return -1;
+ }
-static ssize_t dso_cache_read(struct dso *dso, struct machine *machine,
- u64 offset, u8 *data, ssize_t size)
-{
- struct dso_cache *cache;
+ dso->data_mmap = m;
- cache = dso_cache__find(&dso->cache, offset);
- if (cache)
- return dso_cache__memcpy(cache, offset, data, size);
- else
- return dso_cache__read(dso, machine, offset, data, size);
+out:
+ *ptr = dso->data_mmap;
+ return 0;
}
ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
u64 offset, u8 *data, ssize_t size)
{
- ssize_t r = 0;
- u8 *p = data;
+ ssize_t rsize = size;
+ char *m;
- do {
- ssize_t ret;
-
- ret = dso_cache_read(dso, machine, offset, p, size);
- if (ret < 0)
- return ret;
-
- /* Reached EOF, return what we have. */
- if (!ret)
- break;
+ if (dso__data_mmap(dso, machine, &m))
+ return -1;
- BUG_ON(ret > size);
+ if (offset > dso->data_size)
+ return -1;
- r += ret;
- p += ret;
- offset += ret;
- size -= ret;
+ /* unlikely, but anyway.. check overflow ;-) */
+ if (offset + size < offset)
+ return -1;
- } while (size);
+ if (offset + size > dso->data_size)
+ rsize = dso->data_size - offset;
- return r;
+ memcpy(data, m + offset, rsize);
+ return rsize;
}
ssize_t dso__data_read_addr(struct dso *dso, struct map *map,
@@ -478,7 +385,6 @@ struct dso *dso__new(const char *name)
dso__set_short_name(dso, dso->name, false);
for (i = 0; i < MAP__NR_TYPES; ++i)
dso->symbols[i] = dso->symbol_names[i] = RB_ROOT;
- dso->cache = RB_ROOT;
dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
dso->binary_type = DSO_BINARY_TYPE__NOT_FOUND;
dso->loaded = 0;
@@ -513,7 +419,6 @@ void dso__delete(struct dso *dso)
}
dso__data_close(dso);
- dso_cache__free(&dso->cache);
dso__free_a2l(dso);
zfree(&dso->symsrc_filename);
free(dso);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 6e48cdc..fe4e4aa 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -62,21 +62,10 @@ enum dso_swap_type {
____r; \
})
-#define DSO__DATA_CACHE_SIZE 4096
-#define DSO__DATA_CACHE_MASK ~(DSO__DATA_CACHE_SIZE - 1)
-
-struct dso_cache {
- struct rb_node rb_node;
- u64 offset;
- u64 size;
- char data[0];
-};
-
struct dso {
struct list_head node;
struct rb_root symbols[MAP__NR_TYPES];
struct rb_root symbol_names[MAP__NR_TYPES];
- struct rb_root cache;
void *a2l;
char *symsrc_filename;
unsigned int a2l_fails;
@@ -100,6 +89,8 @@ struct dso {
u16 long_name_len;
u16 short_name_len;
int data_fd;
+ size_t data_size;
+ char *data_mmap;
char name[0];
};
--
1.8.3.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] perf tools: Speedup DWARF unwind
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
` (2 preceding siblings ...)
2014-04-17 17:39 ` [PATCH 3/3] perf tools: Replace dso data cache with mapped data Jiri Olsa
@ 2014-04-18 7:51 ` Ingo Molnar
2014-04-18 7:55 ` Ingo Molnar
2014-04-23 20:16 ` Jiri Olsa
2014-04-25 13:08 ` Jiri Olsa
5 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2014-04-18 7:51 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
* Jiri Olsa <jolsa@redhat.com> wrote:
> hi,
> trying to speedup DWARF unwind report code by factoring
> related code:
> - caching sample's registers access
> - keep dso data file descriptor open for the
> life of the dso object
> - replace dso cache code by mapping dso data file
> directly for the life of the dso object
>
> The speedup is mainly for libunwind unwind. The libdw will benefit
> mainly from cached registers access, because it handles dso data
> accesses by itself.. and anyway it's still faster ;-).
Just curious: do you have any numbers about how much faster it got in
practice?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] perf tools: Speedup DWARF unwind
2014-04-18 7:51 ` [PATCH 0/3] perf tools: Speedup DWARF unwind Ingo Molnar
@ 2014-04-18 7:55 ` Ingo Molnar
2014-04-18 9:35 ` Jiri Olsa
0 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2014-04-18 7:55 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Jiri Olsa <jolsa@redhat.com> wrote:
>
> > hi,
> > trying to speedup DWARF unwind report code by factoring
> > related code:
> > - caching sample's registers access
> > - keep dso data file descriptor open for the
> > life of the dso object
> > - replace dso cache code by mapping dso data file
> > directly for the life of the dso object
> >
> > The speedup is mainly for libunwind unwind. The libdw will benefit
> > mainly from cached registers access, because it handles dso data
> > accesses by itself.. and anyway it's still faster ;-).
>
> Just curious: do you have any numbers about how much faster it got in
> practice?
Oh, the numbers are all in the changelogs, never mind!
So in your test workload it went from 54.6 seconds to 36.7, a 48%
speedup :-)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] perf tools: Speedup DWARF unwind
2014-04-18 7:55 ` Ingo Molnar
@ 2014-04-18 9:35 ` Jiri Olsa
0 siblings, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-18 9:35 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Fri, Apr 18, 2014 at 09:55:25AM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@kernel.org> wrote:
>
> >
> > * Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > > hi,
> > > trying to speedup DWARF unwind report code by factoring
> > > related code:
> > > - caching sample's registers access
> > > - keep dso data file descriptor open for the
> > > life of the dso object
> > > - replace dso cache code by mapping dso data file
> > > directly for the life of the dso object
> > >
> > > The speedup is mainly for libunwind unwind. The libdw will benefit
> > > mainly from cached registers access, because it handles dso data
> > > accesses by itself.. and anyway it's still faster ;-).
> >
> > Just curious: do you have any numbers about how much faster it got in
> > practice?
>
> Oh, the numbers are all in the changelogs, never mind!
>
> So in your test workload it went from 54.6 seconds to 36.7, a 48%
> speedup :-)
>
yep, I should have put it in here as well.. also the current
libdw unwind time on this workload is 26 seconds.. 10 more
seconds to go ;-)
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] perf tools: Speedup DWARF unwind
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
` (3 preceding siblings ...)
2014-04-18 7:51 ` [PATCH 0/3] perf tools: Speedup DWARF unwind Ingo Molnar
@ 2014-04-23 20:16 ` Jiri Olsa
2014-04-25 13:08 ` Jiri Olsa
5 siblings, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-23 20:16 UTC (permalink / raw)
To: linux-kernel
Cc: Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Thu, Apr 17, 2014 at 07:39:09PM +0200, Jiri Olsa wrote:
> hi,
> trying to speedup DWARF unwind report code by factoring
> related code:
> - caching sample's registers access
> - keep dso data file descriptor open for the
> life of the dso object
> - replace dso cache code by mapping dso data file
> directly for the life of the dso object
>
> The speedup is mainly for libunwind unwind. The libdw will benefit
> mainly from cached registers access, because it handles dso data
> accesses by itself.. and anyway it's still faster ;-).
>
> Also reachable in here:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/core_unwind_speedup
rebased to latest tip perf/core, review appreciated ;-)
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/3] perf tools: Speedup DWARF unwind
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
` (4 preceding siblings ...)
2014-04-23 20:16 ` Jiri Olsa
@ 2014-04-25 13:08 ` Jiri Olsa
5 siblings, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-25 13:08 UTC (permalink / raw)
To: linux-kernel
Cc: Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
ping, any feedback?
thanks,
jirka
On Thu, Apr 17, 2014 at 07:39:09PM +0200, Jiri Olsa wrote:
> hi,
> trying to speedup DWARF unwind report code by factoring
> related code:
> - caching sample's registers access
> - keep dso data file descriptor open for the
> life of the dso object
> - replace dso cache code by mapping dso data file
> directly for the life of the dso object
>
> The speedup is mainly for libunwind unwind. The libdw will benefit
> mainly from cached registers access, because it handles dso data
> accesses by itself.. and anyway it's still faster ;-).
>
> Also reachable in here:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/core_unwind_speedup
>
> thanks,
> jirka
>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: Jean Pihet <jean.pihet@linaro.org>
> Signed-off-by: Jiri Olsa <jolsa@redhat.com>
> ---
> Jiri Olsa (3):
> perf tools: Cache register accesses for unwind processing
> perf tools: Cache dso data file descriptor
> perf tools: Replace dso data cache with mapped data
>
> tools/perf/tests/dso-data.c | 7 ++++
> tools/perf/util/dso.c | 200 +++++++++++++++++++++++++++---------------------------------------------------------------------
> tools/perf/util/dso.h | 14 ++-----
> tools/perf/util/event.h | 5 +++
> tools/perf/util/perf_regs.c | 10 ++++-
> tools/perf/util/perf_regs.h | 4 +-
> tools/perf/util/unwind-libunwind.c | 2 -
> 7 files changed, 83 insertions(+), 159 deletions(-)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
@ 2014-04-27 14:29 ` Namhyung Kim
2014-04-28 9:48 ` Jiri Olsa
2014-04-28 10:39 ` Christian Borntraeger
1 sibling, 1 reply; 24+ messages in thread
From: Namhyung Kim @ 2014-04-27 14:29 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
Hi Jiri,
2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> Caching registers value into an array. Got about 4% speed up
> of perf_reg_value function for report command processing
> dwarf unwind stacks.
I'm not familiar with the code base, so probably silly questions: Where
does the speed up come from? IOW I don't know what's the difference
between the regs->regs and regs->cached_regs. And does the cached_regs
contain correct values of registers for each frame?
Thanks,
Namhyung
>
> Output from report over 1.5 GB data with DWARF unwind stacks:
> (TODO fix perf diff)
>
> current code:
> 6.81% perf.old perf.old [.] perf_reg_value
>
> change:
> 2.24% perf perf [.] perf_reg_value
>
> And little bit of speed up:
>
> Performance counter stats for './perf.old report -i perf-test.data --stdio':
>
> 134,664,011,577 cycles:u # 2.472 GHz
> 189,677,227,475 instructions:u # 1.41 insns per cycle
> 54465.096050 task-clock (msec) # 0.998 CPUs utilized
>
> 54.598339009 seconds time elapsed
>
> Performance counter stats for './perf report -i perf-test.data --stdio':
>
> 124,478,681,672 cycles:u # 2.466 GHz
> 168,998,379,866 instructions:u # 1.36 insns per cycle
> 50487.110482 task-clock (msec) # 0.997 CPUs utilized
>
> 50.635824229 seconds time elapsed
>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: Jean Pihet <jean.pihet@linaro.org>
> Signed-off-by: Jiri Olsa <jolsa@redhat.com>
> ---
> tools/perf/util/event.h | 5 +++++
> tools/perf/util/perf_regs.c | 10 +++++++++-
> tools/perf/util/perf_regs.h | 4 +++-
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 38457d4..970d4eb 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -7,6 +7,7 @@
> #include "../perf.h"
> #include "map.h"
> #include "build-id.h"
> +#include "perf_regs.h"
>
> struct mmap_event {
> struct perf_event_header header;
> @@ -87,6 +88,10 @@ struct regs_dump {
> u64 abi;
> u64 mask;
> u64 *regs;
> +
> + /* Cached values/mask filled by first register access. */
> + u64 cache_regs[PERF_REGS_MAX];
> + u64 cache_mask;
> };
>
> struct stack_dump {
> diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
> index a3539ef..43168fb 100644
> --- a/tools/perf/util/perf_regs.c
> +++ b/tools/perf/util/perf_regs.c
> @@ -1,11 +1,15 @@
> #include <errno.h>
> #include "perf_regs.h"
> +#include "event.h"
>
> int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
> {
> int i, idx = 0;
> u64 mask = regs->mask;
>
> + if (regs->cache_mask & (1 << id))
> + goto out;
> +
> if (!(mask & (1 << id)))
> return -EINVAL;
>
> @@ -14,6 +18,10 @@ int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
> idx++;
> }
>
> - *valp = regs->regs[idx];
> + regs->cache_mask |= (1 << id);
> + regs->cache_regs[id] = regs->regs[idx];
> +
> +out:
> + *valp = regs->cache_regs[id];
> return 0;
> }
> diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
> index d6e8b6a..80d8ab1 100644
> --- a/tools/perf/util/perf_regs.h
> +++ b/tools/perf/util/perf_regs.h
> @@ -2,15 +2,17 @@
> #define __PERF_REGS_H
>
> #include "types.h"
> -#include "event.h"
>
> #ifdef HAVE_PERF_REGS_SUPPORT
> #include <perf_regs.h>
>
> +struct regs_dump;
> +
> int perf_reg_value(u64 *valp, struct regs_dump *regs, int id);
>
> #else
> #define PERF_REGS_MASK 0
> +#define PERF_REGS_MAX 0
>
> static inline const char *perf_reg_name(int id __maybe_unused)
> {
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-17 17:39 ` [PATCH 2/3] perf tools: Cache dso data file descriptor Jiri Olsa
@ 2014-04-27 14:36 ` Namhyung Kim
2014-04-28 10:01 ` Jiri Olsa
0 siblings, 1 reply; 24+ messages in thread
From: Namhyung Kim @ 2014-04-27 14:36 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> Keeping the data file description open for the whole life
> of the dso object.
I suspect there might be an issue for reporting very large data file
with this approach - like open file limit?
[SNIP]
> @@ -168,8 +174,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> };
> int i = 0;
>
> - if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND)
> - return open_dso(dso, machine);
Why did you remove this line?
Thanks,
Namhyung
> + if (dso->data_fd >= 0)
> + return dso->data_fd;
>
> do {
> int fd;
> @@ -178,7 +184,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
>
> fd = open_dso(dso, machine);
> if (fd >= 0)
> - return fd;
> + return dso->data_fd = fd;
>
> } while (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND);
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-27 14:29 ` Namhyung Kim
@ 2014-04-28 9:48 ` Jiri Olsa
2014-04-28 13:02 ` Namhyung Kim
0 siblings, 1 reply; 24+ messages in thread
From: Jiri Olsa @ 2014-04-28 9:48 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Sun, Apr 27, 2014 at 11:29:21PM +0900, Namhyung Kim wrote:
> Hi Jiri,
>
> 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > Caching registers value into an array. Got about 4% speed up
> > of perf_reg_value function for report command processing
> > dwarf unwind stacks.
>
> I'm not familiar with the code base, so probably silly questions: Where
> does the speed up come from? IOW I don't know what's the difference
> between the regs->regs and regs->cached_regs. And does the cached_regs
> contain correct values of registers for each frame?
the current way register's value is accessed is to get its
index in the sample's regs array.. based on register's id
and the registers mask
so each time you want register value you traverse the registers
mask and count reg's index for the sample regs array
this patch does this only once for each register (at the time it's
first accessed) and cache its value in the array (cache_regs). The
cache_mask is used to identify which regs are already cached.
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-27 14:36 ` Namhyung Kim
@ 2014-04-28 10:01 ` Jiri Olsa
2014-04-28 13:16 ` Namhyung Kim
2014-05-07 19:01 ` Ingo Molnar
0 siblings, 2 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-28 10:01 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Sun, Apr 27, 2014 at 11:36:35PM +0900, Namhyung Kim wrote:
> 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > Keeping the data file description open for the whole life
> > of the dso object.
>
> I suspect there might be an issue for reporting very large data file
> with this approach - like open file limit?
I've got as high as ~200 openned file descriptors for
~2GB data of system wide monitoring
but right that could be an issue.. I wonder we could
workaround this somehow, because the speed up is quite
noticable
how about we monitor number of openned dso file descriptor
and once we cross this we close some portion of them
or something along those lines ;-)
>
>
> [SNIP]
> > @@ -168,8 +174,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > };
> > int i = 0;
> >
> > - if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND)
> > - return open_dso(dso, machine);
>
> Why did you remove this line?
that code reopens already openned (and closed) file..
instead I return (not closed) descriptor from previous open
thanks,
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
2014-04-27 14:29 ` Namhyung Kim
@ 2014-04-28 10:39 ` Christian Borntraeger
2014-04-28 11:00 ` Jiri Olsa
1 sibling, 1 reply; 24+ messages in thread
From: Christian Borntraeger @ 2014-04-28 10:39 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On 17/04/14 19:39, Jiri Olsa wrote:
> Caching registers value into an array. Got about 4% speed up
> of perf_reg_value function for report command processing
> dwarf unwind stacks.
>
> Output from report over 1.5 GB data with DWARF unwind stacks:
> (TODO fix perf diff)
>
> current code:
> 6.81% perf.old perf.old [.] perf_reg_value
>
> change:
> 2.24% perf perf [.] perf_reg_value
>
> And little bit of speed up:
>
> Performance counter stats for './perf.old report -i perf-test.data --stdio':
>
> 134,664,011,577 cycles:u # 2.472 GHz
> 189,677,227,475 instructions:u # 1.41 insns per cycle
> 54465.096050 task-clock (msec) # 0.998 CPUs utilized
>
> 54.598339009 seconds time elapsed
>
> Performance counter stats for './perf report -i perf-test.data --stdio':
>
> 124,478,681,672 cycles:u # 2.466 GHz
> 168,998,379,866 instructions:u # 1.36 insns per cycle
> 50487.110482 task-clock (msec) # 0.997 CPUs utilized
>
> 50.635824229 seconds time elapsed
>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: Jean Pihet <jean.pihet@linaro.org>
> Signed-off-by: Jiri Olsa <jolsa@redhat.com>
> ---
> tools/perf/util/event.h | 5 +++++
> tools/perf/util/perf_regs.c | 10 +++++++++-
> tools/perf/util/perf_regs.h | 4 +++-
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 38457d4..970d4eb 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -7,6 +7,7 @@
> #include "../perf.h"
> #include "map.h"
> #include "build-id.h"
> +#include "perf_regs.h"
>
> struct mmap_event {
> struct perf_event_header header;
> @@ -87,6 +88,10 @@ struct regs_dump {
> u64 abi;
> u64 mask;
> u64 *regs;
> +
> + /* Cached values/mask filled by first register access. */
> + u64 cache_regs[PERF_REGS_MAX];
> + u64 cache_mask;
> };
>
> struct stack_dump {
> diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
> index a3539ef..43168fb 100644
> --- a/tools/perf/util/perf_regs.c
> +++ b/tools/perf/util/perf_regs.c
> @@ -1,11 +1,15 @@
> #include <errno.h>
> #include "perf_regs.h"
> +#include "event.h"
>
> int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
> {
> int i, idx = 0;
> u64 mask = regs->mask;
>
> + if (regs->cache_mask & (1 << id))
> + goto out;
> +
> if (!(mask & (1 << id)))
> return -EINVAL;
>
> @@ -14,6 +18,10 @@ int perf_reg_value(u64 *valp, struct regs_dump *regs, int id)
> idx++;
> }
>
> - *valp = regs->regs[idx];
> + regs->cache_mask |= (1 << id);
> + regs->cache_regs[id] = regs->regs[idx];
> +
> +out:
> + *valp = regs->cache_regs[id];
> return 0;
> }
> diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
> index d6e8b6a..80d8ab1 100644
> --- a/tools/perf/util/perf_regs.h
> +++ b/tools/perf/util/perf_regs.h
> @@ -2,15 +2,17 @@
> #define __PERF_REGS_H
>
> #include "types.h"
> -#include "event.h"
>
> #ifdef HAVE_PERF_REGS_SUPPORT
> #include <perf_regs.h>
>
> +struct regs_dump;
> +
> int perf_reg_value(u64 *valp, struct regs_dump *regs, int id);
>
> #else
> #define PERF_REGS_MASK 0
> +#define PERF_REGS_MAX 0
>
> static inline const char *perf_reg_name(int id __maybe_unused)
> {
>
Want such a speedup,
but it does not compile on my s390x system:
CC util/top.o
In file included from util/event.h:10:0,
from util/event.c:2:
util/perf_regs.h:24:6: error: ‘struct regs_dump’ declared inside parameter list [-Werror]
util/perf_regs.h:24:6: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
In file included from util/event.h:10:0,
from util/callchain.h:7,
from util/hist.h:6,
from util/evsel.h:11,
from util/evsel.c:18:
util/perf_regs.h:24:6: error: ‘struct regs_dump’ declared inside parameter list [-Werror]
util/perf_regs.h:24:6: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
In file included from /home/cborntra/REPOS/linux/tools/perf/util/event.h:10:0,
from /home/cborntra/REPOS/linux/tools/perf/util/debug.h:6,
from util/cpumap.h:8,
from util/top.c:9:
/home/cborntra/REPOS/linux/tools/perf/util/perf_regs.h:24:6: error: ‘struct regs_dump’ declared inside parameter list [-Werror]
/home/cborntra/REPOS/linux/tools/perf/util/perf_regs.h:24:6: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
In file included from /home/cborntra/REPOS/linux/tools/perf/util/event.h:10:0,
from /home/cborntra/REPOS/linux/tools/perf/util/debug.h:6,
from util/cpumap.h:8,
from util/evlist.c:12:
/home/cborntra/REPOS/linux/tools/perf/util/perf_regs.h:24:6: error: ‘struct regs_dump’ declared inside parameter list [-Werror]
/home/cborntra/REPOS/linux/tools/perf/util/perf_regs.h:24:6: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
CC util/usage.o
CC util/wrapper.o
CC util/sigchain.o
In file included from util/event.h:10:0,
from util/header.h:8,
from util/parse-options.c:4:
util/perf_regs.h:24:6: error: ‘struct regs_dump’ declared inside parameter list [-Werror]
util/perf_regs.h:24:6: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
cc1: all warnings being treated as errors
make[1]: *** [util/parse-options.o] Error 1
make[1]: *** Waiting for unfinished jobs....
cc1: all warnings being treated as errors
cc1: all warnings being treated as errors
make[1]: *** [util/event.o] Error 1
make[1]: *** [util/evsel.o] Error 1
In file included from util/event.h:10:0,
from util/debug.h:6,
from util/usage.c:10:
util/perf_regs.h:24:6: error: ‘struct regs_dump’ declared inside parameter list [-Werror]
util/perf_regs.h:24:6: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
cc1: all warnings being treated as errors
make[1]: *** [util/usage.o] Error 1
cc1: all warnings being treated as errors
make[1]: *** [util/evlist.o] Error 1
cc1: all warnings being treated as errors
make[1]: *** [util/top.o] Error 1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-28 10:39 ` Christian Borntraeger
@ 2014-04-28 11:00 ` Jiri Olsa
0 siblings, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-28 11:00 UTC (permalink / raw)
To: Christian Borntraeger
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Mon, Apr 28, 2014 at 12:39:24PM +0200, Christian Borntraeger wrote:
SNIP
> > {
> >
>
> Want such a speedup,
> but it does not compile on my s390x system:
the speed up is for DWARF unwind report, which is not yet
supported on s390x perf.. still it should compile ;-)
I'll try to get some s390x and make a fix
thanks,
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-28 9:48 ` Jiri Olsa
@ 2014-04-28 13:02 ` Namhyung Kim
2014-04-28 13:24 ` Jiri Olsa
0 siblings, 1 reply; 24+ messages in thread
From: Namhyung Kim @ 2014-04-28 13:02 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
Hi Jiri,
2014-04-28 (월), 11:48 +0200, Jiri Olsa:
> On Sun, Apr 27, 2014 at 11:29:21PM +0900, Namhyung Kim wrote:
> > Hi Jiri,
> >
> > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > > Caching registers value into an array. Got about 4% speed up
> > > of perf_reg_value function for report command processing
> > > dwarf unwind stacks.
> >
> > I'm not familiar with the code base, so probably silly questions: Where
> > does the speed up come from? IOW I don't know what's the difference
> > between the regs->regs and regs->cached_regs. And does the cached_regs
> > contain correct values of registers for each frame?
>
> the current way register's value is accessed is to get its
> index in the sample's regs array.. based on register's id
> and the registers mask
>
> so each time you want register value you traverse the registers
> mask and count reg's index for the sample regs array
>
> this patch does this only once for each register (at the time it's
> first accessed) and cache its value in the array (cache_regs). The
> cache_mask is used to identify which regs are already cached.
That means it'll get the same value everytime it accesses a register in
frames in a sample?
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-28 10:01 ` Jiri Olsa
@ 2014-04-28 13:16 ` Namhyung Kim
2014-04-28 13:34 ` Jiri Olsa
2014-04-28 14:57 ` David Ahern
2014-05-07 19:01 ` Ingo Molnar
1 sibling, 2 replies; 24+ messages in thread
From: Namhyung Kim @ 2014-04-28 13:16 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
2014-04-28 (월), 12:01 +0200, Jiri Olsa:
> On Sun, Apr 27, 2014 at 11:36:35PM +0900, Namhyung Kim wrote:
> > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > > Keeping the data file description open for the whole life
> > > of the dso object.
> >
> > I suspect there might be an issue for reporting very large data file
> > with this approach - like open file limit?
>
> I've got as high as ~200 openned file descriptors for
> ~2GB data of system wide monitoring
>
> but right that could be an issue.. I wonder we could
> workaround this somehow, because the speed up is quite
> noticable
>
> how about we monitor number of openned dso file descriptor
> and once we cross this we close some portion of them
>
> or something along those lines ;-)
Yeah, we'll need some way to control those eventually.
>
> >
> >
> > [SNIP]
> > > @@ -168,8 +174,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > > };
> > > int i = 0;
> > >
> > > - if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND)
> > > - return open_dso(dso, machine);
> >
> > Why did you remove this line?
>
> that code reopens already openned (and closed) file..
> instead I return (not closed) descriptor from previous open
But it'll overwrite the dso->binary_type then. What about this?
if (dso->data_fd >= 0)
return dso->data_fd;
if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND) {
dso->data_fd = open_dso(dso, machine);
return dso->data_fd;
}
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-28 13:02 ` Namhyung Kim
@ 2014-04-28 13:24 ` Jiri Olsa
2014-04-29 0:36 ` Namhyung Kim
0 siblings, 1 reply; 24+ messages in thread
From: Jiri Olsa @ 2014-04-28 13:24 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Mon, Apr 28, 2014 at 10:02:55PM +0900, Namhyung Kim wrote:
> Hi Jiri,
>
> 2014-04-28 (월), 11:48 +0200, Jiri Olsa:
> > On Sun, Apr 27, 2014 at 11:29:21PM +0900, Namhyung Kim wrote:
> > > Hi Jiri,
> > >
> > > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > > > Caching registers value into an array. Got about 4% speed up
> > > > of perf_reg_value function for report command processing
> > > > dwarf unwind stacks.
> > >
> > > I'm not familiar with the code base, so probably silly questions: Where
> > > does the speed up come from? IOW I don't know what's the difference
> > > between the regs->regs and regs->cached_regs. And does the cached_regs
> > > contain correct values of registers for each frame?
> >
> > the current way register's value is accessed is to get its
> > index in the sample's regs array.. based on register's id
> > and the registers mask
> >
> > so each time you want register value you traverse the registers
> > mask and count reg's index for the sample regs array
> >
> > this patch does this only once for each register (at the time it's
> > first accessed) and cache its value in the array (cache_regs). The
> > cache_mask is used to identify which regs are already cached.
>
> That means it'll get the same value everytime it accesses a register in
> frames in a sample?
right..
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-28 13:16 ` Namhyung Kim
@ 2014-04-28 13:34 ` Jiri Olsa
2014-04-28 14:57 ` David Ahern
1 sibling, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-28 13:34 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Mon, Apr 28, 2014 at 10:16:34PM +0900, Namhyung Kim wrote:
> 2014-04-28 (월), 12:01 +0200, Jiri Olsa:
> > On Sun, Apr 27, 2014 at 11:36:35PM +0900, Namhyung Kim wrote:
> > > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > > > Keeping the data file description open for the whole life
> > > > of the dso object.
> > >
> > > I suspect there might be an issue for reporting very large data file
> > > with this approach - like open file limit?
> >
> > I've got as high as ~200 openned file descriptors for
> > ~2GB data of system wide monitoring
> >
> > but right that could be an issue.. I wonder we could
> > workaround this somehow, because the speed up is quite
> > noticable
> >
> > how about we monitor number of openned dso file descriptor
> > and once we cross this we close some portion of them
> >
> > or something along those lines ;-)
>
> Yeah, we'll need some way to control those eventually.
>
> >
> > >
> > >
> > > [SNIP]
> > > > @@ -168,8 +174,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > > > };
> > > > int i = 0;
> > > >
> > > > - if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND)
> > > > - return open_dso(dso, machine);
> > >
> > > Why did you remove this line?
> >
> > that code reopens already openned (and closed) file..
> > instead I return (not closed) descriptor from previous open
>
> But it'll overwrite the dso->binary_type then. What about this?
>
> if (dso->data_fd >= 0)
> return dso->data_fd;
>
> if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND) {
> dso->data_fd = open_dso(dso, machine);
> return dso->data_fd;
> }
right, makes sense.. I'll add it with the control code for the
number of openned descriptors
thanks,
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-28 13:16 ` Namhyung Kim
2014-04-28 13:34 ` Jiri Olsa
@ 2014-04-28 14:57 ` David Ahern
2014-04-29 0:41 ` Namhyung Kim
1 sibling, 1 reply; 24+ messages in thread
From: David Ahern @ 2014-04-28 14:57 UTC (permalink / raw)
To: Namhyung Kim, Jiri Olsa
Cc: linux-kernel, Corey Ashford, Frederic Weisbecker, Ingo Molnar,
Paul Mackerras, Peter Zijlstra, Arnaldo Carvalho de Melo,
Jean Pihet
On 4/28/14, 7:16 AM, Namhyung Kim wrote:
> 2014-04-28 (월), 12:01 +0200, Jiri Olsa:
>> On Sun, Apr 27, 2014 at 11:36:35PM +0900, Namhyung Kim wrote:
>>> 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
>>>> Keeping the data file description open for the whole life
>>>> of the dso object.
>>>
>>> I suspect there might be an issue for reporting very large data file
>>> with this approach - like open file limit?
>>
>> I've got as high as ~200 openned file descriptors for
>> ~2GB data of system wide monitoring
>>
>> but right that could be an issue.. I wonder we could
>> workaround this somehow, because the speed up is quite
>> noticable
>>
>> how about we monitor number of openned dso file descriptor
>> and once we cross this we close some portion of them
>>
>> or something along those lines ;-)
>
> Yeah, we'll need some way to control those eventually.
Handle EMFILE failures. Find an "old" one and close it to let the new
one succeed.
David
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-28 13:24 ` Jiri Olsa
@ 2014-04-29 0:36 ` Namhyung Kim
2014-04-30 12:12 ` Jiri Olsa
0 siblings, 1 reply; 24+ messages in thread
From: Namhyung Kim @ 2014-04-29 0:36 UTC (permalink / raw)
To: Jiri Olsa
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Mon, 28 Apr 2014 15:24:20 +0200, Jiri Olsa wrote:
> On Mon, Apr 28, 2014 at 10:02:55PM +0900, Namhyung Kim wrote:
>> Hi Jiri,
>>
>> 2014-04-28 (월), 11:48 +0200, Jiri Olsa:
>> > On Sun, Apr 27, 2014 at 11:29:21PM +0900, Namhyung Kim wrote:
>> > > Hi Jiri,
>> > >
>> > > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
>> > > > Caching registers value into an array. Got about 4% speed up
>> > > > of perf_reg_value function for report command processing
>> > > > dwarf unwind stacks.
>> > >
>> > > I'm not familiar with the code base, so probably silly questions: Where
>> > > does the speed up come from? IOW I don't know what's the difference
>> > > between the regs->regs and regs->cached_regs. And does the cached_regs
>> > > contain correct values of registers for each frame?
>> >
>> > the current way register's value is accessed is to get its
>> > index in the sample's regs array.. based on register's id
>> > and the registers mask
>> >
>> > so each time you want register value you traverse the registers
>> > mask and count reg's index for the sample regs array
>> >
>> > this patch does this only once for each register (at the time it's
>> > first accessed) and cache its value in the array (cache_regs). The
>> > cache_mask is used to identify which regs are already cached.
>>
>> That means it'll get the same value everytime it accesses a register in
>> frames in a sample?
>
> right..
Hmm.. I thought it'd be changed somehow as it unwinds frames.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-28 14:57 ` David Ahern
@ 2014-04-29 0:41 ` Namhyung Kim
0 siblings, 0 replies; 24+ messages in thread
From: Namhyung Kim @ 2014-04-29 0:41 UTC (permalink / raw)
To: David Ahern
Cc: Jiri Olsa, linux-kernel, Corey Ashford, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
Hi David,
On Mon, 28 Apr 2014 08:57:49 -0600, David Ahern wrote:
> On 4/28/14, 7:16 AM, Namhyung Kim wrote:
>> 2014-04-28 (월), 12:01 +0200, Jiri Olsa:
>>> On Sun, Apr 27, 2014 at 11:36:35PM +0900, Namhyung Kim wrote:
>>>> 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
>>>>> Keeping the data file description open for the whole life
>>>>> of the dso object.
>>>>
>>>> I suspect there might be an issue for reporting very large data file
>>>> with this approach - like open file limit?
>>>
>>> I've got as high as ~200 openned file descriptors for
>>> ~2GB data of system wide monitoring
>>>
>>> but right that could be an issue.. I wonder we could
>>> workaround this somehow, because the speed up is quite
>>> noticable
>>>
>>> how about we monitor number of openned dso file descriptor
>>> and once we cross this we close some portion of them
>>>
>>> or something along those lines ;-)
>>
>> Yeah, we'll need some way to control those eventually.
>
> Handle EMFILE failures. Find an "old" one and close it to let the new
> one succeed.
But it would make other open(), if any, fail anyway.. So I'd rather
limit the size of the dso cache to a reasonable size.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/3] perf tools: Cache register accesses for unwind processing
2014-04-29 0:36 ` Namhyung Kim
@ 2014-04-30 12:12 ` Jiri Olsa
0 siblings, 0 replies; 24+ messages in thread
From: Jiri Olsa @ 2014-04-30 12:12 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-kernel, Corey Ashford, David Ahern, Frederic Weisbecker,
Ingo Molnar, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
On Tue, Apr 29, 2014 at 09:36:19AM +0900, Namhyung Kim wrote:
> On Mon, 28 Apr 2014 15:24:20 +0200, Jiri Olsa wrote:
> > On Mon, Apr 28, 2014 at 10:02:55PM +0900, Namhyung Kim wrote:
> >> Hi Jiri,
> >>
> >> 2014-04-28 (월), 11:48 +0200, Jiri Olsa:
> >> > On Sun, Apr 27, 2014 at 11:29:21PM +0900, Namhyung Kim wrote:
> >> > > Hi Jiri,
> >> > >
> >> > > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> >> > > > Caching registers value into an array. Got about 4% speed up
> >> > > > of perf_reg_value function for report command processing
> >> > > > dwarf unwind stacks.
> >> > >
> >> > > I'm not familiar with the code base, so probably silly questions: Where
> >> > > does the speed up come from? IOW I don't know what's the difference
> >> > > between the regs->regs and regs->cached_regs. And does the cached_regs
> >> > > contain correct values of registers for each frame?
> >> >
> >> > the current way register's value is accessed is to get its
> >> > index in the sample's regs array.. based on register's id
> >> > and the registers mask
> >> >
> >> > so each time you want register value you traverse the registers
> >> > mask and count reg's index for the sample regs array
> >> >
> >> > this patch does this only once for each register (at the time it's
> >> > first accessed) and cache its value in the array (cache_regs). The
> >> > cache_mask is used to identify which regs are already cached.
> >>
> >> That means it'll get the same value everytime it accesses a register in
> >> frames in a sample?
> >
> > right..
>
> Hmm.. I thought it'd be changed somehow as it unwinds frames.
nope, it's just sample's user space registers values from the
time sample was taken
both libunwind and libdw unwinders keep the registers state
through the frames unwinding internally
jirka
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/3] perf tools: Cache dso data file descriptor
2014-04-28 10:01 ` Jiri Olsa
2014-04-28 13:16 ` Namhyung Kim
@ 2014-05-07 19:01 ` Ingo Molnar
1 sibling, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2014-05-07 19:01 UTC (permalink / raw)
To: Jiri Olsa
Cc: Namhyung Kim, linux-kernel, Corey Ashford, David Ahern,
Frederic Weisbecker, Paul Mackerras, Peter Zijlstra,
Arnaldo Carvalho de Melo, Jean Pihet
* Jiri Olsa <jolsa@redhat.com> wrote:
> On Sun, Apr 27, 2014 at 11:36:35PM +0900, Namhyung Kim wrote:
> > 2014-04-17 (목), 19:39 +0200, Jiri Olsa:
> > > Keeping the data file description open for the whole life
> > > of the dso object.
> >
> > I suspect there might be an issue for reporting very large data file
> > with this approach - like open file limit?
>
> I've got as high as ~200 openned file descriptors for
> ~2GB data of system wide monitoring
Note that 200 open file descriptors in themselves are not a
scalability problem on Linux, as long as perf doesn't walk them
linearly anywhere.
I think we are reasonably fast even with a million open files in a
singe process, or so.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2014-05-07 19:01 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-17 17:39 [PATCH 0/3] perf tools: Speedup DWARF unwind Jiri Olsa
2014-04-17 17:39 ` [PATCH 1/3] perf tools: Cache register accesses for unwind processing Jiri Olsa
2014-04-27 14:29 ` Namhyung Kim
2014-04-28 9:48 ` Jiri Olsa
2014-04-28 13:02 ` Namhyung Kim
2014-04-28 13:24 ` Jiri Olsa
2014-04-29 0:36 ` Namhyung Kim
2014-04-30 12:12 ` Jiri Olsa
2014-04-28 10:39 ` Christian Borntraeger
2014-04-28 11:00 ` Jiri Olsa
2014-04-17 17:39 ` [PATCH 2/3] perf tools: Cache dso data file descriptor Jiri Olsa
2014-04-27 14:36 ` Namhyung Kim
2014-04-28 10:01 ` Jiri Olsa
2014-04-28 13:16 ` Namhyung Kim
2014-04-28 13:34 ` Jiri Olsa
2014-04-28 14:57 ` David Ahern
2014-04-29 0:41 ` Namhyung Kim
2014-05-07 19:01 ` Ingo Molnar
2014-04-17 17:39 ` [PATCH 3/3] perf tools: Replace dso data cache with mapped data Jiri Olsa
2014-04-18 7:51 ` [PATCH 0/3] perf tools: Speedup DWARF unwind Ingo Molnar
2014-04-18 7:55 ` Ingo Molnar
2014-04-18 9:35 ` Jiri Olsa
2014-04-23 20:16 ` Jiri Olsa
2014-04-25 13:08 ` Jiri Olsa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.