linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/14] perf pollfd v3
@ 2014-09-10 14:08 Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method Arnaldo Carvalho de Melo
                   ` (14 more replies)
  0 siblings, 15 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Don Zickus,
	Frederic Weisbecker, Ingo Molnar, Jean Pihet, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Hi,

	Main different to what was discussed in v2[1]:

	Poll all event file descriptors, not just the ones mmaped.

	This is a bug that was present before, noticed by Adrian Hunter while
reviewing this patchkit, i.e. we were polling just the mmaped ones, because
then we would look at all the ring buffers when just one of the polled
descriptors was ready to read.

	But this could lead to delays in polling when one of the mmaped fds was
closed and we would then not be notified when another, associated to it by
means of PERF_EVENT_IOC_SET_OUTPUT, had events available.

	Also refcount the mmaps associated by means of PERF_EVENT_IOC_SET_OUTPUT,
so that we can properly unmap a ring buffer to which all its file descriptors
had been closed, for longer running apps.

	It is all available on my tree:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/fdarray.v2

	I kept the v3 for the series as I initially called it pollfd, but since the
class that grew out of it was named fdarray, thus the fdarray.v2 branch name.

	Please let me know if there are still any other problems, now back to
fixing up my last perf/core pull req wrt those external proggies problems.

- Arnaldo

[1] http://lkml.kernel.org/r/1409781604-16778-1-git-send-email-acme@kernel.org

Arnaldo Carvalho de Melo (14):
  perf evlist: Introduce perf_evlist__filter_pollfd method
  perf tests: Add test for perf_evlist__filter_pollfd()
  perf evlist: Monitor POLLERR and POLLHUP events too
  perf evlist: We need to poll all event file descriptors
  perf record: Filter out POLLHUP'ed file descriptors
  perf trace: Filter out POLLHUP'ed file descriptors
  perf evlist: Allow growing pollfd on add method
  perf tests: Add pollfd growing test
  perf kvm stat live: Use perf_evlist__add_pollfd() instead of local equivalent
  perf evlist: Introduce poll method for common code idiom
  tools lib api: Adopt fdarray class from perf's evlist
  perf evlist: Refcount mmaps
  tools lib fd array: Allow associating an integer cookie with each entry
  perf evlist: Unmap ring buffer when fd is nuked

 tools/lib/api/Makefile                    |   7 +-
 tools/lib/api/fd/array.c                  | 128 ++++++++++++++++++
 tools/lib/api/fd/array.h                  |  34 +++++
 tools/perf/Makefile.perf                  |   1 +
 tools/perf/builtin-kvm.c                  |  24 ++--
 tools/perf/builtin-record.c               |   5 +-
 tools/perf/builtin-top.c                  |   4 +-
 tools/perf/builtin-trace.c                |   3 +-
 tools/perf/tests/builtin-test.c           |   8 ++
 tools/perf/tests/evlist.c                 | 217 ++++++++++++++++++++++++++++++
 tools/perf/tests/open-syscall-tp-fields.c |   2 +-
 tools/perf/tests/perf-record.c            |   2 +-
 tools/perf/tests/task-exit.c              |   2 +-
 tools/perf/tests/tests.h                  |   2 +
 tools/perf/util/evlist.c                  |  77 +++++++++--
 tools/perf/util/evlist.h                  |  16 ++-
 tools/perf/util/python.c                  |   6 +-
 17 files changed, 499 insertions(+), 39 deletions(-)
 create mode 100644 tools/lib/api/fd/array.c
 create mode 100644 tools/lib/api/fd/array.h
 create mode 100644 tools/perf/tests/evlist.c

-- 
1.9.3


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 02/14] perf tests: Add test for perf_evlist__filter_pollfd() Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To remove all entries in evlist->pollfd[] that have revents matching at
least one of the bits in the specified mask.

It'll adjust evlist->nr_fds to the number of unfiltered fds and will
return this value, as a convenience and to avoid requiring direct access
to internal state of perf_evlist objects.

This will be used after polling the evlist fds so that we remove fds
that were closed by the kernel.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-y2sca7z3wicvvy40a50lozwm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 21 +++++++++++++++++++++
 tools/perf/util/evlist.h |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a3e28b49128a..023bc3873ae9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -428,6 +428,27 @@ void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 	evlist->nr_fds++;
 }
 
+int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
+{
+	int fd, nr_fds = 0;
+
+	if (evlist->nr_fds == 0)
+		return 0;
+
+	for (fd = 0; fd < evlist->nr_fds; ++fd) {
+		if (evlist->pollfd[fd].revents & revents_and_mask)
+			continue;
+
+		if (fd != nr_fds)
+			evlist->pollfd[nr_fds] = evlist->pollfd[fd];
+
+		++nr_fds;
+	}
+
+	evlist->nr_fds = nr_fds;
+	return nr_fds;
+}
+
 static void perf_evlist__id_hash(struct perf_evlist *evlist,
 				 struct perf_evsel *evsel,
 				 int cpu, int thread, u64 id)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 106de53a6a74..1082420951f9 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -84,6 +84,8 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
 
 void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
 
+int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask);
+
 struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 02/14] perf tests: Add test for perf_evlist__filter_pollfd()
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 03/14] perf evlist: Monitor POLLERR and POLLHUP events too Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

That will use a synthetic evlist with just what is touched by this new
method to check that it works as expected.

Output in verbose mode:

  $ perf test -v pollfd
  33: Filter fds with revents mask in a pollfd array         :
  --- start ---
  filtering all but pollfd[2]:
  before:   5 [ 5, 4, 3, 2, 1 ]
   after:   1 [ 3 ]
  filtering all but (pollfd[0], pollfd[3]):
  before:   5 [ 5, 4, 3, 2, 1 ]
   after:   2 [ 5, 2 ]
  test child finished with 0
  ---- end ----
  Filter fds with revents mask in a pollfd array: Ok
  $

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-x7c8liszdvc3ocmanf2cet8p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf        |   1 +
 tools/perf/tests/builtin-test.c |   4 ++
 tools/perf/tests/evlist.c       | 103 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |   1 +
 4 files changed, 109 insertions(+)
 create mode 100644 tools/perf/tests/evlist.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 95e832b1bc3c..9ce194fc00a0 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -399,6 +399,7 @@ LIB_OBJS += $(OUTPUT)tests/open-syscall-tp-fields.o
 LIB_OBJS += $(OUTPUT)tests/mmap-basic.o
 LIB_OBJS += $(OUTPUT)tests/perf-record.o
 LIB_OBJS += $(OUTPUT)tests/rdpmc.o
+LIB_OBJS += $(OUTPUT)tests/evlist.o
 LIB_OBJS += $(OUTPUT)tests/evsel-roundtrip-name.o
 LIB_OBJS += $(OUTPUT)tests/evsel-tp-sched.o
 LIB_OBJS += $(OUTPUT)tests/pmu.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 6a4145e5ad2c..41e556edbe02 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -158,6 +158,10 @@ static struct test {
 		.func = test__switch_tracking,
 	},
 	{
+		.desc = "Filter fds with revents mask in a pollfd array",
+		.func = test__perf_evlist__filter_pollfd,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/evlist.c b/tools/perf/tests/evlist.c
new file mode 100644
index 000000000000..77579158f4d9
--- /dev/null
+++ b/tools/perf/tests/evlist.c
@@ -0,0 +1,103 @@
+#include "util/evlist.h"
+#include "util/debug.h"
+#include "tests/tests.h"
+
+static void perf_evlist__init_pollfd(struct perf_evlist *evlist,
+				     int nr_fds_alloc, short revents)
+{
+	int fd;
+
+	evlist->nr_fds = nr_fds_alloc;
+
+	for (fd = 0; fd < nr_fds_alloc; ++fd) {
+		evlist->pollfd[fd].fd	   = nr_fds_alloc - fd;
+		evlist->pollfd[fd].revents = revents;
+	}
+}
+
+static int perf_evlist__fprintf_pollfd(struct perf_evlist *evlist,
+				       const char *prefix, FILE *fp)
+{
+	int printed = 0, fd;
+
+	if (!verbose)
+		return 0;
+
+	printed += fprintf(fp, "\n%s: %3d [ ", prefix, evlist->nr_fds);
+	for (fd = 0; fd < evlist->nr_fds; ++fd)
+		printed += fprintf(fp, "%s%d", fd ? ", " : "", evlist->pollfd[fd].fd);
+	printed += fprintf(fp, " ]");
+	return printed;
+}
+
+int test__perf_evlist__filter_pollfd(void)
+{
+	const int nr_fds_alloc = 5;
+	int nr_fds, expected_fd[2], fd;
+	struct pollfd pollfd[nr_fds_alloc];
+	struct perf_evlist evlist_alloc = {
+		.pollfd	= pollfd,
+	}, *evlist = &evlist_alloc;
+
+	perf_evlist__init_pollfd(evlist, nr_fds_alloc, POLLIN);
+	nr_fds = perf_evlist__filter_pollfd(evlist, POLLHUP);
+	if (nr_fds != nr_fds_alloc) {
+		pr_debug("\nperf_evlist__filter_pollfd()=%d != %d shouldn't have filtered anything",
+			 nr_fds, nr_fds_alloc);
+		return TEST_FAIL;
+	}
+
+	perf_evlist__init_pollfd(evlist, nr_fds_alloc, POLLHUP);
+	nr_fds = perf_evlist__filter_pollfd(evlist, POLLHUP);
+	if (nr_fds != 0) {
+		pr_debug("\nperf_evlist__filter_pollfd()=%d != %d, should have filtered all fds",
+			 nr_fds, nr_fds_alloc);
+		return TEST_FAIL;
+	}
+
+	perf_evlist__init_pollfd(evlist, nr_fds_alloc, POLLHUP);
+	pollfd[2].revents = POLLIN;
+	expected_fd[0] = pollfd[2].fd;
+
+	pr_debug("\nfiltering all but pollfd[2]:");
+	perf_evlist__fprintf_pollfd(evlist, "before", stderr);
+	nr_fds = perf_evlist__filter_pollfd(evlist, POLLHUP);
+	perf_evlist__fprintf_pollfd(evlist, " after", stderr);
+	if (nr_fds != 1) {
+		pr_debug("\nperf_evlist__filter_pollfd()=%d != 1, should have left just one event",
+			 nr_fds);
+		return TEST_FAIL;
+	}
+
+	if (pollfd[0].fd != expected_fd[0]) {
+		pr_debug("\npollfd[0].fd=%d != %d\n", pollfd[0].fd, expected_fd[0]);
+		return TEST_FAIL;
+	}
+
+	perf_evlist__init_pollfd(evlist, nr_fds_alloc, POLLHUP);
+	pollfd[0].revents = POLLIN;
+	expected_fd[0] = pollfd[0].fd;
+	pollfd[3].revents = POLLIN;
+	expected_fd[1] = pollfd[3].fd;
+
+	pr_debug("\nfiltering all but (pollfd[0], pollfd[3]):");
+	perf_evlist__fprintf_pollfd(evlist, "before", stderr);
+	nr_fds = perf_evlist__filter_pollfd(evlist, POLLHUP);
+	perf_evlist__fprintf_pollfd(evlist, " after", stderr);
+	if (nr_fds != 2) {
+		pr_debug("\nperf_evlist__filter_pollfd()=%d != 2, should have left just two events",
+			 nr_fds);
+		return TEST_FAIL;
+	}
+
+	for (fd = 0; fd < 2; ++fd) {
+		if (pollfd[fd].fd != expected_fd[fd]) {
+			pr_debug("\npollfd[%d].fd=%d != %d\n", fd, pollfd[fd].fd, expected_fd[fd]);
+			return TEST_FAIL;
+		}
+	}
+
+	pr_debug("\n");
+
+	return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index be8be10e3957..72c4c039bd80 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -49,6 +49,7 @@ int test__thread_mg_share(void);
 int test__hists_output(void);
 int test__hists_cumulate(void);
 int test__switch_tracking(void);
+int test__perf_evlist__filter_pollfd(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 03/14] perf evlist: Monitor POLLERR and POLLHUP events too
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 02/14] perf tests: Add test for perf_evlist__filter_pollfd() Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 04/14] perf evlist: We need to poll all event file descriptors Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We want to know when the fd went away, like when a monitored thread
exits.

If we do not monitor such events, then the tools will wait forever on
events from a vanished thread, like when running:

 $ sleep 5s &
 $ perf record -p `pidof sleep`

This builds upon the kernel patch by Jiri Olsa that actually makes a
poll on those file descriptors to return POLLHUP.

It is also needed to change the tools to use
perf_evlist__filter_pollfd() to check if there are remainings fds to
monitor or if all are gone, in which case they will exit the
poll/mmap/read loop.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-a4fslwspov0bs69nj825hqpq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 023bc3873ae9..502cd11ab17e 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -424,7 +424,7 @@ void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
 	fcntl(fd, F_SETFL, O_NONBLOCK);
 	evlist->pollfd[evlist->nr_fds].fd = fd;
-	evlist->pollfd[evlist->nr_fds].events = POLLIN;
+	evlist->pollfd[evlist->nr_fds].events = POLLIN | POLLERR | POLLHUP;
 	evlist->nr_fds++;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 04/14] perf evlist: We need to poll all event file descriptors
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 03/14] perf evlist: Monitor POLLERR and POLLHUP events too Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 05/14] perf record: Filter out POLLHUP'ed " Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Because we want to notice when they get POLLHUP'ed, so that we can
figure out when all threads exited in a workload being monitored.

We can't just monitor the fds that were mmaped, we need to notice when
all the fds that were PERF_EVENT_IOC_SET_OUTPUT'ed too, because the mmap
stays even after the fd that originally was used to do the mmap call
went away, its only when all the set-output fds for a mmap are gone that
the mmap is.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20140908151016.GH17728@krava.brq.redhat.com
Link: http://lkml.kernel.org/n/tip-24omlq5asrfg4uo3muuzn2bl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 502cd11ab17e..6b13bfa7ac2c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -717,8 +717,6 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 		evlist->mmap[idx].base = NULL;
 		return -1;
 	}
-
-	perf_evlist__add_pollfd(evlist, fd);
 	return 0;
 }
 
@@ -745,6 +743,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				return -1;
 		}
 
+		perf_evlist__add_pollfd(evlist, fd);
+
 		if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
 		    perf_evlist__id_add_fd(evlist, evsel, cpu, thread, fd) < 0)
 			return -1;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 05/14] perf record: Filter out POLLHUP'ed file descriptors
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 04/14] perf evlist: We need to poll all event file descriptors Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 06/14] perf trace: " Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So that we don't continue polling on vanished file descriptors, i.e.
file descriptors for events monitoring threads that exited.

I.e. the following 'perf record' command now exits as expected, instead
of staying in an eternal loop:

  $ sleep 5s &
  $ perf record -p `pidof sleep`

Reported-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ftg46awsyfjc3axb0xm63oig@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 87e28a4e33ba..b87708ce09c9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -467,6 +467,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			if (err > 0 || (err < 0 && errno == EINTR))
 				err = 0;
 			waking++;
+
+			if (perf_evlist__filter_pollfd(rec->evlist, POLLERR | POLLHUP) == 0)
+				done = 1;
 		}
 
 		/*
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/14] perf trace: Filter out POLLHUP'ed file descriptors
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 05/14] perf record: Filter out POLLHUP'ed " Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 07/14] perf evlist: Allow growing pollfd on add method Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So that we don't continue polling on vanished file descriptors, i.e.
file descriptors for events monitoring threads that exited.

I.e. the following 'trace' command now exits as expected, instead
of staying in an eternal loop:

      $ sleep 5s &
      $ trace -p `pidof sleep`

Reported-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-fhn2pyf8sqhoiwu42hxq9yq2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index a9e96ff49c7f..09d4bc44215e 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2171,7 +2171,8 @@ next_event:
 	if (trace->nr_events == before) {
 		int timeout = done ? 100 : -1;
 
-		if (poll(evlist->pollfd, evlist->nr_fds, timeout) > 0)
+		if (poll(evlist->pollfd, evlist->nr_fds, timeout) > 0 &&
+		    perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) > 0)
 			goto again;
 	} else {
 		goto again;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 07/14] perf evlist: Allow growing pollfd on add method
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 06/14] perf trace: " Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 08/14] perf tests: Add pollfd growing test Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

This way we will be able to add more file descriptors to be polled,
like stdin or some timer fd.

At this point we might as well yank the pollfd class from evlist so that
it can be used in other places.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-o2mzsjl7taumsoc35ryol00i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 38 +++++++++++++++++++++++++++++++++-----
 tools/perf/util/evlist.h |  5 +++--
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6b13bfa7ac2c..bcf157c8a9da 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -402,7 +402,21 @@ int perf_evlist__enable_event_idx(struct perf_evlist *evlist,
 		return perf_evlist__enable_event_thread(evlist, evsel, idx);
 }
 
-static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
+static int perf_evlist__grow_pollfd(struct perf_evlist *evlist, int hint)
+{
+	int nr_fds_alloc = evlist->nr_fds_alloc + hint;
+	size_t size = sizeof(struct pollfd) * nr_fds_alloc;
+	struct pollfd *pollfd = realloc(evlist->pollfd, size);
+
+	if (pollfd == NULL)
+		return -ENOMEM;
+
+	evlist->nr_fds_alloc = nr_fds_alloc;
+	evlist->pollfd	     = pollfd;
+	return 0;
+}
+
+int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
 	int nr_cpus = cpu_map__nr(evlist->cpus);
 	int nr_threads = thread_map__nr(evlist->threads);
@@ -416,16 +430,28 @@ static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 			nfds += nr_cpus * nr_threads;
 	}
 
-	evlist->pollfd = malloc(sizeof(struct pollfd) * nfds);
-	return evlist->pollfd != NULL ? 0 : -ENOMEM;
+	if (evlist->nr_fds_alloc - evlist->nr_fds < nfds &&
+	    perf_evlist__grow_pollfd(evlist, nfds) < 0)
+		return -ENOMEM;
+
+	return 0;
 }
 
-void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
+int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
+	/*
+	 * XXX: 64 is arbitrary, just not to call realloc at each fd.
+	 *	Find a better autogrowing heuristic
+	 */
+	if (evlist->nr_fds == evlist->nr_fds_alloc &&
+	    perf_evlist__grow_pollfd(evlist, 64) < 0)
+		return -ENOMEM;
+
 	fcntl(fd, F_SETFL, O_NONBLOCK);
 	evlist->pollfd[evlist->nr_fds].fd = fd;
 	evlist->pollfd[evlist->nr_fds].events = POLLIN | POLLERR | POLLHUP;
 	evlist->nr_fds++;
+	return 0;
 }
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
@@ -717,6 +743,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 		evlist->mmap[idx].base = NULL;
 		return -1;
 	}
+
 	return 0;
 }
 
@@ -743,7 +770,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				return -1;
 		}
 
-		perf_evlist__add_pollfd(evlist, fd);
+		if (perf_evlist__add_pollfd(evlist, fd) < 0)
+			return -1;
 
 		if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
 		    perf_evlist__id_add_fd(evlist, evsel, cpu, thread, fd) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1082420951f9..bbc2fd01b5c5 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -30,6 +30,7 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_fds;
+	int		 nr_fds_alloc;
 	int		 nr_mmaps;
 	size_t		 mmap_len;
 	int		 id_pos;
@@ -82,8 +83,8 @@ perf_evlist__find_tracepoint_by_name(struct perf_evlist *evlist,
 void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
 			 int cpu, int thread, u64 id);
 
-void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
-
+int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
+int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask);
 
 struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 08/14] perf tests: Add pollfd growing test
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 07/14] perf evlist: Allow growing pollfd on add method Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 09/14] perf kvm stat live: Use perf_evlist__add_pollfd() instead of local equivalent Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

  [acme@ssdandy linux]$ perf test "Add fd"
  34: Add fd to pollfd array, making it autogrow             : Ok
  [acme@ssdandy linux]$ perf test -v "Add fd"
  34: Add fd to pollfd array, making it autogrow             :
  --- start ---
  test child forked, pid 19817

  before growing array:   2 [ 1, 2 ]
  after 3rd add_pollfd:   3 [ 1, 2, 35 ]
  after 4th add_pollfd:   4 [ 1, 2, 35, 88 ]
  test child finished with 0
  ---- end ----
  Add fd to pollfd array, making it autogrow: Ok
  [acme@ssdandy linux]$

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-smflpyta146bzog7z0effjss@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/builtin-test.c |   4 ++
 tools/perf/tests/evlist.c       | 111 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |   1 +
 3 files changed, 116 insertions(+)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 41e556edbe02..174c3ffc5713 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -162,6 +162,10 @@ static struct test {
 		.func = test__perf_evlist__filter_pollfd,
 	},
 	{
+		.desc = "Add fd to pollfd array, making it autogrow",
+		.func = test__perf_evlist__add_pollfd,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/evlist.c b/tools/perf/tests/evlist.c
index 77579158f4d9..99d7dfd4e20a 100644
--- a/tools/perf/tests/evlist.c
+++ b/tools/perf/tests/evlist.c
@@ -1,5 +1,7 @@
 #include "util/evlist.h"
 #include "util/debug.h"
+#include "util/thread_map.h"
+#include "util/cpumap.h"
 #include "tests/tests.h"
 
 static void perf_evlist__init_pollfd(struct perf_evlist *evlist,
@@ -101,3 +103,112 @@ int test__perf_evlist__filter_pollfd(void)
 
 	return 0;
 }
+
+int test__perf_evlist__add_pollfd(void)
+{
+	struct perf_evsel evsel = {
+		.system_wide = false,
+	};
+	struct thread_map threads = {
+		.nr = 2,
+	};
+	struct perf_evlist evlist_alloc = {
+		.pollfd	 = NULL,
+		.threads = &threads,
+	}, *evlist = &evlist_alloc;
+
+	INIT_LIST_HEAD(&evlist->entries);
+	list_add(&evsel.node, &evlist->entries);
+
+	if (perf_evlist__alloc_pollfd(evlist) < 0) {
+		pr_debug("\nperf_evlist__alloc_pollfd(evlist) failed!");
+		return TEST_FAIL;
+	}
+
+	if (evlist->nr_fds_alloc != threads.nr) {
+		pr_debug("\n_evlist__alloc_pollfd: nr_fds_alloc=%d != (threads->nr(%d) * cpu_map->nr(%d))=%d",
+			 evlist->nr_fds_alloc, thread_map__nr(evlist->threads), cpu_map__nr(evlist->cpus),
+			 thread_map__nr(evlist->threads) * cpu_map__nr(evlist->cpus));
+		return TEST_FAIL;
+	}
+
+	if (perf_evlist__add_pollfd(evlist, 1) < 0) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 1) failed!");
+		return TEST_FAIL;
+	}
+
+	if (evlist->nr_fds != 1) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 1)=%d != 1", evlist->nr_fds);
+		return TEST_FAIL;
+	}
+
+	if (perf_evlist__add_pollfd(evlist, 2) < 0) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 2) failed!");
+		return TEST_FAIL;
+	}
+
+	if (evlist->nr_fds != 2) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 2)=%d != 2", evlist->nr_fds);
+		return TEST_FAIL;
+	}
+
+	perf_evlist__fprintf_pollfd(evlist, "before growing array", stderr);
+
+	if (perf_evlist__add_pollfd(evlist, 35) < 0) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 35) failed!");
+		return TEST_FAIL;
+	}
+
+	if (evlist->nr_fds != 3) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 35)=%d != 3", evlist->nr_fds);
+		return TEST_FAIL;
+	}
+
+	if (evlist->pollfd == NULL) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 35) should have allocated evlist->pollfd!");
+		return TEST_FAIL;
+	}
+
+	perf_evlist__fprintf_pollfd(evlist, "after 3rd add_pollfd", stderr);
+
+	if (evlist->pollfd[2].fd != 35) {
+		pr_debug("\nevlist->pollfd[2](%d) != 35!", evlist->pollfd[2].fd);
+		return TEST_FAIL;
+	}
+
+	if (perf_evlist__add_pollfd(evlist, 88) < 0) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 88) failed!");
+		return TEST_FAIL;
+	}
+
+	if (evlist->nr_fds != 4) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 88)=%d != 2", evlist->nr_fds);
+		return TEST_FAIL;
+	}
+
+	perf_evlist__fprintf_pollfd(evlist, "after 4th add_pollfd", stderr);
+
+	if (evlist->pollfd[0].fd != 1) {
+		pr_debug("\nevlist->pollfd[0](%d) != 1!", evlist->pollfd[0].fd);
+		return TEST_FAIL;
+	}
+
+	if (evlist->pollfd[1].fd != 2) {
+		pr_debug("\nevlist->pollfd[1](%d) != 2!", evlist->pollfd[1].fd);
+		return TEST_FAIL;
+	}
+
+	if (evlist->pollfd[2].fd != 35) {
+		pr_debug("\nevlist->pollfd[2](%d) != 35!", evlist->pollfd[2].fd);
+		return TEST_FAIL;
+	}
+
+	if (evlist->pollfd[3].fd != 88) {
+		pr_debug("\nevlist->pollfd[3](%d) != 88!", evlist->pollfd[3].fd);
+		return TEST_FAIL;
+	}
+
+	pr_debug("\n");
+
+	return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 72c4c039bd80..699d4bb61dc7 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -50,6 +50,7 @@ int test__hists_output(void);
 int test__hists_cumulate(void);
 int test__switch_tracking(void);
 int test__perf_evlist__filter_pollfd(void);
+int test__perf_evlist__add_pollfd(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/14] perf kvm stat live: Use perf_evlist__add_pollfd() instead of local equivalent
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 08/14] perf tests: Add pollfd growing test Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 10/14] perf evlist: Introduce poll method for common code idiom Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Since we can add file descriptors to the evlist pollfd and it will
autogrow, no need to copy all events to a local pollfd array, just add
the timer and stdin file descriptors.

Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-2hvp9iromiheh6rl4oaa08x5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-kvm.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index f5d3ae483110..a440219b0be0 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -919,15 +919,8 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
-	/* copy pollfds -- need to add timerfd and stdin */
+	/* use pollfds -- need to add timerfd and stdin */
 	nr_fds = kvm->evlist->nr_fds;
-	pollfds = zalloc(sizeof(struct pollfd) * (nr_fds + 2));
-	if (!pollfds) {
-		err = -ENOMEM;
-		goto out;
-	}
-	memcpy(pollfds, kvm->evlist->pollfd,
-		sizeof(struct pollfd) * kvm->evlist->nr_fds);
 
 	/* add timer fd */
 	if (perf_kvm__timerfd_create(kvm) < 0) {
@@ -935,17 +928,21 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 		goto out;
 	}
 
-	pollfds[nr_fds].fd = kvm->timerfd;
-	pollfds[nr_fds].events = POLLIN;
+	if (perf_evlist__add_pollfd(kvm->evlist, kvm->timerfd))
+		goto out;
+
 	nr_fds++;
 
-	pollfds[nr_fds].fd = fileno(stdin);
-	pollfds[nr_fds].events = POLLIN;
+	if (perf_evlist__add_pollfd(kvm->evlist, fileno(stdin)))
+		goto out;
+
 	nr_stdin = nr_fds;
 	nr_fds++;
 	if (fd_set_nonblock(fileno(stdin)) != 0)
 		goto out;
 
+	pollfds	 = kvm->evlist->pollfd;
+
 	/* everything is good - enable the events and process */
 	perf_evlist__enable(kvm->evlist);
 
@@ -979,7 +976,6 @@ out:
 		close(kvm->timerfd);
 
 	tcsetattr(0, TCSAFLUSH, &save);
-	free(pollfds);
 	return err;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 10/14] perf evlist: Introduce poll method for common code idiom
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 09/14] perf kvm stat live: Use perf_evlist__add_pollfd() instead of local equivalent Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Since we have access two evlist members in all these poll calls, provide
a helper.

This will also help to make the patch introducing the pollfd class more
clear, as the evlist specific uses will be hiden away
perf_evlist__poll().

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-hizx410uqouefj52jujubv2y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c               | 2 +-
 tools/perf/builtin-top.c                  | 4 ++--
 tools/perf/builtin-trace.c                | 2 +-
 tools/perf/tests/open-syscall-tp-fields.c | 2 +-
 tools/perf/tests/perf-record.c            | 2 +-
 tools/perf/tests/task-exit.c              | 2 +-
 tools/perf/util/evlist.c                  | 5 +++++
 tools/perf/util/evlist.h                  | 2 ++
 tools/perf/util/python.c                  | 2 +-
 9 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b87708ce09c9..aac86600b89e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -459,7 +459,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		if (hits == rec->samples) {
 			if (done)
 				break;
-			err = poll(rec->evlist->pollfd, rec->evlist->nr_fds, -1);
+			err = perf_evlist__poll(rec->evlist, -1);
 			/*
 			 * Propagate error, only if there's any. Ignore positive
 			 * number of returned events and interrupt error.
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9848e270b92c..ef7c808c8ce2 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -964,7 +964,7 @@ static int __cmd_top(struct perf_top *top)
                 perf_evlist__enable(top->evlist);
 
 	/* Wait for a minimal set of events before starting the snapshot */
-	poll(top->evlist->pollfd, top->evlist->nr_fds, 100);
+	perf_evlist__poll(top->evlist, 100);
 
 	perf_top__mmap_read(top);
 
@@ -991,7 +991,7 @@ static int __cmd_top(struct perf_top *top)
 		perf_top__mmap_read(top);
 
 		if (hits == top->samples)
-			ret = poll(top->evlist->pollfd, top->evlist->nr_fds, 100);
+			ret = perf_evlist__poll(top->evlist, 100);
 	}
 
 	ret = 0;
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 09d4bc44215e..970aa0297697 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2171,7 +2171,7 @@ next_event:
 	if (trace->nr_events == before) {
 		int timeout = done ? 100 : -1;
 
-		if (poll(evlist->pollfd, evlist->nr_fds, timeout) > 0 &&
+		if (perf_evlist__poll(evlist, timeout) > 0 &&
 		    perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) > 0)
 			goto again;
 	} else {
diff --git a/tools/perf/tests/open-syscall-tp-fields.c b/tools/perf/tests/open-syscall-tp-fields.c
index 922bdb627950..127dcae0b760 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -105,7 +105,7 @@ int test__syscall_open_tp_fields(void)
 		}
 
 		if (nr_events == before)
-			poll(evlist->pollfd, evlist->nr_fds, 10);
+			perf_evlist__poll(evlist, 10);
 
 		if (++nr_polls > 5) {
 			pr_debug("%s: no events!\n", __func__);
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 2ce753c1db63..7a228a2a070b 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -268,7 +268,7 @@ int test__PERF_RECORD(void)
 		 * perf_event_attr.wakeup_events, just PERF_EVENT_SAMPLE does.
 		 */
 		if (total_events == before && false)
-			poll(evlist->pollfd, evlist->nr_fds, -1);
+			perf_evlist__poll(evlist, -1);
 
 		sleep(1);
 		if (++wakeups > 5) {
diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c
index 87522f01c7ad..3a8fedef83bc 100644
--- a/tools/perf/tests/task-exit.c
+++ b/tools/perf/tests/task-exit.c
@@ -105,7 +105,7 @@ retry:
 	}
 
 	if (!exited || !nr_exit) {
-		poll(evlist->pollfd, evlist->nr_fds, -1);
+		perf_evlist__poll(evlist, -1);
 		goto retry;
 	}
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bcf157c8a9da..5ff3c667542f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -475,6 +475,11 @@ int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mas
 	return nr_fds;
 }
 
+int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
+{
+	return poll(evlist->pollfd, evlist->nr_fds, timeout);
+}
+
 static void perf_evlist__id_hash(struct perf_evlist *evlist,
 				 struct perf_evsel *evsel,
 				 int cpu, int thread, u64 id)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bbc2fd01b5c5..d7e99b67c94f 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -87,6 +87,8 @@ int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
 int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask);
 
+int perf_evlist__poll(struct perf_evlist *evlist, int timeout);
+
 struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 12aa9b0d0ba1..4472f8be8e35 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -736,7 +736,7 @@ static PyObject *pyrf_evlist__poll(struct pyrf_evlist *pevlist,
 	if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|i", kwlist, &timeout))
 		return NULL;
 
-	n = poll(evlist->pollfd, evlist->nr_fds, timeout);
+	n = perf_evlist__poll(evlist, timeout);
 	if (n < 0) {
 		PyErr_SetFromErrno(PyExc_OSError);
 		return NULL;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 10/14] perf evlist: Introduce poll method for common code idiom Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-11 15:09   ` [PATCH] tools lib fd array: Do not set fd as non blocking evlist Jiri Olsa
                     ` (2 more replies)
  2014-09-10 14:08 ` [PATCH 12/14] perf evlist: Refcount mmaps Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  14 siblings, 3 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The extensible file description array that grew in the perf_evlist class
can be useful for other tools, as it is not something that only evlists
need, so move it to tools/lib/api/fd to ease sharing it.

v2: Don't use {} like in:

 libapi_dirs:
	$(QUIET_MKDIR)mkdir -p $(OUTPUT){fs,fd}/

in Makefiles, as it will not work in some systems, as in ubuntu13.10.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-tvgqy1jubh6t2epumwgey3yo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/api/Makefile    |   7 ++-
 tools/lib/api/fd/array.c  | 108 ++++++++++++++++++++++++++++++++++++++++++++++
 tools/lib/api/fd/array.h  |  32 ++++++++++++++
 tools/perf/builtin-kvm.c  |   4 +-
 tools/perf/tests/evlist.c |  67 ++++++++++++++--------------
 tools/perf/util/evlist.c  |  57 ++++--------------------
 tools/perf/util/evlist.h  |   5 +--
 tools/perf/util/python.c  |   4 +-
 8 files changed, 195 insertions(+), 89 deletions(-)
 create mode 100644 tools/lib/api/fd/array.c
 create mode 100644 tools/lib/api/fd/array.h

diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
index ce00f7ee6455..36c08b1f4afb 100644
--- a/tools/lib/api/Makefile
+++ b/tools/lib/api/Makefile
@@ -10,9 +10,14 @@ LIB_OBJS=
 
 LIB_H += fs/debugfs.h
 LIB_H += fs/fs.h
+# See comment below about piggybacking...
+LIB_H += fd/array.h
 
 LIB_OBJS += $(OUTPUT)fs/debugfs.o
 LIB_OBJS += $(OUTPUT)fs/fs.o
+# XXX piggybacking here, need to introduce libapikfd, or rename this
+# to plain libapik.a and make it have it all api goodies
+LIB_OBJS += $(OUTPUT)fd/array.o
 
 LIBFILE = libapikfs.a
 
@@ -29,7 +34,7 @@ $(LIBFILE): $(LIB_OBJS)
 $(LIB_OBJS): $(LIB_H)
 
 libapi_dirs:
-	$(QUIET_MKDIR)mkdir -p $(OUTPUT)fs/
+	$(QUIET_MKDIR)mkdir -p $(OUTPUT)fd $(OUTPUT)fs
 
 $(OUTPUT)%.o: %.c libapi_dirs
 	$(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) $<
diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
new file mode 100644
index 000000000000..bd923d41b450
--- /dev/null
+++ b/tools/lib/api/fd/array.c
@@ -0,0 +1,108 @@
+/*
+ * Copyright (C) 2014, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+ *
+ * Released under the GPL v2. (and only v2, not any later version)
+ */
+#include "array.h"
+#include <errno.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+void fdarray__init(struct fdarray *fda, int nr_autogrow)
+{
+	fda->entries	 = NULL;
+	fda->nr		 = fda->nr_alloc = 0;
+	fda->nr_autogrow = nr_autogrow;
+}
+
+int fdarray__grow(struct fdarray *fda, int nr)
+{
+	int nr_alloc = fda->nr_alloc + nr;
+	size_t size  = sizeof(struct pollfd) * nr_alloc;
+	struct pollfd *entries = realloc(fda->entries, size);
+
+	if (entries == NULL)
+		return -ENOMEM;
+
+	fda->nr_alloc = nr_alloc;
+	fda->entries  = entries;
+	return 0;
+}
+
+struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
+{
+	struct fdarray *fda = calloc(1, sizeof(*fda));
+
+	if (fda != NULL) {
+		if (fdarray__grow(fda, nr_alloc)) {
+			free(fda);
+			fda = NULL;
+		} else {
+			fda->nr_autogrow = nr_autogrow;
+		}
+	}
+
+	return fda;
+}
+
+void fdarray__exit(struct fdarray *fda)
+{
+	free(fda->entries);
+	fdarray__init(fda, 0);
+}
+
+void fdarray__delete(struct fdarray *fda)
+{
+	fdarray__exit(fda);
+	free(fda);
+}
+
+int fdarray__add(struct fdarray *fda, int fd, short revents)
+{
+	if (fda->nr == fda->nr_alloc &&
+	    fdarray__grow(fda, fda->nr_autogrow) < 0)
+		return -ENOMEM;
+
+	fcntl(fd, F_SETFL, O_NONBLOCK);
+	fda->entries[fda->nr].fd     = fd;
+	fda->entries[fda->nr].events = revents;
+	fda->nr++;
+	return 0;
+}
+
+int fdarray__filter(struct fdarray *fda, short revents)
+{
+	int fd, nr = 0;
+
+	if (fda->nr == 0)
+		return 0;
+
+	for (fd = 0; fd < fda->nr; ++fd) {
+		if (fda->entries[fd].revents & revents)
+			continue;
+
+		if (fd != nr)
+			fda->entries[nr] = fda->entries[fd];
+
+		++nr;
+	}
+
+	return fda->nr = nr;
+}
+
+int fdarray__poll(struct fdarray *fda, int timeout)
+{
+	return poll(fda->entries, fda->nr, timeout);
+}
+
+int fdarray__fprintf(struct fdarray *fda, FILE *fp)
+{
+	int fd, printed = fprintf(fp, "%d [ ", fda->nr);
+
+	for (fd = 0; fd < fda->nr; ++fd)
+		printed += fprintf(fp, "%s%d", fd ? ", " : "", fda->entries[fd].fd);
+
+	return printed + fprintf(fp, " ]");
+}
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
new file mode 100644
index 000000000000..de38361ba69e
--- /dev/null
+++ b/tools/lib/api/fd/array.h
@@ -0,0 +1,32 @@
+#ifndef __API_FD_ARRAY__
+#define __API_FD_ARRAY__
+
+#include <stdio.h>
+
+struct pollfd;
+
+struct fdarray {
+	int	       nr;
+	int	       nr_alloc;
+	int	       nr_autogrow;
+	struct pollfd *entries;
+};
+
+void fdarray__init(struct fdarray *fda, int nr_autogrow);
+void fdarray__exit(struct fdarray *fda);
+
+struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
+void fdarray__delete(struct fdarray *fda);
+
+int fdarray__add(struct fdarray *fda, int fd, short revents);
+int fdarray__poll(struct fdarray *fda, int timeout);
+int fdarray__filter(struct fdarray *fda, short revents);
+int fdarray__grow(struct fdarray *fda, int extra);
+int fdarray__fprintf(struct fdarray *fda, FILE *fp);
+
+static inline int fdarray__available_entries(struct fdarray *fda)
+{
+	return fda->nr_alloc - fda->nr;
+}
+
+#endif /* __API_FD_ARRAY__ */
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index a440219b0be0..1e639d6265cc 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -920,7 +920,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 	signal(SIGTERM, sig_handler);
 
 	/* use pollfds -- need to add timerfd and stdin */
-	nr_fds = kvm->evlist->nr_fds;
+	nr_fds = kvm->evlist->pollfd.nr;
 
 	/* add timer fd */
 	if (perf_kvm__timerfd_create(kvm) < 0) {
@@ -941,7 +941,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 	if (fd_set_nonblock(fileno(stdin)) != 0)
 		goto out;
 
-	pollfds	 = kvm->evlist->pollfd;
+	pollfds	 = kvm->evlist->pollfd.entries;
 
 	/* everything is good - enable the events and process */
 	perf_evlist__enable(kvm->evlist);
diff --git a/tools/perf/tests/evlist.c b/tools/perf/tests/evlist.c
index 99d7dfd4e20a..fe3e118b25bd 100644
--- a/tools/perf/tests/evlist.c
+++ b/tools/perf/tests/evlist.c
@@ -8,28 +8,26 @@ static void perf_evlist__init_pollfd(struct perf_evlist *evlist,
 				     int nr_fds_alloc, short revents)
 {
 	int fd;
+	struct fdarray *fda = &evlist->pollfd;
 
-	evlist->nr_fds = nr_fds_alloc;
+	fda->nr = nr_fds_alloc;
 
 	for (fd = 0; fd < nr_fds_alloc; ++fd) {
-		evlist->pollfd[fd].fd	   = nr_fds_alloc - fd;
-		evlist->pollfd[fd].revents = revents;
+		fda->entries[fd].fd	 = nr_fds_alloc - fd;
+		fda->entries[fd].revents = revents;
 	}
 }
 
 static int perf_evlist__fprintf_pollfd(struct perf_evlist *evlist,
 				       const char *prefix, FILE *fp)
 {
-	int printed = 0, fd;
+	int printed = 0;
 
 	if (!verbose)
 		return 0;
 
-	printed += fprintf(fp, "\n%s: %3d [ ", prefix, evlist->nr_fds);
-	for (fd = 0; fd < evlist->nr_fds; ++fd)
-		printed += fprintf(fp, "%s%d", fd ? ", " : "", evlist->pollfd[fd].fd);
-	printed += fprintf(fp, " ]");
-	return printed;
+	printed += fprintf(fp, "\n%s: ", prefix);
+	return fdarray__fprintf(&evlist->pollfd, fp);
 }
 
 int test__perf_evlist__filter_pollfd(void)
@@ -38,7 +36,9 @@ int test__perf_evlist__filter_pollfd(void)
 	int nr_fds, expected_fd[2], fd;
 	struct pollfd pollfd[nr_fds_alloc];
 	struct perf_evlist evlist_alloc = {
-		.pollfd	= pollfd,
+		.pollfd	= {
+			.entries = pollfd,
+		},
 	}, *evlist = &evlist_alloc;
 
 	perf_evlist__init_pollfd(evlist, nr_fds_alloc, POLLIN);
@@ -113,9 +113,12 @@ int test__perf_evlist__add_pollfd(void)
 		.nr = 2,
 	};
 	struct perf_evlist evlist_alloc = {
-		.pollfd	 = NULL,
+		.pollfd	 = {
+			.entries = NULL,
+		},
 		.threads = &threads,
 	}, *evlist = &evlist_alloc;
+	struct fdarray *fda = &evlist->pollfd;
 
 	INIT_LIST_HEAD(&evlist->entries);
 	list_add(&evsel.node, &evlist->entries);
@@ -125,9 +128,9 @@ int test__perf_evlist__add_pollfd(void)
 		return TEST_FAIL;
 	}
 
-	if (evlist->nr_fds_alloc != threads.nr) {
+	if (fda->nr_alloc != threads.nr) {
 		pr_debug("\n_evlist__alloc_pollfd: nr_fds_alloc=%d != (threads->nr(%d) * cpu_map->nr(%d))=%d",
-			 evlist->nr_fds_alloc, thread_map__nr(evlist->threads), cpu_map__nr(evlist->cpus),
+			 fda->nr_alloc, thread_map__nr(evlist->threads), cpu_map__nr(evlist->cpus),
 			 thread_map__nr(evlist->threads) * cpu_map__nr(evlist->cpus));
 		return TEST_FAIL;
 	}
@@ -137,8 +140,8 @@ int test__perf_evlist__add_pollfd(void)
 		return TEST_FAIL;
 	}
 
-	if (evlist->nr_fds != 1) {
-		pr_debug("\nperf_evlist__add_pollfd(evlist, 1)=%d != 1", evlist->nr_fds);
+	if (fda->nr != 1) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 1)=%d != 1", fda->nr);
 		return TEST_FAIL;
 	}
 
@@ -147,8 +150,8 @@ int test__perf_evlist__add_pollfd(void)
 		return TEST_FAIL;
 	}
 
-	if (evlist->nr_fds != 2) {
-		pr_debug("\nperf_evlist__add_pollfd(evlist, 2)=%d != 2", evlist->nr_fds);
+	if (fda->nr != 2) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 2)=%d != 2", fda->nr);
 		return TEST_FAIL;
 	}
 
@@ -159,20 +162,20 @@ int test__perf_evlist__add_pollfd(void)
 		return TEST_FAIL;
 	}
 
-	if (evlist->nr_fds != 3) {
-		pr_debug("\nperf_evlist__add_pollfd(evlist, 35)=%d != 3", evlist->nr_fds);
+	if (fda->nr != 3) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 35)=%d != 3", fda->nr);
 		return TEST_FAIL;
 	}
 
-	if (evlist->pollfd == NULL) {
+	if (fda->entries == NULL) {
 		pr_debug("\nperf_evlist__add_pollfd(evlist, 35) should have allocated evlist->pollfd!");
 		return TEST_FAIL;
 	}
 
 	perf_evlist__fprintf_pollfd(evlist, "after 3rd add_pollfd", stderr);
 
-	if (evlist->pollfd[2].fd != 35) {
-		pr_debug("\nevlist->pollfd[2](%d) != 35!", evlist->pollfd[2].fd);
+	if (fda->entries[2].fd != 35) {
+		pr_debug("\nfda->entries[2](%d) != 35!", fda->entries[2].fd);
 		return TEST_FAIL;
 	}
 
@@ -181,30 +184,30 @@ int test__perf_evlist__add_pollfd(void)
 		return TEST_FAIL;
 	}
 
-	if (evlist->nr_fds != 4) {
-		pr_debug("\nperf_evlist__add_pollfd(evlist, 88)=%d != 2", evlist->nr_fds);
+	if (fda->nr != 4) {
+		pr_debug("\nperf_evlist__add_pollfd(evlist, 88)=%d != 2", fda->nr);
 		return TEST_FAIL;
 	}
 
 	perf_evlist__fprintf_pollfd(evlist, "after 4th add_pollfd", stderr);
 
-	if (evlist->pollfd[0].fd != 1) {
-		pr_debug("\nevlist->pollfd[0](%d) != 1!", evlist->pollfd[0].fd);
+	if (fda->entries[0].fd != 1) {
+		pr_debug("\nfda->entries[0](%d) != 1!", fda->entries[0].fd);
 		return TEST_FAIL;
 	}
 
-	if (evlist->pollfd[1].fd != 2) {
-		pr_debug("\nevlist->pollfd[1](%d) != 2!", evlist->pollfd[1].fd);
+	if (fda->entries[1].fd != 2) {
+		pr_debug("\nfda->entries[1](%d) != 2!", fda->entries[1].fd);
 		return TEST_FAIL;
 	}
 
-	if (evlist->pollfd[2].fd != 35) {
-		pr_debug("\nevlist->pollfd[2](%d) != 35!", evlist->pollfd[2].fd);
+	if (fda->entries[2].fd != 35) {
+		pr_debug("\nfda->entries[2](%d) != 35!", fda->entries[2].fd);
 		return TEST_FAIL;
 	}
 
-	if (evlist->pollfd[3].fd != 88) {
-		pr_debug("\nevlist->pollfd[3](%d) != 88!", evlist->pollfd[3].fd);
+	if (fda->entries[3].fd != 88) {
+		pr_debug("\nfda->entries[3](%d) != 88!", fda->entries[3].fd);
 		return TEST_FAIL;
 	}
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5ff3c667542f..477ede5367ee 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -37,6 +37,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 		INIT_HLIST_HEAD(&evlist->heads[i]);
 	INIT_LIST_HEAD(&evlist->entries);
 	perf_evlist__set_maps(evlist, cpus, threads);
+	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
 }
 
@@ -102,7 +103,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	zfree(&evlist->mmap);
-	zfree(&evlist->pollfd);
+	fdarray__exit(&evlist->pollfd);
 }
 
 void perf_evlist__delete(struct perf_evlist *evlist)
@@ -402,20 +403,6 @@ int perf_evlist__enable_event_idx(struct perf_evlist *evlist,
 		return perf_evlist__enable_event_thread(evlist, evsel, idx);
 }
 
-static int perf_evlist__grow_pollfd(struct perf_evlist *evlist, int hint)
-{
-	int nr_fds_alloc = evlist->nr_fds_alloc + hint;
-	size_t size = sizeof(struct pollfd) * nr_fds_alloc;
-	struct pollfd *pollfd = realloc(evlist->pollfd, size);
-
-	if (pollfd == NULL)
-		return -ENOMEM;
-
-	evlist->nr_fds_alloc = nr_fds_alloc;
-	evlist->pollfd	     = pollfd;
-	return 0;
-}
-
 int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
 	int nr_cpus = cpu_map__nr(evlist->cpus);
@@ -430,8 +417,8 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 			nfds += nr_cpus * nr_threads;
 	}
 
-	if (evlist->nr_fds_alloc - evlist->nr_fds < nfds &&
-	    perf_evlist__grow_pollfd(evlist, nfds) < 0)
+	if (fdarray__available_entries(&evlist->pollfd) < nfds &&
+	    fdarray__grow(&evlist->pollfd, nfds) < 0)
 		return -ENOMEM;
 
 	return 0;
@@ -439,45 +426,17 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	/*
-	 * XXX: 64 is arbitrary, just not to call realloc at each fd.
-	 *	Find a better autogrowing heuristic
-	 */
-	if (evlist->nr_fds == evlist->nr_fds_alloc &&
-	    perf_evlist__grow_pollfd(evlist, 64) < 0)
-		return -ENOMEM;
-
-	fcntl(fd, F_SETFL, O_NONBLOCK);
-	evlist->pollfd[evlist->nr_fds].fd = fd;
-	evlist->pollfd[evlist->nr_fds].events = POLLIN | POLLERR | POLLHUP;
-	evlist->nr_fds++;
-	return 0;
+	return fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
 }
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
 {
-	int fd, nr_fds = 0;
-
-	if (evlist->nr_fds == 0)
-		return 0;
-
-	for (fd = 0; fd < evlist->nr_fds; ++fd) {
-		if (evlist->pollfd[fd].revents & revents_and_mask)
-			continue;
-
-		if (fd != nr_fds)
-			evlist->pollfd[nr_fds] = evlist->pollfd[fd];
-
-		++nr_fds;
-	}
-
-	evlist->nr_fds = nr_fds;
-	return nr_fds;
+	return fdarray__filter(&evlist->pollfd, revents_and_mask);
 }
 
 int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
 {
-	return poll(evlist->pollfd, evlist->nr_fds, timeout);
+	return fdarray__poll(&evlist->pollfd, timeout);
 }
 
 static void perf_evlist__id_hash(struct perf_evlist *evlist,
@@ -935,7 +894,7 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
-	if (evlist->pollfd == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
+	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
 	evlist->overwrite = overwrite;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index d7e99b67c94f..fc013704d903 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -2,6 +2,7 @@
 #define __PERF_EVLIST_H 1
 
 #include <linux/list.h>
+#include <api/fd/array.h>
 #include <stdio.h>
 #include "../perf.h"
 #include "event.h"
@@ -29,8 +30,6 @@ struct perf_evlist {
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
 	int		 nr_entries;
 	int		 nr_groups;
-	int		 nr_fds;
-	int		 nr_fds_alloc;
 	int		 nr_mmaps;
 	size_t		 mmap_len;
 	int		 id_pos;
@@ -41,8 +40,8 @@ struct perf_evlist {
 		pid_t	pid;
 	} workload;
 	bool		 overwrite;
+	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
-	struct pollfd	 *pollfd;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 4472f8be8e35..3dda85ca50c1 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -753,9 +753,9 @@ static PyObject *pyrf_evlist__get_pollfd(struct pyrf_evlist *pevlist,
         PyObject *list = PyList_New(0);
 	int i;
 
-	for (i = 0; i < evlist->nr_fds; ++i) {
+	for (i = 0; i < evlist->pollfd.nr; ++i) {
 		PyObject *file;
-		FILE *fp = fdopen(evlist->pollfd[i].fd, "r");
+		FILE *fp = fdopen(evlist->pollfd.entries[i].fd, "r");
 
 		if (fp == NULL)
 			goto free_list;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 12/14] perf evlist: Refcount mmaps
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-10 14:08 ` [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We need to know how many fds are using a perf mmap via
PERF_EVENT_IOC_SET_OUTPUT, so that we can know when to ditch an mmap,
refcount it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140908153824.GG2773@kernel.org
Link: http://lkml.kernel.org/n/tip-9ymfcxmk52sb7u1g0i3aiovh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 21 ++++++++++++++++++++-
 tools/perf/util/evlist.h |  6 ++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 477ede5367ee..662057a44126 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -664,6 +664,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	if (evlist->mmap[idx].base != NULL) {
 		munmap(evlist->mmap[idx].base, evlist->mmap_len);
 		evlist->mmap[idx].base = NULL;
+		evlist->mmap[idx].nfds = 0;
 	}
 }
 
@@ -697,6 +698,7 @@ struct mmap_params {
 static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 			       struct mmap_params *mp, int fd)
 {
+	evlist->mmap[idx].nfds = 1;
 	evlist->mmap[idx].prev = 0;
 	evlist->mmap[idx].mask = mp->mask;
 	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
@@ -711,6 +713,19 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
+{
+	++evlist->mmap[idx].nfds;
+}
+
+static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
+{
+	BUG_ON(evlist->mmap[idx].nfds == 0);
+
+	if (--evlist->mmap[idx].nfds == 0)
+		__perf_evlist__munmap(evlist, idx);
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -732,10 +747,14 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		} else {
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
 				return -1;
+
+			perf_evlist__mmap_get(evlist, idx);
 		}
 
-		if (perf_evlist__add_pollfd(evlist, fd) < 0)
+		if (perf_evlist__add_pollfd(evlist, fd) < 0) {
+			perf_evlist__mmap_put(evlist, idx);
 			return -1;
+		}
 
 		if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
 		    perf_evlist__id_add_fd(evlist, evsel, cpu, thread, fd) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index fc013704d903..19bbe0410827 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -18,9 +18,15 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
+/**
+ * struct perf_mmap - perf's ring buffer mmap details
+ *
+ * @nfds - file descriptors pointing here via PERF_EVENT_IOC_SET_OUTPUT
+ */
 struct perf_mmap {
 	void		 *base;
 	int		 mask;
+	int		 nfds;
 	unsigned int	 prev;
 	char		 event_copy[PERF_SAMPLE_MAX_SIZE];
 };
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 12/14] perf evlist: Refcount mmaps Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-11 10:33   ` Jiri Olsa
  2014-09-10 14:08 ` [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked Arnaldo Carvalho de Melo
  2014-09-11 11:33 ` [RFC 00/14] perf pollfd v3 Jiri Olsa
  14 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We will use this in perf's evlist class so that it can, at
fdarray__filter() time, to unmap the associated ring buffer.

We may need to have further info associated with each fdarray entry, in
that case we'll make that int array a 'union fdarray_priv' one and put a
pointer there so that users can stash whatever they want there. For now,
an int is enough tho.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-ugusyqex32mc86hqh7cjdy3i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/api/fd/array.c | 28 ++++++++++++++++++++++++----
 tools/lib/api/fd/array.h |  4 +++-
 tools/perf/util/evlist.c |  2 +-
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index bd923d41b450..3f6d1a03d195 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -13,21 +13,31 @@
 void fdarray__init(struct fdarray *fda, int nr_autogrow)
 {
 	fda->entries	 = NULL;
+	fda->priv	 = NULL;
 	fda->nr		 = fda->nr_alloc = 0;
 	fda->nr_autogrow = nr_autogrow;
 }
 
 int fdarray__grow(struct fdarray *fda, int nr)
 {
+	void *priv;
 	int nr_alloc = fda->nr_alloc + nr;
+	size_t psize = sizeof(fda->priv[0]) * nr_alloc;
 	size_t size  = sizeof(struct pollfd) * nr_alloc;
 	struct pollfd *entries = realloc(fda->entries, size);
 
 	if (entries == NULL)
 		return -ENOMEM;
 
+	priv = realloc(fda->priv, psize);
+	if (priv == NULL) {
+		free(entries);
+		return -ENOMEM;
+	}
+
 	fda->nr_alloc = nr_alloc;
 	fda->entries  = entries;
+	fda->priv     = priv;
 	return 0;
 }
 
@@ -50,6 +60,7 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
 void fdarray__exit(struct fdarray *fda)
 {
 	free(fda->entries);
+	free(fda->priv);
 	fdarray__init(fda, 0);
 }
 
@@ -61,6 +72,8 @@ void fdarray__delete(struct fdarray *fda)
 
 int fdarray__add(struct fdarray *fda, int fd, short revents)
 {
+	int pos = fda->nr;
+
 	if (fda->nr == fda->nr_alloc &&
 	    fdarray__grow(fda, fda->nr_autogrow) < 0)
 		return -ENOMEM;
@@ -69,10 +82,11 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
 	fda->entries[fda->nr].fd     = fd;
 	fda->entries[fda->nr].events = revents;
 	fda->nr++;
-	return 0;
+	return pos;
 }
 
-int fdarray__filter(struct fdarray *fda, short revents)
+int fdarray__filter(struct fdarray *fda, short revents,
+		    void (*entry_destructor)(struct fdarray *fda, int fd))
 {
 	int fd, nr = 0;
 
@@ -80,11 +94,17 @@ int fdarray__filter(struct fdarray *fda, short revents)
 		return 0;
 
 	for (fd = 0; fd < fda->nr; ++fd) {
-		if (fda->entries[fd].revents & revents)
+		if (fda->entries[fd].revents & revents) {
+			if (entry_destructor)
+				entry_destructor(fda, fd);
+
 			continue;
+		}
 
-		if (fd != nr)
+		if (fd != nr) {
 			fda->entries[nr] = fda->entries[fd];
+			fda->priv[nr]	 = fda->priv[fd];
+		}
 
 		++nr;
 	}
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index de38361ba69e..7b2870a96898 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -10,6 +10,7 @@ struct fdarray {
 	int	       nr_alloc;
 	int	       nr_autogrow;
 	struct pollfd *entries;
+	int	      *priv;
 };
 
 void fdarray__init(struct fdarray *fda, int nr_autogrow);
@@ -20,7 +21,8 @@ void fdarray__delete(struct fdarray *fda);
 
 int fdarray__add(struct fdarray *fda, int fd, short revents);
 int fdarray__poll(struct fdarray *fda, int timeout);
-int fdarray__filter(struct fdarray *fda, short revents);
+int fdarray__filter(struct fdarray *fda, short revents,
+		    void (*entry_destructor)(struct fdarray *fda, int fd));
 int fdarray__grow(struct fdarray *fda, int extra);
 int fdarray__fprintf(struct fdarray *fda, FILE *fp);
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 662057a44126..6d2499cfd789 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -431,7 +431,7 @@ int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
 {
-	return fdarray__filter(&evlist->pollfd, revents_and_mask);
+	return fdarray__filter(&evlist->pollfd, revents_and_mask, NULL);
 }
 
 int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry Arnaldo Carvalho de Melo
@ 2014-09-10 14:08 ` Arnaldo Carvalho de Melo
  2014-09-11 12:27   ` Jiri Olsa
  2014-09-11 11:33 ` [RFC 00/14] perf pollfd v3 Jiri Olsa
  14 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-10 14:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

As noticed by receiving a POLLHUP for its pollfd entry.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-4p3ledwt3ra9ak8e5yu3y928@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6d2499cfd789..fdb755f6e299 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -25,6 +25,8 @@
 #include <linux/bitops.h>
 #include <linux/hash.h>
 
+static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
+
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
 
@@ -424,14 +426,35 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+{
+	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	/*
+	 * Save the idx so that when we filter out fds POLLHUP'ed we can
+	 * close the associated evlist->mmap[] entry.
+	 */
+	if (pos >= 0)
+		evlist->pollfd.priv[pos] = idx;
+
+	return pos;
+}
+
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	return __perf_evlist__add_pollfd(evlist, fd, -1);
+}
+
+static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
+{
+	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
+
+	perf_evlist__mmap_put(evlist, fda->priv[fd]);
 }
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
 {
-	return fdarray__filter(&evlist->pollfd, revents_and_mask, NULL);
+	return fdarray__filter(&evlist->pollfd, revents_and_mask,
+			       perf_evlist__munmap_filtered);
 }
 
 int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
@@ -751,7 +774,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
-		if (perf_evlist__add_pollfd(evlist, fd) < 0) {
+		if (__perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-10 14:08 ` [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry Arnaldo Carvalho de Melo
@ 2014-09-11 10:33   ` Jiri Olsa
  2014-09-11 13:29     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 10:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

On Wed, Sep 10, 2014 at 11:08:48AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

>  }
>  
> -int fdarray__filter(struct fdarray *fda, short revents)
> +int fdarray__filter(struct fdarray *fda, short revents,
> +		    void (*entry_destructor)(struct fdarray *fda, int fd))
>  {
>  	int fd, nr = 0;
>  
> @@ -80,11 +94,17 @@ int fdarray__filter(struct fdarray *fda, short revents)
>  		return 0;
>  
>  	for (fd = 0; fd < fda->nr; ++fd) {
> -		if (fda->entries[fd].revents & revents)
> +		if (fda->entries[fd].revents & revents) {
> +			if (entry_destructor)
> +				entry_destructor(fda, fd);

the desctructor callback could have the 'priv' as an argument
so we dont need to touch fdarray internals in upper layer

> +
>  			continue;
> +		}
>  
> -		if (fd != nr)
> +		if (fd != nr) {
>  			fda->entries[nr] = fda->entries[fd];
> +			fda->priv[nr]	 = fda->priv[fd];
> +		}
>  
>  		++nr;
>  	}
> diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
> index de38361ba69e..7b2870a96898 100644
> --- a/tools/lib/api/fd/array.h
> +++ b/tools/lib/api/fd/array.h
> @@ -10,6 +10,7 @@ struct fdarray {
>  	int	       nr_alloc;
>  	int	       nr_autogrow;
>  	struct pollfd *entries;
> +	int	      *priv;

I like the idea of private pointer for each fd, but I think
it should be 'void*' to keep the library generic. The 'int*'
is related only to the evlist usage of this.

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2014-09-10 14:08 ` [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked Arnaldo Carvalho de Melo
@ 2014-09-11 11:33 ` Jiri Olsa
  2014-09-11 11:48   ` Jiri Olsa
  14 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 11:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Don Zickus,
	Frederic Weisbecker, Ingo Molnar, Jean Pihet, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

On Wed, Sep 10, 2014 at 11:08:35AM -0300, Arnaldo Carvalho de Melo wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> Hi,
> 
> 	Main different to what was discussed in v2[1]:
> 
> 	Poll all event file descriptors, not just the ones mmaped.
> 
> 	This is a bug that was present before, noticed by Adrian Hunter while
> reviewing this patchkit, i.e. we were polling just the mmaped ones, because
> then we would look at all the ring buffers when just one of the polled
> descriptors was ready to read.
> 
> 	But this could lead to delays in polling when one of the mmaped fds was
> closed and we would then not be notified when another, associated to it by
> means of PERF_EVENT_IOC_SET_OUTPUT, had events available.
> 
> 	Also refcount the mmaps associated by means of PERF_EVENT_IOC_SET_OUTPUT,
> so that we can properly unmap a ring buffer to which all its file descriptors
> had been closed, for longer running apps.
> 
> 	It is all available on my tree:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/fdarray.v2
> 
> 	I kept the v3 for the series as I initially called it pollfd, but since the
> class that grew out of it was named fdarray, thus the fdarray.v2 branch name.
> 
> 	Please let me know if there are still any other problems, now back to
> fixing up my last perf/core pull req wrt those external proggies problems.
> 

hum got this one:

[jolsa@krava perf]$ ./perf record ls
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?

*** Error in `./perf': free(): invalid next size (fast): 0x00000000020d2290 ***
*** Error in `./perf': malloc(): memory corruption: 0x00000000020d22b0 ***

[jolsa@krava perf]$ pstack 20706
#0  pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:94
#1  0x0000003408109d1c in __GI___backtrace (array=array@entry=0x7fff5c073d60, size=size@entry=64) at ../sysdeps/x86_64/backtrace.c:103
#2  0x0000003408075d34 in __libc_message (do_abort=2, fmt=fmt@entry=0x340817e568 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:176
#3  0x000000340807e21c in malloc_printerr (ptr=0x20d22b0, str=0x340817bcbf "malloc(): memory corruption", action=<optimized out>) at malloc.c:4937
#4  _int_malloc (av=0x34083ba780 <main_arena>, bytes=56) at malloc.c:3411
#5  0x000000340808002c in __GI___libc_malloc (bytes=56) at malloc.c:2863
#6  0x0000003407c0d379 in _dl_map_object_deps (map=map@entry=0x7fe140d0ba18, preloads=preloads@entry=0x0, npreloads=npreloads@entry=0, trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648) at dl-deps.c:515
#7  0x0000003407c138bc in dl_open_worker (a=a@entry=0x7fff5c0748f8) at dl-open.c:265
#8  0x0000003407c0f304 in _dl_catch_error (objname=objname@entry=0x7fff5c0748e8, errstring=errstring@entry=0x7fff5c0748f0, mallocedp=mallocedp@entry=0x7fff5c0748e0, operate=operate@entry=0x3407c13740 <dl_open_worker>, args=args@entry=0x7fff5c0748f8) at dl-error.c:177
#9  0x0000003407c131eb in _dl_open (file=0x340817aaa6 "libgcc_s.so.1", mode=-2147483647, caller_dlopen=0x3408109c05 <init+21>, nsid=-2, argc=3, argv=<optimized out>, env=0x20ae470) at dl-open.c:656
#10 0x00000034081305d2 in do_dlopen (ptr=ptr@entry=0x7fff5c074b00) at dl-libc.c:87
#11 0x0000003407c0f304 in _dl_catch_error (objname=0x7fff5c074ae0, errstring=0x7fff5c074af0, mallocedp=0x7fff5c074ad0, operate=0x3408130590 <do_dlopen>, args=0x7fff5c074b00) at dl-error.c:177
#12 0x0000003408130692 in dlerror_run (args=0x7fff5c074b00, operate=0x3408130590 <do_dlopen>) at dl-libc.c:46
#13 __GI___libc_dlopen_mode (name=name@entry=0x340817aaa6 "libgcc_s.so.1", mode=mode@entry=-2147483647) at dl-libc.c:163
#14 0x0000003408109c05 in init () at ../sysdeps/x86_64/backtrace.c:52
#15 0x0000003408c0ca40 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103
#16 0x0000003408109d1c in __GI___backtrace (array=array@entry=0x7fff5c074dc0, size=size@entry=64) at ../sysdeps/x86_64/backtrace.c:103
#17 0x0000003408075d34 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x340817e568 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:176
#18 0x000000340807d0b8 in malloc_printerr (ptr=<optimized out>, str=0x340817e690 "free(): invalid next size (fast)", action=3) at malloc.c:4937
#19 _int_free (av=0x34083ba780 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3789
#20 0x0000000000465e72 in perf_evsel__free_fd (evsel=0x20af190) at util/evsel.c:786
#21 perf_evsel__close (evsel=evsel@entry=0x20af190, ncpus=<optimized out>, nthreads=nthreads@entry=1) at util/evsel.c:1139
#22 0x000000000045f77d in perf_evlist__close (evlist=0x20ae8b0) at util/evlist.c:1148
#23 perf_evlist__delete (evlist=0x20ae8b0) at util/evlist.c:114
#24 0x000000000042b878 in cmd_record (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-record.c:967
#25 0x000000000041c455 in run_builtin (p=p@entry=0x815e70 <commands+144>, argc=argc@entry=2, argv=argv@entry=0x7fff5c077c50) at perf.c:331
#26 0x000000000041bc70 in handle_internal_command (argv=0x7fff5c077c50, argc=2) at perf.c:390
#27 run_argv (argv=0x7fff5c0779d0, argcp=0x7fff5c0779dc) at perf.c:434
#28 main (argc=2, argv=0x7fff5c077c50) at perf.c:549

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-11 11:33 ` [RFC 00/14] perf pollfd v3 Jiri Olsa
@ 2014-09-11 11:48   ` Jiri Olsa
  2014-09-11 13:30     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 11:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Don Zickus,
	Frederic Weisbecker, Ingo Molnar, Jean Pihet, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

On Thu, Sep 11, 2014 at 01:33:36PM +0200, Jiri Olsa wrote:

SNIP

> #20 0x0000000000465e72 in perf_evsel__free_fd (evsel=0x20af190) at util/evsel.c:786
> #21 perf_evsel__close (evsel=evsel@entry=0x20af190, ncpus=<optimized out>, nthreads=nthreads@entry=1) at util/evsel.c:1139
> #22 0x000000000045f77d in perf_evlist__close (evlist=0x20ae8b0) at util/evlist.c:1148
> #23 perf_evlist__delete (evlist=0x20ae8b0) at util/evlist.c:114
> #24 0x000000000042b878 in cmd_record (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-record.c:967
> #25 0x000000000041c455 in run_builtin (p=p@entry=0x815e70 <commands+144>, argc=argc@entry=2, argv=argv@entry=0x7fff5c077c50) at perf.c:331
> #26 0x000000000041bc70 in handle_internal_command (argv=0x7fff5c077c50, argc=2) at perf.c:390
> #27 run_argv (argv=0x7fff5c0779d0, argcp=0x7fff5c0779dc) at perf.c:434
> #28 main (argc=2, argv=0x7fff5c077c50) at perf.c:549

so the reason was that my fd lib stuff did not get rebuilt..

you probably want to add attached change, before there's the
fix for the apik library

jirka


---
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 9ce194fc00a0..726a31a18125 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -769,7 +769,7 @@ $(LIBTRACEEVENT)-clean:
 install-traceevent-plugins: $(LIBTRACEEVENT)
 	$(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(LIBTRACEEVENT_FLAGS) install_plugins
 
-LIBAPIKFS_SOURCES = $(wildcard $(LIB_PATH)fs/*.[ch])
+LIBAPIKFS_SOURCES = $(wildcard $(LIB_PATH)fs/*.[ch] $(LIB_PATH)fd/*.[ch])
 
 # if subdir is set, we've been called from above so target has been built
 # already

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked
  2014-09-10 14:08 ` [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked Arnaldo Carvalho de Melo
@ 2014-09-11 12:27   ` Jiri Olsa
  2014-09-11 13:40     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 12:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

On Wed, Sep 10, 2014 at 11:08:49AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> +}
> +
>  int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
>  {
> -	return fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
> +	return __perf_evlist__add_pollfd(evlist, fd, -1);
> +}
> +
> +static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
> +{
> +	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
> +
> +	perf_evlist__mmap_put(evlist, fda->priv[fd]);

we cannot do this.. because of the way we read the map

getting error or hup does not mean the mmap is empty,
it can still have data, which we loose if we unmap it

following test will get data only with attached patch:

---
term1:
  $ cat

term2:
  $ cat perf record -p `pgrep cat`

term1:
  ^D
---

we get poll READ notification based on the wattermart settings,
which by default is half size of the ring buffer.. so for small
amount of perf data we dont get the poll read notification

I think we need to handle this in the record command context
and read out the mmap before we unmap it

jirka



>  }
>  
>  int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
>  {
> -	return fdarray__filter(&evlist->pollfd, revents_and_mask, NULL);
> +	return fdarray__filter(&evlist->pollfd, revents_and_mask,
> +			       perf_evlist__munmap_filtered);
>  }
>  
>  int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
> @@ -751,7 +774,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>  			perf_evlist__mmap_get(evlist, idx);
>  		}
>  
> -		if (perf_evlist__add_pollfd(evlist, fd) < 0) {
> +		if (__perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
>  			perf_evlist__mmap_put(evlist, idx);
>  			return -1;
>  		}
> -- 
> 1.9.3
> 

---
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index fdb755f..9e71a47 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -448,6 +448,7 @@ static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
 {
 	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
 
+if (0)
 	perf_evlist__mmap_put(evlist, fda->priv[fd]);
 }
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-11 10:33   ` Jiri Olsa
@ 2014-09-11 13:29     ` Arnaldo Carvalho de Melo
  2014-09-11 14:59       ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 13:29 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Thu, Sep 11, 2014 at 12:33:19PM +0200, Jiri Olsa escreveu:
> On Wed, Sep 10, 2014 at 11:08:48AM -0300, Arnaldo Carvalho de Melo wrote:
 
SNIP

> > @@ -80,11 +94,17 @@ int fdarray__filter(struct fdarray *fda, short revents)
> >  	for (fd = 0; fd < fda->nr; ++fd) {
> > -		if (fda->entries[fd].revents & revents)
> > +		if (fda->entries[fd].revents & revents) {
> > +			if (entry_destructor)
> > +				entry_destructor(fda, fd);
> 
> the desctructor callback could have the 'priv' as an argument
> so we dont need to touch fdarray internals in upper layer

I thought about that, but then I would have to pass as well the pollfd
entry, that is not in priv, i.e. the callback may want to know exactly
what were the POLL{RD,WR,HUP,ER) in fdarray->entryes[fd].revents.

Yeah, sometimes it is good to avoid users touching the internals, but
that doesn't need to be something set in stone, i.e. we can consider
fda->revents and priv->priv to "public" :-)

SNIP
 
> > +++ b/tools/lib/api/fd/array.h
> > @@ -10,6 +10,7 @@ struct fdarray {
> >  	struct pollfd *entries;
> > +	int	      *priv;
 
> I like the idea of private pointer for each fd, but I think it should
> be 'void*' to keep the library generic. The 'int*' is related only to
> the evlist usage of this.

Right, from the changelog comment for this patch:

------------------------------
    We may need to have further info associated with each fdarray entry,
in that case we'll make that int array a 'union fdarray_priv' one and
put a pointer there so that users can stash whatever they want there.
For now, an int is enough tho.
------------------------------

I had it as:

	struct fdarray {
	...
		union {
			int 	idx;
			void	*ptr;
		} priv;
	...
	}

But then I had no use whatsoever for the ->ptr one at this point, so I
just nuked it, to keep _just_ what is used _right now_, and added the
comment to the changelog :-)

Heck, even the comment mentions "union fdarray_priv", because at one
point I was passing that to the destructor, as you suggested, but then I
removed it to an unnamed union, because passing just the index would
allow access to ->priv[] and ->entries[].

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-11 11:48   ` Jiri Olsa
@ 2014-09-11 13:30     ` Arnaldo Carvalho de Melo
  2014-09-11 21:36       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 13:30 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Thu, Sep 11, 2014 at 01:48:25PM +0200, Jiri Olsa escreveu:
> On Thu, Sep 11, 2014 at 01:33:36PM +0200, Jiri Olsa wrote:
> 
> SNIP
> 
> > #20 0x0000000000465e72 in perf_evsel__free_fd (evsel=0x20af190) at util/evsel.c:786
> > #21 perf_evsel__close (evsel=evsel@entry=0x20af190, ncpus=<optimized out>, nthreads=nthreads@entry=1) at util/evsel.c:1139
> > #22 0x000000000045f77d in perf_evlist__close (evlist=0x20ae8b0) at util/evlist.c:1148
> > #23 perf_evlist__delete (evlist=0x20ae8b0) at util/evlist.c:114
> > #24 0x000000000042b878 in cmd_record (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-record.c:967
> > #25 0x000000000041c455 in run_builtin (p=p@entry=0x815e70 <commands+144>, argc=argc@entry=2, argv=argv@entry=0x7fff5c077c50) at perf.c:331
> > #26 0x000000000041bc70 in handle_internal_command (argv=0x7fff5c077c50, argc=2) at perf.c:390
> > #27 run_argv (argv=0x7fff5c0779d0, argcp=0x7fff5c0779dc) at perf.c:434
> > #28 main (argc=2, argv=0x7fff5c077c50) at perf.c:549
> 
> so the reason was that my fd lib stuff did not get rebuilt..

Thanks a lot! I missed that one, will fold it into the patch that
introduces fdarray and add a v2: comment attributing credit to you, so
that bisection works.

- Arnaldo
 
> you probably want to add attached change, before there's the
> fix for the apik library
> 
> jirka
> 
> 
> ---
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 9ce194fc00a0..726a31a18125 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -769,7 +769,7 @@ $(LIBTRACEEVENT)-clean:
>  install-traceevent-plugins: $(LIBTRACEEVENT)
>  	$(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(LIBTRACEEVENT_FLAGS) install_plugins
>  
> -LIBAPIKFS_SOURCES = $(wildcard $(LIB_PATH)fs/*.[ch])
> +LIBAPIKFS_SOURCES = $(wildcard $(LIB_PATH)fs/*.[ch] $(LIB_PATH)fd/*.[ch])
>  
>  # if subdir is set, we've been called from above so target has been built
>  # already

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked
  2014-09-11 12:27   ` Jiri Olsa
@ 2014-09-11 13:40     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 13:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Thu, Sep 11, 2014 at 02:27:51PM +0200, Jiri Olsa escreveu:
> On Wed, Sep 10, 2014 at 11:08:49AM -0300, Arnaldo Carvalho de Melo wrote:
 
> SNIP
 
> >  int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
> >  {
> > -	return fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
> > +	return __perf_evlist__add_pollfd(evlist, fd, -1);
> > +}

> > +static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
> > +{
> > +	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
> > +
> > +	perf_evlist__mmap_put(evlist, fda->priv[fd]);
 
> we cannot do this.. because of the way we read the map
 
> getting error or hup does not mean the mmap is empty,
> it can still have data, which we loose if we unmap it

Understood, good catch, so I think that since associating with the mmap
is done automagically by the evlist class, it should continue, as I did,
doing the mmap_put's to have it in pairs, and then we have two choices:

1. the user, i.e. the tool, does an extra perf_evlist__mmap_get() to
make sure that it exits the loop with the mmaps in place, since it will
have that extra refcount, and then drain the buffers one last time, then
to the final put, or leave it there to be reaped unconditionally at
perf_evlist__delete()

2. Do this all automagically in the evlist layer(), i.e. start the
perf_mmap->nr_fds (that gets renamed, as this is no longer the number of
fds, but just a generic refcount) at 2, and when confirming the mmap
read, in perf_evlist__mmap_consume(), check if the refcount is 1 and if
there are no more events, do the final mmap_put.

I think #2 is better, no?

I.e. tools remain as they are, just doing the filtering, that could even
be renamed from perf_evlist__filter_pollfd() to perf_evlist__eof(), to
use some well known TLA for "no more things to read". :-)

I.e. the less the tools are _required_ to know about how events are laid
out and dealt with, concentrating just on _consuming_ those events, the
better.

- Arnaldo
 
> following test will get data only with attached patch:
> 
> ---
> term1:
>   $ cat
> 
> term2:
>   $ cat perf record -p `pgrep cat`
> 
> term1:
>   ^D
> ---
> 
> we get poll READ notification based on the wattermart settings,
> which by default is half size of the ring buffer.. so for small
> amount of perf data we dont get the poll read notification
> 
> I think we need to handle this in the record command context
> and read out the mmap before we unmap it
> 
> jirka
> 
> 
> 
> >  }
> >  
> >  int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
> >  {
> > -	return fdarray__filter(&evlist->pollfd, revents_and_mask, NULL);
> > +	return fdarray__filter(&evlist->pollfd, revents_and_mask,
> > +			       perf_evlist__munmap_filtered);
> >  }
> >  
> >  int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
> > @@ -751,7 +774,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
> >  			perf_evlist__mmap_get(evlist, idx);
> >  		}
> >  
> > -		if (perf_evlist__add_pollfd(evlist, fd) < 0) {
> > +		if (__perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
> >  			perf_evlist__mmap_put(evlist, idx);
> >  			return -1;
> >  		}
> > -- 
> > 1.9.3
> > 
> 
> ---
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index fdb755f..9e71a47 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -448,6 +448,7 @@ static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
>  {
>  	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
>  
> +if (0)
>  	perf_evlist__mmap_put(evlist, fda->priv[fd]);
>  }
>  

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-11 13:29     ` Arnaldo Carvalho de Melo
@ 2014-09-11 14:59       ` Jiri Olsa
  2014-09-11 15:23         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 14:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

On Thu, Sep 11, 2014 at 10:29:11AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> > I like the idea of private pointer for each fd, but I think it should
> > be 'void*' to keep the library generic. The 'int*' is related only to
> > the evlist usage of this.
> 
> Right, from the changelog comment for this patch:
> 
> ------------------------------
>     We may need to have further info associated with each fdarray entry,
> in that case we'll make that int array a 'union fdarray_priv' one and
> put a pointer there so that users can stash whatever they want there.
> For now, an int is enough tho.
> ------------------------------
> 
> I had it as:
> 
> 	struct fdarray {
> 	...
> 		union {
> 			int 	idx;
> 			void	*ptr;
> 		} priv;
> 	...
> 	}
> 
> But then I had no use whatsoever for the ->ptr one at this point, so I
> just nuked it, to keep _just_ what is used _right now_, and added the
> comment to the changelog :-)

if we are treeting tools/lib/api as 'external' lib, I think we should
use 'void *' for priv and let the user retype it to whatever he wants

but I dont care/insist here too much.. it just seems strange to me ;-)

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] tools lib fd array: Do not set fd as non blocking evlist
  2014-09-10 14:08 ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Arnaldo Carvalho de Melo
@ 2014-09-11 15:09   ` Jiri Olsa
  2014-09-11 15:27     ` Arnaldo Carvalho de Melo
  2014-09-12 12:58   ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Jiri Olsa
  2014-09-22 12:29   ` Jiri Olsa
  2 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 15:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> +}
> +
> +void fdarray__exit(struct fdarray *fda)
> +{
> +	free(fda->entries);
> +	fdarray__init(fda, 0);
> +}
> +
> +void fdarray__delete(struct fdarray *fda)
> +{
> +	fdarray__exit(fda);
> +	free(fda);
> +}
> +
> +int fdarray__add(struct fdarray *fda, int fd, short revents)
> +{
> +	if (fda->nr == fda->nr_alloc &&
> +	    fdarray__grow(fda, fda->nr_autogrow) < 0)
> +		return -ENOMEM;
> +
> +	fcntl(fd, F_SETFL, O_NONBLOCK);

also I spot this one and couldn't think of reason for it, attached
patch makes no behaviour difference for me..

I might be missing something, but I dont see any blocking operation
in perf related data reads. The git log history says it was there
since early days.

jirka


---
There's no reason for the current user of this API
to have non blocking fds. Also it should be up to
user to set this, not this object.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/api/fd/array.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index 3f6d1a0..0e636c4 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -78,7 +78,6 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
 	    fdarray__grow(fda, fda->nr_autogrow) < 0)
 		return -ENOMEM;
 
-	fcntl(fd, F_SETFL, O_NONBLOCK);
 	fda->entries[fda->nr].fd     = fd;
 	fda->entries[fda->nr].events = revents;
 	fda->nr++;
-- 
1.7.7.6



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-11 14:59       ` Jiri Olsa
@ 2014-09-11 15:23         ` Arnaldo Carvalho de Melo
  2014-09-11 15:35           ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 15:23 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Thu, Sep 11, 2014 at 04:59:45PM +0200, Jiri Olsa escreveu:
> On Thu, Sep 11, 2014 at 10:29:11AM -0300, Arnaldo Carvalho de Melo wrote:
> > But then I had no use whatsoever for the ->ptr one at this point, so I
> > just nuked it, to keep _just_ what is used _right now_, and added the
> > comment to the changelog :-)
> 
> if we are treeting tools/lib/api as 'external' lib, I think we should
> use 'void *' for priv and let the user retype it to whatever he wants
 
> but I dont care/insist here too much.. it just seems strange to me ;-)

Ok, so perhaps it should be:

	struct fdarray {
		...
		union {
			int idx;
		} *priv;
		...
}

And then perf's evlist will use:

	fda->priv[fd].idx;

	Then, when we get the need for a pointer, we can go there and
add it to the union.

	This will make tools/lib/api/fd/ a little bit more friendly to
use outside the kernel sources, since then a new version will be source
code compatible (but not necessarily binary compatible).

	If we commit to that we can't just go on and change all its
users as we can now do for things like 'struct sk_buff', 'struct sock'
and any other kernel data structures.

	There are costs in making it 'external' in the sense you're
implying. I want to make it a bit less 'external' (in your sense), at
least while we initially implement new stuff like this one.

	My preference is for us to move things to tools/lib/ and
initially support use of it for tools that live in the kernel sources.

	If and when people find it useful for things outside the kernel
sources, if we think that the code is perfect (ha ha) we can add some
comment saying that that particular source file can't be changed in a
way that will break out-of-the-kernel-tree sources.

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] tools lib fd array: Do not set fd as non blocking evlist
  2014-09-11 15:09   ` [PATCH] tools lib fd array: Do not set fd as non blocking evlist Jiri Olsa
@ 2014-09-11 15:27     ` Arnaldo Carvalho de Melo
  2014-09-11 15:53       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 15:27 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Thu, Sep 11, 2014 at 05:09:43PM +0200, Jiri Olsa escreveu:
> On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:
> 
> SNIP
> 
> > +}
> > +
> > +void fdarray__exit(struct fdarray *fda)
> > +{
> > +	free(fda->entries);
> > +	fdarray__init(fda, 0);
> > +}
> > +
> > +void fdarray__delete(struct fdarray *fda)
> > +{
> > +	fdarray__exit(fda);
> > +	free(fda);
> > +}
> > +
> > +int fdarray__add(struct fdarray *fda, int fd, short revents)
> > +{
> > +	if (fda->nr == fda->nr_alloc &&
> > +	    fdarray__grow(fda, fda->nr_autogrow) < 0)
> > +		return -ENOMEM;
> > +
> > +	fcntl(fd, F_SETFL, O_NONBLOCK);
> 
> also I spot this one and couldn't think of reason for it, attached
> patch makes no behaviour difference for me..

Have to look why it is there, perhaps there is some changeset
specifically made for this, will do some research...
 
> I might be missing something, but I dont see any blocking operation
> in perf related data reads. The git log history says it was there
> since early days.

Oops, you did that research already, have you followed the history all
the way to when this code lived in Documentation/ ?

But yes, I agree with your comment that this is something up to the
users to do, not for a general purpose class as fdarray is set out to
be.

I think the way to do this is to not move this fcntl when introducing
the fdarray class, but instead leave it at the perf_evlist__add_pollfd()
function, then, in a later patch, if we determine that it is not needed
at all, nuke it, ok?

- Arnaldo
 
> jirka
> 
> 
> ---
> There's no reason for the current user of this API
> to have non blocking fds. Also it should be up to
> user to set this, not this object.
> 
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Jean Pihet <jean.pihet@linaro.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/lib/api/fd/array.c |    1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> index 3f6d1a0..0e636c4 100644
> --- a/tools/lib/api/fd/array.c
> +++ b/tools/lib/api/fd/array.c
> @@ -78,7 +78,6 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>  	    fdarray__grow(fda, fda->nr_autogrow) < 0)
>  		return -ENOMEM;
>  
> -	fcntl(fd, F_SETFL, O_NONBLOCK);
>  	fda->entries[fda->nr].fd     = fd;
>  	fda->entries[fda->nr].events = revents;
>  	fda->nr++;
> -- 
> 1.7.7.6
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-11 15:23         ` Arnaldo Carvalho de Melo
@ 2014-09-11 15:35           ` Jiri Olsa
  2014-09-11 15:49             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 15:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

On Thu, Sep 11, 2014 at 12:23:30PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 11, 2014 at 04:59:45PM +0200, Jiri Olsa escreveu:
> > On Thu, Sep 11, 2014 at 10:29:11AM -0300, Arnaldo Carvalho de Melo wrote:
> > > But then I had no use whatsoever for the ->ptr one at this point, so I
> > > just nuked it, to keep _just_ what is used _right now_, and added the
> > > comment to the changelog :-)
> > 
> > if we are treeting tools/lib/api as 'external' lib, I think we should
> > use 'void *' for priv and let the user retype it to whatever he wants
>  
> > but I dont care/insist here too much.. it just seems strange to me ;-)
> 
> Ok, so perhaps it should be:
> 
> 	struct fdarray {
> 		...
> 		union {
> 			int idx;
> 		} *priv;
> 		...
> }
> 
> And then perf's evlist will use:
> 
> 	fda->priv[fd].idx;
> 
> 	Then, when we get the need for a pointer, we can go there and
> add it to the union.
> 
> 	This will make tools/lib/api/fd/ a little bit more friendly to
> use outside the kernel sources, since then a new version will be source
> code compatible (but not necessarily binary compatible).

you lost me ;-)

could you please make an example of the !compatibility you mentioned?

> 	If we commit to that we can't just go on and change all its
> users as we can now do for things like 'struct sk_buff', 'struct sock'
> and any other kernel data structures.
> 	There are costs in making it 'external' in the sense you're
> implying. I want to make it a bit less 'external' (in your sense), at
> least while we initially implement new stuff like this one.

and the costs?


also I did not mean that union above, what I meant was:

struct fdarray {
	...
	void *priv;
	...
}


jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-11 15:35           ` Jiri Olsa
@ 2014-09-11 15:49             ` Arnaldo Carvalho de Melo
  2014-09-11 16:07               ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 15:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Thu, Sep 11, 2014 at 05:35:27PM +0200, Jiri Olsa escreveu:
> On Thu, Sep 11, 2014 at 12:23:30PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Sep 11, 2014 at 04:59:45PM +0200, Jiri Olsa escreveu:
> > > On Thu, Sep 11, 2014 at 10:29:11AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > But then I had no use whatsoever for the ->ptr one at this point, so I
> > > > just nuked it, to keep _just_ what is used _right now_, and added the
> > > > comment to the changelog :-)

> > > if we are treeting tools/lib/api as 'external' lib, I think we should
> > > use 'void *' for priv and let the user retype it to whatever he wants

> > > but I dont care/insist here too much.. it just seems strange to me ;-)
> > 
> > Ok, so perhaps it should be:
> > 
> > 	struct fdarray {
> > 		...
> > 		union {
> > 			int idx;
> > 		} *priv;
> > 		...
> > }
> > 
> > And then perf's evlist will use:
> > 
> > 	fda->priv[fd].idx;
> > 
> > 	Then, when we get the need for a pointer, we can go there and
> > add it to the union.
> > 
> > 	This will make tools/lib/api/fd/ a little bit more friendly to
> > use outside the kernel sources, since then a new version will be source
> > code compatible (but not necessarily binary compatible).
> 
> you lost me ;-)
> 
> could you please make an example of the !compatibility you mentioned?

If I leave it as is now, when I add that void * pointer to each entry in
fdarray->priv[], I will have to change fdarray's users, i.e. right now I
will have to go and change evlist.h, i.e. every place that has:

	fda->priv[fd];

	will have to be changed to:

	fda->priv[fd].idx;

If I instead change it to:

 	struct fdarray {
 		...
 		union {
 			int idx;
 		} *priv;
 		...
	}

Then I will not have to change anything when I add that void * pointer
to the union.
 
> > 	If we commit to that we can't just go on and change all its
> > users as we can now do for things like 'struct sk_buff', 'struct sock'
> > and any other kernel data structures.
> > 	There are costs in making it 'external' in the sense you're
> > implying. I want to make it a bit less 'external' (in your sense), at
> > least while we initially implement new stuff like this one.
 
> and the costs?
 
> also I did not mean that union above, what I meant was:
 
> struct fdarray {
> 	...
> 	void *priv;
> 	...
> }

Yeah, that is what I am trying to avoid. This will make it use
sizeof(void *) for each entry, and I have no use for that right now. All
I need is sizeof(int) for each entry.

Remember, this is an array, so its not just a matter of changing:

	int *priv;

to:

	void *priv;

But since it is an array, it will use, in this case, double the space,
on 64-bit, for no gain _right now_.

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] tools lib fd array: Do not set fd as non blocking evlist
  2014-09-11 15:27     ` Arnaldo Carvalho de Melo
@ 2014-09-11 15:53       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 15:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Thu, Sep 11, 2014 at 12:27:26PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Sep 11, 2014 at 05:09:43PM +0200, Jiri Olsa escreveu:
> > > +	fcntl(fd, F_SETFL, O_NONBLOCK);
> > 
> > also I spot this one and couldn't think of reason for it, attached
> > patch makes no behaviour difference for me..
> 
> Have to look why it is there, perhaps there is some changeset
> specifically made for this, will do some research...
>  
> > I might be missing something, but I dont see any blocking operation
> > in perf related data reads. The git log history says it was there
> > since early days.
> 
> Oops, you did that research already, have you followed the history all
> the way to when this code lived in Documentation/ ?

Since day one, found using:

$ git log -p --follow tools/perf/builtin-top.c

commit e0143bad9dbf2a8fad4c5430562bceba196b66ea
Author: Ingo Molnar <mingo@elte.hu>
Date:   Mon Mar 23 21:29:59 2009 +0100

    perf_counter: add sample user-space to Documentation/perf_counter/
    
    Initial version of kerneltop.c and perfstat.c.
    
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Paul Mackerras <paulus@samba.org>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/Documentation/perf_counter/kerneltop.c b/Documentation/perf_counter/kerneltop.c


Ingo, do we really need that O_NONBLOCK thing there?

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry
  2014-09-11 15:49             ` Arnaldo Carvalho de Melo
@ 2014-09-11 16:07               ` Jiri Olsa
  0 siblings, 0 replies; 48+ messages in thread
From: Jiri Olsa @ 2014-09-11 16:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

On Thu, Sep 11, 2014 at 12:49:22PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 11, 2014 at 05:35:27PM +0200, Jiri Olsa escreveu:
> > On Thu, Sep 11, 2014 at 12:23:30PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Sep 11, 2014 at 04:59:45PM +0200, Jiri Olsa escreveu:
> > > > On Thu, Sep 11, 2014 at 10:29:11AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > But then I had no use whatsoever for the ->ptr one at this point, so I
> > > > > just nuked it, to keep _just_ what is used _right now_, and added the
> > > > > comment to the changelog :-)
> 
> > > > if we are treeting tools/lib/api as 'external' lib, I think we should
> > > > use 'void *' for priv and let the user retype it to whatever he wants
> 
> > > > but I dont care/insist here too much.. it just seems strange to me ;-)
> > > 
> > > Ok, so perhaps it should be:
> > > 
> > > 	struct fdarray {
> > > 		...
> > > 		union {
> > > 			int idx;
> > > 		} *priv;
> > > 		...
> > > }
> > > 
> > > And then perf's evlist will use:
> > > 
> > > 	fda->priv[fd].idx;
> > > 
> > > 	Then, when we get the need for a pointer, we can go there and
> > > add it to the union.
> > > 
> > > 	This will make tools/lib/api/fd/ a little bit more friendly to
> > > use outside the kernel sources, since then a new version will be source
> > > code compatible (but not necessarily binary compatible).
> > 
> > you lost me ;-)
> > 
> > could you please make an example of the !compatibility you mentioned?
> 
> If I leave it as is now, when I add that void * pointer to each entry in
> fdarray->priv[], I will have to change fdarray's users, i.e. right now I
> will have to go and change evlist.h, i.e. every place that has:
> 
> 	fda->priv[fd];
> 
> 	will have to be changed to:
> 
> 	fda->priv[fd].idx;
> 
> If I instead change it to:
> 
>  	struct fdarray {
>  		...
>  		union {
>  			int idx;
>  		} *priv;
>  		...
> 	}
> 
> Then I will not have to change anything when I add that void * pointer
> to the union.

I dont think there's any trouble like that if you keep
just void* priv pointer and let users to store there
whatever they like

>  
> > > 	If we commit to that we can't just go on and change all its
> > > users as we can now do for things like 'struct sk_buff', 'struct sock'
> > > and any other kernel data structures.
> > > 	There are costs in making it 'external' in the sense you're
> > > implying. I want to make it a bit less 'external' (in your sense), at
> > > least while we initially implement new stuff like this one.
>  
> > and the costs?
>  
> > also I did not mean that union above, what I meant was:
>  
> > struct fdarray {
> > 	...
> > 	void *priv;
> > 	...
> > }
> 
> Yeah, that is what I am trying to avoid. This will make it use
> sizeof(void *) for each entry, and I have no use for that right now. All
> I need is sizeof(int) for each entry.

hm? but you're using pointer (int*) not int .. size will be sizeof(unsigned long) anyway

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-11 13:30     ` Arnaldo Carvalho de Melo
@ 2014-09-11 21:36       ` Arnaldo Carvalho de Melo
  2014-09-18 16:04         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-11 21:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Thu, Sep 11, 2014 at 10:30:29AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Sep 11, 2014 at 01:48:25PM +0200, Jiri Olsa escreveu:
> > so the reason was that my fd lib stuff did not get rebuilt..
> 
> Thanks a lot! I missed that one, will fold it into the patch that
> introduces fdarray and add a v2: comment attributing credit to you, so
> that bisection works.
 
> > you probably want to add attached change, before there's the
> > fix for the apik library

Done, need just to move that fcntl O_NONBLOCK from fdarray__add() to
perf_evlist__add_pollfd(), what I have now is at:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v3

In case you want to see if itaddresses your concerns.

The last cset should address what you pointed out about unmapping only
when the events gets completely drained from the ring buffer.

I'll do the fcntl() fix, test your drain-last-events scenario and
repost.

- Arnaldo



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-10 14:08 ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Arnaldo Carvalho de Melo
  2014-09-11 15:09   ` [PATCH] tools lib fd array: Do not set fd as non blocking evlist Jiri Olsa
@ 2014-09-12 12:58   ` Jiri Olsa
  2014-09-12 13:44     ` Arnaldo Carvalho de Melo
  2014-09-22 12:29   ` Jiri Olsa
  2 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-12 12:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> +	if (entries == NULL)
> +		return -ENOMEM;
> +
> +	fda->nr_alloc = nr_alloc;
> +	fda->entries  = entries;
> +	return 0;
> +}
> +
> +struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
> +{
> +	struct fdarray *fda = calloc(1, sizeof(*fda));
> +
> +	if (fda != NULL) {
> +		if (fdarray__grow(fda, nr_alloc)) {
> +			free(fda);
> +			fda = NULL;

you could use zfree
jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-12 12:58   ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Jiri Olsa
@ 2014-09-12 13:44     ` Arnaldo Carvalho de Melo
  2014-09-12 14:16       ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-12 13:44 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Fri, Sep 12, 2014 at 02:58:22PM +0200, Jiri Olsa escreveu:
> On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:
 
> SNIP
 
> > +struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
> > +{
> > +	struct fdarray *fda = calloc(1, sizeof(*fda));

> > +	if (fda != NULL) {
> > +		if (fdarray__grow(fda, nr_alloc)) {
> > +			free(fda);
> > +			fda = NULL;
> 
> you could use zfree

Yeah, we could, if zfree had moved from tools/perf/util/ to tools/lib/,
which is has not yet.

I thought about doing it now, but when doing some grepping in the kernel
sources I realized that zfree is already taken by... zlib, so I'll leave
this for later, when we figure out how to properly export this, perhaps
renaming it to freez()?

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-12 13:44     ` Arnaldo Carvalho de Melo
@ 2014-09-12 14:16       ` Jiri Olsa
  2014-09-12 14:22         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-12 14:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

On Fri, Sep 12, 2014 at 10:44:29AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Sep 12, 2014 at 02:58:22PM +0200, Jiri Olsa escreveu:
> > On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:
>  
> > SNIP
>  
> > > +struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
> > > +{
> > > +	struct fdarray *fda = calloc(1, sizeof(*fda));
> 
> > > +	if (fda != NULL) {
> > > +		if (fdarray__grow(fda, nr_alloc)) {
> > > +			free(fda);
> > > +			fda = NULL;
> > 
> > you could use zfree
> 
> Yeah, we could, if zfree had moved from tools/perf/util/ to tools/lib/,
> which is has not yet.
> 
> I thought about doing it now, but when doing some grepping in the kernel
> sources I realized that zfree is already taken by... zlib, so I'll leave
> this for later, when we figure out how to properly export this, perhaps
> renaming it to freez()?

aargh.. haven't realized it's not in libs ;-) freez sounds ok

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-12 14:16       ` Jiri Olsa
@ 2014-09-12 14:22         ` Arnaldo Carvalho de Melo
  2014-09-12 16:54           ` Borislav Petkov
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-12 14:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Fri, Sep 12, 2014 at 04:16:32PM +0200, Jiri Olsa escreveu:
> On Fri, Sep 12, 2014 at 10:44:29AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Sep 12, 2014 at 02:58:22PM +0200, Jiri Olsa escreveu:
> > > On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > +			free(fda);
> > > > +			fda = NULL;

> > > you could use zfree

> > Yeah, we could, if zfree had moved from tools/perf/util/ to tools/lib/,
> > which is has not yet.

> > I thought about doing it now, but when doing some grepping in the kernel
> > sources I realized that zfree is already taken by... zlib, so I'll leave
> > this for later, when we figure out how to properly export this, perhaps
> > renaming it to freez()?
 
> aargh.. haven't realized it's not in libs ;-) freez sounds ok

Yeah, to me as well, but then zfree() was such a nice pick and looked ok
at the time :-)

I'm leaving this for later, different patchkit that would sweep thru
tools/ and find this pattern, make it use freez() (or some other name),
etc.

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-12 14:22         ` Arnaldo Carvalho de Melo
@ 2014-09-12 16:54           ` Borislav Petkov
  2014-09-12 20:48             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Borislav Petkov @ 2014-09-12 16:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, linux-kernel, Adrian Hunter, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

On Fri, Sep 12, 2014 at 11:22:25AM -0300, Arnaldo Carvalho de Melo wrote:
> Yeah, to me as well, but then zfree() was such a nice pick and looked ok
> at the time :-)
> 
> I'm leaving this for later, different patchkit that would sweep thru
> tools/ and find this pattern, make it use freez() (or some other name),
> etc.

freez() sounds like something gets frozen :-P and not freed.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-12 16:54           ` Borislav Petkov
@ 2014-09-12 20:48             ` Arnaldo Carvalho de Melo
  2014-09-12 22:12               ` Borislav Petkov
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-12 20:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Jiri Olsa, linux-kernel, Adrian Hunter, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Fri, Sep 12, 2014 at 06:54:07PM +0200, Borislav Petkov escreveu:
> On Fri, Sep 12, 2014 at 11:22:25AM -0300, Arnaldo Carvalho de Melo wrote:
> > Yeah, to me as well, but then zfree() was such a nice pick and looked ok
> > at the time :-)
> > 
> > I'm leaving this for later, different patchkit that would sweep thru
> > tools/ and find this pattern, make it use freez() (or some other name),
> > etc.
> 
> freez() sounds like something gets frozen :-P and not freed.

ok, note taken.

I liked it when trying something different from zfree().

I'm not that successfull at landgrabs recently, people at Fedora don't
want to use the 'trace' alias for 'perf trace', even it not being used
by anybody so far.

Ideas of how to name zfree()?

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-12 20:48             ` Arnaldo Carvalho de Melo
@ 2014-09-12 22:12               ` Borislav Petkov
  0 siblings, 0 replies; 48+ messages in thread
From: Borislav Petkov @ 2014-09-12 22:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, linux-kernel, Adrian Hunter, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jean Pihet,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra

On Fri, Sep 12, 2014 at 05:48:29PM -0300, Arnaldo Carvalho de Melo wrote:
> I'm not that successfull at landgrabs recently, people at Fedora don't
> want to use the 'trace' alias for 'perf trace', even it not being used
> by anybody so far.

WTF, it's just a name...

> Ideas of how to name zfree()?

zo_free() for "free and zero out" maybe. I always try for the naming to
still be small rather than spelled out.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-11 21:36       ` Arnaldo Carvalho de Melo
@ 2014-09-18 16:04         ` Arnaldo Carvalho de Melo
  2014-09-22 13:35           ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-18 16:04 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Sep 11, 2014 at 10:30:29AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, Sep 11, 2014 at 01:48:25PM +0200, Jiri Olsa escreveu:
> > > so the reason was that my fd lib stuff did not get rebuilt..
> > 
> > Thanks a lot! I missed that one, will fold it into the patch that
> > introduces fdarray and add a v2: comment attributing credit to you, so
> > that bisection works.
>  
> > > you probably want to add attached change, before there's the
> > > fix for the apik library
> 
> Done, need just to move that fcntl O_NONBLOCK from fdarray__add() to
> perf_evlist__add_pollfd(), what I have now is at:
> 
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v3
> 
> In case you want to see if itaddresses your concerns.
> 
> The last cset should address what you pointed out about unmapping only
> when the events gets completely drained from the ring buffer.
> 
> I'll do the fcntl() fix, test your drain-last-events scenario and
> repost.

Done, updated it there:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4

Its on top of my last perf/core branch, i.e. with builtin-record.c using
perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
mmap_read loop to consume what is left after all fds for a mmap are
closed.

I'll wait a bit before reposting, probably Jiri will not be able to
comment this week, but I would like to at least post the URL for this
latest v4 kit.

Adrian, if you could take a look at it, would be really great :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist
  2014-09-10 14:08 ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Arnaldo Carvalho de Melo
  2014-09-11 15:09   ` [PATCH] tools lib fd array: Do not set fd as non blocking evlist Jiri Olsa
  2014-09-12 12:58   ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Jiri Olsa
@ 2014-09-22 12:29   ` Jiri Olsa
  2 siblings, 0 replies; 48+ messages in thread
From: Jiri Olsa @ 2014-09-22 12:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Borislav Petkov, Corey Ashford, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

On Wed, Sep 10, 2014 at 11:08:46AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> +	if (entries == NULL)
> +		return -ENOMEM;
> +
> +	fda->nr_alloc = nr_alloc;
> +	fda->entries  = entries;
> +	return 0;
> +}
> +
> +struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
> +{
> +	struct fdarray *fda = calloc(1, sizeof(*fda));
> +
> +	if (fda != NULL) {
> +		if (fdarray__grow(fda, nr_alloc)) {
> +			free(fda);
> +			fda = NULL;
> +		} else {
> +			fda->nr_autogrow = nr_autogrow;
> +		}
> +	}
> +
> +	return fda;
> +}

fdarray__new is never used..

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-18 16:04         ` Arnaldo Carvalho de Melo
@ 2014-09-22 13:35           ` Jiri Olsa
  2014-09-22 14:49             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-22 13:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, Sep 11, 2014 at 10:30:29AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Thu, Sep 11, 2014 at 01:48:25PM +0200, Jiri Olsa escreveu:
> > > > so the reason was that my fd lib stuff did not get rebuilt..
> > > 
> > > Thanks a lot! I missed that one, will fold it into the patch that
> > > introduces fdarray and add a v2: comment attributing credit to you, so
> > > that bisection works.
> >  
> > > > you probably want to add attached change, before there's the
> > > > fix for the apik library
> > 
> > Done, need just to move that fcntl O_NONBLOCK from fdarray__add() to
> > perf_evlist__add_pollfd(), what I have now is at:
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v3
> > 
> > In case you want to see if itaddresses your concerns.
> > 
> > The last cset should address what you pointed out about unmapping only
> > when the events gets completely drained from the ring buffer.
> > 
> > I'll do the fcntl() fix, test your drain-last-events scenario and
> > repost.
> 
> Done, updated it there:
> 
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4
> 
> Its on top of my last perf/core branch, i.e. with builtin-record.c using
> perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> mmap_read loop to consume what is left after all fds for a mmap are
> closed.
> 
> I'll wait a bit before reposting, probably Jiri will not be able to
> comment this week, but I would like to at least post the URL for this
> latest v4 kit.
> 
> Adrian, if you could take a look at it, would be really great :-)

I checked the branch.. the last patch brakes the functionality
of perf record for me

I tried attached change instead.. but then I realized we dont
actually need the final unmap for record command, because
we always open 1 mmap for each CPU and then do the output ioctl
for any other process we monitor for each CPU.. so all the mmaps
stay until last process is dead or we exit

maybe we could have this code just to be complete,
or I missed something ;-)

jirka


---
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index aac8660..96a4cf0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -244,11 +244,15 @@ static int record__mmap_read_all(struct record *rec)
 	int rc = 0;
 
 	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-		if (rec->evlist->mmap[i].base) {
-			if (record__mmap_read(rec, &rec->evlist->mmap[i]) != 0) {
+		struct perf_mmap *md = &rec->evlist->mmap[i];
+
+		if (md->base) {
+			if (record__mmap_read(rec, md) != 0) {
 				rc = -1;
 				goto out;
 			}
+			if (md->refcnt == 1 && perf_mmap__empty(md))
+				perf_evlist__mmap_put(rec->evlist, i);
 		}
 	}
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3cebc9a..82bab81 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -25,7 +25,6 @@
 #include <linux/bitops.h>
 #include <linux/hash.h>
 
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -676,17 +675,12 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return event;
 }
 
-static bool perf_mmap__empty(struct perf_mmap *md)
-{
-	return perf_mmap__read_head(md) != md->prev;
-}
-
 static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
 	++evlist->mmap[idx].refcnt;
 }
 
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
+void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 {
 	BUG_ON(evlist->mmap[idx].refcnt == 0);
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bd312b0..fdd395a 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -204,6 +204,13 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md,
 	pc->data_tail = tail;
 }
 
+static inline bool perf_mmap__empty(struct perf_mmap *md)
+{
+	return perf_mmap__read_head(md) != md->prev;
+}
+
+void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
+
 bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str);
 void perf_evlist__to_front(struct perf_evlist *evlist,
 			   struct perf_evsel *move_evsel);

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-22 13:35           ` Jiri Olsa
@ 2014-09-22 14:49             ` Arnaldo Carvalho de Melo
  2014-09-22 14:51               ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-22 14:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Mon, Sep 22, 2014 at 03:35:26PM +0200, Jiri Olsa escreveu:
> On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Done, updated it there:
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4
> > 
> > Its on top of my last perf/core branch, i.e. with builtin-record.c using
> > perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> > mmap_read loop to consume what is left after all fds for a mmap are
> > closed.
> > 
> > I'll wait a bit before reposting, probably Jiri will not be able to
> > comment this week, but I would like to at least post the URL for this
> > latest v4 kit.
> > 
> > Adrian, if you could take a look at it, would be really great :-)
> 
> I checked the branch.. the last patch brakes the functionality
> of perf record for me

Breaks in what sense?
 
> I tried attached change instead.. but then I realized we dont
> actually need the final unmap for record command, because
> we always open 1 mmap for each CPU and then do the output ioctl
> for any other process we monitor for each CPU.. so all the mmaps
> stay until last process is dead or we exit
> 
> maybe we could have this code just to be complete,

This is to be complete, trying to find she simples setup + event loop
processing + teardown API.

> or I missed something ;-)

We'll both find it, eventually ;-)

- Arnaldo
 
> +++ b/tools/perf/builtin-record.c
> @@ -244,11 +244,15 @@ static int record__mmap_read_all(struct record *rec)
>  	int rc = 0;
>  
>  	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
> -		if (rec->evlist->mmap[i].base) {
> -			if (record__mmap_read(rec, &rec->evlist->mmap[i]) != 0) {
> +		struct perf_mmap *md = &rec->evlist->mmap[i];
> +
> +		if (md->base) {
> +			if (record__mmap_read(rec, md) != 0) {
>  				rc = -1;
>  				goto out;
>  			}
> +			if (md->refcnt == 1 && perf_mmap__empty(md))
> +				perf_evlist__mmap_put(rec->evlist, i);

Humm, this should go into perf_evlist__mmap_consume(), is it not there?
I.e. record__mmap_read() will have called it, nope? Humm, it seems I
haven't rebased this on top of tip/perf/core, that has this at the end of
record__mmap_read():

        md->prev = old;
        perf_evlist__mmap_consume(rec->evlist, idx);

Sorry about that, but at least it matches what you did, no?

I just pushed a perf/fdarray.v5 with it.

There is one more thing I realised, that is we can't exit the loop like
I do in my series at these places:

  commit 593ee89818b3ab3493a9f368c1eb6b2e4c4a9512
    perf trace: Filter out POLLHUP'ed file descriptors

  commit 2478af9c299f4f7950b8d5d4bca2d2c71d661018
    perf record: Filter out POLLHUP'ed file descriptors


I need to reorder the patches so that this draining gets merged with the
test for entries in the poll fdarray to drop to zero.

The end result should be the equivalent to what is in perf/fdarray.v5,
but if we test right at those two last mentioned patches, we may lose
the last few events in a record or trace session.

- Arnaldo

> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 3cebc9a..82bab81 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -25,7 +25,6 @@
>  #include <linux/bitops.h>
>  #include <linux/hash.h>
>  
> -static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
>  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
>  
>  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> @@ -676,17 +675,12 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>  	return event;
>  }
>  
> -static bool perf_mmap__empty(struct perf_mmap *md)
> -{
> -	return perf_mmap__read_head(md) != md->prev;
> -}
> -
>  static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
>  {
>  	++evlist->mmap[idx].refcnt;
>  }
>  
> -static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
> +void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
>  {
>  	BUG_ON(evlist->mmap[idx].refcnt == 0);
>  
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index bd312b0..fdd395a 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -204,6 +204,13 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md,
>  	pc->data_tail = tail;
>  }
>  
> +static inline bool perf_mmap__empty(struct perf_mmap *md)
> +{
> +	return perf_mmap__read_head(md) != md->prev;
> +}
> +
> +void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
> +
>  bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str);
>  void perf_evlist__to_front(struct perf_evlist *evlist,
>  			   struct perf_evsel *move_evsel);

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-22 14:49             ` Arnaldo Carvalho de Melo
@ 2014-09-22 14:51               ` Jiri Olsa
  2014-09-22 21:10                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-22 14:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Mon, Sep 22, 2014 at 11:49:49AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 22, 2014 at 03:35:26PM +0200, Jiri Olsa escreveu:
> > On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Done, updated it there:
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4
> > > 
> > > Its on top of my last perf/core branch, i.e. with builtin-record.c using
> > > perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> > > mmap_read loop to consume what is left after all fds for a mmap are
> > > closed.
> > > 
> > > I'll wait a bit before reposting, probably Jiri will not be able to
> > > comment this week, but I would like to at least post the URL for this
> > > latest v4 kit.
> > > 
> > > Adrian, if you could take a look at it, would be really great :-)
> > 
> > I checked the branch.. the last patch brakes the functionality
> > of perf record for me
> 
> Breaks in what sense?

perf record -p pid

and killing the pid won't finish perf record

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-22 14:51               ` Jiri Olsa
@ 2014-09-22 21:10                 ` Arnaldo Carvalho de Melo
  2014-09-23  9:26                   ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-22 21:10 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Mon, Sep 22, 2014 at 04:51:33PM +0200, Jiri Olsa escreveu:
> On Mon, Sep 22, 2014 at 11:49:49AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Sep 22, 2014 at 03:35:26PM +0200, Jiri Olsa escreveu:
> > > On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4

> > > > Its on top of my last perf/core branch, i.e. with builtin-record.c using
> > > > perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> > > > mmap_read loop to consume what is left after all fds for a mmap are
> > > > closed.

> > > > I'll wait a bit before reposting, probably Jiri will not be able to
> > > > comment this week, but I would like to at least post the URL for this
> > > > latest v4 kit.

> > > > Adrian, if you could take a look at it, would be really great :-)

> > > I checked the branch.. the last patch brakes the functionality
> > > of perf record for me

> > Breaks in what sense?

> perf record -p pid

> and killing the pid won't finish perf record

Should be ok now, after making sure record uses
perf_evlist__mmap_consume(), which does what you did in your proposed
fix, tried both killing it and letting it exit normally, i.e. a 'sleep 5s'
target.

Pushed
https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v6
after moving the changesets for trace and record to be done last,
waiting for all fds to be POLLHUP'ed and doing one last draining
mmap_read loop over the mmaps.

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-22 21:10                 ` Arnaldo Carvalho de Melo
@ 2014-09-23  9:26                   ` Jiri Olsa
  2014-09-23 12:46                     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 48+ messages in thread
From: Jiri Olsa @ 2014-09-23  9:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Mon, Sep 22, 2014 at 06:10:27PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 22, 2014 at 04:51:33PM +0200, Jiri Olsa escreveu:
> > On Mon, Sep 22, 2014 at 11:49:49AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Sep 22, 2014 at 03:35:26PM +0200, Jiri Olsa escreveu:
> > > > On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4
> 
> > > > > Its on top of my last perf/core branch, i.e. with builtin-record.c using
> > > > > perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> > > > > mmap_read loop to consume what is left after all fds for a mmap are
> > > > > closed.
> 
> > > > > I'll wait a bit before reposting, probably Jiri will not be able to
> > > > > comment this week, but I would like to at least post the URL for this
> > > > > latest v4 kit.
> 
> > > > > Adrian, if you could take a look at it, would be really great :-)
> 
> > > > I checked the branch.. the last patch brakes the functionality
> > > > of perf record for me
> 
> > > Breaks in what sense?
> 
> > perf record -p pid
> 
> > and killing the pid won't finish perf record
> 
> Should be ok now, after making sure record uses
> perf_evlist__mmap_consume(), which does what you did in your proposed
> fix, tried both killing it and letting it exit normally, i.e. a 'sleep 5s'
> target.
> 
> Pushed
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v6
> after moving the changesets for trace and record to be done last,
> waiting for all fds to be POLLHUP'ed and doing one last draining
> mmap_read loop over the mmaps.

One more thing is I noticed the test 34 is segfaulting, because
perf_evlist__munmap_filtered assumes priv pointer setup, which
is not done in the test.

otherwise it looks gut

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-23  9:26                   ` Jiri Olsa
@ 2014-09-23 12:46                     ` Arnaldo Carvalho de Melo
  2014-09-23 12:52                       ` Jiri Olsa
  0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-23 12:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Tue, Sep 23, 2014 at 11:26:45AM +0200, Jiri Olsa escreveu:
> On Mon, Sep 22, 2014 at 06:10:27PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Sep 22, 2014 at 04:51:33PM +0200, Jiri Olsa escreveu:
> > > On Mon, Sep 22, 2014 at 11:49:49AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Mon, Sep 22, 2014 at 03:35:26PM +0200, Jiri Olsa escreveu:
> > > > > On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4
> > 
> > > > > > Its on top of my last perf/core branch, i.e. with builtin-record.c using
> > > > > > perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> > > > > > mmap_read loop to consume what is left after all fds for a mmap are
> > > > > > closed.
> > 
> > > > > > I'll wait a bit before reposting, probably Jiri will not be able to
> > > > > > comment this week, but I would like to at least post the URL for this
> > > > > > latest v4 kit.
> > 
> > > > > > Adrian, if you could take a look at it, would be really great :-)
> > 
> > > > > I checked the branch.. the last patch brakes the functionality
> > > > > of perf record for me
> > 
> > > > Breaks in what sense?
> > 
> > > perf record -p pid
> > 
> > > and killing the pid won't finish perf record
> > 
> > Should be ok now, after making sure record uses
> > perf_evlist__mmap_consume(), which does what you did in your proposed
> > fix, tried both killing it and letting it exit normally, i.e. a 'sleep 5s'
> > target.
> > 
> > Pushed
> > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v6
> > after moving the changesets for trace and record to be done last,
> > waiting for all fds to be POLLHUP'ed and doing one last draining
> > mmap_read loop over the mmaps.
> 
> One more thing is I noticed the test 34 is segfaulting, because
> perf_evlist__munmap_filtered assumes priv pointer setup, which
> is not done in the test.

Yeah, forgot to update it wrt mmap refcounting.

But this brought to my attention the difficulty in debugging perf test
entries these days, because we do a fork + exec there, I tried using gdb
and I must be missing something, because I couldn't add a breakpoint
inside the tests :-\

I'll try with 'perf probe' instead of with gdb, probably will be a good
idea to try eating our own dog food, and overal, for just inserting some
printfs, better than using a full-blown debugger like gdb.

- Arnaldo
 
> otherwise it looks gut

Danke!

- Arnaldo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC 00/14] perf pollfd v3
  2014-09-23 12:46                     ` Arnaldo Carvalho de Melo
@ 2014-09-23 12:52                       ` Jiri Olsa
  0 siblings, 0 replies; 48+ messages in thread
From: Jiri Olsa @ 2014-09-23 12:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Borislav Petkov, Corey Ashford,
	David Ahern, Don Zickus, Frederic Weisbecker, Ingo Molnar,
	Jean Pihet, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Tue, Sep 23, 2014 at 09:46:40AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 23, 2014 at 11:26:45AM +0200, Jiri Olsa escreveu:
> > On Mon, Sep 22, 2014 at 06:10:27PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Sep 22, 2014 at 04:51:33PM +0200, Jiri Olsa escreveu:
> > > > On Mon, Sep 22, 2014 at 11:49:49AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > Em Mon, Sep 22, 2014 at 03:35:26PM +0200, Jiri Olsa escreveu:
> > > > > > On Thu, Sep 18, 2014 at 01:04:55PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > Em Thu, Sep 11, 2014 at 06:36:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v4
> > > 
> > > > > > > Its on top of my last perf/core branch, i.e. with builtin-record.c using
> > > > > > > perf_evlist__mmap_consume() and it and builtin-trace.c doing one last
> > > > > > > mmap_read loop to consume what is left after all fds for a mmap are
> > > > > > > closed.
> > > 
> > > > > > > I'll wait a bit before reposting, probably Jiri will not be able to
> > > > > > > comment this week, but I would like to at least post the URL for this
> > > > > > > latest v4 kit.
> > > 
> > > > > > > Adrian, if you could take a look at it, would be really great :-)
> > > 
> > > > > > I checked the branch.. the last patch brakes the functionality
> > > > > > of perf record for me
> > > 
> > > > > Breaks in what sense?
> > > 
> > > > perf record -p pid
> > > 
> > > > and killing the pid won't finish perf record
> > > 
> > > Should be ok now, after making sure record uses
> > > perf_evlist__mmap_consume(), which does what you did in your proposed
> > > fix, tried both killing it and letting it exit normally, i.e. a 'sleep 5s'
> > > target.
> > > 
> > > Pushed
> > > https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/fdarray.v6
> > > after moving the changesets for trace and record to be done last,
> > > waiting for all fds to be POLLHUP'ed and doing one last draining
> > > mmap_read loop over the mmaps.
> > 
> > One more thing is I noticed the test 34 is segfaulting, because
> > perf_evlist__munmap_filtered assumes priv pointer setup, which
> > is not done in the test.
> 
> Yeah, forgot to update it wrt mmap refcounting.
> 
> But this brought to my attention the difficulty in debugging perf test
> entries these days, because we do a fork + exec there, I tried using gdb
> and I must be missing something, because I couldn't add a breakpoint
> inside the tests :-\

you need to run following gdb command:

(gdb) set follow-fork-mode child

jirka

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method
  2014-09-25 21:57 [GIT PULL 00/14] perf tools polling fixes Arnaldo Carvalho de Melo
@ 2014-09-25 21:57 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-09-25 21:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Don Zickus, Frederic Weisbecker, Jiri Olsa,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To remove all entries in evlist->pollfd[] that have revents matching at
least one of the bits in the specified mask.

It'll adjust evlist->nr_fds to the number of unfiltered fds and will
return this value, as a convenience and to avoid requiring direct access
to internal state of perf_evlist objects.

This will be used after polling the evlist fds so that we remove fds
that were closed by the kernel.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-y2sca7z3wicvvy40a50lozwm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 21 +++++++++++++++++++++
 tools/perf/util/evlist.h |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a3e28b49128a..023bc3873ae9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -428,6 +428,27 @@ void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 	evlist->nr_fds++;
 }
 
+int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
+{
+	int fd, nr_fds = 0;
+
+	if (evlist->nr_fds == 0)
+		return 0;
+
+	for (fd = 0; fd < evlist->nr_fds; ++fd) {
+		if (evlist->pollfd[fd].revents & revents_and_mask)
+			continue;
+
+		if (fd != nr_fds)
+			evlist->pollfd[nr_fds] = evlist->pollfd[fd];
+
+		++nr_fds;
+	}
+
+	evlist->nr_fds = nr_fds;
+	return nr_fds;
+}
+
 static void perf_evlist__id_hash(struct perf_evlist *evlist,
 				 struct perf_evsel *evsel,
 				 int cpu, int thread, u64 id)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 106de53a6a74..1082420951f9 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -84,6 +84,8 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
 
 void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
 
+int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask);
+
 struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2014-09-25 21:58 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-10 14:08 [RFC 00/14] perf pollfd v3 Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 02/14] perf tests: Add test for perf_evlist__filter_pollfd() Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 03/14] perf evlist: Monitor POLLERR and POLLHUP events too Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 04/14] perf evlist: We need to poll all event file descriptors Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 05/14] perf record: Filter out POLLHUP'ed " Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 06/14] perf trace: " Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 07/14] perf evlist: Allow growing pollfd on add method Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 08/14] perf tests: Add pollfd growing test Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 09/14] perf kvm stat live: Use perf_evlist__add_pollfd() instead of local equivalent Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 10/14] perf evlist: Introduce poll method for common code idiom Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Arnaldo Carvalho de Melo
2014-09-11 15:09   ` [PATCH] tools lib fd array: Do not set fd as non blocking evlist Jiri Olsa
2014-09-11 15:27     ` Arnaldo Carvalho de Melo
2014-09-11 15:53       ` Arnaldo Carvalho de Melo
2014-09-12 12:58   ` [PATCH 11/14] tools lib api: Adopt fdarray class from perf's evlist Jiri Olsa
2014-09-12 13:44     ` Arnaldo Carvalho de Melo
2014-09-12 14:16       ` Jiri Olsa
2014-09-12 14:22         ` Arnaldo Carvalho de Melo
2014-09-12 16:54           ` Borislav Petkov
2014-09-12 20:48             ` Arnaldo Carvalho de Melo
2014-09-12 22:12               ` Borislav Petkov
2014-09-22 12:29   ` Jiri Olsa
2014-09-10 14:08 ` [PATCH 12/14] perf evlist: Refcount mmaps Arnaldo Carvalho de Melo
2014-09-10 14:08 ` [PATCH 13/14] tools lib fd array: Allow associating an integer cookie with each entry Arnaldo Carvalho de Melo
2014-09-11 10:33   ` Jiri Olsa
2014-09-11 13:29     ` Arnaldo Carvalho de Melo
2014-09-11 14:59       ` Jiri Olsa
2014-09-11 15:23         ` Arnaldo Carvalho de Melo
2014-09-11 15:35           ` Jiri Olsa
2014-09-11 15:49             ` Arnaldo Carvalho de Melo
2014-09-11 16:07               ` Jiri Olsa
2014-09-10 14:08 ` [PATCH 14/14] perf evlist: Unmap ring buffer when fd is nuked Arnaldo Carvalho de Melo
2014-09-11 12:27   ` Jiri Olsa
2014-09-11 13:40     ` Arnaldo Carvalho de Melo
2014-09-11 11:33 ` [RFC 00/14] perf pollfd v3 Jiri Olsa
2014-09-11 11:48   ` Jiri Olsa
2014-09-11 13:30     ` Arnaldo Carvalho de Melo
2014-09-11 21:36       ` Arnaldo Carvalho de Melo
2014-09-18 16:04         ` Arnaldo Carvalho de Melo
2014-09-22 13:35           ` Jiri Olsa
2014-09-22 14:49             ` Arnaldo Carvalho de Melo
2014-09-22 14:51               ` Jiri Olsa
2014-09-22 21:10                 ` Arnaldo Carvalho de Melo
2014-09-23  9:26                   ` Jiri Olsa
2014-09-23 12:46                     ` Arnaldo Carvalho de Melo
2014-09-23 12:52                       ` Jiri Olsa
2014-09-25 21:57 [GIT PULL 00/14] perf tools polling fixes Arnaldo Carvalho de Melo
2014-09-25 21:57 ` [PATCH 01/14] perf evlist: Introduce perf_evlist__filter_pollfd method Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).