linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes
@ 2020-06-03 15:47 Alexey Budankov
  2020-06-03 15:52 ` [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors Alexey Budankov
                   ` (13 more replies)
  0 siblings, 14 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Changes in v7:
- added missing perf-record.txt changes 
- adjusted docs wording for --ctl-fd,ctl-fd-ack options 
  to additionally mention --delay=-1 effect

v6: https://lore.kernel.org/lkml/f8e3a714-d9b1-4647-e1d2-9981cbaa83ec@linux.intel.com/

Changes in v6:
- split re-factoring of events handling loops for stat mode
  into smaller incremental parts
- added parts missing at v5
- corrected v5 runtime issues

v5: https://lore.kernel.org/lkml/e5cac8dd-7aa4-ec7c-671c-07756907acba@linux.intel.com/

Changes in v5:
- split re-factoring of events handling loops for stat mode
  into smaller incremental parts

v4: https://lore.kernel.org/lkml/653fe5f3-c986-a841-1ed8-0a7d2fa24c00@linux.intel.com/

Changes in v4:
- made checking of ctlfd state unconditional in record trace streaming loop
- introduced static poll fds to keep evlist__filter_pollfd() unaffected
- handled ret code of evlist__initialize_ctlfd() where need
- renamed and structured handle_events() function
- applied anonymous structs where needed

v3: https://lore.kernel.org/lkml/eb38e9e5-754f-d410-1d9b-e26b702d51b7@linux.intel.com/

Changes in v3:
- renamed functions and types from perf_evlist_ to evlist_ to avoid
  clash with libperf code;
- extended commands to be strings of variable length consisting of
  command name and also possibly including command specific data;
- merged docs update with the code changes;
- updated docs for -D,--delay=-1 option for stat and record modes;

v2: https://lore.kernel.org/lkml/d582cc3d-2302-c7e2-70d3-bc7ab6f628c3@linux.intel.com/

Changes in v2:
- renamed resume and pause commands to enable and disable ones, renamed
  CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
  to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
  ioctl;
- factored out event handling loop into a handle_events() for stat mode;
- separated -D,--delay=-1 into separate patches for stat and record modes;

v1: https://lore.kernel.org/lkml/825a5132-b58d-c0b6-b050-5a6040386ec7@linux.intel.com/

repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core

The patch set implements handling of 'start disabled', 'enable' and 'disable'
external control commands which can be provided for stat and record modes
of the tool from an external controlling process. 'start disabled' command
can be used to postpone enabling of events in the beginning of a monitoring
session. 'enable' and 'disable' commands can be used to enable and disable
events correspondingly any time after the start of the session.

The 'start disabled', 'enable' and 'disable' external control commands can be
used to focus measurement on specially selected time intervals of workload
execution. Focused measurement reduces tool intrusion and influence on
workload behavior, reduces distortion and amount of collected and stored
data, mitigates data accuracy loss because measurement and data capturing
happen only during intervals of interest.

A controlling process can be a bash shell script [1], native executable or
any other language program that can directly work with file descriptors,
e.g. pipes [2], and spawn a process, specially the tool one.

-D,--delay <val> option is extended with -1 value to skip events enabling
in the beginning of a monitoring session ('start disabled' command).
--ctl-fd and --ctl-fd-ack command line options are introduced to provide the
tool with a pair of file descriptors to listen to control commands and reply
to the controlling process on the completion of received commands.

The tool reads control command message from ctl-fd descriptor, handles the
command and optionally replies acknowledgement message to fd-ack descriptor,
if it is specified on the command line. 'enable' command is recognized as
'enable' string message and 'disable' command is recognized as 'disable'
string message both received from ctl-fd descriptor. Completion message is
'ack\n' and sent to fd-ack descriptor.

Example bash script demonstrating simple use case follows:

#!/bin/bash

ctl_dir=/tmp/

ctl_fifo=${ctl_dir}perf_ctl.fifo
test -p ${ctl_fifo} && unlink ${ctl_fifo}
mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}

ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}

perf stat -D -1 -e cpu-cycles -a -I 1000                \
          --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
          -- sleep 40 &
perf_pid=$!

sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"

exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
exec {ctl_fd}>&- && unlink ${ctl_fifo}

wait -n ${perf_pid}
exit $?


Script output:

[root@host dir] example
Events disabled
#           time             counts unit events
     1.001101062      <not counted>      cpu-cycles                                                  
     2.002994944      <not counted>      cpu-cycles                                                  
     3.004864340      <not counted>      cpu-cycles                                                  
     4.006727177      <not counted>      cpu-cycles                                                  
Events enabled
enabled(ack)
     4.993808464          3,124,246      cpu-cycles                                                  
     5.008597004          3,325,624      cpu-cycles                                                  
     6.010387483         83,472,992      cpu-cycles                                                  
     7.012266598         55,877,621      cpu-cycles                                                  
     8.014175695         97,892,729      cpu-cycles                                                  
     9.016056093         68,461,242      cpu-cycles                                                  
    10.017937507         55,449,643      cpu-cycles                                                  
    11.019830154         68,938,167      cpu-cycles                                                  
    12.021719952         55,164,101      cpu-cycles                                                  
    13.023627550         70,535,720      cpu-cycles                                                  
    14.025580995         53,240,125      cpu-cycles                                                  
disabled(ack)
    14.997518260         53,558,068      cpu-cycles                                                  
Events disabled
    15.027216416      <not counted>      cpu-cycles                                                  
    16.029052729      <not counted>      cpu-cycles                                                  
    17.030904762      <not counted>      cpu-cycles                                                  
    18.032073424      <not counted>      cpu-cycles                                                  
    19.033805074      <not counted>      cpu-cycles                                                  
Events enabled
enabled(ack)
    20.001279097          3,021,022      cpu-cycles                                                  
    20.035044381          6,434,367      cpu-cycles                                                  
    21.036923813         89,358,251      cpu-cycles                                                  
    22.038825169         72,516,351      cpu-cycles                                                  
#           time             counts unit events
    23.040715596         55,046,157      cpu-cycles                                                  
    24.042643757         78,128,649      cpu-cycles                                                  
    25.044558535         61,052,428      cpu-cycles                                                  
    26.046452785         62,142,806      cpu-cycles                                                  
    27.048353021         74,477,971      cpu-cycles                                                  
    28.050241286         61,001,623      cpu-cycles                                                  
    29.052149961         61,653,502      cpu-cycles                                                  
disabled(ack)
    30.004980264         82,729,640      cpu-cycles                                                  
Events disabled
    30.053516176      <not counted>      cpu-cycles                                                  
    31.055348366      <not counted>      cpu-cycles                                                  
    32.057202097      <not counted>      cpu-cycles                                                  
    33.059040702      <not counted>      cpu-cycles                                                  
    34.060843288      <not counted>      cpu-cycles                                                  
    35.000888624      <not counted>      cpu-cycles                                                  
[root@host dir]# 

[1] http://man7.org/linux/man-pages/man1/bash.1.html
[2] http://man7.org/linux/man-pages/man2/pipe.2.html

---
Alexey Budankov (13):
  tools/libperf: introduce notion of static polled file descriptors
  perf evlist: introduce control file descriptors
  perf evlist: implement control command handling functions
  perf stat: factor out body of event handling loop for system wide
  perf stat: move target check to loop control statement
  perf stat: factor out body of event handling loop for fork case
  perf stat: factor out event handling loop into dispatch_events()
  perf stat: extend -D,--delay option with -1 value
  perf stat: implement control commands handling
  perf stat: introduce --ctl-fd[-ack] options
  perf record: extend -D,--delay option with -1 value
  perf record: implement control commands handling
  perf record: introduce --ctl-fd[-ack] options

 tools/lib/api/fd/array.c                 |  42 ++++++-
 tools/lib/api/fd/array.h                 |   7 ++
 tools/lib/perf/evlist.c                  |  11 ++
 tools/lib/perf/include/internal/evlist.h |   2 +
 tools/perf/Documentation/perf-record.txt |  45 ++++++-
 tools/perf/Documentation/perf-stat.txt   |  45 ++++++-
 tools/perf/builtin-record.c              |  38 +++++-
 tools/perf/builtin-stat.c                | 149 +++++++++++++++++------
 tools/perf/builtin-trace.c               |   2 +-
 tools/perf/util/evlist.c                 | 131 ++++++++++++++++++++
 tools/perf/util/evlist.h                 |  25 ++++
 tools/perf/util/record.h                 |   4 +-
 tools/perf/util/stat.h                   |   4 +-
 13 files changed, 455 insertions(+), 50 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
@ 2020-06-03 15:52 ` Alexey Budankov
  2020-06-05 10:50   ` Jiri Olsa
  2020-06-03 15:53 ` [PATCH v7 02/13] perf evlist: introduce control " Alexey Budankov
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Implement adding of file descriptors by fdarray__add_stat() to
fix-sized (currently 1) stat_entries array located at struct fdarray.
Append added file descriptors to the array used by poll() syscall
during fdarray__poll() call. Copy poll() result of the added
descriptors from the array back to the storage for analysis.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
 tools/lib/api/fd/array.h                 |  7 ++++
 tools/lib/perf/evlist.c                  | 11 +++++++
 tools/lib/perf/include/internal/evlist.h |  2 ++
 4 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index 58d44d5eee31..b0027f2169c7 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -11,10 +11,16 @@
 
 void fdarray__init(struct fdarray *fda, int nr_autogrow)
 {
+	int i;
+
 	fda->entries	 = NULL;
 	fda->priv	 = NULL;
 	fda->nr		 = fda->nr_alloc = 0;
 	fda->nr_autogrow = nr_autogrow;
+
+	fda->nr_stat = 0;
+	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
+		fda->stat_entries[i].fd = -1;
 }
 
 int fdarray__grow(struct fdarray *fda, int nr)
@@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
 	return pos;
 }
 
+int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
+{
+	int pos = fda->nr_stat;
+
+	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
+		return -1;
+
+	fda->stat_entries[pos].fd = fd;
+	fda->stat_entries[pos].events = revents;
+	fda->nr_stat++;
+
+	return pos;
+}
+
 int fdarray__filter(struct fdarray *fda, short revents,
 		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
 		    void *arg)
@@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
 
 int fdarray__poll(struct fdarray *fda, int timeout)
 {
-	return poll(fda->entries, fda->nr, timeout);
+	int nr, i, pos, res;
+
+	nr = fda->nr;
+
+	for (i = 0; i < fda->nr_stat; i++) {
+		if (fda->stat_entries[i].fd != -1) {
+			pos = fdarray__add(fda, fda->stat_entries[i].fd,
+					   fda->stat_entries[i].events);
+			if (pos >= 0)
+				fda->priv[pos].idx = i;
+		}
+	}
+
+	res = poll(fda->entries, fda->nr, timeout);
+
+	for (i = nr; i < fda->nr; i++)
+		fda->stat_entries[fda->priv[i].idx] = fda->entries[i];
+
+	fda->nr = nr;
+
+	return res;
 }
 
 int fdarray__fprintf(struct fdarray *fda, FILE *fp)
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index b39557d1a88f..9bca72e80b09 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -3,6 +3,7 @@
 #define __API_FD_ARRAY__
 
 #include <stdio.h>
+#include <poll.h>
 
 struct pollfd;
 
@@ -16,6 +17,9 @@ struct pollfd;
  *	  I.e. using 'fda->priv[N].idx = * value' where N < fda->nr is ok,
  *	  but doing 'fda->priv = malloc(M)' is not allowed.
  */
+
+#define FDARRAY__STAT_ENTRIES_MAX	1
+
 struct fdarray {
 	int	       nr;
 	int	       nr_alloc;
@@ -25,6 +29,8 @@ struct fdarray {
 		int    idx;
 		void   *ptr;
 	} *priv;
+	int	       nr_stat;
+	struct pollfd  stat_entries[FDARRAY__STAT_ENTRIES_MAX];
 };
 
 void fdarray__init(struct fdarray *fda, int nr_autogrow);
@@ -34,6 +40,7 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
 void fdarray__delete(struct fdarray *fda);
 
 int fdarray__add(struct fdarray *fda, int fd, short revents);
+int fdarray__add_stat(struct fdarray *fda, int fd, short revents);
 int fdarray__poll(struct fdarray *fda, int timeout);
 int fdarray__filter(struct fdarray *fda, short revents,
 		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6a875a0f01bb..e68e4c08e7c2 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -317,6 +317,17 @@ int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
 	return pos;
 }
 
+int perf_evlist__add_pollfd_stat(struct perf_evlist *evlist, int fd,
+			         short revent)
+{
+	int pos = fdarray__add_stat(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
+
+	if (pos >= 0)
+		fcntl(fd, F_SETFL, O_NONBLOCK);
+
+	return pos;
+}
+
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
 					 void *arg __maybe_unused)
 {
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 74dc8c3f0b66..2b3b4518c05e 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -46,6 +46,8 @@ struct perf_evlist_mmap_ops {
 int perf_evlist__alloc_pollfd(struct perf_evlist *evlist);
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
 			    void *ptr, short revent);
+int perf_evlist__add_pollfd_stat(struct perf_evlist *evlist, int fd,
+			         short revent);
 
 int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 			  struct perf_evlist_mmap_ops *ops,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 02/13] perf evlist: introduce control file descriptors
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
  2020-06-03 15:52 ` [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors Alexey Budankov
@ 2020-06-03 15:53 ` Alexey Budankov
  2020-06-03 15:54 ` [PATCH v7 03/13] perf evlist: implement control command handling functions Alexey Budankov
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Define and initialize control file descriptors.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/util/evlist.c | 3 +++
 tools/perf/util/evlist.h | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 173b4f0e0e6e..47541b5cab46 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -63,6 +63,9 @@ void evlist__init(struct evlist *evlist, struct perf_cpu_map *cpus,
 	perf_evlist__set_maps(&evlist->core, cpus, threads);
 	evlist->workload.pid = -1;
 	evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
+	evlist->ctl_fd.fd = -1;
+	evlist->ctl_fd.ack = -1;
+	evlist->ctl_fd.pos = -1;
 }
 
 struct evlist *evlist__new(void)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b6f325dfb4d2..0d8b361f1c8e 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -74,6 +74,11 @@ struct evlist {
 		pthread_t		th;
 		volatile int		done;
 	} thread;
+	struct {
+		int	fd;
+		int	ack;
+		int	pos;
+	} ctl_fd;
 };
 
 struct evsel_str_handler {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 03/13] perf evlist: implement control command handling functions
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
  2020-06-03 15:52 ` [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors Alexey Budankov
  2020-06-03 15:53 ` [PATCH v7 02/13] perf evlist: introduce control " Alexey Budankov
@ 2020-06-03 15:54 ` Alexey Budankov
  2020-06-23 14:54   ` Jiri Olsa
  2020-06-03 15:55 ` [PATCH v7 04/13] perf stat: factor out body of event handling loop for system wide Alexey Budankov
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Implement functions of initialization, finalization and processing
of control command messages coming from control file descriptors.
Allocate control file descriptor as a static descriptor at struct
pollfd object of evsel_list using perf_evlist__add_pollfd_stat().

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/util/evlist.c | 128 +++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  17 ++++++
 2 files changed, 145 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 47541b5cab46..fbd98f741af9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1718,3 +1718,131 @@ struct evsel *perf_evlist__reset_weak_group(struct evlist *evsel_list,
 	}
 	return leader;
 }
+
+int evlist__initialize_ctlfd(struct evlist *evlist, int fd, int ack)
+{
+	if (fd == -1) {
+		pr_debug("Control descriptor is not initialized\n");
+		return 0;
+	}
+
+	evlist->ctl_fd.pos = perf_evlist__add_pollfd_stat(&evlist->core, fd, POLLIN);
+	if (evlist->ctl_fd.pos < 0) {
+		evlist->ctl_fd.pos = -1;
+		pr_err("Failed to add ctl fd entry: %m\n");
+		return -1;
+	}
+
+	evlist->ctl_fd.fd = fd;
+	evlist->ctl_fd.ack = ack;
+
+	return 0;
+}
+
+int evlist__finalize_ctlfd(struct evlist *evlist)
+{
+	if (evlist->ctl_fd.pos == -1)
+		return 0;
+
+	evlist->core.pollfd.stat_entries[evlist->ctl_fd.pos].fd = -1;
+	evlist->ctl_fd.pos = -1;
+	evlist->ctl_fd.ack = -1;
+	evlist->ctl_fd.fd = -1;
+
+	return 0;
+}
+
+static int evlist__ctlfd_recv(struct evlist *evlist, enum evlist_ctl_cmd *cmd,
+			      char *cmd_data, size_t data_size)
+{
+	int err;
+	char c;
+	size_t bytes_read = 0;
+
+	memset(cmd_data, 0, data_size--);
+
+	do {
+		err = read(evlist->ctl_fd.fd, &c, 1);
+		if (err > 0) {
+			if (c == '\n' || c == '\0')
+				break;
+			cmd_data[bytes_read++] = c;
+			if (bytes_read == data_size)
+				break;
+		} else {
+			if (err == -1)
+				pr_err("Failed to read from ctlfd %d: %m\n", evlist->ctl_fd.fd);
+			break;
+		}
+	} while (1);
+
+	pr_debug("Message from ctl_fd: \"%s%s\"\n", cmd_data,
+		 bytes_read == data_size ? "" : c == '\n' ? "\\n" : "\\0");
+
+	if (err > 0) {
+		if (!strncmp(cmd_data, EVLIST_CTL_CMD_ENABLE_TAG,
+			     strlen(EVLIST_CTL_CMD_ENABLE_TAG))) {
+			*cmd = EVLIST_CTL_CMD_ENABLE;
+		} else if (!strncmp(cmd_data, EVLIST_CTL_CMD_DISABLE_TAG,
+				    strlen(EVLIST_CTL_CMD_DISABLE_TAG))) {
+			*cmd = EVLIST_CTL_CMD_DISABLE;
+		}
+	}
+
+	return err;
+}
+
+static int evlist__ctlfd_ack(struct evlist *evlist)
+{
+	int err;
+
+	if (evlist->ctl_fd.ack == -1)
+		return 0;
+
+	err = write(evlist->ctl_fd.ack, EVLIST_CTL_CMD_ACK_TAG,
+		    sizeof(EVLIST_CTL_CMD_ACK_TAG));
+	if (err == -1)
+		pr_err("failed to write to ctl_ack_fd %d: %m\n", evlist->ctl_fd.ack);
+
+	return err;
+}
+
+int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd)
+{
+	int err = 0;
+	char cmd_data[EVLIST_CTL_CMD_MAX_LEN];
+	int ctlfd_pos = evlist->ctl_fd.pos;
+	struct pollfd *stat_entries = evlist->core.pollfd.stat_entries;
+
+	if (ctlfd_pos == -1 || !stat_entries[ctlfd_pos].revents)
+		return 0;
+
+	if (stat_entries[ctlfd_pos].revents & POLLIN) {
+		err = evlist__ctlfd_recv(evlist, cmd, cmd_data,
+					 EVLIST_CTL_CMD_MAX_LEN);
+		if (err > 0) {
+			switch (*cmd) {
+			case EVLIST_CTL_CMD_ENABLE:
+				evlist__enable(evlist);
+				break;
+			case EVLIST_CTL_CMD_DISABLE:
+				evlist__disable(evlist);
+				break;
+			case EVLIST_CTL_CMD_ACK:
+			case EVLIST_CTL_CMD_UNSUPPORTED:
+			default:
+				pr_debug("ctlfd: unsupported %d\n", *cmd);
+				break;
+			}
+			if (!(*cmd == EVLIST_CTL_CMD_ACK || *cmd == EVLIST_CTL_CMD_UNSUPPORTED))
+				evlist__ctlfd_ack(evlist);
+		}
+	}
+
+	if (stat_entries[ctlfd_pos].revents & (POLLHUP | POLLERR))
+		evlist__finalize_ctlfd(evlist);
+	else
+		stat_entries[ctlfd_pos].revents = 0;
+
+	return err;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0d8b361f1c8e..bccf0a970371 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -360,4 +360,21 @@ void perf_evlist__force_leader(struct evlist *evlist);
 struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
 						 struct evsel *evsel,
 						bool close);
+#define EVLIST_CTL_CMD_ENABLE_TAG  "enable"
+#define EVLIST_CTL_CMD_DISABLE_TAG "disable"
+#define EVLIST_CTL_CMD_ACK_TAG     "ack\n"
+
+#define EVLIST_CTL_CMD_MAX_LEN 64
+
+enum evlist_ctl_cmd {
+	EVLIST_CTL_CMD_UNSUPPORTED = 0,
+	EVLIST_CTL_CMD_ENABLE,
+	EVLIST_CTL_CMD_DISABLE,
+	EVLIST_CTL_CMD_ACK
+};
+
+int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
+int evlist__finalize_ctlfd(struct evlist *evlist);
+int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
+
 #endif /* __PERF_EVLIST_H */
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 04/13] perf stat: factor out body of event handling loop for system wide
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (2 preceding siblings ...)
  2020-06-03 15:54 ` [PATCH v7 03/13] perf evlist: implement control command handling functions Alexey Budankov
@ 2020-06-03 15:55 ` Alexey Budankov
  2020-06-03 15:56 ` [PATCH v7 05/13] perf stat: move target check to loop control statement Alexey Budankov
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Introduce process_interval() and process_timeout() functions that
factor out body of event handling loop for attach and system wide
monitoring use cases.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-stat.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9be020e0098a..31f7ccf9537b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -475,6 +475,23 @@ static void process_interval(void)
 	print_counters(&rs, 0, NULL);
 }
 
+static bool print_interval(unsigned int interval, int *times)
+{
+	if (interval) {
+		process_interval();
+		if (interval_count && !(--(*times)))
+			return true;
+	}
+	return false;
+}
+
+static bool process_timeout(int timeout, unsigned int interval, int *times)
+{
+	if (timeout)
+		return true;
+	return print_interval(interval, times);
+}
+
 static void enable_counters(void)
 {
 	if (stat_config.initial_delay)
@@ -611,6 +628,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	struct affinity affinity;
 	int i, cpu;
 	bool second_pass = false;
+	bool stop = false;
 
 	if (interval) {
 		ts.tv_sec  = interval / USEC_PER_MSEC;
@@ -805,17 +823,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
 		enable_counters();
-		while (!done) {
+		while (!done && !stop) {
 			nanosleep(&ts, NULL);
 			if (!is_target_alive(&target, evsel_list->core.threads))
 				break;
-			if (timeout)
-				break;
-			if (interval) {
-				process_interval();
-				if (interval_count && !(--times))
-					break;
-			}
+			stop = process_timeout(timeout, interval, &times);
 		}
 	}
 
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 05/13] perf stat: move target check to loop control statement
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (3 preceding siblings ...)
  2020-06-03 15:55 ` [PATCH v7 04/13] perf stat: factor out body of event handling loop for system wide Alexey Budankov
@ 2020-06-03 15:56 ` Alexey Budankov
  2020-06-03 15:57 ` [PATCH v7 06/13] perf stat: factor out body of event handling loop for fork case Alexey Budankov
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Check for target existence in loop control statement jointly with 'stop'
indicator based on command line values and external asynchronous 'done'
signal.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-stat.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 31f7ccf9537b..62bad2df13ba 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -823,10 +823,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
 		enable_counters();
-		while (!done && !stop) {
+		while (!done && !stop && is_target_alive(&target, evsel_list->core.threads)) {
 			nanosleep(&ts, NULL);
-			if (!is_target_alive(&target, evsel_list->core.threads))
-				break;
 			stop = process_timeout(timeout, interval, &times);
 		}
 	}
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 06/13] perf stat: factor out body of event handling loop for fork case
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (4 preceding siblings ...)
  2020-06-03 15:56 ` [PATCH v7 05/13] perf stat: move target check to loop control statement Alexey Budankov
@ 2020-06-03 15:57 ` Alexey Budankov
  2020-06-03 15:57 ` [PATCH v7 07/13] perf stat: factor out event handling loop into dispatch_events() Alexey Budankov
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Factor out body of event handling loop for fork case reusing
process_timeout() function.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-stat.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 62bad2df13ba..3bc538576607 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -798,13 +798,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		enable_counters();
 
 		if (interval || timeout) {
-			while (!waitpid(child_pid, &status, WNOHANG)) {
+			while (!stop && !waitpid(child_pid, &status, WNOHANG)) {
 				nanosleep(&ts, NULL);
-				if (timeout)
-					break;
-				process_interval();
-				if (interval_count && !(--times))
-					break;
+				stop = process_timeout(timeout, interval, &times);
 			}
 		}
 		if (child_pid != -1) {
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 07/13] perf stat: factor out event handling loop into dispatch_events()
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (5 preceding siblings ...)
  2020-06-03 15:57 ` [PATCH v7 06/13] perf stat: factor out body of event handling loop for fork case Alexey Budankov
@ 2020-06-03 15:57 ` Alexey Budankov
  2020-06-03 15:58 ` [PATCH v7 08/13] perf stat: extend -D,--delay option with -1 value Alexey Budankov
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Consolidate event dispatching loops for fork, attach and system
wide monitoring use cases into common dispatch_events() function.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-stat.c | 35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 3bc538576607..39749c290508 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -557,6 +557,27 @@ static bool is_target_alive(struct target *_target,
 	return false;
 }
 
+static int dispatch_events(bool forks, int timeout, int interval, int *times, struct timespec *ts)
+{
+	bool stop = false;
+	int child = 0, status = 0;
+
+	while (1) {
+		if (forks)
+			child = waitpid(child_pid, &status, WNOHANG);
+		else
+			child = !is_target_alive(&target, evsel_list->core.threads) ? 1 : 0;
+
+		if (done || stop || child)
+			break;
+
+		nanosleep(ts, NULL);
+		stop = process_timeout(timeout, interval, times);
+	}
+
+	return status;
+}
+
 enum counter_recovery {
 	COUNTER_SKIP,
 	COUNTER_RETRY,
@@ -628,7 +649,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	struct affinity affinity;
 	int i, cpu;
 	bool second_pass = false;
-	bool stop = false;
 
 	if (interval) {
 		ts.tv_sec  = interval / USEC_PER_MSEC;
@@ -797,12 +817,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		perf_evlist__start_workload(evsel_list);
 		enable_counters();
 
-		if (interval || timeout) {
-			while (!stop && !waitpid(child_pid, &status, WNOHANG)) {
-				nanosleep(&ts, NULL);
-				stop = process_timeout(timeout, interval, &times);
-			}
-		}
+		if (interval || timeout)
+			status = dispatch_events(forks, timeout, interval, &times, &ts);
 		if (child_pid != -1) {
 			if (timeout)
 				kill(child_pid, SIGTERM);
@@ -819,10 +835,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
 		enable_counters();
-		while (!done && !stop && is_target_alive(&target, evsel_list->core.threads)) {
-			nanosleep(&ts, NULL);
-			stop = process_timeout(timeout, interval, &times);
-		}
+		dispatch_events(forks, timeout, interval, &times, &ts);
 	}
 
 	disable_counters();
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 08/13] perf stat: extend -D,--delay option with -1 value
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (6 preceding siblings ...)
  2020-06-03 15:57 ` [PATCH v7 07/13] perf stat: factor out event handling loop into dispatch_events() Alexey Budankov
@ 2020-06-03 15:58 ` Alexey Budankov
  2020-06-03 15:59 ` [PATCH v7 09/13] perf stat: implement control commands handling Alexey Budankov
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Extend -D,--delay option with -1 value to start monitoring with
events disabled to be enabled later by enable command provided
via control file descriptor.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |  5 +++--
 tools/perf/builtin-stat.c              | 18 ++++++++++++++----
 tools/perf/util/evlist.h               |  3 +++
 tools/perf/util/stat.h                 |  2 +-
 4 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index b029ee728a0b..9f32f6cd558d 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -238,8 +238,9 @@ mode, use --per-node in addition to -a. (system-wide).
 
 -D msecs::
 --delay msecs::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled). This is useful to filter out the startup phase of the program,
+which is often very different.
 
 -T::
 --transaction::
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 39749c290508..f88d5ee55022 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -494,16 +494,26 @@ static bool process_timeout(int timeout, unsigned int interval, int *times)
 
 static void enable_counters(void)
 {
-	if (stat_config.initial_delay)
+	if (stat_config.initial_delay < 0) {
+		pr_info(EVLIST_DISABLED_MSG);
+		return;
+	}
+
+	if (stat_config.initial_delay > 0) {
+		pr_info(EVLIST_DISABLED_MSG);
 		usleep(stat_config.initial_delay * USEC_PER_MSEC);
+	}
 
 	/*
 	 * We need to enable counters only if:
 	 * - we don't have tracee (attaching to task or cpu)
 	 * - we have initial delay configured
 	 */
-	if (!target__none(&target) || stat_config.initial_delay)
+	if (!target__none(&target) || stat_config.initial_delay) {
 		evlist__enable(evsel_list);
+		if (stat_config.initial_delay > 0)
+			pr_info(EVLIST_ENABLED_MSG);
+	}
 }
 
 static void disable_counters(void)
@@ -1060,8 +1070,8 @@ static struct option stat_options[] = {
 		     "aggregate counts per thread", AGGR_THREAD),
 	OPT_SET_UINT(0, "per-node", &stat_config.aggr_mode,
 		     "aggregate counts per numa node", AGGR_NODE),
-	OPT_UINTEGER('D', "delay", &stat_config.initial_delay,
-		     "ms to wait before starting measurement after program start"),
+	OPT_INTEGER('D', "delay", &stat_config.initial_delay,
+		    "ms to wait before starting measurement after program start (-1: start with events disabled)"),
 	OPT_CALLBACK_NOOPT(0, "metric-only", &stat_config.metric_only, NULL,
 			"Only print computed metrics. No raw values", enable_metric_only),
 	OPT_BOOLEAN(0, "metric-no-group", &stat_config.metric_no_group,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bccf0a970371..7c3726a685f5 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -377,4 +377,7 @@ int evlist__initialize_ctlfd(struct evlist *evlist, int ctl_fd, int ctl_fd_ack);
 int evlist__finalize_ctlfd(struct evlist *evlist);
 int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd);
 
+#define EVLIST_ENABLED_MSG "Events enabled\n"
+#define EVLIST_DISABLED_MSG "Events disabled\n"
+
 #endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index f75ae679eb28..626421ef35c2 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -116,7 +116,7 @@ struct perf_stat_config {
 	FILE			*output;
 	unsigned int		 interval;
 	unsigned int		 timeout;
-	unsigned int		 initial_delay;
+	int			 initial_delay;
 	unsigned int		 unit_width;
 	unsigned int		 metric_only_len;
 	int			 times;
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 09/13] perf stat: implement control commands handling
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (7 preceding siblings ...)
  2020-06-03 15:58 ` [PATCH v7 08/13] perf stat: extend -D,--delay option with -1 value Alexey Budankov
@ 2020-06-03 15:59 ` Alexey Budankov
  2020-06-03 15:59 ` [PATCH v7 10/13] perf stat: introduce --ctl-fd[-ack] options Alexey Budankov
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor. process_evlist() function
checks for events on static fds and makes required operations.
If poll event splits initiated timeout interval then the reminder
is calculated and still waited in the following poll() syscall.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-stat.c | 67 +++++++++++++++++++++++++++++----------
 1 file changed, 50 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f88d5ee55022..cc56d71a3ed5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -492,6 +492,31 @@ static bool process_timeout(int timeout, unsigned int interval, int *times)
 	return print_interval(interval, times);
 }
 
+static bool process_evlist(struct evlist *evlist, unsigned int interval, int *times)
+{
+	bool stop = false;
+	enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
+
+	if (evlist__ctlfd_process(evlist, &cmd) > 0) {
+		switch (cmd) {
+		case EVLIST_CTL_CMD_ENABLE:
+			pr_info(EVLIST_ENABLED_MSG);
+			stop = print_interval(interval, times);
+			break;
+		case EVLIST_CTL_CMD_DISABLE:
+			stop = print_interval(interval, times);
+			pr_info(EVLIST_DISABLED_MSG);
+			break;
+		case EVLIST_CTL_CMD_ACK:
+		case EVLIST_CTL_CMD_UNSUPPORTED:
+		default:
+			break;
+		}
+	}
+
+	return stop;
+}
+
 static void enable_counters(void)
 {
 	if (stat_config.initial_delay < 0) {
@@ -567,10 +592,21 @@ static bool is_target_alive(struct target *_target,
 	return false;
 }
 
-static int dispatch_events(bool forks, int timeout, int interval, int *times, struct timespec *ts)
+static int dispatch_events(bool forks, int timeout, int interval, int *times)
 {
 	bool stop = false;
 	int child = 0, status = 0;
+	int time_to_sleep, sleep_time;
+	struct timespec time_start, time_stop, time_diff;
+
+	if (interval)
+		sleep_time = interval;
+	else if (timeout)
+		sleep_time = timeout;
+	else
+		sleep_time = 1000;
+
+	time_to_sleep = sleep_time;
 
 	while (1) {
 		if (forks)
@@ -581,8 +617,17 @@ static int dispatch_events(bool forks, int timeout, int interval, int *times, st
 		if (done || stop || child)
 			break;
 
-		nanosleep(ts, NULL);
-		stop = process_timeout(timeout, interval, times);
+		clock_gettime(CLOCK_MONOTONIC, &time_start);
+		if (!(evlist__poll(evsel_list, time_to_sleep) > 0)) { /* poll timeout or EINTR */
+			stop = process_timeout(timeout, interval, times);
+			time_to_sleep = sleep_time;
+		} else { /* fd revent */
+			stop = process_evlist(evsel_list, interval, times);
+			clock_gettime(CLOCK_MONOTONIC, &time_stop);
+			diff_timespec(&time_diff, &time_stop, &time_start);
+			time_to_sleep -= time_diff.tv_sec * MSEC_PER_SEC +
+					time_diff.tv_nsec / NSEC_PER_MSEC;
+		}
 	}
 
 	return status;
@@ -651,7 +696,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	char msg[BUFSIZ];
 	unsigned long long t0, t1;
 	struct evsel *counter;
-	struct timespec ts;
 	size_t l;
 	int status = 0;
 	const bool forks = (argc > 0);
@@ -660,17 +704,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	int i, cpu;
 	bool second_pass = false;
 
-	if (interval) {
-		ts.tv_sec  = interval / USEC_PER_MSEC;
-		ts.tv_nsec = (interval % USEC_PER_MSEC) * NSEC_PER_MSEC;
-	} else if (timeout) {
-		ts.tv_sec  = timeout / USEC_PER_MSEC;
-		ts.tv_nsec = (timeout % USEC_PER_MSEC) * NSEC_PER_MSEC;
-	} else {
-		ts.tv_sec  = 1;
-		ts.tv_nsec = 0;
-	}
-
 	if (forks) {
 		if (perf_evlist__prepare_workload(evsel_list, &target, argv, is_pipe,
 						  workload_exec_failed_signal) < 0) {
@@ -828,7 +861,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		enable_counters();
 
 		if (interval || timeout)
-			status = dispatch_events(forks, timeout, interval, &times, &ts);
+			status = dispatch_events(forks, timeout, interval, &times);
 		if (child_pid != -1) {
 			if (timeout)
 				kill(child_pid, SIGTERM);
@@ -845,7 +878,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
 		enable_counters();
-		dispatch_events(forks, timeout, interval, &times, &ts);
+		dispatch_events(forks, timeout, interval, &times);
 	}
 
 	disable_counters();
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 10/13] perf stat: introduce --ctl-fd[-ack] options
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (8 preceding siblings ...)
  2020-06-03 15:59 ` [PATCH v7 09/13] perf stat: implement control commands handling Alexey Budankov
@ 2020-06-03 15:59 ` Alexey Budankov
  2020-06-03 16:00 ` [PATCH v7 11/13] perf record: extend -D,--delay option with -1 value Alexey Budankov
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 15:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-stat.txt file with --ctl-fd[-ack] options
description. Document possible usage model introduced by --ctl-fd[-ack]
options by providing example bash shell script.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++++
 tools/perf/builtin-stat.c              | 11 +++++++
 tools/perf/util/stat.h                 |  2 ++
 3 files changed, 53 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 9f32f6cd558d..029afa009865 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -176,6 +176,46 @@ with it.  --append may be used here.  Examples:
      3>results  perf stat --log-fd 3          -- $cmd
      3>>results perf stat --log-fd 3 --append -- $cmd
 
+--ctl-fd::
+--ctl-fd-ack::
+Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
+'disable': disable events). Measurements can be started with events disabled using
+--delay=-1 option. Optionally send control command completion ('ack\n') to fd-ack descriptor
+to synchronize with the controlling process. Example of bash shell script to enable and
+disable events during measurements:
+
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf stat -D -1 -e cpu-cycles -a -I 1000                \
+          --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+          -- sleep 30 &
+perf_pid=$!
+
+sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
 --pre::
 --post::
 	Pre and post measurement hooks, e.g.:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index cc56d71a3ed5..8d79ba54dbf9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -188,6 +188,8 @@ static struct perf_stat_config stat_config = {
 	.metric_only_len	= METRIC_ONLY_LEN,
 	.walltime_nsecs_stats	= &walltime_nsecs_stats,
 	.big_num		= true,
+	.ctl_fd			= -1,
+	.ctl_fd_ack		= -1
 };
 
 static bool cpus_map_matched(struct evsel *a, struct evsel *b)
@@ -1133,6 +1135,10 @@ static struct option stat_options[] = {
 		"libpfm4 event selector. use 'perf list' to list available events",
 		parse_libpfm_events_option),
 #endif
+	OPT_INTEGER(0, "ctl-fd", &stat_config.ctl_fd,
+		    "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+	OPT_INTEGER(0, "ctl-fd-ack", &stat_config.ctl_fd_ack,
+		    "Send control command completion ('ack') to fd ack descriptor"),
 	OPT_END()
 };
 
@@ -2304,6 +2310,9 @@ int cmd_stat(int argc, const char **argv)
 	signal(SIGALRM, skip_signal);
 	signal(SIGABRT, skip_signal);
 
+	if (evlist__initialize_ctlfd(evsel_list, stat_config.ctl_fd, stat_config.ctl_fd_ack))
+		goto out;
+
 	status = 0;
 	for (run_idx = 0; forever || run_idx < stat_config.run_count; run_idx++) {
 		if (stat_config.run_count != 1 && verbose > 0)
@@ -2323,6 +2332,8 @@ int cmd_stat(int argc, const char **argv)
 	if (!forever && status != -1 && (!interval || stat_config.summary))
 		print_counters(NULL, argc, argv);
 
+	evlist__finalize_ctlfd(evsel_list);
+
 	if (STAT_RECORD) {
 		/*
 		 * We synthesize the kernel mmap record just so that older tools
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 626421ef35c2..06f0baabe775 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -133,6 +133,8 @@ struct perf_stat_config {
 	struct perf_cpu_map		*cpus_aggr_map;
 	u64			*walltime_run;
 	struct rblist		 metric_events;
+	int			 ctl_fd;
+	int			 ctl_fd_ack;
 };
 
 void perf_stat__set_big_num(int set);
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 11/13] perf record: extend -D,--delay option with -1 value
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (9 preceding siblings ...)
  2020-06-03 15:59 ` [PATCH v7 10/13] perf stat: introduce --ctl-fd[-ack] options Alexey Budankov
@ 2020-06-03 16:00 ` Alexey Budankov
  2020-06-03 16:01 ` [PATCH v7 12/13] perf record: implement control commands handling Alexey Budankov
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 16:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Extend -D,--delay option with -1 to start collection with events
disabled to be enbled later by enable command provided via control
file descriptor.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/Documentation/perf-record.txt |  5 +++--
 tools/perf/builtin-record.c              | 12 ++++++++----
 tools/perf/builtin-trace.c               |  2 +-
 tools/perf/util/record.h                 |  2 +-
 4 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index fa8a5fcd27ab..a84376605805 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -407,8 +407,9 @@ if combined with -a or -C options.
 
 -D::
 --delay=::
-After starting the program, wait msecs before measuring. This is useful to
-filter out the startup phase of the program, which is often very different.
+After starting the program, wait msecs before measuring (-1: start with events
+disabled). This is useful to filter out the startup phase of the program, which
+is often very different.
 
 -I::
 --intr-regs::
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e108d90ae2ed..d0b29a1070a0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1749,8 +1749,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	}
 
 	if (opts->initial_delay) {
-		usleep(opts->initial_delay * USEC_PER_MSEC);
-		evlist__enable(rec->evlist);
+		pr_info(EVLIST_DISABLED_MSG);
+		if (opts->initial_delay > 0) {
+			usleep(opts->initial_delay * USEC_PER_MSEC);
+			evlist__enable(rec->evlist);
+			pr_info(EVLIST_ENABLED_MSG);
+		}
 	}
 
 	trigger_ready(&auxtrace_snapshot_trigger);
@@ -2462,8 +2466,8 @@ static struct option __record_options[] = {
 	OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
 		     "monitor event in cgroup name only",
 		     parse_cgroups),
-	OPT_UINTEGER('D', "delay", &record.opts.initial_delay,
-		  "ms to wait before starting measurement after program start"),
+	OPT_INTEGER('D', "delay", &record.opts.initial_delay,
+		  "ms to wait before starting measurement after program start (-1: start with events disabled)"),
 	OPT_BOOLEAN(0, "kcore", &record.opts.kcore, "copy /proc/kcore"),
 	OPT_STRING('u', "uid", &record.opts.target.uid_str, "user",
 		   "user to profile"),
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4cbb64edc998..290149b1b3b5 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -4813,7 +4813,7 @@ int cmd_trace(int argc, const char **argv)
 			"per thread proc mmap processing timeout in ms"),
 	OPT_CALLBACK('G', "cgroup", &trace, "name", "monitor event in cgroup name only",
 		     trace__parse_cgroups),
-	OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
+	OPT_INTEGER('D', "delay", &trace.opts.initial_delay,
 		     "ms to wait before starting measurement after program "
 		     "start"),
 	OPTS_EVSWITCH(&trace.evswitch),
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 39d1de4b2a36..da138dcb4d34 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -61,7 +61,7 @@ struct record_opts {
 	const char    *auxtrace_snapshot_opts;
 	const char    *auxtrace_sample_opts;
 	bool	      sample_transaction;
-	unsigned      initial_delay;
+	int	      initial_delay;
 	bool	      use_clockid;
 	clockid_t     clockid;
 	u64	      clockid_res_ns;
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 12/13] perf record: implement control commands handling
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (10 preceding siblings ...)
  2020-06-03 16:00 ` [PATCH v7 11/13] perf record: extend -D,--delay option with -1 value Alexey Budankov
@ 2020-06-03 16:01 ` Alexey Budankov
  2020-06-03 16:02 ` [PATCH v7 13/13] perf record: introduce --ctl-fd[-ack] options Alexey Budankov
  2020-06-05  7:47 ` [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 16:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Implement handling of 'enable' and 'disable' control commands
coming from control file descriptor.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-record.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d0b29a1070a0..0394e068dde8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1527,6 +1527,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	bool disabled = false, draining = false;
 	int fd;
 	float ratio = 0;
+	enum evlist_ctl_cmd cmd = EVLIST_CTL_CMD_UNSUPPORTED;
 
 	atexit(record__sig_exit);
 	signal(SIGCHLD, sig_handler);
@@ -1830,6 +1831,21 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 				alarm(rec->switch_output.time);
 		}
 
+		if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) {
+			switch (cmd) {
+			case EVLIST_CTL_CMD_ENABLE:
+				pr_info(EVLIST_ENABLED_MSG);
+				break;
+			case EVLIST_CTL_CMD_DISABLE:
+				pr_info(EVLIST_DISABLED_MSG);
+				break;
+			case EVLIST_CTL_CMD_ACK:
+			case EVLIST_CTL_CMD_UNSUPPORTED:
+			default:
+				break;
+			}
+		}
+
 		if (hits == rec->samples) {
 			if (done || draining)
 				break;
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 13/13] perf record: introduce --ctl-fd[-ack] options
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (11 preceding siblings ...)
  2020-06-03 16:01 ` [PATCH v7 12/13] perf record: implement control commands handling Alexey Budankov
@ 2020-06-03 16:02 ` Alexey Budankov
  2020-06-05  7:47 ` [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-03 16:02 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Ingo Molnar, Andi Kleen, linux-kernel


Introduce --ctl-fd[-ack] options to pass open file descriptors numbers
from command line. Extend perf-record.txt file with --ctl-fd[-ack]
options description. Document possible usage model introduced by
--ctl-fd[-ack] options by providing example bash shell script.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/Documentation/perf-record.txt | 40 ++++++++++++++++++++++++
 tools/perf/builtin-record.c              | 10 ++++++
 tools/perf/util/record.h                 |  2 ++
 3 files changed, 52 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index a84376605805..b0e9b7a1761e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -627,6 +627,46 @@ option. The -e option and this one can be mixed and matched.  Events
 can be grouped using the {} notation.
 endif::HAVE_LIBPFM[]
 
+--ctl-fd::
+--ctl-fd-ack::
+Listen on ctl-fd descriptor for command to control measurement ('enable': enable events,
+'disable': disable events). Measurements can be started with events disabled using
+--delay=-1 option. Optionally send control command completion ('ack\n') to fd-ack descriptor
+to synchronize with the controlling process. Example of bash shell script to enable and
+disable events during measurements:
+
+#!/bin/bash
+
+ctl_dir=/tmp/
+
+ctl_fifo=${ctl_dir}perf_ctl.fifo
+test -p ${ctl_fifo} && unlink ${ctl_fifo}
+mkfifo ${ctl_fifo}
+exec {ctl_fd}<>${ctl_fifo}
+
+ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
+test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
+mkfifo ${ctl_ack_fifo}
+exec {ctl_fd_ack}<>${ctl_ack_fifo}
+
+perf record -D -1 -e cpu-cycles -a                        \
+            --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
+            -- sleep 30 &
+perf_pid=$!
+
+sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
+sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
+
+exec {ctl_fd_ack}>&-
+unlink ${ctl_ack_fifo}
+
+exec {ctl_fd}>&-
+unlink ${ctl_fifo}
+
+wait -n ${perf_pid}
+exit $?
+
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0394e068dde8..8494ce964738 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1749,6 +1749,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_evlist__start_workload(rec->evlist);
 	}
 
+	if (evlist__initialize_ctlfd(rec->evlist, opts->ctl_fd, opts->ctl_fd_ack))
+		goto out_child;
+
 	if (opts->initial_delay) {
 		pr_info(EVLIST_DISABLED_MSG);
 		if (opts->initial_delay > 0) {
@@ -1895,6 +1898,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		record__synthesize_workload(rec, true);
 
 out_child:
+	evlist__finalize_ctlfd(rec->evlist);
 	record__mmap_read_all(rec, true);
 	record__aio_mmap_read_sync(rec);
 
@@ -2380,6 +2384,8 @@ static struct record record = {
 		},
 		.mmap_flush          = MMAP_FLUSH_DEFAULT,
 		.nr_threads_synthesize = 1,
+		.ctl_fd              = -1,
+		.ctl_fd_ack          = -1,
 	},
 	.tool = {
 		.sample		= process_sample_event,
@@ -2581,6 +2587,10 @@ static struct option __record_options[] = {
 		"libpfm4 event selector. use 'perf list' to list available events",
 		parse_libpfm_events_option),
 #endif
+	OPT_INTEGER(0, "ctl-fd", &record.opts.ctl_fd,
+		    "Listen on fd descriptor for command to control measurement ('enable': enable events, 'disable': disable events)"),
+	OPT_INTEGER(0, "ctl-fd-ack", &record.opts.ctl_fd_ack,
+		   "Send control command completion ('ack') to fd ack descriptor"),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index da138dcb4d34..4cb72a478af1 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -70,6 +70,8 @@ struct record_opts {
 	int	      mmap_flush;
 	unsigned int  comp_level;
 	unsigned int  nr_threads_synthesize;
+	int	      ctl_fd;
+	int	      ctl_fd_ack;
 };
 
 extern const char * const *record_usage;
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes
  2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
                   ` (12 preceding siblings ...)
  2020-06-03 16:02 ` [PATCH v7 13/13] perf record: introduce --ctl-fd[-ack] options Alexey Budankov
@ 2020-06-05  7:47 ` Alexey Budankov
  13 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-05  7:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Namhyung Kim, Alexander Shishkin, Peter Zijlstra, Ingo Molnar,
	Andi Kleen, linux-kernel

Friendly reminder.

~Alexey

On 03.06.2020 18:47, Alexey Budankov wrote:
> 
> Changes in v7:
> - added missing perf-record.txt changes 
> - adjusted docs wording for --ctl-fd,ctl-fd-ack options 
>   to additionally mention --delay=-1 effect
> 
> v6: https://lore.kernel.org/lkml/f8e3a714-d9b1-4647-e1d2-9981cbaa83ec@linux.intel.com/
> 
> Changes in v6:
> - split re-factoring of events handling loops for stat mode
>   into smaller incremental parts
> - added parts missing at v5
> - corrected v5 runtime issues
> 
> v5: https://lore.kernel.org/lkml/e5cac8dd-7aa4-ec7c-671c-07756907acba@linux.intel.com/
> 
> Changes in v5:
> - split re-factoring of events handling loops for stat mode
>   into smaller incremental parts
> 
> v4: https://lore.kernel.org/lkml/653fe5f3-c986-a841-1ed8-0a7d2fa24c00@linux.intel.com/
> 
> Changes in v4:
> - made checking of ctlfd state unconditional in record trace streaming loop
> - introduced static poll fds to keep evlist__filter_pollfd() unaffected
> - handled ret code of evlist__initialize_ctlfd() where need
> - renamed and structured handle_events() function
> - applied anonymous structs where needed
> 
> v3: https://lore.kernel.org/lkml/eb38e9e5-754f-d410-1d9b-e26b702d51b7@linux.intel.com/
> 
> Changes in v3:
> - renamed functions and types from perf_evlist_ to evlist_ to avoid
>   clash with libperf code;
> - extended commands to be strings of variable length consisting of
>   command name and also possibly including command specific data;
> - merged docs update with the code changes;
> - updated docs for -D,--delay=-1 option for stat and record modes;
> 
> v2: https://lore.kernel.org/lkml/d582cc3d-2302-c7e2-70d3-bc7ab6f628c3@linux.intel.com/
> 
> Changes in v2:
> - renamed resume and pause commands to enable and disable ones, renamed
>   CTL_CMD_RESUME and CTL_CMD_PAUSE to CTL_CMD_ENABLE and CTL_CMD_DISABLE
>   to fit to the appropriate ioctls and avoid mixing up with PAUSE_OUTPUT
>   ioctl;
> - factored out event handling loop into a handle_events() for stat mode;
> - separated -D,--delay=-1 into separate patches for stat and record modes;
> 
> v1: https://lore.kernel.org/lkml/825a5132-b58d-c0b6-b050-5a6040386ec7@linux.intel.com/
> 
> repo: tip of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
> 
> The patch set implements handling of 'start disabled', 'enable' and 'disable'
> external control commands which can be provided for stat and record modes
> of the tool from an external controlling process. 'start disabled' command
> can be used to postpone enabling of events in the beginning of a monitoring
> session. 'enable' and 'disable' commands can be used to enable and disable
> events correspondingly any time after the start of the session.
> 
> The 'start disabled', 'enable' and 'disable' external control commands can be
> used to focus measurement on specially selected time intervals of workload
> execution. Focused measurement reduces tool intrusion and influence on
> workload behavior, reduces distortion and amount of collected and stored
> data, mitigates data accuracy loss because measurement and data capturing
> happen only during intervals of interest.
> 
> A controlling process can be a bash shell script [1], native executable or
> any other language program that can directly work with file descriptors,
> e.g. pipes [2], and spawn a process, specially the tool one.
> 
> -D,--delay <val> option is extended with -1 value to skip events enabling
> in the beginning of a monitoring session ('start disabled' command).
> --ctl-fd and --ctl-fd-ack command line options are introduced to provide the
> tool with a pair of file descriptors to listen to control commands and reply
> to the controlling process on the completion of received commands.
> 
> The tool reads control command message from ctl-fd descriptor, handles the
> command and optionally replies acknowledgement message to fd-ack descriptor,
> if it is specified on the command line. 'enable' command is recognized as
> 'enable' string message and 'disable' command is recognized as 'disable'
> string message both received from ctl-fd descriptor. Completion message is
> 'ack\n' and sent to fd-ack descriptor.
> 
> Example bash script demonstrating simple use case follows:
> 
> #!/bin/bash
> 
> ctl_dir=/tmp/
> 
> ctl_fifo=${ctl_dir}perf_ctl.fifo
> test -p ${ctl_fifo} && unlink ${ctl_fifo}
> mkfifo ${ctl_fifo} && exec {ctl_fd}<>${ctl_fifo}
> 
> ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
> test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
> mkfifo ${ctl_ack_fifo} && exec {ctl_fd_ack}<>${ctl_ack_fifo}
> 
> perf stat -D -1 -e cpu-cycles -a -I 1000                \
>           --ctl-fd ${ctl_fd} --ctl-fd-ack ${ctl_fd_ack} \
>           -- sleep 40 &
> perf_pid=$!
> 
> sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
> sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
> sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e2 && echo "enabled(${e2})"
> sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d2 && echo "disabled(${d2})"
> 
> exec {ctl_fd_ack}>&- && unlink ${ctl_ack_fifo}
> exec {ctl_fd}>&- && unlink ${ctl_fifo}
> 
> wait -n ${perf_pid}
> exit $?
> 
> 
> Script output:
> 
> [root@host dir] example
> Events disabled
> #           time             counts unit events
>      1.001101062      <not counted>      cpu-cycles                                                  
>      2.002994944      <not counted>      cpu-cycles                                                  
>      3.004864340      <not counted>      cpu-cycles                                                  
>      4.006727177      <not counted>      cpu-cycles                                                  
> Events enabled
> enabled(ack)
>      4.993808464          3,124,246      cpu-cycles                                                  
>      5.008597004          3,325,624      cpu-cycles                                                  
>      6.010387483         83,472,992      cpu-cycles                                                  
>      7.012266598         55,877,621      cpu-cycles                                                  
>      8.014175695         97,892,729      cpu-cycles                                                  
>      9.016056093         68,461,242      cpu-cycles                                                  
>     10.017937507         55,449,643      cpu-cycles                                                  
>     11.019830154         68,938,167      cpu-cycles                                                  
>     12.021719952         55,164,101      cpu-cycles                                                  
>     13.023627550         70,535,720      cpu-cycles                                                  
>     14.025580995         53,240,125      cpu-cycles                                                  
> disabled(ack)
>     14.997518260         53,558,068      cpu-cycles                                                  
> Events disabled
>     15.027216416      <not counted>      cpu-cycles                                                  
>     16.029052729      <not counted>      cpu-cycles                                                  
>     17.030904762      <not counted>      cpu-cycles                                                  
>     18.032073424      <not counted>      cpu-cycles                                                  
>     19.033805074      <not counted>      cpu-cycles                                                  
> Events enabled
> enabled(ack)
>     20.001279097          3,021,022      cpu-cycles                                                  
>     20.035044381          6,434,367      cpu-cycles                                                  
>     21.036923813         89,358,251      cpu-cycles                                                  
>     22.038825169         72,516,351      cpu-cycles                                                  
> #           time             counts unit events
>     23.040715596         55,046,157      cpu-cycles                                                  
>     24.042643757         78,128,649      cpu-cycles                                                  
>     25.044558535         61,052,428      cpu-cycles                                                  
>     26.046452785         62,142,806      cpu-cycles                                                  
>     27.048353021         74,477,971      cpu-cycles                                                  
>     28.050241286         61,001,623      cpu-cycles                                                  
>     29.052149961         61,653,502      cpu-cycles                                                  
> disabled(ack)
>     30.004980264         82,729,640      cpu-cycles                                                  
> Events disabled
>     30.053516176      <not counted>      cpu-cycles                                                  
>     31.055348366      <not counted>      cpu-cycles                                                  
>     32.057202097      <not counted>      cpu-cycles                                                  
>     33.059040702      <not counted>      cpu-cycles                                                  
>     34.060843288      <not counted>      cpu-cycles                                                  
>     35.000888624      <not counted>      cpu-cycles                                                  
> [root@host dir]# 
> 
> [1] http://man7.org/linux/man-pages/man1/bash.1.html
> [2] http://man7.org/linux/man-pages/man2/pipe.2.html
> 
> ---
> Alexey Budankov (13):
>   tools/libperf: introduce notion of static polled file descriptors
>   perf evlist: introduce control file descriptors
>   perf evlist: implement control command handling functions
>   perf stat: factor out body of event handling loop for system wide
>   perf stat: move target check to loop control statement
>   perf stat: factor out body of event handling loop for fork case
>   perf stat: factor out event handling loop into dispatch_events()
>   perf stat: extend -D,--delay option with -1 value
>   perf stat: implement control commands handling
>   perf stat: introduce --ctl-fd[-ack] options
>   perf record: extend -D,--delay option with -1 value
>   perf record: implement control commands handling
>   perf record: introduce --ctl-fd[-ack] options
> 
>  tools/lib/api/fd/array.c                 |  42 ++++++-
>  tools/lib/api/fd/array.h                 |   7 ++
>  tools/lib/perf/evlist.c                  |  11 ++
>  tools/lib/perf/include/internal/evlist.h |   2 +
>  tools/perf/Documentation/perf-record.txt |  45 ++++++-
>  tools/perf/Documentation/perf-stat.txt   |  45 ++++++-
>  tools/perf/builtin-record.c              |  38 +++++-
>  tools/perf/builtin-stat.c                | 149 +++++++++++++++++------
>  tools/perf/builtin-trace.c               |   2 +-
>  tools/perf/util/evlist.c                 | 131 ++++++++++++++++++++
>  tools/perf/util/evlist.h                 |  25 ++++
>  tools/perf/util/record.h                 |   4 +-
>  tools/perf/util/stat.h                   |   4 +-
>  13 files changed, 455 insertions(+), 50 deletions(-)
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-03 15:52 ` [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors Alexey Budankov
@ 2020-06-05 10:50   ` Jiri Olsa
  2020-06-05 11:38     ` Jiri Olsa
  2020-06-05 11:50     ` Alexey Budankov
  0 siblings, 2 replies; 44+ messages in thread
From: Jiri Olsa @ 2020-06-05 10:50 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
> 
> Implement adding of file descriptors by fdarray__add_stat() to
> fix-sized (currently 1) stat_entries array located at struct fdarray.
> Append added file descriptors to the array used by poll() syscall
> during fdarray__poll() call. Copy poll() result of the added
> descriptors from the array back to the storage for analysis.
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> ---
>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>  tools/lib/api/fd/array.h                 |  7 ++++
>  tools/lib/perf/evlist.c                  | 11 +++++++
>  tools/lib/perf/include/internal/evlist.h |  2 ++
>  4 files changed, 61 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> index 58d44d5eee31..b0027f2169c7 100644
> --- a/tools/lib/api/fd/array.c
> +++ b/tools/lib/api/fd/array.c
> @@ -11,10 +11,16 @@
>  
>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>  {
> +	int i;
> +
>  	fda->entries	 = NULL;
>  	fda->priv	 = NULL;
>  	fda->nr		 = fda->nr_alloc = 0;
>  	fda->nr_autogrow = nr_autogrow;
> +
> +	fda->nr_stat = 0;
> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
> +		fda->stat_entries[i].fd = -1;
>  }
>  
>  int fdarray__grow(struct fdarray *fda, int nr)
> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>  	return pos;
>  }
>  
> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
> +{
> +	int pos = fda->nr_stat;
> +
> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
> +		return -1;
> +
> +	fda->stat_entries[pos].fd = fd;
> +	fda->stat_entries[pos].events = revents;
> +	fda->nr_stat++;
> +
> +	return pos;
> +}
> +
>  int fdarray__filter(struct fdarray *fda, short revents,
>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>  		    void *arg)
> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>  
>  int fdarray__poll(struct fdarray *fda, int timeout)
>  {
> -	return poll(fda->entries, fda->nr, timeout);
> +	int nr, i, pos, res;
> +
> +	nr = fda->nr;
> +
> +	for (i = 0; i < fda->nr_stat; i++) {
> +		if (fda->stat_entries[i].fd != -1) {
> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
> +					   fda->stat_entries[i].events);

so every call to fdarray__poll will add whatever is
in stat_entries to entries? how is it removed?

I think you should either follow what Adrian said
and put 'static' descriptors early and check for
filter number to match it as an 'quick fix'

or we should fix it for real and make it generic

so currently the interface is like this:

  pos1 = fdarray__add(a, fd1 ... );
  pos2 = fdarray__add(a, fd2 ... );
  pos3 = fdarray__add(a, fd2 ... );

  fdarray__poll(a);

  num = fdarray__filter(a, revents, destructor, arg);

when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
indexes are not relevant anymore

how about we make the 'pos indexes' being stable by allocating
separate object for each added descriptor and each poll call
would create pollfd array from current objects, and entries
would keep pointer to its pollfd entry

  struct fdentry *entry {
       int              fd;
       int              events;
       struct pollfd   *pollfd;
  }

  entry1 = fdarray__add(a, fd1 ...);
  entry2 = fdarray__add(a, fd2 ...);
  entry3 = fdarray__add(a, fd3 ...);

  fdarray__poll(a);

  struct pollfd *fdarray__entry_pollfd(a, entry1);

or smoething like that ;-)

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-05 10:50   ` Jiri Olsa
@ 2020-06-05 11:38     ` Jiri Olsa
  2020-06-05 16:15       ` Alexey Budankov
  2020-06-05 11:50     ` Alexey Budankov
  1 sibling, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-05 11:38 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
> > 
> > Implement adding of file descriptors by fdarray__add_stat() to
> > fix-sized (currently 1) stat_entries array located at struct fdarray.
> > Append added file descriptors to the array used by poll() syscall
> > during fdarray__poll() call. Copy poll() result of the added
> > descriptors from the array back to the storage for analysis.
> > 
> > Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> > ---
> >  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
> >  tools/lib/api/fd/array.h                 |  7 ++++
> >  tools/lib/perf/evlist.c                  | 11 +++++++
> >  tools/lib/perf/include/internal/evlist.h |  2 ++
> >  4 files changed, 61 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> > index 58d44d5eee31..b0027f2169c7 100644
> > --- a/tools/lib/api/fd/array.c
> > +++ b/tools/lib/api/fd/array.c
> > @@ -11,10 +11,16 @@
> >  
> >  void fdarray__init(struct fdarray *fda, int nr_autogrow)
> >  {
> > +	int i;
> > +
> >  	fda->entries	 = NULL;
> >  	fda->priv	 = NULL;
> >  	fda->nr		 = fda->nr_alloc = 0;
> >  	fda->nr_autogrow = nr_autogrow;
> > +
> > +	fda->nr_stat = 0;
> > +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
> > +		fda->stat_entries[i].fd = -1;
> >  }
> >  
> >  int fdarray__grow(struct fdarray *fda, int nr)
> > @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
> >  	return pos;
> >  }
> >  
> > +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
> > +{
> > +	int pos = fda->nr_stat;
> > +
> > +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
> > +		return -1;
> > +
> > +	fda->stat_entries[pos].fd = fd;
> > +	fda->stat_entries[pos].events = revents;
> > +	fda->nr_stat++;
> > +
> > +	return pos;
> > +}
> > +
> >  int fdarray__filter(struct fdarray *fda, short revents,
> >  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
> >  		    void *arg)
> > @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
> >  
> >  int fdarray__poll(struct fdarray *fda, int timeout)
> >  {
> > -	return poll(fda->entries, fda->nr, timeout);
> > +	int nr, i, pos, res;
> > +
> > +	nr = fda->nr;
> > +
> > +	for (i = 0; i < fda->nr_stat; i++) {
> > +		if (fda->stat_entries[i].fd != -1) {
> > +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
> > +					   fda->stat_entries[i].events);
> 
> so every call to fdarray__poll will add whatever is
> in stat_entries to entries? how is it removed?
> 
> I think you should either follow what Adrian said
> and put 'static' descriptors early and check for
> filter number to match it as an 'quick fix'
> 
> or we should fix it for real and make it generic
> 
> so currently the interface is like this:
> 
>   pos1 = fdarray__add(a, fd1 ... );
>   pos2 = fdarray__add(a, fd2 ... );
>   pos3 = fdarray__add(a, fd2 ... );
> 
>   fdarray__poll(a);
> 
>   num = fdarray__filter(a, revents, destructor, arg);
> 
> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
> indexes are not relevant anymore
> 
> how about we make the 'pos indexes' being stable by allocating
> separate object for each added descriptor and each poll call
> would create pollfd array from current objects, and entries
> would keep pointer to its pollfd entry
> 
>   struct fdentry *entry {
>        int              fd;
>        int              events;
>        struct pollfd   *pollfd;
>   }
> 
>   entry1 = fdarray__add(a, fd1 ...);
>   entry2 = fdarray__add(a, fd2 ...);
>   entry3 = fdarray__add(a, fd3 ...);
> 
>   fdarray__poll(a);
> 
>   struct pollfd *fdarray__entry_pollfd(a, entry1);
> 
> or smoething like that ;-)

maybe something like below (only compile tested)

jirka


---
diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index 58d44d5eee31..f1effed3dde1 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -22,8 +22,8 @@ int fdarray__grow(struct fdarray *fda, int nr)
 	void *priv;
 	int nr_alloc = fda->nr_alloc + nr;
 	size_t psize = sizeof(fda->priv[0]) * nr_alloc;
-	size_t size  = sizeof(struct pollfd) * nr_alloc;
-	struct pollfd *entries = realloc(fda->entries, size);
+	size_t size  = sizeof(struct fdentry *) * nr_alloc;
+	struct fdentry **entries = realloc(fda->entries, size);
 
 	if (entries == NULL)
 		return -ENOMEM;
@@ -58,7 +58,12 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
 
 void fdarray__exit(struct fdarray *fda)
 {
+	int i;
+
+	for (i = 0; i < fda->nr; i++)
+		free(fda->entries[i]);
 	free(fda->entries);
+	free(fda->pollfd);
 	free(fda->priv);
 	fdarray__init(fda, 0);
 }
@@ -69,18 +74,25 @@ void fdarray__delete(struct fdarray *fda)
 	free(fda);
 }
 
-int fdarray__add(struct fdarray *fda, int fd, short revents)
+struct fdentry *fdarray__add(struct fdarray *fda, int fd, short revents)
 {
-	int pos = fda->nr;
+	struct fdentry *entry;
 
 	if (fda->nr == fda->nr_alloc &&
 	    fdarray__grow(fda, fda->nr_autogrow) < 0)
-		return -ENOMEM;
+		return NULL;
+
+	entry = malloc(sizeof(*entry));
+	if (!entry)
+		return NULL;
+
+	entry->fd = fd;
+	entry->revents = revents;
+	entry->pollfd = NULL;
 
-	fda->entries[fda->nr].fd     = fd;
-	fda->entries[fda->nr].events = revents;
+	fda->entries[fda->nr] = entry;
 	fda->nr++;
-	return pos;
+	return entry;
 }
 
 int fdarray__filter(struct fdarray *fda, short revents,
@@ -93,7 +105,7 @@ int fdarray__filter(struct fdarray *fda, short revents,
 		return 0;
 
 	for (fd = 0; fd < fda->nr; ++fd) {
-		if (fda->entries[fd].revents & revents) {
+		if (fda->entries[fd]->revents & revents) {
 			if (entry_destructor)
 				entry_destructor(fda, fd, arg);
 
@@ -113,7 +125,22 @@ int fdarray__filter(struct fdarray *fda, short revents,
 
 int fdarray__poll(struct fdarray *fda, int timeout)
 {
-	return poll(fda->entries, fda->nr, timeout);
+	struct pollfd *pollfd = fda->pollfd;
+	int i;
+
+	pollfd = realloc(pollfd, sizeof(*pollfd) * fda->nr);
+	if (!pollfd)
+		return -ENOMEM;
+
+	fda->pollfd = pollfd;
+
+	for (i = 0; i < fda->nr; i++) {
+		pollfd[i].fd = fda->entries[i]->fd;
+		pollfd[i].revents = fda->entries[i]->revents;
+		fda->entries[i]->pollfd = &pollfd[i];
+	}
+
+	return poll(pollfd, fda->nr, timeout);
 }
 
 int fdarray__fprintf(struct fdarray *fda, FILE *fp)
@@ -121,7 +148,12 @@ int fdarray__fprintf(struct fdarray *fda, FILE *fp)
 	int fd, printed = fprintf(fp, "%d [ ", fda->nr);
 
 	for (fd = 0; fd < fda->nr; ++fd)
-		printed += fprintf(fp, "%s%d", fd ? ", " : "", fda->entries[fd].fd);
+		printed += fprintf(fp, "%s%d", fd ? ", " : "", fda->entries[fd]->fd);
 
 	return printed + fprintf(fp, " ]");
 }
+
+int fdentry__events(struct fdentry *entry)
+{
+	return entry->pollfd->revents;
+}
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index b39557d1a88f..5231ce047f2e 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -6,6 +6,12 @@
 
 struct pollfd;
 
+struct fdentry {
+	int		 fd;
+	int		 revents;
+	struct pollfd	*pollfd;
+};
+
 /**
  * struct fdarray: Array of file descriptors
  *
@@ -20,7 +26,10 @@ struct fdarray {
 	int	       nr;
 	int	       nr_alloc;
 	int	       nr_autogrow;
-	struct pollfd *entries;
+
+	struct fdentry	**entries;
+	struct pollfd	 *pollfd;
+
 	union {
 		int    idx;
 		void   *ptr;
@@ -33,7 +42,7 @@ void fdarray__exit(struct fdarray *fda);
 struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
 void fdarray__delete(struct fdarray *fda);
 
-int fdarray__add(struct fdarray *fda, int fd, short revents);
+struct fdentry *fdarray__add(struct fdarray *fda, int fd, short revents);
 int fdarray__poll(struct fdarray *fda, int timeout);
 int fdarray__filter(struct fdarray *fda, short revents,
 		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
@@ -41,6 +50,8 @@ int fdarray__filter(struct fdarray *fda, short revents,
 int fdarray__grow(struct fdarray *fda, int extra);
 int fdarray__fprintf(struct fdarray *fda, FILE *fp);
 
+int fdentry__events(struct fdentry *entry);
+
 static inline int fdarray__available_entries(struct fdarray *fda)
 {
 	return fda->nr_alloc - fda->nr;


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-05 10:50   ` Jiri Olsa
  2020-06-05 11:38     ` Jiri Olsa
@ 2020-06-05 11:50     ` Alexey Budankov
  1 sibling, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-05 11:50 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 05.06.2020 13:50, Jiri Olsa wrote:
> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>
>> Implement adding of file descriptors by fdarray__add_stat() to
>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>> Append added file descriptors to the array used by poll() syscall
>> during fdarray__poll() call. Copy poll() result of the added
>> descriptors from the array back to the storage for analysis.
>>
>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>> ---
>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>>  tools/lib/api/fd/array.h                 |  7 ++++
>>  tools/lib/perf/evlist.c                  | 11 +++++++
>>  tools/lib/perf/include/internal/evlist.h |  2 ++
>>  4 files changed, 61 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>> index 58d44d5eee31..b0027f2169c7 100644
>> --- a/tools/lib/api/fd/array.c
>> +++ b/tools/lib/api/fd/array.c
>> @@ -11,10 +11,16 @@
>>  
>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>>  {
>> +	int i;
>> +
>>  	fda->entries	 = NULL;
>>  	fda->priv	 = NULL;
>>  	fda->nr		 = fda->nr_alloc = 0;
>>  	fda->nr_autogrow = nr_autogrow;
>> +
>> +	fda->nr_stat = 0;
>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
>> +		fda->stat_entries[i].fd = -1;
>>  }
>>  
>>  int fdarray__grow(struct fdarray *fda, int nr)
>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>>  	return pos;
>>  }
>>  
>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
>> +{
>> +	int pos = fda->nr_stat;
>> +
>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
>> +		return -1;
>> +
>> +	fda->stat_entries[pos].fd = fd;
>> +	fda->stat_entries[pos].events = revents;
>> +	fda->nr_stat++;
>> +
>> +	return pos;
>> +}
>> +
>>  int fdarray__filter(struct fdarray *fda, short revents,
>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>>  		    void *arg)
>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>  
>>  int fdarray__poll(struct fdarray *fda, int timeout)
>>  {
>> -	return poll(fda->entries, fda->nr, timeout);
>> +	int nr, i, pos, res;
>> +
>> +	nr = fda->nr;
>> +
>> +	for (i = 0; i < fda->nr_stat; i++) {
>> +		if (fda->stat_entries[i].fd != -1) {
>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
>> +					   fda->stat_entries[i].events);
> 
> so every call to fdarray__poll will add whatever is
> in stat_entries to entries? how is it removed?

Whatever stat fd which is not -1 is added. If static fd is -1 then it is skipped for
poll() call. Complete stat slot (fd == -1) isn't expected to be reused but reuse could
be supported by simple change at fdarray__add_stat()
and probably bring required generality.

> 
> I think you should either follow what Adrian said
> and put 'static' descriptors early and check for
> filter number to match it as an 'quick fix'
> 
> or we should fix it for real and make it generic

It would complicate without a reason. If it really matters I would add possibility of
realloc of stat entries in fdarray__add_stat() and that would make stat entries usage
more similar to filterable ones without dramatic change and risk of regressions.

~Alexey

> 
> so currently the interface is like this:
> 
>   pos1 = fdarray__add(a, fd1 ... );
>   pos2 = fdarray__add(a, fd2 ... );
>   pos3 = fdarray__add(a, fd2 ... );
> 
>   fdarray__poll(a);
> 
>   num = fdarray__filter(a, revents, destructor, arg);
> 
> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
> indexes are not relevant anymore
> 
> how about we make the 'pos indexes' being stable by allocating
> separate object for each added descriptor and each poll call
> would create pollfd array from current objects, and entries
> would keep pointer to its pollfd entry
> 
>   struct fdentry *entry {
>        int              fd;
>        int              events;
>        struct pollfd   *pollfd;
>   }
> 
>   entry1 = fdarray__add(a, fd1 ...);
>   entry2 = fdarray__add(a, fd2 ...);
>   entry3 = fdarray__add(a, fd3 ...);
> 
>   fdarray__poll(a);
> 
>   struct pollfd *fdarray__entry_pollfd(a, entry1);
> 
> or smoething like that ;-)
> 
> jirka
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-05 11:38     ` Jiri Olsa
@ 2020-06-05 16:15       ` Alexey Budankov
  2020-06-08  8:08         ` Alexey Budankov
  0 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-05 16:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 05.06.2020 14:38, Jiri Olsa wrote:
> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>>
>>> Implement adding of file descriptors by fdarray__add_stat() to
>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>> Append added file descriptors to the array used by poll() syscall
>>> during fdarray__poll() call. Copy poll() result of the added
>>> descriptors from the array back to the storage for analysis.
>>>
>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>>> ---
>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>>>  tools/lib/api/fd/array.h                 |  7 ++++
>>>  tools/lib/perf/evlist.c                  | 11 +++++++
>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
>>>  4 files changed, 61 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>> index 58d44d5eee31..b0027f2169c7 100644
>>> --- a/tools/lib/api/fd/array.c
>>> +++ b/tools/lib/api/fd/array.c
>>> @@ -11,10 +11,16 @@
>>>  
>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>>>  {
>>> +	int i;
>>> +
>>>  	fda->entries	 = NULL;
>>>  	fda->priv	 = NULL;
>>>  	fda->nr		 = fda->nr_alloc = 0;
>>>  	fda->nr_autogrow = nr_autogrow;
>>> +
>>> +	fda->nr_stat = 0;
>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
>>> +		fda->stat_entries[i].fd = -1;
>>>  }
>>>  
>>>  int fdarray__grow(struct fdarray *fda, int nr)
>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>>>  	return pos;
>>>  }
>>>  
>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
>>> +{
>>> +	int pos = fda->nr_stat;
>>> +
>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
>>> +		return -1;
>>> +
>>> +	fda->stat_entries[pos].fd = fd;
>>> +	fda->stat_entries[pos].events = revents;
>>> +	fda->nr_stat++;
>>> +
>>> +	return pos;
>>> +}
>>> +
>>>  int fdarray__filter(struct fdarray *fda, short revents,
>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>>>  		    void *arg)
>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>>  
>>>  int fdarray__poll(struct fdarray *fda, int timeout)
>>>  {
>>> -	return poll(fda->entries, fda->nr, timeout);
>>> +	int nr, i, pos, res;
>>> +
>>> +	nr = fda->nr;
>>> +
>>> +	for (i = 0; i < fda->nr_stat; i++) {
>>> +		if (fda->stat_entries[i].fd != -1) {
>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
>>> +					   fda->stat_entries[i].events);
>>
>> so every call to fdarray__poll will add whatever is
>> in stat_entries to entries? how is it removed?
>>
>> I think you should either follow what Adrian said
>> and put 'static' descriptors early and check for
>> filter number to match it as an 'quick fix'
>>
>> or we should fix it for real and make it generic
>>
>> so currently the interface is like this:
>>
>>   pos1 = fdarray__add(a, fd1 ... );
>>   pos2 = fdarray__add(a, fd2 ... );
>>   pos3 = fdarray__add(a, fd2 ... );
>>
>>   fdarray__poll(a);
>>
>>   num = fdarray__filter(a, revents, destructor, arg);
>>
>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
>> indexes are not relevant anymore

and that is why the return value of fdarray__add() should be converted
to bool (added/not added). Currently the return value is used as bool
only allover the calling code.

fdarray__add_fixed() brings the notion of fd with fixed pos which is
valid after fdarray__add_fixed() call so the pos could be used to access
pos fd poll status after poll() call.

pos = fdarray__add_fixed(array, fd);
fdarray_poll(array);
revents = fdarray_fixed_revents(array, pos);
fdarray__del(array, pos);

or fdarray__add() could be extended with fixed attribute to avoid separate call:
int fdarray__add(struct fdarray *fda, int fd, short revents, bool fixed)

~Alexey

>>
>> how about we make the 'pos indexes' being stable by allocating
>> separate object for each added descriptor and each poll call
>> would create pollfd array from current objects, and entries
>> would keep pointer to its pollfd entry
>>
>>   struct fdentry *entry {
>>        int              fd;
>>        int              events;
>>        struct pollfd   *pollfd;
>>   }
>>
>>   entry1 = fdarray__add(a, fd1 ...);
>>   entry2 = fdarray__add(a, fd2 ...);
>>   entry3 = fdarray__add(a, fd3 ...);
>>
>>   fdarray__poll(a);
>>
>>   struct pollfd *fdarray__entry_pollfd(a, entry1);
>>
>> or smoething like that ;-)
> 
> maybe something like below (only compile tested)
> 
> jirka
> 
> 
> ---
> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> index 58d44d5eee31..f1effed3dde1 100644
> --- a/tools/lib/api/fd/array.c
> +++ b/tools/lib/api/fd/array.c
> @@ -22,8 +22,8 @@ int fdarray__grow(struct fdarray *fda, int nr)
>  	void *priv;
>  	int nr_alloc = fda->nr_alloc + nr;
>  	size_t psize = sizeof(fda->priv[0]) * nr_alloc;
> -	size_t size  = sizeof(struct pollfd) * nr_alloc;
> -	struct pollfd *entries = realloc(fda->entries, size);
> +	size_t size  = sizeof(struct fdentry *) * nr_alloc;
> +	struct fdentry **entries = realloc(fda->entries, size);
>  
>  	if (entries == NULL)
>  		return -ENOMEM;
> @@ -58,7 +58,12 @@ struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow)
>  
>  void fdarray__exit(struct fdarray *fda)
>  {
> +	int i;
> +
> +	for (i = 0; i < fda->nr; i++)
> +		free(fda->entries[i]);
>  	free(fda->entries);
> +	free(fda->pollfd);
>  	free(fda->priv);
>  	fdarray__init(fda, 0);
>  }
> @@ -69,18 +74,25 @@ void fdarray__delete(struct fdarray *fda)
>  	free(fda);
>  }
>  
> -int fdarray__add(struct fdarray *fda, int fd, short revents)
> +struct fdentry *fdarray__add(struct fdarray *fda, int fd, short revents)
>  {
> -	int pos = fda->nr;
> +	struct fdentry *entry;
>  
>  	if (fda->nr == fda->nr_alloc &&
>  	    fdarray__grow(fda, fda->nr_autogrow) < 0)
> -		return -ENOMEM;
> +		return NULL;
> +
> +	entry = malloc(sizeof(*entry));
> +	if (!entry)
> +		return NULL;
> +
> +	entry->fd = fd;
> +	entry->revents = revents;
> +	entry->pollfd = NULL;
>  
> -	fda->entries[fda->nr].fd     = fd;
> -	fda->entries[fda->nr].events = revents;
> +	fda->entries[fda->nr] = entry;
>  	fda->nr++;
> -	return pos;
> +	return entry;
>  }
>  
>  int fdarray__filter(struct fdarray *fda, short revents,
> @@ -93,7 +105,7 @@ int fdarray__filter(struct fdarray *fda, short revents,
>  		return 0;
>  
>  	for (fd = 0; fd < fda->nr; ++fd) {
> -		if (fda->entries[fd].revents & revents) {
> +		if (fda->entries[fd]->revents & revents) {
>  			if (entry_destructor)
>  				entry_destructor(fda, fd, arg);
>  
> @@ -113,7 +125,22 @@ int fdarray__filter(struct fdarray *fda, short revents,
>  
>  int fdarray__poll(struct fdarray *fda, int timeout)
>  {
> -	return poll(fda->entries, fda->nr, timeout);
> +	struct pollfd *pollfd = fda->pollfd;
> +	int i;
> +
> +	pollfd = realloc(pollfd, sizeof(*pollfd) * fda->nr);
> +	if (!pollfd)
> +		return -ENOMEM;
> +
> +	fda->pollfd = pollfd;
> +
> +	for (i = 0; i < fda->nr; i++) {
> +		pollfd[i].fd = fda->entries[i]->fd;
> +		pollfd[i].revents = fda->entries[i]->revents;
> +		fda->entries[i]->pollfd = &pollfd[i];
> +	}
> +
> +	return poll(pollfd, fda->nr, timeout);
>  }
>  
>  int fdarray__fprintf(struct fdarray *fda, FILE *fp)
> @@ -121,7 +148,12 @@ int fdarray__fprintf(struct fdarray *fda, FILE *fp)
>  	int fd, printed = fprintf(fp, "%d [ ", fda->nr);
>  
>  	for (fd = 0; fd < fda->nr; ++fd)
> -		printed += fprintf(fp, "%s%d", fd ? ", " : "", fda->entries[fd].fd);
> +		printed += fprintf(fp, "%s%d", fd ? ", " : "", fda->entries[fd]->fd);
>  
>  	return printed + fprintf(fp, " ]");
>  }
> +
> +int fdentry__events(struct fdentry *entry)
> +{
> +	return entry->pollfd->revents;
> +}
> diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
> index b39557d1a88f..5231ce047f2e 100644
> --- a/tools/lib/api/fd/array.h
> +++ b/tools/lib/api/fd/array.h
> @@ -6,6 +6,12 @@
>  
>  struct pollfd;
>  
> +struct fdentry {
> +	int		 fd;
> +	int		 revents;
> +	struct pollfd	*pollfd;
> +};
> +
>  /**
>   * struct fdarray: Array of file descriptors
>   *
> @@ -20,7 +26,10 @@ struct fdarray {
>  	int	       nr;
>  	int	       nr_alloc;
>  	int	       nr_autogrow;
> -	struct pollfd *entries;
> +
> +	struct fdentry	**entries;
> +	struct pollfd	 *pollfd;
> +
>  	union {
>  		int    idx;
>  		void   *ptr;
> @@ -33,7 +42,7 @@ void fdarray__exit(struct fdarray *fda);
>  struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
>  void fdarray__delete(struct fdarray *fda);
>  
> -int fdarray__add(struct fdarray *fda, int fd, short revents);
> +struct fdentry *fdarray__add(struct fdarray *fda, int fd, short revents);
>  int fdarray__poll(struct fdarray *fda, int timeout);
>  int fdarray__filter(struct fdarray *fda, short revents,
>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
> @@ -41,6 +50,8 @@ int fdarray__filter(struct fdarray *fda, short revents,
>  int fdarray__grow(struct fdarray *fda, int extra);
>  int fdarray__fprintf(struct fdarray *fda, FILE *fp);
>  
> +int fdentry__events(struct fdentry *entry);
> +
>  static inline int fdarray__available_entries(struct fdarray *fda)
>  {
>  	return fda->nr_alloc - fda->nr;
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-05 16:15       ` Alexey Budankov
@ 2020-06-08  8:08         ` Alexey Budankov
  2020-06-08  8:43           ` Jiri Olsa
  0 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-08  8:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 05.06.2020 19:15, Alexey Budankov wrote:
> 
> On 05.06.2020 14:38, Jiri Olsa wrote:
>> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
>>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>>>
>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>> Append added file descriptors to the array used by poll() syscall
>>>> during fdarray__poll() call. Copy poll() result of the added
>>>> descriptors from the array back to the storage for analysis.
>>>>
>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>>>> ---
>>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>>>>  tools/lib/api/fd/array.h                 |  7 ++++
>>>>  tools/lib/perf/evlist.c                  | 11 +++++++
>>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
>>>>  4 files changed, 61 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>>> index 58d44d5eee31..b0027f2169c7 100644
>>>> --- a/tools/lib/api/fd/array.c
>>>> +++ b/tools/lib/api/fd/array.c
>>>> @@ -11,10 +11,16 @@
>>>>  
>>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>>>>  {
>>>> +	int i;
>>>> +
>>>>  	fda->entries	 = NULL;
>>>>  	fda->priv	 = NULL;
>>>>  	fda->nr		 = fda->nr_alloc = 0;
>>>>  	fda->nr_autogrow = nr_autogrow;
>>>> +
>>>> +	fda->nr_stat = 0;
>>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
>>>> +		fda->stat_entries[i].fd = -1;
>>>>  }
>>>>  
>>>>  int fdarray__grow(struct fdarray *fda, int nr)
>>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>>>>  	return pos;
>>>>  }
>>>>  
>>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
>>>> +{
>>>> +	int pos = fda->nr_stat;
>>>> +
>>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
>>>> +		return -1;
>>>> +
>>>> +	fda->stat_entries[pos].fd = fd;
>>>> +	fda->stat_entries[pos].events = revents;
>>>> +	fda->nr_stat++;
>>>> +
>>>> +	return pos;
>>>> +}
>>>> +
>>>>  int fdarray__filter(struct fdarray *fda, short revents,
>>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>>>>  		    void *arg)
>>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>>>  
>>>>  int fdarray__poll(struct fdarray *fda, int timeout)
>>>>  {
>>>> -	return poll(fda->entries, fda->nr, timeout);
>>>> +	int nr, i, pos, res;
>>>> +
>>>> +	nr = fda->nr;
>>>> +
>>>> +	for (i = 0; i < fda->nr_stat; i++) {
>>>> +		if (fda->stat_entries[i].fd != -1) {
>>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
>>>> +					   fda->stat_entries[i].events);
>>>
>>> so every call to fdarray__poll will add whatever is
>>> in stat_entries to entries? how is it removed?
>>>
>>> I think you should either follow what Adrian said
>>> and put 'static' descriptors early and check for
>>> filter number to match it as an 'quick fix'
>>>
>>> or we should fix it for real and make it generic
>>>
>>> so currently the interface is like this:
>>>
>>>   pos1 = fdarray__add(a, fd1 ... );
>>>   pos2 = fdarray__add(a, fd2 ... );
>>>   pos3 = fdarray__add(a, fd2 ... );
>>>
>>>   fdarray__poll(a);
>>>
>>>   num = fdarray__filter(a, revents, destructor, arg);
>>>
>>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
>>> indexes are not relevant anymore
> 
> and that is why the return value of fdarray__add() should be converted
> to bool (added/not added). Currently the return value is used as bool
> only allover the calling code.
> 
> fdarray__add_fixed() brings the notion of fd with fixed pos which is
> valid after fdarray__add_fixed() call so the pos could be used to access
> pos fd poll status after poll() call.
> 
> pos = fdarray__add_fixed(array, fd);
> fdarray_poll(array);
> revents = fdarray_fixed_revents(array, pos);
> fdarray__del(array, pos);

So how is it about just adding _revents() and _del() for fixed fds with
correction of retval to bool for fdarray__add()?

~Alexey

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08  8:08         ` Alexey Budankov
@ 2020-06-08  8:43           ` Jiri Olsa
  2020-06-08  9:54             ` Alexey Budankov
  0 siblings, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-08  8:43 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
> 
> On 05.06.2020 19:15, Alexey Budankov wrote:
> > 
> > On 05.06.2020 14:38, Jiri Olsa wrote:
> >> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
> >>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
> >>>>
> >>>> Implement adding of file descriptors by fdarray__add_stat() to
> >>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
> >>>> Append added file descriptors to the array used by poll() syscall
> >>>> during fdarray__poll() call. Copy poll() result of the added
> >>>> descriptors from the array back to the storage for analysis.
> >>>>
> >>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> >>>> ---
> >>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
> >>>>  tools/lib/api/fd/array.h                 |  7 ++++
> >>>>  tools/lib/perf/evlist.c                  | 11 +++++++
> >>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
> >>>>  4 files changed, 61 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> >>>> index 58d44d5eee31..b0027f2169c7 100644
> >>>> --- a/tools/lib/api/fd/array.c
> >>>> +++ b/tools/lib/api/fd/array.c
> >>>> @@ -11,10 +11,16 @@
> >>>>  
> >>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
> >>>>  {
> >>>> +	int i;
> >>>> +
> >>>>  	fda->entries	 = NULL;
> >>>>  	fda->priv	 = NULL;
> >>>>  	fda->nr		 = fda->nr_alloc = 0;
> >>>>  	fda->nr_autogrow = nr_autogrow;
> >>>> +
> >>>> +	fda->nr_stat = 0;
> >>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
> >>>> +		fda->stat_entries[i].fd = -1;
> >>>>  }
> >>>>  
> >>>>  int fdarray__grow(struct fdarray *fda, int nr)
> >>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
> >>>>  	return pos;
> >>>>  }
> >>>>  
> >>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
> >>>> +{
> >>>> +	int pos = fda->nr_stat;
> >>>> +
> >>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
> >>>> +		return -1;
> >>>> +
> >>>> +	fda->stat_entries[pos].fd = fd;
> >>>> +	fda->stat_entries[pos].events = revents;
> >>>> +	fda->nr_stat++;
> >>>> +
> >>>> +	return pos;
> >>>> +}
> >>>> +
> >>>>  int fdarray__filter(struct fdarray *fda, short revents,
> >>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
> >>>>  		    void *arg)
> >>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
> >>>>  
> >>>>  int fdarray__poll(struct fdarray *fda, int timeout)
> >>>>  {
> >>>> -	return poll(fda->entries, fda->nr, timeout);
> >>>> +	int nr, i, pos, res;
> >>>> +
> >>>> +	nr = fda->nr;
> >>>> +
> >>>> +	for (i = 0; i < fda->nr_stat; i++) {
> >>>> +		if (fda->stat_entries[i].fd != -1) {
> >>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
> >>>> +					   fda->stat_entries[i].events);
> >>>
> >>> so every call to fdarray__poll will add whatever is
> >>> in stat_entries to entries? how is it removed?
> >>>
> >>> I think you should either follow what Adrian said
> >>> and put 'static' descriptors early and check for
> >>> filter number to match it as an 'quick fix'
> >>>
> >>> or we should fix it for real and make it generic
> >>>
> >>> so currently the interface is like this:
> >>>
> >>>   pos1 = fdarray__add(a, fd1 ... );
> >>>   pos2 = fdarray__add(a, fd2 ... );
> >>>   pos3 = fdarray__add(a, fd2 ... );
> >>>
> >>>   fdarray__poll(a);
> >>>
> >>>   num = fdarray__filter(a, revents, destructor, arg);
> >>>
> >>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
> >>> indexes are not relevant anymore
> > 
> > and that is why the return value of fdarray__add() should be converted
> > to bool (added/not added). Currently the return value is used as bool
> > only allover the calling code.
> > 
> > fdarray__add_fixed() brings the notion of fd with fixed pos which is
> > valid after fdarray__add_fixed() call so the pos could be used to access
> > pos fd poll status after poll() call.
> > 
> > pos = fdarray__add_fixed(array, fd);
> > fdarray_poll(array);
> > revents = fdarray_fixed_revents(array, pos);
> > fdarray__del(array, pos);
> 
> So how is it about just adding _revents() and _del() for fixed fds with
> correction of retval to bool for fdarray__add()?

I don't like the separation for fixed and non-fixed fds,
why can't we make generic?

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08  8:43           ` Jiri Olsa
@ 2020-06-08  9:54             ` Alexey Budankov
  2020-06-08 15:05               ` Alexey Budankov
  2020-06-08 16:07               ` Jiri Olsa
  0 siblings, 2 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-08  9:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 08.06.2020 11:43, Jiri Olsa wrote:
> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>
>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>
>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>>>> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
>>>>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>>>>>
>>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>>> Append added file descriptors to the array used by poll() syscall
>>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>>> descriptors from the array back to the storage for analysis.
>>>>>>
>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>>>>>> ---
>>>>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>>>>>>  tools/lib/api/fd/array.h                 |  7 ++++
>>>>>>  tools/lib/perf/evlist.c                  | 11 +++++++
>>>>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
>>>>>>  4 files changed, 61 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>>>>> index 58d44d5eee31..b0027f2169c7 100644
>>>>>> --- a/tools/lib/api/fd/array.c
>>>>>> +++ b/tools/lib/api/fd/array.c
>>>>>> @@ -11,10 +11,16 @@
>>>>>>  
>>>>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>>>>>>  {
>>>>>> +	int i;
>>>>>> +
>>>>>>  	fda->entries	 = NULL;
>>>>>>  	fda->priv	 = NULL;
>>>>>>  	fda->nr		 = fda->nr_alloc = 0;
>>>>>>  	fda->nr_autogrow = nr_autogrow;
>>>>>> +
>>>>>> +	fda->nr_stat = 0;
>>>>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
>>>>>> +		fda->stat_entries[i].fd = -1;
>>>>>>  }
>>>>>>  
>>>>>>  int fdarray__grow(struct fdarray *fda, int nr)
>>>>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>>>>>>  	return pos;
>>>>>>  }
>>>>>>  
>>>>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
>>>>>> +{
>>>>>> +	int pos = fda->nr_stat;
>>>>>> +
>>>>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
>>>>>> +		return -1;
>>>>>> +
>>>>>> +	fda->stat_entries[pos].fd = fd;
>>>>>> +	fda->stat_entries[pos].events = revents;
>>>>>> +	fda->nr_stat++;
>>>>>> +
>>>>>> +	return pos;
>>>>>> +}
>>>>>> +
>>>>>>  int fdarray__filter(struct fdarray *fda, short revents,
>>>>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>>>>>>  		    void *arg)
>>>>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>>>>>  
>>>>>>  int fdarray__poll(struct fdarray *fda, int timeout)
>>>>>>  {
>>>>>> -	return poll(fda->entries, fda->nr, timeout);
>>>>>> +	int nr, i, pos, res;
>>>>>> +
>>>>>> +	nr = fda->nr;
>>>>>> +
>>>>>> +	for (i = 0; i < fda->nr_stat; i++) {
>>>>>> +		if (fda->stat_entries[i].fd != -1) {
>>>>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
>>>>>> +					   fda->stat_entries[i].events);
>>>>>
>>>>> so every call to fdarray__poll will add whatever is
>>>>> in stat_entries to entries? how is it removed?
>>>>>
>>>>> I think you should either follow what Adrian said
>>>>> and put 'static' descriptors early and check for
>>>>> filter number to match it as an 'quick fix'
>>>>>
>>>>> or we should fix it for real and make it generic
>>>>>
>>>>> so currently the interface is like this:
>>>>>
>>>>>   pos1 = fdarray__add(a, fd1 ... );
>>>>>   pos2 = fdarray__add(a, fd2 ... );
>>>>>   pos3 = fdarray__add(a, fd2 ... );
>>>>>
>>>>>   fdarray__poll(a);
>>>>>
>>>>>   num = fdarray__filter(a, revents, destructor, arg);
>>>>>
>>>>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
>>>>> indexes are not relevant anymore
>>>
>>> and that is why the return value of fdarray__add() should be converted
>>> to bool (added/not added). Currently the return value is used as bool
>>> only allover the calling code.
>>>
>>> fdarray__add_fixed() brings the notion of fd with fixed pos which is
>>> valid after fdarray__add_fixed() call so the pos could be used to access
>>> pos fd poll status after poll() call.
>>>
>>> pos = fdarray__add_fixed(array, fd);
>>> fdarray_poll(array);
>>> revents = fdarray_fixed_revents(array, pos);
>>> fdarray__del(array, pos);
>>
>> So how is it about just adding _revents() and _del() for fixed fds with
>> correction of retval to bool for fdarray__add()?
> 
> I don't like the separation for fixed and non-fixed fds,
> why can't we make generic?

Usage models are different but they want still to be parts of the same class
for atomic poll(). The distinction is filterable vs. not filterable.
The distinction should be somehow provided in API. Options are:
1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
   use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
   use the type in __filter() and __poll() and, perhaps, other internals;
   expose less API calls in comparison with option 1

Exposure of pos for filterable fds should be converted to bool since currently
the returned pos can become stale and there is no way in API to check its state.
So it could look like this:

fdkey = fdarray__add(array, fd, events, type)
type: filterable, nonfilterable, somthing else
revents = fdarray__get_revents(fdkey);
fdarray__del(array, fdkey);

~Alexey

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08  9:54             ` Alexey Budankov
@ 2020-06-08 15:05               ` Alexey Budankov
  2020-06-08 16:07               ` Jiri Olsa
  1 sibling, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-08 15:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 08.06.2020 12:54, Alexey Budankov wrote:
> 
> On 08.06.2020 11:43, Jiri Olsa wrote:
>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>
>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>
>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>>>>> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
>>>>>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>>>>>>
>>>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>>>> Append added file descriptors to the array used by poll() syscall
>>>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>>>> descriptors from the array back to the storage for analysis.
>>>>>>>
>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>>>>>>> ---
<SNIP>
>>>>>>> +					   fda->stat_entries[i].events);
>>>>>>
>>>>>> so every call to fdarray__poll will add whatever is
>>>>>> in stat_entries to entries? how is it removed?
>>>>>>
>>>>>> I think you should either follow what Adrian said
>>>>>> and put 'static' descriptors early and check for
>>>>>> filter number to match it as an 'quick fix'
>>>>>>
>>>>>> or we should fix it for real and make it generic
>>>>>>
>>>>>> so currently the interface is like this:
>>>>>>
>>>>>>   pos1 = fdarray__add(a, fd1 ... );
>>>>>>   pos2 = fdarray__add(a, fd2 ... );
>>>>>>   pos3 = fdarray__add(a, fd2 ... );
>>>>>>
>>>>>>   fdarray__poll(a);
>>>>>>
>>>>>>   num = fdarray__filter(a, revents, destructor, arg);
>>>>>>
>>>>>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
>>>>>> indexes are not relevant anymore
>>>>
>>>> and that is why the return value of fdarray__add() should be converted
>>>> to bool (added/not added). Currently the return value is used as bool
>>>> only allover the calling code.
>>>>
>>>> fdarray__add_fixed() brings the notion of fd with fixed pos which is
>>>> valid after fdarray__add_fixed() call so the pos could be used to access
>>>> pos fd poll status after poll() call.
>>>>
>>>> pos = fdarray__add_fixed(array, fd);
>>>> fdarray_poll(array);
>>>> revents = fdarray_fixed_revents(array, pos);
>>>> fdarray__del(array, pos);
>>>
>>> So how is it about just adding _revents() and _del() for fixed fds with
>>> correction of retval to bool for fdarray__add()?
>>
>> I don't like the separation for fixed and non-fixed fds,
>> why can't we make generic?
> 
> Usage models are different but they want still to be parts of the same class
> for atomic poll(). The distinction is filterable vs. not filterable.
> The distinction should be somehow provided in API. Options are:
> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>    use the type in __filter() and __poll() and, perhaps, other internals;
>    expose less API calls in comparison with option 1
> 
> Exposure of pos for filterable fds should be converted to bool since currently
> the returned pos can become stale and there is no way in API to check its state.
> So it could look like this:
> 
> fdkey = fdarray__add(array, fd, events, type)
> type: filterable, nonfilterable, somthing else
> revents = fdarray__get_revents(fdkey);
> fdarray__del(array, fdkey);

Are there any thoughts regarding this?

~Alexey

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08  9:54             ` Alexey Budankov
  2020-06-08 15:05               ` Alexey Budankov
@ 2020-06-08 16:07               ` Jiri Olsa
  2020-06-08 16:43                 ` Alexey Budankov
  2020-06-15  5:20                 ` Alexey Budankov
  1 sibling, 2 replies; 44+ messages in thread
From: Jiri Olsa @ 2020-06-08 16:07 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
> 
> On 08.06.2020 11:43, Jiri Olsa wrote:
> > On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
> >>
> >> On 05.06.2020 19:15, Alexey Budankov wrote:
> >>>
> >>> On 05.06.2020 14:38, Jiri Olsa wrote:
> >>>> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
> >>>>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
> >>>>>>
> >>>>>> Implement adding of file descriptors by fdarray__add_stat() to
> >>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
> >>>>>> Append added file descriptors to the array used by poll() syscall
> >>>>>> during fdarray__poll() call. Copy poll() result of the added
> >>>>>> descriptors from the array back to the storage for analysis.
> >>>>>>
> >>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> >>>>>> ---
> >>>>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
> >>>>>>  tools/lib/api/fd/array.h                 |  7 ++++
> >>>>>>  tools/lib/perf/evlist.c                  | 11 +++++++
> >>>>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
> >>>>>>  4 files changed, 61 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> >>>>>> index 58d44d5eee31..b0027f2169c7 100644
> >>>>>> --- a/tools/lib/api/fd/array.c
> >>>>>> +++ b/tools/lib/api/fd/array.c
> >>>>>> @@ -11,10 +11,16 @@
> >>>>>>  
> >>>>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
> >>>>>>  {
> >>>>>> +	int i;
> >>>>>> +
> >>>>>>  	fda->entries	 = NULL;
> >>>>>>  	fda->priv	 = NULL;
> >>>>>>  	fda->nr		 = fda->nr_alloc = 0;
> >>>>>>  	fda->nr_autogrow = nr_autogrow;
> >>>>>> +
> >>>>>> +	fda->nr_stat = 0;
> >>>>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
> >>>>>> +		fda->stat_entries[i].fd = -1;
> >>>>>>  }
> >>>>>>  
> >>>>>>  int fdarray__grow(struct fdarray *fda, int nr)
> >>>>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
> >>>>>>  	return pos;
> >>>>>>  }
> >>>>>>  
> >>>>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
> >>>>>> +{
> >>>>>> +	int pos = fda->nr_stat;
> >>>>>> +
> >>>>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
> >>>>>> +		return -1;
> >>>>>> +
> >>>>>> +	fda->stat_entries[pos].fd = fd;
> >>>>>> +	fda->stat_entries[pos].events = revents;
> >>>>>> +	fda->nr_stat++;
> >>>>>> +
> >>>>>> +	return pos;
> >>>>>> +}
> >>>>>> +
> >>>>>>  int fdarray__filter(struct fdarray *fda, short revents,
> >>>>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
> >>>>>>  		    void *arg)
> >>>>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
> >>>>>>  
> >>>>>>  int fdarray__poll(struct fdarray *fda, int timeout)
> >>>>>>  {
> >>>>>> -	return poll(fda->entries, fda->nr, timeout);
> >>>>>> +	int nr, i, pos, res;
> >>>>>> +
> >>>>>> +	nr = fda->nr;
> >>>>>> +
> >>>>>> +	for (i = 0; i < fda->nr_stat; i++) {
> >>>>>> +		if (fda->stat_entries[i].fd != -1) {
> >>>>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
> >>>>>> +					   fda->stat_entries[i].events);
> >>>>>
> >>>>> so every call to fdarray__poll will add whatever is
> >>>>> in stat_entries to entries? how is it removed?
> >>>>>
> >>>>> I think you should either follow what Adrian said
> >>>>> and put 'static' descriptors early and check for
> >>>>> filter number to match it as an 'quick fix'
> >>>>>
> >>>>> or we should fix it for real and make it generic
> >>>>>
> >>>>> so currently the interface is like this:
> >>>>>
> >>>>>   pos1 = fdarray__add(a, fd1 ... );
> >>>>>   pos2 = fdarray__add(a, fd2 ... );
> >>>>>   pos3 = fdarray__add(a, fd2 ... );
> >>>>>
> >>>>>   fdarray__poll(a);
> >>>>>
> >>>>>   num = fdarray__filter(a, revents, destructor, arg);
> >>>>>
> >>>>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
> >>>>> indexes are not relevant anymore
> >>>
> >>> and that is why the return value of fdarray__add() should be converted
> >>> to bool (added/not added). Currently the return value is used as bool
> >>> only allover the calling code.
> >>>
> >>> fdarray__add_fixed() brings the notion of fd with fixed pos which is
> >>> valid after fdarray__add_fixed() call so the pos could be used to access
> >>> pos fd poll status after poll() call.
> >>>
> >>> pos = fdarray__add_fixed(array, fd);
> >>> fdarray_poll(array);
> >>> revents = fdarray_fixed_revents(array, pos);
> >>> fdarray__del(array, pos);
> >>
> >> So how is it about just adding _revents() and _del() for fixed fds with
> >> correction of retval to bool for fdarray__add()?
> > 
> > I don't like the separation for fixed and non-fixed fds,
> > why can't we make generic?
> 
> Usage models are different but they want still to be parts of the same class
> for atomic poll(). The distinction is filterable vs. not filterable.
> The distinction should be somehow provided in API. Options are:
> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>    use the type in __filter() and __poll() and, perhaps, other internals;
>    expose less API calls in comparison with option 1
> 
> Exposure of pos for filterable fds should be converted to bool since currently
> the returned pos can become stale and there is no way in API to check its state.
> So it could look like this:
> 
> fdkey = fdarray__add(array, fd, events, type)
> type: filterable, nonfilterable, somthing else
> revents = fdarray__get_revents(fdkey);
> fdarray__del(array, fdkey);

I think there's solution without having filterable type,
I'm not sure why you think this is needed

I'm busy with other things this week, but I think I can
come up with some patch early next week if needed

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08 16:07               ` Jiri Olsa
@ 2020-06-08 16:43                 ` Alexey Budankov
  2020-06-08 17:18                   ` Alexey Budankov
  2020-06-15  5:20                 ` Alexey Budankov
  1 sibling, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-08 16:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 08.06.2020 19:07, Jiri Olsa wrote:
> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
>>
>> On 08.06.2020 11:43, Jiri Olsa wrote:
>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>>
>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>>
>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>>>>>> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
>>>>>>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>>>>>>>
>>>>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>>>>> Append added file descriptors to the array used by poll() syscall
>>>>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>>>>> descriptors from the array back to the storage for analysis.
>>>>>>>>
>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>>>>>>>> ---
>>>>>>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>>>>>>>>  tools/lib/api/fd/array.h                 |  7 ++++
>>>>>>>>  tools/lib/perf/evlist.c                  | 11 +++++++
>>>>>>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
>>>>>>>>  4 files changed, 61 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>>>>>>> index 58d44d5eee31..b0027f2169c7 100644
>>>>>>>> --- a/tools/lib/api/fd/array.c
>>>>>>>> +++ b/tools/lib/api/fd/array.c
>>>>>>>> @@ -11,10 +11,16 @@
>>>>>>>>  
>>>>>>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>>>>>>>>  {
>>>>>>>> +	int i;
>>>>>>>> +
>>>>>>>>  	fda->entries	 = NULL;
>>>>>>>>  	fda->priv	 = NULL;
>>>>>>>>  	fda->nr		 = fda->nr_alloc = 0;
>>>>>>>>  	fda->nr_autogrow = nr_autogrow;
>>>>>>>> +
>>>>>>>> +	fda->nr_stat = 0;
>>>>>>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
>>>>>>>> +		fda->stat_entries[i].fd = -1;
>>>>>>>>  }
>>>>>>>>  
>>>>>>>>  int fdarray__grow(struct fdarray *fda, int nr)
>>>>>>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>>>>>>>>  	return pos;
>>>>>>>>  }
>>>>>>>>  
>>>>>>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
>>>>>>>> +{
>>>>>>>> +	int pos = fda->nr_stat;
>>>>>>>> +
>>>>>>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
>>>>>>>> +		return -1;
>>>>>>>> +
>>>>>>>> +	fda->stat_entries[pos].fd = fd;
>>>>>>>> +	fda->stat_entries[pos].events = revents;
>>>>>>>> +	fda->nr_stat++;
>>>>>>>> +
>>>>>>>> +	return pos;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  int fdarray__filter(struct fdarray *fda, short revents,
>>>>>>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>>>>>>>>  		    void *arg)
>>>>>>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>>>>>>>  
>>>>>>>>  int fdarray__poll(struct fdarray *fda, int timeout)
>>>>>>>>  {
>>>>>>>> -	return poll(fda->entries, fda->nr, timeout);
>>>>>>>> +	int nr, i, pos, res;
>>>>>>>> +
>>>>>>>> +	nr = fda->nr;
>>>>>>>> +
>>>>>>>> +	for (i = 0; i < fda->nr_stat; i++) {
>>>>>>>> +		if (fda->stat_entries[i].fd != -1) {
>>>>>>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
>>>>>>>> +					   fda->stat_entries[i].events);
>>>>>>>
>>>>>>> so every call to fdarray__poll will add whatever is
>>>>>>> in stat_entries to entries? how is it removed?
>>>>>>>
>>>>>>> I think you should either follow what Adrian said
>>>>>>> and put 'static' descriptors early and check for
>>>>>>> filter number to match it as an 'quick fix'
>>>>>>>
>>>>>>> or we should fix it for real and make it generic
>>>>>>>
>>>>>>> so currently the interface is like this:
>>>>>>>
>>>>>>>   pos1 = fdarray__add(a, fd1 ... );
>>>>>>>   pos2 = fdarray__add(a, fd2 ... );
>>>>>>>   pos3 = fdarray__add(a, fd2 ... );
>>>>>>>
>>>>>>>   fdarray__poll(a);
>>>>>>>
>>>>>>>   num = fdarray__filter(a, revents, destructor, arg);
>>>>>>>
>>>>>>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
>>>>>>> indexes are not relevant anymore
>>>>>
>>>>> and that is why the return value of fdarray__add() should be converted
>>>>> to bool (added/not added). Currently the return value is used as bool
>>>>> only allover the calling code.
>>>>>
>>>>> fdarray__add_fixed() brings the notion of fd with fixed pos which is
>>>>> valid after fdarray__add_fixed() call so the pos could be used to access
>>>>> pos fd poll status after poll() call.
>>>>>
>>>>> pos = fdarray__add_fixed(array, fd);
>>>>> fdarray_poll(array);
>>>>> revents = fdarray_fixed_revents(array, pos);
>>>>> fdarray__del(array, pos);
>>>>
>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>> correction of retval to bool for fdarray__add()?
>>>
>>> I don't like the separation for fixed and non-fixed fds,
>>> why can't we make generic?
>>
>> Usage models are different but they want still to be parts of the same class
>> for atomic poll(). The distinction is filterable vs. not filterable.
>> The distinction should be somehow provided in API. Options are:
>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>    expose less API calls in comparison with option 1
>>
>> Exposure of pos for filterable fds should be converted to bool since currently
>> the returned pos can become stale and there is no way in API to check its state.
>> So it could look like this:
>>
>> fdkey = fdarray__add(array, fd, events, type)
>> type: filterable, nonfilterable, somthing else
>> revents = fdarray__get_revents(fdkey);
>> fdarray__del(array, fdkey);
> 
> I think there's solution without having filterable type,

and still making the atomic fdarray__poll()?

> I'm not sure why you think this is needed

In order to cause min changes to the existing code,
as in libperf as in the tool.

~Alexey

> 
> I'm busy with other things this week, but I think I can
> come up with some patch early next week if needed
> 
> jirka
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08 16:43                 ` Alexey Budankov
@ 2020-06-08 17:18                   ` Alexey Budankov
  2020-06-09 14:56                     ` Jiri Olsa
  0 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-08 17:18 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 08.06.2020 19:43, Alexey Budankov wrote:
> 
> On 08.06.2020 19:07, Jiri Olsa wrote:
>> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
>>>
>>> On 08.06.2020 11:43, Jiri Olsa wrote:
>>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>>>
>>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>>>
>>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>>>>>>> On Fri, Jun 05, 2020 at 12:50:54PM +0200, Jiri Olsa wrote:
>>>>>>>> On Wed, Jun 03, 2020 at 06:52:59PM +0300, Alexey Budankov wrote:
>>>>>>>>>
>>>>>>>>> Implement adding of file descriptors by fdarray__add_stat() to
>>>>>>>>> fix-sized (currently 1) stat_entries array located at struct fdarray.
>>>>>>>>> Append added file descriptors to the array used by poll() syscall
>>>>>>>>> during fdarray__poll() call. Copy poll() result of the added
>>>>>>>>> descriptors from the array back to the storage for analysis.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>>>>>>>>> ---
>>>>>>>>>  tools/lib/api/fd/array.c                 | 42 +++++++++++++++++++++++-
>>>>>>>>>  tools/lib/api/fd/array.h                 |  7 ++++
>>>>>>>>>  tools/lib/perf/evlist.c                  | 11 +++++++
>>>>>>>>>  tools/lib/perf/include/internal/evlist.h |  2 ++
>>>>>>>>>  4 files changed, 61 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>>>>>>>> index 58d44d5eee31..b0027f2169c7 100644
>>>>>>>>> --- a/tools/lib/api/fd/array.c
>>>>>>>>> +++ b/tools/lib/api/fd/array.c
>>>>>>>>> @@ -11,10 +11,16 @@
>>>>>>>>>  
>>>>>>>>>  void fdarray__init(struct fdarray *fda, int nr_autogrow)
>>>>>>>>>  {
>>>>>>>>> +	int i;
>>>>>>>>> +
>>>>>>>>>  	fda->entries	 = NULL;
>>>>>>>>>  	fda->priv	 = NULL;
>>>>>>>>>  	fda->nr		 = fda->nr_alloc = 0;
>>>>>>>>>  	fda->nr_autogrow = nr_autogrow;
>>>>>>>>> +
>>>>>>>>> +	fda->nr_stat = 0;
>>>>>>>>> +	for (i = 0; i < FDARRAY__STAT_ENTRIES_MAX; i++)
>>>>>>>>> +		fda->stat_entries[i].fd = -1;
>>>>>>>>>  }
>>>>>>>>>  
>>>>>>>>>  int fdarray__grow(struct fdarray *fda, int nr)
>>>>>>>>> @@ -83,6 +89,20 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
>>>>>>>>>  	return pos;
>>>>>>>>>  }
>>>>>>>>>  
>>>>>>>>> +int fdarray__add_stat(struct fdarray *fda, int fd, short revents)
>>>>>>>>> +{
>>>>>>>>> +	int pos = fda->nr_stat;
>>>>>>>>> +
>>>>>>>>> +	if (pos >= FDARRAY__STAT_ENTRIES_MAX)
>>>>>>>>> +		return -1;
>>>>>>>>> +
>>>>>>>>> +	fda->stat_entries[pos].fd = fd;
>>>>>>>>> +	fda->stat_entries[pos].events = revents;
>>>>>>>>> +	fda->nr_stat++;
>>>>>>>>> +
>>>>>>>>> +	return pos;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>  int fdarray__filter(struct fdarray *fda, short revents,
>>>>>>>>>  		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
>>>>>>>>>  		    void *arg)
>>>>>>>>> @@ -113,7 +133,27 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>>>>>>>>  
>>>>>>>>>  int fdarray__poll(struct fdarray *fda, int timeout)
>>>>>>>>>  {
>>>>>>>>> -	return poll(fda->entries, fda->nr, timeout);
>>>>>>>>> +	int nr, i, pos, res;
>>>>>>>>> +
>>>>>>>>> +	nr = fda->nr;
>>>>>>>>> +
>>>>>>>>> +	for (i = 0; i < fda->nr_stat; i++) {
>>>>>>>>> +		if (fda->stat_entries[i].fd != -1) {
>>>>>>>>> +			pos = fdarray__add(fda, fda->stat_entries[i].fd,
>>>>>>>>> +					   fda->stat_entries[i].events);
>>>>>>>>
>>>>>>>> so every call to fdarray__poll will add whatever is
>>>>>>>> in stat_entries to entries? how is it removed?
>>>>>>>>
>>>>>>>> I think you should either follow what Adrian said
>>>>>>>> and put 'static' descriptors early and check for
>>>>>>>> filter number to match it as an 'quick fix'
>>>>>>>>
>>>>>>>> or we should fix it for real and make it generic
>>>>>>>>
>>>>>>>> so currently the interface is like this:
>>>>>>>>
>>>>>>>>   pos1 = fdarray__add(a, fd1 ... );
>>>>>>>>   pos2 = fdarray__add(a, fd2 ... );
>>>>>>>>   pos3 = fdarray__add(a, fd2 ... );
>>>>>>>>
>>>>>>>>   fdarray__poll(a);
>>>>>>>>
>>>>>>>>   num = fdarray__filter(a, revents, destructor, arg);
>>>>>>>>
>>>>>>>> when fdarray__filter removes some of the fds the 'pos1,pos2,pos3'
>>>>>>>> indexes are not relevant anymore
>>>>>>
>>>>>> and that is why the return value of fdarray__add() should be converted
>>>>>> to bool (added/not added). Currently the return value is used as bool
>>>>>> only allover the calling code.
>>>>>>
>>>>>> fdarray__add_fixed() brings the notion of fd with fixed pos which is
>>>>>> valid after fdarray__add_fixed() call so the pos could be used to access
>>>>>> pos fd poll status after poll() call.
>>>>>>
>>>>>> pos = fdarray__add_fixed(array, fd);
>>>>>> fdarray_poll(array);
>>>>>> revents = fdarray_fixed_revents(array, pos);
>>>>>> fdarray__del(array, pos);
>>>>>
>>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>>> correction of retval to bool for fdarray__add()?
>>>>
>>>> I don't like the separation for fixed and non-fixed fds,
>>>> why can't we make generic?
>>>
>>> Usage models are different but they want still to be parts of the same class
>>> for atomic poll(). The distinction is filterable vs. not filterable.
>>> The distinction should be somehow provided in API. Options are:
>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>>    expose less API calls in comparison with option 1
>>>
>>> Exposure of pos for filterable fds should be converted to bool since currently
>>> the returned pos can become stale and there is no way in API to check its state.
>>> So it could look like this:
>>>
>>> fdkey = fdarray__add(array, fd, events, type)
>>> type: filterable, nonfilterable, somthing else
>>> revents = fdarray__get_revents(fdkey);
>>> fdarray__del(array, fdkey);
>>
>> I think there's solution without having filterable type,
> 
> and still making the atomic fdarray__poll()?

How is it about design like this?

    int fdarray__poll(struct fdarray *fda, int timeout)

with additional external array of fds to simultaneously poll() on:

    int fdarray__poll(struct fdarray *fda, int timeout,
                      int *fds, size_t fds_size)

fds would be added to array just prior poll() call.

That wouldn't cause fdarray class extension with fd types and wouldn't change 
fdarray__filter() keeping the types processing outside of fdarray and libperf.

Adoption of new fdarray__poll() signature in the tool via macro (or inline function):

#define fdarray__poll(fda, timeout) \
	fdarray__poll_ext(fda, timeout, NULL, 0)
int fdarray__poll_ext(struct fdarray *fda, int timeout,
                      int *fds, size_t fds_size)

~Alexey

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08 17:18                   ` Alexey Budankov
@ 2020-06-09 14:56                     ` Jiri Olsa
  2020-06-09 18:51                       ` Alexey Budankov
                                         ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Jiri Olsa @ 2020-06-09 14:56 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 08, 2020 at 08:18:20PM +0300, Alexey Budankov wrote:

SNIP

> >>>>> So how is it about just adding _revents() and _del() for fixed fds with
> >>>>> correction of retval to bool for fdarray__add()?
> >>>>
> >>>> I don't like the separation for fixed and non-fixed fds,
> >>>> why can't we make generic?
> >>>
> >>> Usage models are different but they want still to be parts of the same class
> >>> for atomic poll(). The distinction is filterable vs. not filterable.
> >>> The distinction should be somehow provided in API. Options are:
> >>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
> >>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
> >>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
> >>>    use the type in __filter() and __poll() and, perhaps, other internals;
> >>>    expose less API calls in comparison with option 1
> >>>
> >>> Exposure of pos for filterable fds should be converted to bool since currently
> >>> the returned pos can become stale and there is no way in API to check its state.
> >>> So it could look like this:
> >>>
> >>> fdkey = fdarray__add(array, fd, events, type)
> >>> type: filterable, nonfilterable, somthing else
> >>> revents = fdarray__get_revents(fdkey);
> >>> fdarray__del(array, fdkey);
> >>
> >> I think there's solution without having filterable type,

so with the changes I proposed it could no longer be called fdarray ;-)
which I think was the idea at the begning.. just an array of fds

I'd like to have fully flaged events object.. but that's bigger change

> > 
> > and still making the atomic fdarray__poll()?
> 
> How is it about design like this?
> 
>     int fdarray__poll(struct fdarray *fda, int timeout)
> 
> with additional external array of fds to simultaneously poll() on:
> 
>     int fdarray__poll(struct fdarray *fda, int timeout,
>                       int *fds, size_t fds_size)
> 
> fds would be added to array just prior poll() call.

yep, I was considering something like this, having:

  fdarray__poll2(fda1, fda2)
  fdarray__pollx(fda, ...)

but it would need to create an pollfd array and write
the poll results back to arrays.. might be expensive

another idea is to forbid filter to screw the array
and return only remaining number, like below

jirka


---
diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index 58d44d5eee31..89f9a2193c2d 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -93,22 +93,21 @@ int fdarray__filter(struct fdarray *fda, short revents,
 		return 0;
 
 	for (fd = 0; fd < fda->nr; ++fd) {
+		if (!fda->entries[fd].events)
+			continue;
+
 		if (fda->entries[fd].revents & revents) {
 			if (entry_destructor)
 				entry_destructor(fda, fd, arg);
 
+			fda->entries[fd].revents = fda->entries[fd].events = 0;
 			continue;
 		}
 
-		if (fd != nr) {
-			fda->entries[nr] = fda->entries[fd];
-			fda->priv[nr]	 = fda->priv[fd];
-		}
-
 		++nr;
 	}
 
-	return fda->nr = nr;
+	return nr;
 }
 
 int fdarray__poll(struct fdarray *fda, int timeout)
diff --git a/tools/perf/tests/fdarray.c b/tools/perf/tests/fdarray.c
index c7c81c4a5b2b..d0c8a05aab2f 100644
--- a/tools/perf/tests/fdarray.c
+++ b/tools/perf/tests/fdarray.c
@@ -12,6 +12,7 @@ static void fdarray__init_revents(struct fdarray *fda, short revents)
 
 	for (fd = 0; fd < fda->nr; ++fd) {
 		fda->entries[fd].fd	 = fda->nr - fd;
+		fda->entries[fd].events  = revents;
 		fda->entries[fd].revents = revents;
 	}
 }
@@ -29,7 +30,7 @@ static int fdarray__fprintf_prefix(struct fdarray *fda, const char *prefix, FILE
 
 int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_unused)
 {
-	int nr_fds, expected_fd[2], fd, err = TEST_FAIL;
+	int nr_fds, err = TEST_FAIL;
 	struct fdarray *fda = fdarray__new(5, 5);
 
 	if (fda == NULL) {
@@ -55,7 +56,6 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
 
 	fdarray__init_revents(fda, POLLHUP);
 	fda->entries[2].revents = POLLIN;
-	expected_fd[0] = fda->entries[2].fd;
 
 	pr_debug("\nfiltering all but fda->entries[2]:");
 	fdarray__fprintf_prefix(fda, "before", stderr);
@@ -66,17 +66,9 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
 		goto out_delete;
 	}
 
-	if (fda->entries[0].fd != expected_fd[0]) {
-		pr_debug("\nfda->entries[0].fd=%d != %d\n",
-			 fda->entries[0].fd, expected_fd[0]);
-		goto out_delete;
-	}
-
 	fdarray__init_revents(fda, POLLHUP);
 	fda->entries[0].revents = POLLIN;
-	expected_fd[0] = fda->entries[0].fd;
 	fda->entries[3].revents = POLLIN;
-	expected_fd[1] = fda->entries[3].fd;
 
 	pr_debug("\nfiltering all but (fda->entries[0], fda->entries[3]):");
 	fdarray__fprintf_prefix(fda, "before", stderr);
@@ -88,14 +80,6 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
 		goto out_delete;
 	}
 
-	for (fd = 0; fd < 2; ++fd) {
-		if (fda->entries[fd].fd != expected_fd[fd]) {
-			pr_debug("\nfda->entries[%d].fd=%d != %d\n", fd,
-				 fda->entries[fd].fd, expected_fd[fd]);
-			goto out_delete;
-		}
-	}
-
 	pr_debug("\n");
 
 	err = 0;


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-09 14:56                     ` Jiri Olsa
@ 2020-06-09 18:51                       ` Alexey Budankov
  2020-06-15 13:13                       ` Alexey Budankov
  2020-06-15 17:38                       ` Alexey Budankov
  2 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-09 18:51 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 09.06.2020 17:56, Jiri Olsa wrote:
> On Mon, Jun 08, 2020 at 08:18:20PM +0300, Alexey Budankov wrote:
> 
> SNIP
> 
>>>>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>>>>> correction of retval to bool for fdarray__add()?
>>>>>>
>>>>>> I don't like the separation for fixed and non-fixed fds,
>>>>>> why can't we make generic?
>>>>>
>>>>> Usage models are different but they want still to be parts of the same class
>>>>> for atomic poll(). The distinction is filterable vs. not filterable.
>>>>> The distinction should be somehow provided in API. Options are:
>>>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>>>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>>>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>>>>    expose less API calls in comparison with option 1
>>>>>
>>>>> Exposure of pos for filterable fds should be converted to bool since currently
>>>>> the returned pos can become stale and there is no way in API to check its state.
>>>>> So it could look like this:
>>>>>
>>>>> fdkey = fdarray__add(array, fd, events, type)
>>>>> type: filterable, nonfilterable, somthing else
>>>>> revents = fdarray__get_revents(fdkey);
>>>>> fdarray__del(array, fdkey);
>>>>
>>>> I think there's solution without having filterable type,
> 
> so with the changes I proposed it could no longer be called fdarray ;-)
> which I think was the idea at the begning.. just an array of fds
> 
> I'd like to have fully flaged events object.. but that's bigger change
> 
>>>
>>> and still making the atomic fdarray__poll()?
>>
>> How is it about design like this?
>>
>>     int fdarray__poll(struct fdarray *fda, int timeout)
>>
>> with additional external array of fds to simultaneously poll() on:
>>
>>     int fdarray__poll(struct fdarray *fda, int timeout,
>>                       int *fds, size_t fds_size)
>>
>> fds would be added to array just prior poll() call.
> 
> yep, I was considering something like this, having:
> 
>   fdarray__poll2(fda1, fda2)
>   fdarray__pollx(fda, ...)
> 
> but it would need to create an pollfd array and write
> the poll results back to arrays.. might be expensive

This is quite similar to currently proposed design options.
Saying expensive how do you estimate the cost?

> 
> another idea is to forbid filter to screw the array
> and return only remaining number, like below

Important thing with all this is to have atomic poll() on
all fds signalling to the tool in order to avoid tricky races
and complicated design covering them.

Signals can also be consumed via fd and recent pidfd kernel
extension can be subject for this general atomic poll().

So ideally it should look like this:
fdarray__poll(events_fds + ctl_fds + pid_fd + signal_fds)

~Alexey


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-08 16:07               ` Jiri Olsa
  2020-06-08 16:43                 ` Alexey Budankov
@ 2020-06-15  5:20                 ` Alexey Budankov
  2020-06-15 12:30                   ` Jiri Olsa
  1 sibling, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-15  5:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 08.06.2020 19:07, Jiri Olsa wrote:
> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
>>
>> On 08.06.2020 11:43, Jiri Olsa wrote:
>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>>
>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>>
>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
<SNIP>
>>>>> revents = fdarray_fixed_revents(array, pos);
>>>>> fdarray__del(array, pos);
>>>>
>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>> correction of retval to bool for fdarray__add()?
>>>
>>> I don't like the separation for fixed and non-fixed fds,
>>> why can't we make generic?
>>
>> Usage models are different but they want still to be parts of the same class
>> for atomic poll(). The distinction is filterable vs. not filterable.
>> The distinction should be somehow provided in API. Options are:
>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>    expose less API calls in comparison with option 1
>>
>> Exposure of pos for filterable fds should be converted to bool since currently
>> the returned pos can become stale and there is no way in API to check its state.
>> So it could look like this:
>>
>> fdkey = fdarray__add(array, fd, events, type)
>> type: filterable, nonfilterable, somthing else
>> revents = fdarray__get_revents(fdkey);
>> fdarray__del(array, fdkey);
> 
> I think there's solution without having filterable type,
> I'm not sure why you think this is needed
> 
> I'm busy with other things this week, but I think I can
> come up with some patch early next week if needed

Friendly reminder.

Thanks,
Alexey


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-15  5:20                 ` Alexey Budankov
@ 2020-06-15 12:30                   ` Jiri Olsa
  2020-06-15 14:37                     ` Alexey Budankov
  0 siblings, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-15 12:30 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 15, 2020 at 08:20:38AM +0300, Alexey Budankov wrote:
> 
> On 08.06.2020 19:07, Jiri Olsa wrote:
> > On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
> >>
> >> On 08.06.2020 11:43, Jiri Olsa wrote:
> >>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
> >>>>
> >>>> On 05.06.2020 19:15, Alexey Budankov wrote:
> >>>>>
> >>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
> <SNIP>
> >>>>> revents = fdarray_fixed_revents(array, pos);
> >>>>> fdarray__del(array, pos);
> >>>>
> >>>> So how is it about just adding _revents() and _del() for fixed fds with
> >>>> correction of retval to bool for fdarray__add()?
> >>>
> >>> I don't like the separation for fixed and non-fixed fds,
> >>> why can't we make generic?
> >>
> >> Usage models are different but they want still to be parts of the same class
> >> for atomic poll(). The distinction is filterable vs. not filterable.
> >> The distinction should be somehow provided in API. Options are:
> >> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
> >>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
> >> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
> >>    use the type in __filter() and __poll() and, perhaps, other internals;
> >>    expose less API calls in comparison with option 1
> >>
> >> Exposure of pos for filterable fds should be converted to bool since currently
> >> the returned pos can become stale and there is no way in API to check its state.
> >> So it could look like this:
> >>
> >> fdkey = fdarray__add(array, fd, events, type)
> >> type: filterable, nonfilterable, somthing else
> >> revents = fdarray__get_revents(fdkey);
> >> fdarray__del(array, fdkey);
> > 
> > I think there's solution without having filterable type,
> > I'm not sure why you think this is needed
> > 
> > I'm busy with other things this week, but I think I can
> > come up with some patch early next week if needed
> 
> Friendly reminder.

hm? I believe we discussed this in here:
  https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-09 14:56                     ` Jiri Olsa
  2020-06-09 18:51                       ` Alexey Budankov
@ 2020-06-15 13:13                       ` Alexey Budankov
  2020-06-15 17:38                       ` Alexey Budankov
  2 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-15 13:13 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 09.06.2020 17:56, Jiri Olsa wrote:
> On Mon, Jun 08, 2020 at 08:18:20PM +0300, Alexey Budankov wrote:
> 
> SNIP
> 
>>>>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>>>>> correction of retval to bool for fdarray__add()?
>>>>>>
>>>>>> I don't like the separation for fixed and non-fixed fds,
>>>>>> why can't we make generic?
>>>>>
>>>>> Usage models are different but they want still to be parts of the same class
>>>>> for atomic poll(). The distinction is filterable vs. not filterable.
>>>>> The distinction should be somehow provided in API. Options are:
>>>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>>>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>>>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>>>>    expose less API calls in comparison with option 1
>>>>>
>>>>> Exposure of pos for filterable fds should be converted to bool since currently
>>>>> the returned pos can become stale and there is no way in API to check its state.
>>>>> So it could look like this:
>>>>>
>>>>> fdkey = fdarray__add(array, fd, events, type)
>>>>> type: filterable, nonfilterable, somthing else
>>>>> revents = fdarray__get_revents(fdkey);
>>>>> fdarray__del(array, fdkey);
>>>>
>>>> I think there's solution without having filterable type,
> 
> so with the changes I proposed it could no longer be called fdarray ;-)
> which I think was the idea at the begning.. just an array of fds
> 
> I'd like to have fully flaged events object.. but that's bigger change
> 
>>>
>>> and still making the atomic fdarray__poll()?
>>
>> How is it about design like this?
>>
>>     int fdarray__poll(struct fdarray *fda, int timeout)
>>
>> with additional external array of fds to simultaneously poll() on:
>>
>>     int fdarray__poll(struct fdarray *fda, int timeout,
>>                       int *fds, size_t fds_size)
>>
>> fds would be added to array just prior poll() call.
> 
> yep, I was considering something like this, having:
> 
>   fdarray__poll2(fda1, fda2)
>   fdarray__pollx(fda, ...)
> 
> but it would need to create an pollfd array and write
> the poll results back to arrays.. might be expensive
> 
> another idea is to forbid filter to screw the array
> and return only remaining number, like below

Which option does it want to be? Just like below?
If yes then how does it distinguish event fds from the others?
This is required to return correct value from fdarray__filter().
Could you please clarify more?

Thanks,
Alexey

> 
> jirka
> 
> 
> ---
> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> index 58d44d5eee31..89f9a2193c2d 100644
> --- a/tools/lib/api/fd/array.c
> +++ b/tools/lib/api/fd/array.c
> @@ -93,22 +93,21 @@ int fdarray__filter(struct fdarray *fda, short revents,
>  		return 0;
>  
>  	for (fd = 0; fd < fda->nr; ++fd) {
> +		if (!fda->entries[fd].events)
> +			continue;
> +
>  		if (fda->entries[fd].revents & revents) {
>  			if (entry_destructor)
>  				entry_destructor(fda, fd, arg);
>  
> +			fda->entries[fd].revents = fda->entries[fd].events = 0;
>  			continue;
>  		}
>  
> -		if (fd != nr) {
> -			fda->entries[nr] = fda->entries[fd];
> -			fda->priv[nr]	 = fda->priv[fd];
> -		}
> -
>  		++nr;
>  	}
>  
> -	return fda->nr = nr;
> +	return nr;
>  }
>  
>  int fdarray__poll(struct fdarray *fda, int timeout)
> diff --git a/tools/perf/tests/fdarray.c b/tools/perf/tests/fdarray.c
> index c7c81c4a5b2b..d0c8a05aab2f 100644
> --- a/tools/perf/tests/fdarray.c
> +++ b/tools/perf/tests/fdarray.c
> @@ -12,6 +12,7 @@ static void fdarray__init_revents(struct fdarray *fda, short revents)
>  
>  	for (fd = 0; fd < fda->nr; ++fd) {
>  		fda->entries[fd].fd	 = fda->nr - fd;
> +		fda->entries[fd].events  = revents;
>  		fda->entries[fd].revents = revents;
>  	}
>  }
> @@ -29,7 +30,7 @@ static int fdarray__fprintf_prefix(struct fdarray *fda, const char *prefix, FILE
>  
>  int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_unused)
>  {
> -	int nr_fds, expected_fd[2], fd, err = TEST_FAIL;
> +	int nr_fds, err = TEST_FAIL;
>  	struct fdarray *fda = fdarray__new(5, 5);
>  
>  	if (fda == NULL) {
> @@ -55,7 +56,6 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
>  
>  	fdarray__init_revents(fda, POLLHUP);
>  	fda->entries[2].revents = POLLIN;
> -	expected_fd[0] = fda->entries[2].fd;
>  
>  	pr_debug("\nfiltering all but fda->entries[2]:");
>  	fdarray__fprintf_prefix(fda, "before", stderr);
> @@ -66,17 +66,9 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
>  		goto out_delete;
>  	}
>  
> -	if (fda->entries[0].fd != expected_fd[0]) {
> -		pr_debug("\nfda->entries[0].fd=%d != %d\n",
> -			 fda->entries[0].fd, expected_fd[0]);
> -		goto out_delete;
> -	}
> -
>  	fdarray__init_revents(fda, POLLHUP);
>  	fda->entries[0].revents = POLLIN;
> -	expected_fd[0] = fda->entries[0].fd;
>  	fda->entries[3].revents = POLLIN;
> -	expected_fd[1] = fda->entries[3].fd;
>  
>  	pr_debug("\nfiltering all but (fda->entries[0], fda->entries[3]):");
>  	fdarray__fprintf_prefix(fda, "before", stderr);
> @@ -88,14 +80,6 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
>  		goto out_delete;
>  	}
>  
> -	for (fd = 0; fd < 2; ++fd) {
> -		if (fda->entries[fd].fd != expected_fd[fd]) {
> -			pr_debug("\nfda->entries[%d].fd=%d != %d\n", fd,
> -				 fda->entries[fd].fd, expected_fd[fd]);
> -			goto out_delete;
> -		}
> -	}
> -
>  	pr_debug("\n");
>  
>  	err = 0;
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-15 12:30                   ` Jiri Olsa
@ 2020-06-15 14:37                     ` Alexey Budankov
  2020-06-15 16:58                       ` Jiri Olsa
  0 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-15 14:37 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 15.06.2020 15:30, Jiri Olsa wrote:
> On Mon, Jun 15, 2020 at 08:20:38AM +0300, Alexey Budankov wrote:
>>
>> On 08.06.2020 19:07, Jiri Olsa wrote:
>>> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
>>>>
>>>> On 08.06.2020 11:43, Jiri Olsa wrote:
>>>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>>>>
>>>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>>>>
>>>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>> <SNIP>
>>>>>>> revents = fdarray_fixed_revents(array, pos);
>>>>>>> fdarray__del(array, pos);
>>>>>>
>>>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>>>> correction of retval to bool for fdarray__add()?
>>>>>
>>>>> I don't like the separation for fixed and non-fixed fds,
>>>>> why can't we make generic?
>>>>
>>>> Usage models are different but they want still to be parts of the same class
>>>> for atomic poll(). The distinction is filterable vs. not filterable.
>>>> The distinction should be somehow provided in API. Options are:
>>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>>>    expose less API calls in comparison with option 1
>>>>
>>>> Exposure of pos for filterable fds should be converted to bool since currently
>>>> the returned pos can become stale and there is no way in API to check its state.
>>>> So it could look like this:
>>>>
>>>> fdkey = fdarray__add(array, fd, events, type)
>>>> type: filterable, nonfilterable, somthing else
>>>> revents = fdarray__get_revents(fdkey);
>>>> fdarray__del(array, fdkey);
>>>
>>> I think there's solution without having filterable type,
>>> I'm not sure why you think this is needed
>>>
>>> I'm busy with other things this week, but I think I can
>>> come up with some patch early next week if needed
>>
>> Friendly reminder.
> 
> hm? I believe we discussed this in here:
>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/

Do you want it to be implemented like in the patch posted by the link?

~Alexey

> 
> jirka
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-15 14:37                     ` Alexey Budankov
@ 2020-06-15 16:58                       ` Jiri Olsa
  2020-06-17  9:27                         ` Jiri Olsa
  2020-06-22  9:47                         ` Alexey Budankov
  0 siblings, 2 replies; 44+ messages in thread
From: Jiri Olsa @ 2020-06-15 16:58 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 15, 2020 at 05:37:53PM +0300, Alexey Budankov wrote:
> 
> On 15.06.2020 15:30, Jiri Olsa wrote:
> > On Mon, Jun 15, 2020 at 08:20:38AM +0300, Alexey Budankov wrote:
> >>
> >> On 08.06.2020 19:07, Jiri Olsa wrote:
> >>> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
> >>>>
> >>>> On 08.06.2020 11:43, Jiri Olsa wrote:
> >>>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
> >>>>>>
> >>>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
> >>>>>>>
> >>>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
> >> <SNIP>
> >>>>>>> revents = fdarray_fixed_revents(array, pos);
> >>>>>>> fdarray__del(array, pos);
> >>>>>>
> >>>>>> So how is it about just adding _revents() and _del() for fixed fds with
> >>>>>> correction of retval to bool for fdarray__add()?
> >>>>>
> >>>>> I don't like the separation for fixed and non-fixed fds,
> >>>>> why can't we make generic?
> >>>>
> >>>> Usage models are different but they want still to be parts of the same class
> >>>> for atomic poll(). The distinction is filterable vs. not filterable.
> >>>> The distinction should be somehow provided in API. Options are:
> >>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
> >>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
> >>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
> >>>>    use the type in __filter() and __poll() and, perhaps, other internals;
> >>>>    expose less API calls in comparison with option 1
> >>>>
> >>>> Exposure of pos for filterable fds should be converted to bool since currently
> >>>> the returned pos can become stale and there is no way in API to check its state.
> >>>> So it could look like this:
> >>>>
> >>>> fdkey = fdarray__add(array, fd, events, type)
> >>>> type: filterable, nonfilterable, somthing else
> >>>> revents = fdarray__get_revents(fdkey);
> >>>> fdarray__del(array, fdkey);
> >>>
> >>> I think there's solution without having filterable type,
> >>> I'm not sure why you think this is needed
> >>>
> >>> I'm busy with other things this week, but I think I can
> >>> come up with some patch early next week if needed
> >>
> >> Friendly reminder.
> > 
> > hm? I believe we discussed this in here:
> >   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
> 
> Do you want it to be implemented like in the patch posted by the link?

no idea.. looking for good solution ;-)

how about switching completely to epoll? I tried and it
does not look that bad

there might be some loose ends (interface change), but
I think this would solve our problems with fdarray

I'll be able to get back to it by the end of the week,
but if you want to check/finish this patch earlier go ahead

jirka


---
 tools/lib/perf/evlist.c                  | 134 +++++++++++++++++------
 tools/lib/perf/include/internal/evlist.h |   9 +-
 tools/perf/builtin-kvm.c                 |   8 +-
 tools/perf/builtin-record.c              |  14 ++-
 4 files changed, 120 insertions(+), 45 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6a875a0f01bb..8569cdd8bbd8 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -23,6 +23,7 @@
 #include <perf/cpumap.h>
 #include <perf/threadmap.h>
 #include <api/fd/array.h>
+#include <sys/epoll.h>
 
 void perf_evlist__init(struct perf_evlist *evlist)
 {
@@ -32,7 +33,10 @@ void perf_evlist__init(struct perf_evlist *evlist)
 		INIT_HLIST_HEAD(&evlist->heads[i]);
 	INIT_LIST_HEAD(&evlist->entries);
 	evlist->nr_entries = 0;
-	fdarray__init(&evlist->pollfd, 64);
+	INIT_LIST_HEAD(&evlist->poll_data);
+	evlist->poll_cnt = 0;
+	evlist->poll_act = 0;
+	evlist->poll_fd = -1;
 }
 
 static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
@@ -120,6 +124,23 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 	evlist->nr_entries = 0;
 }
 
+struct poll_data {
+	int		  fd;
+	void		 *ptr;
+	struct list_head  list;
+};
+
+static void perf_evlist__exit_pollfd(struct perf_evlist *evlist)
+{
+	struct poll_data *data, *tmp;
+
+	if (evlist->poll_fd != -1)
+		close(evlist->poll_fd);
+
+	list_for_each_entry_safe(data, tmp, &evlist->poll_data, list)
+		free(data);
+}
+
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	perf_cpu_map__put(evlist->cpus);
@@ -128,7 +149,7 @@ void perf_evlist__exit(struct perf_evlist *evlist)
 	evlist->cpus = NULL;
 	evlist->all_cpus = NULL;
 	evlist->threads = NULL;
-	fdarray__exit(&evlist->pollfd);
+	perf_evlist__exit_pollfd(evlist);
 }
 
 void perf_evlist__delete(struct perf_evlist *evlist)
@@ -285,56 +306,105 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
 
 int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
-	int nr_cpus = perf_cpu_map__nr(evlist->cpus);
-	int nr_threads = perf_thread_map__nr(evlist->threads);
-	int nfds = 0;
-	struct perf_evsel *evsel;
-
-	perf_evlist__for_each_entry(evlist, evsel) {
-		if (evsel->system_wide)
-			nfds += nr_cpus;
-		else
-			nfds += nr_cpus * nr_threads;
-	}
+	int poll_fd;
 
-	if (fdarray__available_entries(&evlist->pollfd) < nfds &&
-	    fdarray__grow(&evlist->pollfd, nfds) < 0)
-		return -ENOMEM;
+	poll_fd = epoll_create1(EPOLL_CLOEXEC);
+	if (!poll_fd)
+		return -1;
 
+	evlist->poll_fd = poll_fd;
 	return 0;
 }
 
-int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
-			    void *ptr, short revent)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist,
+				     struct poll_data *data,
+				     short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
+	struct epoll_event *events, ev = {
+		.data.ptr = data,
+		.events   = revent | EPOLLERR | EPOLLHUP,
+	};
+	int err;
+
+	err = epoll_ctl(evlist->poll_fd, EPOLL_CTL_ADD, data->fd, &ev);
+	if (err)
+		return err;
 
-	if (pos >= 0) {
-		evlist->pollfd.priv[pos].ptr = ptr;
-		fcntl(fd, F_SETFL, O_NONBLOCK);
+	events = realloc(evlist->poll_events, sizeof(ev) * evlist->poll_cnt);
+	if (events) {
+		evlist->poll_events = events;
+		evlist->poll_cnt++;
 	}
 
-	return pos;
+	return events ? 0 : -ENOMEM;
 }
 
-static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
-					 void *arg __maybe_unused)
+int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
+			    void *ptr, short revent)
 {
-	struct perf_mmap *map = fda->priv[fd].ptr;
+	struct poll_data *data = zalloc(sizeof(*data));
+	int err;
 
-	if (map)
-		perf_mmap__put(map);
+	if (!data)
+		return -ENOMEM;
+
+	data->fd  = fd;
+	data->ptr = ptr;
+
+	err = __perf_evlist__add_pollfd(evlist, data, revent);
+	if (!err)
+		list_add_tail(&data->list, &evlist->poll_data);
+
+	return err;
 }
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
 {
-	return fdarray__filter(&evlist->pollfd, revents_and_mask,
-			       perf_evlist__munmap_filtered, NULL);
+	struct epoll_event *events = evlist->poll_events;
+	int i, removed = 0;
+
+	for (i = 0; i < evlist->poll_act; i++) {
+		if (events[i].events & revents_and_mask) {
+			struct poll_data *data = events[i].data.ptr;
+
+			if (data->ptr)
+				perf_mmap__put(data->ptr);
+
+			epoll_ctl(evlist->poll_fd, EPOLL_CTL_DEL, data->fd, &events[i]);
+
+			list_del(&data->list);
+			free(data);
+			removed++;
+		}
+	}
+
+	return evlist->poll_cnt -= removed;
+}
+
+bool perf_evlist__pollfd_data(struct perf_evlist *evlist, int fd)
+{
+	int i;
+
+	if (evlist->poll_act < 0)
+		return false;
+
+	for (i = 0; i < evlist->poll_act; i++) {
+		struct poll_data *data = evlist->poll_events[i].data.ptr;
+
+		if (data->fd == fd)
+			return true;
+	}
+
+	return false;
 }
 
 int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
 {
-	return fdarray__poll(&evlist->pollfd, timeout);
+	evlist->poll_act = epoll_wait(evlist->poll_fd,
+				      evlist->poll_events,
+				      evlist->poll_cnt,
+				      timeout);
+	return evlist->poll_act;
 }
 
 static struct perf_mmap* perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool overwrite)
@@ -593,7 +663,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 			return -ENOMEM;
 	}
 
-	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
+	if (evlist->poll_fd == -1 && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
 	if (perf_cpu_map__empty(cpus))
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 74dc8c3f0b66..39b08a04b992 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -3,7 +3,6 @@
 #define __LIBPERF_INTERNAL_EVLIST_H
 
 #include <linux/list.h>
-#include <api/fd/array.h>
 #include <internal/evsel.h>
 
 #define PERF_EVLIST__HLIST_BITS 8
@@ -12,6 +11,7 @@
 struct perf_cpu_map;
 struct perf_thread_map;
 struct perf_mmap_param;
+struct epoll_event;
 
 struct perf_evlist {
 	struct list_head	 entries;
@@ -22,7 +22,11 @@ struct perf_evlist {
 	struct perf_thread_map	*threads;
 	int			 nr_mmaps;
 	size_t			 mmap_len;
-	struct fdarray		 pollfd;
+	int			 poll_fd;
+	int			 poll_cnt;
+	int			 poll_act;
+	struct epoll_event	*poll_events;
+	struct list_head	 poll_data;
 	struct hlist_head	 heads[PERF_EVLIST__HLIST_SIZE];
 	struct perf_mmap	*mmap;
 	struct perf_mmap	*mmap_ovw;
@@ -124,4 +128,5 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
 			   struct perf_evsel *evsel,
 			   int cpu, int thread, int fd);
 
+bool perf_evlist__pollfd_data(struct perf_evlist *evlist, int fd);
 #endif /* __LIBPERF_INTERNAL_EVLIST_H */
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 95a77058023e..decc75745395 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -940,7 +940,7 @@ static int perf_kvm__handle_stdin(void)
 
 static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 {
-	int nr_stdin, ret, err = -EINVAL;
+	int ret, err = -EINVAL;
 	struct termios save;
 
 	/* live flag must be set first */
@@ -971,8 +971,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 	if (evlist__add_pollfd(kvm->evlist, kvm->timerfd) < 0)
 		goto out;
 
-	nr_stdin = evlist__add_pollfd(kvm->evlist, fileno(stdin));
-	if (nr_stdin < 0)
+	if (evlist__add_pollfd(kvm->evlist, fileno(stdin)))
 		goto out;
 
 	if (fd_set_nonblock(fileno(stdin)) != 0)
@@ -982,7 +981,6 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 	evlist__enable(kvm->evlist);
 
 	while (!done) {
-		struct fdarray *fda = &kvm->evlist->core.pollfd;
 		int rc;
 
 		rc = perf_kvm__mmap_read(kvm);
@@ -993,7 +991,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
 		if (err)
 			goto out;
 
-		if (fda->entries[nr_stdin].revents & POLLIN)
+		if (perf_evlist__pollfd_data(&kvm->evlist->core, fileno(stdin)))
 			done = perf_kvm__handle_stdin();
 
 		if (!rc && !done)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e108d90ae2ed..a49bf4186aab 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1576,12 +1576,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		status = -1;
 		goto out_delete_session;
 	}
-	err = evlist__add_pollfd(rec->evlist, done_fd);
-	if (err < 0) {
-		pr_err("Failed to add wakeup eventfd to poll list\n");
-		status = err;
-		goto out_delete_session;
-	}
 #endif // HAVE_EVENTFD_SUPPORT
 
 	session->header.env.comp_type  = PERF_COMP_ZSTD;
@@ -1624,6 +1618,14 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	}
 	session->header.env.comp_mmap_len = session->evlist->core.mmap_len;
 
+#ifdef HAVE_EVENTFD_SUPPORT
+	err = evlist__add_pollfd(rec->evlist, done_fd);
+	if (err < 0) {
+		pr_err("Failed to add wakeup eventfd to poll list\n");
+		goto out_child;
+	}
+#endif // HAVE_EVENTFD_SUPPORT
+
 	if (rec->opts.kcore) {
 		err = record__kcore_copy(&session->machines.host, data);
 		if (err) {
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-09 14:56                     ` Jiri Olsa
  2020-06-09 18:51                       ` Alexey Budankov
  2020-06-15 13:13                       ` Alexey Budankov
@ 2020-06-15 17:38                       ` Alexey Budankov
  2 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-15 17:38 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 09.06.2020 17:56, Jiri Olsa wrote:
> On Mon, Jun 08, 2020 at 08:18:20PM +0300, Alexey Budankov wrote:
> 
> SNIP
> 

<SNIP>

> 
> another idea is to forbid filter to screw the array
> and return only remaining number, like below
> 
> jirka
> 
> 
> ---
> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
> index 58d44d5eee31..89f9a2193c2d 100644
> --- a/tools/lib/api/fd/array.c
> +++ b/tools/lib/api/fd/array.c
> @@ -93,22 +93,21 @@ int fdarray__filter(struct fdarray *fda, short revents,
>  		return 0;
>  
>  	for (fd = 0; fd < fda->nr; ++fd) {
> +		if (!fda->entries[fd].events)

If we change it to
		if (!fda->entries[fd].revents)
and fix indices returned by fdarray__add() between fdarray__filter() then
it is possible to check fds status by the pos and process and zero its
revents prior fdarray__filter(). In this case fdarray__filter() would count
the number of fds skipping the ones with zeroed revents.
 
Looks like it solves current task avoiding explicit fds typing and even
providing some more flexibility for other fds different from event and ctl
ones. I will try to come up with a version implementing this approach.

~Alexey

> +			continue;
> +
>  		if (fda->entries[fd].revents & revents) {
>  			if (entry_destructor)
>  				entry_destructor(fda, fd, arg);
>  
> +			fda->entries[fd].revents = fda->entries[fd].events = 0;
>  			continue;
>  		}
>  
> -		if (fd != nr) {
> -			fda->entries[nr] = fda->entries[fd];
> -			fda->priv[nr]	 = fda->priv[fd];
> -		}
> -
>  		++nr;
>  	}
>  
> -	return fda->nr = nr;
> +	return nr;
>  }
>  
>  int fdarray__poll(struct fdarray *fda, int timeout)
> diff --git a/tools/perf/tests/fdarray.c b/tools/perf/tests/fdarray.c
> index c7c81c4a5b2b..d0c8a05aab2f 100644
> --- a/tools/perf/tests/fdarray.c
> +++ b/tools/perf/tests/fdarray.c
> @@ -12,6 +12,7 @@ static void fdarray__init_revents(struct fdarray *fda, short revents)
>  
>  	for (fd = 0; fd < fda->nr; ++fd) {
>  		fda->entries[fd].fd	 = fda->nr - fd;
> +		fda->entries[fd].events  = revents;
>  		fda->entries[fd].revents = revents;
>  	}
>  }
> @@ -29,7 +30,7 @@ static int fdarray__fprintf_prefix(struct fdarray *fda, const char *prefix, FILE
>  
>  int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_unused)
>  {
> -	int nr_fds, expected_fd[2], fd, err = TEST_FAIL;
> +	int nr_fds, err = TEST_FAIL;
>  	struct fdarray *fda = fdarray__new(5, 5);
>  
>  	if (fda == NULL) {
> @@ -55,7 +56,6 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
>  
>  	fdarray__init_revents(fda, POLLHUP);
>  	fda->entries[2].revents = POLLIN;
> -	expected_fd[0] = fda->entries[2].fd;
>  
>  	pr_debug("\nfiltering all but fda->entries[2]:");
>  	fdarray__fprintf_prefix(fda, "before", stderr);
> @@ -66,17 +66,9 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
>  		goto out_delete;
>  	}
>  
> -	if (fda->entries[0].fd != expected_fd[0]) {
> -		pr_debug("\nfda->entries[0].fd=%d != %d\n",
> -			 fda->entries[0].fd, expected_fd[0]);
> -		goto out_delete;
> -	}
> -
>  	fdarray__init_revents(fda, POLLHUP);
>  	fda->entries[0].revents = POLLIN;
> -	expected_fd[0] = fda->entries[0].fd;
>  	fda->entries[3].revents = POLLIN;
> -	expected_fd[1] = fda->entries[3].fd;
>  
>  	pr_debug("\nfiltering all but (fda->entries[0], fda->entries[3]):");
>  	fdarray__fprintf_prefix(fda, "before", stderr);
> @@ -88,14 +80,6 @@ int test__fdarray__filter(struct test *test __maybe_unused, int subtest __maybe_
>  		goto out_delete;
>  	}
>  
> -	for (fd = 0; fd < 2; ++fd) {
> -		if (fda->entries[fd].fd != expected_fd[fd]) {
> -			pr_debug("\nfda->entries[%d].fd=%d != %d\n", fd,
> -				 fda->entries[fd].fd, expected_fd[fd]);
> -			goto out_delete;
> -		}
> -	}
> -
>  	pr_debug("\n");
>  
>  	err = 0;
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-15 16:58                       ` Jiri Olsa
@ 2020-06-17  9:27                         ` Jiri Olsa
  2020-06-17  9:39                           ` Alexey Budankov
  2020-06-22  9:47                         ` Alexey Budankov
  1 sibling, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-17  9:27 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 15, 2020 at 06:58:04PM +0200, Jiri Olsa wrote:
> On Mon, Jun 15, 2020 at 05:37:53PM +0300, Alexey Budankov wrote:
> > 
> > On 15.06.2020 15:30, Jiri Olsa wrote:
> > > On Mon, Jun 15, 2020 at 08:20:38AM +0300, Alexey Budankov wrote:
> > >>
> > >> On 08.06.2020 19:07, Jiri Olsa wrote:
> > >>> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
> > >>>>
> > >>>> On 08.06.2020 11:43, Jiri Olsa wrote:
> > >>>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
> > >>>>>>
> > >>>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
> > >>>>>>>
> > >>>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
> > >> <SNIP>
> > >>>>>>> revents = fdarray_fixed_revents(array, pos);
> > >>>>>>> fdarray__del(array, pos);
> > >>>>>>
> > >>>>>> So how is it about just adding _revents() and _del() for fixed fds with
> > >>>>>> correction of retval to bool for fdarray__add()?
> > >>>>>
> > >>>>> I don't like the separation for fixed and non-fixed fds,
> > >>>>> why can't we make generic?
> > >>>>
> > >>>> Usage models are different but they want still to be parts of the same class
> > >>>> for atomic poll(). The distinction is filterable vs. not filterable.
> > >>>> The distinction should be somehow provided in API. Options are:
> > >>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
> > >>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
> > >>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
> > >>>>    use the type in __filter() and __poll() and, perhaps, other internals;
> > >>>>    expose less API calls in comparison with option 1
> > >>>>
> > >>>> Exposure of pos for filterable fds should be converted to bool since currently
> > >>>> the returned pos can become stale and there is no way in API to check its state.
> > >>>> So it could look like this:
> > >>>>
> > >>>> fdkey = fdarray__add(array, fd, events, type)
> > >>>> type: filterable, nonfilterable, somthing else
> > >>>> revents = fdarray__get_revents(fdkey);
> > >>>> fdarray__del(array, fdkey);
> > >>>
> > >>> I think there's solution without having filterable type,
> > >>> I'm not sure why you think this is needed
> > >>>
> > >>> I'm busy with other things this week, but I think I can
> > >>> come up with some patch early next week if needed
> > >>
> > >> Friendly reminder.
> > > 
> > > hm? I believe we discussed this in here:
> > >   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
> > 
> > Do you want it to be implemented like in the patch posted by the link?
> 
> no idea.. looking for good solution ;-)

Friendly reminder.

jirka

> 
> how about switching completely to epoll? I tried and it
> does not look that bad
> 
> there might be some loose ends (interface change), but
> I think this would solve our problems with fdarray
> 
> I'll be able to get back to it by the end of the week,
> but if you want to check/finish this patch earlier go ahead
> 
> jirka
> 
> 
> ---
>  tools/lib/perf/evlist.c                  | 134 +++++++++++++++++------
>  tools/lib/perf/include/internal/evlist.h |   9 +-
>  tools/perf/builtin-kvm.c                 |   8 +-
>  tools/perf/builtin-record.c              |  14 ++-
>  4 files changed, 120 insertions(+), 45 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6a875a0f01bb..8569cdd8bbd8 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -23,6 +23,7 @@
>  #include <perf/cpumap.h>
>  #include <perf/threadmap.h>
>  #include <api/fd/array.h>
> +#include <sys/epoll.h>
>  
>  void perf_evlist__init(struct perf_evlist *evlist)
>  {
> @@ -32,7 +33,10 @@ void perf_evlist__init(struct perf_evlist *evlist)
>  		INIT_HLIST_HEAD(&evlist->heads[i]);
>  	INIT_LIST_HEAD(&evlist->entries);
>  	evlist->nr_entries = 0;
> -	fdarray__init(&evlist->pollfd, 64);
> +	INIT_LIST_HEAD(&evlist->poll_data);
> +	evlist->poll_cnt = 0;
> +	evlist->poll_act = 0;
> +	evlist->poll_fd = -1;
>  }
>  
>  static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> @@ -120,6 +124,23 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
>  	evlist->nr_entries = 0;
>  }
>  
> +struct poll_data {
> +	int		  fd;
> +	void		 *ptr;
> +	struct list_head  list;
> +};
> +
> +static void perf_evlist__exit_pollfd(struct perf_evlist *evlist)
> +{
> +	struct poll_data *data, *tmp;
> +
> +	if (evlist->poll_fd != -1)
> +		close(evlist->poll_fd);
> +
> +	list_for_each_entry_safe(data, tmp, &evlist->poll_data, list)
> +		free(data);
> +}
> +
>  void perf_evlist__exit(struct perf_evlist *evlist)
>  {
>  	perf_cpu_map__put(evlist->cpus);
> @@ -128,7 +149,7 @@ void perf_evlist__exit(struct perf_evlist *evlist)
>  	evlist->cpus = NULL;
>  	evlist->all_cpus = NULL;
>  	evlist->threads = NULL;
> -	fdarray__exit(&evlist->pollfd);
> +	perf_evlist__exit_pollfd(evlist);
>  }
>  
>  void perf_evlist__delete(struct perf_evlist *evlist)
> @@ -285,56 +306,105 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
>  
>  int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>  {
> -	int nr_cpus = perf_cpu_map__nr(evlist->cpus);
> -	int nr_threads = perf_thread_map__nr(evlist->threads);
> -	int nfds = 0;
> -	struct perf_evsel *evsel;
> -
> -	perf_evlist__for_each_entry(evlist, evsel) {
> -		if (evsel->system_wide)
> -			nfds += nr_cpus;
> -		else
> -			nfds += nr_cpus * nr_threads;
> -	}
> +	int poll_fd;
>  
> -	if (fdarray__available_entries(&evlist->pollfd) < nfds &&
> -	    fdarray__grow(&evlist->pollfd, nfds) < 0)
> -		return -ENOMEM;
> +	poll_fd = epoll_create1(EPOLL_CLOEXEC);
> +	if (!poll_fd)
> +		return -1;
>  
> +	evlist->poll_fd = poll_fd;
>  	return 0;
>  }
>  
> -int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> -			    void *ptr, short revent)
> +static int __perf_evlist__add_pollfd(struct perf_evlist *evlist,
> +				     struct poll_data *data,
> +				     short revent)
>  {
> -	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
> +	struct epoll_event *events, ev = {
> +		.data.ptr = data,
> +		.events   = revent | EPOLLERR | EPOLLHUP,
> +	};
> +	int err;
> +
> +	err = epoll_ctl(evlist->poll_fd, EPOLL_CTL_ADD, data->fd, &ev);
> +	if (err)
> +		return err;
>  
> -	if (pos >= 0) {
> -		evlist->pollfd.priv[pos].ptr = ptr;
> -		fcntl(fd, F_SETFL, O_NONBLOCK);
> +	events = realloc(evlist->poll_events, sizeof(ev) * evlist->poll_cnt);
> +	if (events) {
> +		evlist->poll_events = events;
> +		evlist->poll_cnt++;
>  	}
>  
> -	return pos;
> +	return events ? 0 : -ENOMEM;
>  }
>  
> -static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
> -					 void *arg __maybe_unused)
> +int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> +			    void *ptr, short revent)
>  {
> -	struct perf_mmap *map = fda->priv[fd].ptr;
> +	struct poll_data *data = zalloc(sizeof(*data));
> +	int err;
>  
> -	if (map)
> -		perf_mmap__put(map);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	data->fd  = fd;
> +	data->ptr = ptr;
> +
> +	err = __perf_evlist__add_pollfd(evlist, data, revent);
> +	if (!err)
> +		list_add_tail(&data->list, &evlist->poll_data);
> +
> +	return err;
>  }
>  
>  int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
>  {
> -	return fdarray__filter(&evlist->pollfd, revents_and_mask,
> -			       perf_evlist__munmap_filtered, NULL);
> +	struct epoll_event *events = evlist->poll_events;
> +	int i, removed = 0;
> +
> +	for (i = 0; i < evlist->poll_act; i++) {
> +		if (events[i].events & revents_and_mask) {
> +			struct poll_data *data = events[i].data.ptr;
> +
> +			if (data->ptr)
> +				perf_mmap__put(data->ptr);
> +
> +			epoll_ctl(evlist->poll_fd, EPOLL_CTL_DEL, data->fd, &events[i]);
> +
> +			list_del(&data->list);
> +			free(data);
> +			removed++;
> +		}
> +	}
> +
> +	return evlist->poll_cnt -= removed;
> +}
> +
> +bool perf_evlist__pollfd_data(struct perf_evlist *evlist, int fd)
> +{
> +	int i;
> +
> +	if (evlist->poll_act < 0)
> +		return false;
> +
> +	for (i = 0; i < evlist->poll_act; i++) {
> +		struct poll_data *data = evlist->poll_events[i].data.ptr;
> +
> +		if (data->fd == fd)
> +			return true;
> +	}
> +
> +	return false;
>  }
>  
>  int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
>  {
> -	return fdarray__poll(&evlist->pollfd, timeout);
> +	evlist->poll_act = epoll_wait(evlist->poll_fd,
> +				      evlist->poll_events,
> +				      evlist->poll_cnt,
> +				      timeout);
> +	return evlist->poll_act;
>  }
>  
>  static struct perf_mmap* perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool overwrite)
> @@ -593,7 +663,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
>  			return -ENOMEM;
>  	}
>  
> -	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
> +	if (evlist->poll_fd == -1 && perf_evlist__alloc_pollfd(evlist) < 0)
>  		return -ENOMEM;
>  
>  	if (perf_cpu_map__empty(cpus))
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 74dc8c3f0b66..39b08a04b992 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -3,7 +3,6 @@
>  #define __LIBPERF_INTERNAL_EVLIST_H
>  
>  #include <linux/list.h>
> -#include <api/fd/array.h>
>  #include <internal/evsel.h>
>  
>  #define PERF_EVLIST__HLIST_BITS 8
> @@ -12,6 +11,7 @@
>  struct perf_cpu_map;
>  struct perf_thread_map;
>  struct perf_mmap_param;
> +struct epoll_event;
>  
>  struct perf_evlist {
>  	struct list_head	 entries;
> @@ -22,7 +22,11 @@ struct perf_evlist {
>  	struct perf_thread_map	*threads;
>  	int			 nr_mmaps;
>  	size_t			 mmap_len;
> -	struct fdarray		 pollfd;
> +	int			 poll_fd;
> +	int			 poll_cnt;
> +	int			 poll_act;
> +	struct epoll_event	*poll_events;
> +	struct list_head	 poll_data;
>  	struct hlist_head	 heads[PERF_EVLIST__HLIST_SIZE];
>  	struct perf_mmap	*mmap;
>  	struct perf_mmap	*mmap_ovw;
> @@ -124,4 +128,5 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
>  			   struct perf_evsel *evsel,
>  			   int cpu, int thread, int fd);
>  
> +bool perf_evlist__pollfd_data(struct perf_evlist *evlist, int fd);
>  #endif /* __LIBPERF_INTERNAL_EVLIST_H */
> diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
> index 95a77058023e..decc75745395 100644
> --- a/tools/perf/builtin-kvm.c
> +++ b/tools/perf/builtin-kvm.c
> @@ -940,7 +940,7 @@ static int perf_kvm__handle_stdin(void)
>  
>  static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  {
> -	int nr_stdin, ret, err = -EINVAL;
> +	int ret, err = -EINVAL;
>  	struct termios save;
>  
>  	/* live flag must be set first */
> @@ -971,8 +971,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  	if (evlist__add_pollfd(kvm->evlist, kvm->timerfd) < 0)
>  		goto out;
>  
> -	nr_stdin = evlist__add_pollfd(kvm->evlist, fileno(stdin));
> -	if (nr_stdin < 0)
> +	if (evlist__add_pollfd(kvm->evlist, fileno(stdin)))
>  		goto out;
>  
>  	if (fd_set_nonblock(fileno(stdin)) != 0)
> @@ -982,7 +981,6 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  	evlist__enable(kvm->evlist);
>  
>  	while (!done) {
> -		struct fdarray *fda = &kvm->evlist->core.pollfd;
>  		int rc;
>  
>  		rc = perf_kvm__mmap_read(kvm);
> @@ -993,7 +991,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  		if (err)
>  			goto out;
>  
> -		if (fda->entries[nr_stdin].revents & POLLIN)
> +		if (perf_evlist__pollfd_data(&kvm->evlist->core, fileno(stdin)))
>  			done = perf_kvm__handle_stdin();
>  
>  		if (!rc && !done)
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index e108d90ae2ed..a49bf4186aab 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1576,12 +1576,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		status = -1;
>  		goto out_delete_session;
>  	}
> -	err = evlist__add_pollfd(rec->evlist, done_fd);
> -	if (err < 0) {
> -		pr_err("Failed to add wakeup eventfd to poll list\n");
> -		status = err;
> -		goto out_delete_session;
> -	}
>  #endif // HAVE_EVENTFD_SUPPORT
>  
>  	session->header.env.comp_type  = PERF_COMP_ZSTD;
> @@ -1624,6 +1618,14 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  	}
>  	session->header.env.comp_mmap_len = session->evlist->core.mmap_len;
>  
> +#ifdef HAVE_EVENTFD_SUPPORT
> +	err = evlist__add_pollfd(rec->evlist, done_fd);
> +	if (err < 0) {
> +		pr_err("Failed to add wakeup eventfd to poll list\n");
> +		goto out_child;
> +	}
> +#endif // HAVE_EVENTFD_SUPPORT
> +
>  	if (rec->opts.kcore) {
>  		err = record__kcore_copy(&session->machines.host, data);
>  		if (err) {
> -- 
> 2.25.4
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-17  9:27                         ` Jiri Olsa
@ 2020-06-17  9:39                           ` Alexey Budankov
  0 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-17  9:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 17.06.2020 12:27, Jiri Olsa wrote:
> On Mon, Jun 15, 2020 at 06:58:04PM +0200, Jiri Olsa wrote:
>> On Mon, Jun 15, 2020 at 05:37:53PM +0300, Alexey Budankov wrote:
>>>
>>> On 15.06.2020 15:30, Jiri Olsa wrote:
>>>> On Mon, Jun 15, 2020 at 08:20:38AM +0300, Alexey Budankov wrote:
>>>>>
>>>>> On 08.06.2020 19:07, Jiri Olsa wrote:
>>>>>> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
>>>>>>>
>>>>>>> On 08.06.2020 11:43, Jiri Olsa wrote:
>>>>>>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>>>>>>>
>>>>>>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>>>>>>>
>>>>>>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>>>>> <SNIP>
>>>>>>>>>> revents = fdarray_fixed_revents(array, pos);
>>>>>>>>>> fdarray__del(array, pos);
>>>>>>>>>
>>>>>>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>>>>>>> correction of retval to bool for fdarray__add()?
>>>>>>>>
>>>>>>>> I don't like the separation for fixed and non-fixed fds,
>>>>>>>> why can't we make generic?
>>>>>>>
>>>>>>> Usage models are different but they want still to be parts of the same class
>>>>>>> for atomic poll(). The distinction is filterable vs. not filterable.
>>>>>>> The distinction should be somehow provided in API. Options are:
>>>>>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>>>>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>>>>>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>>>>>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>>>>>>    expose less API calls in comparison with option 1
>>>>>>>
>>>>>>> Exposure of pos for filterable fds should be converted to bool since currently
>>>>>>> the returned pos can become stale and there is no way in API to check its state.
>>>>>>> So it could look like this:
>>>>>>>
>>>>>>> fdkey = fdarray__add(array, fd, events, type)
>>>>>>> type: filterable, nonfilterable, somthing else
>>>>>>> revents = fdarray__get_revents(fdkey);
>>>>>>> fdarray__del(array, fdkey);
>>>>>>
>>>>>> I think there's solution without having filterable type,
>>>>>> I'm not sure why you think this is needed
>>>>>>
>>>>>> I'm busy with other things this week, but I think I can
>>>>>> come up with some patch early next week if needed
>>>>>
>>>>> Friendly reminder.
>>>>
>>>> hm? I believe we discussed this in here:
>>>>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
>>>
>>> Do you want it to be implemented like in the patch posted by the link?
>>
>> no idea.. looking for good solution ;-)
> 
> Friendly reminder.

Please see v8: https://lore.kernel.org/lkml/0781a077-aa82-5b4a-273e-c17372a72b93@linux.intel.com/

~Alexey

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-15 16:58                       ` Jiri Olsa
  2020-06-17  9:27                         ` Jiri Olsa
@ 2020-06-22  9:47                         ` Alexey Budankov
  2020-06-22 10:21                           ` Jiri Olsa
  1 sibling, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-22  9:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 15.06.2020 19:58, Jiri Olsa wrote:
> On Mon, Jun 15, 2020 at 05:37:53PM +0300, Alexey Budankov wrote:
>>
>> On 15.06.2020 15:30, Jiri Olsa wrote:
>>> On Mon, Jun 15, 2020 at 08:20:38AM +0300, Alexey Budankov wrote:
>>>>
>>>> On 08.06.2020 19:07, Jiri Olsa wrote:
>>>>> On Mon, Jun 08, 2020 at 12:54:31PM +0300, Alexey Budankov wrote:
>>>>>>
>>>>>> On 08.06.2020 11:43, Jiri Olsa wrote:
>>>>>>> On Mon, Jun 08, 2020 at 11:08:56AM +0300, Alexey Budankov wrote:
>>>>>>>>
>>>>>>>> On 05.06.2020 19:15, Alexey Budankov wrote:
>>>>>>>>>
>>>>>>>>> On 05.06.2020 14:38, Jiri Olsa wrote:
>>>> <SNIP>
>>>>>>>>> revents = fdarray_fixed_revents(array, pos);
>>>>>>>>> fdarray__del(array, pos);
>>>>>>>>
>>>>>>>> So how is it about just adding _revents() and _del() for fixed fds with
>>>>>>>> correction of retval to bool for fdarray__add()?
>>>>>>>
>>>>>>> I don't like the separation for fixed and non-fixed fds,
>>>>>>> why can't we make generic?
>>>>>>
>>>>>> Usage models are different but they want still to be parts of the same class
>>>>>> for atomic poll(). The distinction is filterable vs. not filterable.
>>>>>> The distinction should be somehow provided in API. Options are:
>>>>>> 1. expose separate API calls like __add_nonfilterable(), __del_nonfilterable();
>>>>>>    use nonfilterable quality in __filter() and __poll() and, perhaps, other internals;
>>>>>> 2. extend fdarray__add(, nonfilterable) with the nonfilterable quality
>>>>>>    use the type in __filter() and __poll() and, perhaps, other internals;
>>>>>>    expose less API calls in comparison with option 1
>>>>>>
>>>>>> Exposure of pos for filterable fds should be converted to bool since currently
>>>>>> the returned pos can become stale and there is no way in API to check its state.
>>>>>> So it could look like this:
>>>>>>
>>>>>> fdkey = fdarray__add(array, fd, events, type)
>>>>>> type: filterable, nonfilterable, somthing else
>>>>>> revents = fdarray__get_revents(fdkey);
>>>>>> fdarray__del(array, fdkey);
>>>>>
>>>>> I think there's solution without having filterable type,
>>>>> I'm not sure why you think this is needed
>>>>>
>>>>> I'm busy with other things this week, but I think I can
>>>>> come up with some patch early next week if needed
>>>>
>>>> Friendly reminder.
>>>
>>> hm? I believe we discussed this in here:
>>>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
>>
>> Do you want it to be implemented like in the patch posted by the link?
> 
> no idea.. looking for good solution ;-)
> 
> how about switching completely to epoll? I tried and it
> does not look that bad

Well, epoll() is perhaps possible but why does it want switching to epoll()?
What are the benefits and/or specific task being solved by this switch? 

> 
> there might be some loose ends (interface change), but
> I think this would solve our problems with fdarray

Your first patch accomodated in v8 actually avoids fds typing
and solves pos (=fdarray__add()) staleness issue with fdarray.

Thanks,
Alexey

> 
> I'll be able to get back to it by the end of the week,
> but if you want to check/finish this patch earlier go ahead
> 
> jirka
> 
> 
> ---
>  tools/lib/perf/evlist.c                  | 134 +++++++++++++++++------
>  tools/lib/perf/include/internal/evlist.h |   9 +-
>  tools/perf/builtin-kvm.c                 |   8 +-
>  tools/perf/builtin-record.c              |  14 ++-
>  4 files changed, 120 insertions(+), 45 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6a875a0f01bb..8569cdd8bbd8 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -23,6 +23,7 @@
>  #include <perf/cpumap.h>
>  #include <perf/threadmap.h>
>  #include <api/fd/array.h>
> +#include <sys/epoll.h>
>  
>  void perf_evlist__init(struct perf_evlist *evlist)
>  {
> @@ -32,7 +33,10 @@ void perf_evlist__init(struct perf_evlist *evlist)
>  		INIT_HLIST_HEAD(&evlist->heads[i]);
>  	INIT_LIST_HEAD(&evlist->entries);
>  	evlist->nr_entries = 0;
> -	fdarray__init(&evlist->pollfd, 64);
> +	INIT_LIST_HEAD(&evlist->poll_data);
> +	evlist->poll_cnt = 0;
> +	evlist->poll_act = 0;
> +	evlist->poll_fd = -1;
>  }
>  
>  static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> @@ -120,6 +124,23 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
>  	evlist->nr_entries = 0;
>  }
>  
> +struct poll_data {
> +	int		  fd;
> +	void		 *ptr;
> +	struct list_head  list;
> +};
> +
> +static void perf_evlist__exit_pollfd(struct perf_evlist *evlist)
> +{
> +	struct poll_data *data, *tmp;
> +
> +	if (evlist->poll_fd != -1)
> +		close(evlist->poll_fd);
> +
> +	list_for_each_entry_safe(data, tmp, &evlist->poll_data, list)
> +		free(data);
> +}
> +
>  void perf_evlist__exit(struct perf_evlist *evlist)
>  {
>  	perf_cpu_map__put(evlist->cpus);
> @@ -128,7 +149,7 @@ void perf_evlist__exit(struct perf_evlist *evlist)
>  	evlist->cpus = NULL;
>  	evlist->all_cpus = NULL;
>  	evlist->threads = NULL;
> -	fdarray__exit(&evlist->pollfd);
> +	perf_evlist__exit_pollfd(evlist);
>  }
>  
>  void perf_evlist__delete(struct perf_evlist *evlist)
> @@ -285,56 +306,105 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
>  
>  int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>  {
> -	int nr_cpus = perf_cpu_map__nr(evlist->cpus);
> -	int nr_threads = perf_thread_map__nr(evlist->threads);
> -	int nfds = 0;
> -	struct perf_evsel *evsel;
> -
> -	perf_evlist__for_each_entry(evlist, evsel) {
> -		if (evsel->system_wide)
> -			nfds += nr_cpus;
> -		else
> -			nfds += nr_cpus * nr_threads;
> -	}
> +	int poll_fd;
>  
> -	if (fdarray__available_entries(&evlist->pollfd) < nfds &&
> -	    fdarray__grow(&evlist->pollfd, nfds) < 0)
> -		return -ENOMEM;
> +	poll_fd = epoll_create1(EPOLL_CLOEXEC);
> +	if (!poll_fd)
> +		return -1;
>  
> +	evlist->poll_fd = poll_fd;
>  	return 0;
>  }
>  
> -int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> -			    void *ptr, short revent)
> +static int __perf_evlist__add_pollfd(struct perf_evlist *evlist,
> +				     struct poll_data *data,
> +				     short revent)
>  {
> -	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
> +	struct epoll_event *events, ev = {
> +		.data.ptr = data,
> +		.events   = revent | EPOLLERR | EPOLLHUP,
> +	};
> +	int err;
> +
> +	err = epoll_ctl(evlist->poll_fd, EPOLL_CTL_ADD, data->fd, &ev);
> +	if (err)
> +		return err;
>  
> -	if (pos >= 0) {
> -		evlist->pollfd.priv[pos].ptr = ptr;
> -		fcntl(fd, F_SETFL, O_NONBLOCK);
> +	events = realloc(evlist->poll_events, sizeof(ev) * evlist->poll_cnt);
> +	if (events) {
> +		evlist->poll_events = events;
> +		evlist->poll_cnt++;
>  	}
>  
> -	return pos;
> +	return events ? 0 : -ENOMEM;
>  }
>  
> -static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
> -					 void *arg __maybe_unused)
> +int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> +			    void *ptr, short revent)
>  {
> -	struct perf_mmap *map = fda->priv[fd].ptr;
> +	struct poll_data *data = zalloc(sizeof(*data));
> +	int err;
>  
> -	if (map)
> -		perf_mmap__put(map);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	data->fd  = fd;
> +	data->ptr = ptr;
> +
> +	err = __perf_evlist__add_pollfd(evlist, data, revent);
> +	if (!err)
> +		list_add_tail(&data->list, &evlist->poll_data);
> +
> +	return err;
>  }
>  
>  int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
>  {
> -	return fdarray__filter(&evlist->pollfd, revents_and_mask,
> -			       perf_evlist__munmap_filtered, NULL);
> +	struct epoll_event *events = evlist->poll_events;
> +	int i, removed = 0;
> +
> +	for (i = 0; i < evlist->poll_act; i++) {
> +		if (events[i].events & revents_and_mask) {
> +			struct poll_data *data = events[i].data.ptr;
> +
> +			if (data->ptr)
> +				perf_mmap__put(data->ptr);
> +
> +			epoll_ctl(evlist->poll_fd, EPOLL_CTL_DEL, data->fd, &events[i]);
> +
> +			list_del(&data->list);
> +			free(data);
> +			removed++;
> +		}
> +	}
> +
> +	return evlist->poll_cnt -= removed;
> +}
> +
> +bool perf_evlist__pollfd_data(struct perf_evlist *evlist, int fd)
> +{
> +	int i;
> +
> +	if (evlist->poll_act < 0)
> +		return false;
> +
> +	for (i = 0; i < evlist->poll_act; i++) {
> +		struct poll_data *data = evlist->poll_events[i].data.ptr;
> +
> +		if (data->fd == fd)
> +			return true;
> +	}
> +
> +	return false;
>  }
>  
>  int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
>  {
> -	return fdarray__poll(&evlist->pollfd, timeout);
> +	evlist->poll_act = epoll_wait(evlist->poll_fd,
> +				      evlist->poll_events,
> +				      evlist->poll_cnt,
> +				      timeout);
> +	return evlist->poll_act;
>  }
>  
>  static struct perf_mmap* perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool overwrite)
> @@ -593,7 +663,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
>  			return -ENOMEM;
>  	}
>  
> -	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
> +	if (evlist->poll_fd == -1 && perf_evlist__alloc_pollfd(evlist) < 0)
>  		return -ENOMEM;
>  
>  	if (perf_cpu_map__empty(cpus))
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 74dc8c3f0b66..39b08a04b992 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -3,7 +3,6 @@
>  #define __LIBPERF_INTERNAL_EVLIST_H
>  
>  #include <linux/list.h>
> -#include <api/fd/array.h>
>  #include <internal/evsel.h>
>  
>  #define PERF_EVLIST__HLIST_BITS 8
> @@ -12,6 +11,7 @@
>  struct perf_cpu_map;
>  struct perf_thread_map;
>  struct perf_mmap_param;
> +struct epoll_event;
>  
>  struct perf_evlist {
>  	struct list_head	 entries;
> @@ -22,7 +22,11 @@ struct perf_evlist {
>  	struct perf_thread_map	*threads;
>  	int			 nr_mmaps;
>  	size_t			 mmap_len;
> -	struct fdarray		 pollfd;
> +	int			 poll_fd;
> +	int			 poll_cnt;
> +	int			 poll_act;
> +	struct epoll_event	*poll_events;
> +	struct list_head	 poll_data;
>  	struct hlist_head	 heads[PERF_EVLIST__HLIST_SIZE];
>  	struct perf_mmap	*mmap;
>  	struct perf_mmap	*mmap_ovw;
> @@ -124,4 +128,5 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
>  			   struct perf_evsel *evsel,
>  			   int cpu, int thread, int fd);
>  
> +bool perf_evlist__pollfd_data(struct perf_evlist *evlist, int fd);
>  #endif /* __LIBPERF_INTERNAL_EVLIST_H */
> diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
> index 95a77058023e..decc75745395 100644
> --- a/tools/perf/builtin-kvm.c
> +++ b/tools/perf/builtin-kvm.c
> @@ -940,7 +940,7 @@ static int perf_kvm__handle_stdin(void)
>  
>  static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  {
> -	int nr_stdin, ret, err = -EINVAL;
> +	int ret, err = -EINVAL;
>  	struct termios save;
>  
>  	/* live flag must be set first */
> @@ -971,8 +971,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  	if (evlist__add_pollfd(kvm->evlist, kvm->timerfd) < 0)
>  		goto out;
>  
> -	nr_stdin = evlist__add_pollfd(kvm->evlist, fileno(stdin));
> -	if (nr_stdin < 0)
> +	if (evlist__add_pollfd(kvm->evlist, fileno(stdin)))
>  		goto out;
>  
>  	if (fd_set_nonblock(fileno(stdin)) != 0)
> @@ -982,7 +981,6 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  	evlist__enable(kvm->evlist);
>  
>  	while (!done) {
> -		struct fdarray *fda = &kvm->evlist->core.pollfd;
>  		int rc;
>  
>  		rc = perf_kvm__mmap_read(kvm);
> @@ -993,7 +991,7 @@ static int kvm_events_live_report(struct perf_kvm_stat *kvm)
>  		if (err)
>  			goto out;
>  
> -		if (fda->entries[nr_stdin].revents & POLLIN)
> +		if (perf_evlist__pollfd_data(&kvm->evlist->core, fileno(stdin)))
>  			done = perf_kvm__handle_stdin();
>  
>  		if (!rc && !done)
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index e108d90ae2ed..a49bf4186aab 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1576,12 +1576,6 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		status = -1;
>  		goto out_delete_session;
>  	}
> -	err = evlist__add_pollfd(rec->evlist, done_fd);
> -	if (err < 0) {
> -		pr_err("Failed to add wakeup eventfd to poll list\n");
> -		status = err;
> -		goto out_delete_session;
> -	}
>  #endif // HAVE_EVENTFD_SUPPORT
>  
>  	session->header.env.comp_type  = PERF_COMP_ZSTD;
> @@ -1624,6 +1618,14 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  	}
>  	session->header.env.comp_mmap_len = session->evlist->core.mmap_len;
>  
> +#ifdef HAVE_EVENTFD_SUPPORT
> +	err = evlist__add_pollfd(rec->evlist, done_fd);
> +	if (err < 0) {
> +		pr_err("Failed to add wakeup eventfd to poll list\n");
> +		goto out_child;
> +	}
> +#endif // HAVE_EVENTFD_SUPPORT
> +
>  	if (rec->opts.kcore) {
>  		err = record__kcore_copy(&session->machines.host, data);
>  		if (err) {
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-22  9:47                         ` Alexey Budankov
@ 2020-06-22 10:21                           ` Jiri Olsa
  2020-06-22 10:50                             ` Alexey Budankov
  0 siblings, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-22 10:21 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 22, 2020 at 12:47:19PM +0300, Alexey Budankov wrote:

SNIP

> >>>>>> fdarray__del(array, fdkey);
> >>>>>
> >>>>> I think there's solution without having filterable type,
> >>>>> I'm not sure why you think this is needed
> >>>>>
> >>>>> I'm busy with other things this week, but I think I can
> >>>>> come up with some patch early next week if needed
> >>>>
> >>>> Friendly reminder.
> >>>
> >>> hm? I believe we discussed this in here:
> >>>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
> >>
> >> Do you want it to be implemented like in the patch posted by the link?
> > 
> > no idea.. looking for good solution ;-)
> > 
> > how about switching completely to epoll? I tried and it
> > does not look that bad
> 
> Well, epoll() is perhaps possible but why does it want switching to epoll()?
> What are the benefits and/or specific task being solved by this switch? 

epoll change fixes the same issues as the patch you took in v8

on top of it it's not a hack and wil make polling more user
friendly because of the clear interface

> 
> > 
> > there might be some loose ends (interface change), but
> > I think this would solve our problems with fdarray
> 
> Your first patch accomodated in v8 actually avoids fds typing
> and solves pos (=fdarray__add()) staleness issue with fdarray.

yea, it was a change meant for discussion (which never happened),
and I considered it to be more a hack than a solution

I suppose we can live with that for a while, but I'd like to
have clean solution for polling as well

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-22 10:21                           ` Jiri Olsa
@ 2020-06-22 10:50                             ` Alexey Budankov
  2020-06-22 12:11                               ` Jiri Olsa
  0 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-22 10:50 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel


On 22.06.2020 13:21, Jiri Olsa wrote:
> On Mon, Jun 22, 2020 at 12:47:19PM +0300, Alexey Budankov wrote:
> 
> SNIP
> 
>>>>>>>> fdarray__del(array, fdkey);
>>>>>>>
>>>>>>> I think there's solution without having filterable type,
>>>>>>> I'm not sure why you think this is needed
>>>>>>>
>>>>>>> I'm busy with other things this week, but I think I can
>>>>>>> come up with some patch early next week if needed
>>>>>>
>>>>>> Friendly reminder.
>>>>>
>>>>> hm? I believe we discussed this in here:
>>>>>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
>>>>
>>>> Do you want it to be implemented like in the patch posted by the link?
>>>
>>> no idea.. looking for good solution ;-)
>>>
>>> how about switching completely to epoll? I tried and it
>>> does not look that bad
>>
>> Well, epoll() is perhaps possible but why does it want switching to epoll()?
>> What are the benefits and/or specific task being solved by this switch? 
> 
> epoll change fixes the same issues as the patch you took in v8
> 
> on top of it it's not a hack and wil make polling more user
> friendly because of the clear interface

Clear. The opposite thing is /proc/sys/fs/epoll/max_user_watches limit that
will affect Perf tool usage additionally to the current process limit on 
a number of simultaneously open file descriptors (ulimit -n). So move to 
epoll() will impose one limit what can affect Perf tool scalability.

> 
>>
>>>
>>> there might be some loose ends (interface change), but
>>> I think this would solve our problems with fdarray
>>
>> Your first patch accomodated in v8 actually avoids fds typing
>> and solves pos (=fdarray__add()) staleness issue with fdarray.
> 
> yea, it was a change meant for discussion (which never happened),
> and I considered it to be more a hack than a solution
> 
> I suppose we can live with that for a while, but I'd like to
> have clean solution for polling as well

I wouldn't treat it as a hack but more as a fix because returned
pos is now a part of interface that can be safely used in callers.
Can we go with this fix for the patch set?

Thanks,
Alexey

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-22 10:50                             ` Alexey Budankov
@ 2020-06-22 12:11                               ` Jiri Olsa
  2020-06-22 14:04                                 ` Alexey Budankov
  0 siblings, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-22 12:11 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 22, 2020 at 01:50:03PM +0300, Alexey Budankov wrote:
> 
> On 22.06.2020 13:21, Jiri Olsa wrote:
> > On Mon, Jun 22, 2020 at 12:47:19PM +0300, Alexey Budankov wrote:
> > 
> > SNIP
> > 
> >>>>>>>> fdarray__del(array, fdkey);
> >>>>>>>
> >>>>>>> I think there's solution without having filterable type,
> >>>>>>> I'm not sure why you think this is needed
> >>>>>>>
> >>>>>>> I'm busy with other things this week, but I think I can
> >>>>>>> come up with some patch early next week if needed
> >>>>>>
> >>>>>> Friendly reminder.
> >>>>>
> >>>>> hm? I believe we discussed this in here:
> >>>>>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
> >>>>
> >>>> Do you want it to be implemented like in the patch posted by the link?
> >>>
> >>> no idea.. looking for good solution ;-)
> >>>
> >>> how about switching completely to epoll? I tried and it
> >>> does not look that bad
> >>
> >> Well, epoll() is perhaps possible but why does it want switching to epoll()?
> >> What are the benefits and/or specific task being solved by this switch? 
> > 
> > epoll change fixes the same issues as the patch you took in v8
> > 
> > on top of it it's not a hack and wil make polling more user
> > friendly because of the clear interface
> 
> Clear. The opposite thing is /proc/sys/fs/epoll/max_user_watches limit that
> will affect Perf tool usage additionally to the current process limit on 
> a number of simultaneously open file descriptors (ulimit -n). So move to 
> epoll() will impose one limit what can affect Perf tool scalability.

hum, I dont think this will be a problem:

    Allowing top 4% of low memory (per user) to be allocated in epoll watches,
    we have:

    LOMEM    MAX_WATCHES (per user)
    512MB    ~178000
    1GB      ~356000
    2GB      ~712000

my laptop has 19841945 allowed watches per user

> 
> > 
> >>
> >>>
> >>> there might be some loose ends (interface change), but
> >>> I think this would solve our problems with fdarray
> >>
> >> Your first patch accomodated in v8 actually avoids fds typing
> >> and solves pos (=fdarray__add()) staleness issue with fdarray.
> > 
> > yea, it was a change meant for discussion (which never happened),
> > and I considered it to be more a hack than a solution
> > 
> > I suppose we can live with that for a while, but I'd like to
> > have clean solution for polling as well
> 
> I wouldn't treat it as a hack but more as a fix because returned
> pos is now a part of interface that can be safely used in callers.
> Can we go with this fix for the patch set?

apart from this one I still have a problem with that stat factoring
having 1 complicated function deal with both fork and no fork processing,
which I already commented on, but you ignored ;-)

I'll try to go through that once more, and post some comments

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-22 12:11                               ` Jiri Olsa
@ 2020-06-22 14:04                                 ` Alexey Budankov
  2020-06-23 14:54                                   ` Jiri Olsa
  0 siblings, 1 reply; 44+ messages in thread
From: Alexey Budankov @ 2020-06-22 14:04 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 22.06.2020 15:11, Jiri Olsa wrote:
> On Mon, Jun 22, 2020 at 01:50:03PM +0300, Alexey Budankov wrote:
>>
>> On 22.06.2020 13:21, Jiri Olsa wrote:
>>> On Mon, Jun 22, 2020 at 12:47:19PM +0300, Alexey Budankov wrote:
>>>
>>> SNIP
>>>
>>>>>>>>>> fdarray__del(array, fdkey);
>>>>>>>>>
>>>>>>>>> I think there's solution without having filterable type,
>>>>>>>>> I'm not sure why you think this is needed
>>>>>>>>>
>>>>>>>>> I'm busy with other things this week, but I think I can
>>>>>>>>> come up with some patch early next week if needed
>>>>>>>>
>>>>>>>> Friendly reminder.
>>>>>>>
>>>>>>> hm? I believe we discussed this in here:
>>>>>>>   https://lore.kernel.org/lkml/20200609145611.GI1558310@krava/
>>>>>>
>>>>>> Do you want it to be implemented like in the patch posted by the link?
>>>>>
>>>>> no idea.. looking for good solution ;-)
>>>>>
>>>>> how about switching completely to epoll? I tried and it
>>>>> does not look that bad
>>>>
>>>> Well, epoll() is perhaps possible but why does it want switching to epoll()?
>>>> What are the benefits and/or specific task being solved by this switch? 
>>>
>>> epoll change fixes the same issues as the patch you took in v8
>>>
>>> on top of it it's not a hack and wil make polling more user
>>> friendly because of the clear interface
>>
>> Clear. The opposite thing is /proc/sys/fs/epoll/max_user_watches limit that
>> will affect Perf tool usage additionally to the current process limit on 
>> a number of simultaneously open file descriptors (ulimit -n). So move to 
>> epoll() will impose one limit what can affect Perf tool scalability.
> 
> hum, I dont think this will be a problem:
> 
>     Allowing top 4% of low memory (per user) to be allocated in epoll watches,
>     we have:
> 
>     LOMEM    MAX_WATCHES (per user)
>     512MB    ~178000
>     1GB      ~356000
>     2GB      ~712000
> 
> my laptop has 19841945 allowed watches per user
> 
>>
>>>
>>>>
>>>>>
>>>>> there might be some loose ends (interface change), but
>>>>> I think this would solve our problems with fdarray
>>>>
>>>> Your first patch accomodated in v8 actually avoids fds typing
>>>> and solves pos (=fdarray__add()) staleness issue with fdarray.
>>>
>>> yea, it was a change meant for discussion (which never happened),
>>> and I considered it to be more a hack than a solution
>>>
>>> I suppose we can live with that for a while, but I'd like to
>>> have clean solution for polling as well
>>
>> I wouldn't treat it as a hack but more as a fix because returned
>> pos is now a part of interface that can be safely used in callers.
>> Can we go with this fix for the patch set?
> 
> apart from this one I still have a problem with that stat factoring
> having 1 complicated function deal with both fork and no fork processing,
> which I already commented on, but you ignored ;-)

Not an issue at all, lets split that func, dispatch_events() I suppose,
as you see it.

> 
> I'll try to go through that once more, and post some comments
> 
> jirka
> 

Thanks,
Alexey


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors
  2020-06-22 14:04                                 ` Alexey Budankov
@ 2020-06-23 14:54                                   ` Jiri Olsa
  0 siblings, 0 replies; 44+ messages in thread
From: Jiri Olsa @ 2020-06-23 14:54 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Mon, Jun 22, 2020 at 05:04:21PM +0300, Alexey Budankov wrote:

SNIP

> >>>>> there might be some loose ends (interface change), but
> >>>>> I think this would solve our problems with fdarray
> >>>>
> >>>> Your first patch accomodated in v8 actually avoids fds typing
> >>>> and solves pos (=fdarray__add()) staleness issue with fdarray.
> >>>
> >>> yea, it was a change meant for discussion (which never happened),
> >>> and I considered it to be more a hack than a solution
> >>>
> >>> I suppose we can live with that for a while, but I'd like to
> >>> have clean solution for polling as well
> >>
> >> I wouldn't treat it as a hack but more as a fix because returned
> >> pos is now a part of interface that can be safely used in callers.
> >> Can we go with this fix for the patch set?
> > 
> > apart from this one I still have a problem with that stat factoring
> > having 1 complicated function deal with both fork and no fork processing,
> > which I already commented on, but you ignored ;-)
> 
> Not an issue at all, lets split that func, dispatch_events() I suppose,
> as you see it.

ok,I checked it one more time and perhaps the function naming
was confusing for me.. but maybe we can give it another try,
I'm sending some comments

thanks,
jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 03/13] perf evlist: implement control command handling functions
  2020-06-03 15:54 ` [PATCH v7 03/13] perf evlist: implement control command handling functions Alexey Budankov
@ 2020-06-23 14:54   ` Jiri Olsa
  2020-06-24 11:48     ` Alexey Budankov
  0 siblings, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2020-06-23 14:54 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On Wed, Jun 03, 2020 at 06:54:47PM +0300, Alexey Budankov wrote:

SNIP

> +			case EVLIST_CTL_CMD_ACK:
> +			case EVLIST_CTL_CMD_UNSUPPORTED:
> +			default:
> +				pr_debug("ctlfd: unsupported %d\n", *cmd);
> +				break;
> +			}
> +			if (!(*cmd == EVLIST_CTL_CMD_ACK || *cmd == EVLIST_CTL_CMD_UNSUPPORTED))
> +				evlist__ctlfd_ack(evlist);
> +		}
> +	}
> +
> +	if (stat_entries[ctlfd_pos].revents & (POLLHUP | POLLERR))
> +		evlist__finalize_ctlfd(evlist);
> +	else
> +		stat_entries[ctlfd_pos].revents = 0;
> +
> +	return err;
> +}
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 0d8b361f1c8e..bccf0a970371 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -360,4 +360,21 @@ void perf_evlist__force_leader(struct evlist *evlist);
>  struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
>  						 struct evsel *evsel,
>  						bool close);
> +#define EVLIST_CTL_CMD_ENABLE_TAG  "enable"
> +#define EVLIST_CTL_CMD_DISABLE_TAG "disable"
> +#define EVLIST_CTL_CMD_ACK_TAG     "ack\n"

why the \n at the end of ack?

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 03/13] perf evlist: implement control command handling functions
  2020-06-23 14:54   ` Jiri Olsa
@ 2020-06-24 11:48     ` Alexey Budankov
  0 siblings, 0 replies; 44+ messages in thread
From: Alexey Budankov @ 2020-06-24 11:48 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Ingo Molnar, Andi Kleen, linux-kernel

On 23.06.2020 17:54, Jiri Olsa wrote:
> On Wed, Jun 03, 2020 at 06:54:47PM +0300, Alexey Budankov wrote:
> 
> SNIP
> 
>> +			case EVLIST_CTL_CMD_ACK:
>> +			case EVLIST_CTL_CMD_UNSUPPORTED:
>> +			default:
>> +				pr_debug("ctlfd: unsupported %d\n", *cmd);
>> +				break;
>> +			}
>> +			if (!(*cmd == EVLIST_CTL_CMD_ACK || *cmd == EVLIST_CTL_CMD_UNSUPPORTED))
>> +				evlist__ctlfd_ack(evlist);
>> +		}
>> +	}
>> +
>> +	if (stat_entries[ctlfd_pos].revents & (POLLHUP | POLLERR))
>> +		evlist__finalize_ctlfd(evlist);
>> +	else
>> +		stat_entries[ctlfd_pos].revents = 0;
>> +
>> +	return err;
>> +}
>> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
>> index 0d8b361f1c8e..bccf0a970371 100644
>> --- a/tools/perf/util/evlist.h
>> +++ b/tools/perf/util/evlist.h
>> @@ -360,4 +360,21 @@ void perf_evlist__force_leader(struct evlist *evlist);
>>  struct evsel *perf_evlist__reset_weak_group(struct evlist *evlist,
>>  						 struct evsel *evsel,
>>  						bool close);
>> +#define EVLIST_CTL_CMD_ENABLE_TAG  "enable"
>> +#define EVLIST_CTL_CMD_DISABLE_TAG "disable"
>> +#define EVLIST_CTL_CMD_ACK_TAG     "ack\n"
> 
> why the \n at the end of ack?

This \n stops reading by read command from within shell script.
\n can be avoided in ack if using "read -n 3 -u $fd res" instead
of just "read -u $fd res" by the script.

~Alexey

> 
> jirka
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2020-06-24 11:48 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-03 15:47 [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov
2020-06-03 15:52 ` [PATCH v7 01/13] tools/libperf: introduce notion of static polled file descriptors Alexey Budankov
2020-06-05 10:50   ` Jiri Olsa
2020-06-05 11:38     ` Jiri Olsa
2020-06-05 16:15       ` Alexey Budankov
2020-06-08  8:08         ` Alexey Budankov
2020-06-08  8:43           ` Jiri Olsa
2020-06-08  9:54             ` Alexey Budankov
2020-06-08 15:05               ` Alexey Budankov
2020-06-08 16:07               ` Jiri Olsa
2020-06-08 16:43                 ` Alexey Budankov
2020-06-08 17:18                   ` Alexey Budankov
2020-06-09 14:56                     ` Jiri Olsa
2020-06-09 18:51                       ` Alexey Budankov
2020-06-15 13:13                       ` Alexey Budankov
2020-06-15 17:38                       ` Alexey Budankov
2020-06-15  5:20                 ` Alexey Budankov
2020-06-15 12:30                   ` Jiri Olsa
2020-06-15 14:37                     ` Alexey Budankov
2020-06-15 16:58                       ` Jiri Olsa
2020-06-17  9:27                         ` Jiri Olsa
2020-06-17  9:39                           ` Alexey Budankov
2020-06-22  9:47                         ` Alexey Budankov
2020-06-22 10:21                           ` Jiri Olsa
2020-06-22 10:50                             ` Alexey Budankov
2020-06-22 12:11                               ` Jiri Olsa
2020-06-22 14:04                                 ` Alexey Budankov
2020-06-23 14:54                                   ` Jiri Olsa
2020-06-05 11:50     ` Alexey Budankov
2020-06-03 15:53 ` [PATCH v7 02/13] perf evlist: introduce control " Alexey Budankov
2020-06-03 15:54 ` [PATCH v7 03/13] perf evlist: implement control command handling functions Alexey Budankov
2020-06-23 14:54   ` Jiri Olsa
2020-06-24 11:48     ` Alexey Budankov
2020-06-03 15:55 ` [PATCH v7 04/13] perf stat: factor out body of event handling loop for system wide Alexey Budankov
2020-06-03 15:56 ` [PATCH v7 05/13] perf stat: move target check to loop control statement Alexey Budankov
2020-06-03 15:57 ` [PATCH v7 06/13] perf stat: factor out body of event handling loop for fork case Alexey Budankov
2020-06-03 15:57 ` [PATCH v7 07/13] perf stat: factor out event handling loop into dispatch_events() Alexey Budankov
2020-06-03 15:58 ` [PATCH v7 08/13] perf stat: extend -D,--delay option with -1 value Alexey Budankov
2020-06-03 15:59 ` [PATCH v7 09/13] perf stat: implement control commands handling Alexey Budankov
2020-06-03 15:59 ` [PATCH v7 10/13] perf stat: introduce --ctl-fd[-ack] options Alexey Budankov
2020-06-03 16:00 ` [PATCH v7 11/13] perf record: extend -D,--delay option with -1 value Alexey Budankov
2020-06-03 16:01 ` [PATCH v7 12/13] perf record: implement control commands handling Alexey Budankov
2020-06-03 16:02 ` [PATCH v7 13/13] perf record: introduce --ctl-fd[-ack] options Alexey Budankov
2020-06-05  7:47 ` [PATCH v7 00/13] perf: support enable and disable commands in stat and record modes Alexey Budankov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).