LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] perf record: Use an eventfd to wakeup when done
@ 2020-05-08  4:56 Anand K Mistry
  2020-05-11 11:28 ` Jiri Olsa
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Anand K Mistry @ 2020-05-08  4:56 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Anand K Mistry, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Peter Zijlstra, linux-kernel

The setting and checking of 'done' contains a rare race where the signal
handler setting 'done' is run after checking to break the loop, but
before waiting in evlist__poll(). In this case, the main loop won't wake
up until either another signal is sent, or the perf data fd causes a
wake up.

The following simple script can trigger this condition (but you might
need to run it for several hours):
for ((i = 0; i >= 0; i++)) ; do
  echo "Loop $i"
  delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
  ./perf record -- sleep 30000000 >/dev/null&
  pid=$!
  sleep $delay
  kill -TERM $pid
  echo "PID $pid"
  wait $pid
done

At some point, the loop will stall. Adding logging, even though perf has
received the SIGTERM and set 'done = 1', perf will remain sleeping until
a second signal is sent.

Signed-off-by: Anand K Mistry <amistry@google.com>

---

 tools/perf/builtin-record.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1ab349abe90469..ce5fc3860131d2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -53,6 +53,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <signal.h>
+#include <sys/eventfd.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
 #include <sys/types.h>
@@ -518,15 +519,19 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
 
 static volatile int signr = -1;
 static volatile int child_finished;
+static int done_fd = -1;
 
 static void sig_handler(int sig)
 {
+	u64 tmp = 1;
 	if (sig == SIGCHLD)
 		child_finished = 1;
 	else
 		signr = sig;
 
 	done = 1;
+	if (write(done_fd, &tmp, sizeof(tmp)) < 0)
+		pr_err("failed to signal wakeup fd\n");
 }
 
 static void sigsegv_handler(int sig)
@@ -1424,6 +1429,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	int fd;
 	float ratio = 0;
 
+	done_fd = eventfd(0, EFD_NONBLOCK);
+	evlist__add_pollfd(rec->evlist, done_fd);
+
 	atexit(record__sig_exit);
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf record: Use an eventfd to wakeup when done
  2020-05-08  4:56 [PATCH] perf record: Use an eventfd to wakeup when done Anand K Mistry
@ 2020-05-11 11:28 ` Jiri Olsa
  2020-05-12  4:59 ` Anand K Mistry
  2020-05-13  2:20 ` [PATCH v3] " Anand K Mistry
  2 siblings, 0 replies; 12+ messages in thread
From: Jiri Olsa @ 2020-05-11 11:28 UTC (permalink / raw)
  To: Anand K Mistry
  Cc: linux-perf-users, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Mark Rutland, Namhyung Kim, Peter Zijlstra,
	linux-kernel

On Fri, May 08, 2020 at 02:56:43PM +1000, Anand K Mistry wrote:
> The setting and checking of 'done' contains a rare race where the signal
> handler setting 'done' is run after checking to break the loop, but
> before waiting in evlist__poll(). In this case, the main loop won't wake
> up until either another signal is sent, or the perf data fd causes a
> wake up.
> 
> The following simple script can trigger this condition (but you might
> need to run it for several hours):
> for ((i = 0; i >= 0; i++)) ; do
>   echo "Loop $i"
>   delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
>   ./perf record -- sleep 30000000 >/dev/null&
>   pid=$!
>   sleep $delay
>   kill -TERM $pid
>   echo "PID $pid"
>   wait $pid
> done
> 
> At some point, the loop will stall. Adding logging, even though perf has
> received the SIGTERM and set 'done = 1', perf will remain sleeping until
> a second signal is sent.

so it's just few instructions in between the check and the evlist__poll

         if (done || draining)
               break;
         err = evlist__poll(rec->evlist, -1);

nice catch!

SNIP

> @@ -518,15 +519,19 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
>  
>  static volatile int signr = -1;
>  static volatile int child_finished;
> +static int done_fd = -1;
>  
>  static void sig_handler(int sig)
>  {
> +	u64 tmp = 1;
>  	if (sig == SIGCHLD)
>  		child_finished = 1;
>  	else
>  		signr = sig;
>  
>  	done = 1;

could you please put some explaining comment in here,
so we are not confused by this in few months ;-)

> +	if (write(done_fd, &tmp, sizeof(tmp)) < 0)
> +		pr_err("failed to signal wakeup fd\n");
>  }
>  
>  static void sigsegv_handler(int sig)
> @@ -1424,6 +1429,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  	int fd;
>  	float ratio = 0;
>  
> +	done_fd = eventfd(0, EFD_NONBLOCK);
> +	evlist__add_pollfd(rec->evlist, done_fd);

both of those can fail, please check the return values

thanks,
jirka

> +
>  	atexit(record__sig_exit);
>  	signal(SIGCHLD, sig_handler);
>  	signal(SIGINT, sig_handler);
> -- 
> 2.26.2.645.ge9eca65c58-goog
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] perf record: Use an eventfd to wakeup when done
  2020-05-08  4:56 [PATCH] perf record: Use an eventfd to wakeup when done Anand K Mistry
  2020-05-11 11:28 ` Jiri Olsa
@ 2020-05-12  4:59 ` Anand K Mistry
  2020-05-12 12:12   ` Jiri Olsa
  2020-05-13  2:20 ` [PATCH v3] " Anand K Mistry
  2 siblings, 1 reply; 12+ messages in thread
From: Anand K Mistry @ 2020-05-12  4:59 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Anand K Mistry, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Peter Zijlstra, linux-kernel

The setting and checking of 'done' contains a rare race where the signal
handler setting 'done' is run after checking to break the loop, but
before waiting in evlist__poll(). In this case, the main loop won't wake
up until either another signal is sent, or the perf data fd causes a
wake up.

The following simple script can trigger this condition (but you might
need to run it for several hours):
for ((i = 0; i >= 0; i++)) ; do
  echo "Loop $i"
  delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
  ./perf record -- sleep 30000000 >/dev/null&
  pid=$!
  sleep $delay
  kill -TERM $pid
  echo "PID $pid"
  wait $pid
done

At some point, the loop will stall. Adding logging, even though perf has
received the SIGTERM and set 'done = 1', perf will remain sleeping until
a second signal is sent.

Signed-off-by: Anand K Mistry <amistry@google.com>

---

 tools/perf/builtin-record.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1ab349abe90469..099ecaa66732a2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -53,6 +53,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <signal.h>
+#include <sys/eventfd.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
 #include <sys/types.h>
@@ -518,15 +519,28 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
 
 static volatile int signr = -1;
 static volatile int child_finished;
+static int done_fd = -1;
 
 static void sig_handler(int sig)
 {
+	u64 tmp = 1;
 	if (sig == SIGCHLD)
 		child_finished = 1;
 	else
 		signr = sig;
 
 	done = 1;
+
+	/*
+	 * It is possible for this signal handler to run after done is checked
+	 * in the main loop, but before the perf counter fds are polled. If this
+	 * happens, the poll() will continue to wait even though done is set,
+	 * and will only break out if either another signal is received, or the
+	 * counters are ready for read. To ensure the poll() doesn't sleep when
+	 * done is set, use an eventfd (done_fd) to wake up the poll().
+	 */
+	if (write(done_fd, &tmp, sizeof(tmp)) < 0)
+		pr_err("failed to signal wakeup fd\n");
 }
 
 static void sigsegv_handler(int sig)
@@ -1424,6 +1438,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	int fd;
 	float ratio = 0;
 
+	done_fd = eventfd(0, EFD_NONBLOCK);
+	if (done_fd < 0) {
+		pr_err("Failed to create wakeup eventfd, error: %m\n");
+		return -1;
+	}
+	err = evlist__add_pollfd(rec->evlist, done_fd);
+	if (err < 0) {
+		pr_err("Failed to add wakeup eventfd to poll list\n");
+		return -1;
+	}
+
 	atexit(record__sig_exit);
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf record: Use an eventfd to wakeup when done
  2020-05-12  4:59 ` Anand K Mistry
@ 2020-05-12 12:12   ` Jiri Olsa
  2020-05-12 14:12     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2020-05-12 12:12 UTC (permalink / raw)
  To: Anand K Mistry
  Cc: linux-perf-users, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Mark Rutland, Namhyung Kim, Peter Zijlstra,
	linux-kernel

On Tue, May 12, 2020 at 02:59:36PM +1000, Anand K Mistry wrote:

SNIP

> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 1ab349abe90469..099ecaa66732a2 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -53,6 +53,7 @@
>  #include <unistd.h>
>  #include <sched.h>
>  #include <signal.h>
> +#include <sys/eventfd.h>
>  #include <sys/mman.h>
>  #include <sys/wait.h>
>  #include <sys/types.h>
> @@ -518,15 +519,28 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
>  
>  static volatile int signr = -1;
>  static volatile int child_finished;
> +static int done_fd = -1;
>  
>  static void sig_handler(int sig)
>  {
> +	u64 tmp = 1;
>  	if (sig == SIGCHLD)
>  		child_finished = 1;
>  	else
>  		signr = sig;
>  
>  	done = 1;
> +
> +	/*
> +	 * It is possible for this signal handler to run after done is checked
> +	 * in the main loop, but before the perf counter fds are polled. If this
> +	 * happens, the poll() will continue to wait even though done is set,
> +	 * and will only break out if either another signal is received, or the
> +	 * counters are ready for read. To ensure the poll() doesn't sleep when
> +	 * done is set, use an eventfd (done_fd) to wake up the poll().
> +	 */
> +	if (write(done_fd, &tmp, sizeof(tmp)) < 0)
> +		pr_err("failed to signal wakeup fd\n");
>  }
>  
>  static void sigsegv_handler(int sig)
> @@ -1424,6 +1438,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  	int fd;
>  	float ratio = 0;
>  
> +	done_fd = eventfd(0, EFD_NONBLOCK);
> +	if (done_fd < 0) {
> +		pr_err("Failed to create wakeup eventfd, error: %m\n");
> +		return -1;
> +	}
> +	err = evlist__add_pollfd(rec->evlist, done_fd);
> +	if (err < 0) {
> +		pr_err("Failed to add wakeup eventfd to poll list\n");
> +		return -1;
> +	}

sorry I did not notice before, but I think we also
need to close done_fd descriptor on the exit path

also please change subject to PATCHv3 for the next version

thanks,
jirka

> +
>  	atexit(record__sig_exit);
>  	signal(SIGCHLD, sig_handler);
>  	signal(SIGINT, sig_handler);
> -- 
> 2.26.2.645.ge9eca65c58-goog
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf record: Use an eventfd to wakeup when done
  2020-05-12 12:12   ` Jiri Olsa
@ 2020-05-12 14:12     ` Arnaldo Carvalho de Melo
  2020-05-13  2:30       ` Anand K. Mistry
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-05-12 14:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Anand K Mistry, linux-perf-users, Alexander Shishkin,
	Ingo Molnar, Mark Rutland, Namhyung Kim, Peter Zijlstra,
	linux-kernel

Em Tue, May 12, 2020 at 02:12:32PM +0200, Jiri Olsa escreveu:
> On Tue, May 12, 2020 at 02:59:36PM +1000, Anand K Mistry wrote:
> 
> SNIP
> 
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 1ab349abe90469..099ecaa66732a2 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -53,6 +53,7 @@
> >  #include <unistd.h>
> >  #include <sched.h>
> >  #include <signal.h>
> > +#include <sys/eventfd.h>
> >  #include <sys/mman.h>
> >  #include <sys/wait.h>
> >  #include <sys/types.h>
> > @@ -518,15 +519,28 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
> >  
> >  static volatile int signr = -1;
> >  static volatile int child_finished;
> > +static int done_fd = -1;
> >  
> >  static void sig_handler(int sig)
> >  {
> > +	u64 tmp = 1;
> >  	if (sig == SIGCHLD)
> >  		child_finished = 1;
> >  	else
> >  		signr = sig;
> >  
> >  	done = 1;
> > +
> > +	/*
> > +	 * It is possible for this signal handler to run after done is checked
> > +	 * in the main loop, but before the perf counter fds are polled. If this
> > +	 * happens, the poll() will continue to wait even though done is set,
> > +	 * and will only break out if either another signal is received, or the
> > +	 * counters are ready for read. To ensure the poll() doesn't sleep when
> > +	 * done is set, use an eventfd (done_fd) to wake up the poll().
> > +	 */
> > +	if (write(done_fd, &tmp, sizeof(tmp)) < 0)
> > +		pr_err("failed to signal wakeup fd\n");
> >  }
> >  
> >  static void sigsegv_handler(int sig)
> > @@ -1424,6 +1438,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> >  	int fd;
> >  	float ratio = 0;
> >  
> > +	done_fd = eventfd(0, EFD_NONBLOCK);
> > +	if (done_fd < 0) {
> > +		pr_err("Failed to create wakeup eventfd, error: %m\n");
> > +		return -1;
> > +	}
> > +	err = evlist__add_pollfd(rec->evlist, done_fd);
> > +	if (err < 0) {
> > +		pr_err("Failed to add wakeup eventfd to poll list\n");
> > +		return -1;
> > +	}
> 
> sorry I did not notice before, but I think we also
> need to close done_fd descriptor on the exit path
> 
> also please change subject to PATCHv3 for the next version

Yeah, and, and don't take this as a requirement for this patch to be
processed, this can be made as a follow up patch by you or someone else
(me, maybe :)), that maybe tools/perf/builtin-top.c and
tools/perf/builtin-trace.c have the same issue?

Could you please take a look there as well?

- Arnaldo
 
> thanks,
> jirka
> 
> > +
> >  	atexit(record__sig_exit);
> >  	signal(SIGCHLD, sig_handler);
> >  	signal(SIGINT, sig_handler);
> > -- 
> > 2.26.2.645.ge9eca65c58-goog
> > 
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3] perf record: Use an eventfd to wakeup when done
  2020-05-08  4:56 [PATCH] perf record: Use an eventfd to wakeup when done Anand K Mistry
  2020-05-11 11:28 ` Jiri Olsa
  2020-05-12  4:59 ` Anand K Mistry
@ 2020-05-13  2:20 ` Anand K Mistry
  2020-05-13 11:39   ` Jiri Olsa
  2020-05-23 13:34   ` Andi Kleen
  2 siblings, 2 replies; 12+ messages in thread
From: Anand K Mistry @ 2020-05-13  2:20 UTC (permalink / raw)
  To: linux-perf-users
  Cc: Anand K Mistry, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Peter Zijlstra, linux-kernel

The setting and checking of 'done' contains a rare race where the signal
handler setting 'done' is run after checking to break the loop, but
before waiting in evlist__poll(). In this case, the main loop won't wake
up until either another signal is sent, or the perf data fd causes a
wake up.

The following simple script can trigger this condition (but you might
need to run it for several hours):
for ((i = 0; i >= 0; i++)) ; do
  echo "Loop $i"
  delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
  ./perf record -- sleep 30000000 >/dev/null&
  pid=$!
  sleep $delay
  kill -TERM $pid
  echo "PID $pid"
  wait $pid
done

At some point, the loop will stall. Adding logging, even though perf has
received the SIGTERM and set 'done = 1', perf will remain sleeping until
a second signal is sent.

Signed-off-by: Anand K Mistry <amistry@google.com>

---

Changes in v3:
- Move done_fd creation to below session initialisation
- Close done_fd on exit
- Log errno when write(done_fd) fails

Changes in v2:
- Added comment to signal handler explaining why the eventfd is added
- Added error handling when creating done_fd

 tools/perf/builtin-record.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1ab349abe90469..a1af6857f24748 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -53,6 +53,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <signal.h>
+#include <sys/eventfd.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
 #include <sys/types.h>
@@ -518,15 +519,28 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
 
 static volatile int signr = -1;
 static volatile int child_finished;
+static int done_fd = -1;
 
 static void sig_handler(int sig)
 {
+	u64 tmp = 1;
 	if (sig == SIGCHLD)
 		child_finished = 1;
 	else
 		signr = sig;
 
 	done = 1;
+
+	/*
+	 * It is possible for this signal handler to run after done is checked
+	 * in the main loop, but before the perf counter fds are polled. If this
+	 * happens, the poll() will continue to wait even though done is set,
+	 * and will only break out if either another signal is received, or the
+	 * counters are ready for read. To ensure the poll() doesn't sleep when
+	 * done is set, use an eventfd (done_fd) to wake up the poll().
+	 */
+	if (write(done_fd, &tmp, sizeof(tmp)) < 0)
+		pr_err("failed to signal wakeup fd, error: %m\n");
 }
 
 static void sigsegv_handler(int sig)
@@ -1466,6 +1480,19 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		return -1;
 	}
 
+	done_fd = eventfd(0, EFD_NONBLOCK);
+	if (done_fd < 0) {
+		pr_err("Failed to create wakeup eventfd, error: %m\n");
+		status = -1;
+		goto out_delete_session;
+	}
+	err = evlist__add_pollfd(rec->evlist, done_fd);
+	if (err < 0) {
+		pr_err("Failed to add wakeup eventfd to poll list\n");
+		status = err;
+		goto out_delete_session;
+	}
+
 	session->header.env.comp_type  = PERF_COMP_ZSTD;
 	session->header.env.comp_level = rec->opts.comp_level;
 
@@ -1827,6 +1854,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	}
 
 out_delete_session:
+	if (done_fd >= 0)
+		close(done_fd);
 	zstd_fini(&session->zstd_data);
 	perf_session__delete(session);
 
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf record: Use an eventfd to wakeup when done
  2020-05-12 14:12     ` Arnaldo Carvalho de Melo
@ 2020-05-13  2:30       ` Anand K. Mistry
  0 siblings, 0 replies; 12+ messages in thread
From: Anand K. Mistry @ 2020-05-13  2:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, linux-perf-users, Alexander Shishkin, Ingo Molnar,
	Mark Rutland, Namhyung Kim, Peter Zijlstra, linux-kernel

On Wed, 13 May 2020 at 00:12, Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Tue, May 12, 2020 at 02:12:32PM +0200, Jiri Olsa escreveu:
> > On Tue, May 12, 2020 at 02:59:36PM +1000, Anand K Mistry wrote:
> >
> > SNIP
> >
> > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > > index 1ab349abe90469..099ecaa66732a2 100644
> > > --- a/tools/perf/builtin-record.c
> > > +++ b/tools/perf/builtin-record.c
> > > @@ -53,6 +53,7 @@
> > >  #include <unistd.h>
> > >  #include <sched.h>
> > >  #include <signal.h>
> > > +#include <sys/eventfd.h>
> > >  #include <sys/mman.h>
> > >  #include <sys/wait.h>
> > >  #include <sys/types.h>
> > > @@ -518,15 +519,28 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
> > >
> > >  static volatile int signr = -1;
> > >  static volatile int child_finished;
> > > +static int done_fd = -1;
> > >
> > >  static void sig_handler(int sig)
> > >  {
> > > +   u64 tmp = 1;
> > >     if (sig == SIGCHLD)
> > >             child_finished = 1;
> > >     else
> > >             signr = sig;
> > >
> > >     done = 1;
> > > +
> > > +   /*
> > > +    * It is possible for this signal handler to run after done is checked
> > > +    * in the main loop, but before the perf counter fds are polled. If this
> > > +    * happens, the poll() will continue to wait even though done is set,
> > > +    * and will only break out if either another signal is received, or the
> > > +    * counters are ready for read. To ensure the poll() doesn't sleep when
> > > +    * done is set, use an eventfd (done_fd) to wake up the poll().
> > > +    */
> > > +   if (write(done_fd, &tmp, sizeof(tmp)) < 0)
> > > +           pr_err("failed to signal wakeup fd\n");
> > >  }
> > >
> > >  static void sigsegv_handler(int sig)
> > > @@ -1424,6 +1438,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> > >     int fd;
> > >     float ratio = 0;
> > >
> > > +   done_fd = eventfd(0, EFD_NONBLOCK);
> > > +   if (done_fd < 0) {
> > > +           pr_err("Failed to create wakeup eventfd, error: %m\n");
> > > +           return -1;
> > > +   }
> > > +   err = evlist__add_pollfd(rec->evlist, done_fd);
> > > +   if (err < 0) {
> > > +           pr_err("Failed to add wakeup eventfd to poll list\n");
> > > +           return -1;
> > > +   }
> >
> > sorry I did not notice before, but I think we also
> > need to close done_fd descriptor on the exit path
> >
> > also please change subject to PATCHv3 for the next version

Apologies. I'm still getting the hang of this.

>
> Yeah, and, and don't take this as a requirement for this patch to be
> processed, this can be made as a follow up patch by you or someone else
> (me, maybe :)), that maybe tools/perf/builtin-top.c and
> tools/perf/builtin-trace.c have the same issue?
>
> Could you please take a look there as well?

I looked at 'top', 'trace', and 'kvm'. kvm doesn't really have this
issue because
the poll() has a 100ms timeout. Even though it's technically affected,
the timeout
will make it unnoticeable (just delaying the exit for 100ms). top is
in the same boat
(uses a timeout).

trace is the affected one because it has the following code:
int timeout = done ? 100 : -1;
if (!draining && evlist__poll(evlist, timeout) > 0) {

Different logic, but still a gap and an indefinite timeout.

>
> - Arnaldo
>
> > thanks,
> > jirka
> >
> > > +
> > >     atexit(record__sig_exit);
> > >     signal(SIGCHLD, sig_handler);
> > >     signal(SIGINT, sig_handler);
> > > --
> > > 2.26.2.645.ge9eca65c58-goog
> > >
> >
>
> --
>
> - Arnaldo



-- 
Anand K. Mistry
Software Engineer
Google Australia

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] perf record: Use an eventfd to wakeup when done
  2020-05-13  2:20 ` [PATCH v3] " Anand K Mistry
@ 2020-05-13 11:39   ` Jiri Olsa
  2020-05-13 14:03     ` Arnaldo Carvalho de Melo
  2020-05-20 15:47     ` Arnaldo Carvalho de Melo
  2020-05-23 13:34   ` Andi Kleen
  1 sibling, 2 replies; 12+ messages in thread
From: Jiri Olsa @ 2020-05-13 11:39 UTC (permalink / raw)
  To: Anand K Mistry
  Cc: linux-perf-users, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Mark Rutland, Namhyung Kim, Peter Zijlstra,
	linux-kernel

On Wed, May 13, 2020 at 12:20:23PM +1000, Anand K Mistry wrote:
> The setting and checking of 'done' contains a rare race where the signal
> handler setting 'done' is run after checking to break the loop, but
> before waiting in evlist__poll(). In this case, the main loop won't wake
> up until either another signal is sent, or the perf data fd causes a
> wake up.
> 
> The following simple script can trigger this condition (but you might
> need to run it for several hours):
> for ((i = 0; i >= 0; i++)) ; do
>   echo "Loop $i"
>   delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
>   ./perf record -- sleep 30000000 >/dev/null&
>   pid=$!
>   sleep $delay
>   kill -TERM $pid
>   echo "PID $pid"
>   wait $pid
> done
> 
> At some point, the loop will stall. Adding logging, even though perf has
> received the SIGTERM and set 'done = 1', perf will remain sleeping until
> a second signal is sent.
> 
> Signed-off-by: Anand K Mistry <amistry@google.com>
> 
> ---
> 
> Changes in v3:
> - Move done_fd creation to below session initialisation
> - Close done_fd on exit
> - Log errno when write(done_fd) fails

Acked-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] perf record: Use an eventfd to wakeup when done
  2020-05-13 11:39   ` Jiri Olsa
@ 2020-05-13 14:03     ` Arnaldo Carvalho de Melo
  2020-05-20 15:47     ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-05-13 14:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Anand K Mistry, linux-perf-users, Alexander Shishkin,
	Ingo Molnar, Mark Rutland, Namhyung Kim, Peter Zijlstra,
	linux-kernel

Em Wed, May 13, 2020 at 01:39:41PM +0200, Jiri Olsa escreveu:
> On Wed, May 13, 2020 at 12:20:23PM +1000, Anand K Mistry wrote:
> > The setting and checking of 'done' contains a rare race where the signal
> > handler setting 'done' is run after checking to break the loop, but
> > before waiting in evlist__poll(). In this case, the main loop won't wake
> > up until either another signal is sent, or the perf data fd causes a
> > wake up.
> > 
> > The following simple script can trigger this condition (but you might
> > need to run it for several hours):
> > for ((i = 0; i >= 0; i++)) ; do
> >   echo "Loop $i"
> >   delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
> >   ./perf record -- sleep 30000000 >/dev/null&
> >   pid=$!
> >   sleep $delay
> >   kill -TERM $pid
> >   echo "PID $pid"
> >   wait $pid
> > done
> > 
> > At some point, the loop will stall. Adding logging, even though perf has
> > received the SIGTERM and set 'done = 1', perf will remain sleeping until
> > a second signal is sent.
> > 
> > Signed-off-by: Anand K Mistry <amistry@google.com>
> > 
> > ---
> > 
> > Changes in v3:
> > - Move done_fd creation to below session initialisation
> > - Close done_fd on exit
> > - Log errno when write(done_fd) fails
> 
> Acked-by: Jiri Olsa <jolsa@redhat.com>

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] perf record: Use an eventfd to wakeup when done
  2020-05-13 11:39   ` Jiri Olsa
  2020-05-13 14:03     ` Arnaldo Carvalho de Melo
@ 2020-05-20 15:47     ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-05-20 15:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Anand K Mistry, linux-perf-users, Alexander Shishkin,
	Ingo Molnar, Mark Rutland, Namhyung Kim, Peter Zijlstra,
	linux-kernel

Em Wed, May 13, 2020 at 01:39:41PM +0200, Jiri Olsa escreveu:
> On Wed, May 13, 2020 at 12:20:23PM +1000, Anand K Mistry wrote:
> > The setting and checking of 'done' contains a rare race where the signal
> > handler setting 'done' is run after checking to break the loop, but
> > before waiting in evlist__poll(). In this case, the main loop won't wake
> > up until either another signal is sent, or the perf data fd causes a
> > wake up.
> > 
> > The following simple script can trigger this condition (but you might
> > need to run it for several hours):
> > for ((i = 0; i >= 0; i++)) ; do
> >   echo "Loop $i"
> >   delay=$(echo "scale=4; 0.1 * $RANDOM/32768" | bc)
> >   ./perf record -- sleep 30000000 >/dev/null&
> >   pid=$!
> >   sleep $delay
> >   kill -TERM $pid
> >   echo "PID $pid"
> >   wait $pid
> > done
> > 
> > At some point, the loop will stall. Adding logging, even though perf has
> > received the SIGTERM and set 'done = 1', perf will remain sleeping until
> > a second signal is sent.
> > 
> > Signed-off-by: Anand K Mistry <amistry@google.com>
> > 
> > ---
> > 
> > Changes in v3:
> > - Move done_fd creation to below session initialisation
> > - Close done_fd on exit
> > - Log errno when write(done_fd) fails
> 
> Acked-by: Jiri Olsa <jolsa@redhat.com>

I've made this dependent on HAVE_EVENTFD_SUPPORT, so that it continues
building on older systems,

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] perf record: Use an eventfd to wakeup when done
  2020-05-13  2:20 ` [PATCH v3] " Anand K Mistry
  2020-05-13 11:39   ` Jiri Olsa
@ 2020-05-23 13:34   ` Andi Kleen
  2020-05-25  1:43     ` Anand K. Mistry
  1 sibling, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2020-05-23 13:34 UTC (permalink / raw)
  To: Anand K Mistry
  Cc: linux-perf-users, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Peter Zijlstra, linux-kernel

Anand K Mistry <amistry@google.com> writes:
>  	}
>  
> +	done_fd = eventfd(0, EFD_NONBLOCK);

This will make perf depend on a recent glibc or other library
that implements eventfd. Wouldn't surprise me if some kind
of build time check is needed for this to pass all of Arnaldo's
built tests.


-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] perf record: Use an eventfd to wakeup when done
  2020-05-23 13:34   ` Andi Kleen
@ 2020-05-25  1:43     ` Anand K. Mistry
  0 siblings, 0 replies; 12+ messages in thread
From: Anand K. Mistry @ 2020-05-25  1:43 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-perf-users, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Peter Zijlstra, linux-kernel

On Sat, 23 May 2020 at 23:35, Andi Kleen <ak@linux.intel.com> wrote:
>
> Anand K Mistry <amistry@google.com> writes:
> >       }
> >
> > +     done_fd = eventfd(0, EFD_NONBLOCK);
>
> This will make perf depend on a recent glibc or other library
> that implements eventfd. Wouldn't surprise me if some kind
> of build time check is needed for this to pass all of Arnaldo's
> built tests.

Looks like Arnaldo made that change when merging:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=e9db221d37f91409040cf7f3fbed08b44e055ae9

This makes me curious. How old a kernel should modern tools support?
From the man page, eventfd was added in 2.6.22 (and eventfd2 in
2.6.27), which was 2007 (or 2008 for eventfd2) and glibc-2.8 which was
2008. I understand the kernel's policy of never breaking userspace,
but what about userspace tools?

>
>
> -Andi



-- 
Anand K. Mistry
Software Engineer
Google Australia

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08  4:56 [PATCH] perf record: Use an eventfd to wakeup when done Anand K Mistry
2020-05-11 11:28 ` Jiri Olsa
2020-05-12  4:59 ` Anand K Mistry
2020-05-12 12:12   ` Jiri Olsa
2020-05-12 14:12     ` Arnaldo Carvalho de Melo
2020-05-13  2:30       ` Anand K. Mistry
2020-05-13  2:20 ` [PATCH v3] " Anand K Mistry
2020-05-13 11:39   ` Jiri Olsa
2020-05-13 14:03     ` Arnaldo Carvalho de Melo
2020-05-20 15:47     ` Arnaldo Carvalho de Melo
2020-05-23 13:34   ` Andi Kleen
2020-05-25  1:43     ` Anand K. Mistry

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git