perf_evlist__filter_pollfd() in trace__run()

* perf_evlist__filter_pollfd() in trace__run()
@ 2015-01-29 23:55 Sukadev Bhattiprolu
  2015-01-30 15:22 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 4+ messages in thread
From: Sukadev Bhattiprolu @ 2015-01-29 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel, zhlbj

Arnaldo,

On one of our systems we are seeing an intermittent SIGSEGV with

	perf trace sleep 1

and I have question about the 'draining' flag below:
| 
| From 46fb3c21d20415dd2693570c33d0ea6eb8745e04 Mon Sep 17 00:00:00 2001
| From: Arnaldo Carvalho de Melo <acme@redhat.com>
| Date: Mon, 22 Sep 2014 14:39:48 -0300
| Subject: [PATCH 1/1] perf trace: Filter out POLLHUP'ed file descriptors
| 
| So that we don't continue polling on vanished file descriptors, i.e.
| file descriptors for events monitoring threads that exited.
| 
| I.e. the following 'trace' command now exits as expected, instead
| of staying in an eternal loop:
| 
|       $ sleep 5s &
|       $ trace -p `pidof sleep`
| 
| Reported-by: Jiri Olsa <jolsa@redhat.com>
| Cc: Adrian Hunter <adrian.hunter@intel.com>
| Cc: David Ahern <dsahern@gmail.com>
| Cc: Don Zickus <dzickus@redhat.com>
| Cc: Frederic Weisbecker <fweisbec@gmail.com>
| Cc: Jiri Olsa <jolsa@redhat.com>
| Cc: Mike Galbraith <efault@gmx.de>
| Cc: Namhyung Kim <namhyung@kernel.org>
| Cc: Paul Mackerras <paulus@samba.org>
| Cc: Peter Zijlstra <peterz@infradead.org>
| Cc: Stephane Eranian <eranian@google.com>
| Link: http://lkml.kernel.org/n/tip-6qegv786zbf6i8us6t4rxug9@git.kernel.org
| Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| ---
|  tools/perf/builtin-trace.c | 7 ++++++-
|  1 file changed, 6 insertions(+), 1 deletion(-)
| 
| diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
| index b8fedf3..fe39dc6 100644
| --- a/tools/perf/builtin-trace.c
| +++ b/tools/perf/builtin-trace.c
| @@ -2044,6 +2044,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
|  	int err = -1, i;
|  	unsigned long before;
|  	const bool forks = argc > 0;
| +	bool draining = false;
|  	char sbuf[STRERR_BUFSIZE];
|  
|  	trace->live = true;
| @@ -2171,8 +2172,12 @@ next_event:
|  	if (trace->nr_events == before) {
|  		int timeout = done ? 100 : -1;
|  
| -		if (perf_evlist__poll(evlist, timeout) > 0)
| +		if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
| +			if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
| +				draining = true;
| +

If an fd gets into POLLHUP state, perf_evlist__filter_pollfd() removes
("puts") the mmap for the fd. We are seeing that sometimes (frequently)
_all_ fds are in the POLLHUP state and hence their mmap->base are set
to NULL.


|  			goto again;

Now with this goto, we go back and call perf_evlist__mmap_read() which
tries to access the freed mmaps.

Should there be another check to before reading the mmap again ?

I must add that I don't get the SIGSEGV on recent perf-core and the
system where we get the crash, first runs into the following
errors that we are still looking into (maybe related to "ppc64le"
architecture).

	Problems reading syscall 45 information
	Problems reading syscall 5 information
	Problems reading syscall 5 information
	Problems reading syscall 108 information
	Problems reading syscall 108 information
	Problems reading syscall 90 information
	Problems reading syscall 90 information
	Problems reading syscall 6 information

Unlike the SIGSEGV, these errors occur always.

| +		}
|  	} else {
|  		goto again;
|  	}
| -- 
| 1.8.3.1
| 

Following hack seems to fix the SIGSEGV, but then we completely ignore
'draining' flag.

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index fb12645..ac25e16 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2173,8 +2173,10 @@ next_event:
                int timeout = done ? 100 : -1;
 
                if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
-                       if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
+                       if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0) {
                                draining = true;
+                               goto out_disable;
+                       }
 
                        goto again;
                }


^ permalink raw reply related	[flat|nested] 4+ messages in thread