Re: ping Re: [PATCH] perf script: Add stackcollapse.py script

From: Paolo Bonzini <pbonzini@redhat.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org, Jiri Olsa <jolsa@kernel.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Brendan Gregg <bgregg@netflix.com>
Subject: Re: ping Re: [PATCH] perf script: Add stackcollapse.py script
Date: Fri, 20 May 2016 13:01:41 +0200	[thread overview]
Message-ID: <14a4eaa4-6b69-6c8e-6ae7-3a14ee7bb616@redhat.com> (raw)
In-Reply-To: <20160415105739.GA8595@krava.redhat.com>

On 15/04/2016 12:57, Jiri Olsa wrote:
> On Fri, Apr 15, 2016 at 07:20:48AM +0200, Paolo Bonzini wrote:
>> On 12/04/2016 15:26, Paolo Bonzini wrote:
>>> Add stackcollapse.py script as an example of parsing call chains, and
>>> also of using optparse to access command line options.
>>>
>>> The flame graph tools include a set of scripts that parse output from
>>> various tools (including "perf script"), remove the offsets in the
>>> function and collapse each stack to a single line.  The website also says
>>> "perf report could have a report style [...] that output folded stacks
>>> directly, obviating the need for stackcollapse-perf.pl", so here it is.
>>>
>>> This script is a Python rewrite of stackcollapse-perf.pl, using the perf
>>> scripting interface to access the perf data directly from Python.
>>>
>>> Cc: Jiri Olsa <jolsa@kernel.org>
>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>>> Cc: Brendan Gregg <bgregg@netflix.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Sorry for the very early ping, I'm going on vacation and I'm afraid the
>> next time I'd be able to ping would be too late for 4.7. :)
>>
>> Paolo
>>
>>> ---
>>>  tools/perf/scripts/python/bin/stackcollapse-record |   8 ++
>>>  tools/perf/scripts/python/bin/stackcollapse-report |   3 +
>>>  tools/perf/scripts/python/stackcollapse.py         | 127 +++++++++++++++++++++
>>>  3 files changed, 138 insertions(+)
>>>  create mode 100755 tools/perf/scripts/python/bin/stackcollapse-record
>>>  create mode 100755 tools/perf/scripts/python/bin/stackcollapse-report
>>>  create mode 100755 tools/perf/scripts/python/stackcollapse.py
>>>
>>> diff --git a/tools/perf/scripts/python/bin/stackcollapse-record b/tools/perf/scripts/python/bin/stackcollapse-record
>>> new file mode 100755
>>> index 000000000000..9d8f9f0f3a17
>>> --- /dev/null
>>> +++ b/tools/perf/scripts/python/bin/stackcollapse-record
>>> @@ -0,0 +1,8 @@
>>> +#!/bin/sh
>>> +
>>> +#
>>> +# stackcollapse.py can cover all type of perf samples including
>>> +# the tracepoints, so no special record requirements, just record what
>>> +# you want to analyze.
>>> +#
>>> +perf record "$@"
>>> diff --git a/tools/perf/scripts/python/bin/stackcollapse-report b/tools/perf/scripts/python/bin/stackcollapse-report
>>> new file mode 100755
>>> index 000000000000..356b9656393d
>>> --- /dev/null
>>> +++ b/tools/perf/scripts/python/bin/stackcollapse-report
>>> @@ -0,0 +1,3 @@
>>> +#!/bin/sh
>>> +# description: produce callgraphs in short form for scripting use
>>> +perf script -s "$PERF_EXEC_PATH"/scripts/python/stackcollapse.py -- "$@"
>>> diff --git a/tools/perf/scripts/python/stackcollapse.py b/tools/perf/scripts/python/stackcollapse.py
>>> new file mode 100755
>>> index 000000000000..a2dfcda41ae6
>>> --- /dev/null
>>> +++ b/tools/perf/scripts/python/stackcollapse.py
>>> @@ -0,0 +1,127 @@
>>> +#!/usr/bin/perl -w
>>> +#
>>> +# stackcollapse.py - format perf samples with one line per distinct call stack
>>> +#
>>> +# This script's output has two space-separated fields.  The first is a semicolon
>>> +# separated stack including the program name (from the "comm" field) and the
>>> +# function names from the call stack.  The second is a count:
>>> +#
>>> +#  swapper;start_kernel;rest_init;cpu_idle;default_idle;native_safe_halt 2
>>> +#
>>> +# The file is sorted according to the first field.
>>> +#
>>> +# Input may be created and processed using:
>>> +#
>>> +#  perf record -a -g -F 99 sleep 60
>>> +#  perf script report stackcollapse > out.stacks-folded
>>> +#
>>> +# (perf script record stackcollapse works too).
> 
> IIRC Namhyung added -g folded option recently for report
> so you could do:
> 
> perf report -g folded --stdio
> 
> however we dont seem to have it for perf script, so this might
> be useful until we add the --call-graph support into perf script

While "perf report -g folded" is indeed similar in spirit, it doesn't
provide exactly the same output as expected by the flame graph tools.
The point of this patch is to talk directly to them, and to provide an
example of looking at call stacks from Python.

Thanks,

Paolo

> jirka
> 
>>> +#
>>> +# Written by Paolo Bonzini <pbonzini@redhat.com>
>>> +# Based on Brendan Gregg's stackcollapse-perf.pl script.
>>> +
>>> +import os
>>> +import sys
>>> +from collections import defaultdict
>>> +from optparse import OptionParser, make_option
>>> +
>>> +sys.path.append(os.environ['PERF_EXEC_PATH'] + \
>>> +                '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
>>> +
>>> +from perf_trace_context import *
>>> +from Core import *
>>> +from EventClass import *
>>> +
>>> +# command line parsing
>>> +
>>> +option_list = [
>>> +    # formatting options for the bottom entry of the stack
>>> +    make_option("--include-tid", dest="include_tid",
>>> +                 action="store_true", default=False,
>>> +                 help="include thread id in stack"),
>>> +    make_option("--include-pid", dest="include_pid",
>>> +                 action="store_true", default=False,
>>> +                 help="include process id in stack"),
>>> +    make_option("--no-comm", dest="include_comm",
>>> +                 action="store_false", default=True,
>>> +                 help="do not separate stacks according to comm"),
>>> +    make_option("--tidy-java", dest="tidy_java",
>>> +                 action="store_true", default=False,
>>> +                 help="beautify Java signatures"),
>>> +    make_option("--kernel", dest="annotate_kernel",
>>> +                 action="store_true", default=False,
>>> +                 help="annotate kernel functions with _[k]")
>>> +]
>>> +
>>> +parser = OptionParser(option_list=option_list)
>>> +(opts, args) = parser.parse_args()
>>> +
>>> +if len(args) != 0:
>>> +    parser.error("unexpected command line argument")
>>> +if opts.include_tid and not opts.include_comm:
>>> +    parser.error("requesting tid but not comm is invalid")
>>> +if opts.include_pid and not opts.include_comm:
>>> +    parser.error("requesting pid but not comm is invalid")
>>> +
>>> +# event handlers
>>> +
>>> +lines = defaultdict(lambda: 0)
>>> +
>>> +def process_event(param_dict):
>>> +    def tidy_function_name(sym, dso):
>>> +        if sym is None:
>>> +            sym = '[unknown]'
>>> +
>>> +        sym = sym.replace(';', ':')
>>> +        if opts.tidy_java:
>>> +            # the original stackcollapse-perf.pl script gives the
>>> +            # example of converting this:
>>> +            #    Lorg/mozilla/javascript/MemberBox;.<init>(Ljava/lang/reflect/Method;)V
>>> +            # to this:
>>> +            #    org/mozilla/javascript/MemberBox:.init
>>> +            sym = sym.replace('<', '')
>>> +            sym = sym.replace('>', '')
>>> +            if sym[0] == 'L' and sym.find('/'):
>>> +                sym = sym[1:]
>>> +            try:
>>> +                sym = sym[:sym.index('(')]
>>> +            except ValueError:
>>> +                pass
>>> +
>>> +        if opts.annotate_kernel and dso == '[kernel.kallsyms]':
>>> +            return sym + '_[k]'
>>> +        else:
>>> +            return sym
>>> +
>>> +    stack = list()
>>> +    if 'callchain' in param_dict:
>>> +        for entry in param_dict['callchain']:
>>> +            entry.setdefault('sym', dict())
>>> +            entry['sym'].setdefault('name', None)
>>> +            entry.setdefault('dso', None)
>>> +            stack.append(tidy_function_name(entry['sym']['name'],
>>> +                                            entry['dso']))
>>> +    else:
>>> +        param_dict.setdefault('symbol', None)
>>> +        param_dict.setdefault('dso', None)
>>> +        stack.append(tidy_function_name(param_dict['symbol'],
>>> +                                        param_dict['dso']))
>>> +
>>> +    if opts.include_comm:
>>> +        comm = param_dict["comm"].replace(' ', '_')
>>> +        sep = "-"
>>> +        if opts.include_pid:
>>> +            comm = comm + sep + str(param_dict['sample']['pid'])
>>> +            sep = "/"
>>> +        if opts.include_tid:
>>> +            comm = comm + sep + str(param_dict['sample']['tid'])
>>> +        stack.append(comm)
>>> +
>>> +    stack_string = ';'.join(reversed(stack))
>>> +    lines[stack_string] = lines[stack_string] + 1
>>> +
>>> +def trace_end():
>>> +    list = lines.keys()
>>> +    list.sort()
>>> +    for stack in list:
>>> +        print "%s %d" % (stack, lines[stack])
>>>
>