All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] bb-perf: scripting to plot buildstats data
@ 2016-11-16 23:05 ` leonardo.sandoval.gonzalez
  0 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-15 21:19 UTC (permalink / raw)
  To: poky

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

buildstats data has been mostly unexplored mainly due to the lack of tools
to digest this data. The script buildstats.sh has been re-designed to
be much more flexible and the new script buildstats-plot.sh uses the latter
to produce data to be consumed by gnuplot. The tools used are datamash and
gnuplot, so these must be present before running them.

Some plots created by buildstats-plot.sh can be found at [1]

[1] https://wiki.yoctoproject.org/wiki/MortyBuildstats


The following changes since commit dc8508f609974cc99606b9042bfa7f870ce80228:

  build-applance-image: Fix to use the release branch for morty (2016-10-26 11:11:10 +0100)

are available in the git repository at:

  git://git.yoctoproject.org/poky-contrib lsandov1/buildstats-plot
  http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=lsandov1/buildstats-plot

Leonardo Sandoval (3):
  buildstats: Place 'Elapsed Time' stat into a single line
  scripts: Specify the stats to take into account
  bb-perf: plot histograms base on buildstats data

 meta/classes/buildstats.bbclass            |   4 +-
 scripts/contrib/bb-perf/buildstats-plot.sh | 157 +++++++++++++++++++++++++++++
 scripts/contrib/bb-perf/buildstats.sh      |  99 ++++++++++++++----
 3 files changed, 241 insertions(+), 19 deletions(-)
 create mode 100755 scripts/contrib/bb-perf/buildstats-plot.sh

-- 
2.1.4



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] buildstats: Place 'Elapsed Time' stat into a single line
  2016-11-16 23:05 ` leonardo.sandoval.gonzalez
@ 2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  -1 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-15 21:19 UTC (permalink / raw)
  To: poky

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

All lines except one (the one containing the 'Elapsed Time') follows the format
'stat: value'. Fix that so post parsing the stats is simpler.

Signed-off-by: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
---
 meta/classes/buildstats.bbclass | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/meta/classes/buildstats.bbclass b/meta/classes/buildstats.bbclass
index 599a219..57ecc8f 100644
--- a/meta/classes/buildstats.bbclass
+++ b/meta/classes/buildstats.bbclass
@@ -80,8 +80,8 @@ def write_task_data(status, logfile, e, d):
     with open(os.path.join(logfile), "a") as f:
         elapsedtime = get_timedata("__timedata_task", d, e.time)
         if elapsedtime:
-            f.write(d.expand("${PF}: %s: Elapsed time: %0.2f seconds \n" %
-                                    (e.task, elapsedtime)))
+            f.write(d.expand("${PF}: %s\n" % e.task))
+            f.write(d.expand("Elapsed time: %0.2f seconds\n" % elapsedtime))
             cpu, iostats, resources, childres = get_process_cputime(os.getpid())
             if cpu:
                 f.write("utime: %s\n" % cpu['utime'])
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] scripts: Specify the stats to take into account
  2016-11-16 23:05 ` leonardo.sandoval.gonzalez
@ 2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  -1 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-15 21:19 UTC (permalink / raw)
  To: poky

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

There are many more stats on buildstats that 'Elapsed time', so make the script
more flexible to support all stats. Some cmd line examples:

$ buildstats.sh -s 'utime'

Buildstats' data covers proc's stats in different areas, including CPU times,
IO, program system resources and child program system resources. In order
to print values on each of these sets from command line, one can use the
following:

$ buildstats.sh -H -s 'TIME' | less

$ buildstats.sh -H -s 'IO' | less

and 'RUSAGE' and 'CHILD_RUSAGE' for program and program's child system
resources.

Signed-off-by: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
---
 scripts/contrib/bb-perf/buildstats.sh | 99 +++++++++++++++++++++++++++++------
 1 file changed, 82 insertions(+), 17 deletions(-)

diff --git a/scripts/contrib/bb-perf/buildstats.sh b/scripts/contrib/bb-perf/buildstats.sh
index 96158a9..8d7e248 100755
--- a/scripts/contrib/bb-perf/buildstats.sh
+++ b/scripts/contrib/bb-perf/buildstats.sh
@@ -18,24 +18,40 @@
 # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 #
 # DESCRIPTION
-# Given a 'buildstats' path (created by bitbake when setting
-# USER_CLASSES ?= "buildstats" on local.conf) and task names, outputs
-# '<task> <recipe> <elapsed time>' for all recipes. Elapsed times are in
-# seconds, and task should be given without the 'do_' prefix.
+# Given 'buildstats' data (generate by bitbake when setting
+# USER_CLASSES ?= "buildstats" on local.conf), task names and a stats values
+# (these are the ones preset on the buildstats files), outputs
+# '<task> <recipe> <value_1> <value_2> ... <value_n>'. The units are the ones
+# defined at buildstats, which in turn takes data from /proc/[pid] files
 #
 # Some useful pipelines
 #
-# 1. Tasks with largest elapsed times
-# $ buildstats.sh -b <buildstats> | sort -k3 -n -r | head
+# 1. Tasks with largest stime (Amount of time that this process has been scheduled
+#    in kernel mode) values
+# $ buildstats.sh -b <buildstats> -s stime | sort -k3 -n -r | head
 #
-# 2. Min, max, sum per task (in needs GNU datamash)
-# $ buildstats.sh -b <buildstats> | datamash -t' ' -g1 min 3 max 3 sum 3 | sort -k4 -n -r
+# 2. Min, max, sum utime (Amount  of  time  that  this process has been scheduled
+#    in user mode) per task (in needs GNU datamash)
+# $ buildstats.sh -b <buildstats> -s utime | datamash -t' ' -g1 min 3 max 3 sum 3 | sort -k4 -n -r
 #
 # AUTHORS
 # Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
 #
+
+# Stats, by type
+TIME="utime:stime:cutime:cstime"
+IO="IO wchar:IO write_bytes:IO syscr:IO read_bytes:IO rchar:IO syscw:IO cancelled_write_bytes"
+RUSAGE="rusage ru_utime:rusage ru_stime:rusage ru_maxrss:rusage ru_minflt:rusage ru_majflt:\
+rusage ru_inblock:rusage ru_oublock:rusage ru_nvcsw:rusage ru_nivcsw"
+
+CHILD_RUSAGE="Child rusage ru_utime:Child rusage ru_stime:Child rusage ru_maxrss:Child rusage ru_minflt:\
+Child rusage ru_majflt:Child rusage ru_inblock:Child rusage ru_oublock:Child rusage ru_nvcsw:\
+Child rusage ru_nivcsw"
+
 BS_DIR="tmp/buildstats"
 TASKS="compile:configure:fetch:install:patch:populate_lic:populate_sysroot:unpack"
+STATS="$TIME"
+HEADER="" # No header by default
 
 function usage {
 CMD=$(basename $0)
@@ -45,12 +61,20 @@ Usage: $CMD [-b buildstats_dir] [-t do_task]
                 (default: "$BS_DIR")
   -t tasks      The tasks to be computed
                 (default: "$TASKS")
+  -s stats      The stats to be matched. Options: TIME, IO, RUSAGE, CHILD_RUSAGE
+                or any other defined buildstat separated by colons, i.e. stime:utime
+                (default: "$STATS")
+                Default stat sets:
+                    TIME=$TIME
+                    IO=$IO
+                    RUSAGE=$RUSAGE
+                    CHILD_RUSAGE=$CHILD_RUSAGE
   -h            Display this help message
 EOM
 }
 
 # Parse and validate arguments
-while getopts "b:t:h" OPT; do
+while getopts "b:t:s:Hh" OPT; do
 	case $OPT in
 	b)
 		BS_DIR="$OPTARG"
@@ -58,6 +82,12 @@ while getopts "b:t:h" OPT; do
 	t)
 		TASKS="$OPTARG"
 		;;
+	s)
+		STATS="$OPTARG"
+		;;
+	H)
+	        HEADER="y"
+	        ;;
 	h)
 		usage
 		exit 0
@@ -76,15 +106,50 @@ if [ ! -d "$BS_DIR" ]; then
 	exit 1
 fi
 
-RECIPE_FIELD=1
-TIME_FIELD=4
+stats=""
+IFS=":"
+for stat in ${STATS}; do
+	case $stat in
+	    TIME)
+		stats="${stats}:${TIME}"
+		;;
+	    IO)
+		stats="${stats}:${IO}"
+		;;
+	    RUSAGE)
+		stats="${stats}:${RUSAGE}"
+		;;
+	    CHILD_RUSAGE)
+		stats="${stats}:${CHILD_RUSAGE}"
+		;;
+	    *)
+		stats="${STATS}"
+	esac
+done
+
+# remove possible colon at the beginning
+stats="$(echo "$stats" | sed -e 's/^://1')"
+
+# Provide a header if required by the user
+[ -n "$HEADER" ] && { echo "task:recipe:$stats"; }
 
-tasks=(${TASKS//:/ })
-for task in "${tasks[@]}"; do
+for task in ${TASKS}; do
     task="do_${task}"
-    for file in $(find ${BS_DIR} -type f -name ${task}); do
-        recipe=$(sed -n -e "/$task/p" ${file} | cut -d ':' -f${RECIPE_FIELD})
-        time=$(sed -n -e "/$task/p" ${file} | cut -d ':' -f${TIME_FIELD} | cut -d ' ' -f2)
-        echo "${task} ${recipe} ${time}"
+    for file in $(find ${BS_DIR} -type f -name ${task} | awk 'BEGIN{ ORS=""; OFS=":" } { print $0,"" }'); do
+        recipe="$(basename $(dirname $file))"
+	times=""
+	for stat in ${stats}; do
+	    [ -z "$stat" ] && { echo "empty stats"; }
+	    time=$(sed -n -e "s/^\($stat\): \\(.*\\)/\\2/p" $file)
+	    # in case the stat is not present, set the value as NA
+	    [ -z "$time" ] && { time="NA"; }
+	    # Append it to times
+	    if [ -z "$times" ]; then
+		times="${time}"
+	    else
+		times="${times} ${time}"
+	    fi
+	done
+        echo "${task} ${recipe} ${times}"
     done
 done
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] bb-perf: plot histograms base on buildstats data
  2016-11-16 23:05 ` leonardo.sandoval.gonzalez
@ 2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  -1 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-15 21:19 UTC (permalink / raw)
  To: poky

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

Scripts that produces script data to be consumed by gnuplot.
There are two possible plots depending if either the
-S parameter is present or not:

    * without -S: Produces a histogram listing top N recipes/tasks versus
      stats. The first stat defined in the -s parameter is the one taken
      into account for ranking
    * -S: Produces a histogram listing tasks versus stats.  In this case,
      the value of each stat is the sum for that particular stat in all recipes found.
      Stats values  are in descending order defined by the first stat defined on -s

EXAMPLES

1. Top recipes' tasks taking into account utime

    $ buildstats-plot.sh -s utime | gnuplot -p

2. Tasks versus utime:stime

    $ buildstats-plot.sh -s utime:stime -S | gnuplot -p

3. Tasks versus IO write_bytes:IO read_bytes

    $ buildstats-plot.sh -s 'IO write_bytes:IO read_bytes' -S | gnuplot -p

Signed-off-by: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
---
 scripts/contrib/bb-perf/buildstats-plot.sh | 157 +++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100755 scripts/contrib/bb-perf/buildstats-plot.sh

diff --git a/scripts/contrib/bb-perf/buildstats-plot.sh b/scripts/contrib/bb-perf/buildstats-plot.sh
new file mode 100755
index 0000000..7e8ae04
--- /dev/null
+++ b/scripts/contrib/bb-perf/buildstats-plot.sh
@@ -0,0 +1,157 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) 2011, Intel Corporation.
+# All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+#
+# DESCRIPTION
+#
+# Produces script data to be consumed by gnuplot. There are two possible plots
+# depending if either the -S parameter is present or not:
+#
+#     * without -S: Produces a histogram listing top N recipes/tasks versus
+#       stats. The first stat defined in the -s parameter is the one taken
+#       into account for ranking
+#     * -S: Produces a histogram listing tasks versus stats.  In this case,
+#       the value of each stat is the sum for that particular stat in all recipes found.
+#       Stats values  are in descending order defined by the first stat defined on -s
+#
+# EXAMPLES
+#
+# 1. Top recipes' tasks taking into account utime
+#
+#     $ buildstats-plot.sh -s utime | gnuplot -p
+#
+# 2. Tasks versus utime:stime
+#
+#     $ buildstats-plot.sh -s utime:stime -S | gnuplot -p
+#
+# 3. Tasks versus IO write_bytes:IO read_bytes
+#
+#     $ buildstats-plot.sh -s 'IO write_bytes:IO read_bytes' -S | gnuplot -p
+#
+# AUTHORS
+# Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
+#
+
+set -o nounset
+set -o errexit
+
+BS_DIR="tmp/buildstats"
+N=10
+STATS="utime"
+SUM=""
+OUTDATA_FILE="$PWD/buildstats-plot.out"
+
+function usage {
+    CMD=$(basename $0)
+    cat <<EOM
+Usage: $CMD [-b buildstats_dir] [-t do_task]
+  -b buildstats The path where the folder resides
+                (default: "$BS_DIR")
+  -n N          Top N recipes to display. Ignored if -S is present
+                (default: "$N")
+  -s stats      The stats to be matched. If more that one stat, units
+                should be the same because data is plot as histogram.
+                (see buildstats.sh -h for all options) or any other defined
+                (build)stat separated by colons, i.e. stime:utime
+                (default: "$STATS")
+  -S            Sum values for a particular stat for found recipes
+  -o            Output data file.
+                (default: "$OUTDATA_FILE")
+  -h            Display this help message
+EOM
+}
+
+# Parse and validate arguments
+while getopts "b:n:s:o:Sh" OPT; do
+	case $OPT in
+	b)
+		BS_DIR="$OPTARG"
+		;;
+	n)
+		N="$OPTARG"
+		;;
+	s)
+	        STATS="$OPTARG"
+	        ;;
+	S)
+	        SUM="y"
+	        ;;
+	o)
+	        OUTDATA_FILE="$OPTARG"
+	        ;;
+	h)
+		usage
+		exit 0
+		;;
+	*)
+		usage
+		exit 1
+		;;
+	esac
+done
+
+# Get number of stats
+IFS=':'; statsarray=(${STATS}); unset IFS
+nstats=${#statsarray[@]}
+
+# Get script folder, use to run buildstats.sh
+CD=$(dirname $0)
+
+# Parse buildstats recipes to produce a single table
+OUTBUILDSTATS="$PWD/buildstats.log"
+$CD/buildstats.sh -H -s "$STATS" -H > $OUTBUILDSTATS
+
+# Get headers
+HEADERS=$(cat $OUTBUILDSTATS | sed -n -e '1s/ /-/g' -e '1s/:/ /gp')
+
+echo -e "set boxwidth 0.9 relative"
+echo -e "set style data histograms"
+echo -e "set style fill solid 1.0 border lt -1"
+echo -e "set xtics rotate by 45 right"
+
+# Get output data
+if [ -z "$SUM" ]; then
+    cat $OUTBUILDSTATS | sed -e '1d' | sort -k3 -n -r | head -$N > $OUTDATA_FILE
+    # include task at recipe column
+    sed -i -e "1i\
+${HEADERS}" $OUTDATA_FILE
+    echo -e "set title \"Top task/recipes\""
+    echo -e "plot for [COL=3:`expr 3 + ${nstats} - 1`] '${OUTDATA_FILE}' using COL:xtic(stringcolumn(1).' '.stringcolumn(2)) title columnheader(COL)"
+else
+
+    # Construct datatamash sum argument (sum 3 sum 4 ...)
+    declare -a sumargs
+    j=0
+    for i in `seq $nstats`; do
+	sumargs[j]=sum; j=$(( $j + 1 ))
+	sumargs[j]=`expr 3 + $i - 1`;  j=$(( $j + 1 ))
+    done
+
+    # Do the processing with datamash
+    cat $OUTBUILDSTATS | sed -e '1d' | datamash -t ' ' -g1 ${sumargs[*]} | sort -k2 -n -r > $OUTDATA_FILE
+
+    # Include headers into resulted file, so we can include gnuplot xtics
+    HEADERS=$(echo $HEADERS | sed -e 's/recipe//1')
+    sed -i -e "1i\
+${HEADERS}" $OUTDATA_FILE
+
+    # Plot
+    echo -e "set title \"Sum stats values per task for all recipes\""
+    echo -e "plot for [COL=2:`expr 2 + ${nstats} - 1`] '${OUTDATA_FILE}' using COL:xtic(1) title columnheader(COL)"
+fi
+
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] bb-perf: plot histograms base on buildstats data
  2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  (?)
@ 2016-11-16 10:25   ` Markus Lehtonen
  2016-11-16 22:44     ` Leonardo Sandoval
  -1 siblings, 1 reply; 12+ messages in thread
From: Markus Lehtonen @ 2016-11-16 10:25 UTC (permalink / raw)
  To: leonardo.sandoval.gonzalez, poky

On Tue, 2016-11-15 at 15:19 -0600,
leonardo.sandoval.gonzalez@linux.intel.com wrote:
> From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
> 
> Scripts that produces script data to be consumed by gnuplot.
> There are two possible plots depending if either the
> -S parameter is present or not:
> 
>     * without -S: Produces a histogram listing top N recipes/tasks versus
>       stats. The first stat defined in the -s parameter is the one taken
>       into account for ranking
>     * -S: Produces a histogram listing tasks versus stats.  In this case,
>       the value of each stat is the sum for that particular stat in all
> recipes found.
>       Stats values  are in descending order defined by the first stat
> defined on -s
> 
> EXAMPLES
> 
> 1. Top recipes' tasks taking into account utime
> 
>     $ buildstats-plot.sh -s utime | gnuplot -p
> 
> 2. Tasks versus utime:stime
> 
>     $ buildstats-plot.sh -s utime:stime -S | gnuplot -p
> 
> 3. Tasks versus IO write_bytes:IO read_bytes
> 
>     $ buildstats-plot.sh -s 'IO write_bytes:IO read_bytes' -S | gnuplot 
> -p

One problem (or problematic restriction) I see is that the script relies on
the new buildstats format introduced by PATCH 2/3 in this patchset, making
the script incompatible with older buildstats and not being able to render
them.

Another problem for me is the dependency on datamash which I wasn't able to
find for my distro (openSUSE Leap 42.1). How hard would it be to ditch the
dependency on datamash?


Thanks,
  Markus



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] bb-perf: scripting to plot buildstats data
  2016-11-16 23:05 ` leonardo.sandoval.gonzalez
                   ` (3 preceding siblings ...)
  (?)
@ 2016-11-16 11:53 ` Richard Purdie
  2016-11-16 22:48   ` Leonardo Sandoval
  -1 siblings, 1 reply; 12+ messages in thread
From: Richard Purdie @ 2016-11-16 11:53 UTC (permalink / raw)
  To: leonardo.sandoval.gonzalez, poky

On Tue, 2016-11-15 at 15:19 -0600,
leonardo.sandoval.gonzalez@linux.intel.com wrote:
> From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
> 
> buildstats data has been mostly unexplored mainly due to the lack of
> tools
> to digest this data. The script buildstats.sh has been re-designed to
> be much more flexible and the new script buildstats-plot.sh uses the
> latter
> to produce data to be consumed by gnuplot. The tools used are
> datamash and
> gnuplot, so these must be present before running them.
> 
> Some plots created by buildstats-plot.sh can be found at [1]
> 
> [1] https://wiki.yoctoproject.org/wiki/MortyBuildstats
> 

This should go to the OE-Core list?

Cheers,

Richard


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] bb-perf: plot histograms base on buildstats data
  2016-11-16 10:25   ` Markus Lehtonen
@ 2016-11-16 22:44     ` Leonardo Sandoval
  0 siblings, 0 replies; 12+ messages in thread
From: Leonardo Sandoval @ 2016-11-16 22:44 UTC (permalink / raw)
  To: Markus Lehtonen, poky



On 11/16/2016 04:25 AM, Markus Lehtonen wrote:
> On Tue, 2016-11-15 at 15:19 -0600,
> leonardo.sandoval.gonzalez@linux.intel.com wrote:
>> From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
>>
>> Scripts that produces script data to be consumed by gnuplot.
>> There are two possible plots depending if either the
>> -S parameter is present or not:
>>
>>      * without -S: Produces a histogram listing top N recipes/tasks versus
>>        stats. The first stat defined in the -s parameter is the one taken
>>        into account for ranking
>>      * -S: Produces a histogram listing tasks versus stats.  In this case,
>>        the value of each stat is the sum for that particular stat in all
>> recipes found.
>>        Stats values  are in descending order defined by the first stat
>> defined on -s
>>
>> EXAMPLES
>>
>> 1. Top recipes' tasks taking into account utime
>>
>>      $ buildstats-plot.sh -s utime | gnuplot -p
>>
>> 2. Tasks versus utime:stime
>>
>>      $ buildstats-plot.sh -s utime:stime -S | gnuplot -p
>>
>> 3. Tasks versus IO write_bytes:IO read_bytes
>>
>>      $ buildstats-plot.sh -s 'IO write_bytes:IO read_bytes' -S | gnuplot
>> -p
> One problem (or problematic restriction) I see is that the script relies on
> the new buildstats format introduced by PATCH 2/3 in this patchset, making
> the script incompatible with older buildstats and not being able to render
> them.

The proposed buildstats.sh script has the same behavior as the old one 
if -s 'Elapsed time' is included as argument. The new feature introduced 
is that we can parse other stats besides Elapsed Time, so with 
buildstats-plot.sh can make use of it. The main reason of the 2/2 change 
is that Elapsed time does not give us much info because this is wall 
time, not CPU time.

> Another problem for me is the dependency on datamash which I wasn't able to
> find for my distro (openSUSE Leap 42.1). How hard would it be to ditch the
> dependency on datamash?
datamash is pretty good for doing basic stat and this case we are just 
summing columns by 'groups'. The same thing can be with awk of course. 
Looking at the https://www.gnu.org/software/datamash/download/, there is 
no package ready for opensuse so the only option here is to get the 
tarball and 'make' it.


> Thanks,
>    Markus
>
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] bb-perf: scripting to plot buildstats data
  2016-11-16 11:53 ` [PATCH 0/3] bb-perf: scripting to plot " Richard Purdie
@ 2016-11-16 22:48   ` Leonardo Sandoval
  0 siblings, 0 replies; 12+ messages in thread
From: Leonardo Sandoval @ 2016-11-16 22:48 UTC (permalink / raw)
  To: Richard Purdie, poky



On 11/16/2016 05:53 AM, Richard Purdie wrote:
> On Tue, 2016-11-15 at 15:19 -0600,
> leonardo.sandoval.gonzalez@linux.intel.com wrote:
>> From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
>>
>> buildstats data has been mostly unexplored mainly due to the lack of
>> tools
>> to digest this data. The script buildstats.sh has been re-designed to
>> be much more flexible and the new script buildstats-plot.sh uses the
>> latter
>> to produce data to be consumed by gnuplot. The tools used are
>> datamash and
>> gnuplot, so these must be present before running them.
>>
>> Some plots created by buildstats-plot.sh can be found at [1]
>>
>> [1] https://wiki.yoctoproject.org/wiki/MortyBuildstats
>>
> This should go to the OE-Core list?
You are right.  buildstats data is produced by a class maintained in 
oe-core, so these scripts should go into the same project.

>
> Cheers,
>
> Richard
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/3] bb-perf: scripting to plot buildstats data
@ 2016-11-16 23:05 ` leonardo.sandoval.gonzalez
  0 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-16 23:05 UTC (permalink / raw)
  To: openembedded-core

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

buildstats data has been mostly unexplored mainly due to the lack of tools
to digest this data. The script buildstats.sh has been re-designed to
be much more flexible and the new script buildstats-plot.sh uses the latter
to produce data to be consumed by gnuplot. The tools used are datamash (package
at least not available in opensuse, so source code needs to be compiled and
installed) and gnuplot, so both must be present before running them.

Some plots created by buildstats-plot.sh can be found at [1]

[1] https://wiki.yoctoproject.org/wiki/MortyBuildstats


The following changes since commit dc8508f609974cc99606b9042bfa7f870ce80228:

  build-applance-image: Fix to use the release branch for morty (2016-10-26 11:11:10 +0100)

are available in the git repository at:

  git://git.yoctoproject.org/poky-contrib lsandov1/buildstats-plot
  http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=lsandov1/buildstats-plot

Leonardo Sandoval (3):
  buildstats: Place 'Elapsed Time' stat into a single line
  scripts: Specify the stats to take into account
  bb-perf: plot histograms base on buildstats data

 meta/classes/buildstats.bbclass            |   4 +-
 scripts/contrib/bb-perf/buildstats-plot.sh | 157 +++++++++++++++++++++++++++++
 scripts/contrib/bb-perf/buildstats.sh      |  99 ++++++++++++++----
 3 files changed, 241 insertions(+), 19 deletions(-)
 create mode 100755 scripts/contrib/bb-perf/buildstats-plot.sh

-- 
2.1.4



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] buildstats: Place 'Elapsed Time' stat into a single line
@ 2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  0 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-16 23:05 UTC (permalink / raw)
  To: openembedded-core

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

All lines except one (the one containing the 'Elapsed Time') follows the format
'stat: value'. Fix that so post parsing the stats is simpler.

Signed-off-by: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
---
 meta/classes/buildstats.bbclass | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/meta/classes/buildstats.bbclass b/meta/classes/buildstats.bbclass
index 599a219..57ecc8f 100644
--- a/meta/classes/buildstats.bbclass
+++ b/meta/classes/buildstats.bbclass
@@ -80,8 +80,8 @@ def write_task_data(status, logfile, e, d):
     with open(os.path.join(logfile), "a") as f:
         elapsedtime = get_timedata("__timedata_task", d, e.time)
         if elapsedtime:
-            f.write(d.expand("${PF}: %s: Elapsed time: %0.2f seconds \n" %
-                                    (e.task, elapsedtime)))
+            f.write(d.expand("${PF}: %s\n" % e.task))
+            f.write(d.expand("Elapsed time: %0.2f seconds\n" % elapsedtime))
             cpu, iostats, resources, childres = get_process_cputime(os.getpid())
             if cpu:
                 f.write("utime: %s\n" % cpu['utime'])
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] scripts: Specify the stats to take into account
@ 2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  0 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-16 23:05 UTC (permalink / raw)
  To: openembedded-core

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

There are many more stats on buildstats that 'Elapsed time', so make the script
more flexible to support all stats. Some cmd line examples:

$ buildstats.sh -s 'utime'

Buildstats' data covers proc's stats in different areas, including CPU times,
IO, program system resources and child program system resources. In order
to print values on each of these sets from command line, one can use the
following:

$ buildstats.sh -H -s 'TIME' | less

$ buildstats.sh -H -s 'IO' | less

and 'RUSAGE' and 'CHILD_RUSAGE' for program and program's child system
resources.

One more thing: The new version gives the same output as the old one,
just specifying the 'Elapsed time' as stat param:

$ buildstats.sh -s 'Elapsed time'

Signed-off-by: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
---
 scripts/contrib/bb-perf/buildstats.sh | 99 +++++++++++++++++++++++++++++------
 1 file changed, 82 insertions(+), 17 deletions(-)

diff --git a/scripts/contrib/bb-perf/buildstats.sh b/scripts/contrib/bb-perf/buildstats.sh
index 96158a9..8d7e248 100755
--- a/scripts/contrib/bb-perf/buildstats.sh
+++ b/scripts/contrib/bb-perf/buildstats.sh
@@ -18,24 +18,40 @@
 # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 #
 # DESCRIPTION
-# Given a 'buildstats' path (created by bitbake when setting
-# USER_CLASSES ?= "buildstats" on local.conf) and task names, outputs
-# '<task> <recipe> <elapsed time>' for all recipes. Elapsed times are in
-# seconds, and task should be given without the 'do_' prefix.
+# Given 'buildstats' data (generate by bitbake when setting
+# USER_CLASSES ?= "buildstats" on local.conf), task names and a stats values
+# (these are the ones preset on the buildstats files), outputs
+# '<task> <recipe> <value_1> <value_2> ... <value_n>'. The units are the ones
+# defined at buildstats, which in turn takes data from /proc/[pid] files
 #
 # Some useful pipelines
 #
-# 1. Tasks with largest elapsed times
-# $ buildstats.sh -b <buildstats> | sort -k3 -n -r | head
+# 1. Tasks with largest stime (Amount of time that this process has been scheduled
+#    in kernel mode) values
+# $ buildstats.sh -b <buildstats> -s stime | sort -k3 -n -r | head
 #
-# 2. Min, max, sum per task (in needs GNU datamash)
-# $ buildstats.sh -b <buildstats> | datamash -t' ' -g1 min 3 max 3 sum 3 | sort -k4 -n -r
+# 2. Min, max, sum utime (Amount  of  time  that  this process has been scheduled
+#    in user mode) per task (in needs GNU datamash)
+# $ buildstats.sh -b <buildstats> -s utime | datamash -t' ' -g1 min 3 max 3 sum 3 | sort -k4 -n -r
 #
 # AUTHORS
 # Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
 #
+
+# Stats, by type
+TIME="utime:stime:cutime:cstime"
+IO="IO wchar:IO write_bytes:IO syscr:IO read_bytes:IO rchar:IO syscw:IO cancelled_write_bytes"
+RUSAGE="rusage ru_utime:rusage ru_stime:rusage ru_maxrss:rusage ru_minflt:rusage ru_majflt:\
+rusage ru_inblock:rusage ru_oublock:rusage ru_nvcsw:rusage ru_nivcsw"
+
+CHILD_RUSAGE="Child rusage ru_utime:Child rusage ru_stime:Child rusage ru_maxrss:Child rusage ru_minflt:\
+Child rusage ru_majflt:Child rusage ru_inblock:Child rusage ru_oublock:Child rusage ru_nvcsw:\
+Child rusage ru_nivcsw"
+
 BS_DIR="tmp/buildstats"
 TASKS="compile:configure:fetch:install:patch:populate_lic:populate_sysroot:unpack"
+STATS="$TIME"
+HEADER="" # No header by default
 
 function usage {
 CMD=$(basename $0)
@@ -45,12 +61,20 @@ Usage: $CMD [-b buildstats_dir] [-t do_task]
                 (default: "$BS_DIR")
   -t tasks      The tasks to be computed
                 (default: "$TASKS")
+  -s stats      The stats to be matched. Options: TIME, IO, RUSAGE, CHILD_RUSAGE
+                or any other defined buildstat separated by colons, i.e. stime:utime
+                (default: "$STATS")
+                Default stat sets:
+                    TIME=$TIME
+                    IO=$IO
+                    RUSAGE=$RUSAGE
+                    CHILD_RUSAGE=$CHILD_RUSAGE
   -h            Display this help message
 EOM
 }
 
 # Parse and validate arguments
-while getopts "b:t:h" OPT; do
+while getopts "b:t:s:Hh" OPT; do
 	case $OPT in
 	b)
 		BS_DIR="$OPTARG"
@@ -58,6 +82,12 @@ while getopts "b:t:h" OPT; do
 	t)
 		TASKS="$OPTARG"
 		;;
+	s)
+		STATS="$OPTARG"
+		;;
+	H)
+	        HEADER="y"
+	        ;;
 	h)
 		usage
 		exit 0
@@ -76,15 +106,50 @@ if [ ! -d "$BS_DIR" ]; then
 	exit 1
 fi
 
-RECIPE_FIELD=1
-TIME_FIELD=4
+stats=""
+IFS=":"
+for stat in ${STATS}; do
+	case $stat in
+	    TIME)
+		stats="${stats}:${TIME}"
+		;;
+	    IO)
+		stats="${stats}:${IO}"
+		;;
+	    RUSAGE)
+		stats="${stats}:${RUSAGE}"
+		;;
+	    CHILD_RUSAGE)
+		stats="${stats}:${CHILD_RUSAGE}"
+		;;
+	    *)
+		stats="${STATS}"
+	esac
+done
+
+# remove possible colon at the beginning
+stats="$(echo "$stats" | sed -e 's/^://1')"
+
+# Provide a header if required by the user
+[ -n "$HEADER" ] && { echo "task:recipe:$stats"; }
 
-tasks=(${TASKS//:/ })
-for task in "${tasks[@]}"; do
+for task in ${TASKS}; do
     task="do_${task}"
-    for file in $(find ${BS_DIR} -type f -name ${task}); do
-        recipe=$(sed -n -e "/$task/p" ${file} | cut -d ':' -f${RECIPE_FIELD})
-        time=$(sed -n -e "/$task/p" ${file} | cut -d ':' -f${TIME_FIELD} | cut -d ' ' -f2)
-        echo "${task} ${recipe} ${time}"
+    for file in $(find ${BS_DIR} -type f -name ${task} | awk 'BEGIN{ ORS=""; OFS=":" } { print $0,"" }'); do
+        recipe="$(basename $(dirname $file))"
+	times=""
+	for stat in ${stats}; do
+	    [ -z "$stat" ] && { echo "empty stats"; }
+	    time=$(sed -n -e "s/^\($stat\): \\(.*\\)/\\2/p" $file)
+	    # in case the stat is not present, set the value as NA
+	    [ -z "$time" ] && { time="NA"; }
+	    # Append it to times
+	    if [ -z "$times" ]; then
+		times="${time}"
+	    else
+		times="${times} ${time}"
+	    fi
+	done
+        echo "${task} ${recipe} ${times}"
     done
 done
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] bb-perf: plot histograms base on buildstats data
@ 2016-11-16 23:05   ` leonardo.sandoval.gonzalez
  0 siblings, 0 replies; 12+ messages in thread
From: leonardo.sandoval.gonzalez @ 2016-11-16 23:05 UTC (permalink / raw)
  To: openembedded-core

From: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>

Scripts that produces script data to be consumed by gnuplot.
There are two possible plots depending if either the
-S parameter is present or not:

    * without -S: Produces a histogram listing top N recipes/tasks versus
      stats. The first stat defined in the -s parameter is the one taken
      into account for ranking
    * -S: Produces a histogram listing tasks versus stats.  In this case,
      the value of each stat is the sum for that particular stat in all recipes found.
      Stats values  are in descending order defined by the first stat defined on -s

EXAMPLES

1. Top recipes' tasks taking into account utime

    $ buildstats-plot.sh -s utime | gnuplot -p

2. Tasks versus utime:stime

    $ buildstats-plot.sh -s utime:stime -S | gnuplot -p

3. Tasks versus IO write_bytes:IO read_bytes

    $ buildstats-plot.sh -s 'IO write_bytes:IO read_bytes' -S | gnuplot -p

Signed-off-by: Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
---
 scripts/contrib/bb-perf/buildstats-plot.sh | 157 +++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100755 scripts/contrib/bb-perf/buildstats-plot.sh

diff --git a/scripts/contrib/bb-perf/buildstats-plot.sh b/scripts/contrib/bb-perf/buildstats-plot.sh
new file mode 100755
index 0000000..7e8ae04
--- /dev/null
+++ b/scripts/contrib/bb-perf/buildstats-plot.sh
@@ -0,0 +1,157 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) 2011, Intel Corporation.
+# All rights reserved.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+#
+# DESCRIPTION
+#
+# Produces script data to be consumed by gnuplot. There are two possible plots
+# depending if either the -S parameter is present or not:
+#
+#     * without -S: Produces a histogram listing top N recipes/tasks versus
+#       stats. The first stat defined in the -s parameter is the one taken
+#       into account for ranking
+#     * -S: Produces a histogram listing tasks versus stats.  In this case,
+#       the value of each stat is the sum for that particular stat in all recipes found.
+#       Stats values  are in descending order defined by the first stat defined on -s
+#
+# EXAMPLES
+#
+# 1. Top recipes' tasks taking into account utime
+#
+#     $ buildstats-plot.sh -s utime | gnuplot -p
+#
+# 2. Tasks versus utime:stime
+#
+#     $ buildstats-plot.sh -s utime:stime -S | gnuplot -p
+#
+# 3. Tasks versus IO write_bytes:IO read_bytes
+#
+#     $ buildstats-plot.sh -s 'IO write_bytes:IO read_bytes' -S | gnuplot -p
+#
+# AUTHORS
+# Leonardo Sandoval <leonardo.sandoval.gonzalez@linux.intel.com>
+#
+
+set -o nounset
+set -o errexit
+
+BS_DIR="tmp/buildstats"
+N=10
+STATS="utime"
+SUM=""
+OUTDATA_FILE="$PWD/buildstats-plot.out"
+
+function usage {
+    CMD=$(basename $0)
+    cat <<EOM
+Usage: $CMD [-b buildstats_dir] [-t do_task]
+  -b buildstats The path where the folder resides
+                (default: "$BS_DIR")
+  -n N          Top N recipes to display. Ignored if -S is present
+                (default: "$N")
+  -s stats      The stats to be matched. If more that one stat, units
+                should be the same because data is plot as histogram.
+                (see buildstats.sh -h for all options) or any other defined
+                (build)stat separated by colons, i.e. stime:utime
+                (default: "$STATS")
+  -S            Sum values for a particular stat for found recipes
+  -o            Output data file.
+                (default: "$OUTDATA_FILE")
+  -h            Display this help message
+EOM
+}
+
+# Parse and validate arguments
+while getopts "b:n:s:o:Sh" OPT; do
+	case $OPT in
+	b)
+		BS_DIR="$OPTARG"
+		;;
+	n)
+		N="$OPTARG"
+		;;
+	s)
+	        STATS="$OPTARG"
+	        ;;
+	S)
+	        SUM="y"
+	        ;;
+	o)
+	        OUTDATA_FILE="$OPTARG"
+	        ;;
+	h)
+		usage
+		exit 0
+		;;
+	*)
+		usage
+		exit 1
+		;;
+	esac
+done
+
+# Get number of stats
+IFS=':'; statsarray=(${STATS}); unset IFS
+nstats=${#statsarray[@]}
+
+# Get script folder, use to run buildstats.sh
+CD=$(dirname $0)
+
+# Parse buildstats recipes to produce a single table
+OUTBUILDSTATS="$PWD/buildstats.log"
+$CD/buildstats.sh -H -s "$STATS" -H > $OUTBUILDSTATS
+
+# Get headers
+HEADERS=$(cat $OUTBUILDSTATS | sed -n -e '1s/ /-/g' -e '1s/:/ /gp')
+
+echo -e "set boxwidth 0.9 relative"
+echo -e "set style data histograms"
+echo -e "set style fill solid 1.0 border lt -1"
+echo -e "set xtics rotate by 45 right"
+
+# Get output data
+if [ -z "$SUM" ]; then
+    cat $OUTBUILDSTATS | sed -e '1d' | sort -k3 -n -r | head -$N > $OUTDATA_FILE
+    # include task at recipe column
+    sed -i -e "1i\
+${HEADERS}" $OUTDATA_FILE
+    echo -e "set title \"Top task/recipes\""
+    echo -e "plot for [COL=3:`expr 3 + ${nstats} - 1`] '${OUTDATA_FILE}' using COL:xtic(stringcolumn(1).' '.stringcolumn(2)) title columnheader(COL)"
+else
+
+    # Construct datatamash sum argument (sum 3 sum 4 ...)
+    declare -a sumargs
+    j=0
+    for i in `seq $nstats`; do
+	sumargs[j]=sum; j=$(( $j + 1 ))
+	sumargs[j]=`expr 3 + $i - 1`;  j=$(( $j + 1 ))
+    done
+
+    # Do the processing with datamash
+    cat $OUTBUILDSTATS | sed -e '1d' | datamash -t ' ' -g1 ${sumargs[*]} | sort -k2 -n -r > $OUTDATA_FILE
+
+    # Include headers into resulted file, so we can include gnuplot xtics
+    HEADERS=$(echo $HEADERS | sed -e 's/recipe//1')
+    sed -i -e "1i\
+${HEADERS}" $OUTDATA_FILE
+
+    # Plot
+    echo -e "set title \"Sum stats values per task for all recipes\""
+    echo -e "plot for [COL=2:`expr 2 + ${nstats} - 1`] '${OUTDATA_FILE}' using COL:xtic(1) title columnheader(COL)"
+fi
+
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-11-16 23:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-15 21:19 [PATCH 0/3] bb-perf: scripting to plot buildstats data leonardo.sandoval.gonzalez
2016-11-16 23:05 ` leonardo.sandoval.gonzalez
2016-11-15 21:19 ` [PATCH 1/3] buildstats: Place 'Elapsed Time' stat into a single line leonardo.sandoval.gonzalez
2016-11-16 23:05   ` leonardo.sandoval.gonzalez
2016-11-15 21:19 ` [PATCH 2/3] scripts: Specify the stats to take into account leonardo.sandoval.gonzalez
2016-11-16 23:05   ` leonardo.sandoval.gonzalez
2016-11-15 21:19 ` [PATCH 3/3] bb-perf: plot histograms base on buildstats data leonardo.sandoval.gonzalez
2016-11-16 23:05   ` leonardo.sandoval.gonzalez
2016-11-16 10:25   ` Markus Lehtonen
2016-11-16 22:44     ` Leonardo Sandoval
2016-11-16 11:53 ` [PATCH 0/3] bb-perf: scripting to plot " Richard Purdie
2016-11-16 22:48   ` Leonardo Sandoval

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.