All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions
@ 2020-06-26 16:45 Ahmed Karaman
  2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
  To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
	ldoktor, ehabkost, crosa
  Cc: Ahmed Karaman

Greetings,

As a part of the TCG Continous Benchmarking project for GSoC this
year, detailed reports discussing different performance measurement
methodologies and analysis results will be sent here on the mailing
list.

The project's first report was published on the mailing list on the
22nd of June:
https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg06692.html

A section in this report deals with measuring the top 25 executed
functions when running QEMU. It includes two Python scripts that
automatically perform this task.

This series adds these two scripts to a new performance directory
created under the scripts directory. It also adds a new
"Miscellaneous" section to the end of the MAINTAINERS file with a
"Performance Tools and Tests" subsection.

Previous versions of the series:
v3:
https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg07856.html
v2:
https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg06147.html
v1:
https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg04868.html

Best regards,
Ahmed Karaman

v3->v4:
- Save all intermediate files generated by the scripts in the '/tmp'
  directory instead of the current working directory of the user.
- Use more descriptive variable names and table headers.

v2->v3:
- Use a clearer "Syntax" and "Example of usage" in the script comment
  and commit message.
- Manually specify the instructions required to run Perf instead of
  relying on the stderr produced by Perf.
- Use more descriptive variable names.

v1->v2:
- Add an empty line at the end of the MAINTAINERS file.
- Move MAINTAINERS patch to be the last in the series.
- Allow custom number of top functions to be specified.
- Check for vallgrind and perf before executing the scripts.
- Ensure sufficient permissions when running the topN_perf script.
- Use subprocess instead of os.system
- Use os.unlink() for deleting intermediate files.
- Spread out the data extraction steps.
- Enable execution permission for the scripts.
- Add script example output in the commit message.


Ahmed Karaman (3):
  scripts/performance: Add topN_perf.py script
  scripts/performance: Add topN_callgrind.py script
  MAINTAINERS: Add 'Performance Tools and Tests'subsection

 MAINTAINERS                           |   7 ++
 scripts/performance/topN_callgrind.py | 140 ++++++++++++++++++++++++
 scripts/performance/topN_perf.py      | 149 ++++++++++++++++++++++++++
 3 files changed, 296 insertions(+)
 create mode 100755 scripts/performance/topN_callgrind.py
 create mode 100755 scripts/performance/topN_perf.py

-- 
2.17.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/3] scripts/performance: Add topN_perf.py script
  2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
@ 2020-06-26 16:45 ` Ahmed Karaman
  2020-06-27 17:58   ` Aleksandar Markovic
  2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
  2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
  2 siblings, 1 reply; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
  To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
	ldoktor, ehabkost, crosa
  Cc: Ahmed Karaman

Syntax:
topN_perf.py [-h] [-n] <number of displayed top functions>  -- \
                 <qemu executable> [<qemu executable options>] \
                 <target executable> [<target execurable options>]

[-h] - Print the script arguments help message.
[-n] - Specify the number of top functions to print.
     - If this flag is not specified, the tool defaults to 25.

Example of usage:
topN_perf.py -n 20 -- qemu-arm coulomb_double-arm

Example Output:
 No.  Percentage  Name                       Invoked by
----  ----------  -------------------------  -------------------------
   1      16.25%  float64_mul                qemu-x86_64
   2      12.01%  float64_sub                qemu-x86_64
   3      11.99%  float64_add                qemu-x86_64
   4       5.69%  helper_mulsd               qemu-x86_64
   5       4.68%  helper_addsd               qemu-x86_64
   6       4.43%  helper_lookup_tb_ptr       qemu-x86_64
   7       4.28%  helper_subsd               qemu-x86_64
   8       2.71%  f64_compare                qemu-x86_64
   9       2.71%  helper_ucomisd             qemu-x86_64
  10       1.04%  helper_pand_xmm            qemu-x86_64
  11       0.71%  float64_div                qemu-x86_64
  12       0.63%  helper_pxor_xmm            qemu-x86_64
  13       0.50%  0x00007f7b7004ef95         [JIT] tid 491
  14       0.50%  0x00007f7b70044e83         [JIT] tid 491
  15       0.36%  helper_por_xmm             qemu-x86_64
  16       0.32%  helper_cc_compute_all      qemu-x86_64
  17       0.30%  0x00007f7b700433f0         [JIT] tid 491
  18       0.30%  float64_compare_quiet      qemu-x86_64
  19       0.27%  soft_f64_addsub            qemu-x86_64
  20       0.26%  round_to_int               qemu-x86_64

Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
---
 scripts/performance/topN_perf.py | 149 +++++++++++++++++++++++++++++++
 1 file changed, 149 insertions(+)
 create mode 100755 scripts/performance/topN_perf.py

diff --git a/scripts/performance/topN_perf.py b/scripts/performance/topN_perf.py
new file mode 100755
index 0000000000..07be195fc8
--- /dev/null
+++ b/scripts/performance/topN_perf.py
@@ -0,0 +1,149 @@
+#!/usr/bin/env python3
+
+#  Print the top N most executed functions in QEMU using perf.
+#  Syntax:
+#  topN_perf.py [-h] [-n] <number of displayed top functions>  -- \
+#           <qemu executable> [<qemu executable options>] \
+#           <target executable> [<target execurable options>]
+#
+#  [-h] - Print the script arguments help message.
+#  [-n] - Specify the number of top functions to print.
+#       - If this flag is not specified, the tool defaults to 25.
+#
+#  Example of usage:
+#  topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
+#
+#  This file is a part of the project "TCG Continuous Benchmarking".
+#
+#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
+#
+#  This program is free software: you can redistribute it and/or modify
+#  it under the terms of the GNU General Public License as published by
+#  the Free Software Foundation, either version 2 of the License, or
+#  (at your option) any later version.
+#
+#  This program is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+#  GNU General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+import argparse
+import os
+import subprocess
+import sys
+
+
+# Parse the command line arguments
+parser = argparse.ArgumentParser(
+    usage='topN_perf.py [-h] [-n] <number of displayed top functions >  -- '
+          '<qemu executable> [<qemu executable options>] '
+          '<target executable> [<target executable options>]')
+
+parser.add_argument('-n', dest='top', type=int, default=25,
+                    help='Specify the number of top functions to print.')
+
+parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
+
+args = parser.parse_args()
+
+# Extract the needed variables from the args
+command = args.command
+top = args.top
+
+# Insure that perf is installed
+check_perf_presence = subprocess.run(["which", "perf"],
+                                     stdout=subprocess.DEVNULL)
+if check_perf_presence.returncode:
+    sys.exit("Please install perf before running the script!")
+
+# Insure user has previllage to run perf
+check_perf_executability = subprocess.run(["perf", "stat", "ls", "/"],
+                                          stdout=subprocess.DEVNULL,
+                                          stderr=subprocess.DEVNULL)
+if check_perf_executability.returncode:
+    sys.exit(
+"""
+Error:
+You may not have permission to collect stats.
+
+Consider tweaking /proc/sys/kernel/perf_event_paranoid,
+which controls use of the performance events system by
+unprivileged users (without CAP_SYS_ADMIN).
+
+  -1: Allow use of (almost) all events by all users
+      Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
+   0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
+      Disallow raw tracepoint access by users without CAP_SYS_ADMIN
+   1: Disallow CPU event access by users without CAP_SYS_ADMIN
+   2: Disallow kernel profiling by users without CAP_SYS_ADMIN
+
+To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
+   kernel.perf_event_paranoid = -1
+
+* Alternatively, you can run this script under sudo privileges.
+"""
+)
+
+# Run perf record
+perf_record = subprocess.run((["perf", "record", "--output=/tmp/perf.data"] +
+                              command),
+                             stdout=subprocess.DEVNULL,
+                             stderr=subprocess.PIPE)
+if perf_record.returncode:
+    os.unlink('/tmp/perf.data')
+    sys.exit(perf_record.stderr.decode("utf-8"))
+
+# Save perf report output to /tmp/perf_report.out
+with open("/tmp/perf_report.out", "w") as output:
+    perf_report = subprocess.run(
+        ["perf", "report", "--input=/tmp/perf.data", "--stdio"],
+        stdout=output,
+        stderr=subprocess.PIPE)
+    if perf_report.returncode:
+        os.unlink('/tmp/perf.data')
+        output.close()
+        os.unlink('/tmp/perf_report.out')
+        sys.exit(perf_report.stderr.decode("utf-8"))
+
+# Read the reported data to functions[]
+functions = []
+with open("/tmp/perf_report.out", "r") as data:
+    # Only read lines that are not comments (comments start with #)
+    # Only read lines that are not empty
+    functions = [line for line in data.readlines() if line and line[0]
+                 != '#' and line[0] != "\n"]
+
+# Limit the number of top functions to "top"
+number_of_top_functions = top if len(functions) > top else len(functions)
+
+# Store the data of the top functions in top_functions[]
+top_functions = functions[:number_of_top_functions]
+
+# Print table header
+print('{:>4}  {:>10}  {:<30}  {}\n{}  {}  {}  {}'.format('No.',
+                                                         'Percentage',
+                                                         'Name',
+                                                         'Invoked by',
+                                                         '-' * 4,
+                                                         '-' * 10,
+                                                         '-' * 30,
+                                                         '-' * 25))
+
+# Print top N functions
+for (index, function) in enumerate(top_functions, start=1):
+    function_data = function.split()
+    function_percentage = function_data[0]
+    function_name = function_data[-1]
+    function_invoker = ' '.join(function_data[2:-2])
+    print('{:>4}  {:>10}  {:<30}  {}'.format(index,
+                                             function_percentage,
+                                             function_name,
+                                             function_invoker))
+
+# Remove intermediate files
+os.unlink('/tmp/perf.data')
+os.unlink('/tmp/perf_report.out')
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script
  2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
  2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
@ 2020-06-26 16:45 ` Ahmed Karaman
  2020-06-27 17:57   ` Aleksandar Markovic
  2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
  2 siblings, 1 reply; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
  To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
	ldoktor, ehabkost, crosa
  Cc: Ahmed Karaman

Python script that prints the top N most executed functions in QEMU
using callgrind.

Syntax:
topN_callgrind.py [-h] [-n] <number of displayed top functions>  -- \
                      <qemu executable> [<qemu executable options>] \
                      <target executable> [<target execurable options>]

[-h] - Print the script arguments help message.
[-n] - Specify the number of top functions to print.
     - If this flag is not specified, the tool defaults to 25.

Example of usage:
topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm

Example Output:
No.  Percentage Function Name         Source File
----  --------- ------------------    ------------------------------
   1    24.577% 0x00000000082db000    ???
   2    20.467% float64_mul           <qemu>/fpu/softfloat.c
   3    14.720% float64_sub           <qemu>/fpu/softfloat.c
   4    13.864% float64_add           <qemu>/fpu/softfloat.c
   5     4.876% helper_mulsd          <qemu>/target/i386/ops_sse.h
   6     3.767% helper_subsd          <qemu>/target/i386/ops_sse.h
   7     3.549% helper_addsd          <qemu>/target/i386/ops_sse.h
   8     2.185% helper_ucomisd        <qemu>/target/i386/ops_sse.h
   9     1.667% helper_lookup_tb_ptr  <qemu>/include/exec/tb-lookup.h
  10     1.662% f64_compare           <qemu>/fpu/softfloat.c
  11     1.509% helper_lookup_tb_ptr  <qemu>/accel/tcg/tcg-runtime.c
  12     0.635% helper_lookup_tb_ptr  <qemu>/include/exec/exec-all.h
  13     0.616% float64_div           <qemu>/fpu/softfloat.c
  14     0.502% helper_pand_xmm       <qemu>/target/i386/ops_sse.h
  15     0.502% float64_mul           <qemu>/include/fpu/softfloat.h
  16     0.476% helper_lookup_tb_ptr  <qemu>/target/i386/cpu.h
  17     0.437% float64_compare_quiet <qemu>/fpu/softfloat.c
  18     0.414% helper_pxor_xmm       <qemu>/target/i386/ops_sse.h
  19     0.353% round_to_int          <qemu>/fpu/softfloat.c
  20     0.347% helper_cc_compute_all <qemu>/target/i386/cc_helper.c

Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
---
 scripts/performance/topN_callgrind.py | 140 ++++++++++++++++++++++++++
 1 file changed, 140 insertions(+)
 create mode 100755 scripts/performance/topN_callgrind.py

diff --git a/scripts/performance/topN_callgrind.py b/scripts/performance/topN_callgrind.py
new file mode 100755
index 0000000000..67c59197af
--- /dev/null
+++ b/scripts/performance/topN_callgrind.py
@@ -0,0 +1,140 @@
+#!/usr/bin/env python3
+
+#  Print the top N most executed functions in QEMU using callgrind.
+#  Syntax:
+#  topN_callgrind.py [-h] [-n] <number of displayed top functions>  -- \
+#           <qemu executable> [<qemu executable options>] \
+#           <target executable> [<target execurable options>]
+#
+#  [-h] - Print the script arguments help message.
+#  [-n] - Specify the number of top functions to print.
+#       - If this flag is not specified, the tool defaults to 25.
+#
+#  Example of usage:
+#  topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
+#
+#  This file is a part of the project "TCG Continuous Benchmarking".
+#
+#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
+#
+#  This program is free software: you can redistribute it and/or modify
+#  it under the terms of the GNU General Public License as published by
+#  the Free Software Foundation, either version 2 of the License, or
+#  (at your option) any later version.
+#
+#  This program is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+#  GNU General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+import argparse
+import os
+import subprocess
+import sys
+
+
+# Parse the command line arguments
+parser = argparse.ArgumentParser(
+    usage='topN_callgrind.py [-h] [-n] <number of displayed top functions>  -- '
+          '<qemu executable> [<qemu executable options>] '
+          '<target executable> [<target executable options>]')
+
+parser.add_argument('-n', dest='top', type=int, default=25,
+                    help='Specify the number of top functions to print.')
+
+parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
+
+args = parser.parse_args()
+
+# Extract the needed variables from the args
+command = args.command
+top = args.top
+
+# Insure that valgrind is installed
+check_valgrind_presence = subprocess.run(["which", "valgrind"],
+                                         stdout=subprocess.DEVNULL)
+if check_valgrind_presence.returncode:
+    sys.exit("Please install valgrind before running the script!")
+
+# Run callgrind
+callgrind = subprocess.run((
+    ["valgrind", "--tool=callgrind", "--callgrind-out-file=/tmp/callgrind.data"]
+    + command),
+    stdout=subprocess.DEVNULL,
+    stderr=subprocess.PIPE)
+if callgrind.returncode:
+    sys.exit(callgrind.stderr.decode("utf-8"))
+
+# Save callgrind_annotate output to /tmp/callgrind_annotate.out
+with open("/tmp/callgrind_annotate.out", "w") as output:
+    callgrind_annotate = subprocess.run(["callgrind_annotate",
+                                         "/tmp/callgrind.data"],
+                                        stdout=output,
+                                        stderr=subprocess.PIPE)
+    if callgrind_annotate.returncode:
+        os.unlink('/tmp/callgrind.data')
+        output.close()
+        os.unlink('/tmp/callgrind_annotate.out')
+        sys.exit(callgrind_annotate.stderr.decode("utf-8"))
+
+# Read the callgrind_annotate output to callgrind_data[]
+callgrind_data = []
+with open('/tmp/callgrind_annotate.out', 'r') as data:
+    callgrind_data = data.readlines()
+
+# Line number with the total number of instructions
+total_instructions_line_number = 20
+
+# Get the total number of instructions
+total_instructions_line_data = callgrind_data[total_instructions_line_number]
+total_number_of_instructions = total_instructions_line_data.split(' ')[0]
+total_number_of_instructions = int(
+    total_number_of_instructions.replace(',', ''))
+
+# Line number with the top function
+first_func_line = 25
+
+# Number of functions recorded by callgrind, last two lines are always empty
+number_of_functions = len(callgrind_data) - first_func_line - 2
+
+# Limit the number of top functions to "top"
+number_of_top_functions = (top if number_of_functions >
+                           top else number_of_functions)
+
+# Store the data of the top functions in top_functions[]
+top_functions = callgrind_data[first_func_line:
+                               first_func_line + number_of_top_functions]
+
+# Print table header
+print('{:>4}  {:>10}  {:<30}  {}\n{}  {}  {}  {}'.format('No.',
+                                                         'Percentage',
+                                                         'Function Name',
+                                                         'Source File',
+                                                         '-' * 4,
+                                                         '-' * 10,
+                                                         '-' * 30,
+                                                         '-' * 30,
+                                                         ))
+
+# Print top N functions
+for (index, function) in enumerate(top_functions, start=1):
+    function_data = function.split()
+    # Calculate function percentage
+    function_instructions = float(function_data[0].replace(',', ''))
+    function_percentage = (function_instructions /
+                           total_number_of_instructions)*100
+    # Get function name and source files path
+    function_source_file, function_name = function_data[1].split(':')
+    # Print extracted data
+    print('{:>4}  {:>9.3f}%  {:<30}  {}'.format(index,
+                                                round(function_percentage, 3),
+                                                function_name,
+                                                function_source_file))
+
+# Remove intermediate files
+os.unlink('/tmp/callgrind.data')
+os.unlink('/tmp/callgrind_annotate.out')
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection
  2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
  2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
  2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
@ 2020-06-26 16:45 ` Ahmed Karaman
  2020-06-27 17:58   ` Aleksandar Markovic
  2 siblings, 1 reply; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
  To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
	ldoktor, ehabkost, crosa
  Cc: Ahmed Karaman

This commit creates a new 'Miscellaneous' section which hosts a new
'Performance Tools and Tests' subsection.
The subsection will contain the the performance scripts and benchmarks
written as a part of the 'TCG Continuous Benchmarking' project.

Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1b40446c73..c510c942ac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3019,3 +3019,10 @@ M: Peter Maydell <peter.maydell@linaro.org>
 S: Maintained
 F: docs/conf.py
 F: docs/*/conf.py
+
+Miscellaneous
+-------------
+Performance Tools and Tests
+M: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+S: Maintained
+F: scripts/performance/
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script
  2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
@ 2020-06-27 17:57   ` Aleksandar Markovic
  0 siblings, 0 replies; 7+ messages in thread
From: Aleksandar Markovic @ 2020-06-27 17:57 UTC (permalink / raw)
  To: Ahmed Karaman
  Cc: Lukáš Doktor, Eduardo Habkost, Alex Bennée,
	QEMU Developers, Aleksandar Markovic, Cleber Rosa,
	Richard Henderson

On Fri, Jun 26, 2020 at 6:59 PM Ahmed Karaman
<ahmedkhaledkaraman@gmail.com> wrote:
>
> Python script that prints the top N most executed functions in QEMU
> using callgrind.
>
> Syntax:
> topN_callgrind.py [-h] [-n] <number of displayed top functions>  -- \
>                       <qemu executable> [<qemu executable options>] \
>                       <target executable> [<target execurable options>]
>
> [-h] - Print the script arguments help message.
> [-n] - Specify the number of top functions to print.
>      - If this flag is not specified, the tool defaults to 25.
>
> Example of usage:
> topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
>
> Example Output:
> No.  Percentage Function Name         Source File
> ----  --------- ------------------    ------------------------------
>    1    24.577% 0x00000000082db000    ???
>    2    20.467% float64_mul           <qemu>/fpu/softfloat.c
>    3    14.720% float64_sub           <qemu>/fpu/softfloat.c
>    4    13.864% float64_add           <qemu>/fpu/softfloat.c
>    5     4.876% helper_mulsd          <qemu>/target/i386/ops_sse.h
>    6     3.767% helper_subsd          <qemu>/target/i386/ops_sse.h
>    7     3.549% helper_addsd          <qemu>/target/i386/ops_sse.h
>    8     2.185% helper_ucomisd        <qemu>/target/i386/ops_sse.h
>    9     1.667% helper_lookup_tb_ptr  <qemu>/include/exec/tb-lookup.h
>   10     1.662% f64_compare           <qemu>/fpu/softfloat.c
>   11     1.509% helper_lookup_tb_ptr  <qemu>/accel/tcg/tcg-runtime.c
>   12     0.635% helper_lookup_tb_ptr  <qemu>/include/exec/exec-all.h
>   13     0.616% float64_div           <qemu>/fpu/softfloat.c
>   14     0.502% helper_pand_xmm       <qemu>/target/i386/ops_sse.h
>   15     0.502% float64_mul           <qemu>/include/fpu/softfloat.h
>   16     0.476% helper_lookup_tb_ptr  <qemu>/target/i386/cpu.h
>   17     0.437% float64_compare_quiet <qemu>/fpu/softfloat.c
>   18     0.414% helper_pxor_xmm       <qemu>/target/i386/ops_sse.h
>   19     0.353% round_to_int          <qemu>/fpu/softfloat.c
>   20     0.347% helper_cc_compute_all <qemu>/target/i386/cc_helper.c
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> ---

Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>

Applied to "TCG Continuous Benchmarking" queue.

>  scripts/performance/topN_callgrind.py | 140 ++++++++++++++++++++++++++
>  1 file changed, 140 insertions(+)
>  create mode 100755 scripts/performance/topN_callgrind.py
>
> diff --git a/scripts/performance/topN_callgrind.py b/scripts/performance/topN_callgrind.py
> new file mode 100755
> index 0000000000..67c59197af
> --- /dev/null
> +++ b/scripts/performance/topN_callgrind.py
> @@ -0,0 +1,140 @@
> +#!/usr/bin/env python3
> +
> +#  Print the top N most executed functions in QEMU using callgrind.
> +#  Syntax:
> +#  topN_callgrind.py [-h] [-n] <number of displayed top functions>  -- \
> +#           <qemu executable> [<qemu executable options>] \
> +#           <target executable> [<target execurable options>]
> +#
> +#  [-h] - Print the script arguments help message.
> +#  [-n] - Specify the number of top functions to print.
> +#       - If this flag is not specified, the tool defaults to 25.
> +#
> +#  Example of usage:
> +#  topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
> +#
> +#  This file is a part of the project "TCG Continuous Benchmarking".
> +#
> +#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
> +#
> +#  This program is free software: you can redistribute it and/or modify
> +#  it under the terms of the GNU General Public License as published by
> +#  the Free Software Foundation, either version 2 of the License, or
> +#  (at your option) any later version.
> +#
> +#  This program is distributed in the hope that it will be useful,
> +#  but WITHOUT ANY WARRANTY; without even the implied warranty of
> +#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +#  GNU General Public License for more details.
> +#
> +#  You should have received a copy of the GNU General Public License
> +#  along with this program. If not, see <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import os
> +import subprocess
> +import sys
> +
> +
> +# Parse the command line arguments
> +parser = argparse.ArgumentParser(
> +    usage='topN_callgrind.py [-h] [-n] <number of displayed top functions>  -- '
> +          '<qemu executable> [<qemu executable options>] '
> +          '<target executable> [<target executable options>]')
> +
> +parser.add_argument('-n', dest='top', type=int, default=25,
> +                    help='Specify the number of top functions to print.')
> +
> +parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
> +
> +args = parser.parse_args()
> +
> +# Extract the needed variables from the args
> +command = args.command
> +top = args.top
> +
> +# Insure that valgrind is installed
> +check_valgrind_presence = subprocess.run(["which", "valgrind"],
> +                                         stdout=subprocess.DEVNULL)
> +if check_valgrind_presence.returncode:
> +    sys.exit("Please install valgrind before running the script!")
> +
> +# Run callgrind
> +callgrind = subprocess.run((
> +    ["valgrind", "--tool=callgrind", "--callgrind-out-file=/tmp/callgrind.data"]
> +    + command),
> +    stdout=subprocess.DEVNULL,
> +    stderr=subprocess.PIPE)
> +if callgrind.returncode:
> +    sys.exit(callgrind.stderr.decode("utf-8"))
> +
> +# Save callgrind_annotate output to /tmp/callgrind_annotate.out
> +with open("/tmp/callgrind_annotate.out", "w") as output:
> +    callgrind_annotate = subprocess.run(["callgrind_annotate",
> +                                         "/tmp/callgrind.data"],
> +                                        stdout=output,
> +                                        stderr=subprocess.PIPE)
> +    if callgrind_annotate.returncode:
> +        os.unlink('/tmp/callgrind.data')
> +        output.close()
> +        os.unlink('/tmp/callgrind_annotate.out')
> +        sys.exit(callgrind_annotate.stderr.decode("utf-8"))
> +
> +# Read the callgrind_annotate output to callgrind_data[]
> +callgrind_data = []
> +with open('/tmp/callgrind_annotate.out', 'r') as data:
> +    callgrind_data = data.readlines()
> +
> +# Line number with the total number of instructions
> +total_instructions_line_number = 20
> +
> +# Get the total number of instructions
> +total_instructions_line_data = callgrind_data[total_instructions_line_number]
> +total_number_of_instructions = total_instructions_line_data.split(' ')[0]
> +total_number_of_instructions = int(
> +    total_number_of_instructions.replace(',', ''))
> +
> +# Line number with the top function
> +first_func_line = 25
> +
> +# Number of functions recorded by callgrind, last two lines are always empty
> +number_of_functions = len(callgrind_data) - first_func_line - 2
> +
> +# Limit the number of top functions to "top"
> +number_of_top_functions = (top if number_of_functions >
> +                           top else number_of_functions)
> +
> +# Store the data of the top functions in top_functions[]
> +top_functions = callgrind_data[first_func_line:
> +                               first_func_line + number_of_top_functions]
> +
> +# Print table header
> +print('{:>4}  {:>10}  {:<30}  {}\n{}  {}  {}  {}'.format('No.',
> +                                                         'Percentage',
> +                                                         'Function Name',
> +                                                         'Source File',
> +                                                         '-' * 4,
> +                                                         '-' * 10,
> +                                                         '-' * 30,
> +                                                         '-' * 30,
> +                                                         ))
> +
> +# Print top N functions
> +for (index, function) in enumerate(top_functions, start=1):
> +    function_data = function.split()
> +    # Calculate function percentage
> +    function_instructions = float(function_data[0].replace(',', ''))
> +    function_percentage = (function_instructions /
> +                           total_number_of_instructions)*100
> +    # Get function name and source files path
> +    function_source_file, function_name = function_data[1].split(':')
> +    # Print extracted data
> +    print('{:>4}  {:>9.3f}%  {:<30}  {}'.format(index,
> +                                                round(function_percentage, 3),
> +                                                function_name,
> +                                                function_source_file))
> +
> +# Remove intermediate files
> +os.unlink('/tmp/callgrind.data')
> +os.unlink('/tmp/callgrind_annotate.out')
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/3] scripts/performance: Add topN_perf.py script
  2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
@ 2020-06-27 17:58   ` Aleksandar Markovic
  0 siblings, 0 replies; 7+ messages in thread
From: Aleksandar Markovic @ 2020-06-27 17:58 UTC (permalink / raw)
  To: Ahmed Karaman
  Cc: Lukáš Doktor, Eduardo Habkost, Alex Bennée,
	QEMU Developers, Aleksandar Markovic, Cleber Rosa,
	Richard Henderson

On Fri, Jun 26, 2020 at 7:00 PM Ahmed Karaman
<ahmedkhaledkaraman@gmail.com> wrote:
>
> Syntax:
> topN_perf.py [-h] [-n] <number of displayed top functions>  -- \
>                  <qemu executable> [<qemu executable options>] \
>                  <target executable> [<target execurable options>]
>
> [-h] - Print the script arguments help message.
> [-n] - Specify the number of top functions to print.
>      - If this flag is not specified, the tool defaults to 25.
>
> Example of usage:
> topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
>
> Example Output:
>  No.  Percentage  Name                       Invoked by
> ----  ----------  -------------------------  -------------------------
>    1      16.25%  float64_mul                qemu-x86_64
>    2      12.01%  float64_sub                qemu-x86_64
>    3      11.99%  float64_add                qemu-x86_64
>    4       5.69%  helper_mulsd               qemu-x86_64
>    5       4.68%  helper_addsd               qemu-x86_64
>    6       4.43%  helper_lookup_tb_ptr       qemu-x86_64
>    7       4.28%  helper_subsd               qemu-x86_64
>    8       2.71%  f64_compare                qemu-x86_64
>    9       2.71%  helper_ucomisd             qemu-x86_64
>   10       1.04%  helper_pand_xmm            qemu-x86_64
>   11       0.71%  float64_div                qemu-x86_64
>   12       0.63%  helper_pxor_xmm            qemu-x86_64
>   13       0.50%  0x00007f7b7004ef95         [JIT] tid 491
>   14       0.50%  0x00007f7b70044e83         [JIT] tid 491
>   15       0.36%  helper_por_xmm             qemu-x86_64
>   16       0.32%  helper_cc_compute_all      qemu-x86_64
>   17       0.30%  0x00007f7b700433f0         [JIT] tid 491
>   18       0.30%  float64_compare_quiet      qemu-x86_64
>   19       0.27%  soft_f64_addsub            qemu-x86_64
>   20       0.26%  round_to_int               qemu-x86_64
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> ---

Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>

Applied to "TCG Continuous Benchmarking" queue.

>  scripts/performance/topN_perf.py | 149 +++++++++++++++++++++++++++++++
>  1 file changed, 149 insertions(+)
>  create mode 100755 scripts/performance/topN_perf.py
>
> diff --git a/scripts/performance/topN_perf.py b/scripts/performance/topN_perf.py
> new file mode 100755
> index 0000000000..07be195fc8
> --- /dev/null
> +++ b/scripts/performance/topN_perf.py
> @@ -0,0 +1,149 @@
> +#!/usr/bin/env python3
> +
> +#  Print the top N most executed functions in QEMU using perf.
> +#  Syntax:
> +#  topN_perf.py [-h] [-n] <number of displayed top functions>  -- \
> +#           <qemu executable> [<qemu executable options>] \
> +#           <target executable> [<target execurable options>]
> +#
> +#  [-h] - Print the script arguments help message.
> +#  [-n] - Specify the number of top functions to print.
> +#       - If this flag is not specified, the tool defaults to 25.
> +#
> +#  Example of usage:
> +#  topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
> +#
> +#  This file is a part of the project "TCG Continuous Benchmarking".
> +#
> +#  Copyright (C) 2020  Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +#  Copyright (C) 2020  Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
> +#
> +#  This program is free software: you can redistribute it and/or modify
> +#  it under the terms of the GNU General Public License as published by
> +#  the Free Software Foundation, either version 2 of the License, or
> +#  (at your option) any later version.
> +#
> +#  This program is distributed in the hope that it will be useful,
> +#  but WITHOUT ANY WARRANTY; without even the implied warranty of
> +#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +#  GNU General Public License for more details.
> +#
> +#  You should have received a copy of the GNU General Public License
> +#  along with this program. If not, see <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import os
> +import subprocess
> +import sys
> +
> +
> +# Parse the command line arguments
> +parser = argparse.ArgumentParser(
> +    usage='topN_perf.py [-h] [-n] <number of displayed top functions >  -- '
> +          '<qemu executable> [<qemu executable options>] '
> +          '<target executable> [<target executable options>]')
> +
> +parser.add_argument('-n', dest='top', type=int, default=25,
> +                    help='Specify the number of top functions to print.')
> +
> +parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
> +
> +args = parser.parse_args()
> +
> +# Extract the needed variables from the args
> +command = args.command
> +top = args.top
> +
> +# Insure that perf is installed
> +check_perf_presence = subprocess.run(["which", "perf"],
> +                                     stdout=subprocess.DEVNULL)
> +if check_perf_presence.returncode:
> +    sys.exit("Please install perf before running the script!")
> +
> +# Insure user has previllage to run perf
> +check_perf_executability = subprocess.run(["perf", "stat", "ls", "/"],
> +                                          stdout=subprocess.DEVNULL,
> +                                          stderr=subprocess.DEVNULL)
> +if check_perf_executability.returncode:
> +    sys.exit(
> +"""
> +Error:
> +You may not have permission to collect stats.
> +
> +Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> +which controls use of the performance events system by
> +unprivileged users (without CAP_SYS_ADMIN).
> +
> +  -1: Allow use of (almost) all events by all users
> +      Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
> +   0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
> +      Disallow raw tracepoint access by users without CAP_SYS_ADMIN
> +   1: Disallow CPU event access by users without CAP_SYS_ADMIN
> +   2: Disallow kernel profiling by users without CAP_SYS_ADMIN
> +
> +To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
> +   kernel.perf_event_paranoid = -1
> +
> +* Alternatively, you can run this script under sudo privileges.
> +"""
> +)
> +
> +# Run perf record
> +perf_record = subprocess.run((["perf", "record", "--output=/tmp/perf.data"] +
> +                              command),
> +                             stdout=subprocess.DEVNULL,
> +                             stderr=subprocess.PIPE)
> +if perf_record.returncode:
> +    os.unlink('/tmp/perf.data')
> +    sys.exit(perf_record.stderr.decode("utf-8"))
> +
> +# Save perf report output to /tmp/perf_report.out
> +with open("/tmp/perf_report.out", "w") as output:
> +    perf_report = subprocess.run(
> +        ["perf", "report", "--input=/tmp/perf.data", "--stdio"],
> +        stdout=output,
> +        stderr=subprocess.PIPE)
> +    if perf_report.returncode:
> +        os.unlink('/tmp/perf.data')
> +        output.close()
> +        os.unlink('/tmp/perf_report.out')
> +        sys.exit(perf_report.stderr.decode("utf-8"))
> +
> +# Read the reported data to functions[]
> +functions = []
> +with open("/tmp/perf_report.out", "r") as data:
> +    # Only read lines that are not comments (comments start with #)
> +    # Only read lines that are not empty
> +    functions = [line for line in data.readlines() if line and line[0]
> +                 != '#' and line[0] != "\n"]
> +
> +# Limit the number of top functions to "top"
> +number_of_top_functions = top if len(functions) > top else len(functions)
> +
> +# Store the data of the top functions in top_functions[]
> +top_functions = functions[:number_of_top_functions]
> +
> +# Print table header
> +print('{:>4}  {:>10}  {:<30}  {}\n{}  {}  {}  {}'.format('No.',
> +                                                         'Percentage',
> +                                                         'Name',
> +                                                         'Invoked by',
> +                                                         '-' * 4,
> +                                                         '-' * 10,
> +                                                         '-' * 30,
> +                                                         '-' * 25))
> +
> +# Print top N functions
> +for (index, function) in enumerate(top_functions, start=1):
> +    function_data = function.split()
> +    function_percentage = function_data[0]
> +    function_name = function_data[-1]
> +    function_invoker = ' '.join(function_data[2:-2])
> +    print('{:>4}  {:>10}  {:<30}  {}'.format(index,
> +                                             function_percentage,
> +                                             function_name,
> +                                             function_invoker))
> +
> +# Remove intermediate files
> +os.unlink('/tmp/perf.data')
> +os.unlink('/tmp/perf_report.out')
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection
  2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
@ 2020-06-27 17:58   ` Aleksandar Markovic
  0 siblings, 0 replies; 7+ messages in thread
From: Aleksandar Markovic @ 2020-06-27 17:58 UTC (permalink / raw)
  To: Ahmed Karaman
  Cc: Lukáš Doktor, Eduardo Habkost, Alex Bennée,
	QEMU Developers, Aleksandar Markovic, Cleber Rosa,
	Richard Henderson

On Fri, Jun 26, 2020 at 7:02 PM Ahmed Karaman
<ahmedkhaledkaraman@gmail.com> wrote:
>
> This commit creates a new 'Miscellaneous' section which hosts a new
> 'Performance Tools and Tests' subsection.
> The subsection will contain the the performance scripts and benchmarks
> written as a part of the 'TCG Continuous Benchmarking' project.
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> ---

Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>

Applied to "TCG Continuous Benchmarking" queue.

>  MAINTAINERS | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1b40446c73..c510c942ac 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3019,3 +3019,10 @@ M: Peter Maydell <peter.maydell@linaro.org>
>  S: Maintained
>  F: docs/conf.py
>  F: docs/*/conf.py
> +
> +Miscellaneous
> +-------------
> +Performance Tools and Tests
> +M: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +S: Maintained
> +F: scripts/performance/
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-27 17:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
2020-06-27 17:58   ` Aleksandar Markovic
2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
2020-06-27 17:57   ` Aleksandar Markovic
2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
2020-06-27 17:58   ` Aleksandar Markovic

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.