* [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions
@ 2020-06-26 16:45 Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
ldoktor, ehabkost, crosa
Cc: Ahmed Karaman
Greetings,
As a part of the TCG Continous Benchmarking project for GSoC this
year, detailed reports discussing different performance measurement
methodologies and analysis results will be sent here on the mailing
list.
The project's first report was published on the mailing list on the
22nd of June:
https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg06692.html
A section in this report deals with measuring the top 25 executed
functions when running QEMU. It includes two Python scripts that
automatically perform this task.
This series adds these two scripts to a new performance directory
created under the scripts directory. It also adds a new
"Miscellaneous" section to the end of the MAINTAINERS file with a
"Performance Tools and Tests" subsection.
Previous versions of the series:
v3:
https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg07856.html
v2:
https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg06147.html
v1:
https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg04868.html
Best regards,
Ahmed Karaman
v3->v4:
- Save all intermediate files generated by the scripts in the '/tmp'
directory instead of the current working directory of the user.
- Use more descriptive variable names and table headers.
v2->v3:
- Use a clearer "Syntax" and "Example of usage" in the script comment
and commit message.
- Manually specify the instructions required to run Perf instead of
relying on the stderr produced by Perf.
- Use more descriptive variable names.
v1->v2:
- Add an empty line at the end of the MAINTAINERS file.
- Move MAINTAINERS patch to be the last in the series.
- Allow custom number of top functions to be specified.
- Check for vallgrind and perf before executing the scripts.
- Ensure sufficient permissions when running the topN_perf script.
- Use subprocess instead of os.system
- Use os.unlink() for deleting intermediate files.
- Spread out the data extraction steps.
- Enable execution permission for the scripts.
- Add script example output in the commit message.
Ahmed Karaman (3):
scripts/performance: Add topN_perf.py script
scripts/performance: Add topN_callgrind.py script
MAINTAINERS: Add 'Performance Tools and Tests'subsection
MAINTAINERS | 7 ++
scripts/performance/topN_callgrind.py | 140 ++++++++++++++++++++++++
scripts/performance/topN_perf.py | 149 ++++++++++++++++++++++++++
3 files changed, 296 insertions(+)
create mode 100755 scripts/performance/topN_callgrind.py
create mode 100755 scripts/performance/topN_perf.py
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 1/3] scripts/performance: Add topN_perf.py script
2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
@ 2020-06-26 16:45 ` Ahmed Karaman
2020-06-27 17:58 ` Aleksandar Markovic
2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
2 siblings, 1 reply; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
ldoktor, ehabkost, crosa
Cc: Ahmed Karaman
Syntax:
topN_perf.py [-h] [-n] <number of displayed top functions> -- \
<qemu executable> [<qemu executable options>] \
<target executable> [<target execurable options>]
[-h] - Print the script arguments help message.
[-n] - Specify the number of top functions to print.
- If this flag is not specified, the tool defaults to 25.
Example of usage:
topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
Example Output:
No. Percentage Name Invoked by
---- ---------- ------------------------- -------------------------
1 16.25% float64_mul qemu-x86_64
2 12.01% float64_sub qemu-x86_64
3 11.99% float64_add qemu-x86_64
4 5.69% helper_mulsd qemu-x86_64
5 4.68% helper_addsd qemu-x86_64
6 4.43% helper_lookup_tb_ptr qemu-x86_64
7 4.28% helper_subsd qemu-x86_64
8 2.71% f64_compare qemu-x86_64
9 2.71% helper_ucomisd qemu-x86_64
10 1.04% helper_pand_xmm qemu-x86_64
11 0.71% float64_div qemu-x86_64
12 0.63% helper_pxor_xmm qemu-x86_64
13 0.50% 0x00007f7b7004ef95 [JIT] tid 491
14 0.50% 0x00007f7b70044e83 [JIT] tid 491
15 0.36% helper_por_xmm qemu-x86_64
16 0.32% helper_cc_compute_all qemu-x86_64
17 0.30% 0x00007f7b700433f0 [JIT] tid 491
18 0.30% float64_compare_quiet qemu-x86_64
19 0.27% soft_f64_addsub qemu-x86_64
20 0.26% round_to_int qemu-x86_64
Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
---
scripts/performance/topN_perf.py | 149 +++++++++++++++++++++++++++++++
1 file changed, 149 insertions(+)
create mode 100755 scripts/performance/topN_perf.py
diff --git a/scripts/performance/topN_perf.py b/scripts/performance/topN_perf.py
new file mode 100755
index 0000000000..07be195fc8
--- /dev/null
+++ b/scripts/performance/topN_perf.py
@@ -0,0 +1,149 @@
+#!/usr/bin/env python3
+
+# Print the top N most executed functions in QEMU using perf.
+# Syntax:
+# topN_perf.py [-h] [-n] <number of displayed top functions> -- \
+# <qemu executable> [<qemu executable options>] \
+# <target executable> [<target execurable options>]
+#
+# [-h] - Print the script arguments help message.
+# [-n] - Specify the number of top functions to print.
+# - If this flag is not specified, the tool defaults to 25.
+#
+# Example of usage:
+# topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
+#
+# This file is a part of the project "TCG Continuous Benchmarking".
+#
+# Copyright (C) 2020 Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+# Copyright (C) 2020 Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+import argparse
+import os
+import subprocess
+import sys
+
+
+# Parse the command line arguments
+parser = argparse.ArgumentParser(
+ usage='topN_perf.py [-h] [-n] <number of displayed top functions > -- '
+ '<qemu executable> [<qemu executable options>] '
+ '<target executable> [<target executable options>]')
+
+parser.add_argument('-n', dest='top', type=int, default=25,
+ help='Specify the number of top functions to print.')
+
+parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
+
+args = parser.parse_args()
+
+# Extract the needed variables from the args
+command = args.command
+top = args.top
+
+# Insure that perf is installed
+check_perf_presence = subprocess.run(["which", "perf"],
+ stdout=subprocess.DEVNULL)
+if check_perf_presence.returncode:
+ sys.exit("Please install perf before running the script!")
+
+# Insure user has previllage to run perf
+check_perf_executability = subprocess.run(["perf", "stat", "ls", "/"],
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL)
+if check_perf_executability.returncode:
+ sys.exit(
+"""
+Error:
+You may not have permission to collect stats.
+
+Consider tweaking /proc/sys/kernel/perf_event_paranoid,
+which controls use of the performance events system by
+unprivileged users (without CAP_SYS_ADMIN).
+
+ -1: Allow use of (almost) all events by all users
+ Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
+ 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
+ Disallow raw tracepoint access by users without CAP_SYS_ADMIN
+ 1: Disallow CPU event access by users without CAP_SYS_ADMIN
+ 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
+
+To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
+ kernel.perf_event_paranoid = -1
+
+* Alternatively, you can run this script under sudo privileges.
+"""
+)
+
+# Run perf record
+perf_record = subprocess.run((["perf", "record", "--output=/tmp/perf.data"] +
+ command),
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.PIPE)
+if perf_record.returncode:
+ os.unlink('/tmp/perf.data')
+ sys.exit(perf_record.stderr.decode("utf-8"))
+
+# Save perf report output to /tmp/perf_report.out
+with open("/tmp/perf_report.out", "w") as output:
+ perf_report = subprocess.run(
+ ["perf", "report", "--input=/tmp/perf.data", "--stdio"],
+ stdout=output,
+ stderr=subprocess.PIPE)
+ if perf_report.returncode:
+ os.unlink('/tmp/perf.data')
+ output.close()
+ os.unlink('/tmp/perf_report.out')
+ sys.exit(perf_report.stderr.decode("utf-8"))
+
+# Read the reported data to functions[]
+functions = []
+with open("/tmp/perf_report.out", "r") as data:
+ # Only read lines that are not comments (comments start with #)
+ # Only read lines that are not empty
+ functions = [line for line in data.readlines() if line and line[0]
+ != '#' and line[0] != "\n"]
+
+# Limit the number of top functions to "top"
+number_of_top_functions = top if len(functions) > top else len(functions)
+
+# Store the data of the top functions in top_functions[]
+top_functions = functions[:number_of_top_functions]
+
+# Print table header
+print('{:>4} {:>10} {:<30} {}\n{} {} {} {}'.format('No.',
+ 'Percentage',
+ 'Name',
+ 'Invoked by',
+ '-' * 4,
+ '-' * 10,
+ '-' * 30,
+ '-' * 25))
+
+# Print top N functions
+for (index, function) in enumerate(top_functions, start=1):
+ function_data = function.split()
+ function_percentage = function_data[0]
+ function_name = function_data[-1]
+ function_invoker = ' '.join(function_data[2:-2])
+ print('{:>4} {:>10} {:<30} {}'.format(index,
+ function_percentage,
+ function_name,
+ function_invoker))
+
+# Remove intermediate files
+os.unlink('/tmp/perf.data')
+os.unlink('/tmp/perf_report.out')
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script
2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
@ 2020-06-26 16:45 ` Ahmed Karaman
2020-06-27 17:57 ` Aleksandar Markovic
2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
2 siblings, 1 reply; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
ldoktor, ehabkost, crosa
Cc: Ahmed Karaman
Python script that prints the top N most executed functions in QEMU
using callgrind.
Syntax:
topN_callgrind.py [-h] [-n] <number of displayed top functions> -- \
<qemu executable> [<qemu executable options>] \
<target executable> [<target execurable options>]
[-h] - Print the script arguments help message.
[-n] - Specify the number of top functions to print.
- If this flag is not specified, the tool defaults to 25.
Example of usage:
topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
Example Output:
No. Percentage Function Name Source File
---- --------- ------------------ ------------------------------
1 24.577% 0x00000000082db000 ???
2 20.467% float64_mul <qemu>/fpu/softfloat.c
3 14.720% float64_sub <qemu>/fpu/softfloat.c
4 13.864% float64_add <qemu>/fpu/softfloat.c
5 4.876% helper_mulsd <qemu>/target/i386/ops_sse.h
6 3.767% helper_subsd <qemu>/target/i386/ops_sse.h
7 3.549% helper_addsd <qemu>/target/i386/ops_sse.h
8 2.185% helper_ucomisd <qemu>/target/i386/ops_sse.h
9 1.667% helper_lookup_tb_ptr <qemu>/include/exec/tb-lookup.h
10 1.662% f64_compare <qemu>/fpu/softfloat.c
11 1.509% helper_lookup_tb_ptr <qemu>/accel/tcg/tcg-runtime.c
12 0.635% helper_lookup_tb_ptr <qemu>/include/exec/exec-all.h
13 0.616% float64_div <qemu>/fpu/softfloat.c
14 0.502% helper_pand_xmm <qemu>/target/i386/ops_sse.h
15 0.502% float64_mul <qemu>/include/fpu/softfloat.h
16 0.476% helper_lookup_tb_ptr <qemu>/target/i386/cpu.h
17 0.437% float64_compare_quiet <qemu>/fpu/softfloat.c
18 0.414% helper_pxor_xmm <qemu>/target/i386/ops_sse.h
19 0.353% round_to_int <qemu>/fpu/softfloat.c
20 0.347% helper_cc_compute_all <qemu>/target/i386/cc_helper.c
Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
---
scripts/performance/topN_callgrind.py | 140 ++++++++++++++++++++++++++
1 file changed, 140 insertions(+)
create mode 100755 scripts/performance/topN_callgrind.py
diff --git a/scripts/performance/topN_callgrind.py b/scripts/performance/topN_callgrind.py
new file mode 100755
index 0000000000..67c59197af
--- /dev/null
+++ b/scripts/performance/topN_callgrind.py
@@ -0,0 +1,140 @@
+#!/usr/bin/env python3
+
+# Print the top N most executed functions in QEMU using callgrind.
+# Syntax:
+# topN_callgrind.py [-h] [-n] <number of displayed top functions> -- \
+# <qemu executable> [<qemu executable options>] \
+# <target executable> [<target execurable options>]
+#
+# [-h] - Print the script arguments help message.
+# [-n] - Specify the number of top functions to print.
+# - If this flag is not specified, the tool defaults to 25.
+#
+# Example of usage:
+# topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
+#
+# This file is a part of the project "TCG Continuous Benchmarking".
+#
+# Copyright (C) 2020 Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+# Copyright (C) 2020 Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+import argparse
+import os
+import subprocess
+import sys
+
+
+# Parse the command line arguments
+parser = argparse.ArgumentParser(
+ usage='topN_callgrind.py [-h] [-n] <number of displayed top functions> -- '
+ '<qemu executable> [<qemu executable options>] '
+ '<target executable> [<target executable options>]')
+
+parser.add_argument('-n', dest='top', type=int, default=25,
+ help='Specify the number of top functions to print.')
+
+parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
+
+args = parser.parse_args()
+
+# Extract the needed variables from the args
+command = args.command
+top = args.top
+
+# Insure that valgrind is installed
+check_valgrind_presence = subprocess.run(["which", "valgrind"],
+ stdout=subprocess.DEVNULL)
+if check_valgrind_presence.returncode:
+ sys.exit("Please install valgrind before running the script!")
+
+# Run callgrind
+callgrind = subprocess.run((
+ ["valgrind", "--tool=callgrind", "--callgrind-out-file=/tmp/callgrind.data"]
+ + command),
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.PIPE)
+if callgrind.returncode:
+ sys.exit(callgrind.stderr.decode("utf-8"))
+
+# Save callgrind_annotate output to /tmp/callgrind_annotate.out
+with open("/tmp/callgrind_annotate.out", "w") as output:
+ callgrind_annotate = subprocess.run(["callgrind_annotate",
+ "/tmp/callgrind.data"],
+ stdout=output,
+ stderr=subprocess.PIPE)
+ if callgrind_annotate.returncode:
+ os.unlink('/tmp/callgrind.data')
+ output.close()
+ os.unlink('/tmp/callgrind_annotate.out')
+ sys.exit(callgrind_annotate.stderr.decode("utf-8"))
+
+# Read the callgrind_annotate output to callgrind_data[]
+callgrind_data = []
+with open('/tmp/callgrind_annotate.out', 'r') as data:
+ callgrind_data = data.readlines()
+
+# Line number with the total number of instructions
+total_instructions_line_number = 20
+
+# Get the total number of instructions
+total_instructions_line_data = callgrind_data[total_instructions_line_number]
+total_number_of_instructions = total_instructions_line_data.split(' ')[0]
+total_number_of_instructions = int(
+ total_number_of_instructions.replace(',', ''))
+
+# Line number with the top function
+first_func_line = 25
+
+# Number of functions recorded by callgrind, last two lines are always empty
+number_of_functions = len(callgrind_data) - first_func_line - 2
+
+# Limit the number of top functions to "top"
+number_of_top_functions = (top if number_of_functions >
+ top else number_of_functions)
+
+# Store the data of the top functions in top_functions[]
+top_functions = callgrind_data[first_func_line:
+ first_func_line + number_of_top_functions]
+
+# Print table header
+print('{:>4} {:>10} {:<30} {}\n{} {} {} {}'.format('No.',
+ 'Percentage',
+ 'Function Name',
+ 'Source File',
+ '-' * 4,
+ '-' * 10,
+ '-' * 30,
+ '-' * 30,
+ ))
+
+# Print top N functions
+for (index, function) in enumerate(top_functions, start=1):
+ function_data = function.split()
+ # Calculate function percentage
+ function_instructions = float(function_data[0].replace(',', ''))
+ function_percentage = (function_instructions /
+ total_number_of_instructions)*100
+ # Get function name and source files path
+ function_source_file, function_name = function_data[1].split(':')
+ # Print extracted data
+ print('{:>4} {:>9.3f}% {:<30} {}'.format(index,
+ round(function_percentage, 3),
+ function_name,
+ function_source_file))
+
+# Remove intermediate files
+os.unlink('/tmp/callgrind.data')
+os.unlink('/tmp/callgrind_annotate.out')
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection
2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
@ 2020-06-26 16:45 ` Ahmed Karaman
2020-06-27 17:58 ` Aleksandar Markovic
2 siblings, 1 reply; 7+ messages in thread
From: Ahmed Karaman @ 2020-06-26 16:45 UTC (permalink / raw)
To: qemu-devel, aleksandar.qemu.devel, alex.bennee, rth, eblake,
ldoktor, ehabkost, crosa
Cc: Ahmed Karaman
This commit creates a new 'Miscellaneous' section which hosts a new
'Performance Tools and Tests' subsection.
The subsection will contain the the performance scripts and benchmarks
written as a part of the 'TCG Continuous Benchmarking' project.
Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
MAINTAINERS | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 1b40446c73..c510c942ac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3019,3 +3019,10 @@ M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: docs/conf.py
F: docs/*/conf.py
+
+Miscellaneous
+-------------
+Performance Tools and Tests
+M: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
+S: Maintained
+F: scripts/performance/
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script
2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
@ 2020-06-27 17:57 ` Aleksandar Markovic
0 siblings, 0 replies; 7+ messages in thread
From: Aleksandar Markovic @ 2020-06-27 17:57 UTC (permalink / raw)
To: Ahmed Karaman
Cc: Lukáš Doktor, Eduardo Habkost, Alex Bennée,
QEMU Developers, Aleksandar Markovic, Cleber Rosa,
Richard Henderson
On Fri, Jun 26, 2020 at 6:59 PM Ahmed Karaman
<ahmedkhaledkaraman@gmail.com> wrote:
>
> Python script that prints the top N most executed functions in QEMU
> using callgrind.
>
> Syntax:
> topN_callgrind.py [-h] [-n] <number of displayed top functions> -- \
> <qemu executable> [<qemu executable options>] \
> <target executable> [<target execurable options>]
>
> [-h] - Print the script arguments help message.
> [-n] - Specify the number of top functions to print.
> - If this flag is not specified, the tool defaults to 25.
>
> Example of usage:
> topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
>
> Example Output:
> No. Percentage Function Name Source File
> ---- --------- ------------------ ------------------------------
> 1 24.577% 0x00000000082db000 ???
> 2 20.467% float64_mul <qemu>/fpu/softfloat.c
> 3 14.720% float64_sub <qemu>/fpu/softfloat.c
> 4 13.864% float64_add <qemu>/fpu/softfloat.c
> 5 4.876% helper_mulsd <qemu>/target/i386/ops_sse.h
> 6 3.767% helper_subsd <qemu>/target/i386/ops_sse.h
> 7 3.549% helper_addsd <qemu>/target/i386/ops_sse.h
> 8 2.185% helper_ucomisd <qemu>/target/i386/ops_sse.h
> 9 1.667% helper_lookup_tb_ptr <qemu>/include/exec/tb-lookup.h
> 10 1.662% f64_compare <qemu>/fpu/softfloat.c
> 11 1.509% helper_lookup_tb_ptr <qemu>/accel/tcg/tcg-runtime.c
> 12 0.635% helper_lookup_tb_ptr <qemu>/include/exec/exec-all.h
> 13 0.616% float64_div <qemu>/fpu/softfloat.c
> 14 0.502% helper_pand_xmm <qemu>/target/i386/ops_sse.h
> 15 0.502% float64_mul <qemu>/include/fpu/softfloat.h
> 16 0.476% helper_lookup_tb_ptr <qemu>/target/i386/cpu.h
> 17 0.437% float64_compare_quiet <qemu>/fpu/softfloat.c
> 18 0.414% helper_pxor_xmm <qemu>/target/i386/ops_sse.h
> 19 0.353% round_to_int <qemu>/fpu/softfloat.c
> 20 0.347% helper_cc_compute_all <qemu>/target/i386/cc_helper.c
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> ---
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Applied to "TCG Continuous Benchmarking" queue.
> scripts/performance/topN_callgrind.py | 140 ++++++++++++++++++++++++++
> 1 file changed, 140 insertions(+)
> create mode 100755 scripts/performance/topN_callgrind.py
>
> diff --git a/scripts/performance/topN_callgrind.py b/scripts/performance/topN_callgrind.py
> new file mode 100755
> index 0000000000..67c59197af
> --- /dev/null
> +++ b/scripts/performance/topN_callgrind.py
> @@ -0,0 +1,140 @@
> +#!/usr/bin/env python3
> +
> +# Print the top N most executed functions in QEMU using callgrind.
> +# Syntax:
> +# topN_callgrind.py [-h] [-n] <number of displayed top functions> -- \
> +# <qemu executable> [<qemu executable options>] \
> +# <target executable> [<target execurable options>]
> +#
> +# [-h] - Print the script arguments help message.
> +# [-n] - Specify the number of top functions to print.
> +# - If this flag is not specified, the tool defaults to 25.
> +#
> +# Example of usage:
> +# topN_callgrind.py -n 20 -- qemu-arm coulomb_double-arm
> +#
> +# This file is a part of the project "TCG Continuous Benchmarking".
> +#
> +# Copyright (C) 2020 Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +# Copyright (C) 2020 Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
> +#
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import os
> +import subprocess
> +import sys
> +
> +
> +# Parse the command line arguments
> +parser = argparse.ArgumentParser(
> + usage='topN_callgrind.py [-h] [-n] <number of displayed top functions> -- '
> + '<qemu executable> [<qemu executable options>] '
> + '<target executable> [<target executable options>]')
> +
> +parser.add_argument('-n', dest='top', type=int, default=25,
> + help='Specify the number of top functions to print.')
> +
> +parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
> +
> +args = parser.parse_args()
> +
> +# Extract the needed variables from the args
> +command = args.command
> +top = args.top
> +
> +# Insure that valgrind is installed
> +check_valgrind_presence = subprocess.run(["which", "valgrind"],
> + stdout=subprocess.DEVNULL)
> +if check_valgrind_presence.returncode:
> + sys.exit("Please install valgrind before running the script!")
> +
> +# Run callgrind
> +callgrind = subprocess.run((
> + ["valgrind", "--tool=callgrind", "--callgrind-out-file=/tmp/callgrind.data"]
> + + command),
> + stdout=subprocess.DEVNULL,
> + stderr=subprocess.PIPE)
> +if callgrind.returncode:
> + sys.exit(callgrind.stderr.decode("utf-8"))
> +
> +# Save callgrind_annotate output to /tmp/callgrind_annotate.out
> +with open("/tmp/callgrind_annotate.out", "w") as output:
> + callgrind_annotate = subprocess.run(["callgrind_annotate",
> + "/tmp/callgrind.data"],
> + stdout=output,
> + stderr=subprocess.PIPE)
> + if callgrind_annotate.returncode:
> + os.unlink('/tmp/callgrind.data')
> + output.close()
> + os.unlink('/tmp/callgrind_annotate.out')
> + sys.exit(callgrind_annotate.stderr.decode("utf-8"))
> +
> +# Read the callgrind_annotate output to callgrind_data[]
> +callgrind_data = []
> +with open('/tmp/callgrind_annotate.out', 'r') as data:
> + callgrind_data = data.readlines()
> +
> +# Line number with the total number of instructions
> +total_instructions_line_number = 20
> +
> +# Get the total number of instructions
> +total_instructions_line_data = callgrind_data[total_instructions_line_number]
> +total_number_of_instructions = total_instructions_line_data.split(' ')[0]
> +total_number_of_instructions = int(
> + total_number_of_instructions.replace(',', ''))
> +
> +# Line number with the top function
> +first_func_line = 25
> +
> +# Number of functions recorded by callgrind, last two lines are always empty
> +number_of_functions = len(callgrind_data) - first_func_line - 2
> +
> +# Limit the number of top functions to "top"
> +number_of_top_functions = (top if number_of_functions >
> + top else number_of_functions)
> +
> +# Store the data of the top functions in top_functions[]
> +top_functions = callgrind_data[first_func_line:
> + first_func_line + number_of_top_functions]
> +
> +# Print table header
> +print('{:>4} {:>10} {:<30} {}\n{} {} {} {}'.format('No.',
> + 'Percentage',
> + 'Function Name',
> + 'Source File',
> + '-' * 4,
> + '-' * 10,
> + '-' * 30,
> + '-' * 30,
> + ))
> +
> +# Print top N functions
> +for (index, function) in enumerate(top_functions, start=1):
> + function_data = function.split()
> + # Calculate function percentage
> + function_instructions = float(function_data[0].replace(',', ''))
> + function_percentage = (function_instructions /
> + total_number_of_instructions)*100
> + # Get function name and source files path
> + function_source_file, function_name = function_data[1].split(':')
> + # Print extracted data
> + print('{:>4} {:>9.3f}% {:<30} {}'.format(index,
> + round(function_percentage, 3),
> + function_name,
> + function_source_file))
> +
> +# Remove intermediate files
> +os.unlink('/tmp/callgrind.data')
> +os.unlink('/tmp/callgrind_annotate.out')
> --
> 2.17.1
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 1/3] scripts/performance: Add topN_perf.py script
2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
@ 2020-06-27 17:58 ` Aleksandar Markovic
0 siblings, 0 replies; 7+ messages in thread
From: Aleksandar Markovic @ 2020-06-27 17:58 UTC (permalink / raw)
To: Ahmed Karaman
Cc: Lukáš Doktor, Eduardo Habkost, Alex Bennée,
QEMU Developers, Aleksandar Markovic, Cleber Rosa,
Richard Henderson
On Fri, Jun 26, 2020 at 7:00 PM Ahmed Karaman
<ahmedkhaledkaraman@gmail.com> wrote:
>
> Syntax:
> topN_perf.py [-h] [-n] <number of displayed top functions> -- \
> <qemu executable> [<qemu executable options>] \
> <target executable> [<target execurable options>]
>
> [-h] - Print the script arguments help message.
> [-n] - Specify the number of top functions to print.
> - If this flag is not specified, the tool defaults to 25.
>
> Example of usage:
> topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
>
> Example Output:
> No. Percentage Name Invoked by
> ---- ---------- ------------------------- -------------------------
> 1 16.25% float64_mul qemu-x86_64
> 2 12.01% float64_sub qemu-x86_64
> 3 11.99% float64_add qemu-x86_64
> 4 5.69% helper_mulsd qemu-x86_64
> 5 4.68% helper_addsd qemu-x86_64
> 6 4.43% helper_lookup_tb_ptr qemu-x86_64
> 7 4.28% helper_subsd qemu-x86_64
> 8 2.71% f64_compare qemu-x86_64
> 9 2.71% helper_ucomisd qemu-x86_64
> 10 1.04% helper_pand_xmm qemu-x86_64
> 11 0.71% float64_div qemu-x86_64
> 12 0.63% helper_pxor_xmm qemu-x86_64
> 13 0.50% 0x00007f7b7004ef95 [JIT] tid 491
> 14 0.50% 0x00007f7b70044e83 [JIT] tid 491
> 15 0.36% helper_por_xmm qemu-x86_64
> 16 0.32% helper_cc_compute_all qemu-x86_64
> 17 0.30% 0x00007f7b700433f0 [JIT] tid 491
> 18 0.30% float64_compare_quiet qemu-x86_64
> 19 0.27% soft_f64_addsub qemu-x86_64
> 20 0.26% round_to_int qemu-x86_64
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> ---
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Applied to "TCG Continuous Benchmarking" queue.
> scripts/performance/topN_perf.py | 149 +++++++++++++++++++++++++++++++
> 1 file changed, 149 insertions(+)
> create mode 100755 scripts/performance/topN_perf.py
>
> diff --git a/scripts/performance/topN_perf.py b/scripts/performance/topN_perf.py
> new file mode 100755
> index 0000000000..07be195fc8
> --- /dev/null
> +++ b/scripts/performance/topN_perf.py
> @@ -0,0 +1,149 @@
> +#!/usr/bin/env python3
> +
> +# Print the top N most executed functions in QEMU using perf.
> +# Syntax:
> +# topN_perf.py [-h] [-n] <number of displayed top functions> -- \
> +# <qemu executable> [<qemu executable options>] \
> +# <target executable> [<target execurable options>]
> +#
> +# [-h] - Print the script arguments help message.
> +# [-n] - Specify the number of top functions to print.
> +# - If this flag is not specified, the tool defaults to 25.
> +#
> +# Example of usage:
> +# topN_perf.py -n 20 -- qemu-arm coulomb_double-arm
> +#
> +# This file is a part of the project "TCG Continuous Benchmarking".
> +#
> +# Copyright (C) 2020 Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +# Copyright (C) 2020 Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
> +#
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import os
> +import subprocess
> +import sys
> +
> +
> +# Parse the command line arguments
> +parser = argparse.ArgumentParser(
> + usage='topN_perf.py [-h] [-n] <number of displayed top functions > -- '
> + '<qemu executable> [<qemu executable options>] '
> + '<target executable> [<target executable options>]')
> +
> +parser.add_argument('-n', dest='top', type=int, default=25,
> + help='Specify the number of top functions to print.')
> +
> +parser.add_argument('command', type=str, nargs='+', help=argparse.SUPPRESS)
> +
> +args = parser.parse_args()
> +
> +# Extract the needed variables from the args
> +command = args.command
> +top = args.top
> +
> +# Insure that perf is installed
> +check_perf_presence = subprocess.run(["which", "perf"],
> + stdout=subprocess.DEVNULL)
> +if check_perf_presence.returncode:
> + sys.exit("Please install perf before running the script!")
> +
> +# Insure user has previllage to run perf
> +check_perf_executability = subprocess.run(["perf", "stat", "ls", "/"],
> + stdout=subprocess.DEVNULL,
> + stderr=subprocess.DEVNULL)
> +if check_perf_executability.returncode:
> + sys.exit(
> +"""
> +Error:
> +You may not have permission to collect stats.
> +
> +Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> +which controls use of the performance events system by
> +unprivileged users (without CAP_SYS_ADMIN).
> +
> + -1: Allow use of (almost) all events by all users
> + Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
> + 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
> + Disallow raw tracepoint access by users without CAP_SYS_ADMIN
> + 1: Disallow CPU event access by users without CAP_SYS_ADMIN
> + 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
> +
> +To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
> + kernel.perf_event_paranoid = -1
> +
> +* Alternatively, you can run this script under sudo privileges.
> +"""
> +)
> +
> +# Run perf record
> +perf_record = subprocess.run((["perf", "record", "--output=/tmp/perf.data"] +
> + command),
> + stdout=subprocess.DEVNULL,
> + stderr=subprocess.PIPE)
> +if perf_record.returncode:
> + os.unlink('/tmp/perf.data')
> + sys.exit(perf_record.stderr.decode("utf-8"))
> +
> +# Save perf report output to /tmp/perf_report.out
> +with open("/tmp/perf_report.out", "w") as output:
> + perf_report = subprocess.run(
> + ["perf", "report", "--input=/tmp/perf.data", "--stdio"],
> + stdout=output,
> + stderr=subprocess.PIPE)
> + if perf_report.returncode:
> + os.unlink('/tmp/perf.data')
> + output.close()
> + os.unlink('/tmp/perf_report.out')
> + sys.exit(perf_report.stderr.decode("utf-8"))
> +
> +# Read the reported data to functions[]
> +functions = []
> +with open("/tmp/perf_report.out", "r") as data:
> + # Only read lines that are not comments (comments start with #)
> + # Only read lines that are not empty
> + functions = [line for line in data.readlines() if line and line[0]
> + != '#' and line[0] != "\n"]
> +
> +# Limit the number of top functions to "top"
> +number_of_top_functions = top if len(functions) > top else len(functions)
> +
> +# Store the data of the top functions in top_functions[]
> +top_functions = functions[:number_of_top_functions]
> +
> +# Print table header
> +print('{:>4} {:>10} {:<30} {}\n{} {} {} {}'.format('No.',
> + 'Percentage',
> + 'Name',
> + 'Invoked by',
> + '-' * 4,
> + '-' * 10,
> + '-' * 30,
> + '-' * 25))
> +
> +# Print top N functions
> +for (index, function) in enumerate(top_functions, start=1):
> + function_data = function.split()
> + function_percentage = function_data[0]
> + function_name = function_data[-1]
> + function_invoker = ' '.join(function_data[2:-2])
> + print('{:>4} {:>10} {:<30} {}'.format(index,
> + function_percentage,
> + function_name,
> + function_invoker))
> +
> +# Remove intermediate files
> +os.unlink('/tmp/perf.data')
> +os.unlink('/tmp/perf_report.out')
> --
> 2.17.1
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection
2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
@ 2020-06-27 17:58 ` Aleksandar Markovic
0 siblings, 0 replies; 7+ messages in thread
From: Aleksandar Markovic @ 2020-06-27 17:58 UTC (permalink / raw)
To: Ahmed Karaman
Cc: Lukáš Doktor, Eduardo Habkost, Alex Bennée,
QEMU Developers, Aleksandar Markovic, Cleber Rosa,
Richard Henderson
On Fri, Jun 26, 2020 at 7:02 PM Ahmed Karaman
<ahmedkhaledkaraman@gmail.com> wrote:
>
> This commit creates a new 'Miscellaneous' section which hosts a new
> 'Performance Tools and Tests' subsection.
> The subsection will contain the the performance scripts and benchmarks
> written as a part of the 'TCG Continuous Benchmarking' project.
>
> Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> ---
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Applied to "TCG Continuous Benchmarking" queue.
> MAINTAINERS | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1b40446c73..c510c942ac 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3019,3 +3019,10 @@ M: Peter Maydell <peter.maydell@linaro.org>
> S: Maintained
> F: docs/conf.py
> F: docs/*/conf.py
> +
> +Miscellaneous
> +-------------
> +Performance Tools and Tests
> +M: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
> +S: Maintained
> +F: scripts/performance/
> --
> 2.17.1
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-06-27 17:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 16:45 [PATCH v4 0/3] Add Scripts for Finding Top 25 Executed Functions Ahmed Karaman
2020-06-26 16:45 ` [PATCH v4 1/3] scripts/performance: Add topN_perf.py script Ahmed Karaman
2020-06-27 17:58 ` Aleksandar Markovic
2020-06-26 16:45 ` [PATCH v4 2/3] scripts/performance: Add topN_callgrind.py script Ahmed Karaman
2020-06-27 17:57 ` Aleksandar Markovic
2020-06-26 16:45 ` [PATCH v4 3/3] MAINTAINERS: Add 'Performance Tools and Tests'subsection Ahmed Karaman
2020-06-27 17:58 ` Aleksandar Markovic
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.