On Tuesday, July 28, 2020, Aleksandar Markovic < aleksandar.qemu.devel@gmail.com> wrote: > > > On Tuesday, July 28, 2020, Aleksandar Markovic < > aleksandar.qemu.devel@gmail.com> wrote: > >> >> >> On Thursday, July 16, 2020, Ahmed Karaman >> wrote: >> >>> Python script that prints executed helpers of a QEMU invocation. >>> >>> >> Hi, Ahmed. >> >> You outlined the envisioned user workflow regarding this script in your >> report. As I understand it, it generally goes like this: >> >> 1) The user first discovers helpers, and their performance data. >> 2) The user examines the callees of a particular helper of choice >> (usually, the most instruction-consuming helper). >> 3) The user perhaps further examines a callee of a particular callee of >> the particular helper. >> 4) The user continues this way until the conclusion can be drawn, or >> maximal depth is reached. >> >> The procedure might be time consuming since each step requires running an >> emulation of the test program. >> >> This makes me think that the faster and easier tool for the user (but, to >> some, not that great, extent, harder for you) would be improved >> list_helpers.py (and list_fn_calees.py) that provides list of all callees >> for all helpers, in the tree form (so, callees of callees, callees of >> callees of callees, etc.), rather than providing just a list of immediate >> callees, like it currently does. >> >> I think you can provide such functionality relatively easily using >> recursion. See, let's say: >> >> https://realpython.com/python-thinking-recursively/ >> >> > For printing trees like this: > > foo > ├── bar > │ ├── a > │ └── b > ├── baz > └── qux > └── c⏎ > d > > you can potentialy use tree-format library: > > https://pypi.org/project/tree-format/ . > > Aah, probably you can't - license incompatibility. However, you can write your own function for tree-like outputing, it is really not that difficult - and, in that case, you have the full output control, maybe that is the best approach. Thanks, Aleksandar > Perhaps you can have a switch (let's say, --tree ) that specifies >> whether the script outputs just immediate callee list, or entire callee >> tree. >> >> Thanks, >> Aleksandar >> >> >>> Syntax: >>> list_helpers.py [-h] -- \ >>> [] \ >>> [] >>> >>> [-h] - Print the script arguments help message. >>> >>> Example of usage: >>> list_helpers.py -- qemu-mips coulomb_double-mips -n10 >>> >>> Example output: >>> Total number of instructions: 108,933,695 >>> >>> Executed QEMU Helpers: >>> >>> No. Ins Percent Calls Ins/Call Helper Name Source File >>> --- ------- ------- ------ -------- -------------------- >>> --------------- >>> 1 183,021 0.168% 1,305 140 helper_float_sub_d >>> /target/mips/fpu_helper.c >>> 2 177,111 0.163% 770 230 helper_float_madd_d >>> /target/mips/fpu_helper.c >>> 3 171,537 0.157% 1,014 169 helper_float_mul_d >>> /target/mips/fpu_helper.c >>> 4 157,298 0.144% 2,443 64 helper_lookup_tb_ptr >>> /accel/tcg/tcg-runtime.c >>> 5 138,123 0.127% 897 153 helper_float_add_d >>> /target/mips/fpu_helper.c >>> 6 47,083 0.043% 207 227 helper_float_msub_d >>> /target/mips/fpu_helper.c >>> 7 24,062 0.022% 487 49 helper_cmp_d_lt >>> /target/mips/fpu_helper.c >>> 8 22,910 0.021% 150 152 helper_float_div_d >>> /target/mips/fpu_helper.c >>> 9 15,497 0.014% 321 48 helper_cmp_d_eq >>> /target/mips/fpu_helper.c >>> 10 9,100 0.008% 52 175 helper_float_trunc_w_d >>> /target/mips/fpu_helper.c >>> 11 7,059 0.006% 10 705 helper_float_sqrt_d >>> /target/mips/fpu_helper.c >>> 12 3,000 0.003% 40 75 helper_cmp_d_ule >>> /target/mips/fpu_helper.c >>> 13 2,720 0.002% 20 136 helper_float_cvtd_w >>> /target/mips/fpu_helper.c >>> 14 2,477 0.002% 27 91 helper_swl >>> /target/mips/op_helper.c >>> 15 2,000 0.002% 40 50 helper_cmp_d_le >>> /target/mips/fpu_helper.c >>> 16 1,800 0.002% 40 45 helper_cmp_d_un >>> /target/mips/fpu_helper.c >>> 17 1,164 0.001% 12 97 helper_raise_exception_ >>> /target/mips/op_helper.c >>> 18 720 0.001% 10 72 helper_cmp_d_ult >>> /target/mips/fpu_helper.c >>> 19 560 0.001% 140 4 helper_cfc1 >>> /target/mips/fpu_helper.c >>> >>> Signed-off-by: Ahmed Karaman >>> --- >>> scripts/performance/list_helpers.py | 207 ++++++++++++++++++++++++++++ >>> 1 file changed, 207 insertions(+) >>> create mode 100755 scripts/performance/list_helpers.py >>> >>> diff --git a/scripts/performance/list_helpers.py >>> b/scripts/performance/list_helpers.py >>> new file mode 100755 >>> index 0000000000..a97c7ed4fe >>> --- /dev/null >>> +++ b/scripts/performance/list_helpers.py >>> @@ -0,0 +1,207 @@ >>> +#!/usr/bin/env python3 >>> + >>> +# Print the executed helpers of a QEMU invocation. >>> +# >>> +# Syntax: >>> +# list_helpers.py [-h] -- \ >>> +# [] \ >>> +# [] >>> +# >>> +# [-h] - Print the script arguments help message. >>> +# >>> +# Example of usage: >>> +# list_helpers.py -- qemu-mips coulomb_double-mips >>> +# >>> +# This file is a part of the project "TCG Continuous Benchmarking". >>> +# >>> +# Copyright (C) 2020 Ahmed Karaman >>> +# Copyright (C) 2020 Aleksandar Markovic < >>> aleksandar.qemu.devel@gmail.com> >>> +# >>> +# This program is free software: you can redistribute it and/or modify >>> +# it under the terms of the GNU General Public License as published by >>> +# the Free Software Foundation, either version 2 of the License, or >>> +# (at your option) any later version. >>> +# >>> +# This program is distributed in the hope that it will be useful, >>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of >>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>> +# GNU General Public License for more details. >>> +# >>> +# You should have received a copy of the GNU General Public License >>> +# along with this program. If not, see >> >. >>> + >>> +import argparse >>> +import os >>> +import subprocess >>> +import sys >>> +import tempfile >>> + >>> + >>> +def find_JIT_line(callgrind_data): >>> + """ >>> + Search for the line with the JIT call in the callgrind_annotate >>> + output when ran using --tre=calling. >>> + All the helpers should be listed after that line. >>> + >>> + Parameters: >>> + callgrind_data (list): callgrind_annotate output >>> + >>> + Returns: >>> + (int): Line number of JIT call >>> + """ >>> + line = -1 >>> + for i in range(len(callgrind_data)): >>> + split_line = callgrind_data[i].split() >>> + if len(split_line) > 2 and \ >>> + split_line[1] == "*" and \ >>> + split_line[-1] == "[???]": >>> + line = i >>> + break >>> + return line >>> + >>> + >>> +def get_helpers(JIT_line, callgrind_data): >>> + """ >>> + Get all helpers data given the line number of the JIT call. >>> + >>> + Parameters: >>> + JIT_line (int): Line number of the JIT call >>> + callgrind_data (list): callgrind_annotate output >>> + >>> + Returns: >>> + (list):[[number_of_instructions(int), helper_name(str), >>> + number_of_calls(int), source_file(str)]] >>> + """ >>> + helpers = [] >>> + next_helper = JIT_line + 1 >>> + while (callgrind_data[next_helper] != "\n"): >>> + split_line = callgrind_data[next_helper].split() >>> + number_of_instructions = int(split_line[0].replace(",", "")) >>> + source_file = split_line[2].split(":")[0] >>> + callee_name = split_line[2].split(":")[1] >>> + number_of_calls = int(split_line[3][1:-2]) >>> + helpers.append([number_of_instructions, callee_name, >>> + number_of_calls, source_file]) >>> + next_helper += 1 >>> + return sorted(helpers, reverse=True) >>> + >>> + >>> +def main(): >>> + # Parse the command line arguments >>> + parser = argparse.ArgumentParser( >>> + usage="list_helpers.py [-h] -- " >>> + " [] " >>> + " []") >>> + >>> + parser.add_argument("command", type=str, nargs="+", >>> help=argparse.SUPPRESS) >>> + >>> + args = parser.parse_args() >>> + >>> + # Extract the needed variables from the args >>> + command = args.command >>> + >>> + # Insure that valgrind is installed >>> + check_valgrind = subprocess.run( >>> + ["which", "valgrind"], stdout=subprocess.DEVNULL) >>> + if check_valgrind.returncode: >>> + sys.exit("Please install valgrind before running the script.") >>> + >>> + # Save all intermediate files in a temporary directory >>> + with tempfile.TemporaryDirectory() as tmpdirname: >>> + # callgrind output file path >>> + data_path = os.path.join(tmpdirname, "callgrind.data") >>> + # callgrind_annotate output file path >>> + annotate_out_path = os.path.join(tmpdirname, >>> "callgrind_annotate.out") >>> + >>> + # Run callgrind >>> + callgrind = subprocess.run((["valgrind", >>> + "--tool=callgrind", >>> + "--callgrind-out-file=" + >>> data_path] >>> + + command), >>> + stdout=subprocess.DEVNULL, >>> + stderr=subprocess.PIPE) >>> + if callgrind.returncode: >>> + sys.exit(callgrind.stderr.decode("utf-8")) >>> + >>> + # Save callgrind_annotate output >>> + with open(annotate_out_path, "w") as output: >>> + callgrind_annotate = subprocess.run( >>> + ["callgrind_annotate", data_path, >>> + "--threshold=100", "--tree=calling"], >>> + stdout=output, >>> + stderr=subprocess.PIPE) >>> + if callgrind_annotate.returncode: >>> + sys.exit(callgrind_annotate.stderr.decode("utf-8")) >>> + >>> + # Read the callgrind_annotate output to callgrind_data[] >>> + callgrind_data = [] >>> + with open(annotate_out_path, "r") as data: >>> + callgrind_data = data.readlines() >>> + >>> + # Line number with the total number of instructions >>> + total_instructions_line_number = 20 >>> + # Get the total number of instructions >>> + total_instructions_line_data = \ >>> + callgrind_data[total_instructions_line_number] >>> + total_instructions = total_instructions_line_data.split()[0] >>> + >>> + print("Total number of instructions: >>> {}\n".format(total_instructions)) >>> + >>> + # Remove commas and convert to int >>> + total_instructions = int(total_instructions.replace(",", "")) >>> + >>> + # Line number with the JIT call >>> + JIT_line = find_JIT_line(callgrind_data) >>> + if JIT_line == -1: >>> + sys.exit("Couldn't locate the JIT call ... Exiting") >>> + >>> + # Get helpers >>> + helpers = get_helpers(JIT_line, callgrind_data) >>> + >>> + print("Executed QEMU Helpers:\n") >>> + >>> + # Print table header >>> + print("{:>4} {:>15} {:>10} {:>15} {:>10} {:<25} {}". >>> + format( >>> + "No.", >>> + "Instructions", >>> + "Percentage", >>> + "Calls", >>> + "Ins/Call", >>> + "Helper Name", >>> + "Source File") >>> + ) >>> + >>> + print("{:>4} {:>15} {:>10} {:>15} {:>10} {:<25} {}". >>> + format( >>> + "-" * 4, >>> + "-" * 15, >>> + "-" * 10, >>> + "-" * 15, >>> + "-" * 10, >>> + "-" * 25, >>> + "-" * 30) >>> + ) >>> + >>> + for (index, callee) in enumerate(helpers, start=1): >>> + instructions = callee[0] >>> + percentage = (callee[0] / total_instructions) * 100 >>> + calls = callee[2] >>> + instruction_per_call = int(callee[0] / callee[2]) >>> + helper_name = callee[1] >>> + source_file = callee[3] >>> + # Print extracted data >>> + print("{:>4} {:>15} {:>9.3f}% {:>15} {:>10} {:<25} >>> {}". >>> + format( >>> + index, >>> + format(instructions, ","), >>> + round(percentage, 3), >>> + format(calls, ","), >>> + format(instruction_per_call, ","), >>> + helper_name, >>> + source_file) >>> + ) >>> + >>> + >>> +if __name__ == "__main__": >>> + main() >>> -- >>> 2.17.1 >>> >>>