From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBC24C433E5 for ; Tue, 28 Jul 2020 10:32:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A58F820775 for ; Tue, 28 Jul 2020 10:32:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Vl4Ctbqp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A58F820775 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58340 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k0MuN-0000v7-SI for qemu-devel@archiver.kernel.org; Tue, 28 Jul 2020 06:32:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51514) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k0Msr-0008Sj-B6 for qemu-devel@nongnu.org; Tue, 28 Jul 2020 06:30:57 -0400 Received: from mail-wm1-x333.google.com ([2a00:1450:4864:20::333]:56027) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1k0Mso-0001rc-A5 for qemu-devel@nongnu.org; Tue, 28 Jul 2020 06:30:57 -0400 Received: by mail-wm1-x333.google.com with SMTP id 9so16800812wmj.5 for ; Tue, 28 Jul 2020 03:30:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=E7nGi89DW8r6Hgp3SVOvh9acOjf1CKeIxdsvDcdwADE=; b=Vl4CtbqpDbcx1YbbOeQbG1iggt2mYTYnNRoIFxPdcsrjDnPk8qvKqSxclWAb5nq8Tg DPUjvxpW6oeJROc7Bm/jtG2+8SubBGuJZnTbDhFWcDcVxidnbYejoVNorBXLuk2uQml1 B62/tWkB2HHi30KMyKd06qZQAnE1Wg+xeKLXI3WoQXBK2xyH5GjR8qZuyBwv8y4AhjvP xewpNTgdYCVKNAAIAANcnkLem2Wbavw7L7deLyszvotgiHFoApK0HAmM9n1oKvTCksv9 RrsNm66xLvW/Ozz9lst3Ti3t0Th0KKEWF5fuDLekaIjCTyNY78D9z1FJ+f2RQwdAJhMR OjCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=E7nGi89DW8r6Hgp3SVOvh9acOjf1CKeIxdsvDcdwADE=; b=H4qrJKtbnv6LdRHGyRa+F0+qSD3weqVr6KbCsMHxyytgwHqNKRuKoMuBllt72GPnY8 16kt4UjtDVKEoTx02pJpnjRiawsTJ8KF5kozm6+Y+eBAR+sl7mQZ8cJUVLH6f34VuV43 6M3IBGQVYUn0B5hu6swV0xdMEIiH/QTGxxbqkq24Vd2+cqyomxds7BekP3pGhUgSmiT/ sujS30aQ7f+CEJKz3mUz6oeyocYzmsStAHnDRQT+AACniRhbVRdCIkMl8oUOqplQYvBL JoFUEWa9QUq3yA70y108uBYCSmT+aHGxiAcrC4OONexHw3943BlGLVzXMH5xss8PmC6A xh9Q== X-Gm-Message-State: AOAM5307dmQfwKTxr1MoZB3CG2a991UZSFdfr4iCnoIUSRQUx2ohP6Z0 /C34uumvtRQZwVQHWJYyxUYtw6U/5rguhKLZoM4= X-Google-Smtp-Source: ABdhPJz/M4xRxUAI29h26RSjcDabAXy/UPeXpB26dSLSz+jSgpMz60f46R1BrIMp8WZ78wy/J1ifPj5dGAm2zIfj9i4= X-Received: by 2002:a1c:de86:: with SMTP id v128mr3302151wmg.36.1595932252728; Tue, 28 Jul 2020 03:30:52 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:1bca:0:0:0:0:0 with HTTP; Tue, 28 Jul 2020 03:30:52 -0700 (PDT) In-Reply-To: <20200716103921.6605-3-ahmedkhaledkaraman@gmail.com> References: <20200716103921.6605-1-ahmedkhaledkaraman@gmail.com> <20200716103921.6605-3-ahmedkhaledkaraman@gmail.com> From: Aleksandar Markovic Date: Tue, 28 Jul 2020 12:30:52 +0200 Message-ID: Subject: [PATCH v2 2/2] scripts/performance: Add list_helpers.py script To: Ahmed Karaman Content-Type: multipart/alternative; boundary="000000000000026e4905ab7def25" Received-SPF: pass client-ip=2a00:1450:4864:20::333; envelope-from=aleksandar.qemu.devel@gmail.com; helo=mail-wm1-x333.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "ldoktor@redhat.com" , "ehabkost@redhat.com" , "philmd@redhat.com" , "qemu-devel@nongnu.org" , "crosa@redhat.com" , "alex.bennee@linaro.org" , "rth@twiddle.net" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --000000000000026e4905ab7def25 Content-Type: text/plain; charset="UTF-8" On Thursday, July 16, 2020, Ahmed Karaman wrote: > Python script that prints executed helpers of a QEMU invocation. > > Hi, Ahmed. You outlined the envisioned user workflow regarding this script in your report. As I understand it, it generally goes like this: 1) The user first discovers helpers, and their performance data. 2) The user examines the callees of a particular helper of choice (usually, the most instruction-consuming helper). 3) The user perhaps further examines a callee of a particular callee of the particular helper. 4) The user continues this way until the conclusion can be drawn, or maximal depth is reached. The procedure might be time consuming since each step requires running an emulation of the test program. This makes me think that the faster and easier tool for the user (but, to some, not that great, extent, harder for you) would be improved list_helpers.py (and list_fn_calees.py) that provides list of all callees for all helpers, in the tree form (so, callees of callees, callees of callees of callees, etc.), rather than providing just a list of immediate callees, like it currently does. I think you can provide such functionality relatively easily using recursion. See, let's say: https://realpython.com/python-thinking-recursively/ Perhaps you can have a switch (let's say, --tree ) that specifies whether the script outputs just immediate callee list, or entire callee tree. Thanks, Aleksandar > Syntax: > list_helpers.py [-h] -- \ > [] \ > [] > > [-h] - Print the script arguments help message. > > Example of usage: > list_helpers.py -- qemu-mips coulomb_double-mips -n10 > > Example output: > Total number of instructions: 108,933,695 > > Executed QEMU Helpers: > > No. Ins Percent Calls Ins/Call Helper Name Source File > --- ------- ------- ------ -------- -------------------- > --------------- > 1 183,021 0.168% 1,305 140 helper_float_sub_d > /target/mips/fpu_helper.c > 2 177,111 0.163% 770 230 helper_float_madd_d > /target/mips/fpu_helper.c > 3 171,537 0.157% 1,014 169 helper_float_mul_d > /target/mips/fpu_helper.c > 4 157,298 0.144% 2,443 64 helper_lookup_tb_ptr > /accel/tcg/tcg-runtime.c > 5 138,123 0.127% 897 153 helper_float_add_d > /target/mips/fpu_helper.c > 6 47,083 0.043% 207 227 helper_float_msub_d > /target/mips/fpu_helper.c > 7 24,062 0.022% 487 49 helper_cmp_d_lt > /target/mips/fpu_helper.c > 8 22,910 0.021% 150 152 helper_float_div_d > /target/mips/fpu_helper.c > 9 15,497 0.014% 321 48 helper_cmp_d_eq > /target/mips/fpu_helper.c > 10 9,100 0.008% 52 175 helper_float_trunc_w_d > /target/mips/fpu_helper.c > 11 7,059 0.006% 10 705 helper_float_sqrt_d > /target/mips/fpu_helper.c > 12 3,000 0.003% 40 75 helper_cmp_d_ule > /target/mips/fpu_helper.c > 13 2,720 0.002% 20 136 helper_float_cvtd_w > /target/mips/fpu_helper.c > 14 2,477 0.002% 27 91 helper_swl > /target/mips/op_helper.c > 15 2,000 0.002% 40 50 helper_cmp_d_le > /target/mips/fpu_helper.c > 16 1,800 0.002% 40 45 helper_cmp_d_un > /target/mips/fpu_helper.c > 17 1,164 0.001% 12 97 helper_raise_exception_ > /target/mips/op_helper.c > 18 720 0.001% 10 72 helper_cmp_d_ult > /target/mips/fpu_helper.c > 19 560 0.001% 140 4 helper_cfc1 > /target/mips/fpu_helper.c > > Signed-off-by: Ahmed Karaman > --- > scripts/performance/list_helpers.py | 207 ++++++++++++++++++++++++++++ > 1 file changed, 207 insertions(+) > create mode 100755 scripts/performance/list_helpers.py > > diff --git a/scripts/performance/list_helpers.py > b/scripts/performance/list_helpers.py > new file mode 100755 > index 0000000000..a97c7ed4fe > --- /dev/null > +++ b/scripts/performance/list_helpers.py > @@ -0,0 +1,207 @@ > +#!/usr/bin/env python3 > + > +# Print the executed helpers of a QEMU invocation. > +# > +# Syntax: > +# list_helpers.py [-h] -- \ > +# [] \ > +# [] > +# > +# [-h] - Print the script arguments help message. > +# > +# Example of usage: > +# list_helpers.py -- qemu-mips coulomb_double-mips > +# > +# This file is a part of the project "TCG Continuous Benchmarking". > +# > +# Copyright (C) 2020 Ahmed Karaman > +# Copyright (C) 2020 Aleksandar Markovic om> > +# > +# This program is free software: you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation, either version 2 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > + > +import argparse > +import os > +import subprocess > +import sys > +import tempfile > + > + > +def find_JIT_line(callgrind_data): > + """ > + Search for the line with the JIT call in the callgrind_annotate > + output when ran using --tre=calling. > + All the helpers should be listed after that line. > + > + Parameters: > + callgrind_data (list): callgrind_annotate output > + > + Returns: > + (int): Line number of JIT call > + """ > + line = -1 > + for i in range(len(callgrind_data)): > + split_line = callgrind_data[i].split() > + if len(split_line) > 2 and \ > + split_line[1] == "*" and \ > + split_line[-1] == "[???]": > + line = i > + break > + return line > + > + > +def get_helpers(JIT_line, callgrind_data): > + """ > + Get all helpers data given the line number of the JIT call. > + > + Parameters: > + JIT_line (int): Line number of the JIT call > + callgrind_data (list): callgrind_annotate output > + > + Returns: > + (list):[[number_of_instructions(int), helper_name(str), > + number_of_calls(int), source_file(str)]] > + """ > + helpers = [] > + next_helper = JIT_line + 1 > + while (callgrind_data[next_helper] != "\n"): > + split_line = callgrind_data[next_helper].split() > + number_of_instructions = int(split_line[0].replace(",", "")) > + source_file = split_line[2].split(":")[0] > + callee_name = split_line[2].split(":")[1] > + number_of_calls = int(split_line[3][1:-2]) > + helpers.append([number_of_instructions, callee_name, > + number_of_calls, source_file]) > + next_helper += 1 > + return sorted(helpers, reverse=True) > + > + > +def main(): > + # Parse the command line arguments > + parser = argparse.ArgumentParser( > + usage="list_helpers.py [-h] -- " > + " [] " > + " []") > + > + parser.add_argument("command", type=str, nargs="+", > help=argparse.SUPPRESS) > + > + args = parser.parse_args() > + > + # Extract the needed variables from the args > + command = args.command > + > + # Insure that valgrind is installed > + check_valgrind = subprocess.run( > + ["which", "valgrind"], stdout=subprocess.DEVNULL) > + if check_valgrind.returncode: > + sys.exit("Please install valgrind before running the script.") > + > + # Save all intermediate files in a temporary directory > + with tempfile.TemporaryDirectory() as tmpdirname: > + # callgrind output file path > + data_path = os.path.join(tmpdirname, "callgrind.data") > + # callgrind_annotate output file path > + annotate_out_path = os.path.join(tmpdirname, > "callgrind_annotate.out") > + > + # Run callgrind > + callgrind = subprocess.run((["valgrind", > + "--tool=callgrind", > + "--callgrind-out-file=" + data_path] > + + command), > + stdout=subprocess.DEVNULL, > + stderr=subprocess.PIPE) > + if callgrind.returncode: > + sys.exit(callgrind.stderr.decode("utf-8")) > + > + # Save callgrind_annotate output > + with open(annotate_out_path, "w") as output: > + callgrind_annotate = subprocess.run( > + ["callgrind_annotate", data_path, > + "--threshold=100", "--tree=calling"], > + stdout=output, > + stderr=subprocess.PIPE) > + if callgrind_annotate.returncode: > + sys.exit(callgrind_annotate.stderr.decode("utf-8")) > + > + # Read the callgrind_annotate output to callgrind_data[] > + callgrind_data = [] > + with open(annotate_out_path, "r") as data: > + callgrind_data = data.readlines() > + > + # Line number with the total number of instructions > + total_instructions_line_number = 20 > + # Get the total number of instructions > + total_instructions_line_data = \ > + callgrind_data[total_instructions_line_number] > + total_instructions = total_instructions_line_data.split()[0] > + > + print("Total number of instructions: > {}\n".format(total_instructions)) > + > + # Remove commas and convert to int > + total_instructions = int(total_instructions.replace(",", "")) > + > + # Line number with the JIT call > + JIT_line = find_JIT_line(callgrind_data) > + if JIT_line == -1: > + sys.exit("Couldn't locate the JIT call ... Exiting") > + > + # Get helpers > + helpers = get_helpers(JIT_line, callgrind_data) > + > + print("Executed QEMU Helpers:\n") > + > + # Print table header > + print("{:>4} {:>15} {:>10} {:>15} {:>10} {:<25} {}". > + format( > + "No.", > + "Instructions", > + "Percentage", > + "Calls", > + "Ins/Call", > + "Helper Name", > + "Source File") > + ) > + > + print("{:>4} {:>15} {:>10} {:>15} {:>10} {:<25} {}". > + format( > + "-" * 4, > + "-" * 15, > + "-" * 10, > + "-" * 15, > + "-" * 10, > + "-" * 25, > + "-" * 30) > + ) > + > + for (index, callee) in enumerate(helpers, start=1): > + instructions = callee[0] > + percentage = (callee[0] / total_instructions) * 100 > + calls = callee[2] > + instruction_per_call = int(callee[0] / callee[2]) > + helper_name = callee[1] > + source_file = callee[3] > + # Print extracted data > + print("{:>4} {:>15} {:>9.3f}% {:>15} {:>10} {:<25} {}". > + format( > + index, > + format(instructions, ","), > + round(percentage, 3), > + format(calls, ","), > + format(instruction_per_call, ","), > + helper_name, > + source_file) > + ) > + > + > +if __name__ == "__main__": > + main() > -- > 2.17.1 > > --000000000000026e4905ab7def25 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Thursday, July 16, 2020, Ahmed Karaman <ahmedkhaledkaraman@gmail.com> wrote:
Python script that prints e= xecuted helpers of a QEMU invocation.


Hi, Ahmed.

Yo= u outlined the envisioned user workflow regarding this script in your repor= t. As I understand it, it generally goes like this:

1) The user first discovers helpers, and their performance data.2) The user examines the callees of a particular helper of choice (usuall= y, the most instruction-consuming helper).
3) The user perhaps fu= rther examines a callee of a particular callee of the particular helper.
4) The user continues this way until the conclusion can be drawn, o= r maximal depth is reached.

The procedure might be= time consuming since each step requires running an emulation of the test p= rogram.

This makes me think that the faster and ea= sier tool for the user (but, to some, not that great, extent, harder for yo= u) would be improved list_helpers.py (and list_fn_calees.py) that provides = list of all callees for all helpers, in the tree form (so, callees of calle= es, callees of callees of callees, etc.),=C2=A0rather than providing just a= list of immediate callees, like it currently does.

I think you can provide such functionality relatively easily using recurs= ion. See, let's say:


Perhaps you can have a swi= tch (let's say, --tree <yes|no>) that specifies whether the scrip= t outputs just immediate callee list, or entire callee tree.

=
Thanks,
Aleksandar
=C2=A0
Syntax:
list_helpers.py [-h] -- \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<qemu executable&= gt; [<qemu executable options>] \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<target executabl= e> [<target executable options>]

[-h] - Print the script arguments help message.

Example of usage:
list_helpers.py -- qemu-mips coulomb_double-mips -n10

Example output:
=C2=A0Total number of instructions: 108,933,695

=C2=A0Executed QEMU Helpers:

=C2=A0No. Ins=C2=A0 =C2=A0 =C2=A0Percent=C2=A0 Calls Ins/Call Helper Name= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Source File
=C2=A0--- ------- ------- ------ -------- --------------------=C2=A0 =C2=A0= ---------------
=C2=A0 =C2=A01 183,021=C2=A0 0.168%=C2=A0 1,305=C2=A0 =C2=A0 =C2=A0 140 hel= per_float_sub_d=C2=A0 =C2=A0 =C2=A0 <qemu>/target/mips/fpu_helper.c
=C2=A0 =C2=A02 177,111=C2=A0 0.163%=C2=A0 =C2=A0 770=C2=A0 =C2=A0 =C2=A0 23= 0 helper_float_madd_d=C2=A0 =C2=A0 =C2=A0<qemu>/target/mips/fpu_= helper.c
=C2=A0 =C2=A03 171,537=C2=A0 0.157%=C2=A0 1,014=C2=A0 =C2=A0 =C2=A0 169 hel= per_float_mul_d=C2=A0 =C2=A0 =C2=A0 <qemu>/target/mips/fpu_helper.c
=C2=A0 =C2=A04 157,298=C2=A0 0.144%=C2=A0 2,443=C2=A0 =C2=A0 =C2=A0 =C2=A06= 4 helper_lookup_tb_ptr=C2=A0 =C2=A0 <qemu>/accel/tcg/tcg-runtime.c =C2=A0 =C2=A05 138,123=C2=A0 0.127%=C2=A0 =C2=A0 897=C2=A0 =C2=A0 =C2=A0 15= 3 helper_float_add_d=C2=A0 =C2=A0 =C2=A0 <qemu>/target/mips/fpu_helpe= r.c
=C2=A0 =C2=A06=C2=A0 47,083=C2=A0 0.043%=C2=A0 =C2=A0 207=C2=A0 =C2=A0 =C2= =A0 227 helper_float_msub_d=C2=A0 =C2=A0 =C2=A0<qemu>/target/mips/fpu= _helper.c
=C2=A0 =C2=A07=C2=A0 24,062=C2=A0 0.022%=C2=A0 =C2=A0 487=C2=A0 =C2=A0 =C2= =A0 =C2=A049 helper_cmp_d_lt=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<qemu>/= target/mips/fpu_helper.c
=C2=A0 =C2=A08=C2=A0 22,910=C2=A0 0.021%=C2=A0 =C2=A0 150=C2=A0 =C2=A0 =C2= =A0 152 helper_float_div_d=C2=A0 =C2=A0 =C2=A0 <qemu>/target/mips/fpu= _helper.c
=C2=A0 =C2=A09=C2=A0 15,497=C2=A0 0.014%=C2=A0 =C2=A0 321=C2=A0 =C2=A0 =C2= =A0 =C2=A048 helper_cmp_d_eq=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<qemu>/= target/mips/fpu_helper.c
=C2=A0 10=C2=A0 =C2=A09,100=C2=A0 0.008%=C2=A0 =C2=A0 =C2=A052=C2=A0 =C2=A0= =C2=A0 175 helper_float_trunc_w_d=C2=A0 <qemu>/target/mips/fpu_helpe= r.c
=C2=A0 11=C2=A0 =C2=A07,059=C2=A0 0.006%=C2=A0 =C2=A0 =C2=A010=C2=A0 =C2=A0= =C2=A0 705 helper_float_sqrt_d=C2=A0 =C2=A0 =C2=A0<qemu>/target/mips= /fpu_helper.c
=C2=A0 12=C2=A0 =C2=A03,000=C2=A0 0.003%=C2=A0 =C2=A0 =C2=A040=C2=A0 =C2=A0= =C2=A0 =C2=A075 helper_cmp_d_ule=C2=A0 =C2=A0 =C2=A0 =C2=A0 <qemu>/t= arget/mips/fpu_helper.c
=C2=A0 13=C2=A0 =C2=A02,720=C2=A0 0.002%=C2=A0 =C2=A0 =C2=A020=C2=A0 =C2=A0= =C2=A0 136 helper_float_cvtd_w=C2=A0 =C2=A0 =C2=A0<qemu>/target/mips= /fpu_helper.c
=C2=A0 14=C2=A0 =C2=A02,477=C2=A0 0.002%=C2=A0 =C2=A0 =C2=A027=C2=A0 =C2=A0= =C2=A0 =C2=A091 helper_swl=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= <qemu>/target/mips/op_helper.c
=C2=A0 15=C2=A0 =C2=A02,000=C2=A0 0.002%=C2=A0 =C2=A0 =C2=A040=C2=A0 =C2=A0= =C2=A0 =C2=A050 helper_cmp_d_le=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<qemu&= gt;/target/mips/fpu_helper.c
=C2=A0 16=C2=A0 =C2=A01,800=C2=A0 0.002%=C2=A0 =C2=A0 =C2=A040=C2=A0 =C2=A0= =C2=A0 =C2=A045 helper_cmp_d_un=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<qemu&= gt;/target/mips/fpu_helper.c
=C2=A0 17=C2=A0 =C2=A01,164=C2=A0 0.001%=C2=A0 =C2=A0 =C2=A012=C2=A0 =C2=A0= =C2=A0 =C2=A097 helper_raise_exception_ <qemu>/target/mips/op_helper= .c
=C2=A0 18=C2=A0 =C2=A0 =C2=A0720=C2=A0 0.001%=C2=A0 =C2=A0 =C2=A010=C2=A0 = =C2=A0 =C2=A0 =C2=A072 helper_cmp_d_ult=C2=A0 =C2=A0 =C2=A0 =C2=A0 <qemu= >/target/mips/fpu_helper.c
=C2=A0 19=C2=A0 =C2=A0 =C2=A0560=C2=A0 0.001%=C2=A0 =C2=A0 140=C2=A0 =C2=A0= =C2=A0 =C2=A0 4 helper_cfc1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0<qemu>/target/mips/fpu_helper.c

Signed-off-by: Ahmed Karaman <ahmedkhaledkaraman@gmail.com>
---
=C2=A0scripts/performance/list_helpers.py | 207 ++++++++++++++++++++++= ++++++
=C2=A01 file changed, 207 insertions(+)
=C2=A0create mode 100755 scripts/performance/list_helpers.py

diff --git a/scripts/performance/list_helpers.py b/scripts/performance= /list_helpers.py
new file mode 100755
index 0000000000..a97c7ed4fe
--- /dev/null
+++ b/scripts/performance/list_helpers.py
@@ -0,0 +1,207 @@
+#!/usr/bin/env python3
+
+#=C2=A0 Print the executed helpers of a QEMU invocation.
+#
+#=C2=A0 Syntax:
+#=C2=A0 list_helpers.py [-h] -- \
+#=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<qemu ex= ecutable> [<qemu executable options>] \
+#=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<target = executable> [<target executable options>]
+#
+#=C2=A0 [-h] - Print the script arguments help message.
+#
+#=C2=A0 Example of usage:
+#=C2=A0 list_helpers.py -- qemu-mips coulomb_double-mips
+#
+#=C2=A0 This file is a part of the project "TCG Continuous Benchmarki= ng".
+#
+#=C2=A0 Copyright (C) 2020=C2=A0 Ahmed Karaman <ahmedkhaledkaraman@gmail.com= >
+#=C2=A0 Copyright (C) 2020=C2=A0 Aleksandar Markovic <aleksandar.qemu.devel@g= mail.com>
+#
+#=C2=A0 This program is free software: you can redistribute it and/or modi= fy
+#=C2=A0 it under the terms of the GNU General Public License as published = by
+#=C2=A0 the Free Software Foundation, either version 2 of the License, or<= br> +#=C2=A0 (at your option) any later version.
+#
+#=C2=A0 This program is distributed in the hope that it will be useful, +#=C2=A0 but WITHOUT ANY WARRANTY; without even the implied warranty of
+#=C2=A0 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+#=C2=A0 GNU General Public License for more details.
+#
+#=C2=A0 You should have received a copy of the GNU General Public License<= br> +#=C2=A0 along with this program. If not, see <https://www.gnu.org/licenses/&g= t;.
+
+import argparse
+import os
+import subprocess
+import sys
+import tempfile
+
+
+def find_JIT_line(callgrind_data):
+=C2=A0 =C2=A0 """
+=C2=A0 =C2=A0 Search for the line with the JIT call in the callgrind_annot= ate
+=C2=A0 =C2=A0 output when ran using --tre=3Dcalling.
+=C2=A0 =C2=A0 All the helpers should be listed after that line.
+
+=C2=A0 =C2=A0 Parameters:
+=C2=A0 =C2=A0 callgrind_data (list): callgrind_annotate output
+
+=C2=A0 =C2=A0 Returns:
+=C2=A0 =C2=A0 (int): Line number of JIT call
+=C2=A0 =C2=A0 """
+=C2=A0 =C2=A0 line =3D -1
+=C2=A0 =C2=A0 for i in range(len(callgrind_data)):
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 split_line =3D callgrind_data[i].split()
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if len(split_line) > 2 and \
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 split_line[1] =3D= =3D "*" and \
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 split_line[-1] =3D= =3D "[???]":
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 line =3D i
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break
+=C2=A0 =C2=A0 return line
+
+
+def get_helpers(JIT_line, callgrind_data):
+=C2=A0 =C2=A0 """
+=C2=A0 =C2=A0 Get all helpers data given the line number of the JIT call.<= br> +
+=C2=A0 =C2=A0 Parameters:
+=C2=A0 =C2=A0 JIT_line (int): Line number of the JIT call
+=C2=A0 =C2=A0 callgrind_data (list): callgrind_annotate output
+
+=C2=A0 =C2=A0 Returns:
+=C2=A0 =C2=A0 (list):[[number_of_instructions(int), helper_name(str),=
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0number_of_calls(int), sour= ce_file(str)]]
+=C2=A0 =C2=A0 """
+=C2=A0 =C2=A0 helpers =3D []
+=C2=A0 =C2=A0 next_helper =3D JIT_line + 1
+=C2=A0 =C2=A0 while (callgrind_data[next_helper] !=3D "\n"):
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 split_line =3D callgrind_data[next_helper].sp<= wbr>lit()
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 number_of_instructions =3D int(split_line[0].r= eplace(",", ""))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 source_file =3D split_line[2].split(":&qu= ot;)[0]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 callee_name =3D split_line[2].split(":&qu= ot;)[1]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 number_of_calls =3D int(split_line[3][1:-2]) +=C2=A0 =C2=A0 =C2=A0 =C2=A0 helpers.append([number_of_instructions, c= allee_name,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 number_of_calls, source_file])
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 next_helper +=3D 1
+=C2=A0 =C2=A0 return sorted(helpers, reverse=3DTrue)
+
+
+def main():
+=C2=A0 =C2=A0 # Parse the command line arguments
+=C2=A0 =C2=A0 parser =3D argparse.ArgumentParser(
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 usage=3D"list_helpers.py [-h] -- " +=C2=A0 =C2=A0 =C2=A0 =C2=A0 "<qemu executable> [<qemu execut= able options>] "
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 "<target executable> [<target ex= ecutable options>]")
+
+=C2=A0 =C2=A0 parser.add_argument("command", type=3Dstr, nargs= =3D"+", help=3Dargparse.SUPPRESS)
+
+=C2=A0 =C2=A0 args =3D parser.parse_args()
+
+=C2=A0 =C2=A0 # Extract the needed variables from the args
+=C2=A0 =C2=A0 command =3D args.command
+
+=C2=A0 =C2=A0 # Insure that valgrind is installed
+=C2=A0 =C2=A0 check_valgrind =3D subprocess.run(
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 ["which", "valgrind"], std= out=3Dsubprocess.DEVNULL)
+=C2=A0 =C2=A0 if check_valgrind.returncode:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 sys.exit("Please install valgrind before = running the script.")
+
+=C2=A0 =C2=A0 # Save all intermediate files in a temporary directory
+=C2=A0 =C2=A0 with tempfile.TemporaryDirectory() as tmpdirname:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # callgrind output file path
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 data_path =3D os.path.join(tmpdirname, "c= allgrind.data")
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # callgrind_annotate output file path
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 annotate_out_path =3D os.path.join(tmpdirname,= "callgrind_annotate.out")
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Run callgrind
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 callgrind =3D subprocess.run((["valgrind&= quot;,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"--tool=3Dc= allgrind",
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"--callgrin= d-out-file=3D" + data_path]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 + command),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0stdout=3Dsubprocess.DEV= NULL,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0stderr=3Dsubprocess.PIP= E)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if callgrind.returncode:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sys.exit(callgrind.stderr.decode("utf-8"))
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Save callgrind_annotate output
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 with open(annotate_out_path, "w") as= output:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 callgrind_annotate =3D subproces= s.run(
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ["callgrind_a= nnotate", data_path,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 &quo= t;--threshold=3D100", "--tree=3Dcalling"],
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 stdout=3Doutput, +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 stderr=3Dsubproces= s.PIPE)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if callgrind_annotate.returncode= :
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sys.exit(callgrind_annotate.stderr.decode= ("utf-8"))
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Read the callgrind_annotate output to callgr= ind_data[]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 callgrind_data =3D []
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 with open(annotate_out_path, "r") as= data:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 callgrind_data =3D data.readline= s()
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Line number with the total number of instruc= tions
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 total_instructions_line_number =3D 20
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Get the total number of instructions
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 total_instructions_line_data =3D \
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 callgrind_data[total_instructions_line_number]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 total_instructions =3D total_instructions_line= _data.split()[0]
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 print("Total number of instructions: {}\n= ".format(total_instructions))
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Remove commas and convert to int
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 total_instructions =3D int(total_instructions.= replace(",", ""))
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Line number with the JIT call
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 JIT_line =3D find_JIT_line(callgrind_data)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if JIT_line =3D=3D -1:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sys.exit("Couldn't loca= te the JIT call ... Exiting")
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Get helpers
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 helpers =3D get_helpers(JIT_line, callgrind_da= ta)
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 print("Executed QEMU Helpers:\n") +
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 # Print table header
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 print("{:>4}=C2=A0 {:>15}=C2=A0 {:&= gt;10}=C2=A0 {:>15}=C2=A0 {:>10}=C2=A0 {:<25}=C2=A0 {}".
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 format(
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "No.&q= uot;,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Instr= uctions",
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Perce= ntage",
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Calls= ",
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Ins/C= all",
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Helpe= r Name",
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Sourc= e File")
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 )
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 print("{:>4}=C2=A0 {:>15}=C2=A0 {:&= gt;10}=C2=A0 {:>15}=C2=A0 {:>10}=C2=A0 {:<25}=C2=A0 {}".
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 format(
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 4,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 15,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 10,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 15,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 10,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 25,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "-&quo= t; * 30)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 )
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 for (index, callee) in enumerate(helpers, star= t=3D1):
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 instructions =3D callee[0]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 percentage =3D (callee[0] / tota= l_instructions) * 100
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 calls =3D callee[2]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 instruction_per_call =3D int(cal= lee[0] / callee[2])
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 helper_name =3D callee[1]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 source_file =3D callee[3]
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # Print extracted data
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 print("{:>4}=C2=A0 {:>= ;15}=C2=A0 {:>9.3f}%=C2=A0 {:>15}=C2=A0 {:>10}=C2=A0 {:<25}=C2= =A0 {}".
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 format(
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 index,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 format(instructions, ","),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 round(percentage, 3),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 format(calls, ","),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 format(instruction_per_call, ","),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 helper_name,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 source_file)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 )
+
+
+if __name__ =3D=3D "__main__":
+=C2=A0 =C2=A0 main()
--
2.17.1

--000000000000026e4905ab7def25--