From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C1E2C4332F for ; Tue, 4 Oct 2022 21:15:19 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4Mhr853tWkz3c4h for ; Wed, 5 Oct 2022 08:15:17 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=gpPsRLmn; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=google.com (client-ip=2a00:1450:4864:20::434; helo=mail-wr1-x434.google.com; envelope-from=irogers@google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=gpPsRLmn; dkim-atps=neutral Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4MhgbB3Q74z2xH6 for ; Wed, 5 Oct 2022 01:49:41 +1100 (AEDT) Received: by mail-wr1-x434.google.com with SMTP id bq9so21698529wrb.4 for ; Tue, 04 Oct 2022 07:49:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=KQKTLP2+V4k64zxBWdwqPH676jZVEuNhWrlkFhJXiW0=; b=gpPsRLmn338R+XE8oPiVvgHR1hud+jmqE5oNoFQ4lV6AfrX08KwlRKQ5KdgB5mpfhb A2X9f/EDS7ElgA7PpXuRLzN7ht4nLK6xV8/+2tVbLtTybbZ2FzNIqw+UsEm43VF/SBI+ A8KFW6254TFZv8tZMleC9/ZjdijMxBHiBZ0F3Fd/xFNVEtBKttXKRFnL8BsIblCeQ54e ns2RGI0aMOEl0kBPqzvytobJDkhQ9AL+fEa1aVEnOjeG4DJkHnC1IKxaRQovToUaFRPz UHVPEKnnpB1y+dB8DwtHhyrcDVEsUNXDLIs2Bbd09eUlDRXAuDLVSPSsuEoQ9ITLLXK4 Nc5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=KQKTLP2+V4k64zxBWdwqPH676jZVEuNhWrlkFhJXiW0=; b=yRz05S2r7kyPikLzw6Ba8b7gNwBwnQyV5Ts6Kr5wRLLhqUzCxiWXC9AhPQ9BX6C0t+ AV99DgoN2R1ZUxlsSO69Il1bwRDUeiegxqSLjXB8jjLRT3kjCq+O5FULNrjqioTIVxEg BaTPQ1pHdfXdeEQt64vg18saXN3JNoH5r2Rb9DFRdvn1rKNM8qca+TRcNlbR/5DWBPUN Asd0gX+ZuyiB3tdT2Fm6+TB5YSBmcd92j2tTwIogILvFpnHphA49Yv8QEKoWWJ4yZexk Oy0bkdneE74S3j2c+44Y3FygoFzJX5VP7aA4NZyTVCmoMO3Ky7T1hch1pxTOGJX+t5Ks wpbA== X-Gm-Message-State: ACrzQf1FIPbxKQoM6Dh+kdKVzlPkrqmuEYB+igjiCkpZRf3682r7dBxO hon55apwd1YDqomsvIOpmUBq0ksj38t8yy/cL3jdcA== X-Google-Smtp-Source: AMsMyM6R2y9dkdoLtklwqyeIfdXfYj/vxo1jenp6H3oM+aWlHB9lppTUmTOBCsJGLFweJsgIMS4gZqpsRzNzE6d4kHo= X-Received: by 2002:adf:fd50:0:b0:22e:5503:9c4c with SMTP id h16-20020adffd50000000b0022e55039c4cmr1731173wrs.375.1664894973586; Tue, 04 Oct 2022 07:49:33 -0700 (PDT) MIME-Version: 1.0 References: <20220913115717.36191-1-atrajeev@linux.vnet.ibm.com> <4792ef3a-8790-a0d4-50f2-4917874d81d6@arm.com> <82D5587E-593A-43A7-92D7-7E095E2BE9A9@linux.vnet.ibm.com> <6627ae9a-91e3-0923-1234-54e0fc3f4916@arm.com> <993a1391ee931e859d972c460644d171@imap.linux.ibm.com> <6A5D0603-CF66-43B4-A13F-0308CF01967A@linux.vnet.ibm.com> In-Reply-To: <6A5D0603-CF66-43B4-A13F-0308CF01967A@linux.vnet.ibm.com> From: Ian Rogers Date: Tue, 4 Oct 2022 07:49:21 -0700 Message-ID: Subject: Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value To: Athira Rajeev Content-Type: multipart/alternative; boundary="0000000000007de18005ea369141" X-Mailman-Approved-At: Wed, 05 Oct 2022 08:14:24 +1100 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: maddy@linux.vnet.ibm.com, Nageswara Sastry , Kajol Jain , Arnaldo Carvalho de Melo , linux-perf-users , James Clark , Jiri Olsa , atrajeev , Disha Goel , linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" --0000000000007de18005ea369141 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev wrote: > > > > On 04-Oct-2022, at 12:21 AM, Ian Rogers wrote: > > > > On Mon, Oct 3, 2022 at 7:03 AM atrajeev > wrote: > >> > >> On 2022-10-02 05:17, Ian Rogers wrote: > >>> On Thu, Sep 29, 2022 at 5:56 AM James Clark > >>> wrote: > >>>> > >>>> > >>>> > >>>> On 29/09/2022 09:49, Athira Rajeev wrote: > >>>>> > >>>>> > >>>>>> On 28-Sep-2022, at 9:05 PM, James Clark > wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> Hi James, > >>>>> > >>>>> Thanks for looking at the patch and sharing review comments. > >>>>> > >>>>>> On 13/09/2022 12:57, Athira Rajeev wrote: > >>>>>>> perf stat includes option to specify aggr_mode to display > >>>>>>> per-socket, per-core, per-die, per-node counter details. > >>>>>>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the > >>>>>>> counter values are displayed for each cpu along with "CPU" > >>>>>>> value in one field of the output. > >>>>>>> > >>>>>>> Each of the aggregate mode uses the information fetched > >>>>>>> from "/sys/devices/system/cpu/cpuX/topology" like core_id, > >>>>>> > >>>>>> I thought that this wouldn't apply to the cpu field because cpu is > >>>>>> basically interchangeable as an index in cpumap, rather than > anything > >>>>>> being read from the topology file. > >>>>> > >>>>> The cpu value is filled in this function: > >>>>> > >>>>> Function : aggr_cpu_id__cpu > >>>>> Code: util/cpumap.c > >>>>> > >>>>>> > >>>>>>> physical_package_id. Utility functions in "cpumap.c" fetches > >>>>>>> this information and populates the socket id, core id, cpu etc. > >>>>>>> If the platform does not expose the topology information, > >>>>>>> these values will be set to -1. Example, in case of powerpc, > >>>>>>> details like physical_package_id is restricted to be exposed > >>>>>>> in pSeries platform. So id.socket, id.core, id.cpu all will > >>>>>>> be set as -1. > >>>>>>> > >>>>>>> In case of displaying socket or die value, there is no check > >>>>>>> done in the "aggr_printout" function to see if it points to > >>>>>>> valid socket id or die. But for displaying "cpu" value, there > >>>>>>> is a check for "if (id.core > -1)". In case of powerpc pSeries > >>>>>>> where detail like physical_package_id is restricted to be > >>>>>>> exposed, id.core will be set to -1. Hence the column or field > >>>>>>> itself for CPU won't be displayed in the output. > >>>>>>> > >>>>>>> Result for per-socket: > >>>>>>> > >>>>>>> <<>> > >>>>>>> perf stat -e branches --per-socket -a true > >>>>>>> > >>>>>>> Performance counter stats for 'system wide': > >>>>>>> > >>>>>>> S-1 32 416,851 branches > >>>>>>> <<>> > >>>>>>> > >>>>>>> Here S has -1 in above result. But with -A option which also > >>>>>>> expects CPU in one column in the result, below is observed. > >>>>>>> > >>>>>>> <<>> > >>>>>>> /bin/perf stat -e instructions -A -a true > >>>>>>> > >>>>>>> Performance counter stats for 'system wide': > >>>>>>> > >>>>>>> 47,146 instructions > >>>>>>> 45,226 instructions > >>>>>>> 43,354 instructions > >>>>>>> 45,184 instructions > >>>>>>> <<>> > >>>>>>> > >>>>>>> If the cpu id value is pointing to -1 also, it makes sense > >>>>>>> to display the column in the output to replicate the behaviour > >>>>>>> or to be in precedence with other aggr options(like per-socket, > >>>>>>> per-core). Remove the check "id.core" so that CPU field gets > >>>>>>> displayed in the output. > >>>>>> > >>>>>> Why would you want to print -1 out? Seems like the if statement wa= s > a > >>>>>> good one to me, otherwise the output looks a bit broken to users. > Are > >>>>>> the other aggregation modes even working if -1 is set for socket a= nd > >>>>>> die? Maybe we need to not print -1 in those cases or exit earlier > with a > >>>>>> failure. > >>>>>> > >>>>>> The -1 value has a specific internal meaning which is "to not > >>>>>> aggregate". It doesn't mean "not set". > >>>>> > >>>>> Currently, this check is done only for printing cpu value. > >>>>> For socket/die/core values, this check is not done. Pasting an > >>>>> example snippet from a powerpc system ( specifically from pseries > platform where > >>>>> the value is set to -1 ) > >>>>> > >>>>> ./perf stat --per-core -a -C 1 true > >>>>> > >>>>> Performance counter stats for 'system wide': > >>>>> > >>>>> S-1-D-1-C-1 1 1.06 msec cpu-clock > # 1.018 CPUs utilized > >>>>> S-1-D-1-C-1 1 2 context-switches > # 1.879 K/sec > >>>>> S-1-D-1-C-1 1 0 cpu-migrations > # 0.000 /sec > >>>>> > >>>>> Here though the value is -1, we are displaying it. Where as in case > of cpu, the first column will be > >>>>> empty since we do a check before printing. > >>>>> > >>>>> Example: > >>>>> > >>>>> ./perf stat --per-core -A -C 1 true > >>>>> > >>>>> Performance counter stats for 'CPU(s) 1': > >>>>> > >>>>> 0.88 msec cpu-clock # 1.022 > CPUs utilized > >>>>> 2 context-switches > >>>>> 0 cpu-migrations > >>>>> > >>>>> > >>>>> No sure, whether there are scripts out there, which consume the > current format and > >>>>> not displaying -1 may break it. That is why we tried with change to > remove check for cpu, similar to > >>>>> other modes like socket, die, core etc. > >>>> > >>>> I wouldn't worry about that because there are json and CSV modes whi= ch > >>>> are machine readable, and -1 is already not always displayed. If > >>>> anything this change here is also likely to break parsing by adding = -1 > >>>> where it wasn't before. > >>>> > >>>>> > >>>>> Also perf code ie =E2=80=9Caggr_cpu_id__empty=E2=80=9D in util/cpum= ap.c initialises > the > >>>>> values to -1 . I was checking to see where we are mapping -1 to =E2= =80=9Cto > not aggregate=E2=80=9D. > >>>>> What I could find is AGGR_NONE ( which is for no-aggr ) has value a= s > zero. > >>>>> > >>>>> Reference: defined in util/stat.h > >>>>> > >>>>> enum aggr_mode { > >>>>> AGGR_NONE, > >>>>> > >>>> > >>>> That enum is never written to any of the cpumap members, that define= s > >>>> the mode of how to fill the cpu map instead. 0 is a valid value, for > >>>> example "CPU 0". -1 is used as a special case and shouldn't be > >>>> displayed > >>>> IMO. > >>>> > >>>> Did you see my comment in the code below about the bad merge? Could > >>>> that > >>>> not be related to your issue? > >>> > >>> I'm suspicious of this too. In Claire's patch: > >>> > >>> case AGGR_NONE: > >>> - if (evsel->percore && !config->percore_show_thread) { > >>> - fprintf(config->output, "S%d-D%d-C%*d%s", > >>> - id.socket, > >>> - id.die, > >>> - config->csv_output ? 0 : -3, > >>> - id.core, config->csv_sep); > >>> - } else if (id.cpu.cpu > -1) { > >>> - fprintf(config->output, "CPU%*d%s", > >>> - config->csv_output ? 0 : -7, > >>> - id.cpu.cpu, config->csv_sep); > >>> + if (config->json_output) { > >>> + if (evsel->percore && > >>> !config->percore_show_thread) { > >>> + fprintf(config->output, "\"core\" : > >>> \"S%d-D%d-C%d\"", > >>> + id.socket, > >>> + id.die, > >>> + id.core); > >>> + } else if (id.core > -1) { > >>> + fprintf(config->output, "\"cpu\" : > >>> \"%d\", ", > >>> + id.cpu.cpu); > >>> + } > >>> + } else { > >>> + if (evsel->percore && > >>> !config->percore_show_thread) { > >>> + fprintf(config->output, > >>> "S%d-D%d-C%*d%s", > >>> + id.socket, > >>> + id.die, > >>> + config->csv_output ? 0 : -3, > >>> + id.core, config->csv_sep); > >>> + } else if (id.core > -1) { > >>> + fprintf(config->output, "CPU%*d%s", > >>> + config->csv_output ? 0 : -7, > >>> + id.cpu.cpu, config->csv_sep); > >>> + } > >>> } > >>> break; > >>> > >>> The old code was using "id.cpu.cpu > -1" while the new code is > >>> "id.core > -1". The value printed is id.cpu.cpu and so testing id.cor= e > >>> makes less sense to me. Going back to the original patch: > >>> > >>> > https://lore.kernel.org/lkml/20210811224317.1811618-1-cjense@google.com/ > >>> case AGGR_NONE: > >>> - if (evsel->percore && !config->percore_show_thread) { > >>> - fprintf(config->output, "S%d-D%d-C%*d%s", > >>> - id.socket, > >>> - id.die, > >>> - config->csv_output ? 0 : -3, > >>> - id.core, config->csv_sep); > >>> + if (config->json_output) { > >>> + if (evsel->percore && !config->percore_show_thread) { > >>> + fprintf(config->output, "\"core\" : \"S%d-D%d-C%d\"", > >>> + id.socket, > >>> + id.die, > >>> + id.core); > >>> + } else if (id.core > -1) { > >>> + fprintf(config->output, "\"cpu\" : \"%d\", ", > >>> + evsel__cpus(evsel)->map[id.core]); > >>> + } > >>> + } else { > >>> + if (evsel->percore && !config->percore_show_thread) { > >>> + fprintf(config->output, "S%d-D%d-C%*d%s", > >>> + id.socket, > >>> + id.die, > >>> + config->csv_output ? 0 : -3, > >>> + id.core, config->csv_sep); > >>> } else if (id.core > -1) { > >>> fprintf(config->output, "CPU%*d%s", > >>> config->csv_output ? 0 : -7, > >>> evsel__cpus(evsel)->map[id.core], > >>> config->csv_sep); > >>> - } > >>> + } > >>> + } > >>> + > >>> break; > >>> > >>> So testing the id.core isn't a bad index makes sense. However, we > >>> changed from core to CPU here: > >>> > https://lore.kernel.org/all/20220105061351.120843-26-irogers@google.com/ > >>> and that was because of: > >>> https://lore.kernel.org/r/20220105061351.120843-25-irogers@google.com > >>> > >>> So I think the code needs to test CPU and not core. Whether that is > >>> addressing the Power test failures is another matter, as James said w= e > >>> may need a fix in the tests for that. > >>> > >> > >> Hi Ian, James > >> > >> Thanks for the reviews and suggestions. > >> > >> After checking through the original commits for id.core vs cpu check, > >> sharing patch below to test CPU and not core. > >> > >> From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001 > >> From: Athira Rajeev > >> Date: Mon, 3 Oct 2022 15:47:27 +0530 > >> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in > >> aggr_printout > >> > >> perf stat has options to aggregate the counts in different > >> modes like per socket, per core etc. The function "aggr_printout" > >> in util/stat-display.c which is used to print the aggregates, > >> has a check for cpu in case of AGGR_NONE. This check was > >> originally using condition : "if (id.cpu.cpu > -1)". But > >> this got changed after commit df936cadfb58 ("perf stat: Add > >> JSON output option"), which added option to output json format > >> for different aggregation modes. After this commit, the > >> check in "aggr_printout" is using "if (id.core > -1)". > >> > >> The old code was using "id.cpu.cpu > -1" while the new code > >> is using "id.core > -1". But since the value printed is > >> id.cpu.cpu, fix this check to use cpu and not core. > >> > >> Signed-off-by: Athira Rajeev > >> Suggested-by: James Clark > >> Suggested-by: Ian Rogers > > > > The change below works on my dual socket SkylakeX: > > .. > > 85: perf stat CSV output linter : > > Ok > > 86: perf stat csv summary test : O= k > > 87: perf stat JSON output linter : O= k > > .. > > I don't see anything else out of the ordinary. > > > > Thanks, > > Ian > > > > Hi Ian, > Thanks for helping with testing. Can I add your Tested-by for the patch ? > Yep. Tested-by: Ian Rogers Thanks, Ian Arnaldo, > Please suggest if I have to send as separate patch for the cpu check fix > patch pasted above: > "tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout=E2=80=9D > > Thanks > Athira > >> --- > >> tools/perf/util/stat-display.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/tools/perf/util/stat-display.c > >> b/tools/perf/util/stat-display.c > >> index b82844cb0ce7..cf28020798ec 100644 > >> --- a/tools/perf/util/stat-display.c > >> +++ b/tools/perf/util/stat-display.c > >> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config > >> *config, > >> id.socket, > >> id.die, > >> id.core); > >> - } else if (id.core > -1) { > >> + } else if (id.cpu.cpu > -1) { > >> fprintf(config->output, "\"cpu\" : > \"%d\", ", > >> id.cpu.cpu); > >> } > >> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config > >> *config, > >> id.die, > >> config->csv_output ? 0 : -3, > >> id.core, config->csv_sep); > >> - } else if (id.core > -1) { > >> + } else if (id.cpu.cpu > -1) { > >> fprintf(config->output, "CPU%*d%s", > >> config->csv_output ? 0 : -7, > >> id.cpu.cpu, config->csv_sep); > >> -- > >> 2.31.1 > >> > >> Can you suggest or help to test this patch change. > >> > >> To address the test failure, as James suggested, I will handle fix in > >> testcases and post them > >> as a separate patch. Plan is to add a sanity check in the tests to see > >> if the "physical_packagge_id" ( ie socket id ) in topology points to -= 1 > >> and if so skip the test. Also in parallel, checking to see how we can > >> handle the aggregation modes to work incase of "-1" value for socket o= r > >> die > >> > >> Thanks > >> Athira > >> > >>> Thanks, > >>> Ian > >>> > >>>> Or the one about fixing it in the test instead? Or failing early if > >>>> the > >>>> topology can't be read? > >>>> > >>>> I'm still not convinced that any of the modes where -1 is printed ar= e > >>>> even working properly so it might be best to fix that rather than ju= st > >>>> the printout. > >>>> > >>>>> James, can you point me to reference for that meaning if I have > missed anything. > >>>> > >>>> It's here: > >>>> > >>>> /** Identify where counts are aggregated, -1 implies not to > >>>> aggregate. */ > >>>> struct aggr_cpu_id { > >>>> > >>>>> > >>>>> Thanks > >>>>> Athira > >>>>> > >>>>>> > >>>>>>> > >>>>>>> After the fix: > >>>>>>> > >>>>>>> <<>> > >>>>>>> perf stat -e instructions -A -a true > >>>>>>> > >>>>>>> Performance counter stats for 'system wide': > >>>>>>> > >>>>>>> CPU-1 64,034 instructions > >>>>>>> CPU-1 68,941 instructions > >>>>>>> CPU-1 59,418 instructions > >>>>>>> CPU-1 70,478 instructions > >>>>>>> CPU-1 65,201 instructions > >>>>>>> CPU-1 63,704 instructions > >>>>>>> <<>> > >>>>>>> > >>>>>>> This is caught while running "perf test" for > >>>>>>> "stat+json_output.sh" and "stat+csv_output.sh". > >>>>>> > >>>>>> Is it possible to fix the issue by making the tests cope with the > lack > >>>>>> of the CPU id? > >>>>>> > >>>>>>> > >>>>>>> Reported-by: Disha Goel > >>>>>>> Signed-off-by: Athira Rajeev > >>>>>>> --- > >>>>>>> tools/perf/util/stat-display.c | 6 ++---- > >>>>>>> 1 file changed, 2 insertions(+), 4 deletions(-) > >>>>>>> > >>>>>>> diff --git a/tools/perf/util/stat-display.c > b/tools/perf/util/stat-display.c > >>>>>>> index b82844cb0ce7..1b751a730271 100644 > >>>>>>> --- a/tools/perf/util/stat-display.c > >>>>>>> +++ b/tools/perf/util/stat-display.c > >>>>>>> @@ -168,10 +168,9 @@ static void aggr_printout(struct > perf_stat_config *config, > >>>>>>> id.socket, > >>>>>>> id.die, > >>>>>>> id.core); > >>>>>>> - } else if (id.core > -1) { > >>>>>>> + } else > >>>>>> > >>>>>> This should have been "id.cpu.cpu > -1". Looks like it was changed > by > >>>>>> some kind of bad merge or rebase in df936cadfb because there is no > >>>>>> obvious justification for the change to .core in that commit. > >>>>> > >>>>>> > >>>>>>> fprintf(config->output, "\"cpu\" : > \"%d\", ", > >>>>>>> id.cpu.cpu); > >>>>>>> - } > >>>>>>> } else { > >>>>>>> if (evsel->percore && > !config->percore_show_thread) { > >>>>>>> fprintf(config->output, > "S%d-D%d-C%*d%s", > >>>>>>> @@ -179,11 +178,10 @@ static void aggr_printout(struct > perf_stat_config *config, > >>>>>>> id.die, > >>>>>>> config->csv_output ? 0 : -3, > >>>>>>> id.core, config->csv_sep); > >>>>>>> - } else if (id.core > -1) { > >>>>>>> + } else > >>>>>>> fprintf(config->output, "CPU%*d%s", > >>>>>>> config->csv_output ? 0 : -7, > >>>>>>> id.cpu.cpu, config->csv_sep); > >>>>>>> - } > >>>>>>> } > >>>>>>> break; > >>>>>>> case AGGR_THREAD: > >>>>> > > --0000000000007de18005ea369141 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev <atrajeev@linux.vnet.ibm.com>= ; wrote:


> On 04-Oct-2022, at 12:21 AM, Ian Rogers <irogers@google.com>= wrote:
>
> On Mon, Oct 3, 2022 at 7:03 AM atrajeev <atrajeev@imap.lin= ux.ibm.com> wrote:
>>
>> On 2022-10-02 05:17, Ian Rogers wrote:
>>> On Thu, Sep 29, 2022 at 5:56 AM James Clark <james.clark@a= rm.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 29/09/2022 09:49, Athira Rajeev wrote:
>>>>>
>>>>>
>>>>>> On 28-Sep-2022, at 9:05 PM, James Clark <james= .clark@arm.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Hi James,
>>>>>
>>>>> Thanks for looking at the patch and sharing review com= ments.
>>>>>
>>>>>> On 13/09/2022 12:57, Athira Rajeev wrote:
>>>>>>> perf stat includes option to specify aggr_mode= to display
>>>>>>> per-socket, per-core, per-die, per-node counte= r details.
>>>>>>> Also there is option -A ( AGGR_NONE, -no-aggr = ), where the
>>>>>>> counter values are displayed for each cpu alon= g with "CPU"
>>>>>>> value in one field of the output.
>>>>>>>
>>>>>>> Each of the aggregate mode uses the informatio= n fetched
>>>>>>> from "/sys/devices/system/cpu/cpuX/topolo= gy" like core_id,
>>>>>>
>>>>>> I thought that this wouldn't apply to the cpu = field because cpu is
>>>>>> basically interchangeable as an index in cpumap, r= ather than anything
>>>>>> being read from the topology file.
>>>>>
>>>>> The cpu value is filled in this function:
>>>>>
>>>>> Function : aggr_cpu_id__cpu
>>>>> Code: util/cpumap.c
>>>>>
>>>>>>
>>>>>>> physical_package_id. Utility functions in &quo= t;cpumap.c" fetches
>>>>>>> this information and populates the socket id, = core id, cpu etc.
>>>>>>> If the platform does not expose the topology i= nformation,
>>>>>>> these values will be set to -1. Example, in ca= se of powerpc,
>>>>>>> details like physical_package_id is restricted= to be exposed
>>>>>>> in pSeries platform. So id.socket, id.core, id= .cpu all will
>>>>>>> be set as -1.
>>>>>>>
>>>>>>> In case of displaying socket or die value, the= re is no check
>>>>>>> done in the "aggr_printout" function= to see if it points to
>>>>>>> valid socket id or die. But for displaying &qu= ot;cpu" value, there
>>>>>>> is a check for "if (id.core > -1)"= ;. In case of powerpc pSeries
>>>>>>> where detail like physical_package_id is restr= icted to be
>>>>>>> exposed, id.core will be set to -1. Hence the = column or field
>>>>>>> itself for CPU won't be displayed in the o= utput.
>>>>>>>
>>>>>>> Result for per-socket:
>>>>>>>
>>>>>>> <<>>
>>>>>>> perf stat -e branches --per-socket -a true
>>>>>>>
>>>>>>> Performance counter stats for 'system wide= ':
>>>>>>>
>>>>>>> S-1=C2=A0 =C2=A0 =C2=A0 32=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 416,851=C2=A0 =C2=A0 =C2=A0 branches
>>>>>>> <<>>
>>>>>>>
>>>>>>> Here S has -1 in above result. But with -A opt= ion which also
>>>>>>> expects CPU in one column in the result, below= is observed.
>>>>>>>
>>>>>>> <<>>
>>>>>>> /bin/perf stat -e instructions -A -a true
>>>>>>>
>>>>>>> Performance counter stats for 'system wide= ':
>>>>>>>
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A047,146= =C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A045,226= =C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A043,354= =C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A045,184= =C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> <<>>
>>>>>>>
>>>>>>> If the cpu id value is pointing to -1 also, it= makes sense
>>>>>>> to display the column in the output to replica= te the behaviour
>>>>>>> or to be in precedence with other aggr options= (like per-socket,
>>>>>>> per-core). Remove the check "id.core"= ; so that CPU field gets
>>>>>>> displayed in the output.
>>>>>>
>>>>>> Why would you want to print -1 out? Seems like the= if statement was a
>>>>>> good one to me, otherwise the output looks a bit b= roken to users. Are
>>>>>> the other aggregation modes even working if -1 is = set for socket and
>>>>>> die? Maybe we need to not print -1 in those cases = or exit earlier with a
>>>>>> failure.
>>>>>>
>>>>>> The -1 value has a specific internal meaning which= is "to not
>>>>>> aggregate". It doesn't mean "not set= ".
>>>>>
>>>>> Currently, this check is done only for printing cpu va= lue.
>>>>> For socket/die/core values, this check is not done. Pa= sting an
>>>>> example snippet from a powerpc system ( specifically f= rom pseries platform where
>>>>> the value is set to -1 )
>>>>>
>>>>> ./perf stat --per-core -a -C 1 true
>>>>>
>>>>> Performance counter stats for 'system wide': >>>>>
>>>>> S-1-D-1-C-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.06 msec cpu-clock=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 #=C2=A0 =C2=A0 1.018 CPUs utilized
>>>>> S-1-D-1-C-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2=C2=A0 =C2=A0 =C2= =A0 context-switches=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0#=C2=A0 =C2=A0 1.879 K/sec
>>>>> S-1-D-1-C-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2= =A0 cpu-migrations=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0#=C2=A0 =C2=A0 0.000 /sec
>>>>>
>>>>> Here though the value is -1, we are displaying it. Whe= re as in case of cpu, the first column will be
>>>>> empty since we do a check before printing.
>>>>>
>>>>> Example:
>>>>>
>>>>> ./perf stat --per-core -A -C 1 true
>>>>>
>>>>> Performance counter stats for 'CPU(s) 1':
>>>>>
>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.88 m= sec cpu-clock=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 #=C2=A0 =C2=A0 1.022 CPUs utilized
>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A02=C2=A0 =C2=A0 =C2=A0 context-switches
>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A00=C2=A0 =C2=A0 =C2=A0 cpu-migrations
>>>>>
>>>>>
>>>>> No sure, whether there are scripts out there, which co= nsume the current format and
>>>>> not displaying -1 may break it. That is why we tried w= ith change to remove check for cpu, similar to
>>>>> other modes like socket, die, core etc.
>>>>
>>>> I wouldn't worry about that because there are json and= CSV modes which
>>>> are machine readable, and -1 is already not always display= ed. If
>>>> anything this change here is also likely to break parsing = by adding -1
>>>> where it wasn't before.
>>>>
>>>>>
>>>>> Also perf code ie =E2=80=9Caggr_cpu_id__empty=E2=80=9D= in util/cpumap.c initialises the
>>>>> values to -1 . I was checking to see where we are mapp= ing -1 to =E2=80=9Cto not aggregate=E2=80=9D.
>>>>> What I could find is AGGR_NONE ( which is for no-aggr = ) has value as zero.
>>>>>
>>>>> Reference: defined in util/stat.h
>>>>>
>>>>> enum aggr_mode {
>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 AGGR_NONE,
>>>>>
>>>>
>>>> That enum is never written to any of the cpumap members, t= hat defines
>>>> the mode of how to fill the cpu map instead. 0 is a valid = value, for
>>>> example "CPU 0". -1 is used as a special case an= d shouldn't be
>>>> displayed
>>>> IMO.
>>>>
>>>> Did you see my comment in the code below about the bad mer= ge? Could
>>>> that
>>>> not be related to your issue?
>>>
>>> I'm suspicious of this too. In Claire's patch:
>>>
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 case AGGR_NONE:
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (ev= sel->percore && !config->percore_show_thread) {
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0fprintf(config->output, "S%d-D%d-C%*d%s&quo= t;,
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0id.socket,
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0id.die,
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0config->csv_output ?= 0 : -3,
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0id.core, config->csv= _sep);
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} else= if (id.cpu.cpu > -1) {
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0fprintf(config->output, "CPU%*d%s", >>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0config->csv_output ?= 0 : -7,
>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0id.cpu.cpu, config->= csv_sep);
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (co= nfig->json_output) {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0if (evsel->percore &&
>>> !config->percore_show_thread) {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fprintf(config->outp= ut, "\"core\" :
>>> \"S%d-D%d-C%d\"",
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.socket,
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.die,
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.core);
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0} else if (id.core > -1) {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fprintf(config->outp= ut, "\"cpu\" :
>>> \"%d\", ",
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.cpu.cpu);
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0}
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} else= {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0if (evsel->percore &&
>>> !config->percore_show_thread) {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fprintf(config->outp= ut,
>>> "S%d-D%d-C%*d%s",
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.socket,
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.die,
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0config->csv_output ? 0 : -3,
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.core, config->csv_sep);
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0} else if (id.core > -1) {
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0fprintf(config->outp= ut, "CPU%*d%s",
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0config->csv_output ? 0 : -7,
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0id.cpu.cpu, config->csv_sep);
>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0}
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break; >>>
>>> The old code was using "id.cpu.cpu > -1" while th= e new code is
>>> "id.core > -1". The value printed is id.cpu.cpu a= nd so testing id.core
>>> makes less sense to me. Going back to the original patch:
>>>
>>> http= s://lore.kernel.org/lkml/20210811224317.1811618-1-cjense@google.com/ >>>=C2=A0 case AGGR_NONE:
>>> - if (evsel->percore && !config->percore_show_th= read) {
>>> - fprintf(config->output, "S%d-D%d-C%*d%s",
>>> - id.socket,
>>> - id.die,
>>> - config->csv_output ? 0 : -3,
>>> - id.core, config->csv_sep);
>>> + if (config->json_output) {
>>> + if (evsel->percore && !config->percore_show_th= read) {
>>> + fprintf(config->output, "\"core\" : \"= ;S%d-D%d-C%d\"",
>>> + id.socket,
>>> + id.die,
>>> + id.core);
>>> + } else if (id.core > -1) {
>>> + fprintf(config->output, "\"cpu\" : \"= %d\", ",
>>> + evsel__cpus(evsel)->map[id.core]);
>>> + }
>>> + } else {
>>> + if (evsel->percore && !config->percore_show_th= read) {
>>> + fprintf(config->output, "S%d-D%d-C%*d%s",
>>> + id.socket,
>>> + id.die,
>>> + config->csv_output ? 0 : -3,
>>> + id.core, config->csv_sep);
>>>=C2=A0 } else if (id.core > -1) {
>>>=C2=A0 fprintf(config->output, "CPU%*d%s",
>>>=C2=A0 config->csv_output ? 0 : -7,
>>>=C2=A0 evsel__cpus(evsel)->map[id.core],
>>>=C2=A0 config->csv_sep);
>>> - }
>>> + }
>>> + }
>>> +
>>>=C2=A0 break;
>>>
>>> So testing the id.core isn't a bad index makes sense. Howe= ver, we
>>> changed from core to CPU here:
>>> http= s://lore.kernel.org/all/20220105061351.120843-26-irogers@google.com/ >>> and that was because of:
>>> https:/= /lore.kernel.org/r/20220105061351.120843-25-irogers@google.com
>>>
>>> So I think the code needs to test CPU and not core. Whether th= at is
>>> addressing the Power test failures is another matter, as James= said we
>>> may need a fix in the tests for that.
>>>
>>
>> Hi Ian, James
>>
>> Thanks for the reviews and suggestions.
>>
>> After checking through the original commits for id.core vs cpu che= ck,
>> sharing patch below to test CPU and not core.
>>
>> From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 = 2001
>> From: Athira Rajeev <atrajeev@linux.vnet.ibm.com&g= t;
>> Date: Mon, 3 Oct 2022 15:47:27 +0530
>> Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in >> aggr_printout
>>
>> perf stat has options to aggregate the counts in different
>> modes like per socket, per core etc. The function "aggr_print= out"
>> in util/stat-display.c which is used to print the aggregates,
>> has a check for cpu in case of AGGR_NONE. This check was
>> originally using condition : "if (id.cpu.cpu > -1)". = But
>> this got changed after commit df936cadfb58 ("perf stat: Add >> JSON output option"), which added option to output json forma= t
>> for different aggregation modes. After this commit, the
>> check in "aggr_printout" is using "if (id.core >= -1)".
>>
>> The old code was using "id.cpu.cpu > -1" while the ne= w code
>> is using "id.core > -1". But since the value printed = is
>> id.cpu.cpu, fix this check to use cpu and not core.
>>
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.= com>
>> Suggested-by: James Clark <james.clark@arm.com>
>> Suggested-by: Ian Rogers <irogers@google.com>
>
> The change below works on my dual socket SkylakeX:
> ..
> 85: perf stat CSV output linter=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0:
> Ok
> 86: perf stat csv summary test=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 : Ok
> 87: perf stat JSON output linter=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 : Ok
> ..
> I don't see anything else out of the ordinary.
>
> Thanks,
> Ian
>

Hi Ian,
Thanks for helping with testing. Can I add your Tested-by for the patch ?

Ye= p.

Tested-by: Ian Rogers= <irogers@google.com>
=

Thanks,
Ian

Arnaldo,
Please suggest if I have to send as separate patch for the cpu check fix pa= tch pasted above:
=C2=A0"tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout=E2= =80=9D

Thanks
Athira
>> ---
>>=C2=A0 tools/perf/util/stat-display.c | 4 ++--
>>=C2=A0 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/stat-display.c
>> b/tools/perf/util/stat-display.c
>> index b82844cb0ce7..cf28020798ec 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_con= fig
>> *config,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.socket,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.die,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.core);
>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0} else if (id.core > -1) {
>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0} else if (id.cpu.cpu > -1) {
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 fprintf(config->output, &q= uot;\"cpu\" : \"%d\", ",
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.cpu.cpu);
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 }
>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_con= fig
>> *config,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.die,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 c= onfig->csv_output ? 0 : -3,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.core, config->csv_sep);
>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0} else if (id.core > -1) {
>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0} else if (id.cpu.cpu > -1) {
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 fprintf(config->output, &q= uot;CPU%*d%s",
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 c= onfig->csv_output ? 0 : -7,
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i= d.cpu.cpu, config->csv_sep);
>> --
>> 2.31.1
>>
>> Can you suggest or help to test this patch change.
>>
>> To address the test failure, as James suggested, I will handle fix= in
>> testcases and post them
>> as a separate patch. Plan is to add a sanity check in the tests to= see
>> if the "physical_packagge_id" ( ie socket id ) in topolo= gy points to -1
>> and if so skip the test. Also in parallel, checking to see how we = can
>> handle the aggregation modes to work incase of "-1" valu= e for socket or
>> die
>>
>> Thanks
>> Athira
>>
>>> Thanks,
>>> Ian
>>>
>>>> Or the one about fixing it in the test instead? Or failing= early if
>>>> the
>>>> topology can't be read?
>>>>
>>>> I'm still not convinced that any of the modes where -1= is printed are
>>>> even working properly so it might be best to fix that rath= er than just
>>>> the printout.
>>>>
>>>>> James, can you point me to reference for that meaning = if I have missed anything.
>>>>
>>>> It's here:
>>>>
>>>>=C2=A0 /** Identify where counts are aggregated, -1 implies= not to
>>>> aggregate. */
>>>>=C2=A0 struct aggr_cpu_id {
>>>>
>>>>>
>>>>> Thanks
>>>>> Athira
>>>>>
>>>>>>
>>>>>>>
>>>>>>> After the fix:
>>>>>>>
>>>>>>> <<>>
>>>>>>> perf stat -e instructions -A -a true
>>>>>>>
>>>>>>> Performance counter stats for 'system wide= ':
>>>>>>>
>>>>>>> CPU-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 64,034=C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> CPU-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 68,941=C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> CPU-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 59,418=C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> CPU-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 70,478=C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> CPU-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 65,201=C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> CPU-1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 63,704=C2=A0 =C2=A0 =C2=A0 instructions
>>>>>>> <<>>
>>>>>>>
>>>>>>> This is caught while running "perf test&q= uot; for
>>>>>>> "stat+json_output.sh" and "stat= +csv_output.sh".
>>>>>>
>>>>>> Is it possible to fix the issue by making the test= s cope with the lack
>>>>>> of the CPU id?
>>>>>>
>>>>>>>
>>>>>>> Reported-by: Disha Goel <disgoel@li= nux.vnet.ibm.com>
>>>>>>> Signed-off-by: Athira Rajeev <atra= jeev@linux.vnet.ibm.com>
>>>>>>> ---
>>>>>>> tools/perf/util/stat-display.c | 6 ++----
>>>>>>> 1 file changed, 2 insertions(+), 4 deletions(-= )
>>>>>>>
>>>>>>> diff --git a/tools/perf/util/stat-display.c b/= tools/perf/util/stat-display.c
>>>>>>> index b82844cb0ce7..1b751a730271 100644
>>>>>>> --- a/tools/perf/util/stat-display.c
>>>>>>> +++ b/tools/perf/util/stat-display.c
>>>>>>> @@ -168,10 +168,9 @@ static void aggr_printout= (struct perf_stat_config *config,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.socket,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.die,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.core);
>>>>>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0} else if (id.core > -1) {
>>>>>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0} else
>>>>>>
>>>>>> This should have been "id.cpu.cpu > -1&quo= t;. Looks like it was changed by
>>>>>> some kind of bad merge or rebase in df936cadfb bec= ause there is no
>>>>>> obvious justification for the change to .core in t= hat commit.
>>>>>
>>>>>>
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 fprintf(config->out= put, "\"cpu\" : \"%d\", ",
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.cpu.cpu);
>>>>>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0}
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } els= e {
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 if (evsel->percore && !config->perco= re_show_thread) {
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 fprintf(config->out= put, "S%d-D%d-C%*d%s",
>>>>>>> @@ -179,11 +178,10 @@ static void aggr_printou= t(struct perf_stat_config *config,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.die,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 config->csv_output ? 0 : -3,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.core, config->csv_sep);
>>>>>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0} else if (id.core > -1) {
>>>>>>> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0} else
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 fprintf(config->out= put, "CPU%*d%s",
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 config->csv_output ? 0 : -7,
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 id.cpu.cpu, config->csv_sep);
>>>>>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0}
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
>>>>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break= ;
>>>>>>>=C2=A0 =C2=A0 case AGGR_THREAD:
>>>>>

--0000000000007de18005ea369141--