All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>, kan.liang@linux.intel.com
Cc: peterz@infradead.org, mingo@kernel.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	namhyung@kernel.org, jolsa@redhat.com, ak@linux.intel.com,
	alexander.shishkin@linux.intel.com, adrian.hunter@intel.com,
	"Jin, Yao" <yao.jin@intel.com>
Subject: Re: [PATCH 43/49] perf stat: Add default hybrid events
Date: Tue, 9 Feb 2021 08:36:36 +0800	[thread overview]
Message-ID: <1c87bd51-949c-cc3e-2726-8da5d504eb16@linux.intel.com> (raw)
In-Reply-To: <20210208191011.GO920417@kernel.org>

Hi Arnaldo,

On 2/9/2021 3:10 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:40AM -0800, kan.liang@linux.intel.com escreveu:
>> From: Jin Yao <yao.jin@linux.intel.com>
>>
>> Previously if '-e' is not specified in perf stat, some software events
>> and hardware events are added to evlist by default.
>>
>> root@otcpl-adl-s-2:~# ./perf stat  -- ./triad_loop
>>
>>   Performance counter stats for './triad_loop':
>>
>>              109.43 msec task-clock                #    0.993 CPUs utilized
>>                   1      context-switches          #    0.009 K/sec
>>                   0      cpu-migrations            #    0.000 K/sec
>>                 105      page-faults               #    0.960 K/sec
>>         401,161,982      cycles                    #    3.666 GHz
>>       1,601,216,357      instructions              #    3.99  insn per cycle
>>         200,217,751      branches                  # 1829.686 M/sec
>>              14,555      branch-misses             #    0.01% of all branches
>>
>>         0.110176860 seconds time elapsed
>>
>> Among the events, cycles, instructions, branches and branch-misses
>> are hardware events.
>>
>> One hybrid platform, two events are created for one hardware event.
>>
>> core cycles,
>> atom cycles,
>> core instructions,
>> atom instructions,
>> core branches,
>> atom branches,
>> core branch-misses,
>> atom branch-misses
>>
>> These events will be added to evlist in order on hybrid platform
>> if '-e' is not set.
>>
>> Since parse_events() has been supported to create two hardware events
>> for one event on hybrid platform, so we just use parse_events(evlist,
>> "cycles,instructions,branches,branch-misses") to create the default
>> events and add them to evlist.
>>
>> After:
>> root@otcpl-adl-s-2:~# ./perf stat -vv -- taskset -c 16 ./triad_loop
>> ...
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             1
>>    size                             120
>>    config                           0x1
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 3
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             1
>>    size                             120
>>    config                           0x3
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 4
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             1
>>    size                             120
>>    config                           0x4
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 5
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             1
>>    size                             120
>>    config                           0x2
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 7
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0x400000000
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 8
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0xa00000000
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 9
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0x400000001
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 10
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0xa00000001
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 11
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0x400000004
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 12
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0xa00000004
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 13
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0x400000005
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954  cpu -1  group_fd -1  flags 0x8 = 14
>> ------------------------------------------------------------
>> perf_event_attr:
>>    type                             6
>>    size                             120
>>    config                           0xa00000005
>>    sample_type                      IDENTIFIER
>>    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>    disabled                         1
>>    inherit                          1
>>    enable_on_exec                   1
>>    exclude_guest                    1
>> ------------------------------------------------------------
>> ...
>>
>>   Performance counter stats for 'taskset -c 16 ./triad_loop':
>>
>>              201.31 msec task-clock                #    0.997 CPUs utilized
>>                   1      context-switches          #    0.005 K/sec
>>                   1      cpu-migrations            #    0.005 K/sec
>>                 166      page-faults               #    0.825 K/sec
>>         623,267,134      cycles                    # 3096.043 M/sec                    (0.16%)
>>         603,082,383      cycles                    # 2995.777 M/sec                    (99.84%)
>>         406,410,481      instructions              # 2018.820 M/sec                    (0.16%)
>>       1,604,213,375      instructions              # 7968.837 M/sec                    (99.84%)
>>          81,444,171      branches                  #  404.569 M/sec                    (0.16%)
>>         200,616,430      branches                  #  996.550 M/sec                    (99.84%)
>>           3,769,856      branch-misses             #   18.727 M/sec                    (0.16%)
>>              16,111      branch-misses             #    0.080 M/sec                    (99.84%)
>>
>>         0.201895853 seconds time elapsed
>>
>> We can see two events are created for one hardware event.
>> First one is core event the second one is atom event.
> 
> Can we have that (core/atom) as a prefix or in the comment area?
>

In next patch "perf stat: Uniquify hybrid event name", it would tell user the pmu which the event 
belongs to.

For example, I run the triad_loop on core cpu,

root@ssp-pwrt-002:# ./perf stat -- taskset -c 0 ./triad_loop

  Performance counter stats for 'taskset -c 0 ./triad_loop':

             287.87 msec task-clock                #    0.990 CPUs utilized
                 30      context-switches          #    0.104 K/sec
                  1      cpu-migrations            #    0.003 K/sec
                168      page-faults               #    0.584 K/sec
        450,089,808      cycles [cpu_core]         # 1563.496 M/sec
      <not counted>      cycles [cpu_atom]                                             (0.00%)
      1,602,536,074      instructions [cpu_core]   # 5566.797 M/sec
      <not counted>      instructions [cpu_atom]                                       (0.00%)
        200,474,560      branches [cpu_core]       #  696.397 M/sec
      <not counted>      branches [cpu_atom]                                           (0.00%)
             23,002      branch-misses [cpu_core]  #    0.080 M/sec
      <not counted>      branch-misses [cpu_atom]                                      (0.00%)

We can see cpu_atom is not counted.

Thanks
Jin Yao

>> One thing is, the shadow stats looks a bit different, now it's just
>> 'M/sec'.
>>
>> The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
>> need to be improved in future if we want to get the original shadow
>> stats.
>>
>> Reviewed-by: Andi Kleen <ak@linux.intel.com>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/builtin-stat.c | 22 ++++++++++++++++++++++
>>   1 file changed, 22 insertions(+)
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 44d1a5f..0b08665 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -1137,6 +1137,13 @@ static int parse_hybrid_type(const struct option *opt,
>>   	return 0;
>>   }
>>   
>> +static int add_default_hybrid_events(struct evlist *evlist)
>> +{
>> +	struct parse_events_error err;
>> +
>> +	return parse_events(evlist, "cycles,instructions,branches,branch-misses", &err);
>> +}
>> +
>>   static struct option stat_options[] = {
>>   	OPT_BOOLEAN('T', "transaction", &transaction_run,
>>   		    "hardware transaction statistics"),
>> @@ -1613,6 +1620,12 @@ static int add_default_attributes(void)
>>     { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES		},
>>   
>>   };
>> +	struct perf_event_attr default_sw_attrs[] = {
>> +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK		},
>> +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES	},
>> +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS		},
>> +  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS		},
>> +};
>>   
>>   /*
>>    * Detailed stats (-d), covering the L1 and last level data caches:
>> @@ -1849,6 +1862,15 @@ static int add_default_attributes(void)
>>   	}
>>   
>>   	if (!evsel_list->core.nr_entries) {
>> +		perf_pmu__scan(NULL);
>> +		if (perf_pmu__hybrid_exist()) {
>> +			if (evlist__add_default_attrs(evsel_list,
>> +						      default_sw_attrs) < 0) {
>> +				return -1;
>> +			}
>> +			return add_default_hybrid_events(evsel_list);
>> +		}
>> +
>>   		if (target__has_cpu(&target))
>>   			default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
>>   
>> -- 
>> 2.7.4
>>
> 

  reply	other threads:[~2021-02-09  0:38 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 15:24 [PATCH 00/49] Add Alder Lake support for perf kan.liang
2021-02-08 15:24 ` [PATCH 01/49] x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit kan.liang
2021-02-08 15:24 ` [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86 kan.liang
2021-02-08 17:56   ` Borislav Petkov
2021-02-08 19:04     ` Liang, Kan
2021-02-08 19:10       ` Luck, Tony
2021-02-08 19:19         ` Borislav Petkov
2021-02-08 15:25 ` [PATCH 03/49] perf/x86/intel: Hybrid PMU support for perf capabilities kan.liang
2021-02-08 15:25 ` [PATCH 04/49] perf/x86: Hybrid PMU support for intel_ctrl kan.liang
2021-02-08 15:25 ` [PATCH 05/49] perf/x86: Hybrid PMU support for counters kan.liang
2021-02-08 15:25 ` [PATCH 06/49] perf/x86: Hybrid PMU support for unconstrained kan.liang
2021-02-08 15:25 ` [PATCH 07/49] perf/x86: Hybrid PMU support for hardware cache event kan.liang
2021-02-08 15:25 ` [PATCH 08/49] perf/x86: Hybrid PMU support for event constraints kan.liang
2021-02-08 15:25 ` [PATCH 09/49] perf/x86: Hybrid PMU support for extra_regs kan.liang
2021-02-08 15:25 ` [PATCH 10/49] perf/x86/intel: Factor out intel_pmu_check_num_counters kan.liang
2021-02-08 15:25 ` [PATCH 11/49] perf/x86/intel: Factor out intel_pmu_check_event_constraints kan.liang
2021-02-08 15:25 ` [PATCH 12/49] perf/x86/intel: Factor out intel_pmu_check_extra_regs kan.liang
2021-02-08 15:25 ` [PATCH 13/49] perf/x86: Expose check_hw_exists kan.liang
2021-02-08 15:25 ` [PATCH 14/49] perf/x86: Remove temporary pmu assignment in event_init kan.liang
2021-02-08 15:25 ` [PATCH 15/49] perf/x86: Factor out x86_pmu_show_pmu_cap kan.liang
2021-02-08 15:25 ` [PATCH 16/49] perf/x86: Register hybrid PMUs kan.liang
2021-02-08 15:25 ` [PATCH 17/49] perf/x86: Add structures for the attributes of Hybrid PMUs kan.liang
2021-02-08 15:25 ` [PATCH 18/49] perf/x86/intel: Add attr_update for " kan.liang
2021-02-08 15:25 ` [PATCH 19/49] perf/x86: Support filter_match callback kan.liang
2021-02-08 15:25 ` [PATCH 20/49] perf/x86/intel: Add Alder Lake Hybrid support kan.liang
2021-02-09  2:28   ` kernel test robot
2021-02-09  4:24   ` kernel test robot
2021-02-08 15:25 ` [PATCH 21/49] perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU kan.liang
2021-02-08 15:25 ` [PATCH 22/49] perf/x86/intel/uncore: Add Alder Lake support kan.liang
2021-02-09  4:18   ` kernel test robot
2021-02-09  4:18     ` kernel test robot
2021-02-08 15:25 ` [PATCH 23/49] perf/x86/msr: Add Alder Lake CPU support kan.liang
2021-02-09  3:58   ` kernel test robot
2021-02-09  3:58     ` kernel test robot
2021-02-09 13:44     ` Liang, Kan
2021-02-09 13:44       ` Liang, Kan
2021-02-09  5:15   ` kernel test robot
2021-02-09  5:15     ` kernel test robot
2021-02-08 15:25 ` [PATCH 24/49] perf/x86/cstate: " kan.liang
2021-02-08 15:25 ` [PATCH 25/49] perf/x86/rapl: Add support for Intel Alder Lake kan.liang
2021-02-09  5:16   ` kernel test robot
2021-02-09  5:16     ` kernel test robot
2021-02-08 15:25 ` [PATCH 26/49] perf jevents: Support unit value "cpu_core" and "cpu_atom" kan.liang
2021-02-08 15:25 ` [PATCH 27/49] perf util: Save pmu name to struct perf_pmu_alias kan.liang
2021-02-08 18:57   ` Arnaldo Carvalho de Melo
2021-02-09  0:17     ` Jin, Yao
2021-02-08 15:25 ` [PATCH 28/49] perf pmu: Save detected hybrid pmus to a global pmu list kan.liang
2021-02-08 18:55   ` Arnaldo Carvalho de Melo
2021-02-09  0:05     ` Jin, Yao
2021-02-08 15:25 ` [PATCH 29/49] perf pmu: Add hybrid helper functions kan.liang
2021-02-08 15:25 ` [PATCH 30/49] perf list: Support --cputype option to list hybrid pmu events kan.liang
2021-02-08 15:25 ` [PATCH 31/49] perf stat: Hybrid evsel uses its own cpus kan.liang
2021-02-08 15:25 ` [PATCH 32/49] perf header: Support HYBRID_TOPOLOGY feature kan.liang
2021-02-08 19:05   ` Arnaldo Carvalho de Melo
2021-02-09  0:26     ` Jin, Yao
2021-02-08 15:25 ` [PATCH 33/49] perf header: Support hybrid CPU_PMU_CAPS kan.liang
2021-02-08 15:25 ` [PATCH 34/49] tools headers uapi: Update tools's copy of linux/perf_event.h kan.liang
2021-02-08 15:25 ` [PATCH 35/49] perf parse-events: Create two hybrid hardware events kan.liang
2021-02-08 18:59   ` Arnaldo Carvalho de Melo
2021-02-09  0:23     ` Jin, Yao
2021-02-08 15:25 ` [PATCH 36/49] perf parse-events: Create two hybrid cache events kan.liang
2021-02-08 15:25 ` [PATCH 37/49] perf parse-events: Support hardware events inside PMU kan.liang
2021-02-08 15:25 ` [PATCH 38/49] perf list: Display pmu prefix for partially supported hybrid cache events kan.liang
2021-02-08 15:25 ` [PATCH 39/49] perf parse-events: Support hybrid raw events kan.liang
2021-02-08 19:07   ` Arnaldo Carvalho de Melo
2021-02-09  0:28     ` Jin, Yao
2021-02-08 15:25 ` [PATCH 40/49] perf stat: Support --cputype option for hybrid events kan.liang
2021-02-08 15:25 ` [PATCH 41/49] perf stat: Support metrics with " kan.liang
2021-02-08 15:25 ` [PATCH 42/49] perf evlist: Create two hybrid 'cycles' events by default kan.liang
2021-02-08 15:25 ` [PATCH 43/49] perf stat: Add default hybrid events kan.liang
2021-02-08 19:10   ` Arnaldo Carvalho de Melo
2021-02-09  0:36     ` Jin, Yao [this message]
2021-02-08 15:25 ` [PATCH 44/49] perf stat: Uniquify hybrid event name kan.liang
2021-02-08 15:25 ` [PATCH 45/49] perf stat: Merge event counts from all hybrid PMUs kan.liang
2021-02-08 15:25 ` [PATCH 46/49] perf stat: Filter out unmatched aggregation for hybrid event kan.liang
2021-02-08 19:16   ` Arnaldo Carvalho de Melo
2021-02-09  0:53     ` Jin, Yao
2021-02-08 15:25 ` [PATCH 47/49] perf evlist: Warn as events from different hybrid PMUs in a group kan.liang
2021-02-08 15:25 ` [PATCH 48/49] perf Documentation: Document intel-hybrid support kan.liang
2021-02-08 15:25 ` [PATCH 49/49] perf evsel: Adjust hybrid event and global event mixed group kan.liang
2021-02-08 19:12   ` Arnaldo Carvalho de Melo
2021-02-09  0:47     ` Jin, Yao
2021-02-11 11:40 ` [PATCH 00/49] Add Alder Lake support for perf Jiri Olsa
2021-02-11 16:22   ` Liang, Kan
2021-02-18  0:07     ` Jin, Yao
2021-03-04 15:50 ` Liang, Kan
2021-03-04 17:50   ` Peter Zijlstra
2021-03-05 11:14     ` Peter Zijlstra
2021-03-05 13:36       ` Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c87bd51-949c-cc3e-2726-8da5d504eb16@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=yao.jin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.