All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Ian Rogers <irogers@google.com>, Will Deacon <will@kernel.org>,
	"James Clark" <james.clark@arm.com>,
	Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	"Namhyung Kim" <namhyung@kernel.org>,
	Andi Kleen <ak@linux.intel.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	"Kan Liang" <kan.liang@linux.intel.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-perf-users@vger.kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v3 00/17] Compress the pmu_event tables
Date: Fri, 29 Jul 2022 16:03:36 +0100	[thread overview]
Message-ID: <d8356ddc-56e7-7324-5330-ff2bd54bcba4@huawei.com> (raw)
In-Reply-To: <20220729074351.138260-1-irogers@google.com>

On 29/07/2022 08:43, Ian Rogers wrote:
> jevents.py creates a number of large arrays from the json events. The
> arrays contain pointers to strings that need relocating. The
> relocations have file size, run time and memory costs. These changes
> refactor the pmu_events API so that the storage of the pmu_event
> struct isn't exposed. The format is then changed to an offset within a
> combined big string, with adjacent pmu_event struct variables being
> next to each other in the string separated by \0 - meaning only the
> first variable of the struct needs its offset recording.
> 
> Some related fixes are contained with the patches. The architecture
> jevents.py creates tables for can now be set by the JEVENTS_ARCH make
> variable, with a new 'all' that generates the events and metrics for
> all architectures.

Hi Ian,

I am going through this series currently.

But I just wanted to mention my idea again on how to compress the 
tables. Maybe you thought that there was no value in my idea or didn't 
get it, but I'll mention it again just in case...

Background:
There is much duplication in events between cores. And currently we have 
something like this:

pmu-events/pmu-events.c:
struct pmu_event core0[] {
{
	.name = event0,
	.event = "event=0x0",
},
{
	.name = event1,
	.event = "event=0x1",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 common desc",
},
...
};

struct pmu_event core1[] {
{
	.name = event0,
	.event = "event=0x0",
},
{
	.name = event1,
	.event = "event=0x1",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 desc for core1",
},
...
};


struct pmu_evenets_map map[] = {
{
	.cpuid = "0000",
	.table = core0,
},
{
	.cpuid = "0001",
	.table = core1,
},
...
};

If you check broadwell and broadwellde frontent.json you will notice 
that they are identical, which is an extreme example of duplication.

Proposal for change:
Make each event in the per-core pmu event table point to common event. 
Each common event is unique, and each event per-core will point to a 
common event. So if 2x cores have same event but small difference, then 
there would be still 2x common events.

pmu-events/pmu-events.c:
struct pmu_event common_events[] {
{
	.name = event0,
	.event = "event=0x0",
},
{
	.name = event1,
	.event = "event=0x1",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 common desc",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 desc for core1",
},
...
};

struct pmu_event_ptr {
	struct pmu_event *pmu_event;
}

struct pmu_event_ptr core0[] {
{
	.pmu_event = &common_events[0],
},
{
	.pmu_event = &common_events[1],
},
{
	.pmu_event = &common_events[2],
},
...
};

struct pmu_event_ptr core0[] {
{
	.pmu_event = &common_events[0],
},
{
	.pmu_event = &common_events[1],
},
{
	.pmu_event = &common_events[3],
},
...
};

struct pmu_evenets_map map[] = {
{
	.cpuid = "0000",
	.table = core0,
},
{
	.cpuid = "0001",
	.table = core1,
},
...
};

For x86, first step in JSON parsing would be to go through the JSON 
files and compile a list of unique events. Then second step is to 
process each per-core JSON to create the pmu events table, using the 
common events. Using a per common event hash would make the lookup quicker.

I'm not sure what you think. From figures below you seem to be saving 
~20% at best - I would guess (with a capital G) that my method could 
save a lot more.

This implementation would require core pmu.c to be changed, but there is 
ways that this could be done without needing to change core pmu.c

Thanks,
John

> 
> An example of the improvement to the file size on x86 is:
> no jevents - the same 19,788,464bytes
> x86 jevents - ~16.7% file size saving 23,744,288bytes vs 28,502,632bytes
> all jevents - ~19.5% file size saving 24,469,056bytes vs 30,379,920bytes
> default build options plus NO_LIBBFD=1.
> 
> I originally suggested fixing this problem in:
> https://lore.kernel.org/linux-perf-users/CAP-5=fVB8G4bdb9T=FncRTh9oBVKCS=+=eowAO+YSgAhab+Dtg@mail.gmail.com/
> 
> v3. Fix an ARM build issue with a missed weak symbol. Perform some
>      pytype clean up.
> v2. Split the substring folding optimization to its own patch and
>      comment tweaks as suggested by Namhyung Kim
>      <namhyung@kernel.org>. Recompute the file size savings with the
>      latest json events and metrics.
> 
> Ian Rogers (17):
>    perf jevents: Clean up pytype warnings
>    perf jevents: Simplify generation of C-string
>    perf jevents: Add JEVENTS_ARCH make option
>    perf jevent: Add an 'all' architecture argument
>    perf jevents: Remove the type/version variables
>    perf jevents: Provide path to json file on error
>    perf jevents: Sort json files entries
>    perf pmu-events: Hide pmu_sys_event_tables
>    perf pmu-events: Avoid passing pmu_events_map
>    perf pmu-events: Hide pmu_events_map
>    perf test: Use full metric resolution
>    perf pmu-events: Move test events/metrics to json
>    perf pmu-events: Don't assume pmu_event is an array
>    perf pmu-events: Hide the pmu_events
>    perf metrics: Copy entire pmu_event in find metric
>    perf jevents: Compress the pmu_events_table
>    perf jevents: Fold strings optimization
> 
>   tools/perf/arch/arm64/util/pmu.c              |   4 +-
>   tools/perf/pmu-events/Build                   |   6 +-
>   .../arch/test/test_soc/cpu/metrics.json       |  64 +++
>   tools/perf/pmu-events/empty-pmu-events.c      | 204 +++++++-
>   tools/perf/pmu-events/jevents.py              | 495 ++++++++++++++----
>   tools/perf/pmu-events/pmu-events.h            |  40 +-
>   tools/perf/tests/expand-cgroup.c              |  25 +-
>   tools/perf/tests/parse-metric.c               |  77 +--
>   tools/perf/tests/pmu-events.c                 | 466 +++++++----------
>   tools/perf/util/metricgroup.c                 | 275 ++++++----
>   tools/perf/util/metricgroup.h                 |   5 +-
>   tools/perf/util/pmu.c                         | 139 ++---
>   tools/perf/util/pmu.h                         |   8 +-
>   tools/perf/util/s390-sample-raw.c             |  50 +-
>   14 files changed, 1140 insertions(+), 718 deletions(-)
>   create mode 100644 tools/perf/pmu-events/arch/test/test_soc/cpu/metrics.json
> 


WARNING: multiple messages have this Message-ID (diff)
From: John Garry <john.garry@huawei.com>
To: Ian Rogers <irogers@google.com>, Will Deacon <will@kernel.org>,
	"James Clark" <james.clark@arm.com>,
	Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	"Namhyung Kim" <namhyung@kernel.org>,
	Andi Kleen <ak@linux.intel.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	"Kan Liang" <kan.liang@linux.intel.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-perf-users@vger.kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v3 00/17] Compress the pmu_event tables
Date: Fri, 29 Jul 2022 16:03:36 +0100	[thread overview]
Message-ID: <d8356ddc-56e7-7324-5330-ff2bd54bcba4@huawei.com> (raw)
In-Reply-To: <20220729074351.138260-1-irogers@google.com>

On 29/07/2022 08:43, Ian Rogers wrote:
> jevents.py creates a number of large arrays from the json events. The
> arrays contain pointers to strings that need relocating. The
> relocations have file size, run time and memory costs. These changes
> refactor the pmu_events API so that the storage of the pmu_event
> struct isn't exposed. The format is then changed to an offset within a
> combined big string, with adjacent pmu_event struct variables being
> next to each other in the string separated by \0 - meaning only the
> first variable of the struct needs its offset recording.
> 
> Some related fixes are contained with the patches. The architecture
> jevents.py creates tables for can now be set by the JEVENTS_ARCH make
> variable, with a new 'all' that generates the events and metrics for
> all architectures.

Hi Ian,

I am going through this series currently.

But I just wanted to mention my idea again on how to compress the 
tables. Maybe you thought that there was no value in my idea or didn't 
get it, but I'll mention it again just in case...

Background:
There is much duplication in events between cores. And currently we have 
something like this:

pmu-events/pmu-events.c:
struct pmu_event core0[] {
{
	.name = event0,
	.event = "event=0x0",
},
{
	.name = event1,
	.event = "event=0x1",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 common desc",
},
...
};

struct pmu_event core1[] {
{
	.name = event0,
	.event = "event=0x0",
},
{
	.name = event1,
	.event = "event=0x1",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 desc for core1",
},
...
};


struct pmu_evenets_map map[] = {
{
	.cpuid = "0000",
	.table = core0,
},
{
	.cpuid = "0001",
	.table = core1,
},
...
};

If you check broadwell and broadwellde frontent.json you will notice 
that they are identical, which is an extreme example of duplication.

Proposal for change:
Make each event in the per-core pmu event table point to common event. 
Each common event is unique, and each event per-core will point to a 
common event. So if 2x cores have same event but small difference, then 
there would be still 2x common events.

pmu-events/pmu-events.c:
struct pmu_event common_events[] {
{
	.name = event0,
	.event = "event=0x0",
},
{
	.name = event1,
	.event = "event=0x1",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 common desc",
},
{
	.name = event2,
	.event = "event=0x2",
	.desc = "event2 desc for core1",
},
...
};

struct pmu_event_ptr {
	struct pmu_event *pmu_event;
}

struct pmu_event_ptr core0[] {
{
	.pmu_event = &common_events[0],
},
{
	.pmu_event = &common_events[1],
},
{
	.pmu_event = &common_events[2],
},
...
};

struct pmu_event_ptr core0[] {
{
	.pmu_event = &common_events[0],
},
{
	.pmu_event = &common_events[1],
},
{
	.pmu_event = &common_events[3],
},
...
};

struct pmu_evenets_map map[] = {
{
	.cpuid = "0000",
	.table = core0,
},
{
	.cpuid = "0001",
	.table = core1,
},
...
};

For x86, first step in JSON parsing would be to go through the JSON 
files and compile a list of unique events. Then second step is to 
process each per-core JSON to create the pmu events table, using the 
common events. Using a per common event hash would make the lookup quicker.

I'm not sure what you think. From figures below you seem to be saving 
~20% at best - I would guess (with a capital G) that my method could 
save a lot more.

This implementation would require core pmu.c to be changed, but there is 
ways that this could be done without needing to change core pmu.c

Thanks,
John

> 
> An example of the improvement to the file size on x86 is:
> no jevents - the same 19,788,464bytes
> x86 jevents - ~16.7% file size saving 23,744,288bytes vs 28,502,632bytes
> all jevents - ~19.5% file size saving 24,469,056bytes vs 30,379,920bytes
> default build options plus NO_LIBBFD=1.
> 
> I originally suggested fixing this problem in:
> https://lore.kernel.org/linux-perf-users/CAP-5=fVB8G4bdb9T=FncRTh9oBVKCS=+=eowAO+YSgAhab+Dtg@mail.gmail.com/
> 
> v3. Fix an ARM build issue with a missed weak symbol. Perform some
>      pytype clean up.
> v2. Split the substring folding optimization to its own patch and
>      comment tweaks as suggested by Namhyung Kim
>      <namhyung@kernel.org>. Recompute the file size savings with the
>      latest json events and metrics.
> 
> Ian Rogers (17):
>    perf jevents: Clean up pytype warnings
>    perf jevents: Simplify generation of C-string
>    perf jevents: Add JEVENTS_ARCH make option
>    perf jevent: Add an 'all' architecture argument
>    perf jevents: Remove the type/version variables
>    perf jevents: Provide path to json file on error
>    perf jevents: Sort json files entries
>    perf pmu-events: Hide pmu_sys_event_tables
>    perf pmu-events: Avoid passing pmu_events_map
>    perf pmu-events: Hide pmu_events_map
>    perf test: Use full metric resolution
>    perf pmu-events: Move test events/metrics to json
>    perf pmu-events: Don't assume pmu_event is an array
>    perf pmu-events: Hide the pmu_events
>    perf metrics: Copy entire pmu_event in find metric
>    perf jevents: Compress the pmu_events_table
>    perf jevents: Fold strings optimization
> 
>   tools/perf/arch/arm64/util/pmu.c              |   4 +-
>   tools/perf/pmu-events/Build                   |   6 +-
>   .../arch/test/test_soc/cpu/metrics.json       |  64 +++
>   tools/perf/pmu-events/empty-pmu-events.c      | 204 +++++++-
>   tools/perf/pmu-events/jevents.py              | 495 ++++++++++++++----
>   tools/perf/pmu-events/pmu-events.h            |  40 +-
>   tools/perf/tests/expand-cgroup.c              |  25 +-
>   tools/perf/tests/parse-metric.c               |  77 +--
>   tools/perf/tests/pmu-events.c                 | 466 +++++++----------
>   tools/perf/util/metricgroup.c                 | 275 ++++++----
>   tools/perf/util/metricgroup.h                 |   5 +-
>   tools/perf/util/pmu.c                         | 139 ++---
>   tools/perf/util/pmu.h                         |   8 +-
>   tools/perf/util/s390-sample-raw.c             |  50 +-
>   14 files changed, 1140 insertions(+), 718 deletions(-)
>   create mode 100644 tools/perf/pmu-events/arch/test/test_soc/cpu/metrics.json
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-07-29 15:03 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-29  7:43 [PATCH v3 00/17] Compress the pmu_event tables Ian Rogers
2022-07-29  7:43 ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 01/17] perf jevents: Clean up pytype warnings Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 02/17] perf jevents: Simplify generation of C-string Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 03/17] perf jevents: Add JEVENTS_ARCH make option Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 04/17] perf jevent: Add an 'all' architecture argument Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 05/17] perf jevents: Remove the type/version variables Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  8:29   ` John Garry
2022-07-29  8:29     ` John Garry
2022-07-29 14:24     ` Ian Rogers
2022-07-29 14:24       ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 06/17] perf jevents: Provide path to json file on error Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 07/17] perf jevents: Sort json files entries Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 08/17] perf pmu-events: Hide pmu_sys_event_tables Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 09/17] perf pmu-events: Avoid passing pmu_events_map Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 10/17] perf pmu-events: Hide pmu_events_map Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 11/17] perf test: Use full metric resolution Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 12/17] perf pmu-events: Move test events/metrics to json Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 13/17] perf pmu-events: Don't assume pmu_event is an array Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 14/17] perf pmu-events: Hide the pmu_events Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 15/17] perf metrics: Copy entire pmu_event in find metric Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 16/17] perf jevents: Compress the pmu_events_table Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29  7:43 ` [PATCH v3 17/17] perf jevents: Fold strings optimization Ian Rogers
2022-07-29  7:43   ` Ian Rogers
2022-07-29 15:03 ` John Garry [this message]
2022-07-29 15:03   ` [PATCH v3 00/17] Compress the pmu_event tables John Garry
2022-07-29 17:27   ` Ian Rogers
2022-07-29 17:27     ` Ian Rogers
2022-08-02  9:08     ` John Garry
2022-08-02  9:08       ` John Garry
2022-08-05  8:11       ` John Garry
2022-08-05  8:11         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8356ddc-56e7-7324-5330-ff2bd54bcba4@huawei.com \
    --to=john.garry@huawei.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=will@kernel.org \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.