All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Francis <alistair23@gmail.com>
To: Atish Patra <atishp@rivosinc.com>
Cc: Alistair Francis <alistair.francis@wdc.com>,
	Bin Meng <bin.meng@windriver.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	"open list:RISC-V" <qemu-riscv@nongnu.org>
Subject: Re: [PATCH v5 00/12] Improve PMU support
Date: Thu, 3 Mar 2022 13:20:58 +1000	[thread overview]
Message-ID: <CAKmqyKONw4O7Wf+uV34vbAhzMRY7oOfNWHhYoAGxYEXsy+Ju=A@mail.gmail.com> (raw)
In-Reply-To: <20220219002518.1936806-1-atishp@rivosinc.com>

On Sat, Feb 19, 2022 at 10:26 AM Atish Patra <atishp@rivosinc.com> wrote:
>
> The latest version of the SBI specification includes a Performance Monitoring
> Unit(PMU) extension[1] which allows the supervisor to start/stop/configure
> various PMU events. The Sscofpmf ('Ss' for Privileged arch and Supervisor-level
> extensions, and 'cofpmf' for Count OverFlow and Privilege Mode Filtering)
> extension[2] allows the perf like tool to handle overflow interrupts and
> filtering support.
>
> This series implements full PMU infrastructure to support
> PMU in virt machine. This will allow us to add any PMU events in future.
>
> Currently, this series enables the following omu events.
> 1. cycle count
> 2. instruction count
> 3. DTLB load/store miss
> 4. ITLB prefetch miss
>
> The first two are computed using host ticks while last three are counted during
> cpu_tlb_fill. We can do both sampling and count from guest userspace.
> This series has been tested on both RV64 and RV32. Both Linux[3] and Opensbi[4]
> patches are required to get the perf working.
>
> Here is an output of perf stat/report while running hackbench with OpenSBI & Linux
> kernel patches applied [3].
>
> Perf stat:
> ==========
> [root@fedora-riscv ~]# perf stat -e cycles -e instructions -e dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses \
> > perf bench sched messaging -g 1 -l 10
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
>
>      Total time: 0.265 [sec]
>
>  Performance counter stats for 'perf bench sched messaging -g 1 -l 10':
>
>      4,167,825,362      cycles
>      4,166,609,256      instructions              #    1.00  insn per cycle
>          3,092,026      dTLB-load-misses
>            258,280      dTLB-store-misses
>          2,068,966      iTLB-load-misses
>
>        0.585791767 seconds time elapsed
>
>        0.373802000 seconds user
>        1.042359000 seconds sys
>
> Perf record:
> ============
> [root@fedora-riscv ~]# perf record -e cycles -e instructions \
> > -e dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses -c 10000 \
> > perf bench sched messaging -g 1 -l 10
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
>
>      Total time: 1.397 [sec]
> [ perf record: Woken up 10 times to write data ]
> Check IO/CPU overload!
> [ perf record: Captured and wrote 8.211 MB perf.data (214486 samples) ]
>
> [root@fedora-riscv riscv]# perf report
> Available samples
> 107K cycles                                                                    ◆
> 107K instructions                                                              ▒
> 250 dTLB-load-misses                                                           ▒
> 13 dTLB-store-misses                                                           ▒
> 172 iTLB-load-misses
> ..
>
> Changes from v4->v5:
> 1. Rebased on top of the -next with following patches.
>    - isa extension
>    - priv 1.12 spec
> 2. Addressed all the comments on v4
> 3. Removed additional isa-ext DT node in favor of riscv,isa string update
>
> Changes from v3->v4:
> 1. Removed the dummy events from pmu DT node.
> 2. Fixed pmu_avail_counters mask generation.
> 3. Added a patch to simplify the predicate function for counters.
>
> Changes from v2->v3:
> 1. Addressed all the comments on PATCH1-4.
> 2. Split patch1 into two separate patches.
> 3. Added explicit comments to explain the event types in DT node.
> 4. Rebased on latest Qemu.
>
> Changes from v1->v2:
> 1. Dropped the ACks from v1 as signficant changes happened after v1.
> 2. sscofpmf support.
> 3. A generic counter management framework.
>
> [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc
> [2] https://drive.google.com/file/d/171j4jFjIkKdj5LWcExphq4xG_2sihbfd/edit
> [3] https://github.com/atishp04/linux/tree/riscv_pmu_v6
> [4] https://github.com/atishp04/qemu/tree/riscv_pmu_v5
>
> Atish Patra (12):
> target/riscv: Fix PMU CSR predicate function
> target/riscv: Implement PMU CSR predicate function for S-mode
> target/riscv: pmu: Rename the counters extension to pmu

I have applied the first 3 patches

Alistair

> target/riscv: pmu: Make number of counters configurable
> target/riscv: Implement mcountinhibit CSR
> target/riscv: Add support for hpmcounters/hpmevents
> target/riscv: Support mcycle/minstret write operation
> target/riscv: Add sscofpmf extension support
> target/riscv: Simplify counter predicate function
> target/riscv: Add few cache related PMU events
> hw/riscv: virt: Add PMU DT node to the device tree
> target/riscv: Update the privilege field for sscofpmf CSRs
>
> hw/riscv/virt.c           |  28 ++
> target/riscv/cpu.c        |  15 +-
> target/riscv/cpu.h        |  49 ++-
> target/riscv/cpu_bits.h   |  59 +++
> target/riscv/cpu_helper.c |  26 ++
> target/riscv/csr.c        | 862 ++++++++++++++++++++++++++++----------
> target/riscv/machine.c    |  25 ++
> target/riscv/meson.build  |   1 +
> target/riscv/pmu.c        | 431 +++++++++++++++++++
> target/riscv/pmu.h        |  37 ++
> 10 files changed, 1303 insertions(+), 230 deletions(-)
> create mode 100644 target/riscv/pmu.c
> create mode 100644 target/riscv/pmu.h
>
> --
> 2.30.2
>
>


WARNING: multiple messages have this Message-ID (diff)
From: Alistair Francis <alistair23@gmail.com>
To: Atish Patra <atishp@rivosinc.com>
Cc: "qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	 Bin Meng <bin.meng@windriver.com>,
	Alistair Francis <alistair.francis@wdc.com>,
	"open list:RISC-V" <qemu-riscv@nongnu.org>
Subject: Re: [PATCH v5 00/12] Improve PMU support
Date: Thu, 3 Mar 2022 13:20:58 +1000	[thread overview]
Message-ID: <CAKmqyKONw4O7Wf+uV34vbAhzMRY7oOfNWHhYoAGxYEXsy+Ju=A@mail.gmail.com> (raw)
In-Reply-To: <20220219002518.1936806-1-atishp@rivosinc.com>

On Sat, Feb 19, 2022 at 10:26 AM Atish Patra <atishp@rivosinc.com> wrote:
>
> The latest version of the SBI specification includes a Performance Monitoring
> Unit(PMU) extension[1] which allows the supervisor to start/stop/configure
> various PMU events. The Sscofpmf ('Ss' for Privileged arch and Supervisor-level
> extensions, and 'cofpmf' for Count OverFlow and Privilege Mode Filtering)
> extension[2] allows the perf like tool to handle overflow interrupts and
> filtering support.
>
> This series implements full PMU infrastructure to support
> PMU in virt machine. This will allow us to add any PMU events in future.
>
> Currently, this series enables the following omu events.
> 1. cycle count
> 2. instruction count
> 3. DTLB load/store miss
> 4. ITLB prefetch miss
>
> The first two are computed using host ticks while last three are counted during
> cpu_tlb_fill. We can do both sampling and count from guest userspace.
> This series has been tested on both RV64 and RV32. Both Linux[3] and Opensbi[4]
> patches are required to get the perf working.
>
> Here is an output of perf stat/report while running hackbench with OpenSBI & Linux
> kernel patches applied [3].
>
> Perf stat:
> ==========
> [root@fedora-riscv ~]# perf stat -e cycles -e instructions -e dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses \
> > perf bench sched messaging -g 1 -l 10
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
>
>      Total time: 0.265 [sec]
>
>  Performance counter stats for 'perf bench sched messaging -g 1 -l 10':
>
>      4,167,825,362      cycles
>      4,166,609,256      instructions              #    1.00  insn per cycle
>          3,092,026      dTLB-load-misses
>            258,280      dTLB-store-misses
>          2,068,966      iTLB-load-misses
>
>        0.585791767 seconds time elapsed
>
>        0.373802000 seconds user
>        1.042359000 seconds sys
>
> Perf record:
> ============
> [root@fedora-riscv ~]# perf record -e cycles -e instructions \
> > -e dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses -c 10000 \
> > perf bench sched messaging -g 1 -l 10
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
>
>      Total time: 1.397 [sec]
> [ perf record: Woken up 10 times to write data ]
> Check IO/CPU overload!
> [ perf record: Captured and wrote 8.211 MB perf.data (214486 samples) ]
>
> [root@fedora-riscv riscv]# perf report
> Available samples
> 107K cycles                                                                    ◆
> 107K instructions                                                              ▒
> 250 dTLB-load-misses                                                           ▒
> 13 dTLB-store-misses                                                           ▒
> 172 iTLB-load-misses
> ..
>
> Changes from v4->v5:
> 1. Rebased on top of the -next with following patches.
>    - isa extension
>    - priv 1.12 spec
> 2. Addressed all the comments on v4
> 3. Removed additional isa-ext DT node in favor of riscv,isa string update
>
> Changes from v3->v4:
> 1. Removed the dummy events from pmu DT node.
> 2. Fixed pmu_avail_counters mask generation.
> 3. Added a patch to simplify the predicate function for counters.
>
> Changes from v2->v3:
> 1. Addressed all the comments on PATCH1-4.
> 2. Split patch1 into two separate patches.
> 3. Added explicit comments to explain the event types in DT node.
> 4. Rebased on latest Qemu.
>
> Changes from v1->v2:
> 1. Dropped the ACks from v1 as signficant changes happened after v1.
> 2. sscofpmf support.
> 3. A generic counter management framework.
>
> [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc
> [2] https://drive.google.com/file/d/171j4jFjIkKdj5LWcExphq4xG_2sihbfd/edit
> [3] https://github.com/atishp04/linux/tree/riscv_pmu_v6
> [4] https://github.com/atishp04/qemu/tree/riscv_pmu_v5
>
> Atish Patra (12):
> target/riscv: Fix PMU CSR predicate function
> target/riscv: Implement PMU CSR predicate function for S-mode
> target/riscv: pmu: Rename the counters extension to pmu

I have applied the first 3 patches

Alistair

> target/riscv: pmu: Make number of counters configurable
> target/riscv: Implement mcountinhibit CSR
> target/riscv: Add support for hpmcounters/hpmevents
> target/riscv: Support mcycle/minstret write operation
> target/riscv: Add sscofpmf extension support
> target/riscv: Simplify counter predicate function
> target/riscv: Add few cache related PMU events
> hw/riscv: virt: Add PMU DT node to the device tree
> target/riscv: Update the privilege field for sscofpmf CSRs
>
> hw/riscv/virt.c           |  28 ++
> target/riscv/cpu.c        |  15 +-
> target/riscv/cpu.h        |  49 ++-
> target/riscv/cpu_bits.h   |  59 +++
> target/riscv/cpu_helper.c |  26 ++
> target/riscv/csr.c        | 862 ++++++++++++++++++++++++++++----------
> target/riscv/machine.c    |  25 ++
> target/riscv/meson.build  |   1 +
> target/riscv/pmu.c        | 431 +++++++++++++++++++
> target/riscv/pmu.h        |  37 ++
> 10 files changed, 1303 insertions(+), 230 deletions(-)
> create mode 100644 target/riscv/pmu.c
> create mode 100644 target/riscv/pmu.h
>
> --
> 2.30.2
>
>


  parent reply	other threads:[~2022-03-03  3:28 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-19  0:25 [PATCH v5 00/12] Improve PMU support Atish Patra
2022-02-19  0:25 ` Atish Patra
2022-02-19  0:25 ` [PATCH v5 01/12] target/riscv: Fix PMU CSR predicate function Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-03-03  5:22   ` Alistair Francis
2022-03-03  5:22     ` Alistair Francis
2022-03-03 10:03     ` Atish Kumar Patra
2022-03-03 10:03       ` Atish Kumar Patra
2022-02-19  0:25 ` [PATCH v5 02/12] target/riscv: Implement PMU CSR predicate function for S-mode Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-02-19  0:25 ` [PATCH v5 03/12] target/riscv: pmu: Rename the counters extension to pmu Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-02-19  0:25 ` [PATCH v5 04/12] target/riscv: pmu: Make number of counters configurable Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-02-19  0:25 ` [PATCH v5 05/12] target/riscv: Implement mcountinhibit CSR Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-02-19  0:25 ` [PATCH v5 06/12] target/riscv: Add support for hpmcounters/hpmevents Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-02-28  4:17   ` Alistair Francis
2022-02-28  4:17     ` Alistair Francis
2022-02-19  0:25 ` [PATCH v5 07/12] target/riscv: Support mcycle/minstret write operation Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-03-01  6:14   ` Alistair Francis
2022-03-01  6:14     ` Alistair Francis
2022-03-03 20:04     ` Atish Kumar Patra
2022-03-03 20:04       ` Atish Kumar Patra
2022-02-19  0:25 ` [PATCH v5 08/12] target/riscv: Add sscofpmf extension support Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-03-02 22:36   ` Alistair Francis
2022-03-02 22:36     ` Alistair Francis
2022-03-03 20:16     ` Atish Kumar Patra
2022-03-03 20:16       ` Atish Kumar Patra
2022-02-19  0:25 ` [PATCH v5 09/12] target/riscv: Simplify counter predicate function Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-03-02 22:45   ` Alistair Francis
2022-03-02 22:45     ` Alistair Francis
2022-02-19  0:25 ` [PATCH v5 10/12] target/riscv: Add few cache related PMU events Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-03-02 23:35   ` Alistair Francis
2022-03-02 23:35     ` Alistair Francis
2022-02-19  0:25 ` [PATCH v5 11/12] hw/riscv: virt: Add PMU DT node to the device tree Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-02-19  0:25 ` [PATCH v5 12/12] target/riscv: Update the privilege field for sscofpmf CSRs Atish Patra
2022-02-19  0:25   ` Atish Patra
2022-03-03  3:20 ` Alistair Francis [this message]
2022-03-03  3:20   ` [PATCH v5 00/12] Improve PMU support Alistair Francis
  -- strict thread matches above, loose matches on Subject: below --
2022-02-19  0:10 Atish Patra
2022-02-19  0:10 ` Atish Patra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKmqyKONw4O7Wf+uV34vbAhzMRY7oOfNWHhYoAGxYEXsy+Ju=A@mail.gmail.com' \
    --to=alistair23@gmail.com \
    --cc=alistair.francis@wdc.com \
    --cc=atishp@rivosinc.com \
    --cc=bin.meng@windriver.com \
    --cc=palmer@dabbelt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.